My Phish Tank

OLLYMEDZ
codeburst
Published in
3 min readJun 20, 2019

--

A phishing website (sometimes called a “spoofed”site) tries to steal your account password or other confidential information by tricking you into believing you’re on a legitimate website. You could even land on a phishing site by mistyping a URL (web address).

Doing a little work experience at Netcraft.Ltd I was able to gain insight into how they tackle cyber-crime on a large scale through the use of machine learning. Intrigued in this area I wanted to explore further and found that Netcraft has a service where you can report phishing websites and you gain points for it. So to begin with I wanted to create a programme that would essentially find these phishing websites.

I came up with a simple plan:

  • Identify all the recently registered URLs
  • Detect keywords in the URLs
  • Report a list of URLs to Netcraft

A phishing website is supposed to simulate the real destination, but will steal your data, so when thinking about this, how can you land on one of these sites? Well, one way that people make these phishing scams is to put the name of the legitimate website in the URL but add other things. For example, the official URL for apple id is “https://support.apple.com/en-gb/apple-id”, however, a phishing URL might look like “store-apple-idd.com”. Notice how ID is spelt as [ idd ]. when searching for apple id you might mistype the URL like that and find yourself land on a pretty UI that looks exactly as apple, unaware of the potential threat. As you can’t register a new domain with the same URL as the real Apple company, they resort to simple miss spelling or extra words or phrases that in our fast-paced lives you are likely to not notice. Hence the second part of my plan is to create a list of keywords and phrases that can be used by the programme to detect these phishing sites.

As you can see the list can go on forever and it would be cool to improve on this code as it is a little inefficient to use tonnes of the ELIF statements. I used VadeSecures graph of common phished companies to create my list.

Once the list was created the third and final part of my plan was put to action. The first function searches through the big list of URLs and if a keyword is detected it appends that URL into a new list called suspicious. The next step is to check if the website is up and running or if it has already been removed. Function 2 uses the “isup function” to check if the website is up and if it is it appends it to the new list suspicious_and_up. Finally Function 3 inputs all the suspicious and up URLs and the info for the report to Netcraft. That's how I created this simple program that classifies recently registered domains into a list where keywords/ phrases in the domain suggest that the domain is illegitimate, and reports the domain to be taken down.

--

--