SlideShare a Scribd company logo
1 of 98
Download to read offline
Jerry @ TDOHCON 2017-10-14
jlee58.tw@gmail.com
Jerry
•
• https://jerrynest.io/
•
2
3
4
5
–
–
–
–
–
6
•
•
7
•
•
8
•
•
•
9
•
10
•
•
–
–
11
•
•
•
12
•
•
•
13
•
14
15
16
http://best0969.cdn7-network17-server2.club
http://app9259.cdn7-network27-server2.club
http://apps4684.cdn7-bignetwork17-server9.top
17
Redirect!
•
18
19
•
20
21
•
•
•
•
22
23
…
• 84% of phishing sites exist for less than 24 hours and some sites just appear
for less than 15 minutes.
• Almost all of the phishing sites are hidden within the legitimate domains.
24
Changing Fast Cross-platformHacked Server
1. Establishment and maintenance of infrastructure
– Collection of public phishing data
– Updating of blacklist
– Streaming analysis with Storm
– The analysis of duplication
2. The evolution of detection and prevention technology
– List-based
– Visual-based
– Feature-based
– Ensemble model
25
26
Phishing site Crawler Feature Extraction Detection Model Analysis & Report
HTML
CSS
JavaScript
Fonts
…
HTTP response header
DNS record
WHOIS record
IP address
URL
SSL record
Screenshot
…
PhishTank
•
•
•
27
Suspicious phishing URL The interface for verification
Alexa Top List
28
http://s3.amazonaws.com/alexa-static/top-1m.csv.zip
29
Phishing site Crawler Feature Extraction Detection Model Analysis & Report
HTML
CSS
JavaScript
Fonts
…
HTTP response header
DNS record
WHOIS record
IP address
URL
SSL record
Screenshot
…
• Deployed with container
technology, multi-thread
crawlers on Google Cloud
Platform (GCP)
• Features and screenshot will be
extracted and store in File
storage, image server and
Mongo database.
30
Legitimate sitesPhishing sites
Crawler 1 Crawler 2 Crawler 3 Crawler N
Image Server MongoDB
Feature Extractor Web
Crawlers
Analysis
URL Fetcher
URL Pool
File Storage
Data sources
…
1
31
•
•
32
/
•
•
•
33
2
•
•
•
34
wget --no-parent -Q10m --timestamping --reject otf,woff,woff2,ttf,eot --convert-links --
page-requisites --span-hosts --adjust-extension --no-check-certificate -e robots=off -U
"Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML,
like Gecko) Version/9.0 Mobile/13B143 Safari/601.1" -P download/
"https://www.google.com.tw/"
35
36
Phishing site Crawler Feature Extraction Detection Model Analysis & Report
HTML
CSS
JavaScript
Fonts
…
HTTP response header
DNS record
WHOIS record
IP address
URL
SSL record
Screenshot
…
37
•
–
–
–
–
–
–
–
–
–
–
–
–
38
•
–
–
–
–
–
–
•
–
–
–
–
–
39
• Domain based Features
– Age of Domain
– DNS Record
– Website Traffic
– PageRank
– Google Index
– Number of Links Pointing to Page
– Statistical-Reports Based Feature
40
Rule: IF !
𝑈𝑅𝐿	𝑙𝑒𝑛𝑔𝑡ℎ < 54	 → 	𝑓𝑒𝑎𝑡𝑢𝑟𝑒 = Legitimate
	𝑒𝑙𝑠𝑒	𝑖𝑓	𝑈𝑅𝐿	𝑙𝑒𝑛𝑔𝑡ℎ ≥ 54	𝑎𝑛𝑑	 ≤ 75	 → 	𝑓𝑒𝑎𝑡𝑢𝑟𝑒 = 𝑆𝑢𝑠𝑝𝑖𝑐𝑖𝑜𝑢𝑠	
𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 → 	𝑓𝑒𝑎𝑡𝑢𝑟𝑒 = Phishing
LegitimateSuspicious	Phishing
http://federmacedoadv.com.br/3f/aze/ab51e2e319e51502f416dbe46b773a5e/?c
md=_home&amp;dispatch=11004d58f5b74f8dc1e7c2e8dd4105e811004d58f5b7
4f8dc1e7c2e8dd4105e8@phishing.website.html
41
Phishing site Crawler Feature Extraction Detection Model Analysis & Report
HTML
CSS
JavaScript
Fonts
…
HTTP response header
DNS record
WHOIS record
IP address
URL
SSL record
Screenshot
…
• Black/white list-based
– Google Safe Browsing
– PhishNet
– Automated individual white-list (AIWL)
• Visual-based
– Earth Mover’s Distance (EMD) algorithm
– SURF
– Histogram of Oriented Gradients (HOG)
• Feature-based
– CANTINA
– PhishWho
– Mobile features
• Ensemble
– AJNA (SSL/TLS feature and JavaScript-based visual clues)
– kAYO
– MobileFish
42
•
–
–
–
•
–
•
–
–
43
44
45
Phishing site Crawler Feature Extraction Detection Model Analysis & Report
HTML
CSS
JavaScript
Fonts
…
HTTP response header
DNS record
WHOIS record
IP address
URL
SSL record
Screenshot
…
/
•
•
46
Displaying phishing information The interface for labeler to verify phishing sites
+
47
Submitted
Voting Module Monitoring module
Voting Verified Blacklist Invalid
Crawl VoterClassifier Evaluation Crawl Classifier Evaluation
The lifecycle of phish on the PhishTank
•
–
•
–
/ Selenium
48
49
50
Phishing site Crawler Feature Extraction Detection Model Analysis & Report
HTML
CSS
JavaScript
Fonts
…
HTTP response header
DNS record
WHOIS record
IP address
URL
SSL record
Screenshot
…
51
IEEE DASC 2017 Accepted
•
–
–
•
–
–
•
–
52
The integrated architecture
53
Data
collection
Blacklist
update
Infrastructure
Monitoring
ETL Monitoring Model (Validation/Detection)VisualizationVoting
PhishBox
Visual-based
Phishing Detection technology
Feature-
based
Feature
selection
Two-stage phishing detection model
• The two-stage phishing detection model
is combined with validation and
detection model
– Non-phish = invalid + legitimate
• Build the validation model with manual
labeling
– Apply supervised learning algorithm
– Apply active learning
• Improve the performance of detection
model with the validated phishing data
54
Target
Non-Phish
Invalid Valid
Legitimate Phish
Phish
Two-stage Model
Validation model
Detection model
Phishing data validation
• Once a page encounter the following situations, we call it invalid
– Offline: the website is not reachable. E.g. status code 404.
– Redirection: the page is redirected to the legitimate page.
– Invalid content: the content of the page is changed and contains invalid keyword such as “this
account has been suspended” or “the page is forbidden”.
55
[Invalid content] The account has been suspended by host provider.[Redirection] Redirect to google homepage
Construct a validation classifier!
Examples of invalid page
56
The page has been removed
Blocked by host provider
Domain Parking
Redirect to homepage Error message from host provider
Redirect to legitimate site
Examples of phishing page
57
Multi-provider login page Specific target
58
59
Active learning
60
Ensemble
validation model
Labeled
training set
Unlabeled
pool
Sampling
algorithm
(Initial label size)
(Query block size)
The rules of manual labeling
61
The screenshot on PhishTankThe screenshot we took
URL and host information
Label area
1. Check the screenshots to confirm if it is invalid
2. Check the URL and WHOIS to confirm if it is invalid
3. Check the website with search engine to confirm if it is invalid
62
63
64
65
Real
New version
Fake Fake Fake
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
推薦閱讀 /未來的犯罪:當萬物都可駭,我們該如何面對
96
97
• PhishTank - https://www.phishtank.com/
• UCI Phishing dataset - https://archive.ics.uci.edu/ml/datasets/phishing+websites
• Google Cloud Platform - https://cloud.google.com/
• Weka (Data mining) - https://www.cs.waikato.ac.nz/ml/weka/
• scikit-learn (Machine Learning) - http://scikit-learn.org/stable/
• Micorsoft Machine Learning - https://azure.microsoft.com/zh-tw/services/machine-
learning-studio/
• PhishBox : An approach for phishing validation and detection
• 不要被騙了!帶你分析 Google 會員抽獎詐騙網頁 - https://jerrynest.io/google-scam-
site/
98
Blog: https://jerrynest.io/
Facebook: https://www.facebook.com/jerrynest.io/

More Related Content

Similar to 對抗釣魚與詐騙網站的經驗談

Web analytics masterclass Howest
Web analytics masterclass HowestWeb analytics masterclass Howest
Web analytics masterclass Howest
Evelien De Mey
 

Similar to 對抗釣魚與詐騙網站的經驗談 (20)

Web analytics masterclass Howest
Web analytics masterclass HowestWeb analytics masterclass Howest
Web analytics masterclass Howest
 
Common Web Application Attacks
Common Web Application Attacks Common Web Application Attacks
Common Web Application Attacks
 
black hat deephish
black hat deephishblack hat deephish
black hat deephish
 
Understanding Identity in the World of Web APIs – Ronnie Mitra, API Architec...
Understanding Identity in the World of Web APIs – Ronnie Mitra,  API Architec...Understanding Identity in the World of Web APIs – Ronnie Mitra,  API Architec...
Understanding Identity in the World of Web APIs – Ronnie Mitra, API Architec...
 
Google-image poisoning: How hackers use images to spread malware
Google-image poisoning: How hackers use images to spread malwareGoogle-image poisoning: How hackers use images to spread malware
Google-image poisoning: How hackers use images to spread malware
 
AutoBLG by Sun Bo
AutoBLG by Sun Bo AutoBLG by Sun Bo
AutoBLG by Sun Bo
 
Technical SEO Checklist for Beginners
Technical SEO Checklist for BeginnersTechnical SEO Checklist for Beginners
Technical SEO Checklist for Beginners
 
SEO Training Slides October 2016
SEO Training Slides October 2016SEO Training Slides October 2016
SEO Training Slides October 2016
 
[Webinar] Blocking Spam Efficiently in Google Analytics
[Webinar] Blocking Spam Efficiently in Google Analytics[Webinar] Blocking Spam Efficiently in Google Analytics
[Webinar] Blocking Spam Efficiently in Google Analytics
 
Technical SEO: Crawl Space Management - SEOZone Istanbul 2014
Technical SEO: Crawl Space Management - SEOZone Istanbul 2014Technical SEO: Crawl Space Management - SEOZone Istanbul 2014
Technical SEO: Crawl Space Management - SEOZone Istanbul 2014
 
Search Engine Optimization (SEO)
Search Engine Optimization (SEO)Search Engine Optimization (SEO)
Search Engine Optimization (SEO)
 
MKEsearch 2018 | CSI: Forensic SEO Audits
MKEsearch 2018 | CSI: Forensic SEO AuditsMKEsearch 2018 | CSI: Forensic SEO Audits
MKEsearch 2018 | CSI: Forensic SEO Audits
 
SEO for Beginners Feb 2020 - Bristol Media
SEO for Beginners Feb 2020  - Bristol MediaSEO for Beginners Feb 2020  - Bristol Media
SEO for Beginners Feb 2020 - Bristol Media
 
Optimizing your WordPress website
Optimizing your WordPress websiteOptimizing your WordPress website
Optimizing your WordPress website
 
DEFCON 23 - Jason Haddix - how do i shot web
DEFCON 23 - Jason Haddix - how do i shot webDEFCON 23 - Jason Haddix - how do i shot web
DEFCON 23 - Jason Haddix - how do i shot web
 
(130216) #fitalk potentially malicious ur ls
(130216) #fitalk   potentially malicious ur ls(130216) #fitalk   potentially malicious ur ls
(130216) #fitalk potentially malicious ur ls
 
Phishing Website Detection by Machine Learning Techniques Presentation.pdf
Phishing Website Detection by Machine Learning Techniques Presentation.pdfPhishing Website Detection by Machine Learning Techniques Presentation.pdf
Phishing Website Detection by Machine Learning Techniques Presentation.pdf
 
How to Shot Web - Jason Haddix at DEFCON 23 - See it Live: Details in Descrip...
How to Shot Web - Jason Haddix at DEFCON 23 - See it Live: Details in Descrip...How to Shot Web - Jason Haddix at DEFCON 23 - See it Live: Details in Descrip...
How to Shot Web - Jason Haddix at DEFCON 23 - See it Live: Details in Descrip...
 
GDD Japan 2009 - Designing OpenSocial Apps For Speed and Scale
GDD Japan 2009 - Designing OpenSocial Apps For Speed and ScaleGDD Japan 2009 - Designing OpenSocial Apps For Speed and Scale
GDD Japan 2009 - Designing OpenSocial Apps For Speed and Scale
 
Fried toronto sps14 91 wcm intranet
Fried toronto sps14 91 wcm intranetFried toronto sps14 91 wcm intranet
Fried toronto sps14 91 wcm intranet
 

Recently uploaded

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 

Recently uploaded (20)

call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 

對抗釣魚與詐騙網站的經驗談