SlideShare a Scribd company logo
1 of 20
Fake Website Detection using Machine
Learning Algorithms
PresentedBy
MD SAJADUL ISLAM (193002138)
MST. NUSRAT JAHAN JYOTI (192002022)
1
SupervisedBy
MD. SOLAIMAN MIA
Assistant Professor
(Green University of Bangladesh)
GreenUniversityof Bangladesh
Department of CSE
1
Co-Supervised By
MD. GULZAR HUSSAIN
Lecturer, GUB
Contents
 Introduction
 Motivation
 Objectives
 Literature Review
 Problem Description
 Draft Plan
 Gantt Chart
 Proposed Budget Components
 Conclusion
 References
2
Introduction
 Phishing attacks are increasing every year by
200%
 Fake Website is the most common way to
conduct a phishing attack
 Multinational companies are losing 100
billion dollars per year because of phishing
attack
3
Motivation
4
 Desire to save internet users from the phishing attack
 Reduce the percentage of phishing attack
 Using URLs to find out Phishing Websites
 To save people’s personal information
Objectives
 Detecting fake website URLs using Machine Learning Algorithms
 To train machine learning models on the dataset and predict
phishing websites
 To use the efficient machine learning models
 Maintaining an accuracy rate of more than 90%
5
LiteratureReview
 Use the Neural Network to
detect URL
 Use binary visualization
 System can be applied to
Phishing
 And non-phishing website
classification.
 Limited Dataset to
conduct the experiment
 Limited Dataset affect the
prediction and efficiency
of the model
Working Process Usefulness Drawbacks Scopeof Improvement
 A Novel Approach to Detect Phishing Attacks using Binary Visualization and Machine
Learning [1]
 Adding more datasets for
both training and testing
 Using the different
models in making
predictions
6
LiteratureReview(Cont.)
 Phishing detection
 Mitigation in emails
and website
 Phishing detection
 And mitigation in
emails/website
 A lot phishing detection
method such as rule base
method, decision tree,
associative classification,
 SVM, NN were listed but
none have been
demonstrated in the
research.
Working Process Usefulness Drawbacks Scopeof Improvement
 Phishing attacks in Qatar: A literature review of the problems and solutions [2]
 Incorporate other
phishing detection
techniques for example
Random Forest, Light
GBM, and XG Boost.
 Improving the accuracy
level
7
LiteratureReview(Cont.)
 Use binary visualization
 User training to increase
awareness on phishing
 useful for Phishing
detection
 offensive defense
 No matrices for end-user
evaluation for each
website
 No features for user
involvement
Working Process Usefulness Drawbacks Scopeof Improvement
 Phishing detection: A literature survey[3]
 Adding some user
matrices for end-user
evaluation for each
website
 Increasing the accuracy
level
8
LiteratureReview(Cont.)
 Use Support vector
machine, Random forest
and CNN
 Machine learning and
deep learning
 CNN had given more
accuracy than SVM and RF
 Identifying Fake URL
 Accuracy was less for RF
and SVM 67% and 64%
accuracy
 Missing user involvement
Working Process Usefulness Drawbacks Scopeof Improvement
 Phishing Website Detection using Machine Learning Techniques and CNN [4]
 Using Random
Forest Light GBM XG
Boost.
 Expected accuracy of up
to 90%
9
LiteratureReview(Cont.)
 Use a Decision tree
and Support Vector
Machine.
 Proposed a deep
learning-based URL
detector. The authors
argued that the method
can produce insights
from URLs.
 Deep learning methods
demand more time to
produce an output. In
addition, it processes
the URL and matches it
with the library to
generate an output.
Working Process Usefulness Drawbacks Scopeof Improvement
 Phishing Website Detection using Machine Learning Techniques and CNN [4]
 Random Forest Light
GBM XG Boost for
producing an output in a
short time
 Increase the accuracy
10
ProblemDescription
 Fake URLs Detection
 Dataset Description
 Three Machine Learning
Classifiers
1st stage (URLs Detection) 2nd stage (Dataset)
(Expected)
 Phishing URLs
 Malware URLs
 Create lexical numeric features from input URLs
 Dataset from https://phishtank.org/
 Dataset from Kaggle
 4,28,103 benign or safe URLs
 96,457 defacement URLs
 94,111 phishing URLs
 32,520 malware URLs
11
ProblemDescription(cont.)
12
Random Forest Light GBM XG Boost
 Performs well even if
the data contains
null/missing values.
 Lower memory usage.  Effective for large
data sets
 Expected up to 90%
accuracy
 Desiring to get high
efficiency
Do not need
normalized features
 Random Forest performs both regression and classification tasks.
 Light GBM has faster training speed and higher efficiency
 XG Boost has an in-built capability to handle missing values.
 3rd Stage (Three Machine Learning Classifiers)
DraftPlan
Dataset
13
 How Phisher Conducts Phishing Attacks by Fake Website
DraftPlan (cont.)
Dataset
14
DraftPlan (cont.)
Dataset
Collect the URL
Lexical Feature
Extraction
Classifier Performance
Analysis
Create the URL into
lexical numeric
Evaluation using
Machine Learning
Evaluation the
accuracy
15
Random
Forest
Light
GBM
XG Boost
16
GanttChart
SL.
No.
Months
Thesis Activities
Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun
1 Read Existing Papers
2 Finding Limitation
3 Planning
4 Fixed Objectives
5 Optimization Formulation
6 Design Learning Algorithm
7 Data Collection
8 Implementation
9 Comparison
10 Paper Writing & Publication
ProposedBudgetComponents
17
SL.
No.
Budget Title No.
of
Unit
Per Unit
Cost
Total Cost
(BDT)
1 Research Supervision 3 6000 18000
2 Data Collection - 12000 12000
3 Access Researching Website - 1200 3000
4 Environment Setup 2 5000 10000
5 Implementation and Testing - 15000 15000
6 Paper Publication Cost 2 7000 14000
7 Others - 4000 4000
Total: 76000 (BDT)
Table 1: Budget Plan for Research
Conclusion
18
 Creating a fake website detection system
 Identify a fake website URL with the best accuracy
 In the Future, System can upgrade to automatically Detect the web page and the
compatibility of the Application with the web browser.
 Additional work also can be done by adding some other characteristics to distinguish the
fake web pages from the legitimate web pages
References
[1] L. Barlow, G. Bendiab, S. Shiaeles, and N. Savage, “A Novel Approach to Detect Phishing Attacks using Binary
Visualization and Machine Learning,” in Proceedings - 2020 IEEE World Congress on Services, SERVICES 2020, Oct.
2020, pp. 177–182. doi: 10.1109/SERVICES48979.2020.00046.
[2] Y. Al-Hamar, H. Kolivand, and A. Al-Hamar, “Phishing attacks in Qatar: A literature review of the problems and
solutions,” in Proceedings – International Conference on Developments in eSystems Engineering, DeSE, Oct. 2021, vol.
October-2021, pp. 837–842. doi: 10.1109/DeSE.2020.00155.
[3] A. Basit, M. Zafar, A. R. Javed, and Z. Jalil, “A Novel Ensemble Machine Learning Method to Detect Phishing Attack,”
Nov. 2020. doi: 10.1109/INMIC50486.2020.9318210.
[4] Deepa Mary Vargheese, Sreelakshmi N R “Phishing Website Detection using Machine Learning Techniques and
CNN,” International Journal of Engineering Research & Technology (IJERT) ISSN: 2278-0181 Published by, www.ijert.org
ICCIDT - 2022 Conference Proceedings
[5] Gandotra E., Gupta D, “An Efficient Approach for Phishing Detection using Machine Learning”, Algorithms for
Intelligent Systems, Springer, Singapore, 2021, https://doi.org/10.1007/978-981-15-8711-5_ 12.
19
THANKYOU
20

More Related Content

Similar to 193002138_Fake-Website-Detection-Fina-updatet-ppt.pptx

DETECTION OF PHISHING WEBSITES USING MACHINE LEARNING
DETECTION OF PHISHING WEBSITES USING MACHINE LEARNINGDETECTION OF PHISHING WEBSITES USING MACHINE LEARNING
DETECTION OF PHISHING WEBSITES USING MACHINE LEARNINGIRJET Journal
 
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATIONMAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATIONIJNSA Journal
 
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATIONMAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATIONIJNSA Journal
 
IRJET- Detecting Phishing Websites using Machine Learning
IRJET- Detecting Phishing Websites using Machine LearningIRJET- Detecting Phishing Websites using Machine Learning
IRJET- Detecting Phishing Websites using Machine LearningIRJET Journal
 
IRJET- Noisy Content Detection on Web Data using Machine Learning
IRJET- Noisy Content Detection on Web Data using Machine LearningIRJET- Noisy Content Detection on Web Data using Machine Learning
IRJET- Noisy Content Detection on Web Data using Machine LearningIRJET Journal
 
WEB ATTACK PREDICTION USING STEPWISE CONDITIONAL PARAMETER TUNING IN MACHINE ...
WEB ATTACK PREDICTION USING STEPWISE CONDITIONAL PARAMETER TUNING IN MACHINE ...WEB ATTACK PREDICTION USING STEPWISE CONDITIONAL PARAMETER TUNING IN MACHINE ...
WEB ATTACK PREDICTION USING STEPWISE CONDITIONAL PARAMETER TUNING IN MACHINE ...IJCNCJournal
 
Web Attack Prediction using Stepwise Conditional Parameter Tuning in Machine ...
Web Attack Prediction using Stepwise Conditional Parameter Tuning in Machine ...Web Attack Prediction using Stepwise Conditional Parameter Tuning in Machine ...
Web Attack Prediction using Stepwise Conditional Parameter Tuning in Machine ...IJCNCJournal
 
Click Fraud Detection Of Advertisements using Machine Learning
Click Fraud Detection Of Advertisements using Machine LearningClick Fraud Detection Of Advertisements using Machine Learning
Click Fraud Detection Of Advertisements using Machine LearningIRJET Journal
 
IRJET - Phishing Attack Detection and Prevention using Linkguard Algorithm
IRJET - Phishing Attack Detection and Prevention using Linkguard AlgorithmIRJET - Phishing Attack Detection and Prevention using Linkguard Algorithm
IRJET - Phishing Attack Detection and Prevention using Linkguard AlgorithmIRJET Journal
 
Malicious Link Detection System
Malicious Link Detection SystemMalicious Link Detection System
Malicious Link Detection SystemIRJET Journal
 
Phishing Website Detection using Classification Algorithms
Phishing Website Detection using Classification AlgorithmsPhishing Website Detection using Classification Algorithms
Phishing Website Detection using Classification AlgorithmsIRJET Journal
 
PHISHING URL DETECTION USING MACHINE LEARNING
PHISHING URL DETECTION USING MACHINE LEARNINGPHISHING URL DETECTION USING MACHINE LEARNING
PHISHING URL DETECTION USING MACHINE LEARNINGIRJET Journal
 
PHISHING URL DETECTION USING LSTM BASED ENSEMBLE LEARNING APPROACHES
PHISHING URL DETECTION USING LSTM BASED ENSEMBLE LEARNING APPROACHESPHISHING URL DETECTION USING LSTM BASED ENSEMBLE LEARNING APPROACHES
PHISHING URL DETECTION USING LSTM BASED ENSEMBLE LEARNING APPROACHESIJCNCJournal
 
Phishing URL Detection using LSTM Based Ensemble Learning Approaches
Phishing URL Detection using LSTM Based Ensemble Learning ApproachesPhishing URL Detection using LSTM Based Ensemble Learning Approaches
Phishing URL Detection using LSTM Based Ensemble Learning ApproachesIJCNCJournal
 
Phishing Website Detection Using Machine Learning
Phishing Website Detection Using Machine LearningPhishing Website Detection Using Machine Learning
Phishing Website Detection Using Machine LearningIRJET Journal
 
IRJET - Chrome Extension for Detecting Phishing Websites
IRJET -  	  Chrome Extension for Detecting Phishing WebsitesIRJET -  	  Chrome Extension for Detecting Phishing Websites
IRJET - Chrome Extension for Detecting Phishing WebsitesIRJET Journal
 
IRJET - Detection and Prevention of Phishing Websites using Machine Learning ...
IRJET - Detection and Prevention of Phishing Websites using Machine Learning ...IRJET - Detection and Prevention of Phishing Websites using Machine Learning ...
IRJET - Detection and Prevention of Phishing Websites using Machine Learning ...IRJET Journal
 
A Comparative Analysis of Different Feature Set on the Performance of Differe...
A Comparative Analysis of Different Feature Set on the Performance of Differe...A Comparative Analysis of Different Feature Set on the Performance of Differe...
A Comparative Analysis of Different Feature Set on the Performance of Differe...gerogepatton
 

Similar to 193002138_Fake-Website-Detection-Fina-updatet-ppt.pptx (20)

DETECTION OF PHISHING WEBSITES USING MACHINE LEARNING
DETECTION OF PHISHING WEBSITES USING MACHINE LEARNINGDETECTION OF PHISHING WEBSITES USING MACHINE LEARNING
DETECTION OF PHISHING WEBSITES USING MACHINE LEARNING
 
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATIONMAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
 
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATIONMAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
 
IRJET- Detecting Phishing Websites using Machine Learning
IRJET- Detecting Phishing Websites using Machine LearningIRJET- Detecting Phishing Websites using Machine Learning
IRJET- Detecting Phishing Websites using Machine Learning
 
IRJET- Noisy Content Detection on Web Data using Machine Learning
IRJET- Noisy Content Detection on Web Data using Machine LearningIRJET- Noisy Content Detection on Web Data using Machine Learning
IRJET- Noisy Content Detection on Web Data using Machine Learning
 
WEB ATTACK PREDICTION USING STEPWISE CONDITIONAL PARAMETER TUNING IN MACHINE ...
WEB ATTACK PREDICTION USING STEPWISE CONDITIONAL PARAMETER TUNING IN MACHINE ...WEB ATTACK PREDICTION USING STEPWISE CONDITIONAL PARAMETER TUNING IN MACHINE ...
WEB ATTACK PREDICTION USING STEPWISE CONDITIONAL PARAMETER TUNING IN MACHINE ...
 
Web Attack Prediction using Stepwise Conditional Parameter Tuning in Machine ...
Web Attack Prediction using Stepwise Conditional Parameter Tuning in Machine ...Web Attack Prediction using Stepwise Conditional Parameter Tuning in Machine ...
Web Attack Prediction using Stepwise Conditional Parameter Tuning in Machine ...
 
Click Fraud Detection Of Advertisements using Machine Learning
Click Fraud Detection Of Advertisements using Machine LearningClick Fraud Detection Of Advertisements using Machine Learning
Click Fraud Detection Of Advertisements using Machine Learning
 
Iy2515891593
Iy2515891593Iy2515891593
Iy2515891593
 
Iy2515891593
Iy2515891593Iy2515891593
Iy2515891593
 
IRJET - Phishing Attack Detection and Prevention using Linkguard Algorithm
IRJET - Phishing Attack Detection and Prevention using Linkguard AlgorithmIRJET - Phishing Attack Detection and Prevention using Linkguard Algorithm
IRJET - Phishing Attack Detection and Prevention using Linkguard Algorithm
 
Malicious Link Detection System
Malicious Link Detection SystemMalicious Link Detection System
Malicious Link Detection System
 
Phishing Website Detection using Classification Algorithms
Phishing Website Detection using Classification AlgorithmsPhishing Website Detection using Classification Algorithms
Phishing Website Detection using Classification Algorithms
 
PHISHING URL DETECTION USING MACHINE LEARNING
PHISHING URL DETECTION USING MACHINE LEARNINGPHISHING URL DETECTION USING MACHINE LEARNING
PHISHING URL DETECTION USING MACHINE LEARNING
 
PHISHING URL DETECTION USING LSTM BASED ENSEMBLE LEARNING APPROACHES
PHISHING URL DETECTION USING LSTM BASED ENSEMBLE LEARNING APPROACHESPHISHING URL DETECTION USING LSTM BASED ENSEMBLE LEARNING APPROACHES
PHISHING URL DETECTION USING LSTM BASED ENSEMBLE LEARNING APPROACHES
 
Phishing URL Detection using LSTM Based Ensemble Learning Approaches
Phishing URL Detection using LSTM Based Ensemble Learning ApproachesPhishing URL Detection using LSTM Based Ensemble Learning Approaches
Phishing URL Detection using LSTM Based Ensemble Learning Approaches
 
Phishing Website Detection Using Machine Learning
Phishing Website Detection Using Machine LearningPhishing Website Detection Using Machine Learning
Phishing Website Detection Using Machine Learning
 
IRJET - Chrome Extension for Detecting Phishing Websites
IRJET -  	  Chrome Extension for Detecting Phishing WebsitesIRJET -  	  Chrome Extension for Detecting Phishing Websites
IRJET - Chrome Extension for Detecting Phishing Websites
 
IRJET - Detection and Prevention of Phishing Websites using Machine Learning ...
IRJET - Detection and Prevention of Phishing Websites using Machine Learning ...IRJET - Detection and Prevention of Phishing Websites using Machine Learning ...
IRJET - Detection and Prevention of Phishing Websites using Machine Learning ...
 
A Comparative Analysis of Different Feature Set on the Performance of Differe...
A Comparative Analysis of Different Feature Set on the Performance of Differe...A Comparative Analysis of Different Feature Set on the Performance of Differe...
A Comparative Analysis of Different Feature Set on the Performance of Differe...
 

Recently uploaded

social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 

Recently uploaded (20)

social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 

193002138_Fake-Website-Detection-Fina-updatet-ppt.pptx

  • 1. Fake Website Detection using Machine Learning Algorithms PresentedBy MD SAJADUL ISLAM (193002138) MST. NUSRAT JAHAN JYOTI (192002022) 1 SupervisedBy MD. SOLAIMAN MIA Assistant Professor (Green University of Bangladesh) GreenUniversityof Bangladesh Department of CSE 1 Co-Supervised By MD. GULZAR HUSSAIN Lecturer, GUB
  • 2. Contents  Introduction  Motivation  Objectives  Literature Review  Problem Description  Draft Plan  Gantt Chart  Proposed Budget Components  Conclusion  References 2
  • 3. Introduction  Phishing attacks are increasing every year by 200%  Fake Website is the most common way to conduct a phishing attack  Multinational companies are losing 100 billion dollars per year because of phishing attack 3
  • 4. Motivation 4  Desire to save internet users from the phishing attack  Reduce the percentage of phishing attack  Using URLs to find out Phishing Websites  To save people’s personal information
  • 5. Objectives  Detecting fake website URLs using Machine Learning Algorithms  To train machine learning models on the dataset and predict phishing websites  To use the efficient machine learning models  Maintaining an accuracy rate of more than 90% 5
  • 6. LiteratureReview  Use the Neural Network to detect URL  Use binary visualization  System can be applied to Phishing  And non-phishing website classification.  Limited Dataset to conduct the experiment  Limited Dataset affect the prediction and efficiency of the model Working Process Usefulness Drawbacks Scopeof Improvement  A Novel Approach to Detect Phishing Attacks using Binary Visualization and Machine Learning [1]  Adding more datasets for both training and testing  Using the different models in making predictions 6
  • 7. LiteratureReview(Cont.)  Phishing detection  Mitigation in emails and website  Phishing detection  And mitigation in emails/website  A lot phishing detection method such as rule base method, decision tree, associative classification,  SVM, NN were listed but none have been demonstrated in the research. Working Process Usefulness Drawbacks Scopeof Improvement  Phishing attacks in Qatar: A literature review of the problems and solutions [2]  Incorporate other phishing detection techniques for example Random Forest, Light GBM, and XG Boost.  Improving the accuracy level 7
  • 8. LiteratureReview(Cont.)  Use binary visualization  User training to increase awareness on phishing  useful for Phishing detection  offensive defense  No matrices for end-user evaluation for each website  No features for user involvement Working Process Usefulness Drawbacks Scopeof Improvement  Phishing detection: A literature survey[3]  Adding some user matrices for end-user evaluation for each website  Increasing the accuracy level 8
  • 9. LiteratureReview(Cont.)  Use Support vector machine, Random forest and CNN  Machine learning and deep learning  CNN had given more accuracy than SVM and RF  Identifying Fake URL  Accuracy was less for RF and SVM 67% and 64% accuracy  Missing user involvement Working Process Usefulness Drawbacks Scopeof Improvement  Phishing Website Detection using Machine Learning Techniques and CNN [4]  Using Random Forest Light GBM XG Boost.  Expected accuracy of up to 90% 9
  • 10. LiteratureReview(Cont.)  Use a Decision tree and Support Vector Machine.  Proposed a deep learning-based URL detector. The authors argued that the method can produce insights from URLs.  Deep learning methods demand more time to produce an output. In addition, it processes the URL and matches it with the library to generate an output. Working Process Usefulness Drawbacks Scopeof Improvement  Phishing Website Detection using Machine Learning Techniques and CNN [4]  Random Forest Light GBM XG Boost for producing an output in a short time  Increase the accuracy 10
  • 11. ProblemDescription  Fake URLs Detection  Dataset Description  Three Machine Learning Classifiers 1st stage (URLs Detection) 2nd stage (Dataset) (Expected)  Phishing URLs  Malware URLs  Create lexical numeric features from input URLs  Dataset from https://phishtank.org/  Dataset from Kaggle  4,28,103 benign or safe URLs  96,457 defacement URLs  94,111 phishing URLs  32,520 malware URLs 11
  • 12. ProblemDescription(cont.) 12 Random Forest Light GBM XG Boost  Performs well even if the data contains null/missing values.  Lower memory usage.  Effective for large data sets  Expected up to 90% accuracy  Desiring to get high efficiency Do not need normalized features  Random Forest performs both regression and classification tasks.  Light GBM has faster training speed and higher efficiency  XG Boost has an in-built capability to handle missing values.  3rd Stage (Three Machine Learning Classifiers)
  • 13. DraftPlan Dataset 13  How Phisher Conducts Phishing Attacks by Fake Website
  • 15. DraftPlan (cont.) Dataset Collect the URL Lexical Feature Extraction Classifier Performance Analysis Create the URL into lexical numeric Evaluation using Machine Learning Evaluation the accuracy 15 Random Forest Light GBM XG Boost
  • 16. 16 GanttChart SL. No. Months Thesis Activities Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun 1 Read Existing Papers 2 Finding Limitation 3 Planning 4 Fixed Objectives 5 Optimization Formulation 6 Design Learning Algorithm 7 Data Collection 8 Implementation 9 Comparison 10 Paper Writing & Publication
  • 17. ProposedBudgetComponents 17 SL. No. Budget Title No. of Unit Per Unit Cost Total Cost (BDT) 1 Research Supervision 3 6000 18000 2 Data Collection - 12000 12000 3 Access Researching Website - 1200 3000 4 Environment Setup 2 5000 10000 5 Implementation and Testing - 15000 15000 6 Paper Publication Cost 2 7000 14000 7 Others - 4000 4000 Total: 76000 (BDT) Table 1: Budget Plan for Research
  • 18. Conclusion 18  Creating a fake website detection system  Identify a fake website URL with the best accuracy  In the Future, System can upgrade to automatically Detect the web page and the compatibility of the Application with the web browser.  Additional work also can be done by adding some other characteristics to distinguish the fake web pages from the legitimate web pages
  • 19. References [1] L. Barlow, G. Bendiab, S. Shiaeles, and N. Savage, “A Novel Approach to Detect Phishing Attacks using Binary Visualization and Machine Learning,” in Proceedings - 2020 IEEE World Congress on Services, SERVICES 2020, Oct. 2020, pp. 177–182. doi: 10.1109/SERVICES48979.2020.00046. [2] Y. Al-Hamar, H. Kolivand, and A. Al-Hamar, “Phishing attacks in Qatar: A literature review of the problems and solutions,” in Proceedings – International Conference on Developments in eSystems Engineering, DeSE, Oct. 2021, vol. October-2021, pp. 837–842. doi: 10.1109/DeSE.2020.00155. [3] A. Basit, M. Zafar, A. R. Javed, and Z. Jalil, “A Novel Ensemble Machine Learning Method to Detect Phishing Attack,” Nov. 2020. doi: 10.1109/INMIC50486.2020.9318210. [4] Deepa Mary Vargheese, Sreelakshmi N R “Phishing Website Detection using Machine Learning Techniques and CNN,” International Journal of Engineering Research & Technology (IJERT) ISSN: 2278-0181 Published by, www.ijert.org ICCIDT - 2022 Conference Proceedings [5] Gandotra E., Gupta D, “An Efficient Approach for Phishing Detection using Machine Learning”, Algorithms for Intelligent Systems, Springer, Singapore, 2021, https://doi.org/10.1007/978-981-15-8711-5_ 12. 19