SlideShare a Scribd company logo
1 of 10
EMAIL SPAM
DETECTION
CONTENT
■ Slide 1: Title: Email Spam Classification
■ Slide 2: Title: Dataset Overview
■ Slide 3: Title: Data Preprocessing & Vectorization
■ Slide 4: Title: Naive Bayes & Logistic Regression
Classifiers
■ Slide 5: Title: Accuracy Comparison & Email Prediction
INTRODUCTION
• Introduction:
• Email spam is a significant issue affecting individuals, businesses, and organizations
worldwide.
• Spam emails often contain malicious content, scams, or unwanted advertising.
• Accurately classifying spam and ham (non-spam) emails is crucial for ensuring email
security and protecting users from potential threats.
• Goal of the Code:
• The purpose of the presented code is to demonstrate the application of machine learning
algorithms for email spam classification.
• By training and evaluating classifiers on a labeled dataset, the code aims to accurately
differentiate between spam and legitimate emails.
• Importance of Email Spam Classification:
• Enhanced Security: Efficient spam classification helps prevent users from falling victim to
phishing attempts, malware distribution, and fraudulent schemes.
• User Experience: Reducing the influx of spam emails improves productivity and ensures
that users receive relevant and legitimate messages.
• Resource Optimization: Identifying and filtering out spam emails saves storage space and
reduces the burden on email servers.
Data
Overview
• 'spam_ham_dataset.csv' dataset
used in the code.
• Contains labeled examples of spam
and ham emails.
• Data Splitting:
• Dataset split into 80% training
and 20% testing sets.
• Random state 42 for
reproducibility.
• Vectorization:
• CountVectorizer converts text
data into numerical feature
vectors.
• Training and testing data
transformed using
CountVectorizer.
Data
Preprocessing &
Vectorization
Naive Bayes & Logistic
Regression Classifiers
• Naive Bayes Classifier:
• Training: MultinomialNB classifier trained with vectorized training
data.
• Prediction: Naive Bayes predicts labels for the test data.
• Accuracy Calculation: accuracy_score metric used to evaluate
Naive Bayes accuracy.
• Logistic Regression Classifier:
• Training: Logistic Regression classifier trained with vectorized
training data.
• Prediction: Logistic Regression predicts labels for the test data.
• Accuracy Calculation: accuracy_score metric used to evaluate
Logistic Regression accuracy.
Accuracy Comparison & Email
Prediction
Accuracy
Comparison
• Bar chart comparing accuracies of
Naive Bayes and Logistic
Regression classifiers.
• X-axis: Algorithm names (Naive
Bayes, Logistic Regression).
• Y-axis: Corresponding accuracies.
Email Prediction
Examples:
• Sample test spam and ham emails
used for prediction.
• Predictions made by both Naive
Bayes and Logistic Regression
classifiers.
• Display predicted labels for the test
spam and ham emails.
SUMMARY
■ In this project, we successfully developed machine learning models for email
spam classification using the Naive Bayes and logistic regression algorithms.
The models exhibit high accuracy and effectively differentiate between spam
and ham email messages. These results validate the efficacy of the
implemented algorithms in addressing the email spam problem. We can
further enhance the classifiers by incorporating advanced techniques or
exploring ensemble methods.
THANK
YOU

More Related Content

Similar to 671gdhfhfghhfhfghfghfghfgh163663-Project-2-PPT.pptx

Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningLior Rokach
 
BMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckBMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckSasha Lazarevic
 
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...Agile Testing Alliance
 
final-spam-e-mail-detection-180125111231.pptx
final-spam-e-mail-detection-180125111231.pptxfinal-spam-e-mail-detection-180125111231.pptx
final-spam-e-mail-detection-180125111231.pptxinfotowards
 
Nimrita deep learning
Nimrita deep learningNimrita deep learning
Nimrita deep learningNimrita Koul
 
Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyAlon Bochman, CFA
 
Classification of URLs
Classification of URLsClassification of URLs
Classification of URLsFANCY ARORA
 
Employee Churn Prediction: Artificial Intelligence Project Presentation
Employee Churn Prediction: Artificial Intelligence Project PresentationEmployee Churn Prediction: Artificial Intelligence Project Presentation
Employee Churn Prediction: Artificial Intelligence Project PresentationBoston Institute of Analytics
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptxAniket Patil
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptxpatilaniket2418
 
Unearth the limitless possibilities with Amazon Sagemaker.pptx
Unearth the limitless possibilities with Amazon Sagemaker.pptxUnearth the limitless possibilities with Amazon Sagemaker.pptx
Unearth the limitless possibilities with Amazon Sagemaker.pptxMarketing CloudThat
 
Mba ii rm unit-4.1 data analysis & presentation a
Mba ii rm unit-4.1 data analysis & presentation aMba ii rm unit-4.1 data analysis & presentation a
Mba ii rm unit-4.1 data analysis & presentation aRai University
 
Final spam-e-mail-detection
Final  spam-e-mail-detectionFinal  spam-e-mail-detection
Final spam-e-mail-detectionPartnered Health
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learningSanghamitra Deb
 
Statistical Learning on Credit Data
Statistical Learning on Credit DataStatistical Learning on Credit Data
Statistical Learning on Credit DataFiras Obeid
 

Similar to 671gdhfhfghhfhfghfghfghfgh163663-Project-2-PPT.pptx (20)

Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
BMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckBMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist Deck
 
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...
 
final-spam-e-mail-detection-180125111231.pptx
final-spam-e-mail-detection-180125111231.pptxfinal-spam-e-mail-detection-180125111231.pptx
final-spam-e-mail-detection-180125111231.pptx
 
Nimrita deep learning
Nimrita deep learningNimrita deep learning
Nimrita deep learning
 
Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case Study
 
Classification of URLs
Classification of URLsClassification of URLs
Classification of URLs
 
Employee Churn Prediction: Artificial Intelligence Project Presentation
Employee Churn Prediction: Artificial Intelligence Project PresentationEmployee Churn Prediction: Artificial Intelligence Project Presentation
Employee Churn Prediction: Artificial Intelligence Project Presentation
 
INTERNSHIP
INTERNSHIPINTERNSHIP
INTERNSHIP
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
 
Bank Customer Churn Prediction- Saurav Singh.pptx
Bank Customer Churn Prediction- Saurav Singh.pptxBank Customer Churn Prediction- Saurav Singh.pptx
Bank Customer Churn Prediction- Saurav Singh.pptx
 
Fuzzy Rough Set Feature Selection to Enhance Phishing Attack Detection
Fuzzy Rough Set Feature Selection to Enhance Phishing Attack Detection Fuzzy Rough Set Feature Selection to Enhance Phishing Attack Detection
Fuzzy Rough Set Feature Selection to Enhance Phishing Attack Detection
 
Unearth the limitless possibilities with Amazon Sagemaker.pptx
Unearth the limitless possibilities with Amazon Sagemaker.pptxUnearth the limitless possibilities with Amazon Sagemaker.pptx
Unearth the limitless possibilities with Amazon Sagemaker.pptx
 
Mba ii rm unit-4.1 data analysis & presentation a
Mba ii rm unit-4.1 data analysis & presentation aMba ii rm unit-4.1 data analysis & presentation a
Mba ii rm unit-4.1 data analysis & presentation a
 
Credit scorecard
Credit scorecardCredit scorecard
Credit scorecard
 
Final spam-e-mail-detection
Final  spam-e-mail-detectionFinal  spam-e-mail-detection
Final spam-e-mail-detection
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Statistical Learning on Credit Data
Statistical Learning on Credit DataStatistical Learning on Credit Data
Statistical Learning on Credit Data
 
Haicku submission
Haicku submissionHaicku submission
Haicku submission
 

Recently uploaded

Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 

Recently uploaded (20)

Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 

671gdhfhfghhfhfghfghfghfgh163663-Project-2-PPT.pptx

  • 2. CONTENT ■ Slide 1: Title: Email Spam Classification ■ Slide 2: Title: Dataset Overview ■ Slide 3: Title: Data Preprocessing & Vectorization ■ Slide 4: Title: Naive Bayes & Logistic Regression Classifiers ■ Slide 5: Title: Accuracy Comparison & Email Prediction
  • 3. INTRODUCTION • Introduction: • Email spam is a significant issue affecting individuals, businesses, and organizations worldwide. • Spam emails often contain malicious content, scams, or unwanted advertising. • Accurately classifying spam and ham (non-spam) emails is crucial for ensuring email security and protecting users from potential threats. • Goal of the Code: • The purpose of the presented code is to demonstrate the application of machine learning algorithms for email spam classification. • By training and evaluating classifiers on a labeled dataset, the code aims to accurately differentiate between spam and legitimate emails. • Importance of Email Spam Classification: • Enhanced Security: Efficient spam classification helps prevent users from falling victim to phishing attempts, malware distribution, and fraudulent schemes. • User Experience: Reducing the influx of spam emails improves productivity and ensures that users receive relevant and legitimate messages. • Resource Optimization: Identifying and filtering out spam emails saves storage space and reduces the burden on email servers.
  • 4. Data Overview • 'spam_ham_dataset.csv' dataset used in the code. • Contains labeled examples of spam and ham emails. • Data Splitting: • Dataset split into 80% training and 20% testing sets. • Random state 42 for reproducibility. • Vectorization: • CountVectorizer converts text data into numerical feature vectors. • Training and testing data transformed using CountVectorizer. Data Preprocessing & Vectorization
  • 5.
  • 6. Naive Bayes & Logistic Regression Classifiers • Naive Bayes Classifier: • Training: MultinomialNB classifier trained with vectorized training data. • Prediction: Naive Bayes predicts labels for the test data. • Accuracy Calculation: accuracy_score metric used to evaluate Naive Bayes accuracy. • Logistic Regression Classifier: • Training: Logistic Regression classifier trained with vectorized training data. • Prediction: Logistic Regression predicts labels for the test data. • Accuracy Calculation: accuracy_score metric used to evaluate Logistic Regression accuracy.
  • 7. Accuracy Comparison & Email Prediction Accuracy Comparison • Bar chart comparing accuracies of Naive Bayes and Logistic Regression classifiers. • X-axis: Algorithm names (Naive Bayes, Logistic Regression). • Y-axis: Corresponding accuracies. Email Prediction Examples: • Sample test spam and ham emails used for prediction. • Predictions made by both Naive Bayes and Logistic Regression classifiers. • Display predicted labels for the test spam and ham emails.
  • 8.
  • 9. SUMMARY ■ In this project, we successfully developed machine learning models for email spam classification using the Naive Bayes and logistic regression algorithms. The models exhibit high accuracy and effectively differentiate between spam and ham email messages. These results validate the efficacy of the implemented algorithms in addressing the email spam problem. We can further enhance the classifiers by incorporating advanced techniques or exploring ensemble methods.