SlideShare a Scribd company logo
1 of 43
Download to read offline
Machine Learning vs
Rule-Based Systems
DataTalks.Club
Machine Learning Zoomcamp
Session #1.2
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Session #1.2: Plan
● A rule-based system for spam detection
● Using ML for spam detection
● Extracting features for ML
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Email system
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Spam
Subject: Get 50% off now
From: promotions@online.com
Whether or not you use PayPal, you have likely received at least one PayPal
spam message. In it, a spammer impersonates PayPal and informs you that
you have to log in to your account and authorize some recent changes. If you
click on the link included below the message, you will be taken to a fake
PayPal login page set up by the spammer to steal your password and
withdraw funds from your account.
Subject: URGENT: tax review
From: tax@online.com
Your tax review is pending acceptance. Review within 24 hours:
https://taxes.we-are-legit.com
Tax office.
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Rules
● If sender = promotions@online.com then “spam”
● If title contains “tax review” and sender domain is “online.com” then “spam”
● Otherwise, “good email”
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Code
def detect_spam(email):
if email.sender == 'promotions@online.com':
return SPAM
if contains(email.title, ['tax', 'rewiew']) and
domain(email.sender, 'online.com'):
return SPAM
return GOOD
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
More
Subject: Waiting for your reply
From: prince1@test.com
We are delighted to inform you that you won 1.000.000 (one million) US
Dollars. To claim the prize, you need to pay a small processing fee. Please
deposit $10 to our PayPal account at prince@test.com. Once we receive the
money, we will start the transfer.
Congratulations again!
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Rules
● If sender = promotions@online.com then “spam”
● If title contains “tax review” and sender domain is “online.com” then “spam”
● If body contains a word “deposit” then “spam”
● Otherwise, “good email”
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Code
def detect_spam(email):
if email.sender == 'promotions@online.com':
return SPAM
if contains(email.title, ['tax', 'rewiew']) and
domain(email.sender, 'online.com'):
return SPAM
if contains(email.body, ['deposit']):
return SPAM
return GOOD
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
More
Subject: Totally legit email
From: pedro@gmail.com
I transferred $50 to you one year ago, and now I’m moving out.
Please refund my deposit.
Pedro.
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Rules
● If sender = promotions@online.com then “spam”
● If title contains “tax review” and sender domain is “online.com” then “spam”
● If body contains a word “deposit”
○ If sender domain is “test.com” then “spam”
○ If body >= 100 words then spam
● Otherwise, “good email”
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Repeat
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
🤯
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
🤯
Use Machine Learning!
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Machine Learning
● Get data
● Define & calculate features
● Train and use the model
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Getting data
Subject: Waiting for your reply
From: prince1@test.com
We are delighted to inform you that you won 1.000.000 (one million) US
Dollars. To claim the prize, you need to pay a small processing fee. Please
deposit $10 to our PayPal account at prince@test.com. Once we receive the
money, we will start the transfer.
Congratulations again!
SPAM
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Machine Learning
● Get data
● Define & calculate features
● Train and use the model
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Features
● Length of title > 10? true/false
● Length of body > 10? true/false
● Sender “promotions@online.com”? true/false
● Sender “hpYOSKmL@test.com”? true/false
● Sender domain “test.com”? true/false
● Description contains “deposit”? true/false
Rules
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
📝
Start with rules and then use these rules
as features
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Subject: Waiting for your reply
From: prince1@test.com
We are delighted to inform you that you won 1.000.000 (one million) US
Dollars. To claim the prize, you need to pay a small processing fee. Please
deposit $10 to our PayPal account at prince@test.com. Once we receive the
money, we will start the transfer.
Congratulations again!
SPAM
[1, 1, 0, 0, 1, 1]
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Subject: Waiting for your reply
From: prince1@test.com
We are delighted to inform you that you won 1.000.000 (one million) US
Dollars. To claim the prize, you need to pay a small processing fee. Please
deposit $10 to our PayPal account at prince@test.com. Once we receive the
money, we will start the transfer.
Congratulations again!
SPAM
[1, 1, 0, 0, 1, 1]
Length of title > 10? True
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Subject: Waiting for your reply
From: prince1@test.com
We are delighted to inform you that you won 1.000.000 (one million) US
Dollars. To claim the prize, you need to pay a small processing fee. Please
deposit $10 to our PayPal account at prince@test.com. Once we receive the
money, we will start the transfer.
Congratulations again!
SPAM
Length of body > 10? True
[1, 1, 0, 0, 1, 1]
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Subject: Waiting for your reply
From: prince1@test.com
We are delighted to inform you that you won 1.000.000 (one million) US
Dollars. To claim the prize, you need to pay a small processing fee. Please
deposit $10 to our PayPal account at prince@test.com. Once we receive the
money, we will start the transfer.
Congratulations again!
SPAM
Sender “promotions@online.com”? False
[1, 1, 0, 0, 1, 1]
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Subject: Waiting for your reply
From: prince1@test.com
We are delighted to inform you that you won 1.000.000 (one million) US
Dollars. To claim the prize, you need to pay a small processing fee. Please
deposit $10 to our PayPal account at prince@test.com. Once we receive the
money, we will start the transfer.
Congratulations again!
SPAM
Sender “hpYOSKmL@test.com”? False
[1, 1, 0, 0, 1, 1]
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Subject: Waiting for your reply
From: prince1@test.com
We are delighted to inform you that you won 1.000.000 (one million) US
Dollars. To claim the prize, you need to pay a small processing fee. Please
deposit $10 to our PayPal account at prince@test.com. Once we receive the
money, we will start the transfer.
Congratulations again!
SPAM
Sender domain “test.com”? True
[1, 1, 0, 0, 1, 1]
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Subject: Waiting for your reply
From: prince1@test.com
We are delighted to inform you that you won 1.000.000 (one million) US
Dollars. To claim the prize, you need to pay a small processing fee. Please
deposit $10 to our PayPal account at prince@test.com. Once we receive the
money, we will start the transfer.
Congratulations again!
SPAM
Description contains “deposit”? False
[1, 1, 0, 0, 1, 1]
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Subject: Waiting for your reply
From: prince1@test.com
We are delighted to inform you that you won 1.000.000 (one million) US
Dollars. To claim the prize, you need to pay a small processing fee. Please
deposit $10 to our PayPal account at prince@test.com. Once we receive the
money, we will start the transfer.
Congratulations again!
SPAM
[1, 1, 0, 0, 1, 1] 1
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
[1, 1, 0, 0, 1, 1] 1
Features
(data)
Target
(desired output)
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
[1, 1, 0, 0, 1, 1] 1
[0, 0, 0, 1, 0, 1] 0
Features
(data)
Target
(desired output)
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
[1, 1, 0, 0, 1, 1] 1
[0, 0, 0, 1, 0, 1] 0
[1, 1, 1, 0, 1, 0] 1
Features
(data)
Target
(desired output)
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
[1, 1, 0, 0, 1, 1] 1
[0, 0, 0, 1, 0, 1] 0
[1, 1, 1, 0, 1, 0] 1
[1, 0, 0, 0, 0, 1] 1
Features
(data)
Target
(desired output)
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
[1, 1, 0, 0, 1, 1] 1
[0, 0, 0, 1, 0, 1] 0
[1, 1, 1, 0, 1, 0] 1
[1, 0, 0, 0, 0, 1] 1
[0, 0, 0, 1, 1, 0] 0
Features
(data)
Target
(desired output)
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
[1, 1, 0, 0, 1, 1] 1
[0, 0, 0, 1, 0, 1] 0
[1, 1, 1, 0, 1, 0] 1
[1, 0, 0, 0, 0, 1] 1
[0, 0, 0, 1, 1, 0] 0
[1, 0, 1, 0, 1, 1] 0
Features
(data)
Target
(desired output)
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Machine Learning
● Get data
● Define & calculate features
● Train and use the model
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
[1, 1, 0, 0, 1, 1] 1
[0, 0, 0, 1, 0, 1] 0
[1, 1, 1, 0, 1, 0] 1
[1, 0, 0, 0, 0, 1] 1
[0, 0, 0, 1, 1, 0] 0
[1, 0, 1, 0, 1, 1] 0
Features
(data)
Target
(desired output)
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
[0, 0, 0, 1, 0, 1]
[0, 0, 0, 1, 1, 0]
[1, 0, 1, 0, 1, 1]
[1, 1, 1, 0, 1, 0]
[1, 0, 0, 0, 0, 1]
[1, 1, 0, 0, 1, 1]
Features
(data)
Predictions
(output)
Apply
Final outcome
(decision)
DataTalks.Club — mlzoomcamp.com — @Al_Grigor Summary
data + code => software => outcome
DataTalks.Club — mlzoomcamp.com — @Al_Grigor Summary
data + outcome => ML => model
DataTalks.Club — mlzoomcamp.com — @Al_Grigor
Next
Supervised machine learning
● A bit more formal definition
● Examples: regression, classification, ranking

More Related Content

What's hot

Quora: Because you cannot Google everything
Quora: Because you cannot Google everythingQuora: Because you cannot Google everything
Quora: Because you cannot Google everythingMrinal Chandra
 
Reddit Advertisement Sales Pitch
Reddit Advertisement Sales PitchReddit Advertisement Sales Pitch
Reddit Advertisement Sales PitchJoseph Hsieh
 
Sendgrid pitch deck
Sendgrid pitch deckSendgrid pitch deck
Sendgrid pitch deckDavid Cohen
 
Log file analysis with advertools
Log file analysis with advertoolsLog file analysis with advertools
Log file analysis with advertoolsElias Dabbas
 
Google Tag Manager Can Do What
Google Tag Manager Can Do WhatGoogle Tag Manager Can Do What
Google Tag Manager Can Do Whatpatrickstox
 
Learning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search GuildLearning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search GuildSujit Pal
 
MySQL fundraising pitch deck ($16 million Series B round - 2003)
MySQL fundraising pitch deck ($16 million Series B round - 2003)MySQL fundraising pitch deck ($16 million Series B round - 2003)
MySQL fundraising pitch deck ($16 million Series B round - 2003)Robin Wauters
 
Dropbox: $15K VC investment turned into $16.8B. Dropbox's initial pitch deck
Dropbox: $15K VC investment turned into $16.8B. Dropbox's initial pitch deckDropbox: $15K VC investment turned into $16.8B. Dropbox's initial pitch deck
Dropbox: $15K VC investment turned into $16.8B. Dropbox's initial pitch deckAA BB
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsBenjamin Le
 
Theranos: $500K VC investment turned into $10B. Theranos' initial pitch deck
Theranos: $500K VC investment turned into $10B. Theranos' initial pitch deckTheranos: $500K VC investment turned into $10B. Theranos' initial pitch deck
Theranos: $500K VC investment turned into $10B. Theranos' initial pitch deckAA BB
 
The investor presentation we used to raise 2 million dollars
The investor presentation we used to raise 2 million dollarsThe investor presentation we used to raise 2 million dollars
The investor presentation we used to raise 2 million dollarsMikael Cho
 
Building an Implicit Recommendation Engine with Spark with Sophie Watson
Building an Implicit Recommendation Engine with Spark with Sophie WatsonBuilding an Implicit Recommendation Engine with Spark with Sophie Watson
Building an Implicit Recommendation Engine with Spark with Sophie WatsonDatabricks
 
Zenpayroll Pitch Deck Template
Zenpayroll Pitch Deck TemplateZenpayroll Pitch Deck Template
Zenpayroll Pitch Deck TemplateJoseph Hsieh
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation SystemsTrieu Nguyen
 
Cracking the PM Interview
Cracking the PM InterviewCracking the PM Interview
Cracking the PM InterviewGayle McDowell
 
Digital Analytics with the Google Tag Manager (GTM)
Digital Analytics with the Google Tag Manager (GTM)Digital Analytics with the Google Tag Manager (GTM)
Digital Analytics with the Google Tag Manager (GTM)Yourposition AG
 
Flowhaven Pitch Deck
Flowhaven Pitch DeckFlowhaven Pitch Deck
Flowhaven Pitch DeckLaytonHughes
 
Mint.com Pre-Launch Pitch Deck
Mint.com Pre-Launch Pitch DeckMint.com Pre-Launch Pitch Deck
Mint.com Pre-Launch Pitch DeckHiten Shah
 
Google Product Manager Interview Cheat Sheet
Google Product Manager Interview Cheat SheetGoogle Product Manager Interview Cheat Sheet
Google Product Manager Interview Cheat SheetLewis Lin 🦊
 

What's hot (20)

Quora: Because you cannot Google everything
Quora: Because you cannot Google everythingQuora: Because you cannot Google everything
Quora: Because you cannot Google everything
 
Reddit Advertisement Sales Pitch
Reddit Advertisement Sales PitchReddit Advertisement Sales Pitch
Reddit Advertisement Sales Pitch
 
Sendgrid pitch deck
Sendgrid pitch deckSendgrid pitch deck
Sendgrid pitch deck
 
Log file analysis with advertools
Log file analysis with advertoolsLog file analysis with advertools
Log file analysis with advertools
 
Google Tag Manager Can Do What
Google Tag Manager Can Do WhatGoogle Tag Manager Can Do What
Google Tag Manager Can Do What
 
Learning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search GuildLearning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search Guild
 
MySQL fundraising pitch deck ($16 million Series B round - 2003)
MySQL fundraising pitch deck ($16 million Series B round - 2003)MySQL fundraising pitch deck ($16 million Series B round - 2003)
MySQL fundraising pitch deck ($16 million Series B round - 2003)
 
Dropbox: $15K VC investment turned into $16.8B. Dropbox's initial pitch deck
Dropbox: $15K VC investment turned into $16.8B. Dropbox's initial pitch deckDropbox: $15K VC investment turned into $16.8B. Dropbox's initial pitch deck
Dropbox: $15K VC investment turned into $16.8B. Dropbox's initial pitch deck
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
 
Theranos: $500K VC investment turned into $10B. Theranos' initial pitch deck
Theranos: $500K VC investment turned into $10B. Theranos' initial pitch deckTheranos: $500K VC investment turned into $10B. Theranos' initial pitch deck
Theranos: $500K VC investment turned into $10B. Theranos' initial pitch deck
 
The investor presentation we used to raise 2 million dollars
The investor presentation we used to raise 2 million dollarsThe investor presentation we used to raise 2 million dollars
The investor presentation we used to raise 2 million dollars
 
Building an Implicit Recommendation Engine with Spark with Sophie Watson
Building an Implicit Recommendation Engine with Spark with Sophie WatsonBuilding an Implicit Recommendation Engine with Spark with Sophie Watson
Building an Implicit Recommendation Engine with Spark with Sophie Watson
 
Zenpayroll Pitch Deck Template
Zenpayroll Pitch Deck TemplateZenpayroll Pitch Deck Template
Zenpayroll Pitch Deck Template
 
Understanding AlphaGo
Understanding AlphaGoUnderstanding AlphaGo
Understanding AlphaGo
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation Systems
 
Cracking the PM Interview
Cracking the PM InterviewCracking the PM Interview
Cracking the PM Interview
 
Digital Analytics with the Google Tag Manager (GTM)
Digital Analytics with the Google Tag Manager (GTM)Digital Analytics with the Google Tag Manager (GTM)
Digital Analytics with the Google Tag Manager (GTM)
 
Flowhaven Pitch Deck
Flowhaven Pitch DeckFlowhaven Pitch Deck
Flowhaven Pitch Deck
 
Mint.com Pre-Launch Pitch Deck
Mint.com Pre-Launch Pitch DeckMint.com Pre-Launch Pitch Deck
Mint.com Pre-Launch Pitch Deck
 
Google Product Manager Interview Cheat Sheet
Google Product Manager Interview Cheat SheetGoogle Product Manager Interview Cheat Sheet
Google Product Manager Interview Cheat Sheet
 

Similar to ML Zoomcamp 1.2 - ML vs Rule-Based Systems

Introduction to Machine Learning: Process and Roles
 Introduction to Machine Learning: Process and Roles Introduction to Machine Learning: Process and Roles
Introduction to Machine Learning: Process and RolesAlexey Grigorev
 
9 Learnings from 10 Years of SaaS Investing
9 Learnings from 10 Years of SaaS Investing9 Learnings from 10 Years of SaaS Investing
9 Learnings from 10 Years of SaaS InvestingChristoph Janz
 
List of paying_websites
List of paying_websitesList of paying_websites
List of paying_websitesPri Yazzo
 
[db tech showcase Tokyo 2018] #dbts2018 #C37 『進化を続ける Amazon Redshift のパフォーマンス...
[db tech showcase Tokyo 2018] #dbts2018 #C37 『進化を続ける Amazon Redshift のパフォーマンス...[db tech showcase Tokyo 2018] #dbts2018 #C37 『進化を続ける Amazon Redshift のパフォーマンス...
[db tech showcase Tokyo 2018] #dbts2018 #C37 『進化を続ける Amazon Redshift のパフォーマンス...Insight Technology, Inc.
 
Immediately Sales Deck
Immediately Sales DeckImmediately Sales Deck
Immediately Sales DeckLilly Skolnik
 
SEO Training In Ambala ! BATRA COMPUTER CENTRE
SEO Training In Ambala ! BATRA COMPUTER CENTRESEO Training In Ambala ! BATRA COMPUTER CENTRE
SEO Training In Ambala ! BATRA COMPUTER CENTREjatin batra
 
You've Got Fail
You've Got FailYou've Got Fail
You've Got FailJames Boyd
 
LAPHP/LAMPSig Talk: Intro to SendGrid - Building a Scalable Email Infrastructure
LAPHP/LAMPSig Talk: Intro to SendGrid - Building a Scalable Email InfrastructureLAPHP/LAMPSig Talk: Intro to SendGrid - Building a Scalable Email Infrastructure
LAPHP/LAMPSig Talk: Intro to SendGrid - Building a Scalable Email InfrastructureSendGrid
 
Google & Yahoo's Email Update: Your Must-Do Checklist
Google & Yahoo's Email Update: Your Must-Do ChecklistGoogle & Yahoo's Email Update: Your Must-Do Checklist
Google & Yahoo's Email Update: Your Must-Do ChecklistBloomerang
 
How to become hacker
How to become hackerHow to become hacker
How to become hackerRaman Sanoria
 
Killer profitsecrets
Killer profitsecretsKiller profitsecrets
Killer profitsecretsPaul Counts
 
How B2B Tech Marketers Can Radically Improve Email Performance NOW
How B2B Tech Marketers Can Radically Improve Email Performance NOWHow B2B Tech Marketers Can Radically Improve Email Performance NOW
How B2B Tech Marketers Can Radically Improve Email Performance NOWKiwi Creative
 
Hypocrisy DEFINATION OF DELUSION AND ONES WITH GOD
Hypocrisy DEFINATION OF DELUSION AND ONES WITH GODHypocrisy DEFINATION OF DELUSION AND ONES WITH GOD
Hypocrisy DEFINATION OF DELUSION AND ONES WITH GODKeith Andrew Taylor
 
Learnings from the Field: Best Practices for Making Money with Alexa Skills (...
Learnings from the Field: Best Practices for Making Money with Alexa Skills (...Learnings from the Field: Best Practices for Making Money with Alexa Skills (...
Learnings from the Field: Best Practices for Making Money with Alexa Skills (...Amazon Web Services
 
The Phishing Ecosystem
The Phishing EcosystemThe Phishing Ecosystem
The Phishing Ecosystemamiable_indian
 
How to Attract New People to Your Brand
How to Attract New People to Your BrandHow to Attract New People to Your Brand
How to Attract New People to Your BrandHighRoad Solution
 

Similar to ML Zoomcamp 1.2 - ML vs Rule-Based Systems (17)

Introduction to Machine Learning: Process and Roles
 Introduction to Machine Learning: Process and Roles Introduction to Machine Learning: Process and Roles
Introduction to Machine Learning: Process and Roles
 
9 Learnings from 10 Years of SaaS Investing
9 Learnings from 10 Years of SaaS Investing9 Learnings from 10 Years of SaaS Investing
9 Learnings from 10 Years of SaaS Investing
 
List of paying_websites
List of paying_websitesList of paying_websites
List of paying_websites
 
[db tech showcase Tokyo 2018] #dbts2018 #C37 『進化を続ける Amazon Redshift のパフォーマンス...
[db tech showcase Tokyo 2018] #dbts2018 #C37 『進化を続ける Amazon Redshift のパフォーマンス...[db tech showcase Tokyo 2018] #dbts2018 #C37 『進化を続ける Amazon Redshift のパフォーマンス...
[db tech showcase Tokyo 2018] #dbts2018 #C37 『進化を続ける Amazon Redshift のパフォーマンス...
 
Immediately Sales Deck
Immediately Sales DeckImmediately Sales Deck
Immediately Sales Deck
 
Gmail hacking
Gmail hackingGmail hacking
Gmail hacking
 
SEO Training In Ambala ! BATRA COMPUTER CENTRE
SEO Training In Ambala ! BATRA COMPUTER CENTRESEO Training In Ambala ! BATRA COMPUTER CENTRE
SEO Training In Ambala ! BATRA COMPUTER CENTRE
 
You've Got Fail
You've Got FailYou've Got Fail
You've Got Fail
 
LAPHP/LAMPSig Talk: Intro to SendGrid - Building a Scalable Email Infrastructure
LAPHP/LAMPSig Talk: Intro to SendGrid - Building a Scalable Email InfrastructureLAPHP/LAMPSig Talk: Intro to SendGrid - Building a Scalable Email Infrastructure
LAPHP/LAMPSig Talk: Intro to SendGrid - Building a Scalable Email Infrastructure
 
Google & Yahoo's Email Update: Your Must-Do Checklist
Google & Yahoo's Email Update: Your Must-Do ChecklistGoogle & Yahoo's Email Update: Your Must-Do Checklist
Google & Yahoo's Email Update: Your Must-Do Checklist
 
How to become hacker
How to become hackerHow to become hacker
How to become hacker
 
Killer profitsecrets
Killer profitsecretsKiller profitsecrets
Killer profitsecrets
 
How B2B Tech Marketers Can Radically Improve Email Performance NOW
How B2B Tech Marketers Can Radically Improve Email Performance NOWHow B2B Tech Marketers Can Radically Improve Email Performance NOW
How B2B Tech Marketers Can Radically Improve Email Performance NOW
 
Hypocrisy DEFINATION OF DELUSION AND ONES WITH GOD
Hypocrisy DEFINATION OF DELUSION AND ONES WITH GODHypocrisy DEFINATION OF DELUSION AND ONES WITH GOD
Hypocrisy DEFINATION OF DELUSION AND ONES WITH GOD
 
Learnings from the Field: Best Practices for Making Money with Alexa Skills (...
Learnings from the Field: Best Practices for Making Money with Alexa Skills (...Learnings from the Field: Best Practices for Making Money with Alexa Skills (...
Learnings from the Field: Best Practices for Making Money with Alexa Skills (...
 
The Phishing Ecosystem
The Phishing EcosystemThe Phishing Ecosystem
The Phishing Ecosystem
 
How to Attract New People to Your Brand
How to Attract New People to Your BrandHow to Attract New People to Your Brand
How to Attract New People to Your Brand
 

More from Alexey Grigorev

Codementor - Data Science at OLX
Codementor - Data Science at OLX Codementor - Data Science at OLX
Codementor - Data Science at OLX Alexey Grigorev
 
Data Monitoring with whylogs
Data Monitoring with whylogsData Monitoring with whylogs
Data Monitoring with whylogsAlexey Grigorev
 
Data engineering zoomcamp introduction
Data engineering zoomcamp  introductionData engineering zoomcamp  introduction
Data engineering zoomcamp introductionAlexey Grigorev
 
AI in Fashion - Size & Fit - Nour Karessli
 AI in Fashion - Size & Fit - Nour Karessli AI in Fashion - Size & Fit - Nour Karessli
AI in Fashion - Size & Fit - Nour KaressliAlexey Grigorev
 
AI-Powered Computer Vision Applications in Media Industry - Yulia Pavlova
AI-Powered Computer Vision Applications in Media Industry - Yulia PavlovaAI-Powered Computer Vision Applications in Media Industry - Yulia Pavlova
AI-Powered Computer Vision Applications in Media Industry - Yulia PavlovaAlexey Grigorev
 
ML Zoomcamp 10 - Kubernetes
ML Zoomcamp 10 - KubernetesML Zoomcamp 10 - Kubernetes
ML Zoomcamp 10 - KubernetesAlexey Grigorev
 
Paradoxes in Data Science
Paradoxes in Data ScienceParadoxes in Data Science
Paradoxes in Data ScienceAlexey Grigorev
 
ML Zoomcamp 8 - Neural networks and deep learning
ML Zoomcamp 8 - Neural networks and deep learningML Zoomcamp 8 - Neural networks and deep learning
ML Zoomcamp 8 - Neural networks and deep learningAlexey Grigorev
 
ML Zoomcamp 6 - Decision Trees and Ensemble Learning
ML Zoomcamp 6 - Decision Trees and Ensemble LearningML Zoomcamp 6 - Decision Trees and Ensemble Learning
ML Zoomcamp 6 - Decision Trees and Ensemble LearningAlexey Grigorev
 
ML Zoomcamp 5 - Model deployment
ML Zoomcamp 5 - Model deploymentML Zoomcamp 5 - Model deployment
ML Zoomcamp 5 - Model deploymentAlexey Grigorev
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaAlexey Grigorev
 
ML Zoomcamp 4 - Evaluation Metrics for Classification
ML Zoomcamp 4 - Evaluation Metrics for ClassificationML Zoomcamp 4 - Evaluation Metrics for Classification
ML Zoomcamp 4 - Evaluation Metrics for ClassificationAlexey Grigorev
 
ML Zoomcamp 3 - Machine Learning for Classification
ML Zoomcamp 3 - Machine Learning for ClassificationML Zoomcamp 3 - Machine Learning for Classification
ML Zoomcamp 3 - Machine Learning for ClassificationAlexey Grigorev
 
ML Zoomcamp Week #2 Office Hours
ML Zoomcamp Week #2 Office HoursML Zoomcamp Week #2 Office Hours
ML Zoomcamp Week #2 Office HoursAlexey Grigorev
 
AMLD2021 - ML in online marketplaces
AMLD2021 - ML in online marketplacesAMLD2021 - ML in online marketplaces
AMLD2021 - ML in online marketplacesAlexey Grigorev
 
ML Zoomcamp 2.1 - Car Price Prediction Project
ML Zoomcamp 2.1 - Car Price Prediction ProjectML Zoomcamp 2.1 - Car Price Prediction Project
ML Zoomcamp 2.1 - Car Price Prediction ProjectAlexey Grigorev
 
ML Zoomcamp 1.5 - Model Selection Process
ML Zoomcamp 1.5 - Model Selection ProcessML Zoomcamp 1.5 - Model Selection Process
ML Zoomcamp 1.5 - Model Selection ProcessAlexey Grigorev
 

More from Alexey Grigorev (20)

MLOps week 1 intro
MLOps week 1 introMLOps week 1 intro
MLOps week 1 intro
 
Codementor - Data Science at OLX
Codementor - Data Science at OLX Codementor - Data Science at OLX
Codementor - Data Science at OLX
 
Data Monitoring with whylogs
Data Monitoring with whylogsData Monitoring with whylogs
Data Monitoring with whylogs
 
Data engineering zoomcamp introduction
Data engineering zoomcamp  introductionData engineering zoomcamp  introduction
Data engineering zoomcamp introduction
 
AI in Fashion - Size & Fit - Nour Karessli
 AI in Fashion - Size & Fit - Nour Karessli AI in Fashion - Size & Fit - Nour Karessli
AI in Fashion - Size & Fit - Nour Karessli
 
AI-Powered Computer Vision Applications in Media Industry - Yulia Pavlova
AI-Powered Computer Vision Applications in Media Industry - Yulia PavlovaAI-Powered Computer Vision Applications in Media Industry - Yulia Pavlova
AI-Powered Computer Vision Applications in Media Industry - Yulia Pavlova
 
ML Zoomcamp 10 - Kubernetes
ML Zoomcamp 10 - KubernetesML Zoomcamp 10 - Kubernetes
ML Zoomcamp 10 - Kubernetes
 
Paradoxes in Data Science
Paradoxes in Data ScienceParadoxes in Data Science
Paradoxes in Data Science
 
ML Zoomcamp 8 - Neural networks and deep learning
ML Zoomcamp 8 - Neural networks and deep learningML Zoomcamp 8 - Neural networks and deep learning
ML Zoomcamp 8 - Neural networks and deep learning
 
Algorithmic fairness
Algorithmic fairnessAlgorithmic fairness
Algorithmic fairness
 
ML Zoomcamp 6 - Decision Trees and Ensemble Learning
ML Zoomcamp 6 - Decision Trees and Ensemble LearningML Zoomcamp 6 - Decision Trees and Ensemble Learning
ML Zoomcamp 6 - Decision Trees and Ensemble Learning
 
ML Zoomcamp 5 - Model deployment
ML Zoomcamp 5 - Model deploymentML Zoomcamp 5 - Model deployment
ML Zoomcamp 5 - Model deployment
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga Petrova
 
ML Zoomcamp 4 - Evaluation Metrics for Classification
ML Zoomcamp 4 - Evaluation Metrics for ClassificationML Zoomcamp 4 - Evaluation Metrics for Classification
ML Zoomcamp 4 - Evaluation Metrics for Classification
 
ML Zoomcamp 3 - Machine Learning for Classification
ML Zoomcamp 3 - Machine Learning for ClassificationML Zoomcamp 3 - Machine Learning for Classification
ML Zoomcamp 3 - Machine Learning for Classification
 
ML Zoomcamp Week #2 Office Hours
ML Zoomcamp Week #2 Office HoursML Zoomcamp Week #2 Office Hours
ML Zoomcamp Week #2 Office Hours
 
AMLD2021 - ML in online marketplaces
AMLD2021 - ML in online marketplacesAMLD2021 - ML in online marketplaces
AMLD2021 - ML in online marketplaces
 
ML Zoomcamp 2 - Slides
ML Zoomcamp 2 - SlidesML Zoomcamp 2 - Slides
ML Zoomcamp 2 - Slides
 
ML Zoomcamp 2.1 - Car Price Prediction Project
ML Zoomcamp 2.1 - Car Price Prediction ProjectML Zoomcamp 2.1 - Car Price Prediction Project
ML Zoomcamp 2.1 - Car Price Prediction Project
 
ML Zoomcamp 1.5 - Model Selection Process
ML Zoomcamp 1.5 - Model Selection ProcessML Zoomcamp 1.5 - Model Selection Process
ML Zoomcamp 1.5 - Model Selection Process
 

Recently uploaded

Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxsocialsciencegdgrohi
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 

Recently uploaded (20)

Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 

ML Zoomcamp 1.2 - ML vs Rule-Based Systems

  • 1. Machine Learning vs Rule-Based Systems DataTalks.Club Machine Learning Zoomcamp Session #1.2
  • 2. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Session #1.2: Plan ● A rule-based system for spam detection ● Using ML for spam detection ● Extracting features for ML
  • 3. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Email system
  • 4. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Spam Subject: Get 50% off now From: promotions@online.com Whether or not you use PayPal, you have likely received at least one PayPal spam message. In it, a spammer impersonates PayPal and informs you that you have to log in to your account and authorize some recent changes. If you click on the link included below the message, you will be taken to a fake PayPal login page set up by the spammer to steal your password and withdraw funds from your account. Subject: URGENT: tax review From: tax@online.com Your tax review is pending acceptance. Review within 24 hours: https://taxes.we-are-legit.com Tax office.
  • 5. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Rules ● If sender = promotions@online.com then “spam” ● If title contains “tax review” and sender domain is “online.com” then “spam” ● Otherwise, “good email”
  • 6. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Code def detect_spam(email): if email.sender == 'promotions@online.com': return SPAM if contains(email.title, ['tax', 'rewiew']) and domain(email.sender, 'online.com'): return SPAM return GOOD
  • 7. DataTalks.Club — mlzoomcamp.com — @Al_Grigor More Subject: Waiting for your reply From: prince1@test.com We are delighted to inform you that you won 1.000.000 (one million) US Dollars. To claim the prize, you need to pay a small processing fee. Please deposit $10 to our PayPal account at prince@test.com. Once we receive the money, we will start the transfer. Congratulations again!
  • 8. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Rules ● If sender = promotions@online.com then “spam” ● If title contains “tax review” and sender domain is “online.com” then “spam” ● If body contains a word “deposit” then “spam” ● Otherwise, “good email”
  • 9. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Code def detect_spam(email): if email.sender == 'promotions@online.com': return SPAM if contains(email.title, ['tax', 'rewiew']) and domain(email.sender, 'online.com'): return SPAM if contains(email.body, ['deposit']): return SPAM return GOOD
  • 10. DataTalks.Club — mlzoomcamp.com — @Al_Grigor More Subject: Totally legit email From: pedro@gmail.com I transferred $50 to you one year ago, and now I’m moving out. Please refund my deposit. Pedro.
  • 11. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Rules ● If sender = promotions@online.com then “spam” ● If title contains “tax review” and sender domain is “online.com” then “spam” ● If body contains a word “deposit” ○ If sender domain is “test.com” then “spam” ○ If body >= 100 words then spam ● Otherwise, “good email”
  • 12. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Repeat
  • 17. DataTalks.Club — mlzoomcamp.com — @Al_Grigor 🤯
  • 18. DataTalks.Club — mlzoomcamp.com — @Al_Grigor 🤯 Use Machine Learning!
  • 19. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Machine Learning ● Get data ● Define & calculate features ● Train and use the model
  • 20. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Getting data Subject: Waiting for your reply From: prince1@test.com We are delighted to inform you that you won 1.000.000 (one million) US Dollars. To claim the prize, you need to pay a small processing fee. Please deposit $10 to our PayPal account at prince@test.com. Once we receive the money, we will start the transfer. Congratulations again! SPAM
  • 21. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Machine Learning ● Get data ● Define & calculate features ● Train and use the model
  • 22. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Features ● Length of title > 10? true/false ● Length of body > 10? true/false ● Sender “promotions@online.com”? true/false ● Sender “hpYOSKmL@test.com”? true/false ● Sender domain “test.com”? true/false ● Description contains “deposit”? true/false Rules
  • 23. DataTalks.Club — mlzoomcamp.com — @Al_Grigor 📝 Start with rules and then use these rules as features
  • 24. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Subject: Waiting for your reply From: prince1@test.com We are delighted to inform you that you won 1.000.000 (one million) US Dollars. To claim the prize, you need to pay a small processing fee. Please deposit $10 to our PayPal account at prince@test.com. Once we receive the money, we will start the transfer. Congratulations again! SPAM [1, 1, 0, 0, 1, 1]
  • 25. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Subject: Waiting for your reply From: prince1@test.com We are delighted to inform you that you won 1.000.000 (one million) US Dollars. To claim the prize, you need to pay a small processing fee. Please deposit $10 to our PayPal account at prince@test.com. Once we receive the money, we will start the transfer. Congratulations again! SPAM [1, 1, 0, 0, 1, 1] Length of title > 10? True
  • 26. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Subject: Waiting for your reply From: prince1@test.com We are delighted to inform you that you won 1.000.000 (one million) US Dollars. To claim the prize, you need to pay a small processing fee. Please deposit $10 to our PayPal account at prince@test.com. Once we receive the money, we will start the transfer. Congratulations again! SPAM Length of body > 10? True [1, 1, 0, 0, 1, 1]
  • 27. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Subject: Waiting for your reply From: prince1@test.com We are delighted to inform you that you won 1.000.000 (one million) US Dollars. To claim the prize, you need to pay a small processing fee. Please deposit $10 to our PayPal account at prince@test.com. Once we receive the money, we will start the transfer. Congratulations again! SPAM Sender “promotions@online.com”? False [1, 1, 0, 0, 1, 1]
  • 28. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Subject: Waiting for your reply From: prince1@test.com We are delighted to inform you that you won 1.000.000 (one million) US Dollars. To claim the prize, you need to pay a small processing fee. Please deposit $10 to our PayPal account at prince@test.com. Once we receive the money, we will start the transfer. Congratulations again! SPAM Sender “hpYOSKmL@test.com”? False [1, 1, 0, 0, 1, 1]
  • 29. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Subject: Waiting for your reply From: prince1@test.com We are delighted to inform you that you won 1.000.000 (one million) US Dollars. To claim the prize, you need to pay a small processing fee. Please deposit $10 to our PayPal account at prince@test.com. Once we receive the money, we will start the transfer. Congratulations again! SPAM Sender domain “test.com”? True [1, 1, 0, 0, 1, 1]
  • 30. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Subject: Waiting for your reply From: prince1@test.com We are delighted to inform you that you won 1.000.000 (one million) US Dollars. To claim the prize, you need to pay a small processing fee. Please deposit $10 to our PayPal account at prince@test.com. Once we receive the money, we will start the transfer. Congratulations again! SPAM Description contains “deposit”? False [1, 1, 0, 0, 1, 1]
  • 31. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Subject: Waiting for your reply From: prince1@test.com We are delighted to inform you that you won 1.000.000 (one million) US Dollars. To claim the prize, you need to pay a small processing fee. Please deposit $10 to our PayPal account at prince@test.com. Once we receive the money, we will start the transfer. Congratulations again! SPAM [1, 1, 0, 0, 1, 1] 1
  • 32. DataTalks.Club — mlzoomcamp.com — @Al_Grigor [1, 1, 0, 0, 1, 1] 1 Features (data) Target (desired output)
  • 33. DataTalks.Club — mlzoomcamp.com — @Al_Grigor [1, 1, 0, 0, 1, 1] 1 [0, 0, 0, 1, 0, 1] 0 Features (data) Target (desired output)
  • 34. DataTalks.Club — mlzoomcamp.com — @Al_Grigor [1, 1, 0, 0, 1, 1] 1 [0, 0, 0, 1, 0, 1] 0 [1, 1, 1, 0, 1, 0] 1 Features (data) Target (desired output)
  • 35. DataTalks.Club — mlzoomcamp.com — @Al_Grigor [1, 1, 0, 0, 1, 1] 1 [0, 0, 0, 1, 0, 1] 0 [1, 1, 1, 0, 1, 0] 1 [1, 0, 0, 0, 0, 1] 1 Features (data) Target (desired output)
  • 36. DataTalks.Club — mlzoomcamp.com — @Al_Grigor [1, 1, 0, 0, 1, 1] 1 [0, 0, 0, 1, 0, 1] 0 [1, 1, 1, 0, 1, 0] 1 [1, 0, 0, 0, 0, 1] 1 [0, 0, 0, 1, 1, 0] 0 Features (data) Target (desired output)
  • 37. DataTalks.Club — mlzoomcamp.com — @Al_Grigor [1, 1, 0, 0, 1, 1] 1 [0, 0, 0, 1, 0, 1] 0 [1, 1, 1, 0, 1, 0] 1 [1, 0, 0, 0, 0, 1] 1 [0, 0, 0, 1, 1, 0] 0 [1, 0, 1, 0, 1, 1] 0 Features (data) Target (desired output)
  • 38. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Machine Learning ● Get data ● Define & calculate features ● Train and use the model
  • 39. DataTalks.Club — mlzoomcamp.com — @Al_Grigor [1, 1, 0, 0, 1, 1] 1 [0, 0, 0, 1, 0, 1] 0 [1, 1, 1, 0, 1, 0] 1 [1, 0, 0, 0, 0, 1] 1 [0, 0, 0, 1, 1, 0] 0 [1, 0, 1, 0, 1, 1] 0 Features (data) Target (desired output)
  • 40. DataTalks.Club — mlzoomcamp.com — @Al_Grigor [0, 0, 0, 1, 0, 1] [0, 0, 0, 1, 1, 0] [1, 0, 1, 0, 1, 1] [1, 1, 1, 0, 1, 0] [1, 0, 0, 0, 0, 1] [1, 1, 0, 0, 1, 1] Features (data) Predictions (output) Apply Final outcome (decision)
  • 41. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Summary data + code => software => outcome
  • 42. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Summary data + outcome => ML => model
  • 43. DataTalks.Club — mlzoomcamp.com — @Al_Grigor Next Supervised machine learning ● A bit more formal definition ● Examples: regression, classification, ranking