SlideShare a Scribd company logo
Classification
Machine Learning
Supervised Learning:
 Classification: Predict a discrete value(label)
associated with feature vector.
 Regression: Predict a real number associated with a
feature vector.
E.g., Use linear regression to fit a curve to data.
Example:
Distance Matrix:
Using Distance Matrix for Classification:
 Simplest approach is probably nearest neighbors.
 Remember training data
 When predicting the label of a new example
 Find the nearest example in the training data
 Predict the label associated with that example.
Distance Matrix:
Hand-Written Character Recognition:
K-nearest neighbors
Advantages and Disadvantages of KNN:
Advantages:
 Learning Fast, no explicit training
 No theory Required
 Easy to explain method and results
Disadvantages:
 Memory intensive and predictions can take a long
time.
 No model to shed light on process that generated
data.
Naïve Baye’s Text classification:
Why?
 Learn which news articles are of interest.
 Learn to classify web pages category
Basic Intuition:
 Simple (naïve) classification method based on
Bayes rule.
 Relies on very simple representation of documents
 Bag of words
Bag of words representation:
Naïve Bayes Text Classification:
Bayes Rule:
For a document d and class c
Goal of Classifier:
Learn to Classify Text using Naïve Bayes:
Target concept interesting? : Document {+, -}
 Represent each document by vector of words
 One attribute per word position in document
 Learning : Use training examples to estimate
P(+), P(-), P(doc|+), P(doc|-)
Naïve Bayes conditional independence assumption
Where P(ai = Wk|Vj) is probability that a word
in position in i is Wk , given Vj
An example: Movie Review
Dictionary: 10 Unique words
< I, loved, the, movie, hated, a, great, good, poor,
acting>
Steps:
 Covert the documents into feature sets, where
attributes are possible words, and the values are the
number of times a word occurs in the given
document.
Doc I love
d
the movi
e
hate
d
a great goo
d
poor actin
g
Clas
s
1 1 1 1 1 +
2 1 1 1 1 -
3 2 1 1 1 +
4 1 1 -
5 1 1 1 1 1 +
Let us look at the probabilities per outcomes(+
or -)
Naïve Bayes…
 Documents with positive outcomes:
P(+)= 3/5= 0.6
Compute: P(I|+), P(loved|+), P(the|+), P(movie|+), P(a|+),
P(great|+), P(good|+), P(acting|+)
Let n be the number of words in the (+) case: 14, nk the
number of word k occurs in these case(+)
Let P(Wk|+) = (n k + 1)/(n +|vocabulary|)
Doc I loved the movie hate
d
a great goo
d
poo
r
actin
g
Clas
s
1 1 1 1 1 +
3 2 1 1 1 +
5 1 1 1 1 1 +
Naïve Bayes…
P(I|+)=0.0833 P(acting|+)=
0.0833
P(loved|+)=0.0833 P(poor|+)=
0.0417
P(the|+)= 0.0833 P(hated|+) =
0.0417
P(movie|+)= 0.2083 P(great|+)=
0.1250
P(a|+)= 0.1250 P(good|+)=
0.1250
 Now, Documents with negative class:
Doc I love
d
the movie hate
d
a gre
at
goo
d
poo
r
acting Clas
s
2 1 1 1 1 -
4 1 1 -
P(I|-)= 0.1250 P(acting|-)= 0.1250
P(loved|-)= 0.0625 P(poor|-)= 0.1250
P(the|-)= 0.1250 P(hated|-) = 0.1250
P(movie|-)= 0.1250 P(great|-)= 0.0625
P(a|-)= 0.0625 P(good|-)= 0.0625
Now, Let’s classify a new sentence w.r.t our training
samples:
Test document: I hated the poor acting
If Vj= +;
P(+)*P(I|+)*P(hated|+)*P(the|+)*P(poor|+)*P(acting|+)
6.03× 10^(-7)
If Vj= - ; P(-)*P(I|-)*P(hated|-)*P(the|-)*P(poor|-)*P(acting|-)
1.22 × 10^(-5)

More Related Content

What's hot

Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)
Jeet Das
 
Lecture 11
Lecture 11Lecture 11
Lecture 11
Jeet Das
 
Learning to Rank - From pairwise approach to listwise
Learning to Rank - From pairwise approach to listwiseLearning to Rank - From pairwise approach to listwise
Learning to Rank - From pairwise approach to listwise
Hasan H Topcu
 
Text Classification, Sentiment Analysis, and Opinion Mining
Text Classification, Sentiment Analysis, and Opinion MiningText Classification, Sentiment Analysis, and Opinion Mining
Text Classification, Sentiment Analysis, and Opinion Mining
Fabrizio Sebastiani
 
Word vectorization(embedding) with nnlm
Word vectorization(embedding) with nnlmWord vectorization(embedding) with nnlm
Word vectorization(embedding) with nnlm
hyunsung lee
 
Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)
Bhaskar Mitra
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
Bhaskar Mitra
 
5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval
Bhaskar Mitra
 
Information Retrieval 02
Information Retrieval 02Information Retrieval 02
Information Retrieval 02
Jeet Das
 
Neural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain ShiftNeural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain Shift
Sebastian Ruder
 
FaDA: Fast document aligner with word embedding - Pintu Lohar, Debasis Gangul...
FaDA: Fast document aligner with word embedding - Pintu Lohar, Debasis Gangul...FaDA: Fast document aligner with word embedding - Pintu Lohar, Debasis Gangul...
FaDA: Fast document aligner with word embedding - Pintu Lohar, Debasis Gangul...
Sebastian Ruder
 
Lecture 6: Ensemble Methods
Lecture 6: Ensemble Methods Lecture 6: Ensemble Methods
Lecture 6: Ensemble Methods
Marina Santini
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.pptbutest
 

What's hot (14)

Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)
 
Lecture 11
Lecture 11Lecture 11
Lecture 11
 
Learning to Rank - From pairwise approach to listwise
Learning to Rank - From pairwise approach to listwiseLearning to Rank - From pairwise approach to listwise
Learning to Rank - From pairwise approach to listwise
 
Lec 4,5
Lec 4,5Lec 4,5
Lec 4,5
 
Text Classification, Sentiment Analysis, and Opinion Mining
Text Classification, Sentiment Analysis, and Opinion MiningText Classification, Sentiment Analysis, and Opinion Mining
Text Classification, Sentiment Analysis, and Opinion Mining
 
Word vectorization(embedding) with nnlm
Word vectorization(embedding) with nnlmWord vectorization(embedding) with nnlm
Word vectorization(embedding) with nnlm
 
Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
 
5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval
 
Information Retrieval 02
Information Retrieval 02Information Retrieval 02
Information Retrieval 02
 
Neural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain ShiftNeural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain Shift
 
FaDA: Fast document aligner with word embedding - Pintu Lohar, Debasis Gangul...
FaDA: Fast document aligner with word embedding - Pintu Lohar, Debasis Gangul...FaDA: Fast document aligner with word embedding - Pintu Lohar, Debasis Gangul...
FaDA: Fast document aligner with word embedding - Pintu Lohar, Debasis Gangul...
 
Lecture 6: Ensemble Methods
Lecture 6: Ensemble Methods Lecture 6: Ensemble Methods
Lecture 6: Ensemble Methods
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
 

Similar to Lecture 10

Search Engines
Search EnginesSearch Engines
Search Enginesbutest
 
Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...
Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...
Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...
Jonathon Hare
 
A Review on Subjectivity Analysis through Text Classification Using Mining Te...
A Review on Subjectivity Analysis through Text Classification Using Mining Te...A Review on Subjectivity Analysis through Text Classification Using Mining Te...
A Review on Subjectivity Analysis through Text Classification Using Mining Te...
IJERA Editor
 
Data.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and predictionData.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and predictionMargaret Wang
 
Classification Of Web Documents
Classification Of Web Documents Classification Of Web Documents
Classification Of Web Documents
hussainahmad77100
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
Dev Sahu
 
Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3butest
 
Naive bayes
Naive bayesNaive bayes
Machine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptMachine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptbutest
 
MLEARN 210 B Autumn 2018: Lecture 1
MLEARN 210 B Autumn 2018: Lecture 1MLEARN 210 B Autumn 2018: Lecture 1
MLEARN 210 B Autumn 2018: Lecture 1
heinestien
 
Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos butest
 
powerpoint
powerpointpowerpoint
powerpointbutest
 
Machine Learning and Inductive Inference
Machine Learning and Inductive InferenceMachine Learning and Inductive Inference
Machine Learning and Inductive Inferencebutest
 
NLP - Sentiment Analysis
NLP - Sentiment AnalysisNLP - Sentiment Analysis
NLP - Sentiment Analysis
Rupak Roy
 
Part 1
Part 1Part 1
Part 1butest
 
slides
slidesslides
slidesbutest
 
slides
slidesslides
slidesbutest
 

Similar to Lecture 10 (20)

Search Engines
Search EnginesSearch Engines
Search Engines
 
Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...
Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...
Multimodal Searching and Semantic Spaces: ...or how to find images of Dalmati...
 
A Review on Subjectivity Analysis through Text Classification Using Mining Te...
A Review on Subjectivity Analysis through Text Classification Using Mining Te...A Review on Subjectivity Analysis through Text Classification Using Mining Te...
A Review on Subjectivity Analysis through Text Classification Using Mining Te...
 
Data.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and predictionData.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and prediction
 
Classification Of Web Documents
Classification Of Web Documents Classification Of Web Documents
Classification Of Web Documents
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
 
Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Machine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptMachine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.ppt
 
MLEARN 210 B Autumn 2018: Lecture 1
MLEARN 210 B Autumn 2018: Lecture 1MLEARN 210 B Autumn 2018: Lecture 1
MLEARN 210 B Autumn 2018: Lecture 1
 
Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos
 
powerpoint
powerpointpowerpoint
powerpoint
 
Machine Learning and Inductive Inference
Machine Learning and Inductive InferenceMachine Learning and Inductive Inference
Machine Learning and Inductive Inference
 
NLP - Sentiment Analysis
NLP - Sentiment AnalysisNLP - Sentiment Analysis
NLP - Sentiment Analysis
 
Part 1
Part 1Part 1
Part 1
 
slides
slidesslides
slides
 
slides
slidesslides
slides
 
[ppt]
[ppt][ppt]
[ppt]
 
[ppt]
[ppt][ppt]
[ppt]
 
Mapping Keywords to
Mapping Keywords to Mapping Keywords to
Mapping Keywords to
 

More from Jeet Das

Lecture 13
Lecture 13Lecture 13
Lecture 13
Jeet Das
 
Lecture 12
Lecture 12Lecture 12
Lecture 12
Jeet Das
 
Information Retrieval 08
Information Retrieval 08 Information Retrieval 08
Information Retrieval 08
Jeet Das
 
Information Retrieval 07
Information Retrieval 07Information Retrieval 07
Information Retrieval 07
Jeet Das
 
Information Retrieval-06
Information Retrieval-06Information Retrieval-06
Information Retrieval-06
Jeet Das
 
Information Retrieval-05(wild card query_positional index_spell correction)
Information Retrieval-05(wild card query_positional index_spell correction)Information Retrieval-05(wild card query_positional index_spell correction)
Information Retrieval-05(wild card query_positional index_spell correction)
Jeet Das
 
Information Retrieval-4(inverted index_&amp;_query handling)
Information Retrieval-4(inverted index_&amp;_query handling)Information Retrieval-4(inverted index_&amp;_query handling)
Information Retrieval-4(inverted index_&amp;_query handling)
Jeet Das
 
Information Retrieval-1
Information Retrieval-1Information Retrieval-1
Information Retrieval-1
Jeet Das
 
NLP
NLPNLP
Token classification using Bengali Tokenizer
Token classification using Bengali TokenizerToken classification using Bengali Tokenizer
Token classification using Bengali Tokenizer
Jeet Das
 
Silent sound technology
Silent sound technologySilent sound technology
Silent sound technology
Jeet Das
 

More from Jeet Das (11)

Lecture 13
Lecture 13Lecture 13
Lecture 13
 
Lecture 12
Lecture 12Lecture 12
Lecture 12
 
Information Retrieval 08
Information Retrieval 08 Information Retrieval 08
Information Retrieval 08
 
Information Retrieval 07
Information Retrieval 07Information Retrieval 07
Information Retrieval 07
 
Information Retrieval-06
Information Retrieval-06Information Retrieval-06
Information Retrieval-06
 
Information Retrieval-05(wild card query_positional index_spell correction)
Information Retrieval-05(wild card query_positional index_spell correction)Information Retrieval-05(wild card query_positional index_spell correction)
Information Retrieval-05(wild card query_positional index_spell correction)
 
Information Retrieval-4(inverted index_&amp;_query handling)
Information Retrieval-4(inverted index_&amp;_query handling)Information Retrieval-4(inverted index_&amp;_query handling)
Information Retrieval-4(inverted index_&amp;_query handling)
 
Information Retrieval-1
Information Retrieval-1Information Retrieval-1
Information Retrieval-1
 
NLP
NLPNLP
NLP
 
Token classification using Bengali Tokenizer
Token classification using Bengali TokenizerToken classification using Bengali Tokenizer
Token classification using Bengali Tokenizer
 
Silent sound technology
Silent sound technologySilent sound technology
Silent sound technology
 

Recently uploaded

Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
ongomchris
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
Vijay Dialani, PhD
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
Osamah Alsalih
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
Jayaprasanna4
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
Jayaprasanna4
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
karthi keyan
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
WENKENLI1
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
BrazilAccount1
 
Runway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptxRunway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptx
SupreethSP4
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 

Recently uploaded (20)

Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
 
Runway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptxRunway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptx
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 

Lecture 10

  • 2. Supervised Learning:  Classification: Predict a discrete value(label) associated with feature vector.  Regression: Predict a real number associated with a feature vector. E.g., Use linear regression to fit a curve to data.
  • 5. Using Distance Matrix for Classification:  Simplest approach is probably nearest neighbors.  Remember training data  When predicting the label of a new example  Find the nearest example in the training data  Predict the label associated with that example.
  • 9. Advantages and Disadvantages of KNN: Advantages:  Learning Fast, no explicit training  No theory Required  Easy to explain method and results Disadvantages:  Memory intensive and predictions can take a long time.  No model to shed light on process that generated data.
  • 10. Naïve Baye’s Text classification: Why?  Learn which news articles are of interest.  Learn to classify web pages category Basic Intuition:  Simple (naïve) classification method based on Bayes rule.  Relies on very simple representation of documents  Bag of words
  • 11. Bag of words representation:
  • 12. Naïve Bayes Text Classification: Bayes Rule: For a document d and class c Goal of Classifier:
  • 13. Learn to Classify Text using Naïve Bayes: Target concept interesting? : Document {+, -}  Represent each document by vector of words  One attribute per word position in document  Learning : Use training examples to estimate P(+), P(-), P(doc|+), P(doc|-) Naïve Bayes conditional independence assumption Where P(ai = Wk|Vj) is probability that a word in position in i is Wk , given Vj
  • 14. An example: Movie Review Dictionary: 10 Unique words < I, loved, the, movie, hated, a, great, good, poor, acting>
  • 15. Steps:  Covert the documents into feature sets, where attributes are possible words, and the values are the number of times a word occurs in the given document. Doc I love d the movi e hate d a great goo d poor actin g Clas s 1 1 1 1 1 + 2 1 1 1 1 - 3 2 1 1 1 + 4 1 1 - 5 1 1 1 1 1 + Let us look at the probabilities per outcomes(+ or -)
  • 16. Naïve Bayes…  Documents with positive outcomes: P(+)= 3/5= 0.6 Compute: P(I|+), P(loved|+), P(the|+), P(movie|+), P(a|+), P(great|+), P(good|+), P(acting|+) Let n be the number of words in the (+) case: 14, nk the number of word k occurs in these case(+) Let P(Wk|+) = (n k + 1)/(n +|vocabulary|) Doc I loved the movie hate d a great goo d poo r actin g Clas s 1 1 1 1 1 + 3 2 1 1 1 + 5 1 1 1 1 1 +
  • 17. Naïve Bayes… P(I|+)=0.0833 P(acting|+)= 0.0833 P(loved|+)=0.0833 P(poor|+)= 0.0417 P(the|+)= 0.0833 P(hated|+) = 0.0417 P(movie|+)= 0.2083 P(great|+)= 0.1250 P(a|+)= 0.1250 P(good|+)= 0.1250  Now, Documents with negative class: Doc I love d the movie hate d a gre at goo d poo r acting Clas s 2 1 1 1 1 - 4 1 1 -
  • 18. P(I|-)= 0.1250 P(acting|-)= 0.1250 P(loved|-)= 0.0625 P(poor|-)= 0.1250 P(the|-)= 0.1250 P(hated|-) = 0.1250 P(movie|-)= 0.1250 P(great|-)= 0.0625 P(a|-)= 0.0625 P(good|-)= 0.0625 Now, Let’s classify a new sentence w.r.t our training samples: Test document: I hated the poor acting If Vj= +; P(+)*P(I|+)*P(hated|+)*P(the|+)*P(poor|+)*P(acting|+) 6.03× 10^(-7) If Vj= - ; P(-)*P(I|-)*P(hated|-)*P(the|-)*P(poor|-)*P(acting|-) 1.22 × 10^(-5)