SlideShare a Scribd company logo
1 of 22
Text Classification and Naïve Bayes
An example of text classification
Definition of a machine learning problem
A refresher on probability
The Naive Bayes classifier
1
Google News
2
Different ways for classification
Human labor (people assign categories to every incoming
article)
Hand-crafted rules for automatic classification
 If article contains: stock, Dow, share, Nasdaq, etc.  Business
 If article contains: set, breakpoint, player, Federer, etc.  Tennis
Machine learning algorithms
3
What is Machine Learning?
4
Definition: A computer program is said to learn from
experience E when its performance P at a task T
improves with experience E.
Tom Mitchell, Machine Learning, 1997
Examples:
- Learning to recognize spoken words
- Learning to drive a vehicle
- Learning to play backgammon
Components of a ML System (1)
Experience (a set of examples that combines together
input and output for a task)
 Text categorization: document + category
 Speech recognition: spoken text + written text
Experience is referred to as Training Data. When training
data is available, we talk of Supervised Learning.
Performance metrics
 Error or accuracy in the Test Data
 Test Data are not present in the Training Data
 When there are few training data, methods like ‘leave-one-out’ or
‘ten-fold cross validation’ are used to measure error.
5
Components of a ML System (2)
Type of knowledge to be learned (known as the target
function, that will map between input and output)
Representation of the target function
 Decision trees
 Neural networks
 Linear functions
The learning algorithm
 C4.5 (learns decision trees)
 Gradient descent (learns a neural network)
 Linear programming (learns linear functions)
6
Task
Defining Text Classification
7
XdX∈d
},,,{ 21 Jccc =C
D cd,
C×∈Xcd,
C→X:γ
γ=Γ D)(
the document in the multi-dimensional space
a set of classes (categories, or labels)
the training set of labeled documents
Target function:
Learning algorithm:
=cd, “Beijing joins the World Trade Organization”, China
cd =)(γ =)(dγ China
Naïve Bayes Learning
8
∏≤≤∈∈
==
dnk
k
CcCc
MAP ctPcPdcPc
1
)|(ˆ)(ˆmaxarg)|(ˆmaxarg
cd =)(γ
Learning Algorithm: Naïve Bayes
Target Function:
)|()(maxarg)|(maxarg cdPcPdcPc
CcCc
MAP
∈∈
==
)(cP
)|( cdP
The generative process:
)|( dcP
a priori probability, of choosing a category
the cond. prob. of generating d, given the fixed c
a posteriori probability that c generated d
A Refresher on Probability
9
Visualizing probability
A is a random variable that denotes an uncertain event
 Example: A = “I’ll get an A+ in the final exam”
P(A) is “the fraction of possible worlds where A is true”
10
Worlds in
which A
is true
Slide: Andrew W. Moore
Worlds in which A is false
Event space of all possible
worlds. Its area is 1.
P(A) = Area of the blue
circle.
Axioms and Theorems of Probability
Axioms:
 0 <= P(A) <= 1
 P(True) = 1
 P(False) = 0
 P(A or B) = P(A) + P(B) – P(A and B)
Theorems:
 P(not A) = P(~A) = 1 – P(A)
 P(A) = P(A ^ B) + P(A ^ ~B)
11
Conditional Probability
P(A|B) = the probability of A being true, given that we
know that B is true
12
F
H
H = “I have a headache”
F = “Coming down with flu”
P(H) = 1/10
P(F) = 1/40
P(H/F) = 1/2
Slide: Andrew W. Moore
Headaches are rare and flu
even rarer, but if you got that flu,
there is a 50-50 chance you’ll
have a headache.
Deriving the Bayes Rule
13
)(
)(
)|(
BP
BAP
BAP
∧
=Conditional Probability:
)()|()( BPBAPBAP =∧Chain rule:
)()|()()( APABPABPBAP =∧=∧
Bayes Rule:
)(
)()|(
)|(
AP
BPBAP
ABP =
Back to the Naïve Bayes Classifier
14
Deriving the Naïve Bayes
15
)(
)()|(
)|(
AP
BPBAP
ABP = (Bayes Rule)
21,cc 'dGiven two classes and the document
)'(
)|'()(
)'|( 11
1
dP
cdPcP
dcP =
)'(
)|'()(
)'|( 22
2
dP
cdPcP
dcP =
We are looking for a that maximizes the a-posterioriic )'|( dcP i
)'(dP (the denominator) is the same in both cases
)|()(maxarg cdPcPc
Cc
MAP
∈
=Thus:
Estimating parameters for the
target function
We are looking for the estimates and
16
)(ˆ cP )|(ˆ cdP
P(c) is the fraction of possible worlds where c is true.
N
N
cP c
=)(ˆ N – number of all documents
Nc – number of documents in class c
d is a vector in the space X
)|,,,()|( 2 ctttPcdP dni =
where each dimension is a term:
)()|()( BPBAPBAP =∧By using the chain rule: we have:
(P
),,...,(),,...,|()|,,,( 2212 cttPctttPctttP ddd nnni =
...=
Naïve assumptions of independence
1. All attribute values are independent of each other given
the class. (conditional independence assumption)
2. The conditional probabilities for a term are the same
independent of position in the document.
We assume the document is a “bag-of-words”.
17
∏≤≤
==
d
d
nk
kni ctPctttPcdP
1
2 )|()|,,,()|( 
∏≤≤∈∈
==
dnk
k
CcCc
MAP ctPcPdcPc
1
)|(ˆ)(ˆmaxarg)|(ˆmaxarg
Finally, we get the target function of Slide 8:
Again about estimation
18
For each term, t, we need to estimate P(t|c)
∑ ∈
=
Vt ct
ct
T
T
ctP
' '
)|(ˆ
Because an estimate will be 0 if a term does not appear with a class
in the training data, we need smoothing:
||)(
1
)1(
1
)|(ˆ
' '' ' VT
T
T
T
ctP
Vt ct
ct
Vt ct
ct
∑∑ ∈∈
+
+
=
+
+
=Laplace
Smoothing
|V| is the number of terms in the vocabulary
Tct is the count of term t in all documents of class c
An Example of classification with
Naïve Bayes
19
Example 13.1 (Part 1)
20
Training
set
docID c = China?
1 Chinese Beijing Chinese Yes
2 Chinese Chinese Shangai Yes
3 Chinese Macao Yes
4 Tokyo Japan Chinese No
Test set 5 Chinese Chinese Chinese Tokyo Japan ?
Two classes: “China”, “not China”
N = 4 4/3)(ˆ =cP 4/1)(ˆ =cP
V = {Beijing, Chinese, Japan, Macao, Tokyo}
Example 13.1 (Part 1)
21
Training
set
docID c = China?
1 Chinese Beijing Chinese Yes
2 Chinese Chinese Shangai Yes
3 Chinese Macao Yes
4 Tokyo Japan Chinese No
Test set 5 Chinese Chinese Chinese Tokyo Japan ?
7/3)68/()15()|Chinese(ˆ =++=cP
14/1)68/()10()|Japan(ˆ)|Tokyo(ˆ =++== cPcP
9/2)63/()11()|Chinese(ˆ =++=cP
9/2)63/()11()|Japan(ˆ)|Tokyo(ˆ =++== cPcP
Estimation Classification
∏≤≤
∝
dnk
k ctPcPdcP
1
)|()()|(
0001.09/29/2)9/2(4/1)|(
0003.014/114/1)7/3(4/3)|(
3
5
3
5
≈⋅⋅⋅∝
≈⋅⋅⋅∝
dcP
dcP
Summary: Miscellanious
Naïve Bayes is linear in the time is takes to scan the data
When we have many terms, the product of probabilities
with cause a floating point underflow, therefore:
For a large training set, the vocabulary is large. It is better
to select only a subset of terms. For that is used “feature
selection” (Section 13.5).
22
∑≤≤∈
+=
dnk
k
Cc
MAP ctPcPc
1
)|(log)(ˆ[logmaxarg

More Related Content

What's hot

Bayesian classification
Bayesian classification Bayesian classification
Bayesian classification Zul Kawsar
 
Presentation on Text Classification
Presentation on Text ClassificationPresentation on Text Classification
Presentation on Text ClassificationSai Srinivas Kotni
 
Model selection and cross validation techniques
Model selection and cross validation techniquesModel selection and cross validation techniques
Model selection and cross validation techniquesVenkata Reddy Konasani
 
Challenges in Large Scale Machine Learning
Challenges in Large Scale  Machine LearningChallenges in Large Scale  Machine Learning
Challenges in Large Scale Machine LearningSudarsun Santhiappan
 
NLP - Sentiment Analysis
NLP - Sentiment AnalysisNLP - Sentiment Analysis
NLP - Sentiment AnalysisRupak Roy
 
Text classification presentation
Text classification presentationText classification presentation
Text classification presentationMarijn van Zelst
 
Introduction to text classification using naive bayes
Introduction to text classification using naive bayesIntroduction to text classification using naive bayes
Introduction to text classification using naive bayesDhwaj Raj
 
HITS + Pagerank
HITS + PagerankHITS + Pagerank
HITS + Pagerankajkt
 
Data mining techniques unit III
Data mining techniques unit IIIData mining techniques unit III
Data mining techniques unit IIImalathieswaran29
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision treesKnoldus Inc.
 
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...Edureka!
 
Financial Question Answering with BERT Language Models
Financial Question Answering with BERT Language ModelsFinancial Question Answering with BERT Language Models
Financial Question Answering with BERT Language ModelsBithiah Yuan
 
Support Vector Machine ppt presentation
Support Vector Machine ppt presentationSupport Vector Machine ppt presentation
Support Vector Machine ppt presentationAyanaRukasar
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingCloudxLab
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining Sushil Kulkarni
 
Neural network final NWU 4.3 Graphics Course
Neural network final NWU 4.3 Graphics CourseNeural network final NWU 4.3 Graphics Course
Neural network final NWU 4.3 Graphics CourseMohaiminur Rahman
 

What's hot (20)

Bayesian classification
Bayesian classification Bayesian classification
Bayesian classification
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Presentation on Text Classification
Presentation on Text ClassificationPresentation on Text Classification
Presentation on Text Classification
 
Model selection and cross validation techniques
Model selection and cross validation techniquesModel selection and cross validation techniques
Model selection and cross validation techniques
 
Text Classification
Text ClassificationText Classification
Text Classification
 
Challenges in Large Scale Machine Learning
Challenges in Large Scale  Machine LearningChallenges in Large Scale  Machine Learning
Challenges in Large Scale Machine Learning
 
NLP - Sentiment Analysis
NLP - Sentiment AnalysisNLP - Sentiment Analysis
NLP - Sentiment Analysis
 
Text classification presentation
Text classification presentationText classification presentation
Text classification presentation
 
Introduction to text classification using naive bayes
Introduction to text classification using naive bayesIntroduction to text classification using naive bayes
Introduction to text classification using naive bayes
 
HITS + Pagerank
HITS + PagerankHITS + Pagerank
HITS + Pagerank
 
Data mining techniques unit III
Data mining techniques unit IIIData mining techniques unit III
Data mining techniques unit III
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
K Nearest Neighbors
K Nearest NeighborsK Nearest Neighbors
K Nearest Neighbors
 
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
Naive Bayes Classifier Tutorial | Naive Bayes Classifier Example | Naive Baye...
 
Financial Question Answering with BERT Language Models
Financial Question Answering with BERT Language ModelsFinancial Question Answering with BERT Language Models
Financial Question Answering with BERT Language Models
 
Support Vector Machine ppt presentation
Support Vector Machine ppt presentationSupport Vector Machine ppt presentation
Support Vector Machine ppt presentation
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining
 
Neural network final NWU 4.3 Graphics Course
Neural network final NWU 4.3 Graphics CourseNeural network final NWU 4.3 Graphics Course
Neural network final NWU 4.3 Graphics Course
 

Viewers also liked

Tweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVMTweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVMTrilok Sharma
 
Naive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event ModelsNaive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event ModelsDKALab
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Dev Sahu
 
How Sentiment Analysis works
How Sentiment Analysis worksHow Sentiment Analysis works
How Sentiment Analysis worksCJ Jenkins
 
Modeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive BayesModeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive Bayesjakehofman
 
Sentiment analysis of tweets
Sentiment analysis of tweetsSentiment analysis of tweets
Sentiment analysis of tweetsVasu Jain
 
Feature specific analysis of reviews
Feature specific analysis of reviewsFeature specific analysis of reviews
Feature specific analysis of reviewsSubhabrata Mukherjee
 
Text Classification/Categorization
Text Classification/CategorizationText Classification/Categorization
Text Classification/CategorizationOswal Abhishek
 
Text classification
Text classificationText classification
Text classificationJames Wong
 
Text classification & sentiment analysis
Text classification & sentiment analysisText classification & sentiment analysis
Text classification & sentiment analysisM. Atif Qureshi
 
Sentiment Analysis in R
Sentiment Analysis in RSentiment Analysis in R
Sentiment Analysis in REdureka!
 
Sentiment Analysis via R Programming
Sentiment Analysis via R ProgrammingSentiment Analysis via R Programming
Sentiment Analysis via R ProgrammingSkillspeed
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learningParas Kohli
 
Supervised Learning
Supervised LearningSupervised Learning
Supervised Learningbutest
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learningTonmoy Bhagawati
 
Cs221 lecture5-fall11
Cs221 lecture5-fall11Cs221 lecture5-fall11
Cs221 lecture5-fall11darwinrlo
 
Introduction to Sentiment Analysis
Introduction to Sentiment AnalysisIntroduction to Sentiment Analysis
Introduction to Sentiment AnalysisJaganadh Gopinadhan
 
Classification with Naive Bayes
Classification with Naive BayesClassification with Naive Bayes
Classification with Naive BayesJosh Patterson
 
2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers
2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers
2013-1 Machine Learning Lecture 03 - Naïve Bayes ClassifiersDongseo University
 

Viewers also liked (20)

Tweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVMTweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVM
 
Naive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event ModelsNaive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event Models
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
 
How Sentiment Analysis works
How Sentiment Analysis worksHow Sentiment Analysis works
How Sentiment Analysis works
 
Modeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive BayesModeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive Bayes
 
Sentiment analysis of tweets
Sentiment analysis of tweetsSentiment analysis of tweets
Sentiment analysis of tweets
 
Supervised algorithms
Supervised algorithmsSupervised algorithms
Supervised algorithms
 
Feature specific analysis of reviews
Feature specific analysis of reviewsFeature specific analysis of reviews
Feature specific analysis of reviews
 
Text Classification/Categorization
Text Classification/CategorizationText Classification/Categorization
Text Classification/Categorization
 
Text classification
Text classificationText classification
Text classification
 
Text classification & sentiment analysis
Text classification & sentiment analysisText classification & sentiment analysis
Text classification & sentiment analysis
 
Sentiment Analysis in R
Sentiment Analysis in RSentiment Analysis in R
Sentiment Analysis in R
 
Sentiment Analysis via R Programming
Sentiment Analysis via R ProgrammingSentiment Analysis via R Programming
Sentiment Analysis via R Programming
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Supervised Learning
Supervised LearningSupervised Learning
Supervised Learning
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learning
 
Cs221 lecture5-fall11
Cs221 lecture5-fall11Cs221 lecture5-fall11
Cs221 lecture5-fall11
 
Introduction to Sentiment Analysis
Introduction to Sentiment AnalysisIntroduction to Sentiment Analysis
Introduction to Sentiment Analysis
 
Classification with Naive Bayes
Classification with Naive BayesClassification with Naive Bayes
Classification with Naive Bayes
 
2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers
2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers
2013-1 Machine Learning Lecture 03 - Naïve Bayes Classifiers
 

Similar to Text classification

Joint optimization framework for learning with noisy labels
Joint optimization framework for learning with noisy labelsJoint optimization framework for learning with noisy labels
Joint optimization framework for learning with noisy labelsCheng-You Lu
 
Hands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in PythonHands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in PythonChun-Ming Chang
 
Pattern recognition binoy 05-naive bayes classifier
Pattern recognition binoy 05-naive bayes classifierPattern recognition binoy 05-naive bayes classifier
Pattern recognition binoy 05-naive bayes classifier108kaushik
 
An introduction to Bayesian Statistics using Python
An introduction to Bayesian Statistics using PythonAn introduction to Bayesian Statistics using Python
An introduction to Bayesian Statistics using Pythonfreshdatabos
 
[系列活動] Machine Learning 機器學習課程
[系列活動] Machine Learning 機器學習課程[系列活動] Machine Learning 機器學習課程
[系列活動] Machine Learning 機器學習課程台灣資料科學年會
 
Alpaydin - Chapter 2
Alpaydin - Chapter 2Alpaydin - Chapter 2
Alpaydin - Chapter 2butest
 
Scala as a Declarative Language
Scala as a Declarative LanguageScala as a Declarative Language
Scala as a Declarative Languagevsssuresh
 
Alpaydin - Chapter 2
Alpaydin - Chapter 2Alpaydin - Chapter 2
Alpaydin - Chapter 2butest
 
Python Lab manual program for BE First semester (all department
Python Lab manual program for BE First semester (all departmentPython Lab manual program for BE First semester (all department
Python Lab manual program for BE First semester (all departmentNazeer Wahab
 
The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)Pierre Schaus
 
Data.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and predictionData.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and predictionMargaret Wang
 
ch8Bayes.ppt
ch8Bayes.pptch8Bayes.ppt
ch8Bayes.pptImXaib
 
Introduction
IntroductionIntroduction
Introductionbutest
 
Design and Analysis of Algorithm Brute Force 1.ppt
Design and Analysis of Algorithm Brute Force 1.pptDesign and Analysis of Algorithm Brute Force 1.ppt
Design and Analysis of Algorithm Brute Force 1.pptmoiza354
 
Unit-2 Bayes Decision Theory.pptx
Unit-2 Bayes Decision Theory.pptxUnit-2 Bayes Decision Theory.pptx
Unit-2 Bayes Decision Theory.pptxavinashBajpayee1
 

Similar to Text classification (20)

Joint optimization framework for learning with noisy labels
Joint optimization framework for learning with noisy labelsJoint optimization framework for learning with noisy labels
Joint optimization framework for learning with noisy labels
 
Hands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in PythonHands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in Python
 
Pattern recognition binoy 05-naive bayes classifier
Pattern recognition binoy 05-naive bayes classifierPattern recognition binoy 05-naive bayes classifier
Pattern recognition binoy 05-naive bayes classifier
 
ML unit-1.pptx
ML unit-1.pptxML unit-1.pptx
ML unit-1.pptx
 
An introduction to Bayesian Statistics using Python
An introduction to Bayesian Statistics using PythonAn introduction to Bayesian Statistics using Python
An introduction to Bayesian Statistics using Python
 
[系列活動] Machine Learning 機器學習課程
[系列活動] Machine Learning 機器學習課程[系列活動] Machine Learning 機器學習課程
[系列活動] Machine Learning 機器學習課程
 
Automatic bayesian cubature
Automatic bayesian cubatureAutomatic bayesian cubature
Automatic bayesian cubature
 
Alpaydin - Chapter 2
Alpaydin - Chapter 2Alpaydin - Chapter 2
Alpaydin - Chapter 2
 
Scala as a Declarative Language
Scala as a Declarative LanguageScala as a Declarative Language
Scala as a Declarative Language
 
Alpaydin - Chapter 2
Alpaydin - Chapter 2Alpaydin - Chapter 2
Alpaydin - Chapter 2
 
ch8Bayes.pptx
ch8Bayes.pptxch8Bayes.pptx
ch8Bayes.pptx
 
Python Lab manual program for BE First semester (all department
Python Lab manual program for BE First semester (all departmentPython Lab manual program for BE First semester (all department
Python Lab manual program for BE First semester (all department
 
The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)
 
Data.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and predictionData.Mining.C.6(II).classification and prediction
Data.Mining.C.6(II).classification and prediction
 
ch8Bayes.ppt
ch8Bayes.pptch8Bayes.ppt
ch8Bayes.ppt
 
ch8Bayes.ppt
ch8Bayes.pptch8Bayes.ppt
ch8Bayes.ppt
 
Introduction
IntroductionIntroduction
Introduction
 
Design and Analysis of Algorithm Brute Force 1.ppt
Design and Analysis of Algorithm Brute Force 1.pptDesign and Analysis of Algorithm Brute Force 1.ppt
Design and Analysis of Algorithm Brute Force 1.ppt
 
Midterm
MidtermMidterm
Midterm
 
Unit-2 Bayes Decision Theory.pptx
Unit-2 Bayes Decision Theory.pptxUnit-2 Bayes Decision Theory.pptx
Unit-2 Bayes Decision Theory.pptx
 

More from Harry Potter

How to build a rest api.pptx
How to build a rest api.pptxHow to build a rest api.pptx
How to build a rest api.pptxHarry Potter
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data miningHarry Potter
 
Big picture of data mining
Big picture of data miningBig picture of data mining
Big picture of data miningHarry Potter
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discoveryHarry Potter
 
Directory based cache coherence
Directory based cache coherenceDirectory based cache coherence
Directory based cache coherenceHarry Potter
 
How analysis services caching works
How analysis services caching worksHow analysis services caching works
How analysis services caching worksHarry Potter
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsHarry Potter
 
Hardware managed cache
Hardware managed cacheHardware managed cache
Hardware managed cacheHarry Potter
 
Data structures and algorithms
Data structures and algorithmsData structures and algorithms
Data structures and algorithmsHarry Potter
 
Abstract data types
Abstract data typesAbstract data types
Abstract data typesHarry Potter
 
Concurrency with java
Concurrency with javaConcurrency with java
Concurrency with javaHarry Potter
 
Encapsulation anonymous class
Encapsulation anonymous classEncapsulation anonymous class
Encapsulation anonymous classHarry Potter
 
Object oriented analysis
Object oriented analysisObject oriented analysis
Object oriented analysisHarry Potter
 
Rest api to integrate with your site
Rest api to integrate with your siteRest api to integrate with your site
Rest api to integrate with your siteHarry Potter
 

More from Harry Potter (20)

How to build a rest api.pptx
How to build a rest api.pptxHow to build a rest api.pptx
How to build a rest api.pptx
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Big picture of data mining
Big picture of data miningBig picture of data mining
Big picture of data mining
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Cache recap
Cache recapCache recap
Cache recap
 
Directory based cache coherence
Directory based cache coherenceDirectory based cache coherence
Directory based cache coherence
 
How analysis services caching works
How analysis services caching worksHow analysis services caching works
How analysis services caching works
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessors
 
Hardware managed cache
Hardware managed cacheHardware managed cache
Hardware managed cache
 
Smm & caching
Smm & cachingSmm & caching
Smm & caching
 
Data structures and algorithms
Data structures and algorithmsData structures and algorithms
Data structures and algorithms
 
Abstract data types
Abstract data typesAbstract data types
Abstract data types
 
Abstraction file
Abstraction fileAbstraction file
Abstraction file
 
Object model
Object modelObject model
Object model
 
Concurrency with java
Concurrency with javaConcurrency with java
Concurrency with java
 
Encapsulation anonymous class
Encapsulation anonymous classEncapsulation anonymous class
Encapsulation anonymous class
 
Abstract class
Abstract classAbstract class
Abstract class
 
Object oriented analysis
Object oriented analysisObject oriented analysis
Object oriented analysis
 
Api crash
Api crashApi crash
Api crash
 
Rest api to integrate with your site
Rest api to integrate with your siteRest api to integrate with your site
Rest api to integrate with your site
 

Recently uploaded

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 

Recently uploaded (20)

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 

Text classification

  • 1. Text Classification and Naïve Bayes An example of text classification Definition of a machine learning problem A refresher on probability The Naive Bayes classifier 1
  • 3. Different ways for classification Human labor (people assign categories to every incoming article) Hand-crafted rules for automatic classification  If article contains: stock, Dow, share, Nasdaq, etc.  Business  If article contains: set, breakpoint, player, Federer, etc.  Tennis Machine learning algorithms 3
  • 4. What is Machine Learning? 4 Definition: A computer program is said to learn from experience E when its performance P at a task T improves with experience E. Tom Mitchell, Machine Learning, 1997 Examples: - Learning to recognize spoken words - Learning to drive a vehicle - Learning to play backgammon
  • 5. Components of a ML System (1) Experience (a set of examples that combines together input and output for a task)  Text categorization: document + category  Speech recognition: spoken text + written text Experience is referred to as Training Data. When training data is available, we talk of Supervised Learning. Performance metrics  Error or accuracy in the Test Data  Test Data are not present in the Training Data  When there are few training data, methods like ‘leave-one-out’ or ‘ten-fold cross validation’ are used to measure error. 5
  • 6. Components of a ML System (2) Type of knowledge to be learned (known as the target function, that will map between input and output) Representation of the target function  Decision trees  Neural networks  Linear functions The learning algorithm  C4.5 (learns decision trees)  Gradient descent (learns a neural network)  Linear programming (learns linear functions) 6 Task
  • 7. Defining Text Classification 7 XdX∈d },,,{ 21 Jccc =C D cd, C×∈Xcd, C→X:γ γ=Γ D)( the document in the multi-dimensional space a set of classes (categories, or labels) the training set of labeled documents Target function: Learning algorithm: =cd, “Beijing joins the World Trade Organization”, China cd =)(γ =)(dγ China
  • 8. Naïve Bayes Learning 8 ∏≤≤∈∈ == dnk k CcCc MAP ctPcPdcPc 1 )|(ˆ)(ˆmaxarg)|(ˆmaxarg cd =)(γ Learning Algorithm: Naïve Bayes Target Function: )|()(maxarg)|(maxarg cdPcPdcPc CcCc MAP ∈∈ == )(cP )|( cdP The generative process: )|( dcP a priori probability, of choosing a category the cond. prob. of generating d, given the fixed c a posteriori probability that c generated d
  • 9. A Refresher on Probability 9
  • 10. Visualizing probability A is a random variable that denotes an uncertain event  Example: A = “I’ll get an A+ in the final exam” P(A) is “the fraction of possible worlds where A is true” 10 Worlds in which A is true Slide: Andrew W. Moore Worlds in which A is false Event space of all possible worlds. Its area is 1. P(A) = Area of the blue circle.
  • 11. Axioms and Theorems of Probability Axioms:  0 <= P(A) <= 1  P(True) = 1  P(False) = 0  P(A or B) = P(A) + P(B) – P(A and B) Theorems:  P(not A) = P(~A) = 1 – P(A)  P(A) = P(A ^ B) + P(A ^ ~B) 11
  • 12. Conditional Probability P(A|B) = the probability of A being true, given that we know that B is true 12 F H H = “I have a headache” F = “Coming down with flu” P(H) = 1/10 P(F) = 1/40 P(H/F) = 1/2 Slide: Andrew W. Moore Headaches are rare and flu even rarer, but if you got that flu, there is a 50-50 chance you’ll have a headache.
  • 13. Deriving the Bayes Rule 13 )( )( )|( BP BAP BAP ∧ =Conditional Probability: )()|()( BPBAPBAP =∧Chain rule: )()|()()( APABPABPBAP =∧=∧ Bayes Rule: )( )()|( )|( AP BPBAP ABP =
  • 14. Back to the Naïve Bayes Classifier 14
  • 15. Deriving the Naïve Bayes 15 )( )()|( )|( AP BPBAP ABP = (Bayes Rule) 21,cc 'dGiven two classes and the document )'( )|'()( )'|( 11 1 dP cdPcP dcP = )'( )|'()( )'|( 22 2 dP cdPcP dcP = We are looking for a that maximizes the a-posterioriic )'|( dcP i )'(dP (the denominator) is the same in both cases )|()(maxarg cdPcPc Cc MAP ∈ =Thus:
  • 16. Estimating parameters for the target function We are looking for the estimates and 16 )(ˆ cP )|(ˆ cdP P(c) is the fraction of possible worlds where c is true. N N cP c =)(ˆ N – number of all documents Nc – number of documents in class c d is a vector in the space X )|,,,()|( 2 ctttPcdP dni = where each dimension is a term: )()|()( BPBAPBAP =∧By using the chain rule: we have: (P ),,...,(),,...,|()|,,,( 2212 cttPctttPctttP ddd nnni = ...=
  • 17. Naïve assumptions of independence 1. All attribute values are independent of each other given the class. (conditional independence assumption) 2. The conditional probabilities for a term are the same independent of position in the document. We assume the document is a “bag-of-words”. 17 ∏≤≤ == d d nk kni ctPctttPcdP 1 2 )|()|,,,()|(  ∏≤≤∈∈ == dnk k CcCc MAP ctPcPdcPc 1 )|(ˆ)(ˆmaxarg)|(ˆmaxarg Finally, we get the target function of Slide 8:
  • 18. Again about estimation 18 For each term, t, we need to estimate P(t|c) ∑ ∈ = Vt ct ct T T ctP ' ' )|(ˆ Because an estimate will be 0 if a term does not appear with a class in the training data, we need smoothing: ||)( 1 )1( 1 )|(ˆ ' '' ' VT T T T ctP Vt ct ct Vt ct ct ∑∑ ∈∈ + + = + + =Laplace Smoothing |V| is the number of terms in the vocabulary Tct is the count of term t in all documents of class c
  • 19. An Example of classification with Naïve Bayes 19
  • 20. Example 13.1 (Part 1) 20 Training set docID c = China? 1 Chinese Beijing Chinese Yes 2 Chinese Chinese Shangai Yes 3 Chinese Macao Yes 4 Tokyo Japan Chinese No Test set 5 Chinese Chinese Chinese Tokyo Japan ? Two classes: “China”, “not China” N = 4 4/3)(ˆ =cP 4/1)(ˆ =cP V = {Beijing, Chinese, Japan, Macao, Tokyo}
  • 21. Example 13.1 (Part 1) 21 Training set docID c = China? 1 Chinese Beijing Chinese Yes 2 Chinese Chinese Shangai Yes 3 Chinese Macao Yes 4 Tokyo Japan Chinese No Test set 5 Chinese Chinese Chinese Tokyo Japan ? 7/3)68/()15()|Chinese(ˆ =++=cP 14/1)68/()10()|Japan(ˆ)|Tokyo(ˆ =++== cPcP 9/2)63/()11()|Chinese(ˆ =++=cP 9/2)63/()11()|Japan(ˆ)|Tokyo(ˆ =++== cPcP Estimation Classification ∏≤≤ ∝ dnk k ctPcPdcP 1 )|()()|( 0001.09/29/2)9/2(4/1)|( 0003.014/114/1)7/3(4/3)|( 3 5 3 5 ≈⋅⋅⋅∝ ≈⋅⋅⋅∝ dcP dcP
  • 22. Summary: Miscellanious Naïve Bayes is linear in the time is takes to scan the data When we have many terms, the product of probabilities with cause a floating point underflow, therefore: For a large training set, the vocabulary is large. It is better to select only a subset of terms. For that is used “feature selection” (Section 13.5). 22 ∑≤≤∈ += dnk k Cc MAP ctPcPc 1 )|(log)(ˆ[logmaxarg

Editor's Notes

  1. Q: What is different in this definition from other types of computer programs? A: We do not speak about experience in other occasions, just about the task and performance criteria. Q: If the task T is speech recognition, could you imagine what would be E and P? A: E would be examples of spoken text, i.e., the computer has the written text and while someone speaks the computer matches the written words to the spoken words. P (performance) will be the number of words that the computer recognizes correctly.
  2. We give the target function at the beginning, but we say that we are going to explain later on how this formula is derived (after the refresher in probability). Give the example of selecting topics for the class project, that means, selecting c. Then, given c, the choice of d, is conditional, P(d|c).
  3. It is clear that calculating all the parameters that derive from the application of the chain rule is infeasible. Therefore, we need the naïve assumptions of independence in next page.