NLP For
Text Categorization/Classification
Domain-Natural Language Processing
Prepared By-Abhishek Oswal
Guide-Jayshree Ghorpade
Some Questions
What is NLP Detecting ->Patterns
Features Models
What is Cassification
What is Text Classification
Why Text Classification
Promblem Type
•Supervised
•You know about it
•Train data
•Fruits Analogy
•Unsupervised
•You don't know about it
•Untrain data
Supervised
•Regression and Classification
•Regression -> Real estate market predict
price ,
•Price Continious Output
•Classification ->Whether it sells for more or
less than asked price,discrete output
Process of Classification
•Data preprocessing
•Training and Test set
•Creation of model
•Algorithm
•Classify
Methods To Represent
•Document -term Matrix
•Bags of words
Methods To Classify
•Using Probability
•Naive Bayes
•Using Graphs
•Simple Vector Machine
•Tree
•Decision Tree
Naive Bayes
•Predictice Model
•Conditional Probability
Naive Bayes
•Independent Features
•Prior
•Likelihood
Naive Bayes
Naive Bayes
Another Method
Simple Vector Machine
SVM
lOptimal plane
SVM
•Using margin
•Marging is no man's land
SVM
•Optimal plane would be one with Biggest
margin
•Equation of hyperplane
Applications
•Email Classification
•Spam Filtering
•News Organization
•Classification of documents based on language
•Opining Mining
•Eg.
•Sakaal Classifieds
•Gmail Spam Mail Detection
Example
Comparision
Naive Bayes
Easy
Fast
Different Classes
SVM
Difficult
Slow (traning time)
Binary Output
Questions
• My Questions First
•Classify my presentation
•Class ->Good /Bad
Fei Yu ,Jiyao An and Hong Li,Mialiang Zhu and
Ouyang Yang,”Intelligence Text Categorization
Based on Bayes Algorithm”,Proceedings of
International Conference on Information
Acquisition.

Text Classification/Categorization