Nlp text classification

Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com
NLP Text Classification
In this paper author implemented naïve bayes algorithm to predict the text classification.
For feature extraction we imported Count Vectorizer, Tf-idf Transformer. We can use linear
model for SGD classification and for model selection we imported Grid serachCv.
Naïve Bayes Classifier Algorithm
It wouldbe difficultandpracticallyimpossible toclassifyawebpage,a document,anemail orany
otherlengthytextnotesmanually.This iswhere Naïve BayesClassifiermachine learningalgorithm
comesto the rescue.A classifierisafunctionthatallocatesa population’s elementvalue fromone of
the available categories.Forinstance,SpamFilteringisa popularapplicationof Naïve Bayes
algorithm.Spamfilterhere,isaclassifierthatassignsalabel “Spam”or “Not Spam” to all the emails.
Naïve BayesClassifierisamongstthe mostpopularlearningmethodgroupedbysimilaritiesthat
workson the popularBayesTheorem of Probability- tobuildmachine learningmodelsparticularly
for disease predictionanddocumentclassification.Itisa simple classification of wordsbasedon
BayesProbabilityTheoremforsubjective analysisof content.
Stochastic Gradient Descent (SGD):
It isa simple yetveryefficientapproachtodiscriminativelearningof linearclassifiersunderconvex
lossfunctionssuchas (linear) SupportVectorMachinesand LogisticRegression.EventhoughSGD
has beenaroundinthe machine learningcommunityforalongtime,ithas receivedaconsiderable
amountof attentionjustrecentlyinthe contextof large-scalelearning.
SGD has beensuccessfullyappliedtolarge-scale andsparse machine learningproblemsoften
encounteredintextclassificationandnatural language processing.Giventhatthe datais sparse,the
classifiersinthismodule easilyscale toproblemswithmore than10^5 trainingexamplesandmore
than 10^5 features.
The advantagesof StochasticGradientDescentare:
 Efficiency.
 Ease of implementation(lotsof opportunitiesforcode tuning).
The disadvantagesof StochasticGradientDescentinclude:
 SGD requiresanumberof hyperparameterssuchas the regularizationparameterandthe
numberof iterations.
 SGD is sensitivetofeature scaling.
Logistic Regression:
The name of thisalgorithmcouldbe a little confusinginthe sense thatLogisticRegressionmachine
learningalgorithmisforclassificationtasksandnotregressionproblems.The name ‘Regression’
here implies thatalinearmodel isfitintothe feature space.
Thisalgorithmappliesalogisticfunctiontoa linearcombinationof featurestopredictthe outcome
of a categorical dependentvariable basedonpredictorvariables.

Mobile:+91 9966499110
The odds or probabilitiesthatdescribethe outcome of asingle trial are modeledasafunctionof
explanatoryvariables.Logisticregressionalgorithmshelpsestimatethe probabilityof fallingintoa
specificlevel of the categorical dependentvariable basedonthe givenpredictorvariables.
Justsuppose thatyou wantto predictif there will be asnowfall tomorrow inNew York.Here the
outcome of the predictionisnota continuousnumberbecause there will eitherbe snowfallorno
snowfall andhence linearregressioncannotbe applied.Here the outcome variableisone of the
several categoriesandusinglogisticregressionhelps.
Basedon the nature of categorical response,logisticregressionisclassifiedinto3types –
Binary LogisticRegression – The most commonlyusedlogisticregressionwhenthe categorical
response has2 possible outcomesi.e.eitheryesornot.Example –Predictingwhetherastudentwill
pass or fail anexam,predictingwhetherastudentwill have low orhighbloodpressure,predicting
whetheratumor iscancerousor not.
Multi-nominal LogisticRegression - Categorical response has3 or more possible outcomeswithno
ordering.Example-Predictingwhatkindof searchengine (Yahoo,Bing,Google,andMSN) isusedby
majorityof US citizens.
Ordinal Logistic Regression - Categorical response has3or more possible outcomeswithnatural
ordering.Example-Howacustomerrates the service andqualityof foodata restaurantbasedona
scale of 1 to 10.
Let usconsidera simple example where acake manufacturerwantstofindout if bakinga cake at
160°C, 180°C and200°C will produce a‘hard’or ‘soft’varietyof cake ( assumingthe factthat the
bakerysellsboththe varietiesof cake withdifferentnamesandprices).
Logisticregressionisaperfectfitinthisscenarioinsteadof otherstatistical techniques.Forexample,
if the manufacturesproduces2 cake batcheswhereinthe firstbatchcontains20 cakes (of which7
were hardand 13 were soft) andthe secondbatchof cake producedconsistedof 80 cakes(of which
41 were hard and39 were softcakes).Here inthiscase if linearregressionalgorithmisuseditwill
give equal importance boththe batchesof cakesregardlessof the numberof cakesineachbatch.
Applyingalogisticregressionalgorithmwill considerthisfactorandgive the secondbatchof cakes
more weightage thanthe firstbatch.
Support vector machine:
Machine learninginvolvespredictingandclassifyingdataandto do sowe employvariousmachine
learningalgorithmsaccording tothe dataset.SVMor SupportVectorMachine is a linearmodel for
classificationandregressionproblems.Itcansolve linearandnon-linearproblemsandworkwell for
manypractical problems.The ideaof SVMissimple:The algorithmcreatesaline ora hyperplane
whichseparatesthe dataintoclasses.Inmachine learning,the radial basisfunctionkernel,orRBF
kernel,isapopularkernel functionusedinvariouskernelizedlearningalgorithms.Inparticular,itis
commonlyusedinsupportvectormachine classification.Asasimple example,foraclassification
task withonlytwofeatures(like the image above),youcanthinkof a hyperplane asa line that
linearlyseparatesandclassifiesasetof data.
Intuitively,the furtherfromthe hyperplane ourdatapointslie,the more confidentwe are thatthey
have beencorrectlyclassified.We therefore wantourdatapointstobe as far awayfrom the hyper
plane as possible,while still beingonthe correctside of it. So whennew testingdataisadded,
whateverside of the hyperplane itlandswill decide the classthatwe assigntoit.

Mobile:+91 9966499110
How dowe findthe righthyperplane?
Or, in otherwords,howdowe bestsegregate the twoclasseswithinthe data?
The distance betweenthe hyperplane andthe nearest datapointfromeithersetisknownasthe
margin.The goal isto choose a hyperplane withthe greatestpossible marginbetweenthe hyper
plane andany pointwithinthe trainingset,givingagreaterchance of new data beingclassified
correctly.Both algorithmsgenerate modelfromtraindatasetandnew data will be appliedontrain
model topredictitclass.SVMalgorithmisgivingbetterpredictionaccuracycompare to ANN
algorithm.
PythonPackagesandLibrariesused:Numpy,pandas, tkinter, NLP
PyVISA 1.10.1 1.10.1
PyVISA-py 0.3.1 0.3.1
cycler 0.10.0 0.10.0
imutils 0.5.3 0.5.3
joblib 0.14.1 0.14.1
kiwisolver 1.1.0 1.1.0
matplotlib 3.1.2 3.1.2
nltk 3.4.5 3.4.5
numpy 1.18.1 1.18.1
opencv-python 4.1.2.30 4.1.2.30
pandas 0.25.3 0.25.3
pip 19.0.3 20.0.1
pylab 0.0.2 0.0.2
pyparsing 2.4.6 2.4.6
python-dateutil 2.8.1 2.8.1
pytz 2019.3 2019.3
pyusb 1.0.2 1.0.2
scikit-learn 0.22.1 0.22.1
scipy 1.4.1 1.4.1
seaborn 0.9.0 0.9.0
setuptools 40.8.0 45.1.0
six 1.14.0 1.14.0
sklearn 0.0 0.0
style 1.1.6 1.1.6
styled 0.2.0.post1 0.2.0.post1
Screen shots

Mobile:+91 9966499110
When we run the code it displays below window
Now click on ‘Download and categories ’ to display dataset
Nowclickon ‘preprocessdataset’topreprocessthe data

Mobile:+91 9966499110
Nowclickon ’count vectorizer’tofitcountVectorizer.
Now click on ‘TF_IDF’ to fit TF-IDF transformation

Mobile:+91 9966499110
Now click on ‘MultinominalNB’ for fitting the MultinominalNB
Now click on ‘SGD Classifier’ for fitting

Mobile:+91 9966499110
Now click on ‘Tuned Naïve_Bayes’ tuning for Naïve_Bayes
Now click on ‘Tuned SGD’ tuning for SGD

Mobile:+91 9966499110

Nlp text classification

Recommended

Recommended

More Related Content

What's hot

What's hot (11)

Similar to Nlp text classification

Similar to Nlp text classification (20)

More from Venkat Projects

More from Venkat Projects (20)

Recently uploaded

Recently uploaded (20)

Nlp text classification