TEXT ANALYTICS
USING NLTK
By,
Vaishnavi A
III CSE B
What is Natural Language Processing?
NLP is a part of computer science and
artificial intelligence which deals with
human languages.
Think about how much text you see each day:
• Email
• SMS
• Web Pages
• Newspaper
• and so much more…
The list is endless.
What is Text Mining?
Text Mining / Text Analytics is the process of deriving
meaningful information from natural language text.
Applications of NLP
Applications of NLP
NLTK
What is NLTK?
Data Cleaning
•TOKENIZATION
Tokenization is the first step in NLP
Tokenization is the first step in NLP
Tokenization
Removal of StopWords
•Stopwords might not add much value to the meaning of
the statement
•Perform tokenization before any stopwords removal.
Eg: “There is a book on the table”
The words “is”, “a”, “on” and “the’ 🡪 Stopwords
Words like “there”, “book” and “table” 🡪 Keywords
STEMMING
Normalize words into its base form or root form
Affects Affections Affected Affection Affecting
Affect
Groups together different
inflected forms of a word,
called Lemma
Somehow similar to
Stemming, as it maps several
words into one common root
Output of Lemmatization is a
proper word
For example, a Lemmatizer
should map gone, going and
went into go
Text Processing
Vectorization
Text Classification
POS : Tags and Description
NAMED ENTITY RECOGNITION
Text Analytics using NLTK
Text Analytics using NLTK
Text Analytics using NLTK
Text Analytics using NLTK

Text Analytics using NLTK