SlideShare a Scribd company logo
NLTK
Mohammed Shokr
16-Mar-16
Natural Language Toolkit (NLTK)
■ A collection of Python programs, modules, data set and tutorial to
support research and development in Natural Language Processing
(NLP)
■ Written by Steven Bird, Edvard Loper and Ewan Klien
■ NLTK is
– Free and Open source
– Easy to use
– Modular
– Well documented
– Simple and extensible
Installation of NLTK
■ NLTK requires Python versions 2.7 or 3.2+
Python REPL
• Read Eval Print Loop
• REPL: This is a procedure that just
loops, accepts one command at a
time, executing it, and printing the
result.
• GUI OR CLI
Installation of NLTK
1. Start a Command Prompt as an Administrator ( Windows User )
1. Click Start.
2. In the Start Search box, type cmd, and then press CTRL+SHIFT+ENTER.
3. If the User Account Control dialog box appears, confirm that the action it
displays is what you want, and then click Continue.
2. changing from user to Superuser ( linux user )
– sudo su
Installation of NLTK
Install NLTK: run
pip install nltk
Test installation
run
python
then type
import nltk
Installing NLTK Data
■ NLTK comes with many corpora, toy grammars, trained models, etc. A complete
list is posted at: http://nltk.org/nltk_data/
■ Run the Python REPL and type the commands:
■ A new window should open, showing the NLTK Downloader. Click on the File menu and select Change Download Directory. For central
installation, set this to C:nltk_data (Windows), /usr/local/share/nltk_data (Mac), or /usr/share/nltk_data (Unix). Next, select the
packages or collections you want to download.
>>> import nltk
>>> nltk.download()
NLTK Downloader
Getting Started with NLTK
NLTK Text
Text Pre-processing
Sentence splitter
from nltk.tokenize import sent_tokenize
input_sting = ‘Hello Every One. How Are you ? life is not
easy.’
all_sent = sent_tokenize(input_sting)
print (all_sent)
Tokenization
sent = "Hi Everyone ! How do you do ?"
# Split() built-in string function
print (sent.split())
# word_tokenize
from nltk.tokenize import word_tokenize
print (word_tokenize(sent))
Stemming
Lemmatization
Morphology
Edit-Distance
We can create a very basic spellchecker
by just using a dictionary lookup.
Part of Speech Tagging
Part of Speech
Tagging
Penn Bank Part-of-Speech Tags
Part of Speech Tagging
■ Stanford tagger
■ N-gram tagger
■ Regex tagger
■ Brill tagger
■ Machine learning based tagger
■ NER tagger
– Named Entity Recognition (NER)
– NLTK provides the ne_chunk() method
NER tagger
 Tokens
 Tagged
 Entities
Parsing Structure in Text
Shallow VS deep parsing
■ In deep or full parsing, typically, grammar concepts such as CFG, and probabilistic
context-free grammar (PCFG), and a search strategy is used to give a complete
syntactic structure to a sentence.
■ Shallow parsing is the task of parsing a limited part of the syntactic information
from the given text.
The two approaches in parsing
The rule-based approach The probabilistic approach
This approach is based on rules/grammar In this approach, you learn rules/grammar
by using probabilistic models
Manual grammatical rules are coded down
in CFG, and so on, in this approach
This uses observed probabilities of linguistic
features
This has a top-down approach This has a bottom-up approach
This approach includes CFG and Regexbased
parser
This approach includes PCFG and the
Stanford parser
context-free grammar (CFG)
■ Generating sentences from context-free grammars:
Different types of parsers
■ Recursive descent parser
■ Shift-reduce parser
■ Chart parser
■ Regex parser
■ Dependency parsing
Different types of parsers
Recursive descent parser One of the most straightforward forms of parsing is recursive
descent parsing. This is a top-down process in which the parser
attempts to verify that the syntax of the input stream is correct,
as it is read from left to right.
Shift-reduce parser The shift-reduce parser is a simple kind of bottom-up parser.
Chart parser We will apply the algorithm design technique of dynamic
programming to the parsing problem.
Regex parser A regex parser uses a regular expression defined in the form of
grammar on top of a POS-tagged string.
Dependency parsing Dependency parsing (DP) is a modern parsing mechanism. The
main concept of DP is that each linguistic unit (words) is
connected with each other by a directed link.
Chunking
■ Chunking is shallow parsing where instead of reaching out to the deep structure of
the sentence, we try to club some chunks of the sentences that constitute some
meaning.
■ For example, the sentence "the President speaks about the health care reforms"
So, let's write some code snippets to do some basic
chunking:
Display a parse tree
# import treebank corpus
from nltk.corpus import treebank
t = treebank.parsed_sents('wsj_0001.mrg')[0]
t.draw()
Relation Extraction
Output
Resources
■ NLTK 3.0 documentation
– http://www.nltk.org/
■ NLTK Essentials
– https://www.packtpub.com/big-data-and-business-intelligence/nltk-essentials
■ nltk_tutorial_repo [Code]
– https://git.io/vaRIR
Thanks.
Questions? Send me an email! (I love talking!)
Mohammed Shokr
mohammedshokr2014@gmail.com
@MShokr1

More Related Content

What's hot

Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
Yuriy Guts
 
Text generation and_advanced_topics
Text generation and_advanced_topicsText generation and_advanced_topics
Text generation and_advanced_topics
ankit_ppt
 
Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics
Prakash Pimpale
 
Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...
Rajnish Raj
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
Alia Hamwi
 
Nltk
NltkNltk
Nltk
Anirudh
 
NLP
NLPNLP
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
VenkateshMurugadas
 
NLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in PythonNLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in Python
shanbady
 
Introduction to NLTK
Introduction to NLTKIntroduction to NLTK
Introduction to NLTK
Sreejith Sasidharan
 
word level analysis
word level analysis word level analysis
word level analysis
tjs1
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Rishikese MR
 
Natural language processing (NLP)
Natural language processing (NLP) Natural language processing (NLP)
Natural language processing (NLP)
ASWINKP11
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
Saurav Aryal
 
NLTK in 20 minutes
NLTK in 20 minutesNLTK in 20 minutes
NLTK in 20 minutes
Jacob Perkins
 
Language Model (N-Gram).pptx
Language Model (N-Gram).pptxLanguage Model (N-Gram).pptx
Language Model (N-Gram).pptx
HeneWijaya
 
Improving Neural Abstractive Text Summarization with Prior Knowledge
Improving Neural Abstractive Text Summarization with Prior KnowledgeImproving Neural Abstractive Text Summarization with Prior Knowledge
Improving Neural Abstractive Text Summarization with Prior Knowledge
Gaetano Rossiello, PhD
 
NLP
NLPNLP
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Bhavya Chawla
 
Parts of Speect Tagging
Parts of Speect TaggingParts of Speect Tagging
Parts of Speect Tagging
theyaseen51
 

What's hot (20)

Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Text generation and_advanced_topics
Text generation and_advanced_topicsText generation and_advanced_topics
Text generation and_advanced_topics
 
Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics
 
Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
 
Nltk
NltkNltk
Nltk
 
NLP
NLPNLP
NLP
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
 
NLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in PythonNLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in Python
 
Introduction to NLTK
Introduction to NLTKIntroduction to NLTK
Introduction to NLTK
 
word level analysis
word level analysis word level analysis
word level analysis
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language processing (NLP)
Natural language processing (NLP) Natural language processing (NLP)
Natural language processing (NLP)
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
NLTK in 20 minutes
NLTK in 20 minutesNLTK in 20 minutes
NLTK in 20 minutes
 
Language Model (N-Gram).pptx
Language Model (N-Gram).pptxLanguage Model (N-Gram).pptx
Language Model (N-Gram).pptx
 
Improving Neural Abstractive Text Summarization with Prior Knowledge
Improving Neural Abstractive Text Summarization with Prior KnowledgeImproving Neural Abstractive Text Summarization with Prior Knowledge
Improving Neural Abstractive Text Summarization with Prior Knowledge
 
NLP
NLPNLP
NLP
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Parts of Speect Tagging
Parts of Speect TaggingParts of Speect Tagging
Parts of Speect Tagging
 

Viewers also liked

Python NLTK
Python NLTKPython NLTK
Python NLTK
Alberts Pumpurs
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Mariana Soffer
 
Machine Learning in NLP
Machine Learning in NLPMachine Learning in NLP
Machine Learning in NLP
Vijay Ganti
 
Statistical Learning and Text Classification with NLTK and scikit-learn
Statistical Learning and Text Classification with NLTK and scikit-learnStatistical Learning and Text Classification with NLTK and scikit-learn
Statistical Learning and Text Classification with NLTK and scikit-learn
Olivier Grisel
 
Natural language processing (Python)
Natural language processing (Python)Natural language processing (Python)
Natural language processing (Python)
Sumit Raj
 
Text classification in scikit-learn
Text classification in scikit-learnText classification in scikit-learn
Text classification in scikit-learn
Jimmy Lai
 
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Jimmy Lai
 
Sentiment analysis-by-nltk
Sentiment analysis-by-nltkSentiment analysis-by-nltk
Sentiment analysis-by-nltk
Wei-Ting Kuo
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
Hansi Thenuwara
 
Spam Filtering
Spam FilteringSpam Filtering
Spam Filtering
Umar Alharaky
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
rohitnayak
 
How Sentiment Analysis works
How Sentiment Analysis worksHow Sentiment Analysis works
How Sentiment Analysis works
CJ Jenkins
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
Pranav Gupta
 
Neuro Linguistic Programming
Neuro Linguistic ProgrammingNeuro Linguistic Programming
Neuro Linguistic Programming
smjk
 
Statistical Machine Learning for Text Classification with scikit-learn and NLTK
Statistical Machine Learning for Text Classification with scikit-learn and NLTKStatistical Machine Learning for Text Classification with scikit-learn and NLTK
Statistical Machine Learning for Text Classification with scikit-learn and NLTK
Olivier Grisel
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Jaganadh Gopinadhan
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Rahul Jain
 

Viewers also liked (17)

Python NLTK
Python NLTKPython NLTK
Python NLTK
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Machine Learning in NLP
Machine Learning in NLPMachine Learning in NLP
Machine Learning in NLP
 
Statistical Learning and Text Classification with NLTK and scikit-learn
Statistical Learning and Text Classification with NLTK and scikit-learnStatistical Learning and Text Classification with NLTK and scikit-learn
Statistical Learning and Text Classification with NLTK and scikit-learn
 
Natural language processing (Python)
Natural language processing (Python)Natural language processing (Python)
Natural language processing (Python)
 
Text classification in scikit-learn
Text classification in scikit-learnText classification in scikit-learn
Text classification in scikit-learn
 
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
 
Sentiment analysis-by-nltk
Sentiment analysis-by-nltkSentiment analysis-by-nltk
Sentiment analysis-by-nltk
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Spam Filtering
Spam FilteringSpam Filtering
Spam Filtering
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
How Sentiment Analysis works
How Sentiment Analysis worksHow Sentiment Analysis works
How Sentiment Analysis works
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Neuro Linguistic Programming
Neuro Linguistic ProgrammingNeuro Linguistic Programming
Neuro Linguistic Programming
 
Statistical Machine Learning for Text Classification with scikit-learn and NLTK
Statistical Machine Learning for Text Classification with scikit-learn and NLTKStatistical Machine Learning for Text Classification with scikit-learn and NLTK
Statistical Machine Learning for Text Classification with scikit-learn and NLTK
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 

Similar to NLTK

NLP CHEAT SHEET.pdf
NLP CHEAT SHEET.pdfNLP CHEAT SHEET.pdf
NLP CHEAT SHEET.pdf
ssuserc8990f1
 
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...
Databricks
 
Techniques & applications of Compiler
Techniques & applications of CompilerTechniques & applications of Compiler
Techniques & applications of Compiler
Preethi AKNR
 
Nltk:a tool for_nlp - py_con-dhaka-2014
Nltk:a tool for_nlp - py_con-dhaka-2014Nltk:a tool for_nlp - py_con-dhaka-2014
Nltk:a tool for_nlp - py_con-dhaka-2014
Fasihul Kabir
 
NLP with TensorFlow.pdf
NLP with TensorFlow.pdfNLP with TensorFlow.pdf
NLP with TensorFlow.pdf
AkankshaPathak42
 
python presentation
python presentationpython presentation
python presentation
VaibhavMawal
 
The Phases of a Compiler
The Phases of a CompilerThe Phases of a Compiler
The Phases of a Compiler
Radhika Talaviya
 
SDD Translation
SDD  TranslationSDD  Translation
SDD Translation
gavhays
 
New c sharp3_features_(linq)_part_iv
New c sharp3_features_(linq)_part_ivNew c sharp3_features_(linq)_part_iv
New c sharp3_features_(linq)_part_iv
Nico Ludwig
 
AI UNIT 3 - SRCAS JOC.pptx enjoy this ppt
AI UNIT 3 - SRCAS JOC.pptx enjoy this pptAI UNIT 3 - SRCAS JOC.pptx enjoy this ppt
AI UNIT 3 - SRCAS JOC.pptx enjoy this ppt
pavankalyanadroittec
 
Introduction to Python for Bioinformatics
Introduction to Python for BioinformaticsIntroduction to Python for Bioinformatics
Introduction to Python for Bioinformatics
José Héctor Gálvez
 
lec00-Introduction.pdf
lec00-Introduction.pdflec00-Introduction.pdf
lec00-Introduction.pdf
wigewej294
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
CloudxLab
 
An introduction to erlang
An introduction to erlangAn introduction to erlang
An introduction to erlang
Mirko Bonadei
 
compiler introduction vtu syllabus 1st chapter.pptx
compiler introduction vtu syllabus 1st chapter.pptxcompiler introduction vtu syllabus 1st chapter.pptx
compiler introduction vtu syllabus 1st chapter.pptx
ranjan317165
 
Tokenization and how to use it from scratch
Tokenization and how to use it from scratchTokenization and how to use it from scratch
Tokenization and how to use it from scratch
Mahmoud Yasser
 
Insight into progam execution ppt
Insight into progam execution pptInsight into progam execution ppt
Insight into progam execution ppt
Keerty Smile
 
Assignment4.pptx
Assignment4.pptxAssignment4.pptx
Assignment4.pptx
jatinchand3
 
1 cc
1 cc1 cc
1 cc
Jay Soni
 
Nltk - Boston Text Analytics
Nltk - Boston Text AnalyticsNltk - Boston Text Analytics
Nltk - Boston Text Analytics
shanbady
 

Similar to NLTK (20)

NLP CHEAT SHEET.pdf
NLP CHEAT SHEET.pdfNLP CHEAT SHEET.pdf
NLP CHEAT SHEET.pdf
 
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...
 
Techniques & applications of Compiler
Techniques & applications of CompilerTechniques & applications of Compiler
Techniques & applications of Compiler
 
Nltk:a tool for_nlp - py_con-dhaka-2014
Nltk:a tool for_nlp - py_con-dhaka-2014Nltk:a tool for_nlp - py_con-dhaka-2014
Nltk:a tool for_nlp - py_con-dhaka-2014
 
NLP with TensorFlow.pdf
NLP with TensorFlow.pdfNLP with TensorFlow.pdf
NLP with TensorFlow.pdf
 
python presentation
python presentationpython presentation
python presentation
 
The Phases of a Compiler
The Phases of a CompilerThe Phases of a Compiler
The Phases of a Compiler
 
SDD Translation
SDD  TranslationSDD  Translation
SDD Translation
 
New c sharp3_features_(linq)_part_iv
New c sharp3_features_(linq)_part_ivNew c sharp3_features_(linq)_part_iv
New c sharp3_features_(linq)_part_iv
 
AI UNIT 3 - SRCAS JOC.pptx enjoy this ppt
AI UNIT 3 - SRCAS JOC.pptx enjoy this pptAI UNIT 3 - SRCAS JOC.pptx enjoy this ppt
AI UNIT 3 - SRCAS JOC.pptx enjoy this ppt
 
Introduction to Python for Bioinformatics
Introduction to Python for BioinformaticsIntroduction to Python for Bioinformatics
Introduction to Python for Bioinformatics
 
lec00-Introduction.pdf
lec00-Introduction.pdflec00-Introduction.pdf
lec00-Introduction.pdf
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
An introduction to erlang
An introduction to erlangAn introduction to erlang
An introduction to erlang
 
compiler introduction vtu syllabus 1st chapter.pptx
compiler introduction vtu syllabus 1st chapter.pptxcompiler introduction vtu syllabus 1st chapter.pptx
compiler introduction vtu syllabus 1st chapter.pptx
 
Tokenization and how to use it from scratch
Tokenization and how to use it from scratchTokenization and how to use it from scratch
Tokenization and how to use it from scratch
 
Insight into progam execution ppt
Insight into progam execution pptInsight into progam execution ppt
Insight into progam execution ppt
 
Assignment4.pptx
Assignment4.pptxAssignment4.pptx
Assignment4.pptx
 
1 cc
1 cc1 cc
1 cc
 
Nltk - Boston Text Analytics
Nltk - Boston Text AnalyticsNltk - Boston Text Analytics
Nltk - Boston Text Analytics
 

Recently uploaded

Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Łukasz Chruściel
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
kalichargn70th171
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
mz5nrf0n
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
brainerhub1
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Julian Hyde
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
Hornet Dynamics
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
lorraineandreiamcidl
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
Quickdice ERP
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Neo4j
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise EditionWhy Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Envertis Software Solutions
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
Octavian Nadolu
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
Shane Coughlan
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
lorraineandreiamcidl
 
What is Augmented Reality Image Tracking
What is Augmented Reality Image TrackingWhat is Augmented Reality Image Tracking
What is Augmented Reality Image Tracking
pavan998932
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
Hironori Washizaki
 

Recently uploaded (20)

Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise EditionWhy Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
 
What is Augmented Reality Image Tracking
What is Augmented Reality Image TrackingWhat is Augmented Reality Image Tracking
What is Augmented Reality Image Tracking
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
 

NLTK

  • 2. Natural Language Toolkit (NLTK) ■ A collection of Python programs, modules, data set and tutorial to support research and development in Natural Language Processing (NLP) ■ Written by Steven Bird, Edvard Loper and Ewan Klien ■ NLTK is – Free and Open source – Easy to use – Modular – Well documented – Simple and extensible
  • 3. Installation of NLTK ■ NLTK requires Python versions 2.7 or 3.2+
  • 4. Python REPL • Read Eval Print Loop • REPL: This is a procedure that just loops, accepts one command at a time, executing it, and printing the result. • GUI OR CLI
  • 5. Installation of NLTK 1. Start a Command Prompt as an Administrator ( Windows User ) 1. Click Start. 2. In the Start Search box, type cmd, and then press CTRL+SHIFT+ENTER. 3. If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Continue. 2. changing from user to Superuser ( linux user ) – sudo su
  • 6. Installation of NLTK Install NLTK: run pip install nltk
  • 8. Installing NLTK Data ■ NLTK comes with many corpora, toy grammars, trained models, etc. A complete list is posted at: http://nltk.org/nltk_data/ ■ Run the Python REPL and type the commands: ■ A new window should open, showing the NLTK Downloader. Click on the File menu and select Change Download Directory. For central installation, set this to C:nltk_data (Windows), /usr/local/share/nltk_data (Mac), or /usr/share/nltk_data (Unix). Next, select the packages or collections you want to download. >>> import nltk >>> nltk.download()
  • 13. Sentence splitter from nltk.tokenize import sent_tokenize input_sting = ‘Hello Every One. How Are you ? life is not easy.’ all_sent = sent_tokenize(input_sting) print (all_sent)
  • 14. Tokenization sent = "Hi Everyone ! How do you do ?" # Split() built-in string function print (sent.split()) # word_tokenize from nltk.tokenize import word_tokenize print (word_tokenize(sent))
  • 18. Edit-Distance We can create a very basic spellchecker by just using a dictionary lookup.
  • 19. Part of Speech Tagging
  • 22. Part of Speech Tagging ■ Stanford tagger ■ N-gram tagger ■ Regex tagger ■ Brill tagger ■ Machine learning based tagger ■ NER tagger – Named Entity Recognition (NER) – NLTK provides the ne_chunk() method
  • 23. NER tagger  Tokens  Tagged  Entities
  • 25. Shallow VS deep parsing ■ In deep or full parsing, typically, grammar concepts such as CFG, and probabilistic context-free grammar (PCFG), and a search strategy is used to give a complete syntactic structure to a sentence. ■ Shallow parsing is the task of parsing a limited part of the syntactic information from the given text.
  • 26. The two approaches in parsing The rule-based approach The probabilistic approach This approach is based on rules/grammar In this approach, you learn rules/grammar by using probabilistic models Manual grammatical rules are coded down in CFG, and so on, in this approach This uses observed probabilities of linguistic features This has a top-down approach This has a bottom-up approach This approach includes CFG and Regexbased parser This approach includes PCFG and the Stanford parser
  • 27. context-free grammar (CFG) ■ Generating sentences from context-free grammars:
  • 28.
  • 29. Different types of parsers ■ Recursive descent parser ■ Shift-reduce parser ■ Chart parser ■ Regex parser ■ Dependency parsing
  • 30. Different types of parsers Recursive descent parser One of the most straightforward forms of parsing is recursive descent parsing. This is a top-down process in which the parser attempts to verify that the syntax of the input stream is correct, as it is read from left to right. Shift-reduce parser The shift-reduce parser is a simple kind of bottom-up parser. Chart parser We will apply the algorithm design technique of dynamic programming to the parsing problem. Regex parser A regex parser uses a regular expression defined in the form of grammar on top of a POS-tagged string. Dependency parsing Dependency parsing (DP) is a modern parsing mechanism. The main concept of DP is that each linguistic unit (words) is connected with each other by a directed link.
  • 31. Chunking ■ Chunking is shallow parsing where instead of reaching out to the deep structure of the sentence, we try to club some chunks of the sentences that constitute some meaning. ■ For example, the sentence "the President speaks about the health care reforms"
  • 32. So, let's write some code snippets to do some basic chunking:
  • 33. Display a parse tree # import treebank corpus from nltk.corpus import treebank t = treebank.parsed_sents('wsj_0001.mrg')[0] t.draw()
  • 35. Resources ■ NLTK 3.0 documentation – http://www.nltk.org/ ■ NLTK Essentials – https://www.packtpub.com/big-data-and-business-intelligence/nltk-essentials ■ nltk_tutorial_repo [Code] – https://git.io/vaRIR
  • 36. Thanks. Questions? Send me an email! (I love talking!) Mohammed Shokr mohammedshokr2014@gmail.com @MShokr1

Editor's Notes

  1. While deep parsing is required for more complex NLP applications, such as dialogue systems and summarization , shallow parsing is more suited for information extraction and text mining varieties of applications.