Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introduction to NLTK

3,295 views

Published on

Published in: Technology
  • Be the first to comment

Introduction to NLTK

  1. 1. Getting Started with NLTK An Introduction to NLTK Sreejith S srssreejith@gmail.com @tweet2sree FOSSMeet 2011,NIC Calicut 06 February 2011 Sreejith S Getting Started with NLTK
  2. 2. Just a word about me !! Working in Natural Language Processing (NLP), Machine Learning, Text Mining Active member of ilugcbe , http://ilugcbe.techstud.org Works for 365Media Pvt. Ltd. Coimbatore India. @tweet2sree , srssreejith@gmail.com Sreejith S Getting Started with NLTK
  3. 3. Introduction - NLP Natural Language Processing Sreejith S Getting Started with NLTK
  4. 4. Introduction - NLP Natural Language Processing NLP is an inter-disciplinary subject Sreejith S Getting Started with NLTK
  5. 5. Introduction - NLP Natural Language Processing NLP is an inter-disciplinary subject Computer Science Sreejith S Getting Started with NLTK
  6. 6. Introduction - NLP Natural Language Processing NLP is an inter-disciplinary subject Computer Science Linguistics Sreejith S Getting Started with NLTK
  7. 7. Introduction - NLP Natural Language Processing NLP is an inter-disciplinary subject Computer Science Linguistics Statistics etc... Sreejith S Getting Started with NLTK
  8. 8. Introduction - NLP Natural Language Processing NLP is an inter-disciplinary subject Computer Science Linguistics Statistics etc... NLP is a sub field of Artificial Intelligence Sreejith S Getting Started with NLTK
  9. 9. Introduction - NLP Natural Language Processing NLP is an inter-disciplinary subject Computer Science Linguistics Statistics etc... NLP is a sub field of Artificial Intelligence NLP - Any kind of computer manipulation of natural language. Sreejith S Getting Started with NLTK
  10. 10. Introduction - NLP Natural Language Processing NLP is an inter-disciplinary subject Computer Science Linguistics Statistics etc... NLP is a sub field of Artificial Intelligence NLP - Any kind of computer manipulation of natural language. It is a rapidly developing field of study Sreejith S Getting Started with NLTK
  11. 11. Introduction - NLP Natural Language Processing NLP is an inter-disciplinary subject Computer Science Linguistics Statistics etc... NLP is a sub field of Artificial Intelligence NLP - Any kind of computer manipulation of natural language. It is a rapidly developing field of study Everyday applications of NLP Sreejith S Getting Started with NLTK
  12. 12. Introduction - NLP Natural Language Processing NLP is an inter-disciplinary subject Computer Science Linguistics Statistics etc... NLP is a sub field of Artificial Intelligence NLP - Any kind of computer manipulation of natural language. It is a rapidly developing field of study Everyday applications of NLP Handwriting recognition,Machine translation,Question-answering systems,Spell checkers,Grammer checkers etc... Sreejith S Getting Started with NLTK
  13. 13. Natural Language Toolkit (NLTK) A collection of Python programs, modules, data set and tutorial to support research and development in Natural Language Processing (NLP) Sreejith S Getting Started with NLTK
  14. 14. Natural Language Toolkit (NLTK) A collection of Python programs, modules, data set and tutorial to support research and development in Natural Language Processing (NLP) Written by Steven Bird, Edvard Loper and Ewan Klien Sreejith S Getting Started with NLTK
  15. 15. Natural Language Toolkit (NLTK) A collection of Python programs, modules, data set and tutorial to support research and development in Natural Language Processing (NLP) Written by Steven Bird, Edvard Loper and Ewan Klien NLTK is Sreejith S Getting Started with NLTK
  16. 16. Natural Language Toolkit (NLTK) A collection of Python programs, modules, data set and tutorial to support research and development in Natural Language Processing (NLP) Written by Steven Bird, Edvard Loper and Ewan Klien NLTK is Free and Open source Sreejith S Getting Started with NLTK
  17. 17. Natural Language Toolkit (NLTK) A collection of Python programs, modules, data set and tutorial to support research and development in Natural Language Processing (NLP) Written by Steven Bird, Edvard Loper and Ewan Klien NLTK is Free and Open source Easy to use Sreejith S Getting Started with NLTK
  18. 18. Natural Language Toolkit (NLTK) A collection of Python programs, modules, data set and tutorial to support research and development in Natural Language Processing (NLP) Written by Steven Bird, Edvard Loper and Ewan Klien NLTK is Free and Open source Easy to use Modular Sreejith S Getting Started with NLTK
  19. 19. Natural Language Toolkit (NLTK) A collection of Python programs, modules, data set and tutorial to support research and development in Natural Language Processing (NLP) Written by Steven Bird, Edvard Loper and Ewan Klien NLTK is Free and Open source Easy to use Modular Well documented Sreejith S Getting Started with NLTK
  20. 20. Natural Language Toolkit (NLTK) A collection of Python programs, modules, data set and tutorial to support research and development in Natural Language Processing (NLP) Written by Steven Bird, Edvard Loper and Ewan Klien NLTK is Free and Open source Easy to use Modular Well documented Simple and extensible Sreejith S Getting Started with NLTK
  21. 21. Natural Language Toolkit (NLTK) A collection of Python programs, modules, data set and tutorial to support research and development in Natural Language Processing (NLP) Written by Steven Bird, Edvard Loper and Ewan Klien NLTK is Free and Open source Easy to use Modular Well documented Simple and extensible http://www.nltk.org Sreejith S Getting Started with NLTK
  22. 22. What You Will Learn How simple programs can help you manipulate and analyze language data, and how to write these programs Sreejith S Getting Started with NLTK
  23. 23. What You Will Learn How simple programs can help you manipulate and analyze language data, and how to write these programs How key concepts from NLP and linguistics are used to describe and analyze language Sreejith S Getting Started with NLTK
  24. 24. What You Will Learn How simple programs can help you manipulate and analyze language data, and how to write these programs How key concepts from NLP and linguistics are used to describe and analyze language How data structures and algorithms are used in NLP Sreejith S Getting Started with NLTK
  25. 25. What You Will Learn How simple programs can help you manipulate and analyze language data, and how to write these programs How key concepts from NLP and linguistics are used to describe and analyze language How data structures and algorithms are used in NLP How language data is stored in standard formats, and how data can be used to evaluate the performance of NLP techniques Sreejith S Getting Started with NLTK
  26. 26. Installation of NLTK Make sure that Ptyhon 2.4 or 2.5 or 2.6 is available in your system Sreejith S Getting Started with NLTK
  27. 27. Installation of NLTK Make sure that Ptyhon 2.4 or 2.5 or 2.6 is available in your system Install Python Tkinter package Sreejith S Getting Started with NLTK
  28. 28. Installation of NLTK Make sure that Ptyhon 2.4 or 2.5 or 2.6 is available in your system Install Python Tkinter package Install Numpy, Matplotlib, Prover9, MaltParse and MegaM Sreejith S Getting Started with NLTK
  29. 29. Installation of NLTK Make sure that Ptyhon 2.4 or 2.5 or 2.6 is available in your system Install Python Tkinter package Install Numpy, Matplotlib, Prover9, MaltParse and MegaM Download NLTK and Install it Sreejith S Getting Started with NLTK
  30. 30. Installation of NLTK Make sure that Ptyhon 2.4 or 2.5 or 2.6 is available in your system Install Python Tkinter package Install Numpy, Matplotlib, Prover9, MaltParse and MegaM Download NLTK and Install it If you are installing NLTK from source Download http://nltk.googlecode.com/files/nltk-2.0b9.zip Sreejith S Getting Started with NLTK
  31. 31. Installation of NLTK Make sure that Ptyhon 2.4 or 2.5 or 2.6 is available in your system Install Python Tkinter package Install Numpy, Matplotlib, Prover9, MaltParse and MegaM Download NLTK and Install it If you are installing NLTK from source Download http://nltk.googlecode.com/files/nltk-2.0b9.zip Unzip it , It will create nltk-2.0b9 . Sreejith S Getting Started with NLTK
  32. 32. Installation of NLTK Make sure that Ptyhon 2.4 or 2.5 or 2.6 is available in your system Install Python Tkinter package Install Numpy, Matplotlib, Prover9, MaltParse and MegaM Download NLTK and Install it If you are installing NLTK from source Download http://nltk.googlecode.com/files/nltk-2.0b9.zip Unzip it , It will create nltk-2.0b9 . Open terminal and cd in to this folder, Be super user , python setup.py install Sreejith S Getting Started with NLTK
  33. 33. Installation of NLTK Make sure that Ptyhon 2.4 or 2.5 or 2.6 is available in your system Install Python Tkinter package Install Numpy, Matplotlib, Prover9, MaltParse and MegaM Download NLTK and Install it If you are installing NLTK from source Download http://nltk.googlecode.com/files/nltk-2.0b9.zip Unzip it , It will create nltk-2.0b9 . Open terminal and cd in to this folder, Be super user , python setup.py install To install data Sreejith S Getting Started with NLTK
  34. 34. Installation of NLTK Make sure that Ptyhon 2.4 or 2.5 or 2.6 is available in your system Install Python Tkinter package Install Numpy, Matplotlib, Prover9, MaltParse and MegaM Download NLTK and Install it If you are installing NLTK from source Download http://nltk.googlecode.com/files/nltk-2.0b9.zip Unzip it , It will create nltk-2.0b9 . Open terminal and cd in to this folder, Be super user , python setup.py install To install data Start python interpreter >>> import nltk >>> nltk.download() Sreejith S Getting Started with NLTK
  35. 35. Installation of NLTK Make sure that Ptyhon 2.4 or 2.5 or 2.6 is available in your system Install Python Tkinter package Install Numpy, Matplotlib, Prover9, MaltParse and MegaM Download NLTK and Install it If you are installing NLTK from source Download http://nltk.googlecode.com/files/nltk-2.0b9.zip Unzip it , It will create nltk-2.0b9 . Open terminal and cd in to this folder, Be super user , python setup.py install To install data Start python interpreter >>> import nltk >>> nltk.download() Now you are ready to play with NLTK !!! Sreejith S Getting Started with NLTK
  36. 36. NLTK Modules NLTK Modules Functionality Sreejith S Getting Started with NLTK
  37. 37. NLTK Modules NLTK Modules Functionality nltk.corpus Courpus Sreejith S Getting Started with NLTK
  38. 38. NLTK Modules NLTK Modules Functionality nltk.corpus Courpus nltk.tokenize,nltk.stem Tokenizers,stemmers Sreejith S Getting Started with NLTK
  39. 39. NLTK Modules NLTK Modules Functionality nltk.corpus Courpus nltk.tokenize,nltk.stem Tokenizers,stemmers nltk.collocations t-test,chi-squared,mutual-info Sreejith S Getting Started with NLTK
  40. 40. NLTK Modules NLTK Modules Functionality nltk.corpus Courpus nltk.tokenize,nltk.stem Tokenizers,stemmers nltk.collocations t-test,chi-squared,mutual-info nltk.tag n-gram,backoff,Brill,HMM,TnT Sreejith S Getting Started with NLTK
  41. 41. NLTK Modules NLTK Modules Functionality nltk.corpus Courpus nltk.tokenize,nltk.stem Tokenizers,stemmers nltk.collocations t-test,chi-squared,mutual-info nltk.tag n-gram,backoff,Brill,HMM,TnT nltk.classify,nltk.cluster Decision tree,Naive bayes,K-means Sreejith S Getting Started with NLTK
  42. 42. NLTK Modules NLTK Modules Functionality nltk.corpus Courpus nltk.tokenize,nltk.stem Tokenizers,stemmers nltk.collocations t-test,chi-squared,mutual-info nltk.tag n-gram,backoff,Brill,HMM,TnT nltk.classify,nltk.cluster Decision tree,Naive bayes,K-means nltk.chunk Regex,n-gram,named entity Sreejith S Getting Started with NLTK
  43. 43. NLTK Modules NLTK Modules Functionality nltk.corpus Courpus nltk.tokenize,nltk.stem Tokenizers,stemmers nltk.collocations t-test,chi-squared,mutual-info nltk.tag n-gram,backoff,Brill,HMM,TnT nltk.classify,nltk.cluster Decision tree,Naive bayes,K-means nltk.chunk Regex,n-gram,named entity nltk.parsing Parsing Sreejith S Getting Started with NLTK
  44. 44. NLTK Modules NLTK Modules Functionality nltk.corpus Courpus nltk.tokenize,nltk.stem Tokenizers,stemmers nltk.collocations t-test,chi-squared,mutual-info nltk.tag n-gram,backoff,Brill,HMM,TnT nltk.classify,nltk.cluster Decision tree,Naive bayes,K-means nltk.chunk Regex,n-gram,named entity nltk.parsing Parsing nltk.sem,nltk.interence Semantic interpretation Sreejith S Getting Started with NLTK
  45. 45. NLTK Modules NLTK Modules Functionality nltk.corpus Courpus nltk.tokenize,nltk.stem Tokenizers,stemmers nltk.collocations t-test,chi-squared,mutual-info nltk.tag n-gram,backoff,Brill,HMM,TnT nltk.classify,nltk.cluster Decision tree,Naive bayes,K-means nltk.chunk Regex,n-gram,named entity nltk.parsing Parsing nltk.sem,nltk.interence Semantic interpretation nltk.metrics Evaluation metrics Sreejith S Getting Started with NLTK
  46. 46. NLTK Modules NLTK Modules Functionality nltk.corpus Courpus nltk.tokenize,nltk.stem Tokenizers,stemmers nltk.collocations t-test,chi-squared,mutual-info nltk.tag n-gram,backoff,Brill,HMM,TnT nltk.classify,nltk.cluster Decision tree,Naive bayes,K-means nltk.chunk Regex,n-gram,named entity nltk.parsing Parsing nltk.sem,nltk.interence Semantic interpretation nltk.metrics Evaluation metrics nltk.probability Probability & Estimation Sreejith S Getting Started with NLTK
  47. 47. NLTK Modules NLTK Modules Functionality nltk.corpus Courpus nltk.tokenize,nltk.stem Tokenizers,stemmers nltk.collocations t-test,chi-squared,mutual-info nltk.tag n-gram,backoff,Brill,HMM,TnT nltk.classify,nltk.cluster Decision tree,Naive bayes,K-means nltk.chunk Regex,n-gram,named entity nltk.parsing Parsing nltk.sem,nltk.interence Semantic interpretation nltk.metrics Evaluation metrics nltk.probability Probability & Estimation nltk.app,nltk.chat Applications Sreejith S Getting Started with NLTK
  48. 48. NLTK Modules NLTK Modules Functionality nltk.corpus Courpus nltk.tokenize,nltk.stem Tokenizers,stemmers nltk.collocations t-test,chi-squared,mutual-info nltk.tag n-gram,backoff,Brill,HMM,TnT nltk.classify,nltk.cluster Decision tree,Naive bayes,K-means nltk.chunk Regex,n-gram,named entity nltk.parsing Parsing nltk.sem,nltk.interence Semantic interpretation nltk.metrics Evaluation metrics nltk.probability Probability & Estimation nltk.app,nltk.chat Applications Sreejith S Getting Started with NLTK
  49. 49. Let us start the game To access data for working out the example in the book Start python interpreter Sreejith S Getting Started with NLTK
  50. 50. Let us start the game To access data for working out the example in the book Start python interpreter Some basic work outs from the book Sreejith S Getting Started with NLTK
  51. 51. Let us start the game To access data for working out the example in the book Start python interpreter Some basic work outs from the book Concordance Sreejith S Getting Started with NLTK
  52. 52. Let us start the game To access data for working out the example in the book Start python interpreter Some basic work outs from the book Concordance >>> from nltk.book import * >>> text1.concordance("monstrous") Sreejith S Getting Started with NLTK
  53. 53. Let us start the game To access data for working out the example in the book Start python interpreter Some basic work outs from the book Concordance >>> from nltk.book import * >>> text1.concordance("monstrous") Similar Sreejith S Getting Started with NLTK
  54. 54. Let us start the game To access data for working out the example in the book Start python interpreter Some basic work outs from the book Concordance >>> from nltk.book import * >>> text1.concordance("monstrous") Similar >>> text1.similar("monstrous") Sreejith S Getting Started with NLTK
  55. 55. Let us start the game To access data for working out the example in the book Start python interpreter Some basic work outs from the book Concordance >>> from nltk.book import * >>> text1.concordance("monstrous") Similar >>> text1.similar("monstrous") Dispersion plot - Positional information Sreejith S Getting Started with NLTK
  56. 56. Let us start the game To access data for working out the example in the book Start python interpreter Some basic work outs from the book Concordance >>> from nltk.book import * >>> text1.concordance("monstrous") Similar >>> text1.similar("monstrous") Dispersion plot - Positional information >>> text4.dispersion_plot(["citizens", "democracy", "freedom", "duties", "America"]) >>> text4.dispersion_plot(["and", "to", "of", "with", "the"]) What is it !!! Why ??? Sreejith S Getting Started with NLTK
  57. 57. Continued... Some basic work outs from the book Sreejith S Getting Started with NLTK
  58. 58. Continued... Some basic work outs from the book Generate Sreejith S Getting Started with NLTK
  59. 59. Continued... Some basic work outs from the book Generate >>> text3.generate() Sreejith S Getting Started with NLTK
  60. 60. Continued... Some basic work outs from the book Generate >>> text3.generate() Counting Vocabulary Sreejith S Getting Started with NLTK
  61. 61. Continued... Some basic work outs from the book Generate >>> text3.generate() Counting Vocabulary >>> len(text3) Sreejith S Getting Started with NLTK
  62. 62. Continued... Some basic work outs from the book Generate >>> text3.generate() Counting Vocabulary >>> len(text3) List of distinct words ,sorted in dictionary order. Sreejith S Getting Started with NLTK
  63. 63. Continued... Some basic work outs from the book Generate >>> text3.generate() Counting Vocabulary >>> len(text3) List of distinct words ,sorted in dictionary order. >>> sorted(set(text3)) Sreejith S Getting Started with NLTK
  64. 64. Continued... Some basic work outs from the book Generate >>> text3.generate() Counting Vocabulary >>> len(text3) List of distinct words ,sorted in dictionary order. >>> sorted(set(text3)) Count occurrence of a particular word in a text Sreejith S Getting Started with NLTK
  65. 65. Continued... Some basic work outs from the book Generate >>> text3.generate() Counting Vocabulary >>> len(text3) List of distinct words ,sorted in dictionary order. >>> sorted(set(text3)) Count occurrence of a particular word in a text >>> text3.count("and") What percentage of text it is taken by a specific word >>> 100 * text3.count("and") / len(text3) Sreejith S Getting Started with NLTK
  66. 66. Collocation & Bigram Sreejith S Getting Started with NLTK
  67. 67. Collocation & Bigram Collocation A collocation is a sequence of words that occur together unusually often e.g :- red wine , strong tea But strong computer is not a collocation Sreejith S Getting Started with NLTK
  68. 68. Collocation & Bigram Collocation A collocation is a sequence of words that occur together unusually often e.g :- red wine , strong tea But strong computer is not a collocation >>> text4.collocations() Sreejith S Getting Started with NLTK
  69. 69. Collocation & Bigram Collocation A collocation is a sequence of words that occur together unusually often e.g :- red wine , strong tea But strong computer is not a collocation >>> text4.collocations() Bigrams List of word pairs Sreejith S Getting Started with NLTK
  70. 70. Collocation & Bigram Collocation A collocation is a sequence of words that occur together unusually often e.g :- red wine , strong tea But strong computer is not a collocation >>> text4.collocations() Bigrams List of word pairs >>> text = "sreejith is talking about NLTK" >>> wordlist = text.split() >>> bigrams(wordlist) Sreejith S Getting Started with NLTK
  71. 71. Collocation & Bigram Collocation A collocation is a sequence of words that occur together unusually often e.g :- red wine , strong tea But strong computer is not a collocation >>> text4.collocations() Bigrams List of word pairs >>> text = "sreejith is talking about NLTK" >>> wordlist = text.split() >>> bigrams(wordlist) what will happen if i do like this >>> bigrams(text) Sreejith S Getting Started with NLTK
  72. 72. Work with our own data Populate our own corpora with NLTK and analyse it Sreejith S Getting Started with NLTK
  73. 73. Work with our own data Populate our own corpora with NLTK and analyse it >>> from nltk.corpus import PlaintextCorpusReader as ptr >>> corpus = ’/home/developer/Desktop/Sreejith’ >>> wordlist = ptr(corpus,’.*’) >>> wordlist.fileids() Sreejith S Getting Started with NLTK
  74. 74. Work with our own data Populate our own corpora with NLTK and analyse it >>> from nltk.corpus import PlaintextCorpusReader as ptr >>> corpus = ’/home/developer/Desktop/Sreejith’ >>> wordlist = ptr(corpus,’.*’) >>> wordlist.fileids() Let us try to find it out how to count number of characters, words and sentences in the corpus Sreejith S Getting Started with NLTK
  75. 75. Work with our own data Populate our own corpora with NLTK and analyse it >>> from nltk.corpus import PlaintextCorpusReader as ptr >>> corpus = ’/home/developer/Desktop/Sreejith’ >>> wordlist = ptr(corpus,’.*’) >>> wordlist.fileids() Let us try to find it out how to count number of characters, words and sentences in the corpus >>> for fid in wordlist.fileids(): print len(wordlist.raw(fid)) >>> for fid in wordlist.fileids(): print len(wordlist.words(fid)) >>> for fid in wordlist.fileids(): print len(wordlist.sents(fid)) Sreejith S Getting Started with NLTK
  76. 76. Continued... Ploting conditional frquency distribution Sreejith S Getting Started with NLTK
  77. 77. Continued... Ploting conditional frquency distribution >>> text = "sreejith is talking about NLTK" >>> words = text.split() >>> big = bigrams(words) >>> gd = nltk.ConditionalFreqDist(big) >>> gd.plot() Sreejith S Getting Started with NLTK
  78. 78. Continued... Ploting conditional frquency distribution >>> text = "sreejith is talking about NLTK" >>> words = text.split() >>> big = bigrams(words) >>> gd = nltk.ConditionalFreqDist(big) >>> gd.plot() Tabulate CFD Sreejith S Getting Started with NLTK
  79. 79. Continued... Ploting conditional frquency distribution >>> text = "sreejith is talking about NLTK" >>> words = text.split() >>> big = bigrams(words) >>> gd = nltk.ConditionalFreqDist(big) >>> gd.plot() Tabulate CFD >>> gd.tabulate() Sreejith S Getting Started with NLTK
  80. 80. Continued... Ploting conditional frquency distribution >>> text = "sreejith is talking about NLTK" >>> words = text.split() >>> big = bigrams(words) >>> gd = nltk.ConditionalFreqDist(big) >>> gd.plot() Tabulate CFD >>> gd.tabulate() Plot frequency distribution Sreejith S Getting Started with NLTK
  81. 81. Continued... Ploting conditional frquency distribution >>> text = "sreejith is talking about NLTK" >>> words = text.split() >>> big = bigrams(words) >>> gd = nltk.ConditionalFreqDist(big) >>> gd.plot() Tabulate CFD >>> gd.tabulate() Plot frequency distribution >>> fdist = FreqDist(text1) >>> fdist.plot(50,cumulative=True) Sreejith S Getting Started with NLTK
  82. 82. Normalizing Text Sreejith S Getting Started with NLTK
  83. 83. Normalizing Text Stemming Stemming is the process for reducing inflected (or sometimes derived) words to their stem, base or root form , generally a written word form Sreejith S Getting Started with NLTK
  84. 84. Normalizing Text Stemming Stemming is the process for reducing inflected (or sometimes derived) words to their stem, base or root form , generally a written word form >>> porter = nltk.PorterStemmer() >>> word = ’running’ >>> porter.stem(word) >>> lancaster = nltk.LancasterStemmer() >>> lancaster.stem(tok[2]) Sreejith S Getting Started with NLTK
  85. 85. Normalizing Text Sreejith S Getting Started with NLTK
  86. 86. Normalizing Text Lemmatization Stemming + make sure that the resulting form is a known word in a dictionary Sreejith S Getting Started with NLTK
  87. 87. Normalizing Text Lemmatization Stemming + make sure that the resulting form is a known word in a dictionary >>> wnl = nltk.WordNetLemmatizer() >>> wnl.lemmatize(word) Sreejith S Getting Started with NLTK
  88. 88. POS Tagging Sreejith S Getting Started with NLTK
  89. 89. POS Tagging POS Tagging The process of classifying words into their parts-of-speech and labeling them accordingly is known as part-of-speech tagging, POS tagging Sreejith S Getting Started with NLTK
  90. 90. POS Tagging POS Tagging The process of classifying words into their parts-of-speech and labeling them accordingly is known as part-of-speech tagging, POS tagging >>> text = nltk.word_tokenize("we are attending FOSS meet at NIC calicut") >>> nltk.pos_tag(text) Sreejith S Getting Started with NLTK
  91. 91. Parsing Sreejith S Getting Started with NLTK
  92. 92. Parsing Sentence Parsing Analyzing sentence structures and create a Parse Tree Sreejith S Getting Started with NLTK
  93. 93. Parsing Sentence Parsing Analyzing sentence structures and create a Parse Tree >>> sentence = [("the", "DT"), ("little", "JJ"), ("yellow", "JJ"),("dog", "NN"), ("barked", "VBD"), ("at", "IN"), ("the", "DT"), ("cat", "NN")] >>> grammar = "NP: {<DT>?<JJ>*<NN>}" >>> cp = nltk.RegexpParser(grammar) >>> result = cp.parse(sentence) >>> print result >>> result.draw() Sreejith S Getting Started with NLTK
  94. 94. Machine Translation Sreejith S Getting Started with NLTK
  95. 95. Machine Translation Babelizer Shell Translating a sentence from its source langauge to a specified language. NLTK provides babelize shell Sreejith S Getting Started with NLTK
  96. 96. Machine Translation Babelizer Shell Translating a sentence from its source langauge to a specified language. NLTK provides babelize shell >>> babelize_shell() Babel> hello how are you? Babel> german Babel> run Sreejith S Getting Started with NLTK
  97. 97. Machine Translation Babelizer Shell Translating a sentence from its source langauge to a specified language. NLTK provides babelize shell >>> babelize_shell() Babel> hello how are you? Babel> german Babel> run Just try Google Translator, Yahoo babelfish Sreejith S Getting Started with NLTK
  98. 98. What u can do?? Contribute to NLTK GSOC NLP Training Real time research Sreejith S Getting Started with NLTK
  99. 99. Reference Steven Bird, Edvard Loper and Ewan Klien Natural Language Processing with Python Jacob Perkins Python Text Processing with NLTK2.0 Cookbook http://www.nltk.org Sreejith S Getting Started with NLTK
  100. 100. Questions Sreejith S Getting Started with NLTK
  101. 101. And finally... Sreejith.S Sreejith S Getting Started with NLTK

×