SlideShare a Scribd company logo
Introduction	
  to	
  Natural	
  
Language	
  Processing	
  
Rutu	
  Mulkar-­‐Mehta,	
  PhD	
  
Founder	
  and	
  Data	
  Scientist	
  @Ticary	
  
@RutuMulkar	
  
Co-­‐hosted	
  Meetup	
  
Data	
  Science	
  Dojo	
  
http://www.meetup.com/data-­‐science-­‐dojo	
  
Natural	
  Language	
  Processing	
  
http://www.meetup.com/Natural-­‐Language-­‐Processing-­‐Meetup/	
  
About	
  Me	
  
•  Founder	
  and	
  Data	
  Scientist	
  at	
  Ticary	
  	
  
•  Background:	
  
– PhD	
  in	
  Natural	
  Language	
  Processing	
  
– Computer	
  Science	
  
•  Worked	
  on	
  applying	
  NLP	
  to:	
  
– Healthcare	
  
– SEO	
  (Search	
  Engine	
  Optimization)	
  
– Other	
  Stuff:	
  Sentiment	
  Analysis,	
  Question	
  
Answering,	
  Natural	
  Language	
  Understanding	
  ++	
  	
  
4	
  
Agenda	
  
•  Understanding	
  Natural	
  Language	
  
•  Introduction	
  to	
  different	
  NLP	
  Problems	
  
•  Part	
  of	
  Speech	
  tagging	
  
•  Linguistic	
  Resources	
  
	
  
UNDERSTANDING	
  NATURAL	
  
LANGUAGE	
  
Some	
  Example	
  Sentences	
  
•  Children	
  make	
  delicious	
  snacks	
  
•  I	
  saw	
  the	
  Grand	
  Canyon	
  flying	
  to	
  New	
  York	
  
•  Stolen	
  painting	
  found	
  by	
  the	
  tree	
  
	
  
•  Two	
  sentences:	
  
– Monkeys	
  like	
  bananas	
  when	
  they	
  wake	
  up.	
  
– Monkeys	
  like	
  bananas	
  when	
  they	
  are	
  ripe.	
  
Why	
  is	
  NLP	
  Hard?	
  
Brazil	
  crowds	
  attend	
  funeral	
  of	
  late	
  candidate	
  Campos	
  
	
  
More	
  than	
  100,000	
  people	
  in	
  Brazil	
  have	
  paid	
  their	
  last	
  respects	
  to	
  the	
  
late	
  presidential	
  candidate,	
  Eduardo	
  Campos,	
  who	
  died	
  in	
  a	
  plane	
  
crash	
  on	
  Wednesday.	
  
They	
  attended	
  a	
  funeral	
  Mass	
  and	
  filled	
  the	
  streets	
  of	
  the	
  city	
  of	
  
Recife	
  to	
  follow	
  the	
  passage	
  of	
  his	
  coffin.	
  
Later	
  this	
  week,	
  Mr.	
  Campos's	
  Socialist	
  Party	
  is	
  expected	
  to	
  appoint	
  
former	
  Environment	
  Minister	
  Marina	
  Silva	
  as	
  a	
  replacement	
  
candidate.	
  
Mr.	
  Campos's	
  jet	
  crashed	
  in	
  bad	
  weather	
  in	
  Santos,	
  near	
  Sao	
  Paulo.	
  
Investigators	
  are	
  still	
  trying	
  to	
  establish	
  the	
  exact	
  causes	
  of	
  the	
  crash,	
  
which	
  killed	
  six	
  other	
  people.	
  
Why	
  is	
  NLP	
  Hard?	
  
Brazil	
  crowds	
  attend	
  funeral	
  of	
  late	
  candidate	
  Campos	
  
	
  
More	
  than	
  100,000	
  people	
  in	
  Brazil	
  have	
  paid	
  their	
  last	
  respects	
  to	
  the	
  
late	
  presidential	
  candidate,	
  Eduardo	
  Campos,	
  who	
  died	
  in	
  a	
  plane	
  
crash	
  on	
  Wednesday.	
  
They	
  attended	
  a	
  funeral	
  Mass	
  and	
  filled	
  the	
  streets	
  of	
  the	
  city	
  of	
  
Recife	
  to	
  follow	
  the	
  passage	
  of	
  his	
  coffin.	
  
Later	
  this	
  week,	
  Mr	
  Campos's	
  Socialist	
  Party	
  is	
  expected	
  to	
  appoint	
  
former	
  Environment	
  Minister	
  Marina	
  Silva	
  as	
  a	
  replacement	
  
candidate.	
  
Mr	
  Campos's	
  jet	
  crashed	
  in	
  bad	
  weather	
  in	
  Santos,	
  near	
  Sao	
  Paulo.	
  
Investigators	
  are	
  still	
  trying	
  to	
  establish	
  the	
  exact	
  causes	
  of	
  the	
  crash,	
  
which	
  killed	
  six	
  other	
  people.	
  
Why	
  is	
  NLP	
  Hard?	
  
Brazil	
  crowds	
  attend	
  funeral	
  of	
  late	
  candidate	
  Campos	
  
	
  
More	
  than	
  100,000	
  people	
  in	
  Brazil	
  have	
  paid	
  their	
  last	
  respects	
  to	
  the	
  
late	
  presidential	
  candidate,	
  Eduardo	
  Campos,	
  who	
  died	
  in	
  a	
  plane	
  
crash	
  on	
  Wednesday.	
  
They	
  attended	
  a	
  funeral	
  Mass	
  and	
  filled	
  the	
  streets	
  of	
  the	
  city	
  of	
  
Recife	
  to	
  follow	
  the	
  passage	
  of	
  his	
  coffin.	
  
Later	
  this	
  week,	
  Mr	
  Campos's	
  Socialist	
  Party	
  is	
  expected	
  to	
  appoint	
  
former	
  Environment	
  Minister	
  Marina	
  Silva	
  as	
  a	
  replacement	
  
candidate.	
  
Mr	
  Campos's	
  jet	
  crashed	
  in	
  bad	
  weather	
  in	
  Santos,	
  near	
  Sao	
  Paulo.	
  
Investigators	
  are	
  still	
  trying	
  to	
  establish	
  the	
  exact	
  causes	
  of	
  the	
  crash,	
  
which	
  killed	
  six	
  other	
  people.	
  
Why	
  is	
  NLP	
  Hard?	
  
•  To	
  understand	
  the	
  current	
  event,	
  you	
  need	
  to	
  
understand	
  several	
  other	
  concepts:	
  
– Current	
  Event	
  
– Background	
  Event	
  
– Property	
  
– references	
  to	
  other	
  events	
  
– pronouns	
  
NLP	
  TASKS	
  
What	
  can	
  we	
  solve	
  with	
  Natural	
  Language	
  Processing	
  
NLP	
  Tasks	
  
•  Text	
  Categorization	
  
•  Sentiment	
  Analysis	
  
•  Information	
  Extraction	
  
•  Information	
  Retrieval	
  
•  Question	
  Answering	
  
•  Text	
  Summarization	
  
•  Machine	
  Translation	
  
Text	
  Categorization	
  
Input	
  Document	
  
What	
  is	
  the	
  document	
  about:	
  
	
  
sports:	
  0.2%	
  
	
  
politics:	
  2%	
  
	
  
entertainment:	
  96%	
  
	
  
religion:	
  …	
  
	
  
finance:	
  …	
  
Text	
  Classification	
  
finance.yahoo.com	
   sports.yahoo.com	
  
make	
  your	
  own	
  wordle	
  using	
  wordle.net	
  
Vocabulary	
  used	
  in	
  one	
  genre	
  of	
  text,	
  is	
  different	
  from	
  
vocabulary	
  used	
  in	
  another	
  genre	
  
NLP	
  Tasks	
  
•  Text	
  Categorization	
  
•  Sentiment	
  Analysis	
  
•  Information	
  Extraction	
  
•  Information	
  Retrieval	
  
•  Question	
  Answering	
  
•  Text	
  Summarization	
  
•  Machine	
  Translation	
  
Sentiment	
  Analysis	
  
Sharp
screen resolution
Low
battery life
v	
  
Product Reviews – Kindle Paperwhite
Sentiment	
  Analysis	
  
•  What	
  are	
  people	
  saying?	
  
–  Twitter	
  
–  Reviews	
  
–  Blogs	
  
–  Emails	
  
•  Can	
  be	
  for:	
  
–  Products	
  
–  Companies	
  
–  Movies	
  
–  Books	
  
Sentiment	
  Analysis	
  
Possible	
  Features	
  
•  Important	
  keywords,	
  and	
  key	
  phrases:	
  
–  POS:	
  dazzling,	
  brilliant,	
  phenomenal	
  
–  NEG:	
  hideous,	
  awful,	
  unwatchable	
  
•  Emoticons	
  
–  POS	
  :-­‐)	
  	
  
–  NEG	
  :-­‐(	
  
•  Ontologies	
  
–  Wordnet:	
  https://wordnet.princeton.edu/	
  
–  SentiWordnet:	
  http://sentiwordnet.isti.cnr.it/	
  
Challenges	
  
•  People	
  express	
  opinions	
  in	
  complex	
  ways	
  
– “The	
  acting	
  was	
  great	
  and	
  the	
  plots	
  were	
  intense	
  
and	
  mesmerizing,	
  but	
  I	
  hated	
  the	
  movie”	
  
•  Sarcasm,	
  humor	
  and	
  other	
  expressions	
  
– “It	
  was	
  a	
  great	
  movie	
  for	
  a	
  Sunday	
  nap.	
  I	
  only	
  fell	
  
asleep	
  twice,	
  but	
  it	
  was	
  very	
  restful”	
  
NLP	
  Tasks	
  
•  Text	
  Categorization	
  
•  Sentiment	
  Analysis	
  
•  Information	
  Extraction	
  
•  Information	
  Retrieval	
  
•  Question	
  Answering	
  
•  Text	
  Summarization	
  
•  Machine	
  Translation	
  
Information	
  Extraction	
  
Input	
  Document	
  
What	
  are	
  the	
  key	
  
pieces	
  of	
  information	
  ?	
  
	
  
Location:	
  
Time:	
  
People:	
  
…	
  
Extracting	
  Named	
  Entities	
  from	
  Documents	
  
Other	
  ways	
  for	
  IE	
  :	
  	
  
Hypernyms	
  (type	
  of)	
  
colors	
  such	
  as	
  red,	
  blue	
  and	
  …	
  
25	
  
Other	
  ways	
  for	
  IE:	
  	
  
Synonyms	
  	
  
Find	
  different	
  relations	
  between	
  2	
  concepts:	
  
Microsoft	
  bought	
  Farecast	
  
26	
  
NLP	
  Tasks	
  
•  Text	
  Categorization	
  
•  Sentiment	
  Analysis	
  
•  Information	
  Extraction	
  
•  Information	
  Retrieval	
  
•  Question	
  Answering	
  
•  Text	
  Summarization	
  
•  Machine	
  Translation	
  
Information	
  Retrieval	
  
Information	
  Retrieval	
  
Input	
  Document	
  
What	
  are	
  the	
  documents	
  
relevant	
  to	
  the	
  query?	
  
Input	
  Document	
  
Input	
  Document	
  
Input	
  Document	
  
Input	
  Document	
  
query	
  
Information	
  Retrieval	
  
Q)	
  Which	
  documents	
  are	
  most	
  relevant	
  to	
  a	
  
given	
  query?	
  
	
  
A)	
  Similar	
  vocabulary	
  between	
  query	
  and	
  
document?	
  
Quantify	
  similarity	
  based	
  on	
  maximum	
  overlap	
  
– Cosine	
  Similarity	
  
– Jaccard	
  Similarity	
  
Information	
  Retrieval	
  
Q)	
  If	
  you	
  rewrite	
  the	
  query	
  –	
  will	
  that	
  give	
  you	
  
more	
  precise	
  results?	
  
	
  
A)	
  Yes!	
  It	
  is	
  called	
  “Query	
  Expansion”	
  
Commercial	
  Search	
  Tools	
  
•  Lucene	
  
– http://lucene.apache.org/	
  	
  
•  ElasticSearch	
  
– https://www.elastic.co/	
  
Underlying	
  technology	
  in	
  most	
  of	
  these	
  is	
  the	
  same,	
  with	
  some	
  variations	
  
	
  
Meetup	
  about	
  this	
  topic	
  scheduled	
  for	
  early	
  2016	
  
NLP	
  Tasks	
  
•  Text	
  Categorization	
  
•  Sentiment	
  Analysis	
  
•  Information	
  Extraction	
  
•  Information	
  Retrieval	
  
•  Question	
  Answering	
  
•  Text	
  Summarization	
  
•  Machine	
  Translation	
  
Question	
  Answering	
  -­‐	
  Closed	
  
Input	
  Data	
  Source	
  
Questions:	
  
	
  
What	
  event	
  happened?	
  
	
  
When	
  did	
  the	
  event	
  happen?	
  
	
  
Why	
  did	
  the	
  event	
  happen?	
  
	
  
How	
  long	
  was	
  the	
  event?	
  
	
  
How	
  did	
  the	
  event	
  happen?	
  
Question	
  Answering	
  -­‐	
  Open	
  
38	
  
NLP	
  Tasks	
  
•  Text	
  Categorization	
  
•  Sentiment	
  Analysis	
  
•  Information	
  Extraction	
  
•  Information	
  Retrieval	
  
•  Question	
  Answering	
  
•  Text	
  Summarization	
  
•  Machine	
  Translation	
  
Text	
  Summarization	
  
Types	
  of	
  Text	
  Summarization	
  
•  Keyword	
  Summaries	
  
–  Extract	
  significant	
  Keywords	
  from	
  text	
  
–  Easy	
  to	
  implement	
  
–  Hard	
  to	
  understand	
  by	
  end	
  user	
  a	
  
Types	
  of	
  Text	
  Summarization	
  
•  Sentence/Phrase	
  Extraction	
  
–  Extract	
  relevant	
  sentences	
  
–  Medium-­‐Hard	
  to	
  implement	
  
–  Easy	
  for	
  end	
  user	
  to	
  understand	
  
Types	
  of	
  Text	
  Summarization	
  
•  Natural	
  Language	
  Understanding	
  and	
  Generation	
  
–  Understand	
  meaning	
  of	
  text	
  
–  Generate	
  sentences	
  from	
  meaning	
  of	
  original	
  text	
  
–  Hard	
  to	
  implement	
  
–  Easy	
  for	
  end	
  user	
  
President	
  of	
  University	
  
of	
  Missouri	
  resigned	
  
after	
  graduate	
  student	
  
hunger	
  strike	
  and	
  class	
  
cancellations	
  by	
  faculty	
  
NLP	
  Tasks	
  
•  Text	
  Categorization	
  
•  Sentiment	
  Analysis	
  
•  Information	
  Extraction	
  
•  Information	
  Retrieval	
  
•  Question	
  Answering	
  
•  Text	
  Summarization	
  
•  Machine	
  Translation	
  
Machine	
  Translation	
  
translate.google.com	
  
Why	
  is	
  MT	
  Hard?	
  
•  It	
  is	
  not	
  a	
  1	
  to	
  1	
  translation	
  
– In	
  the	
  previous	
  example	
  4	
  words	
  in	
  English	
  
translate	
  into	
  2	
  in	
  Spanish	
  
•  Grammar	
  is	
  different	
  in	
  different	
  languages	
  
– SOV	
  (Subject	
  –	
  Object	
  –	
  Verb)	
  
•  “She	
  him	
  loves”	
  (Hindi,	
  Japanese)	
  
– SVO	
  (Subject	
  –	
  Verb	
  –	
  Object)	
  	
  
•  “She	
  loves	
  him”	
  (English,	
  Mandarin)	
  
Machine	
  Translation	
  
•  Waygoapp	
  
•  Instantly	
  translated	
  Chinese,	
  
Japanese	
  and	
  Korean	
  
•  Simply	
  point	
  and	
  translate	
  
•  Offline	
  
	
  
http://waygoapp.com/	
  
LINGUISTIC	
  NUANCES	
  
Back	
  to	
  the	
  basics	
  
Example	
  
All	
  the	
  gobulins	
  were	
  gramzies.	
  
It	
  was	
  grimbleton.	
  
What	
  are	
  the	
  underlined	
  words?	
  
	
  
gobulins	
  	
  
•  Noun	
  
gramzies	
  	
  
•  Noun	
  or	
  Adjective	
  
grimbleton	
  
•  Noun	
  or	
  Adjective	
  
Why	
  is	
  the	
  example	
  important?	
  
We	
  can	
  get	
  a	
  sense	
  of	
  what	
  the	
  word	
  means,	
  
based	
  on	
  how	
  it	
  is	
  used	
  in	
  language.	
  
Nouns	
  
•  E.g.	
  cat,	
  car,	
  computer,	
  tree	
  
•  Variations:	
  
– Number:	
  singular,	
  plural	
  
•  one	
  car,	
  two	
  cars	
  
– Gender:	
  masculine,	
  feminine,	
  neuter	
  
– Case:	
  nominative,	
  genitive,	
  accusative,	
  dative	
  
Pronouns	
  
•  Vary	
  in	
  
–  E.g.	
  she,	
  ourselves,	
  mine	
  
–  Person	
  
–  Gender	
  
•  his,	
  her	
  
–  Number	
  
–  Case:	
  nominative,	
  accusative,	
  possessive,	
  2nd	
  
possessive	
  
–  Reflexive	
  and	
  Anaphoric	
  Forms:	
  	
  
•  herself,	
  each	
  other	
  
Determiners	
  
•  Articles	
  
– a,	
  an,	
  the	
  
•  Demonstratives	
  
– this,	
  that	
  
	
  
Adjectives	
  
•  Describe	
  Properties	
  
– sunny,	
  beautiful,	
  calm	
  
•  Attributive	
  and	
  predicative	
  properties	
  
•  Agreement	
  
– in	
  gender,	
  number	
  
•  Comparative	
  and	
  superlative	
  forms	
  
– derivative	
  and	
  periphrastic	
  
•  positive	
  form	
  
Verbs	
  
•  Tense:	
  past,	
  present,	
  future	
  
– danced,	
  dancing,	
  will	
  dance	
  
•  Aspect:	
  progressive,	
  perfective	
  
•  Voice:	
  active,	
  passive	
  
•  Other:	
  number,	
  person	
  
•  Arguments:	
  transitive,	
  intransitive,	
  
ditransitive	
  
Other	
  POS	
  tags	
  
•  Adverbs	
  
– happily	
  
•  Prepositions	
  
– of,	
  on,	
  in	
  
•  Particles	
  
– ran	
  a	
  bill	
  vs	
  ran	
  up	
  a	
  bill	
  
Morphological	
  Analysis	
  
•  Sleeps	
  =	
  sleep	
  +	
  v	
  +	
  3rd	
  Person	
  +	
  Singular	
  
•  If	
  we	
  have	
  a	
  good	
  enough	
  grammar	
  with	
  all	
  of	
  
these	
  rules,	
  we	
  have	
  a	
  good	
  shot	
  at	
  
understanding	
  syntax	
  of	
  language	
  
Automatic	
  Taggers	
  
•  Almost	
  all	
  the	
  POS	
  taggers	
  use	
  the	
  Penn-­‐Treebank	
  
list	
  of	
  tags	
  
•  https://www.ling.upenn.edu/courses/Fall_2003/
ling001/penn_treebank_pos.html	
  
58	
  
Automatic	
  Taggers	
  
•  Almost	
  all	
  the	
  POS	
  taggers	
  use	
  the	
  Penn-­‐Treebank	
  list	
  of	
  
tags	
  
•  https://www.ling.upenn.edu/courses/Fall_2003/ling001/
penn_treebank_pos.html	
  
–  Nouns	
  :	
  	
  
•  NN	
  (house),	
  NNS(houses),	
  NNP(White	
  House),	
  NNPS	
  
–  Verbs:	
  	
  
•  VB(say),	
  VBD(said),	
  VBG(saying),	
  VBN,	
  VBP,	
  VBZ	
  
–  Adjectives:	
  	
  
•  JJ	
  (good),	
  JJR(better),	
  JJS(best)	
  
–  Adverbs:	
  RB,	
  RBR,	
  RBS	
  
–  Prepositions:	
  IN	
  
59	
  
Example	
  
60	
  
POS	
  Tagging	
  and	
  Parsing	
  
•  Stanford	
  Core	
  NLP	
  
– http://nlp.stanford.edu:8080/corenlp/	
  
•  NLTK	
  
– Natural	
  Language	
  Toolkit	
  
– You	
  need	
  to	
  provide	
  your	
  own	
  training	
  data,	
  and	
  
train	
  models	
  for	
  NLTK	
  to	
  be	
  effective	
  
61	
  
Other	
  Linguistic	
  Features	
  of	
  Interest	
  
– We	
  want	
  to	
  get	
  nouns	
  and	
  verbs	
  into	
  a	
  root	
  form	
  
E.g.	
  
•  am,	
  are,	
  is	
  à	
  be	
  
•  car,	
  cars,	
  car’s	
  à	
  car	
  	
  
– Two	
  approaches:	
  	
  
•  Stemming	
  	
  
•  Lemmatization	
  
62	
  
Stemming	
  and	
  Lemmatization	
  
•  Lemmatization	
  	
  
–  use	
  of	
  a	
  vocabulary	
  
–  morphological	
  analysis	
  of	
  words	
  
–  returns	
  the	
  base	
  or	
  dictionary	
  form	
  of	
  a	
  word	
  
–  base	
  form	
  is	
  known	
  as	
  the	
  lemma	
  
–  e.g.	
  am,	
  are,	
  is	
  à	
  be	
  
•  Stemming	
  
–  crude	
  heuristic	
  process	
  	
  
–  chops	
  off	
  the	
  ends	
  of	
  words	
  	
  
–  hope	
  of	
  achieving	
  this	
  goal	
  	
  
–  e.g.	
  Marked	
  à	
  Mark,	
  Marker	
  à	
  Mark	
  
63	
  
Parsing	
  Resources	
  
•  NLTK	
  
– python,	
  low	
  accuracy,	
  fast	
  
– http://www.nltk.org/	
  
•  Stanford	
  Core	
  NLP	
  
– java,	
  high	
  accuracy,	
  slow	
  
– http://nlp.stanford.edu/software/corenlp.shtml	
  
•  SpaCy	
  
– python,	
  medium	
  accuracy,	
  fast	
  
– https://spacy.io/	
  
Other	
  Resources:	
  Ontologies 	
  	
  
•  Wordnet	
  
–  groups	
  words	
  when	
  they	
  have	
  the	
  same	
  meaning	
  	
  
–  represents	
  hierarchical	
  links	
  between	
  groups	
  
–  E.g.	
  car	
  is	
  the	
  same	
  thing	
  as	
  an	
  automobile	
  
•  SentiWordnet	
  
•  Wordnet	
  +	
  Sentiment	
  
•  ConceptNet	
  
–  broader	
  relationships	
  than	
  WordNet	
  
–  E.g.	
  bread	
  is	
  typically	
  found	
  near	
  a	
  toaster.	
  
•  FrameNet	
  
–  Frames	
  represent	
  concepts	
  and	
  their	
  associated	
  roles	
  
SOMETHING	
  TO	
  THINK	
  ABOUT	
  
Semantics	
  and	
  Word	
  Co-­‐locations	
  
•  It	
  is	
  important	
  to	
  know	
  which	
  words	
  occur	
  
together	
  	
  
– Strong	
  Beer	
  vs	
  Powerful	
  Beer	
  
– Big	
  Sister	
  vs	
  Large	
  Sister	
  	
  
•  Two	
  approaches	
  have	
  been	
  used	
  
– Semantics	
  –	
  ontologies	
  and	
  word	
  meanings	
  
– Statistics	
  –	
  word	
  colocations	
  and	
  probabilities	
  
Thank	
  you	
  for	
  Listening	
  
rutu@ticary.com	
  
@RutuMulkar	
  
	
  

More Related Content

What's hot

Natural language processing (NLP)
Natural language processing (NLP) Natural language processing (NLP)
Natural language processing (NLP)
ASWINKP11
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
Pranav Gupta
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
VenkateshMurugadas
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
KarenVacca
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
Benjamin Bengfort
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
prashantdahake
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
Yogendra Tamang
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Rishikese MR
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
Saurav Aryal
 
Nlp
NlpNlp
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
rohitnayak
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review
Jayneel Vora
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
Aanchal Chaurasia
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Bhavya Chawla
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
Mercy Rani
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
Abash shah
 
Natural Language Processing
Natural Language Processing Natural Language Processing
Natural Language Processing
Adarsh Saxena
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Varunjeet Singh Rekhi
 

What's hot (20)

Natural language processing (NLP)
Natural language processing (NLP) Natural language processing (NLP)
Natural language processing (NLP)
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
 
NLP
NLPNLP
NLP
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Nlp
NlpNlp
Nlp
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural Language Processing
Natural Language Processing Natural Language Processing
Natural Language Processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 

Viewers also liked

Text analytics and R - Open Question: is it a good match?
Text analytics and R - Open Question: is it a good match?Text analytics and R - Open Question: is it a good match?
Text analytics and R - Open Question: is it a good match?
Marina Santini
 
Measuring Opinion Credibility in Twitter
Measuring Opinion Credibility in TwitterMeasuring Opinion Credibility in Twitter
Measuring Opinion Credibility in Twitter
Mya Thandar
 
Annotation processing
Annotation processingAnnotation processing
Annotation processing
Benjamin Cheng
 
Startupfest 2015: HARPER REED (Modest, Inc.) - Lightning Keynote
Startupfest 2015: HARPER REED (Modest, Inc.) - Lightning KeynoteStartupfest 2015: HARPER REED (Modest, Inc.) - Lightning Keynote
Startupfest 2015: HARPER REED (Modest, Inc.) - Lightning Keynote
Startupfest
 
Natural language procesing in R
Natural language procesing in RNatural language procesing in R
Natural language procesing in R
Olabanji Shonibare
 
Natural language processing 2
Natural language processing 2Natural language processing 2
Natural language processing 2Tony Vo
 
Gordana Panajotović - NLP Master
Gordana Panajotović - NLP MasterGordana Panajotović - NLP Master
Gordana Panajotović - NLP Master
NLP Centar Beograd
 
Challenges of social media analysis in the real world
Challenges of social media analysis in the real worldChallenges of social media analysis in the real world
Challenges of social media analysis in the real world
Diana Maynard
 
Natural Language Processing in Practice
Natural Language Processing in PracticeNatural Language Processing in Practice
Natural Language Processing in PracticeVsevolod Dyomkin
 
Text Mining Infrastructure in R
Text Mining Infrastructure in RText Mining Infrastructure in R
Text Mining Infrastructure in R
Ashraf Uddin
 
Introduction to nlp 2014
Introduction to nlp 2014Introduction to nlp 2014
Introduction to nlp 2014
Grant Hamel
 
Online Tweet Sentiment Analysis with Apache Spark
Online Tweet Sentiment Analysis with Apache SparkOnline Tweet Sentiment Analysis with Apache Spark
Online Tweet Sentiment Analysis with Apache Spark
Davide Nardone
 
Text analytics in Python and R with examples from Tobacco Control
Text analytics in Python and R with examples from Tobacco ControlText analytics in Python and R with examples from Tobacco Control
Text analytics in Python and R with examples from Tobacco Control
Ben Healey
 
Practical Natural Language Processing
Practical Natural Language ProcessingPractical Natural Language Processing
Practical Natural Language ProcessingJaganadh Gopinadhan
 
Persona Driven Keyword Research
Persona Driven Keyword ResearchPersona Driven Keyword Research
Persona Driven Keyword ResearchMichael King
 
Introducing natural language processing(NLP) with r
Introducing natural language processing(NLP) with rIntroducing natural language processing(NLP) with r
Introducing natural language processing(NLP) with r
Vivian S. Zhang
 
Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)
fridolin.wild
 
R by example: mining Twitter for consumer attitudes towards airlines
R by example: mining Twitter for consumer attitudes towards airlinesR by example: mining Twitter for consumer attitudes towards airlines
R by example: mining Twitter for consumer attitudes towards airlines
Jeffrey Breen
 
NLTK in 20 minutes
NLTK in 20 minutesNLTK in 20 minutes
NLTK in 20 minutes
Jacob Perkins
 

Viewers also liked (20)

Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Text analytics and R - Open Question: is it a good match?
Text analytics and R - Open Question: is it a good match?Text analytics and R - Open Question: is it a good match?
Text analytics and R - Open Question: is it a good match?
 
Measuring Opinion Credibility in Twitter
Measuring Opinion Credibility in TwitterMeasuring Opinion Credibility in Twitter
Measuring Opinion Credibility in Twitter
 
Annotation processing
Annotation processingAnnotation processing
Annotation processing
 
Startupfest 2015: HARPER REED (Modest, Inc.) - Lightning Keynote
Startupfest 2015: HARPER REED (Modest, Inc.) - Lightning KeynoteStartupfest 2015: HARPER REED (Modest, Inc.) - Lightning Keynote
Startupfest 2015: HARPER REED (Modest, Inc.) - Lightning Keynote
 
Natural language procesing in R
Natural language procesing in RNatural language procesing in R
Natural language procesing in R
 
Natural language processing 2
Natural language processing 2Natural language processing 2
Natural language processing 2
 
Gordana Panajotović - NLP Master
Gordana Panajotović - NLP MasterGordana Panajotović - NLP Master
Gordana Panajotović - NLP Master
 
Challenges of social media analysis in the real world
Challenges of social media analysis in the real worldChallenges of social media analysis in the real world
Challenges of social media analysis in the real world
 
Natural Language Processing in Practice
Natural Language Processing in PracticeNatural Language Processing in Practice
Natural Language Processing in Practice
 
Text Mining Infrastructure in R
Text Mining Infrastructure in RText Mining Infrastructure in R
Text Mining Infrastructure in R
 
Introduction to nlp 2014
Introduction to nlp 2014Introduction to nlp 2014
Introduction to nlp 2014
 
Online Tweet Sentiment Analysis with Apache Spark
Online Tweet Sentiment Analysis with Apache SparkOnline Tweet Sentiment Analysis with Apache Spark
Online Tweet Sentiment Analysis with Apache Spark
 
Text analytics in Python and R with examples from Tobacco Control
Text analytics in Python and R with examples from Tobacco ControlText analytics in Python and R with examples from Tobacco Control
Text analytics in Python and R with examples from Tobacco Control
 
Practical Natural Language Processing
Practical Natural Language ProcessingPractical Natural Language Processing
Practical Natural Language Processing
 
Persona Driven Keyword Research
Persona Driven Keyword ResearchPersona Driven Keyword Research
Persona Driven Keyword Research
 
Introducing natural language processing(NLP) with r
Introducing natural language processing(NLP) with rIntroducing natural language processing(NLP) with r
Introducing natural language processing(NLP) with r
 
Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)
 
R by example: mining Twitter for consumer attitudes towards airlines
R by example: mining Twitter for consumer attitudes towards airlinesR by example: mining Twitter for consumer attitudes towards airlines
R by example: mining Twitter for consumer attitudes towards airlines
 
NLTK in 20 minutes
NLTK in 20 minutesNLTK in 20 minutes
NLTK in 20 minutes
 

Similar to Intro to nlp

Beyond document retrieval using semantic annotations
Beyond document retrieval using semantic annotations Beyond document retrieval using semantic annotations
Beyond document retrieval using semantic annotations
Roi Blanco
 
NLP Introduction.ppt machine learning presentation
NLP  Introduction.ppt machine learning presentationNLP  Introduction.ppt machine learning presentation
NLP Introduction.ppt machine learning presentation
PriyankaRamavath3
 
Information Architecture Fundamentals
Information Architecture FundamentalsInformation Architecture Fundamentals
Information Architecture Fundamentals
Christina Wodtke
 
Introduction to nlp
Introduction to nlpIntroduction to nlp
Introduction to nlp
Amaan Shaikh
 
1004-nlp.ppt
1004-nlp.ppt1004-nlp.ppt
1004-nlp.ppt
chalachew5
 
Nlp app
Nlp appNlp app
Relation Extraction
Relation ExtractionRelation Extraction
Relation Extraction
Marina Santini
 
topics natural language processing and image processing
topics natural language processing and image processingtopics natural language processing and image processing
topics natural language processing and image processing
youkayaslam
 
way_topics.ppt
way_topics.pptway_topics.ppt
way_topics.ppt
UmayKulsoom2
 
Intro 2 document
Intro 2 documentIntro 2 document
Intro 2 documentUma Kant
 
Text Analytics: Yesterday, Today and Tomorrow
Text Analytics: Yesterday, Today and TomorrowText Analytics: Yesterday, Today and Tomorrow
Text Analytics: Yesterday, Today and Tomorrow
Tony Russell-Rose
 
Ted Talk
Ted TalkTed Talk
Ted Talk
Erik Hatcher
 
Natural_Language_Processing_1.ppt
Natural_Language_Processing_1.pptNatural_Language_Processing_1.ppt
Natural_Language_Processing_1.ppt
testbest6
 
Natural Language processing
Natural Language processingNatural Language processing
Natural Language processing
Sanzid Kawsar
 
Mining Web content for Enhanced Search
Mining Web content for Enhanced Search Mining Web content for Enhanced Search
Mining Web content for Enhanced Search
Roi Blanco
 
The Coming Explosion of Records at FamilySearch Syllabus
The Coming Explosion of Records at FamilySearch SyllabusThe Coming Explosion of Records at FamilySearch Syllabus
The Coming Explosion of Records at FamilySearch Syllabus
bakers84
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information Retrieval
Roi Blanco
 
PLAIN2013 Rethink, Reorganize, Reword, Redesign
PLAIN2013   Rethink, Reorganize, Reword, RedesignPLAIN2013   Rethink, Reorganize, Reword, Redesign
PLAIN2013 Rethink, Reorganize, Reword, Redesign
macgredl
 

Similar to Intro to nlp (20)

Beyond document retrieval using semantic annotations
Beyond document retrieval using semantic annotations Beyond document retrieval using semantic annotations
Beyond document retrieval using semantic annotations
 
NLP Introduction.ppt machine learning presentation
NLP  Introduction.ppt machine learning presentationNLP  Introduction.ppt machine learning presentation
NLP Introduction.ppt machine learning presentation
 
Information Architecture Fundamentals
Information Architecture FundamentalsInformation Architecture Fundamentals
Information Architecture Fundamentals
 
Introduction to nlp
Introduction to nlpIntroduction to nlp
Introduction to nlp
 
1004-nlp.ppt
1004-nlp.ppt1004-nlp.ppt
1004-nlp.ppt
 
Nlp app
Nlp appNlp app
Nlp app
 
Relation Extraction
Relation ExtractionRelation Extraction
Relation Extraction
 
topics natural language processing and image processing
topics natural language processing and image processingtopics natural language processing and image processing
topics natural language processing and image processing
 
way_topics.ppt
way_topics.pptway_topics.ppt
way_topics.ppt
 
Intro 2 document
Intro 2 documentIntro 2 document
Intro 2 document
 
Intro
IntroIntro
Intro
 
Intro
IntroIntro
Intro
 
Text Analytics: Yesterday, Today and Tomorrow
Text Analytics: Yesterday, Today and TomorrowText Analytics: Yesterday, Today and Tomorrow
Text Analytics: Yesterday, Today and Tomorrow
 
Ted Talk
Ted TalkTed Talk
Ted Talk
 
Natural_Language_Processing_1.ppt
Natural_Language_Processing_1.pptNatural_Language_Processing_1.ppt
Natural_Language_Processing_1.ppt
 
Natural Language processing
Natural Language processingNatural Language processing
Natural Language processing
 
Mining Web content for Enhanced Search
Mining Web content for Enhanced Search Mining Web content for Enhanced Search
Mining Web content for Enhanced Search
 
The Coming Explosion of Records at FamilySearch Syllabus
The Coming Explosion of Records at FamilySearch SyllabusThe Coming Explosion of Records at FamilySearch Syllabus
The Coming Explosion of Records at FamilySearch Syllabus
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information Retrieval
 
PLAIN2013 Rethink, Reorganize, Reword, Redesign
PLAIN2013   Rethink, Reorganize, Reword, RedesignPLAIN2013   Rethink, Reorganize, Reword, Redesign
PLAIN2013 Rethink, Reorganize, Reword, Redesign
 

Recently uploaded

Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 

Recently uploaded (20)

Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 

Intro to nlp

  • 1. Introduction  to  Natural   Language  Processing   Rutu  Mulkar-­‐Mehta,  PhD   Founder  and  Data  Scientist  @Ticary   @RutuMulkar  
  • 2. Co-­‐hosted  Meetup   Data  Science  Dojo   http://www.meetup.com/data-­‐science-­‐dojo   Natural  Language  Processing   http://www.meetup.com/Natural-­‐Language-­‐Processing-­‐Meetup/  
  • 3.
  • 4. About  Me   •  Founder  and  Data  Scientist  at  Ticary     •  Background:   – PhD  in  Natural  Language  Processing   – Computer  Science   •  Worked  on  applying  NLP  to:   – Healthcare   – SEO  (Search  Engine  Optimization)   – Other  Stuff:  Sentiment  Analysis,  Question   Answering,  Natural  Language  Understanding  ++     4  
  • 5. Agenda   •  Understanding  Natural  Language   •  Introduction  to  different  NLP  Problems   •  Part  of  Speech  tagging   •  Linguistic  Resources    
  • 7. Some  Example  Sentences   •  Children  make  delicious  snacks   •  I  saw  the  Grand  Canyon  flying  to  New  York   •  Stolen  painting  found  by  the  tree     •  Two  sentences:   – Monkeys  like  bananas  when  they  wake  up.   – Monkeys  like  bananas  when  they  are  ripe.  
  • 8. Why  is  NLP  Hard?   Brazil  crowds  attend  funeral  of  late  candidate  Campos     More  than  100,000  people  in  Brazil  have  paid  their  last  respects  to  the   late  presidential  candidate,  Eduardo  Campos,  who  died  in  a  plane   crash  on  Wednesday.   They  attended  a  funeral  Mass  and  filled  the  streets  of  the  city  of   Recife  to  follow  the  passage  of  his  coffin.   Later  this  week,  Mr.  Campos's  Socialist  Party  is  expected  to  appoint   former  Environment  Minister  Marina  Silva  as  a  replacement   candidate.   Mr.  Campos's  jet  crashed  in  bad  weather  in  Santos,  near  Sao  Paulo.   Investigators  are  still  trying  to  establish  the  exact  causes  of  the  crash,   which  killed  six  other  people.  
  • 9. Why  is  NLP  Hard?   Brazil  crowds  attend  funeral  of  late  candidate  Campos     More  than  100,000  people  in  Brazil  have  paid  their  last  respects  to  the   late  presidential  candidate,  Eduardo  Campos,  who  died  in  a  plane   crash  on  Wednesday.   They  attended  a  funeral  Mass  and  filled  the  streets  of  the  city  of   Recife  to  follow  the  passage  of  his  coffin.   Later  this  week,  Mr  Campos's  Socialist  Party  is  expected  to  appoint   former  Environment  Minister  Marina  Silva  as  a  replacement   candidate.   Mr  Campos's  jet  crashed  in  bad  weather  in  Santos,  near  Sao  Paulo.   Investigators  are  still  trying  to  establish  the  exact  causes  of  the  crash,   which  killed  six  other  people.  
  • 10. Why  is  NLP  Hard?   Brazil  crowds  attend  funeral  of  late  candidate  Campos     More  than  100,000  people  in  Brazil  have  paid  their  last  respects  to  the   late  presidential  candidate,  Eduardo  Campos,  who  died  in  a  plane   crash  on  Wednesday.   They  attended  a  funeral  Mass  and  filled  the  streets  of  the  city  of   Recife  to  follow  the  passage  of  his  coffin.   Later  this  week,  Mr  Campos's  Socialist  Party  is  expected  to  appoint   former  Environment  Minister  Marina  Silva  as  a  replacement   candidate.   Mr  Campos's  jet  crashed  in  bad  weather  in  Santos,  near  Sao  Paulo.   Investigators  are  still  trying  to  establish  the  exact  causes  of  the  crash,   which  killed  six  other  people.  
  • 11. Why  is  NLP  Hard?   •  To  understand  the  current  event,  you  need  to   understand  several  other  concepts:   – Current  Event   – Background  Event   – Property   – references  to  other  events   – pronouns  
  • 12. NLP  TASKS   What  can  we  solve  with  Natural  Language  Processing  
  • 13. NLP  Tasks   •  Text  Categorization   •  Sentiment  Analysis   •  Information  Extraction   •  Information  Retrieval   •  Question  Answering   •  Text  Summarization   •  Machine  Translation  
  • 14. Text  Categorization   Input  Document   What  is  the  document  about:     sports:  0.2%     politics:  2%     entertainment:  96%     religion:  …     finance:  …  
  • 15. Text  Classification   finance.yahoo.com   sports.yahoo.com   make  your  own  wordle  using  wordle.net   Vocabulary  used  in  one  genre  of  text,  is  different  from   vocabulary  used  in  another  genre  
  • 16. NLP  Tasks   •  Text  Categorization   •  Sentiment  Analysis   •  Information  Extraction   •  Information  Retrieval   •  Question  Answering   •  Text  Summarization   •  Machine  Translation  
  • 17. Sentiment  Analysis   Sharp screen resolution Low battery life v   Product Reviews – Kindle Paperwhite
  • 18. Sentiment  Analysis   •  What  are  people  saying?   –  Twitter   –  Reviews   –  Blogs   –  Emails   •  Can  be  for:   –  Products   –  Companies   –  Movies   –  Books  
  • 19. Sentiment  Analysis   Possible  Features   •  Important  keywords,  and  key  phrases:   –  POS:  dazzling,  brilliant,  phenomenal   –  NEG:  hideous,  awful,  unwatchable   •  Emoticons   –  POS  :-­‐)     –  NEG  :-­‐(   •  Ontologies   –  Wordnet:  https://wordnet.princeton.edu/   –  SentiWordnet:  http://sentiwordnet.isti.cnr.it/  
  • 20. Challenges   •  People  express  opinions  in  complex  ways   – “The  acting  was  great  and  the  plots  were  intense   and  mesmerizing,  but  I  hated  the  movie”   •  Sarcasm,  humor  and  other  expressions   – “It  was  a  great  movie  for  a  Sunday  nap.  I  only  fell   asleep  twice,  but  it  was  very  restful”  
  • 21.
  • 22. NLP  Tasks   •  Text  Categorization   •  Sentiment  Analysis   •  Information  Extraction   •  Information  Retrieval   •  Question  Answering   •  Text  Summarization   •  Machine  Translation  
  • 23. Information  Extraction   Input  Document   What  are  the  key   pieces  of  information  ?     Location:   Time:   People:   …   Extracting  Named  Entities  from  Documents  
  • 24.
  • 25. Other  ways  for  IE  :     Hypernyms  (type  of)   colors  such  as  red,  blue  and  …   25  
  • 26. Other  ways  for  IE:     Synonyms     Find  different  relations  between  2  concepts:   Microsoft  bought  Farecast   26  
  • 27. NLP  Tasks   •  Text  Categorization   •  Sentiment  Analysis   •  Information  Extraction   •  Information  Retrieval   •  Question  Answering   •  Text  Summarization   •  Machine  Translation  
  • 29. Information  Retrieval   Input  Document   What  are  the  documents   relevant  to  the  query?   Input  Document   Input  Document   Input  Document   Input  Document   query  
  • 30.
  • 31. Information  Retrieval   Q)  Which  documents  are  most  relevant  to  a   given  query?     A)  Similar  vocabulary  between  query  and   document?   Quantify  similarity  based  on  maximum  overlap   – Cosine  Similarity   – Jaccard  Similarity  
  • 32. Information  Retrieval   Q)  If  you  rewrite  the  query  –  will  that  give  you   more  precise  results?     A)  Yes!  It  is  called  “Query  Expansion”  
  • 33. Commercial  Search  Tools   •  Lucene   – http://lucene.apache.org/     •  ElasticSearch   – https://www.elastic.co/   Underlying  technology  in  most  of  these  is  the  same,  with  some  variations     Meetup  about  this  topic  scheduled  for  early  2016  
  • 34. NLP  Tasks   •  Text  Categorization   •  Sentiment  Analysis   •  Information  Extraction   •  Information  Retrieval   •  Question  Answering   •  Text  Summarization   •  Machine  Translation  
  • 35. Question  Answering  -­‐  Closed   Input  Data  Source   Questions:     What  event  happened?     When  did  the  event  happen?     Why  did  the  event  happen?     How  long  was  the  event?     How  did  the  event  happen?  
  • 36.
  • 38. 38  
  • 39. NLP  Tasks   •  Text  Categorization   •  Sentiment  Analysis   •  Information  Extraction   •  Information  Retrieval   •  Question  Answering   •  Text  Summarization   •  Machine  Translation  
  • 41. Types  of  Text  Summarization   •  Keyword  Summaries   –  Extract  significant  Keywords  from  text   –  Easy  to  implement   –  Hard  to  understand  by  end  user  a  
  • 42. Types  of  Text  Summarization   •  Sentence/Phrase  Extraction   –  Extract  relevant  sentences   –  Medium-­‐Hard  to  implement   –  Easy  for  end  user  to  understand  
  • 43. Types  of  Text  Summarization   •  Natural  Language  Understanding  and  Generation   –  Understand  meaning  of  text   –  Generate  sentences  from  meaning  of  original  text   –  Hard  to  implement   –  Easy  for  end  user   President  of  University   of  Missouri  resigned   after  graduate  student   hunger  strike  and  class   cancellations  by  faculty  
  • 44. NLP  Tasks   •  Text  Categorization   •  Sentiment  Analysis   •  Information  Extraction   •  Information  Retrieval   •  Question  Answering   •  Text  Summarization   •  Machine  Translation  
  • 46. Why  is  MT  Hard?   •  It  is  not  a  1  to  1  translation   – In  the  previous  example  4  words  in  English   translate  into  2  in  Spanish   •  Grammar  is  different  in  different  languages   – SOV  (Subject  –  Object  –  Verb)   •  “She  him  loves”  (Hindi,  Japanese)   – SVO  (Subject  –  Verb  –  Object)     •  “She  loves  him”  (English,  Mandarin)  
  • 47. Machine  Translation   •  Waygoapp   •  Instantly  translated  Chinese,   Japanese  and  Korean   •  Simply  point  and  translate   •  Offline     http://waygoapp.com/  
  • 48. LINGUISTIC  NUANCES   Back  to  the  basics  
  • 49. Example   All  the  gobulins  were  gramzies.   It  was  grimbleton.   What  are  the  underlined  words?     gobulins     •  Noun   gramzies     •  Noun  or  Adjective   grimbleton   •  Noun  or  Adjective  
  • 50. Why  is  the  example  important?   We  can  get  a  sense  of  what  the  word  means,   based  on  how  it  is  used  in  language.  
  • 51. Nouns   •  E.g.  cat,  car,  computer,  tree   •  Variations:   – Number:  singular,  plural   •  one  car,  two  cars   – Gender:  masculine,  feminine,  neuter   – Case:  nominative,  genitive,  accusative,  dative  
  • 52. Pronouns   •  Vary  in   –  E.g.  she,  ourselves,  mine   –  Person   –  Gender   •  his,  her   –  Number   –  Case:  nominative,  accusative,  possessive,  2nd   possessive   –  Reflexive  and  Anaphoric  Forms:     •  herself,  each  other  
  • 53. Determiners   •  Articles   – a,  an,  the   •  Demonstratives   – this,  that    
  • 54. Adjectives   •  Describe  Properties   – sunny,  beautiful,  calm   •  Attributive  and  predicative  properties   •  Agreement   – in  gender,  number   •  Comparative  and  superlative  forms   – derivative  and  periphrastic   •  positive  form  
  • 55. Verbs   •  Tense:  past,  present,  future   – danced,  dancing,  will  dance   •  Aspect:  progressive,  perfective   •  Voice:  active,  passive   •  Other:  number,  person   •  Arguments:  transitive,  intransitive,   ditransitive  
  • 56. Other  POS  tags   •  Adverbs   – happily   •  Prepositions   – of,  on,  in   •  Particles   – ran  a  bill  vs  ran  up  a  bill  
  • 57. Morphological  Analysis   •  Sleeps  =  sleep  +  v  +  3rd  Person  +  Singular   •  If  we  have  a  good  enough  grammar  with  all  of   these  rules,  we  have  a  good  shot  at   understanding  syntax  of  language  
  • 58. Automatic  Taggers   •  Almost  all  the  POS  taggers  use  the  Penn-­‐Treebank   list  of  tags   •  https://www.ling.upenn.edu/courses/Fall_2003/ ling001/penn_treebank_pos.html   58  
  • 59. Automatic  Taggers   •  Almost  all  the  POS  taggers  use  the  Penn-­‐Treebank  list  of   tags   •  https://www.ling.upenn.edu/courses/Fall_2003/ling001/ penn_treebank_pos.html   –  Nouns  :     •  NN  (house),  NNS(houses),  NNP(White  House),  NNPS   –  Verbs:     •  VB(say),  VBD(said),  VBG(saying),  VBN,  VBP,  VBZ   –  Adjectives:     •  JJ  (good),  JJR(better),  JJS(best)   –  Adverbs:  RB,  RBR,  RBS   –  Prepositions:  IN   59  
  • 61. POS  Tagging  and  Parsing   •  Stanford  Core  NLP   – http://nlp.stanford.edu:8080/corenlp/   •  NLTK   – Natural  Language  Toolkit   – You  need  to  provide  your  own  training  data,  and   train  models  for  NLTK  to  be  effective   61  
  • 62. Other  Linguistic  Features  of  Interest   – We  want  to  get  nouns  and  verbs  into  a  root  form   E.g.   •  am,  are,  is  à  be   •  car,  cars,  car’s  à  car     – Two  approaches:     •  Stemming     •  Lemmatization   62  
  • 63. Stemming  and  Lemmatization   •  Lemmatization     –  use  of  a  vocabulary   –  morphological  analysis  of  words   –  returns  the  base  or  dictionary  form  of  a  word   –  base  form  is  known  as  the  lemma   –  e.g.  am,  are,  is  à  be   •  Stemming   –  crude  heuristic  process     –  chops  off  the  ends  of  words     –  hope  of  achieving  this  goal     –  e.g.  Marked  à  Mark,  Marker  à  Mark   63  
  • 64. Parsing  Resources   •  NLTK   – python,  low  accuracy,  fast   – http://www.nltk.org/   •  Stanford  Core  NLP   – java,  high  accuracy,  slow   – http://nlp.stanford.edu/software/corenlp.shtml   •  SpaCy   – python,  medium  accuracy,  fast   – https://spacy.io/  
  • 65. Other  Resources:  Ontologies     •  Wordnet   –  groups  words  when  they  have  the  same  meaning     –  represents  hierarchical  links  between  groups   –  E.g.  car  is  the  same  thing  as  an  automobile   •  SentiWordnet   •  Wordnet  +  Sentiment   •  ConceptNet   –  broader  relationships  than  WordNet   –  E.g.  bread  is  typically  found  near  a  toaster.   •  FrameNet   –  Frames  represent  concepts  and  their  associated  roles  
  • 67. Semantics  and  Word  Co-­‐locations   •  It  is  important  to  know  which  words  occur   together     – Strong  Beer  vs  Powerful  Beer   – Big  Sister  vs  Large  Sister     •  Two  approaches  have  been  used   – Semantics  –  ontologies  and  word  meanings   – Statistics  –  word  colocations  and  probabilities  
  • 68. Thank  you  for  Listening   rutu@ticary.com   @RutuMulkar