SlideShare a Scribd company logo
1 of 95
600.465 - Intro to NLP - J. Eisner 1
NLP Tasks and Applications
It’s a big world out there
And everyone uses language
The NLP Research Community
 Papers
 ACL Anthology has nearly everything, free!
 Over 80,000 papers!
 Free-text searchable
 Great way to learn about current research on a topic
 New search interfaces currently available in beta
 Find recent or highly cited work; follow citations
 Used as a dataset by various projects
 Analyzing the text of the papers (e.g., parsing it)
 Extracting a graph of papers, authors, and institutions
(Who wrote what? Who works where? What cites what?)
 Google Scholar to sort by citation count / track citations
The NLP Research Community
 Conferences
 Most work in NLP is published as 9-page conference
papers with 3 double-blind reviewers.
 Papers are presented via talks, posters, videos
 Also:
 Conference short papers (5 pages)
 “Findings” papers (accepted but without presentation)
 “System demo track” / “Industry track”
 Journal papers (up to 12 pages, or unlimited)
 Main annual conferences: ACL, EMNLP, NAACL
 + EACL, AACL/IJCNLP, COLING, …; also LREC
 + journals: TACL, CL, JNLE, …
 + AI/ML journals: JAIR, JMLR, …
 + various specialized conferences and workshops
The NLP Research Community
 Conferences
 Pre-COVID, ACL had > 2000 in-person attendees
 Kind of a zoo! It used to be 200 people and few enough talks that
you could go to all of them …
 ACL 2022:
 3378 papers submitted (701 accepted = 20%)
 Accepted:
 86% “long” (9 pages)
 14% “short” (5 pages)
 + 361 “findings” published outside main conf ( total 31%)
 Awards: Several papers (will be widely read)
“Tracks” (at ACL 2021)
 Machine Learning for NLP
 Interpretability & Analysis of
Models for NLP
 Resources & Evaluation
 Ethics & NLP
 Phonology, Morphology & Word
Segmentation
 Syntax: Tagging, Chunking & Parsing
 Semantics: Lexical
 Semantics: Sentence-level
Semantics, Textual Inference &
Other areas
 Linguistic Theories, Cognitive
Modeling & Psycholinguistics
 Information Extraction
 Information Retrieval & Text Mining
 Question Answering
 Summarization
 Machine Translation & Multilinguality
 Speech & Multimodality
 Discourse & Pragmatics
 Sentiment Analysis, Stylistic Analysis,
& Argument Mining
 Dialogue & Interactive Systems
 Language Grounding to Vision,
Robotics & Beyond
 Computational Social Science &
Cultural Analytics
 NLP Applications
 Special theme: NLP for Social Good
used to organize reviewing and presentation for most papers
+ 23 workshops on various topics
The NLP Research Community
 Chitchat
 arXiv papers
 Twitter accounts
 NLP researchers with active accounts
(grad students, profs, industry folks)
 Official conference accounts
 “NLP Highlights” podcast
 “NLP News” newsletter
The NLP Research Community
 Institutions
 Universities: Many have 2+ NLP faculty
 Several “big players” with many faculty
 Some of them also have good linguistics,
cognitive science, machine learning, AI
 Companies:
 Old days: AT&T Bell Labs, IBM
 Now: Microsoft Research, Google Brain/DeepMind,
FAIR, Amazon, startups …
 Nonprofits: AI2, HuggingFace, TTIC, …
 Many niche markets – online reviews, medical transcription,
news summarization, legal search and discovery …
The NLP Research Community
 Standard tasks
 If you want people to work on your problem, make it
easy for them to get started and to measure their
progress. Provide:
 Test data, for evaluating the final systems
 Development data (= validation data) for measuring whether a
change to the system helps, and for tuning parameters
 An evaluation metric (formula for measuring how well a system
does on the dev and test data)
 “If you can’t measure it, you can’t improve it” – Peter Drucker
 A program for computing the evaluation metric
 Labeled training data and other data resources
 A prize? – with clear rules on what data can be used
 NLP-progress repo tracks the “state of the art” (SoTA)
 Workshops sometimes create new “shared tasks”
The NLP Research Community
 Software
 Lots of people distribute code for these tasks
 Search github – fun to download packages and play around!
 Or you can email a paper’s authors to ask for their code
 Some lists of software, but no central site 
 PapersWithCode.com
 Search for “awesome NLP” for some lists
 Toolkits and end-to-end pipelines for text analysis
 Hugging Face – > 20,000 models, > 2,000 datasets
 Large pretrained models: pip install transformers (quick tour)
 Task-specific models: pip install allennlp, etc.
 Allen NLP (Python), Spacy (Cython), UDPipe (C++),
Stanza (Python), CoreNLP (Java), NLTK (Python)
The NLP Research Community
 Software
 To find good or popular tools:
 Identify current papers, ask around, search the web
 Trying to identify the best tool for your job:
 Produces appropriate, sufficiently detailed output?
 Accurate? (on the measure you care about)
 Robust? (accurate on your data, not just theirs)
 Fast?
 Easy and flexible to use? Nice file formats, command line
options, visualization?
 Trainable for new data and languages? How slow is training?
 Open-source and easy to extend?
The NLP Research Community
 Datasets
 Raw text or speech corpora
 Or just their n-gram counts, for super-big corpora
 Various languages and genres
 Usually there’s some metadata (each document’s date, author, etc.)
 Sometimes  licensing restrictions (proprietary or copyright data)
 Text or speech with manual or automatic annotations
 What kind of annotations? That’s the rest of this lecture …
 May include translations into other languages
 Words and their relationships
 phonological, morphological, semantic, translational, evolutionary
 Grammars
 World Atlas of Linguistic Structures
 Parameters of statistical models (e.g., grammar weights)
The NLP Research Community
 Datasets
 Read papers to find out what datasets others are using
 Linguistic Data Consortium (searchable) hosts many large datasets
 Many projects and competitions post data on their websites
 But sometimes you have to email the author for a copy
 CORPORA mailing list is also good place to ask around
 LREC Conference publishes papers about new datasets & metrics
 Pay humans to annotate your data
 Or to correct the annotations produced by your initial system
 Old task, new domain: Annotate parses etc. on your kind of data
 New task: Annotate something new that you want your system to find
 Auxiliary task: Annotate something new that your system may benefit
from finding (e.g., annotate subjunctive mood to improve translation)
 Can you make annotation so much fun or so worthwhile
that they’ll do it for free?
Crowdsourcing
 E.g., Mechanical Turk
 Pay humans small amounts of money to do
“Human Intelligence Tasks” (HITs)
 If you post it, they will come …
 … although they’ll go away again if other
people’s tasks are more pleasant / better paid
600.465 - Intro to NLP - J. Eisner 13
Crowdsourcing
 Callison-Burch (2010), section 5.3:
 Did the Italian  English MT work well?
 Let’s use humans to evaluate, but how?
 English-speaking Turkers read the English news article,
and answer reading comprehension questions.
 If they answer correctly, the translation was good.
 Who grades the answers?
 Who writes the questions and sample answers?
 Who translates the questions/answers to English?
 Who filters out bad or redundant questions?
 Pro tip: GPT-3 is also good at the kind of well-defined
small tasks that you’d trust human strangers to do
600.465 - Intro to NLP - J. Eisner 14
other Turkers
The NLP Research Community
 Standard data formats
 Often just simple ad hoc text-file formats
 Documented in a README; easily read with scripts
 Some standards:
 Unicode – strings in any language (see ICU toolkit)
 PCM (.wav, .aiff) – uncompressed audio
 BWF and AUP extend w/metadata; also many compressed formats
 XML – documents with embedded annotations
 Text Encoding Initiative – faithful digital representations of
printed text
 Protocol Buffers, JSON – structured data
 UIMA – “unstructured information management”; Watson uses it
 Standoff markup: raw text in one file, annotations in
other files (“ noun phrase from byte 378—392”)
 Annotations can be independently contributed & distributed
The NLP Research Community
 Survey articles
 May help you get oriented in a new area
 Synthesis Lectures on Human Language Technologies
 Oxford Handbook of Computational Linguistics (2014)
 Foundations & Trends in Machine Learning
 Survey articles in journals – JAIR, CL, JMLR
 ACM Computing Surveys?
 Search online for tutorial papers and survey papers
 Slides from tutorials at conferences
 Textbooks
 Dissertation literature review chapters
To Write A Typical Paper
 Need some of these ingredients:
 A domain of inquiry
 A task
 Resources
 A method for training & testing
 An algorithm
 Analysis of results
 There are other kinds of papers too: theoretical papers on formal
grammars and their properties, new error metrics, new tasks or
resources, etc.
Scientific or engineering question
Input & output representations, evaluation metrics*
Corpora, annotations, dictionaries, …
Derived from a model?
Comparison to baselines & other systems,
significance testing, learning curves,
ablation analysis, error analysis
*Evaluation is harder for unsupervised learning or
text generation. There’s not a single right answer …
 Easy to build a “yes” or “no” predictor from supervised training data
 Plenty of software packages to do the learning & prediction
 Lots of people in NLP never go beyond this 
 Similarly, easy to build a system that chooses from a small finite set
 Basically the same deal
 But runtime goes up linearly with the size of the set
 Harder to predict the best string or tree (set is exponentially large or infinite)
 Turn it into a sequence of predictions (words, tags, structure-building actions)
 Tagging / finite-state / parsing algorithms will find the best sequence for you
 Might also find k-best, or k-random (sample), or sum over all possible outputs
 An algorithm for any of these can be turned into a learning algorithm
 If the predictions are independent or interact only locally, algorithms can use
dynamic programming
 If they interact non-locally, you need an approximate method like beam search
 Other approximate methods include variational inference, belief propagation, cutset
conditioning, simulated annealing, MCMC, genetic algorithms, learning to search, …
Supervised Learning Methods
600.465 - Intro to NLP - J. Eisner 19
Text Annotation Tasks
1. Classify the entire document
(“text categorization”)
600.465 - Intro to NLP - J. Eisner 20
Sentiment classification
example from amazon.com, thanks to Delip Rao
What features of the text could help predict # of stars?
(e.g., using a log-linear model) How to identify more?
Are the features hard to compute? (syntax? sarcasm?)
?
600.465 - Intro to NLP - J. Eisner 21
Other text categorization tasks
 Is it spam? (see features)
 What medical billing code for this visit?
 What grade, as an answer to this essay question?
 Is it interesting to this user?
 News filtering; helpdesk routing
 Is it interesting to this NLP program?
 Skill classification for a digital assistant!
 If it’s Spanish, translate it from Spanish
 If it’s subjective, run the sentiment classifier
 If it’s an appointment, run information extraction
 Where should it be filed?
 Which mail folder? (work, friends, junk, urgent ...)
 Yahoo! / Open Directory / digital libraries
600.465 - Intro to NLP - J. Eisner 22
Measuring Performance
 Classification accuracy: What % of
messages were classified correctly?
 Is this what we care about?
Overall
accuracy
Accuracy
on spam
Accuracy
on gen
System 1 95% 99.99% 90%
System 2 95% 90% 99.99%
 Which system do you prefer?
600.465 - Intro to NLP - J. Eisner 23
Measuring Performance
 Precision =
good messages kept
all messages kept
 Recall =
good messages kept
all good messages
Move from high precision to high recall by
deleting fewer messages (delete only if spamminess > high threshold)
Precision vs. Recall of
Good (non-spam) Email
0%
25%
50%
75%
100%
0% 25% 50% 75% 100%
Recall
Precision
600.465 - Intro to NLP - J. Eisner 24
Precision vs. Recall of
Good (non-spam) Email
0%
25%
50%
75%
100%
0% 25% 50% 75% 100%
Recall
Precision
Measuring Performance
low threshold:
keep all the good stuff,
but a lot of the bad too
high threshold:
all we keep is good,
but we don’t keep much
OK for spam
filtering and
legal search
OK for search engines
(users only want top 10)
would prefer
to be here!
point where
precision=recall
(occasionally
reported)
600.465 - Intro to NLP - J. Eisner 25
Measuring Performance
 Precision =
good messages kept
all messages kept
 Recall =
good messages kept
all good messages
 F-measure =
Move from high precision to high recall by
deleting fewer messages (raise threshold)
Conventional to tune system and threshold to optimize F-measure on dev data
But it’s more informative to report the whole curve
Since in real life, the user should be able to pick a tradeoff point they like
Precision vs. Recall of
Good (non-spam) Email
0%
25%
50%
75%
100%
0% 25% 50% 75% 100%
Recall
Precision
precision-1 + recall-1
2
( )
-1
another system: better
for some users, worse for
others (can’t tell just by
comparing F-measures)
More than 2 classes
 Report F-measure for each class
 Show a confusion matrix
600.465 - Intro to NLP - J. Eisner 26
Image from
https://www.analyticsvidhya.com/blog/2020/04/confusion-matrix-machine-learning/
True
class
Predicted class

Supervised Learning Methods
600.465 - Intro to NLP - J. Eisner 29
Text Annotation Tasks
1. Classify the entire document
2. Classify individual word tokens
slide courtesy of D. Yarowsky
p(class | token in context)
(WSD)
Build a special classifier just for tokens of “plant”
slide courtesy of D. Yarowsky
p(class | token in context)
WSD for
Build a special classifier just for tokens of “sentence”
slide courtesy of D. Yarowsky
p(class | token in context)
slide courtesy of D. Yarowsky
p(class | token in context)
slide courtesy of D. Yarowsky
p(class | token in context)
slide courtesy of D. Yarowsky
p(class | token in context)
slide courtesy of D. Yarowsky
p(class | token in context)
600.465 - Intro to NLP - J. Eisner 37
What features? Example: “word to left”
slide courtesy of D. Yarowsky (modified)
Spelling correction using an
n-gram language model
(n ≥ 2) would use words to
left and right to help
predict the true word.
Similarly, an HMM would
predict a word’s class using
classes to left and right.
But we’d like to throw in all
kinds of other features,
too …
600.465 - Intro to NLP - J. Eisner 38
Feature Templates
slide courtesy of D. Yarowsky (modified)
generates a whole
bunch of features –
use data to find out
which ones work best
600.465 - Intro to NLP - J. Eisner 39
Feature Templates
slide courtesy of D. Yarowsky (modified)
merged ranking
of all features
of all these types
This feature is
relatively
weak, but weak
features are
still useful,
especially since
very few
features will
fire in a given
context.
600.465 - Intro to NLP - J. Eisner 40
Final decision list for lead (abbreviated)
slide courtesy of D. Yarowsky (modified)
List of all features,
ranked by their weight.
(These weights are for a simple
“decision list” model where the
single highest-weighted feature
that fires gets to make the
decision all by itself.
However, a log-linear model,
which adds up the weights of all
features that fire, would be
roughly similar.)
600.465 - Intro to NLP - J. Eisner 41
Text Annotation Tasks
1. Classify the entire document
2. Classify individual word tokens
3. Identify phrases (“chunking”)
3/2/2023 Slide from Jim Martin 42
Named Entity Recognition
CHICAGO (AP) — Citing high fuel prices, United Airlines said Friday
it has increased fares by $6 per round trip on flights to some cities
also served by lower-cost carriers. American Airlines, a unit AMR,
immediately matched the move, spokesman Tim Wagner said.
United, a unit of UAL, said the increase took effect Thursday night
and applies to most routes where it competes against discount
carriers, such as Chicago to Dallas and Atlanta and Denver to San
Francisco, Los Angeles and New York.
Slide from Jim Martin 43
NE Types
 Phrases that are useful for information extraction:
 Named entities
 As on previous slides
 Relationship phrases
 “said”, “according to”, …
 “was born in”, “hails from”, …
 “bought”, “hopes to acquire”, “formed a joint agreement with”, …
 Simple syntactic chunks (e.g., non-recursive NPs)
 “Syntactic chunking” sometimes done before (or instead of) parsing
 Also, “segmentation”: divide Chinese text into words (no spaces)
 So, how do we learn to mark phrases?
 Earlier, we built an FST to mark dates by inserting brackets
 But, it’s common to set this up as a tagging problem …
Identifying phrases (chunking)
45
Reduce to a tagging problem …
• The IOB encoding (Ramshaw & Marcus 1995):
 B_X = “beginning” (first word of an X)
 I_X = “inside” (non-first word of an X)
 O = “outside” (not in any phrase)
 Does not allow overlapping or recursive phrases
…United Airlines said Friday it has increased …
B_ORG I_ORG O O O O O
… the move , spokesman Tim Wagner said …
O O O O B_PER I_PER O
Slide adapted from Chris Brew
What if this were tagged as B_ORG instead?
Slide adapted from Jim Martin 46
Attributes and
Feature Templates for NER
POS tags and chunks
from earlier processing
Now predict NER tagseq
A feature of this
tagseq might give a
positive or negative
weight to this
B_ORG in
conjunction with
some subset of the
nearby attributes
Or even faraway attributes:
B_ORG is more likely in a
sentence with a spokesman!
Alternative: CRF Chunker
 Log-linear model of p(structure | sentence):
 CRF tagging takes O(n) time (“Markov property”)
 Score each possible tag or tag bigram at each position
 Then use dynamic programming to find best overall tagging
 CRF chunking takes O(n2) time (“semi-Markov”)
 Score each possible chunk, e.g., using a BiRNN-CRF
 Then use dynamic programming to find best overall chunking
 Forward algorithm:
 for all j: for all i < j: for all labels L:
α(j) += α(i) * score(possible chunk from i to j with label L)
 CRF parsing takes O(n3) time
 Score each possible rule at each position
 Then use dynamic programming to find best overall parse
600.465 - Intro to NLP - J. Eisner 48
Text Annotation Tasks
1. Classify the entire document
2. Classify individual word tokens
3. Identify phrases (“chunking”)
4. Syntactic annotation (parsing)
 Runtime
 Exact match
 Is the parse 100% correct?
 Labeled precision, recall, F-measure of constituents
 Precision: You predicted (NP,5,8); was it right?
 Recall: (NP,5,8) was right; did you predict it?
 Easier versions:
 Unlabeled: Don’t worry about getting (NP,5,8) right, only (5,8)
 Short sentences: Only test on sentences of ≤ 15, ≤ 40, ≤ 100 words
 Dependency parsing: Labeled and unlabeled attachment accuracy
 Crossing brackets
 You predicted (…,5,8), but there was really a constituent (…,6,10)
Parser Evaluation Metrics
MOD
Labeled Dependency Parsing
He reckons the current account deficit will narrow to only 1.8 billion in September.
Raw sentence
Part-of-speech tagging
He reckons the current account deficit will narrow to only 1.8 billion in September.
PRP VBZ DT JJ NN NN MD VB TO RB CD CD IN NNP .
POS-tagged sentence
Word dependency parsing
slide adapted from Yuji Matsumoto
Word dependency parsed sentence
He reckons the current account deficit will narrow to only 1.8 billion in September .
SUBJ
ROOT
S-COMP
SUBJ
SPEC
MOD
MOD
COMP
COMP
MOD
[head=plan]
Det
The
N
plan
to
VP
VP
V
swallow
NP
Wanda
V
has
V
been
V
thrilling
NP
Otto
NP
VP
VP
VP
S
N
[head=plan]
[head=plan]
[head=swallow] [head=Wanda]
[head=Otto]
[head=swallow]
[head=swallow]
[head=thrill]
[head=thrill]
[head=thrill]
[head=thrill]
[head=thrill]
Dependency Trees 1. Assign heads
[head=plan]
Det
The
N
plan
to
VP
VP
V
swallow
NP
Wanda
V
has
V
been
V
thrilling
NP
Otto
NP
VP
VP
VP
S
N
[head=plan]
[head=plan]
[head=swallow] [head=Wanda]
[head=Otto]
[head=swallow]
[head=swallow]
[head=thrill]
[head=thrill]
[head=thrill]
[head=thrill]
[head=thrill]
Dependency Trees 2. Each word is
the head of a
whole
connected
subgraph
Det
The
N
plan
to
VP
VP
V
swallow
NP
Wanda
V
has
V
been
V
thrilling
NP
Otto
NP
VP
VP
VP
S
N
Dependency Trees 2. Each word is
the head of a
whole
connected
subgraph
The
to
swallow
Wanda
has
been
Otto
plan
thrilling
Dependency Trees 3. Just look at
which words are
related
The plan to swallow Wanda has been thrilling Otto.
Dependency Trees 4. Optionally
flatten the
drawing
 Shows which words modify (“depend on”) another word
 Each subtree of the dependency tree is still a constituent
 But not all of the original constituents are subtrees (e.g., VP)
 Easy to spot semantic relations (“who did what to whom?”)
 Good source of syntactic features for other tasks
 Easy to annotate (high agreement)
 Easy to evaluate (what % of words have correct parent?)
 Data available in more languages (Universal Dependencies)
600.465 - Intro to NLP - J. Eisner 56
Text Annotation Tasks
1. Classify the entire document
2. Classify individual word tokens
3. Identify phrases (“chunking”)
4. Syntactic annotation (parsing)
5. Semantic annotation
57
Semantic Role Labeling (SRL)
• For each predicate (e.g., verb)
1. find its arguments (e.g., NPs)
2. determine their semantic roles
John drove Mary from Austin to Dallas in his Toyota Prius.
The hammer broke the window.
– agent: Actor of an action
– patient: Entity affected by the action
– source: Origin of the affected entity
– destination: Destination of the affected entity
– instrument: Tool used in performing action.
– beneficiary: Entity for whom action is performed
Slide thanks to Ray Mooney (modified)
58
Might be helped by syntactic parsing first …
• Consider one verb at a time: “bit”
• Classify the role (if any) of each of the 3 NPs
S
NP VP
NP PP
The
Prep NP
with
the
V NP
bit
a
big
dog girl
boy
Det A N
Det A N
ε
Adj A
ε
Det A N
ε
Color Code:
not-a-role
agent
patient
source
destination
instrument
beneficiary
Slide thanks to Ray Mooney (modified)
59
Parse tree paths as classification features
S
NP VP
NP PP
The
Prep NP
with
the
V NP
bit
a
big
dog girl
boy
Det A N
Det A N
ε
Adj A
ε
Det A N
ε
Path feature is
V ↑ VP ↑ S ↓ NP
which tends to
be associated
with agent role
Slide thanks to Ray Mooney (modified)
60
S
NP VP
NP PP
The
Prep NP
with
the
V NP
bit
a
big
dog girl
boy
Det A N
Det A N
ε
Adj A
ε
Det A N
ε
Slide thanks to Ray Mooney (modified)
Parse tree paths as classification features
Path feature is
V ↑ VP ↑ S ↓ NP ↓ PP ↓ NP
which tends to
be associated
with no role
61
Head words as features
• Some roles prefer to be filled by certain kinds of NPs.
• This can give us useful features for classifying accurately:
– “John ate the spaghetti with chopsticks.” (instrument)
“John ate the spaghetti with meatballs.” (patient)
“John ate the spaghetti with Mary.”
• Instruments should be tools
• Patient of “eat” should be edible
– “John bought the car for $21K.” (instrument)
“John bought the car for Mary.” (beneficiary)
• Instrument of “buy” should be Money
• Beneficiaries should be animate (things with desires)
– “John drove Mary to school in the van”
“John drove the van to work with Mary.”
• What do you think?
Slide thanks to Ray Mooney (modified)
62
Semantic roles can feed into further tasks
• Find the answer to a user’s question
– “Who” questions usually want Agents
– “What” question usually want Patients
– “How” and “with what” questions usually want Instruments
– “Where” questions frequently want Sources/Destinations.
– “For whom” questions usually want Beneficiaries
– “To whom” questions usually want Destinations
• Generate text
– Many languages have specific syntactic constructions that must or should
be used for specific semantic roles.
• Word sense disambiguation, using selectional restrictions
– The bat ate the bug. (what kind of bat? what kind of bug?)
• Agents (particularly of “eat”) should be animate – animal bat, not baseball bat
• Patients of “eat” should be edible – animal bug, not software bug
– John fired the secretary.
John fired the rifle.
Patients of fire1 are different than patients of fire2
Slide thanks to Ray Mooney (modified)
 PropBank – coarse-grained roles of verbs
 NomBank – similar, but for nouns
 TimeBank – temporal expressions
 FrameNet – fine-grained roles of any word
 Like finding the intent (= frame) for each “trigger
phrase,” and filling its slots
Other Current Semantic
Annotation Tasks (similar to SRL)
Alarm(delay=“20 minutes”, content=“turn off the rice”)
intent slot slot filler
filler
We avenged the insult by setting fire to his village.
REVENGE FRAME
Avenger
Offender (unexpressed in this sentence)
Injury
Injured Party (unexpressed in this sentence)
Punishment
FrameNet Example
a word/phrase that triggers the REVENGE frame
Slide thanks to CJ Fillmore (modified)
REVENGE FRAME
triggering words and phrases
(not limited to verbs)
avenge, revenge, retaliate, get back at, pay back, get even, …
revenge, vengeance, retaliation, retribution, reprisal, …
vengeful, retaliatory, retributive; in revenge, in retaliation, …
take revenge, wreak vengeance, exact retribution, …
FrameNet Example
Slide thanks to CJ Fillmore (modified)
Slide from Chris Brew, adapted from slide by William Cohen
Information Extraction
Filling slots in a database from sub-segments of text.
As a task:
October 14, 2002, 4:00 a.m. PT
For years, Microsoft Corporation CEO Bill
Gates railed against the economic philosophy
of open-source software with Orwellian fervor,
denouncing its communal licensing as a
"cancer" that stifled technological innovation.
Today, Microsoft claims to "love" the open-
source concept, by which software code is
made public to encourage improvement and
development by outside programmers. Gates
himself says Microsoft will gladly disclose its
crown jewels--the coveted code behind the
Windows operating system--to select
customers.
"We can be open source. We love the concept
of shared source," said Bill Veghte, a
Microsoft VP. "That's a super-important shift
for us in terms of code access.“
Richard Stallman, founder of the Free
Software Foundation, countered saying…
NAME TITLE ORGANIZATION
Bill Gates CEO Microsoft
Bill Veghte VP Microsoft
Richard Stallman founder Free Soft..
IE
Phrase Types to Identify for IE
Closed set
He was born in Alabama…
The big Wyoming sky…
U.S. states
Regular set
Phone: (413) 545-1323
The CALD main office can be
reached at 412-268-1299
U.S. phone numbers
Complex pattern
University of Arkansas
P.O. Box 140
Hope, AR 71802
U.S. postal addresses
Headquarters:
1128 Main Street, 4th Floor
Cincinnati, Ohio 45210
…was among the six houses
sold by Hope Feldman that year.
Ambiguous patterns,
needing context and
many sources of evidence
Person names
Pawel Opalinski, Software
Engineer at WhizBang Labs.
Slide from Chris Brew, adapted from slide by William Cohen
 Classified ads
 Restaurant reviews
 Bibliographic citations
 Appointment emails
 Legal opinions
 Papers describing clinical medical studies
 …
 Adding facts to the semantic web
Example applications for IE
The Semantic Web
 A simple scheme for representing factual
knowledge as a labeled graph
 Many information extraction tasks aim to
produce something like this
 [examples on next slides]
The Semantic Web
from programmableweb.com
The Semantic Web
from en.wikipedia.org
The Semantic Web
 A simple scheme for representing factual
knowledge as a labeled graph
 Many information extraction tasks aim to
produce something like this
 Is a labeled graph (triples) really enough?
  Can transform k-tuples to triples
(cf. Davidsonian event variable)
  Supports facts about individuals, but no
direct support for quantifiers or reasoning
 Syntactic parsing maps from a sentence to a
syntactic structure
 Semantic parsing maps from a sentence
directly to a semantic structure
 Often executable code, like a SQL query
 Or perhaps a logical form like a λ-term
 Or perhaps a graph as in
Abstract Meaning Representation (AMR)
Semantic Parsing
Semantic Parsing:
SPIDER dataset
“easy” example from Yu et al. (2018)
Semantic Parsing:
SPIDER dataset
“medium” example from Yu et al. (2018)
Semantic Parsing:
SPIDER dataset
“extra hard” example from Yu et al. (2018)
Semantic Parsing: AMR
 AMR = Abstract Meaning Representation
 Proposed as an “interlingua” for MT
 That is, encode Chinese into AMR, then
decode AMR (or Chinese+AMR) into English
 Tutorial slides with examples
600.465 - Intro to NLP - J. Eisner 78
600.465 - Intro to NLP - J. Eisner 79
Generating new text
1. Speech recognition (transcribe as text)
2. Machine translation
3. Text generation from semantics
4. Inflect, analyze, pronounce, or
transliterate words
5. Single- or multi-doc summarization
600.465 - Intro to NLP - J. Eisner 80
Deeper Information Extraction
1. Coreference resolution (within a document)
2. Entity linking (across documents)
3. Event extraction and linking
4. Knowledge base population (KBP)
5. Recognizing texual entailment (RTE)
600.465 - Intro to NLP - J. Eisner 81
User interfaces
1. Dialogue systems
 Task-oriented dialogue (specific domains)
 Chatbots (unrestricted topic, fun or useful)
 Human-computer collaboration
 Interactive teaching
2. Language teaching; writing help
3. Question answering
4. Information retrieval
600.465 - Intro to NLP - J. Eisner 82
Multimodal interfaces or
modeling
1. Sign languages
2. Speech + gestures
3. Images + captions
4. Brain recordings, human reaction
times
600.465 - Intro to NLP - J. Eisner 83
Discovering Linguistic Structure
1. Decipherment
2. Word meanings
3. Grammar induction (syntax, morphology)
4. Topic modeling
5. Language evolution (historical linguistics)
6. Grounded semantics
NLP automates things that humans do well, so that they can be done
automatically on more sentences. But this slide is about language analysis
that’s hard even for humans. Computational linguistics (like comp bio, etc.)
can discover underlying patterns in large datasets: things we didn’t know!
Some of these may already be hidden in vector representations …
Engineering conventional wisdom
(circa 2020)
EOS
Every nation wants Joe to love Jill
600.465 - Intro to NLP - J. Eisner 84
Every nation wants Joe
“bidirectional RNN (BiRNN) tagger”
BOS
Verb
Engineering conventional wisdom
(circa 2020)
85
Every nation wants Joe to love Jill
Chaque nation veut que …
que Joe
veut
nation
Transformer seq2seq
Engineering conventional wisdom
(circa 2020)
 Encode your input text
 Probably using a Transformer
 Or maybe a BiRNN (such as a BiLSTM)
 Let your network look at other helpful features/data
86
Engineering conventional wisdom
(circa 2020)
 Encode your input text
 Feed that pretrained encoding into another
network that makes predictions
 Classify the whole text
 Classify each token (tagging)
 Locally score possible tags, n-grams, rules at each token
 Then use dynamic programming to predict a global structure
 E.g., BiRNN-CRF in HW6
 Predict a sequence of structure-building rules
 Try to find the highest-scoring sequence with beam search
87
Engineering conventional wisdom
(circa 2020)
 Encode your input text
 Feed that pretrained encoding into another
network that makes predictions
 Fine-tune the whole system end-to-end to
minimize some regularized training loss
 This includes fine-tuning the pretrained encoder
 Log-loss or some kind of differentiable error metric
 L2 regularization with early stopping
 Can also regularize by adding other problem-specific
terms (multitask learning, posterior regularization)
88
Engineering conventional wisdom
(circa 2020)
 Encode your input text
 Feed that pretrained encoding into another
network that makes predictions
 Fine-tune the whole system end-to-end to
minimize some regularized training loss
 Outer loop: Tune architecture & parameters
 Try to minimize development loss
 Keep doing manual error analysis
89
Engineering conventional wisdom
(circa 2022)
 Write some input/output examples (training set)
90
Engineering conventional wisdom
(circa 2022)
 Write some input/output examples (training set)
 Prompt a large pretrained LM as follows:
The following examples show how to …
Input: [example input 8]
Output: [example output 8]
Input: [example input 42]
Output: [example output 42]
Input: [example input 3]
Output: [example output 3]
Input: [your actual test input]
Output: [LM produces the output]
91
Engineering conventional wisdom
(circa 2022)
 Write some input/output examples (training set)
 Prompt a large pretrained LM as follows
 Learn a policy which, given an input, retrieves a
sequence of “similar” examples from training set
 Retrieve inputs with similar words? (as in IR)
 Retrieve inputs with similar vector encodings?
 Try to enforce diversity?
 Tune the example selection policy using
reinforcement learning?
92
Engineering conventional wisdom
(circa 2022)
 Write some input/output examples (training set)
 Prompt a large pretrained LM as follows
 Learn a policy which, given an input, retrieves a
sequence of “similar” examples from training set
 Fine-tune the LM to do well on your chosen
prompts or randomized prompts
 Maybe only tune the top few layers
 Or tune low-rank perturbations to the neural matrices
 Alternatively, fine-tune the instructions in the prompt
93
Engineering conventional wisdom
(circa 2022)
 Write some input/output examples (training set)
 Prompt a large pretrained LM as follows
 Learn a policy which, given an input, retrieves a
sequence of “similar” examples from training set
 Fine-tune the LM to do well on your chosen
prompts or randomized prompts
 Once your big slow system is accurate, “distill” it
 Run it on lots of real inputs
 Train a small fast neural net (previous slide) to agree
with the distribution predicted by the big slow system
 Use small system; when it’s unsure, fall back to big one
94
Some Big Questions
 Neural nets are fluent at generating text, but do they
really represent and reason about the world the text
describes? Are their answers consistent? Can they
explain them?
 How can models learn effectively through interaction
with the world or with human teachers?
 What kinds of linguistic biases should we build in, and
how? Huge Transformer LMs with enormous training
sets work well, but can we find architectures that
generalize like humans from much smaller datasets?
(Or is that just pretraining + few-shot or fine-tuning?)
95
Parting advice from my Ph.D. students
600.465 - Intro to NLP - J. Eisner 96

More Related Content

Similar to Intro to NLP - Overview of NLP Research Community and Tasks

Collaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsCollaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsAndrea Wiggins
 
Master Beginners' Workshop September 2018
Master Beginners' Workshop September 2018Master Beginners' Workshop September 2018
Master Beginners' Workshop September 2018Miguel Pardal
 
Master Beginners Workshop - September 2019
Master Beginners Workshop - September 2019Master Beginners Workshop - September 2019
Master Beginners Workshop - September 2019Miguel Pardal
 
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-ProcessingAn-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-ProcessingTheodore J. LaGrow
 
CNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundationCNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundationJohn Doove
 
Publishing in high impact factor journals
Publishing in high impact factor journalsPublishing in high impact factor journals
Publishing in high impact factor journalsMohamed Alrshah
 
Question Focus Recognition in Question Answering Systems
Question Focus Recognition in Question  Answering Systems Question Focus Recognition in Question  Answering Systems
Question Focus Recognition in Question Answering Systems Waheeb Ahmed
 
Text-mining and Automation
Text-mining and AutomationText-mining and Automation
Text-mining and Automationbenosteen
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest linkCS, NcState
 
Analyzing Big Data's Weakest Link (hint: it might be you)
Analyzing Big Data's Weakest Link  (hint: it might be you)Analyzing Big Data's Weakest Link  (hint: it might be you)
Analyzing Big Data's Weakest Link (hint: it might be you)HPCC Systems
 
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesHaystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesMax Irwin
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibEl Habib NFAOUI
 
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...PhD Assistance
 
Rasa NLU and ML Interpretability
Rasa NLU and ML InterpretabilityRasa NLU and ML Interpretability
Rasa NLU and ML Interpretabilityztopol
 
Knowledge Representation on the Web
Knowledge Representation on the WebKnowledge Representation on the Web
Knowledge Representation on the WebRinke Hoekstra
 
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...PhD Assistance
 
Using construction grammar in conversational systems
Using construction grammar in conversational systemsUsing construction grammar in conversational systems
Using construction grammar in conversational systemsCJ Jenkins
 

Similar to Intro to NLP - Overview of NLP Research Community and Tasks (20)

Collaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsCollaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna Workflows
 
Master Beginners' Workshop September 2018
Master Beginners' Workshop September 2018Master Beginners' Workshop September 2018
Master Beginners' Workshop September 2018
 
Master Beginners Workshop - September 2019
Master Beginners Workshop - September 2019Master Beginners Workshop - September 2019
Master Beginners Workshop - September 2019
 
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-ProcessingAn-Exploration-of-scientific-literature-using-Natural-Language-Processing
An-Exploration-of-scientific-literature-using-Natural-Language-Processing
 
CNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundationCNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundation
 
Publishing in high impact factor journals
Publishing in high impact factor journalsPublishing in high impact factor journals
Publishing in high impact factor journals
 
Question Focus Recognition in Question Answering Systems
Question Focus Recognition in Question  Answering Systems Question Focus Recognition in Question  Answering Systems
Question Focus Recognition in Question Answering Systems
 
Text-mining and Automation
Text-mining and AutomationText-mining and Automation
Text-mining and Automation
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest link
 
ppt
pptppt
ppt
 
Analyzing Big Data's Weakest Link (hint: it might be you)
Analyzing Big Data's Weakest Link  (hint: it might be you)Analyzing Big Data's Weakest Link  (hint: it might be you)
Analyzing Big Data's Weakest Link (hint: it might be you)
 
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesHaystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
 
NLP todo
NLP todoNLP todo
NLP todo
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_Habib
 
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
 
Rasa NLU and ML Interpretability
Rasa NLU and ML InterpretabilityRasa NLU and ML Interpretability
Rasa NLU and ML Interpretability
 
Knowledge Representation on the Web
Knowledge Representation on the WebKnowledge Representation on the Web
Knowledge Representation on the Web
 
WALS and eLanguage (Leipzig)
WALS and eLanguage (Leipzig)WALS and eLanguage (Leipzig)
WALS and eLanguage (Leipzig)
 
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
Future of Natural Language Processing - Potential Lists of Topics for PhD stu...
 
Using construction grammar in conversational systems
Using construction grammar in conversational systemsUsing construction grammar in conversational systems
Using construction grammar in conversational systems
 

Recently uploaded

/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In.../:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...lizamodels9
 
Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,noida100girls
 
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfIntro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfpollardmorgan
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.Aaiza Hassan
 
RE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechRE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechNewman George Leech
 
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...lizamodels9
 
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service DewasVip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewasmakika9823
 
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...lizamodels9
 
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Howrah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Roomdivyansh0kumar0
 
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,noida100girls
 
Tech Startup Growth Hacking 101 - Basics on Growth Marketing
Tech Startup Growth Hacking 101  - Basics on Growth MarketingTech Startup Growth Hacking 101  - Basics on Growth Marketing
Tech Startup Growth Hacking 101 - Basics on Growth MarketingShawn Pang
 
rishikeshgirls.in- Rishikesh call girl.pdf
rishikeshgirls.in- Rishikesh call girl.pdfrishikeshgirls.in- Rishikesh call girl.pdf
rishikeshgirls.in- Rishikesh call girl.pdfmuskan1121w
 
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsCash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsApsara Of India
 
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service JamshedpurVIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service JamshedpurSuhani Kapoor
 
BEST Call Girls In BELLMONT HOTEL ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In BELLMONT HOTEL ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In BELLMONT HOTEL ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In BELLMONT HOTEL ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,noida100girls
 
Pitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deckPitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deckHajeJanKamps
 
Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst SummitHolger Mueller
 
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...lizamodels9
 

Recently uploaded (20)

/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In.../:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
 
Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝
 
Best Practices for Implementing an External Recruiting Partnership
Best Practices for Implementing an External Recruiting PartnershipBest Practices for Implementing an External Recruiting Partnership
Best Practices for Implementing an External Recruiting Partnership
 
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
 
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfIntro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.
 
RE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechRE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman Leech
 
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
 
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service DewasVip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
 
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...
 
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Howrah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
 
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
 
Tech Startup Growth Hacking 101 - Basics on Growth Marketing
Tech Startup Growth Hacking 101  - Basics on Growth MarketingTech Startup Growth Hacking 101  - Basics on Growth Marketing
Tech Startup Growth Hacking 101 - Basics on Growth Marketing
 
rishikeshgirls.in- Rishikesh call girl.pdf
rishikeshgirls.in- Rishikesh call girl.pdfrishikeshgirls.in- Rishikesh call girl.pdf
rishikeshgirls.in- Rishikesh call girl.pdf
 
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsCash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
 
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service JamshedpurVIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
 
BEST Call Girls In BELLMONT HOTEL ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In BELLMONT HOTEL ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In BELLMONT HOTEL ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In BELLMONT HOTEL ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
 
Pitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deckPitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deck
 
Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst Summit
 
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
 

Intro to NLP - Overview of NLP Research Community and Tasks

  • 1. 600.465 - Intro to NLP - J. Eisner 1 NLP Tasks and Applications It’s a big world out there And everyone uses language
  • 2. The NLP Research Community  Papers  ACL Anthology has nearly everything, free!  Over 80,000 papers!  Free-text searchable  Great way to learn about current research on a topic  New search interfaces currently available in beta  Find recent or highly cited work; follow citations  Used as a dataset by various projects  Analyzing the text of the papers (e.g., parsing it)  Extracting a graph of papers, authors, and institutions (Who wrote what? Who works where? What cites what?)  Google Scholar to sort by citation count / track citations
  • 3. The NLP Research Community  Conferences  Most work in NLP is published as 9-page conference papers with 3 double-blind reviewers.  Papers are presented via talks, posters, videos  Also:  Conference short papers (5 pages)  “Findings” papers (accepted but without presentation)  “System demo track” / “Industry track”  Journal papers (up to 12 pages, or unlimited)  Main annual conferences: ACL, EMNLP, NAACL  + EACL, AACL/IJCNLP, COLING, …; also LREC  + journals: TACL, CL, JNLE, …  + AI/ML journals: JAIR, JMLR, …  + various specialized conferences and workshops
  • 4. The NLP Research Community  Conferences  Pre-COVID, ACL had > 2000 in-person attendees  Kind of a zoo! It used to be 200 people and few enough talks that you could go to all of them …  ACL 2022:  3378 papers submitted (701 accepted = 20%)  Accepted:  86% “long” (9 pages)  14% “short” (5 pages)  + 361 “findings” published outside main conf ( total 31%)  Awards: Several papers (will be widely read)
  • 5. “Tracks” (at ACL 2021)  Machine Learning for NLP  Interpretability & Analysis of Models for NLP  Resources & Evaluation  Ethics & NLP  Phonology, Morphology & Word Segmentation  Syntax: Tagging, Chunking & Parsing  Semantics: Lexical  Semantics: Sentence-level Semantics, Textual Inference & Other areas  Linguistic Theories, Cognitive Modeling & Psycholinguistics  Information Extraction  Information Retrieval & Text Mining  Question Answering  Summarization  Machine Translation & Multilinguality  Speech & Multimodality  Discourse & Pragmatics  Sentiment Analysis, Stylistic Analysis, & Argument Mining  Dialogue & Interactive Systems  Language Grounding to Vision, Robotics & Beyond  Computational Social Science & Cultural Analytics  NLP Applications  Special theme: NLP for Social Good used to organize reviewing and presentation for most papers + 23 workshops on various topics
  • 6. The NLP Research Community  Chitchat  arXiv papers  Twitter accounts  NLP researchers with active accounts (grad students, profs, industry folks)  Official conference accounts  “NLP Highlights” podcast  “NLP News” newsletter
  • 7. The NLP Research Community  Institutions  Universities: Many have 2+ NLP faculty  Several “big players” with many faculty  Some of them also have good linguistics, cognitive science, machine learning, AI  Companies:  Old days: AT&T Bell Labs, IBM  Now: Microsoft Research, Google Brain/DeepMind, FAIR, Amazon, startups …  Nonprofits: AI2, HuggingFace, TTIC, …  Many niche markets – online reviews, medical transcription, news summarization, legal search and discovery …
  • 8. The NLP Research Community  Standard tasks  If you want people to work on your problem, make it easy for them to get started and to measure their progress. Provide:  Test data, for evaluating the final systems  Development data (= validation data) for measuring whether a change to the system helps, and for tuning parameters  An evaluation metric (formula for measuring how well a system does on the dev and test data)  “If you can’t measure it, you can’t improve it” – Peter Drucker  A program for computing the evaluation metric  Labeled training data and other data resources  A prize? – with clear rules on what data can be used  NLP-progress repo tracks the “state of the art” (SoTA)  Workshops sometimes create new “shared tasks”
  • 9. The NLP Research Community  Software  Lots of people distribute code for these tasks  Search github – fun to download packages and play around!  Or you can email a paper’s authors to ask for their code  Some lists of software, but no central site   PapersWithCode.com  Search for “awesome NLP” for some lists  Toolkits and end-to-end pipelines for text analysis  Hugging Face – > 20,000 models, > 2,000 datasets  Large pretrained models: pip install transformers (quick tour)  Task-specific models: pip install allennlp, etc.  Allen NLP (Python), Spacy (Cython), UDPipe (C++), Stanza (Python), CoreNLP (Java), NLTK (Python)
  • 10. The NLP Research Community  Software  To find good or popular tools:  Identify current papers, ask around, search the web  Trying to identify the best tool for your job:  Produces appropriate, sufficiently detailed output?  Accurate? (on the measure you care about)  Robust? (accurate on your data, not just theirs)  Fast?  Easy and flexible to use? Nice file formats, command line options, visualization?  Trainable for new data and languages? How slow is training?  Open-source and easy to extend?
  • 11. The NLP Research Community  Datasets  Raw text or speech corpora  Or just their n-gram counts, for super-big corpora  Various languages and genres  Usually there’s some metadata (each document’s date, author, etc.)  Sometimes  licensing restrictions (proprietary or copyright data)  Text or speech with manual or automatic annotations  What kind of annotations? That’s the rest of this lecture …  May include translations into other languages  Words and their relationships  phonological, morphological, semantic, translational, evolutionary  Grammars  World Atlas of Linguistic Structures  Parameters of statistical models (e.g., grammar weights)
  • 12. The NLP Research Community  Datasets  Read papers to find out what datasets others are using  Linguistic Data Consortium (searchable) hosts many large datasets  Many projects and competitions post data on their websites  But sometimes you have to email the author for a copy  CORPORA mailing list is also good place to ask around  LREC Conference publishes papers about new datasets & metrics  Pay humans to annotate your data  Or to correct the annotations produced by your initial system  Old task, new domain: Annotate parses etc. on your kind of data  New task: Annotate something new that you want your system to find  Auxiliary task: Annotate something new that your system may benefit from finding (e.g., annotate subjunctive mood to improve translation)  Can you make annotation so much fun or so worthwhile that they’ll do it for free?
  • 13. Crowdsourcing  E.g., Mechanical Turk  Pay humans small amounts of money to do “Human Intelligence Tasks” (HITs)  If you post it, they will come …  … although they’ll go away again if other people’s tasks are more pleasant / better paid 600.465 - Intro to NLP - J. Eisner 13
  • 14. Crowdsourcing  Callison-Burch (2010), section 5.3:  Did the Italian  English MT work well?  Let’s use humans to evaluate, but how?  English-speaking Turkers read the English news article, and answer reading comprehension questions.  If they answer correctly, the translation was good.  Who grades the answers?  Who writes the questions and sample answers?  Who translates the questions/answers to English?  Who filters out bad or redundant questions?  Pro tip: GPT-3 is also good at the kind of well-defined small tasks that you’d trust human strangers to do 600.465 - Intro to NLP - J. Eisner 14 other Turkers
  • 15. The NLP Research Community  Standard data formats  Often just simple ad hoc text-file formats  Documented in a README; easily read with scripts  Some standards:  Unicode – strings in any language (see ICU toolkit)  PCM (.wav, .aiff) – uncompressed audio  BWF and AUP extend w/metadata; also many compressed formats  XML – documents with embedded annotations  Text Encoding Initiative – faithful digital representations of printed text  Protocol Buffers, JSON – structured data  UIMA – “unstructured information management”; Watson uses it  Standoff markup: raw text in one file, annotations in other files (“ noun phrase from byte 378—392”)  Annotations can be independently contributed & distributed
  • 16. The NLP Research Community  Survey articles  May help you get oriented in a new area  Synthesis Lectures on Human Language Technologies  Oxford Handbook of Computational Linguistics (2014)  Foundations & Trends in Machine Learning  Survey articles in journals – JAIR, CL, JMLR  ACM Computing Surveys?  Search online for tutorial papers and survey papers  Slides from tutorials at conferences  Textbooks  Dissertation literature review chapters
  • 17. To Write A Typical Paper  Need some of these ingredients:  A domain of inquiry  A task  Resources  A method for training & testing  An algorithm  Analysis of results  There are other kinds of papers too: theoretical papers on formal grammars and their properties, new error metrics, new tasks or resources, etc. Scientific or engineering question Input & output representations, evaluation metrics* Corpora, annotations, dictionaries, … Derived from a model? Comparison to baselines & other systems, significance testing, learning curves, ablation analysis, error analysis *Evaluation is harder for unsupervised learning or text generation. There’s not a single right answer …
  • 18.  Easy to build a “yes” or “no” predictor from supervised training data  Plenty of software packages to do the learning & prediction  Lots of people in NLP never go beyond this   Similarly, easy to build a system that chooses from a small finite set  Basically the same deal  But runtime goes up linearly with the size of the set  Harder to predict the best string or tree (set is exponentially large or infinite)  Turn it into a sequence of predictions (words, tags, structure-building actions)  Tagging / finite-state / parsing algorithms will find the best sequence for you  Might also find k-best, or k-random (sample), or sum over all possible outputs  An algorithm for any of these can be turned into a learning algorithm  If the predictions are independent or interact only locally, algorithms can use dynamic programming  If they interact non-locally, you need an approximate method like beam search  Other approximate methods include variational inference, belief propagation, cutset conditioning, simulated annealing, MCMC, genetic algorithms, learning to search, … Supervised Learning Methods
  • 19. 600.465 - Intro to NLP - J. Eisner 19 Text Annotation Tasks 1. Classify the entire document (“text categorization”)
  • 20. 600.465 - Intro to NLP - J. Eisner 20 Sentiment classification example from amazon.com, thanks to Delip Rao What features of the text could help predict # of stars? (e.g., using a log-linear model) How to identify more? Are the features hard to compute? (syntax? sarcasm?) ?
  • 21. 600.465 - Intro to NLP - J. Eisner 21 Other text categorization tasks  Is it spam? (see features)  What medical billing code for this visit?  What grade, as an answer to this essay question?  Is it interesting to this user?  News filtering; helpdesk routing  Is it interesting to this NLP program?  Skill classification for a digital assistant!  If it’s Spanish, translate it from Spanish  If it’s subjective, run the sentiment classifier  If it’s an appointment, run information extraction  Where should it be filed?  Which mail folder? (work, friends, junk, urgent ...)  Yahoo! / Open Directory / digital libraries
  • 22. 600.465 - Intro to NLP - J. Eisner 22 Measuring Performance  Classification accuracy: What % of messages were classified correctly?  Is this what we care about? Overall accuracy Accuracy on spam Accuracy on gen System 1 95% 99.99% 90% System 2 95% 90% 99.99%  Which system do you prefer?
  • 23. 600.465 - Intro to NLP - J. Eisner 23 Measuring Performance  Precision = good messages kept all messages kept  Recall = good messages kept all good messages Move from high precision to high recall by deleting fewer messages (delete only if spamminess > high threshold) Precision vs. Recall of Good (non-spam) Email 0% 25% 50% 75% 100% 0% 25% 50% 75% 100% Recall Precision
  • 24. 600.465 - Intro to NLP - J. Eisner 24 Precision vs. Recall of Good (non-spam) Email 0% 25% 50% 75% 100% 0% 25% 50% 75% 100% Recall Precision Measuring Performance low threshold: keep all the good stuff, but a lot of the bad too high threshold: all we keep is good, but we don’t keep much OK for spam filtering and legal search OK for search engines (users only want top 10) would prefer to be here! point where precision=recall (occasionally reported)
  • 25. 600.465 - Intro to NLP - J. Eisner 25 Measuring Performance  Precision = good messages kept all messages kept  Recall = good messages kept all good messages  F-measure = Move from high precision to high recall by deleting fewer messages (raise threshold) Conventional to tune system and threshold to optimize F-measure on dev data But it’s more informative to report the whole curve Since in real life, the user should be able to pick a tradeoff point they like Precision vs. Recall of Good (non-spam) Email 0% 25% 50% 75% 100% 0% 25% 50% 75% 100% Recall Precision precision-1 + recall-1 2 ( ) -1 another system: better for some users, worse for others (can’t tell just by comparing F-measures)
  • 26. More than 2 classes  Report F-measure for each class  Show a confusion matrix 600.465 - Intro to NLP - J. Eisner 26 Image from https://www.analyticsvidhya.com/blog/2020/04/confusion-matrix-machine-learning/ True class Predicted class
  • 28. 600.465 - Intro to NLP - J. Eisner 29 Text Annotation Tasks 1. Classify the entire document 2. Classify individual word tokens
  • 29. slide courtesy of D. Yarowsky p(class | token in context) (WSD) Build a special classifier just for tokens of “plant”
  • 30. slide courtesy of D. Yarowsky p(class | token in context) WSD for Build a special classifier just for tokens of “sentence”
  • 31. slide courtesy of D. Yarowsky p(class | token in context)
  • 32. slide courtesy of D. Yarowsky p(class | token in context)
  • 33. slide courtesy of D. Yarowsky p(class | token in context)
  • 34. slide courtesy of D. Yarowsky p(class | token in context)
  • 35. slide courtesy of D. Yarowsky p(class | token in context)
  • 36. 600.465 - Intro to NLP - J. Eisner 37 What features? Example: “word to left” slide courtesy of D. Yarowsky (modified) Spelling correction using an n-gram language model (n ≥ 2) would use words to left and right to help predict the true word. Similarly, an HMM would predict a word’s class using classes to left and right. But we’d like to throw in all kinds of other features, too …
  • 37. 600.465 - Intro to NLP - J. Eisner 38 Feature Templates slide courtesy of D. Yarowsky (modified) generates a whole bunch of features – use data to find out which ones work best
  • 38. 600.465 - Intro to NLP - J. Eisner 39 Feature Templates slide courtesy of D. Yarowsky (modified) merged ranking of all features of all these types This feature is relatively weak, but weak features are still useful, especially since very few features will fire in a given context.
  • 39. 600.465 - Intro to NLP - J. Eisner 40 Final decision list for lead (abbreviated) slide courtesy of D. Yarowsky (modified) List of all features, ranked by their weight. (These weights are for a simple “decision list” model where the single highest-weighted feature that fires gets to make the decision all by itself. However, a log-linear model, which adds up the weights of all features that fire, would be roughly similar.)
  • 40. 600.465 - Intro to NLP - J. Eisner 41 Text Annotation Tasks 1. Classify the entire document 2. Classify individual word tokens 3. Identify phrases (“chunking”)
  • 41. 3/2/2023 Slide from Jim Martin 42 Named Entity Recognition CHICAGO (AP) — Citing high fuel prices, United Airlines said Friday it has increased fares by $6 per round trip on flights to some cities also served by lower-cost carriers. American Airlines, a unit AMR, immediately matched the move, spokesman Tim Wagner said. United, a unit of UAL, said the increase took effect Thursday night and applies to most routes where it competes against discount carriers, such as Chicago to Dallas and Atlanta and Denver to San Francisco, Los Angeles and New York.
  • 42. Slide from Jim Martin 43 NE Types
  • 43.  Phrases that are useful for information extraction:  Named entities  As on previous slides  Relationship phrases  “said”, “according to”, …  “was born in”, “hails from”, …  “bought”, “hopes to acquire”, “formed a joint agreement with”, …  Simple syntactic chunks (e.g., non-recursive NPs)  “Syntactic chunking” sometimes done before (or instead of) parsing  Also, “segmentation”: divide Chinese text into words (no spaces)  So, how do we learn to mark phrases?  Earlier, we built an FST to mark dates by inserting brackets  But, it’s common to set this up as a tagging problem … Identifying phrases (chunking)
  • 44. 45 Reduce to a tagging problem … • The IOB encoding (Ramshaw & Marcus 1995):  B_X = “beginning” (first word of an X)  I_X = “inside” (non-first word of an X)  O = “outside” (not in any phrase)  Does not allow overlapping or recursive phrases …United Airlines said Friday it has increased … B_ORG I_ORG O O O O O … the move , spokesman Tim Wagner said … O O O O B_PER I_PER O Slide adapted from Chris Brew What if this were tagged as B_ORG instead?
  • 45. Slide adapted from Jim Martin 46 Attributes and Feature Templates for NER POS tags and chunks from earlier processing Now predict NER tagseq A feature of this tagseq might give a positive or negative weight to this B_ORG in conjunction with some subset of the nearby attributes Or even faraway attributes: B_ORG is more likely in a sentence with a spokesman!
  • 46. Alternative: CRF Chunker  Log-linear model of p(structure | sentence):  CRF tagging takes O(n) time (“Markov property”)  Score each possible tag or tag bigram at each position  Then use dynamic programming to find best overall tagging  CRF chunking takes O(n2) time (“semi-Markov”)  Score each possible chunk, e.g., using a BiRNN-CRF  Then use dynamic programming to find best overall chunking  Forward algorithm:  for all j: for all i < j: for all labels L: α(j) += α(i) * score(possible chunk from i to j with label L)  CRF parsing takes O(n3) time  Score each possible rule at each position  Then use dynamic programming to find best overall parse
  • 47. 600.465 - Intro to NLP - J. Eisner 48 Text Annotation Tasks 1. Classify the entire document 2. Classify individual word tokens 3. Identify phrases (“chunking”) 4. Syntactic annotation (parsing)
  • 48.  Runtime  Exact match  Is the parse 100% correct?  Labeled precision, recall, F-measure of constituents  Precision: You predicted (NP,5,8); was it right?  Recall: (NP,5,8) was right; did you predict it?  Easier versions:  Unlabeled: Don’t worry about getting (NP,5,8) right, only (5,8)  Short sentences: Only test on sentences of ≤ 15, ≤ 40, ≤ 100 words  Dependency parsing: Labeled and unlabeled attachment accuracy  Crossing brackets  You predicted (…,5,8), but there was really a constituent (…,6,10) Parser Evaluation Metrics
  • 49. MOD Labeled Dependency Parsing He reckons the current account deficit will narrow to only 1.8 billion in September. Raw sentence Part-of-speech tagging He reckons the current account deficit will narrow to only 1.8 billion in September. PRP VBZ DT JJ NN NN MD VB TO RB CD CD IN NNP . POS-tagged sentence Word dependency parsing slide adapted from Yuji Matsumoto Word dependency parsed sentence He reckons the current account deficit will narrow to only 1.8 billion in September . SUBJ ROOT S-COMP SUBJ SPEC MOD MOD COMP COMP MOD
  • 54. The plan to swallow Wanda has been thrilling Otto. Dependency Trees 4. Optionally flatten the drawing  Shows which words modify (“depend on”) another word  Each subtree of the dependency tree is still a constituent  But not all of the original constituents are subtrees (e.g., VP)  Easy to spot semantic relations (“who did what to whom?”)  Good source of syntactic features for other tasks  Easy to annotate (high agreement)  Easy to evaluate (what % of words have correct parent?)  Data available in more languages (Universal Dependencies)
  • 55. 600.465 - Intro to NLP - J. Eisner 56 Text Annotation Tasks 1. Classify the entire document 2. Classify individual word tokens 3. Identify phrases (“chunking”) 4. Syntactic annotation (parsing) 5. Semantic annotation
  • 56. 57 Semantic Role Labeling (SRL) • For each predicate (e.g., verb) 1. find its arguments (e.g., NPs) 2. determine their semantic roles John drove Mary from Austin to Dallas in his Toyota Prius. The hammer broke the window. – agent: Actor of an action – patient: Entity affected by the action – source: Origin of the affected entity – destination: Destination of the affected entity – instrument: Tool used in performing action. – beneficiary: Entity for whom action is performed Slide thanks to Ray Mooney (modified)
  • 57. 58 Might be helped by syntactic parsing first … • Consider one verb at a time: “bit” • Classify the role (if any) of each of the 3 NPs S NP VP NP PP The Prep NP with the V NP bit a big dog girl boy Det A N Det A N ε Adj A ε Det A N ε Color Code: not-a-role agent patient source destination instrument beneficiary Slide thanks to Ray Mooney (modified)
  • 58. 59 Parse tree paths as classification features S NP VP NP PP The Prep NP with the V NP bit a big dog girl boy Det A N Det A N ε Adj A ε Det A N ε Path feature is V ↑ VP ↑ S ↓ NP which tends to be associated with agent role Slide thanks to Ray Mooney (modified)
  • 59. 60 S NP VP NP PP The Prep NP with the V NP bit a big dog girl boy Det A N Det A N ε Adj A ε Det A N ε Slide thanks to Ray Mooney (modified) Parse tree paths as classification features Path feature is V ↑ VP ↑ S ↓ NP ↓ PP ↓ NP which tends to be associated with no role
  • 60. 61 Head words as features • Some roles prefer to be filled by certain kinds of NPs. • This can give us useful features for classifying accurately: – “John ate the spaghetti with chopsticks.” (instrument) “John ate the spaghetti with meatballs.” (patient) “John ate the spaghetti with Mary.” • Instruments should be tools • Patient of “eat” should be edible – “John bought the car for $21K.” (instrument) “John bought the car for Mary.” (beneficiary) • Instrument of “buy” should be Money • Beneficiaries should be animate (things with desires) – “John drove Mary to school in the van” “John drove the van to work with Mary.” • What do you think? Slide thanks to Ray Mooney (modified)
  • 61. 62 Semantic roles can feed into further tasks • Find the answer to a user’s question – “Who” questions usually want Agents – “What” question usually want Patients – “How” and “with what” questions usually want Instruments – “Where” questions frequently want Sources/Destinations. – “For whom” questions usually want Beneficiaries – “To whom” questions usually want Destinations • Generate text – Many languages have specific syntactic constructions that must or should be used for specific semantic roles. • Word sense disambiguation, using selectional restrictions – The bat ate the bug. (what kind of bat? what kind of bug?) • Agents (particularly of “eat”) should be animate – animal bat, not baseball bat • Patients of “eat” should be edible – animal bug, not software bug – John fired the secretary. John fired the rifle. Patients of fire1 are different than patients of fire2 Slide thanks to Ray Mooney (modified)
  • 62.  PropBank – coarse-grained roles of verbs  NomBank – similar, but for nouns  TimeBank – temporal expressions  FrameNet – fine-grained roles of any word  Like finding the intent (= frame) for each “trigger phrase,” and filling its slots Other Current Semantic Annotation Tasks (similar to SRL) Alarm(delay=“20 minutes”, content=“turn off the rice”) intent slot slot filler filler
  • 63. We avenged the insult by setting fire to his village. REVENGE FRAME Avenger Offender (unexpressed in this sentence) Injury Injured Party (unexpressed in this sentence) Punishment FrameNet Example a word/phrase that triggers the REVENGE frame Slide thanks to CJ Fillmore (modified)
  • 64. REVENGE FRAME triggering words and phrases (not limited to verbs) avenge, revenge, retaliate, get back at, pay back, get even, … revenge, vengeance, retaliation, retribution, reprisal, … vengeful, retaliatory, retributive; in revenge, in retaliation, … take revenge, wreak vengeance, exact retribution, … FrameNet Example Slide thanks to CJ Fillmore (modified)
  • 65.
  • 66. Slide from Chris Brew, adapted from slide by William Cohen Information Extraction Filling slots in a database from sub-segments of text. As a task: October 14, 2002, 4:00 a.m. PT For years, Microsoft Corporation CEO Bill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation. Today, Microsoft claims to "love" the open- source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers. "We can be open source. We love the concept of shared source," said Bill Veghte, a Microsoft VP. "That's a super-important shift for us in terms of code access.“ Richard Stallman, founder of the Free Software Foundation, countered saying… NAME TITLE ORGANIZATION Bill Gates CEO Microsoft Bill Veghte VP Microsoft Richard Stallman founder Free Soft.. IE
  • 67. Phrase Types to Identify for IE Closed set He was born in Alabama… The big Wyoming sky… U.S. states Regular set Phone: (413) 545-1323 The CALD main office can be reached at 412-268-1299 U.S. phone numbers Complex pattern University of Arkansas P.O. Box 140 Hope, AR 71802 U.S. postal addresses Headquarters: 1128 Main Street, 4th Floor Cincinnati, Ohio 45210 …was among the six houses sold by Hope Feldman that year. Ambiguous patterns, needing context and many sources of evidence Person names Pawel Opalinski, Software Engineer at WhizBang Labs. Slide from Chris Brew, adapted from slide by William Cohen
  • 68.  Classified ads  Restaurant reviews  Bibliographic citations  Appointment emails  Legal opinions  Papers describing clinical medical studies  …  Adding facts to the semantic web Example applications for IE
  • 69. The Semantic Web  A simple scheme for representing factual knowledge as a labeled graph  Many information extraction tasks aim to produce something like this  [examples on next slides]
  • 70. The Semantic Web from programmableweb.com
  • 71. The Semantic Web from en.wikipedia.org
  • 72. The Semantic Web  A simple scheme for representing factual knowledge as a labeled graph  Many information extraction tasks aim to produce something like this  Is a labeled graph (triples) really enough?   Can transform k-tuples to triples (cf. Davidsonian event variable)   Supports facts about individuals, but no direct support for quantifiers or reasoning
  • 73.  Syntactic parsing maps from a sentence to a syntactic structure  Semantic parsing maps from a sentence directly to a semantic structure  Often executable code, like a SQL query  Or perhaps a logical form like a λ-term  Or perhaps a graph as in Abstract Meaning Representation (AMR) Semantic Parsing
  • 74. Semantic Parsing: SPIDER dataset “easy” example from Yu et al. (2018)
  • 75. Semantic Parsing: SPIDER dataset “medium” example from Yu et al. (2018)
  • 76. Semantic Parsing: SPIDER dataset “extra hard” example from Yu et al. (2018)
  • 77. Semantic Parsing: AMR  AMR = Abstract Meaning Representation  Proposed as an “interlingua” for MT  That is, encode Chinese into AMR, then decode AMR (or Chinese+AMR) into English  Tutorial slides with examples 600.465 - Intro to NLP - J. Eisner 78
  • 78. 600.465 - Intro to NLP - J. Eisner 79 Generating new text 1. Speech recognition (transcribe as text) 2. Machine translation 3. Text generation from semantics 4. Inflect, analyze, pronounce, or transliterate words 5. Single- or multi-doc summarization
  • 79. 600.465 - Intro to NLP - J. Eisner 80 Deeper Information Extraction 1. Coreference resolution (within a document) 2. Entity linking (across documents) 3. Event extraction and linking 4. Knowledge base population (KBP) 5. Recognizing texual entailment (RTE)
  • 80. 600.465 - Intro to NLP - J. Eisner 81 User interfaces 1. Dialogue systems  Task-oriented dialogue (specific domains)  Chatbots (unrestricted topic, fun or useful)  Human-computer collaboration  Interactive teaching 2. Language teaching; writing help 3. Question answering 4. Information retrieval
  • 81. 600.465 - Intro to NLP - J. Eisner 82 Multimodal interfaces or modeling 1. Sign languages 2. Speech + gestures 3. Images + captions 4. Brain recordings, human reaction times
  • 82. 600.465 - Intro to NLP - J. Eisner 83 Discovering Linguistic Structure 1. Decipherment 2. Word meanings 3. Grammar induction (syntax, morphology) 4. Topic modeling 5. Language evolution (historical linguistics) 6. Grounded semantics NLP automates things that humans do well, so that they can be done automatically on more sentences. But this slide is about language analysis that’s hard even for humans. Computational linguistics (like comp bio, etc.) can discover underlying patterns in large datasets: things we didn’t know! Some of these may already be hidden in vector representations …
  • 83. Engineering conventional wisdom (circa 2020) EOS Every nation wants Joe to love Jill 600.465 - Intro to NLP - J. Eisner 84 Every nation wants Joe “bidirectional RNN (BiRNN) tagger” BOS Verb
  • 84. Engineering conventional wisdom (circa 2020) 85 Every nation wants Joe to love Jill Chaque nation veut que … que Joe veut nation Transformer seq2seq
  • 85. Engineering conventional wisdom (circa 2020)  Encode your input text  Probably using a Transformer  Or maybe a BiRNN (such as a BiLSTM)  Let your network look at other helpful features/data 86
  • 86. Engineering conventional wisdom (circa 2020)  Encode your input text  Feed that pretrained encoding into another network that makes predictions  Classify the whole text  Classify each token (tagging)  Locally score possible tags, n-grams, rules at each token  Then use dynamic programming to predict a global structure  E.g., BiRNN-CRF in HW6  Predict a sequence of structure-building rules  Try to find the highest-scoring sequence with beam search 87
  • 87. Engineering conventional wisdom (circa 2020)  Encode your input text  Feed that pretrained encoding into another network that makes predictions  Fine-tune the whole system end-to-end to minimize some regularized training loss  This includes fine-tuning the pretrained encoder  Log-loss or some kind of differentiable error metric  L2 regularization with early stopping  Can also regularize by adding other problem-specific terms (multitask learning, posterior regularization) 88
  • 88. Engineering conventional wisdom (circa 2020)  Encode your input text  Feed that pretrained encoding into another network that makes predictions  Fine-tune the whole system end-to-end to minimize some regularized training loss  Outer loop: Tune architecture & parameters  Try to minimize development loss  Keep doing manual error analysis 89
  • 89. Engineering conventional wisdom (circa 2022)  Write some input/output examples (training set) 90
  • 90. Engineering conventional wisdom (circa 2022)  Write some input/output examples (training set)  Prompt a large pretrained LM as follows: The following examples show how to … Input: [example input 8] Output: [example output 8] Input: [example input 42] Output: [example output 42] Input: [example input 3] Output: [example output 3] Input: [your actual test input] Output: [LM produces the output] 91
  • 91. Engineering conventional wisdom (circa 2022)  Write some input/output examples (training set)  Prompt a large pretrained LM as follows  Learn a policy which, given an input, retrieves a sequence of “similar” examples from training set  Retrieve inputs with similar words? (as in IR)  Retrieve inputs with similar vector encodings?  Try to enforce diversity?  Tune the example selection policy using reinforcement learning? 92
  • 92. Engineering conventional wisdom (circa 2022)  Write some input/output examples (training set)  Prompt a large pretrained LM as follows  Learn a policy which, given an input, retrieves a sequence of “similar” examples from training set  Fine-tune the LM to do well on your chosen prompts or randomized prompts  Maybe only tune the top few layers  Or tune low-rank perturbations to the neural matrices  Alternatively, fine-tune the instructions in the prompt 93
  • 93. Engineering conventional wisdom (circa 2022)  Write some input/output examples (training set)  Prompt a large pretrained LM as follows  Learn a policy which, given an input, retrieves a sequence of “similar” examples from training set  Fine-tune the LM to do well on your chosen prompts or randomized prompts  Once your big slow system is accurate, “distill” it  Run it on lots of real inputs  Train a small fast neural net (previous slide) to agree with the distribution predicted by the big slow system  Use small system; when it’s unsure, fall back to big one 94
  • 94. Some Big Questions  Neural nets are fluent at generating text, but do they really represent and reason about the world the text describes? Are their answers consistent? Can they explain them?  How can models learn effectively through interaction with the world or with human teachers?  What kinds of linguistic biases should we build in, and how? Huge Transformer LMs with enormous training sets work well, but can we find architectures that generalize like humans from much smaller datasets? (Or is that just pretraining + few-shot or fine-tuning?) 95
  • 95. Parting advice from my Ph.D. students 600.465 - Intro to NLP - J. Eisner 96