SlideShare a Scribd company logo
Distant Supervision with
Imitation Learning
Isabelle Augenstein
i.augenstein@sheffield.ac.uk
Department of Computer Science, University of Sheffield, UK
Joint work with Andreas Vlachos, Diana Maynard (EMNLP 2015)
30 November 2015
Heriot-Watt University Computer Science Seminar
2
Talk Overview
•  Relation Extraction from the Web with Distant Supervision
•  Extracting Relations from Web pages
•  Relation are used for populating Knowledge Bases
•  Distant Supervision allows to automatically generate relation extraction
training data using knowledge base
Ø  No manual effort necessary
3
Talk Overview
•  Imitation Learning for Distant Supervision
•  Relation extraction relies on recognising and classifying named entities,
but sentences only have relation annotations
•  Suitable manually labeled NERC training data can be difficult to obtain
•  Imitation Learning decomposes tasks (RE) into sequence of actions
(e.g. NEC, RE), able to deal with latent variables
•  Imitation Learning is a structured prediction method, also called
learning-to-search, inverse reinforcement learning
Ø  Only labels for last action (RE) needed, no additional manual effort
4
•  Large knowledge bases are useful for search, question
answering etc.
Overall Problem
Structured Information from
Google Knowledge Graph
5
•  Large knowledge bases are useful for search, question
answering etc. but far from complete
Overall Problem
Structured Information from
Google Knowledge Graph
Band members,
genre missing
6
•  Large knowledge bases are useful for search, question
answering etc. but far from complete
•  Approach: automatic knowledge base population (KBP)
methods using Web information extraction (IE)
1)  Extracting entities and relations between them from text on Web pages
2)  Combining information from several sources to populate KBs
Overall Problem
7
Relation extraction for knowledge base completion
•  Given subject and name of relation, find object of relation in corpus
•  E.g. “Where was Bill Gates born?”
•  Answer: birthplace(Bill Gates, Seattle_Washington)
Relation Extraction Overview
birthplace
Bill Gates was born in Seattle, Washington
LOC
8
•  Why distant supervision for relation extraction (RE)?
•  RE methods requiring manual effort
•  Rule-based approaches: manually created patters, e.g.
“X is a professor at Y”
•  Supervised learning: statistical models, manually annotated training data
Ø  Biased towards a domain, e.g. Biology, newswire, Wikipedia
•  RE methods requiring no manual effort
•  Bootstrapping: semi-supervised, learning patterns iteratively starting with
prior knowledge, e.g. list of names
Ø  “Semantic drift”, e.g. “X is a professor at Y” -> “X lives in Y”
•  Open Information Extraction: unsupervised learning, discovering
patterns, clustering
Ø  Difficult to map to schema
Existing Approaches
9
“If two entities participate in a relation, any sentence that contains those two
entities might express that relation.” (Mintz, 2009)
Amy Jade Winehouse was a
singer and songwriter known for
her eclectic mix of musical genres
including R&B, soul and jazz.
Blur helped to popularise the
Britpop genre.
Beckham rose to fame with the
all-female pop group Spice Girls.
Name Genre …
Amy Winehouse
Amy Jade Winehouse
Wino
…
R&B
soul
jazz
…
…
Blur
…
Britpop
…
…
Spice Girls
…
pop
…
…
different
lexicalisations
Distant Supervision
10
Creating positive &
negative training
examples
Feature
Extraction
Classifier
Training
Prediction of
New
Relations
Distant Supervision
11
Creating positive &
negative training
examples
Feature
Extraction
Classifier
Training
Prediction of
New
Relations
Distant Supervision
KB: album(The Beatles, Abbey Road)
Positive: The Beatles released their album Abbey Road
in 1969.
Negative: The Beatles played in Edinburgh.
depLemmaPath=released_OJB,
possPath=VBD_PRP_album, …
possPath=_release+VBN=0.354677
depLemmaPath=_release=1.81213, …
Michael Jackson’s third album is Music & Me
album(Michael Jackson, Music & Me)
12
Distant Supervision
Creating positive &
negative training
examples
Feature
Extraction
Classifier
Training
Prediction of
New
Relations
Supervised learning
Automatically generated
training data
+
Distant Supervision
13
•  Requires no manual effort
•  Automatically label text with relations from knowledge base
•  Train statistical model (not patterns)
•  Extract relations with respect to knowledge base
Ø  Combine benefits of supervised approaches (learn statistical
model) and bootstrapping RE approaches (only list of extractions
as input)
Distant Supervision
14
•  Web crawl corpus, created using entity-specific search
queries, e.g. “`The Beatles’ Musical Artist album”
Class Property / Relation
Book author, characters
Musical
Artist
album, record label, track
Film director, producer, actor,
character
Politician birthplace, educational
institution, spouse
Evaluation: Corpus
Class Property / Relation
Business employees, founders
Educational
Institution
mascot, city
River origin, mouth
15
•  Distant Supervision does not require manual annotation but
depends on NERC for candidate identification
NERC for Distant Supervision
birthplace
Bill Gates was born in Seattle, Washington
LOC
16
•  Existing works use Stanford NER (Finkel et al. 2005) or
FIGER (Ling and Weld 2012)
Stanford NER FIGER
Location 14 Location (City, Country, County, Province, Railway, …)
Person 15 Person (Actor, Architect, Artist, Musician, Terrorist, …)
Organisation 13 Org (Airline, Company, Educational_Institution, ….)
Misc 13 Product (Car, Train, Camera, Software, Weapon, …)
9 Building (Airport, Hospital, Restaurant, Theater, …)
5 Art (Film, Play, Written_Work, Music, Newspaper)
7 Event (Election, Military_Conflict, Terrorist_Attack, …)
30 Misc (Time, Educational_Degree, Drug, Algorithm, …)
NERC for Distant Supervision
17
•  Problem 1: missing NE types even with fine-grained schemas
album
Michael Jackson’s third album is Music & Me
Musician ? Misc
NERC for Distant Supervision
18
•  Problem 1: missing NE types even with fine-grained schemas
•  Problem 2: domain difference between training and testing
data (e.g. newswire, Wikipedia vs. Web)
album
Michael Jackson’s third album is Music & Me
? Misc
NERC for Distant Supervision
19
•  Task decomposition
•  NER: Named Entity Boundary Recognition
•  NEC: Assigning Types to NEs
•  RE: Relation Extraction
•  Solution 1:
•  NER: recognise NEs with heuristics (e.g. POS-based, HTML)
•  NEC: apply trained model (e.g. Stanford, FIGER), add labels of objects
to RE features
•  RE: train model with distantly annotated data as usual
•  NER Heuristics:
•  Noun phrases, capitalised phrases
•  Phrases from HTML markup: <ahref>, <li>, <h1>, <h2>, <h3>,
<strong>, <b>, <em>, <i>
NERC for Distant Supervision
20
album
Michael Jackson’s third album is Music & Me
O
NERC for Distant Supervision
•  Solution 1:
•  NER: recognise NEs with heuristics (e.g. POS-based, HTML)
•  NEC: add object candidate labels (e.g. with Stanford, FIGER)
•  RE: train model with distantly annotated data as usual
•  RE features: ne=O, depLemmaPath=poss_album_subj,
possPath=POS_JJ_album_VBZ, …
21
•  Experiments with 16 relations (e.g. album, character, record
label, author, origin)
Recall of NER with off-the-shelf Stanford model compared to
heuristics
NERC for Distant Supervision
22
•  Solution 2:
•  NER: with heuristics
•  NEC & RE: train one-stage model
•  NEC features: obj=Music & Me, w[-1-2]=album is, …
•  RE features: depLemmaPath=poss_album_subj,
possPath=POS_JJ_album_VBZ, …
album
Michael Jackson’s third album is Music & Me
NERC for Distant Supervision
23
•  Solution 2:
•  NER: with heuristics
•  NEC & RE: train one-stage model
•  Problem 3: NEC features useful for RE but
•  RE features are sparse (e.g. path between subject and object)
•  NEC features can overpower RE features
album
Michael Jackson’s third album is Music & Me
NERC for Distant Supervision
24
•  Problem 3: NEC features useful for RE but:
•  RE features are sparse (e.g. path between subject and object)
•  NEC features can overpower RE features
Ø  Model would incorrectly predict Stephen Spielberg,
because context is stronger (w[-1]=director)
One of director Stephen Spielberg’s greatest heroes
was Alfred Hitchcock, the mastermind behind
Psycho.
Candidates for director relation with subject Psycho:
Stephen Spielberg, Alfred Hitchcock
NERC for Distant Supervision
25
•  Ideal Solution:
•  NER: with heuristics
•  NEC: trained classifier
•  RE: trained classifier
Ø  That would be great, but how can we do this without NEC
training data?
NERC for Distant Supervision
26
•  Imitation learning with DAGGER (Ross et al. 2011)
•  Also called learning-to-search, inverse reinforcement learning
•  Structured prediction method
•  Able to deal with latent variables, only labels for last stage (RE) needed
•  Decompose tasks into sequence of actions made at different stages
•  Dependencies between tasks are learnt by appropriate generation of
training examples
•  Classifiers are trained iteratively
•  Relationship between Reinforcement Learning and
Imitation learning
•  In reinforcement, the policy is being learnt and the actions are given
•  In imitation learning, the policy is given and the actions are learnt
•  (hence inverse)
Imitation Learning for Distant
Supervision
27
Imitation Learning for Distant
Supervision
•  Learning from demonstrator
•  Possible actions are given
•  Correctness of actions (i.e.
costs) are assessed by
taking actions, predicting
remaining ones and
evaluating result
•  Dependencies between
actions are learnt by
observation
•  Origins of Imitation learning
•  Robotics
•  Game playing (e.g. Ortega et al. 2012)
•  Mario’s possible actions (simplified): move left, move right,
duck, run, jump, fire
28
Imitation Learning for Distant
Supervision
•  Imitation Learning for NLP
•  Actions: NEC, if NEC positive followed by RE
•  Demonstrator (expert policy) tries to replicate labelled RE data
•  Base classifier: cost sensitive classification learning with PA
(passive-aggressive classifier)
•  NEC labels are needed but not specified by labelled RE data
•  Solution: look-ahead!
29
•  Iteration 1, NEC Stage
Imitation Learning for Distant
Supervision
True False Features
NEC Stage ? ? obj=Music & Me, …
RE Stage depLemma=poss_album_subj, …
Michael Jackson’s third album is Music & Me
?
30
•  Iteration 1, RE Stage
Imitation Learning for Distant
Supervision
True False Features
NEC Stage ? ? obj=Music & Me, …
RE Stage 0 1 depLemma=poss_album_subj, …
True
Michael Jackson’s third album is Music & Me
?
31
•  Iteration 1, RE Stage
Imitation Learning for Distant
Supervision
True False Features
NEC Stage 0 1 obj=Music & Me, …
RE Stage 0 1 depLemma=poss_album_subj, …
True
Michael Jackson’s third album is Music & Me
True
32
•  Iteration 1
•  NEC and RE Stage: predict labels according to labelled data
(expert policy) with look-ahead
•  Extract features
•  Assess costs
•  CSC example: features, costs -> will be remembered for next iterations!
•  Train classifier for each stage based on CSC example (learned policy)
Imitation Learning for Distant
Supervision
33
•  Iteration 1
•  NEC and RE Stage: predict labels according to labelled data
(expert policy) with look-ahead
•  Extract features
•  Assess costs
•  CSC example: features, costs -> will be remembered for next iterations!
•  Train classifier for each stage based on CSC example (learned policy)
•  Iteration >= 2
•  Predict labels according to expert policy or learned policy
•  Learned policy is chosen stochastically, i.e. p=(1−β)
i: number iteration, β: learning rate
•  With each iteration it is more likely that expert policy is chosen
•  The bigger the learning rate the faster learner moves away from labelled
data
Imitation Learning for Distant
Supervision
i-1
34
•  Reminder: Problem 3: NEC features useful for RE but:
•  RE features are sparse (e.g. path between subject and object)
•  NEC features can overpower RE features
Ø  Model would incorrectly predict Stephen Spielberg,
because context is stronger (w[-1]=director)
One of director Stephen Spielberg’s greatest heroes
was Alfred Hitchcock, the mastermind behind
Psycho.
Candidates for director relation with subject Psycho:
Stephen Spielberg, Alfred Hitchcock
NERC for Distant Supervision
35
•  Multi-stage modelling compensates for mistakes
Imitation Learning for Distant
Supervision
Confidence Prediction Features
NEC Stage 0.629 True obj=Stephen Spielberg, …
RE Stage -0.571 False depLemma=_POSS_heroes_ …
False
Steven Spielberg’s greatest heroes (…) Psycho
True
36
•  Multi-stage modelling compensates for mistakes
Imitation Learning for Distant
Supervision
True
Alfred Hitchcock, the mastermind behind Psycho
True
Confidence Prediction Features
NEC Stage 0.629 True obj=Alfred Hitchcock, …
RE Stage 0.571 True depLemma=_APPOS_mastermi
nd …
37
•  Web crawl corpus, created using entity-specific search
queries, e.g. “`The Beatles’ Musical Artist album”
Class Property / Relation
Book author, characters
Musical
Artist
album, record label, track
Film director, producer, actor,
character
Politician birthplace, educational
institution, spouse
Evaluation: Corpus
Class Property / Relation
Business employees, founders
Educational
Institution
mascot, city
River origin, mouth
38
•  Improving NEC for RE with Web Features
Evaluation: NEC Features
Arctic Monkeys
Arctic Monkeys are a rock band from Sheffield,
famous for albums such as AM.
Albums:
- Whatever People Say I Am, That's What I'm Not
- AM
header
link
bold
list
39
•  NEC:
•  Word features: Object occurrence, POS, digit and capitalisation
pattern etc.
•  Context features: 2 words to left and right: BOW, sequence, bag of
POS, POS sequence, as 1-grams and 2-grams
•  Web features
Ø  Best F1 and P-avg achieved with all of those
•  RE:
•  Context features (as for NEC)
•  POS and words between subject and object, as seq and BOW
•  Dependency path with/without lemmas
Ø  Best F1 and P-avg with sparse dependency features and 2-gram
context features
Evaluation: Features
40
Evaluation Setting
•  Models:
•  All models: NER with candidate identification heuristics (POS,
Web-based)
•  Rel only: one-stage, only relation features
•  Stanf: one-stage with Stanf NEC labels added to RE features
•  FIGER: one-stage with FIGER labels added to RE features
•  OS: one-stage with NEC features added to RE features
•  IL: two-stage with imitation learning
41
Overall Results
42
Conclusions EMNLP Experiments
•  Imitation learning approach outperforms baselines with
supervised NEC (Stanford NER and FIGER) by 10 points in
average precision
•  For NEC: Web features such as appearance in lists or links to
other Web improve average precision by 7 points
•  For RE: parse, high-precision features (such as parse)
outperform high-recall low-precision features (such as BOW
features)
43
Distant Supervision Challenges
•  Automatically generating training data
•  Can lead to noisy training examples
Let It Be is the twelfth album by
The Beatles which contains their
hit single Let It Be.
Name Album Track
The Beatles
…
Let It Be
…
Let It Be
…
44
Distant Supervision Challenges
•  Automatically generating training data
•  Can lead to noisy training examples
•  Use ‘Let It Be’ mentions as positive training examples for album or for
track?
•  Problem: if both mentions of ‘Let It Be’ are used to extract features for
both album and track, wrong weights are learnt
Let It Be is the twelfth album by
The Beatles which contains their
hit single Let It Be.
Name Album Track
The Beatles
…
Let It Be
…
Let It Be
…
45
Distant Supervision Challenges
•  Automatically generating training data
•  Can lead to noisy training examples
•  Evaluation
•  If training data is generated automatically, how / on what data can
approaches be evaluated?
•  Co-Reference Resolution
•  Does training / testing data have to contain names of subj and obj
directly?
•  Named Entity Recognition and Classification
•  Supervised off-the-shelf NERC approaches are not perfect (see rest of
talk)
46
Conclusions / Future Work
•  Distant supervision allows to automatically populate
knowledge bases without manual effort
•  Distant supervision can be applied to any domain
•  Ongoing challenges:
•  Reducing errors made by automatic labeling
•  Distant supervision with co-reference resolution
•  NERC for distant supervision
47
References
•  Isabelle Augenstein, Andreas Vlachos, Diana Maynard (2015).
Extracting Relations between Non-Standard Entities using Distant
Supervision and Imitation Learning. EMNLP 2015.
•  Isabelle Augenstein, Diana Maynard, Fabio Ciravegna (2015). Distantly
Supervised Web Relation Extraction for Knowledge Base Population.
Semantic Web Journal.
•  Isabelle Augenstein, Diana Maynard, Fabio Ciravegna (2014). Relation
Extraction from the Web using Distant Supervision. EKAW 2014, nominated
for best paper award.
•  Isabelle Augenstein (2014). Joint Information Extraction from the Web using
Linked Data. ISWC 2014.
•  Isabelle Augenstein (2014). Seed Selection for Distantly Supervised Web-
Based Relation Extraction. SWAIE Workshop at COLING 2014.
48
References
Distant Supervision:
•  Mike Mintz, Steven Bills, Rion Snow, and Dan Jurafsky. 2009. Distant
supervision for relation extraction without labeled data. ACL- IJCNLP.
NERC:
•  Jenny Rose Finkel, Trond Grenager, and Christopher Manning. 2005.
Incorporating Non-local Information into Information Extraction Systems by
Gibbs Sampling. ACL.
•  Xiao Ling and Daniel S. Weld. 2012. Fine-Grained Entity Recognition. AAAI.
Imitation Learning:
•  Stéphane Ross, Geoffrey J. Gordon, and Drew Bagnell. 2011. A Reduction
of Imitation Learning and Structured Prediction to No-Regret Online
Learning. JMLR.
•  Juan Ortega, Noor Shaker, Julian Togelius and Georgios N. Yannakakis
(2013): Imitating human playing styles in Super Mario Bros. Entertainment
Computing, Elsevier.
49
Thank you for
your attention!
Questions?

More Related Content

Viewers also liked

Lodifier: Generating Linked Data from Unstructured Text
Lodifier: Generating Linked Data from Unstructured TextLodifier: Generating Linked Data from Unstructured Text
Lodifier: Generating Linked Data from Unstructured Text
Isabelle Augenstein
 
Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012
Peter Mika
 
Alan Turing, Camila 4º A
Alan Turing, Camila 4º AAlan Turing, Camila 4º A
Alan Turing, Camila 4º A
ME PP
 
Human Neural Machine
Human Neural MachineHuman Neural Machine
Human Neural Machine
Georgios Spithourakis
 
Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)
Andre Freitas
 
Information Extraction with Linked Data
Information Extraction with Linked DataInformation Extraction with Linked Data
Information Extraction with Linked Data
Isabelle Augenstein
 
Natural Language Processing for the Semantic Web
Natural Language Processing for the Semantic WebNatural Language Processing for the Semantic Web
Natural Language Processing for the Semantic Web
Isabelle Augenstein
 
Information-Theoretic Metric Learning
Information-Theoretic Metric LearningInformation-Theoretic Metric Learning
Information-Theoretic Metric LearningKoji Matsuda
 
Lecture: Question Answering
Lecture: Question AnsweringLecture: Question Answering
Lecture: Question Answering
Marina Santini
 
Semantic Search Over The Web
Semantic Search Over The WebSemantic Search Over The Web
Semantic Search Over The Web
alierkan
 
Modeling missing data in distant supervision for information extraction (Ritt...
Modeling missing data in distant supervision for information extraction (Ritt...Modeling missing data in distant supervision for information extraction (Ritt...
Modeling missing data in distant supervision for information extraction (Ritt...
Naoaki Okazaki
 
Management information system question and answers
Management information system question and answersManagement information system question and answers
Management information system question and answerspradeep acharya
 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question Answering
Sujit Pal
 
Intro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringIntro to Deep Learning for Question Answering
Intro to Deep Learning for Question Answering
Traian Rebedea
 
「知識」のDeep Learning
「知識」のDeep Learning「知識」のDeep Learning
「知識」のDeep Learning
Yuya Unno
 
Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic...
Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic...Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic...
Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic...
Glen Cathey
 
Web 3.0 The Semantic Web
Web 3.0 The Semantic WebWeb 3.0 The Semantic Web
Web 3.0 The Semantic Web
Hatem Mahmoud
 

Viewers also liked (18)

Lodifier: Generating Linked Data from Unstructured Text
Lodifier: Generating Linked Data from Unstructured TextLodifier: Generating Linked Data from Unstructured Text
Lodifier: Generating Linked Data from Unstructured Text
 
Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012
 
Alan Turing, Camila 4º A
Alan Turing, Camila 4º AAlan Turing, Camila 4º A
Alan Turing, Camila 4º A
 
Mapping Keywords to
Mapping Keywords to Mapping Keywords to
Mapping Keywords to
 
Human Neural Machine
Human Neural MachineHuman Neural Machine
Human Neural Machine
 
Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)
 
Information Extraction with Linked Data
Information Extraction with Linked DataInformation Extraction with Linked Data
Information Extraction with Linked Data
 
Natural Language Processing for the Semantic Web
Natural Language Processing for the Semantic WebNatural Language Processing for the Semantic Web
Natural Language Processing for the Semantic Web
 
Information-Theoretic Metric Learning
Information-Theoretic Metric LearningInformation-Theoretic Metric Learning
Information-Theoretic Metric Learning
 
Lecture: Question Answering
Lecture: Question AnsweringLecture: Question Answering
Lecture: Question Answering
 
Semantic Search Over The Web
Semantic Search Over The WebSemantic Search Over The Web
Semantic Search Over The Web
 
Modeling missing data in distant supervision for information extraction (Ritt...
Modeling missing data in distant supervision for information extraction (Ritt...Modeling missing data in distant supervision for information extraction (Ritt...
Modeling missing data in distant supervision for information extraction (Ritt...
 
Management information system question and answers
Management information system question and answersManagement information system question and answers
Management information system question and answers
 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question Answering
 
Intro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringIntro to Deep Learning for Question Answering
Intro to Deep Learning for Question Answering
 
「知識」のDeep Learning
「知識」のDeep Learning「知識」のDeep Learning
「知識」のDeep Learning
 
Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic...
Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic...Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic...
Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic...
 
Web 3.0 The Semantic Web
Web 3.0 The Semantic WebWeb 3.0 The Semantic Web
Web 3.0 The Semantic Web
 

Similar to Distant Supervision with Imitation Learning

A Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary DefenseA Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary Defense
Yongyao Jiang
 
ML_Overview.ppt
ML_Overview.pptML_Overview.ppt
ML_Overview.ppt
ParveshKumar17303
 
ML overview
ML overviewML overview
ML overview
NoopurRathore1
 
ML_Overview.pptx
ML_Overview.pptxML_Overview.pptx
ML_Overview.pptx
ssuserb0b8ed1
 
ML_Overview.ppt
ML_Overview.pptML_Overview.ppt
ML_Overview.ppt
vijay251387
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Roelof Pieters
 
Epistemic networks for Epistemic Commitments
Epistemic networks for Epistemic CommitmentsEpistemic networks for Epistemic Commitments
Epistemic networks for Epistemic Commitments
Simon Knight
 
OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)
OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)
OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)
Nicolas Van Labeke
 
Bring your own idea - Visual learning analytics
Bring your own idea - Visual learning analyticsBring your own idea - Visual learning analytics
Bring your own idea - Visual learning analytics
Joris Klerkx
 
Building AI Applications using Knowledge Graphs
Building AI Applications using Knowledge GraphsBuilding AI Applications using Knowledge Graphs
Building AI Applications using Knowledge Graphs
Andre Freitas
 
Design of learning experiences for science teaching & faculty development - W...
Design of learning experiences for science teaching & faculty development - W...Design of learning experiences for science teaching & faculty development - W...
Design of learning experiences for science teaching & faculty development - W...
Liz Dorland
 
Learning Relations from Social Tagging Data
Learning Relations from Social Tagging DataLearning Relations from Social Tagging Data
Learning Relations from Social Tagging Data
Hang Dong
 
felder's index of learning styles.ppt
felder's index of learning styles.pptfelder's index of learning styles.ppt
felder's index of learning styles.ppt
Jose Paulo
 
Leverhulme methods presentation
Leverhulme methods presentationLeverhulme methods presentation
Leverhulme methods presentation
Anne Adams
 
Action research
Action researchAction research
Action research
Parlin Pardede
 
Requirements for Learning Analytics
Requirements for Learning AnalyticsRequirements for Learning Analytics
Requirements for Learning Analytics
Tore Hoel
 
Survey Research in Software Engineering
Survey Research in Software EngineeringSurvey Research in Software Engineering
Survey Research in Software Engineering
Daniel Mendez
 
Hands-on Lesson the Scoping Review research
Hands-on Lesson  the Scoping Review researchHands-on Lesson  the Scoping Review research
Hands-on Lesson the Scoping Review research
ImanQasrina
 
CiteSeerX: Mining Scholarly Big Data
CiteSeerX: Mining Scholarly Big DataCiteSeerX: Mining Scholarly Big Data
CiteSeerX: Mining Scholarly Big Data
Jian Wu
 
Mathematics: skills, understanding or both?
Mathematics: skills, understanding or both?Mathematics: skills, understanding or both?
Mathematics: skills, understanding or both?
Christian Bokhove
 

Similar to Distant Supervision with Imitation Learning (20)

A Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary DefenseA Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary Defense
 
ML_Overview.ppt
ML_Overview.pptML_Overview.ppt
ML_Overview.ppt
 
ML overview
ML overviewML overview
ML overview
 
ML_Overview.pptx
ML_Overview.pptxML_Overview.pptx
ML_Overview.pptx
 
ML_Overview.ppt
ML_Overview.pptML_Overview.ppt
ML_Overview.ppt
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
 
Epistemic networks for Epistemic Commitments
Epistemic networks for Epistemic CommitmentsEpistemic networks for Epistemic Commitments
Epistemic networks for Epistemic Commitments
 
OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)
OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)
OpenEssayist: Extractive Summarisation and Formative Assessment (DCLA13)
 
Bring your own idea - Visual learning analytics
Bring your own idea - Visual learning analyticsBring your own idea - Visual learning analytics
Bring your own idea - Visual learning analytics
 
Building AI Applications using Knowledge Graphs
Building AI Applications using Knowledge GraphsBuilding AI Applications using Knowledge Graphs
Building AI Applications using Knowledge Graphs
 
Design of learning experiences for science teaching & faculty development - W...
Design of learning experiences for science teaching & faculty development - W...Design of learning experiences for science teaching & faculty development - W...
Design of learning experiences for science teaching & faculty development - W...
 
Learning Relations from Social Tagging Data
Learning Relations from Social Tagging DataLearning Relations from Social Tagging Data
Learning Relations from Social Tagging Data
 
felder's index of learning styles.ppt
felder's index of learning styles.pptfelder's index of learning styles.ppt
felder's index of learning styles.ppt
 
Leverhulme methods presentation
Leverhulme methods presentationLeverhulme methods presentation
Leverhulme methods presentation
 
Action research
Action researchAction research
Action research
 
Requirements for Learning Analytics
Requirements for Learning AnalyticsRequirements for Learning Analytics
Requirements for Learning Analytics
 
Survey Research in Software Engineering
Survey Research in Software EngineeringSurvey Research in Software Engineering
Survey Research in Software Engineering
 
Hands-on Lesson the Scoping Review research
Hands-on Lesson  the Scoping Review researchHands-on Lesson  the Scoping Review research
Hands-on Lesson the Scoping Review research
 
CiteSeerX: Mining Scholarly Big Data
CiteSeerX: Mining Scholarly Big DataCiteSeerX: Mining Scholarly Big Data
CiteSeerX: Mining Scholarly Big Data
 
Mathematics: skills, understanding or both?
Mathematics: skills, understanding or both?Mathematics: skills, understanding or both?
Mathematics: skills, understanding or both?
 

More from Isabelle Augenstein

Beyond Fact Checking — Modelling Information Change in Scientific Communication
Beyond Fact Checking — Modelling Information Change in Scientific CommunicationBeyond Fact Checking — Modelling Information Change in Scientific Communication
Beyond Fact Checking — Modelling Information Change in Scientific Communication
Isabelle Augenstein
 
Automatically Detecting Scientific Misinformation
Automatically Detecting Scientific MisinformationAutomatically Detecting Scientific Misinformation
Automatically Detecting Scientific Misinformation
Isabelle Augenstein
 
Accountable and Robust Automatic Fact Checking
Accountable and Robust Automatic Fact CheckingAccountable and Robust Automatic Fact Checking
Accountable and Robust Automatic Fact Checking
Isabelle Augenstein
 
Determining the Credibility of Science Communication
Determining the Credibility of Science CommunicationDetermining the Credibility of Science Communication
Determining the Credibility of Science Communication
Isabelle Augenstein
 
Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)
Isabelle Augenstein
 
Explainability for NLP
Explainability for NLPExplainability for NLP
Explainability for NLP
Isabelle Augenstein
 
Towards Explainable Fact Checking
Towards Explainable Fact CheckingTowards Explainable Fact Checking
Towards Explainable Fact Checking
Isabelle Augenstein
 
Tracking False Information Online
Tracking False Information OnlineTracking False Information Online
Tracking False Information Online
Isabelle Augenstein
 
What can typological knowledge bases and language representations tell us abo...
What can typological knowledge bases and language representations tell us abo...What can typological knowledge bases and language representations tell us abo...
What can typological knowledge bases and language representations tell us abo...
Isabelle Augenstein
 
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...
Isabelle Augenstein
 
Learning with limited labelled data in NLP: multi-task learning and beyond
Learning with limited labelled data in NLP: multi-task learning and beyondLearning with limited labelled data in NLP: multi-task learning and beyond
Learning with limited labelled data in NLP: multi-task learning and beyond
Isabelle Augenstein
 
Learning to read for automated fact checking
Learning to read for automated fact checkingLearning to read for automated fact checking
Learning to read for automated fact checking
Isabelle Augenstein
 
SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...
SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...
SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...
Isabelle Augenstein
 
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
Isabelle Augenstein
 
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
Isabelle Augenstein
 
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Isabelle Augenstein
 
Extracting Relations between Non-Standard Entities using Distant Supervision ...
Extracting Relations between Non-Standard Entities using Distant Supervision ...Extracting Relations between Non-Standard Entities using Distant Supervision ...
Extracting Relations between Non-Standard Entities using Distant Supervision ...
Isabelle Augenstein
 

More from Isabelle Augenstein (17)

Beyond Fact Checking — Modelling Information Change in Scientific Communication
Beyond Fact Checking — Modelling Information Change in Scientific CommunicationBeyond Fact Checking — Modelling Information Change in Scientific Communication
Beyond Fact Checking — Modelling Information Change in Scientific Communication
 
Automatically Detecting Scientific Misinformation
Automatically Detecting Scientific MisinformationAutomatically Detecting Scientific Misinformation
Automatically Detecting Scientific Misinformation
 
Accountable and Robust Automatic Fact Checking
Accountable and Robust Automatic Fact CheckingAccountable and Robust Automatic Fact Checking
Accountable and Robust Automatic Fact Checking
 
Determining the Credibility of Science Communication
Determining the Credibility of Science CommunicationDetermining the Credibility of Science Communication
Determining the Credibility of Science Communication
 
Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)
 
Explainability for NLP
Explainability for NLPExplainability for NLP
Explainability for NLP
 
Towards Explainable Fact Checking
Towards Explainable Fact CheckingTowards Explainable Fact Checking
Towards Explainable Fact Checking
 
Tracking False Information Online
Tracking False Information OnlineTracking False Information Online
Tracking False Information Online
 
What can typological knowledge bases and language representations tell us abo...
What can typological knowledge bases and language representations tell us abo...What can typological knowledge bases and language representations tell us abo...
What can typological knowledge bases and language representations tell us abo...
 
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...
 
Learning with limited labelled data in NLP: multi-task learning and beyond
Learning with limited labelled data in NLP: multi-task learning and beyondLearning with limited labelled data in NLP: multi-task learning and beyond
Learning with limited labelled data in NLP: multi-task learning and beyond
 
Learning to read for automated fact checking
Learning to read for automated fact checkingLearning to read for automated fact checking
Learning to read for automated fact checking
 
SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...
SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...
SemEval 2017 Task 10: ScienceIE – Extracting Keyphrases and Relations from Sc...
 
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
 
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
 
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
 
Extracting Relations between Non-Standard Entities using Distant Supervision ...
Extracting Relations between Non-Standard Entities using Distant Supervision ...Extracting Relations between Non-Standard Entities using Distant Supervision ...
Extracting Relations between Non-Standard Entities using Distant Supervision ...
 

Recently uploaded

Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 

Recently uploaded (20)

Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 

Distant Supervision with Imitation Learning

  • 1. Distant Supervision with Imitation Learning Isabelle Augenstein i.augenstein@sheffield.ac.uk Department of Computer Science, University of Sheffield, UK Joint work with Andreas Vlachos, Diana Maynard (EMNLP 2015) 30 November 2015 Heriot-Watt University Computer Science Seminar
  • 2. 2 Talk Overview •  Relation Extraction from the Web with Distant Supervision •  Extracting Relations from Web pages •  Relation are used for populating Knowledge Bases •  Distant Supervision allows to automatically generate relation extraction training data using knowledge base Ø  No manual effort necessary
  • 3. 3 Talk Overview •  Imitation Learning for Distant Supervision •  Relation extraction relies on recognising and classifying named entities, but sentences only have relation annotations •  Suitable manually labeled NERC training data can be difficult to obtain •  Imitation Learning decomposes tasks (RE) into sequence of actions (e.g. NEC, RE), able to deal with latent variables •  Imitation Learning is a structured prediction method, also called learning-to-search, inverse reinforcement learning Ø  Only labels for last action (RE) needed, no additional manual effort
  • 4. 4 •  Large knowledge bases are useful for search, question answering etc. Overall Problem Structured Information from Google Knowledge Graph
  • 5. 5 •  Large knowledge bases are useful for search, question answering etc. but far from complete Overall Problem Structured Information from Google Knowledge Graph Band members, genre missing
  • 6. 6 •  Large knowledge bases are useful for search, question answering etc. but far from complete •  Approach: automatic knowledge base population (KBP) methods using Web information extraction (IE) 1)  Extracting entities and relations between them from text on Web pages 2)  Combining information from several sources to populate KBs Overall Problem
  • 7. 7 Relation extraction for knowledge base completion •  Given subject and name of relation, find object of relation in corpus •  E.g. “Where was Bill Gates born?” •  Answer: birthplace(Bill Gates, Seattle_Washington) Relation Extraction Overview birthplace Bill Gates was born in Seattle, Washington LOC
  • 8. 8 •  Why distant supervision for relation extraction (RE)? •  RE methods requiring manual effort •  Rule-based approaches: manually created patters, e.g. “X is a professor at Y” •  Supervised learning: statistical models, manually annotated training data Ø  Biased towards a domain, e.g. Biology, newswire, Wikipedia •  RE methods requiring no manual effort •  Bootstrapping: semi-supervised, learning patterns iteratively starting with prior knowledge, e.g. list of names Ø  “Semantic drift”, e.g. “X is a professor at Y” -> “X lives in Y” •  Open Information Extraction: unsupervised learning, discovering patterns, clustering Ø  Difficult to map to schema Existing Approaches
  • 9. 9 “If two entities participate in a relation, any sentence that contains those two entities might express that relation.” (Mintz, 2009) Amy Jade Winehouse was a singer and songwriter known for her eclectic mix of musical genres including R&B, soul and jazz. Blur helped to popularise the Britpop genre. Beckham rose to fame with the all-female pop group Spice Girls. Name Genre … Amy Winehouse Amy Jade Winehouse Wino … R&B soul jazz … … Blur … Britpop … … Spice Girls … pop … … different lexicalisations Distant Supervision
  • 10. 10 Creating positive & negative training examples Feature Extraction Classifier Training Prediction of New Relations Distant Supervision
  • 11. 11 Creating positive & negative training examples Feature Extraction Classifier Training Prediction of New Relations Distant Supervision KB: album(The Beatles, Abbey Road) Positive: The Beatles released their album Abbey Road in 1969. Negative: The Beatles played in Edinburgh. depLemmaPath=released_OJB, possPath=VBD_PRP_album, … possPath=_release+VBN=0.354677 depLemmaPath=_release=1.81213, … Michael Jackson’s third album is Music & Me album(Michael Jackson, Music & Me)
  • 12. 12 Distant Supervision Creating positive & negative training examples Feature Extraction Classifier Training Prediction of New Relations Supervised learning Automatically generated training data + Distant Supervision
  • 13. 13 •  Requires no manual effort •  Automatically label text with relations from knowledge base •  Train statistical model (not patterns) •  Extract relations with respect to knowledge base Ø  Combine benefits of supervised approaches (learn statistical model) and bootstrapping RE approaches (only list of extractions as input) Distant Supervision
  • 14. 14 •  Web crawl corpus, created using entity-specific search queries, e.g. “`The Beatles’ Musical Artist album” Class Property / Relation Book author, characters Musical Artist album, record label, track Film director, producer, actor, character Politician birthplace, educational institution, spouse Evaluation: Corpus Class Property / Relation Business employees, founders Educational Institution mascot, city River origin, mouth
  • 15. 15 •  Distant Supervision does not require manual annotation but depends on NERC for candidate identification NERC for Distant Supervision birthplace Bill Gates was born in Seattle, Washington LOC
  • 16. 16 •  Existing works use Stanford NER (Finkel et al. 2005) or FIGER (Ling and Weld 2012) Stanford NER FIGER Location 14 Location (City, Country, County, Province, Railway, …) Person 15 Person (Actor, Architect, Artist, Musician, Terrorist, …) Organisation 13 Org (Airline, Company, Educational_Institution, ….) Misc 13 Product (Car, Train, Camera, Software, Weapon, …) 9 Building (Airport, Hospital, Restaurant, Theater, …) 5 Art (Film, Play, Written_Work, Music, Newspaper) 7 Event (Election, Military_Conflict, Terrorist_Attack, …) 30 Misc (Time, Educational_Degree, Drug, Algorithm, …) NERC for Distant Supervision
  • 17. 17 •  Problem 1: missing NE types even with fine-grained schemas album Michael Jackson’s third album is Music & Me Musician ? Misc NERC for Distant Supervision
  • 18. 18 •  Problem 1: missing NE types even with fine-grained schemas •  Problem 2: domain difference between training and testing data (e.g. newswire, Wikipedia vs. Web) album Michael Jackson’s third album is Music & Me ? Misc NERC for Distant Supervision
  • 19. 19 •  Task decomposition •  NER: Named Entity Boundary Recognition •  NEC: Assigning Types to NEs •  RE: Relation Extraction •  Solution 1: •  NER: recognise NEs with heuristics (e.g. POS-based, HTML) •  NEC: apply trained model (e.g. Stanford, FIGER), add labels of objects to RE features •  RE: train model with distantly annotated data as usual •  NER Heuristics: •  Noun phrases, capitalised phrases •  Phrases from HTML markup: <ahref>, <li>, <h1>, <h2>, <h3>, <strong>, <b>, <em>, <i> NERC for Distant Supervision
  • 20. 20 album Michael Jackson’s third album is Music & Me O NERC for Distant Supervision •  Solution 1: •  NER: recognise NEs with heuristics (e.g. POS-based, HTML) •  NEC: add object candidate labels (e.g. with Stanford, FIGER) •  RE: train model with distantly annotated data as usual •  RE features: ne=O, depLemmaPath=poss_album_subj, possPath=POS_JJ_album_VBZ, …
  • 21. 21 •  Experiments with 16 relations (e.g. album, character, record label, author, origin) Recall of NER with off-the-shelf Stanford model compared to heuristics NERC for Distant Supervision
  • 22. 22 •  Solution 2: •  NER: with heuristics •  NEC & RE: train one-stage model •  NEC features: obj=Music & Me, w[-1-2]=album is, … •  RE features: depLemmaPath=poss_album_subj, possPath=POS_JJ_album_VBZ, … album Michael Jackson’s third album is Music & Me NERC for Distant Supervision
  • 23. 23 •  Solution 2: •  NER: with heuristics •  NEC & RE: train one-stage model •  Problem 3: NEC features useful for RE but •  RE features are sparse (e.g. path between subject and object) •  NEC features can overpower RE features album Michael Jackson’s third album is Music & Me NERC for Distant Supervision
  • 24. 24 •  Problem 3: NEC features useful for RE but: •  RE features are sparse (e.g. path between subject and object) •  NEC features can overpower RE features Ø  Model would incorrectly predict Stephen Spielberg, because context is stronger (w[-1]=director) One of director Stephen Spielberg’s greatest heroes was Alfred Hitchcock, the mastermind behind Psycho. Candidates for director relation with subject Psycho: Stephen Spielberg, Alfred Hitchcock NERC for Distant Supervision
  • 25. 25 •  Ideal Solution: •  NER: with heuristics •  NEC: trained classifier •  RE: trained classifier Ø  That would be great, but how can we do this without NEC training data? NERC for Distant Supervision
  • 26. 26 •  Imitation learning with DAGGER (Ross et al. 2011) •  Also called learning-to-search, inverse reinforcement learning •  Structured prediction method •  Able to deal with latent variables, only labels for last stage (RE) needed •  Decompose tasks into sequence of actions made at different stages •  Dependencies between tasks are learnt by appropriate generation of training examples •  Classifiers are trained iteratively •  Relationship between Reinforcement Learning and Imitation learning •  In reinforcement, the policy is being learnt and the actions are given •  In imitation learning, the policy is given and the actions are learnt •  (hence inverse) Imitation Learning for Distant Supervision
  • 27. 27 Imitation Learning for Distant Supervision •  Learning from demonstrator •  Possible actions are given •  Correctness of actions (i.e. costs) are assessed by taking actions, predicting remaining ones and evaluating result •  Dependencies between actions are learnt by observation •  Origins of Imitation learning •  Robotics •  Game playing (e.g. Ortega et al. 2012) •  Mario’s possible actions (simplified): move left, move right, duck, run, jump, fire
  • 28. 28 Imitation Learning for Distant Supervision •  Imitation Learning for NLP •  Actions: NEC, if NEC positive followed by RE •  Demonstrator (expert policy) tries to replicate labelled RE data •  Base classifier: cost sensitive classification learning with PA (passive-aggressive classifier) •  NEC labels are needed but not specified by labelled RE data •  Solution: look-ahead!
  • 29. 29 •  Iteration 1, NEC Stage Imitation Learning for Distant Supervision True False Features NEC Stage ? ? obj=Music & Me, … RE Stage depLemma=poss_album_subj, … Michael Jackson’s third album is Music & Me ?
  • 30. 30 •  Iteration 1, RE Stage Imitation Learning for Distant Supervision True False Features NEC Stage ? ? obj=Music & Me, … RE Stage 0 1 depLemma=poss_album_subj, … True Michael Jackson’s third album is Music & Me ?
  • 31. 31 •  Iteration 1, RE Stage Imitation Learning for Distant Supervision True False Features NEC Stage 0 1 obj=Music & Me, … RE Stage 0 1 depLemma=poss_album_subj, … True Michael Jackson’s third album is Music & Me True
  • 32. 32 •  Iteration 1 •  NEC and RE Stage: predict labels according to labelled data (expert policy) with look-ahead •  Extract features •  Assess costs •  CSC example: features, costs -> will be remembered for next iterations! •  Train classifier for each stage based on CSC example (learned policy) Imitation Learning for Distant Supervision
  • 33. 33 •  Iteration 1 •  NEC and RE Stage: predict labels according to labelled data (expert policy) with look-ahead •  Extract features •  Assess costs •  CSC example: features, costs -> will be remembered for next iterations! •  Train classifier for each stage based on CSC example (learned policy) •  Iteration >= 2 •  Predict labels according to expert policy or learned policy •  Learned policy is chosen stochastically, i.e. p=(1−β) i: number iteration, β: learning rate •  With each iteration it is more likely that expert policy is chosen •  The bigger the learning rate the faster learner moves away from labelled data Imitation Learning for Distant Supervision i-1
  • 34. 34 •  Reminder: Problem 3: NEC features useful for RE but: •  RE features are sparse (e.g. path between subject and object) •  NEC features can overpower RE features Ø  Model would incorrectly predict Stephen Spielberg, because context is stronger (w[-1]=director) One of director Stephen Spielberg’s greatest heroes was Alfred Hitchcock, the mastermind behind Psycho. Candidates for director relation with subject Psycho: Stephen Spielberg, Alfred Hitchcock NERC for Distant Supervision
  • 35. 35 •  Multi-stage modelling compensates for mistakes Imitation Learning for Distant Supervision Confidence Prediction Features NEC Stage 0.629 True obj=Stephen Spielberg, … RE Stage -0.571 False depLemma=_POSS_heroes_ … False Steven Spielberg’s greatest heroes (…) Psycho True
  • 36. 36 •  Multi-stage modelling compensates for mistakes Imitation Learning for Distant Supervision True Alfred Hitchcock, the mastermind behind Psycho True Confidence Prediction Features NEC Stage 0.629 True obj=Alfred Hitchcock, … RE Stage 0.571 True depLemma=_APPOS_mastermi nd …
  • 37. 37 •  Web crawl corpus, created using entity-specific search queries, e.g. “`The Beatles’ Musical Artist album” Class Property / Relation Book author, characters Musical Artist album, record label, track Film director, producer, actor, character Politician birthplace, educational institution, spouse Evaluation: Corpus Class Property / Relation Business employees, founders Educational Institution mascot, city River origin, mouth
  • 38. 38 •  Improving NEC for RE with Web Features Evaluation: NEC Features Arctic Monkeys Arctic Monkeys are a rock band from Sheffield, famous for albums such as AM. Albums: - Whatever People Say I Am, That's What I'm Not - AM header link bold list
  • 39. 39 •  NEC: •  Word features: Object occurrence, POS, digit and capitalisation pattern etc. •  Context features: 2 words to left and right: BOW, sequence, bag of POS, POS sequence, as 1-grams and 2-grams •  Web features Ø  Best F1 and P-avg achieved with all of those •  RE: •  Context features (as for NEC) •  POS and words between subject and object, as seq and BOW •  Dependency path with/without lemmas Ø  Best F1 and P-avg with sparse dependency features and 2-gram context features Evaluation: Features
  • 40. 40 Evaluation Setting •  Models: •  All models: NER with candidate identification heuristics (POS, Web-based) •  Rel only: one-stage, only relation features •  Stanf: one-stage with Stanf NEC labels added to RE features •  FIGER: one-stage with FIGER labels added to RE features •  OS: one-stage with NEC features added to RE features •  IL: two-stage with imitation learning
  • 42. 42 Conclusions EMNLP Experiments •  Imitation learning approach outperforms baselines with supervised NEC (Stanford NER and FIGER) by 10 points in average precision •  For NEC: Web features such as appearance in lists or links to other Web improve average precision by 7 points •  For RE: parse, high-precision features (such as parse) outperform high-recall low-precision features (such as BOW features)
  • 43. 43 Distant Supervision Challenges •  Automatically generating training data •  Can lead to noisy training examples Let It Be is the twelfth album by The Beatles which contains their hit single Let It Be. Name Album Track The Beatles … Let It Be … Let It Be …
  • 44. 44 Distant Supervision Challenges •  Automatically generating training data •  Can lead to noisy training examples •  Use ‘Let It Be’ mentions as positive training examples for album or for track? •  Problem: if both mentions of ‘Let It Be’ are used to extract features for both album and track, wrong weights are learnt Let It Be is the twelfth album by The Beatles which contains their hit single Let It Be. Name Album Track The Beatles … Let It Be … Let It Be …
  • 45. 45 Distant Supervision Challenges •  Automatically generating training data •  Can lead to noisy training examples •  Evaluation •  If training data is generated automatically, how / on what data can approaches be evaluated? •  Co-Reference Resolution •  Does training / testing data have to contain names of subj and obj directly? •  Named Entity Recognition and Classification •  Supervised off-the-shelf NERC approaches are not perfect (see rest of talk)
  • 46. 46 Conclusions / Future Work •  Distant supervision allows to automatically populate knowledge bases without manual effort •  Distant supervision can be applied to any domain •  Ongoing challenges: •  Reducing errors made by automatic labeling •  Distant supervision with co-reference resolution •  NERC for distant supervision
  • 47. 47 References •  Isabelle Augenstein, Andreas Vlachos, Diana Maynard (2015). Extracting Relations between Non-Standard Entities using Distant Supervision and Imitation Learning. EMNLP 2015. •  Isabelle Augenstein, Diana Maynard, Fabio Ciravegna (2015). Distantly Supervised Web Relation Extraction for Knowledge Base Population. Semantic Web Journal. •  Isabelle Augenstein, Diana Maynard, Fabio Ciravegna (2014). Relation Extraction from the Web using Distant Supervision. EKAW 2014, nominated for best paper award. •  Isabelle Augenstein (2014). Joint Information Extraction from the Web using Linked Data. ISWC 2014. •  Isabelle Augenstein (2014). Seed Selection for Distantly Supervised Web- Based Relation Extraction. SWAIE Workshop at COLING 2014.
  • 48. 48 References Distant Supervision: •  Mike Mintz, Steven Bills, Rion Snow, and Dan Jurafsky. 2009. Distant supervision for relation extraction without labeled data. ACL- IJCNLP. NERC: •  Jenny Rose Finkel, Trond Grenager, and Christopher Manning. 2005. Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling. ACL. •  Xiao Ling and Daniel S. Weld. 2012. Fine-Grained Entity Recognition. AAAI. Imitation Learning: •  Stéphane Ross, Geoffrey J. Gordon, and Drew Bagnell. 2011. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning. JMLR. •  Juan Ortega, Noor Shaker, Julian Togelius and Georgios N. Yannakakis (2013): Imitating human playing styles in Super Mario Bros. Entertainment Computing, Elsevier.
  • 49. 49 Thank you for your attention! Questions?