SlideShare a Scribd company logo
Deep Learning for NLP
Yves Peirsman
About me
2012
NLP engineer
2011
Post-doctoral researcher,
Stanford University
2014
NLP Town
2010
PhD Computational
Linguistics, KULeuven
Deep Learning in NLP
2012
Deep Learning
● Comeback of neural networks
● Unified framework for many
problems
1990
Statistical NLP
● Machine learning from data
● Many different models
1950s
Rule-based NLP
● Hand-written linguistic rules
● Knowledge models
20??
??
Deep Learning in NLP
Basic models
The basic NLPer’s
toolkit
Main advantages
Why has DL become
so popular in NLP?
Beyond the hype
Deeper dive & recent trends
Word embeddings
Words as atomic units Words as dense embeddings
The movie has an excellent cast. M
I like the cover of the book. B
There were too many pages in the
novel.
?
book
novel
movie
POS
syntaxcapitaliz
ation
prefix
suffix
bigrams
lemmas
Feature engineering
Major feature engineering Little feature engineering (if any)
strings
strings
Models
Distinct models for different problems Unified toolkit
Text
classification
NER MT ...
PP PP PP
SVM CRF LM TM
decoder
Text
classification
NER MT ...
LSTM (or similar)
NLP Toolkit: LSTM for classification
Applications: text classification, language modelling
LSTM
LSTM
LSTM
LSTM
LSTM
embeddings
dense layer
weights
biases
The
movie
was
boring
.
positive
neutral
negative
NLP Toolkit: inside the LSTM
forget input tanh output
tanh
was boring
The movie ...
NLP Toolkit: LSTM for sequence labelling
Applications: named entity recognition
LSTM
LSTM
LSTM
LSTM
LSTM
embeddings dense layer
logits
B-PER
O
O
B-LOC
O
weights
biases
John
lives
in
London
.
NLP Toolkit: Encoder-Decoder Architectures
Applications: machine translation, text summarization, dialogue modelling, etc.
LSTM
LSTM
LSTM
LSTM
source
embeddings
LSTM
LSTM
LSTM
LSTM
LSTM
target
embeddings
Je
t’
aime
.
<END>
I
love
you
.
NLP Toolkit: Attention
Applications: machine translation, question answering, etc.
LSTM
LSTM
LSTM
LSTM
source
embeddings
LSTM
LSTM
LSTM
LSTM
LSTM
target
embeddings
Je
t’
aime
.
<END>
I
love
you
.
attention
NLP under threat?
Deep learning models have taken NLP by storm, achieving superior
results across many applications.
Many DL approaches do not model any linguistic knowledge.
They view language as a sequence of strings.
Is this the end of NLP as a separate discipline?
NLP under threat?
Deep learning models have taken NLP by storm, achieving superior
results across many applications.
Language models
Rajeswar et al. 2017, https://arxiv.org/pdf/1705.10929.pdf
Language models
● Great performance when explicitly trained for the task: 99% correct
○ > 120,000 sentence starts, labelled with singular or plural.
○ 50-dimensional LSTM followed by logistic regression.
○ In > 95% of the cases, the last noun determines the number.
● Performance drop for generic language models: 93% correct
○ Worse than chance on cases where a noun of the “incorrect” number occurs between the
subject and the verb
Linzen, Dupoux & Goldberg 2016, https://arxiv.org/pdf/1611.01368.pdf
Machine Translation
● NMT can behave strangely
● Problems for languages with a very different syntax, such as English and
Chinese:
○ 25% of Chinese noun phrases are translated into discontinuous phrases in English
○ Chinese noun phrases are often translated twice
Li et al. 2017, https://arxiv.org/abs/1705.01020
Question Answering
Jia & Liang 2017, https://arxiv.org/pdf/1707.07328.pdf
Textual entailment
Deep Learning in NLP
● Deep Learning produces great results on many tasks.
● But:
○ Race to the bottom on standard data sets:
■ Language models: Penn Treebank, WikiText-2
■ Machine Translation: WMT datasets
■ Question Answering: SQuAD
○ Its ignorance of linguistic structure is problematic in the evolution towards NLU
● So:
○ What do neural networks model?
○ How can we make them better?
Linguistic knowledge in MT
● What linguistic knowledge does MT model?
● Simple syntactic labels
○ Encoder output + logistic regression
■ Word-level output: part-of-speech
■ Sentence-level output: voice (active or passive), tense (past or present)
● Deep syntactic structure
○ Encoder output + decoder to predict parse trees
● Two benchmarks:
○ Upper bound: neural parser
○ Lower bound: English-to-English “MT” auto-encoder
Shi et al. 2016, https://www.isi.edu/natural-language/mt/emnlp16-nmt-grammar.pdf
Linguistic knowledge in MT
Linguistic knowledge in MT
Linguistic knowledge in MT
Solution 1: present the encoder with both syntactic and lexical information
Li et al. 2017, https://arxiv.org/abs/1705.01020
Linguistic knowledge in MT
Li et al. 2017, https://arxiv.org/abs/1705.01020
Linguistic knowledge in MT
Solution 2: combine MT with
parsing in multi-task learning
Eriguchi et al. 2017
http://www.aclweb.org/anthology/P/P17/P17-2012.pdf
Linguistic knowledge in MT
Eriguchi et al. 2017
http://www.aclweb.org/anthology/P/P17/P17-2012.pdf
Linguistic knowledge in QA
● Most answers to questions are
constituents in the sentence.
● Restricting our candidate answers
to constituents reduces the search spaces.
● Instead of feeding the network flat
sequences, we need to feed it syntax trees.
Xie and Xing 2017, http://www.aclweb.org/anthology/P/P17/P17-1129.pdf
Linguistic knowledge in QA
Xie and Xing 2017, http://www.aclweb.org/anthology/P/P17/P17-1129.pdf
Conclusions
● Deep Learning works great for NLP, but it is not a silver bullet.
● For simple tasks, simple string input may suffice, but for deeper natural
language understanding likely not.
● To tackle this challenge, we need to:
○ Better understand what neural networks model,
○ Help them model more linguistic knowledge,
○ Combine language with other modalities.
yves@nlp.town
Yves Peirsman - Deep Learning for NLP

More Related Content

What's hot

Technical Development Workshop - Text Analytics with Python
Technical Development Workshop - Text Analytics with PythonTechnical Development Workshop - Text Analytics with Python
Technical Development Workshop - Text Analytics with Python
Michelle Purnama
 
Parallel Corpora in (Machine) Translation: goals, issues and methodologies
Parallel Corpora in (Machine) Translation: goals, issues and methodologiesParallel Corpora in (Machine) Translation: goals, issues and methodologies
Parallel Corpora in (Machine) Translation: goals, issues and methodologies
Antonio Toral
 
Machine translation from English to Hindi
Machine translation from English to HindiMachine translation from English to Hindi
Machine translation from English to Hindi
Rajat Jain
 
Deep learning Type Inference for Dynamic Programming Languages
Deep learning Type Inference for Dynamic Programming Languages Deep learning Type Inference for Dynamic Programming Languages
Deep learning Type Inference for Dynamic Programming Languages
Amir M. Mir
 
Class9
 Class9 Class9
Class9issbp
 
Moses
MosesMoses
MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...
MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...
MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...
kevig
 
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Fwdays
 
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATION
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATIONA ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATION
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATION
kevig
 
Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...
John Tinsley
 
EXTRACTING LINGUISTIC SPEECH PATTERNS OF JAPANESE FICTIONAL CHARACTERS USING ...
EXTRACTING LINGUISTIC SPEECH PATTERNS OF JAPANESE FICTIONAL CHARACTERS USING ...EXTRACTING LINGUISTIC SPEECH PATTERNS OF JAPANESE FICTIONAL CHARACTERS USING ...
EXTRACTING LINGUISTIC SPEECH PATTERNS OF JAPANESE FICTIONAL CHARACTERS USING ...
kevig
 
6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine Translation6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine TranslationRIILP
 
AINL 2016: Kravchenko
AINL 2016: KravchenkoAINL 2016: Kravchenko
AINL 2016: Kravchenko
Lidia Pivovarova
 
Machine translation with statistical approach
Machine translation with statistical approachMachine translation with statistical approach
Machine translation with statistical approachvini89
 
Machine translator Introduction
Machine translator IntroductionMachine translator Introduction
Machine translator Introduction
Hamid Shahrivari Joghan
 
Aspects of NLP Practice
Aspects of NLP PracticeAspects of NLP Practice
Aspects of NLP Practice
Vsevolod Dyomkin
 
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...
kevig
 
Bilingual terminology mining
Bilingual terminology miningBilingual terminology mining
Bilingual terminology mining
Estelle Delpech
 
Nlp and transformer (v3s)
Nlp and transformer (v3s)Nlp and transformer (v3s)
Nlp and transformer (v3s)
H K Yoon
 
Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...
Rajnish Raj
 

What's hot (20)

Technical Development Workshop - Text Analytics with Python
Technical Development Workshop - Text Analytics with PythonTechnical Development Workshop - Text Analytics with Python
Technical Development Workshop - Text Analytics with Python
 
Parallel Corpora in (Machine) Translation: goals, issues and methodologies
Parallel Corpora in (Machine) Translation: goals, issues and methodologiesParallel Corpora in (Machine) Translation: goals, issues and methodologies
Parallel Corpora in (Machine) Translation: goals, issues and methodologies
 
Machine translation from English to Hindi
Machine translation from English to HindiMachine translation from English to Hindi
Machine translation from English to Hindi
 
Deep learning Type Inference for Dynamic Programming Languages
Deep learning Type Inference for Dynamic Programming Languages Deep learning Type Inference for Dynamic Programming Languages
Deep learning Type Inference for Dynamic Programming Languages
 
Class9
 Class9 Class9
Class9
 
Moses
MosesMoses
Moses
 
MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...
MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...
MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...
 
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
 
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATION
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATIONA ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATION
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATION
 
Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...
 
EXTRACTING LINGUISTIC SPEECH PATTERNS OF JAPANESE FICTIONAL CHARACTERS USING ...
EXTRACTING LINGUISTIC SPEECH PATTERNS OF JAPANESE FICTIONAL CHARACTERS USING ...EXTRACTING LINGUISTIC SPEECH PATTERNS OF JAPANESE FICTIONAL CHARACTERS USING ...
EXTRACTING LINGUISTIC SPEECH PATTERNS OF JAPANESE FICTIONAL CHARACTERS USING ...
 
6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine Translation6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine Translation
 
AINL 2016: Kravchenko
AINL 2016: KravchenkoAINL 2016: Kravchenko
AINL 2016: Kravchenko
 
Machine translation with statistical approach
Machine translation with statistical approachMachine translation with statistical approach
Machine translation with statistical approach
 
Machine translator Introduction
Machine translator IntroductionMachine translator Introduction
Machine translator Introduction
 
Aspects of NLP Practice
Aspects of NLP PracticeAspects of NLP Practice
Aspects of NLP Practice
 
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...
 
Bilingual terminology mining
Bilingual terminology miningBilingual terminology mining
Bilingual terminology mining
 
Nlp and transformer (v3s)
Nlp and transformer (v3s)Nlp and transformer (v3s)
Nlp and transformer (v3s)
 
Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...
 

Similar to Yves Peirsman - Deep Learning for NLP

Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...
alessio_ferrari
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
Roelof Pieters
 
MixedLanguageProcessingTutorialEMNLP2019.pptx
MixedLanguageProcessingTutorialEMNLP2019.pptxMixedLanguageProcessingTutorialEMNLP2019.pptx
MixedLanguageProcessingTutorialEMNLP2019.pptx
MariYam371004
 
CS269-01 (1).pptx
CS269-01 (1).pptxCS269-01 (1).pptx
CS269-01 (1).pptx
INyomanSwitrayana
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processing
Minh Pham
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on Language
Roelof Pieters
 
Automatic Grammatical Error Correction for ESL-Learners by SMT - Getting it r...
Automatic Grammatical Error Correction for ESL-Learners by SMT - Getting it r...Automatic Grammatical Error Correction for ESL-Learners by SMT - Getting it r...
Automatic Grammatical Error Correction for ESL-Learners by SMT - Getting it r...
Marcin Junczys-Dowmunt
 
Natural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsNatural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application Trends
Shreyas Suresh Rao
 
[ACL2017読み会] What do Neural Machine Translation Models Learn about Morphology?
[ACL2017読み会] What do Neural Machine Translation Models Learn about Morphology?[ACL2017読み会] What do Neural Machine Translation Models Learn about Morphology?
[ACL2017読み会] What do Neural Machine Translation Models Learn about Morphology?
Hayahide Yamagishi
 
Thomas Wolf "Transfer learning in NLP"
Thomas Wolf "Transfer learning in NLP"Thomas Wolf "Transfer learning in NLP"
Thomas Wolf "Transfer learning in NLP"
Fwdays
 
A Simple Explanation of XLNet
A Simple Explanation of XLNetA Simple Explanation of XLNet
A Simple Explanation of XLNet
Domyoung Lee
 
The NLP Muppets revolution!
The NLP Muppets revolution!The NLP Muppets revolution!
The NLP Muppets revolution!
Fabio Petroni, PhD
 
NPL.pptx
NPL.pptxNPL.pptx
NPL.pptx
AlwinHilton
 
Learning with limited labelled data in NLP: multi-task learning and beyond
Learning with limited labelled data in NLP: multi-task learning and beyondLearning with limited labelled data in NLP: multi-task learning and beyond
Learning with limited labelled data in NLP: multi-task learning and beyond
Isabelle Augenstein
 
About programming languages
About programming languagesAbout programming languages
About programming languages
Ganesh Samarthyam
 
NLP.pptx
NLP.pptxNLP.pptx
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
Arvind Devaraj
 
Moving to neural machine translation at google - gopro-meetup
Moving to neural machine translation at google  - gopro-meetupMoving to neural machine translation at google  - gopro-meetup
Moving to neural machine translation at google - gopro-meetup
Chester Chen
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
Abdullah al Mamun
 
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET Journal
 

Similar to Yves Peirsman - Deep Learning for NLP (20)

Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 
MixedLanguageProcessingTutorialEMNLP2019.pptx
MixedLanguageProcessingTutorialEMNLP2019.pptxMixedLanguageProcessingTutorialEMNLP2019.pptx
MixedLanguageProcessingTutorialEMNLP2019.pptx
 
CS269-01 (1).pptx
CS269-01 (1).pptxCS269-01 (1).pptx
CS269-01 (1).pptx
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processing
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on Language
 
Automatic Grammatical Error Correction for ESL-Learners by SMT - Getting it r...
Automatic Grammatical Error Correction for ESL-Learners by SMT - Getting it r...Automatic Grammatical Error Correction for ESL-Learners by SMT - Getting it r...
Automatic Grammatical Error Correction for ESL-Learners by SMT - Getting it r...
 
Natural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsNatural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application Trends
 
[ACL2017読み会] What do Neural Machine Translation Models Learn about Morphology?
[ACL2017読み会] What do Neural Machine Translation Models Learn about Morphology?[ACL2017読み会] What do Neural Machine Translation Models Learn about Morphology?
[ACL2017読み会] What do Neural Machine Translation Models Learn about Morphology?
 
Thomas Wolf "Transfer learning in NLP"
Thomas Wolf "Transfer learning in NLP"Thomas Wolf "Transfer learning in NLP"
Thomas Wolf "Transfer learning in NLP"
 
A Simple Explanation of XLNet
A Simple Explanation of XLNetA Simple Explanation of XLNet
A Simple Explanation of XLNet
 
The NLP Muppets revolution!
The NLP Muppets revolution!The NLP Muppets revolution!
The NLP Muppets revolution!
 
NPL.pptx
NPL.pptxNPL.pptx
NPL.pptx
 
Learning with limited labelled data in NLP: multi-task learning and beyond
Learning with limited labelled data in NLP: multi-task learning and beyondLearning with limited labelled data in NLP: multi-task learning and beyond
Learning with limited labelled data in NLP: multi-task learning and beyond
 
About programming languages
About programming languagesAbout programming languages
About programming languages
 
NLP.pptx
NLP.pptxNLP.pptx
NLP.pptx
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
 
Moving to neural machine translation at google - gopro-meetup
Moving to neural machine translation at google  - gopro-meetupMoving to neural machine translation at google  - gopro-meetup
Moving to neural machine translation at google - gopro-meetup
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
 

Recently uploaded

Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
eddie19851
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Subhajit Sahu
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 

Recently uploaded (20)

Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 

Yves Peirsman - Deep Learning for NLP

  • 1. Deep Learning for NLP Yves Peirsman
  • 2. About me 2012 NLP engineer 2011 Post-doctoral researcher, Stanford University 2014 NLP Town 2010 PhD Computational Linguistics, KULeuven
  • 3. Deep Learning in NLP 2012 Deep Learning ● Comeback of neural networks ● Unified framework for many problems 1990 Statistical NLP ● Machine learning from data ● Many different models 1950s Rule-based NLP ● Hand-written linguistic rules ● Knowledge models 20?? ??
  • 4. Deep Learning in NLP Basic models The basic NLPer’s toolkit Main advantages Why has DL become so popular in NLP? Beyond the hype Deeper dive & recent trends
  • 5. Word embeddings Words as atomic units Words as dense embeddings The movie has an excellent cast. M I like the cover of the book. B There were too many pages in the novel. ? book novel movie
  • 6. POS syntaxcapitaliz ation prefix suffix bigrams lemmas Feature engineering Major feature engineering Little feature engineering (if any) strings strings
  • 7. Models Distinct models for different problems Unified toolkit Text classification NER MT ... PP PP PP SVM CRF LM TM decoder Text classification NER MT ... LSTM (or similar)
  • 8. NLP Toolkit: LSTM for classification Applications: text classification, language modelling LSTM LSTM LSTM LSTM LSTM embeddings dense layer weights biases The movie was boring . positive neutral negative
  • 9. NLP Toolkit: inside the LSTM forget input tanh output tanh was boring The movie ...
  • 10. NLP Toolkit: LSTM for sequence labelling Applications: named entity recognition LSTM LSTM LSTM LSTM LSTM embeddings dense layer logits B-PER O O B-LOC O weights biases John lives in London .
  • 11. NLP Toolkit: Encoder-Decoder Architectures Applications: machine translation, text summarization, dialogue modelling, etc. LSTM LSTM LSTM LSTM source embeddings LSTM LSTM LSTM LSTM LSTM target embeddings Je t’ aime . <END> I love you .
  • 12. NLP Toolkit: Attention Applications: machine translation, question answering, etc. LSTM LSTM LSTM LSTM source embeddings LSTM LSTM LSTM LSTM LSTM target embeddings Je t’ aime . <END> I love you . attention
  • 13. NLP under threat? Deep learning models have taken NLP by storm, achieving superior results across many applications. Many DL approaches do not model any linguistic knowledge. They view language as a sequence of strings. Is this the end of NLP as a separate discipline?
  • 14. NLP under threat? Deep learning models have taken NLP by storm, achieving superior results across many applications.
  • 15. Language models Rajeswar et al. 2017, https://arxiv.org/pdf/1705.10929.pdf
  • 16. Language models ● Great performance when explicitly trained for the task: 99% correct ○ > 120,000 sentence starts, labelled with singular or plural. ○ 50-dimensional LSTM followed by logistic regression. ○ In > 95% of the cases, the last noun determines the number. ● Performance drop for generic language models: 93% correct ○ Worse than chance on cases where a noun of the “incorrect” number occurs between the subject and the verb Linzen, Dupoux & Goldberg 2016, https://arxiv.org/pdf/1611.01368.pdf
  • 17. Machine Translation ● NMT can behave strangely ● Problems for languages with a very different syntax, such as English and Chinese: ○ 25% of Chinese noun phrases are translated into discontinuous phrases in English ○ Chinese noun phrases are often translated twice Li et al. 2017, https://arxiv.org/abs/1705.01020
  • 18. Question Answering Jia & Liang 2017, https://arxiv.org/pdf/1707.07328.pdf
  • 20. Deep Learning in NLP ● Deep Learning produces great results on many tasks. ● But: ○ Race to the bottom on standard data sets: ■ Language models: Penn Treebank, WikiText-2 ■ Machine Translation: WMT datasets ■ Question Answering: SQuAD ○ Its ignorance of linguistic structure is problematic in the evolution towards NLU ● So: ○ What do neural networks model? ○ How can we make them better?
  • 21. Linguistic knowledge in MT ● What linguistic knowledge does MT model? ● Simple syntactic labels ○ Encoder output + logistic regression ■ Word-level output: part-of-speech ■ Sentence-level output: voice (active or passive), tense (past or present) ● Deep syntactic structure ○ Encoder output + decoder to predict parse trees ● Two benchmarks: ○ Upper bound: neural parser ○ Lower bound: English-to-English “MT” auto-encoder Shi et al. 2016, https://www.isi.edu/natural-language/mt/emnlp16-nmt-grammar.pdf
  • 24. Linguistic knowledge in MT Solution 1: present the encoder with both syntactic and lexical information Li et al. 2017, https://arxiv.org/abs/1705.01020
  • 25. Linguistic knowledge in MT Li et al. 2017, https://arxiv.org/abs/1705.01020
  • 26. Linguistic knowledge in MT Solution 2: combine MT with parsing in multi-task learning Eriguchi et al. 2017 http://www.aclweb.org/anthology/P/P17/P17-2012.pdf
  • 27. Linguistic knowledge in MT Eriguchi et al. 2017 http://www.aclweb.org/anthology/P/P17/P17-2012.pdf
  • 28. Linguistic knowledge in QA ● Most answers to questions are constituents in the sentence. ● Restricting our candidate answers to constituents reduces the search spaces. ● Instead of feeding the network flat sequences, we need to feed it syntax trees. Xie and Xing 2017, http://www.aclweb.org/anthology/P/P17/P17-1129.pdf
  • 29. Linguistic knowledge in QA Xie and Xing 2017, http://www.aclweb.org/anthology/P/P17/P17-1129.pdf
  • 30. Conclusions ● Deep Learning works great for NLP, but it is not a silver bullet. ● For simple tasks, simple string input may suffice, but for deeper natural language understanding likely not. ● To tackle this challenge, we need to: ○ Better understand what neural networks model, ○ Help them model more linguistic knowledge, ○ Combine language with other modalities. yves@nlp.town