SlideShare a Scribd company logo
Distributed Representations for
Natural Language Processing
Tomas Mikolov, Facebook
ML Prague 2016
Structure of this talk
• Motivation
• Word2vec
• Architecture
• Evaluation
• Examples
• Discussion
Motivation
Representation of text is very important for performance of many real-world
applications: search, ads recommendation, ranking, spam filtering, …
• Local representations
• N-grams
• 1-of-N coding
• Bag-of-words
• Continuous representations
• Latent Semantic Analysis
• Latent Dirichlet Allocation
• Distributed Representations
Motivation: example
Suppose you want to quickly build a classifier:
• Input = keyword, or user query
• Output = is user interested in X? (where X can be a service, ad, …)
• Toy classifier: is X capital city?
• Getting training examples can be difficult, costly, and time consuming
• With local representations of input (1-of-N), one will need many
training examples for decent performance
Motivation: example
Suppose we have a few training examples:
• (Rome, 1)
• (Turkey, 0)
• (Prague, 1)
• (Australia, 0)
• …
Can we build a good classifier without much effort?
Motivation: example
Suppose we have a few training examples:
• (Rome, 1)
• (Turkey, 0)
• (Prague, 1)
• (Australia, 0)
• …
Can we build a good classifier without much effort?
YES, if we use good pre-trained features.
Motivation: example
Pre-trained features: to leverage vast amount of unannotated text data
• Local features:
• Prague = (0, 1, 0, 0, ..)
• Tokyo = (0, 0, 1, 0, ..)
• Italy = (1, 0, 0, 0, ..)
• Distributed features:
• Prague = (0.2, 0.4, 0.1, ..)
• Tokyo = (0.2, 0.4, 0.3, ..)
• Italy = (0.5, 0.8, 0.2, ..)
Distributed representations
• We hope to learn such representations so that Prague, Rome, Berlin,
Paris etc. will be close to each other
• We do not want just to cluster words: we seek representations that
can capture multiple degrees of similarity: Prague is similar to Berlin
in some way, and to Czech Republic in another way
• Can this be even done without manually created databases like
Wordnet / Knowledge graphs?
Word2vec
• Simple neural nets can be used to obtain distributed representations
of words (Hinton et al, 1986; Elman, 1991; …)
• The resulting representations have interesting structure – vectors can
be obtained using shallow network (Mikolov, 2007)
Word2vec
• Deep learning for NLP (Collobert & Weston, 2008): let’s use deep
neural networks! It works great!
• Back to shallow nets: Word2vec toolkit (Mikolov at el, 2013) -> much
more efficient than deep networks for this task
Word2vec
Two basic architectures:
• Skip-gram
• CBOW
Two training objectives:
• Hierarchical softmax
• Negative sampling
Plus bunch of tricks: weighting of distant words, down-sampling of frequent
words
Skip-gram Architecture
• Predicts the surrounding words given the current word
Continuous Bag-of-words Architecture
• Predicts the current word given the context
Word2vec: Linguistic Regularities
• After training is finished, the weight matrix between the input and hidden layers
represent the word feature vectors
• The word vector space implicitly encodes many regularities among words:
Linguistic Regularities in Word Vector Space
• The resulting distributed representations of words contain
surprisingly a lot of syntactic and semantic information
• There are multiple degrees of similarity among words:
• KING is similar to QUEEN as MAN is similar to WOMAN
• KING is similar to KINGS as MAN is similar to MEN
• Simple vector operations with the word vectors provide very intuitive
results (King – man + woman ~= Queen)
Linguistic Regularities - Evaluation
• Regularity of the learned word vector space was evaluated using test
set with about 20K analogy questions
• The test set contains both syntactic and semantic questions
• Comparison to previous state of art (pre-2013)
Linguistic Regularities - Evaluation
Linguistic Regularities - Examples
Visualization using PCA
Summary and discussion
• Word2vec: much faster and way more accurate than previous neural net
based solutions - speed up of training compared to prior state of art is
more than 10 000 times! (literally from weeks to seconds)
• Features derived from word2vec are now used across all big IT companies
in plenty of applications (search, ads, ..)
• Very popular also in research community: simple way how to boost
performance in many NLP tasks
• Main reasons of success: very fast, open-source, easy to use the resulting
features to boost many applications (even non-NLP)
Follow up work
Baroni, Dinu, Kruszewski (2014): Don't count, predict! A systematic
comparison of context-counting vs. context-predicting semantic vectors
• Turns out neural based approaches are very close to traditional
distributional semantics models
• Luckily, word2vec significantly outperformed the best previous
models across many tasks 
Follow up work
Pennington, Socher, Manning (2014): Glove: Global Vectors for Word
Representation
• Word2vec version from Stanford: almost identical, but a new name 
• In some sense step back: word2vec counts co-occurrences and does
dimensionality reduction together, Glove is two-pass algorithm
Follow up work
Levy, Goldberg, Dagan (2015): Improving distributional similarity with
lessons learned from word embeddings
• Hyper-parameter tuning is important: debunks the claims of
superiority of Glove
• Compares models trained on the same data (unlike Glove…),
word2vec is faster & vectors better & much less memory consuming
• Many others did end up with similar conclusions (Radim Rehurek, …)
Final notes
• Word2vec is successful because it is simple, but it cannot be applied
everywhere
• For modeling sequences of words, consider Recurrent networks
• Do not sum word vectors to obtain representations of sentences, it will not
work well
• Be careful about the hype, as always … the most cited papers often contain
non-reproducible results
References
• Mikolov (2007): Language Modeling for Speech Recognition in Czech
• Collobert, Weston (2008): A unified architecture for natural language processing: Deep neural networks with
multitask learning
• Mikolov, Karafiat, Burget, Cernocky, Khudanpur (2010): Recurrent neural network based language model
• Mikolov (2012): Statistical Language Models Based on Neural Networks
• Mikolov, Yih, Zweig (2013): Linguistic Regularities in Continuous Space Word Representations
• Mikolov, Chen, Corrado, Dean (2013): Efficient estimation of word representations in vector space
• Mikolov, Sutskever, Chen, Corrado, Dean (2013): Distributed representations of words and phrases and their
compositionality
• Baroni, Dinu, Kruszewski (2014): Don't count, predict! A systematic comparison of context-counting vs.
context-predicting semantic vectors
• Pennington, Socher, Manning (2014): Glove: Global Vectors for Word Representation
• Levy, Goldberg, Dagan (2015): Improving distributional similarity with lessons learned from word
embeddings

More Related Content

What's hot

I Am A Donut - How To Avoid International SEO Mistakes
I Am A Donut - How To Avoid International SEO MistakesI Am A Donut - How To Avoid International SEO Mistakes
I Am A Donut - How To Avoid International SEO Mistakes
Tom Brennan
 
Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...
Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...
Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...
Koray Tugberk GUBUR
 
brightonSEO - Stress Is Contagious Don't Catch It From Your Clients
brightonSEO - Stress Is Contagious Don't Catch It From Your ClientsbrightonSEO - Stress Is Contagious Don't Catch It From Your Clients
brightonSEO - Stress Is Contagious Don't Catch It From Your Clients
Kathryn Monkcom
 
How Search Works
How Search WorksHow Search Works
How Search Works
Ahrefs
 
Quality Content at Scale Through Automated Text Summarization of UGC
Quality Content at Scale Through Automated Text Summarization of UGCQuality Content at Scale Through Automated Text Summarization of UGC
Quality Content at Scale Through Automated Text Summarization of UGC
Hamlet Batista
 
Log File Analysis
Log File AnalysisLog File Analysis
Log File Analysis
Elias Dabbas
 
Keyword Research for SEO: Best Practices & Top Tips
Keyword Research for SEO: Best Practices & Top TipsKeyword Research for SEO: Best Practices & Top Tips
Keyword Research for SEO: Best Practices & Top Tips
Search Engine Journal
 
[BrightonSEO 2022] Unlocking the Hidden Potential of Product Listing Pages
[BrightonSEO 2022] Unlocking the Hidden Potential of Product Listing Pages[BrightonSEO 2022] Unlocking the Hidden Potential of Product Listing Pages
[BrightonSEO 2022] Unlocking the Hidden Potential of Product Listing Pages
Areej AbuAli
 
Explainability for Learning to Rank
Explainability for Learning to RankExplainability for Learning to Rank
Explainability for Learning to Rank
Sease
 
Advanced Ways to Use Ahrefs (That You Didn't Know About)
Advanced Ways to Use Ahrefs (That You Didn't Know About)Advanced Ways to Use Ahrefs (That You Didn't Know About)
Advanced Ways to Use Ahrefs (That You Didn't Know About)
Ahrefs
 
SEO Case Study - Hangikredi.com From 12 March to 24 September Core Update
SEO Case Study - Hangikredi.com From 12 March to 24 September Core UpdateSEO Case Study - Hangikredi.com From 12 March to 24 September Core Update
SEO Case Study - Hangikredi.com From 12 March to 24 September Core Update
Koray Tugberk GUBUR
 
Goodbye SEO fck ups! Learn to set an SEO Quality Assurance Framework
Goodbye SEO fck ups! Learn to set an SEO Quality Assurance FrameworkGoodbye SEO fck ups! Learn to set an SEO Quality Assurance Framework
Goodbye SEO fck ups! Learn to set an SEO Quality Assurance Framework
Aleyda Solís
 
4-Step SEO Waltz: Tackle SEO Challenges Head-On
4-Step SEO Waltz: Tackle SEO Challenges Head-On4-Step SEO Waltz: Tackle SEO Challenges Head-On
4-Step SEO Waltz: Tackle SEO Challenges Head-On
Search Engine Journal
 
Semantic Publishing and Entity SEO - Conteference 20-11-2022
Semantic Publishing and Entity SEO - Conteference 20-11-2022Semantic Publishing and Entity SEO - Conteference 20-11-2022
Semantic Publishing and Entity SEO - Conteference 20-11-2022
Massimiliano Geraci
 
How to control googlebot
How to control googlebotHow to control googlebot
How to control googlebot
Serge Bezborodov
 
E-Commerce SEO Horror Stories : How to tackle the most common issues 
at scal...
E-Commerce SEO Horror Stories : How to tackle the most common issues 
at scal...E-Commerce SEO Horror Stories : How to tackle the most common issues 
at scal...
E-Commerce SEO Horror Stories : How to tackle the most common issues 
at scal...
Aleyda Solís
 
Semantic search Bill Slawski DEEP SEA Con
Semantic search Bill Slawski DEEP SEA ConSemantic search Bill Slawski DEEP SEA Con
Semantic search Bill Slawski DEEP SEA Con
Bill Slawski
 
Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...
Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...
Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...
Koray Tugberk GUBUR
 
Google Sheets For SEO - Tom Pool - London SEO Meetup XL
Google Sheets For SEO - Tom Pool - London SEO Meetup XLGoogle Sheets For SEO - Tom Pool - London SEO Meetup XL
Google Sheets For SEO - Tom Pool - London SEO Meetup XL
Tom Pool
 
BrightonSEO October 2022 - Martijn Scheybeler - SEO Testing: Find Out What Wo...
BrightonSEO October 2022 - Martijn Scheybeler - SEO Testing: Find Out What Wo...BrightonSEO October 2022 - Martijn Scheybeler - SEO Testing: Find Out What Wo...
BrightonSEO October 2022 - Martijn Scheybeler - SEO Testing: Find Out What Wo...
Martijn Scheijbeler
 

What's hot (20)

I Am A Donut - How To Avoid International SEO Mistakes
I Am A Donut - How To Avoid International SEO MistakesI Am A Donut - How To Avoid International SEO Mistakes
I Am A Donut - How To Avoid International SEO Mistakes
 
Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...
Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...
Semantic Search Engine: Semantic Search and Query Parsing with Phrases and En...
 
brightonSEO - Stress Is Contagious Don't Catch It From Your Clients
brightonSEO - Stress Is Contagious Don't Catch It From Your ClientsbrightonSEO - Stress Is Contagious Don't Catch It From Your Clients
brightonSEO - Stress Is Contagious Don't Catch It From Your Clients
 
How Search Works
How Search WorksHow Search Works
How Search Works
 
Quality Content at Scale Through Automated Text Summarization of UGC
Quality Content at Scale Through Automated Text Summarization of UGCQuality Content at Scale Through Automated Text Summarization of UGC
Quality Content at Scale Through Automated Text Summarization of UGC
 
Log File Analysis
Log File AnalysisLog File Analysis
Log File Analysis
 
Keyword Research for SEO: Best Practices & Top Tips
Keyword Research for SEO: Best Practices & Top TipsKeyword Research for SEO: Best Practices & Top Tips
Keyword Research for SEO: Best Practices & Top Tips
 
[BrightonSEO 2022] Unlocking the Hidden Potential of Product Listing Pages
[BrightonSEO 2022] Unlocking the Hidden Potential of Product Listing Pages[BrightonSEO 2022] Unlocking the Hidden Potential of Product Listing Pages
[BrightonSEO 2022] Unlocking the Hidden Potential of Product Listing Pages
 
Explainability for Learning to Rank
Explainability for Learning to RankExplainability for Learning to Rank
Explainability for Learning to Rank
 
Advanced Ways to Use Ahrefs (That You Didn't Know About)
Advanced Ways to Use Ahrefs (That You Didn't Know About)Advanced Ways to Use Ahrefs (That You Didn't Know About)
Advanced Ways to Use Ahrefs (That You Didn't Know About)
 
SEO Case Study - Hangikredi.com From 12 March to 24 September Core Update
SEO Case Study - Hangikredi.com From 12 March to 24 September Core UpdateSEO Case Study - Hangikredi.com From 12 March to 24 September Core Update
SEO Case Study - Hangikredi.com From 12 March to 24 September Core Update
 
Goodbye SEO fck ups! Learn to set an SEO Quality Assurance Framework
Goodbye SEO fck ups! Learn to set an SEO Quality Assurance FrameworkGoodbye SEO fck ups! Learn to set an SEO Quality Assurance Framework
Goodbye SEO fck ups! Learn to set an SEO Quality Assurance Framework
 
4-Step SEO Waltz: Tackle SEO Challenges Head-On
4-Step SEO Waltz: Tackle SEO Challenges Head-On4-Step SEO Waltz: Tackle SEO Challenges Head-On
4-Step SEO Waltz: Tackle SEO Challenges Head-On
 
Semantic Publishing and Entity SEO - Conteference 20-11-2022
Semantic Publishing and Entity SEO - Conteference 20-11-2022Semantic Publishing and Entity SEO - Conteference 20-11-2022
Semantic Publishing and Entity SEO - Conteference 20-11-2022
 
How to control googlebot
How to control googlebotHow to control googlebot
How to control googlebot
 
E-Commerce SEO Horror Stories : How to tackle the most common issues 
at scal...
E-Commerce SEO Horror Stories : How to tackle the most common issues 
at scal...E-Commerce SEO Horror Stories : How to tackle the most common issues 
at scal...
E-Commerce SEO Horror Stories : How to tackle the most common issues 
at scal...
 
Semantic search Bill Slawski DEEP SEA Con
Semantic search Bill Slawski DEEP SEA ConSemantic search Bill Slawski DEEP SEA Con
Semantic search Bill Slawski DEEP SEA Con
 
Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...
Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...
Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...
 
Google Sheets For SEO - Tom Pool - London SEO Meetup XL
Google Sheets For SEO - Tom Pool - London SEO Meetup XLGoogle Sheets For SEO - Tom Pool - London SEO Meetup XL
Google Sheets For SEO - Tom Pool - London SEO Meetup XL
 
BrightonSEO October 2022 - Martijn Scheybeler - SEO Testing: Find Out What Wo...
BrightonSEO October 2022 - Martijn Scheybeler - SEO Testing: Find Out What Wo...BrightonSEO October 2022 - Martijn Scheybeler - SEO Testing: Find Out What Wo...
BrightonSEO October 2022 - Martijn Scheybeler - SEO Testing: Find Out What Wo...
 

Viewers also liked

Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
ananth
 
Emerging Trends in Online Search
Emerging Trends in Online SearchEmerging Trends in Online Search
Emerging Trends in Online Search
Distilled
 
word2vec - From theory to practice
word2vec - From theory to practiceword2vec - From theory to practice
word2vec - From theory to practice
hen_drik
 
Аналіз рівнів реалізуємості технічного потенціалу енергозбереження за енергот...
Аналіз рівнів реалізуємості технічного потенціалу енергозбереження за енергот...Аналіз рівнів реалізуємості технічного потенціалу енергозбереження за енергот...
Аналіз рівнів реалізуємості технічного потенціалу енергозбереження за енергот...
Yurii Chernukha
 
12- MMA Forum Argentina 2016 - La televisión conectada - Verne
12- MMA Forum Argentina 2016 - La televisión conectada - Verne12- MMA Forum Argentina 2016 - La televisión conectada - Verne
12- MMA Forum Argentina 2016 - La televisión conectada - Verne
Mobile Marketing Association
 
Tools and tips for simplifying startup formation.
Tools and tips for simplifying startup formation.Tools and tips for simplifying startup formation.
Tools and tips for simplifying startup formation.
Alex Shoer
 
Evaluation question 1 notes
Evaluation question 1 notesEvaluation question 1 notes
Evaluation question 1 notesJoel Ryan
 
Evaluation; Question 02
Evaluation; Question 02Evaluation; Question 02
Evaluation; Question 02
30040996
 
Evaluation Question 1 - 8288
Evaluation Question 1 - 8288Evaluation Question 1 - 8288
Evaluation Question 1 - 8288aragorn337
 
Tu tecnologia es la mia
Tu tecnologia es la miaTu tecnologia es la mia
Tu tecnologia es la mia
maria camila rojas idarraga
 
Slides from Writing for Wikipedia Event
Slides from Writing for Wikipedia EventSlides from Writing for Wikipedia Event
Slides from Writing for Wikipedia Event
Kristen T
 
ILLICH YNEDT2074
ILLICH YNEDT2074ILLICH YNEDT2074
ILLICH YNEDT2074Ruth Davies
 
Почему я программирую на Perl‎
Почему я программирую на Perl‎Почему я программирую на Perl‎
Почему я программирую на Perl‎
Anatoly Sharifulin
 
Researching Genre
Researching GenreResearching Genre
Researching GenreHenry Tait
 
JJUG CCC 2014 ATL
JJUG CCC 2014 ATLJJUG CCC 2014 ATL
JJUG CCC 2014 ATL
Recruit Technologies
 
Devopsdays se-2011
Devopsdays se-2011Devopsdays se-2011
Devopsdays se-2011lusis
 
Int. sistemas compu.
Int. sistemas compu.Int. sistemas compu.
Int. sistemas compu.
Al-FreDox' Salaas R
 
3 d pie chart circular puzzle with hole in center pieces 8 stages style 1 pow...
3 d pie chart circular puzzle with hole in center pieces 8 stages style 1 pow...3 d pie chart circular puzzle with hole in center pieces 8 stages style 1 pow...
3 d pie chart circular puzzle with hole in center pieces 8 stages style 1 pow...SlideTeam.net
 
Redes socialesparaempresas - Actualizada
Redes socialesparaempresas - ActualizadaRedes socialesparaempresas - Actualizada
Redes socialesparaempresas - Actualizada
Adriana Alban
 

Viewers also liked (20)

HOPFIELD NETWORK
HOPFIELD NETWORKHOPFIELD NETWORK
HOPFIELD NETWORK
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
 
Emerging Trends in Online Search
Emerging Trends in Online SearchEmerging Trends in Online Search
Emerging Trends in Online Search
 
word2vec - From theory to practice
word2vec - From theory to practiceword2vec - From theory to practice
word2vec - From theory to practice
 
Аналіз рівнів реалізуємості технічного потенціалу енергозбереження за енергот...
Аналіз рівнів реалізуємості технічного потенціалу енергозбереження за енергот...Аналіз рівнів реалізуємості технічного потенціалу енергозбереження за енергот...
Аналіз рівнів реалізуємості технічного потенціалу енергозбереження за енергот...
 
12- MMA Forum Argentina 2016 - La televisión conectada - Verne
12- MMA Forum Argentina 2016 - La televisión conectada - Verne12- MMA Forum Argentina 2016 - La televisión conectada - Verne
12- MMA Forum Argentina 2016 - La televisión conectada - Verne
 
Tools and tips for simplifying startup formation.
Tools and tips for simplifying startup formation.Tools and tips for simplifying startup formation.
Tools and tips for simplifying startup formation.
 
Evaluation question 1 notes
Evaluation question 1 notesEvaluation question 1 notes
Evaluation question 1 notes
 
Evaluation; Question 02
Evaluation; Question 02Evaluation; Question 02
Evaluation; Question 02
 
Evaluation Question 1 - 8288
Evaluation Question 1 - 8288Evaluation Question 1 - 8288
Evaluation Question 1 - 8288
 
Tu tecnologia es la mia
Tu tecnologia es la miaTu tecnologia es la mia
Tu tecnologia es la mia
 
Slides from Writing for Wikipedia Event
Slides from Writing for Wikipedia EventSlides from Writing for Wikipedia Event
Slides from Writing for Wikipedia Event
 
ILLICH YNEDT2074
ILLICH YNEDT2074ILLICH YNEDT2074
ILLICH YNEDT2074
 
Почему я программирую на Perl‎
Почему я программирую на Perl‎Почему я программирую на Perl‎
Почему я программирую на Perl‎
 
Researching Genre
Researching GenreResearching Genre
Researching Genre
 
JJUG CCC 2014 ATL
JJUG CCC 2014 ATLJJUG CCC 2014 ATL
JJUG CCC 2014 ATL
 
Devopsdays se-2011
Devopsdays se-2011Devopsdays se-2011
Devopsdays se-2011
 
Int. sistemas compu.
Int. sistemas compu.Int. sistemas compu.
Int. sistemas compu.
 
3 d pie chart circular puzzle with hole in center pieces 8 stages style 1 pow...
3 d pie chart circular puzzle with hole in center pieces 8 stages style 1 pow...3 d pie chart circular puzzle with hole in center pieces 8 stages style 1 pow...
3 d pie chart circular puzzle with hole in center pieces 8 stages style 1 pow...
 
Redes socialesparaempresas - Actualizada
Redes socialesparaempresas - ActualizadaRedes socialesparaempresas - Actualizada
Redes socialesparaempresas - Actualizada
 

Similar to Tomáš Mikolov - Distributed Representations for NLP

Bridging the gap between AI and UI - DSI Vienna - full version
Bridging the gap between AI and UI - DSI Vienna - full versionBridging the gap between AI and UI - DSI Vienna - full version
Bridging the gap between AI and UI - DSI Vienna - full version
Liad Magen
 
ODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLP
indico data
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
Traian Rebedea
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014
Paris Open Source Summit
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Matthew Lease
 
Challenges in transfer learning in nlp
Challenges in transfer learning in nlpChallenges in transfer learning in nlp
Challenges in transfer learning in nlp
LaraOlmosCamarena
 
AINL 2016: Nikolenko
AINL 2016: NikolenkoAINL 2016: Nikolenko
AINL 2016: Nikolenko
Lidia Pivovarova
 
ICS1020 NLP 2020
ICS1020 NLP 2020ICS1020 NLP 2020
ICS1020 NLP 2020
Vanessa Camilleri
 
Multi modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed models
Roelof Pieters
 
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersGPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask Learners
Young Seok Kim
 
From Semantics to Self-supervised Learning for Speech and Beyond (Opening Ke...
From Semantics to Self-supervised Learning  for Speech and Beyond (Opening Ke...From Semantics to Self-supervised Learning  for Speech and Beyond (Opening Ke...
From Semantics to Self-supervised Learning for Speech and Beyond (Opening Ke...
linshanleearchive
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLP
MENGSAYLOEM1
 
Nlp research presentation
Nlp research presentationNlp research presentation
Nlp research presentation
Surya Sg
 
Foundation Models in Recommender Systems
Foundation Models in Recommender SystemsFoundation Models in Recommender Systems
Foundation Models in Recommender Systems
Anoop Deoras
 
How can text-mining leverage developments in Deep Learning? Presentation at ...
How can text-mining leverage developments in Deep Learning?  Presentation at ...How can text-mining leverage developments in Deep Learning?  Presentation at ...
How can text-mining leverage developments in Deep Learning? Presentation at ...
jcscholtes
 
Deep Dialog System Review
Deep Dialog System ReviewDeep Dialog System Review
Deep Dialog System Review
Nguyen Quang
 
Introducción a NLP (Natural Language Processing) en Azure
Introducción a NLP (Natural Language Processing) en AzureIntroducción a NLP (Natural Language Processing) en Azure
Introducción a NLP (Natural Language Processing) en Azure
Plain Concepts
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
Roelof Pieters
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
ananth
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ers
Roelof Pieters
 

Similar to Tomáš Mikolov - Distributed Representations for NLP (20)

Bridging the gap between AI and UI - DSI Vienna - full version
Bridging the gap between AI and UI - DSI Vienna - full versionBridging the gap between AI and UI - DSI Vienna - full version
Bridging the gap between AI and UI - DSI Vienna - full version
 
ODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLP
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
Challenges in transfer learning in nlp
Challenges in transfer learning in nlpChallenges in transfer learning in nlp
Challenges in transfer learning in nlp
 
AINL 2016: Nikolenko
AINL 2016: NikolenkoAINL 2016: Nikolenko
AINL 2016: Nikolenko
 
ICS1020 NLP 2020
ICS1020 NLP 2020ICS1020 NLP 2020
ICS1020 NLP 2020
 
Multi modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed models
 
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersGPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask Learners
 
From Semantics to Self-supervised Learning for Speech and Beyond (Opening Ke...
From Semantics to Self-supervised Learning  for Speech and Beyond (Opening Ke...From Semantics to Self-supervised Learning  for Speech and Beyond (Opening Ke...
From Semantics to Self-supervised Learning for Speech and Beyond (Opening Ke...
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLP
 
Nlp research presentation
Nlp research presentationNlp research presentation
Nlp research presentation
 
Foundation Models in Recommender Systems
Foundation Models in Recommender SystemsFoundation Models in Recommender Systems
Foundation Models in Recommender Systems
 
How can text-mining leverage developments in Deep Learning? Presentation at ...
How can text-mining leverage developments in Deep Learning?  Presentation at ...How can text-mining leverage developments in Deep Learning?  Presentation at ...
How can text-mining leverage developments in Deep Learning? Presentation at ...
 
Deep Dialog System Review
Deep Dialog System ReviewDeep Dialog System Review
Deep Dialog System Review
 
Introducción a NLP (Natural Language Processing) en Azure
Introducción a NLP (Natural Language Processing) en AzureIntroducción a NLP (Natural Language Processing) en Azure
Introducción a NLP (Natural Language Processing) en Azure
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ers
 

More from Machine Learning Prague

Vít Listík - Email.cz workshop
Vít Listík - Email.cz workshopVít Listík - Email.cz workshop
Vít Listík - Email.cz workshop
Machine Learning Prague
 
Lukáš Vrábel - Deep Convolutional Neural Networks
Lukáš Vrábel - Deep Convolutional Neural NetworksLukáš Vrábel - Deep Convolutional Neural Networks
Lukáš Vrábel - Deep Convolutional Neural Networks
Machine Learning Prague
 
Tomáš Cícha - Machine Learning Solutions at Seznam.cz
Tomáš Cícha - Machine Learning Solutions at Seznam.czTomáš Cícha - Machine Learning Solutions at Seznam.cz
Tomáš Cícha - Machine Learning Solutions at Seznam.cz
Machine Learning Prague
 
Jan Pospíšil - Azure ML
Jan Pospíšil - Azure MLJan Pospíšil - Azure ML
Jan Pospíšil - Azure ML
Machine Learning Prague
 
Michael Levin - MatrixNet Applications at Yandex
Michael Levin - MatrixNet Applications at YandexMichael Levin - MatrixNet Applications at Yandex
Michael Levin - MatrixNet Applications at Yandex
Machine Learning Prague
 
Libor Mořkovský - Recognizing Malware
Libor Mořkovský - Recognizing MalwareLibor Mořkovský - Recognizing Malware
Libor Mořkovský - Recognizing Malware
Machine Learning Prague
 
Adam Ashenfelter - Finding the Oddballs
Adam Ashenfelter - Finding the OddballsAdam Ashenfelter - Finding the Oddballs
Adam Ashenfelter - Finding the Oddballs
Machine Learning Prague
 
Chris Brew - TR Discover: A Natural Language Interface for Exploring Linked D...
Chris Brew - TR Discover: A Natural Language Interface for Exploring Linked D...Chris Brew - TR Discover: A Natural Language Interface for Exploring Linked D...
Chris Brew - TR Discover: A Natural Language Interface for Exploring Linked D...
Machine Learning Prague
 
Kateřina Veselovská - ML Approaches to Sentiment Analysis
Kateřina Veselovská - ML Approaches to Sentiment AnalysisKateřina Veselovská - ML Approaches to Sentiment Analysis
Kateřina Veselovská - ML Approaches to Sentiment Analysis
Machine Learning Prague
 
Jiří Materna - Artificial Intelligence in Creative Writing
Jiří Materna - Artificial Intelligence in Creative WritingJiří Materna - Artificial Intelligence in Creative Writing
Jiří Materna - Artificial Intelligence in Creative Writing
Machine Learning Prague
 
Jan Šedivý - Intelligent Personal Assistants
Jan Šedivý - Intelligent Personal AssistantsJan Šedivý - Intelligent Personal Assistants
Jan Šedivý - Intelligent Personal Assistants
Machine Learning Prague
 
Marek Rosa - Inventing General Artificial Intelligence: A Vision and Methodology
Marek Rosa - Inventing General Artificial Intelligence: A Vision and MethodologyMarek Rosa - Inventing General Artificial Intelligence: A Vision and Methodology
Marek Rosa - Inventing General Artificial Intelligence: A Vision and Methodology
Machine Learning Prague
 
Xuedong Huang - Deep Learning and Intelligent Applications
Xuedong Huang - Deep Learning and Intelligent ApplicationsXuedong Huang - Deep Learning and Intelligent Applications
Xuedong Huang - Deep Learning and Intelligent Applications
Machine Learning Prague
 

More from Machine Learning Prague (13)

Vít Listík - Email.cz workshop
Vít Listík - Email.cz workshopVít Listík - Email.cz workshop
Vít Listík - Email.cz workshop
 
Lukáš Vrábel - Deep Convolutional Neural Networks
Lukáš Vrábel - Deep Convolutional Neural NetworksLukáš Vrábel - Deep Convolutional Neural Networks
Lukáš Vrábel - Deep Convolutional Neural Networks
 
Tomáš Cícha - Machine Learning Solutions at Seznam.cz
Tomáš Cícha - Machine Learning Solutions at Seznam.czTomáš Cícha - Machine Learning Solutions at Seznam.cz
Tomáš Cícha - Machine Learning Solutions at Seznam.cz
 
Jan Pospíšil - Azure ML
Jan Pospíšil - Azure MLJan Pospíšil - Azure ML
Jan Pospíšil - Azure ML
 
Michael Levin - MatrixNet Applications at Yandex
Michael Levin - MatrixNet Applications at YandexMichael Levin - MatrixNet Applications at Yandex
Michael Levin - MatrixNet Applications at Yandex
 
Libor Mořkovský - Recognizing Malware
Libor Mořkovský - Recognizing MalwareLibor Mořkovský - Recognizing Malware
Libor Mořkovský - Recognizing Malware
 
Adam Ashenfelter - Finding the Oddballs
Adam Ashenfelter - Finding the OddballsAdam Ashenfelter - Finding the Oddballs
Adam Ashenfelter - Finding the Oddballs
 
Chris Brew - TR Discover: A Natural Language Interface for Exploring Linked D...
Chris Brew - TR Discover: A Natural Language Interface for Exploring Linked D...Chris Brew - TR Discover: A Natural Language Interface for Exploring Linked D...
Chris Brew - TR Discover: A Natural Language Interface for Exploring Linked D...
 
Kateřina Veselovská - ML Approaches to Sentiment Analysis
Kateřina Veselovská - ML Approaches to Sentiment AnalysisKateřina Veselovská - ML Approaches to Sentiment Analysis
Kateřina Veselovská - ML Approaches to Sentiment Analysis
 
Jiří Materna - Artificial Intelligence in Creative Writing
Jiří Materna - Artificial Intelligence in Creative WritingJiří Materna - Artificial Intelligence in Creative Writing
Jiří Materna - Artificial Intelligence in Creative Writing
 
Jan Šedivý - Intelligent Personal Assistants
Jan Šedivý - Intelligent Personal AssistantsJan Šedivý - Intelligent Personal Assistants
Jan Šedivý - Intelligent Personal Assistants
 
Marek Rosa - Inventing General Artificial Intelligence: A Vision and Methodology
Marek Rosa - Inventing General Artificial Intelligence: A Vision and MethodologyMarek Rosa - Inventing General Artificial Intelligence: A Vision and Methodology
Marek Rosa - Inventing General Artificial Intelligence: A Vision and Methodology
 
Xuedong Huang - Deep Learning and Intelligent Applications
Xuedong Huang - Deep Learning and Intelligent ApplicationsXuedong Huang - Deep Learning and Intelligent Applications
Xuedong Huang - Deep Learning and Intelligent Applications
 

Recently uploaded

Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 

Recently uploaded (20)

Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 

Tomáš Mikolov - Distributed Representations for NLP

  • 1. Distributed Representations for Natural Language Processing Tomas Mikolov, Facebook ML Prague 2016
  • 2. Structure of this talk • Motivation • Word2vec • Architecture • Evaluation • Examples • Discussion
  • 3. Motivation Representation of text is very important for performance of many real-world applications: search, ads recommendation, ranking, spam filtering, … • Local representations • N-grams • 1-of-N coding • Bag-of-words • Continuous representations • Latent Semantic Analysis • Latent Dirichlet Allocation • Distributed Representations
  • 4. Motivation: example Suppose you want to quickly build a classifier: • Input = keyword, or user query • Output = is user interested in X? (where X can be a service, ad, …) • Toy classifier: is X capital city? • Getting training examples can be difficult, costly, and time consuming • With local representations of input (1-of-N), one will need many training examples for decent performance
  • 5. Motivation: example Suppose we have a few training examples: • (Rome, 1) • (Turkey, 0) • (Prague, 1) • (Australia, 0) • … Can we build a good classifier without much effort?
  • 6. Motivation: example Suppose we have a few training examples: • (Rome, 1) • (Turkey, 0) • (Prague, 1) • (Australia, 0) • … Can we build a good classifier without much effort? YES, if we use good pre-trained features.
  • 7. Motivation: example Pre-trained features: to leverage vast amount of unannotated text data • Local features: • Prague = (0, 1, 0, 0, ..) • Tokyo = (0, 0, 1, 0, ..) • Italy = (1, 0, 0, 0, ..) • Distributed features: • Prague = (0.2, 0.4, 0.1, ..) • Tokyo = (0.2, 0.4, 0.3, ..) • Italy = (0.5, 0.8, 0.2, ..)
  • 8. Distributed representations • We hope to learn such representations so that Prague, Rome, Berlin, Paris etc. will be close to each other • We do not want just to cluster words: we seek representations that can capture multiple degrees of similarity: Prague is similar to Berlin in some way, and to Czech Republic in another way • Can this be even done without manually created databases like Wordnet / Knowledge graphs?
  • 9. Word2vec • Simple neural nets can be used to obtain distributed representations of words (Hinton et al, 1986; Elman, 1991; …) • The resulting representations have interesting structure – vectors can be obtained using shallow network (Mikolov, 2007)
  • 10. Word2vec • Deep learning for NLP (Collobert & Weston, 2008): let’s use deep neural networks! It works great! • Back to shallow nets: Word2vec toolkit (Mikolov at el, 2013) -> much more efficient than deep networks for this task
  • 11. Word2vec Two basic architectures: • Skip-gram • CBOW Two training objectives: • Hierarchical softmax • Negative sampling Plus bunch of tricks: weighting of distant words, down-sampling of frequent words
  • 12. Skip-gram Architecture • Predicts the surrounding words given the current word
  • 13. Continuous Bag-of-words Architecture • Predicts the current word given the context
  • 14. Word2vec: Linguistic Regularities • After training is finished, the weight matrix between the input and hidden layers represent the word feature vectors • The word vector space implicitly encodes many regularities among words:
  • 15. Linguistic Regularities in Word Vector Space • The resulting distributed representations of words contain surprisingly a lot of syntactic and semantic information • There are multiple degrees of similarity among words: • KING is similar to QUEEN as MAN is similar to WOMAN • KING is similar to KINGS as MAN is similar to MEN • Simple vector operations with the word vectors provide very intuitive results (King – man + woman ~= Queen)
  • 16. Linguistic Regularities - Evaluation • Regularity of the learned word vector space was evaluated using test set with about 20K analogy questions • The test set contains both syntactic and semantic questions • Comparison to previous state of art (pre-2013)
  • 20. Summary and discussion • Word2vec: much faster and way more accurate than previous neural net based solutions - speed up of training compared to prior state of art is more than 10 000 times! (literally from weeks to seconds) • Features derived from word2vec are now used across all big IT companies in plenty of applications (search, ads, ..) • Very popular also in research community: simple way how to boost performance in many NLP tasks • Main reasons of success: very fast, open-source, easy to use the resulting features to boost many applications (even non-NLP)
  • 21. Follow up work Baroni, Dinu, Kruszewski (2014): Don't count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors • Turns out neural based approaches are very close to traditional distributional semantics models • Luckily, word2vec significantly outperformed the best previous models across many tasks 
  • 22. Follow up work Pennington, Socher, Manning (2014): Glove: Global Vectors for Word Representation • Word2vec version from Stanford: almost identical, but a new name  • In some sense step back: word2vec counts co-occurrences and does dimensionality reduction together, Glove is two-pass algorithm
  • 23. Follow up work Levy, Goldberg, Dagan (2015): Improving distributional similarity with lessons learned from word embeddings • Hyper-parameter tuning is important: debunks the claims of superiority of Glove • Compares models trained on the same data (unlike Glove…), word2vec is faster & vectors better & much less memory consuming • Many others did end up with similar conclusions (Radim Rehurek, …)
  • 24. Final notes • Word2vec is successful because it is simple, but it cannot be applied everywhere • For modeling sequences of words, consider Recurrent networks • Do not sum word vectors to obtain representations of sentences, it will not work well • Be careful about the hype, as always … the most cited papers often contain non-reproducible results
  • 25. References • Mikolov (2007): Language Modeling for Speech Recognition in Czech • Collobert, Weston (2008): A unified architecture for natural language processing: Deep neural networks with multitask learning • Mikolov, Karafiat, Burget, Cernocky, Khudanpur (2010): Recurrent neural network based language model • Mikolov (2012): Statistical Language Models Based on Neural Networks • Mikolov, Yih, Zweig (2013): Linguistic Regularities in Continuous Space Word Representations • Mikolov, Chen, Corrado, Dean (2013): Efficient estimation of word representations in vector space • Mikolov, Sutskever, Chen, Corrado, Dean (2013): Distributed representations of words and phrases and their compositionality • Baroni, Dinu, Kruszewski (2014): Don't count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors • Pennington, Socher, Manning (2014): Glove: Global Vectors for Word Representation • Levy, Goldberg, Dagan (2015): Improving distributional similarity with lessons learned from word embeddings