SlideShare a Scribd company logo
1 of 1
Download to read offline
Document Informed Neural Autoregressive Topic Models
with Distributional Prior
Informed Neural Topic Model with Pre-trained Word Embeddings
Pankaj Gupta1,2
, Yatin Chaudhary2
, Florian Buettner2
& Hinrich SchĀØutze1
1
CIS, University of Munich (LMU), Germany
2
Corporate Technology, Machine-Intelligence, Siemens AG Munich, Germany
pankaj.gupta@campus.lmu.de | pankaj.gupta@siemens.com
Introduction
A novel Neural Autoregressive Topic Model for short and long texts, empowered by:
ā€¢ Context-awareness in learning better representations,
ā€¢ Distributional semantics i.e., Word Embeddings as prior knowledge
Problem Statement / Motivation
1. ā€œNeed for the context-awareness in representation learningā€?
ā€¢ To determine actual meaning of ambiguous words
ā€¢ To improve word and document representations
Figure 1: Need for context-awareness in learning representations
2. ā€œNeed for Prior Knowledge in the limited context settingsā€?
ā€¢ ā†“ word co-occurrences in short texts (e.g., headlines, tweets) or small corpora
ā€¢ Difļ¬cult to learn good representations ā†’ Generates incoherent topics
Figure 2: (left): Word embedding similarity (right): Topic examples
Figure 3: Contributions in this work
Evaluation and Analysis
ā€¢ 8 short-text and 7 long-text datasets from news, Q&A, sentiment and Industrial domains
ā€¢ Generalization (perplexity), Interpretability (topic coherence), Text retrieval (IR) and classiļ¬cation
Table 1: Perplexity (PPL) and IR-precision (at fraction 0.02) scores for short and long texts
Table 2: IR-precision at different retrieval fractions
Methodology: Document Neural Autoregressive Topic Models
Figure 4: (left): DocNADE[1] (baseline) (right): iDocNADE (DocNADE + context-awareness)
Figure 5: (left): DocNADE[1] (baseline) (right): DocNADEe (DocNADE + Word Embeddings)
Table 3: (left): Topic coherence with the top 10 and 20 words (right): Qualitative example
Table 4: (left): Text classiļ¬cation (F1 and Accuracy) scores for short texts
Conclusion & Key Takeaways
ā€¢ Leverage full context + pre-trained word embeddings in neural autoregressive topic model
ā€¢ Gain of 5.2% (404 vs 426) in PPL, 2.8% (.74 vs .72) in coherence, 11.1% (.60 vs .54) in IR-
precision, 5.2% (.664 vs .631) in F1 for text categorization, on avg over 15 datasets
ā€¢ Demonstrate learning better word/document representation for short and long texts
ā€¢ Tryout: Code available at: https://github.com/pgcool/iDocNADEe
Our recent extension of this work: ā€œtextTOvecā€
Pankaj Gupta, Yatin Chaudhary, Florian Buettner and Hinrich SchĀØutze. textTOvec: Deep Contextu-
alized Neural Autoregressive Topic Models of Language with Distributed Compositional Prior. To
appear in ICLR2019. TL;DR: A Neural Topic Model with Language Structures.
References
[1] Hugo Larochelle and Stanislas Lauly. A neural autoregressive topic model. In Advances in Neural Information
Processing Systems 25, pages 2708ā€“2716. Curran Associates, Inc., 2012.

More Related Content

What's hot

Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...
Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...
Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...Traian Rebedea
Ā 
NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241Urjit Patel
Ā 
Nlp research presentation
Nlp research presentationNlp research presentation
Nlp research presentationSurya Sg
Ā 
Text summarization
Text summarizationText summarization
Text summarizationkareemhashem
Ā 
NLP Project Presentation
NLP Project PresentationNLP Project Presentation
NLP Project PresentationAryak Sengupta
Ā 
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...ijtsrd
Ā 
How to write a paper
How to write a paperHow to write a paper
How to write a paperIrika Widiasanti
Ā 
Induction and Decision Tree Learning (Part 1)
Induction and Decision Tree Learning (Part 1)Induction and Decision Tree Learning (Part 1)
Induction and Decision Tree Learning (Part 1)butest
Ā 
Natural Language Processing Theory, Applications and Difficulties
Natural Language Processing Theory, Applications and DifficultiesNatural Language Processing Theory, Applications and Difficulties
Natural Language Processing Theory, Applications and Difficultiesijtsrd
Ā 
Machine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptMachine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptbutest
Ā 
IRJET- Survey on Text Error Detection using Deep Learning
IRJET-  	  Survey on Text Error Detection using Deep LearningIRJET-  	  Survey on Text Error Detection using Deep Learning
IRJET- Survey on Text Error Detection using Deep LearningIRJET Journal
Ā 
Chapter 6 : Connectionist Approaches
Chapter 6 : Connectionist ApproachesChapter 6 : Connectionist Approaches
Chapter 6 : Connectionist ApproachesPiseth Chea
Ā 
Artificial intelligence cs607 handouts lecture 11 - 45
Artificial intelligence   cs607 handouts lecture 11 - 45Artificial intelligence   cs607 handouts lecture 11 - 45
Artificial intelligence cs607 handouts lecture 11 - 45Sattar kayani
Ā 

What's hot (19)

O01741103108
O01741103108O01741103108
O01741103108
Ā 
Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...
Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...
Extraction of Socio-Semantic Data from Chat Conversations in Collaborative Le...
Ā 
NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241NLP_Project_Paper_up276_vec241
NLP_Project_Paper_up276_vec241
Ā 
Truth management system
Truth  management systemTruth  management system
Truth management system
Ā 
Nlp research presentation
Nlp research presentationNlp research presentation
Nlp research presentation
Ā 
Text summarization
Text summarizationText summarization
Text summarization
Ā 
NLP Project Presentation
NLP Project PresentationNLP Project Presentation
NLP Project Presentation
Ā 
10.1.1.35.8376
10.1.1.35.837610.1.1.35.8376
10.1.1.35.8376
Ā 
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Ā 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
Ā 
Ihi2012 semantic-similarity-tutorial-part1
Ihi2012 semantic-similarity-tutorial-part1Ihi2012 semantic-similarity-tutorial-part1
Ihi2012 semantic-similarity-tutorial-part1
Ā 
How to write a paper
How to write a paperHow to write a paper
How to write a paper
Ā 
Induction and Decision Tree Learning (Part 1)
Induction and Decision Tree Learning (Part 1)Induction and Decision Tree Learning (Part 1)
Induction and Decision Tree Learning (Part 1)
Ā 
Natural Language Processing Theory, Applications and Difficulties
Natural Language Processing Theory, Applications and DifficultiesNatural Language Processing Theory, Applications and Difficulties
Natural Language Processing Theory, Applications and Difficulties
Ā 
Machine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptMachine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.ppt
Ā 
IRJET- Survey on Text Error Detection using Deep Learning
IRJET-  	  Survey on Text Error Detection using Deep LearningIRJET-  	  Survey on Text Error Detection using Deep Learning
IRJET- Survey on Text Error Detection using Deep Learning
Ā 
Chapter 6 : Connectionist Approaches
Chapter 6 : Connectionist ApproachesChapter 6 : Connectionist Approaches
Chapter 6 : Connectionist Approaches
Ā 
Artificial intelligence cs607 handouts lecture 11 - 45
Artificial intelligence   cs607 handouts lecture 11 - 45Artificial intelligence   cs607 handouts lecture 11 - 45
Artificial intelligence cs607 handouts lecture 11 - 45
Ā 
Sementic nets
Sementic netsSementic nets
Sementic nets
Ā 

Similar to Informed Neural Topic Model with Pre-trained Word Embeddings

Introduction to Text Mining and Topic Modelling
Introduction to Text Mining and Topic ModellingIntroduction to Text Mining and Topic Modelling
Introduction to Text Mining and Topic ModellingDavid Paule
Ā 
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...Parang Saraf
Ā 
Challenges in transfer learning in nlp
Challenges in transfer learning in nlpChallenges in transfer learning in nlp
Challenges in transfer learning in nlpLaraOlmosCamarena
Ā 
A Text Mining Research Based on LDA Topic Modelling
A Text Mining Research Based on LDA Topic ModellingA Text Mining Research Based on LDA Topic Modelling
A Text Mining Research Based on LDA Topic Modellingcsandit
Ā 
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLING
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLINGA TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLING
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLINGcscpconf
Ā 
A perspective on Conversational Agents
A perspective on Conversational AgentsA perspective on Conversational Agents
A perspective on Conversational AgentsZelia Blaga
Ā 
Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLPSatyam Saxena
Ā 
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLPAnuj Gupta
Ā 
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...Sergey Sosnovsky
Ā 
An Approach To Assess The Existence Of A Proposed Intervention In Essay-Argum...
An Approach To Assess The Existence Of A Proposed Intervention In Essay-Argum...An Approach To Assess The Existence Of A Proposed Intervention In Essay-Argum...
An Approach To Assess The Existence Of A Proposed Intervention In Essay-Argum...Heather Strinden
Ā 
Naver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNaver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNAVER Engineering
Ā 
taghelper-final.doc
taghelper-final.doctaghelper-final.doc
taghelper-final.docbutest
Ā 
Machine Learning - Intro & Applications .pptx
Machine Learning - Intro & Applications .pptxMachine Learning - Intro & Applications .pptx
Machine Learning - Intro & Applications .pptxssuserf3aa89
Ā 
Listening to transactional discourse
Listening to transactional discourseListening to transactional discourse
Listening to transactional discourseOktari Aneliya
Ā 
SFScon18 - Gabriele Sottocornola - Probabilistic Topic Models with MALLET
SFScon18 - Gabriele Sottocornola - Probabilistic Topic Models with MALLETSFScon18 - Gabriele Sottocornola - Probabilistic Topic Models with MALLET
SFScon18 - Gabriele Sottocornola - Probabilistic Topic Models with MALLETSouth Tyrol Free Software Conference
Ā 
TopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxTopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxKalpit Desai
Ā 
Online Discussion Groups - 2019 update
Online Discussion Groups - 2019 updateOnline Discussion Groups - 2019 update
Online Discussion Groups - 2019 updateLeonie Sloman
Ā 

Similar to Informed Neural Topic Model with Pre-trained Word Embeddings (20)

Introduction to Text Mining and Topic Modelling
Introduction to Text Mining and Topic ModellingIntroduction to Text Mining and Topic Modelling
Introduction to Text Mining and Topic Modelling
Ā 
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Ā 
Challenges in transfer learning in nlp
Challenges in transfer learning in nlpChallenges in transfer learning in nlp
Challenges in transfer learning in nlp
Ā 
A Text Mining Research Based on LDA Topic Modelling
A Text Mining Research Based on LDA Topic ModellingA Text Mining Research Based on LDA Topic Modelling
A Text Mining Research Based on LDA Topic Modelling
Ā 
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLING
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLINGA TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLING
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLING
Ā 
A perspective on Conversational Agents
A perspective on Conversational AgentsA perspective on Conversational Agents
A perspective on Conversational Agents
Ā 
Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLP
Ā 
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLP
Ā 
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...
Ā 
Topic modelling
Topic modellingTopic modelling
Topic modelling
Ā 
An Approach To Assess The Existence Of A Proposed Intervention In Essay-Argum...
An Approach To Assess The Existence Of A Proposed Intervention In Essay-Argum...An Approach To Assess The Existence Of A Proposed Intervention In Essay-Argum...
An Approach To Assess The Existence Of A Proposed Intervention In Essay-Argum...
Ā 
Question answering
Question answeringQuestion answering
Question answering
Ā 
Naver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNaver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltc
Ā 
taghelper-final.doc
taghelper-final.doctaghelper-final.doc
taghelper-final.doc
Ā 
Machine Learning - Intro & Applications .pptx
Machine Learning - Intro & Applications .pptxMachine Learning - Intro & Applications .pptx
Machine Learning - Intro & Applications .pptx
Ā 
Listening to transactional discourse
Listening to transactional discourseListening to transactional discourse
Listening to transactional discourse
Ā 
SFScon18 - Gabriele Sottocornola - Probabilistic Topic Models with MALLET
SFScon18 - Gabriele Sottocornola - Probabilistic Topic Models with MALLETSFScon18 - Gabriele Sottocornola - Probabilistic Topic Models with MALLET
SFScon18 - Gabriele Sottocornola - Probabilistic Topic Models with MALLET
Ā 
TopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxTopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptx
Ā 
Online Discussion Groups - 2019 update
Online Discussion Groups - 2019 updateOnline Discussion Groups - 2019 update
Online Discussion Groups - 2019 update
Ā 
No more bad news!
No more bad news!No more bad news!
No more bad news!
Ā 

More from Pankaj Gupta, PhD

Neural NLP Models of Information Extraction
Neural NLP Models of Information ExtractionNeural NLP Models of Information Extraction
Neural NLP Models of Information ExtractionPankaj Gupta, PhD
Ā 
Poster: Neural Relation ExtractionWithin and Across Sentence Boundaries
Poster: Neural Relation ExtractionWithin and Across Sentence BoundariesPoster: Neural Relation ExtractionWithin and Across Sentence Boundaries
Poster: Neural Relation ExtractionWithin and Across Sentence BoundariesPankaj Gupta, PhD
Ā 
Document Informed Neural Autoregressive Topic Models with Distributional Prior
Document Informed Neural Autoregressive Topic Models with Distributional PriorDocument Informed Neural Autoregressive Topic Models with Distributional Prior
Document Informed Neural Autoregressive Topic Models with Distributional PriorPankaj Gupta, PhD
Ā 
Neural Relation ExtractionWithin and Across Sentence Boundaries
Neural Relation ExtractionWithin and Across Sentence BoundariesNeural Relation ExtractionWithin and Across Sentence Boundaries
Neural Relation ExtractionWithin and Across Sentence BoundariesPankaj Gupta, PhD
Ā 
Deep Learning for Information Extraction in Natural Language Text
Deep Learning for Information Extraction in Natural Language TextDeep Learning for Information Extraction in Natural Language Text
Deep Learning for Information Extraction in Natural Language TextPankaj Gupta, PhD
Ā 
textTOvec: DEEP CONTEXTUALIZED NEURAL AUTOREGRESSIVE TOPIC MODELS OF LANGUAGE...
textTOvec: DEEP CONTEXTUALIZED NEURAL AUTOREGRESSIVE TOPIC MODELS OF LANGUAGE...textTOvec: DEEP CONTEXTUALIZED NEURAL AUTOREGRESSIVE TOPIC MODELS OF LANGUAGE...
textTOvec: DEEP CONTEXTUALIZED NEURAL AUTOREGRESSIVE TOPIC MODELS OF LANGUAGE...Pankaj Gupta, PhD
Ā 
Pankaj Gupta CV / Resume
Pankaj Gupta CV / ResumePankaj Gupta CV / Resume
Pankaj Gupta CV / ResumePankaj Gupta, PhD
Ā 
LISA: Explaining RNN Judgments via Layer-wIse Semantic Accumulation and Examp...
LISA: Explaining RNN Judgments via Layer-wIse Semantic Accumulation and Examp...LISA: Explaining RNN Judgments via Layer-wIse Semantic Accumulation and Examp...
LISA: Explaining RNN Judgments via Layer-wIse Semantic Accumulation and Examp...Pankaj Gupta, PhD
Ā 
Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retri...
Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retri...Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retri...
Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retri...Pankaj Gupta, PhD
Ā 
Joint Bootstrapping Machines for High Confidence Relation Extraction
Joint Bootstrapping Machines for High Confidence Relation ExtractionJoint Bootstrapping Machines for High Confidence Relation Extraction
Joint Bootstrapping Machines for High Confidence Relation ExtractionPankaj Gupta, PhD
Ā 
RNN-RSM (Topics over Time) | NAACL2018 conference talk
RNN-RSM (Topics over Time) | NAACL2018 conference talkRNN-RSM (Topics over Time) | NAACL2018 conference talk
RNN-RSM (Topics over Time) | NAACL2018 conference talkPankaj Gupta, PhD
Ā 
Lecture 07: Representation and Distributional Learning by Pankaj Gupta
Lecture 07: Representation and Distributional Learning by Pankaj GuptaLecture 07: Representation and Distributional Learning by Pankaj Gupta
Lecture 07: Representation and Distributional Learning by Pankaj GuptaPankaj Gupta, PhD
Ā 
Lecture 05: Recurrent Neural Networks / Deep Learning by Pankaj Gupta
Lecture 05: Recurrent Neural Networks / Deep Learning by Pankaj GuptaLecture 05: Recurrent Neural Networks / Deep Learning by Pankaj Gupta
Lecture 05: Recurrent Neural Networks / Deep Learning by Pankaj GuptaPankaj Gupta, PhD
Ā 

More from Pankaj Gupta, PhD (13)

Neural NLP Models of Information Extraction
Neural NLP Models of Information ExtractionNeural NLP Models of Information Extraction
Neural NLP Models of Information Extraction
Ā 
Poster: Neural Relation ExtractionWithin and Across Sentence Boundaries
Poster: Neural Relation ExtractionWithin and Across Sentence BoundariesPoster: Neural Relation ExtractionWithin and Across Sentence Boundaries
Poster: Neural Relation ExtractionWithin and Across Sentence Boundaries
Ā 
Document Informed Neural Autoregressive Topic Models with Distributional Prior
Document Informed Neural Autoregressive Topic Models with Distributional PriorDocument Informed Neural Autoregressive Topic Models with Distributional Prior
Document Informed Neural Autoregressive Topic Models with Distributional Prior
Ā 
Neural Relation ExtractionWithin and Across Sentence Boundaries
Neural Relation ExtractionWithin and Across Sentence BoundariesNeural Relation ExtractionWithin and Across Sentence Boundaries
Neural Relation ExtractionWithin and Across Sentence Boundaries
Ā 
Deep Learning for Information Extraction in Natural Language Text
Deep Learning for Information Extraction in Natural Language TextDeep Learning for Information Extraction in Natural Language Text
Deep Learning for Information Extraction in Natural Language Text
Ā 
textTOvec: DEEP CONTEXTUALIZED NEURAL AUTOREGRESSIVE TOPIC MODELS OF LANGUAGE...
textTOvec: DEEP CONTEXTUALIZED NEURAL AUTOREGRESSIVE TOPIC MODELS OF LANGUAGE...textTOvec: DEEP CONTEXTUALIZED NEURAL AUTOREGRESSIVE TOPIC MODELS OF LANGUAGE...
textTOvec: DEEP CONTEXTUALIZED NEURAL AUTOREGRESSIVE TOPIC MODELS OF LANGUAGE...
Ā 
Pankaj Gupta CV / Resume
Pankaj Gupta CV / ResumePankaj Gupta CV / Resume
Pankaj Gupta CV / Resume
Ā 
LISA: Explaining RNN Judgments via Layer-wIse Semantic Accumulation and Examp...
LISA: Explaining RNN Judgments via Layer-wIse Semantic Accumulation and Examp...LISA: Explaining RNN Judgments via Layer-wIse Semantic Accumulation and Examp...
LISA: Explaining RNN Judgments via Layer-wIse Semantic Accumulation and Examp...
Ā 
Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retri...
Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retri...Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retri...
Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retri...
Ā 
Joint Bootstrapping Machines for High Confidence Relation Extraction
Joint Bootstrapping Machines for High Confidence Relation ExtractionJoint Bootstrapping Machines for High Confidence Relation Extraction
Joint Bootstrapping Machines for High Confidence Relation Extraction
Ā 
RNN-RSM (Topics over Time) | NAACL2018 conference talk
RNN-RSM (Topics over Time) | NAACL2018 conference talkRNN-RSM (Topics over Time) | NAACL2018 conference talk
RNN-RSM (Topics over Time) | NAACL2018 conference talk
Ā 
Lecture 07: Representation and Distributional Learning by Pankaj Gupta
Lecture 07: Representation and Distributional Learning by Pankaj GuptaLecture 07: Representation and Distributional Learning by Pankaj Gupta
Lecture 07: Representation and Distributional Learning by Pankaj Gupta
Ā 
Lecture 05: Recurrent Neural Networks / Deep Learning by Pankaj Gupta
Lecture 05: Recurrent Neural Networks / Deep Learning by Pankaj GuptaLecture 05: Recurrent Neural Networks / Deep Learning by Pankaj Gupta
Lecture 05: Recurrent Neural Networks / Deep Learning by Pankaj Gupta
Ā 

Recently uploaded

Call Girls in Sarai Kale Khan Delhi šŸ’Æ Call Us šŸ”9205541914 šŸ”( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi šŸ’Æ Call Us šŸ”9205541914 šŸ”( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi šŸ’Æ Call Us šŸ”9205541914 šŸ”( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi šŸ’Æ Call Us šŸ”9205541914 šŸ”( Delhi) Escorts S...Delhi Call girls
Ā 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
Ā 
Junnasandra Call Girls: šŸ“ 7737669865 šŸ“ High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: šŸ“ 7737669865 šŸ“ High Profile Model Escorts | Bangalore...Junnasandra Call Girls: šŸ“ 7737669865 šŸ“ High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: šŸ“ 7737669865 šŸ“ High Profile Model Escorts | Bangalore...amitlee9823
Ā 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
Ā 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
Ā 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
Ā 
Delhi Call Girls Punjabi Bagh 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Callshivangimorya083
Ā 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
Ā 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
Ā 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
Ā 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptDr. Soumendra Kumar Patra
Ā 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
Ā 
Delhi Call Girls CP 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Callshivangimorya083
Ā 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
Ā 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
Ā 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
Ā 

Recently uploaded (20)

(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
Ā 
Call Girls in Sarai Kale Khan Delhi šŸ’Æ Call Us šŸ”9205541914 šŸ”( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi šŸ’Æ Call Us šŸ”9205541914 šŸ”( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi šŸ’Æ Call Us šŸ”9205541914 šŸ”( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi šŸ’Æ Call Us šŸ”9205541914 šŸ”( Delhi) Escorts S...
Ā 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
Ā 
Junnasandra Call Girls: šŸ“ 7737669865 šŸ“ High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: šŸ“ 7737669865 šŸ“ High Profile Model Escorts | Bangalore...Junnasandra Call Girls: šŸ“ 7737669865 šŸ“ High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: šŸ“ 7737669865 šŸ“ High Profile Model Escorts | Bangalore...
Ā 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Ā 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
Ā 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Ā 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
Ā 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Ā 
Delhi Call Girls Punjabi Bagh 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Call
Ā 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
Ā 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Ā 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
Ā 
CHEAP Call Girls in Saket (-DELHI )šŸ” 9953056974šŸ”(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )šŸ” 9953056974šŸ”(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )šŸ” 9953056974šŸ”(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )šŸ” 9953056974šŸ”(=)/CALL GIRLS SERVICE
Ā 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
Ā 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Ā 
Delhi Call Girls CP 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ā˜Žāœ”šŸ‘Œāœ” Whatsapp Hard And Sexy Vip Call
Ā 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
Ā 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
Ā 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
Ā 

Informed Neural Topic Model with Pre-trained Word Embeddings

  • 1. Document Informed Neural Autoregressive Topic Models with Distributional Prior Informed Neural Topic Model with Pre-trained Word Embeddings Pankaj Gupta1,2 , Yatin Chaudhary2 , Florian Buettner2 & Hinrich SchĀØutze1 1 CIS, University of Munich (LMU), Germany 2 Corporate Technology, Machine-Intelligence, Siemens AG Munich, Germany pankaj.gupta@campus.lmu.de | pankaj.gupta@siemens.com Introduction A novel Neural Autoregressive Topic Model for short and long texts, empowered by: ā€¢ Context-awareness in learning better representations, ā€¢ Distributional semantics i.e., Word Embeddings as prior knowledge Problem Statement / Motivation 1. ā€œNeed for the context-awareness in representation learningā€? ā€¢ To determine actual meaning of ambiguous words ā€¢ To improve word and document representations Figure 1: Need for context-awareness in learning representations 2. ā€œNeed for Prior Knowledge in the limited context settingsā€? ā€¢ ā†“ word co-occurrences in short texts (e.g., headlines, tweets) or small corpora ā€¢ Difļ¬cult to learn good representations ā†’ Generates incoherent topics Figure 2: (left): Word embedding similarity (right): Topic examples Figure 3: Contributions in this work Evaluation and Analysis ā€¢ 8 short-text and 7 long-text datasets from news, Q&A, sentiment and Industrial domains ā€¢ Generalization (perplexity), Interpretability (topic coherence), Text retrieval (IR) and classiļ¬cation Table 1: Perplexity (PPL) and IR-precision (at fraction 0.02) scores for short and long texts Table 2: IR-precision at different retrieval fractions Methodology: Document Neural Autoregressive Topic Models Figure 4: (left): DocNADE[1] (baseline) (right): iDocNADE (DocNADE + context-awareness) Figure 5: (left): DocNADE[1] (baseline) (right): DocNADEe (DocNADE + Word Embeddings) Table 3: (left): Topic coherence with the top 10 and 20 words (right): Qualitative example Table 4: (left): Text classiļ¬cation (F1 and Accuracy) scores for short texts Conclusion & Key Takeaways ā€¢ Leverage full context + pre-trained word embeddings in neural autoregressive topic model ā€¢ Gain of 5.2% (404 vs 426) in PPL, 2.8% (.74 vs .72) in coherence, 11.1% (.60 vs .54) in IR- precision, 5.2% (.664 vs .631) in F1 for text categorization, on avg over 15 datasets ā€¢ Demonstrate learning better word/document representation for short and long texts ā€¢ Tryout: Code available at: https://github.com/pgcool/iDocNADEe Our recent extension of this work: ā€œtextTOvecā€ Pankaj Gupta, Yatin Chaudhary, Florian Buettner and Hinrich SchĀØutze. textTOvec: Deep Contextu- alized Neural Autoregressive Topic Models of Language with Distributed Compositional Prior. To appear in ICLR2019. TL;DR: A Neural Topic Model with Language Structures. References [1] Hugo Larochelle and Stanislas Lauly. A neural autoregressive topic model. In Advances in Neural Information Processing Systems 25, pages 2708ā€“2716. Curran Associates, Inc., 2012.