SlideShare a Scribd company logo
1 of 57
Download to read offline
Intern © Siemens AG 2017
Document Informed Neural Autoregressive
Topic Models with Distributional Prior
Author(s): Pankaj Gupta1,2, Yatin Chaudhary2, Florian Büttner2 and Hinrich Schütze1
Presenter: Pankaj Gupta @AAAI19 Honolulu Hawaii, USA. Jan 2019
1CIS, University of Munich (LMU) | 2 Machine Intelligence, Siemens AG | Jan 2019
Intern © Siemens AG 2017
May 2017Seite 2 Corporate Technology
Outline
Motivation
→ Context awareness in Topic Models
→ Distributional semantics in Neural Topic Model
→ Baseline: DocNADE
Proposed Models
→ DocNADE+ context awareness, i.e., iDocNADE
→ DocNADE + word embeddings priors, i.e., DocNADEe
Evaluation
→ Generalization, Topic Coherence, Text retrieval and classification
Intern © Siemens AG 2017
May 2017Seite 3 Corporate Technology
Outline
Motivation
→ Context awareness in Topic Models
→ Distributional semantics in Neural Topic Model
→ Baseline: DocNADE
Proposed Models
→ DocNADE+ context awareness, i.e., iDocNADE
→ DocNADE + word embeddings priors, i.e., DocNADEe
Evaluation
→ Generalization, Topic Coherence, Text retrieval and classification
Intern © Siemens AG 2017
May 2017Seite 4 Corporate Technology
Outline
Motivation
→ Context awareness in Topic Models
→ Distributional semantics in Neural Topic Model
→ Baseline: DocNADE
Proposed Models
→ DocNADE+ context awareness, i.e., iDocNADE
→ DocNADE + word embeddings priors, i.e., DocNADEe
Evaluation
→ Generalization, Topic Coherence, Text retrieval and classification
Intern © Siemens AG 2017
May 2017Seite 5 Corporate Technology
Outline
Motivation
→ Context awareness in Topic Models
→ Distributional semantics in Neural Topic Model
→ Baseline: DocNADE
Proposed Models
→ DocNADE+ context awareness, i.e., iDocNADE
→ DocNADE + word embeddings priors, i.e., DocNADEe
Evaluation
→ Generalization, Topic Coherence, Text retrieval and classification
Intern © Siemens AG 2017
May 2017Seite 6 Corporate Technology
This Work: CONTRIBUTIONS
Incorporate context information around words
→ determining actual meaning of ambiguous words
→ improving word and document representations
Improving Topic Modeling for short-text and long-text documents
Incorporate external knowledge for each word
→ using distributional semantics, i.e., word embeddings
→ improving document representations and topics
Intern © Siemens AG 2017
May 2017Seite 7 Corporate Technology
This Work: CONTRIBUTIONS
Incorporate context information around words
→ determining actual meaning of ambiguous words
→ improving word and document representations
Improving Topic Modeling for short-text and long-text documents
Incorporate external knowledge for each word
→ using distributional semantics, i.e., word embeddings
→ improving document representations and topics
Intern © Siemens AG 2017
May 2017Seite 8 Corporate Technology
This Work: CONTRIBUTIONS
Incorporate context information around words
→ determining actual meaning of ambiguous words
→ improving word and document representations
Improving Topic Modeling for short-text and long-text documents
Incorporate external knowledge for each word
→ using distributional semantics, i.e., word embeddings
→ improving document representations and topics
Intern © Siemens AG 2017
May 2017Seite 9 Corporate Technology
This Work: CONTRIBUTIONS
Incorporate context information around words
→ determining actual meaning of ambiguous words
→ improving word and document representations
Improving Topic Modeling for short-text and long-text documents
Incorporate external knowledge for each word
→ using distributional semantics, i.e., word embeddings
→ improving document representations and topics
Intern © Siemens AG 2017
May 2017Seite 10 Corporate Technology
This Work: CONTRIBUTIONS
Incorporate context information around words
→ determining actual meaning of ambiguous words
→ improving word and document representations
Improving Topic Modeling for short-text and long-text documents
Incorporate external knowledge for each word
→ using distributional semantics, i.e., word embeddings
→ improving document representations and topics
Intern © Siemens AG 2017
May 2017Seite 11 Corporate Technology
This Work: CONTRIBUTIONS
Incorporate context information around words
→ determining actual meaning of ambiguous words
→ improving word and document representations
Improving Topic Modeling for short-text and long-text documents
Incorporate external knowledge for each word
→ using distributional semantics, i.e., word embeddings
→ improving document representations and topics
Intern © Siemens AG 2017
May 2017Seite 12 Corporate Technology
Motivation1: Need for Full Contextual Information
Source Text Sense/Topic
In biological brains, we study noisy neurons at cellular level → “biological neural network”
Like biological brains, study of noisy neurons in artificial neural networks → “artificial neural network”
Intern © Siemens AG 2017
May 2017Seite 13 Corporate Technology
Motivation1: Need for Full Contextual Information
Source Text Sense/Topic
In biological brains, we study noisy neurons at cellular level → “biological neural network”
Like biological brains, study of noisy neurons in artificial neural networks → “artificial neural network”
Preceding Context Following Context Sense/Topic of “neurons”
Like biological brains, study of noisy + in artificial neural networks → “biological neural network”
Intern © Siemens AG 2017
May 2017Seite 14 Corporate Technology
Motivation1: Need for Full Contextual Information
Source Text Sense/Topic
In biological brains, we study noisy neurons at cellular level → “biological neural network”
Like biological brains, study of noisy neurons in artificial neural networks → “artificial neural network”
Preceding Context Following Context Sense/Topic of “neurons”
Like biological brains, study of noisy + in artificial neural networks → “biological neural network”
Like biological brains, study of noisy + in artificial neural networks → “artificial neural network”
Intern © Siemens AG 2017
May 2017Seite 15 Corporate Technology
Motivation1: Need for Full Contextual Information
Source Text Sense/Topic
In biological brains, we study noisy neurons at cellular level → “biological neural network”
Like biological brains, study of noisy neurons in artificial neural networks → “artificial neural network”
Preceding Context Following Context Sense/Topic of “neurons”
Like biological brains, study of noisy + in artificial neural networks → “biological neural network”
Like biological brains, study of noisy + in artificial neural networks → “artificial neural network”
Context information around words helps in determining their actual meaning !!!
Intern © Siemens AG 2017
May 2017Seite 16 Corporate Technology
Motivation2: Need for Distributional Semantics or Prior Knowledge
➢ “Lack of Context” in short-text documents, e.g., headlines, tweets, etc.
➢ “Lack of Context” in a corpus of few documents
Small number of
Word co-occurrences
Intern © Siemens AG 2017
May 2017Seite 17 Corporate Technology
Motivation2: Need for Distributional Semantics or Prior Knowledge
➢ “Lack of Context” in short-text documents, e.g., headlines, tweets, etc.
➢ “Lack of Context” in a corpus of few documents
Small number of
Word co-occurrences
Lack of Context
Difficult to learn good representations
Intern © Siemens AG 2017
May 2017Seite 18 Corporate Technology
Motivation2: Need for Distributional Semantics or Prior Knowledge
➢ “Lack of Context” in short-text documents, e.g., headlines, tweets, etc.
➢ “Lack of Context” in a corpus of few documents
Small number of
Word co-occurrences
Lack of Context
Generate Incoherent TopicsDifficult to learn good representations
Intern © Siemens AG 2017
May 2017Seite 19 Corporate Technology
Motivation2: Need for Distributional Semantics or Prior Knowledge
➢ “Lack of Context” in short-text documents, e.g., headlines, tweets, etc.
➢ “Lack of Context” in a corpus of few documents
Small number of
Word co-occurrences
Lack of Context
Generate Incoherent TopicsDifficult to learn good representations
Topic1: price, wall, china, fall, shares
Topic2: shares, price, profits, rises, earnings
incoherent
coherent
example topics for ‘trading’
Intern © Siemens AG 2017
May 2017Seite 20 Corporate Technology
Motivation2: Need for Distributional Semantics or Prior Knowledge
➢ “Lack of Context” in short-text documents, e.g., headlines, tweets, etc.
➢ “Lack of Context” in a corpus of few documents
Small number of
Word co-occurrences
Lack of Context
Generate Incoherent TopicsDifficult to learn good representations
TO RESCUE: Use External/additional information, e.g., WORD EMBEDDINGS
(encodes semantic and syntactic relatedness in words in a vector space)
Intern © Siemens AG 2017
May 2017Seite 21 Corporate Technology
Motivation2: Need for Distributional Semantics or Prior Knowledge
➢ “Lack of Context” in short-text documents, e.g., headlines, tweets, etc.
➢ “Lack of Context” in a corpus of few documents
Small number of
Word co-occurrences
Lack of Context
Generate Incoherent TopicsDifficult to learn good representations
→ trading
No word overlap
(e.g., 1-hot-encoding)
Same
topic
class→ trading
TO RESCUE: Use External/additional information, e.g., WORD EMBEDDINGS
(encodes semantic and syntactic relatedness in words in a vector space)
Intern © Siemens AG 2017
May 2017Seite 22 Corporate Technology
Baseline: Neural Autoregressive Topic Model (DocNADE)
A probabilistic graphical model inspired by RBM, RSM and NADE models
➢ learns topics over sequences of words, v
➢ learn distributed word representations based on word co-occurrences
➢ follows encoder-decoder mechanism of unsupervised learning
EncodingDecoding
Intern © Siemens AG 2017
May 2017Seite 23 Corporate Technology
Baseline: Neural Autoregressive Topic Model (DocNADE)
A probabilistic graphical model inspired by RBM, RSM and NADE models
➢ learns topics over sequences of words, v
➢ learn distributed word representations based on word co-occurrences
➢ follows encoder-decoder principle
➢ computes joint distribution or log-likelihood for a document, v
in language modeling fashion via autoregressive conditionals
i.e., predict the word vi, given the sequence of preceding words v<i
autoregressive
conditional, p(vi=3 | v<3)
via a feed-forward neural network, where
EncodingDecoding
Intern © Siemens AG 2017
May 2017Seite 24 Corporate Technology
Baseline: Neural Autoregressive Topic Model (DocNADE)
A probabilistic graphical model inspired by RBM, RSM and NADE models
➢ learns topics over sequences of words, v
➢ learn distributed word representations based on word co-occurrences
➢ follows encoder-decoder principle
➢ computes joint distribution or log-likelihood for a document, v
in language modeling fashion via autoregressive conditionals
i.e., predict the word vi, given the sequence of preceding words v<i
autoregressive
conditional, p(vi=3 | v<3)
via a feed-forward neural network, where
where,
Topic matrix
Encoding
EncodingDecoding
Embedding aggregation
Intern © Siemens AG 2017
May 2017Seite 25 Corporate Technology
Baseline: Neural Autoregressive Topic Model (DocNADE)
Limitations
➢ does not take into account the following words v>i in the sequence
➢ poor in modeling short-text documents due to limited context
➢ does not use pre-trained word embeddings (or external knowledge)
autoregressive
conditional, p(vi=3 | v<3)
EncodingDecoding
DOES NOT take into account the
following words v>i
Intern © Siemens AG 2017
May 2017Seite 26 Corporate Technology
Proposed Neural Architectures, extending DocNADE to
→incorporate full context information
→incorporate pre-trained Word Embeddings
Intern © Siemens AG 2017
May 2017Seite 27 Corporate Technology
Proposed Variant1: Contextualized DocNADE (iDocNADE)
Intern © Siemens AG 2017
May 2017Seite 28 Corporate Technology
Proposed Variant1: Contextualized DocNADE (iDocNADE)
➢ incorporating full contextual information around words in a document (preceding and following words)
➢ boost the likelihood of each word and subsequently the document likelihood
➢ improved representation learning
Intern © Siemens AG 2017
May 2017Seite 29 Corporate Technology
Proposed Variant1: Contextualized DocNADE (iDocNADE)
➢ incorporating full contextual information around words in a document (preceding and following words)
➢ boost the likelihood of each word and subsequently the document likelihood
➢ improved representation learning
Intern © Siemens AG 2017
May 2017Seite 30 Corporate Technology
Proposed Variant1: Contextualized DocNADE (iDocNADE)
Intern © Siemens AG 2017
May 2017Seite 31 Corporate Technology
Proposed Variant1: Contextualized DocNADE (iDocNADE)
DocNADE
Incomplete Context
around words
iDocNADE
Full Context
around words in
Need for
Intern © Siemens AG 2017
May 2017Seite 32 Corporate Technology
Proposed Variant2: DocNADE + Embedding Priors (DocNADEe)
Intern © Siemens AG 2017
May 2017Seite 33 Corporate Technology
Proposed Variant2: DocNADE + Embedding Priors ‘e’ (DocNADEe)
Intern © Siemens AG 2017
May 2017Seite 34 Corporate Technology
Proposed Variant2: DocNADE + Embedding Priors ‘e’ (DocNADEe)
➢ introduce weighted pre-trained
word embedding aggregation at
each autoregressive step k
➢ E, pretrained emb as fixed prior
➢ generate topics with embeddings
➢ learn a complementary textual
representation
E
Intern © Siemens AG 2017
May 2017Seite 35 Corporate Technology
Proposed Variant2: DocNADE + Embedding Priors ‘e’ (DocNADEe)
mixture weights
➢ introduce weighted pre-trained
word embedding aggregation at
each autoregressive step k
➢ E, pretrained emb as fixed prior
➢ generate topics with embeddings
➢ learn a complementary textual
representation
Intern © Siemens AG 2017
May 2017Seite 36 Corporate Technology
Proposed Variant3: iDocNADE + Embedding Priors ‘e’ (iDocNADEe)
Vairant1
iDocNADE
Vairant2
DocNADEe
Vairant3
iDocNADEe
Intern © Siemens AG 2017
May 2017Seite 37 Corporate Technology
Evaluation: Datasets, Statistics and Properties
➢ 8 short-text and 7 long-text datasets
➢ short-text → #words < 25 in a document
➢ a corpus of few documents (e.g., 20NSsmall)
➢ topics: 50 and 200 (hidden layer size)
➢ Quantitatively evaluated using:
- generalization (perplexity, PPL)
- interpretability using topic coherence
- text/information retrieval (IR)
- text classification
Intern © Siemens AG 2017
May 2017Seite 38 Corporate Technology
Evaluation: Datasets, Statistics and Properties
➢ 8 short-text and 7 long-text datasets
➢ short-text → #words < 25 in a document
➢ a corpus of few documents (e.g., 20NSsmall)
➢ topics: 50 and 200 (hidden layer size)
➢ Quantitatively evaluated using:
- generalization (perplexity, PPL)
- interpretability using topic coherence
- text/information retrieval (IR)
- text classification
Intern © Siemens AG 2017
May 2017Seite 39 Corporate Technology
Evaluation: Generalization via Perplexity (PPL)
Generalization (PPL)
→ lower the better
→ on short-text datasets
Intern © Siemens AG 2017
May 2017Seite 40 Corporate Technology
Evaluation: Generalization via Perplexity (PPL)
Generalization (PPL)
→ lower the better
→ on short-text datasets
Intern © Siemens AG 2017
May 2017Seite 41 Corporate Technology
Evaluation: Generalization via Perplexity (PPL)
Generalization (PPL)
→ lower the better
→ on short-text datasets
Gain (%)
4.1%
Intern © Siemens AG 2017
May 2017Seite 42 Corporate Technology
Evaluation: Generalization via Perplexity (PPL)
Generalization (PPL)
→ lower the better
→ on short-text datasets
Gain (%)
4.1% 4.3%
Intern © Siemens AG 2017
May 2017Seite 43 Corporate Technology
Evaluation: Generalization via Perplexity (PPL)
Generalization (PPL)
→ lower the better
→ on short-text datasets
Gain (%)
4.1% 4.3% 5.1%
Intern © Siemens AG 2017
May 2017Seite 44 Corporate Technology
Evaluation: Generalization via Perplexity (PPL)
Generalization (PPL)
→ lower the better
→ on long-text datasets
Gain (%)
5.3% 4.8% 5.5%
Intern © Siemens AG 2017
May 2017Seite 45 Corporate Technology
Evaluation: Applicability (Information Retrieval)
IR-precision (on short-text data)
→ Precision at retrieval fraction 0.02
→ higher the better
short-text
Intern © Siemens AG 2017
May 2017Seite 46 Corporate Technology
Evaluation: Applicability (Information Retrieval)
IR-precision (on short-text data)
→ Precision at retrieval fraction 0.02
→ higher the better
Gain (%)
short-text
Intern © Siemens AG 2017
May 2017Seite 47 Corporate Technology
Evaluation: Applicability (Information Retrieval)
IR-precision (on short-text data)
→ Precision at retrieval fraction 0.02
→ higher the better
Gain (%)
5.6%
short-text
Intern © Siemens AG 2017
May 2017Seite 48 Corporate Technology
Evaluation: Applicability (Information Retrieval)
IR-precision (on short-text data)
→ Precision at retrieval fraction 0.02
→ higher the better
Gain (%)
5.6% 7.4%
short-text
Intern © Siemens AG 2017
May 2017Seite 49 Corporate Technology
Evaluation: Applicability (Information Retrieval)
IR-precision (on short-text data)
→ Precision at retrieval fraction 0.02
→ higher the better
Gain (%)
5.6% 7.4%
11.1%
short-text
Intern © Siemens AG 2017
May 2017Seite 50 Corporate Technology
Evaluation: Applicability (Information Retrieval)
IR-precision (on long-text data)
→ Precision at retrieval fraction 0.02
→ higher the better
Gain (%)
7.1% 7.1%7.1%
long-text
Intern © Siemens AG 2017
May 2017Seite 51 Corporate Technology
Evaluation: Applicability (Information Retrieval)
TMNtitle AGnewstitle
Intern © Siemens AG 2017
May 2017Seite 52 Corporate Technology
Evaluation: Interpretability (Topic Coherence)
→ assess the meaningfulness of
the underlying topics captured
→ coherence measure proposed by
Roeder, Both, and Hinneburg (2015)
→ higher scores imply
more coherent topics
Intern © Siemens AG 2017
May 2017Seite 53 Corporate Technology
Evaluation: Interpretability (Topic Coherence)
→ assess the meaningfulness of
the underlying topics captured
→ coherence measure proposed by
Roeder, Both, and Hinneburg (2015)
→ higher scores imply
more coherent topics
short-text
Intern © Siemens AG 2017
May 2017Seite 54 Corporate Technology
Evaluation: Interpretability (Topic Coherence)
→ assess the meaningfulness of
the underlying topics captured
→ coherence measure proposed by
Roeder, Both, and Hinneburg (2015)
→ higher scores imply
more coherent topics
long-text
Intern © Siemens AG 2017
May 2017Seite 55 Corporate Technology
Evaluation: Qualitative Topics (e.g., ‘religion’)
coherent
topic words
Intern © Siemens AG 2017
May 2017Seite 56 Corporate Technology
Conclusion: Take Away
➢ Leveraging full contextual information in neural autoregressive topic model
➢ Introducing distributional priors via pre-trained word embeddings
➢ Gain of 5.2% (404 vs 426) in perplexity,
2.8% (.74 vs .72) in topic coherence,
11.1% (.60 vs .54) in precision at retrieval fraction 0.02,
5.2% (.664 vs .631) in F1 for text categorization
on avg over 15 datasets
➢ Learning better word/document representation for short/long texts
➢ State-of-the-art topic models unified with textual representation learning
Intern © Siemens AG 2017
May 2017Seite 57 Corporate Technology
Conclusion: Take Away
➢ Leveraging full contextual information in neural autoregressive topic model
➢ Introducing distributional priors via pre-trained word embeddings
➢ Gain of 5.2% (404 vs 426) in perplexity,
2.8% (.74 vs .72) in topic coherence,
11.1% (.60 vs .54) in precision at retrieval fraction 0.02,
5.2% (.664 vs .631) in F1 for text categorization
on avg over 15 datasets
➢ Learning better word/document representation for short/long texts
➢ State-of-the-art topic models unified with textual representation learning
Tryout: The code and data are available at https://github.com/pgcool/iDocNADEe
Thanks !!
“textTOvec”: Latest work to appear in ICLR19 | A Neural Topic Model with Language Structures

More Related Content

Similar to Document Informed Neural Autoregressive Topic Models with Distributional Prior

Data Science Lecture: Overview and Information Collateral
Data Science Lecture: Overview and Information CollateralData Science Lecture: Overview and Information Collateral
Data Science Lecture: Overview and Information CollateralFrank Kienle
 
Text Analytics Market Insights: What's Working and What's Next
Text Analytics Market Insights: What's Working and What's NextText Analytics Market Insights: What's Working and What's Next
Text Analytics Market Insights: What's Working and What's NextSeth Grimes
 
[Webinar Slides] Tapping the Power of Content Analytics – Exploring this Powe...
[Webinar Slides] Tapping the Power of Content Analytics – Exploring this Powe...[Webinar Slides] Tapping the Power of Content Analytics – Exploring this Powe...
[Webinar Slides] Tapping the Power of Content Analytics – Exploring this Powe...AIIM International
 
Sentiment analysis in Twitter on Big Data
Sentiment analysis in Twitter on Big DataSentiment analysis in Twitter on Big Data
Sentiment analysis in Twitter on Big DataIswarya M
 
Schema Engineering for Enterprise Knowledge Graphs
Schema Engineering for Enterprise Knowledge GraphsSchema Engineering for Enterprise Knowledge Graphs
Schema Engineering for Enterprise Knowledge GraphsVera G. Meister
 
Hvordan få søk til å fungere effektivt
Hvordan få søk til å fungere effektivtHvordan få søk til å fungere effektivt
Hvordan få søk til å fungere effektivtKristian Norling
 
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...recsysfr
 
How Customers Are Using the IBM Data Science Experience - Expected Cases and ...
How Customers Are Using the IBM Data Science Experience - Expected Cases and ...How Customers Are Using the IBM Data Science Experience - Expected Cases and ...
How Customers Are Using the IBM Data Science Experience - Expected Cases and ...Databricks
 
Semantic engagement handouts
Semantic engagement handoutsSemantic engagement handouts
Semantic engagement handoutsSTIinnsbruck
 
Document Classification Using KNN with Fuzzy Bags of Word Representation
Document Classification Using KNN with Fuzzy Bags of Word RepresentationDocument Classification Using KNN with Fuzzy Bags of Word Representation
Document Classification Using KNN with Fuzzy Bags of Word Representationsuthi
 
Themes for graduation projects 2010
Themes for graduation projects   2010Themes for graduation projects   2010
Themes for graduation projects 2010mohamedsamyali
 
Maturing Artificial Intelligence – Data Science for Real-World Applications
Maturing Artificial Intelligence – Data Science for Real-World ApplicationsMaturing Artificial Intelligence – Data Science for Real-World Applications
Maturing Artificial Intelligence – Data Science for Real-World Applicationsscrdbsr
 
ANChor: A powerful approach to scientific communication
ANChor: A powerful approach to scientific communicationANChor: A powerful approach to scientific communication
ANChor: A powerful approach to scientific communicationJosh Inouye
 
Storytelling for research software engineers
Storytelling for research software engineersStorytelling for research software engineers
Storytelling for research software engineersAlbanLevy
 
Guideline for Preparing PhD Course Work Synopsis on Engineering Technology - ...
Guideline for Preparing PhD Course Work Synopsis on Engineering Technology - ...Guideline for Preparing PhD Course Work Synopsis on Engineering Technology - ...
Guideline for Preparing PhD Course Work Synopsis on Engineering Technology - ...PhD Assistance
 
Guideline for Preparing PhD Course Work Synopsis on Engineering Technology - ...
Guideline for Preparing PhD Course Work Synopsis on Engineering Technology - ...Guideline for Preparing PhD Course Work Synopsis on Engineering Technology - ...
Guideline for Preparing PhD Course Work Synopsis on Engineering Technology - ...PhD Assistance
 
Using NLP to understand textual content at scale
Using NLP to understand textual content at scaleUsing NLP to understand textual content at scale
Using NLP to understand textual content at scaleParsa Ghaffari
 
Changing Times - the Future of ECM - AIIM 2017
Changing Times - the Future of ECM - AIIM 2017 Changing Times - the Future of ECM - AIIM 2017
Changing Times - the Future of ECM - AIIM 2017 Stephen Ludlow
 

Similar to Document Informed Neural Autoregressive Topic Models with Distributional Prior (20)

Data Science Lecture: Overview and Information Collateral
Data Science Lecture: Overview and Information CollateralData Science Lecture: Overview and Information Collateral
Data Science Lecture: Overview and Information Collateral
 
Text Analytics Market Insights: What's Working and What's Next
Text Analytics Market Insights: What's Working and What's NextText Analytics Market Insights: What's Working and What's Next
Text Analytics Market Insights: What's Working and What's Next
 
[Webinar Slides] Tapping the Power of Content Analytics – Exploring this Powe...
[Webinar Slides] Tapping the Power of Content Analytics – Exploring this Powe...[Webinar Slides] Tapping the Power of Content Analytics – Exploring this Powe...
[Webinar Slides] Tapping the Power of Content Analytics – Exploring this Powe...
 
Sentiment analysis in Twitter on Big Data
Sentiment analysis in Twitter on Big DataSentiment analysis in Twitter on Big Data
Sentiment analysis in Twitter on Big Data
 
Schema Engineering for Enterprise Knowledge Graphs
Schema Engineering for Enterprise Knowledge GraphsSchema Engineering for Enterprise Knowledge Graphs
Schema Engineering for Enterprise Knowledge Graphs
 
Hvordan få søk til å fungere effektivt
Hvordan få søk til å fungere effektivtHvordan få søk til å fungere effektivt
Hvordan få søk til å fungere effektivt
 
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
Highlights on most interesting RecSys papers - Elena Smirnova, Lowik Chanusso...
 
How Customers Are Using the IBM Data Science Experience - Expected Cases and ...
How Customers Are Using the IBM Data Science Experience - Expected Cases and ...How Customers Are Using the IBM Data Science Experience - Expected Cases and ...
How Customers Are Using the IBM Data Science Experience - Expected Cases and ...
 
Semantic engagement handouts
Semantic engagement handoutsSemantic engagement handouts
Semantic engagement handouts
 
Document Classification Using KNN with Fuzzy Bags of Word Representation
Document Classification Using KNN with Fuzzy Bags of Word RepresentationDocument Classification Using KNN with Fuzzy Bags of Word Representation
Document Classification Using KNN with Fuzzy Bags of Word Representation
 
Themes for graduation projects 2010
Themes for graduation projects   2010Themes for graduation projects   2010
Themes for graduation projects 2010
 
Using Knowledge Graph for Promoting Cognitive Computing
Using Knowledge Graph for Promoting Cognitive ComputingUsing Knowledge Graph for Promoting Cognitive Computing
Using Knowledge Graph for Promoting Cognitive Computing
 
Maturing Artificial Intelligence – Data Science for Real-World Applications
Maturing Artificial Intelligence – Data Science for Real-World ApplicationsMaturing Artificial Intelligence – Data Science for Real-World Applications
Maturing Artificial Intelligence – Data Science for Real-World Applications
 
ANChor: A powerful approach to scientific communication
ANChor: A powerful approach to scientific communicationANChor: A powerful approach to scientific communication
ANChor: A powerful approach to scientific communication
 
Storytelling for research software engineers
Storytelling for research software engineersStorytelling for research software engineers
Storytelling for research software engineers
 
Guideline for Preparing PhD Course Work Synopsis on Engineering Technology - ...
Guideline for Preparing PhD Course Work Synopsis on Engineering Technology - ...Guideline for Preparing PhD Course Work Synopsis on Engineering Technology - ...
Guideline for Preparing PhD Course Work Synopsis on Engineering Technology - ...
 
NLP@DATEV: Setting up a domain specific language model, Dr. Jonas Rende & Tho...
NLP@DATEV: Setting up a domain specific language model, Dr. Jonas Rende & Tho...NLP@DATEV: Setting up a domain specific language model, Dr. Jonas Rende & Tho...
NLP@DATEV: Setting up a domain specific language model, Dr. Jonas Rende & Tho...
 
Guideline for Preparing PhD Course Work Synopsis on Engineering Technology - ...
Guideline for Preparing PhD Course Work Synopsis on Engineering Technology - ...Guideline for Preparing PhD Course Work Synopsis on Engineering Technology - ...
Guideline for Preparing PhD Course Work Synopsis on Engineering Technology - ...
 
Using NLP to understand textual content at scale
Using NLP to understand textual content at scaleUsing NLP to understand textual content at scale
Using NLP to understand textual content at scale
 
Changing Times - the Future of ECM - AIIM 2017
Changing Times - the Future of ECM - AIIM 2017 Changing Times - the Future of ECM - AIIM 2017
Changing Times - the Future of ECM - AIIM 2017
 

More from Pankaj Gupta, PhD

textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language...
textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language...textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language...
textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language...Pankaj Gupta, PhD
 
Poster: Neural Relation ExtractionWithin and Across Sentence Boundaries
Poster: Neural Relation ExtractionWithin and Across Sentence BoundariesPoster: Neural Relation ExtractionWithin and Across Sentence Boundaries
Poster: Neural Relation ExtractionWithin and Across Sentence BoundariesPankaj Gupta, PhD
 
Poster: Document Informed Neural Autoregressive Topic Models with Distributio...
Poster: Document Informed Neural Autoregressive Topic Models with Distributio...Poster: Document Informed Neural Autoregressive Topic Models with Distributio...
Poster: Document Informed Neural Autoregressive Topic Models with Distributio...Pankaj Gupta, PhD
 
LISA: Explaining RNN Judgments via Layer-wIse Semantic Accumulation and Examp...
LISA: Explaining RNN Judgments via Layer-wIse Semantic Accumulation and Examp...LISA: Explaining RNN Judgments via Layer-wIse Semantic Accumulation and Examp...
LISA: Explaining RNN Judgments via Layer-wIse Semantic Accumulation and Examp...Pankaj Gupta, PhD
 
Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retri...
Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retri...Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retri...
Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retri...Pankaj Gupta, PhD
 

More from Pankaj Gupta, PhD (6)

textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language...
textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language...textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language...
textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language...
 
Poster: Neural Relation ExtractionWithin and Across Sentence Boundaries
Poster: Neural Relation ExtractionWithin and Across Sentence BoundariesPoster: Neural Relation ExtractionWithin and Across Sentence Boundaries
Poster: Neural Relation ExtractionWithin and Across Sentence Boundaries
 
Poster: Document Informed Neural Autoregressive Topic Models with Distributio...
Poster: Document Informed Neural Autoregressive Topic Models with Distributio...Poster: Document Informed Neural Autoregressive Topic Models with Distributio...
Poster: Document Informed Neural Autoregressive Topic Models with Distributio...
 
Pankaj Gupta CV / Resume
Pankaj Gupta CV / ResumePankaj Gupta CV / Resume
Pankaj Gupta CV / Resume
 
LISA: Explaining RNN Judgments via Layer-wIse Semantic Accumulation and Examp...
LISA: Explaining RNN Judgments via Layer-wIse Semantic Accumulation and Examp...LISA: Explaining RNN Judgments via Layer-wIse Semantic Accumulation and Examp...
LISA: Explaining RNN Judgments via Layer-wIse Semantic Accumulation and Examp...
 
Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retri...
Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retri...Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retri...
Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retri...
 

Recently uploaded

Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 

Recently uploaded (20)

Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 

Document Informed Neural Autoregressive Topic Models with Distributional Prior

  • 1. Intern © Siemens AG 2017 Document Informed Neural Autoregressive Topic Models with Distributional Prior Author(s): Pankaj Gupta1,2, Yatin Chaudhary2, Florian Büttner2 and Hinrich Schütze1 Presenter: Pankaj Gupta @AAAI19 Honolulu Hawaii, USA. Jan 2019 1CIS, University of Munich (LMU) | 2 Machine Intelligence, Siemens AG | Jan 2019
  • 2. Intern © Siemens AG 2017 May 2017Seite 2 Corporate Technology Outline Motivation → Context awareness in Topic Models → Distributional semantics in Neural Topic Model → Baseline: DocNADE Proposed Models → DocNADE+ context awareness, i.e., iDocNADE → DocNADE + word embeddings priors, i.e., DocNADEe Evaluation → Generalization, Topic Coherence, Text retrieval and classification
  • 3. Intern © Siemens AG 2017 May 2017Seite 3 Corporate Technology Outline Motivation → Context awareness in Topic Models → Distributional semantics in Neural Topic Model → Baseline: DocNADE Proposed Models → DocNADE+ context awareness, i.e., iDocNADE → DocNADE + word embeddings priors, i.e., DocNADEe Evaluation → Generalization, Topic Coherence, Text retrieval and classification
  • 4. Intern © Siemens AG 2017 May 2017Seite 4 Corporate Technology Outline Motivation → Context awareness in Topic Models → Distributional semantics in Neural Topic Model → Baseline: DocNADE Proposed Models → DocNADE+ context awareness, i.e., iDocNADE → DocNADE + word embeddings priors, i.e., DocNADEe Evaluation → Generalization, Topic Coherence, Text retrieval and classification
  • 5. Intern © Siemens AG 2017 May 2017Seite 5 Corporate Technology Outline Motivation → Context awareness in Topic Models → Distributional semantics in Neural Topic Model → Baseline: DocNADE Proposed Models → DocNADE+ context awareness, i.e., iDocNADE → DocNADE + word embeddings priors, i.e., DocNADEe Evaluation → Generalization, Topic Coherence, Text retrieval and classification
  • 6. Intern © Siemens AG 2017 May 2017Seite 6 Corporate Technology This Work: CONTRIBUTIONS Incorporate context information around words → determining actual meaning of ambiguous words → improving word and document representations Improving Topic Modeling for short-text and long-text documents Incorporate external knowledge for each word → using distributional semantics, i.e., word embeddings → improving document representations and topics
  • 7. Intern © Siemens AG 2017 May 2017Seite 7 Corporate Technology This Work: CONTRIBUTIONS Incorporate context information around words → determining actual meaning of ambiguous words → improving word and document representations Improving Topic Modeling for short-text and long-text documents Incorporate external knowledge for each word → using distributional semantics, i.e., word embeddings → improving document representations and topics
  • 8. Intern © Siemens AG 2017 May 2017Seite 8 Corporate Technology This Work: CONTRIBUTIONS Incorporate context information around words → determining actual meaning of ambiguous words → improving word and document representations Improving Topic Modeling for short-text and long-text documents Incorporate external knowledge for each word → using distributional semantics, i.e., word embeddings → improving document representations and topics
  • 9. Intern © Siemens AG 2017 May 2017Seite 9 Corporate Technology This Work: CONTRIBUTIONS Incorporate context information around words → determining actual meaning of ambiguous words → improving word and document representations Improving Topic Modeling for short-text and long-text documents Incorporate external knowledge for each word → using distributional semantics, i.e., word embeddings → improving document representations and topics
  • 10. Intern © Siemens AG 2017 May 2017Seite 10 Corporate Technology This Work: CONTRIBUTIONS Incorporate context information around words → determining actual meaning of ambiguous words → improving word and document representations Improving Topic Modeling for short-text and long-text documents Incorporate external knowledge for each word → using distributional semantics, i.e., word embeddings → improving document representations and topics
  • 11. Intern © Siemens AG 2017 May 2017Seite 11 Corporate Technology This Work: CONTRIBUTIONS Incorporate context information around words → determining actual meaning of ambiguous words → improving word and document representations Improving Topic Modeling for short-text and long-text documents Incorporate external knowledge for each word → using distributional semantics, i.e., word embeddings → improving document representations and topics
  • 12. Intern © Siemens AG 2017 May 2017Seite 12 Corporate Technology Motivation1: Need for Full Contextual Information Source Text Sense/Topic In biological brains, we study noisy neurons at cellular level → “biological neural network” Like biological brains, study of noisy neurons in artificial neural networks → “artificial neural network”
  • 13. Intern © Siemens AG 2017 May 2017Seite 13 Corporate Technology Motivation1: Need for Full Contextual Information Source Text Sense/Topic In biological brains, we study noisy neurons at cellular level → “biological neural network” Like biological brains, study of noisy neurons in artificial neural networks → “artificial neural network” Preceding Context Following Context Sense/Topic of “neurons” Like biological brains, study of noisy + in artificial neural networks → “biological neural network”
  • 14. Intern © Siemens AG 2017 May 2017Seite 14 Corporate Technology Motivation1: Need for Full Contextual Information Source Text Sense/Topic In biological brains, we study noisy neurons at cellular level → “biological neural network” Like biological brains, study of noisy neurons in artificial neural networks → “artificial neural network” Preceding Context Following Context Sense/Topic of “neurons” Like biological brains, study of noisy + in artificial neural networks → “biological neural network” Like biological brains, study of noisy + in artificial neural networks → “artificial neural network”
  • 15. Intern © Siemens AG 2017 May 2017Seite 15 Corporate Technology Motivation1: Need for Full Contextual Information Source Text Sense/Topic In biological brains, we study noisy neurons at cellular level → “biological neural network” Like biological brains, study of noisy neurons in artificial neural networks → “artificial neural network” Preceding Context Following Context Sense/Topic of “neurons” Like biological brains, study of noisy + in artificial neural networks → “biological neural network” Like biological brains, study of noisy + in artificial neural networks → “artificial neural network” Context information around words helps in determining their actual meaning !!!
  • 16. Intern © Siemens AG 2017 May 2017Seite 16 Corporate Technology Motivation2: Need for Distributional Semantics or Prior Knowledge ➢ “Lack of Context” in short-text documents, e.g., headlines, tweets, etc. ➢ “Lack of Context” in a corpus of few documents Small number of Word co-occurrences
  • 17. Intern © Siemens AG 2017 May 2017Seite 17 Corporate Technology Motivation2: Need for Distributional Semantics or Prior Knowledge ➢ “Lack of Context” in short-text documents, e.g., headlines, tweets, etc. ➢ “Lack of Context” in a corpus of few documents Small number of Word co-occurrences Lack of Context Difficult to learn good representations
  • 18. Intern © Siemens AG 2017 May 2017Seite 18 Corporate Technology Motivation2: Need for Distributional Semantics or Prior Knowledge ➢ “Lack of Context” in short-text documents, e.g., headlines, tweets, etc. ➢ “Lack of Context” in a corpus of few documents Small number of Word co-occurrences Lack of Context Generate Incoherent TopicsDifficult to learn good representations
  • 19. Intern © Siemens AG 2017 May 2017Seite 19 Corporate Technology Motivation2: Need for Distributional Semantics or Prior Knowledge ➢ “Lack of Context” in short-text documents, e.g., headlines, tweets, etc. ➢ “Lack of Context” in a corpus of few documents Small number of Word co-occurrences Lack of Context Generate Incoherent TopicsDifficult to learn good representations Topic1: price, wall, china, fall, shares Topic2: shares, price, profits, rises, earnings incoherent coherent example topics for ‘trading’
  • 20. Intern © Siemens AG 2017 May 2017Seite 20 Corporate Technology Motivation2: Need for Distributional Semantics or Prior Knowledge ➢ “Lack of Context” in short-text documents, e.g., headlines, tweets, etc. ➢ “Lack of Context” in a corpus of few documents Small number of Word co-occurrences Lack of Context Generate Incoherent TopicsDifficult to learn good representations TO RESCUE: Use External/additional information, e.g., WORD EMBEDDINGS (encodes semantic and syntactic relatedness in words in a vector space)
  • 21. Intern © Siemens AG 2017 May 2017Seite 21 Corporate Technology Motivation2: Need for Distributional Semantics or Prior Knowledge ➢ “Lack of Context” in short-text documents, e.g., headlines, tweets, etc. ➢ “Lack of Context” in a corpus of few documents Small number of Word co-occurrences Lack of Context Generate Incoherent TopicsDifficult to learn good representations → trading No word overlap (e.g., 1-hot-encoding) Same topic class→ trading TO RESCUE: Use External/additional information, e.g., WORD EMBEDDINGS (encodes semantic and syntactic relatedness in words in a vector space)
  • 22. Intern © Siemens AG 2017 May 2017Seite 22 Corporate Technology Baseline: Neural Autoregressive Topic Model (DocNADE) A probabilistic graphical model inspired by RBM, RSM and NADE models ➢ learns topics over sequences of words, v ➢ learn distributed word representations based on word co-occurrences ➢ follows encoder-decoder mechanism of unsupervised learning EncodingDecoding
  • 23. Intern © Siemens AG 2017 May 2017Seite 23 Corporate Technology Baseline: Neural Autoregressive Topic Model (DocNADE) A probabilistic graphical model inspired by RBM, RSM and NADE models ➢ learns topics over sequences of words, v ➢ learn distributed word representations based on word co-occurrences ➢ follows encoder-decoder principle ➢ computes joint distribution or log-likelihood for a document, v in language modeling fashion via autoregressive conditionals i.e., predict the word vi, given the sequence of preceding words v<i autoregressive conditional, p(vi=3 | v<3) via a feed-forward neural network, where EncodingDecoding
  • 24. Intern © Siemens AG 2017 May 2017Seite 24 Corporate Technology Baseline: Neural Autoregressive Topic Model (DocNADE) A probabilistic graphical model inspired by RBM, RSM and NADE models ➢ learns topics over sequences of words, v ➢ learn distributed word representations based on word co-occurrences ➢ follows encoder-decoder principle ➢ computes joint distribution or log-likelihood for a document, v in language modeling fashion via autoregressive conditionals i.e., predict the word vi, given the sequence of preceding words v<i autoregressive conditional, p(vi=3 | v<3) via a feed-forward neural network, where where, Topic matrix Encoding EncodingDecoding Embedding aggregation
  • 25. Intern © Siemens AG 2017 May 2017Seite 25 Corporate Technology Baseline: Neural Autoregressive Topic Model (DocNADE) Limitations ➢ does not take into account the following words v>i in the sequence ➢ poor in modeling short-text documents due to limited context ➢ does not use pre-trained word embeddings (or external knowledge) autoregressive conditional, p(vi=3 | v<3) EncodingDecoding DOES NOT take into account the following words v>i
  • 26. Intern © Siemens AG 2017 May 2017Seite 26 Corporate Technology Proposed Neural Architectures, extending DocNADE to →incorporate full context information →incorporate pre-trained Word Embeddings
  • 27. Intern © Siemens AG 2017 May 2017Seite 27 Corporate Technology Proposed Variant1: Contextualized DocNADE (iDocNADE)
  • 28. Intern © Siemens AG 2017 May 2017Seite 28 Corporate Technology Proposed Variant1: Contextualized DocNADE (iDocNADE) ➢ incorporating full contextual information around words in a document (preceding and following words) ➢ boost the likelihood of each word and subsequently the document likelihood ➢ improved representation learning
  • 29. Intern © Siemens AG 2017 May 2017Seite 29 Corporate Technology Proposed Variant1: Contextualized DocNADE (iDocNADE) ➢ incorporating full contextual information around words in a document (preceding and following words) ➢ boost the likelihood of each word and subsequently the document likelihood ➢ improved representation learning
  • 30. Intern © Siemens AG 2017 May 2017Seite 30 Corporate Technology Proposed Variant1: Contextualized DocNADE (iDocNADE)
  • 31. Intern © Siemens AG 2017 May 2017Seite 31 Corporate Technology Proposed Variant1: Contextualized DocNADE (iDocNADE) DocNADE Incomplete Context around words iDocNADE Full Context around words in Need for
  • 32. Intern © Siemens AG 2017 May 2017Seite 32 Corporate Technology Proposed Variant2: DocNADE + Embedding Priors (DocNADEe)
  • 33. Intern © Siemens AG 2017 May 2017Seite 33 Corporate Technology Proposed Variant2: DocNADE + Embedding Priors ‘e’ (DocNADEe)
  • 34. Intern © Siemens AG 2017 May 2017Seite 34 Corporate Technology Proposed Variant2: DocNADE + Embedding Priors ‘e’ (DocNADEe) ➢ introduce weighted pre-trained word embedding aggregation at each autoregressive step k ➢ E, pretrained emb as fixed prior ➢ generate topics with embeddings ➢ learn a complementary textual representation E
  • 35. Intern © Siemens AG 2017 May 2017Seite 35 Corporate Technology Proposed Variant2: DocNADE + Embedding Priors ‘e’ (DocNADEe) mixture weights ➢ introduce weighted pre-trained word embedding aggregation at each autoregressive step k ➢ E, pretrained emb as fixed prior ➢ generate topics with embeddings ➢ learn a complementary textual representation
  • 36. Intern © Siemens AG 2017 May 2017Seite 36 Corporate Technology Proposed Variant3: iDocNADE + Embedding Priors ‘e’ (iDocNADEe) Vairant1 iDocNADE Vairant2 DocNADEe Vairant3 iDocNADEe
  • 37. Intern © Siemens AG 2017 May 2017Seite 37 Corporate Technology Evaluation: Datasets, Statistics and Properties ➢ 8 short-text and 7 long-text datasets ➢ short-text → #words < 25 in a document ➢ a corpus of few documents (e.g., 20NSsmall) ➢ topics: 50 and 200 (hidden layer size) ➢ Quantitatively evaluated using: - generalization (perplexity, PPL) - interpretability using topic coherence - text/information retrieval (IR) - text classification
  • 38. Intern © Siemens AG 2017 May 2017Seite 38 Corporate Technology Evaluation: Datasets, Statistics and Properties ➢ 8 short-text and 7 long-text datasets ➢ short-text → #words < 25 in a document ➢ a corpus of few documents (e.g., 20NSsmall) ➢ topics: 50 and 200 (hidden layer size) ➢ Quantitatively evaluated using: - generalization (perplexity, PPL) - interpretability using topic coherence - text/information retrieval (IR) - text classification
  • 39. Intern © Siemens AG 2017 May 2017Seite 39 Corporate Technology Evaluation: Generalization via Perplexity (PPL) Generalization (PPL) → lower the better → on short-text datasets
  • 40. Intern © Siemens AG 2017 May 2017Seite 40 Corporate Technology Evaluation: Generalization via Perplexity (PPL) Generalization (PPL) → lower the better → on short-text datasets
  • 41. Intern © Siemens AG 2017 May 2017Seite 41 Corporate Technology Evaluation: Generalization via Perplexity (PPL) Generalization (PPL) → lower the better → on short-text datasets Gain (%) 4.1%
  • 42. Intern © Siemens AG 2017 May 2017Seite 42 Corporate Technology Evaluation: Generalization via Perplexity (PPL) Generalization (PPL) → lower the better → on short-text datasets Gain (%) 4.1% 4.3%
  • 43. Intern © Siemens AG 2017 May 2017Seite 43 Corporate Technology Evaluation: Generalization via Perplexity (PPL) Generalization (PPL) → lower the better → on short-text datasets Gain (%) 4.1% 4.3% 5.1%
  • 44. Intern © Siemens AG 2017 May 2017Seite 44 Corporate Technology Evaluation: Generalization via Perplexity (PPL) Generalization (PPL) → lower the better → on long-text datasets Gain (%) 5.3% 4.8% 5.5%
  • 45. Intern © Siemens AG 2017 May 2017Seite 45 Corporate Technology Evaluation: Applicability (Information Retrieval) IR-precision (on short-text data) → Precision at retrieval fraction 0.02 → higher the better short-text
  • 46. Intern © Siemens AG 2017 May 2017Seite 46 Corporate Technology Evaluation: Applicability (Information Retrieval) IR-precision (on short-text data) → Precision at retrieval fraction 0.02 → higher the better Gain (%) short-text
  • 47. Intern © Siemens AG 2017 May 2017Seite 47 Corporate Technology Evaluation: Applicability (Information Retrieval) IR-precision (on short-text data) → Precision at retrieval fraction 0.02 → higher the better Gain (%) 5.6% short-text
  • 48. Intern © Siemens AG 2017 May 2017Seite 48 Corporate Technology Evaluation: Applicability (Information Retrieval) IR-precision (on short-text data) → Precision at retrieval fraction 0.02 → higher the better Gain (%) 5.6% 7.4% short-text
  • 49. Intern © Siemens AG 2017 May 2017Seite 49 Corporate Technology Evaluation: Applicability (Information Retrieval) IR-precision (on short-text data) → Precision at retrieval fraction 0.02 → higher the better Gain (%) 5.6% 7.4% 11.1% short-text
  • 50. Intern © Siemens AG 2017 May 2017Seite 50 Corporate Technology Evaluation: Applicability (Information Retrieval) IR-precision (on long-text data) → Precision at retrieval fraction 0.02 → higher the better Gain (%) 7.1% 7.1%7.1% long-text
  • 51. Intern © Siemens AG 2017 May 2017Seite 51 Corporate Technology Evaluation: Applicability (Information Retrieval) TMNtitle AGnewstitle
  • 52. Intern © Siemens AG 2017 May 2017Seite 52 Corporate Technology Evaluation: Interpretability (Topic Coherence) → assess the meaningfulness of the underlying topics captured → coherence measure proposed by Roeder, Both, and Hinneburg (2015) → higher scores imply more coherent topics
  • 53. Intern © Siemens AG 2017 May 2017Seite 53 Corporate Technology Evaluation: Interpretability (Topic Coherence) → assess the meaningfulness of the underlying topics captured → coherence measure proposed by Roeder, Both, and Hinneburg (2015) → higher scores imply more coherent topics short-text
  • 54. Intern © Siemens AG 2017 May 2017Seite 54 Corporate Technology Evaluation: Interpretability (Topic Coherence) → assess the meaningfulness of the underlying topics captured → coherence measure proposed by Roeder, Both, and Hinneburg (2015) → higher scores imply more coherent topics long-text
  • 55. Intern © Siemens AG 2017 May 2017Seite 55 Corporate Technology Evaluation: Qualitative Topics (e.g., ‘religion’) coherent topic words
  • 56. Intern © Siemens AG 2017 May 2017Seite 56 Corporate Technology Conclusion: Take Away ➢ Leveraging full contextual information in neural autoregressive topic model ➢ Introducing distributional priors via pre-trained word embeddings ➢ Gain of 5.2% (404 vs 426) in perplexity, 2.8% (.74 vs .72) in topic coherence, 11.1% (.60 vs .54) in precision at retrieval fraction 0.02, 5.2% (.664 vs .631) in F1 for text categorization on avg over 15 datasets ➢ Learning better word/document representation for short/long texts ➢ State-of-the-art topic models unified with textual representation learning
  • 57. Intern © Siemens AG 2017 May 2017Seite 57 Corporate Technology Conclusion: Take Away ➢ Leveraging full contextual information in neural autoregressive topic model ➢ Introducing distributional priors via pre-trained word embeddings ➢ Gain of 5.2% (404 vs 426) in perplexity, 2.8% (.74 vs .72) in topic coherence, 11.1% (.60 vs .54) in precision at retrieval fraction 0.02, 5.2% (.664 vs .631) in F1 for text categorization on avg over 15 datasets ➢ Learning better word/document representation for short/long texts ➢ State-of-the-art topic models unified with textual representation learning Tryout: The code and data are available at https://github.com/pgcool/iDocNADEe Thanks !! “textTOvec”: Latest work to appear in ICLR19 | A Neural Topic Model with Language Structures