NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks", KDD 2015

•Download as PPTX, PDF•

0 likes•240 views

Technology

1
 Problem definition
 Predictive text embedding
• Bipartite Network Embedding
• Heterogeneous Text Network Embedding
• Text Embedding
 Experiments
 Discussion and conclusion

2
1. Problem definition
Definition
• Definition 1. (Word-Word Network)
• Capture co-occurrence information in unlabeled local contexts
𝐺𝑊𝑊 = (𝑉 , 𝐸𝑊𝑊)
• Traditional word embedding approaches such as skipgrams
• Definition 2. (Word-Document Network) Word-document
• Capture connections between words and documents in a corpus
𝐺𝑊𝐷 = (𝑉 ∪ 𝐷 , 𝐸𝑊𝐷)
• Definition 3. (Word-Label Network) Word-label
𝐺𝑊𝑙 = 𝑉 ∪ 𝐿 , 𝐸𝑊𝑙
𝑤𝑖𝑗 = 𝑛𝑑𝑙

3
1. Problem definition
Definition
• Definition 4. (Heterogeneous Text Network) The heterogeneous text network
• Represents a combination of defined networks
• Captures co-occurrences at multiple levels and includes both labeled and unlabeled data
• Definition 5. (Predictive Text Embedding)
• The resulting low-dimensional embeddings are powerful for certain tasks

4
2. Predictive text embedding
Bipartite Network Embedding
• LINE model was introduced for large-scale information embedding, but weights for different types of
edges cannot be compared
• Therefore, we propose an applied method that applies quadratic proximity between nodes
• 𝐺 = (𝑉𝐴 ∪ 𝑉𝐵, 𝐸)

5
2. Predictive text embedding
Bipartite Network Embedding
• Optimization of the objective function using stochastic gradient descent.
• Using edge sampling and negative sampling techniques.
• Edge sampling method to obtain binary edges e with probability proportional to their weights at each
step and negative samples from the noise distribution p.
• After learning all the embeddings, we can define the objective function

6
Heterogeneous Text Network Embedding
• There are three different networks shared by the word vertices
2. Predictive text embedding

7
Heterogeneous Text Network Embedding
• Training unlabeled and labeled together
2. Predictive text embedding

8
Heterogeneous Text Network Embedding
• Train with unlabeled data and refine using labeled
2. Predictive text embedding

9
Text Embedding
• After training the vector representation, it can be averaged to obtain a representation of all the text.
• Learn by minimizing a loss function, specified as the Euclidean distance between embeddings, using a
gradient descent algorithm.
2. Predictive text embedding

18
4. Discussion and conclusion
Discussion and conclusion
• Unsupervised learning uses either local context-level or document-level word co-occurrences, with
document-level co-occurrences being more useful for long documents and local context-level
being more useful for short documents.
• PTE joint training on both labeled and unlabeled data, and outperforms CNNs with more labeled
data.
• PTE needs improvement, such as taking into account the order of words.

Similar to NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks", KDD 2015

Dcnn for text捷恩蔡

Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...Yuki Tomo

Code Search Based on Deep Neural Network and Code MutationNorihiro Yoshida

Modern Java - JFS - Stuttgart - 2023.pdfRonVeen1

NLP and Deep Learning for non_expertsSanghamitra Deb

Das09112008sunnyjohn

Parts of speech taggersadakpramodh

Sujit Pal - Applying the four-step "Embed, Encode, Attend, Predict" framework...PyData

Natural Language Query to SQL conversion using Machine Learning ApproachMinhazul Arefin

character_ANN.pptHarsh480253

End-to-end sequence labeling via bi-directional LSTM-CNNs-CRFJayavardhan Reddy Peddamail

Deep Learning for Machine TranslationMatīss ‎‎‎‎‎‎‎

Icacci presentation-isi-text categorizationvinaykumar R

NS-CUK Seminar: J.H.Lee, Review on "Abstract Meaning Representation for Semb...ssuser4b1f48

Modern Java - WeAreDevelopers - Berlin - 2023 - Ron Veen.pdfRonVeen1

NS-CUK Seminar: H.E.Lee, Review on "Structural Deep Embedding for Hyper-Netw...ssuser4b1f48

2017:12:06 acl読み会"Learning attention for historical text normalization by lea...ayaha osaki

Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Saurabh Kaushik

network mining and representation learningsun peiyuan

240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptxthanhdowork

Similar to NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks", KDD 2015 (20)

Dcnn for text

Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...

Code Search Based on Deep Neural Network and Code Mutation

Modern Java - JFS - Stuttgart - 2023.pdf

NLP and Deep Learning for non_experts

Das09112008

Parts of speech tagger

Sujit Pal - Applying the four-step "Embed, Encode, Attend, Predict" framework...

Natural Language Query to SQL conversion using Machine Learning Approach

character_ANN.ppt

End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF

Deep Learning for Machine Translation

Icacci presentation-isi-text categorization

NS-CUK Seminar: J.H.Lee, Review on "Abstract Meaning Representation for Semb...

Modern Java - WeAreDevelopers - Berlin - 2023 - Ron Veen.pdf

NS-CUK Seminar: H.E.Lee, Review on "Structural Deep Embedding for Hyper-Netw...

2017:12:06 acl読み会"Learning attention for historical text normalization by lea...

Engineering Intelligent NLP Applications Using Deep Learning – Part 2

network mining and representation learning

240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx

Recently uploaded

SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j

Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst

Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren

Build your next Gen AI Breakthrough - April 2024Neo4j

Install Stable Diffusion in windows machinePadma Pradeep

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

CloudStudio User manual (basic edition):comworks

My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer

Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed

Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays

Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm

SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren

Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies

Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55

Artificial intelligence in the post-deep learning eraDeakin University

DMCC Future of Trade Web3 - Special EditionDubai Multi Commodity Centre

The transition to renewables in India.pdfCompetition Advisory Services (India) LLP

Understanding the Laravel MVC ArchitecturePixlogix Infotech

Recently uploaded (20)

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph

Human Factors of XR: Using Human Factors to Design XR Systems

Advanced Test Driven-Development @ php[tek] 2024

Build your next Gen AI Breakthrough - April 2024

Install Stable Diffusion in windows machine

My Hashitalk Indonesia April 2024 Presentation

CloudStudio User manual (basic edition):

My INSURER PTE LTD - Insurtech Innovation Award 2024

Scanning the Internet for External Cloud Exposures via SSL Certs

Unleash Your Potential - Namagunga Girls Coding Club

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...

Streamlining Python Development: A Guide to a Modern Project Setup

SQL Database Design For Developers at php[tek] 2024

Benefits Of Flutter Compared To Other Frameworks

Swan(sea) Song – personal research during my six years at Swansea ... and bey...

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...

Artificial intelligence in the post-deep learning era

DMCC Future of Trade Web3 - Special Edition

The transition to renewables in India.pdf

Understanding the Laravel MVC Architecture

NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks", KDD 2015

1. Hyo Eun Lee Network Science Lab Dept. of Biotechnology The Catholic University of Korea E-mail: gydnsml@gmail.com 2023.08.14

2. 1  Problem definition  Predictive text embedding • Bipartite Network Embedding • Heterogeneous Text Network Embedding • Text Embedding  Experiments  Discussion and conclusion

3. 2 1. Problem definition Definition • Definition 1. (Word-Word Network) • Capture co-occurrence information in unlabeled local contexts 𝐺𝑊𝑊 = (𝑉 , 𝐸𝑊𝑊) • Traditional word embedding approaches such as skipgrams • Definition 2. (Word-Document Network) Word-document • Capture connections between words and documents in a corpus 𝐺𝑊𝐷 = (𝑉 ∪ 𝐷 , 𝐸𝑊𝐷) • Definition 3. (Word-Label Network) Word-label 𝐺𝑊𝑙 = 𝑉 ∪ 𝐿 , 𝐸𝑊𝑙 𝑤𝑖𝑗 = 𝑛𝑑𝑙

4. 3 1. Problem definition Definition • Definition 4. (Heterogeneous Text Network) The heterogeneous text network • Represents a combination of defined networks • Captures co-occurrences at multiple levels and includes both labeled and unlabeled data • Definition 5. (Predictive Text Embedding) • The resulting low-dimensional embeddings are powerful for certain tasks

5. 4 2. Predictive text embedding Bipartite Network Embedding • LINE model was introduced for large-scale information embedding, but weights for different types of edges cannot be compared • Therefore, we propose an applied method that applies quadratic proximity between nodes • 𝐺 = (𝑉𝐴 ∪ 𝑉𝐵, 𝐸)

6. 5 2. Predictive text embedding Bipartite Network Embedding • Optimization of the objective function using stochastic gradient descent. • Using edge sampling and negative sampling techniques. • Edge sampling method to obtain binary edges e with probability proportional to their weights at each step and negative samples from the noise distribution p. • After learning all the embeddings, we can define the objective function

7. 6 Heterogeneous Text Network Embedding • There are three different networks shared by the word vertices 2. Predictive text embedding

8. 7 Heterogeneous Text Network Embedding • Training unlabeled and labeled together 2. Predictive text embedding

9. 8 Heterogeneous Text Network Embedding • Train with unlabeled data and refine using labeled 2. Predictive text embedding

10. 9 Text Embedding • After training the vector representation, it can be averaged to obtain a representation of all the text. • Learn by minimizing a loss function, specified as the Euclidean distance between embeddings, using a gradient descent algorithm. 2. Predictive text embedding

11. 10 3. Experiments

12. 11 3. Experiments

13. 12 3. Experiments

14. 13 3. Experiments

15. 14 3. Experiments

16. 15 3. Experiments

17. 16 3. Experiments

18. 17 3. Experiments

19. 18 4. Discussion and conclusion Discussion and conclusion • Unsupervised learning uses either local context-level or document-level word co-occurrences, with document-level co-occurrences being more useful for long documents and local context-level being more useful for short documents. • PTE joint training on both labeled and unlabeled data, and outperforms CNNs with more labeled data. • PTE needs improvement, such as taking into account the order of words.

NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks", KDD 2015

Recommended

Recommended

More Related Content

Similar to NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks", KDD 2015

Similar to NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks", KDD 2015 (20)

More from ssuser4b1f48

More from ssuser4b1f48 (20)

Recently uploaded

Recently uploaded (20)

NS-CUK Seminar: H.E.Lee, Review on "PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks", KDD 2015