SlideShare a Scribd company logo
1 of 15
Ashis Kumar Chanda
PhD Candidate
Understanding Word2Vector
Authors: Tomas Mikolov et al. 2013
Contents
• Problem description
• Motivation
• Proposed Method
• Experiments
• Conclusion
• Criticism
Problem description
• Every word has a meaning
• But, how can we learn a new word?
• We can check dictionary for its meaning
– It takes time and we are not always ready with dictionary
• Otherwise, we can guess the meaning of a new word from its context
Her limpid prose made even the most difficult subjects accessible to all.
This part helps to guess the meaning of limpid
It would be “pleasant” or “clear”
Problem description
• How machine can understand a word meaning?
• It can translate from a dictionary or word library
– difficult to create and maintain such a library
• However, a word can have different meaning
– neighboring / context words can help to suggest
• Machine should learn word representation itself
Word embeddings
• There are many methods to find word embeddings
– Frequency based embeddings
– Count vectors
– TF-IDF
– Co-occurrence matrix
– Skip gram model
– CBOW
• We are going to discuss the last two methods
https://www.analyticsvidhya.com/blog/2017/06/word-embeddings-count-word2veec/
Motivation
• Finding semantic meaning of words
• Learning word from its context words
• Representing word in a low dimensional vector
• Easy to compare two words in vector space
Proposed method
• Representing a word as a vector
• How should we learn these vector values
• There are two methods
– 1. Continuous Bag of Word (CBOW)
– 2. Skip gram model (SG)
0.2
0
0
0.7
0
0
0
0
…
0
cat
0.1
0.3
0.9
0
0
0
0
0
…
0
dog
Proposed method
• CBOW: use a set of words in fixed length (window) to predict
the middle word
• SG: use a word to predict the surrounding words in a fixed
distance (window)
Proposed method
• Scanning words in a window from an article
• Word order is not important in window
• Eg: Many days ago, there was a king who had ……
Here, “king” is our target word = Wt
Wt Wt+1 Wt+2Wt-1Wt-2 Window = 5
Next window
Proposed method
• Used a two layer neural network
• First layer is fully connected
• Final layer used softmax function to know probability of one
word with respect of others
• Stochastic gradient descent is used to learn parameters in
back propagation
Proposed method
• Representing a word as a vector
• Translate the query tree into a SQL statement
0.2
0
0
0.7
0
0
0
0
…
0
cat
0.1
0.3
0.9
0
0
0
0
0
…
0
cat
Fig: collected from Coursera course of Andrew Ng
word
Feature
Conclusions
• Introducing a new state of art in natural language processing
• Big Data is needed to find a good embedding
• Training process takes a long time
• Learned W2V model on wikipedia documents is publicly
available
• Used in many applications successfully
Project Links
• https://code.google.com/archive/p/word2vec/
• https://radimrehurek.com/gensim/models/word2vec.html
Application on Medical Data
• Medical data contains notes and codes
• Note is a description of patient’s condition and treatments
• Codes are unique values that used to represent diagnosis and medicine
• There are many standard coding methods, like ICD-9, CPT …
• W2V can be used in medical dataset to know the medical code
embeddings
T. Bai, A. K. Chanda, S. Vucetic, B. L. Egleston. "Joint learning of representations of
medical concepts and words from EHR data". In the BIBM conference, 2017
References
• T. Mikolov, K. Chen, G. Corrado, J. Dean, Ecient estimation of word representations in vector space, CoRR
• abs/1301.3781. arXiv:1301.3781. URL http://arxiv.org/abs/1301.3781
• X. Rong, word2vec parameter learning explained, CoRR abs/1411.2738. arXiv:1411.2738. URL http://
arxiv.org/abs/1411.2738
• T. Bai, A. K. Chanda, B. L. Egleston, S. Vucetic, Joint learning of representations of medical concepts and words from
• EHR data, in: 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017, Kansas City, MO, USA,
November 13-16, 2017, 2017, pp. 764{769. doi:10.1109/BIBM.2017.8217752. URL
https://doi.org/10.1109/BIBM.2017.8217752

More Related Content

Similar to Word 2 vector

Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com Lucidworks
 
Haystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesHaystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesOpenSource Connections
 
Searching with vectors
Searching with vectorsSearching with vectors
Searching with vectorsSimon Hughes
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
 
How can text-mining leverage developments in Deep Learning? Presentation at ...
How can text-mining leverage developments in Deep Learning?  Presentation at ...How can text-mining leverage developments in Deep Learning?  Presentation at ...
How can text-mining leverage developments in Deep Learning? Presentation at ...jcscholtes
 
Semi supervised approach for word sense disambiguation
Semi supervised approach for word sense disambiguationSemi supervised approach for word sense disambiguation
Semi supervised approach for word sense disambiguationkokanechandrakant
 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categoriesWarNik Chow
 
Netizen style commenting on fashion photos
Netizen style commenting on fashion photosNetizen style commenting on fashion photos
Netizen style commenting on fashion photosJason Tang
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPMachine Learning Prague
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Lucidworks
 
Probabilistic indexing for archival holdings - possibilities and limits
Probabilistic indexing for archival holdings - possibilities and limitsProbabilistic indexing for archival holdings - possibilities and limits
Probabilistic indexing for archival holdings - possibilities and limitsUniversité Libre de Bruxelles
 
Writing a scientific manuscript
Writing a scientific manuscriptWriting a scientific manuscript
Writing a scientific manuscriptMartin McMorrow
 
Integrating an intelligent tutoring system into a virtual world
Integrating an intelligent tutoring system into a virtual worldIntegrating an intelligent tutoring system into a virtual world
Integrating an intelligent tutoring system into a virtual worldParvati Dev
 
Word_Embeddings.pptx
Word_Embeddings.pptxWord_Embeddings.pptx
Word_Embeddings.pptxGowrySailaja
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingTed Xiao
 
Open Creativity Scoring Tutorial
Open Creativity Scoring TutorialOpen Creativity Scoring Tutorial
Open Creativity Scoring TutorialDenisDumas2
 
Sparse Composite Document Vector (Emnlp 2017)
Sparse Composite Document Vector (Emnlp 2017)Sparse Composite Document Vector (Emnlp 2017)
Sparse Composite Document Vector (Emnlp 2017)Vivek Gupta
 
SolrSherlock: Linkfinding among Biomolecules with Literature-based Discovery
SolrSherlock: Linkfinding among Biomolecules with Literature-based DiscoverySolrSherlock: Linkfinding among Biomolecules with Literature-based Discovery
SolrSherlock: Linkfinding among Biomolecules with Literature-based DiscoveryJack Park
 

Similar to Word 2 vector (20)

Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
 
Haystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesHaystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon Hughes
 
Searching with vectors
Searching with vectorsSearching with vectors
Searching with vectors
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
How can text-mining leverage developments in Deep Learning? Presentation at ...
How can text-mining leverage developments in Deep Learning?  Presentation at ...How can text-mining leverage developments in Deep Learning?  Presentation at ...
How can text-mining leverage developments in Deep Learning? Presentation at ...
 
Semi supervised approach for word sense disambiguation
Semi supervised approach for word sense disambiguationSemi supervised approach for word sense disambiguation
Semi supervised approach for word sense disambiguation
 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories
 
Netizen style commenting on fashion photos
Netizen style commenting on fashion photosNetizen style commenting on fashion photos
Netizen style commenting on fashion photos
 
ESWC 2014 Tutorial part 3
ESWC 2014 Tutorial part 3ESWC 2014 Tutorial part 3
ESWC 2014 Tutorial part 3
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLP
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
 
Probabilistic indexing for archival holdings - possibilities and limits
Probabilistic indexing for archival holdings - possibilities and limitsProbabilistic indexing for archival holdings - possibilities and limits
Probabilistic indexing for archival holdings - possibilities and limits
 
Writing a scientific manuscript
Writing a scientific manuscriptWriting a scientific manuscript
Writing a scientific manuscript
 
Integrating an intelligent tutoring system into a virtual world
Integrating an intelligent tutoring system into a virtual worldIntegrating an intelligent tutoring system into a virtual world
Integrating an intelligent tutoring system into a virtual world
 
Word_Embeddings.pptx
Word_Embeddings.pptxWord_Embeddings.pptx
Word_Embeddings.pptx
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language Processing
 
Open Creativity Scoring Tutorial
Open Creativity Scoring TutorialOpen Creativity Scoring Tutorial
Open Creativity Scoring Tutorial
 
Sparse Composite Document Vector (Emnlp 2017)
Sparse Composite Document Vector (Emnlp 2017)Sparse Composite Document Vector (Emnlp 2017)
Sparse Composite Document Vector (Emnlp 2017)
 
SolrSherlock: Linkfinding among Biomolecules with Literature-based Discovery
SolrSherlock: Linkfinding among Biomolecules with Literature-based DiscoverySolrSherlock: Linkfinding among Biomolecules with Literature-based Discovery
SolrSherlock: Linkfinding among Biomolecules with Literature-based Discovery
 

More from Ashis Kumar Chanda (20)

Multi-class Image Classification using deep convolutional networks on extreme...
Multi-class Image Classification using deep convolutional networks on extreme...Multi-class Image Classification using deep convolutional networks on extreme...
Multi-class Image Classification using deep convolutional networks on extreme...
 
Full resolution image compression with recurrent neural networks
Full resolution image compression with  recurrent neural networksFull resolution image compression with  recurrent neural networks
Full resolution image compression with recurrent neural networks
 
03. Agile Development
03. Agile Development03. Agile Development
03. Agile Development
 
Software Cost Estimation
Software Cost EstimationSoftware Cost Estimation
Software Cost Estimation
 
Risk Management
Risk ManagementRisk Management
Risk Management
 
Project Management
Project ManagementProject Management
Project Management
 
MVC
MVCMVC
MVC
 
Requirements engineering
Requirements engineeringRequirements engineering
Requirements engineering
 
4. UML
4. UML4. UML
4. UML
 
2. Software process
2. Software process2. Software process
2. Software process
 
1. Introduction
1. Introduction1. Introduction
1. Introduction
 
Periodic pattern mining
Periodic pattern miningPeriodic pattern mining
Periodic pattern mining
 
FPPM algorithm
FPPM algorithmFPPM algorithm
FPPM algorithm
 
Secure software design
Secure software designSecure software design
Secure software design
 
Sequential logic circuit optimization
Sequential logic circuit optimizationSequential logic circuit optimization
Sequential logic circuit optimization
 
Introduction to CS
Introduction to CSIntroduction to CS
Introduction to CS
 
Iterative deepening search
Iterative deepening searchIterative deepening search
Iterative deepening search
 
CloudBus
CloudBusCloudBus
CloudBus
 
Linear Machine Decision Tree
Linear Machine Decision TreeLinear Machine Decision Tree
Linear Machine Decision Tree
 
Logical Operations on BDD
Logical Operations on BDDLogical Operations on BDD
Logical Operations on BDD
 

Recently uploaded

Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and usesDevarapalliHaritha
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 

Recently uploaded (20)

Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and uses
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 

Word 2 vector

  • 1. Ashis Kumar Chanda PhD Candidate Understanding Word2Vector Authors: Tomas Mikolov et al. 2013
  • 2. Contents • Problem description • Motivation • Proposed Method • Experiments • Conclusion • Criticism
  • 3. Problem description • Every word has a meaning • But, how can we learn a new word? • We can check dictionary for its meaning – It takes time and we are not always ready with dictionary • Otherwise, we can guess the meaning of a new word from its context Her limpid prose made even the most difficult subjects accessible to all. This part helps to guess the meaning of limpid It would be “pleasant” or “clear”
  • 4. Problem description • How machine can understand a word meaning? • It can translate from a dictionary or word library – difficult to create and maintain such a library • However, a word can have different meaning – neighboring / context words can help to suggest • Machine should learn word representation itself
  • 5. Word embeddings • There are many methods to find word embeddings – Frequency based embeddings – Count vectors – TF-IDF – Co-occurrence matrix – Skip gram model – CBOW • We are going to discuss the last two methods https://www.analyticsvidhya.com/blog/2017/06/word-embeddings-count-word2veec/
  • 6. Motivation • Finding semantic meaning of words • Learning word from its context words • Representing word in a low dimensional vector • Easy to compare two words in vector space
  • 7. Proposed method • Representing a word as a vector • How should we learn these vector values • There are two methods – 1. Continuous Bag of Word (CBOW) – 2. Skip gram model (SG) 0.2 0 0 0.7 0 0 0 0 … 0 cat 0.1 0.3 0.9 0 0 0 0 0 … 0 dog
  • 8. Proposed method • CBOW: use a set of words in fixed length (window) to predict the middle word • SG: use a word to predict the surrounding words in a fixed distance (window)
  • 9. Proposed method • Scanning words in a window from an article • Word order is not important in window • Eg: Many days ago, there was a king who had …… Here, “king” is our target word = Wt Wt Wt+1 Wt+2Wt-1Wt-2 Window = 5 Next window
  • 10. Proposed method • Used a two layer neural network • First layer is fully connected • Final layer used softmax function to know probability of one word with respect of others • Stochastic gradient descent is used to learn parameters in back propagation
  • 11. Proposed method • Representing a word as a vector • Translate the query tree into a SQL statement 0.2 0 0 0.7 0 0 0 0 … 0 cat 0.1 0.3 0.9 0 0 0 0 0 … 0 cat Fig: collected from Coursera course of Andrew Ng word Feature
  • 12. Conclusions • Introducing a new state of art in natural language processing • Big Data is needed to find a good embedding • Training process takes a long time • Learned W2V model on wikipedia documents is publicly available • Used in many applications successfully
  • 13. Project Links • https://code.google.com/archive/p/word2vec/ • https://radimrehurek.com/gensim/models/word2vec.html
  • 14. Application on Medical Data • Medical data contains notes and codes • Note is a description of patient’s condition and treatments • Codes are unique values that used to represent diagnosis and medicine • There are many standard coding methods, like ICD-9, CPT … • W2V can be used in medical dataset to know the medical code embeddings T. Bai, A. K. Chanda, S. Vucetic, B. L. Egleston. "Joint learning of representations of medical concepts and words from EHR data". In the BIBM conference, 2017
  • 15. References • T. Mikolov, K. Chen, G. Corrado, J. Dean, Ecient estimation of word representations in vector space, CoRR • abs/1301.3781. arXiv:1301.3781. URL http://arxiv.org/abs/1301.3781 • X. Rong, word2vec parameter learning explained, CoRR abs/1411.2738. arXiv:1411.2738. URL http:// arxiv.org/abs/1411.2738 • T. Bai, A. K. Chanda, B. L. Egleston, S. Vucetic, Joint learning of representations of medical concepts and words from • EHR data, in: 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017, Kansas City, MO, USA, November 13-16, 2017, 2017, pp. 764{769. doi:10.1109/BIBM.2017.8217752. URL https://doi.org/10.1109/BIBM.2017.8217752

Editor's Notes

  1. <number>
  2. Problem description, Motivation, Proposal, Experiments, Conclusion, Criticism <number>
  3. A multiple choice selection panel <number>
  4. A multiple choice selection panel <number>
  5. A multiple choice selection panel <number>
  6. A multiple choice selection panel <number>
  7. A multiple choice selection panel <number>
  8. No link for their application <number>