SlideShare a Scribd company logo
Dual Embedding Space Model (DESM)
Bhaskar Mitra, Eric Nalisnick, Nick Craswell and Rich Caruana
https://arxiv.org/abs/1602.01137
How do you learn a neural embedding?
Setup a prediction task
Source Item → Target Item
(The bottleneck layers are crucial for generalization)
Target
item
(sparse)
Source
item
(sparse)
Source
embedding
(dense)
Target
Embedding
(dense)
Distance
Metric
The bottleneck
Word2vec
Mikolov et. al. (2013)
Word → Neighboring word
I/O: One-Hot
DSSM (Query-Document)
Huang et. al. (2013), Shen et. al. (2014)
Query → Document
I/O: Bag-of-trigrams
DSSM (Session Pairs)
Mitra (2015)
Query → Neighboring query in session
I/O: Bag-of-trigrams
DSSM (Language Model)
Mitra and Craswell (2015)
Query prefix → query suffix
I/O: Bag-of-trigrams
Not all embeddings are created equal
The source-target training pairs strictly dictate what notion of
relatedness will be modelled in the embedding space
Is eminem more similar to rihanna or rap?
Is yale more similar to harvard or alumni?
Is seahawks more similar to broncos or seattle?
(Be careful of using pre-trained embeddings as inputs to a different model –
one-hot representations or learning an in situ embedding may be better!)
Word2vec
Learning word embeddings based
on word co-occurrence data.
Well-known for word analogy tasks,
[king] – [man] + [woman] ≈ [queen]
What if I told you that everyone
who uses Word2vec is throwing half
the model away?
Typical vs. Topical Relatedness
The IN-IN and the OUT-OUT similarities cluster words that occur in the same context
and therefore of the same Type. The overall word2vec model is trained to predict
neighboring words. Therefore the IN-OUT similarity clusters words that commonly co-
occur under the same Topic.
Typical embeddings for Web search?
B. Mitra and N. Craswell. Query
auto-completion for rare prefixes.
In Proc. CIKM. ACM, 2015.
Which passage is about Albuquerque?
Traditionally in Search we look for evidence of
relevance of a document to a query in terms
of the number of matches of the query
terms in the document.
But there is useful signal in the non-matching
terms in the document about whether the
document is really about the query terms, or
simply mentions them.
A word co-occurrence model can be used to
check if the other words in the document
support the presence of the matching terms.
Passage about Albuquerque
Passage not about Albuquerque
Dual Embedding Space Model
• All pairs comparison between query
and document terms
• Document embedding can be pre-
computed as the centroid of all the
unit vectors of the words in the
document
• DESMIN-OUT uses IN-embeddings for
query words and OUT-embeddings
for document words
• DESMIN-IN uses IN-embeddings
document words as well
IN-OUT vs. IN-IN
Because Cambridge is not an African mammal
DESM = ✔
BM25 = ✔
DESM = ✘
BM25 = ✔
DESM = ✔
BM25 = ✘
Query: cambridge
Telescoping Evaluation
As a weak ranking feature DESMIN-OUT performs better than BM25,
LSA and DESMIN-IN models on a UHRS (Overall) set and a click based
test set.
Full retrieval evaluation
The DESM models only a specific aspect of document relevance. In the presence
of many random documents (distractors) it is susceptible to spurious false
positives and needs to be combined with lexical ranking features such as BM25
DESM vs. BM25
Making different mistakes
Questions?

More Related Content

What's hot

Utilidad del Encerado Diagnostico y Enfilado Dentario
Utilidad del Encerado Diagnostico y Enfilado DentarioUtilidad del Encerado Diagnostico y Enfilado Dentario
Utilidad del Encerado Diagnostico y Enfilado DentarioFelipeMaldo
 
Sistema essix
Sistema essixSistema essix
Sistema essix
IsabellaArdilaGalean
 
Using Text Embeddings for Information Retrieval
Using Text Embeddings for Information RetrievalUsing Text Embeddings for Information Retrieval
Using Text Embeddings for Information Retrieval
Bhaskar Mitra
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet Allocation
Marco Righini
 
Cefalometria Posteroanterior
Cefalometria PosteroanteriorCefalometria Posteroanterior
Cefalometria Posteroanterior
claudia cano
 
Molares inferiores
Molares inferioresMolares inferiores
Molares inferiores
karlamargine
 
Histología y embriología del esmalte dental
Histología y embriología del esmalte dentalHistología y embriología del esmalte dental
Histología y embriología del esmalte dental
CornArlett
 
Mesialización 29 ene 2013
Mesialización 29 ene 2013Mesialización 29 ene 2013
Mesialización 29 ene 2013Carolum
 
ANALISIS_DE_MODELOS_CLASE_I.ppt
ANALISIS_DE_MODELOS_CLASE_I.pptANALISIS_DE_MODELOS_CLASE_I.ppt
ANALISIS_DE_MODELOS_CLASE_I.ppt
sucrelc
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
Bhaskar Mitra
 
El Sistema Damon
El Sistema Damon El Sistema Damon
El Sistema Damon
Igor Hanssen
 
Yesos de uso_odontologico[1]
Yesos de uso_odontologico[1]Yesos de uso_odontologico[1]
Yesos de uso_odontologico[1]
Cirujano Dentista
 
Information Retrieval-4(inverted index_&_query handling)
Information Retrieval-4(inverted index_&_query handling)Information Retrieval-4(inverted index_&_query handling)
Information Retrieval-4(inverted index_&_query handling)
Jeet Das
 
Revision bibliografica de prescripciones de ortodoncia
Revision bibliografica de prescripciones de ortodonciaRevision bibliografica de prescripciones de ortodoncia
Revision bibliografica de prescripciones de ortodoncia
xime_alex19
 
Periodonto e inervacion
Periodonto e inervacionPeriodonto e inervacion
Periodonto e inervacionNatalia GF
 
Técnicas quirúrgicas gingivales.
Técnicas quirúrgicas gingivales.Técnicas quirúrgicas gingivales.
Técnicas quirúrgicas gingivales.
Angel Nadal
 
Syllabus.quinto protesis fija y removible (pre clinica)
Syllabus.quinto   protesis fija y removible (pre clinica)Syllabus.quinto   protesis fija y removible (pre clinica)
Syllabus.quinto protesis fija y removible (pre clinica)Celina Zalzer
 

What's hot (20)

Utilidad del Encerado Diagnostico y Enfilado Dentario
Utilidad del Encerado Diagnostico y Enfilado DentarioUtilidad del Encerado Diagnostico y Enfilado Dentario
Utilidad del Encerado Diagnostico y Enfilado Dentario
 
Sistema essix
Sistema essixSistema essix
Sistema essix
 
Carillas
CarillasCarillas
Carillas
 
Análisis de modelos
Análisis de modelosAnálisis de modelos
Análisis de modelos
 
Using Text Embeddings for Information Retrieval
Using Text Embeddings for Information RetrievalUsing Text Embeddings for Information Retrieval
Using Text Embeddings for Information Retrieval
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet Allocation
 
Cefalometria Posteroanterior
Cefalometria PosteroanteriorCefalometria Posteroanterior
Cefalometria Posteroanterior
 
Molares inferiores
Molares inferioresMolares inferiores
Molares inferiores
 
Histología y embriología del esmalte dental
Histología y embriología del esmalte dentalHistología y embriología del esmalte dental
Histología y embriología del esmalte dental
 
Mesialización 29 ene 2013
Mesialización 29 ene 2013Mesialización 29 ene 2013
Mesialización 29 ene 2013
 
ANALISIS_DE_MODELOS_CLASE_I.ppt
ANALISIS_DE_MODELOS_CLASE_I.pptANALISIS_DE_MODELOS_CLASE_I.ppt
ANALISIS_DE_MODELOS_CLASE_I.ppt
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
 
Yeso
YesoYeso
Yeso
 
El Sistema Damon
El Sistema Damon El Sistema Damon
El Sistema Damon
 
Yesos de uso_odontologico[1]
Yesos de uso_odontologico[1]Yesos de uso_odontologico[1]
Yesos de uso_odontologico[1]
 
Information Retrieval-4(inverted index_&_query handling)
Information Retrieval-4(inverted index_&_query handling)Information Retrieval-4(inverted index_&_query handling)
Information Retrieval-4(inverted index_&_query handling)
 
Revision bibliografica de prescripciones de ortodoncia
Revision bibliografica de prescripciones de ortodonciaRevision bibliografica de prescripciones de ortodoncia
Revision bibliografica de prescripciones de ortodoncia
 
Periodonto e inervacion
Periodonto e inervacionPeriodonto e inervacion
Periodonto e inervacion
 
Técnicas quirúrgicas gingivales.
Técnicas quirúrgicas gingivales.Técnicas quirúrgicas gingivales.
Técnicas quirúrgicas gingivales.
 
Syllabus.quinto protesis fija y removible (pre clinica)
Syllabus.quinto   protesis fija y removible (pre clinica)Syllabus.quinto   protesis fija y removible (pre clinica)
Syllabus.quinto protesis fija y removible (pre clinica)
 

Similar to Dual Embedding Space Model (DESM)

The Duet model
The Duet modelThe Duet model
The Duet model
Bhaskar Mitra
 
5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval
Bhaskar Mitra
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
Bhaskar Mitra
 
Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for Search
Bhaskar Mitra
 
Using topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchUsing topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic search
Dawn Anderson MSc DigM
 
Document Classification Using KNN with Fuzzy Bags of Word Representation
Document Classification Using KNN with Fuzzy Bags of Word RepresentationDocument Classification Using KNN with Fuzzy Bags of Word Representation
Document Classification Using KNN with Fuzzy Bags of Word Representation
suthi
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Matthew Lease
 
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESTHE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
kevig
 
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESTHE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
kevig
 
Designing, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural NetworksDesigning, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural Networks
connectbeubax
 
6&7-Query Languages & Operations.ppt
6&7-Query Languages & Operations.ppt6&7-Query Languages & Operations.ppt
6&7-Query Languages & Operations.ppt
BereketAraya
 
Deep Neural Methods for Retrieval
Deep Neural Methods for RetrievalDeep Neural Methods for Retrieval
Deep Neural Methods for Retrieval
Bhaskar Mitra
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word Embeddings
Bhaskar Mitra
 
Automated Software Requirements Labeling
Automated Software Requirements LabelingAutomated Software Requirements Labeling
Automated Software Requirements Labeling
Data Works MD
 
Topic detecton by clustering and text mining
Topic detecton by clustering and text miningTopic detecton by clustering and text mining
Topic detecton by clustering and text mining
IRJET Journal
 
Vectorization In NLP.pptx
Vectorization In NLP.pptxVectorization In NLP.pptx
Vectorization In NLP.pptx
Chode Amarnath
 
A Novel Approach for Keyword extraction in learning objects using text mining
A Novel Approach for Keyword extraction in learning objects using text miningA Novel Approach for Keyword extraction in learning objects using text mining
A Novel Approach for Keyword extraction in learning objects using text mining
IJSRD
 
Cc35451454
Cc35451454Cc35451454
Cc35451454
IJERA Editor
 

Similar to Dual Embedding Space Model (DESM) (20)

The Duet model
The Duet modelThe Duet model
The Duet model
 
5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
 
Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for Search
 
Using topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchUsing topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic search
 
Document Classification Using KNN with Fuzzy Bags of Word Representation
Document Classification Using KNN with Fuzzy Bags of Word RepresentationDocument Classification Using KNN with Fuzzy Bags of Word Representation
Document Classification Using KNN with Fuzzy Bags of Word Representation
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESTHE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
 
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESTHE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
 
EDI 2009- Advanced Search: What’s Under the Hood of your Favorite Search System?
EDI 2009- Advanced Search: What’s Under the Hood of your Favorite Search System?EDI 2009- Advanced Search: What’s Under the Hood of your Favorite Search System?
EDI 2009- Advanced Search: What’s Under the Hood of your Favorite Search System?
 
Designing, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural NetworksDesigning, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural Networks
 
6&7-Query Languages & Operations.ppt
6&7-Query Languages & Operations.ppt6&7-Query Languages & Operations.ppt
6&7-Query Languages & Operations.ppt
 
Deep Neural Methods for Retrieval
Deep Neural Methods for RetrievalDeep Neural Methods for Retrieval
Deep Neural Methods for Retrieval
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word Embeddings
 
Automated Software Requirements Labeling
Automated Software Requirements LabelingAutomated Software Requirements Labeling
Automated Software Requirements Labeling
 
Eurolan 2005 Pedersen
Eurolan 2005 PedersenEurolan 2005 Pedersen
Eurolan 2005 Pedersen
 
Topic detecton by clustering and text mining
Topic detecton by clustering and text miningTopic detecton by clustering and text mining
Topic detecton by clustering and text mining
 
Vectorization In NLP.pptx
Vectorization In NLP.pptxVectorization In NLP.pptx
Vectorization In NLP.pptx
 
A Novel Approach for Keyword extraction in learning objects using text mining
A Novel Approach for Keyword extraction in learning objects using text miningA Novel Approach for Keyword extraction in learning objects using text mining
A Novel Approach for Keyword extraction in learning objects using text mining
 
Cc35451454
Cc35451454Cc35451454
Cc35451454
 

More from Bhaskar Mitra

Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
Joint Multisided Exposure Fairness for Search and Recommendation
Joint Multisided Exposure Fairness for Search and RecommendationJoint Multisided Exposure Fairness for Search and Recommendation
Joint Multisided Exposure Fairness for Search and Recommendation
Bhaskar Mitra
 
What’s next for deep learning for Search?
What’s next for deep learning for Search?What’s next for deep learning for Search?
What’s next for deep learning for Search?
Bhaskar Mitra
 
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...
Bhaskar Mitra
 
Efficient Machine Learning and Machine Learning for Efficiency in Information...
Efficient Machine Learning and Machine Learning for Efficiency in Information...Efficient Machine Learning and Machine Learning for Efficiency in Information...
Efficient Machine Learning and Machine Learning for Efficiency in Information...
Bhaskar Mitra
 
Multisided Exposure Fairness for Search and Recommendation
Multisided Exposure Fairness for Search and RecommendationMultisided Exposure Fairness for Search and Recommendation
Multisided Exposure Fairness for Search and Recommendation
Bhaskar Mitra
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
Bhaskar Mitra
 
Neural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progressNeural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progress
Bhaskar Mitra
 
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning TrackConformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Bhaskar Mitra
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
Bhaskar Mitra
 
Duet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning TrackDuet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning Track
Bhaskar Mitra
 
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and BeyondBenchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Bhaskar Mitra
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
Bhaskar Mitra
 
Learning to Rank with Neural Networks
Learning to Rank with Neural NetworksLearning to Rank with Neural Networks
Learning to Rank with Neural Networks
Bhaskar Mitra
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
Bhaskar Mitra
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
Bhaskar Mitra
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
Bhaskar Mitra
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
Bhaskar Mitra
 
Adversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrievalAdversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrieval
Bhaskar Mitra
 
A Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information RetrievalA Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information Retrieval
Bhaskar Mitra
 

More from Bhaskar Mitra (20)

Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Joint Multisided Exposure Fairness for Search and Recommendation
Joint Multisided Exposure Fairness for Search and RecommendationJoint Multisided Exposure Fairness for Search and Recommendation
Joint Multisided Exposure Fairness for Search and Recommendation
 
What’s next for deep learning for Search?
What’s next for deep learning for Search?What’s next for deep learning for Search?
What’s next for deep learning for Search?
 
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...
 
Efficient Machine Learning and Machine Learning for Efficiency in Information...
Efficient Machine Learning and Machine Learning for Efficiency in Information...Efficient Machine Learning and Machine Learning for Efficiency in Information...
Efficient Machine Learning and Machine Learning for Efficiency in Information...
 
Multisided Exposure Fairness for Search and Recommendation
Multisided Exposure Fairness for Search and RecommendationMultisided Exposure Fairness for Search and Recommendation
Multisided Exposure Fairness for Search and Recommendation
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
 
Neural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progressNeural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progress
 
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning TrackConformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
 
Duet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning TrackDuet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning Track
 
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and BeyondBenchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
 
Learning to Rank with Neural Networks
Learning to Rank with Neural NetworksLearning to Rank with Neural Networks
Learning to Rank with Neural Networks
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
Adversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrievalAdversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrieval
 
A Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information RetrievalA Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information Retrieval
 

Recently uploaded

Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
ViralQR
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 

Recently uploaded (20)

Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 

Dual Embedding Space Model (DESM)

  • 1. Dual Embedding Space Model (DESM) Bhaskar Mitra, Eric Nalisnick, Nick Craswell and Rich Caruana https://arxiv.org/abs/1602.01137
  • 2. How do you learn a neural embedding? Setup a prediction task Source Item → Target Item (The bottleneck layers are crucial for generalization) Target item (sparse) Source item (sparse) Source embedding (dense) Target Embedding (dense) Distance Metric The bottleneck Word2vec Mikolov et. al. (2013) Word → Neighboring word I/O: One-Hot DSSM (Query-Document) Huang et. al. (2013), Shen et. al. (2014) Query → Document I/O: Bag-of-trigrams DSSM (Session Pairs) Mitra (2015) Query → Neighboring query in session I/O: Bag-of-trigrams DSSM (Language Model) Mitra and Craswell (2015) Query prefix → query suffix I/O: Bag-of-trigrams
  • 3. Not all embeddings are created equal The source-target training pairs strictly dictate what notion of relatedness will be modelled in the embedding space Is eminem more similar to rihanna or rap? Is yale more similar to harvard or alumni? Is seahawks more similar to broncos or seattle? (Be careful of using pre-trained embeddings as inputs to a different model – one-hot representations or learning an in situ embedding may be better!)
  • 4. Word2vec Learning word embeddings based on word co-occurrence data. Well-known for word analogy tasks, [king] – [man] + [woman] ≈ [queen] What if I told you that everyone who uses Word2vec is throwing half the model away?
  • 5. Typical vs. Topical Relatedness The IN-IN and the OUT-OUT similarities cluster words that occur in the same context and therefore of the same Type. The overall word2vec model is trained to predict neighboring words. Therefore the IN-OUT similarity clusters words that commonly co- occur under the same Topic.
  • 6. Typical embeddings for Web search? B. Mitra and N. Craswell. Query auto-completion for rare prefixes. In Proc. CIKM. ACM, 2015.
  • 7. Which passage is about Albuquerque? Traditionally in Search we look for evidence of relevance of a document to a query in terms of the number of matches of the query terms in the document. But there is useful signal in the non-matching terms in the document about whether the document is really about the query terms, or simply mentions them. A word co-occurrence model can be used to check if the other words in the document support the presence of the matching terms. Passage about Albuquerque Passage not about Albuquerque
  • 8. Dual Embedding Space Model • All pairs comparison between query and document terms • Document embedding can be pre- computed as the centroid of all the unit vectors of the words in the document • DESMIN-OUT uses IN-embeddings for query words and OUT-embeddings for document words • DESMIN-IN uses IN-embeddings document words as well
  • 10. Because Cambridge is not an African mammal DESM = ✔ BM25 = ✔ DESM = ✘ BM25 = ✔ DESM = ✔ BM25 = ✘ Query: cambridge
  • 11. Telescoping Evaluation As a weak ranking feature DESMIN-OUT performs better than BM25, LSA and DESMIN-IN models on a UHRS (Overall) set and a click based test set.
  • 12. Full retrieval evaluation The DESM models only a specific aspect of document relevance. In the presence of many random documents (distractors) it is susceptible to spurious false positives and needs to be combined with lexical ranking features such as BM25