SlideShare a Scribd company logo
1 of 31
NEURAL MODELS FOR
DOCUMENT RANKING
BHASKAR MITRA
Principal Applied Scientist
Microsoft Research and AI
Research Student
Dept. of Computer Science
University College London
Joint work with Nick Craswell, Fernando Diaz,
Federico Nanni, Matt Magnusson, and Laura Dietz
PAPERS WE WILL DISCUSS
Learning to Match Using Local and Distributed Representations of
Text for Web Search
Bhaskar Mitra, Fernando Diaz, and Nick Craswell, in Proc. WWW, 2017.
https://dl.acm.org/citation.cfm?id=3052579
Benchmark for Complex Answer Retrieval
Federico Nanni, Bhaskar Mitra, Matt Magnusson, and Laura Dietz, in Proc. ICTIR, 2017.
https://dl.acm.org/citation.cfm?id=3121099
THE DOCUMENT RANKING TASK
Given a query rank documents
according to relevance
The query text has few terms
The document representation can be
long (e.g., body text) or short (e.g., title)
query
ranked results
search engine w/ an
index of retrievable items
This talk is focused on ranking documents
based on their long body text
CHALLENGES IN SHORT VS. LONG
TEXT RETRIEVAL
Short-text
Vocabulary mismatch more serious problem
Long-text
Documents contain mixture of many topics
Matches in different parts of a long document contribute unequally
Term proximity is an important consideration
MANY DNN MODELS FOR SHORT TEXT RANKING
(Huang et al., 2013)
(Severyn and Moschitti, 2015)
(Shen et al., 2014)
(Palangi et al., 2015)
(Hu et al., 2014)
(Tai et al., 2015)
BUT FEW FOR LONG DOCUMENT RANKING…
(Guo et al., 2016)
(Salakhutdinov and Hinton, 2009)
DESIDERATA OF DOCUMENT RANKING
EXACT MATCHING
Frequency and positions of matches
good indicators of relevance
Term proximity is important
Important if query term is rare / fresh
INEXACT MATCHING
Synonymy relationships
united states president ↔ Obama
Evidence for document aboutness
Documents about Australia likely to contain
related terms like Sydney and koala
Proximity and position is important
DIFFERENT TEXT REPRESENTATIONS FOR
MATCHING
LOCAL REPRESENTATION
Terms are considered distinct entities
Term representation is local (one-hot vectors)
Matching is exact (term-level)
DISTRIBUTED REPRESENTATION
Represent text as dense vectors (embeddings)
Inexact matching in the embedding space
Local (one-hot) representation Distributed representation
A TALE OF TWO QUERIES
“PEKAROVIC LAND COMPANY”
Hard to learn good representation for
rare term pekarovic
But easy to estimate relevance based
on patterns of exact matches
Proposal: Learn a neural model to
estimate relevance from patterns of
exact matches
“WHAT CHANNEL ARE THE SEAHAWKS ON
TODAY”
Target document likely contains ESPN
or sky sports instead of channel
An embedding model can associate
ESPN in document to channel in query
Proposal: Learn embeddings of text
and match query with document in
the embedding space
The Duet Architecture
Use a neural network to model both functions and learn their parameters jointly
THE DUET
ARCHITECTURE
Linear combination of two models
trained jointly on labelled query-
document pairs
Local model operates on lexical
interaction matrix
Distributed model projects n-graph
vectors of text into an embedding
space and then estimates match
LOCAL
SUB-MODEL
Focuses on patterns of
exact matches of query
terms in document
INTERACTION MATRIX OF QUERY-DOCUMENT
TERMS
𝑋𝑖,𝑗 =
1, 𝑖𝑓 𝑞𝑖 = 𝑑𝑗
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
In relevant documents,
→Many matches, typically in clusters
→Matches localized early in
document
→Matches for all query terms
→In-order (phrasal) matches
ESTIMATING RELEVANCE FROM INTERACTION
MATRIX
← document words →
Convolve using window of size 𝑛 𝑑 × 1
Each window instance compares a query term w/
whole document
Fully connected layers aggregate evidence
across query terms - can model phrasal matches
LOCAL
SUB-MODEL
Focuses on patterns of
exact matches of query
terms in document
THE DUET
ARCHITECTURE
Linear combination of two models
trained jointly on labelled query-
document pairs
Local model operates on lexical
interaction matrix
Distributed model projects n-graph
vectors of text into an embedding
space and then estimates match
DISTRIBUTED
SUB-MODEL
Learns representation of
text and matches query
with document in the
embedding space
INPUT REPRESENTATION
dogs → [ d , o , g , s , #d , do , og , gs , s# , #do , dog , ogs , gs#, #dog, dogs, ogs#, #dogs, dogs# ]
(we consider 2K most popular n-graphs only for encoding)
d o g s h a v e o w n e r s c a t s h a v e s t a f f
n-graph
encoding
concatenate
Channels=2K
[words x channels]
convolutio
n
pooling
Query
embedding
…
…
…
HadamardproductHadamardproductFullyconnected
query document
ESTIMATING RELEVANCE FROM TEXT
EMBEDDINGS
Convolve over query and
document terms
Match query with moving
windows over document
Learn text embeddings
specifically for the task
Matching happens in
embedding space
* Network architecture slightly
simplified for visualization – refer paper
for exact details
PUTTING THE TWO MODELS
TOGETHER…
THE DUET
MODEL
Training sample: 𝑄, 𝐷+, 𝐷1
−
𝐷2
−
𝐷3
−
𝐷4
−
𝐷+
= 𝐷𝑜𝑐𝑢𝑚𝑒𝑛𝑡 𝑟𝑎𝑡𝑒𝑑 𝐸𝑥𝑐𝑒𝑙𝑙𝑒𝑛𝑡 𝑜𝑟 𝐺𝑜𝑜𝑑
𝐷−
= 𝐷𝑜𝑐𝑢𝑚𝑒𝑛𝑡 2 𝑟𝑎𝑡𝑖𝑛𝑔𝑠 𝑤𝑜𝑟𝑠𝑒 𝑡ℎ𝑎𝑛 𝐷+
Optimize cross-entropy loss
Implemented using CNTK (GitHub link)
RESULTS ON DOCUMENT RANKING
Key finding: Duet performs significantly better than local and distributed
models trained individually
DUET ON
OTHER IR TASKS
Promising early results on TREC
2017 Complex Answer Retrieval
(TREC-CAR)
Duet performs significantly
better when trained on large
data (~32 million samples)
RANDOM NEGATIVES VS. JUDGED NEGATIVES
Key finding: training w/ judged
bad as negatives significantly
better than w/ random negatives
LOCAL VS. DISTRIBUTED MODEL
Key finding: local and distributed
model performs better on
different segments, but
combination is always better
EFFECT OF TRAINING DATA VOLUME
Key finding: large quantity of training data necessary for learning good
representations, less impactful for training local model
EFFECT OF TRAINING DATA VOLUME (TREC CAR)
Key finding: large quantity of training data necessary for learning good
representations, less impactful for training local model
TERM IMPORTANCE
LOCAL MODEL DISTRIBUTED MODEL
Query: united states president
If we classify models by
query level performance
there is a clear clustering of
lexical (local) and semantic
(distributed) models
GET THE CODE
Implemented using CNTK python API
https://github.com/bmitra-msft/NDRM/blob/master/notebooks/Duet.ipynb
Download
AN INTRODUCTION TO NEURAL
INFORMATION RETRIEVAL
Manuscript under review for
Foundations and Trends® in Information Retrieval
Pre-print is available for free download
http://bit.ly/neuralir-intro
(Final manuscript may contain additional content and changes)
THANK YOU

More Related Content

What's hot

Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for SearchBhaskar Mitra
 
Duet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning TrackDuet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning TrackBhaskar Mitra
 
A Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information RetrievalA Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information RetrievalBhaskar Mitra
 
Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchBhaskar Mitra
 
Neural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progressNeural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progressBhaskar Mitra
 
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning TrackConformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning TrackBhaskar Mitra
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information RetrievalNik Spirin
 
Basic review on topic modeling
Basic review on  topic modelingBasic review on  topic modeling
Basic review on topic modelingHiroyuki Kuromiya
 
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksTopic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksLeonardo Di Donato
 
Transformation Functions for Text Classification: A case study with StackOver...
Transformation Functions for Text Classification: A case study with StackOver...Transformation Functions for Text Classification: A case study with StackOver...
Transformation Functions for Text Classification: A case study with StackOver...Sebastian Ruder
 
A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...
A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...
A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...Andre Freitas
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsBhaskar Mitra
 
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...Parang Saraf
 
Topic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic ModelsTopic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic ModelsClaudia Wagner
 
Topic model an introduction
Topic model an introductionTopic model an introduction
Topic model an introductionYueshen Xu
 
Concurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsConcurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsParang Saraf
 
Topic modeling using big data analytics
Topic modeling using big data analyticsTopic modeling using big data analytics
Topic modeling using big data analyticsFarheen Nilofer
 

What's hot (20)

Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
Duet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning TrackDuet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning Track
 
A Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information RetrievalA Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information Retrieval
 
Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for Search
 
Neural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progressNeural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progress
 
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning TrackConformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
 
Word Embedding In IR
Word Embedding In IRWord Embedding In IR
Word Embedding In IR
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information Retrieval
 
Topics Modeling
Topics ModelingTopics Modeling
Topics Modeling
 
Basic review on topic modeling
Basic review on  topic modelingBasic review on  topic modeling
Basic review on topic modeling
 
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasksTopic Modeling for Information Retrieval and Word Sense Disambiguation tasks
Topic Modeling for Information Retrieval and Word Sense Disambiguation tasks
 
Transformation Functions for Text Classification: A case study with StackOver...
Transformation Functions for Text Classification: A case study with StackOver...Transformation Functions for Text Classification: A case study with StackOver...
Transformation Functions for Text Classification: A case study with StackOver...
 
Topic Modeling
Topic ModelingTopic Modeling
Topic Modeling
 
A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...
A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...
A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word Embeddings
 
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
 
Topic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic ModelsTopic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic Models
 
Topic model an introduction
Topic model an introductionTopic model an introduction
Topic model an introduction
 
Concurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsConcurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector Representations
 
Topic modeling using big data analytics
Topic modeling using big data analyticsTopic modeling using big data analytics
Topic modeling using big data analytics
 

Viewers also liked

Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Association for Computational Linguistics
 
Deep Learning for Chatbot (4/4)
Deep Learning for Chatbot (4/4)Deep Learning for Chatbot (4/4)
Deep Learning for Chatbot (4/4)Jaemin Cho
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageRoelof Pieters
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Association for Computational Linguistics
 
Cs231n 2017 lecture12 Visualizing and Understanding
Cs231n 2017 lecture12 Visualizing and UnderstandingCs231n 2017 lecture12 Visualizing and Understanding
Cs231n 2017 lecture12 Visualizing and UnderstandingYanbin Kong
 
Blockchain Smartnetworks: Bitcoin and Blockchain Explained
Blockchain Smartnetworks: Bitcoin and Blockchain ExplainedBlockchain Smartnetworks: Bitcoin and Blockchain Explained
Blockchain Smartnetworks: Bitcoin and Blockchain ExplainedMelanie Swan
 
Cs231n 2017 lecture10 Recurrent Neural Networks
Cs231n 2017 lecture10 Recurrent Neural NetworksCs231n 2017 lecture10 Recurrent Neural Networks
Cs231n 2017 lecture10 Recurrent Neural NetworksYanbin Kong
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Roelof Pieters
 
Cs231n 2017 lecture11 Detection and Segmentation
Cs231n 2017 lecture11 Detection and SegmentationCs231n 2017 lecture11 Detection and Segmentation
Cs231n 2017 lecture11 Detection and SegmentationYanbin Kong
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
Using Text Embeddings for Information Retrieval
Using Text Embeddings for Information RetrievalUsing Text Embeddings for Information Retrieval
Using Text Embeddings for Information RetrievalBhaskar Mitra
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
Advanced Node.JS Meetup
Advanced Node.JS MeetupAdvanced Node.JS Meetup
Advanced Node.JS MeetupLINAGORA
 
iPhone5c的最后猜测
iPhone5c的最后猜测iPhone5c的最后猜测
iPhone5c的最后猜测Yanbin Kong
 
Recommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRecommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRoelof Pieters
 

Viewers also liked (20)

Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation
Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine TranslationRoee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation
Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
 
Deep Learning for Chatbot (4/4)
Deep Learning for Chatbot (4/4)Deep Learning for Chatbot (4/4)
Deep Learning for Chatbot (4/4)
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on Language
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
 
Cs231n 2017 lecture12 Visualizing and Understanding
Cs231n 2017 lecture12 Visualizing and UnderstandingCs231n 2017 lecture12 Visualizing and Understanding
Cs231n 2017 lecture12 Visualizing and Understanding
 
Blockchain Smartnetworks: Bitcoin and Blockchain Explained
Blockchain Smartnetworks: Bitcoin and Blockchain ExplainedBlockchain Smartnetworks: Bitcoin and Blockchain Explained
Blockchain Smartnetworks: Bitcoin and Blockchain Explained
 
Cs231n 2017 lecture10 Recurrent Neural Networks
Cs231n 2017 lecture10 Recurrent Neural NetworksCs231n 2017 lecture10 Recurrent Neural Networks
Cs231n 2017 lecture10 Recurrent Neural Networks
 
Care your Child
Care your ChildCare your Child
Care your Child
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
 
Cs231n 2017 lecture11 Detection and Segmentation
Cs231n 2017 lecture11 Detection and SegmentationCs231n 2017 lecture11 Detection and Segmentation
Cs231n 2017 lecture11 Detection and Segmentation
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Using Text Embeddings for Information Retrieval
Using Text Embeddings for Information RetrievalUsing Text Embeddings for Information Retrieval
Using Text Embeddings for Information Retrieval
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Advanced Node.JS Meetup
Advanced Node.JS MeetupAdvanced Node.JS Meetup
Advanced Node.JS Meetup
 
iPhone5c的最后猜测
iPhone5c的最后猜测iPhone5c的最后猜测
iPhone5c的最后猜测
 
Recommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRecommender Systems, Matrices and Graphs
Recommender Systems, Matrices and Graphs
 

Similar to NEURAL MODELS FOR DOCUMENT RANKING

A rough set based hybrid method to text categorization
A rough set based hybrid method to text categorizationA rough set based hybrid method to text categorization
A rough set based hybrid method to text categorizationNinad Samel
 
[ ] uottawa_copeck.doc
[ ] uottawa_copeck.doc[ ] uottawa_copeck.doc
[ ] uottawa_copeck.docbutest
 
A Novel Approach for Keyword extraction in learning objects using text mining
A Novel Approach for Keyword extraction in learning objects using text miningA Novel Approach for Keyword extraction in learning objects using text mining
A Novel Approach for Keyword extraction in learning objects using text miningIJSRD
 
Effective Approach For Content Based Image Retrieval In Peer-Peer To Networks
Effective Approach For Content Based Image Retrieval In Peer-Peer To NetworksEffective Approach For Content Based Image Retrieval In Peer-Peer To Networks
Effective Approach For Content Based Image Retrieval In Peer-Peer To NetworksIRJET Journal
 
03 interlinking-dass
03 interlinking-dass03 interlinking-dass
03 interlinking-dassDiego Pessoa
 
IEEE 2014 C# Projects
IEEE 2014 C# ProjectsIEEE 2014 C# Projects
IEEE 2014 C# ProjectsVijay Karan
 
IEEE 2014 C# Projects
IEEE 2014 C# ProjectsIEEE 2014 C# Projects
IEEE 2014 C# ProjectsVijay Karan
 
IEEE 2014 JAVA DATA MINING PROJECTS Keyword query routing
IEEE 2014 JAVA DATA MINING PROJECTS Keyword query routingIEEE 2014 JAVA DATA MINING PROJECTS Keyword query routing
IEEE 2014 JAVA DATA MINING PROJECTS Keyword query routingIEEEFINALYEARSTUDENTPROJECTS
 
2014 IEEE JAVA DATA MINING PROJECT Keyword query routing
2014 IEEE JAVA DATA MINING PROJECT Keyword query routing2014 IEEE JAVA DATA MINING PROJECT Keyword query routing
2014 IEEE JAVA DATA MINING PROJECT Keyword query routingIEEEMEMTECHSTUDENTSPROJECTS
 
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )SBGC
 
ROMAN URDU OPINION MINING SYSTEM (RUOMIS)
ROMAN URDU OPINION MINING SYSTEM (RUOMIS) ROMAN URDU OPINION MINING SYSTEM (RUOMIS)
ROMAN URDU OPINION MINING SYSTEM (RUOMIS) cseij
 
Hybrid geo textual index structure
Hybrid geo textual index structureHybrid geo textual index structure
Hybrid geo textual index structurecseij
 
Text Segmentation for Online Subjective Examination using Machine Learning
Text Segmentation for Online Subjective Examination using Machine   LearningText Segmentation for Online Subjective Examination using Machine   Learning
Text Segmentation for Online Subjective Examination using Machine LearningIRJET Journal
 
Topic detecton by clustering and text mining
Topic detecton by clustering and text miningTopic detecton by clustering and text mining
Topic detecton by clustering and text miningIRJET Journal
 
Semantic Text Processing Powered by Wikipedia
Semantic Text Processing Powered by WikipediaSemantic Text Processing Powered by Wikipedia
Semantic Text Processing Powered by WikipediaMaxim Grinev
 
Semantic Relatedness of Web Resources by XESA - Philipp Scholl
Semantic Relatedness of Web Resources by XESA - Philipp SchollSemantic Relatedness of Web Resources by XESA - Philipp Scholl
Semantic Relatedness of Web Resources by XESA - Philipp SchollCROKODIl consortium
 
Usage of word sense disambiguation in concept identification in ontology cons...
Usage of word sense disambiguation in concept identification in ontology cons...Usage of word sense disambiguation in concept identification in ontology cons...
Usage of word sense disambiguation in concept identification in ontology cons...Innovation Quotient Pvt Ltd
 
Topic Extraction on Domain Ontology
Topic Extraction on Domain OntologyTopic Extraction on Domain Ontology
Topic Extraction on Domain OntologyKeerti Bhogaraju
 
keyword query routing
keyword query routingkeyword query routing
keyword query routingswathi78
 

Similar to NEURAL MODELS FOR DOCUMENT RANKING (20)

G1803054653
G1803054653G1803054653
G1803054653
 
A rough set based hybrid method to text categorization
A rough set based hybrid method to text categorizationA rough set based hybrid method to text categorization
A rough set based hybrid method to text categorization
 
[ ] uottawa_copeck.doc
[ ] uottawa_copeck.doc[ ] uottawa_copeck.doc
[ ] uottawa_copeck.doc
 
A Novel Approach for Keyword extraction in learning objects using text mining
A Novel Approach for Keyword extraction in learning objects using text miningA Novel Approach for Keyword extraction in learning objects using text mining
A Novel Approach for Keyword extraction in learning objects using text mining
 
Effective Approach For Content Based Image Retrieval In Peer-Peer To Networks
Effective Approach For Content Based Image Retrieval In Peer-Peer To NetworksEffective Approach For Content Based Image Retrieval In Peer-Peer To Networks
Effective Approach For Content Based Image Retrieval In Peer-Peer To Networks
 
03 interlinking-dass
03 interlinking-dass03 interlinking-dass
03 interlinking-dass
 
IEEE 2014 C# Projects
IEEE 2014 C# ProjectsIEEE 2014 C# Projects
IEEE 2014 C# Projects
 
IEEE 2014 C# Projects
IEEE 2014 C# ProjectsIEEE 2014 C# Projects
IEEE 2014 C# Projects
 
IEEE 2014 JAVA DATA MINING PROJECTS Keyword query routing
IEEE 2014 JAVA DATA MINING PROJECTS Keyword query routingIEEE 2014 JAVA DATA MINING PROJECTS Keyword query routing
IEEE 2014 JAVA DATA MINING PROJECTS Keyword query routing
 
2014 IEEE JAVA DATA MINING PROJECT Keyword query routing
2014 IEEE JAVA DATA MINING PROJECT Keyword query routing2014 IEEE JAVA DATA MINING PROJECT Keyword query routing
2014 IEEE JAVA DATA MINING PROJECT Keyword query routing
 
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
 
ROMAN URDU OPINION MINING SYSTEM (RUOMIS)
ROMAN URDU OPINION MINING SYSTEM (RUOMIS) ROMAN URDU OPINION MINING SYSTEM (RUOMIS)
ROMAN URDU OPINION MINING SYSTEM (RUOMIS)
 
Hybrid geo textual index structure
Hybrid geo textual index structureHybrid geo textual index structure
Hybrid geo textual index structure
 
Text Segmentation for Online Subjective Examination using Machine Learning
Text Segmentation for Online Subjective Examination using Machine   LearningText Segmentation for Online Subjective Examination using Machine   Learning
Text Segmentation for Online Subjective Examination using Machine Learning
 
Topic detecton by clustering and text mining
Topic detecton by clustering and text miningTopic detecton by clustering and text mining
Topic detecton by clustering and text mining
 
Semantic Text Processing Powered by Wikipedia
Semantic Text Processing Powered by WikipediaSemantic Text Processing Powered by Wikipedia
Semantic Text Processing Powered by Wikipedia
 
Semantic Relatedness of Web Resources by XESA - Philipp Scholl
Semantic Relatedness of Web Resources by XESA - Philipp SchollSemantic Relatedness of Web Resources by XESA - Philipp Scholl
Semantic Relatedness of Web Resources by XESA - Philipp Scholl
 
Usage of word sense disambiguation in concept identification in ontology cons...
Usage of word sense disambiguation in concept identification in ontology cons...Usage of word sense disambiguation in concept identification in ontology cons...
Usage of word sense disambiguation in concept identification in ontology cons...
 
Topic Extraction on Domain Ontology
Topic Extraction on Domain OntologyTopic Extraction on Domain Ontology
Topic Extraction on Domain Ontology
 
keyword query routing
keyword query routingkeyword query routing
keyword query routing
 

More from Bhaskar Mitra

Joint Multisided Exposure Fairness for Search and Recommendation
Joint Multisided Exposure Fairness for Search and RecommendationJoint Multisided Exposure Fairness for Search and Recommendation
Joint Multisided Exposure Fairness for Search and RecommendationBhaskar Mitra
 
What’s next for deep learning for Search?
What’s next for deep learning for Search?What’s next for deep learning for Search?
What’s next for deep learning for Search?Bhaskar Mitra
 
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...Bhaskar Mitra
 
Efficient Machine Learning and Machine Learning for Efficiency in Information...
Efficient Machine Learning and Machine Learning for Efficiency in Information...Efficient Machine Learning and Machine Learning for Efficiency in Information...
Efficient Machine Learning and Machine Learning for Efficiency in Information...Bhaskar Mitra
 
Multisided Exposure Fairness for Search and Recommendation
Multisided Exposure Fairness for Search and RecommendationMultisided Exposure Fairness for Search and Recommendation
Multisided Exposure Fairness for Search and RecommendationBhaskar Mitra
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to RankBhaskar Mitra
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to RankBhaskar Mitra
 
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and BeyondBenchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and BeyondBhaskar Mitra
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to RankBhaskar Mitra
 
Learning to Rank with Neural Networks
Learning to Rank with Neural NetworksLearning to Rank with Neural Networks
Learning to Rank with Neural NetworksBhaskar Mitra
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for SearchBhaskar Mitra
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for SearchBhaskar Mitra
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to RankBhaskar Mitra
 
Neu-IR 2017: welcome
Neu-IR 2017: welcomeNeu-IR 2017: welcome
Neu-IR 2017: welcomeBhaskar Mitra
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Bhaskar Mitra
 
Query Expansion with Locally-Trained Word Embeddings (ACL 2016)
Query Expansion with Locally-Trained Word Embeddings (ACL 2016)Query Expansion with Locally-Trained Word Embeddings (ACL 2016)
Query Expansion with Locally-Trained Word Embeddings (ACL 2016)Bhaskar Mitra
 
Query Expansion with Locally-Trained Word Embeddings (Neu-IR 2016)
Query Expansion with Locally-Trained Word Embeddings (Neu-IR 2016)Query Expansion with Locally-Trained Word Embeddings (Neu-IR 2016)
Query Expansion with Locally-Trained Word Embeddings (Neu-IR 2016)Bhaskar Mitra
 
Recurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas MikolovRecurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas MikolovBhaskar Mitra
 

More from Bhaskar Mitra (18)

Joint Multisided Exposure Fairness for Search and Recommendation
Joint Multisided Exposure Fairness for Search and RecommendationJoint Multisided Exposure Fairness for Search and Recommendation
Joint Multisided Exposure Fairness for Search and Recommendation
 
What’s next for deep learning for Search?
What’s next for deep learning for Search?What’s next for deep learning for Search?
What’s next for deep learning for Search?
 
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...
 
Efficient Machine Learning and Machine Learning for Efficiency in Information...
Efficient Machine Learning and Machine Learning for Efficiency in Information...Efficient Machine Learning and Machine Learning for Efficiency in Information...
Efficient Machine Learning and Machine Learning for Efficiency in Information...
 
Multisided Exposure Fairness for Search and Recommendation
Multisided Exposure Fairness for Search and RecommendationMultisided Exposure Fairness for Search and Recommendation
Multisided Exposure Fairness for Search and Recommendation
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
 
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and BeyondBenchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
 
Learning to Rank with Neural Networks
Learning to Rank with Neural NetworksLearning to Rank with Neural Networks
Learning to Rank with Neural Networks
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
 
Neu-IR 2017: welcome
Neu-IR 2017: welcomeNeu-IR 2017: welcome
Neu-IR 2017: welcome
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
 
Query Expansion with Locally-Trained Word Embeddings (ACL 2016)
Query Expansion with Locally-Trained Word Embeddings (ACL 2016)Query Expansion with Locally-Trained Word Embeddings (ACL 2016)
Query Expansion with Locally-Trained Word Embeddings (ACL 2016)
 
Query Expansion with Locally-Trained Word Embeddings (Neu-IR 2016)
Query Expansion with Locally-Trained Word Embeddings (Neu-IR 2016)Query Expansion with Locally-Trained Word Embeddings (Neu-IR 2016)
Query Expansion with Locally-Trained Word Embeddings (Neu-IR 2016)
 
Recurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas MikolovRecurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas Mikolov
 

Recently uploaded

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 

Recently uploaded (20)

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 

NEURAL MODELS FOR DOCUMENT RANKING

  • 1. NEURAL MODELS FOR DOCUMENT RANKING BHASKAR MITRA Principal Applied Scientist Microsoft Research and AI Research Student Dept. of Computer Science University College London Joint work with Nick Craswell, Fernando Diaz, Federico Nanni, Matt Magnusson, and Laura Dietz
  • 2. PAPERS WE WILL DISCUSS Learning to Match Using Local and Distributed Representations of Text for Web Search Bhaskar Mitra, Fernando Diaz, and Nick Craswell, in Proc. WWW, 2017. https://dl.acm.org/citation.cfm?id=3052579 Benchmark for Complex Answer Retrieval Federico Nanni, Bhaskar Mitra, Matt Magnusson, and Laura Dietz, in Proc. ICTIR, 2017. https://dl.acm.org/citation.cfm?id=3121099
  • 3. THE DOCUMENT RANKING TASK Given a query rank documents according to relevance The query text has few terms The document representation can be long (e.g., body text) or short (e.g., title) query ranked results search engine w/ an index of retrievable items
  • 4. This talk is focused on ranking documents based on their long body text
  • 5. CHALLENGES IN SHORT VS. LONG TEXT RETRIEVAL Short-text Vocabulary mismatch more serious problem Long-text Documents contain mixture of many topics Matches in different parts of a long document contribute unequally Term proximity is an important consideration
  • 6. MANY DNN MODELS FOR SHORT TEXT RANKING (Huang et al., 2013) (Severyn and Moschitti, 2015) (Shen et al., 2014) (Palangi et al., 2015) (Hu et al., 2014) (Tai et al., 2015)
  • 7. BUT FEW FOR LONG DOCUMENT RANKING… (Guo et al., 2016) (Salakhutdinov and Hinton, 2009)
  • 8. DESIDERATA OF DOCUMENT RANKING EXACT MATCHING Frequency and positions of matches good indicators of relevance Term proximity is important Important if query term is rare / fresh INEXACT MATCHING Synonymy relationships united states president ↔ Obama Evidence for document aboutness Documents about Australia likely to contain related terms like Sydney and koala Proximity and position is important
  • 9. DIFFERENT TEXT REPRESENTATIONS FOR MATCHING LOCAL REPRESENTATION Terms are considered distinct entities Term representation is local (one-hot vectors) Matching is exact (term-level) DISTRIBUTED REPRESENTATION Represent text as dense vectors (embeddings) Inexact matching in the embedding space Local (one-hot) representation Distributed representation
  • 10. A TALE OF TWO QUERIES “PEKAROVIC LAND COMPANY” Hard to learn good representation for rare term pekarovic But easy to estimate relevance based on patterns of exact matches Proposal: Learn a neural model to estimate relevance from patterns of exact matches “WHAT CHANNEL ARE THE SEAHAWKS ON TODAY” Target document likely contains ESPN or sky sports instead of channel An embedding model can associate ESPN in document to channel in query Proposal: Learn embeddings of text and match query with document in the embedding space The Duet Architecture Use a neural network to model both functions and learn their parameters jointly
  • 11. THE DUET ARCHITECTURE Linear combination of two models trained jointly on labelled query- document pairs Local model operates on lexical interaction matrix Distributed model projects n-graph vectors of text into an embedding space and then estimates match
  • 12. LOCAL SUB-MODEL Focuses on patterns of exact matches of query terms in document
  • 13. INTERACTION MATRIX OF QUERY-DOCUMENT TERMS 𝑋𝑖,𝑗 = 1, 𝑖𝑓 𝑞𝑖 = 𝑑𝑗 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 In relevant documents, →Many matches, typically in clusters →Matches localized early in document →Matches for all query terms →In-order (phrasal) matches
  • 14. ESTIMATING RELEVANCE FROM INTERACTION MATRIX ← document words → Convolve using window of size 𝑛 𝑑 × 1 Each window instance compares a query term w/ whole document Fully connected layers aggregate evidence across query terms - can model phrasal matches
  • 15. LOCAL SUB-MODEL Focuses on patterns of exact matches of query terms in document
  • 16. THE DUET ARCHITECTURE Linear combination of two models trained jointly on labelled query- document pairs Local model operates on lexical interaction matrix Distributed model projects n-graph vectors of text into an embedding space and then estimates match
  • 17. DISTRIBUTED SUB-MODEL Learns representation of text and matches query with document in the embedding space
  • 18. INPUT REPRESENTATION dogs → [ d , o , g , s , #d , do , og , gs , s# , #do , dog , ogs , gs#, #dog, dogs, ogs#, #dogs, dogs# ] (we consider 2K most popular n-graphs only for encoding) d o g s h a v e o w n e r s c a t s h a v e s t a f f n-graph encoding concatenate Channels=2K [words x channels]
  • 19. convolutio n pooling Query embedding … … … HadamardproductHadamardproductFullyconnected query document ESTIMATING RELEVANCE FROM TEXT EMBEDDINGS Convolve over query and document terms Match query with moving windows over document Learn text embeddings specifically for the task Matching happens in embedding space * Network architecture slightly simplified for visualization – refer paper for exact details
  • 20. PUTTING THE TWO MODELS TOGETHER…
  • 21. THE DUET MODEL Training sample: 𝑄, 𝐷+, 𝐷1 − 𝐷2 − 𝐷3 − 𝐷4 − 𝐷+ = 𝐷𝑜𝑐𝑢𝑚𝑒𝑛𝑡 𝑟𝑎𝑡𝑒𝑑 𝐸𝑥𝑐𝑒𝑙𝑙𝑒𝑛𝑡 𝑜𝑟 𝐺𝑜𝑜𝑑 𝐷− = 𝐷𝑜𝑐𝑢𝑚𝑒𝑛𝑡 2 𝑟𝑎𝑡𝑖𝑛𝑔𝑠 𝑤𝑜𝑟𝑠𝑒 𝑡ℎ𝑎𝑛 𝐷+ Optimize cross-entropy loss Implemented using CNTK (GitHub link)
  • 22. RESULTS ON DOCUMENT RANKING Key finding: Duet performs significantly better than local and distributed models trained individually
  • 23. DUET ON OTHER IR TASKS Promising early results on TREC 2017 Complex Answer Retrieval (TREC-CAR) Duet performs significantly better when trained on large data (~32 million samples)
  • 24. RANDOM NEGATIVES VS. JUDGED NEGATIVES Key finding: training w/ judged bad as negatives significantly better than w/ random negatives
  • 25. LOCAL VS. DISTRIBUTED MODEL Key finding: local and distributed model performs better on different segments, but combination is always better
  • 26. EFFECT OF TRAINING DATA VOLUME Key finding: large quantity of training data necessary for learning good representations, less impactful for training local model
  • 27. EFFECT OF TRAINING DATA VOLUME (TREC CAR) Key finding: large quantity of training data necessary for learning good representations, less impactful for training local model
  • 28. TERM IMPORTANCE LOCAL MODEL DISTRIBUTED MODEL Query: united states president
  • 29. If we classify models by query level performance there is a clear clustering of lexical (local) and semantic (distributed) models
  • 30. GET THE CODE Implemented using CNTK python API https://github.com/bmitra-msft/NDRM/blob/master/notebooks/Duet.ipynb Download
  • 31. AN INTRODUCTION TO NEURAL INFORMATION RETRIEVAL Manuscript under review for Foundations and Trends® in Information Retrieval Pre-print is available for free download http://bit.ly/neuralir-intro (Final manuscript may contain additional content and changes) THANK YOU