SlideShare a Scribd company logo
1 of 28
Download to read offline
Rank by Time or by Relevance?
Revisiting Email Search
November17th, 2015
David Carmel Guy Halawi Liane Lewin-Eytan Yoelle Maarek Ariel Raviv
Haifa Labs
Motivation
▪  “Email search still remains difficult, time-consuming and
frustrating" (Elsweiler et al. 2011)
▪  By default, all existing Web mail services display search results
in reverse chronological order
▪  makes the discovery of older messages very hard
▪  Imposes strict constraints for messages matching
Email Search Today (Time ordered)
Searching for an (old) application form for “Visa to India”
Search in Yahoo Mail
▪  Boolean Search model
•  Each query is a Boolean expression (AND, OR, NOT)
•  Generally, all query terms must appear in at-least one of the
message fields (AND operation)
▪  Ranking
•  Default: by Recency (Reverse Chronological ordering)
•  (pseudo)-Relevance – implementation is based on matching
query terms
▪  almost never used by users
Challenge
▪  Challenge the traditional prevalent chronological ranking in Web email
search
›  investigate whether an email-specific relevance ranking could bring any value to our
users
▪  Introduce mail-specific relevance ranking consisting of two phases:
›  Relaxed matching phase to improve recall
›  Comprehensive ranking phase using a rich set of mail-specific features to improve
precision
▪ Very short queries: 1.5 terms on avg.
› Re-find Intent
– looking for specific previous message
› Contact queries ~40%
– Picture, Email address, Phone number,
Physical address, Links, Attachments,
Appointments (time/date), Conversation
▪ Tasks involved:
› Couponing (Pizza coupon?)
› Tracking Items (bill paid, package shipped)
› Looking up Account / Registration info
› Social media (searching for comments/posts)
Email Queries: What people search for?
▪  Standard two-phase retrieval process:
›  First Phase: Retrieve a pool of message qualified as potentially relevant
to the query
•  Two matching models:
–  Restricted (AND mode)
–  Relaxed: any message containing at least one of the query terms in any
of its fields is considered a match
›  Second phase:
•  Ranks these messages using a rich set of features
–  Scores messages by linear regression analysis learned using Learning-to-rank
approach
The Search Process
REX - Relevance EXtended Ranking Model
Based on an LTR framework using
several sets of features:
▪  Message
▪  Recipient
▪  Sender
▪  Message-Query Similarity
Message Features
▪  Freshness exponential decay over the message age
▪  User Actions replied, forwarded, flagged, drafted, read,..
▪  Attachment has attachment, attachment type / size
▪  Folder folder type (inbox, draft, sent, user defined folder)
▪  Exchange Type reply/forward, in-thread
Recipient Features
▪  To recipient mentioned in To
▪  Cc recipient mentioned in Cc
▪  In Group recipient was not mentioned explicitly
Sender Features
▪  User-sender connection correspondence volume / type
▪  Self correspondence sender is user
Vertical
▪  Sender inbound / outbound traffic volume and ratio
▪  Sender urls usage volume and ratio in messages
▪  Sender recipients number avg. per message
▪  Sender recipients actions ratio over messages
Horizontal
Message-Query Similarity Features
▪  BM25f textual similarity between a query and the entire message
•  Considering query term distribution over message fields (Subject, From, To, Body,
Attachment)
●  TF-IDF measures the (tf-idf) similarity of each message field
independently of the others
●  Coord fraction of query terms that occur in the message
Proximity
Taking into account proximity between query terms in content
▪  Neighborhood boosting consecutive matches
▪  Proximity boosting tokens found closely in a fixed
window (5) with no ordering
▪  Prefix allowing prefix match but with score decay using
length difference
Learning to Rank (LTR)
▪  Data point: < query | ~100 matched messages | klicked message >
▪  Datasets:
›  Corporate 100K random queries from the corporate query log
›  Web-mail 10K random queries
›  Editorial 500 queries judged by editors
▪  LTR Algorithm AROW (Crammer et al. 2013)
d1
d2
d3
d4 ∑wi fi(d4)
∑wi fi(d1)
>∑wi fi(d2)
∑wi fi(d3)
d1
d2
d4
d3
∑wi fi(d4) ∑wi fi(d1)
> ∑wi fi(d2)∑wi fi(d4)
update w
Learning to Rank (LTR)
Experimental Results
Performance Measures
▪  Mean reciprocal rank (MRR)
corresponds to the harmonic mean of
the ranks of the relevant documents
▪  Success@K
The number of queries for which the clicked message is
found in the top-k results
▪  NDCG@K
when we have several relevance feedback levels
Time Vs. REX (Corporate Dataset)
Algorithm MRR (+lift %)
Time 0.3722
REX (fresh. + sim.) 0.4261 (+14.48%)
REX (fresh. + sim. + actions) 0.4550 (+22.24%)
REX (fresh. + sim. + actions + sender) 0.4548 (+22.19%)
Time Vs. REX (Web Mail Dataset)
Algorithm MRR (+lift %)
Time 0.3717
REX (fresh. + sim.) 0.3785 (+1.81%)
REX (fresh. + sim. + actions) 0.4238 (+14%)
REX (fresh. + sim. + actions + sender) 0.4258 (+14.55%)
Time vs REX (as a function of the Result set size)
Relative improvement of REX over Time increases as more
messages in the user inbox match the query
Relative Feature Importance
▪  In general, REX ranker significantly outperforms Chronological ranker
›  both in the Corporate and in the Web datasets
▪  Relative Feature Importance:
Freshness >> User actions >> Similarity >> Sender features
›  Freshness:
•  Years >> Months >> Weeks >> Days
›  User actions
•  Read >> Forwarded >> Flagged >> Replied >> Draft >> Ham >> Spam
›  Similarity:
•  coord >> tf-idf (From > Subject > Body >Attachment > To) >> BM25f
▪  Low significance of the sender features ??
Time Vs. REX (Editorial Dataset)
Algorithm MRR (+lift %) NDCG@10 (+lift %)
Time 0.3629 0.4936
REX 0.5105 (+40.65%) 0.6647 (+34.66%)
Query Intent
Algo A
Most Relevant
Algo A
Related
Algo B
Most Relevant
Algo B
Related
Lila dress Discussion about dress for party 2 3,5,7 4 1,2,6,7
Spense KE Schedule for Spense KE meeting 5 2,4,8,9 1 3,4,5
Editors Feedback
“... Sometimes, I had the feeling that Algo. B was
really reading my mind to put in the first place
exactly the email message I was thinking of ...”
“...Today, after I ran it again, it was not that much
impressive, but still I have the feeling it was the
type of search that gave me the best results..."
Email Search Tomorrow (REX ordered)
Searching for an (old) application form for “Visa to India”
Conclusions
●  We Challenged the traditional chronological sort for email search
o  While freshness is still super important, it should be integrated into the
relevance model with many other important features
o  REX performs significantly better from Time-based ranking
o  The model can be easily expanded considering more signals as they become
available
●  Are mail users ready to depart from chronological sort in favor of
modern relevance ranking?
●  Time will tell
●  REX provides our users the opportunity, at least
●  More details can be found in our CIKM 2015 paper:
o  Rank by Time or by Relevance? Revisiting Email Search
Future Work
▪  Enriching the the set of Ranking features
›  Solving the mystery:
•  how come that Sender features do not contribute to the ranking
›  Adding Query based features Based on Query Intent Analysis
▪  Personalization
›  Adding the User into the ranking model
▪  User Study
›  Better understanding user needs
•  how users search over their mailboxes
Yahoo new Mobile Mail application
Thanks for listening

More Related Content

Similar to Rank by time or by relevance - Revisiting Email Search

Improving Search in Workday Products using Natural Language Processing
Improving Search in Workday Products using Natural Language ProcessingImproving Search in Workday Products using Natural Language Processing
Improving Search in Workday Products using Natural Language ProcessingDataWorks Summit
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...S. Diana Hu
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...Joaquin Delgado PhD.
 
An introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxAn introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxElasticsearch
 
Pushing the Institutional Repository to a New Level: Potential Benefits of Me...
Pushing the Institutional Repository to a New Level: Potential Benefits of Me...Pushing the Institutional Repository to a New Level: Potential Benefits of Me...
Pushing the Institutional Repository to a New Level: Potential Benefits of Me...CULS
 
Rated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality EvaluationRated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality EvaluationAlessandro Benedetti
 
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...OpenSource Connections
 
GOKb and Refine (Kuali Days 2013)
GOKb and Refine (Kuali Days 2013)GOKb and Refine (Kuali Days 2013)
GOKb and Refine (Kuali Days 2013)GOKb Project
 
The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge GraphTrey Grainger
 
Rated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: an Open Source Approach for Search Quality EvaluationRated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: an Open Source Approach for Search Quality EvaluationSease
 
Webinar: Question Answering and Virtual Assistants with Deep Learning
Webinar: Question Answering and Virtual Assistants with Deep LearningWebinar: Question Answering and Virtual Assistants with Deep Learning
Webinar: Question Answering and Virtual Assistants with Deep LearningLucidworks
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comSimon Hughes
 
Learning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search GuildLearning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search GuildSujit Pal
 
Towards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology ApplicationsTowards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology ApplicationsMarina Santini
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningJoaquin Delgado PhD.
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningS. Diana Hu
 
Intern Project Showcase.pptx
Intern Project Showcase.pptxIntern Project Showcase.pptx
Intern Project Showcase.pptxritikgarg48
 
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systemsQi He
 
Vectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic MatchingVectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic MatchingSimon Hughes
 

Similar to Rank by time or by relevance - Revisiting Email Search (20)

Improving Search in Workday Products using Natural Language Processing
Improving Search in Workday Products using Natural Language ProcessingImproving Search in Workday Products using Natural Language Processing
Improving Search in Workday Products using Natural Language Processing
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
An introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxAn introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolbox
 
Pushing the Institutional Repository to a New Level: Potential Benefits of Me...
Pushing the Institutional Repository to a New Level: Potential Benefits of Me...Pushing the Institutional Repository to a New Level: Potential Benefits of Me...
Pushing the Institutional Repository to a New Level: Potential Benefits of Me...
 
Live Blog Analysis
Live Blog AnalysisLive Blog Analysis
Live Blog Analysis
 
Rated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality EvaluationRated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
 
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...
 
GOKb and Refine (Kuali Days 2013)
GOKb and Refine (Kuali Days 2013)GOKb and Refine (Kuali Days 2013)
GOKb and Refine (Kuali Days 2013)
 
The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge Graph
 
Rated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: an Open Source Approach for Search Quality EvaluationRated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
 
Webinar: Question Answering and Virtual Assistants with Deep Learning
Webinar: Question Answering and Virtual Assistants with Deep LearningWebinar: Question Answering and Virtual Assistants with Deep Learning
Webinar: Question Answering and Virtual Assistants with Deep Learning
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
 
Learning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search GuildLearning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search Guild
 
Towards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology ApplicationsTowards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology Applications
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
 
Intern Project Showcase.pptx
Intern Project Showcase.pptxIntern Project Showcase.pptx
Intern Project Showcase.pptx
 
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
 
Vectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic MatchingVectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic Matching
 

Recently uploaded

Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsssuserddc89b
 
TOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxTOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxdharshini369nike
 
Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2John Carlo Rollon
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
‏‏VIRUS - 123455555555555555555555555555555555555555
‏‏VIRUS -  123455555555555555555555555555555555555555‏‏VIRUS -  123455555555555555555555555555555555555555
‏‏VIRUS - 123455555555555555555555555555555555555555kikilily0909
 
Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)DHURKADEVIBASKAR
 
Forest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are importantForest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are importantadityabhardwaj282
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Solution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsSolution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsHajira Mahmood
 
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 

Recently uploaded (20)

Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physics
 
TOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxTOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptx
 
Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
‏‏VIRUS - 123455555555555555555555555555555555555555
‏‏VIRUS -  123455555555555555555555555555555555555555‏‏VIRUS -  123455555555555555555555555555555555555555
‏‏VIRUS - 123455555555555555555555555555555555555555
 
Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)
 
Forest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are importantForest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are important
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Solution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsSolution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutions
 
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 

Rank by time or by relevance - Revisiting Email Search

  • 1. Rank by Time or by Relevance? Revisiting Email Search November17th, 2015 David Carmel Guy Halawi Liane Lewin-Eytan Yoelle Maarek Ariel Raviv Haifa Labs
  • 2. Motivation ▪  “Email search still remains difficult, time-consuming and frustrating" (Elsweiler et al. 2011) ▪  By default, all existing Web mail services display search results in reverse chronological order ▪  makes the discovery of older messages very hard ▪  Imposes strict constraints for messages matching
  • 3. Email Search Today (Time ordered) Searching for an (old) application form for “Visa to India”
  • 4. Search in Yahoo Mail ▪  Boolean Search model •  Each query is a Boolean expression (AND, OR, NOT) •  Generally, all query terms must appear in at-least one of the message fields (AND operation) ▪  Ranking •  Default: by Recency (Reverse Chronological ordering) •  (pseudo)-Relevance – implementation is based on matching query terms ▪  almost never used by users
  • 5. Challenge ▪  Challenge the traditional prevalent chronological ranking in Web email search ›  investigate whether an email-specific relevance ranking could bring any value to our users ▪  Introduce mail-specific relevance ranking consisting of two phases: ›  Relaxed matching phase to improve recall ›  Comprehensive ranking phase using a rich set of mail-specific features to improve precision
  • 6. ▪ Very short queries: 1.5 terms on avg. › Re-find Intent – looking for specific previous message › Contact queries ~40% – Picture, Email address, Phone number, Physical address, Links, Attachments, Appointments (time/date), Conversation ▪ Tasks involved: › Couponing (Pizza coupon?) › Tracking Items (bill paid, package shipped) › Looking up Account / Registration info › Social media (searching for comments/posts) Email Queries: What people search for?
  • 7. ▪  Standard two-phase retrieval process: ›  First Phase: Retrieve a pool of message qualified as potentially relevant to the query •  Two matching models: –  Restricted (AND mode) –  Relaxed: any message containing at least one of the query terms in any of its fields is considered a match ›  Second phase: •  Ranks these messages using a rich set of features –  Scores messages by linear regression analysis learned using Learning-to-rank approach The Search Process
  • 8. REX - Relevance EXtended Ranking Model Based on an LTR framework using several sets of features: ▪  Message ▪  Recipient ▪  Sender ▪  Message-Query Similarity
  • 9. Message Features ▪  Freshness exponential decay over the message age ▪  User Actions replied, forwarded, flagged, drafted, read,.. ▪  Attachment has attachment, attachment type / size ▪  Folder folder type (inbox, draft, sent, user defined folder) ▪  Exchange Type reply/forward, in-thread
  • 10. Recipient Features ▪  To recipient mentioned in To ▪  Cc recipient mentioned in Cc ▪  In Group recipient was not mentioned explicitly
  • 11. Sender Features ▪  User-sender connection correspondence volume / type ▪  Self correspondence sender is user Vertical ▪  Sender inbound / outbound traffic volume and ratio ▪  Sender urls usage volume and ratio in messages ▪  Sender recipients number avg. per message ▪  Sender recipients actions ratio over messages Horizontal
  • 12. Message-Query Similarity Features ▪  BM25f textual similarity between a query and the entire message •  Considering query term distribution over message fields (Subject, From, To, Body, Attachment) ●  TF-IDF measures the (tf-idf) similarity of each message field independently of the others ●  Coord fraction of query terms that occur in the message
  • 13. Proximity Taking into account proximity between query terms in content ▪  Neighborhood boosting consecutive matches ▪  Proximity boosting tokens found closely in a fixed window (5) with no ordering ▪  Prefix allowing prefix match but with score decay using length difference
  • 14. Learning to Rank (LTR) ▪  Data point: < query | ~100 matched messages | klicked message > ▪  Datasets: ›  Corporate 100K random queries from the corporate query log ›  Web-mail 10K random queries ›  Editorial 500 queries judged by editors ▪  LTR Algorithm AROW (Crammer et al. 2013)
  • 15. d1 d2 d3 d4 ∑wi fi(d4) ∑wi fi(d1) >∑wi fi(d2) ∑wi fi(d3) d1 d2 d4 d3 ∑wi fi(d4) ∑wi fi(d1) > ∑wi fi(d2)∑wi fi(d4) update w Learning to Rank (LTR)
  • 17. Performance Measures ▪  Mean reciprocal rank (MRR) corresponds to the harmonic mean of the ranks of the relevant documents ▪  Success@K The number of queries for which the clicked message is found in the top-k results ▪  NDCG@K when we have several relevance feedback levels
  • 18. Time Vs. REX (Corporate Dataset) Algorithm MRR (+lift %) Time 0.3722 REX (fresh. + sim.) 0.4261 (+14.48%) REX (fresh. + sim. + actions) 0.4550 (+22.24%) REX (fresh. + sim. + actions + sender) 0.4548 (+22.19%)
  • 19. Time Vs. REX (Web Mail Dataset) Algorithm MRR (+lift %) Time 0.3717 REX (fresh. + sim.) 0.3785 (+1.81%) REX (fresh. + sim. + actions) 0.4238 (+14%) REX (fresh. + sim. + actions + sender) 0.4258 (+14.55%)
  • 20. Time vs REX (as a function of the Result set size) Relative improvement of REX over Time increases as more messages in the user inbox match the query
  • 21. Relative Feature Importance ▪  In general, REX ranker significantly outperforms Chronological ranker ›  both in the Corporate and in the Web datasets ▪  Relative Feature Importance: Freshness >> User actions >> Similarity >> Sender features ›  Freshness: •  Years >> Months >> Weeks >> Days ›  User actions •  Read >> Forwarded >> Flagged >> Replied >> Draft >> Ham >> Spam ›  Similarity: •  coord >> tf-idf (From > Subject > Body >Attachment > To) >> BM25f ▪  Low significance of the sender features ??
  • 22. Time Vs. REX (Editorial Dataset) Algorithm MRR (+lift %) NDCG@10 (+lift %) Time 0.3629 0.4936 REX 0.5105 (+40.65%) 0.6647 (+34.66%) Query Intent Algo A Most Relevant Algo A Related Algo B Most Relevant Algo B Related Lila dress Discussion about dress for party 2 3,5,7 4 1,2,6,7 Spense KE Schedule for Spense KE meeting 5 2,4,8,9 1 3,4,5
  • 23. Editors Feedback “... Sometimes, I had the feeling that Algo. B was really reading my mind to put in the first place exactly the email message I was thinking of ...” “...Today, after I ran it again, it was not that much impressive, but still I have the feeling it was the type of search that gave me the best results..."
  • 24. Email Search Tomorrow (REX ordered) Searching for an (old) application form for “Visa to India”
  • 25. Conclusions ●  We Challenged the traditional chronological sort for email search o  While freshness is still super important, it should be integrated into the relevance model with many other important features o  REX performs significantly better from Time-based ranking o  The model can be easily expanded considering more signals as they become available ●  Are mail users ready to depart from chronological sort in favor of modern relevance ranking? ●  Time will tell ●  REX provides our users the opportunity, at least ●  More details can be found in our CIKM 2015 paper: o  Rank by Time or by Relevance? Revisiting Email Search
  • 26. Future Work ▪  Enriching the the set of Ranking features ›  Solving the mystery: •  how come that Sender features do not contribute to the ranking ›  Adding Query based features Based on Query Intent Analysis ▪  Personalization ›  Adding the User into the ranking model ▪  User Study ›  Better understanding user needs •  how users search over their mailboxes
  • 27. Yahoo new Mobile Mail application