SlideShare a Scribd company logo
1 of 28
Download to read offline
Rank by Time or by Relevance?
Revisiting Email Search
November17th, 2015
David Carmel Guy Halawi Liane Lewin-Eytan Yoelle Maarek Ariel Raviv
Haifa Labs
Motivation
▪  “Email search still remains difficult, time-consuming and
frustrating" (Elsweiler et al. 2011)
▪  By default, all existing Web mail services display search results
in reverse chronological order
▪  makes the discovery of older messages very hard
▪  Imposes strict constraints for messages matching
Email Search Today (Time ordered)
Searching for an (old) application form for “Visa to India”
Search in Yahoo Mail
▪  Boolean Search model
•  Each query is a Boolean expression (AND, OR, NOT)
•  Generally, all query terms must appear in at-least one of the
message fields (AND operation)
▪  Ranking
•  Default: by Recency (Reverse Chronological ordering)
•  (pseudo)-Relevance – implementation is based on matching
query terms
▪  almost never used by users
Challenge
▪  Challenge the traditional prevalent chronological ranking in Web email
search
›  investigate whether an email-specific relevance ranking could bring any value to our
users
▪  Introduce mail-specific relevance ranking consisting of two phases:
›  Relaxed matching phase to improve recall
›  Comprehensive ranking phase using a rich set of mail-specific features to improve
precision
▪ Very short queries: 1.5 terms on avg.
› Re-find Intent
– looking for specific previous message
› Contact queries ~40%
– Picture, Email address, Phone number,
Physical address, Links, Attachments,
Appointments (time/date), Conversation
▪ Tasks involved:
› Couponing (Pizza coupon?)
› Tracking Items (bill paid, package shipped)
› Looking up Account / Registration info
› Social media (searching for comments/posts)
Email Queries: What people search for?
▪  Standard two-phase retrieval process:
›  First Phase: Retrieve a pool of message qualified as potentially relevant
to the query
•  Two matching models:
–  Restricted (AND mode)
–  Relaxed: any message containing at least one of the query terms in any
of its fields is considered a match
›  Second phase:
•  Ranks these messages using a rich set of features
–  Scores messages by linear regression analysis learned using Learning-to-rank
approach
The Search Process
REX - Relevance EXtended Ranking Model
Based on an LTR framework using
several sets of features:
▪  Message
▪  Recipient
▪  Sender
▪  Message-Query Similarity
Message Features
▪  Freshness exponential decay over the message age
▪  User Actions replied, forwarded, flagged, drafted, read,..
▪  Attachment has attachment, attachment type / size
▪  Folder folder type (inbox, draft, sent, user defined folder)
▪  Exchange Type reply/forward, in-thread
Recipient Features
▪  To recipient mentioned in To
▪  Cc recipient mentioned in Cc
▪  In Group recipient was not mentioned explicitly
Sender Features
▪  User-sender connection correspondence volume / type
▪  Self correspondence sender is user
Vertical
▪  Sender inbound / outbound traffic volume and ratio
▪  Sender urls usage volume and ratio in messages
▪  Sender recipients number avg. per message
▪  Sender recipients actions ratio over messages
Horizontal
Message-Query Similarity Features
▪  BM25f textual similarity between a query and the entire message
•  Considering query term distribution over message fields (Subject, From, To, Body,
Attachment)
●  TF-IDF measures the (tf-idf) similarity of each message field
independently of the others
●  Coord fraction of query terms that occur in the message
Proximity
Taking into account proximity between query terms in content
▪  Neighborhood boosting consecutive matches
▪  Proximity boosting tokens found closely in a fixed
window (5) with no ordering
▪  Prefix allowing prefix match but with score decay using
length difference
Learning to Rank (LTR)
▪  Data point: < query | ~100 matched messages | klicked message >
▪  Datasets:
›  Corporate 100K random queries from the corporate query log
›  Web-mail 10K random queries
›  Editorial 500 queries judged by editors
▪  LTR Algorithm AROW (Crammer et al. 2013)
d1
d2
d3
d4 ∑wi fi(d4)
∑wi fi(d1)
>∑wi fi(d2)
∑wi fi(d3)
d1
d2
d4
d3
∑wi fi(d4) ∑wi fi(d1)
> ∑wi fi(d2)∑wi fi(d4)
update w
Learning to Rank (LTR)
Experimental Results
Performance Measures
▪  Mean reciprocal rank (MRR)
corresponds to the harmonic mean of
the ranks of the relevant documents
▪  Success@K
The number of queries for which the clicked message is
found in the top-k results
▪  NDCG@K
when we have several relevance feedback levels
Time Vs. REX (Corporate Dataset)
Algorithm MRR (+lift %)
Time 0.3722
REX (fresh. + sim.) 0.4261 (+14.48%)
REX (fresh. + sim. + actions) 0.4550 (+22.24%)
REX (fresh. + sim. + actions + sender) 0.4548 (+22.19%)
Time Vs. REX (Web Mail Dataset)
Algorithm MRR (+lift %)
Time 0.3717
REX (fresh. + sim.) 0.3785 (+1.81%)
REX (fresh. + sim. + actions) 0.4238 (+14%)
REX (fresh. + sim. + actions + sender) 0.4258 (+14.55%)
Time vs REX (as a function of the Result set size)
Relative improvement of REX over Time increases as more
messages in the user inbox match the query
Relative Feature Importance
▪  In general, REX ranker significantly outperforms Chronological ranker
›  both in the Corporate and in the Web datasets
▪  Relative Feature Importance:
Freshness >> User actions >> Similarity >> Sender features
›  Freshness:
•  Years >> Months >> Weeks >> Days
›  User actions
•  Read >> Forwarded >> Flagged >> Replied >> Draft >> Ham >> Spam
›  Similarity:
•  coord >> tf-idf (From > Subject > Body >Attachment > To) >> BM25f
▪  Low significance of the sender features ??
Time Vs. REX (Editorial Dataset)
Algorithm MRR (+lift %) NDCG@10 (+lift %)
Time 0.3629 0.4936
REX 0.5105 (+40.65%) 0.6647 (+34.66%)
Query Intent
Algo A
Most Relevant
Algo A
Related
Algo B
Most Relevant
Algo B
Related
Lila dress Discussion about dress for party 2 3,5,7 4 1,2,6,7
Spense KE Schedule for Spense KE meeting 5 2,4,8,9 1 3,4,5
Editors Feedback
“... Sometimes, I had the feeling that Algo. B was
really reading my mind to put in the first place
exactly the email message I was thinking of ...”
“...Today, after I ran it again, it was not that much
impressive, but still I have the feeling it was the
type of search that gave me the best results..."
Email Search Tomorrow (REX ordered)
Searching for an (old) application form for “Visa to India”
Conclusions
●  We Challenged the traditional chronological sort for email search
o  While freshness is still super important, it should be integrated into the
relevance model with many other important features
o  REX performs significantly better from Time-based ranking
o  The model can be easily expanded considering more signals as they become
available
●  Are mail users ready to depart from chronological sort in favor of
modern relevance ranking?
●  Time will tell
●  REX provides our users the opportunity, at least
●  More details can be found in our CIKM 2015 paper:
o  Rank by Time or by Relevance? Revisiting Email Search
Future Work
▪  Enriching the the set of Ranking features
›  Solving the mystery:
•  how come that Sender features do not contribute to the ranking
›  Adding Query based features Based on Query Intent Analysis
▪  Personalization
›  Adding the User into the ranking model
▪  User Study
›  Better understanding user needs
•  how users search over their mailboxes
Yahoo new Mobile Mail application
Thanks for listening

More Related Content

Similar to Rank by time or by relevance - Revisiting Email Search

Improving Search in Workday Products using Natural Language Processing
Improving Search in Workday Products using Natural Language ProcessingImproving Search in Workday Products using Natural Language Processing
Improving Search in Workday Products using Natural Language ProcessingDataWorks Summit
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...Joaquin Delgado PhD.
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...S. Diana Hu
 
An introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxAn introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxElasticsearch
 
Pushing the Institutional Repository to a New Level: Potential Benefits of Me...
Pushing the Institutional Repository to a New Level: Potential Benefits of Me...Pushing the Institutional Repository to a New Level: Potential Benefits of Me...
Pushing the Institutional Repository to a New Level: Potential Benefits of Me...CULS
 
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...OpenSource Connections
 
Rated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality EvaluationRated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality EvaluationAlessandro Benedetti
 
GOKb and Refine (Kuali Days 2013)
GOKb and Refine (Kuali Days 2013)GOKb and Refine (Kuali Days 2013)
GOKb and Refine (Kuali Days 2013)GOKb Project
 
The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge GraphTrey Grainger
 
Rated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: an Open Source Approach for Search Quality EvaluationRated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: an Open Source Approach for Search Quality EvaluationSease
 
Webinar: Question Answering and Virtual Assistants with Deep Learning
Webinar: Question Answering and Virtual Assistants with Deep LearningWebinar: Question Answering and Virtual Assistants with Deep Learning
Webinar: Question Answering and Virtual Assistants with Deep LearningLucidworks
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comSimon Hughes
 
Learning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search GuildLearning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search GuildSujit Pal
 
Towards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology ApplicationsTowards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology ApplicationsMarina Santini
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningJoaquin Delgado PhD.
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningS. Diana Hu
 
Intern Project Showcase.pptx
Intern Project Showcase.pptxIntern Project Showcase.pptx
Intern Project Showcase.pptxritikgarg48
 
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systemsQi He
 
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com Lucidworks
 

Similar to Rank by time or by relevance - Revisiting Email Search (20)

Improving Search in Workday Products using Natural Language Processing
Improving Search in Workday Products using Natural Language ProcessingImproving Search in Workday Products using Natural Language Processing
Improving Search in Workday Products using Natural Language Processing
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
An introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxAn introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolbox
 
Pushing the Institutional Repository to a New Level: Potential Benefits of Me...
Pushing the Institutional Repository to a New Level: Potential Benefits of Me...Pushing the Institutional Repository to a New Level: Potential Benefits of Me...
Pushing the Institutional Repository to a New Level: Potential Benefits of Me...
 
Live Blog Analysis
Live Blog AnalysisLive Blog Analysis
Live Blog Analysis
 
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...
 
Rated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality EvaluationRated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation
 
GOKb and Refine (Kuali Days 2013)
GOKb and Refine (Kuali Days 2013)GOKb and Refine (Kuali Days 2013)
GOKb and Refine (Kuali Days 2013)
 
The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge Graph
 
Rated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: an Open Source Approach for Search Quality EvaluationRated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
 
Webinar: Question Answering and Virtual Assistants with Deep Learning
Webinar: Question Answering and Virtual Assistants with Deep LearningWebinar: Question Answering and Virtual Assistants with Deep Learning
Webinar: Question Answering and Virtual Assistants with Deep Learning
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
 
Learning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search GuildLearning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search Guild
 
Towards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology ApplicationsTowards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology Applications
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
 
Intern Project Showcase.pptx
Intern Project Showcase.pptxIntern Project Showcase.pptx
Intern Project Showcase.pptx
 
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
 
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
 

Recently uploaded

GBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interactionGBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interactionAreesha Ahmad
 
In-pond Race way systems for Aquaculture (IPRS).pptx
In-pond Race way systems for Aquaculture (IPRS).pptxIn-pond Race way systems for Aquaculture (IPRS).pptx
In-pond Race way systems for Aquaculture (IPRS).pptxMAGOTI ERNEST
 
Factor Causing low production and physiology of mamary Gland
Factor Causing low production and physiology of mamary GlandFactor Causing low production and physiology of mamary Gland
Factor Causing low production and physiology of mamary GlandRcvets
 
MSCII_ FCT UNIT 5 TOXICOLOGY.pdf
MSCII_              FCT UNIT 5 TOXICOLOGY.pdfMSCII_              FCT UNIT 5 TOXICOLOGY.pdf
MSCII_ FCT UNIT 5 TOXICOLOGY.pdfSuchita Rawat
 
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana LahariERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Laharimuralinath2
 
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...TALAPATI ARUNA CHENNA VYDYANAD
 
Triploidy ...............................pptx
Triploidy ...............................pptxTriploidy ...............................pptx
Triploidy ...............................pptxCherry
 
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...kevin8smith
 
EU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfEU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfStart Project
 
MSC IV_Forensic medicine -sexual offence.pdf
MSC IV_Forensic medicine -sexual offence.pdfMSC IV_Forensic medicine -sexual offence.pdf
MSC IV_Forensic medicine -sexual offence.pdfSuchita Rawat
 
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptx
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptxPlasmapheresis - Dr. E. Muralinath - Kalyan . C.pptx
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptxmuralinath2
 
Lubrication System in forced feed system
Lubrication System in forced feed systemLubrication System in forced feed system
Lubrication System in forced feed systemADB online India
 
Mining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxMining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxKyawThanTint
 
GBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) EnzymologyGBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) EnzymologyAreesha Ahmad
 
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...Sérgio Sacani
 
Plasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyanPlasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyanmuralinath2
 
MODERN PHYSICS_REPORTING_QUANTA_.....pdf
MODERN PHYSICS_REPORTING_QUANTA_.....pdfMODERN PHYSICS_REPORTING_QUANTA_.....pdf
MODERN PHYSICS_REPORTING_QUANTA_.....pdfRevenJadePalma
 
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...Sérgio Sacani
 
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...yogeshlabana357357
 

Recently uploaded (20)

GBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interactionGBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interaction
 
In-pond Race way systems for Aquaculture (IPRS).pptx
In-pond Race way systems for Aquaculture (IPRS).pptxIn-pond Race way systems for Aquaculture (IPRS).pptx
In-pond Race way systems for Aquaculture (IPRS).pptx
 
Factor Causing low production and physiology of mamary Gland
Factor Causing low production and physiology of mamary GlandFactor Causing low production and physiology of mamary Gland
Factor Causing low production and physiology of mamary Gland
 
MSCII_ FCT UNIT 5 TOXICOLOGY.pdf
MSCII_              FCT UNIT 5 TOXICOLOGY.pdfMSCII_              FCT UNIT 5 TOXICOLOGY.pdf
MSCII_ FCT UNIT 5 TOXICOLOGY.pdf
 
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana LahariERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
 
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
 
Triploidy ...............................pptx
Triploidy ...............................pptxTriploidy ...............................pptx
Triploidy ...............................pptx
 
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
 
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
 
EU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfEU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdf
 
MSC IV_Forensic medicine -sexual offence.pdf
MSC IV_Forensic medicine -sexual offence.pdfMSC IV_Forensic medicine -sexual offence.pdf
MSC IV_Forensic medicine -sexual offence.pdf
 
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptx
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptxPlasmapheresis - Dr. E. Muralinath - Kalyan . C.pptx
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptx
 
Lubrication System in forced feed system
Lubrication System in forced feed systemLubrication System in forced feed system
Lubrication System in forced feed system
 
Mining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxMining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptx
 
GBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) EnzymologyGBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) Enzymology
 
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...
 
Plasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyanPlasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyan
 
MODERN PHYSICS_REPORTING_QUANTA_.....pdf
MODERN PHYSICS_REPORTING_QUANTA_.....pdfMODERN PHYSICS_REPORTING_QUANTA_.....pdf
MODERN PHYSICS_REPORTING_QUANTA_.....pdf
 
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
 
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
 

Rank by time or by relevance - Revisiting Email Search

  • 1. Rank by Time or by Relevance? Revisiting Email Search November17th, 2015 David Carmel Guy Halawi Liane Lewin-Eytan Yoelle Maarek Ariel Raviv Haifa Labs
  • 2. Motivation ▪  “Email search still remains difficult, time-consuming and frustrating" (Elsweiler et al. 2011) ▪  By default, all existing Web mail services display search results in reverse chronological order ▪  makes the discovery of older messages very hard ▪  Imposes strict constraints for messages matching
  • 3. Email Search Today (Time ordered) Searching for an (old) application form for “Visa to India”
  • 4. Search in Yahoo Mail ▪  Boolean Search model •  Each query is a Boolean expression (AND, OR, NOT) •  Generally, all query terms must appear in at-least one of the message fields (AND operation) ▪  Ranking •  Default: by Recency (Reverse Chronological ordering) •  (pseudo)-Relevance – implementation is based on matching query terms ▪  almost never used by users
  • 5. Challenge ▪  Challenge the traditional prevalent chronological ranking in Web email search ›  investigate whether an email-specific relevance ranking could bring any value to our users ▪  Introduce mail-specific relevance ranking consisting of two phases: ›  Relaxed matching phase to improve recall ›  Comprehensive ranking phase using a rich set of mail-specific features to improve precision
  • 6. ▪ Very short queries: 1.5 terms on avg. › Re-find Intent – looking for specific previous message › Contact queries ~40% – Picture, Email address, Phone number, Physical address, Links, Attachments, Appointments (time/date), Conversation ▪ Tasks involved: › Couponing (Pizza coupon?) › Tracking Items (bill paid, package shipped) › Looking up Account / Registration info › Social media (searching for comments/posts) Email Queries: What people search for?
  • 7. ▪  Standard two-phase retrieval process: ›  First Phase: Retrieve a pool of message qualified as potentially relevant to the query •  Two matching models: –  Restricted (AND mode) –  Relaxed: any message containing at least one of the query terms in any of its fields is considered a match ›  Second phase: •  Ranks these messages using a rich set of features –  Scores messages by linear regression analysis learned using Learning-to-rank approach The Search Process
  • 8. REX - Relevance EXtended Ranking Model Based on an LTR framework using several sets of features: ▪  Message ▪  Recipient ▪  Sender ▪  Message-Query Similarity
  • 9. Message Features ▪  Freshness exponential decay over the message age ▪  User Actions replied, forwarded, flagged, drafted, read,.. ▪  Attachment has attachment, attachment type / size ▪  Folder folder type (inbox, draft, sent, user defined folder) ▪  Exchange Type reply/forward, in-thread
  • 10. Recipient Features ▪  To recipient mentioned in To ▪  Cc recipient mentioned in Cc ▪  In Group recipient was not mentioned explicitly
  • 11. Sender Features ▪  User-sender connection correspondence volume / type ▪  Self correspondence sender is user Vertical ▪  Sender inbound / outbound traffic volume and ratio ▪  Sender urls usage volume and ratio in messages ▪  Sender recipients number avg. per message ▪  Sender recipients actions ratio over messages Horizontal
  • 12. Message-Query Similarity Features ▪  BM25f textual similarity between a query and the entire message •  Considering query term distribution over message fields (Subject, From, To, Body, Attachment) ●  TF-IDF measures the (tf-idf) similarity of each message field independently of the others ●  Coord fraction of query terms that occur in the message
  • 13. Proximity Taking into account proximity between query terms in content ▪  Neighborhood boosting consecutive matches ▪  Proximity boosting tokens found closely in a fixed window (5) with no ordering ▪  Prefix allowing prefix match but with score decay using length difference
  • 14. Learning to Rank (LTR) ▪  Data point: < query | ~100 matched messages | klicked message > ▪  Datasets: ›  Corporate 100K random queries from the corporate query log ›  Web-mail 10K random queries ›  Editorial 500 queries judged by editors ▪  LTR Algorithm AROW (Crammer et al. 2013)
  • 15. d1 d2 d3 d4 ∑wi fi(d4) ∑wi fi(d1) >∑wi fi(d2) ∑wi fi(d3) d1 d2 d4 d3 ∑wi fi(d4) ∑wi fi(d1) > ∑wi fi(d2)∑wi fi(d4) update w Learning to Rank (LTR)
  • 17. Performance Measures ▪  Mean reciprocal rank (MRR) corresponds to the harmonic mean of the ranks of the relevant documents ▪  Success@K The number of queries for which the clicked message is found in the top-k results ▪  NDCG@K when we have several relevance feedback levels
  • 18. Time Vs. REX (Corporate Dataset) Algorithm MRR (+lift %) Time 0.3722 REX (fresh. + sim.) 0.4261 (+14.48%) REX (fresh. + sim. + actions) 0.4550 (+22.24%) REX (fresh. + sim. + actions + sender) 0.4548 (+22.19%)
  • 19. Time Vs. REX (Web Mail Dataset) Algorithm MRR (+lift %) Time 0.3717 REX (fresh. + sim.) 0.3785 (+1.81%) REX (fresh. + sim. + actions) 0.4238 (+14%) REX (fresh. + sim. + actions + sender) 0.4258 (+14.55%)
  • 20. Time vs REX (as a function of the Result set size) Relative improvement of REX over Time increases as more messages in the user inbox match the query
  • 21. Relative Feature Importance ▪  In general, REX ranker significantly outperforms Chronological ranker ›  both in the Corporate and in the Web datasets ▪  Relative Feature Importance: Freshness >> User actions >> Similarity >> Sender features ›  Freshness: •  Years >> Months >> Weeks >> Days ›  User actions •  Read >> Forwarded >> Flagged >> Replied >> Draft >> Ham >> Spam ›  Similarity: •  coord >> tf-idf (From > Subject > Body >Attachment > To) >> BM25f ▪  Low significance of the sender features ??
  • 22. Time Vs. REX (Editorial Dataset) Algorithm MRR (+lift %) NDCG@10 (+lift %) Time 0.3629 0.4936 REX 0.5105 (+40.65%) 0.6647 (+34.66%) Query Intent Algo A Most Relevant Algo A Related Algo B Most Relevant Algo B Related Lila dress Discussion about dress for party 2 3,5,7 4 1,2,6,7 Spense KE Schedule for Spense KE meeting 5 2,4,8,9 1 3,4,5
  • 23. Editors Feedback “... Sometimes, I had the feeling that Algo. B was really reading my mind to put in the first place exactly the email message I was thinking of ...” “...Today, after I ran it again, it was not that much impressive, but still I have the feeling it was the type of search that gave me the best results..."
  • 24. Email Search Tomorrow (REX ordered) Searching for an (old) application form for “Visa to India”
  • 25. Conclusions ●  We Challenged the traditional chronological sort for email search o  While freshness is still super important, it should be integrated into the relevance model with many other important features o  REX performs significantly better from Time-based ranking o  The model can be easily expanded considering more signals as they become available ●  Are mail users ready to depart from chronological sort in favor of modern relevance ranking? ●  Time will tell ●  REX provides our users the opportunity, at least ●  More details can be found in our CIKM 2015 paper: o  Rank by Time or by Relevance? Revisiting Email Search
  • 26. Future Work ▪  Enriching the the set of Ranking features ›  Solving the mystery: •  how come that Sender features do not contribute to the ranking ›  Adding Query based features Based on Query Intent Analysis ▪  Personalization ›  Adding the User into the ranking model ▪  User Study ›  Better understanding user needs •  how users search over their mailboxes
  • 27. Yahoo new Mobile Mail application