Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Rank by Time or by Relevance?
Revisiting Email Search
November17th, 2015
David Carmel Guy Halawi Liane Lewin-Eytan Yoelle ...
Motivation
▪  “Email search still remains difficult, time-consuming and
frustrating" (Elsweiler et al. 2011)
▪  By default...
Email Search Today (Time ordered)
Searching for an (old) application form for “Visa to India”
Search in Yahoo Mail
▪  Boolean Search model
•  Each query is a Boolean expression (AND, OR, NOT)
•  Generally, all query ...
Challenge
▪  Challenge the traditional prevalent chronological ranking in Web email
search
›  investigate whether an email...
▪ Very short queries: 1.5 terms on avg.
› Re-find Intent
– looking for specific previous message
› Contact queries ~40%
– ...
▪  Standard two-phase retrieval process:
›  First Phase: Retrieve a pool of message qualified as potentially relevant
to t...
REX - Relevance EXtended Ranking Model
Based on an LTR framework using
several sets of features:
▪  Message
▪  Recipient
▪...
Message Features
▪  Freshness exponential decay over the message age
▪  User Actions replied, forwarded, flagged, drafted,...
Recipient Features
▪  To recipient mentioned in To
▪  Cc recipient mentioned in Cc
▪  In Group recipient was not mentioned...
Sender Features
▪  User-sender connection correspondence volume / type
▪  Self correspondence sender is user
Vertical
▪  S...
Message-Query Similarity Features
▪  BM25f textual similarity between a query and the entire message
•  Considering query ...
Proximity
Taking into account proximity between query terms in content
▪  Neighborhood boosting consecutive matches
▪  Pro...
Learning to Rank (LTR)
▪  Data point: < query | ~100 matched messages | klicked message >
▪  Datasets:
›  Corporate 100K r...
d1
d2
d3
d4 ∑wi fi(d4)
∑wi fi(d1)
>∑wi fi(d2)
∑wi fi(d3)
d1
d2
d4
d3
∑wi fi(d4) ∑wi fi(d1)
> ∑wi fi(d2)∑wi fi(d4)
update w...
Experimental Results
Performance Measures
▪  Mean reciprocal rank (MRR)
corresponds to the harmonic mean of
the ranks of the relevant documents...
Time Vs. REX (Corporate Dataset)
Algorithm MRR (+lift %)
Time 0.3722
REX (fresh. + sim.) 0.4261 (+14.48%)
REX (fresh. + si...
Time Vs. REX (Web Mail Dataset)
Algorithm MRR (+lift %)
Time 0.3717
REX (fresh. + sim.) 0.3785 (+1.81%)
REX (fresh. + sim....
Time vs REX (as a function of the Result set size)
Relative improvement of REX over Time increases as more
messages in the...
Relative Feature Importance
▪  In general, REX ranker significantly outperforms Chronological ranker
›  both in the Corpor...
Time Vs. REX (Editorial Dataset)
Algorithm MRR (+lift %) NDCG@10 (+lift %)
Time 0.3629 0.4936
REX 0.5105 (+40.65%) 0.6647 ...
Editors Feedback
“... Sometimes, I had the feeling that Algo. B was
really reading my mind to put in the first place
exact...
Email Search Tomorrow (REX ordered)
Searching for an (old) application form for “Visa to India”
Conclusions
●  We Challenged the traditional chronological sort for email search
o  While freshness is still super importa...
Future Work
▪  Enriching the the set of Ranking features
›  Solving the mystery:
•  how come that Sender features do not c...
Yahoo new Mobile Mail application
Thanks for listening
Upcoming SlideShare
Loading in …5
×

Rank by time or by relevance - Revisiting Email Search

978 views

Published on

It is quite surprising that in spite of the huge progress of relevance ranking in Web Search, mail search results are still typically ranked by date. In this paper, we discuss the limitations of ranking search results by date. We argue that this sort-by-date paradigm needs to be revisited in order to account for the specific structure and nature of mail messages, as well as the high-recall needs of users.

Published in: Science
  • Be the first to comment

Rank by time or by relevance - Revisiting Email Search

  1. 1. Rank by Time or by Relevance? Revisiting Email Search November17th, 2015 David Carmel Guy Halawi Liane Lewin-Eytan Yoelle Maarek Ariel Raviv Haifa Labs
  2. 2. Motivation ▪  “Email search still remains difficult, time-consuming and frustrating" (Elsweiler et al. 2011) ▪  By default, all existing Web mail services display search results in reverse chronological order ▪  makes the discovery of older messages very hard ▪  Imposes strict constraints for messages matching
  3. 3. Email Search Today (Time ordered) Searching for an (old) application form for “Visa to India”
  4. 4. Search in Yahoo Mail ▪  Boolean Search model •  Each query is a Boolean expression (AND, OR, NOT) •  Generally, all query terms must appear in at-least one of the message fields (AND operation) ▪  Ranking •  Default: by Recency (Reverse Chronological ordering) •  (pseudo)-Relevance – implementation is based on matching query terms ▪  almost never used by users
  5. 5. Challenge ▪  Challenge the traditional prevalent chronological ranking in Web email search ›  investigate whether an email-specific relevance ranking could bring any value to our users ▪  Introduce mail-specific relevance ranking consisting of two phases: ›  Relaxed matching phase to improve recall ›  Comprehensive ranking phase using a rich set of mail-specific features to improve precision
  6. 6. ▪ Very short queries: 1.5 terms on avg. › Re-find Intent – looking for specific previous message › Contact queries ~40% – Picture, Email address, Phone number, Physical address, Links, Attachments, Appointments (time/date), Conversation ▪ Tasks involved: › Couponing (Pizza coupon?) › Tracking Items (bill paid, package shipped) › Looking up Account / Registration info › Social media (searching for comments/posts) Email Queries: What people search for?
  7. 7. ▪  Standard two-phase retrieval process: ›  First Phase: Retrieve a pool of message qualified as potentially relevant to the query •  Two matching models: –  Restricted (AND mode) –  Relaxed: any message containing at least one of the query terms in any of its fields is considered a match ›  Second phase: •  Ranks these messages using a rich set of features –  Scores messages by linear regression analysis learned using Learning-to-rank approach The Search Process
  8. 8. REX - Relevance EXtended Ranking Model Based on an LTR framework using several sets of features: ▪  Message ▪  Recipient ▪  Sender ▪  Message-Query Similarity
  9. 9. Message Features ▪  Freshness exponential decay over the message age ▪  User Actions replied, forwarded, flagged, drafted, read,.. ▪  Attachment has attachment, attachment type / size ▪  Folder folder type (inbox, draft, sent, user defined folder) ▪  Exchange Type reply/forward, in-thread
  10. 10. Recipient Features ▪  To recipient mentioned in To ▪  Cc recipient mentioned in Cc ▪  In Group recipient was not mentioned explicitly
  11. 11. Sender Features ▪  User-sender connection correspondence volume / type ▪  Self correspondence sender is user Vertical ▪  Sender inbound / outbound traffic volume and ratio ▪  Sender urls usage volume and ratio in messages ▪  Sender recipients number avg. per message ▪  Sender recipients actions ratio over messages Horizontal
  12. 12. Message-Query Similarity Features ▪  BM25f textual similarity between a query and the entire message •  Considering query term distribution over message fields (Subject, From, To, Body, Attachment) ●  TF-IDF measures the (tf-idf) similarity of each message field independently of the others ●  Coord fraction of query terms that occur in the message
  13. 13. Proximity Taking into account proximity between query terms in content ▪  Neighborhood boosting consecutive matches ▪  Proximity boosting tokens found closely in a fixed window (5) with no ordering ▪  Prefix allowing prefix match but with score decay using length difference
  14. 14. Learning to Rank (LTR) ▪  Data point: < query | ~100 matched messages | klicked message > ▪  Datasets: ›  Corporate 100K random queries from the corporate query log ›  Web-mail 10K random queries ›  Editorial 500 queries judged by editors ▪  LTR Algorithm AROW (Crammer et al. 2013)
  15. 15. d1 d2 d3 d4 ∑wi fi(d4) ∑wi fi(d1) >∑wi fi(d2) ∑wi fi(d3) d1 d2 d4 d3 ∑wi fi(d4) ∑wi fi(d1) > ∑wi fi(d2)∑wi fi(d4) update w Learning to Rank (LTR)
  16. 16. Experimental Results
  17. 17. Performance Measures ▪  Mean reciprocal rank (MRR) corresponds to the harmonic mean of the ranks of the relevant documents ▪  Success@K The number of queries for which the clicked message is found in the top-k results ▪  NDCG@K when we have several relevance feedback levels
  18. 18. Time Vs. REX (Corporate Dataset) Algorithm MRR (+lift %) Time 0.3722 REX (fresh. + sim.) 0.4261 (+14.48%) REX (fresh. + sim. + actions) 0.4550 (+22.24%) REX (fresh. + sim. + actions + sender) 0.4548 (+22.19%)
  19. 19. Time Vs. REX (Web Mail Dataset) Algorithm MRR (+lift %) Time 0.3717 REX (fresh. + sim.) 0.3785 (+1.81%) REX (fresh. + sim. + actions) 0.4238 (+14%) REX (fresh. + sim. + actions + sender) 0.4258 (+14.55%)
  20. 20. Time vs REX (as a function of the Result set size) Relative improvement of REX over Time increases as more messages in the user inbox match the query
  21. 21. Relative Feature Importance ▪  In general, REX ranker significantly outperforms Chronological ranker ›  both in the Corporate and in the Web datasets ▪  Relative Feature Importance: Freshness >> User actions >> Similarity >> Sender features ›  Freshness: •  Years >> Months >> Weeks >> Days ›  User actions •  Read >> Forwarded >> Flagged >> Replied >> Draft >> Ham >> Spam ›  Similarity: •  coord >> tf-idf (From > Subject > Body >Attachment > To) >> BM25f ▪  Low significance of the sender features ??
  22. 22. Time Vs. REX (Editorial Dataset) Algorithm MRR (+lift %) NDCG@10 (+lift %) Time 0.3629 0.4936 REX 0.5105 (+40.65%) 0.6647 (+34.66%) Query Intent Algo A Most Relevant Algo A Related Algo B Most Relevant Algo B Related Lila dress Discussion about dress for party 2 3,5,7 4 1,2,6,7 Spense KE Schedule for Spense KE meeting 5 2,4,8,9 1 3,4,5
  23. 23. Editors Feedback “... Sometimes, I had the feeling that Algo. B was really reading my mind to put in the first place exactly the email message I was thinking of ...” “...Today, after I ran it again, it was not that much impressive, but still I have the feeling it was the type of search that gave me the best results..."
  24. 24. Email Search Tomorrow (REX ordered) Searching for an (old) application form for “Visa to India”
  25. 25. Conclusions ●  We Challenged the traditional chronological sort for email search o  While freshness is still super important, it should be integrated into the relevance model with many other important features o  REX performs significantly better from Time-based ranking o  The model can be easily expanded considering more signals as they become available ●  Are mail users ready to depart from chronological sort in favor of modern relevance ranking? ●  Time will tell ●  REX provides our users the opportunity, at least ●  More details can be found in our CIKM 2015 paper: o  Rank by Time or by Relevance? Revisiting Email Search
  26. 26. Future Work ▪  Enriching the the set of Ranking features ›  Solving the mystery: •  how come that Sender features do not contribute to the ranking ›  Adding Query based features Based on Query Intent Analysis ▪  Personalization ›  Adding the User into the ranking model ▪  User Study ›  Better understanding user needs •  how users search over their mailboxes
  27. 27. Yahoo new Mobile Mail application
  28. 28. Thanks for listening

×