Your SlideShare is downloading. ×
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Hadoop in Love
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Hadoop in Love

564

Published on

eHarmony was founded to give people a better chance to find someone for a long lasting relationship. As one of the first companies we have applied advanced technology what became known as Data Science …

eHarmony was founded to give people a better chance to find someone for a long lasting relationship. As one of the first companies we have applied advanced technology what became known as Data Science these days to the age old problem of matchmaking. Over the years eHarmony has accumulated vast amount of data on variety of romantic interactions. This data a is a treasure trove of entertaining tidbits and nuggets of insight into human nature. I will share some of those in hope that people may find them useful but more importantly I will also demonstrate how we actually use this data to make recommendations and give single people an upper hand in finding “The One”. In particular I will show how we utilize hadoop (YARN) to process billions of pairs of user profiles to find ngrams and other features that are predictive of romantic attraction and how we use the features discovered for large scale machine learning using vowpal wabbit`s allreduce parallell learning. Finally I am going to describe an optimization technique that decides what matches to deliver to who and when but which is more broadly aplicable to other domains such as advertising or constrained recommendations.

Published in: Technology, Business
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
564
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Wednesday, June 26, 13
  • 2. Hadoop in Love @petricek Wednesday, June 26, 13
  • 3. The eHarmony Difference › Who we are ~45% Tech Wednesday, June 26, 13
  • 4. The eHarmony Difference › Who we are ~15% Customer Care ~45% Tech Wednesday, June 26, 13
  • 5. The eHarmony Difference › Who we are ~15% Customer Care ~45% Tech ~10% Marketing Wednesday, June 26, 13
  • 6. The eHarmony Difference › Compatibility Matching System® Wednesday, June 26, 13
  • 7. The eHarmony Difference › Compatibility Matching System® Compatibility Matching System® Wednesday, June 26, 13
  • 8. The eHarmony Difference › Compatibility Matching System® Compatibility Matching System® Compatibility Matching 1 Wednesday, June 26, 13
  • 9. The eHarmony Difference › Compatibility Matching System® Compatibility Matching System® Compatibility Matching 1 Affinity Matching 2 Wednesday, June 26, 13
  • 10. The eHarmony Difference › Compatibility Matching System® Compatibility Matching System® Match Distribution 3 Compatibility Matching 1 Affinity Matching 2 Wednesday, June 26, 13
  • 11. The eHarmony Difference Wednesday, June 26, 13
  • 12. Affinity Matching Match Distribution 2 3 The eHarmony Difference › Compatibility Matching System® Compatibility Matching 1 Wednesday, June 26, 13
  • 13. Affinity Matching Match Distribution 2 3 The eHarmony Difference › Compatibility Matching System® Compatibility Matching 1 Wednesday, June 26, 13
  • 14. Wednesday, June 26, 13
  • 15. 150   ques)ons Wednesday, June 26, 13
  • 16. 150   ques)ons Personality Values A5ributes Beliefs Wednesday, June 26, 13
  • 17. Compatibility Matching › Obstreperousness Wednesday, June 26, 13
  • 18. Compatibility Matching › Romantic Wednesday, June 26, 13
  • 19. Marital satisfaction Wednesday, June 26, 13
  • 20. Marital satisfaction                                       Wednesday, June 26, 13
  • 21. Compatibility Matching › Wednesday, June 26, 13
  • 22. Compatibility Matching › Wednesday, June 26, 13
  • 23. Match Distribution 3 Compatibility Matching 1 Affinity Matching 2 The eHarmony Difference › Compatibility Matching System® Wednesday, June 26, 13
  • 24. Match Distribution 3 Compatibility Matching 1 Affinity Matching 2 The eHarmony Difference › Compatibility Matching System® Layers on Top of Compatibility Matching Wednesday, June 26, 13
  • 25. Affinity Matching › Wednesday, June 26, 13
  • 26. 61 21 Affinity Matching › Wednesday, June 26, 13
  • 27. 61 21 3000 Affinity Matching › Wednesday, June 26, 13
  • 28. 61 21 3000 Affinity Matching › Wednesday, June 26, 13
  • 29. Affinity Matching › Wednesday, June 26, 13
  • 30. ……… Affinity Matching › Wednesday, June 26, 13
  • 31. Affinity Matching › Distance Prob(              ) Wednesday, June 26, 13
  • 32. Affinity Matching › Distance Wednesday, June 26, 13
  • 33. Affinity Matching › Height difference Prob(              ) 4  -­‐  8  in cm Wednesday, June 26, 13
  • 34. Affinity Matching › Zoom level Wednesday, June 26, 13
  • 35. Affinity Matching › Zoom level Wednesday, June 26, 13
  • 36. Affinity Matching › Zoom level Wednesday, June 26, 13
  • 37. life Affinity Matching › Semi-structured Text life my  smile my  smile world world my me my me I I I I Wednesday, June 26, 13
  • 38. life Affinity Matching › Semi-structured Text life my  smile my  smile world world Wednesday, June 26, 13
  • 39. life Affinity Matching › Semi-structured Text life my  smile my  smile world world Wednesday, June 26, 13
  • 40. Affinity Matching › ~40M  registered  users ~10^7  matches  per  day ~10^3  a5ributes ... ... Prob( | data) ? ~10^8  daily Prob( | features) Wednesday, June 26, 13
  • 41. UserMatchCommunica)on feature  expansion Sparse   ML  format models Affinity Matching › Model Training: Maestro Protocol  Buffers vowpal  wabbit, boosted  trees Wednesday, June 26, 13
  • 42. 750M  Compressed Protocol  Buffers Map-­‐side  joins (~TB) Matching  User  Serice Pairings  Browser   Service 1+G  Compressed  Protocol  Buffers   Affinity Matching › Production: Conductor Wednesday, June 26, 13
  • 43. ... [User[Demographic][Photo][Ac)vity][FX]] [Cand[Demographic][Photo]  [Ac)vity][FX]] [Pairing[Distance][Flags]] [User[Demographic][Photo][Ac)vity][FX]] [Cand[Demographic][Photo]  [Ac)vity][FX]] [Pairing[Distance][Flags]] [User[Demographic][Photo][Ac)vity][FX]] [Cand[Demographic][Photo]  [Ac)vity][FX]] [Pairing[Distance][Flags]] [User[Demographic][Photo][Ac)vity][FX]] [Cand[Demographic][Photo]  [Ac)vity][FX]] [Pairing[Distance][Flags]] Affinity Matching › Scorer Wednesday, June 26, 13
  • 44. ... Prob( | data) Prob( | data) Prob( | data) Prob( | data) [User[Demographic][Photo][Ac)vity][FX]] [Cand[Demographic][Photo]  [Ac)vity][FX]] [Pairing[Distance][Flags]] [User[Demographic][Photo][Ac)vity][FX]] [Cand[Demographic][Photo]  [Ac)vity][FX]] [Pairing[Distance][Flags]] [User[Demographic][Photo][Ac)vity][FX]] [Cand[Demographic][Photo]  [Ac)vity][FX]] [Pairing[Distance][Flags]] [User[Demographic][Photo][Ac)vity][FX]] [Cand[Demographic][Photo]  [Ac)vity][FX]] [Pairing[Distance][Flags]] Affinity Matching › Scorer Wednesday, June 26, 13
  • 45. ... [User[Demographic][Photo][Ac)vity][FX]] [Cand[Demographic][Photo]  [Ac)vity][FX]] [Pairing[Distance][Flags]] ... [User[Demographic][Photo][Ac)vity][FX]] [Cand[Demographic][Photo]  [Ac)vity][FX]] [Pairing[Distance][Flags]] [User[Demographic][Photo][Ac)vity][FX]] [Cand[Demographic][Photo]  [Ac)vity][FX]] [Pairing[Distance][Flags]] [User[Demographic][Photo][Ac)vity][FX]] [Cand[Demographic][Photo]  [Ac)vity][FX]] [Pairing[Distance][Flags]] Prob( | data) Prob( | data) Prob( | data) Prob( | data) Prob( | data) Prob( | data) Prob( | data) Prob( | data) [User[Demographic][Photo][Ac)vity][FX]] [Cand[Demographic][Photo]  [Ac)vity][FX]] [Pairing[Distance][Flags]] [User[Demographic][Photo][Ac)vity][FX]] [Cand[Demographic][Photo]  [Ac)vity][FX]] [Pairing[Distance][Flags]] [User[Demographic][Photo][Ac)vity][FX]] [Cand[Demographic][Photo]  [Ac)vity][FX]] [Pairing[Distance][Flags]] [User[Demographic][Photo][Ac)vity][FX]] [Cand[Demographic][Photo]  [Ac)vity][FX]] [Pairing[Distance][Flags]] Affinity Matching › Scorer Wednesday, June 26, 13
  • 46. “same_religion”:”${user.profile.religion}=={cand.profile.religion}” “cmp_drinking”:”cmp(${user.profile.drinking},{cand.profile.drinking})” < “strict_distance_u”:”${user.profile.accepted_distance}<={pairing.distance}” 60miles Affinity Matching › Scala DSL Wednesday, June 26, 13
  • 47. Compatibility Matching 1 Affinity Matching 2 Match Distribution 3 The eHarmony Difference › Compatibility Matching System® Wednesday, June 26, 13
  • 48. Compatibility Matching 1 Affinity Matching 2 Match Distribution 3 The eHarmony Difference › Compatibility Matching System® Delivering the right matches at the right time to as many people as possible across the entire network. Wednesday, June 26, 13
  • 49. Match Distribution › Graph optimization Wednesday, June 26, 13
  • 50. Match Distribution › Graph optimization Wednesday, June 26, 13
  • 51. Match Distribution › Graph optimization 2 2 Wednesday, June 26, 13
  • 52. Match Distribution › Graph optimization 2 21 Wednesday, June 26, 13
  • 53. Match Distribution › Graph optimization 2 21Prob( | data) Wednesday, June 26, 13
  • 54. Match Distribution › Graph optimization 2 21Prob( | data) Wednesday, June 26, 13
  • 55. Match Distribution › Graph optimization 2 2Prob( | data) Wednesday, June 26, 13
  • 56. Match Distribution › Graph optimization 2 2Prob( | data) Wednesday, June 26, 13
  • 57. Resulting Customer Experience › Guided Communication Wednesday, June 26, 13
  • 58. Resulting Customer Experience › Guided Communication Wednesday, June 26, 13
  • 59. ? ! Resulting Customer Experience › Guided Communication Wednesday, June 26, 13
  • 60. Resulting Customer Experience › Success! Wednesday, June 26, 13
  • 61. Resulting Customer Experience › Success! Wednesday, June 26, 13
  • 62. eHarmony Results › The eHarmony Impact 2005 90 eHarmony Members Married Every Day Wednesday, June 26, 13
  • 63. eHarmony Results › The eHarmony Impact 2005 2007 236 eHarmony Members Married Every Day Wednesday, June 26, 13
  • 64. eHarmony Results › The eHarmony Impact 2005 2007 2009 542 eHarmony Members Married Every Day Wednesday, June 26, 13
  • 65. Proceedings of National Academy of Sciences Wednesday, June 26, 13
  • 66. Press coverage Wednesday, June 26, 13
  • 67. Since  2005,  about  1/3  of  couples   who  have  married  in  the  US   have  met  online  (35%) eHarmony Results › The eHarmony Impact *  according  to  survey  of  couples  married  between  2005-­‐2012  by  Harris  InteracAve  for  eHarmony Wednesday, June 26, 13
  • 68. Rates of breakup or divorce 0% 2.0% 4.0% 6.0% 8.0% All Online Offline *  according  to  survey  of  couples  married  between  2005-­‐2012  by  Harris  InteracAve  for  eHarmony Wednesday, June 26, 13
  • 69. The  largest  number   of  marriages  surveyed   who  met  via  online  da)ng   had  met  on  eHarmony  (25%) eHarmony Results › The eHarmony Impact *  according  to  survey  of  couples  married  between  2005-­‐2012  by  Harris  InteracAve  for  eHarmony Wednesday, June 26, 13
  • 70. Rates of breakup or divorce 0% 2.0% 4.0% 6.0% 8.0% eHarmony All Other Online Offline *  according  to  survey  of  couples  married  between  2005-­‐2012  by  by  Harris  InteracAve  for  eHarmony bit.ly/jobateharmony Wednesday, June 26, 13
  • 71. Rates of breakup or divorce 0% 2.0% 4.0% 6.0% 8.0% eHarmony All Other Online Offline *  according  to  survey  of  couples  married  between  2005-­‐2012  by  by  Harris  InteracAve  for  eHarmony linkedin.com/in/petricek bit.ly/jobateharmony @petricek Wednesday, June 26, 13

×