• Save
Data Science of Love
Upcoming SlideShare
Loading in...5
×
 

Data Science of Love

on

  • 603 views

Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/12jQfPk. ...

Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/12jQfPk.

Vaclav Petricek digs some of the romantic interactions nuggets hidden in eHarmony's large collection of human relationships.Filmed at qconnewyork.com.

Vaclav Petricek is a Principal Data Scientist at Santa Monica-based eHarmony where he is responsible for optimization and machine learning in eHarmony's core matchmaking algorithms. He also runs a series of invited ML talks at eHarmony. He was Visiting Researcher at University College, London where his research spanned recommender systems, social networks, web structure and online auctions.

Statistics

Views

Total Views
603
Views on SlideShare
603
Embed Views
0

Actions

Likes
1
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Data Science of Love Data Science of Love Presentation Transcript

  • Wednesday, June 12, 13
  • InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations /eharmony-hadoop
  • Presented at QCon New York www.qconnewyork.com Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide
  • Data Science of Love Vaclav Petricek @petricek Wednesday, June 12, 13
  • The eHarmony Difference › Who we are ~45% Tech Wednesday, June 12, 13
  • The eHarmony Difference › Who we are ~15% Customer Care ~45% Tech Wednesday, June 12, 13
  • The eHarmony Difference › Who we are ~15% Customer Care ~45% Tech ~10% Marketing Wednesday, June 12, 13
  • The eHarmony Difference › Compatibility Matching System® Wednesday, June 12, 13
  • The eHarmony Difference › Compatibility Matching System® Compatibility Matching System® Wednesday, June 12, 13
  • The eHarmony Difference › Compatibility Matching System® Compatibility Matching System® Compatibility Matching 1 Wednesday, June 12, 13
  • The eHarmony Difference › Compatibility Matching System® Compatibility Matching System® Compatibility Matching 1 Affinity Matching 2 Wednesday, June 12, 13
  • The eHarmony Difference › Compatibility Matching System® Compatibility Matching System® Match Distribution 3 Compatibility Matching 1 Affinity Matching 2 Wednesday, June 12, 13
  • The eHarmony Difference Wednesday, June 12, 13
  • Affinity Matching Match Distribution 2 3 The eHarmony Difference › Compatibility Matching System® Compatibility Matching 1 Wednesday, June 12, 13
  • Affinity Matching Match Distribution 2 3 The eHarmony Difference › Compatibility Matching System® Compatibility Matching 1 Wednesday, June 12, 13
  • Wednesday, June 12, 13
  • Wednesday, June 12, 13
  • 150   ques)ons Wednesday, June 12, 13
  • 150   ques)ons Personality Values A5ributes Beliefs Wednesday, June 12, 13
  • Compatibility Matching › Obstreperousness Wednesday, June 12, 13
  • Compatibility Matching › Romantic Wednesday, June 12, 13
  • CMP (CMP Makes Pairings) Wednesday, June 12, 13
  • CMP (CMP Makes Pairings) Wednesday, June 12, 13
  • CMP (CMP Makes Pairings) Wednesday, June 12, 13
  • CMP (CMP Makes Pairings) Compa)bility   Models Wednesday, June 12, 13
  • Compatibility Matching › Wednesday, June 12, 13
  • Compatibility Matching › Wednesday, June 12, 13
  • Match Distribution 3 Compatibility Matching 1 Affinity Matching 2 The eHarmony Difference › Compatibility Matching System® Wednesday, June 12, 13
  • Match Distribution 3 Compatibility Matching 1 Affinity Matching 2 The eHarmony Difference › Compatibility Matching System® Layers on Top of Compatibility Matching Wednesday, June 12, 13
  • Affinity Matching › Wednesday, June 12, 13
  • 61 21 Affinity Matching › Wednesday, June 12, 13
  • 61 21 3000 Affinity Matching › Wednesday, June 12, 13
  • 61 21 3000 Affinity Matching › Wednesday, June 12, 13
  • Affinity Matching › Wednesday, June 12, 13
  • ……… Affinity Matching › Wednesday, June 12, 13
  • Affinity Matching › Distance Prob(              ) Wednesday, June 12, 13
  • Affinity Matching › Distance Wednesday, June 12, 13
  • Affinity Matching › Height difference Prob(              ) 4  -­‐  8  in cm Wednesday, June 12, 13
  • Affinity Matching › “Attractiveness” Prob(              ) Wednesday, June 12, 13
  • Affinity Matching › Zoom level Wednesday, June 12, 13
  • Affinity Matching › Zoom level Wednesday, June 12, 13
  • Affinity Matching › Zoom level Wednesday, June 12, 13
  • 25% -­‐1%-­‐1% -­‐24% 20% 13% 9% -­‐5%-­‐5% -­‐27% 7% 0%9% -­‐5%-­‐5% -­‐27% 7% 0% -­‐12% -­‐21%-­‐21% -­‐42% -­‐19% -­‐23% 19% 0%0% -­‐28% 28% 10% 9% -­‐11%-­‐11% -­‐35% 11% 44% Affinity Matching › Food preference Wednesday, June 12, 13
  • 25% -­‐1%-­‐1% -­‐24% 20% 13% 9% -­‐5%-­‐5% -­‐27% 7% 0%9% -­‐5%-­‐5% -­‐27% 7% 0% -­‐12% -­‐21%-­‐21% -­‐42% -­‐19% -­‐23% 19% 0%0% -­‐28% 28% 10% 9% -­‐11%-­‐11% -­‐35% 11% 44% Affinity Matching › Food preference Wednesday, June 12, 13
  • Wednesday, June 12, 13
  • Wednesday, June 12, 13
  • Wednesday, June 12, 13
  • Wednesday, June 12, 13
  • Wednesday, June 12, 13
  • Affinity Matching › ~40M  registered  users ~10^7  matches  per  day ~10^3  a5ributes ... ... Prob( | data) ? ~10^8  daily Prob( | features) Wednesday, June 12, 13
  • Affinity Matching › ~40M  registered  users ~10^7  matches  per  day ~10^3  a5ributes ... ... Prob( | data) ? ~10^8  daily Prob( | features) Unsupervised  features (LDA,  classifiers) Constructed  features Wednesday, June 12, 13
  • 1TB RAM Wednesday, June 12, 13
  • Maestro: Data Protocol  Buffers distcp Wednesday, June 12, 13
  • Modeling: Maestro UserMatchCommunica)on feature  expansion Sparse   ML  format models Wednesday, June 12, 13
  • Modeling: Model parametrizations Model  parameters features weights tree  splits Calibra)on  Spline DISTANCE:534 Wednesday, June 12, 13
  • Modeling: Model parametrizations Model  parameters features weights tree  splits Calibra)on  Spline DISTANCE:534 DSL Wednesday, June 12, 13
  • Modeling: Scala DSL “same_religion”:”${user.profile.religion}=={cand.profile.religion}” “cmp_drinking”:”cmp(${user.profile.drinking},{cand.profile.drinking})” < “strict_distance_u”:”${user.profile.accepted_distance}<={pairing.distance}” 60miles Wednesday, June 12, 13
  • 750M  Compressed Protocol  Buffers Production: Spring Conductor Map-­‐side  joins (TB) Matching  User  Serice Pairings  Browser   Service 1+G  Compressed  Protocol  Buffers   Scorer Wednesday, June 12, 13
  • ? Production: FeatureX (expensive features) FeatureX LSH NLP Voldemort  backed   Service Wednesday, June 12, 13
  • Production: User Activity Service User Ac)vity Service 10K  events/s Matching User Service ~5ms  response ? Event  Listener Wednesday, June 12, 13
  • eHarmony & OpenSource github.com/petricek/datatools github.com/eHarmony/seeking github.com/eHarmony/hive springsource.org/spring-­‐data/hadoop github.com/JohnLangford/vowpal_wabbit Wednesday, June 12, 13
  • Compatibility Matching 1 Affinity Matching 2 Match Distribution 3 The eHarmony Difference › Compatibility Matching System® Wednesday, June 12, 13
  • Compatibility Matching 1 Affinity Matching 2 Match Distribution 3 The eHarmony Difference › Compatibility Matching System® Delivering the right matches at the right time to as many people as possible across the entire network. Wednesday, June 12, 13
  • Match Distribution › Graph optimization Wednesday, June 12, 13
  • Match Distribution › Graph optimization Wednesday, June 12, 13
  • Match Distribution › Graph optimization 2 2 Wednesday, June 12, 13
  • Match Distribution › Graph optimization 2 21 Wednesday, June 12, 13
  • Match Distribution › Graph optimization 2 21Prob( | data) Wednesday, June 12, 13
  • Match Distribution › Graph optimization 2 21Prob( | data) Wednesday, June 12, 13
  • Match Distribution › Graph optimization 2 2Prob( | data) Wednesday, June 12, 13
  • Match Distribution › Graph optimization 2 2Prob( | data) Wednesday, June 12, 13
  • Resulting Customer Experience › Guided Communication Wednesday, June 12, 13
  • Resulting Customer Experience › Guided Communication Wednesday, June 12, 13
  • ? ! Resulting Customer Experience › Guided Communication Wednesday, June 12, 13
  • Resulting Customer Experience › Success! Wednesday, June 12, 13
  • Resulting Customer Experience › Success! Wednesday, June 12, 13
  • eHarmony Results › The eHarmony Impact 2005 90 eHarmony Members Married Every Day Wednesday, June 12, 13
  • eHarmony Results › The eHarmony Impact 2005 2007 236 eHarmony Members Married Every Day Wednesday, June 12, 13
  • eHarmony Results › The eHarmony Impact 2005 2007 2009 542 eHarmony Members Married Every Day Wednesday, June 12, 13
  • Proceedings of National Academy of Sciences Wednesday, June 12, 13
  • Press coverage Wednesday, June 12, 13
  • Since  2005,  about  1/3  of  couples   who  have  married  in  the  US   have  met  online  (35%) eHarmony Results › The eHarmony Impact *  according  to  survey  of  couples  married  between  2005-­‐2012  by  Harris  InteracQve  for  eHarmony Wednesday, June 12, 13
  • Rates of breakup or divorce 0% 2.0% 4.0% 6.0% 8.0% All Online Offline *  according  to  survey  of  couples  married  between  2005-­‐2012  by  Harris  InteracQve  for  eHarmony Wednesday, June 12, 13
  • The  largest  number   of  marriages  surveyed   who  met  via  online  da)ng   had  met  on  eHarmony  (25%) eHarmony Results › The eHarmony Impact *  according  to  survey  of  couples  married  between  2005-­‐2012  by  Harris  InteracQve  for  eHarmony Wednesday, June 12, 13
  • Rates of breakup or divorce 0% 2.0% 4.0% 6.0% 8.0% eHarmony All Other Online Offline *  according  to  survey  of  couples  married  between  2005-­‐2012  by  by  Harris  InteracQve  for  eHarmony Wednesday, June 12, 13
  • Rates of breakup or divorce 0% 2.0% 4.0% 6.0% 8.0% eHarmony All Other Online Offline *  according  to  survey  of  couples  married  between  2005-­‐2012  by  by  Harris  InteracQve  for  eHarmony @petricek linkedin.com/in/petricek bit.ly/jobateharmony Wednesday, June 12, 13
  • Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations/eharmony -hadoop