SlideShare a Scribd company logo
1 of 37
1
Dating with ModelsDating with Models
A How to Guide for Programmers and ArchitectsA How to Guide for Programmers and Architects
Ryan BarkerRyan Barker
The eHarmony Difference ›The eHarmony Difference › How are we different?
• 30+ years as clinical psychologist
and marriage counselor
• Many failing marriages due to
fundamental incompatibility
Can we do better?
The fundamental idea›The fundamental idea›
320 Questions
› Personality
› Values
› Attitudes
› Beliefs
Compatibility Matching
Compatibility Matching ›Compatibility Matching › Obstreperousness
Compatibility Matching ›Compatibility Matching › Romantic
Compatibility Matching ›Compatibility Matching › 29 Dimensions®
So lets build it! ›So lets build it! › Models as a stored procedure~2001
Problems ›Problems › Stored procedures are awesome
• Problem #1 – Thousands of users, very few matches. Entire
company is at stake
• Resolution – Line by line debugging of stored procedure finds
an AND that should be an OR
• Problem #2 – Database load increasing
• Resolution – Optimize stored procedure? More hardware?
Rewrite?
• Problem #3 – Order by compatibility does not work
• Resolution – Change stored procedure? Find a way to
introduce models
Match
Distribution
3
Compatibility
Matching
1
Affinity
Matching
2
The eHarmony Difference ›The eHarmony Difference › Compatibility Matching System®
Layers on Top of
Compatibility Matching
61 21
3000
Affinity Matching ›Affinity Matching ›
………
Affinity Matching ›Affinity Matching ›
Affinity Matching ›Affinity Matching › Distance
Prob( )
Affinity Matching ›Affinity Matching › Distance
Affinity Matching ›Affinity Matching › Height difference
Prob( ) 4 - 8 in
cm
Affinity Matching ›Affinity Matching › “Attractiveness”
Prob( )
Redesign ›Redesign › Event based matching with Java/Groovy models
Problems ›Problems › Better but still suboptimal
• Problem #1 – Suboptimal distribution of matches
• Resolution – Shuffle loop order each day? Introduce an
optimizer!
• Problem #2 – Nightly match run taking 27 hours, heavy
database load
• Resolution – Move to an offline process
• Problem #3 – Java models require testing and new releases.
Groovy models are too slow
• Resolution – Change to configuration based models
Compatibility
Matching
1
Affinity
Matching
2
Match
Distribution
3
The eHarmony Difference ›The eHarmony Difference › Compatibility Matching System®
Delivering the right
matches at the right time
to as many people as
possible across the entire
network.
Match Distribution ›Match Distribution › Graph optimization
2 21Prob( | data)
Match Distribution ›Match Distribution › Graph optimization
2 2Prob( | data)
Match Distribution ›Match Distribution › Graph optimization
2 2Prob( | data)
23
Match Distribution ›Match Distribution › Does it work?
Problems ›Problems › The design is never finished
• Problem #1 – More data required
• Resolution – Build services to collect data in real time
• Problem #2 – Bandwidth limitations
• Resolution – Switch to protocol buffers
• Problem #3 – Can’t reprocess people fast enough due to
database load
• Resolution – Switch to key value store backed services
Rearchitecture ›Rearchitecture › Services for everything
Rearchitecture ›Rearchitecture › Service features
• RESTful data oriented design
• Single element
• GET – Return single element
• POST – Update single element
• PUT – Create single element
• DELETE – Delete single element
• Multiple element
• GET – Return list of elements
• Produces/Consumes JSON or Protobuf
• JAX-RS providers transparently convert
between formats
• Accept/ContentType: X-application-protobuf
Rearchitecture ›Rearchitecture › Service Client features
• Generic client customized for each service
• Single element
• GET – Return single element
• POST – Update single element
• PUT – Create single element
• DELETE – Delete single element
• Multiple element
• GET – Return list of elements
• BATCH – Scatter gather implementation
• Protocol buffer based by default, falls back to
JSON for older services
• Configurable retries for GET/PUT/DELETE
Current Day ›Current Day › Matching User Service
Matching User Service is a data aggregation service
that gathers data from various sources, and stores
them in a key value store
•REST + Protocol buffer based
• /user-service/<version>/users/<user-id>
• Supports full and partial updates
• Supports single and batch gets
• 1000+ data attributes,
• ~4KB each uncompressed
•Key: Userid
•Value: UserProto
Current Day ›Current Day › Matching User Servic
Current Day ›Current Day › Matching User Service
Current Day ›Current Day › Matching User Service
Current Day ›Current Day › Pairing Service
Pairing Service is a data service that supports a
specialized set of operations
•REST + Protocol buffer based
• GET/PUT/DELETE /pairings-
service/<version>/pairings/<type>/users/<user-id>
• DELETE /pairings-
service/<version>/pairings/<type>/users/<user-
id>/candidates/<candidate-id>
• 4 data attributes per pairing
• 0 to tens of thousands of pairings per user
•Stores: 1 per type
•Key: Userid
•Value: PairingsProto
Current Day ›Current Day › Scoring Service
Scoring Service is a stateless calculation
service that supports JSON based models
•REST + Protocol buffer based
• GET /scoring-service/<version>/users/<user-
id>/models/<modelname>/score
• POST /scoring-
service/<version>/models/<modelname>/score
•Knows how to fetch data from data sources for
some models
•All models slowly being centralized in one place
•Underlying library supports any protobuf or map
•Possible candidate for redesign?
Current Day ›Current Day › Model Frameworks 3.0
Model Frameworks 3.0 is the core library
behind all scoring
•JSON based model definitions
•Scala DSL implementation with bytecode
generation
•Supports Protobuffs (Message), ResultSet, Maps
•Examples
• “same_religion” : ”{user.profile.religion} ==
{cand.profile.religion}”
• “bin_age_diff” : ”bin(bins, {user.calculatedValues.age} -
{cand.calculatedValues.age})”
Current Day ›Current Day › Offline Matching – Spring Conductor
Current Day ›Current Day › Offline Matching – Hadoop flow
38
linkedin.com/in/rbarker1

More Related Content

What's hot

Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2MongoDB
 
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB  present...MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB  present...
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...MongoDB
 
Ldap2010
Ldap2010Ldap2010
Ldap2010CYJ
 
Speed Kit: Getting Websites out of the Web Performance Stone Age
Speed Kit: Getting Websites out of the Web Performance Stone AgeSpeed Kit: Getting Websites out of the Web Performance Stone Age
Speed Kit: Getting Websites out of the Web Performance Stone AgeFelix Gessert
 
RESTdesc – Efficient runtime service discovery and consumption
RESTdesc – Efficient runtime service discovery and consumptionRESTdesc – Efficient runtime service discovery and consumption
RESTdesc – Efficient runtime service discovery and consumptionRuben Verborgh
 
Tips and Tricks for Migrating to Exchange Online
Tips and Tricks for Migrating to Exchange OnlineTips and Tricks for Migrating to Exchange Online
Tips and Tricks for Migrating to Exchange OnlineSteve Goodman
 
Event Driven-Architecture from a Scalability perspective
Event Driven-Architecture from a Scalability perspectiveEvent Driven-Architecture from a Scalability perspective
Event Driven-Architecture from a Scalability perspectiveJonas Bonér
 
GWAVACon - Migration into Office 365 Cloud
GWAVACon - Migration into Office 365 CloudGWAVACon - Migration into Office 365 Cloud
GWAVACon - Migration into Office 365 CloudGWAVA
 

What's hot (8)

Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2
 
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB  present...MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB  present...
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...
 
Ldap2010
Ldap2010Ldap2010
Ldap2010
 
Speed Kit: Getting Websites out of the Web Performance Stone Age
Speed Kit: Getting Websites out of the Web Performance Stone AgeSpeed Kit: Getting Websites out of the Web Performance Stone Age
Speed Kit: Getting Websites out of the Web Performance Stone Age
 
RESTdesc – Efficient runtime service discovery and consumption
RESTdesc – Efficient runtime service discovery and consumptionRESTdesc – Efficient runtime service discovery and consumption
RESTdesc – Efficient runtime service discovery and consumption
 
Tips and Tricks for Migrating to Exchange Online
Tips and Tricks for Migrating to Exchange OnlineTips and Tricks for Migrating to Exchange Online
Tips and Tricks for Migrating to Exchange Online
 
Event Driven-Architecture from a Scalability perspective
Event Driven-Architecture from a Scalability perspectiveEvent Driven-Architecture from a Scalability perspective
Event Driven-Architecture from a Scalability perspective
 
GWAVACon - Migration into Office 365 Cloud
GWAVACon - Migration into Office 365 CloudGWAVACon - Migration into Office 365 Cloud
GWAVACon - Migration into Office 365 Cloud
 

Viewers also liked

5 - eHarmony Presentation Noah Conference 2011
5 - eHarmony Presentation Noah Conference 20115 - eHarmony Presentation Noah Conference 2011
5 - eHarmony Presentation Noah Conference 2011NOAH Advisors
 
eHarmony Creative Strategy
eHarmony Creative StrategyeHarmony Creative Strategy
eHarmony Creative StrategyBefrank86
 
AWS Customer Presentation - eHarmony
AWS Customer Presentation - eHarmonyAWS Customer Presentation - eHarmony
AWS Customer Presentation - eHarmonyAmazon Web Services
 
Final presentation
Final presentationFinal presentation
Final presentationcmcglaun
 
e-Harmony Study
e-Harmony Study e-Harmony Study
e-Harmony Study oiisdp2010
 
MKTG Research eHarmony Final
MKTG Research eHarmony FinalMKTG Research eHarmony Final
MKTG Research eHarmony FinalPeter Curry
 
Big Dating at eHarmony
Big Dating at eHarmonyBig Dating at eHarmony
Big Dating at eHarmonyMongoDB
 
What is online dating?
What is online dating?What is online dating?
What is online dating?12pm19
 
eHarmony Strategic Marketing Case Study
eHarmony Strategic Marketing Case StudyeHarmony Strategic Marketing Case Study
eHarmony Strategic Marketing Case StudyZoe Robinson
 

Viewers also liked (14)

Research project 2014
Research project 2014Research project 2014
Research project 2014
 
Eharmony socialmedia
Eharmony socialmediaEharmony socialmedia
Eharmony socialmedia
 
5 - eHarmony Presentation Noah Conference 2011
5 - eHarmony Presentation Noah Conference 20115 - eHarmony Presentation Noah Conference 2011
5 - eHarmony Presentation Noah Conference 2011
 
presentation
presentationpresentation
presentation
 
eHarmony Creative Strategy
eHarmony Creative StrategyeHarmony Creative Strategy
eHarmony Creative Strategy
 
AWS Customer Presentation - eHarmony
AWS Customer Presentation - eHarmonyAWS Customer Presentation - eHarmony
AWS Customer Presentation - eHarmony
 
Final presentation
Final presentationFinal presentation
Final presentation
 
Onine dating
Onine datingOnine dating
Onine dating
 
e-Harmony Study
e-Harmony Study e-Harmony Study
e-Harmony Study
 
MKTG Research eHarmony Final
MKTG Research eHarmony FinalMKTG Research eHarmony Final
MKTG Research eHarmony Final
 
Big Dating at eHarmony
Big Dating at eHarmonyBig Dating at eHarmony
Big Dating at eHarmony
 
What is online dating?
What is online dating?What is online dating?
What is online dating?
 
eHarmony Strategic Marketing Case Study
eHarmony Strategic Marketing Case StudyeHarmony Strategic Marketing Case Study
eHarmony Strategic Marketing Case Study
 
Online dating
Online datingOnline dating
Online dating
 

Similar to Dating with Models

Art of refactoring - Code Smells and Microservices Antipatterns
Art of refactoring - Code Smells and Microservices AntipatternsArt of refactoring - Code Smells and Microservices Antipatterns
Art of refactoring - Code Smells and Microservices AntipatternsEl Mahdi Benzekri
 
Operations for databases – the agile/devops journey
Operations for databases – the agile/devops journeyOperations for databases – the agile/devops journey
Operations for databases – the agile/devops journeyEduardo Piairo
 
Deep Dive into SharePoint Topologies and Server Architecture for SharePoint 2013
Deep Dive into SharePoint Topologies and Server Architecture for SharePoint 2013Deep Dive into SharePoint Topologies and Server Architecture for SharePoint 2013
Deep Dive into SharePoint Topologies and Server Architecture for SharePoint 2013K.Mohamed Faizal
 
MongoDB Atlas - eHarmony’s New Message Store
MongoDB Atlas - eHarmony’s New Message StoreMongoDB Atlas - eHarmony’s New Message Store
MongoDB Atlas - eHarmony’s New Message StoreMongoDB
 
MongoDB Atlas - eHarmony’s New Message Store
MongoDB Atlas - eHarmony’s New Message StoreMongoDB Atlas - eHarmony’s New Message Store
MongoDB Atlas - eHarmony’s New Message StoreEvan Rodd
 
When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...MongoDB
 
Add Redis to Postgres to Make Your Microservices Go Boom!
Add Redis to Postgres to Make Your Microservices Go Boom!Add Redis to Postgres to Make Your Microservices Go Boom!
Add Redis to Postgres to Make Your Microservices Go Boom!Dave Nielsen
 
Taming Large Databases
Taming Large DatabasesTaming Large Databases
Taming Large DatabasesNeo4j
 
Operations for databases: the agile/devops journey
Operations for databases: the agile/devops journeyOperations for databases: the agile/devops journey
Operations for databases: the agile/devops journeyEduardo Piairo
 
PostgreSQL at 20TB and Beyond
PostgreSQL at 20TB and BeyondPostgreSQL at 20TB and Beyond
PostgreSQL at 20TB and BeyondChris Travers
 
Got documents - The Raven Bouns Edition
Got documents - The Raven Bouns EditionGot documents - The Raven Bouns Edition
Got documents - The Raven Bouns EditionMaggie Pint
 
PostgreSQL as a Big Data Platform
PostgreSQL as a Big Data Platform PostgreSQL as a Big Data Platform
PostgreSQL as a Big Data Platform Chris Travers
 
RedHat MRG and Infinispan for Large Scale Integration
RedHat MRG and Infinispan for Large Scale IntegrationRedHat MRG and Infinispan for Large Scale Integration
RedHat MRG and Infinispan for Large Scale Integrationprajods
 
MongoDB: What, why, when
MongoDB: What, why, whenMongoDB: What, why, when
MongoDB: What, why, whenEugenio Minardi
 
Scaling Systems: Architectures that grow
Scaling Systems: Architectures that growScaling Systems: Architectures that grow
Scaling Systems: Architectures that growGibraltar Software
 
Operations for databases – The DevOps journey
Operations for databases – The DevOps journey Operations for databases – The DevOps journey
Operations for databases – The DevOps journey Eduardo Piairo
 
Got documents Code Mash Revision
Got documents Code Mash RevisionGot documents Code Mash Revision
Got documents Code Mash RevisionMaggie Pint
 
Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications Tugdual Grall
 

Similar to Dating with Models (20)

Art of refactoring - Code Smells and Microservices Antipatterns
Art of refactoring - Code Smells and Microservices AntipatternsArt of refactoring - Code Smells and Microservices Antipatterns
Art of refactoring - Code Smells and Microservices Antipatterns
 
Operations for databases – the agile/devops journey
Operations for databases – the agile/devops journeyOperations for databases – the agile/devops journey
Operations for databases – the agile/devops journey
 
Deep Dive into SharePoint Topologies and Server Architecture for SharePoint 2013
Deep Dive into SharePoint Topologies and Server Architecture for SharePoint 2013Deep Dive into SharePoint Topologies and Server Architecture for SharePoint 2013
Deep Dive into SharePoint Topologies and Server Architecture for SharePoint 2013
 
MongoDB Atlas - eHarmony’s New Message Store
MongoDB Atlas - eHarmony’s New Message StoreMongoDB Atlas - eHarmony’s New Message Store
MongoDB Atlas - eHarmony’s New Message Store
 
MongoDB Atlas - eHarmony’s New Message Store
MongoDB Atlas - eHarmony’s New Message StoreMongoDB Atlas - eHarmony’s New Message Store
MongoDB Atlas - eHarmony’s New Message Store
 
When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...
 
Add Redis to Postgres to Make Your Microservices Go Boom!
Add Redis to Postgres to Make Your Microservices Go Boom!Add Redis to Postgres to Make Your Microservices Go Boom!
Add Redis to Postgres to Make Your Microservices Go Boom!
 
Taming Large Databases
Taming Large DatabasesTaming Large Databases
Taming Large Databases
 
Operations for databases: the agile/devops journey
Operations for databases: the agile/devops journeyOperations for databases: the agile/devops journey
Operations for databases: the agile/devops journey
 
PostgreSQL at 20TB and Beyond
PostgreSQL at 20TB and BeyondPostgreSQL at 20TB and Beyond
PostgreSQL at 20TB and Beyond
 
Got documents - The Raven Bouns Edition
Got documents - The Raven Bouns EditionGot documents - The Raven Bouns Edition
Got documents - The Raven Bouns Edition
 
PostgreSQL as a Big Data Platform
PostgreSQL as a Big Data Platform PostgreSQL as a Big Data Platform
PostgreSQL as a Big Data Platform
 
Scalability and performance for e commerce
Scalability and performance for e commerceScalability and performance for e commerce
Scalability and performance for e commerce
 
RedHat MRG and Infinispan for Large Scale Integration
RedHat MRG and Infinispan for Large Scale IntegrationRedHat MRG and Infinispan for Large Scale Integration
RedHat MRG and Infinispan for Large Scale Integration
 
MongoDB: What, why, when
MongoDB: What, why, whenMongoDB: What, why, when
MongoDB: What, why, when
 
Scaling Systems: Architectures that grow
Scaling Systems: Architectures that growScaling Systems: Architectures that grow
Scaling Systems: Architectures that grow
 
Operations for databases – The DevOps journey
Operations for databases – The DevOps journey Operations for databases – The DevOps journey
Operations for databases – The DevOps journey
 
Voldemort Nosql
Voldemort NosqlVoldemort Nosql
Voldemort Nosql
 
Got documents Code Mash Revision
Got documents Code Mash RevisionGot documents Code Mash Revision
Got documents Code Mash Revision
 
Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications
 

Recently uploaded

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 

Recently uploaded (20)

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 

Dating with Models

  • 1. 1 Dating with ModelsDating with Models A How to Guide for Programmers and ArchitectsA How to Guide for Programmers and Architects Ryan BarkerRyan Barker
  • 2. The eHarmony Difference ›The eHarmony Difference › How are we different? • 30+ years as clinical psychologist and marriage counselor • Many failing marriages due to fundamental incompatibility Can we do better?
  • 3. The fundamental idea›The fundamental idea› 320 Questions › Personality › Values › Attitudes › Beliefs Compatibility Matching
  • 4. Compatibility Matching ›Compatibility Matching › Obstreperousness
  • 6. Compatibility Matching ›Compatibility Matching › 29 Dimensions®
  • 7. So lets build it! ›So lets build it! › Models as a stored procedure~2001
  • 8. Problems ›Problems › Stored procedures are awesome • Problem #1 – Thousands of users, very few matches. Entire company is at stake • Resolution – Line by line debugging of stored procedure finds an AND that should be an OR • Problem #2 – Database load increasing • Resolution – Optimize stored procedure? More hardware? Rewrite? • Problem #3 – Order by compatibility does not work • Resolution – Change stored procedure? Find a way to introduce models
  • 9. Match Distribution 3 Compatibility Matching 1 Affinity Matching 2 The eHarmony Difference ›The eHarmony Difference › Compatibility Matching System® Layers on Top of Compatibility Matching
  • 10. 61 21 3000 Affinity Matching ›Affinity Matching ›
  • 12. Affinity Matching ›Affinity Matching › Distance Prob( )
  • 13. Affinity Matching ›Affinity Matching › Distance
  • 14. Affinity Matching ›Affinity Matching › Height difference Prob( ) 4 - 8 in cm
  • 15. Affinity Matching ›Affinity Matching › “Attractiveness” Prob( )
  • 16. Redesign ›Redesign › Event based matching with Java/Groovy models
  • 17. Problems ›Problems › Better but still suboptimal • Problem #1 – Suboptimal distribution of matches • Resolution – Shuffle loop order each day? Introduce an optimizer! • Problem #2 – Nightly match run taking 27 hours, heavy database load • Resolution – Move to an offline process • Problem #3 – Java models require testing and new releases. Groovy models are too slow • Resolution – Change to configuration based models
  • 18. Compatibility Matching 1 Affinity Matching 2 Match Distribution 3 The eHarmony Difference ›The eHarmony Difference › Compatibility Matching System® Delivering the right matches at the right time to as many people as possible across the entire network.
  • 19. Match Distribution ›Match Distribution › Graph optimization 2 21Prob( | data)
  • 20. Match Distribution ›Match Distribution › Graph optimization 2 2Prob( | data)
  • 21. Match Distribution ›Match Distribution › Graph optimization 2 2Prob( | data)
  • 22. 23
  • 23. Match Distribution ›Match Distribution › Does it work?
  • 24. Problems ›Problems › The design is never finished • Problem #1 – More data required • Resolution – Build services to collect data in real time • Problem #2 – Bandwidth limitations • Resolution – Switch to protocol buffers • Problem #3 – Can’t reprocess people fast enough due to database load • Resolution – Switch to key value store backed services
  • 25. Rearchitecture ›Rearchitecture › Services for everything
  • 26. Rearchitecture ›Rearchitecture › Service features • RESTful data oriented design • Single element • GET – Return single element • POST – Update single element • PUT – Create single element • DELETE – Delete single element • Multiple element • GET – Return list of elements • Produces/Consumes JSON or Protobuf • JAX-RS providers transparently convert between formats • Accept/ContentType: X-application-protobuf
  • 27. Rearchitecture ›Rearchitecture › Service Client features • Generic client customized for each service • Single element • GET – Return single element • POST – Update single element • PUT – Create single element • DELETE – Delete single element • Multiple element • GET – Return list of elements • BATCH – Scatter gather implementation • Protocol buffer based by default, falls back to JSON for older services • Configurable retries for GET/PUT/DELETE
  • 28. Current Day ›Current Day › Matching User Service Matching User Service is a data aggregation service that gathers data from various sources, and stores them in a key value store •REST + Protocol buffer based • /user-service/<version>/users/<user-id> • Supports full and partial updates • Supports single and batch gets • 1000+ data attributes, • ~4KB each uncompressed •Key: Userid •Value: UserProto
  • 29. Current Day ›Current Day › Matching User Servic
  • 30. Current Day ›Current Day › Matching User Service
  • 31. Current Day ›Current Day › Matching User Service
  • 32. Current Day ›Current Day › Pairing Service Pairing Service is a data service that supports a specialized set of operations •REST + Protocol buffer based • GET/PUT/DELETE /pairings- service/<version>/pairings/<type>/users/<user-id> • DELETE /pairings- service/<version>/pairings/<type>/users/<user- id>/candidates/<candidate-id> • 4 data attributes per pairing • 0 to tens of thousands of pairings per user •Stores: 1 per type •Key: Userid •Value: PairingsProto
  • 33. Current Day ›Current Day › Scoring Service Scoring Service is a stateless calculation service that supports JSON based models •REST + Protocol buffer based • GET /scoring-service/<version>/users/<user- id>/models/<modelname>/score • POST /scoring- service/<version>/models/<modelname>/score •Knows how to fetch data from data sources for some models •All models slowly being centralized in one place •Underlying library supports any protobuf or map •Possible candidate for redesign?
  • 34. Current Day ›Current Day › Model Frameworks 3.0 Model Frameworks 3.0 is the core library behind all scoring •JSON based model definitions •Scala DSL implementation with bytecode generation •Supports Protobuffs (Message), ResultSet, Maps •Examples • “same_religion” : ”{user.profile.religion} == {cand.profile.religion}” • “bin_age_diff” : ”bin(bins, {user.calculatedValues.age} - {cand.calculatedValues.age})”
  • 35. Current Day ›Current Day › Offline Matching – Spring Conductor
  • 36. Current Day ›Current Day › Offline Matching – Hadoop flow

Editor's Notes

  1. Hello and thank you Pleasure to be here. Today I am here to talk about what is happening behind the scenes @ eharmony. We were one of the first companies to apply sophisticated technology to the very old concept of matchmaking . eHarmony takes a very different approach from other online dating sites, … search-based . On those sites, you determine your preferences – and filter out That ’ s one valid approach . But eHarmony is different . eHarmony was created to give people a [better chance] and a better way to find a great long-term relationship. Many of you may know from our old television commercials that eHarmony was founded by [Dr. Neil Clark Warren ]. You may not know that he was a clinical psychologist and marriage counselor in Pasadena , California for more than 30 years . A lot of the couples Dr. Warren counseled were in failing marriages . Over the years, he realized that marriages often fall apart when the people in them are fundamentally incompatible . Dr. Warren believed that the best way to create happier marriages and reduce some of the negative effects of divorce was to give people a better chance of marrying the right person in the first place. That insight led to a lot of questions: What makes some couples more satisfied in their relationships over time than others? Can long-term relationship satisfaction be predicted ? If so, can those qualities be used to match single people ? Dr. Warren and the founding team at eHarmony began researching those questions by studying several thousand married couples . They discovered that there are common traits that distinguish the most satisfied married couples from others.  Thus, in the late 90s, eHarmony was born .
  2. eHarmony was created to give people a [better chance] and a better way to find a great long-term relationship. Many of you may know from our old television commercials that eHarmony was founded by [Dr. Neil Clark Warren ]. You may not know that he was a clinical psychologist and marriage counselor in Pasadena , California for more than 30 years . A lot of the couples Dr. Warren counseled were in failing marriages . Over the years, he realized that marriages often fall apart when the people in them are fundamentally incompatible . Dr. Warren believed that the best way to create happier marriages and reduce some of the negative effects of divorce was to give people a better chance of marrying the right person in the first place. That insight led to a lot of questions: What makes some couples more satisfied in their relationships over time than others? Can long-term relationship satisfaction be predicted ? If so, can those qualities be used to match single people ? Dr. Warren and the founding team at eHarmony began researching those questions by studying several thousand married couples . They discovered that there are common traits that distinguish the most satisfied married couples from others.  Thus, in the late 90s, eHarmony was born .
  3. That is compatibility matching Similarity on dims that don ’ t get discussed When asked: “ Are you happy with yourself? ” Important but not pickup line. That ’ s why RQ A very good snapshot of personality
  4. Core traits and vital attrs Core traits: [CLICK TO BUILD] Vital attributes  Initial eH model
  5. Here is the initial eHarmony model Only pairs with high chance to be very happy together are introduced.
  6. If no click  no comm Compatibility and chemistry are two very different things. interests provide something to talk about. His matches have to like him back. Affinity Matching is about
  7. Every match eHarmony makes is compatible. That is from the personality perspective. However not all matches end up talking to each other. Sometimes the age gapcould be too big Other times the users may live too far. There are too many reasons to count. We are trying to deliver as many matches as possible where both users are interested in each other, start communicating and get to know each other.
  8. That leads me to the last piece of our matching process, which we call match distribution. We need to make sure that we ’ re presenting the right matches… to the right users… at the right time… to as many people as possible across our entire network, every day. Network changing every day Let me illustrate this.
  9. Now we ’ re not doing those joins on disk at all. For each potential match we want to process, we can load relevant user data on demand from each side from our voldemort cache. Ths was loaded with user data by a previous mapreduce step. now we ’ re joining in ram, record by record, on demand. At the end of the evalutation we ’ ve actually thrown away most of the data we don ’ t need after we ’ ve used it. Did I meantion this gave us a 10x speedup over conventional hadoop joins. It ’ s worth repeating: we got an order of magnitude performance improvement by doing this technique.
  10. It worked
  11. How it works for adam? matches Interested in Julia Break the ice? Pick up lines no good
  12. Doing matchmaking well requires an innate understanding of your customers and the sophistication to use that data to deliver a valuable experience. All the advances in computing power and algorithms have recently opened up a lot of new possibilities and applications. I ’ m happy to talk with any of you further if you have questions about eHarmony or how to apply matchmaking to your own businesses. Thank you.