V I J A Y V A N G A P A N D U
@ v i j a y v a n g a p a n d u
v i j a y k u m a r v a n g a p a n d u
W H O A R E W E ?
E H A R M O N Y C R E A T E S
T H E H A P P I E S T ,
M O S T P A S S I O N A T E
A N D M O S T F U L F I L L I N G
R E L A T I O N S H I P S *
* A C C O R D I N G T O A R E C E N T S T U D Y
M A R R I A G E S P E R D A Y
150
questions
Personality
Values
Attributes
Beliefs
M A T C H I N G S Y S T E M
Compatibility Matching SystemÂź
C O M P A T I B I L I T Y
M A T C H I N G
A F F I N I T Y M A T C H I N G
M A T C H
D I S T R I B U T I O N
Intellect
Energy
Sociability
Ambition
Kindness
Curiosity
Humor
Spirituality
U S E R U P D A T E S
M A T C H D E L I V E R Y ( V 1 )
M A P - S I D E J O I N S
( T B ) S C O R I N G
V O L D E M O R T
M A T C H D A T A S E R V I C E
M A T C H I N G S Y S T E M
3 0 + M I L L I O N E V E N T S
6 5 + M I L L I O N U S E R S
3 0 + B I L L I O N R E C O R D S
V O L D E M O R T ?
T H A T N A M E
T H A T N A M E
S O U N D S F A M I L I A R
V O L D E M O R T
A U T O
P A R T I T I O N I N G
P L U G G A B L E
S E R I A L I Z A T I O N
A U T O
R E P L I C A T I O N
K E Y - V A L U E D Y N A M O
G O S S I P
U S E R U P D A T E S O ( N ^ 2 )
* U S E R U P D A T E D T H E A D D R E S S
N E E D F O R S C A L A B I L I T Y
V O L D E M O R T
30+ Million Match Events / Day
30+ Billion Match Records
Millions of user generated Events / Day
Low latency user requests
1 . 4 G B / M I N ( 1 4 )
N E E D F O R S C A L A B I L I T Y
G E T M A T C H E S R E S P O N S E
T I M E S
DATA STORE NEEDS
Q U E R I E S
L O W
L A T E N C Y
C R U D
O P E R A T I O N S
F I L T E R I N G
T H R O U G H P U T
4 0 + M I L L I O N
W R I T E S
3 0 + B I L L I O N
R E C O R D S
DATA STORE NEEDS
E A S Y T O
M A I N T A I N
C O N S I S T E N C Y
A V A I L A B L E
P A R T I T I O N
T O L E R A N C E
BREAKING CAP ?
Consistency
Availability
Partition
Tolerance
CA CP
AP
MongoDB
HBase
Redis
Cassandra
DynamoDB
Riak
RDBMS
KAFKA
L A M B D
‱ Robust and fault-tolerant system
‱ Serves a wide range of workloads and use cases
‱ linearly scalable
‱ Layered Architecture
Batch Layer
Query Layer
Speed Layer
- Nathan Marz
L A M B D
B A T C H L A Y E R
QUERYLAYER
S P E E D / S A V E L A Y E R
M A P - S I D E J O I N S
( T B ) S C O R I N G
M A T C H I N G
S Y S T E M
M E S S A G E
B R O K E R
B A T C H
S T O R A G E
S P E E D
S T O R A G E
MERGE
DATA STORE EVALUATION
C R U D
O P E R A T I O N S
T H R O U G H P U T
4 0 + M I L L I O N
W R I T E S
3 0 + B I L L I O N
R E C O R D S
E A S Y T O
M A I N T A I N
C O N S I S T E N C
Y
P A R T I T I O N
T O L E R A N C E
A V A I L A B I L I T Y
HBASE AS BATCH
STORE
T H R O U G H P U T
4 0 + M I L L I O N
W R I T E S
C O N S I S T E N C Y
A V A I L A B L E
P A R T I T I O N
T O L E R A N C E
KAFKA AS BROKER
C O N S I S T E N C Y
P A R T I T I O N
T O L E R A N C E
A V A I L A B L E
REDIS AS SPEED STORAGE
L O W L A T E N C Y
C R U D
O P E R A T I O N S
E A S Y T O
M A I N T A I N
AS SQL LAYER
Q U E R I E S
I N D E X I N G
T R A N S A C T I O N
S
M U L T I T E N A N C Y
C R U D
O P E R A T I O N SE A S Y T O
M A I N T A I N
PHO LIBRARY
Q U E R I E S
F I L T E R I N G
PHO LIBRARY
CONFIGURATION
ANNOTATE THE ENTITY BEAN
@Entity(value="user_matches")
public class MatchDataFeedItemDto implements Serializable {
@Embedded private MatchCommunicationElement communication;
@Embedded private MatchElement match;
@Property(value = "UID") private long storeUserIdKey;
@Property(value = "MID") private long matchId;
}
REGISTER THE BEAN
<util:list id="entityPropertiesMappings">
<value>com.eharmony.datastore.model.MatchDataFeedItemDto</value>
</util:list>
<bean id="entityPropertiesMappingContext" class="com.eharmony.datastore.mapper.EntityPropertiesMappingContext">
<constructor-arg ref="entityPropertiesMappings"/>
</bean>
<bean id="entityPropertiesResolver" class="com.eharmony.datastore.mapper.EntityPropertiesResolver">
<constructor-arg ref="entityPropertiesMappingContext"/>
</bean>
<bean id="phoenixHBaseQueryTranslator" class="com.eharmony.datastore.hbase.translator.PhoenixHBaseQueryTranslator">
<constructor-arg name="propertyResolver" ref="entityPropertiesResolver" />
</bean>
<bean id="phoenixHBaseQueryExecutor" class="com.eharmony.datastore.hbase.query.executor.PhoenixHBaseQueryExecutor">
<constructor-arg name="queryTranslator" ref="phoenixHBaseQueryTranslator"/>
<constructor-arg name="resultMapper" ref="phoenixProjectedResultMapper" />
</bean>
PHO LIBRARY
QUERY BUILDING
Disjunction disjunction = new Disjunction();
for (int statusFilter : statusFilters) {
disjunction.add(Restrictions.eq("status", statusFilter));
}
QueryBuilder.builderFor(FeedItemDto.class).select()
.add(Restrictions.eq("userId", userId))
.add(Restrictions.gte("spotlightEnd", spotlightEndDate))
.add(disjunction)
.setReturnFields(projection)
.addOrder(orderings)
.setMaxResults(maxResults)
.build();
http://eharmony.github.io/
L A M B D
K A F K A
M A P - S I D E J O I N S
( T B ) S C O R I N G
B A T C H L A Y E R
Q U E R Y L A Y E R
S P E E D / S A V E L A Y E R
M A T C H I N G S Y S T E M
P E R F O R M A N C E
H B A S E C U T O V E R
S A V E M A T C H R E S P O N S E
T I M E S
5 0 % 1 0 0 %
G E T M A T C H E S R E S P O N S E
T I M E S
H B A S E C U T O V E R 1 0 0 %
S O M E I N S I G H T
C H A L L E N G E S
H O T
R E G I O N M I G R A T I O N
metrics.codahale.com
M O N I T O R I N G
http://www.eharmony.com/about/careers/
T H A N K Y O U
Q U E S T I O N S ?
@ v i j a y v a n g a p a n d u

eHarmony @ Phoenix Con 2016

  • 2.
    V I JA Y V A N G A P A N D U @ v i j a y v a n g a p a n d u v i j a y k u m a r v a n g a p a n d u
  • 3.
    W H OA R E W E ?
  • 4.
    E H AR M O N Y C R E A T E S T H E H A P P I E S T , M O S T P A S S I O N A T E A N D M O S T F U L F I L L I N G R E L A T I O N S H I P S * * A C C O R D I N G T O A R E C E N T S T U D Y
  • 5.
    M A RR I A G E S P E R D A Y
  • 7.
  • 8.
    M A TC H I N G S Y S T E M Compatibility Matching SystemÂź C O M P A T I B I L I T Y M A T C H I N G A F F I N I T Y M A T C H I N G M A T C H D I S T R I B U T I O N
  • 9.
  • 11.
    U S ER U P D A T E S
  • 12.
    M A TC H D E L I V E R Y ( V 1 ) M A P - S I D E J O I N S ( T B ) S C O R I N G V O L D E M O R T M A T C H D A T A S E R V I C E M A T C H I N G S Y S T E M 3 0 + M I L L I O N E V E N T S 6 5 + M I L L I O N U S E R S 3 0 + B I L L I O N R E C O R D S
  • 13.
    V O LD E M O R T ? T H A T N A M E T H A T N A M E S O U N D S F A M I L I A R
  • 14.
    V O LD E M O R T A U T O P A R T I T I O N I N G P L U G G A B L E S E R I A L I Z A T I O N A U T O R E P L I C A T I O N K E Y - V A L U E D Y N A M O G O S S I P
  • 15.
    U S ER U P D A T E S O ( N ^ 2 ) * U S E R U P D A T E D T H E A D D R E S S
  • 16.
    N E ED F O R S C A L A B I L I T Y V O L D E M O R T 30+ Million Match Events / Day 30+ Billion Match Records Millions of user generated Events / Day Low latency user requests 1 . 4 G B / M I N ( 1 4 )
  • 17.
    N E ED F O R S C A L A B I L I T Y G E T M A T C H E S R E S P O N S E T I M E S
  • 18.
    DATA STORE NEEDS QU E R I E S L O W L A T E N C Y C R U D O P E R A T I O N S F I L T E R I N G T H R O U G H P U T 4 0 + M I L L I O N W R I T E S 3 0 + B I L L I O N R E C O R D S
  • 19.
    DATA STORE NEEDS EA S Y T O M A I N T A I N C O N S I S T E N C Y A V A I L A B L E P A R T I T I O N T O L E R A N C E
  • 20.
    BREAKING CAP ? Consistency Availability Partition Tolerance CACP AP MongoDB HBase Redis Cassandra DynamoDB Riak RDBMS KAFKA
  • 21.
    L A MB D ‱ Robust and fault-tolerant system ‱ Serves a wide range of workloads and use cases ‱ linearly scalable ‱ Layered Architecture Batch Layer Query Layer Speed Layer - Nathan Marz
  • 22.
    L A MB D B A T C H L A Y E R QUERYLAYER S P E E D / S A V E L A Y E R M A P - S I D E J O I N S ( T B ) S C O R I N G M A T C H I N G S Y S T E M M E S S A G E B R O K E R B A T C H S T O R A G E S P E E D S T O R A G E MERGE
  • 23.
  • 24.
    C R UD O P E R A T I O N S T H R O U G H P U T 4 0 + M I L L I O N W R I T E S 3 0 + B I L L I O N R E C O R D S E A S Y T O M A I N T A I N C O N S I S T E N C Y P A R T I T I O N T O L E R A N C E A V A I L A B I L I T Y HBASE AS BATCH STORE
  • 25.
    T H RO U G H P U T 4 0 + M I L L I O N W R I T E S C O N S I S T E N C Y A V A I L A B L E P A R T I T I O N T O L E R A N C E KAFKA AS BROKER
  • 26.
    C O NS I S T E N C Y P A R T I T I O N T O L E R A N C E A V A I L A B L E REDIS AS SPEED STORAGE L O W L A T E N C Y
  • 27.
    C R UD O P E R A T I O N S E A S Y T O M A I N T A I N AS SQL LAYER Q U E R I E S I N D E X I N G T R A N S A C T I O N S M U L T I T E N A N C Y
  • 28.
    C R UD O P E R A T I O N SE A S Y T O M A I N T A I N PHO LIBRARY Q U E R I E S F I L T E R I N G
  • 29.
    PHO LIBRARY CONFIGURATION ANNOTATE THEENTITY BEAN @Entity(value="user_matches") public class MatchDataFeedItemDto implements Serializable { @Embedded private MatchCommunicationElement communication; @Embedded private MatchElement match; @Property(value = "UID") private long storeUserIdKey; @Property(value = "MID") private long matchId; } REGISTER THE BEAN <util:list id="entityPropertiesMappings"> <value>com.eharmony.datastore.model.MatchDataFeedItemDto</value> </util:list> <bean id="entityPropertiesMappingContext" class="com.eharmony.datastore.mapper.EntityPropertiesMappingContext"> <constructor-arg ref="entityPropertiesMappings"/> </bean> <bean id="entityPropertiesResolver" class="com.eharmony.datastore.mapper.EntityPropertiesResolver"> <constructor-arg ref="entityPropertiesMappingContext"/> </bean> <bean id="phoenixHBaseQueryTranslator" class="com.eharmony.datastore.hbase.translator.PhoenixHBaseQueryTranslator"> <constructor-arg name="propertyResolver" ref="entityPropertiesResolver" /> </bean> <bean id="phoenixHBaseQueryExecutor" class="com.eharmony.datastore.hbase.query.executor.PhoenixHBaseQueryExecutor"> <constructor-arg name="queryTranslator" ref="phoenixHBaseQueryTranslator"/> <constructor-arg name="resultMapper" ref="phoenixProjectedResultMapper" /> </bean>
  • 30.
    PHO LIBRARY QUERY BUILDING Disjunctiondisjunction = new Disjunction(); for (int statusFilter : statusFilters) { disjunction.add(Restrictions.eq("status", statusFilter)); } QueryBuilder.builderFor(FeedItemDto.class).select() .add(Restrictions.eq("userId", userId)) .add(Restrictions.gte("spotlightEnd", spotlightEndDate)) .add(disjunction) .setReturnFields(projection) .addOrder(orderings) .setMaxResults(maxResults) .build();
  • 31.
  • 32.
    L A MB D K A F K A M A P - S I D E J O I N S ( T B ) S C O R I N G B A T C H L A Y E R Q U E R Y L A Y E R S P E E D / S A V E L A Y E R M A T C H I N G S Y S T E M
  • 33.
    P E RF O R M A N C E H B A S E C U T O V E R S A V E M A T C H R E S P O N S E T I M E S 5 0 % 1 0 0 % G E T M A T C H E S R E S P O N S E T I M E S H B A S E C U T O V E R 1 0 0 %
  • 34.
    S O ME I N S I G H T
  • 35.
    C H AL L E N G E S H O T R E G I O N M I G R A T I O N
  • 36.
  • 37.
  • 38.
    T H AN K Y O U Q U E S T I O N S ? @ v i j a y v a n g a p a n d u

Editor's Notes

  • #19 60M+ multi-attribute queries daily across 250+ attributes
  • #20 60M+ multi-attribute queries daily across 250+ attributes
  • #21 60M+ multi-attribute queries daily across 250+ attributes
  • #24 60M+ multi-attribute queries daily across 250+ attributes
  • #25 60M+ multi-attribute queries daily across 250+ attributes
  • #26 60M+ multi-attribute queries daily across 250+ attributes
  • #27 60M+ multi-attribute queries daily across 250+ attributes
  • #28 60M+ multi-attribute queries daily across 250+ attributes
  • #29 60M+ multi-attribute queries daily across 250+ attributes
  • #30 60M+ multi-attribute queries daily across 250+ attributes
  • #31 60M+ multi-attribute queries daily across 250+ attributes
  • #32 60M+ multi-attribute queries daily across 250+ attributes