Your SlideShare is downloading. ×
Apache Avro in LivePerson
Collecting and saving data is easy
keeping it consistent is tough
DevCon Tlv, June 2014
Amihay Z...
Who am I?
Amihay Zer-Kavod
Software Architect
Been in software Since 1989
LivePerson Echo System
M/R
● Consistent but decoupled communication
between services, such as:
o Monitoring, Interaction
o Predictive, Sentiment
o Re...
What can’t we use?
Don’t use Direct APIs!
They are completely wrong for this issue, since:
• They produce too much couplin...
So what is needed?
The Message is the API!
● A unified event model (schema) for all reported events
● Management tools for...
The Event model
From generic to specific structure with:
• Common header - all common data to all events
• Logical Entitie...
Apache Avro to the rescue
● Avro - a schema based serialization/deserialization
framework
● Avro idl - schema definition l...
Avro JSON schema sample
{
"type": "record",
"name": "Event",
"namespace": "com.liveperson.example",
"doc": "Example event"...
Avro IDL - LivePerson Event
/** Base for all LivePerson Events
*/
@namespace("com.liveperson.global")
record LPEvent {
/**...
Backward & Forward Compatibility
Avro schema evolution
● Avro supports two schemes resolution
● Need to follow a set of ru...
Is that enough?
M/R
Migdalor
How good does it work?
● Cyber Monday 2013 (one day)
o More than 320,000 events per second
o 7 Storm topologies consuming ...
So how did we do it?
1. Use an event driven system, don’t use direct APIs
2. Create a unified schema for all events
3. Use...
????
Questions
event
evento
事件
घटना
‫حدث‬
‫ארוע‬
событие
Amihay Zer-Kavod
You can contact me at:
amihayz@liveperson.com
LivePerson is hiring!
Thank You
Upcoming SlideShare
Loading in...5
×

Apache Avro and Messaging at Scale in LivePerson

551

Published on

This talk covers the challenges we tackled during building our new service oriented system. Summarizing what we realized would bad Ideas to do, what are the better approaches to data consistency, how we used Apache Avro technology and what other supporting infrastructure we created to help us achieving the goal of consistent yet flexible system.
Amihay Zer-Kavod is I'm a Senior Software Architect at LivePerson.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
551
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Apache Avro and Messaging at Scale in LivePerson"

  1. 1. Apache Avro in LivePerson Collecting and saving data is easy keeping it consistent is tough DevCon Tlv, June 2014 Amihay Zer-Kavod, Software Architect
  2. 2. Who am I? Amihay Zer-Kavod Software Architect Been in software Since 1989
  3. 3. LivePerson Echo System M/R
  4. 4. ● Consistent but decoupled communication between services, such as: o Monitoring, Interaction o Predictive, Sentiment o Reporting & Analysis o History Communication & Meaning event evento 事件 घटना ‫حدث‬ ‫ארוע‬ событие ● Consistent meaning over time o BigData Store (Hadoop) o Reporting
  5. 5. What can’t we use? Don’t use Direct APIs! They are completely wrong for this issue, since: • They produce too much coupling between services • APIs are synchronous by nature • Adds irrelevant complexity to the called service
  6. 6. So what is needed? The Message is the API! ● A unified event model (schema) for all reported events ● Management tools for the unified schema ● Tools for sending events over the wire ● Tools for reading/writing event in big data ● Backward and forward compatibility
  7. 7. The Event model From generic to specific structure with: • Common header - all common data to all events • Logical Entities - common header to all logical entities (such as Visitor) • Dynamic Specific headers • Specific Event body
  8. 8. Apache Avro to the rescue ● Avro - a schema based serialization/deserialization framework ● Avro idl - schema definition language ● Avro file - Hadoop integration ● Avro schema resolution ● Apache Avro created by Doug Cutting
  9. 9. Avro JSON schema sample { "type": "record", "name": "Event", "namespace": "com.liveperson.example", "doc": "Example event", "fields":[{ "name": "version", "type": "string", "default": "1" }, { "name": "id", "type": "string", "default": "Unknown"}, {"name": "time","type": "long","default": -1}, {"name": "body","type": "string","default": "no body"}, {"name": "color","type": { "type": "enum", "name": "Color", "symbols": ["NO_COLOR", "BLUE", "BLACK", "WHITE", "PINK"] }, "default": "NO_COLOR" } ] }
  10. 10. Avro IDL - LivePerson Event /** Base for all LivePerson Events */ @namespace("com.liveperson.global") record LPEvent { /** Common Header of the event */ CommonHeader header = null; /** Logical entity details participating in this event - Visitor, Agent, etc... */ array<Participant> participants = null; /** Holding specific platform info as node name (machine) cluster Id etc... */ PlatformHeader platformSpecificHeader = null; /** Auditing Header, Optional - adds data for auditing of the events flow in the platform*/ union {null, AuditingHeader } auditingHeader = null; /** The event body */ EventBody eventBody = null; }
  11. 11. Backward & Forward Compatibility Avro schema evolution ● Avro supports two schemes resolution ● Need to follow a set of rules: ● Every field must have a default value ● A field can be added (make sure to put a default value) ● Field types can not be changed (add a new field instead) ● enum symbols can be added but never removed
  12. 12. Is that enough? M/R Migdalor
  13. 13. How good does it work? ● Cyber Monday 2013 (one day) o More than 320,000 events per second o 7 Storm topologies consuming the events seconds from real time o 2TB of data saved to Hadoop ● 2014 preparation: o x2 number of events per second to ~640,000
  14. 14. So how did we do it? 1. Use an event driven system, don’t use direct APIs 2. Create a unified schema for all events 3. Use Avro to implement the schema 4. Add some supporting infrastructure
  15. 15. ???? Questions event evento 事件 घटना ‫حدث‬ ‫ארוע‬ событие
  16. 16. Amihay Zer-Kavod You can contact me at: amihayz@liveperson.com LivePerson is hiring!
  17. 17. Thank You

×