SlideShare a Scribd company logo
Apache Avro in LivePerson
Collecting and saving data is easy
keeping it consistent is tough
DevCon Tlv, June 2014
Amihay Zer-Kavod, Software Architect
Who am I?
Amihay Zer-Kavod
Software Architect
Been in software Since 1989
LivePerson Echo System
M/R
● Consistent but decoupled communication
between services, such as:
o Monitoring, Interaction
o Predictive, Sentiment
o Reporting & Analysis
o History
Communication & Meaning
event
evento
事件
घटना
‫حدث‬
‫ארוע‬
событие
● Consistent meaning over time
o BigData Store (Hadoop)
o Reporting
What can’t we use?
Don’t use Direct APIs!
They are completely wrong for this issue, since:
• They produce too much coupling between services
• APIs are synchronous by nature
• Adds irrelevant complexity to the called service
So what is needed?
The Message is the API!
● A unified event model (schema) for all reported events
● Management tools for the unified schema
● Tools for sending events over the wire
● Tools for reading/writing event in big data
● Backward and forward compatibility
The Event model
From generic to specific structure with:
• Common header - all common data to all events
• Logical Entities - common header to all logical entities
(such as Visitor)
• Dynamic Specific headers
• Specific Event body
Apache Avro to the rescue
● Avro - a schema based serialization/deserialization
framework
● Avro idl - schema definition language
● Avro file - Hadoop integration
● Avro schema resolution
● Apache Avro created by Doug Cutting
Avro JSON schema sample
{
"type": "record",
"name": "Event",
"namespace": "com.liveperson.example",
"doc": "Example event",
"fields":[{ "name": "version", "type": "string", "default": "1" },
{ "name": "id", "type": "string", "default": "Unknown"},
{"name": "time","type": "long","default": -1},
{"name": "body","type": "string","default": "no body"},
{"name": "color","type":
{ "type": "enum", "name": "Color",
"symbols": ["NO_COLOR", "BLUE", "BLACK", "WHITE", "PINK"] },
"default": "NO_COLOR" }
]
}
Avro IDL - LivePerson Event
/** Base for all LivePerson Events
*/
@namespace("com.liveperson.global")
record LPEvent {
/** Common Header of the event */
CommonHeader header = null;
/** Logical entity details participating in this event - Visitor, Agent, etc... */
array<Participant> participants = null;
/** Holding specific platform info as node name (machine) cluster Id etc... */
PlatformHeader platformSpecificHeader = null;
/** Auditing Header, Optional - adds data for auditing of the events flow in the platform*/
union {null, AuditingHeader } auditingHeader = null;
/** The event body */
EventBody eventBody = null;
}
Backward & Forward Compatibility
Avro schema evolution
● Avro supports two schemes resolution
● Need to follow a set of rules:
● Every field must have a default value
● A field can be added (make sure to put a default value)
● Field types can not be changed (add a new field
instead)
● enum symbols can be added but never removed
Is that enough?
M/R
Migdalor
How good does it work?
● Cyber Monday 2013 (one day)
o More than 320,000 events per second
o 7 Storm topologies consuming the events seconds from
real time
o 2TB of data saved to Hadoop
● 2014 preparation:
o x2 number of events per second to ~640,000
So how did we do it?
1. Use an event driven system, don’t use direct APIs
2. Create a unified schema for all events
3. Use Avro to implement the schema
4. Add some supporting infrastructure
????
Questions
event
evento
事件
घटना
‫حدث‬
‫ארוע‬
событие
Amihay Zer-Kavod
You can contact me at:
amihayz@liveperson.com
LivePerson is hiring!
Thank You

More Related Content

What's hot

Introduction to Thrift
Introduction to ThriftIntroduction to Thrift
Introduction to ThriftDvir Volk
 
Serialization (Avro, Message Pack, Kryo)
Serialization (Avro, Message Pack, Kryo)Serialization (Avro, Message Pack, Kryo)
Serialization (Avro, Message Pack, Kryo)
오석 한
 
RESTLess Design with Apache Thrift: Experiences from Apache Airavata
RESTLess Design with Apache Thrift: Experiences from Apache AiravataRESTLess Design with Apache Thrift: Experiences from Apache Airavata
RESTLess Design with Apache Thrift: Experiences from Apache Airavata
smarru
 
Google Protocol Buffers
Google Protocol BuffersGoogle Protocol Buffers
Google Protocol Buffers
Sergey Podolsky
 
Data Serialization Using Google Protocol Buffers
Data Serialization Using Google Protocol BuffersData Serialization Using Google Protocol Buffers
Data Serialization Using Google Protocol Buffers
William Kibira
 
Experience protocol buffer on android
Experience protocol buffer on androidExperience protocol buffer on android
Experience protocol buffer on android
Richard Chang
 
Dart programming language
Dart programming languageDart programming language
Dart programming language
Aniruddha Chakrabarti
 
Serialization and performance by Sergey Morenets
Serialization and performance by Sergey MorenetsSerialization and performance by Sergey Morenets
Serialization and performance by Sergey MorenetsAlex Tumanoff
 
Rest style web services (google protocol buffers) prasad nirantar
Rest style web services (google protocol buffers)   prasad nirantarRest style web services (google protocol buffers)   prasad nirantar
Rest style web services (google protocol buffers) prasad nirantar
IndicThreads
 
Apache Thrift, a brief introduction
Apache Thrift, a brief introductionApache Thrift, a brief introduction
Apache Thrift, a brief introduction
Randy Abernethy
 
F# Type Provider for R Statistical Platform
F# Type Provider for R Statistical PlatformF# Type Provider for R Statistical Platform
F# Type Provider for R Statistical PlatformHoward Mansell
 
Presentation of Python, Django, DockerStack
Presentation of Python, Django, DockerStackPresentation of Python, Django, DockerStack
Presentation of Python, Django, DockerStack
David Sanchez
 
Extending the Xbase Typesystem
Extending the Xbase TypesystemExtending the Xbase Typesystem
Extending the Xbase Typesystem
Sebastian Zarnekow
 
Dart the better Javascript 2015
Dart the better Javascript 2015Dart the better Javascript 2015
Dart the better Javascript 2015
Jorg Janke
 
System Programming and Administration
System Programming and AdministrationSystem Programming and Administration
System Programming and Administration
Krasimir Berov (Красимир Беров)
 
Apache Thrift : One Stop Solution for Cross Language Communication
Apache Thrift : One Stop Solution for Cross Language CommunicationApache Thrift : One Stop Solution for Cross Language Communication
Apache Thrift : One Stop Solution for Cross Language Communication
Piyush Goel
 

What's hot (18)

Introduction to Thrift
Introduction to ThriftIntroduction to Thrift
Introduction to Thrift
 
Serialization (Avro, Message Pack, Kryo)
Serialization (Avro, Message Pack, Kryo)Serialization (Avro, Message Pack, Kryo)
Serialization (Avro, Message Pack, Kryo)
 
RESTLess Design with Apache Thrift: Experiences from Apache Airavata
RESTLess Design with Apache Thrift: Experiences from Apache AiravataRESTLess Design with Apache Thrift: Experiences from Apache Airavata
RESTLess Design with Apache Thrift: Experiences from Apache Airavata
 
Google Protocol Buffers
Google Protocol BuffersGoogle Protocol Buffers
Google Protocol Buffers
 
Data Serialization Using Google Protocol Buffers
Data Serialization Using Google Protocol BuffersData Serialization Using Google Protocol Buffers
Data Serialization Using Google Protocol Buffers
 
Experience protocol buffer on android
Experience protocol buffer on androidExperience protocol buffer on android
Experience protocol buffer on android
 
Dart programming language
Dart programming languageDart programming language
Dart programming language
 
Serialization and performance by Sergey Morenets
Serialization and performance by Sergey MorenetsSerialization and performance by Sergey Morenets
Serialization and performance by Sergey Morenets
 
Rest style web services (google protocol buffers) prasad nirantar
Rest style web services (google protocol buffers)   prasad nirantarRest style web services (google protocol buffers)   prasad nirantar
Rest style web services (google protocol buffers) prasad nirantar
 
Apache Thrift, a brief introduction
Apache Thrift, a brief introductionApache Thrift, a brief introduction
Apache Thrift, a brief introduction
 
F# Type Provider for R Statistical Platform
F# Type Provider for R Statistical PlatformF# Type Provider for R Statistical Platform
F# Type Provider for R Statistical Platform
 
Php
PhpPhp
Php
 
Presentation of Python, Django, DockerStack
Presentation of Python, Django, DockerStackPresentation of Python, Django, DockerStack
Presentation of Python, Django, DockerStack
 
Hack and HHVM
Hack and HHVMHack and HHVM
Hack and HHVM
 
Extending the Xbase Typesystem
Extending the Xbase TypesystemExtending the Xbase Typesystem
Extending the Xbase Typesystem
 
Dart the better Javascript 2015
Dart the better Javascript 2015Dart the better Javascript 2015
Dart the better Javascript 2015
 
System Programming and Administration
System Programming and AdministrationSystem Programming and Administration
System Programming and Administration
 
Apache Thrift : One Stop Solution for Cross Language Communication
Apache Thrift : One Stop Solution for Cross Language CommunicationApache Thrift : One Stop Solution for Cross Language Communication
Apache Thrift : One Stop Solution for Cross Language Communication
 

Viewers also liked

Avro Data | Washington DC HUG
Avro Data | Washington DC HUGAvro Data | Washington DC HUG
Avro Data | Washington DC HUGCloudera, Inc.
 
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
Hisham Mardam-Bey
 
Apache Avro and You
Apache Avro and YouApache Avro and You
Apache Avro and You
Eric Wendelin
 
Apache Flume
Apache FlumeApache Flume
Apache Flume
GetInData
 
Ruth agnew presentation Modern Governor #GovernorLive 25062013
Ruth agnew presentation Modern Governor #GovernorLive 25062013Ruth agnew presentation Modern Governor #GovernorLive 25062013
Ruth agnew presentation Modern Governor #GovernorLive 25062013
Elaine Walton
 
デブサミ2014 個人スポンサー募集要項
デブサミ2014 個人スポンサー募集要項デブサミ2014 個人スポンサー募集要項
デブサミ2014 個人スポンサー募集要項Developers Summit
 
Arrangementen bij Galerie Des Beaux Arts
Arrangementen bij Galerie Des Beaux ArtsArrangementen bij Galerie Des Beaux Arts
Arrangementen bij Galerie Des Beaux Arts
des Beaux Arts
 
Pengeualaran Daerah Efektif
Pengeualaran Daerah EfektifPengeualaran Daerah Efektif
Pengeualaran Daerah Efektifguest5fc123f
 
Installing oracle database 11g on windows 7
Installing oracle database 11g on windows 7Installing oracle database 11g on windows 7
Installing oracle database 11g on windows 7
Ravi Kumar Lanke
 
The Law of Averages, Chapter 2: The Not-So-Average Family
The Law of Averages, Chapter 2: The Not-So-Average FamilyThe Law of Averages, Chapter 2: The Not-So-Average Family
The Law of Averages, Chapter 2: The Not-So-Average FamilyNerissaemerald
 
Эффективное использование социальных сетей для развития интернет-магазина
Эффективное использование социальных сетей для развития интернет-магазинаЭффективное использование социальных сетей для развития интернет-магазина
Эффективное использование социальных сетей для развития интернет-магазина
Fert
 
NVN7125, berekenen energiebesparende gebiedsmaatregelen
NVN7125, berekenen energiebesparende gebiedsmaatregelenNVN7125, berekenen energiebesparende gebiedsmaatregelen
NVN7125, berekenen energiebesparende gebiedsmaatregelen
Netherlands Enterprise Agency (RVO.nl)
 
UB0203: Big 4 Pattern
UB0203: Big 4 PatternUB0203: Big 4 Pattern
UB0203: Big 4 PatternKonevo311
 
Why Iocom Video Conferencing
Why Iocom Video ConferencingWhy Iocom Video Conferencing
Why Iocom Video ConferencingMarilynBlanchard
 
Cjv30 Daily Care D201896 Ver1 0
Cjv30 Daily Care D201896 Ver1 0Cjv30 Daily Care D201896 Ver1 0
Cjv30 Daily Care D201896 Ver1 0guest7c3c32
 
Social Media Tips - MEPRA
Social Media Tips - MEPRASocial Media Tips - MEPRA
Social Media Tips - MEPRA
DigiArabs
 

Viewers also liked (20)

Avro Data | Washington DC HUG
Avro Data | Washington DC HUGAvro Data | Washington DC HUG
Avro Data | Washington DC HUG
 
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
 
Apache Avro and You
Apache Avro and YouApache Avro and You
Apache Avro and You
 
3 apache-avro
3 apache-avro3 apache-avro
3 apache-avro
 
Apache Flume
Apache FlumeApache Flume
Apache Flume
 
Topfield
TopfieldTopfield
Topfield
 
Ruth agnew presentation Modern Governor #GovernorLive 25062013
Ruth agnew presentation Modern Governor #GovernorLive 25062013Ruth agnew presentation Modern Governor #GovernorLive 25062013
Ruth agnew presentation Modern Governor #GovernorLive 25062013
 
デブサミ2014 個人スポンサー募集要項
デブサミ2014 個人スポンサー募集要項デブサミ2014 個人スポンサー募集要項
デブサミ2014 個人スポンサー募集要項
 
Opensat
OpensatOpensat
Opensat
 
Arrangementen bij Galerie Des Beaux Arts
Arrangementen bij Galerie Des Beaux ArtsArrangementen bij Galerie Des Beaux Arts
Arrangementen bij Galerie Des Beaux Arts
 
Pengeualaran Daerah Efektif
Pengeualaran Daerah EfektifPengeualaran Daerah Efektif
Pengeualaran Daerah Efektif
 
Installing oracle database 11g on windows 7
Installing oracle database 11g on windows 7Installing oracle database 11g on windows 7
Installing oracle database 11g on windows 7
 
The Law of Averages, Chapter 2: The Not-So-Average Family
The Law of Averages, Chapter 2: The Not-So-Average FamilyThe Law of Averages, Chapter 2: The Not-So-Average Family
The Law of Averages, Chapter 2: The Not-So-Average Family
 
Эффективное использование социальных сетей для развития интернет-магазина
Эффективное использование социальных сетей для развития интернет-магазинаЭффективное использование социальных сетей для развития интернет-магазина
Эффективное использование социальных сетей для развития интернет-магазина
 
NVN7125, berekenen energiebesparende gebiedsmaatregelen
NVN7125, berekenen energiebesparende gebiedsmaatregelenNVN7125, berekenen energiebesparende gebiedsmaatregelen
NVN7125, berekenen energiebesparende gebiedsmaatregelen
 
UB0203: Big 4 Pattern
UB0203: Big 4 PatternUB0203: Big 4 Pattern
UB0203: Big 4 Pattern
 
Nhom 3
Nhom 3Nhom 3
Nhom 3
 
Why Iocom Video Conferencing
Why Iocom Video ConferencingWhy Iocom Video Conferencing
Why Iocom Video Conferencing
 
Cjv30 Daily Care D201896 Ver1 0
Cjv30 Daily Care D201896 Ver1 0Cjv30 Daily Care D201896 Ver1 0
Cjv30 Daily Care D201896 Ver1 0
 
Social Media Tips - MEPRA
Social Media Tips - MEPRASocial Media Tips - MEPRA
Social Media Tips - MEPRA
 

Similar to Apache Avro and Messaging at Scale in LivePerson

S4: Distributed Stream Computing Platform
S4: Distributed Stream Computing PlatformS4: Distributed Stream Computing Platform
S4: Distributed Stream Computing Platform
Aleksandar Bradic
 
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding OverviewSplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
Splunk
 
SplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding OverviewSplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding Overview
Splunk
 
Tech Talk: ONOS- A Distributed SDN Network Operating System
Tech Talk: ONOS- A Distributed SDN Network Operating SystemTech Talk: ONOS- A Distributed SDN Network Operating System
Tech Talk: ONOS- A Distributed SDN Network Operating System
nvirters
 
.conf2011: Web Analytics Throwdown: with NPR and Intuit
.conf2011: Web Analytics Throwdown: with NPR and Intuit.conf2011: Web Analytics Throwdown: with NPR and Intuit
.conf2011: Web Analytics Throwdown: with NPR and Intuit
Erin Sweeney
 
How to Create a Service in Choreo
How to Create a Service in ChoreoHow to Create a Service in Choreo
How to Create a Service in Choreo
WSO2
 
Handout: 'Open Source Tools & Resources'
Handout: 'Open Source Tools & Resources'Handout: 'Open Source Tools & Resources'
Handout: 'Open Source Tools & Resources'
BDPA Education and Technology Foundation
 
PROCESS WARP
PROCESS WARPPROCESS WARP
PROCESS WARP
祐司 伊藤
 
Schemas Beyond The Edge
Schemas Beyond The EdgeSchemas Beyond The Edge
Schemas Beyond The Edge
confluent
 
FIWARE Tech Summit - lwM2M IoT Agent in Depth
FIWARE Tech Summit - lwM2M IoT Agent in DepthFIWARE Tech Summit - lwM2M IoT Agent in Depth
FIWARE Tech Summit - lwM2M IoT Agent in Depth
FIWARE
 
Uni w pachube 111108
Uni w pachube 111108Uni w pachube 111108
Uni w pachube 111108
Paul Tanner
 
Cytoscape and External Data Analysis Tools
Cytoscape and External Data Analysis ToolsCytoscape and External Data Analysis Tools
Cytoscape and External Data Analysis Tools
Keiichiro Ono
 
Creating a Context-Aware solution, Complex Event Processing with FIWARE Perseo
Creating a Context-Aware solution, Complex Event Processing with FIWARE PerseoCreating a Context-Aware solution, Complex Event Processing with FIWARE Perseo
Creating a Context-Aware solution, Complex Event Processing with FIWARE Perseo
Fernando Lopez Aguilar
 
Overview Of Parallel Development - Ericnel
Overview Of Parallel Development -  EricnelOverview Of Parallel Development -  Ericnel
Overview Of Parallel Development - Ericnel
ukdpe
 
Pref Presentation (2)
Pref Presentation (2)Pref Presentation (2)
Pref Presentation (2)Prachi Patil
 
Flow Monitoring Tools, What do we have, What do we need?
Flow Monitoring Tools, What do we have, What do we need?Flow Monitoring Tools, What do we have, What do we need?
Flow Monitoring Tools, What do we have, What do we need?
CSUC - Consorci de Serveis Universitaris de Catalunya
 
Java Performance & Profiling
Java Performance & ProfilingJava Performance & Profiling
Java Performance & Profiling
Isuru Perera
 
Spring on PAS - Fabio Marinelli
Spring on PAS - Fabio MarinelliSpring on PAS - Fabio Marinelli
Spring on PAS - Fabio Marinelli
VMware Tanzu
 
MacSysAdmin Conference 2019 - Logging
MacSysAdmin Conference 2019 - Logging MacSysAdmin Conference 2019 - Logging
MacSysAdmin Conference 2019 - Logging
Henry Stamerjohann
 

Similar to Apache Avro and Messaging at Scale in LivePerson (20)

S4: Distributed Stream Computing Platform
S4: Distributed Stream Computing PlatformS4: Distributed Stream Computing Platform
S4: Distributed Stream Computing Platform
 
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding OverviewSplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
 
SplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding OverviewSplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding Overview
 
Tech Talk: ONOS- A Distributed SDN Network Operating System
Tech Talk: ONOS- A Distributed SDN Network Operating SystemTech Talk: ONOS- A Distributed SDN Network Operating System
Tech Talk: ONOS- A Distributed SDN Network Operating System
 
.conf2011: Web Analytics Throwdown: with NPR and Intuit
.conf2011: Web Analytics Throwdown: with NPR and Intuit.conf2011: Web Analytics Throwdown: with NPR and Intuit
.conf2011: Web Analytics Throwdown: with NPR and Intuit
 
How to Create a Service in Choreo
How to Create a Service in ChoreoHow to Create a Service in Choreo
How to Create a Service in Choreo
 
Handout: 'Open Source Tools & Resources'
Handout: 'Open Source Tools & Resources'Handout: 'Open Source Tools & Resources'
Handout: 'Open Source Tools & Resources'
 
PROCESS WARP
PROCESS WARPPROCESS WARP
PROCESS WARP
 
Schemas Beyond The Edge
Schemas Beyond The EdgeSchemas Beyond The Edge
Schemas Beyond The Edge
 
FIWARE Tech Summit - lwM2M IoT Agent in Depth
FIWARE Tech Summit - lwM2M IoT Agent in DepthFIWARE Tech Summit - lwM2M IoT Agent in Depth
FIWARE Tech Summit - lwM2M IoT Agent in Depth
 
Uni w pachube 111108
Uni w pachube 111108Uni w pachube 111108
Uni w pachube 111108
 
Cytoscape and External Data Analysis Tools
Cytoscape and External Data Analysis ToolsCytoscape and External Data Analysis Tools
Cytoscape and External Data Analysis Tools
 
Creating a Context-Aware solution, Complex Event Processing with FIWARE Perseo
Creating a Context-Aware solution, Complex Event Processing with FIWARE PerseoCreating a Context-Aware solution, Complex Event Processing with FIWARE Perseo
Creating a Context-Aware solution, Complex Event Processing with FIWARE Perseo
 
Overview Of Parallel Development - Ericnel
Overview Of Parallel Development -  EricnelOverview Of Parallel Development -  Ericnel
Overview Of Parallel Development - Ericnel
 
Pref Presentation (2)
Pref Presentation (2)Pref Presentation (2)
Pref Presentation (2)
 
Flow Monitoring Tools, What do we have, What do we need?
Flow Monitoring Tools, What do we have, What do we need?Flow Monitoring Tools, What do we have, What do we need?
Flow Monitoring Tools, What do we have, What do we need?
 
project_docs
project_docsproject_docs
project_docs
 
Java Performance & Profiling
Java Performance & ProfilingJava Performance & Profiling
Java Performance & Profiling
 
Spring on PAS - Fabio Marinelli
Spring on PAS - Fabio MarinelliSpring on PAS - Fabio Marinelli
Spring on PAS - Fabio Marinelli
 
MacSysAdmin Conference 2019 - Logging
MacSysAdmin Conference 2019 - Logging MacSysAdmin Conference 2019 - Logging
MacSysAdmin Conference 2019 - Logging
 

More from LivePerson

Microservices on top of kafka
Microservices on top of kafkaMicroservices on top of kafka
Microservices on top of kafka
LivePerson
 
Graph QL Introduction
Graph QL IntroductionGraph QL Introduction
Graph QL Introduction
LivePerson
 
Kubernetes your tests! automation with docker on google cloud platform
Kubernetes your tests! automation with docker on google cloud platformKubernetes your tests! automation with docker on google cloud platform
Kubernetes your tests! automation with docker on google cloud platform
LivePerson
 
Growing into a proactive Data Platform
Growing into a proactive Data PlatformGrowing into a proactive Data Platform
Growing into a proactive Data Platform
LivePerson
 
Measure() or die()
Measure() or die() Measure() or die()
Measure() or die()
LivePerson
 
Resilience from Theory to Practice
Resilience from Theory to PracticeResilience from Theory to Practice
Resilience from Theory to Practice
LivePerson
 
System Revolution- How We Did It
System Revolution- How We Did It System Revolution- How We Did It
System Revolution- How We Did It
LivePerson
 
Liveperson DLD 2015
Liveperson DLD 2015 Liveperson DLD 2015
Liveperson DLD 2015
LivePerson
 
Http 2: Should I care?
Http 2: Should I care?Http 2: Should I care?
Http 2: Should I care?
LivePerson
 
Mobile app real-time content modifications using websockets
Mobile app real-time content modifications using websocketsMobile app real-time content modifications using websockets
Mobile app real-time content modifications using websockets
LivePerson
 
Mobile SDK: Considerations & Best Practices
Mobile SDK: Considerations & Best Practices Mobile SDK: Considerations & Best Practices
Mobile SDK: Considerations & Best Practices
LivePerson
 
Functional programming with Java 8
Functional programming with Java 8Functional programming with Java 8
Functional programming with Java 8
LivePerson
 
Data compression in Modern Application
Data compression in Modern ApplicationData compression in Modern Application
Data compression in Modern Application
LivePerson
 
Support Office Hour Webinar - LivePerson API
Support Office Hour Webinar - LivePerson API Support Office Hour Webinar - LivePerson API
Support Office Hour Webinar - LivePerson API
LivePerson
 
SIP - Introduction to SIP Protocol
SIP - Introduction to SIP ProtocolSIP - Introduction to SIP Protocol
SIP - Introduction to SIP Protocol
LivePerson
 
Scalding: Reaching Efficient MapReduce
Scalding: Reaching Efficient MapReduceScalding: Reaching Efficient MapReduce
Scalding: Reaching Efficient MapReduce
LivePerson
 
Building Enterprise Level End-To-End Monitor System with Open Source Solution...
Building Enterprise Level End-To-End Monitor System with Open Source Solution...Building Enterprise Level End-To-End Monitor System with Open Source Solution...
Building Enterprise Level End-To-End Monitor System with Open Source Solution...
LivePerson
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
LivePerson
 
From a Kafkaesque Story to The Promised Land at LivePerson
From a Kafkaesque Story to The Promised Land at LivePersonFrom a Kafkaesque Story to The Promised Land at LivePerson
From a Kafkaesque Story to The Promised Land at LivePerson
LivePerson
 
How can A/B testing go wrong?
How can A/B testing go wrong?How can A/B testing go wrong?
How can A/B testing go wrong?
LivePerson
 

More from LivePerson (20)

Microservices on top of kafka
Microservices on top of kafkaMicroservices on top of kafka
Microservices on top of kafka
 
Graph QL Introduction
Graph QL IntroductionGraph QL Introduction
Graph QL Introduction
 
Kubernetes your tests! automation with docker on google cloud platform
Kubernetes your tests! automation with docker on google cloud platformKubernetes your tests! automation with docker on google cloud platform
Kubernetes your tests! automation with docker on google cloud platform
 
Growing into a proactive Data Platform
Growing into a proactive Data PlatformGrowing into a proactive Data Platform
Growing into a proactive Data Platform
 
Measure() or die()
Measure() or die() Measure() or die()
Measure() or die()
 
Resilience from Theory to Practice
Resilience from Theory to PracticeResilience from Theory to Practice
Resilience from Theory to Practice
 
System Revolution- How We Did It
System Revolution- How We Did It System Revolution- How We Did It
System Revolution- How We Did It
 
Liveperson DLD 2015
Liveperson DLD 2015 Liveperson DLD 2015
Liveperson DLD 2015
 
Http 2: Should I care?
Http 2: Should I care?Http 2: Should I care?
Http 2: Should I care?
 
Mobile app real-time content modifications using websockets
Mobile app real-time content modifications using websocketsMobile app real-time content modifications using websockets
Mobile app real-time content modifications using websockets
 
Mobile SDK: Considerations & Best Practices
Mobile SDK: Considerations & Best Practices Mobile SDK: Considerations & Best Practices
Mobile SDK: Considerations & Best Practices
 
Functional programming with Java 8
Functional programming with Java 8Functional programming with Java 8
Functional programming with Java 8
 
Data compression in Modern Application
Data compression in Modern ApplicationData compression in Modern Application
Data compression in Modern Application
 
Support Office Hour Webinar - LivePerson API
Support Office Hour Webinar - LivePerson API Support Office Hour Webinar - LivePerson API
Support Office Hour Webinar - LivePerson API
 
SIP - Introduction to SIP Protocol
SIP - Introduction to SIP ProtocolSIP - Introduction to SIP Protocol
SIP - Introduction to SIP Protocol
 
Scalding: Reaching Efficient MapReduce
Scalding: Reaching Efficient MapReduceScalding: Reaching Efficient MapReduce
Scalding: Reaching Efficient MapReduce
 
Building Enterprise Level End-To-End Monitor System with Open Source Solution...
Building Enterprise Level End-To-End Monitor System with Open Source Solution...Building Enterprise Level End-To-End Monitor System with Open Source Solution...
Building Enterprise Level End-To-End Monitor System with Open Source Solution...
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
From a Kafkaesque Story to The Promised Land at LivePerson
From a Kafkaesque Story to The Promised Land at LivePersonFrom a Kafkaesque Story to The Promised Land at LivePerson
From a Kafkaesque Story to The Promised Land at LivePerson
 
How can A/B testing go wrong?
How can A/B testing go wrong?How can A/B testing go wrong?
How can A/B testing go wrong?
 

Recently uploaded

Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 

Recently uploaded (20)

Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 

Apache Avro and Messaging at Scale in LivePerson

  • 1. Apache Avro in LivePerson Collecting and saving data is easy keeping it consistent is tough DevCon Tlv, June 2014 Amihay Zer-Kavod, Software Architect
  • 2. Who am I? Amihay Zer-Kavod Software Architect Been in software Since 1989
  • 4. ● Consistent but decoupled communication between services, such as: o Monitoring, Interaction o Predictive, Sentiment o Reporting & Analysis o History Communication & Meaning event evento 事件 घटना ‫حدث‬ ‫ארוע‬ событие ● Consistent meaning over time o BigData Store (Hadoop) o Reporting
  • 5. What can’t we use? Don’t use Direct APIs! They are completely wrong for this issue, since: • They produce too much coupling between services • APIs are synchronous by nature • Adds irrelevant complexity to the called service
  • 6. So what is needed? The Message is the API! ● A unified event model (schema) for all reported events ● Management tools for the unified schema ● Tools for sending events over the wire ● Tools for reading/writing event in big data ● Backward and forward compatibility
  • 7. The Event model From generic to specific structure with: • Common header - all common data to all events • Logical Entities - common header to all logical entities (such as Visitor) • Dynamic Specific headers • Specific Event body
  • 8. Apache Avro to the rescue ● Avro - a schema based serialization/deserialization framework ● Avro idl - schema definition language ● Avro file - Hadoop integration ● Avro schema resolution ● Apache Avro created by Doug Cutting
  • 9. Avro JSON schema sample { "type": "record", "name": "Event", "namespace": "com.liveperson.example", "doc": "Example event", "fields":[{ "name": "version", "type": "string", "default": "1" }, { "name": "id", "type": "string", "default": "Unknown"}, {"name": "time","type": "long","default": -1}, {"name": "body","type": "string","default": "no body"}, {"name": "color","type": { "type": "enum", "name": "Color", "symbols": ["NO_COLOR", "BLUE", "BLACK", "WHITE", "PINK"] }, "default": "NO_COLOR" } ] }
  • 10. Avro IDL - LivePerson Event /** Base for all LivePerson Events */ @namespace("com.liveperson.global") record LPEvent { /** Common Header of the event */ CommonHeader header = null; /** Logical entity details participating in this event - Visitor, Agent, etc... */ array<Participant> participants = null; /** Holding specific platform info as node name (machine) cluster Id etc... */ PlatformHeader platformSpecificHeader = null; /** Auditing Header, Optional - adds data for auditing of the events flow in the platform*/ union {null, AuditingHeader } auditingHeader = null; /** The event body */ EventBody eventBody = null; }
  • 11. Backward & Forward Compatibility Avro schema evolution ● Avro supports two schemes resolution ● Need to follow a set of rules: ● Every field must have a default value ● A field can be added (make sure to put a default value) ● Field types can not be changed (add a new field instead) ● enum symbols can be added but never removed
  • 13. How good does it work? ● Cyber Monday 2013 (one day) o More than 320,000 events per second o 7 Storm topologies consuming the events seconds from real time o 2TB of data saved to Hadoop ● 2014 preparation: o x2 number of events per second to ~640,000
  • 14. So how did we do it? 1. Use an event driven system, don’t use direct APIs 2. Create a unified schema for all events 3. Use Avro to implement the schema 4. Add some supporting infrastructure
  • 16. Amihay Zer-Kavod You can contact me at: amihayz@liveperson.com LivePerson is hiring!