SlideShare a Scribd company logo
Mark Harrison
Event driven architectures with Kinesis
Justin Potter
3
● MONOLITH!
● Background
● Microservice spaghetti
● Microservice eventing
● Kinesis Overview
● (Soon to be) Open source Kinesis Driver
● Join Us
Agenda
4
The traditional Oracle backed monolith architecture
● Tight and ever increasing coupling
● Difficult to scale with users and features
● Difficult to maintain
● Difficult to onboard new developers
● Lacked modularity
Long ago in a …...
5
Background
Journal (Tracking) - When a user enters a food, weight, or activity into Weight
Watchers, it is sent to Journal.
Program (Points Calculation) - When a user wishes to view their Weight Watchers
points, a call is made to Program to calculate and retrieve their point allocation.
Program depends on the Journal service for it’s food tracking.
6
Microservices!!
● Scala, Akka, Play, Cassandra
● REST based services
● Each service represents a single domain concept
○ User Profile, Entitlements, Program …
We needed something different!
7
8
It turns out magic bullets aren’t magic after all!!
● Features cross service boundaries, a LOT
● New features often increase requests between services
○ So one request now hits two services, that’s a 100% increase!
● Immediate consistency means reduced availability
○ I’m looking at you… REST
● Scaling out worked ok, just add more nodes!
● Broadcasting data to other teams result in a direct dependency
● Not enough emphasis on logging and monitoring
So… how’d that work out for you???
9
10
Isolation
Pros
Individually
Scalable
Pros
11
Domain-ish
Driven
Design
Pros
12
Easier to onboard
developers
Pros
13
Scales
Horizontally
Pros
14
Still Tightly
Coupled
Cons
15
Convoluted
JSON
Responses
Cons
16
Higher
Latency
Cons
17
No
Back
Pressure
Cons
18
Complicated
Integration
Testing
Cons
19
No way to broadcast
events to other
teams
Cons
20
Data
Duplication
Between
Services
Cons
21
More “Reactive”
● Better monitoring
● Decouple the services
● More concise event payloads
● Services hold their own state
● Backpressure
Fix all the things!!!
22
Considerations...
● Accept that Eventual consistency is inevitable
● Some services do too many things, some should be merged together!
● The APIs will give the latest known state
● Deal with the fact that duplicates will happen
● Did I mention better monitoring??
But… How? What? Um...
23
24
Think Kafka, but not :)
● “Real-time” streaming platform
● Multiple applications can publish and consumer to/from the same stream
● Geared at higher latency workloads
● Messages are consumed in batches
● Elastic - easy to scale up and down
● Some interesting constraints (more on that soon!)
Kinesis
25
● Stream - An ordered sequence of data records, each stream has a unique name
● Data Record - Unit of data stored in a Stream. Composed of a Sequence number, Partition
Key and Data Blob.
● Partition key - Used to control distribution of records
● Sequence Number - Each record has a sequence number. Sequence numbers for the same
partition key generally increase over time (non-sequentially).
● SubSequence Number - When aggregating records, multiple will records in the batch will
share a sequence number. In this instance, a SubSequence Number is used in combination to
uniquely identify records.
Key concepts
26
Even more key concepts
● Shard - A group of data records in a
stream. A stream has one or more Shards.
A Shard is a unit of throughput capacity
and therefore determines the throughput
of the Stream
● Producer - Puts messages onto a Shard
● Consumer - Gets data records from one
or more Shards. If multiple consumers
share a name, they therefore share a
checkpoint position.
● Checkpointing - The per consumer
process of tracking the latest consumed
record.
27
Constraints
Wait.. it’s not all sunshine and roses?
● Data can be persisted in Kinesis for up to 7 days, with an initial default of 1 day.
● A Shard is a unit of throughput capacity
○ Reads - up to 5 transactions per second, with a maximum total data read rate of 2 MB
per second
○ Writes - up to 1,000 records per second, up to a maximum total data write rate of 1 MB
per second (including partition keys)
● When one application has multiple consumers, thereby sharing one checkpoint position, you
must have at least one shard per instance
○ Think of a database table which tracks the current progress, in which the primary key is a
combination of the application name and shard id
● You are charged on a per shard basis
28
Interfacing with Kinesis
Out of the box, Amazon provides two libraries for programmatically interfacing with Kinesis
● KPL - Kinesis Producer Library
● KCL - Kinesis Consumer Library
Both are available in Java and handle a number of low level concerns
● Stream connection and disconnection
● Enumeration of shards
● Parallel processing of the stream: consuming from and producing to a number of shards
● Shard worker allocation and reallocation, balancing shards across workers
● Batching and aggregation of records
29
So what’s lacking???
Nobody’s perfect, right?
● Java only, usage involves some interesting use of inheritance
● Asynchronous & non-blocking processing on the consumer
● Fool proof and non-blocking checkpointing
● Throttling to reduce memory footprint
● Smarter per message checkpointing
● Hard to prevent the driver code becoming tangled with your
business logic
30
Introducing...
The Weight Watchers Kinesis client
<Insert cool logo here>
Coming to a github repo near you soon…..
31
Producer
Scala & (optionally) Akka based producer
● Wraps the KPL driver
● Choice of Scala Future or Akka based interface
● Scala interface
○ Returns a Future for each message
○ Completes when send (batch) is successful
● Actor interface
○ Fire and forget or callback messages
○ Optional throttling to limit the number of unsent
messages and therefore Futures
32
33
Consumer
Scala & Akka based consumer
● Wraps the KCL library
● Provides fool proof checkpointing
○ Allows message failures within a configurable threshold
● Messages sent for processing to provided Actor
● Configurable retries
● Asynchronous processing and checkpointing
34
35
36
37
38
Scala Producer
39
Akka Producer
40
Consumer Event Processor
41
Consumer Instantiation
42
Performance
The performance scales reasonably well with the number of shards,
with consistent increases as each new shard is added.
1 Shard - 5,000,000 messages:
Records/sec: 42016
Seconds elapsed: 119
2 Shards - 5,000,000 messages:
Records/sec: 74626
Seconds elapsed: 67
5 Shards - - 10,000,000 messages
Records/sec: 140845
Seconds elapsed: 71
43
Mark Harrison
@markglh
Justin Potter
We’re Hiring!!
www.weightwatchers.com/us/corporate-careers
Or email: Joanna.mark@weightwatchers.com

More Related Content

What's hot

Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Amazon Web Services
 
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
Amazon Web Services
 
Streaming data for real time analysis
Streaming data for real time analysisStreaming data for real time analysis
Streaming data for real time analysis
Amazon Web Services
 
netflix-real-time-data-strata-talk
netflix-real-time-data-strata-talknetflix-real-time-data-strata-talk
netflix-real-time-data-strata-talk
Danny Yuan
 
Aws Kinesis
Aws KinesisAws Kinesis
Aws Kinesis
Szilveszter Molnár
 
Real-Time Event Processing
Real-Time Event ProcessingReal-Time Event Processing
Real-Time Event Processing
Amazon Web Services
 
Real time sentiment analysis using twitter stream api &amp; aws kinesis
Real time sentiment analysis using twitter stream api &amp; aws kinesisReal time sentiment analysis using twitter stream api &amp; aws kinesis
Real time sentiment analysis using twitter stream api &amp; aws kinesis
Armando Padilla
 
Symantec: Cassandra Data Modelling techniques in action
Symantec: Cassandra Data Modelling techniques in actionSymantec: Cassandra Data Modelling techniques in action
Symantec: Cassandra Data Modelling techniques in action
DataStax Academy
 
AWS Real-Time Event Processing
AWS Real-Time Event ProcessingAWS Real-Time Event Processing
AWS Real-Time Event Processing
Amazon Web Services
 
Webinar : Nouveautés de MongoDB 3.2
Webinar : Nouveautés de MongoDB 3.2Webinar : Nouveautés de MongoDB 3.2
Webinar : Nouveautés de MongoDB 3.2
MongoDB
 
AWS Community Nordics Virtual Meetup
AWS Community Nordics Virtual MeetupAWS Community Nordics Virtual Meetup
AWS Community Nordics Virtual Meetup
Anahit Pogosova
 
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Chris Fregly
 
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
Amazon Web Services
 
Capital One: Using Cassandra In Building A Reporting Platform
Capital One: Using Cassandra In Building A Reporting PlatformCapital One: Using Cassandra In Building A Reporting Platform
Capital One: Using Cassandra In Building A Reporting Platform
DataStax Academy
 
Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015
Robbie Strickland
 
Azure + DataStax Enterprise Powers Office 365 Per User Store
Azure + DataStax Enterprise Powers Office 365 Per User StoreAzure + DataStax Enterprise Powers Office 365 Per User Store
Azure + DataStax Enterprise Powers Office 365 Per User Store
DataStax Academy
 
Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-Flight
DataStax Academy
 
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
Hakka Labs
 
Python Awareness for Exploration and Production Students and Professionals
Python Awareness for Exploration and Production Students and ProfessionalsPython Awareness for Exploration and Production Students and Professionals
Python Awareness for Exploration and Production Students and Professionals
Yohanes Nuwara
 
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
Amazon Web Services
 

What's hot (20)

Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
 
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
(SDD405) Amazon Kinesis Deep Dive | AWS re:Invent 2014
 
Streaming data for real time analysis
Streaming data for real time analysisStreaming data for real time analysis
Streaming data for real time analysis
 
netflix-real-time-data-strata-talk
netflix-real-time-data-strata-talknetflix-real-time-data-strata-talk
netflix-real-time-data-strata-talk
 
Aws Kinesis
Aws KinesisAws Kinesis
Aws Kinesis
 
Real-Time Event Processing
Real-Time Event ProcessingReal-Time Event Processing
Real-Time Event Processing
 
Real time sentiment analysis using twitter stream api &amp; aws kinesis
Real time sentiment analysis using twitter stream api &amp; aws kinesisReal time sentiment analysis using twitter stream api &amp; aws kinesis
Real time sentiment analysis using twitter stream api &amp; aws kinesis
 
Symantec: Cassandra Data Modelling techniques in action
Symantec: Cassandra Data Modelling techniques in actionSymantec: Cassandra Data Modelling techniques in action
Symantec: Cassandra Data Modelling techniques in action
 
AWS Real-Time Event Processing
AWS Real-Time Event ProcessingAWS Real-Time Event Processing
AWS Real-Time Event Processing
 
Webinar : Nouveautés de MongoDB 3.2
Webinar : Nouveautés de MongoDB 3.2Webinar : Nouveautés de MongoDB 3.2
Webinar : Nouveautés de MongoDB 3.2
 
AWS Community Nordics Virtual Meetup
AWS Community Nordics Virtual MeetupAWS Community Nordics Virtual Meetup
AWS Community Nordics Virtual Meetup
 
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
 
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
 
Capital One: Using Cassandra In Building A Reporting Platform
Capital One: Using Cassandra In Building A Reporting PlatformCapital One: Using Cassandra In Building A Reporting Platform
Capital One: Using Cassandra In Building A Reporting Platform
 
Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015
 
Azure + DataStax Enterprise Powers Office 365 Per User Store
Azure + DataStax Enterprise Powers Office 365 Per User StoreAzure + DataStax Enterprise Powers Office 365 Per User Store
Azure + DataStax Enterprise Powers Office 365 Per User Store
 
Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-Flight
 
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
 
Python Awareness for Exploration and Production Students and Professionals
Python Awareness for Exploration and Production Students and ProfessionalsPython Awareness for Exploration and Production Students and Professionals
Python Awareness for Exploration and Production Students and Professionals
 
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
 

Similar to Event driven architectures with Kinesis

Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
Allen (Xiaozhong) Wang
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
Steven Wu
 
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ UberKafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
confluent
 
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Monal Daxini
 
Apache Kafka - Free Friday
Apache Kafka - Free FridayApache Kafka - Free Friday
Apache Kafka - Free Friday
Otávio Carvalho
 
Apache KAfka
Apache KAfkaApache KAfka
Apache KAfka
Pedro Alcantara
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
confluent
 
Architectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingArchitectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark Streaming
Apache Apex
 
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpStrimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
José Román Martín Gil
 
BISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesBISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple Spaces
Srinath Perera
 
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
Nicolas Brousse
 
Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018
Otávio Carvalho
 
Kafka used at scale to deliver real-time notifications
Kafka used at scale to deliver real-time notificationsKafka used at scale to deliver real-time notifications
Kafka used at scale to deliver real-time notifications
Sérgio Nunes
 
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
Anna Ossowski
 
Uber: Kafka Consumer Proxy
Uber: Kafka Consumer ProxyUber: Kafka Consumer Proxy
Uber: Kafka Consumer Proxy
confluent
 
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Hernan Costante
 
C. Sotiriou, Vodafone Greece: Adopting Quarkus for the digital experience layer
C. Sotiriou, Vodafone Greece: Adopting Quarkus for the digital experience layerC. Sotiriou, Vodafone Greece: Adopting Quarkus for the digital experience layer
C. Sotiriou, Vodafone Greece: Adopting Quarkus for the digital experience layer
Uni Systems S.M.S.A.
 
Collecting 600M events/day
Collecting 600M events/dayCollecting 600M events/day
Collecting 600M events/day
Lars Marius Garshol
 
kafka
kafkakafka
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Apache Apex
 

Similar to Event driven architectures with Kinesis (20)

Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ UberKafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
 
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
 
Apache Kafka - Free Friday
Apache Kafka - Free FridayApache Kafka - Free Friday
Apache Kafka - Free Friday
 
Apache KAfka
Apache KAfkaApache KAfka
Apache KAfka
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
Architectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark StreamingArchitectual Comparison of Apache Apex and Spark Streaming
Architectual Comparison of Apache Apex and Spark Streaming
 
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpStrimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
 
BISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesBISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple Spaces
 
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
 
Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018
 
Kafka used at scale to deliver real-time notifications
Kafka used at scale to deliver real-time notificationsKafka used at scale to deliver real-time notifications
Kafka used at scale to deliver real-time notifications
 
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
 
Uber: Kafka Consumer Proxy
Uber: Kafka Consumer ProxyUber: Kafka Consumer Proxy
Uber: Kafka Consumer Proxy
 
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
 
C. Sotiriou, Vodafone Greece: Adopting Quarkus for the digital experience layer
C. Sotiriou, Vodafone Greece: Adopting Quarkus for the digital experience layerC. Sotiriou, Vodafone Greece: Adopting Quarkus for the digital experience layer
C. Sotiriou, Vodafone Greece: Adopting Quarkus for the digital experience layer
 
Collecting 600M events/day
Collecting 600M events/dayCollecting 600M events/day
Collecting 600M events/day
 
kafka
kafkakafka
kafka
 
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
 

Recently uploaded

Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Łukasz Chruściel
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
Alina Yurenko
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
Shane Coughlan
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
Green Software Development
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
What is Augmented Reality Image Tracking
What is Augmented Reality Image TrackingWhat is Augmented Reality Image Tracking
What is Augmented Reality Image Tracking
pavan998932
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
mz5nrf0n
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
TheSMSPoint
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
Philip Schwarz
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
Green Software Development
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
Boni García
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
Octavian Nadolu
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
Ayan Halder
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
Gerardo Pardo-Castellote
 

Recently uploaded (20)

Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
What is Augmented Reality Image Tracking
What is Augmented Reality Image TrackingWhat is Augmented Reality Image Tracking
What is Augmented Reality Image Tracking
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
 

Event driven architectures with Kinesis

  • 1.
  • 2. Mark Harrison Event driven architectures with Kinesis Justin Potter
  • 3. 3 ● MONOLITH! ● Background ● Microservice spaghetti ● Microservice eventing ● Kinesis Overview ● (Soon to be) Open source Kinesis Driver ● Join Us Agenda
  • 4. 4 The traditional Oracle backed monolith architecture ● Tight and ever increasing coupling ● Difficult to scale with users and features ● Difficult to maintain ● Difficult to onboard new developers ● Lacked modularity Long ago in a …...
  • 5. 5 Background Journal (Tracking) - When a user enters a food, weight, or activity into Weight Watchers, it is sent to Journal. Program (Points Calculation) - When a user wishes to view their Weight Watchers points, a call is made to Program to calculate and retrieve their point allocation. Program depends on the Journal service for it’s food tracking.
  • 6. 6 Microservices!! ● Scala, Akka, Play, Cassandra ● REST based services ● Each service represents a single domain concept ○ User Profile, Entitlements, Program … We needed something different!
  • 7. 7
  • 8. 8 It turns out magic bullets aren’t magic after all!! ● Features cross service boundaries, a LOT ● New features often increase requests between services ○ So one request now hits two services, that’s a 100% increase! ● Immediate consistency means reduced availability ○ I’m looking at you… REST ● Scaling out worked ok, just add more nodes! ● Broadcasting data to other teams result in a direct dependency ● Not enough emphasis on logging and monitoring So… how’d that work out for you???
  • 9. 9
  • 20. No way to broadcast events to other teams Cons 20
  • 22. More “Reactive” ● Better monitoring ● Decouple the services ● More concise event payloads ● Services hold their own state ● Backpressure Fix all the things!!! 22
  • 23. Considerations... ● Accept that Eventual consistency is inevitable ● Some services do too many things, some should be merged together! ● The APIs will give the latest known state ● Deal with the fact that duplicates will happen ● Did I mention better monitoring?? But… How? What? Um... 23
  • 24. 24
  • 25. Think Kafka, but not :) ● “Real-time” streaming platform ● Multiple applications can publish and consumer to/from the same stream ● Geared at higher latency workloads ● Messages are consumed in batches ● Elastic - easy to scale up and down ● Some interesting constraints (more on that soon!) Kinesis 25
  • 26. ● Stream - An ordered sequence of data records, each stream has a unique name ● Data Record - Unit of data stored in a Stream. Composed of a Sequence number, Partition Key and Data Blob. ● Partition key - Used to control distribution of records ● Sequence Number - Each record has a sequence number. Sequence numbers for the same partition key generally increase over time (non-sequentially). ● SubSequence Number - When aggregating records, multiple will records in the batch will share a sequence number. In this instance, a SubSequence Number is used in combination to uniquely identify records. Key concepts 26
  • 27. Even more key concepts ● Shard - A group of data records in a stream. A stream has one or more Shards. A Shard is a unit of throughput capacity and therefore determines the throughput of the Stream ● Producer - Puts messages onto a Shard ● Consumer - Gets data records from one or more Shards. If multiple consumers share a name, they therefore share a checkpoint position. ● Checkpointing - The per consumer process of tracking the latest consumed record. 27
  • 28. Constraints Wait.. it’s not all sunshine and roses? ● Data can be persisted in Kinesis for up to 7 days, with an initial default of 1 day. ● A Shard is a unit of throughput capacity ○ Reads - up to 5 transactions per second, with a maximum total data read rate of 2 MB per second ○ Writes - up to 1,000 records per second, up to a maximum total data write rate of 1 MB per second (including partition keys) ● When one application has multiple consumers, thereby sharing one checkpoint position, you must have at least one shard per instance ○ Think of a database table which tracks the current progress, in which the primary key is a combination of the application name and shard id ● You are charged on a per shard basis 28
  • 29. Interfacing with Kinesis Out of the box, Amazon provides two libraries for programmatically interfacing with Kinesis ● KPL - Kinesis Producer Library ● KCL - Kinesis Consumer Library Both are available in Java and handle a number of low level concerns ● Stream connection and disconnection ● Enumeration of shards ● Parallel processing of the stream: consuming from and producing to a number of shards ● Shard worker allocation and reallocation, balancing shards across workers ● Batching and aggregation of records 29
  • 30. So what’s lacking??? Nobody’s perfect, right? ● Java only, usage involves some interesting use of inheritance ● Asynchronous & non-blocking processing on the consumer ● Fool proof and non-blocking checkpointing ● Throttling to reduce memory footprint ● Smarter per message checkpointing ● Hard to prevent the driver code becoming tangled with your business logic 30
  • 31. Introducing... The Weight Watchers Kinesis client <Insert cool logo here> Coming to a github repo near you soon….. 31
  • 32. Producer Scala & (optionally) Akka based producer ● Wraps the KPL driver ● Choice of Scala Future or Akka based interface ● Scala interface ○ Returns a Future for each message ○ Completes when send (batch) is successful ● Actor interface ○ Fire and forget or callback messages ○ Optional throttling to limit the number of unsent messages and therefore Futures 32
  • 33. 33
  • 34. Consumer Scala & Akka based consumer ● Wraps the KCL library ● Provides fool proof checkpointing ○ Allows message failures within a configurable threshold ● Messages sent for processing to provided Actor ● Configurable retries ● Asynchronous processing and checkpointing 34
  • 35. 35
  • 36. 36
  • 37. 37
  • 38. 38
  • 43. Performance The performance scales reasonably well with the number of shards, with consistent increases as each new shard is added. 1 Shard - 5,000,000 messages: Records/sec: 42016 Seconds elapsed: 119 2 Shards - 5,000,000 messages: Records/sec: 74626 Seconds elapsed: 67 5 Shards - - 10,000,000 messages Records/sec: 140845 Seconds elapsed: 71 43
  • 44. Mark Harrison @markglh Justin Potter We’re Hiring!! www.weightwatchers.com/us/corporate-careers Or email: Joanna.mark@weightwatchers.com