SlideShare a Scribd company logo
With Akka Streams & Kafka
Intro & Agenda Crawler Intro & Problem
Statements
Crawler Architecture
Infrastructure: Akka Streams,
Kafka, etc.
The Goodies
Crawl
Jobs
Job DB
Validate
URL
Cache
Downloa
d
Process
URLs
URLs
Timestamps
High-Level View
Requirements Ever-expanding # of URLs
Can’t crawl all URLs at once
Control over concurrent web GETs
Efficient resource usage
Resilient under high burst
Scales horizontally & vertically
Sizing the Crawl Job
Let:
i = Number of seed URLs in a job
n = Average number of links per page
d = The crawl depth
(how many layers to follow links)
u = The max number of URLs to process
Then:
u = ind
1.00E+00
1.00E+01
1.00E+02
1.00E+03
1.00E+04
1.00E+05
1.00E+06
1.00E+07
0 2 4 6 8 10 12
totalURLs vs depth
depth (initialURLs = 1, outLinks = 5)
1.00E+00
1.00E+01
1.00E+02
1.00E+03
1.00E+04
1.00E+05
1.00E+06
1.00E+07
1.00E+08
1.00E+09
1.00E+10
1.00E+11
1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07
totalURLs vs initialURLs
initialURLs (depth = 5, outLinks = 5)
The Reactive Manifesto
Responsive
Message Driven
Elastic Resilient
Why Does it Matter?
Respond in a deterministic, timely manner
Stays responsive in the face of failure – even cascading failures
Stays responsive under workload spikes
Basic building block for responsive, resilient, and elastic systems
Responsive
Resilient
Elastic
Message Driven
The Right Ingredients
• Kafka
• Huge persistent buffer for the bursts
• Load distribution to very large number of
processing nodes
• Enable horizontal scalability
• Akka streams
• High performance, highly efficient processing
pipeline
• Resilient with end-to-end back-pressure
• Fully asynchronous – utilizes mapAsyncUnordered
with Async HTTP client
• Async HTTP client
• Non-blocking and consumes no threads in waiting
• Integrates with Akka Streams for a high
parallelism, low resource solution
Efficient
Resilient
Scale
Akka
Stream
Async
HTTP
Reactive
Kafka
Crawl
Jobs
Job DB
Validate
URL
Cache
Downloa
d
Process
URLs
URLs
Timestamps
Adding Kafka & Akka Streams
URLs
Akka Streams
Akka Streams,
what???
High performance, pure async,
stream processing
Conforms to reactive streams
Simple, yet powerful GraphDSL
allows clear stream topology
declaration
Central point to understand
processing pipeline
Crawl Stream
Actual Stream Declaration in Code
prioritizeSource ~> crawlerFlow ~> bCast0 ~> result ~> bCast ~> outLinksFlow ~> outLinksSink
bCast ~> dataSinkFlow ~> kafkaDataSink
bCast ~> hdfsDataSink
bCast ~> graphFlow ~> merge ~> graphSink
bCast0 ~> maxPage ~> merge
bCast0 ~> retry ~> bCastRetry ~> retryFailed ~> merge
bCastRetry ~> errorSink
Prioritized
Source
Crawl
Result
MaxPageReached
Retry
OutLinks
Data
Graph
CheckFail
CheckErr
OutLinks
Sink
Kafka Data
Sink
HDFS Data
Sink
Graph
Sink
Error
Sink
Resulting Characteristics
Efficient
• Low thread count, controlled by Akka and pure non-blocking async HTTP
• High latency URLs do not block low latency URLs using MapAsyncUnordered
• Well-controlled download concurrency using MapAsyncUnordered
• Thread per concurrent crawl job
Resilient
• Processes only what can be processed – no resource overload
• Kafka as short-term, persistent queue
Scale
• Kafka feeds next batch of URLs to available client
• Pull model – only processes that have capacity will get the load
• Kafka distributes work to large number of processing nodes in cluster
Back-Pressure
0
20000
40000
60000
80000
100000
120000
0 100 200 300 400 500 600 700
Queue Size
Time (seconds)
0
200
400
URLs/sec
Time (seconds)
seedURLs : 100
parallelism : 1000
processTime : 1 – 5
s
outLinks : 0 - 10
depth : 5
totalCrawled :
312500
Challenges
Training
• Developers not used to E2E stream
definitions
• More familiar with deeply nested function
calls
Maturity of Infrastructure
• Kafka 0.9 use fetch as heartbeat
• Slow nodes cause timeout & rebalance
• Solved in 0.10.0.1
What it would
have been…
Bloated, ineffective concurrency
control
Lack of well-thought-out and visible
processing pipeline
Clumsy code, hard to manage &
understand
Low training cost, high project TCO
Dev / Support / Maintenance
Bottom Line
Standardized Reactive Platform
Efficiency & Resilience meets Standardization
• Monitoring
• Need to collect metrics, consistently
• Logging
• Correlation across services
• Uniformity in logs
• Security
• Need to apply standard security configuration
• Environment Resolution
• Staging, production, etc.
Consistency in the face of Heterogeneity
squbs is not… A framework by its own
A programming model – use Akka
Take all or none –
Components/patterns can mostly be
used independently
squbs
Akka for large
scale deployments
Bootstrap
Lifecycle management
Loosely-coupled module system
Integration hooks for logging,
monitoring, ops integration
squbs
Akka for large
scale deployments
JSON console
HttpClient with pluggable resolver and
monitoring/logging hooks
Test tools and interfaces
Goodies:
- Activators for Scala & Java
- Programming patterns and helpers for
Akka and Akka Stream Use cases…,
and growing
Performance Akka is principally designed for
great performance & scalability
Actor scheduling, dispatcher,
message batching
• throughput parameter
The job for squbs is adding ops
functionality without impacting
performance
squbs
Performance
for Akka Http
Pipeline (auth, logging, etc.) built as
Akka Streams BidiFlow
Pipeline supplied by infra or app
App code supplies Flow or Route
Pipeline and app flow baked into
single Request/Response Flow
Fully utilizes Akka-Stream fusing
Zero Overhead for Given Functionality
Application
Performance
Tips
Kafka – great firehose, with a bit
latency
Only convert byte ↔ char when
absolutely needed
Work with ByteString if you can
Need better facilities (ByteString
JSON parser, etc.)
Remember : App has biggest impact on performance, not tuning
Akka
Performance
Tuning
Parallelism Factor: 1.0 is optimal
• Minimizes context switches
Use other dispatcher for blocking
Test your throughput setting
Try and test different GC settings
The Goodies
PerpetualStream
• Provides a convenience trait to help
write streams controlled by system
lifecycle
• Minimal/no message losses
• Register PerpetualStream to make
stream start/stop
• Provides customization hooks –
especially for how to stop the stream
• Provides killSwitch (from Akka) to be
embedded into stream
• Implementers - just provide your
stream!
A non-stop stream; starts and stops with the system
class MyStream extends PerpetualStream[Future[Int]] {
def generator = Iterator.iterate(0) { p =>
if (p == Int.MaxValue) 0 else p + 1
}
val source = Source.fromIterator(generator _)
val ignoreSink = Sink.ignore[Int]
override def streamGraph = RunnableGraph.fromGraph(
GraphDSL.create(ignoreSink) { implicit builder =>
sink =>
import GraphDSL.Implicits._
source ~> killSwitch.flow[Int] ~> sink
ClosedShape
})
}
PersistentBuffer/BroadcastBuffer
• Data & indexes in rotating memory-mapped files
• Off-heap rotating file buffer – very large buffers
• Restarts gracefully with no or minimal message loss
• Not as durable as a remote data store, but much faster
• Does not back-pressure upstream beyond data/index writes
• Similar usage to Buffer and Broadcast
• BroadcastBuffer – a FanOutShape decouples each output port making each downstream
independent
• Useful if downstream stage blocked or unavailable
• Kafka is unavailable/rebalancing but system cannot backpressure/deny incoming
traffic
• Optional commit stage for at-least-once delivery semantics
• Implementation based on Chronicle Queue
A buffer of virtually unlimited size
Summary
• Kafka + Akka Streams + Async I/O = Ideal Architecture for High Bursts
& High Efficiency
• Akka Streams
• Clear view of stream topology
• Back-pressure & Kafka allows buffering load bursts
• Standardization
• Walk like a duck, quack like a duck, and manage it like a duck
• squbs: Have the cake, and eat it too
• Functionality without sacrificing performance
• Goodies like PerpetualStream, PersistentBuffer, & BroadcastBuffer
Q&A – Feedback Appreciated
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And Kafka

More Related Content

What's hot

Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...
Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...
Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...
Lightbend
 
A Tale of Two APIs: Using Spark Streaming In Production
A Tale of Two APIs: Using Spark Streaming In ProductionA Tale of Two APIs: Using Spark Streaming In Production
A Tale of Two APIs: Using Spark Streaming In Production
Lightbend
 
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Helena Edelson
 
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
DataStax Academy
 
Developing a Real-time Engine with Akka, Cassandra, and Spray
Developing a Real-time Engine with Akka, Cassandra, and SprayDeveloping a Real-time Engine with Akka, Cassandra, and Spray
Developing a Real-time Engine with Akka, Cassandra, and Spray
Jacob Park
 
Akka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive PlatformAkka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive Platform
Legacy Typesafe (now Lightbend)
 
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
Lightbend
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Amazon Web Services
 
Real-time personal trainer on the SMACK stack
Real-time personal trainer on the SMACK stackReal-time personal trainer on the SMACK stack
Real-time personal trainer on the SMACK stack
Anirvan Chakraborty
 
How to manage large amounts of data with akka streams
How to manage large amounts of data with akka streamsHow to manage large amounts of data with akka streams
How to manage large amounts of data with akka streams
Igor Mielientiev
 
Streaming Microservices With Akka Streams And Kafka Streams
Streaming Microservices With Akka Streams And Kafka StreamsStreaming Microservices With Akka Streams And Kafka Streams
Streaming Microservices With Akka Streams And Kafka Streams
Lightbend
 
Moving from Big Data to Fast Data? Here's How To Pick The Right Streaming Engine
Moving from Big Data to Fast Data? Here's How To Pick The Right Streaming EngineMoving from Big Data to Fast Data? Here's How To Pick The Right Streaming Engine
Moving from Big Data to Fast Data? Here's How To Pick The Right Streaming Engine
Lightbend
 
Do's and don'ts when deploying akka in production
Do's and don'ts when deploying akka in productionDo's and don'ts when deploying akka in production
Do's and don'ts when deploying akka in production
jglobal
 
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Helena Edelson
 
Scaling Twitter with Cassandra
Scaling Twitter with CassandraScaling Twitter with Cassandra
Scaling Twitter with Cassandra
Ryan King
 
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the JobAkka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
Lightbend
 
Revitalizing Enterprise Integration with Reactive Streams
Revitalizing Enterprise Integration with Reactive StreamsRevitalizing Enterprise Integration with Reactive Streams
Revitalizing Enterprise Integration with Reactive Streams
Lightbend
 
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
Amazon Web Services
 
Akka Streams - From Zero to Kafka
Akka Streams - From Zero to KafkaAkka Streams - From Zero to Kafka
Akka Streams - From Zero to Kafka
Mark Harrison
 
KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafka
confluent
 

What's hot (20)

Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...
Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...
Akka Revealed: A JVM Architect's Journey From Resilient Actors To Scalable Cl...
 
A Tale of Two APIs: Using Spark Streaming In Production
A Tale of Two APIs: Using Spark Streaming In ProductionA Tale of Two APIs: Using Spark Streaming In Production
A Tale of Two APIs: Using Spark Streaming In Production
 
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
 
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
 
Developing a Real-time Engine with Akka, Cassandra, and Spray
Developing a Real-time Engine with Akka, Cassandra, and SprayDeveloping a Real-time Engine with Akka, Cassandra, and Spray
Developing a Real-time Engine with Akka, Cassandra, and Spray
 
Akka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive PlatformAkka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive Platform
 
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
 
Real-time personal trainer on the SMACK stack
Real-time personal trainer on the SMACK stackReal-time personal trainer on the SMACK stack
Real-time personal trainer on the SMACK stack
 
How to manage large amounts of data with akka streams
How to manage large amounts of data with akka streamsHow to manage large amounts of data with akka streams
How to manage large amounts of data with akka streams
 
Streaming Microservices With Akka Streams And Kafka Streams
Streaming Microservices With Akka Streams And Kafka StreamsStreaming Microservices With Akka Streams And Kafka Streams
Streaming Microservices With Akka Streams And Kafka Streams
 
Moving from Big Data to Fast Data? Here's How To Pick The Right Streaming Engine
Moving from Big Data to Fast Data? Here's How To Pick The Right Streaming EngineMoving from Big Data to Fast Data? Here's How To Pick The Right Streaming Engine
Moving from Big Data to Fast Data? Here's How To Pick The Right Streaming Engine
 
Do's and don'ts when deploying akka in production
Do's and don'ts when deploying akka in productionDo's and don'ts when deploying akka in production
Do's and don'ts when deploying akka in production
 
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
 
Scaling Twitter with Cassandra
Scaling Twitter with CassandraScaling Twitter with Cassandra
Scaling Twitter with Cassandra
 
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the JobAkka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
 
Revitalizing Enterprise Integration with Reactive Streams
Revitalizing Enterprise Integration with Reactive StreamsRevitalizing Enterprise Integration with Reactive Streams
Revitalizing Enterprise Integration with Reactive Streams
 
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
 
Akka Streams - From Zero to Kafka
Akka Streams - From Zero to KafkaAkka Streams - From Zero to Kafka
Akka Streams - From Zero to Kafka
 
KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafka
 

Viewers also liked

Velocity2014 gvp
Velocity2014 gvpVelocity2014 gvp
Velocity2014 gvp
Almudena Vivanco
 
Benefits Of The Actor Model For Cloud Computing: A Pragmatic Overview For Jav...
Benefits Of The Actor Model For Cloud Computing: A Pragmatic Overview For Jav...Benefits Of The Actor Model For Cloud Computing: A Pragmatic Overview For Jav...
Benefits Of The Actor Model For Cloud Computing: A Pragmatic Overview For Jav...
Lightbend
 
Akka streams kafka kinesis
Akka streams kafka kinesisAkka streams kafka kinesis
Akka streams kafka kinesis
Peter Vandenabeele
 
What's The Role Of Machine Learning In Fast Data And Streaming Applications?
What's The Role Of Machine Learning In Fast Data And Streaming Applications?What's The Role Of Machine Learning In Fast Data And Streaming Applications?
What's The Role Of Machine Learning In Fast Data And Streaming Applications?
Lightbend
 
Reactive integrations with Akka Streams
Reactive integrations with Akka StreamsReactive integrations with Akka Streams
Reactive integrations with Akka Streams
Konrad Malawski
 
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Lightbend
 
Reactive Stream Processing with Akka Streams
Reactive Stream Processing with Akka StreamsReactive Stream Processing with Akka Streams
Reactive Stream Processing with Akka Streams
Konrad Malawski
 

Viewers also liked (7)

Velocity2014 gvp
Velocity2014 gvpVelocity2014 gvp
Velocity2014 gvp
 
Benefits Of The Actor Model For Cloud Computing: A Pragmatic Overview For Jav...
Benefits Of The Actor Model For Cloud Computing: A Pragmatic Overview For Jav...Benefits Of The Actor Model For Cloud Computing: A Pragmatic Overview For Jav...
Benefits Of The Actor Model For Cloud Computing: A Pragmatic Overview For Jav...
 
Akka streams kafka kinesis
Akka streams kafka kinesisAkka streams kafka kinesis
Akka streams kafka kinesis
 
What's The Role Of Machine Learning In Fast Data And Streaming Applications?
What's The Role Of Machine Learning In Fast Data And Streaming Applications?What's The Role Of Machine Learning In Fast Data And Streaming Applications?
What's The Role Of Machine Learning In Fast Data And Streaming Applications?
 
Reactive integrations with Akka Streams
Reactive integrations with Akka StreamsReactive integrations with Akka Streams
Reactive integrations with Akka Streams
 
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
 
Reactive Stream Processing with Akka Streams
Reactive Stream Processing with Akka StreamsReactive Stream Processing with Akka Streams
Reactive Stream Processing with Akka Streams
 

Similar to Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And Kafka

Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...
Reactivesummit
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Guido Schmutz
 
Apache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's NextApache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's Next
Prateek Maheshwari
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Guido Schmutz
 
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...
Dan Halperin
 
Spark Streaming @ Scale (Clicktale)
Spark Streaming @ Scale (Clicktale)Spark Streaming @ Scale (Clicktale)
Spark Streaming @ Scale (Clicktale)
Yuval Itzchakov
 
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQLKafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
confluent
 
Spark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsSpark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka Streams
Guido Schmutz
 
Jug - ecosystem
Jug -  ecosystemJug -  ecosystem
Jug - ecosystem
Florent Ramiere
 
Chti jug - 2018-06-26
Chti jug - 2018-06-26Chti jug - 2018-06-26
Chti jug - 2018-06-26
Florent Ramiere
 
StackMate - CloudFormation for CloudStack
StackMate - CloudFormation for CloudStackStackMate - CloudFormation for CloudStack
StackMate - CloudFormation for CloudStack
Chiradeep Vittal
 
Scalable Stream Processing with Apache Samza
Scalable Stream Processing with Apache SamzaScalable Stream Processing with Apache Samza
Scalable Stream Processing with Apache Samza
Prateek Maheshwari
 
Architectures, Frameworks and Infrastructure
Architectures, Frameworks and InfrastructureArchitectures, Frameworks and Infrastructure
Architectures, Frameworks and Infrastructure
harendra_pathak
 
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
Monal Daxini
 
xPatterns on Spark, Shark, Mesos, Tachyon
xPatterns on Spark, Shark, Mesos, TachyonxPatterns on Spark, Shark, Mesos, Tachyon
xPatterns on Spark, Shark, Mesos, Tachyon
Claudiu Barbura
 
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
confluent
 
Cloud Lambda Architecture Patterns
Cloud Lambda Architecture PatternsCloud Lambda Architecture Patterns
Cloud Lambda Architecture Patterns
Asis Mohanty
 
Using the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data ProductUsing the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data Product
Evans Ye
 
KSQL Intro
KSQL IntroKSQL Intro
KSQL Intro
confluent
 
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and DatabricksFour Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Legacy Typesafe (now Lightbend)
 

Similar to Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And Kafka (20)

Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
 
Apache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's NextApache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's Next
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
 
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...
 
Spark Streaming @ Scale (Clicktale)
Spark Streaming @ Scale (Clicktale)Spark Streaming @ Scale (Clicktale)
Spark Streaming @ Scale (Clicktale)
 
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQLKafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
 
Spark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsSpark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka Streams
 
Jug - ecosystem
Jug -  ecosystemJug -  ecosystem
Jug - ecosystem
 
Chti jug - 2018-06-26
Chti jug - 2018-06-26Chti jug - 2018-06-26
Chti jug - 2018-06-26
 
StackMate - CloudFormation for CloudStack
StackMate - CloudFormation for CloudStackStackMate - CloudFormation for CloudStack
StackMate - CloudFormation for CloudStack
 
Scalable Stream Processing with Apache Samza
Scalable Stream Processing with Apache SamzaScalable Stream Processing with Apache Samza
Scalable Stream Processing with Apache Samza
 
Architectures, Frameworks and Infrastructure
Architectures, Frameworks and InfrastructureArchitectures, Frameworks and Infrastructure
Architectures, Frameworks and Infrastructure
 
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
 
xPatterns on Spark, Shark, Mesos, Tachyon
xPatterns on Spark, Shark, Mesos, TachyonxPatterns on Spark, Shark, Mesos, Tachyon
xPatterns on Spark, Shark, Mesos, Tachyon
 
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
 
Cloud Lambda Architecture Patterns
Cloud Lambda Architecture PatternsCloud Lambda Architecture Patterns
Cloud Lambda Architecture Patterns
 
Using the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data ProductUsing the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data Product
 
KSQL Intro
KSQL IntroKSQL Intro
KSQL Intro
 
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and DatabricksFour Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
 

More from Lightbend

IoT 'Megaservices' - High Throughput Microservices with Akka
IoT 'Megaservices' - High Throughput Microservices with AkkaIoT 'Megaservices' - High Throughput Microservices with Akka
IoT 'Megaservices' - High Throughput Microservices with Akka
Lightbend
 
How Akka Cluster Works: Actors Living in a Cluster
How Akka Cluster Works: Actors Living in a ClusterHow Akka Cluster Works: Actors Living in a Cluster
How Akka Cluster Works: Actors Living in a Cluster
Lightbend
 
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native ApplicationsThe Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
Lightbend
 
Putting the 'I' in IoT - Building Digital Twins with Akka Microservices
Putting the 'I' in IoT - Building Digital Twins with Akka MicroservicesPutting the 'I' in IoT - Building Digital Twins with Akka Microservices
Putting the 'I' in IoT - Building Digital Twins with Akka Microservices
Lightbend
 
Akka at Enterprise Scale: Performance Tuning Distributed Applications
Akka at Enterprise Scale: Performance Tuning Distributed ApplicationsAkka at Enterprise Scale: Performance Tuning Distributed Applications
Akka at Enterprise Scale: Performance Tuning Distributed Applications
Lightbend
 
Digital Transformation with Kubernetes, Containers, and Microservices
Digital Transformation with Kubernetes, Containers, and MicroservicesDigital Transformation with Kubernetes, Containers, and Microservices
Digital Transformation with Kubernetes, Containers, and Microservices
Lightbend
 
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on KubernetesDetecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Lightbend
 
Cloudstate - Towards Stateful Serverless
Cloudstate - Towards Stateful ServerlessCloudstate - Towards Stateful Serverless
Cloudstate - Towards Stateful Serverless
Lightbend
 
Digital Transformation from Monoliths to Microservices to Serverless and Beyond
Digital Transformation from Monoliths to Microservices to Serverless and BeyondDigital Transformation from Monoliths to Microservices to Serverless and Beyond
Digital Transformation from Monoliths to Microservices to Serverless and Beyond
Lightbend
 
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
Lightbend
 
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
Lightbend
 
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
Lightbend
 
Microservices, Kubernetes, and Application Modernization Done Right
Microservices, Kubernetes, and Application Modernization Done RightMicroservices, Kubernetes, and Application Modernization Done Right
Microservices, Kubernetes, and Application Modernization Done Right
Lightbend
 
Full Stack Reactive In Practice
Full Stack Reactive In PracticeFull Stack Reactive In Practice
Full Stack Reactive In Practice
Lightbend
 
Akka and Kubernetes: A Symbiotic Love Story
Akka and Kubernetes: A Symbiotic Love StoryAkka and Kubernetes: A Symbiotic Love Story
Akka and Kubernetes: A Symbiotic Love Story
Lightbend
 
Scala 3 Is Coming: Martin Odersky Shares What To Know
Scala 3 Is Coming: Martin Odersky Shares What To KnowScala 3 Is Coming: Martin Odersky Shares What To Know
Scala 3 Is Coming: Martin Odersky Shares What To Know
Lightbend
 
Migrating From Java EE To Cloud-Native Reactive Systems
Migrating From Java EE To Cloud-Native Reactive SystemsMigrating From Java EE To Cloud-Native Reactive Systems
Migrating From Java EE To Cloud-Native Reactive Systems
Lightbend
 
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsRunning Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Lightbend
 
Designing Events-First Microservices For A Cloud Native World
Designing Events-First Microservices For A Cloud Native WorldDesigning Events-First Microservices For A Cloud Native World
Designing Events-First Microservices For A Cloud Native World
Lightbend
 
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For ScalaScala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
Lightbend
 

More from Lightbend (20)

IoT 'Megaservices' - High Throughput Microservices with Akka
IoT 'Megaservices' - High Throughput Microservices with AkkaIoT 'Megaservices' - High Throughput Microservices with Akka
IoT 'Megaservices' - High Throughput Microservices with Akka
 
How Akka Cluster Works: Actors Living in a Cluster
How Akka Cluster Works: Actors Living in a ClusterHow Akka Cluster Works: Actors Living in a Cluster
How Akka Cluster Works: Actors Living in a Cluster
 
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native ApplicationsThe Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
 
Putting the 'I' in IoT - Building Digital Twins with Akka Microservices
Putting the 'I' in IoT - Building Digital Twins with Akka MicroservicesPutting the 'I' in IoT - Building Digital Twins with Akka Microservices
Putting the 'I' in IoT - Building Digital Twins with Akka Microservices
 
Akka at Enterprise Scale: Performance Tuning Distributed Applications
Akka at Enterprise Scale: Performance Tuning Distributed ApplicationsAkka at Enterprise Scale: Performance Tuning Distributed Applications
Akka at Enterprise Scale: Performance Tuning Distributed Applications
 
Digital Transformation with Kubernetes, Containers, and Microservices
Digital Transformation with Kubernetes, Containers, and MicroservicesDigital Transformation with Kubernetes, Containers, and Microservices
Digital Transformation with Kubernetes, Containers, and Microservices
 
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on KubernetesDetecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
 
Cloudstate - Towards Stateful Serverless
Cloudstate - Towards Stateful ServerlessCloudstate - Towards Stateful Serverless
Cloudstate - Towards Stateful Serverless
 
Digital Transformation from Monoliths to Microservices to Serverless and Beyond
Digital Transformation from Monoliths to Microservices to Serverless and BeyondDigital Transformation from Monoliths to Microservices to Serverless and Beyond
Digital Transformation from Monoliths to Microservices to Serverless and Beyond
 
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
 
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
 
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
 
Microservices, Kubernetes, and Application Modernization Done Right
Microservices, Kubernetes, and Application Modernization Done RightMicroservices, Kubernetes, and Application Modernization Done Right
Microservices, Kubernetes, and Application Modernization Done Right
 
Full Stack Reactive In Practice
Full Stack Reactive In PracticeFull Stack Reactive In Practice
Full Stack Reactive In Practice
 
Akka and Kubernetes: A Symbiotic Love Story
Akka and Kubernetes: A Symbiotic Love StoryAkka and Kubernetes: A Symbiotic Love Story
Akka and Kubernetes: A Symbiotic Love Story
 
Scala 3 Is Coming: Martin Odersky Shares What To Know
Scala 3 Is Coming: Martin Odersky Shares What To KnowScala 3 Is Coming: Martin Odersky Shares What To Know
Scala 3 Is Coming: Martin Odersky Shares What To Know
 
Migrating From Java EE To Cloud-Native Reactive Systems
Migrating From Java EE To Cloud-Native Reactive SystemsMigrating From Java EE To Cloud-Native Reactive Systems
Migrating From Java EE To Cloud-Native Reactive Systems
 
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsRunning Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
 
Designing Events-First Microservices For A Cloud Native World
Designing Events-First Microservices For A Cloud Native WorldDesigning Events-First Microservices For A Cloud Native World
Designing Events-First Microservices For A Cloud Native World
 
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For ScalaScala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
 

Recently uploaded

TheFutureIsDynamic-BoxLang-CFCamp2024.pdf
TheFutureIsDynamic-BoxLang-CFCamp2024.pdfTheFutureIsDynamic-BoxLang-CFCamp2024.pdf
TheFutureIsDynamic-BoxLang-CFCamp2024.pdf
Ortus Solutions, Corp
 
Microsoft-Power-Platform-Adoption-Planning.pptx
Microsoft-Power-Platform-Adoption-Planning.pptxMicrosoft-Power-Platform-Adoption-Planning.pptx
Microsoft-Power-Platform-Adoption-Planning.pptx
jrodriguezq3110
 
Photoshop Tutorial for Beginners (2024 Edition)
Photoshop Tutorial for Beginners (2024 Edition)Photoshop Tutorial for Beginners (2024 Edition)
Photoshop Tutorial for Beginners (2024 Edition)
alowpalsadig
 
Going AOT: Everything you need to know about GraalVM for Java applications
Going AOT: Everything you need to know about GraalVM for Java applicationsGoing AOT: Everything you need to know about GraalVM for Java applications
Going AOT: Everything you need to know about GraalVM for Java applications
Alina Yurenko
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid
 
Boost Your Savings with These Money Management Apps
Boost Your Savings with These Money Management AppsBoost Your Savings with These Money Management Apps
Boost Your Savings with These Money Management Apps
Jhone kinadey
 
Hyperledger Besu 빨리 따라하기 (Private Networks)
Hyperledger Besu 빨리 따라하기 (Private Networks)Hyperledger Besu 빨리 따라하기 (Private Networks)
Hyperledger Besu 빨리 따라하기 (Private Networks)
wonyong hwang
 
What’s New in VictoriaLogs - Q2 2024 Update
What’s New in VictoriaLogs - Q2 2024 UpdateWhat’s New in VictoriaLogs - Q2 2024 Update
What’s New in VictoriaLogs - Q2 2024 Update
VictoriaMetrics
 
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdfThe Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
kalichargn70th171
 
Computer Science & Engineering VI Sem- New Syllabus.pdf
Computer Science & Engineering VI Sem- New Syllabus.pdfComputer Science & Engineering VI Sem- New Syllabus.pdf
Computer Science & Engineering VI Sem- New Syllabus.pdf
chandangoswami40933
 
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Peter Caitens
 
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
kalichargn70th171
 
Penify - Let AI do the Documentation, you write the Code.
Penify - Let AI do the Documentation, you write the Code.Penify - Let AI do the Documentation, you write the Code.
Penify - Let AI do the Documentation, you write the Code.
KrishnaveniMohan1
 
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data PlatformAlluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio, Inc.
 
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
Ortus Solutions, Corp
 
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
manji sharman06
 
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSISDECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
Tier1 app
 
Building API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructureBuilding API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructure
confluent
 
Upturn India Technologies - Web development company in Nashik
Upturn India Technologies - Web development company in NashikUpturn India Technologies - Web development company in Nashik
Upturn India Technologies - Web development company in Nashik
Upturn India Technologies
 
Ensuring Efficiency and Speed with Practical Solutions for Clinical Operations
Ensuring Efficiency and Speed with Practical Solutions for Clinical OperationsEnsuring Efficiency and Speed with Practical Solutions for Clinical Operations
Ensuring Efficiency and Speed with Practical Solutions for Clinical Operations
OnePlan Solutions
 

Recently uploaded (20)

TheFutureIsDynamic-BoxLang-CFCamp2024.pdf
TheFutureIsDynamic-BoxLang-CFCamp2024.pdfTheFutureIsDynamic-BoxLang-CFCamp2024.pdf
TheFutureIsDynamic-BoxLang-CFCamp2024.pdf
 
Microsoft-Power-Platform-Adoption-Planning.pptx
Microsoft-Power-Platform-Adoption-Planning.pptxMicrosoft-Power-Platform-Adoption-Planning.pptx
Microsoft-Power-Platform-Adoption-Planning.pptx
 
Photoshop Tutorial for Beginners (2024 Edition)
Photoshop Tutorial for Beginners (2024 Edition)Photoshop Tutorial for Beginners (2024 Edition)
Photoshop Tutorial for Beginners (2024 Edition)
 
Going AOT: Everything you need to know about GraalVM for Java applications
Going AOT: Everything you need to know about GraalVM for Java applicationsGoing AOT: Everything you need to know about GraalVM for Java applications
Going AOT: Everything you need to know about GraalVM for Java applications
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
 
Boost Your Savings with These Money Management Apps
Boost Your Savings with These Money Management AppsBoost Your Savings with These Money Management Apps
Boost Your Savings with These Money Management Apps
 
Hyperledger Besu 빨리 따라하기 (Private Networks)
Hyperledger Besu 빨리 따라하기 (Private Networks)Hyperledger Besu 빨리 따라하기 (Private Networks)
Hyperledger Besu 빨리 따라하기 (Private Networks)
 
What’s New in VictoriaLogs - Q2 2024 Update
What’s New in VictoriaLogs - Q2 2024 UpdateWhat’s New in VictoriaLogs - Q2 2024 Update
What’s New in VictoriaLogs - Q2 2024 Update
 
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdfThe Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
 
Computer Science & Engineering VI Sem- New Syllabus.pdf
Computer Science & Engineering VI Sem- New Syllabus.pdfComputer Science & Engineering VI Sem- New Syllabus.pdf
Computer Science & Engineering VI Sem- New Syllabus.pdf
 
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
 
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
 
Penify - Let AI do the Documentation, you write the Code.
Penify - Let AI do the Documentation, you write the Code.Penify - Let AI do the Documentation, you write the Code.
Penify - Let AI do the Documentation, you write the Code.
 
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data PlatformAlluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
 
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
 
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
 
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSISDECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
 
Building API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructureBuilding API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructure
 
Upturn India Technologies - Web development company in Nashik
Upturn India Technologies - Web development company in NashikUpturn India Technologies - Web development company in Nashik
Upturn India Technologies - Web development company in Nashik
 
Ensuring Efficiency and Speed with Practical Solutions for Clinical Operations
Ensuring Efficiency and Speed with Practical Solutions for Clinical OperationsEnsuring Efficiency and Speed with Practical Solutions for Clinical Operations
Ensuring Efficiency and Speed with Practical Solutions for Clinical Operations
 

Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And Kafka

  • 2. Intro & Agenda Crawler Intro & Problem Statements Crawler Architecture Infrastructure: Akka Streams, Kafka, etc. The Goodies
  • 4. Requirements Ever-expanding # of URLs Can’t crawl all URLs at once Control over concurrent web GETs Efficient resource usage Resilient under high burst Scales horizontally & vertically
  • 5. Sizing the Crawl Job Let: i = Number of seed URLs in a job n = Average number of links per page d = The crawl depth (how many layers to follow links) u = The max number of URLs to process Then: u = ind 1.00E+00 1.00E+01 1.00E+02 1.00E+03 1.00E+04 1.00E+05 1.00E+06 1.00E+07 0 2 4 6 8 10 12 totalURLs vs depth depth (initialURLs = 1, outLinks = 5) 1.00E+00 1.00E+01 1.00E+02 1.00E+03 1.00E+04 1.00E+05 1.00E+06 1.00E+07 1.00E+08 1.00E+09 1.00E+10 1.00E+11 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 totalURLs vs initialURLs initialURLs (depth = 5, outLinks = 5)
  • 6. The Reactive Manifesto Responsive Message Driven Elastic Resilient
  • 7. Why Does it Matter? Respond in a deterministic, timely manner Stays responsive in the face of failure – even cascading failures Stays responsive under workload spikes Basic building block for responsive, resilient, and elastic systems Responsive Resilient Elastic Message Driven
  • 8. The Right Ingredients • Kafka • Huge persistent buffer for the bursts • Load distribution to very large number of processing nodes • Enable horizontal scalability • Akka streams • High performance, highly efficient processing pipeline • Resilient with end-to-end back-pressure • Fully asynchronous – utilizes mapAsyncUnordered with Async HTTP client • Async HTTP client • Non-blocking and consumes no threads in waiting • Integrates with Akka Streams for a high parallelism, low resource solution Efficient Resilient Scale Akka Stream Async HTTP Reactive Kafka
  • 10. Akka Streams, what??? High performance, pure async, stream processing Conforms to reactive streams Simple, yet powerful GraphDSL allows clear stream topology declaration Central point to understand processing pipeline
  • 11. Crawl Stream Actual Stream Declaration in Code prioritizeSource ~> crawlerFlow ~> bCast0 ~> result ~> bCast ~> outLinksFlow ~> outLinksSink bCast ~> dataSinkFlow ~> kafkaDataSink bCast ~> hdfsDataSink bCast ~> graphFlow ~> merge ~> graphSink bCast0 ~> maxPage ~> merge bCast0 ~> retry ~> bCastRetry ~> retryFailed ~> merge bCastRetry ~> errorSink Prioritized Source Crawl Result MaxPageReached Retry OutLinks Data Graph CheckFail CheckErr OutLinks Sink Kafka Data Sink HDFS Data Sink Graph Sink Error Sink
  • 12. Resulting Characteristics Efficient • Low thread count, controlled by Akka and pure non-blocking async HTTP • High latency URLs do not block low latency URLs using MapAsyncUnordered • Well-controlled download concurrency using MapAsyncUnordered • Thread per concurrent crawl job Resilient • Processes only what can be processed – no resource overload • Kafka as short-term, persistent queue Scale • Kafka feeds next batch of URLs to available client • Pull model – only processes that have capacity will get the load • Kafka distributes work to large number of processing nodes in cluster
  • 13. Back-Pressure 0 20000 40000 60000 80000 100000 120000 0 100 200 300 400 500 600 700 Queue Size Time (seconds) 0 200 400 URLs/sec Time (seconds) seedURLs : 100 parallelism : 1000 processTime : 1 – 5 s outLinks : 0 - 10 depth : 5 totalCrawled : 312500
  • 14. Challenges Training • Developers not used to E2E stream definitions • More familiar with deeply nested function calls Maturity of Infrastructure • Kafka 0.9 use fetch as heartbeat • Slow nodes cause timeout & rebalance • Solved in 0.10.0.1
  • 15. What it would have been… Bloated, ineffective concurrency control Lack of well-thought-out and visible processing pipeline Clumsy code, hard to manage & understand Low training cost, high project TCO Dev / Support / Maintenance
  • 18. Efficiency & Resilience meets Standardization • Monitoring • Need to collect metrics, consistently • Logging • Correlation across services • Uniformity in logs • Security • Need to apply standard security configuration • Environment Resolution • Staging, production, etc. Consistency in the face of Heterogeneity
  • 19. squbs is not… A framework by its own A programming model – use Akka Take all or none – Components/patterns can mostly be used independently
  • 20. squbs Akka for large scale deployments Bootstrap Lifecycle management Loosely-coupled module system Integration hooks for logging, monitoring, ops integration
  • 21. squbs Akka for large scale deployments JSON console HttpClient with pluggable resolver and monitoring/logging hooks Test tools and interfaces Goodies: - Activators for Scala & Java - Programming patterns and helpers for Akka and Akka Stream Use cases…, and growing
  • 22. Performance Akka is principally designed for great performance & scalability Actor scheduling, dispatcher, message batching • throughput parameter The job for squbs is adding ops functionality without impacting performance
  • 23. squbs Performance for Akka Http Pipeline (auth, logging, etc.) built as Akka Streams BidiFlow Pipeline supplied by infra or app App code supplies Flow or Route Pipeline and app flow baked into single Request/Response Flow Fully utilizes Akka-Stream fusing Zero Overhead for Given Functionality
  • 24. Application Performance Tips Kafka – great firehose, with a bit latency Only convert byte ↔ char when absolutely needed Work with ByteString if you can Need better facilities (ByteString JSON parser, etc.) Remember : App has biggest impact on performance, not tuning
  • 25. Akka Performance Tuning Parallelism Factor: 1.0 is optimal • Minimizes context switches Use other dispatcher for blocking Test your throughput setting Try and test different GC settings
  • 27. PerpetualStream • Provides a convenience trait to help write streams controlled by system lifecycle • Minimal/no message losses • Register PerpetualStream to make stream start/stop • Provides customization hooks – especially for how to stop the stream • Provides killSwitch (from Akka) to be embedded into stream • Implementers - just provide your stream! A non-stop stream; starts and stops with the system class MyStream extends PerpetualStream[Future[Int]] { def generator = Iterator.iterate(0) { p => if (p == Int.MaxValue) 0 else p + 1 } val source = Source.fromIterator(generator _) val ignoreSink = Sink.ignore[Int] override def streamGraph = RunnableGraph.fromGraph( GraphDSL.create(ignoreSink) { implicit builder => sink => import GraphDSL.Implicits._ source ~> killSwitch.flow[Int] ~> sink ClosedShape }) }
  • 28. PersistentBuffer/BroadcastBuffer • Data & indexes in rotating memory-mapped files • Off-heap rotating file buffer – very large buffers • Restarts gracefully with no or minimal message loss • Not as durable as a remote data store, but much faster • Does not back-pressure upstream beyond data/index writes • Similar usage to Buffer and Broadcast • BroadcastBuffer – a FanOutShape decouples each output port making each downstream independent • Useful if downstream stage blocked or unavailable • Kafka is unavailable/rebalancing but system cannot backpressure/deny incoming traffic • Optional commit stage for at-least-once delivery semantics • Implementation based on Chronicle Queue A buffer of virtually unlimited size
  • 29. Summary • Kafka + Akka Streams + Async I/O = Ideal Architecture for High Bursts & High Efficiency • Akka Streams • Clear view of stream topology • Back-pressure & Kafka allows buffering load bursts • Standardization • Walk like a duck, quack like a duck, and manage it like a duck • squbs: Have the cake, and eat it too • Functionality without sacrificing performance • Goodies like PerpetualStream, PersistentBuffer, & BroadcastBuffer
  • 30. Q&A – Feedback Appreciated