SlideShare a Scribd company logo
1 of 43
Download to read offline
Apache Beam in the Google Cloud
Lessons learned from building and operating a serverless
streaming runtime
Reuven Lax, Google (@reuvenlax)
Sergei Sokolenko, Google (@datancoffee)
Common steps in Stream AnalyticsLessons we learned
Watermarks
Adaptive Scaling: Flow Control
Adaptive Scaling: Autoscaling
Separating compute state from storage
Common steps in Stream AnalyticsHistory Lesson
2012 20132002 2004 2006 2008 2010
Flume Millwheel
2015
DataflowMillwheelBespoke Streaming
Common steps in Stream AnalyticsLesson Learned: Watermarks
A pipeline stage S with a watermark value of t means that all future data that will be seen by S will
have a timestamp later than t. In other words, all data older than t has already been processed.
Key use case: process windows once the watermark passes the end of the window, since we
expect all data for that window to have arrived already
Common steps in Stream AnalyticsWhat Triggers Output?
Traditional batch: query triggers output Streaming: When to trigger?
● Standing Query
● Unbounded Data
Query Data
Output
Query Data?
Common steps in Stream AnalyticsUse Case: Anomaly Detection pipelines
Early Millwheel user was an anomaly detection
pipeline
Built cubic-spline model for each key
Once a spline was calculated, it could not be
modified. No late data, trigger only when ready!
Bob
Sara
Common steps in Stream AnalyticsFirst attempt: leading-edge watermark
Latest timestamp - δ
Graph shows that skew peaked at 10
minutes.
Set δ = 10 minutes to minimize data
drops.
Common steps in Stream AnalyticsFirst attempt: leading-edge watermark
Too fast
Too often a lot of data was behind this watermark
Ended up with many gaps in output
Impacted quality of results
Too slow
Subtracting fixed delta puts lower bound on latency
Subtracted 10 minutes because the system is
sometimes delayed by 10 minutes. However most
of the time the delay was under 1 minute!
Common steps in Stream AnalyticsSecond attempt: dynamic leading edge watermark
Leading edge watermark
Dynamic statistical models to compute how far the lookback should be
Common steps in Stream AnalyticsSecond attempt: dynamic leading edge watermark
Still many gaps in output data
Input is too noisy
Many delays are unpredictable (e.g. a machine restarting)
Models take time to adapt, in which time you are dropping data
Common steps in Stream AnalyticsTrailing edge watermark
Tracking the minimum event time instead generally solved the problem.
Common steps in Stream AnalyticsWatermark: Definition
Given a node N of a computation graph G, Let In
be the sequence of input elements processed with
the order provided by an oracle. t: In
-> R is a real-valued function on In
called the timestamp
function. A watermark function is a real-valued function W defined on prefixes of In
satisfying the
following:
{Wn
} = {W({I1
, …, In
})} is eventually increasing.
{Wn
} is monotonic.
W is said to be a temporal watermark, if it also satisfies Wn
< t(Im
) for m >= n
Common steps in Stream AnalyticsLoad Variance
Streaming pipelines must keep up with input.
Load varies throughout the day, throughout the week and spikes can happen at any time.
Common steps in Stream AnalyticsLoad Variance: Hand Tuning
Every pipeline is different, and hand tuning is hard
Eventually tuning parameters go stale
Hand-tuned flags become cargo cult science
Must tune for worst case
● Tuning for the peak is wasteful
● If pipeline ever falls behind, must be able to catch up faster than real time.
○ An exactly-once streaming system is a batch system whenever it falls behind.
Common steps in Stream AnalyticsTechniques: Batching
Always process data in batches
Batch sizes are dynamic: small when caught up, large when while catching up.
Lesson: be careful of putting arbitrary limits on batches.
● Don’t limit by event time ranges - event-time density changes.
● Don’t limit by windows - window policies change.
Batching limits will be especially painful while catching up a backlog.
Common steps in Stream AnalyticsTechniques: Flow Control
A good adaptive backpressure system is critical
● Prevents workers from overloading and crashing
● Adaptive backpressure adapts to changing load.
● Reduces need to perfectly tune cluster.
Common steps in Stream AnalyticsTechniques: Flow Control
Soft resources: CPU.
Hard resources: Memory.
Signals:
● Queue length
● Memory usage
Eventually flow control will pause
pulling from sources.
A
B C
Worker 1
A
B C
Worker 3
A
B C
Worker 2
Flow
controlled
Common steps in Stream AnalyticsTechniques: Flow Control
What happens if all streams are flow
controlled?
Deadlock!!!!!
● Every worker is holding
onto memory for pending
deliveries.
● Every worker is flow
controlling its input
streams.
A
B C
Worker 1
A
B C
Worker 3
A
B C
Worker 2
Flow
controlled
Flow
controlled
Flow
controlled
Common steps in Stream AnalyticsTechniques: Flow Control
To avoid deadlock, workers must be
able to release memory
This might involve canceling in-flight
work to be retried later
Dataflow streaming workers can spill
pending deliveries to disk to release
memory. Scanned back in later.
A
B C
Worker 1
A
B C
Worker 3
A
B C
Worker 2
Flow
controlled
Flow
controlled
Flow
controlled
Common steps in Stream AnalyticsTechniques: Auto Scaling
Adapative autoscaling allows elastic
scaling with load.
Work must by dynamically load
balanced to take advantage of
autoscaling.
Common steps in Stream AnalyticsTechniques: Auto Scaling
Never assume fixed workers.
Work ownership can be moved at any time.
All keys are hash sharded, and hash ranges
distributed among workers.
Separate storage from compute
Adds a lot of complexity to exactly once and
consistency protocols!
A
B C
worker23
A
B C
Worker
A
B C
worker32
[0, 3)
[a, f)
[3, a)
[0, 3): 23
[3, a): 32
[a, f): 32
RPCs addressed to
keys, not workers
Common steps in Stream AnalyticsLoad Variance: Lesson
Dynamic control is key
No amount of static configuration works
Eventually the universe will outsmart your configuration
Separating compute from
state storage to improve
scalability
Sergei Sokolenko, Google (@datancoffee)
Common steps in Stream Analytics
End-user
apps
Cloud Composer
IoT
Events
Cloud Pub/Sub Dataflow Streaming
DBs
Cloud AI
Platform
Bigtable Dataflow Batch
Action
Streaming processing options in GCP
BigQuery
BigQuery Streaming API
Machine Learning
Data Warehousing
Motivating Example:
Spotify migrating the largest European Hadoop cluster to Dataflow
● Run 80,000+ Dataflow jobs / month
● 90% batch, 10% streaming
Use Dataflow for “everything”
● Music recommendations, Ads targeting
● AB testing, Behavioral analysis, Business metrics
Huge batch jobs:
● 26000 CPUs, 166 TB RAM
● Processing 325 billion rows in 240TB from Bigtable
Traditional Distributed Data Processing Architecture
User code
VM
User code
VM
User code
VM
User code
VM
State storage
● Jobs executed on
clusters of VMs
● Job state stored on
network-attached
volumes
● Control plane
orchestrates data
plane
Network
Control plane
VM
State storage State storage State storage
Traditional Architecture works well ...
Filter
Filter
Join
Group
Filter
Filter
fs://
Databasefs://
Database
… except for Joins and
Group By’s
Shuffling key-value pairs
● Starting with <K,V> pairs
placed on different workers
● Goal: co-locate all pairs
with the same Key
<key1, record>
<key5, record>
<key3, record>
<key8, record>
<key4, record>
...
<key5, record>
<key5, record>
<key2, record>
<key3, record>
<key8, record>
...
<key3, record>
<key3, record>
<key8, record>
<key3, record>
<key6, record>
...
<key2, record>
<key1, record>
<key5, record>
<key8, record>
<key4, record>
...
● Starting with <K,V> pairs
placed on different workers
● Goal: co-locate all pairs
with the same Key
● Workers exchange <K,V>
<key1, record>
<key5, record>
<key3, record>
<key8, record>
<key4, record>
...
<key5, record>
<key5, record>
<key2, record>
<key3, record>
<key8, record>
...
<key3, record>
<key3, record>
<key8, record>
<key3, record>
<key6, record>
...
<key2, record>
<key1, record>
<key5, record>
<key8, record>
<key4, record>
...
Shuffling key-value pairs
Shuffling key-value pairs
<key1, record>
<key1, record>
<key2, record>
<key2, record>
<key2, record>
...
<key3, record>
<key3, record>
<key3, record>
<key3, record>
<key3, record>
<key4, record>
...
<key5, record>
<key5, record>
<key5, record>
<key5, record>
<key6, record>
...
<key7, record>
<key8, record>
<key8, record>
<key8, record>
...
key1, key 2 key3, key4 key5, key6 key7, key8
● Starting with <K,V> pairs
placed on different workers
● Goal: co-locate all pairs
with the same Key
● Workers exchange <K,V>
● Until everything is sorted
Traditional Architecture Requires Manual Tuning
User code
VM
User code
VM
User code
VM
User code
VM
State storage
● When data volumes
exceed dozens of TBs
Network
Control plane
VM
State storage State storage State storage
Distributed in-memory Shuffle in batch Cloud Dataflow
Compute
Petabit
network
Dataflow Shuffle
Region
Zone ‘a’ Zone ‘b’
Zone ‘c’Distributed
in-memory
file system
Distributed
on-disk
file system
Shuffle
proxy
Autozone placement
Pipeline user code Shuffling Operations
No tuning required
Dataflow Shuffle is usually
faster than worker-based
shuffle, including those using
SSD-PD.
Better autoscaling keeps
aggregate resource usage
same, but cuts processing
time.
Faster Processing
Runtime of shuffle
Runtime
(mins)
Shuffle 300TB+
Dataflow shuffle has been
used to shuffle 300TB+
datasets.
Supporting larger datasets
Dataset size of shuffle
Dataset
size (TB)
Storing state
What about streaming pipelines?
Streaming shuffle
Just like in batch, need to group and join
streams
Distributed streaming shuffle
Window data elements
Time window data aggregations need
to be buffered
Until triggering conditions occur
Goal: Grouping by Event Time into Time Windows
9:00 14:0013:0012:0011:0010:00Event
time
9:00 14:0013:0012:0011:0010:00Processing
time
Input
Output
Even more state to store on disks in streaming
User code
VM
User code
VM
User code
VM
User code
VM
Shuffle data elements
● Key ranges are assigned
to workers
● Data elements of these
keys is stored on
Persistent Disks
State storage State storage State storage State storage
key 0000 ...
… key 1234
key 1235 ...
… key ABC2
key ABC3 ...
… key DEF5
key DEF6 ...
… key GHI2
Time window data
● Also assigned to workers
● When time windows
close, data processed on
workers
Dataflow Streaming Engine
Benefits
● Better supportability
● Less worker resources
● More efficient autoscaling
User code
Streaming engine
Worker
User code
Worker
User code
Worker
User code
Worker
Window state storage Streaming shuffle
Autoscaling: Even better with separate Compute and State Storage
User code
Streaming engine
Worker
User code
Worker
Window state storage Streaming shuffle
Dataflow with Streaming Engine
User code
VM
User code
VM
State storage State storage
key 0000 ...
… key 1234
key 1235 ...
… key ABC2
Dataflow without Streaming Engine
Dataflow with Streaming Engine Dataflow without Streaming Engine
● Personalization and experimentation platform
● Wanted things to work out-of-the-box
Significant data volumes:
● 25 million user sessions per day
● 2B events per day
Dataflow usage profile:
● Streaming Engine for worryless autoscaling
● Batch processing with FlexRS for cost savings
AB Tasty is using Dataflow Streaming Engine
Main Takeaways
Trailing edge watermarks provided a solution for triggering aggregations
The system must be elastic and adaptive
Separating compute from state storage help make stream and batch processing scalable
Thank You!

More Related Content

What's hot

Tran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
Tran Nam-Luc – Stale Synchronous Parallel Iterations on FlinkTran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
Tran Nam-Luc – Stale Synchronous Parallel Iterations on FlinkFlink Forward
 
Unify Enterprise Data Processing System Platform Level Integration of Flink a...
Unify Enterprise Data Processing System Platform Level Integration of Flink a...Unify Enterprise Data Processing System Platform Level Integration of Flink a...
Unify Enterprise Data Processing System Platform Level Integration of Flink a...Flink Forward
 
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...Flink Forward
 
Matthias J. Sax – A Tale of Squirrels and Storms
Matthias J. Sax – A Tale of Squirrels and StormsMatthias J. Sax – A Tale of Squirrels and Storms
Matthias J. Sax – A Tale of Squirrels and StormsFlink Forward
 
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache ZeppelinMoon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache ZeppelinFlink Forward
 
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry confluent
 
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015Robert Metzger
 
Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...
Flink Forward SF 2017:  Cliff Resnick & Seth Wiesman -   From Zero to Streami...Flink Forward SF 2017:  Cliff Resnick & Seth Wiesman -   From Zero to Streami...
Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...Flink Forward
 
Tech Talk @ Google on Flink Fault Tolerance and HA
Tech Talk @ Google on Flink Fault Tolerance and HATech Talk @ Google on Flink Fault Tolerance and HA
Tech Talk @ Google on Flink Fault Tolerance and HAParis Carbone
 
Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck - Pravega: Storage Rei...
Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck -  Pravega: Storage Rei...Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck -  Pravega: Storage Rei...
Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck - Pravega: Storage Rei...Flink Forward
 
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...Flink Forward
 
Flink Connector Development Tips & Tricks
Flink Connector Development Tips & TricksFlink Connector Development Tips & Tricks
Flink Connector Development Tips & TricksEron Wright
 
Christian Kreuzfeld – Static vs Dynamic Stream Processing
Christian Kreuzfeld – Static vs Dynamic Stream ProcessingChristian Kreuzfeld – Static vs Dynamic Stream Processing
Christian Kreuzfeld – Static vs Dynamic Stream ProcessingFlink Forward
 
Flink forward SF 2017: Elizabeth K. Joseph and Ravi Yadav - Flink meet DC/OS ...
Flink forward SF 2017: Elizabeth K. Joseph and Ravi Yadav - Flink meet DC/OS ...Flink forward SF 2017: Elizabeth K. Joseph and Ravi Yadav - Flink meet DC/OS ...
Flink forward SF 2017: Elizabeth K. Joseph and Ravi Yadav - Flink meet DC/OS ...Flink Forward
 
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud"
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud" Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud"
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud" Flink Forward
 
Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...
Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...
Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...Flink Forward
 
Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...
Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...
Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...Flink Forward
 
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...Flink Forward
 
Stateful Distributed Stream Processing
Stateful Distributed Stream ProcessingStateful Distributed Stream Processing
Stateful Distributed Stream ProcessingGyula Fóra
 
Gelly-Stream: Single-Pass Graph Streaming Analytics with Apache Flink
Gelly-Stream: Single-Pass Graph Streaming Analytics with Apache FlinkGelly-Stream: Single-Pass Graph Streaming Analytics with Apache Flink
Gelly-Stream: Single-Pass Graph Streaming Analytics with Apache FlinkVasia Kalavri
 

What's hot (20)

Tran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
Tran Nam-Luc – Stale Synchronous Parallel Iterations on FlinkTran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
Tran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
 
Unify Enterprise Data Processing System Platform Level Integration of Flink a...
Unify Enterprise Data Processing System Platform Level Integration of Flink a...Unify Enterprise Data Processing System Platform Level Integration of Flink a...
Unify Enterprise Data Processing System Platform Level Integration of Flink a...
 
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
 
Matthias J. Sax – A Tale of Squirrels and Storms
Matthias J. Sax – A Tale of Squirrels and StormsMatthias J. Sax – A Tale of Squirrels and Storms
Matthias J. Sax – A Tale of Squirrels and Storms
 
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache ZeppelinMoon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
 
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
 
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
 
Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...
Flink Forward SF 2017:  Cliff Resnick & Seth Wiesman -   From Zero to Streami...Flink Forward SF 2017:  Cliff Resnick & Seth Wiesman -   From Zero to Streami...
Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...
 
Tech Talk @ Google on Flink Fault Tolerance and HA
Tech Talk @ Google on Flink Fault Tolerance and HATech Talk @ Google on Flink Fault Tolerance and HA
Tech Talk @ Google on Flink Fault Tolerance and HA
 
Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck - Pravega: Storage Rei...
Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck -  Pravega: Storage Rei...Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck -  Pravega: Storage Rei...
Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck - Pravega: Storage Rei...
 
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...
 
Flink Connector Development Tips & Tricks
Flink Connector Development Tips & TricksFlink Connector Development Tips & Tricks
Flink Connector Development Tips & Tricks
 
Christian Kreuzfeld – Static vs Dynamic Stream Processing
Christian Kreuzfeld – Static vs Dynamic Stream ProcessingChristian Kreuzfeld – Static vs Dynamic Stream Processing
Christian Kreuzfeld – Static vs Dynamic Stream Processing
 
Flink forward SF 2017: Elizabeth K. Joseph and Ravi Yadav - Flink meet DC/OS ...
Flink forward SF 2017: Elizabeth K. Joseph and Ravi Yadav - Flink meet DC/OS ...Flink forward SF 2017: Elizabeth K. Joseph and Ravi Yadav - Flink meet DC/OS ...
Flink forward SF 2017: Elizabeth K. Joseph and Ravi Yadav - Flink meet DC/OS ...
 
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud"
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud" Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud"
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud"
 
Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...
Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...
Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...
 
Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...
Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...
Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...
 
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
 
Stateful Distributed Stream Processing
Stateful Distributed Stream ProcessingStateful Distributed Stream Processing
Stateful Distributed Stream Processing
 
Gelly-Stream: Single-Pass Graph Streaming Analytics with Apache Flink
Gelly-Stream: Single-Pass Graph Streaming Analytics with Apache FlinkGelly-Stream: Single-Pass Graph Streaming Analytics with Apache Flink
Gelly-Stream: Single-Pass Graph Streaming Analytics with Apache Flink
 

Similar to Keynote: Building and Operating A Serverless Streaming Runtime for Apache Beam in The Google Cloud - Sergei Sokolenko & Reuven lax, Google

Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Data Con LA
 
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingCloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingDoiT International
 
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...Flink Forward
 
Netflix SRE perf meetup_slides
Netflix SRE perf meetup_slidesNetflix SRE perf meetup_slides
Netflix SRE perf meetup_slidesEd Hunter
 
Natural Laws of Software Performance
Natural Laws of Software PerformanceNatural Laws of Software Performance
Natural Laws of Software PerformanceGibraltar Software
 
Dataflow - A Unified Model for Batch and Streaming Data Processing
Dataflow - A Unified Model for Batch and Streaming Data ProcessingDataflow - A Unified Model for Batch and Streaming Data Processing
Dataflow - A Unified Model for Batch and Streaming Data ProcessingDoiT International
 
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...InfluxData
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...HostedbyConfluent
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformApache Apex
 
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexApache Apex
 
Mantis: Netflix's Event Stream Processing System
Mantis: Netflix's Event Stream Processing SystemMantis: Netflix's Event Stream Processing System
Mantis: Netflix's Event Stream Processing SystemC4Media
 
BDW16 London - William Vambenepe, Google - 3rd Generation Data Platform
BDW16 London - William Vambenepe, Google - 3rd Generation Data PlatformBDW16 London - William Vambenepe, Google - 3rd Generation Data Platform
BDW16 London - William Vambenepe, Google - 3rd Generation Data PlatformBig Data Week
 
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache Beam
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache BeamMalo Denielou - No shard left behind: Dynamic work rebalancing in Apache Beam
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache BeamFlink Forward
 
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?Marlon Dumas
 
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLWebinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLSeveralnines
 
Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek PROIDEA
 
Docker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackDocker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackJakub Hajek
 

Similar to Keynote: Building and Operating A Serverless Streaming Runtime for Apache Beam in The Google Cloud - Sergei Sokolenko & Reuven lax, Google (20)

Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
 
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingCloud Dataflow - A Unified Model for Batch and Streaming Data Processing
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
 
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
 
Stream Processing Overview
Stream Processing OverviewStream Processing Overview
Stream Processing Overview
 
Netflix SRE perf meetup_slides
Netflix SRE perf meetup_slidesNetflix SRE perf meetup_slides
Netflix SRE perf meetup_slides
 
Natural Laws of Software Performance
Natural Laws of Software PerformanceNatural Laws of Software Performance
Natural Laws of Software Performance
 
Dataflow - A Unified Model for Batch and Streaming Data Processing
Dataflow - A Unified Model for Batch and Streaming Data ProcessingDataflow - A Unified Model for Batch and Streaming Data Processing
Dataflow - A Unified Model for Batch and Streaming Data Processing
 
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
 
Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex
 
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
 
Mantis: Netflix's Event Stream Processing System
Mantis: Netflix's Event Stream Processing SystemMantis: Netflix's Event Stream Processing System
Mantis: Netflix's Event Stream Processing System
 
BDW16 London - William Vambenepe, Google - 3rd Generation Data Platform
BDW16 London - William Vambenepe, Google - 3rd Generation Data PlatformBDW16 London - William Vambenepe, Google - 3rd Generation Data Platform
BDW16 London - William Vambenepe, Google - 3rd Generation Data Platform
 
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache Beam
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache BeamMalo Denielou - No shard left behind: Dynamic work rebalancing in Apache Beam
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache Beam
 
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?
 
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLWebinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
 
Gcp dataflow
Gcp dataflowGcp dataflow
Gcp dataflow
 
Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek
 
Docker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackDocker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic Stack
 

More from Flink Forward

Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Flink Forward
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkFlink Forward
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...Flink Forward
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Flink Forward
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorFlink Forward
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeFlink Forward
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Flink Forward
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkFlink Forward
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxFlink Forward
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink Forward
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraFlink Forward
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentFlink Forward
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022Flink Forward
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink Forward
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsFlink Forward
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesFlink Forward
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Flink Forward
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergFlink Forward
 

More from Flink Forward (20)

Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async Sink
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production Deployment
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easy
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial Services
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 

Recently uploaded

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 

Recently uploaded (20)

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 

Keynote: Building and Operating A Serverless Streaming Runtime for Apache Beam in The Google Cloud - Sergei Sokolenko & Reuven lax, Google

  • 1. Apache Beam in the Google Cloud Lessons learned from building and operating a serverless streaming runtime Reuven Lax, Google (@reuvenlax) Sergei Sokolenko, Google (@datancoffee)
  • 2. Common steps in Stream AnalyticsLessons we learned Watermarks Adaptive Scaling: Flow Control Adaptive Scaling: Autoscaling Separating compute state from storage
  • 3. Common steps in Stream AnalyticsHistory Lesson 2012 20132002 2004 2006 2008 2010 Flume Millwheel 2015 DataflowMillwheelBespoke Streaming
  • 4. Common steps in Stream AnalyticsLesson Learned: Watermarks A pipeline stage S with a watermark value of t means that all future data that will be seen by S will have a timestamp later than t. In other words, all data older than t has already been processed. Key use case: process windows once the watermark passes the end of the window, since we expect all data for that window to have arrived already
  • 5. Common steps in Stream AnalyticsWhat Triggers Output? Traditional batch: query triggers output Streaming: When to trigger? ● Standing Query ● Unbounded Data Query Data Output Query Data?
  • 6. Common steps in Stream AnalyticsUse Case: Anomaly Detection pipelines Early Millwheel user was an anomaly detection pipeline Built cubic-spline model for each key Once a spline was calculated, it could not be modified. No late data, trigger only when ready! Bob Sara
  • 7. Common steps in Stream AnalyticsFirst attempt: leading-edge watermark Latest timestamp - δ Graph shows that skew peaked at 10 minutes. Set δ = 10 minutes to minimize data drops.
  • 8. Common steps in Stream AnalyticsFirst attempt: leading-edge watermark Too fast Too often a lot of data was behind this watermark Ended up with many gaps in output Impacted quality of results Too slow Subtracting fixed delta puts lower bound on latency Subtracted 10 minutes because the system is sometimes delayed by 10 minutes. However most of the time the delay was under 1 minute!
  • 9. Common steps in Stream AnalyticsSecond attempt: dynamic leading edge watermark Leading edge watermark Dynamic statistical models to compute how far the lookback should be
  • 10. Common steps in Stream AnalyticsSecond attempt: dynamic leading edge watermark Still many gaps in output data Input is too noisy Many delays are unpredictable (e.g. a machine restarting) Models take time to adapt, in which time you are dropping data
  • 11. Common steps in Stream AnalyticsTrailing edge watermark Tracking the minimum event time instead generally solved the problem.
  • 12. Common steps in Stream AnalyticsWatermark: Definition Given a node N of a computation graph G, Let In be the sequence of input elements processed with the order provided by an oracle. t: In -> R is a real-valued function on In called the timestamp function. A watermark function is a real-valued function W defined on prefixes of In satisfying the following: {Wn } = {W({I1 , …, In })} is eventually increasing. {Wn } is monotonic. W is said to be a temporal watermark, if it also satisfies Wn < t(Im ) for m >= n
  • 13. Common steps in Stream AnalyticsLoad Variance Streaming pipelines must keep up with input. Load varies throughout the day, throughout the week and spikes can happen at any time.
  • 14. Common steps in Stream AnalyticsLoad Variance: Hand Tuning Every pipeline is different, and hand tuning is hard Eventually tuning parameters go stale Hand-tuned flags become cargo cult science Must tune for worst case ● Tuning for the peak is wasteful ● If pipeline ever falls behind, must be able to catch up faster than real time. ○ An exactly-once streaming system is a batch system whenever it falls behind.
  • 15. Common steps in Stream AnalyticsTechniques: Batching Always process data in batches Batch sizes are dynamic: small when caught up, large when while catching up. Lesson: be careful of putting arbitrary limits on batches. ● Don’t limit by event time ranges - event-time density changes. ● Don’t limit by windows - window policies change. Batching limits will be especially painful while catching up a backlog.
  • 16. Common steps in Stream AnalyticsTechniques: Flow Control A good adaptive backpressure system is critical ● Prevents workers from overloading and crashing ● Adaptive backpressure adapts to changing load. ● Reduces need to perfectly tune cluster.
  • 17. Common steps in Stream AnalyticsTechniques: Flow Control Soft resources: CPU. Hard resources: Memory. Signals: ● Queue length ● Memory usage Eventually flow control will pause pulling from sources. A B C Worker 1 A B C Worker 3 A B C Worker 2 Flow controlled
  • 18. Common steps in Stream AnalyticsTechniques: Flow Control What happens if all streams are flow controlled? Deadlock!!!!! ● Every worker is holding onto memory for pending deliveries. ● Every worker is flow controlling its input streams. A B C Worker 1 A B C Worker 3 A B C Worker 2 Flow controlled Flow controlled Flow controlled
  • 19. Common steps in Stream AnalyticsTechniques: Flow Control To avoid deadlock, workers must be able to release memory This might involve canceling in-flight work to be retried later Dataflow streaming workers can spill pending deliveries to disk to release memory. Scanned back in later. A B C Worker 1 A B C Worker 3 A B C Worker 2 Flow controlled Flow controlled Flow controlled
  • 20. Common steps in Stream AnalyticsTechniques: Auto Scaling Adapative autoscaling allows elastic scaling with load. Work must by dynamically load balanced to take advantage of autoscaling.
  • 21. Common steps in Stream AnalyticsTechniques: Auto Scaling Never assume fixed workers. Work ownership can be moved at any time. All keys are hash sharded, and hash ranges distributed among workers. Separate storage from compute Adds a lot of complexity to exactly once and consistency protocols! A B C worker23 A B C Worker A B C worker32 [0, 3) [a, f) [3, a) [0, 3): 23 [3, a): 32 [a, f): 32 RPCs addressed to keys, not workers
  • 22. Common steps in Stream AnalyticsLoad Variance: Lesson Dynamic control is key No amount of static configuration works Eventually the universe will outsmart your configuration
  • 23. Separating compute from state storage to improve scalability Sergei Sokolenko, Google (@datancoffee)
  • 24. Common steps in Stream Analytics End-user apps Cloud Composer IoT Events Cloud Pub/Sub Dataflow Streaming DBs Cloud AI Platform Bigtable Dataflow Batch Action Streaming processing options in GCP BigQuery BigQuery Streaming API Machine Learning Data Warehousing
  • 25. Motivating Example: Spotify migrating the largest European Hadoop cluster to Dataflow ● Run 80,000+ Dataflow jobs / month ● 90% batch, 10% streaming Use Dataflow for “everything” ● Music recommendations, Ads targeting ● AB testing, Behavioral analysis, Business metrics Huge batch jobs: ● 26000 CPUs, 166 TB RAM ● Processing 325 billion rows in 240TB from Bigtable
  • 26. Traditional Distributed Data Processing Architecture User code VM User code VM User code VM User code VM State storage ● Jobs executed on clusters of VMs ● Job state stored on network-attached volumes ● Control plane orchestrates data plane Network Control plane VM State storage State storage State storage
  • 27. Traditional Architecture works well ... Filter Filter Join Group Filter Filter fs:// Databasefs:// Database … except for Joins and Group By’s
  • 28. Shuffling key-value pairs ● Starting with <K,V> pairs placed on different workers ● Goal: co-locate all pairs with the same Key <key1, record> <key5, record> <key3, record> <key8, record> <key4, record> ... <key5, record> <key5, record> <key2, record> <key3, record> <key8, record> ... <key3, record> <key3, record> <key8, record> <key3, record> <key6, record> ... <key2, record> <key1, record> <key5, record> <key8, record> <key4, record> ...
  • 29. ● Starting with <K,V> pairs placed on different workers ● Goal: co-locate all pairs with the same Key ● Workers exchange <K,V> <key1, record> <key5, record> <key3, record> <key8, record> <key4, record> ... <key5, record> <key5, record> <key2, record> <key3, record> <key8, record> ... <key3, record> <key3, record> <key8, record> <key3, record> <key6, record> ... <key2, record> <key1, record> <key5, record> <key8, record> <key4, record> ... Shuffling key-value pairs
  • 30. Shuffling key-value pairs <key1, record> <key1, record> <key2, record> <key2, record> <key2, record> ... <key3, record> <key3, record> <key3, record> <key3, record> <key3, record> <key4, record> ... <key5, record> <key5, record> <key5, record> <key5, record> <key6, record> ... <key7, record> <key8, record> <key8, record> <key8, record> ... key1, key 2 key3, key4 key5, key6 key7, key8 ● Starting with <K,V> pairs placed on different workers ● Goal: co-locate all pairs with the same Key ● Workers exchange <K,V> ● Until everything is sorted
  • 31. Traditional Architecture Requires Manual Tuning User code VM User code VM User code VM User code VM State storage ● When data volumes exceed dozens of TBs Network Control plane VM State storage State storage State storage
  • 32. Distributed in-memory Shuffle in batch Cloud Dataflow Compute Petabit network Dataflow Shuffle Region Zone ‘a’ Zone ‘b’ Zone ‘c’Distributed in-memory file system Distributed on-disk file system Shuffle proxy Autozone placement Pipeline user code Shuffling Operations
  • 33. No tuning required Dataflow Shuffle is usually faster than worker-based shuffle, including those using SSD-PD. Better autoscaling keeps aggregate resource usage same, but cuts processing time. Faster Processing Runtime of shuffle Runtime (mins)
  • 34. Shuffle 300TB+ Dataflow shuffle has been used to shuffle 300TB+ datasets. Supporting larger datasets Dataset size of shuffle Dataset size (TB)
  • 35. Storing state What about streaming pipelines? Streaming shuffle Just like in batch, need to group and join streams Distributed streaming shuffle Window data elements Time window data aggregations need to be buffered Until triggering conditions occur
  • 36. Goal: Grouping by Event Time into Time Windows 9:00 14:0013:0012:0011:0010:00Event time 9:00 14:0013:0012:0011:0010:00Processing time Input Output
  • 37. Even more state to store on disks in streaming User code VM User code VM User code VM User code VM Shuffle data elements ● Key ranges are assigned to workers ● Data elements of these keys is stored on Persistent Disks State storage State storage State storage State storage key 0000 ... … key 1234 key 1235 ... … key ABC2 key ABC3 ... … key DEF5 key DEF6 ... … key GHI2 Time window data ● Also assigned to workers ● When time windows close, data processed on workers
  • 38. Dataflow Streaming Engine Benefits ● Better supportability ● Less worker resources ● More efficient autoscaling User code Streaming engine Worker User code Worker User code Worker User code Worker Window state storage Streaming shuffle
  • 39. Autoscaling: Even better with separate Compute and State Storage User code Streaming engine Worker User code Worker Window state storage Streaming shuffle Dataflow with Streaming Engine User code VM User code VM State storage State storage key 0000 ... … key 1234 key 1235 ... … key ABC2 Dataflow without Streaming Engine
  • 40. Dataflow with Streaming Engine Dataflow without Streaming Engine
  • 41. ● Personalization and experimentation platform ● Wanted things to work out-of-the-box Significant data volumes: ● 25 million user sessions per day ● 2B events per day Dataflow usage profile: ● Streaming Engine for worryless autoscaling ● Batch processing with FlexRS for cost savings AB Tasty is using Dataflow Streaming Engine
  • 42. Main Takeaways Trailing edge watermarks provided a solution for triggering aggregations The system must be elastic and adaptive Separating compute from state storage help make stream and batch processing scalable