SlideShare a Scribd company logo
1 of 18
Download to read offline
© King.com Ltd 2016 – Commercially confidential
RBea: Scalable Real-Time Analytics
at King
Gyula Fóra
Data Warehouse Engineer
Apache Flink PMC
Page 2
© King.com Ltd 2016 – Commercially confidential
We make awesome mobile games
463 million monthly active users
30 billion events per day
And a lot of data…
Page 3
About King
© King.com Ltd 2016 – Commercially confidential
From streaming perspective…
Page 4
DB
30 billion events / day
Complex processing logic
Terabytes of state
DB
DB
© King.com Ltd 2016 – Commercially confidential
How do we use Flink
Page 5
Standalone deployment
Few heavy streaming jobs
RocksDB state backend with caching
Lot of custom tooling
© King.com Ltd 2016 – Commercially confidential
The RBea platform
Page 6
Scripting on the live streams
Deploy from browser/python
Window aggregates
Stateful computations
Scalable + fault tolerant
© King.com Ltd 2016 – Commercially confidential
RBea architecture
Page 7
Events Output
REST API
RBEA web frontend
Libraries
http://hpc-asia.com/wp-content/uploads/2015/09/World-Class-Consultancy-Seeking-Data-Scientist-CA-Hobson-Associates-Matthew-Abel-Recruiter.jpg
Data Scientists
© King.com Ltd 2016 – Commercially confidential
RBea backend implementation
Page 8
One stateful Flink job / game
Stream events and scripts
Events are partitioned by user id
Scripts are broadcasted
Output/Aggregation happens downstream
S1 S2
S3
S4
S5
Add/Remove scripts
Event stream Loop over deployed
scripts and process
CoFlatMap
Output based
on API calls
© King.com Ltd 2016 – Commercially confidential
Dissecting the DSL
Page 9
@ProcessEvent(semanticClass=SCPurchase.class)
def process(SCPurchase purchase,
Output out,
Aggregators agg) {
long amount = purchase.getAmount()
String curr = purchase.getCurrency()
out.writeToKafka("purchases", curr + "t" + amount)
Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10)
numPurchases.increment()
}
© King.com Ltd 2016 – Commercially confidential
Dissecting the DSL
Page 10
@ProcessEvent(semanticClass=SCPurchase.class)
def process(SCPurchase purchase,
Output out,
Aggregators agg) {
long amount = purchase.getAmount()
String curr = purchase.getCurrency()
out.writeToKafka("purchases", curr + "t" + amount)
Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10)
numPurchases.increment()
}
Processing methods by annotation
Event filter conditions
Flexible argument list
Code-generate Java classes
=> void processEvent(Event e, Context ctx);
© King.com Ltd 2016 – Commercially confidential
Dissecting the DSL
Page 11
@ProcessEvent(semanticClass=SCPurchase.class)
def process(SCPurchase purchase,
Output out,
Aggregators agg) {
long amount = purchase.getAmount()
String curr = purchase.getCurrency()
out.writeToKafka("purchases", curr + "t" + amount)
Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10)
numPurchases.increment()
}
Output calls create Output events
Output(KAFKA, “purchases”, “…” )
These events are filtered downstream and
sent to a Kafka sink
© King.com Ltd 2016 – Commercially confidential
Dissecting the DSL
Page 12
@ProcessEvent(semanticClass=SCPurchase.class)
def process(SCPurchase purchase,
Output out,
Aggregators agg) {
long amount = purchase.getAmount()
String curr = purchase.getCurrency()
out.writeToKafka("purchases", curr + "t" + amount)
Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10)
numPurchases.increment()
}
Aggregator calls create Aggregate events
Aggr (MYSQL, 60000, “PurchaseCount”, 1)
Flink window operators do the aggregation
© King.com Ltd 2016 – Commercially confidential
Aggregators
Page 13
long size = aggregate.getWindowSize();
long start = timestamp - (timestamp % size);
long end = start + size;
TimeWindow tw = new TimeWindow(start, end);
Event time windows
Window size / aggregator
Script1
Script2
Window 1Window 2 NumGames
Revenue
W1: 8999
W2: 9001
W1: 200
W2: 300
MyAggregator
W1: 10
W2: 5
Dynamic window assignment
© King.com Ltd 2016 – Commercially confidential
User states
Page 14
RBea
Script
LAST_GAMESTART
LAST_PURCHASE
LAST_…
+ User defined states
Fields
byte[]
User State
byte[]byte[]
LRU Cache
Scripts can read all state
But write only their own
© King.com Ltd 2016 – Commercially confidential
Things you might wonder…
Page 15
Can slow scripts affect other scripts?
Yes, but we are working on it
Separate test/live environments
What does the backend know about the scripts?
Outputs produced
Failures (causes) => these are propagated
Runtime stats in the future
Is RBea useful compared to custom Flink jobs?
Writing + maintaining streaming jobs is hard
Especially with state, windowing, MySQL etc.
© King.com Ltd 2016 – Commercially confidential Page 16
RBea physical plan
© King.com Ltd 2016 – Commercially confidential
Wrap up
Page 17
RBea makes streaming accessible to every
data scientist at King
We leverage Flink’s stateful and windowed
processing capabilities
People love it because it’s simple and
powerful
Thank you!

More Related Content

What's hot

Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick Evans
Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick EvansRealtime Risk Management Using Kafka, Python, and Spark Streaming by Nick Evans
Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick EvansSpark Summit
 
Storing State Forever: Why It Can Be Good For Your Analytics
Storing State Forever: Why It Can Be Good For Your AnalyticsStoring State Forever: Why It Can Be Good For Your Analytics
Storing State Forever: Why It Can Be Good For Your AnalyticsYaroslav Tkachenko
 
Stateful Distributed Stream Processing
Stateful Distributed Stream ProcessingStateful Distributed Stream Processing
Stateful Distributed Stream ProcessingGyula Fóra
 
Presto Summit 2018 - 03 - Starburst CBO
Presto Summit 2018  - 03 - Starburst CBOPresto Summit 2018  - 03 - Starburst CBO
Presto Summit 2018 - 03 - Starburst CBOkbajda
 
Baymeetup-FlinkResearch
Baymeetup-FlinkResearchBaymeetup-FlinkResearch
Baymeetup-FlinkResearchFoo Sounds
 
Stream Processing Live Traffic Data with Kafka Streams
Stream Processing Live Traffic Data with Kafka StreamsStream Processing Live Traffic Data with Kafka Streams
Stream Processing Live Traffic Data with Kafka StreamsTim Ysewyn
 
Introduction to Real-time data processing
Introduction to Real-time data processingIntroduction to Real-time data processing
Introduction to Real-time data processingYogi Devendra Vyavahare
 
Measure your app internals with InfluxDB and Symfony2
Measure your app internals with InfluxDB and Symfony2Measure your app internals with InfluxDB and Symfony2
Measure your app internals with InfluxDB and Symfony2Corley S.r.l.
 
HBaseCon2017 Data Product at AirBnB
HBaseCon2017 Data Product at AirBnBHBaseCon2017 Data Product at AirBnB
HBaseCon2017 Data Product at AirBnBHBaseCon
 
Kafka Streams - From the Ground Up to the Cloud
Kafka Streams - From the Ground Up to the CloudKafka Streams - From the Ground Up to the Cloud
Kafka Streams - From the Ground Up to the CloudVMware Tanzu
 
Flink Forward SF 2017: Chinmay Soman - Real Time Analytics in the real World ...
Flink Forward SF 2017: Chinmay Soman - Real Time Analytics in the real World ...Flink Forward SF 2017: Chinmay Soman - Real Time Analytics in the real World ...
Flink Forward SF 2017: Chinmay Soman - Real Time Analytics in the real World ...Flink Forward
 
Apache HBase at Airbnb
Apache HBase at Airbnb Apache HBase at Airbnb
Apache HBase at Airbnb HBaseCon
 
How to build an event driven architecture with kafka and kafka connect
How to build an event driven architecture with kafka and kafka connectHow to build an event driven architecture with kafka and kafka connect
How to build an event driven architecture with kafka and kafka connectLoi Nguyen
 
Airstream: Spark Streaming At Airbnb
Airstream: Spark Streaming At AirbnbAirstream: Spark Streaming At Airbnb
Airstream: Spark Streaming At AirbnbJen Aman
 
Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...
Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...
Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...Timo Walther
 
Stream Processing Live Traffic Data with Kafka Streams
Stream Processing Live Traffic Data with Kafka StreamsStream Processing Live Traffic Data with Kafka Streams
Stream Processing Live Traffic Data with Kafka StreamsTim Ysewyn
 
Stream Processing in Uber
Stream Processing in UberStream Processing in Uber
Stream Processing in UberC4Media
 
Beautiful Monitoring With Grafana and InfluxDB
Beautiful Monitoring With Grafana and InfluxDBBeautiful Monitoring With Grafana and InfluxDB
Beautiful Monitoring With Grafana and InfluxDBleesjensen
 

What's hot (20)

Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick Evans
Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick EvansRealtime Risk Management Using Kafka, Python, and Spark Streaming by Nick Evans
Realtime Risk Management Using Kafka, Python, and Spark Streaming by Nick Evans
 
Storing State Forever: Why It Can Be Good For Your Analytics
Storing State Forever: Why It Can Be Good For Your AnalyticsStoring State Forever: Why It Can Be Good For Your Analytics
Storing State Forever: Why It Can Be Good For Your Analytics
 
Kafka at trivago
Kafka at trivagoKafka at trivago
Kafka at trivago
 
Stateful Distributed Stream Processing
Stateful Distributed Stream ProcessingStateful Distributed Stream Processing
Stateful Distributed Stream Processing
 
Presto Summit 2018 - 03 - Starburst CBO
Presto Summit 2018  - 03 - Starburst CBOPresto Summit 2018  - 03 - Starburst CBO
Presto Summit 2018 - 03 - Starburst CBO
 
Baymeetup-FlinkResearch
Baymeetup-FlinkResearchBaymeetup-FlinkResearch
Baymeetup-FlinkResearch
 
Stream Processing Live Traffic Data with Kafka Streams
Stream Processing Live Traffic Data with Kafka StreamsStream Processing Live Traffic Data with Kafka Streams
Stream Processing Live Traffic Data with Kafka Streams
 
Introduction to Real-time data processing
Introduction to Real-time data processingIntroduction to Real-time data processing
Introduction to Real-time data processing
 
Measure your app internals with InfluxDB and Symfony2
Measure your app internals with InfluxDB and Symfony2Measure your app internals with InfluxDB and Symfony2
Measure your app internals with InfluxDB and Symfony2
 
HBaseCon2017 Data Product at AirBnB
HBaseCon2017 Data Product at AirBnBHBaseCon2017 Data Product at AirBnB
HBaseCon2017 Data Product at AirBnB
 
Introduction to influx db
Introduction to influx dbIntroduction to influx db
Introduction to influx db
 
Kafka Streams - From the Ground Up to the Cloud
Kafka Streams - From the Ground Up to the CloudKafka Streams - From the Ground Up to the Cloud
Kafka Streams - From the Ground Up to the Cloud
 
Flink Forward SF 2017: Chinmay Soman - Real Time Analytics in the real World ...
Flink Forward SF 2017: Chinmay Soman - Real Time Analytics in the real World ...Flink Forward SF 2017: Chinmay Soman - Real Time Analytics in the real World ...
Flink Forward SF 2017: Chinmay Soman - Real Time Analytics in the real World ...
 
Apache HBase at Airbnb
Apache HBase at Airbnb Apache HBase at Airbnb
Apache HBase at Airbnb
 
How to build an event driven architecture with kafka and kafka connect
How to build an event driven architecture with kafka and kafka connectHow to build an event driven architecture with kafka and kafka connect
How to build an event driven architecture with kafka and kafka connect
 
Airstream: Spark Streaming At Airbnb
Airstream: Spark Streaming At AirbnbAirstream: Spark Streaming At Airbnb
Airstream: Spark Streaming At Airbnb
 
Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...
Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...
Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...
 
Stream Processing Live Traffic Data with Kafka Streams
Stream Processing Live Traffic Data with Kafka StreamsStream Processing Live Traffic Data with Kafka Streams
Stream Processing Live Traffic Data with Kafka Streams
 
Stream Processing in Uber
Stream Processing in UberStream Processing in Uber
Stream Processing in Uber
 
Beautiful Monitoring With Grafana and InfluxDB
Beautiful Monitoring With Grafana and InfluxDBBeautiful Monitoring With Grafana and InfluxDB
Beautiful Monitoring With Grafana and InfluxDB
 

Viewers also liked

Real-time Stream Processing with Apache Flink @ Hadoop Summit
Real-time Stream Processing with Apache Flink @ Hadoop SummitReal-time Stream Processing with Apache Flink @ Hadoop Summit
Real-time Stream Processing with Apache Flink @ Hadoop SummitGyula Fóra
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesDavid Martínez Rego
 
Real Time Analytics with Apache Cassandra - Cassandra Day Munich
Real Time Analytics with Apache Cassandra - Cassandra Day MunichReal Time Analytics with Apache Cassandra - Cassandra Day Munich
Real Time Analytics with Apache Cassandra - Cassandra Day MunichGuido Schmutz
 
KDD 2016 Streaming Analytics Tutorial
KDD 2016 Streaming Analytics TutorialKDD 2016 Streaming Analytics Tutorial
KDD 2016 Streaming Analytics TutorialNeera Agarwal
 
Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop EcosystemLarge-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop EcosystemGyula Fóra
 
Real Time Analytics with Apache Cassandra - Cassandra Day Berlin
Real Time Analytics with Apache Cassandra - Cassandra Day BerlinReal Time Analytics with Apache Cassandra - Cassandra Day Berlin
Real Time Analytics with Apache Cassandra - Cassandra Day BerlinGuido Schmutz
 
Data Streaming (in a Nutshell) ... and Spark's window operations
Data Streaming (in a Nutshell) ... and Spark's window operationsData Streaming (in a Nutshell) ... and Spark's window operations
Data Streaming (in a Nutshell) ... and Spark's window operationsVincenzo Gulisano
 
Stream Analytics in the Enterprise
Stream Analytics in the EnterpriseStream Analytics in the Enterprise
Stream Analytics in the EnterpriseJesus Rodriguez
 
Stream Processing Everywhere - What to use?
Stream Processing Everywhere - What to use?Stream Processing Everywhere - What to use?
Stream Processing Everywhere - What to use?MapR Technologies
 
Reliable Data Intestion in BigData / IoT
Reliable Data Intestion in BigData / IoTReliable Data Intestion in BigData / IoT
Reliable Data Intestion in BigData / IoTGuido Schmutz
 
Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016
Netflix keystone   streaming data pipeline @scale in the cloud-dbtb-2016Netflix keystone   streaming data pipeline @scale in the cloud-dbtb-2016
Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016Monal Daxini
 
The end of polling : why and how to transform a REST API into a Data Streamin...
The end of polling : why and how to transform a REST API into a Data Streamin...The end of polling : why and how to transform a REST API into a Data Streamin...
The end of polling : why and how to transform a REST API into a Data Streamin...Audrey Neveu
 
Oracle Stream Analytics - Simplifying Stream Processing
Oracle Stream Analytics - Simplifying Stream ProcessingOracle Stream Analytics - Simplifying Stream Processing
Oracle Stream Analytics - Simplifying Stream ProcessingGuido Schmutz
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016Guido Schmutz
 
Amazon Kinesis: Real-time Streaming Big data Processing Applications (BDT311)...
Amazon Kinesis: Real-time Streaming Big data Processing Applications (BDT311)...Amazon Kinesis: Real-time Streaming Big data Processing Applications (BDT311)...
Amazon Kinesis: Real-time Streaming Big data Processing Applications (BDT311)...Amazon Web Services
 
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016Gyula Fóra
 
Distributed Real-Time Stream Processing: Why and How 2.0
Distributed Real-Time Stream Processing:  Why and How 2.0Distributed Real-Time Stream Processing:  Why and How 2.0
Distributed Real-Time Stream Processing: Why and How 2.0Petr Zapletal
 
Kafka and Stream Processing, Taking Analytics Real-time, Mike Spicer
Kafka and Stream Processing, Taking Analytics Real-time, Mike SpicerKafka and Stream Processing, Taking Analytics Real-time, Mike Spicer
Kafka and Stream Processing, Taking Analytics Real-time, Mike Spicerconfluent
 

Viewers also liked (20)

Real-time Stream Processing with Apache Flink @ Hadoop Summit
Real-time Stream Processing with Apache Flink @ Hadoop SummitReal-time Stream Processing with Apache Flink @ Hadoop Summit
Real-time Stream Processing with Apache Flink @ Hadoop Summit
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming Architectures
 
Real Time Analytics with Apache Cassandra - Cassandra Day Munich
Real Time Analytics with Apache Cassandra - Cassandra Day MunichReal Time Analytics with Apache Cassandra - Cassandra Day Munich
Real Time Analytics with Apache Cassandra - Cassandra Day Munich
 
KDD 2016 Streaming Analytics Tutorial
KDD 2016 Streaming Analytics TutorialKDD 2016 Streaming Analytics Tutorial
KDD 2016 Streaming Analytics Tutorial
 
Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop EcosystemLarge-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop Ecosystem
 
Real Time Analytics with Apache Cassandra - Cassandra Day Berlin
Real Time Analytics with Apache Cassandra - Cassandra Day BerlinReal Time Analytics with Apache Cassandra - Cassandra Day Berlin
Real Time Analytics with Apache Cassandra - Cassandra Day Berlin
 
Streaming Analytics
Streaming AnalyticsStreaming Analytics
Streaming Analytics
 
Data Streaming (in a Nutshell) ... and Spark's window operations
Data Streaming (in a Nutshell) ... and Spark's window operationsData Streaming (in a Nutshell) ... and Spark's window operations
Data Streaming (in a Nutshell) ... and Spark's window operations
 
Stream Analytics in the Enterprise
Stream Analytics in the EnterpriseStream Analytics in the Enterprise
Stream Analytics in the Enterprise
 
Stream Processing Everywhere - What to use?
Stream Processing Everywhere - What to use?Stream Processing Everywhere - What to use?
Stream Processing Everywhere - What to use?
 
Reliable Data Intestion in BigData / IoT
Reliable Data Intestion in BigData / IoTReliable Data Intestion in BigData / IoT
Reliable Data Intestion in BigData / IoT
 
Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016
Netflix keystone   streaming data pipeline @scale in the cloud-dbtb-2016Netflix keystone   streaming data pipeline @scale in the cloud-dbtb-2016
Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016
 
The end of polling : why and how to transform a REST API into a Data Streamin...
The end of polling : why and how to transform a REST API into a Data Streamin...The end of polling : why and how to transform a REST API into a Data Streamin...
The end of polling : why and how to transform a REST API into a Data Streamin...
 
Oracle Stream Analytics - Simplifying Stream Processing
Oracle Stream Analytics - Simplifying Stream ProcessingOracle Stream Analytics - Simplifying Stream Processing
Oracle Stream Analytics - Simplifying Stream Processing
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016
 
Amazon Kinesis: Real-time Streaming Big data Processing Applications (BDT311)...
Amazon Kinesis: Real-time Streaming Big data Processing Applications (BDT311)...Amazon Kinesis: Real-time Streaming Big data Processing Applications (BDT311)...
Amazon Kinesis: Real-time Streaming Big data Processing Applications (BDT311)...
 
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016
 
Distributed Real-Time Stream Processing: Why and How 2.0
Distributed Real-Time Stream Processing:  Why and How 2.0Distributed Real-Time Stream Processing:  Why and How 2.0
Distributed Real-Time Stream Processing: Why and How 2.0
 
Kafka and Stream Processing, Taking Analytics Real-time, Mike Spicer
Kafka and Stream Processing, Taking Analytics Real-time, Mike SpicerKafka and Stream Processing, Taking Analytics Real-time, Mike Spicer
Kafka and Stream Processing, Taking Analytics Real-time, Mike Spicer
 

Similar to RBea: Scalable Real-Time Analytics at King

Flink Forward Berlin 2017: Gyula Fora - Building and operating large-scale st...
Flink Forward Berlin 2017: Gyula Fora - Building and operating large-scale st...Flink Forward Berlin 2017: Gyula Fora - Building and operating large-scale st...
Flink Forward Berlin 2017: Gyula Fora - Building and operating large-scale st...Flink Forward
 
Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018
Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018
Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018Amazon Web Services
 
Workflow Engines & Event Streaming Brokers - Can They Work Together?
Workflow Engines & Event Streaming Brokers - Can They Work Together?Workflow Engines & Event Streaming Brokers - Can They Work Together?
Workflow Engines & Event Streaming Brokers - Can They Work Together?HostedbyConfluent
 
Workflow Engines & Event Streaming Brokers - Can they work together? [Current...
Workflow Engines & Event Streaming Brokers - Can they work together? [Current...Workflow Engines & Event Streaming Brokers - Can they work together? [Current...
Workflow Engines & Event Streaming Brokers - Can they work together? [Current...Natan Silnitsky
 
big data fest building modern data streaming apps
big data fest building modern data streaming appsbig data fest building modern data streaming apps
big data fest building modern data streaming appsTimothy Spann
 
BigDataFest_ Building Modern Data Streaming Apps
BigDataFest_  Building Modern Data Streaming AppsBigDataFest_  Building Modern Data Streaming Apps
BigDataFest_ Building Modern Data Streaming Appsssuser73434e
 
BDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
BDA403 How Netflix Monitors Applications in Real-time with Amazon KinesisBDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
BDA403 How Netflix Monitors Applications in Real-time with Amazon KinesisAmazon Web Services
 
Streaming sql w kafka and flink
Streaming sql w  kafka and flinkStreaming sql w  kafka and flink
Streaming sql w kafka and flinkKenny Gorman
 
Big Data: Mejores prácticas en AWS
Big Data: Mejores prácticas en AWSBig Data: Mejores prácticas en AWS
Big Data: Mejores prácticas en AWSAmazon Web Services
 
Set Your Data In Motion - CTO Roundtable
Set Your Data In Motion - CTO RoundtableSet Your Data In Motion - CTO Roundtable
Set Your Data In Motion - CTO Roundtableconfluent
 
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesEvent Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesKai Wähner
 
Couchbase@live person meetup july 22nd
Couchbase@live person meetup   july 22ndCouchbase@live person meetup   july 22nd
Couchbase@live person meetup july 22ndIdo Shilon
 
LendingClub RealTime BigData Platform with Oracle GoldenGate
LendingClub RealTime BigData Platform with Oracle GoldenGateLendingClub RealTime BigData Platform with Oracle GoldenGate
LendingClub RealTime BigData Platform with Oracle GoldenGateRajit Saha
 
Streaming Analytics for Financial Enterprises
Streaming Analytics for Financial EnterprisesStreaming Analytics for Financial Enterprises
Streaming Analytics for Financial EnterprisesDatabricks
 
IBM Cloud Pak for Integration with Confluent Platform powered by Apache Kafka
IBM Cloud Pak for Integration with Confluent Platform powered by Apache KafkaIBM Cloud Pak for Integration with Confluent Platform powered by Apache Kafka
IBM Cloud Pak for Integration with Confluent Platform powered by Apache KafkaKai Wähner
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...HostedbyConfluent
 
Bridge Your Kafka Streams to Azure Webinar
Bridge Your Kafka Streams to Azure WebinarBridge Your Kafka Streams to Azure Webinar
Bridge Your Kafka Streams to Azure Webinarconfluent
 
EDA Meets Data Engineering – What's the Big Deal?
EDA Meets Data Engineering – What's the Big Deal?EDA Meets Data Engineering – What's the Big Deal?
EDA Meets Data Engineering – What's the Big Deal?confluent
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...HostedbyConfluent
 

Similar to RBea: Scalable Real-Time Analytics at King (20)

Flink Forward Berlin 2017: Gyula Fora - Building and operating large-scale st...
Flink Forward Berlin 2017: Gyula Fora - Building and operating large-scale st...Flink Forward Berlin 2017: Gyula Fora - Building and operating large-scale st...
Flink Forward Berlin 2017: Gyula Fora - Building and operating large-scale st...
 
Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018
Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018
Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018
 
Workflow Engines & Event Streaming Brokers - Can They Work Together?
Workflow Engines & Event Streaming Brokers - Can They Work Together?Workflow Engines & Event Streaming Brokers - Can They Work Together?
Workflow Engines & Event Streaming Brokers - Can They Work Together?
 
Workflow Engines & Event Streaming Brokers - Can they work together? [Current...
Workflow Engines & Event Streaming Brokers - Can they work together? [Current...Workflow Engines & Event Streaming Brokers - Can they work together? [Current...
Workflow Engines & Event Streaming Brokers - Can they work together? [Current...
 
big data fest building modern data streaming apps
big data fest building modern data streaming appsbig data fest building modern data streaming apps
big data fest building modern data streaming apps
 
BigDataFest_ Building Modern Data Streaming Apps
BigDataFest_  Building Modern Data Streaming AppsBigDataFest_  Building Modern Data Streaming Apps
BigDataFest_ Building Modern Data Streaming Apps
 
BDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
BDA403 How Netflix Monitors Applications in Real-time with Amazon KinesisBDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
BDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
 
Streaming sql w kafka and flink
Streaming sql w  kafka and flinkStreaming sql w  kafka and flink
Streaming sql w kafka and flink
 
Big Data: Mejores prácticas en AWS
Big Data: Mejores prácticas en AWSBig Data: Mejores prácticas en AWS
Big Data: Mejores prácticas en AWS
 
Set Your Data In Motion - CTO Roundtable
Set Your Data In Motion - CTO RoundtableSet Your Data In Motion - CTO Roundtable
Set Your Data In Motion - CTO Roundtable
 
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesEvent Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
 
Stream Processing in 2019
Stream Processing in 2019Stream Processing in 2019
Stream Processing in 2019
 
Couchbase@live person meetup july 22nd
Couchbase@live person meetup   july 22ndCouchbase@live person meetup   july 22nd
Couchbase@live person meetup july 22nd
 
LendingClub RealTime BigData Platform with Oracle GoldenGate
LendingClub RealTime BigData Platform with Oracle GoldenGateLendingClub RealTime BigData Platform with Oracle GoldenGate
LendingClub RealTime BigData Platform with Oracle GoldenGate
 
Streaming Analytics for Financial Enterprises
Streaming Analytics for Financial EnterprisesStreaming Analytics for Financial Enterprises
Streaming Analytics for Financial Enterprises
 
IBM Cloud Pak for Integration with Confluent Platform powered by Apache Kafka
IBM Cloud Pak for Integration with Confluent Platform powered by Apache KafkaIBM Cloud Pak for Integration with Confluent Platform powered by Apache Kafka
IBM Cloud Pak for Integration with Confluent Platform powered by Apache Kafka
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
 
Bridge Your Kafka Streams to Azure Webinar
Bridge Your Kafka Streams to Azure WebinarBridge Your Kafka Streams to Azure Webinar
Bridge Your Kafka Streams to Azure Webinar
 
EDA Meets Data Engineering – What's the Big Deal?
EDA Meets Data Engineering – What's the Big Deal?EDA Meets Data Engineering – What's the Big Deal?
EDA Meets Data Engineering – What's the Big Deal?
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
 

Recently uploaded

Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 

Recently uploaded (20)

Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 

RBea: Scalable Real-Time Analytics at King

  • 1.
  • 2. © King.com Ltd 2016 – Commercially confidential RBea: Scalable Real-Time Analytics at King Gyula Fóra Data Warehouse Engineer Apache Flink PMC Page 2
  • 3. © King.com Ltd 2016 – Commercially confidential We make awesome mobile games 463 million monthly active users 30 billion events per day And a lot of data… Page 3 About King
  • 4. © King.com Ltd 2016 – Commercially confidential From streaming perspective… Page 4 DB 30 billion events / day Complex processing logic Terabytes of state DB DB
  • 5. © King.com Ltd 2016 – Commercially confidential How do we use Flink Page 5 Standalone deployment Few heavy streaming jobs RocksDB state backend with caching Lot of custom tooling
  • 6. © King.com Ltd 2016 – Commercially confidential The RBea platform Page 6 Scripting on the live streams Deploy from browser/python Window aggregates Stateful computations Scalable + fault tolerant
  • 7. © King.com Ltd 2016 – Commercially confidential RBea architecture Page 7 Events Output REST API RBEA web frontend Libraries http://hpc-asia.com/wp-content/uploads/2015/09/World-Class-Consultancy-Seeking-Data-Scientist-CA-Hobson-Associates-Matthew-Abel-Recruiter.jpg Data Scientists
  • 8. © King.com Ltd 2016 – Commercially confidential RBea backend implementation Page 8 One stateful Flink job / game Stream events and scripts Events are partitioned by user id Scripts are broadcasted Output/Aggregation happens downstream S1 S2 S3 S4 S5 Add/Remove scripts Event stream Loop over deployed scripts and process CoFlatMap Output based on API calls
  • 9. © King.com Ltd 2016 – Commercially confidential Dissecting the DSL Page 9 @ProcessEvent(semanticClass=SCPurchase.class) def process(SCPurchase purchase, Output out, Aggregators agg) { long amount = purchase.getAmount() String curr = purchase.getCurrency() out.writeToKafka("purchases", curr + "t" + amount) Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10) numPurchases.increment() }
  • 10. © King.com Ltd 2016 – Commercially confidential Dissecting the DSL Page 10 @ProcessEvent(semanticClass=SCPurchase.class) def process(SCPurchase purchase, Output out, Aggregators agg) { long amount = purchase.getAmount() String curr = purchase.getCurrency() out.writeToKafka("purchases", curr + "t" + amount) Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10) numPurchases.increment() } Processing methods by annotation Event filter conditions Flexible argument list Code-generate Java classes => void processEvent(Event e, Context ctx);
  • 11. © King.com Ltd 2016 – Commercially confidential Dissecting the DSL Page 11 @ProcessEvent(semanticClass=SCPurchase.class) def process(SCPurchase purchase, Output out, Aggregators agg) { long amount = purchase.getAmount() String curr = purchase.getCurrency() out.writeToKafka("purchases", curr + "t" + amount) Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10) numPurchases.increment() } Output calls create Output events Output(KAFKA, “purchases”, “…” ) These events are filtered downstream and sent to a Kafka sink
  • 12. © King.com Ltd 2016 – Commercially confidential Dissecting the DSL Page 12 @ProcessEvent(semanticClass=SCPurchase.class) def process(SCPurchase purchase, Output out, Aggregators agg) { long amount = purchase.getAmount() String curr = purchase.getCurrency() out.writeToKafka("purchases", curr + "t" + amount) Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10) numPurchases.increment() } Aggregator calls create Aggregate events Aggr (MYSQL, 60000, “PurchaseCount”, 1) Flink window operators do the aggregation
  • 13. © King.com Ltd 2016 – Commercially confidential Aggregators Page 13 long size = aggregate.getWindowSize(); long start = timestamp - (timestamp % size); long end = start + size; TimeWindow tw = new TimeWindow(start, end); Event time windows Window size / aggregator Script1 Script2 Window 1Window 2 NumGames Revenue W1: 8999 W2: 9001 W1: 200 W2: 300 MyAggregator W1: 10 W2: 5 Dynamic window assignment
  • 14. © King.com Ltd 2016 – Commercially confidential User states Page 14 RBea Script LAST_GAMESTART LAST_PURCHASE LAST_… + User defined states Fields byte[] User State byte[]byte[] LRU Cache Scripts can read all state But write only their own
  • 15. © King.com Ltd 2016 – Commercially confidential Things you might wonder… Page 15 Can slow scripts affect other scripts? Yes, but we are working on it Separate test/live environments What does the backend know about the scripts? Outputs produced Failures (causes) => these are propagated Runtime stats in the future Is RBea useful compared to custom Flink jobs? Writing + maintaining streaming jobs is hard Especially with state, windowing, MySQL etc.
  • 16. © King.com Ltd 2016 – Commercially confidential Page 16 RBea physical plan
  • 17. © King.com Ltd 2016 – Commercially confidential Wrap up Page 17 RBea makes streaming accessible to every data scientist at King We leverage Flink’s stateful and windowed processing capabilities People love it because it’s simple and powerful