SlideShare a Scribd company logo
Real-time driving score service
using Flink
Dongwon Kim
SK telecom
My talks @FlinkForward
Flink Forward 2015
A Comparative Performance
Evaluation of Flink
Flink Forward 2017
Predictive Maintenance
with
Deep Learning and Flink .
Flink Forward 2018
Real-time driving score service
using Flink
T map, a mobile navigation app by SK telecom
≈
Choose from
frequent locations
Enter an address
or
a place nameWaze
Google Maps
T map, a mobile navigation app by SK telecom
multiple route options in driving mode arriving at destination
Driving score service by T map
I scored 83
out of 100! yay!
Driving score
KB Insurance DB Insurance
10% discount 10% discount
Car insurance discount for safe drivers
If you drive safely with ,
automobile insurance premiums
go down.
Driving score is based on three factors
My driving score
Rank : 970k Speeding
Rapid
accel.
Rapid
decel.
greatgood good
Monthly chart
Apr May Jun Jul Aug
The three factors are calculated for each session
6/29 (Fri.)
min
min
SKT Network Operation Center
Yanghyeon Village
●speeding 0 ●rapid acc. 0 ●rapid decel. 0
●speeding 1 ●rapid acc. 1 ●rapid decel. 0
6/28 (Thu.)
min
min
SKT Network Operation Center
Yanghyeon Village
●speeding 1 ●rapid acc. 1 ●rapid decel. 0
●speeding 1 ●rapid acc. 1 ●rapid decel. 1
● ● ●
The three factors are calculated for each session
● ● ●
Speeding
0.2km
My speed : 90km/h
(Speed limit : 70km/h)
Rapid accel.
(within 3 sec)
Rapid decel.
(within 3 sec)
Current client-server architecture
A GPS trajectory is generated
for each driving session
…
GPS coord.
• latitude
• longitude
• altitude
T1
GPS coord.
• latitude
• longitude
• altitude
T2
GPS coord.
• latitude
• longitude
• altitude
TN
T map
GPS trajectory
driving score
(+1day)
Batch ETL jobs are executed
twice a day
to calculate three factors ●●●
from trajectories
The main drawback
Users cannot see today’s driving scores until tomorrow
T map service server
...
11min
SKT Network Operation Center
●speeding 1 ●rapid acc. 1 ●rapid decel. 1
Migration from batch ETL to streaming processing
... ...
Service
DB
Millions
of users
...
Batch processing
Real-time streaming
processing
Goal
Let users know driving scores ASAP
Why we choose to use Flink?
https://flink.apache.org/introduction.html#features-why-flink
Exactly-once semantics
for stateful computations
stream processing and windowing
with event time semantics
flexible windowing
light-weight fault-tolerance high throughput and low latency
Contents
• Dataflow design and trigger customization
• Instrumentation with Prometheus
Source
JSON
parser SinkKafka Kafka Service DBUser
key-based
Bounded
OutOfOrderness
TimestampExtractor
(BOOTE)
messages
USER1 to ...
USER2 to ...
USER3 to ...
......
user ID +
destination
Session window
with a custom trigger
Define metrics Collect metrics Plot metrics
A 12-minute driving with 720 GPS coordinates
T map T map service server
...
...
...
...
T map generates
a GPS coordinate
every second
T map sends 4 messages to the service server
1st periodic message
(300 coordinates for the first 5 mins)
2nd periodic message
(300 coordinates for the next 5 mins)
End message
(120 coordinates for the last 2 mins)
...
...
...
T map T map service server
...
Init
message
Return scores right after receiving end messages
T map
driving score
7:20
T map service server
...
Init
a
7:08
Periodic
b
7:13
c
7:18
End
d
7:20
Messages
11min
SKT Network Operation Center
●speeding 1 ●rapid acc. 1 ●rapid decel. 1
Real-time driving score dataflow using
Source
JSON
parser SinkKafka Kafka Service DBUser
key-based
Logical
dataflow
messages
USER1 to ...
USER2 to ...
USER3 to ...
......
user ID +
destination
Session window
with a custom trigger
Bounded
OutOfOrderness
TimestampExtractor
(BOOTE)
at-least-once Kafka producer
session gap : 1 hour
Real-time driving score dataflow using
Source
JSON
parser SinkKafka Kafka Service DBUser
key-based
Logical
dataflow
Bounded
OutOfOrderness
TimestampExtractor
(BOOTE)
...
Source
Session window
with a custom trigger
p0
p1
p2
p19
20
partitions
20
tasks
256 tasks
...
...
...
p0
p1
p2
p19
20
partitions
Sink
...
several million
users
20
tasks
256
tasks
Service DB
...
User
Physical
dataflow
...
20
tasks
JSON
parser BOOTE
messages
USER1 to ...
USER2 to ...
USER3 to ...
......
user ID +
destination
messages
…
…
......
user ID +
destination
messages
…
…
......
user ID +
destination
messages
…
…
......
user ID +
destination
Session window
with a custom trigger
Session window (gap : 1 hour) with different triggers
The default
EventTimeTrigger
EarlyResultEventTimeTrigger
Time
Time
Session window (gap : 1 hour) with different triggers
8:13 8:18 8:208:08
abcd
7:13
b
Periodic
7:08
a
Init
7:18
c
Periodic
The default
EventTimeTrigger
● 1 ● 1 ● 1
EarlyResultEventTimeTrigger
7:20
d
End
fire
Time
Time
Watermark
Session window (gap : 1 hour) with different triggers
8:13 8:18 8:208:087:13
b
Periodic
7:08
a
Init
7:18
c
Periodic
7:20
d
End
8:13 8:18 8:208:087:13
b
Periodic
7:08
a
Init
7:18
c
Periodic
abcd
abcd ● 1 ● 1 ● 1
● 1 ● 1 ● 1
The default
EventTimeTrigger
EarlyResultEventTimeTrigger
7:20
d
End
early fire DO NOT fire
fire
(necessary in case of out-of-order messages)
Time
Time
Early timer
Slow for some reason
Out-or-order messages
...
Source
... ......
JSON
parser
p0
p1
p2
p19
...
p0
p1
p2
p19
Service
DB
... ...
a
Init
b
Periodic
cd
End
a
b
c
d
messages
……
……
......
user ID +
destination
messages
……
……
......
user ID +
destination
Session window w/
EarlyResultEventTimeTrigger
(session gap : 1 hour)
Sink
messages
user ID +
destination
……
……
Dongwon
to
SKT NOC
ab
Dongwon’s
iPhone
BOOTE
(maxOoO : 1 sec)
c d
How EarlyResultEventTimeTrigger deals with out-or-order messages
[Case 1] C arrives
before the early timer expires
c
[Case 2] C arrives
after the early timer expires
c
Time
Time
b
Periodic
a
Init
c
Periodic
abdc ● 1 ● 1 ● 1
d
End
early fire (perfect result) DO NOT fire
(no messages added after the last fire)
[Case 1] C arrives
before the early timer expires
c
[Case 2] C arrives
after the early timer expires
c
Time
Time
How EarlyResultEventTimeTrigger deals with out-or-order messages
b
Periodic
a
Init
c
Periodic
● 1 ● 1 ● 1
d
End
early fire (perfect result) DO NOT fire
(no messages added after the last fire)
b
Periodic
a
Init
c
Periodic
abd ● 0 ● 1 ● 1
d
End
early fire (incomplete result)
[Case 1] C arrives
before the early timer expires
c
[Case 2] C arrives
after the early timer expires
c
2nd fire (perfect result)
abc d ● 1 ● 1 ● 1
Time
Time
abdc
How EarlyResultEventTimeTrigger deals with out-or-order messages
EarlyResultEventTimeTrigger
[Constructor]
Get an evaluator to determine early firing
https://github.com/eastcirclek/flink-examples/blob/master/src/main/scala/com/github/eastcirclek/flink/trigger/EarlyResultEventTimeTrigger.scala
[onElement]
register an early timer
if the evaluator returns true
(e.g. when the end message comes in)
[onEventTime]
Fire if the early timer expires
Contents
• Dataflow design and trigger customization
• Instrumentation with Prometheus
Source
JSON
parser SinkKafka Kafka Service DBUser
key-based
Bounded
OutOfOrderness
TimestampExtractor
(BOOTE)
messages
USER1 to ...
USER2 to ...
USER3 to ...
......
user ID +
destination
Session window
with a custom trigger
Define metrics Collect metrics Plot metrics
Individual message statistics
N:1
Message stats.
extractor
Message stats.
sink
Source SinkKafka Kafka
JSON
parser BOOTE
key-based
messages
……
……
......
user ID +
destination
Source
20 tasks
... ...
20 tasks
JSON
parser
...
Message stats.
extractor
Message stats.
sink
20 tasks
1 task
Logical
dataflow
Physical
dataflow
Session window
Service DBUser
Individual message statistics
1K messages per second
100M messages per day
10s of MB per second
2 TB per day
N:1
Message stats.
extractor
Message stats.
sink
Source SinkKafka Kafka
JSON
parser BOOTE
key-based
messages
……
……
......
user ID +
destination
Logical
dataflow
Session window
Service DBUser
meter histogram histogrammeter
Jitter (ingestion time – event time)
Source SinkKafka Kafka
JSON
parser
Bounded
OutOfOrderness
TimestampExtractor
key-based
messages
……
……
......
user ID +
destination
Logical
dataflow
Session window
Service DB
event
time
ingestion
time
User
1 sec
Based on this observation,
we use 1 sec for maxOutOfOrderness
Session output statistics
N:1 N:1
Message stats.
extractor
Message stats.
sink
Session output stats.
extractor
Session output stats.
sink
Source SinkKafka Kafka
JSON
parser BOOTE
key-based
messages
……
……
......
user ID +
destination
Source
20 tasks
... ...
20 tasks
JSON
parser
...
Message stats.
extractor
Message stats.
sink
20 tasks
1 task
messages
……
……
......
user ID +
destination
256 tasks
Session output stats.
extractor
Session output stats.
sink
256 tasks
1 task...
Session window
...
messages
……
……
......
user ID +
destination
messages
……
……
......
user ID +
destination
● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ●
Logical
dataflow
Physical
dataflow
Session window
Service DBUser
Session output statistics
N:1
Session output stats.
extractor
Session output stats.
sink
Source SinkKafka Kafka
JSON
parser BOOTE
key-based
messages
……
……
......
user ID +
destination
Logical
dataflow
Session window
Service DBUser
N:1
Message stats.
extractor
Message stats.
sink
meter histogram histogrammeter
Our own definition of latency
ingestion time
of end messages
Session output stats.
extractor
Session output stats.
sink
Source
Sink
Kafka
Kafka
JSON
parser BOOTE
Session window
messages
user ID +
destination
……
……
Dongwon
to
SKT NOC
abcd
End
d
End
d
End
d
End
d
processing time of
the resultant session output
@extractor
● 1 ● 1 ● 1
● 1 ● 1 ● 1
Considering maxOutOfOrderness is 1 second,
Flink takes at most 250 milliseconds
N:1 N:1
Message stats.
extractor
Message stats.
sink
Session output stats.
extractor
Session output stats.
sink
Source SinkKafka Kafka
JSON
parser BOOTE
key-based
messages
……
……
......
user ID +
destination
Service DBUser
How to expose metrics to Prometheus?
Session window
Flink metric reporters
TaskManager #1 TaskManager #2JobManager
Push-model and pull-model
Ganglia
reporter
Ganglia
reporter
Ganglia
reporter
push push push
TaskManager #1 TaskManager #2JobManager
Push-model and pull-model
Ganglia
reporter
Graphite
reporter
Ganglia
reporter
Graphite
reporter
Ganglia
reporter
Graphite
reporter
push push push
pushed
TaskManager #1 TaskManager #2JobManager
Push-model and pull-model
Prometheus
reporter
(HTTP endpoint)
Ganglia
reporter
Graphite
reporter
Prometheus
reporter
(HTTP endpoint)
Ganglia
reporter
Graphite
reporter
Prometheus
reporter
(HTTP endpoint)
Ganglia
reporter
Graphite
reporter
pull
pushed pushed
Reporter configuration
Prometheus
reporter
Ganglia
reporter
Graphite
reporter
https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#reporter
client
socket
client
socket
server
socket
Node
Manager
w1
Node
Manager
w2
Node
Manager
w3
Node
Manager
w4
Resource
Manager
Endpoint addresses cannot be determined in advance
#!/bin/bash
# launch a Flink per-job cluster on YARN
flink run
--jobmanager yarn-cluster
--yarncontainer 4
...
# flink-conf.yaml
...
metrics.reporter.prom.port: 5001-5100
...
Q. Can we list the endpoint addresses
before YARN’s scheduling?
A. No, impossible
Node
Manager
w1
Node
Manager
w2
Node
Manager
w3
Node
Manager
w4
Resource
Manager
Endpoint addresses cannot be determined in advance
#!/bin/bash
# launch a Flink per-job cluster on YARN
flink run
--jobmanager yarn-cluster
--yarncontainer 4
...
# flink-conf.yaml
...
metrics.reporter.prom.port: 5001-5100
...
TM
Prom. endpoint
w1:5002
TM
Prom. endpoint
w2:5001
JM
Prom. endpoint
w2:5001 TM
Prom. endpoint
w3:5001
TM
Prom. endpoint
w4:5001
Possible world #1
Node
Manager
w1
Node
Manager
w2
Node
Manager
w3
Node
Manager
w4
Resource
Manager
Endpoint addresses cannot be determined in advance
#!/bin/bash
# launch a Flink per-job cluster on YARN
flink run
--jobmanager yarn-cluster
--yarncontainer 4
...
# flink-conf.yaml
...
metrics.reporter.prom.port: 5001-5100
...
TM
Prom. endpoint
w1:5001
TM
Prom. endpoint
w2:5002
JM
Prom. endpoint
w2:5001 TM
Prom. endpoint
w3:5001
TM
Prom. endpoint
w4:5001
Possible world #2
Node
Manager
w1
Node
Manager
w2
Node
Manager
w3
Node
Manager
w4
Resource
Manager
Endpoint addresses cannot be determined in advance
#!/bin/bash
# launch a Flink per-job cluster on YARN
flink run
--jobmanager yarn-cluster
--yarncontainer 4
...
# flink-conf.yaml
...
metrics.reporter.prom.port: 5001-5100
...
TM
Prom. endpoint
w1:5001
TM
Prom. endpoint
w2:5001
JM
Prom. endpoint
w2:5001
TM
Prom. endpoint
w3:5002
TM
Prom. endpoint
w4:5001
Possible world #3
Node
Manager
w1
Node
Manager
w2
Node
Manager
w3
Node
Manager
w4
Resource
Manager
Endpoint addresses cannot be determined in advance
#!/bin/bash
# launch a Flink per-job cluster on YARN
flink run
--jobmanager yarn-cluster
--yarncontainer 4
...
# flink-conf.yaml
...
metrics.reporter.prom.port: 5001-5100
...
TM
Prom. endpoint
w1:5001
TM
Prom. endpoint
w2:5001
JM
Prom. endpoint
w2:5001TM
Prom. endpoint
w3:5001
TM
Prom. endpoint
w4:5002
Possible world #4
Node
Manager
w1
Node
Manager
w2
Node
Manager
w3
Node
Manager
w4
Resource
Manager
Endpoint addresses cannot be determined in advance
#!/bin/bash
# launch a Flink per-job cluster on YARN
flink run
--jobmanager yarn-cluster
--yarncontainer 4
...
# flink-conf.yaml
...
metrics.reporter.prom.port: 5001-5100
...
TM
Prom. endpoint
w1:5001
TM
Prom. endpoint
w2:5002
JM
Prom. endpoint
w2:5001 TM
Prom. endpoint
w3:5001
TM
Prom. endpoint
w4:5001
TM
Prom. endpoint
w1:5002
TM
Prom. endpoint
w2:5003
JM
Prom. endpoint
w3:5002
TM
Prom. endpoint
w3:5003
TM
Prom. endpoint
w4:5002
Where to scrape metrics form?
Endpoint addresses are available after a cluster is up
TM
Prom. endpoint
w1:5001
TM
Prom. endpoint
w2:5002
JobManager
Prom. endpoint : w2:5001
TM
Prom. endpoint
w3:5001
TM
Prom. endpoint
w4:5001
TM
Prom. endpoint
w1:5002
TM
Prom. endpoint
w2:5003
JobManager
Prom. endpoint : w3:5002
TM
Prom. endpoint
w3:5003
TM
Prom. endpoint
w4:5002
A per-job cluster
(YARN ID : application_1500000000000_0001)
Another per-job cluster
(YARN ID : application_1500000000000_0002)
File-based service discovery mechanism
TM
Prom. endpoint
w1:5001
TM
Prom. endpoint
w2:5002
JobManager
Prom. endpoint : w2:5001
TM
Prom. endpoint
w3:5001
TM
Prom. endpoint
w4:5001
TM
Prom. endpoint
w1:5002
TM
Prom. endpoint
w2:5003
JobManager
Prom. endpoint : w3:5002
TM
Prom. endpoint
w3:5003
TM
Prom. endpoint
w4:5002
A per-job cluster
(YARN ID : application_1500000000000_0001)
Another per-job cluster
(YARN ID : application_1500000000000_0002)
/etc/prometheus/flink-service-discovery/
[
{
“targets”: [“w2:5001”, “w1:5001”, “w2:5002”, “w3:5001”, “w4:5001”],
}
]
application_1528160315197_0001.json
[
{
“targets”: [“w3:5002”, “w1:5002”, “w2:5003”, “w3:5003”, “w4:5002”],
}
]
application_1528160315197_0002.json
🙄 watches file names matching a given pattern
File-based service discovery
scrape metrics
from known endpoints
TM
Prom. endpoint
w1:5001
TM
Prom. endpoint
w2:5002
JobManager
Prom. endpoint : w2:5001
TM
Prom. endpoint
w3:5001
TM
Prom. endpoint
w4:5001
TM
Prom. endpoint
w1:5002
TM
Prom. endpoint
w2:5003
JobManager
Prom. endpoint : w3:5002
TM
Prom. endpoint
w3:5003
TM
Prom. endpoint
w4:5002
A per-job cluster
(YARN ID : application_1500000000000_0001)
Another per-job cluster
(YARN ID : application_1500000000000_0002)
w2:5001, w1:5001, w2:5002, w3:5001, w4:5001
w3:5002, w1:5002, w2:5003, w3:5003, w4:5002
flink-service-discovery
https://github.com/eastcirclek/flink-service-discovery
YARN
Resource
Manager
discovery.py
param1) rmAddr
param2) targetDir
TM
Prom. endpoint
w1:5001
TM
Prom. endpoint
w2:5002
JobManager
Prom. endpoint : w2:5001
TM
Prom. endpoint
w3:5001
TM
Prom. endpoint
w4:5001
1) watch a new Flink cluster
2) get the address of JM
3) get all TM identifiers
4) identify all endpoints
by scrapping JM/TM logs
[
{
“targets”: [“w2:5001”, “w1:5001”,
“w2:5002”, “w3:5001”, “w4:5001”],
}
]
application_1528160315197_0001.json
/etc/prometheus/flink-service-discovery/
🙄
5) create a file
6) scrape metrics from JM and TMs
Grafana dashboard
Overview & summary
• Dataflow design and trigger customization
• Instrumentation with Prometheus
Source
JSON
parser SinkKafka Kafka Service DBUser
key-based
Bounded
OutOfOrderness
TimestampExtractor
(BOOTE)
messages
USER1 to ...
USER2 to ...
USER3 to ...
......
user ID +
destination
Session window
with a custom trigger
Define metrics Collect metrics Plot metrics
THE END

More Related Content

What's hot

Spark streaming: Best Practices
Spark streaming: Best PracticesSpark streaming: Best Practices
Spark streaming: Best Practices
Prakash Chockalingam
 
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini
Deep Stream Dynamic Graph Analytics with Grapharis -  Massimo PeriniDeep Stream Dynamic Graph Analytics with Grapharis -  Massimo Perini
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini
Flink Forward
 
Real-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormReal-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and Storm
John Georgiadis
 
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing systemQConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
Danny Yuan
 
Distributed Real-Time Stream Processing: Why and How 2.0
Distributed Real-Time Stream Processing:  Why and How 2.0Distributed Real-Time Stream Processing:  Why and How 2.0
Distributed Real-Time Stream Processing: Why and How 2.0
Petr Zapletal
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014
P. Taylor Goetz
 
Self-managed and automatically reconfigurable stream processing
Self-managed and automatically reconfigurable stream processingSelf-managed and automatically reconfigurable stream processing
Self-managed and automatically reconfigurable stream processing
Vasia Kalavri
 
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
Srinath Perera
 
QCon London 2016 - Patterns of reliable in-stream processing @ Scale
QCon London 2016 - Patterns of reliable in-stream processing @ ScaleQCon London 2016 - Patterns of reliable in-stream processing @ Scale
QCon London 2016 - Patterns of reliable in-stream processing @ Scale
Alexey Kharlamov
 
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
Alexey Kharlamov
 
Michael Häusler – Everyday flink
Michael Häusler – Everyday flinkMichael Häusler – Everyday flink
Michael Häusler – Everyday flink
Flink Forward
 
Spark Streaming into context
Spark Streaming into contextSpark Streaming into context
Spark Streaming into context
David Martínez Rego
 
Distributed Stream Processing - Spark Summit East 2017
Distributed Stream Processing - Spark Summit East 2017Distributed Stream Processing - Spark Summit East 2017
Distributed Stream Processing - Spark Summit East 2017
Petr Zapletal
 
Ufuc Celebi – Stream & Batch Processing in one System
Ufuc Celebi – Stream & Batch Processing in one SystemUfuc Celebi – Stream & Batch Processing in one System
Ufuc Celebi – Stream & Batch Processing in one System
Flink Forward
 
Automatic Features Generation And Model Training On Spark: A Bayesian Approach
Automatic Features Generation And Model Training On Spark: A Bayesian ApproachAutomatic Features Generation And Model Training On Spark: A Bayesian Approach
Automatic Features Generation And Model Training On Spark: A Bayesian Approach
Spark Summit
 
Virtual Flink Forward 2020: Cogynt: Flink without code - Samantha Chan, Aslam...
Virtual Flink Forward 2020: Cogynt: Flink without code - Samantha Chan, Aslam...Virtual Flink Forward 2020: Cogynt: Flink without code - Samantha Chan, Aslam...
Virtual Flink Forward 2020: Cogynt: Flink without code - Samantha Chan, Aslam...
Flink Forward
 
Always On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on CassandraAlways On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on Cassandra
Robbie Strickland
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
DataWorks Summit/Hadoop Summit
 
Thinking Functionally with Clojure
Thinking Functionally with ClojureThinking Functionally with Clojure
Thinking Functionally with Clojure
John Stevenson
 
Stabilising the jenga tower
Stabilising the jenga towerStabilising the jenga tower
Stabilising the jenga tower
Gordon Chung
 

What's hot (20)

Spark streaming: Best Practices
Spark streaming: Best PracticesSpark streaming: Best Practices
Spark streaming: Best Practices
 
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini
Deep Stream Dynamic Graph Analytics with Grapharis -  Massimo PeriniDeep Stream Dynamic Graph Analytics with Grapharis -  Massimo Perini
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini
 
Real-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormReal-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and Storm
 
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing systemQConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
 
Distributed Real-Time Stream Processing: Why and How 2.0
Distributed Real-Time Stream Processing:  Why and How 2.0Distributed Real-Time Stream Processing:  Why and How 2.0
Distributed Real-Time Stream Processing: Why and How 2.0
 
Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014Scaling Apache Storm - Strata + Hadoop World 2014
Scaling Apache Storm - Strata + Hadoop World 2014
 
Self-managed and automatically reconfigurable stream processing
Self-managed and automatically reconfigurable stream processingSelf-managed and automatically reconfigurable stream processing
Self-managed and automatically reconfigurable stream processing
 
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
 
QCon London 2016 - Patterns of reliable in-stream processing @ Scale
QCon London 2016 - Patterns of reliable in-stream processing @ ScaleQCon London 2016 - Patterns of reliable in-stream processing @ Scale
QCon London 2016 - Patterns of reliable in-stream processing @ Scale
 
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
 
Michael Häusler – Everyday flink
Michael Häusler – Everyday flinkMichael Häusler – Everyday flink
Michael Häusler – Everyday flink
 
Spark Streaming into context
Spark Streaming into contextSpark Streaming into context
Spark Streaming into context
 
Distributed Stream Processing - Spark Summit East 2017
Distributed Stream Processing - Spark Summit East 2017Distributed Stream Processing - Spark Summit East 2017
Distributed Stream Processing - Spark Summit East 2017
 
Ufuc Celebi – Stream & Batch Processing in one System
Ufuc Celebi – Stream & Batch Processing in one SystemUfuc Celebi – Stream & Batch Processing in one System
Ufuc Celebi – Stream & Batch Processing in one System
 
Automatic Features Generation And Model Training On Spark: A Bayesian Approach
Automatic Features Generation And Model Training On Spark: A Bayesian ApproachAutomatic Features Generation And Model Training On Spark: A Bayesian Approach
Automatic Features Generation And Model Training On Spark: A Bayesian Approach
 
Virtual Flink Forward 2020: Cogynt: Flink without code - Samantha Chan, Aslam...
Virtual Flink Forward 2020: Cogynt: Flink without code - Samantha Chan, Aslam...Virtual Flink Forward 2020: Cogynt: Flink without code - Samantha Chan, Aslam...
Virtual Flink Forward 2020: Cogynt: Flink without code - Samantha Chan, Aslam...
 
Always On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on CassandraAlways On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on Cassandra
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
 
Thinking Functionally with Clojure
Thinking Functionally with ClojureThinking Functionally with Clojure
Thinking Functionally with Clojure
 
Stabilising the jenga tower
Stabilising the jenga towerStabilising the jenga tower
Stabilising the jenga tower
 

Similar to Real-time driving score service using Flink

Netflix Keystone - How Netflix Handles Data Streams up to 11M Events/Sec
Netflix Keystone - How Netflix Handles Data Streams up to 11M Events/SecNetflix Keystone - How Netflix Handles Data Streams up to 11M Events/Sec
Netflix Keystone - How Netflix Handles Data Streams up to 11M Events/Sec
Peter Bakas
 
#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter
Twitter Developers
 
Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...
Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...
Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...
Flink Forward
 
Netflix - Realtime Impression Store
Netflix - Realtime Impression Store Netflix - Realtime Impression Store
Netflix - Realtime Impression Store
Nitin S
 
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's ScalePinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Seunghyun Lee
 
Story of migrating event pipeline from batch to streaming
Story of migrating event pipeline from batch to streamingStory of migrating event pipeline from batch to streaming
Story of migrating event pipeline from batch to streaming
lohitvijayarenu
 
Keystone event processing pipeline on a dockerized microservices architecture
Keystone event processing pipeline on a dockerized microservices architectureKeystone event processing pipeline on a dockerized microservices architecture
Keystone event processing pipeline on a dockerized microservices architecture
Zhenzhong Xu
 
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
Amazon Web Services
 
Evolution of Real-time User Engagement Event Consumption at Pinterest
Evolution of Real-time User Engagement Event Consumption at PinterestEvolution of Real-time User Engagement Event Consumption at Pinterest
Evolution of Real-time User Engagement Event Consumption at Pinterest
HostedbyConfluent
 
Mirko Damiani - An Embedded soft real time distributed system in Go
Mirko Damiani - An Embedded soft real time distributed system in GoMirko Damiani - An Embedded soft real time distributed system in Go
Mirko Damiani - An Embedded soft real time distributed system in Go
linuxlab_conf
 
BDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
BDA403 How Netflix Monitors Applications in Real-time with Amazon KinesisBDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
BDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
Amazon Web Services
 
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
KafkaZone
 
Flink Forward San Francisco 2019: Building Financial Identity Platform using ...
Flink Forward San Francisco 2019: Building Financial Identity Platform using ...Flink Forward San Francisco 2019: Building Financial Identity Platform using ...
Flink Forward San Francisco 2019: Building Financial Identity Platform using ...
Flink Forward
 
Stream processing at Hotstar
Stream processing at HotstarStream processing at Hotstar
Stream processing at Hotstar
KafkaZone
 
Unbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxiniUnbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxini
Monal Daxini
 
Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018
Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018
Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018
Seunghyun Lee
 
Running Flink in Production: The good, The bad and The in Between - Lakshmi ...
Running Flink in Production:  The good, The bad and The in Between - Lakshmi ...Running Flink in Production:  The good, The bad and The in Between - Lakshmi ...
Running Flink in Production: The good, The bad and The in Between - Lakshmi ...
Flink Forward
 
Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Ana...
Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Ana...Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Ana...
Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Ana...
Big Data Spain
 
A Practical Deep Dive into Observability of Streaming Applications with Kosta...
A Practical Deep Dive into Observability of Streaming Applications with Kosta...A Practical Deep Dive into Observability of Streaming Applications with Kosta...
A Practical Deep Dive into Observability of Streaming Applications with Kosta...
HostedbyConfluent
 
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward
 

Similar to Real-time driving score service using Flink (20)

Netflix Keystone - How Netflix Handles Data Streams up to 11M Events/Sec
Netflix Keystone - How Netflix Handles Data Streams up to 11M Events/SecNetflix Keystone - How Netflix Handles Data Streams up to 11M Events/Sec
Netflix Keystone - How Netflix Handles Data Streams up to 11M Events/Sec
 
#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter
 
Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...
Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...
Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...
 
Netflix - Realtime Impression Store
Netflix - Realtime Impression Store Netflix - Realtime Impression Store
Netflix - Realtime Impression Store
 
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's ScalePinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
 
Story of migrating event pipeline from batch to streaming
Story of migrating event pipeline from batch to streamingStory of migrating event pipeline from batch to streaming
Story of migrating event pipeline from batch to streaming
 
Keystone event processing pipeline on a dockerized microservices architecture
Keystone event processing pipeline on a dockerized microservices architectureKeystone event processing pipeline on a dockerized microservices architecture
Keystone event processing pipeline on a dockerized microservices architecture
 
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
 
Evolution of Real-time User Engagement Event Consumption at Pinterest
Evolution of Real-time User Engagement Event Consumption at PinterestEvolution of Real-time User Engagement Event Consumption at Pinterest
Evolution of Real-time User Engagement Event Consumption at Pinterest
 
Mirko Damiani - An Embedded soft real time distributed system in Go
Mirko Damiani - An Embedded soft real time distributed system in GoMirko Damiani - An Embedded soft real time distributed system in Go
Mirko Damiani - An Embedded soft real time distributed system in Go
 
BDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
BDA403 How Netflix Monitors Applications in Real-time with Amazon KinesisBDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
BDA403 How Netflix Monitors Applications in Real-time with Amazon Kinesis
 
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
 
Flink Forward San Francisco 2019: Building Financial Identity Platform using ...
Flink Forward San Francisco 2019: Building Financial Identity Platform using ...Flink Forward San Francisco 2019: Building Financial Identity Platform using ...
Flink Forward San Francisco 2019: Building Financial Identity Platform using ...
 
Stream processing at Hotstar
Stream processing at HotstarStream processing at Hotstar
Stream processing at Hotstar
 
Unbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxiniUnbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxini
 
Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018
Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018
Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018
 
Running Flink in Production: The good, The bad and The in Between - Lakshmi ...
Running Flink in Production:  The good, The bad and The in Between - Lakshmi ...Running Flink in Production:  The good, The bad and The in Between - Lakshmi ...
Running Flink in Production: The good, The bad and The in Between - Lakshmi ...
 
Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Ana...
Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Ana...Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Ana...
Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Ana...
 
A Practical Deep Dive into Observability of Streaming Applications with Kosta...
A Practical Deep Dive into Observability of Streaming Applications with Kosta...A Practical Deep Dive into Observability of Streaming Applications with Kosta...
A Practical Deep Dive into Observability of Streaming Applications with Kosta...
 
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
 

Recently uploaded

一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
zsjl4mimo
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Natural Language Processing (NLP), RAG and its applications .pptx
Natural Language Processing (NLP), RAG and its applications .pptxNatural Language Processing (NLP), RAG and its applications .pptx
Natural Language Processing (NLP), RAG and its applications .pptx
fkyes25
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 

Recently uploaded (20)

一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Natural Language Processing (NLP), RAG and its applications .pptx
Natural Language Processing (NLP), RAG and its applications .pptxNatural Language Processing (NLP), RAG and its applications .pptx
Natural Language Processing (NLP), RAG and its applications .pptx
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 

Real-time driving score service using Flink

  • 1. Real-time driving score service using Flink Dongwon Kim SK telecom
  • 2. My talks @FlinkForward Flink Forward 2015 A Comparative Performance Evaluation of Flink Flink Forward 2017 Predictive Maintenance with Deep Learning and Flink . Flink Forward 2018 Real-time driving score service using Flink
  • 3. T map, a mobile navigation app by SK telecom ≈ Choose from frequent locations Enter an address or a place nameWaze Google Maps
  • 4. T map, a mobile navigation app by SK telecom multiple route options in driving mode arriving at destination
  • 5. Driving score service by T map I scored 83 out of 100! yay! Driving score KB Insurance DB Insurance 10% discount 10% discount Car insurance discount for safe drivers If you drive safely with , automobile insurance premiums go down.
  • 6. Driving score is based on three factors My driving score Rank : 970k Speeding Rapid accel. Rapid decel. greatgood good Monthly chart Apr May Jun Jul Aug
  • 7. The three factors are calculated for each session 6/29 (Fri.) min min SKT Network Operation Center Yanghyeon Village ●speeding 0 ●rapid acc. 0 ●rapid decel. 0 ●speeding 1 ●rapid acc. 1 ●rapid decel. 0 6/28 (Thu.) min min SKT Network Operation Center Yanghyeon Village ●speeding 1 ●rapid acc. 1 ●rapid decel. 0 ●speeding 1 ●rapid acc. 1 ●rapid decel. 1 ● ● ●
  • 8. The three factors are calculated for each session ● ● ● Speeding 0.2km My speed : 90km/h (Speed limit : 70km/h) Rapid accel. (within 3 sec) Rapid decel. (within 3 sec)
  • 9. Current client-server architecture A GPS trajectory is generated for each driving session … GPS coord. • latitude • longitude • altitude T1 GPS coord. • latitude • longitude • altitude T2 GPS coord. • latitude • longitude • altitude TN T map GPS trajectory driving score (+1day) Batch ETL jobs are executed twice a day to calculate three factors ●●● from trajectories The main drawback Users cannot see today’s driving scores until tomorrow T map service server ... 11min SKT Network Operation Center ●speeding 1 ●rapid acc. 1 ●rapid decel. 1
  • 10. Migration from batch ETL to streaming processing ... ... Service DB Millions of users ... Batch processing Real-time streaming processing Goal Let users know driving scores ASAP
  • 11. Why we choose to use Flink? https://flink.apache.org/introduction.html#features-why-flink Exactly-once semantics for stateful computations stream processing and windowing with event time semantics flexible windowing light-weight fault-tolerance high throughput and low latency
  • 12. Contents • Dataflow design and trigger customization • Instrumentation with Prometheus Source JSON parser SinkKafka Kafka Service DBUser key-based Bounded OutOfOrderness TimestampExtractor (BOOTE) messages USER1 to ... USER2 to ... USER3 to ... ...... user ID + destination Session window with a custom trigger Define metrics Collect metrics Plot metrics
  • 13. A 12-minute driving with 720 GPS coordinates T map T map service server ... ... ... ... T map generates a GPS coordinate every second
  • 14. T map sends 4 messages to the service server 1st periodic message (300 coordinates for the first 5 mins) 2nd periodic message (300 coordinates for the next 5 mins) End message (120 coordinates for the last 2 mins) ... ... ... T map T map service server ... Init message
  • 15. Return scores right after receiving end messages T map driving score 7:20 T map service server ... Init a 7:08 Periodic b 7:13 c 7:18 End d 7:20 Messages 11min SKT Network Operation Center ●speeding 1 ●rapid acc. 1 ●rapid decel. 1
  • 16. Real-time driving score dataflow using Source JSON parser SinkKafka Kafka Service DBUser key-based Logical dataflow messages USER1 to ... USER2 to ... USER3 to ... ...... user ID + destination Session window with a custom trigger Bounded OutOfOrderness TimestampExtractor (BOOTE) at-least-once Kafka producer session gap : 1 hour
  • 17. Real-time driving score dataflow using Source JSON parser SinkKafka Kafka Service DBUser key-based Logical dataflow Bounded OutOfOrderness TimestampExtractor (BOOTE) ... Source Session window with a custom trigger p0 p1 p2 p19 20 partitions 20 tasks 256 tasks ... ... ... p0 p1 p2 p19 20 partitions Sink ... several million users 20 tasks 256 tasks Service DB ... User Physical dataflow ... 20 tasks JSON parser BOOTE messages USER1 to ... USER2 to ... USER3 to ... ...... user ID + destination messages … … ...... user ID + destination messages … … ...... user ID + destination messages … … ...... user ID + destination Session window with a custom trigger
  • 18. Session window (gap : 1 hour) with different triggers The default EventTimeTrigger EarlyResultEventTimeTrigger Time Time
  • 19. Session window (gap : 1 hour) with different triggers 8:13 8:18 8:208:08 abcd 7:13 b Periodic 7:08 a Init 7:18 c Periodic The default EventTimeTrigger ● 1 ● 1 ● 1 EarlyResultEventTimeTrigger 7:20 d End fire Time Time Watermark
  • 20. Session window (gap : 1 hour) with different triggers 8:13 8:18 8:208:087:13 b Periodic 7:08 a Init 7:18 c Periodic 7:20 d End 8:13 8:18 8:208:087:13 b Periodic 7:08 a Init 7:18 c Periodic abcd abcd ● 1 ● 1 ● 1 ● 1 ● 1 ● 1 The default EventTimeTrigger EarlyResultEventTimeTrigger 7:20 d End early fire DO NOT fire fire (necessary in case of out-of-order messages) Time Time Early timer
  • 21. Slow for some reason Out-or-order messages ... Source ... ...... JSON parser p0 p1 p2 p19 ... p0 p1 p2 p19 Service DB ... ... a Init b Periodic cd End a b c d messages …… …… ...... user ID + destination messages …… …… ...... user ID + destination Session window w/ EarlyResultEventTimeTrigger (session gap : 1 hour) Sink messages user ID + destination …… …… Dongwon to SKT NOC ab Dongwon’s iPhone BOOTE (maxOoO : 1 sec) c d
  • 22. How EarlyResultEventTimeTrigger deals with out-or-order messages [Case 1] C arrives before the early timer expires c [Case 2] C arrives after the early timer expires c Time Time
  • 23. b Periodic a Init c Periodic abdc ● 1 ● 1 ● 1 d End early fire (perfect result) DO NOT fire (no messages added after the last fire) [Case 1] C arrives before the early timer expires c [Case 2] C arrives after the early timer expires c Time Time How EarlyResultEventTimeTrigger deals with out-or-order messages
  • 24. b Periodic a Init c Periodic ● 1 ● 1 ● 1 d End early fire (perfect result) DO NOT fire (no messages added after the last fire) b Periodic a Init c Periodic abd ● 0 ● 1 ● 1 d End early fire (incomplete result) [Case 1] C arrives before the early timer expires c [Case 2] C arrives after the early timer expires c 2nd fire (perfect result) abc d ● 1 ● 1 ● 1 Time Time abdc How EarlyResultEventTimeTrigger deals with out-or-order messages
  • 25. EarlyResultEventTimeTrigger [Constructor] Get an evaluator to determine early firing https://github.com/eastcirclek/flink-examples/blob/master/src/main/scala/com/github/eastcirclek/flink/trigger/EarlyResultEventTimeTrigger.scala [onElement] register an early timer if the evaluator returns true (e.g. when the end message comes in) [onEventTime] Fire if the early timer expires
  • 26. Contents • Dataflow design and trigger customization • Instrumentation with Prometheus Source JSON parser SinkKafka Kafka Service DBUser key-based Bounded OutOfOrderness TimestampExtractor (BOOTE) messages USER1 to ... USER2 to ... USER3 to ... ...... user ID + destination Session window with a custom trigger Define metrics Collect metrics Plot metrics
  • 27. Individual message statistics N:1 Message stats. extractor Message stats. sink Source SinkKafka Kafka JSON parser BOOTE key-based messages …… …… ...... user ID + destination Source 20 tasks ... ... 20 tasks JSON parser ... Message stats. extractor Message stats. sink 20 tasks 1 task Logical dataflow Physical dataflow Session window Service DBUser
  • 28. Individual message statistics 1K messages per second 100M messages per day 10s of MB per second 2 TB per day N:1 Message stats. extractor Message stats. sink Source SinkKafka Kafka JSON parser BOOTE key-based messages …… …… ...... user ID + destination Logical dataflow Session window Service DBUser meter histogram histogrammeter
  • 29. Jitter (ingestion time – event time) Source SinkKafka Kafka JSON parser Bounded OutOfOrderness TimestampExtractor key-based messages …… …… ...... user ID + destination Logical dataflow Session window Service DB event time ingestion time User 1 sec Based on this observation, we use 1 sec for maxOutOfOrderness
  • 30. Session output statistics N:1 N:1 Message stats. extractor Message stats. sink Session output stats. extractor Session output stats. sink Source SinkKafka Kafka JSON parser BOOTE key-based messages …… …… ...... user ID + destination Source 20 tasks ... ... 20 tasks JSON parser ... Message stats. extractor Message stats. sink 20 tasks 1 task messages …… …… ...... user ID + destination 256 tasks Session output stats. extractor Session output stats. sink 256 tasks 1 task... Session window ... messages …… …… ...... user ID + destination messages …… …… ...... user ID + destination ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Logical dataflow Physical dataflow Session window Service DBUser
  • 31. Session output statistics N:1 Session output stats. extractor Session output stats. sink Source SinkKafka Kafka JSON parser BOOTE key-based messages …… …… ...... user ID + destination Logical dataflow Session window Service DBUser N:1 Message stats. extractor Message stats. sink meter histogram histogrammeter
  • 32. Our own definition of latency ingestion time of end messages Session output stats. extractor Session output stats. sink Source Sink Kafka Kafka JSON parser BOOTE Session window messages user ID + destination …… …… Dongwon to SKT NOC abcd End d End d End d End d processing time of the resultant session output @extractor ● 1 ● 1 ● 1 ● 1 ● 1 ● 1 Considering maxOutOfOrderness is 1 second, Flink takes at most 250 milliseconds
  • 33. N:1 N:1 Message stats. extractor Message stats. sink Session output stats. extractor Session output stats. sink Source SinkKafka Kafka JSON parser BOOTE key-based messages …… …… ...... user ID + destination Service DBUser How to expose metrics to Prometheus? Session window
  • 35. TaskManager #1 TaskManager #2JobManager Push-model and pull-model Ganglia reporter Ganglia reporter Ganglia reporter push push push
  • 36. TaskManager #1 TaskManager #2JobManager Push-model and pull-model Ganglia reporter Graphite reporter Ganglia reporter Graphite reporter Ganglia reporter Graphite reporter push push push pushed
  • 37. TaskManager #1 TaskManager #2JobManager Push-model and pull-model Prometheus reporter (HTTP endpoint) Ganglia reporter Graphite reporter Prometheus reporter (HTTP endpoint) Ganglia reporter Graphite reporter Prometheus reporter (HTTP endpoint) Ganglia reporter Graphite reporter pull pushed pushed
  • 39. Node Manager w1 Node Manager w2 Node Manager w3 Node Manager w4 Resource Manager Endpoint addresses cannot be determined in advance #!/bin/bash # launch a Flink per-job cluster on YARN flink run --jobmanager yarn-cluster --yarncontainer 4 ... # flink-conf.yaml ... metrics.reporter.prom.port: 5001-5100 ... Q. Can we list the endpoint addresses before YARN’s scheduling? A. No, impossible
  • 40. Node Manager w1 Node Manager w2 Node Manager w3 Node Manager w4 Resource Manager Endpoint addresses cannot be determined in advance #!/bin/bash # launch a Flink per-job cluster on YARN flink run --jobmanager yarn-cluster --yarncontainer 4 ... # flink-conf.yaml ... metrics.reporter.prom.port: 5001-5100 ... TM Prom. endpoint w1:5002 TM Prom. endpoint w2:5001 JM Prom. endpoint w2:5001 TM Prom. endpoint w3:5001 TM Prom. endpoint w4:5001 Possible world #1
  • 41. Node Manager w1 Node Manager w2 Node Manager w3 Node Manager w4 Resource Manager Endpoint addresses cannot be determined in advance #!/bin/bash # launch a Flink per-job cluster on YARN flink run --jobmanager yarn-cluster --yarncontainer 4 ... # flink-conf.yaml ... metrics.reporter.prom.port: 5001-5100 ... TM Prom. endpoint w1:5001 TM Prom. endpoint w2:5002 JM Prom. endpoint w2:5001 TM Prom. endpoint w3:5001 TM Prom. endpoint w4:5001 Possible world #2
  • 42. Node Manager w1 Node Manager w2 Node Manager w3 Node Manager w4 Resource Manager Endpoint addresses cannot be determined in advance #!/bin/bash # launch a Flink per-job cluster on YARN flink run --jobmanager yarn-cluster --yarncontainer 4 ... # flink-conf.yaml ... metrics.reporter.prom.port: 5001-5100 ... TM Prom. endpoint w1:5001 TM Prom. endpoint w2:5001 JM Prom. endpoint w2:5001 TM Prom. endpoint w3:5002 TM Prom. endpoint w4:5001 Possible world #3
  • 43. Node Manager w1 Node Manager w2 Node Manager w3 Node Manager w4 Resource Manager Endpoint addresses cannot be determined in advance #!/bin/bash # launch a Flink per-job cluster on YARN flink run --jobmanager yarn-cluster --yarncontainer 4 ... # flink-conf.yaml ... metrics.reporter.prom.port: 5001-5100 ... TM Prom. endpoint w1:5001 TM Prom. endpoint w2:5001 JM Prom. endpoint w2:5001TM Prom. endpoint w3:5001 TM Prom. endpoint w4:5002 Possible world #4
  • 44. Node Manager w1 Node Manager w2 Node Manager w3 Node Manager w4 Resource Manager Endpoint addresses cannot be determined in advance #!/bin/bash # launch a Flink per-job cluster on YARN flink run --jobmanager yarn-cluster --yarncontainer 4 ... # flink-conf.yaml ... metrics.reporter.prom.port: 5001-5100 ... TM Prom. endpoint w1:5001 TM Prom. endpoint w2:5002 JM Prom. endpoint w2:5001 TM Prom. endpoint w3:5001 TM Prom. endpoint w4:5001 TM Prom. endpoint w1:5002 TM Prom. endpoint w2:5003 JM Prom. endpoint w3:5002 TM Prom. endpoint w3:5003 TM Prom. endpoint w4:5002
  • 45. Where to scrape metrics form? Endpoint addresses are available after a cluster is up TM Prom. endpoint w1:5001 TM Prom. endpoint w2:5002 JobManager Prom. endpoint : w2:5001 TM Prom. endpoint w3:5001 TM Prom. endpoint w4:5001 TM Prom. endpoint w1:5002 TM Prom. endpoint w2:5003 JobManager Prom. endpoint : w3:5002 TM Prom. endpoint w3:5003 TM Prom. endpoint w4:5002 A per-job cluster (YARN ID : application_1500000000000_0001) Another per-job cluster (YARN ID : application_1500000000000_0002)
  • 46. File-based service discovery mechanism TM Prom. endpoint w1:5001 TM Prom. endpoint w2:5002 JobManager Prom. endpoint : w2:5001 TM Prom. endpoint w3:5001 TM Prom. endpoint w4:5001 TM Prom. endpoint w1:5002 TM Prom. endpoint w2:5003 JobManager Prom. endpoint : w3:5002 TM Prom. endpoint w3:5003 TM Prom. endpoint w4:5002 A per-job cluster (YARN ID : application_1500000000000_0001) Another per-job cluster (YARN ID : application_1500000000000_0002) /etc/prometheus/flink-service-discovery/ [ { “targets”: [“w2:5001”, “w1:5001”, “w2:5002”, “w3:5001”, “w4:5001”], } ] application_1528160315197_0001.json [ { “targets”: [“w3:5002”, “w1:5002”, “w2:5003”, “w3:5003”, “w4:5002”], } ] application_1528160315197_0002.json 🙄 watches file names matching a given pattern
  • 47. File-based service discovery scrape metrics from known endpoints TM Prom. endpoint w1:5001 TM Prom. endpoint w2:5002 JobManager Prom. endpoint : w2:5001 TM Prom. endpoint w3:5001 TM Prom. endpoint w4:5001 TM Prom. endpoint w1:5002 TM Prom. endpoint w2:5003 JobManager Prom. endpoint : w3:5002 TM Prom. endpoint w3:5003 TM Prom. endpoint w4:5002 A per-job cluster (YARN ID : application_1500000000000_0001) Another per-job cluster (YARN ID : application_1500000000000_0002) w2:5001, w1:5001, w2:5002, w3:5001, w4:5001 w3:5002, w1:5002, w2:5003, w3:5003, w4:5002
  • 48. flink-service-discovery https://github.com/eastcirclek/flink-service-discovery YARN Resource Manager discovery.py param1) rmAddr param2) targetDir TM Prom. endpoint w1:5001 TM Prom. endpoint w2:5002 JobManager Prom. endpoint : w2:5001 TM Prom. endpoint w3:5001 TM Prom. endpoint w4:5001 1) watch a new Flink cluster 2) get the address of JM 3) get all TM identifiers 4) identify all endpoints by scrapping JM/TM logs [ { “targets”: [“w2:5001”, “w1:5001”, “w2:5002”, “w3:5001”, “w4:5001”], } ] application_1528160315197_0001.json /etc/prometheus/flink-service-discovery/ 🙄 5) create a file 6) scrape metrics from JM and TMs
  • 50. Overview & summary • Dataflow design and trigger customization • Instrumentation with Prometheus Source JSON parser SinkKafka Kafka Service DBUser key-based Bounded OutOfOrderness TimestampExtractor (BOOTE) messages USER1 to ... USER2 to ... USER3 to ... ...... user ID + destination Session window with a custom trigger Define metrics Collect metrics Plot metrics