Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber

Scalable Real-Time Complex Event Processing @Uber
Shuyi Chen
Uber Technology Inc.

6 continents, 70+ countries and 400+ cities
Transportation as reliable as running water, everywhere,
for everyone
Uber

Outline
● Motivation
● Architecture
● Limitations
● Challenges

Thousands of Kafka topics from micro-services

We can extract a lot of useful information from this
rich set of logs in real-time!

Multiple logins from the same IP in the last 10
minutes

Partner accepted a trip
→ partner calls rider through the Uber APP
→ rider cancels the trip

Partners reject the second pickup of a UberPOOL
trip

Multiple logins from the same IP in the last 10
minutes
Window Aggregation

Partner accepted a trip
→ partner calls rider through the Uber APP
→ rider cancels the trip
Pattern detection

Partners reject the second pickup of a UberPOOL
trip
Filter

Can we use declarative languages to specify these
stream processing logics?

Complex event processing
● Combines data from multiple sources to infer events or patterns that suggest
more complicated circumstances
● CEP is used across many industries for various use cases, including:
○ Finance: Trade analysis, fraud detection
○ Airlines: Operations monitoring
○ Healthcare: Claims processing, patient monitoring
○ Energy and Telecommunications: Outage detection
● CEP uses declarative rule/query language to specify event processing logic

WSO2/Siddhi: Complex event processing engine
● Lightweight, extensible, open source, released as a Java library
● Features supported
○ Filter
○ Join
○ Aggregation
○ Group by
○ Window
○ Pattern processing
○ Sequence processing
○ Event tables
○ Event-time processing
○ UDF
○ Extensions
○ Declarative query language: SiddhiQL

How Siddhi works
● Specify processing logic declaratively with SiddhiQL

How Siddhi works
● Query is parsed at runtime into an execution plan runtime
● As events flow in, the execution plan runtime process events inside the CEP
engine according the query logic

How can we make it scalable at Uber scale?

Apache Samza
● A distributed stream processing framework
○ Distributed and Scalable
○ Built-in State management
○ Built-in fault tolerant
○ At-least-once message processing
○ Infrastructure support at Uber

How can we make the stream processing output
useful?

Actions
● Generalize a set of common action templates to make it easy for
micro-services and human to harness the power of realtime stream
processing
● Currently we support
○ Make an RPC call
○ Invoke a Webhook endpoint
○ Index to ElasticSearch
○ Index to Cassandra
○ Kafka
○ Statsd
○ Chat service
○ Email
○ Push notification

Actions
Real-time Scalable Complex Event Processing

Partitioner
● Re-shuffle events based on key
● Support predicate pushdown through query analysis
● Support column pruning through query analysis (WIP)

Query processor
● Parse Siddhi queries into execution plan runtime
● Process events in Siddhi execution plan runtime
● Checkpoint state regularly to ensure recovery upon crash/restart using
RocksDB

Action processor
● Execute actions upon the query processing output
● Support various kinds of actions for easy integration
● Implement action retry mechanism using RocksDB to provide at-least-once
delivery

How do we translate a query into psychical plan that
runs?

DAG (Directed Acyclic Graph) generation
● Analyze Siddhi query to automatically generate the stream processing DAG in
Samza using the processors
Filter, transformation

No stream processing logic is hard-coded in any of
the processors

REST API backend
● All queries, actions are stored externally in database.
● RESTFUL API for CRUD operations
● If query/action logic changed
○ Redeploy the Samza DAG if needed
○ Otherwise, the updated queries/actions will be loaded at runtime w/o interruption

Unified management and monitoring
● Every use case
○ share the same set of processors
○ Use queries and actions to describe its processing logic
● A single monitoring template can be reused across different use cases

Production status
● In production for >1.5 years
● 120+ production use cases
● 30+ billion messages processed per day

Applications
● Real-time fraud detection
● Real-time anomaly detection
● Real-time marketing campaign
● Real-time promotion
● Real-time monitoring
● Real-time feedback system
● Real-time analytics
● Real-time visualizations
● And etc.

Out-of-order event handling
● Not a big concern
○ Events of the same rider/partner are usually seconds aparts
● K-slack extension in Siddhi for out-of-order event processing

Auto-scaling
● Manually re-partition kafka topics to increase parallelism
● Manually tune container memory if needed
● Future
○ Use CPU/memory/IO stats to automate the process

Large checkpointing state
● Samza use Kafka to log state changes
● Siddhi engine snapshot can be large
● Kafka message size limit to 1MB by default
● Solution: we build logics to slice state into smaller pieces and checkpoint
them.

Synchronous checkpointing
● Samza checkpointing is synchronous with message processing
● If state is large, time to checkpoint can be long, might cause processing lag
● Incremental state checkpointing

Exactly once state processing?
● Can not commit state and offset atomically
● No exactly once state processing

Custom business logic
● Common logic implemented as Siddhi extensions
● Ad-hoc logic implemented as UDF in javascript or scalascript inline with the
query

Intermediate Kafka messages
● Samza uses Kafka as message queue for intermediate processing output
○ Each stage is independent of each other
○ This can create large load on Kafka if a heave topic is re-shuffled multiple times
■ Encode the intermediate messages to reduce footprint

Upgrading Samza jobs
● Upgrade Samza jobs require a full restart, and can take minutes due to
○ Offset checkpointing topic too large → set retention to hours or enable compaction
○ Changelog topic too large → set retention or enable compaction in Kafka or host affinity
● To minimize the interruption during upgrade, it would be nice to have
○ Rolling restart
○ Per container restart

Our solution: non-interrupted handoff
● For critical jobs, we use replication during upgrade
○ Start a shadow job
○ Upgrade shadow
○ Switch primary and shadow
○ Upgrade primary
○ Switch back
● Downside: require 2x capacity during upgrade

Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber

Similar to Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber (20)

More from confluent

More from confluent (20)

Recently uploaded

Recently uploaded (20)

Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber