Stateful Distributed Stream Processing

Stateful distributed stream
processing
Gyula Fóra
gyfora@apache.org
@GyulaFora

This talk
§ Stateful processing by example
§ Definition and challenges
§ State in current open-source systems
§ State in Apache Flink
§ Closing
2Apache Flink Meetup @ MapR2015-‐‑08-‐‑27

Stateful processing by example
§ Window aggregations
• Total number of customers
in the last 10 minutes
• State: Current aggregate
§ Machine learning
• Fitting trends to the evolving
stream
• State: Model

Stateful processing by example
§ Pattern recognition
• Detect suspicious financial
activity
• State: Matched prefix
§ Stream-stream joins
• Match ad views and
impressions
• State: Elements in the window

Stateful operators
§ All these examples use a common processing
pattern
§ Stateful operator (in essence):
𝒇:
𝒊𝒏, 𝒔𝒕𝒂𝒕𝒆 ⟶ 𝒐𝒖𝒕, 𝒔𝒕𝒂𝒕𝒆.
§ State hangs around and can be read and
modified as the stream evolves
§ Goal: Get as close as possible while
maintaining scalability and fault-tolerance

State-of-the-art systems
§ Most systems allow developers to
implement stateful programs
§ Trick is to limit the scope of 𝒇 (state access)
while maintaining expressivity
§ Issues to tackle:
• Expressivity
• Exactly-once semantics
• Scalability to large inputs
• Scalability to large states

§ States available only in Trident API
§ Dedicated operators for state updates and
queries
§ State access methods
• stateQuery(…)
• partitionPersist(…)
• persistentAggregate(…)
§ It’s very difficult to
implement transactional
states
Exactly-‐‑once guarantee

Storm Word Count

§ Stateless runtime by design
• No continuous operators
• UDFs are assumed to be stateless
§ State can be generated as a stream of
RDDs: updateStateByKey(…)
𝒇:
𝑺𝒆𝒒[𝒊𝒏 𝒌], 𝒔𝒕𝒂𝒕𝒆 𝒌 ⟶ 𝒔𝒕𝒂𝒕𝒆.
𝒌
§ 𝒇 is scoped to a specific key
§ Exactly-once semantics

val stateDstream = wordDstream.updateStateByKey[Int](
newUpdateFunc,
new HashPartitioner(ssc.sparkContext.defaultParallelism),
true,
initialRDD)
val updateFunc = (values: Seq[Int], state: Option[Int]) => {
val currentCount = values.sum
val previousCount = state.getOrElse(0)
Some(currentCount + previousCount)
}
Spark Streaming Word Count

§ Stateful dataflow operators
(Any task can hold state)
§ State changes are stored
as a log by Kafka
§ Custom storage engines can
be plugged in to the log
§ 𝒇 is scoped to a specific task
§ At-least-once processing
semantics

Samza Word Count
public class WordCounter implements StreamTask, InitableTask {
//Some omitted details…
private KeyValueStore<String, Integer> store;
public void process(IncomingMessageEnvelope envelope,
MessageCollector collector,
TaskCoordinator coordinator) {
//Get the current count
String word = (String) envelope.getKey();
Integer count = store.get(word);
if (count == null) count = 0;
//Increment, store and send
count += 1;
store.put(word, count);
collector.send(
new OutgoingMessageEnvelope(OUTPUT_STREAM, word ,count));
}
}

What can we say so far?
§ Trident
+ Consistent state accessible from outside
– Only works well with idempotent states
– States are not part of the operators
§ Spark
+ Integrates well with the system guarantees
– Limited expressivity
– Immutability increases update complexity
§ Samza
+ Efficient log based state updates
+ States are well integrated with the operators
– Lack of exactly-once semantics
– State access is not fully transparent

§ Take what’s good, make it work + add
some more
§ Clean and powerful abstractions
• Local (Task) state
• Partitioned (Key) state
§ Proper API integration
• Java: OperatorState interface
• Scala: mapWithState, flatMapWithState…
§ Exactly-once semantics by checkpointing

Flink Word Count
words.keyBy(x => x).mapWithState {
(word, count: Option[Int]) =>
{
val newCount = count.getOrElse(0) + 1
val output = (word, newCount)
(output, Some(newCount))
}
}

Local State
§ Task scoped state access
§ Can be used to implement
custom access patterns
§ Typical usage:
• Source operators (offset)
• Machine learning models
• Use cyclic flows to simulate
global state access

Local State Example (Java)
public class MySource extends RichParallelSourceFunction {
// Omitted details
private OperatorState<Long> offset;
@Override
public void run(SourceContext ctx) {
Object checkpointLock = ctx.getCheckpointLock();
isRunning = true;
while (isRunning) {
synchronized (checkpointLock) {
offset.update(offset.value() + 1);
// ctx.collect(next);
}
}
}
}

Partitioned State
§ Key scoped state access
§ Highly scalable
§ Allows for incremental
backup/restore
§ Typical usage:
• Any per-key operation
• Grouped aggregations
• Window buffers

Partitioned State Example (Scala)
// Compute the current average of each city's temperature
temps.keyBy("city").mapWithState {
(in: Temp, state: Option[(Double, Long)]) =>
{
val current = state.getOrElse((0.0, 0L))
val updated = (current._1 + in.temp, current._2 + 1)
val avg = Temp(in.city, updated._1 / updated._2)
(avg, Some(updated))
}
}
case class Temp(city: String, temp: Double)

Exactly-once semantics
§ Based on consistent global snapshots
§ Algorithm designed for stateful dataflows
Detailed mechanism

Exactly-once semantics
§ Low runtime overhead
§ Checkpointing logic is separated from
application logic
Blogpost on streaming fault-‐‑tolerance

Summary
§ State is essential to many applications
§ Fault-tolerant streaming state is a hard
problem
§ There is a trade-off between expressivity vs
scalability/fault-tolerance
§ Flink tries to hit the sweet spot with…
• Providing very flexible abstractions
• Keeping good scalability and exactly-once
semantics

Stateful Distributed Stream Processing

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Stateful Distributed Stream Processing

Similar to Stateful Distributed Stream Processing (20)

Recently uploaded

Recently uploaded (20)

Stateful Distributed Stream Processing