Sudden spikes in load can be a source of disaster for stream processors. These spikes can reveal latent bottlenecks in otherwise well-balanced configurations and through them introduce backpressure, increase latency and reduce overall throughput. This problem is far from being solved. While the prevailing solution of dynamic scaling (i.e. the process of re-deploying the analytic job on an increased set of resources) offers relief in some cases, it has the disadvantage of requiring free resources to be available during the spike and additionally risks violating existing SLAs during the re-deployment phase. As an alternative, we have implemented MERA. MERA prioritizes items based on their value for the result of an analytic job. When a job is in risk of creating backpressure, MERA sheds items of lower priority at selected locations in the job graph, ensuring that the job stays within its SLA. MERA relies on Flink's internal metrics to restrict its overhead during normal operations to a minimum. In this talk we present MERA, a framework for trading in precision for performance. We demonstrate MERA via a custom performance monitor for Apache Flink.
2. The Why: Compute function chains over large datasets.
2
Cost-efficientFast Accurate
3. Queue
Queue
UDF x N
Task
Jobs are deployed on commodity
hardware
Commodity hardware is failure-
prone
UDF UDF
UDF UDF
Blocked
UDF
Slowed
down
3
Backpressure
5. Prior Knowledge: Straggler Mitigation
• Learn over iterations to identify the best configuration options?
• Herodotou, Herodotos, et al. "Starfish: a self-tuning system for big data analytics." Cidr. Vol.
11. No. 2011. 2011.
5
• Use speculative execution, i.e. execute the same task multiple times
in parallel, and use the result from the fastest tasks.
• Ananthanarayanan, Ganesh, et al. "Effective Straggler Mitigation: Attack of the Clones."
NSDI. Vol. 13. 2013.
Does not work for streaming!
6. Data Streams are different
time
UDF
UDF UDF
UDF
time
UDF
UDF UDF
UDF
No perfect configuration!
6
7. Methods to deal with Backpressure
• Overprovision
Does not help against unbalanced partition scheme
• Dynamically re-scale/re-partition
Requires state migration
Incurs overhead
• Reduce accuracy in response to load peaks
7
• Accept unbounded delays for requests
consumers might require timely results
✅
❌
❌Approximation
Aggregation Filtering
Sampling LoadsheddingDifficult for complex
schemata
8. Filter - Sampling
8
StreamApprox presented by Pramod Bhatotia
and Do Le Quoc at FlinkForward’17
ERROR WARN INFO DEBUG
Statistics
Generate a representative subset
9. Filter - Loadshedding
9
Generate a subset fulfilling a given property
Scoring
Function
Load-
shedder
Priority score
HIGH
LOW
Accept
Shed
Threshold
12. Insights
Dynamic approximation is faster than scaling
Backpressure needs to be handled everywhere at the job!
Filtering at the source is sub optimal!
There is more to approximation than sampling!
12
No provisioning &
state migration!
14. Results
• No optimization vs optimization
• Bottleneck latency
• Input rate
• Output rate
• Drop rate
• Latency?
• Drop only at source or everywhere
• Bar charts
• Bottleneck Latency
• Latency Mean
• Without Opt/Only at Source/Full
14
16. Re-Read
• Load Shedding Techniques for Data Stream Systems
• STREAMAPPROX: APPROXIMATE STREAM ANALYTICS IN APACHE FLINK
• Approximation papers
16
UDFUDF
Future
Combine aggregation with task
17. Open issues
• Problem: The ILP will select only a single or only few places to shed
items, leading to a loss of high-scoring items.
• Fix: Include the distribution factor into the ILP.
• Problem: By shedding at reducers, the distribution of data will change
in one direction.
• Fix: Shed only at mappers
17