Extracting insights out of continuously generated data requires a stream processor with powerful data analytics features such as Apache Flink. A stream data pipeline with Flink typically includes a storage component to ingest and serve the data. Pravega is a stream store that ingests and stores stream data permanently, making the data available for tail, catch-up, and historical reads. One important challenge for such stream data pipelines is coping with the variations in the workload. Daily cycles and seasonal spikes might require the provisioning of the application to adapt accordingly. Pravega has a feature called stream scaling, which enables the capacity offered for the ingestion of events of a stream to grow and shrink over time according to workload. Such a feature is useful when the application downstream has the ability of accommodating such changes and also scale its provisioning accordingly. In this presentation, we introduce stream scaling in Pravega and how Flink jobs leverage this feature to rescale stateful jobs according to variations in the workload.
6. Workload cycles and seasonal spikes
Flink Forward - San Francisco, 2018 6
Daily cycles
NYC Yellow Taxi Trip Records, March 2015
http://www.nyc.gov/html/tlc/html/about/trip
_record_data.shtml
Seasonal spikes
https://www.slideshare.net/iwmw/building-
highly-scalable-web-applications/7-
Seasonal_Spikes
7. Workload cycles and spikes
Flink Forward - San Francisco, 2018 7
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Seasonal spikes
0:00
2:00
4:00
6:00
8:00
10:00
12:00
14:00
16:00
18:00
20:00
22:00
1:00
3:00
5:00
7:00
9:00
11:00
13:00
15:00
17:00
19:00
21:00
23:00
Daily cycles
0
2
4
6
8
10
12
14
Weekly cycles
Unplanned
11. Event processing
Flink Forward - San Francisco, 2018 11
Source
ü Source rate
increases
ü New rate: 4
events/second
ü Processor still processes 3
events/second
ü Can’t keep up with the
source rate
Processor 1
Append-only Log
Colors represent event keys
12. Event processing
Flink Forward - San Francisco, 2018 12
Source
ü Source rate
increases
ü New rate: 4
events/second
Processor 1
Append-only Log
Processor 2
ü Add a second processor
ü Each processor processes 3
events/second
ü Can keep up with the rate
Colors represent event keys
13. Event processing
Flink Forward - San Francisco, 2018 13
Source
ü Source rate
increases
ü New rate: 4
events/second
Processor 1
Append-only Log
Processor 2
ü Add a second processor
ü Each processor processes 3
events/second
ü Can keep up with the rate
Problem: Per-key order
14. Event processing
Flink Forward - San Francisco, 2018 14
Source
Processor 1
Processor 2
ü Source rate
increases
ü New rate: 4
events/second
ü Add a second processor
ü Each processor processes 3
events/second
ü Can keep up with the rate
Split the input and
add processors
Append-only Log
15. Event processing
Flink Forward - San Francisco, 2018 15
Source
Processor 1
Processor 2
ü Source rate
increases
ü New rate: 4
events/second
ü Add a second processor
ü Each processor processes 3
events/second
ü Can keep up with the rate
Split the input and
add processors
Append-only Log
Problem: Per-key order
16. Event processing
Flink Forward - San Francisco, 2018 16
Source
Processor 1
Processor 2
ü Source rate
increases
ü New rate: 4
events/second
ü Add a second processor
ü Each processor processes 3
events/second
ü Can keep up with the rate
Split the input and
add processors
Processor 2 only starts once earlier
events have been processed
19. Pravega
• Storing data streams
• Young project, under active development
• Open source
http://pravega.io
http://github.com/pravega/pravega
19Flink Forward - San Francisco, 2018
24. Pravega aims to be a stream store able to:
• Store stream data permanently
• Preserve order
• Accommodate unbounded streams
• Adapt to varying workloads automatically
• Low-latency from append to read
Flink Forward - San Francisco, 2018 24
41. Daily Cycles
Peak rate is 10x higher than lowest rate
4:00 AM
9:00 AM
NYC Yellow Taxi Trip Records, March 2015
http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml
45. How do I control scaling?
Flink Forward - San Francisco, 2018 45
46. Scaling policies
• Configured on a per stream basis
• Specifies a policy for the stream
• Policies
• Fixed
• Set of segments is fixed
• Bytes per second
• Scales up and down according to volume of data
• Target data rate
• Events per second
• Scales up and down according to volume of events
• Target event rate
Flink Forward - San Francisco, 2018 46
47. Auto-Scaling: Triggering a scaling event
• By byte and event rates
• Target T per segment
• Reports every 2 minutes
ü 2-min rate (2M)
ü 5-min rate (5M)
ü 10-min rate (10M)
ü 20-min rate (20M)
Flink Forward - San Francisco, 2018 47
Scale up
x x + 2 min x + 4 min x + 6 min time
• Scaling down
∧ 2M, 5M, 10M < T
∧ 20M < T / 2
2M = 60
5M = 56
10M = 46
T = 50
2M = 60
5M = 60
10M = 48
T = 50
2M = 60
5M = 60
10M = 5
T = 50
2M = 60
5M = 60
10M = 52
T = 50
Scale down
x x + 2 min x + 4 min x + 6 min time
2M = 20
5M = 20
10M = 20
20M = 27
T = 50
• Scaling up
∨ 2M > 5 x T
∨ 5M > 2 x T
∨ 10M > T
2M = 20
5M = 20
10M = 20
20M = 26
T = 50
2M = 20
5M = 20
10M = 20
20M = 25
T = 50
2M = 20
5M = 20
10M = 20
20M = 24
T = 50
55. Scaling signals
Flink Forward - San Francisco, 2018 55
Pravega AppBig source
• Pravega won’t scale the
application downstream
• … but it can signal
• E.g., more segments
• E.g., number of unread
bytes is growing
Signals from Pravega
56. Reader group notifier
• Listener API
• Register a listener to react to changes
• E.g., changes to the number of segments
Flink Forward - San Francisco, 2018 56
ReaderGroupManager groupManager = new ReaderGroupManagerImpl(SCOPE, controller,
clientFactory, connectionFactory);
ReaderGroup readerGroup = groupManager.createReaderGroup(GROUP_NAME,
ReaderGroupConfig.builder().build(), Collections.singleton(STREAM));
readerGroup.getSegmentNotifier(executor).registerListener(segmentNotification -> {
int numOfReaders = segmentNotification.getNumOfReaders();
int segments = segmentNotification.getNumOfSegments();
if (numOfReaders < segments) {
//Scale up number of readers based on application capacity
} else {
//More readers available time to shut down some
}
});
57. Reader group: listener and metrics
• Listener API
• Register a listener to react to changes
• E.g., changes to the number of segments
• Metrics
• Reports specific values of interest
• E.g., number of unread bytes in a stream
Flink Forward - San Francisco, 2018 57
71. Flink’s Revamped Distributed Architecture
Flink Forward - San Francisco, 2018 71
• Motivation
• Resource elasticity
• Support for different deployments
• REST interface for client-cluster
communication
• Introduce generic building blocks
• Compose blocks for different scenarios
72. The Building Blocks
72
• ClusterManager-specific
• May live across jobs
• Manages available Containers/TaskManagers
• Used to acquire / release resources
ResourceManager
TaskManagerJobManager
• Registers at ResourceManager
• Gets tasks from one or more JobManagers
• Single job only, started per job
• Thinks in terms of "task slots"
• Deploys and monitors job/task execution
Dispatcher
• Lives across jobs
• Touch-point for job submissions
• Spawns JobManagers
Flink Forward - San Francisco, 2018