SlideShare a Scribd company logo
A Comparative Performance
Evaluation of Flink
Dongwon Kim
POSTECH
About Me
• Postdoctoral researcher @ POSTECH
• Research interest
• Design and implementation of distributed systems
• Performance optimization of big data processing engines
• Doctoral thesis
• MR2: Fault Tolerant MapReduce with the Push Model
• Personal blog
• http://eastcirclek.blogspot.kr
• Why I’m here 
2
Outline
• TeraSort for various engines
• Experimental setup
• Results & analysis
• What else for better performance?
• Conclusion
3
TeraSort
• Hadoop MapReduce program for the annual terabyte sort competition
• TeraSort is essentially distributed sort (DS)
a4
b3 a1
a2
b1
b2
a2
b1 a3
a4
b3
b4
Disk
a2
a4
b3
b1
a1
b4
a3
b2
a1
a3
b4
b2
Disk
a2
a4
a1
a3
b3
b1
b4
b2
read shufflinglocal sort write
Disk
Disk
local sort
Node 1
Node 2
Typical DS phases :
4Total order
• Included in Hadoop distributions
• with TeraGen & TeraValidate
• Identity map & reduce functions
• Range partitioner built on sampling
• To guarantee a total order & to prevent partition skew
• Sampling to compute boundary points within few seconds
TeraSort for MapReduce
Reduce taskMap task
read shuffling sortsort reducemap write
read shufflinglocal sort writelocal sortDS phases :
reducemap
5
Record range
…
Partition 1 Partition 2 Partition r
boundary points
• Tez can execute TeraSort for MapReduce w/o any modification
• mapreduce.framework.name = yarn-tez
• Tez DAG plan of TeraSort for MapReduce
TeraSort for Tez
finalreduce vertex
initialmap vertex
Map task
read sortmap
Reduce task
shuffling sort reduce write
input data
output data
6
TeraSort for Spark & Flink
• My source code in GitHub:
• https://github.com/eastcirclek/terasort
• Sampling-based range partitioner from TeraSort for MapReduce
• Visit my personal blog for a detailed explanation
• http://eastcirclek.blogspot.kr
7
RDD1 RDD2
• Code
• Two RDDs
TeraSort for Spark
Stage 1Stage 0
Shuffle-Map Task
(for newAPIHadoopFile)
read sort
Result Task
(for repartitionAndSortWithinPartitions)
shuffling sort write
read shufflinglocal sort writelocal sort
Create a new RDD to read from HDFS
# partitions = # blocks
Repartition the parent RDD
based on the user-specified partitioner
Write output to HDFS
DS phases :
8
Pipeline
• Code
• Pipelines consisting of four operators
TeraSort for Flink
read shuffling writelocal sort
Create a dataset to read tuples
from HDFS
partition tuples
Sort tuples of each partition
DataSource Partition SortPartition DataSink
local sort
No map-side sorting
due to pipelined execution
Write output to HDFS
DS phases :
9
Importance of TeraSort
• Suitable for measuring the pure performance of big data engines
• No data transformation (like map, filter) with user-defined logic
• Basic facilities of each engine are used
• “Winning the sort benchmark” is a great means of PR
10
Outline
• TeraSort for various engines
• Experimental setup
• Machine specification
• Node configuration
• Results & analysis
• What else for better performance?
• Conclusion
11
Machine specification (42 identical machines)
DELL PowerEdge R610
CPU
Two X5650 processors
(Total 12 cores)
Memory
Total 24Gb
Disk
6 disks * 500GB/disk
Network
10 Gigabit Ethernet
My machine Spark team
Processor
Intel Xeon X5650
(Q1, 2010)
Intel Xeon E5-2670
(Q1, 2012)
Cores 6 * 2 processors 8 * 4 processors
Memory 24GB 244GB
Disks 6 HDD's 8 SSD's
Results can be different
in newer machines
12
24GB on each node
Node configuration
Total 2 GB
for daemons
13 GB
Tez-0.7.0
NodeManager (1 GB)
ShuffleService
MapTask (1GB)
DataNode (1 GB)
MapTask (1GB)
ReduceTask (1GB)
ReduceTask (1GB)
MapTask (1GB)
MapTask (1GB)
MapTask (1GB)
…
MapReduce-2.7.1
NodeManager (1 GB)
ShuffleService
DataNode (1 GB)
MapTask (1GB)
MapTask (1GB)
ReduceTask (1GB)
ReduceTask (1GB)
MapTask (1GB)
MapTask (1GB)
MapTask (1GB)
…
12 GB
Flink
Spark
Spark-1.5.1
NodeManager (1 GB)
Executor (12GB)
Internal
memory layout
Various managers
DataNode (1 GB)
Task slot 1
Task slot 2
Task slot 12
...
Thread pool
Flink-0.9.1
NodeManager (1 GB)
TaskManager (12GB)
DataNode (1 GB)
Internal
memory layout
Various managers
Task slot 1
Task slot 2
Task slot 12
...
Task threads
Tez
MapReduce
ReduceTask (1GB)ReduceTask (1GB)
13
12 simultaneous tasks
at most
Driver (1GB) JobManager (1GB)
Outline
• TeraSort for various engines
• Experimental setup
• Results & analysis
• Flink is faster than other engines due to its pipelined execution
• What else for better performance?
• Conclusion
14
How to read a swimlane graph & throughput graphs
Tasks
Time since job starts (seconds)
2nd stage
1st
2nd
3rd
4th
5th
6th
15
Cluster network throughput
Cluster disk throughput
In
Out
Disk read
Disk
Write
- 6 waves of 1st stage tasks
- 1 wave of 2nd stage tasks
- Two stages are hardly overlapped
1st stage
2nd stage
1st stage
2nd stage
No network traffic during 1st stage
Each line : duration of each task
Different patterns for different stages
Result of sorting 80GB/node (3.2TB)
1480 sec
1st stage
1st stage
1st stage
2nd stage
2157 sec
2nd stage
2171 sec
1 DataSource
2 Partition
3 SortPartition
4 DataSink
• Flink is the fastest due to its pipelined execution
• Tez and Spark do not overlap 1st and 2nd stages
• MapReduce is slow despite overlapping stages
MapReduce
in Hadoop-2.7.1
Tez-0.7.0
Spark-1.5.1
Flink-0.9.1
2nd stage
1887 sec
2157
1887
2171
1480
0
500
1000
1500
2000
2500
MapReduce
in Hadoop-2.7.1
Tez-0.7.0 Spark-1.5.1 Flink-0.9.1
Time(seconds)
16* Map output compression turned on for Spark and Tez
* *
Tez and Spark do not overlap 1st and 2nd stages
Cluster network
throughput
Cluster disk throughput
In
Out
Disk read
Cluster network
throughput
Cluster disk throughput
In
Out
Disk read
Disk
write
Disk
write
Disk read
Disk write
Out
In
(1) 2nd stage starts
(2)
Output of 1st stage is sent
(1) 2nd stage starts
(2)
Output of 1st stage is sent
(1)
Network traffic
occurs from start
Cluster network
throughput
(2)
Write to HDFS occurs
right after shuffling is done
1 DataSource
2 Partition
3 SortPartition
4 DataSink
idle idle
(3)
Disk write to HDFS occurs
after shuffling is done
(3)
Disk write to HDFS occurs
after shuffling is done
17
Tez does not overlap 1st and 2nd stages
• Tez has parameters to control the degree of overlap
• tez.shuffle-vertex-manager.min-src-fraction : 0.2
• tez.shuffle-vertex-manager.max-src-fraction : 0.4
• However, 2nd stage is scheduled early but launched late
scheduled launched
18
Spark does not overlap 1st and 2nd stages
• Spark cannot execute multiple stages simultaneously
• also mentioned in the following VLDB paper (2015)
Spark doesn’t support the overlap
between shuffle write and read stages.
…
Spark may want to support this overlap
in the future to improve performance.
Experimental results of this paper
- Spark is faster than MapReduce for WordCount, K-means, PageRank.
- MapReduce is faster than Spark for Sort.
19
MapReduce is slow despite overlapping stages
• mapreduce.job.reduce.slowstart.completedMaps : [0.0, 1.0]
• Wang’s attempt to overlap spark stages
0.05
(overlapping, default)
0.95
(no overlapping)
2157 sec
10%
improvement
20
Wang proposes to overlap stages
to achieve better utilization
10%???
Why Spark & MapReduce
improve just 10%?
2385 sec
2nd stage
1st stage
Disk
Data transfer between tasks of different stages
Output file
P1 P2 Pn
Shuffle server
…
Consumer
Task 1
Consumer
Task 2
Consumer
Task n
P1
…
P2
Pn
Traditional pull model
- Used in MapReduce, Spark, Tez
- Extra disk access & simultaneous disk access
- Shuffling affects the performance of producers
Producer
Task
(1)
Write output
to disk
(2)
Request P1
(3)
Send P1
Pipelined data transfer
- Used in Flink
- Data transfer from memory to memory
- Flink causes fewer disk access during shuffling
21
Leads to only 10% improvement
Flink causes fewer disk access during shuffling
Map
Reduce
Flink diff.
Total disk write
(TB)
9.9 6.5 3.4
Total disk read
(TB)
8.1 6.9 1.2
Difference comes
from shuffling
Shuffled data are sometimes
read from page cache
Cluster disk throughput
Disk read
Disk write
Disk read
Disk write
Cluster disk throughput
FlinkMapReduce
22
Total amount of disk read/write
equals to
the area of blue/green region
Result of TeraSort with various data sizes
node data size
(GB)
Time (seconds)
Flink Spark MapReduce Tez
10 157 387 259 277
20 350 652 555 729
40 741 1135 1085 1709
80 1480 2171 2157 1887
160 3127 4927 4796 3950
23
100
1000
10000
10 20 40 80 160
Time(seconds)
node data size (GB)
Flink Spark MapReduce Tez
What we’ve seen
Log scale
* Map output compression turned on for Spark and Tez
Result of HashJoin
• 10 slave nodes
• org.apache.tez.examples.JoinDataGen
• Small dataset : 256MB
• Large dataset : 240GB (24GB/node)
• Result :
24
Visit my blog

Flink is
~2x faster than Tez
~4x faster than Spark
770
1538
378
0
200
400
600
800
1000
1200
1400
1600
1800
Tez-0.7.0 Spark-1.5.1 Flink-0.9.1
Time(seconds)
* No map output compression for both Spark and Tez unlike in TeraSort
Result of HashJoin with swimlane & throughput graphs
25
Idle
1 DataSource
2 DataSource
3 Join
4 DataSink
Idle
Cluster network throughput
Cluster disk throughput
In
Out
Disk
read
Disk
write
Disk read
Disk write
In
Out
In
Out
Disk read
Disk
write
Cluster network throughput
Cluster disk throughput
0.24 TB
0.41 TB
0.60 TB 0.84 TB
0.68 TB
0.74 TB
Overlap
2nd
3rd
Flink’s shortcoming
• No support for map output compression
• Small data blocks are pipelined between operators
• Job-level fault tolerance
• Shuffle data are not materialized
• Low disk throughput during the post-shuffling phase
26
Low disk throughput during the post-shuffling phase
• Possible reason : sorting records from small files
• Concurrent disk access to small files  too many disk seeks
 low disk throughput
• Other engines merge records from larger files than Flink
• “Eager pipelining moves some of the sorting work from the mapper to the
reducer”
• from MapReduce online (NSDI 2010)
Flink Tez MapReduce
27
Outline
• TeraSort for various engines
• Experimental setup
• Results & analysis
• What else for better performance?
• Conclusion
28
MR2 – another MapReduce engine
• PhD thesis
• MR2: Fault Tolerant MapReduce with the Push Model
• developed for 3 years
• Provide the user interface of Hadoop MapReduce
• No DAG support
• No in-memory computation
• No iterative-computation
• Characteristics
• Push model + Fault tolerance
• Techniques to boost up HDD throughput
• Prefetching for mappers
• Preloading for reducers
29
MR2 pipeline
• 7 types of components with memory buffers
1. Mappers & reducers : to apply user-defined functions
2. Prefetcher & preloader : to eliminate concurrent disk access
3. Sender & reducer & merger : to implement MR2’s push model
• Various buffers : to pass data between components w/o disk IOs
• Minimum disk access (2 disk reads & 2 disk writes)
• +1 disk write for fault tolerance
W1 R2 W2R1
30
1 12 23 3 3
W3
Prefetcher & Mappers
• Prefetcher loads data for multiple mappers
• Mappers do not read input from disks
<MR2><Hadoop MapReduce>
Mapper1 processing Blk1
Mapper2 processing Blk2
Time
Disk
throughput
CPU
utilization
2 mappers
on a node
Blk1
Time
Prefetcher Blk2 Blk3
Blk1
2
Blk1
1
Blk2
2
Blk2
1
Blk3
2
Blk3
1
Blk4
Blk4
2
Blk4
1
Disk
throughput
CPU
utilization
2 mappers
on a node
31
Push-model in MR2
• Node-to-node network connection for pushing data
• To reduce # network connections
• Data transfer from memory buffer
• Mappers stores spills in send buffer
• Spills are pushed to reducer sides by sender
• Fault tolerance (can be turned on/off)
• Input ranges of each spill are known to master for reproduce
• For fast recovery
• store spills on disk for fast recovery (extra disk write)
32
similar to Flink’s pipelined execution
MR2 does local sorting
before pushing data
similar to Spark
Receiver’s managed memory
Receiver & merger & preloader & reducer
• Merger produces a file from different partition data
• sorts each partition data
• and then does interleaving
• Preloader preloads each group into reduce buffer
• Reducers do not read data directly from disks
• MR2 can eliminate concurrent disk reads from reducers thanks to Preloader
P1 P2 P3 P4
P1 P2 P3 P4
P1 P2 P3 P4
… …
Preloader loads each group
(1 disk access for 4 partitions)
33
Result of sorting 80GB/node (3.2TB) with MR2
MapReduce
in Hadoop-2.7.1
Tez-0.7.0 Spark-1.5.1 Flink-0.9.1 MR2
Time (sec) 2157 1887 2171 1480 890
MR2 speedup
over other engines
2.42 2.12 2.44 1.66 -
2157
1887
2171
1480
890
0
500
1000
1500
2000
2500
MapReduce
in Hadoop-2.7.1
Tez-0.7.0 Spark-1.5.1 Flink-0.9.1 MR2
Time(seconds)
34
Disk & network throughput
1. DataSource / Mapping
• Prefetcher is effective
• MR2 shows higher disk
throughput
2. Partition / Shuffling
• Records to shuffle are
generated faster from in MR2
3. DataSink / Reducing
• Preloader is effective
• Almost 2x throughput
Disk read
Disk write
Out
In
Cluster network throughput
Cluster disk throughput
Out
In
Disk read
Disk write
Flink MR2
1
1
2
2
3
3
35
• Experimental results using 10 nodes
PUMA (PUrdue MApreduce benchmarks suite)
36
Outline
• TeraSort for various engines
• Experimental setup
• Experimental results & analysis
• What else for better performance?
• Conclusion
37
Conclusion
• Pipelined execution for both batch and streaming processing
• Even better than other batch processing engines for
TeraSort & HashJoin
• Shortcomings due to pipelined execution
• No fine-grained fault tolerance
• No map output compression
• Low disk throughput during the post-shuffling phase
38
Thank you!
Any question?
39

More Related Content

What's hot

HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
enissoz
 
Physical Plans in Spark SQL
Physical Plans in Spark SQLPhysical Plans in Spark SQL
Physical Plans in Spark SQL
Databricks
 
Memory Management in Apache Spark
Memory Management in Apache SparkMemory Management in Apache Spark
Memory Management in Apache Spark
Databricks
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark Fundamentals
Zahra Eskandari
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
Venkata Naga Ravi
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
DataWorks Summit
 
How to Automate Performance Tuning for Apache Spark
How to Automate Performance Tuning for Apache SparkHow to Automate Performance Tuning for Apache Spark
How to Automate Performance Tuning for Apache Spark
Databricks
 
A Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQLA Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQL
Databricks
 
Advanced Apache Spark Meetup Project Tungsten Nov 12 2015
Advanced Apache Spark Meetup Project Tungsten Nov 12 2015Advanced Apache Spark Meetup Project Tungsten Nov 12 2015
Advanced Apache Spark Meetup Project Tungsten Nov 12 2015
Chris Fregly
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architecture
Sohil Jain
 
Apache Spark overview
Apache Spark overviewApache Spark overview
Apache Spark overview
DataArt
 
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...
DataWorks Summit/Hadoop Summit
 
Top 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsTop 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark Applications
Spark Summit
 
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...
Databricks
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
Vadim Y. Bichutskiy
 
Deep Dive with Spark Streaming - Tathagata Das - Spark Meetup 2013-06-17
Deep Dive with Spark Streaming - Tathagata  Das - Spark Meetup 2013-06-17Deep Dive with Spark Streaming - Tathagata  Das - Spark Meetup 2013-06-17
Deep Dive with Spark Streaming - Tathagata Das - Spark Meetup 2013-06-17
spark-project
 
Incremental View Maintenance with Coral, DBT, and Iceberg
Incremental View Maintenance with Coral, DBT, and IcebergIncremental View Maintenance with Coral, DBT, and Iceberg
Incremental View Maintenance with Coral, DBT, and Iceberg
Walaa Eldin Moustafa
 
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideLearn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive Guide
Whizlabs
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
Flink Forward
 

What's hot (20)

HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
Physical Plans in Spark SQL
Physical Plans in Spark SQLPhysical Plans in Spark SQL
Physical Plans in Spark SQL
 
Memory Management in Apache Spark
Memory Management in Apache SparkMemory Management in Apache Spark
Memory Management in Apache Spark
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark Fundamentals
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
 
How to Automate Performance Tuning for Apache Spark
How to Automate Performance Tuning for Apache SparkHow to Automate Performance Tuning for Apache Spark
How to Automate Performance Tuning for Apache Spark
 
A Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQLA Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQL
 
Advanced Apache Spark Meetup Project Tungsten Nov 12 2015
Advanced Apache Spark Meetup Project Tungsten Nov 12 2015Advanced Apache Spark Meetup Project Tungsten Nov 12 2015
Advanced Apache Spark Meetup Project Tungsten Nov 12 2015
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architecture
 
Apache Spark overview
Apache Spark overviewApache Spark overview
Apache Spark overview
 
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...
 
Top 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsTop 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark Applications
 
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
 
Deep Dive with Spark Streaming - Tathagata Das - Spark Meetup 2013-06-17
Deep Dive with Spark Streaming - Tathagata  Das - Spark Meetup 2013-06-17Deep Dive with Spark Streaming - Tathagata  Das - Spark Meetup 2013-06-17
Deep Dive with Spark Streaming - Tathagata Das - Spark Meetup 2013-06-17
 
Incremental View Maintenance with Coral, DBT, and Iceberg
Incremental View Maintenance with Coral, DBT, and IcebergIncremental View Maintenance with Coral, DBT, and Iceberg
Incremental View Maintenance with Coral, DBT, and Iceberg
 
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideLearn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive Guide
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 

Viewers also liked

Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward
 
Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...
Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...
Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...
Flink Forward
 
Flink Apachecon Presentation
Flink Apachecon PresentationFlink Apachecon Presentation
Flink Apachecon Presentation
Gyula Fóra
 
Mikio Braun – Data flow vs. procedural programming
Mikio Braun – Data flow vs. procedural programming Mikio Braun – Data flow vs. procedural programming
Mikio Braun – Data flow vs. procedural programming
Flink Forward
 
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache ZeppelinMoon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
Flink Forward
 
Mohamed Amine Abdessemed – Real-time Data Integration with Apache Flink & Kafka
Mohamed Amine Abdessemed – Real-time Data Integration with Apache Flink & KafkaMohamed Amine Abdessemed – Real-time Data Integration with Apache Flink & Kafka
Mohamed Amine Abdessemed – Real-time Data Integration with Apache Flink & Kafka
Flink Forward
 
Apache Flink Training: DataStream API Part 1 Basic
 Apache Flink Training: DataStream API Part 1 Basic Apache Flink Training: DataStream API Part 1 Basic
Apache Flink Training: DataStream API Part 1 Basic
Flink Forward
 
Maximilian Michels – Google Cloud Dataflow on Top of Apache Flink
Maximilian Michels – Google Cloud Dataflow on Top of Apache FlinkMaximilian Michels – Google Cloud Dataflow on Top of Apache Flink
Maximilian Michels – Google Cloud Dataflow on Top of Apache Flink
Flink Forward
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
Till Rohrmann
 
Slim Baltagi – Flink vs. Spark
Slim Baltagi – Flink vs. SparkSlim Baltagi – Flink vs. Spark
Slim Baltagi – Flink vs. Spark
Flink Forward
 
Marton Balassi – Stateful Stream Processing
Marton Balassi – Stateful Stream ProcessingMarton Balassi – Stateful Stream Processing
Marton Balassi – Stateful Stream Processing
Flink Forward
 
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-timeChris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Flink Forward
 
Flink Case Study: Bouygues Telecom
Flink Case Study: Bouygues TelecomFlink Case Study: Bouygues Telecom
Flink Case Study: Bouygues Telecom
Flink Forward
 
Albert Bifet – Apache Samoa: Mining Big Data Streams with Apache Flink
Albert Bifet – Apache Samoa: Mining Big Data Streams with Apache FlinkAlbert Bifet – Apache Samoa: Mining Big Data Streams with Apache Flink
Albert Bifet – Apache Samoa: Mining Big Data Streams with Apache Flink
Flink Forward
 
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data CompanionS. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
Flink Forward
 
Tran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
Tran Nam-Luc – Stale Synchronous Parallel Iterations on FlinkTran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
Tran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
Flink Forward
 
Apache Flink - Hadoop MapReduce Compatibility
Apache Flink - Hadoop MapReduce CompatibilityApache Flink - Hadoop MapReduce Compatibility
Apache Flink - Hadoop MapReduce Compatibility
Fabian Hueske
 
Apache Flink Training: DataSet API Basics
Apache Flink Training: DataSet API BasicsApache Flink Training: DataSet API Basics
Apache Flink Training: DataSet API Basics
Flink Forward
 
K. Tzoumas & S. Ewen – Flink Forward Keynote
K. Tzoumas & S. Ewen – Flink Forward KeynoteK. Tzoumas & S. Ewen – Flink Forward Keynote
K. Tzoumas & S. Ewen – Flink Forward Keynote
Flink Forward
 
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
Flink Forward
 

Viewers also liked (20)

Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
 
Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...
Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...
Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...
 
Flink Apachecon Presentation
Flink Apachecon PresentationFlink Apachecon Presentation
Flink Apachecon Presentation
 
Mikio Braun – Data flow vs. procedural programming
Mikio Braun – Data flow vs. procedural programming Mikio Braun – Data flow vs. procedural programming
Mikio Braun – Data flow vs. procedural programming
 
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache ZeppelinMoon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
 
Mohamed Amine Abdessemed – Real-time Data Integration with Apache Flink & Kafka
Mohamed Amine Abdessemed – Real-time Data Integration with Apache Flink & KafkaMohamed Amine Abdessemed – Real-time Data Integration with Apache Flink & Kafka
Mohamed Amine Abdessemed – Real-time Data Integration with Apache Flink & Kafka
 
Apache Flink Training: DataStream API Part 1 Basic
 Apache Flink Training: DataStream API Part 1 Basic Apache Flink Training: DataStream API Part 1 Basic
Apache Flink Training: DataStream API Part 1 Basic
 
Maximilian Michels – Google Cloud Dataflow on Top of Apache Flink
Maximilian Michels – Google Cloud Dataflow on Top of Apache FlinkMaximilian Michels – Google Cloud Dataflow on Top of Apache Flink
Maximilian Michels – Google Cloud Dataflow on Top of Apache Flink
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
 
Slim Baltagi – Flink vs. Spark
Slim Baltagi – Flink vs. SparkSlim Baltagi – Flink vs. Spark
Slim Baltagi – Flink vs. Spark
 
Marton Balassi – Stateful Stream Processing
Marton Balassi – Stateful Stream ProcessingMarton Balassi – Stateful Stream Processing
Marton Balassi – Stateful Stream Processing
 
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-timeChris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
 
Flink Case Study: Bouygues Telecom
Flink Case Study: Bouygues TelecomFlink Case Study: Bouygues Telecom
Flink Case Study: Bouygues Telecom
 
Albert Bifet – Apache Samoa: Mining Big Data Streams with Apache Flink
Albert Bifet – Apache Samoa: Mining Big Data Streams with Apache FlinkAlbert Bifet – Apache Samoa: Mining Big Data Streams with Apache Flink
Albert Bifet – Apache Samoa: Mining Big Data Streams with Apache Flink
 
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data CompanionS. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
 
Tran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
Tran Nam-Luc – Stale Synchronous Parallel Iterations on FlinkTran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
Tran Nam-Luc – Stale Synchronous Parallel Iterations on Flink
 
Apache Flink - Hadoop MapReduce Compatibility
Apache Flink - Hadoop MapReduce CompatibilityApache Flink - Hadoop MapReduce Compatibility
Apache Flink - Hadoop MapReduce Compatibility
 
Apache Flink Training: DataSet API Basics
Apache Flink Training: DataSet API BasicsApache Flink Training: DataSet API Basics
Apache Flink Training: DataSet API Basics
 
K. Tzoumas & S. Ewen – Flink Forward Keynote
K. Tzoumas & S. Ewen – Flink Forward KeynoteK. Tzoumas & S. Ewen – Flink Forward Keynote
K. Tzoumas & S. Ewen – Flink Forward Keynote
 
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
 

Similar to Dongwon Kim – A Comparative Performance Evaluation of Flink

Spark Overview and Performance Issues
Spark Overview and Performance IssuesSpark Overview and Performance Issues
Spark Overview and Performance Issues
Antonios Katsarakis
 
Migrating ETL Workflow to Apache Spark at Scale in Pinterest
Migrating ETL Workflow to Apache Spark at Scale in PinterestMigrating ETL Workflow to Apache Spark at Scale in Pinterest
Migrating ETL Workflow to Apache Spark at Scale in Pinterest
Databricks
 
Spark architechure.pptx
Spark architechure.pptxSpark architechure.pptx
Spark architechure.pptx
SaiSriMadhuriYatam
 
Apache Spark: What's under the hood
Apache Spark: What's under the hoodApache Spark: What's under the hood
Apache Spark: What's under the hood
Adarsh Pannu
 
Healthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache SparkHealthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache Spark
Databricks
 
From HDFS to S3: Migrate Pinterest Apache Spark Clusters
From HDFS to S3: Migrate Pinterest Apache Spark ClustersFrom HDFS to S3: Migrate Pinterest Apache Spark Clusters
From HDFS to S3: Migrate Pinterest Apache Spark Clusters
Databricks
 
[262] netflix 빅데이터 플랫폼
[262] netflix 빅데이터 플랫폼[262] netflix 빅데이터 플랫폼
[262] netflix 빅데이터 플랫폼
NAVER D2
 
Apache Spark: The Next Gen toolset for Big Data Processing
Apache Spark: The Next Gen toolset for Big Data ProcessingApache Spark: The Next Gen toolset for Big Data Processing
Apache Spark: The Next Gen toolset for Big Data Processing
prajods
 
Spark on Yarn @ Netflix
Spark on Yarn @ NetflixSpark on Yarn @ Netflix
Spark on Yarn @ Netflix
Nezih Yigitbasi
 
Producing Spark on YARN for ETL
Producing Spark on YARN for ETLProducing Spark on YARN for ETL
Producing Spark on YARN for ETL
DataWorks Summit/Hadoop Summit
 
700 Queries Per Second with Updates: Spark As A Real-Time Web Service
700 Queries Per Second with Updates: Spark As A Real-Time Web Service700 Queries Per Second with Updates: Spark As A Real-Time Web Service
700 Queries Per Second with Updates: Spark As A Real-Time Web Service
Spark Summit
 
700 Updatable Queries Per Second: Spark as a Real-Time Web Service
700 Updatable Queries Per Second: Spark as a Real-Time Web Service700 Updatable Queries Per Second: Spark as a Real-Time Web Service
700 Updatable Queries Per Second: Spark as a Real-Time Web Service
Evan Chan
 
Hadoop Network Performance profile
Hadoop Network Performance profileHadoop Network Performance profile
Hadoop Network Performance profile
pramodbiligiri
 
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the HoodRadical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Databricks
 
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeAdaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
Databricks
 
Scala like distributed collections - dumping time-series data with apache spark
Scala like distributed collections - dumping time-series data with apache sparkScala like distributed collections - dumping time-series data with apache spark
Scala like distributed collections - dumping time-series data with apache spark
Demi Ben-Ari
 
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Be...
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark  - Demi Be...S3, Cassandra or Outer Space? Dumping Time Series Data using Spark  - Demi Be...
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Be...
Codemotion
 
Presentations from the Cloudera Impala meetup on Aug 20 2013
Presentations from the Cloudera Impala meetup on Aug 20 2013Presentations from the Cloudera Impala meetup on Aug 20 2013
Presentations from the Cloudera Impala meetup on Aug 20 2013
Cloudera, Inc.
 
11. From Hadoop to Spark 1:2
11. From Hadoop to Spark 1:211. From Hadoop to Spark 1:2
11. From Hadoop to Spark 1:2
Fabio Fumarola
 
Apache Spark Best Practices Meetup Talk
Apache Spark Best Practices Meetup TalkApache Spark Best Practices Meetup Talk
Apache Spark Best Practices Meetup Talk
Eren Avşaroğulları
 

Similar to Dongwon Kim – A Comparative Performance Evaluation of Flink (20)

Spark Overview and Performance Issues
Spark Overview and Performance IssuesSpark Overview and Performance Issues
Spark Overview and Performance Issues
 
Migrating ETL Workflow to Apache Spark at Scale in Pinterest
Migrating ETL Workflow to Apache Spark at Scale in PinterestMigrating ETL Workflow to Apache Spark at Scale in Pinterest
Migrating ETL Workflow to Apache Spark at Scale in Pinterest
 
Spark architechure.pptx
Spark architechure.pptxSpark architechure.pptx
Spark architechure.pptx
 
Apache Spark: What's under the hood
Apache Spark: What's under the hoodApache Spark: What's under the hood
Apache Spark: What's under the hood
 
Healthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache SparkHealthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache Spark
 
From HDFS to S3: Migrate Pinterest Apache Spark Clusters
From HDFS to S3: Migrate Pinterest Apache Spark ClustersFrom HDFS to S3: Migrate Pinterest Apache Spark Clusters
From HDFS to S3: Migrate Pinterest Apache Spark Clusters
 
[262] netflix 빅데이터 플랫폼
[262] netflix 빅데이터 플랫폼[262] netflix 빅데이터 플랫폼
[262] netflix 빅데이터 플랫폼
 
Apache Spark: The Next Gen toolset for Big Data Processing
Apache Spark: The Next Gen toolset for Big Data ProcessingApache Spark: The Next Gen toolset for Big Data Processing
Apache Spark: The Next Gen toolset for Big Data Processing
 
Spark on Yarn @ Netflix
Spark on Yarn @ NetflixSpark on Yarn @ Netflix
Spark on Yarn @ Netflix
 
Producing Spark on YARN for ETL
Producing Spark on YARN for ETLProducing Spark on YARN for ETL
Producing Spark on YARN for ETL
 
700 Queries Per Second with Updates: Spark As A Real-Time Web Service
700 Queries Per Second with Updates: Spark As A Real-Time Web Service700 Queries Per Second with Updates: Spark As A Real-Time Web Service
700 Queries Per Second with Updates: Spark As A Real-Time Web Service
 
700 Updatable Queries Per Second: Spark as a Real-Time Web Service
700 Updatable Queries Per Second: Spark as a Real-Time Web Service700 Updatable Queries Per Second: Spark as a Real-Time Web Service
700 Updatable Queries Per Second: Spark as a Real-Time Web Service
 
Hadoop Network Performance profile
Hadoop Network Performance profileHadoop Network Performance profile
Hadoop Network Performance profile
 
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the HoodRadical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
 
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeAdaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
 
Scala like distributed collections - dumping time-series data with apache spark
Scala like distributed collections - dumping time-series data with apache sparkScala like distributed collections - dumping time-series data with apache spark
Scala like distributed collections - dumping time-series data with apache spark
 
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Be...
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark  - Demi Be...S3, Cassandra or Outer Space? Dumping Time Series Data using Spark  - Demi Be...
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Be...
 
Presentations from the Cloudera Impala meetup on Aug 20 2013
Presentations from the Cloudera Impala meetup on Aug 20 2013Presentations from the Cloudera Impala meetup on Aug 20 2013
Presentations from the Cloudera Impala meetup on Aug 20 2013
 
11. From Hadoop to Spark 1:2
11. From Hadoop to Spark 1:211. From Hadoop to Spark 1:2
11. From Hadoop to Spark 1:2
 
Apache Spark Best Practices Meetup Talk
Apache Spark Best Practices Meetup TalkApache Spark Best Practices Meetup Talk
Apache Spark Best Practices Meetup Talk
 

More from Flink Forward

Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...
Flink Forward
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
Flink Forward
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Flink Forward
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
Flink Forward
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
Flink Forward
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Flink Forward
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async Sink
Flink Forward
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
Flink Forward
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
Flink Forward
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
Flink Forward
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
Flink Forward
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Flink Forward
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022
Flink Forward
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easy
Flink Forward
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
Flink Forward
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Flink Forward
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial Services
Flink Forward
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
Flink Forward
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
Flink Forward
 
Welcome to the Flink Community!
Welcome to the Flink Community!Welcome to the Flink Community!
Welcome to the Flink Community!
Flink Forward
 

More from Flink Forward (20)

Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async Sink
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production Deployment
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easy
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial Services
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
Welcome to the Flink Community!
Welcome to the Flink Community!Welcome to the Flink Community!
Welcome to the Flink Community!
 

Recently uploaded

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 

Recently uploaded (20)

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 

Dongwon Kim – A Comparative Performance Evaluation of Flink

  • 1. A Comparative Performance Evaluation of Flink Dongwon Kim POSTECH
  • 2. About Me • Postdoctoral researcher @ POSTECH • Research interest • Design and implementation of distributed systems • Performance optimization of big data processing engines • Doctoral thesis • MR2: Fault Tolerant MapReduce with the Push Model • Personal blog • http://eastcirclek.blogspot.kr • Why I’m here  2
  • 3. Outline • TeraSort for various engines • Experimental setup • Results & analysis • What else for better performance? • Conclusion 3
  • 4. TeraSort • Hadoop MapReduce program for the annual terabyte sort competition • TeraSort is essentially distributed sort (DS) a4 b3 a1 a2 b1 b2 a2 b1 a3 a4 b3 b4 Disk a2 a4 b3 b1 a1 b4 a3 b2 a1 a3 b4 b2 Disk a2 a4 a1 a3 b3 b1 b4 b2 read shufflinglocal sort write Disk Disk local sort Node 1 Node 2 Typical DS phases : 4Total order
  • 5. • Included in Hadoop distributions • with TeraGen & TeraValidate • Identity map & reduce functions • Range partitioner built on sampling • To guarantee a total order & to prevent partition skew • Sampling to compute boundary points within few seconds TeraSort for MapReduce Reduce taskMap task read shuffling sortsort reducemap write read shufflinglocal sort writelocal sortDS phases : reducemap 5 Record range … Partition 1 Partition 2 Partition r boundary points
  • 6. • Tez can execute TeraSort for MapReduce w/o any modification • mapreduce.framework.name = yarn-tez • Tez DAG plan of TeraSort for MapReduce TeraSort for Tez finalreduce vertex initialmap vertex Map task read sortmap Reduce task shuffling sort reduce write input data output data 6
  • 7. TeraSort for Spark & Flink • My source code in GitHub: • https://github.com/eastcirclek/terasort • Sampling-based range partitioner from TeraSort for MapReduce • Visit my personal blog for a detailed explanation • http://eastcirclek.blogspot.kr 7
  • 8. RDD1 RDD2 • Code • Two RDDs TeraSort for Spark Stage 1Stage 0 Shuffle-Map Task (for newAPIHadoopFile) read sort Result Task (for repartitionAndSortWithinPartitions) shuffling sort write read shufflinglocal sort writelocal sort Create a new RDD to read from HDFS # partitions = # blocks Repartition the parent RDD based on the user-specified partitioner Write output to HDFS DS phases : 8
  • 9. Pipeline • Code • Pipelines consisting of four operators TeraSort for Flink read shuffling writelocal sort Create a dataset to read tuples from HDFS partition tuples Sort tuples of each partition DataSource Partition SortPartition DataSink local sort No map-side sorting due to pipelined execution Write output to HDFS DS phases : 9
  • 10. Importance of TeraSort • Suitable for measuring the pure performance of big data engines • No data transformation (like map, filter) with user-defined logic • Basic facilities of each engine are used • “Winning the sort benchmark” is a great means of PR 10
  • 11. Outline • TeraSort for various engines • Experimental setup • Machine specification • Node configuration • Results & analysis • What else for better performance? • Conclusion 11
  • 12. Machine specification (42 identical machines) DELL PowerEdge R610 CPU Two X5650 processors (Total 12 cores) Memory Total 24Gb Disk 6 disks * 500GB/disk Network 10 Gigabit Ethernet My machine Spark team Processor Intel Xeon X5650 (Q1, 2010) Intel Xeon E5-2670 (Q1, 2012) Cores 6 * 2 processors 8 * 4 processors Memory 24GB 244GB Disks 6 HDD's 8 SSD's Results can be different in newer machines 12
  • 13. 24GB on each node Node configuration Total 2 GB for daemons 13 GB Tez-0.7.0 NodeManager (1 GB) ShuffleService MapTask (1GB) DataNode (1 GB) MapTask (1GB) ReduceTask (1GB) ReduceTask (1GB) MapTask (1GB) MapTask (1GB) MapTask (1GB) … MapReduce-2.7.1 NodeManager (1 GB) ShuffleService DataNode (1 GB) MapTask (1GB) MapTask (1GB) ReduceTask (1GB) ReduceTask (1GB) MapTask (1GB) MapTask (1GB) MapTask (1GB) … 12 GB Flink Spark Spark-1.5.1 NodeManager (1 GB) Executor (12GB) Internal memory layout Various managers DataNode (1 GB) Task slot 1 Task slot 2 Task slot 12 ... Thread pool Flink-0.9.1 NodeManager (1 GB) TaskManager (12GB) DataNode (1 GB) Internal memory layout Various managers Task slot 1 Task slot 2 Task slot 12 ... Task threads Tez MapReduce ReduceTask (1GB)ReduceTask (1GB) 13 12 simultaneous tasks at most Driver (1GB) JobManager (1GB)
  • 14. Outline • TeraSort for various engines • Experimental setup • Results & analysis • Flink is faster than other engines due to its pipelined execution • What else for better performance? • Conclusion 14
  • 15. How to read a swimlane graph & throughput graphs Tasks Time since job starts (seconds) 2nd stage 1st 2nd 3rd 4th 5th 6th 15 Cluster network throughput Cluster disk throughput In Out Disk read Disk Write - 6 waves of 1st stage tasks - 1 wave of 2nd stage tasks - Two stages are hardly overlapped 1st stage 2nd stage 1st stage 2nd stage No network traffic during 1st stage Each line : duration of each task Different patterns for different stages
  • 16. Result of sorting 80GB/node (3.2TB) 1480 sec 1st stage 1st stage 1st stage 2nd stage 2157 sec 2nd stage 2171 sec 1 DataSource 2 Partition 3 SortPartition 4 DataSink • Flink is the fastest due to its pipelined execution • Tez and Spark do not overlap 1st and 2nd stages • MapReduce is slow despite overlapping stages MapReduce in Hadoop-2.7.1 Tez-0.7.0 Spark-1.5.1 Flink-0.9.1 2nd stage 1887 sec 2157 1887 2171 1480 0 500 1000 1500 2000 2500 MapReduce in Hadoop-2.7.1 Tez-0.7.0 Spark-1.5.1 Flink-0.9.1 Time(seconds) 16* Map output compression turned on for Spark and Tez * *
  • 17. Tez and Spark do not overlap 1st and 2nd stages Cluster network throughput Cluster disk throughput In Out Disk read Cluster network throughput Cluster disk throughput In Out Disk read Disk write Disk write Disk read Disk write Out In (1) 2nd stage starts (2) Output of 1st stage is sent (1) 2nd stage starts (2) Output of 1st stage is sent (1) Network traffic occurs from start Cluster network throughput (2) Write to HDFS occurs right after shuffling is done 1 DataSource 2 Partition 3 SortPartition 4 DataSink idle idle (3) Disk write to HDFS occurs after shuffling is done (3) Disk write to HDFS occurs after shuffling is done 17
  • 18. Tez does not overlap 1st and 2nd stages • Tez has parameters to control the degree of overlap • tez.shuffle-vertex-manager.min-src-fraction : 0.2 • tez.shuffle-vertex-manager.max-src-fraction : 0.4 • However, 2nd stage is scheduled early but launched late scheduled launched 18
  • 19. Spark does not overlap 1st and 2nd stages • Spark cannot execute multiple stages simultaneously • also mentioned in the following VLDB paper (2015) Spark doesn’t support the overlap between shuffle write and read stages. … Spark may want to support this overlap in the future to improve performance. Experimental results of this paper - Spark is faster than MapReduce for WordCount, K-means, PageRank. - MapReduce is faster than Spark for Sort. 19
  • 20. MapReduce is slow despite overlapping stages • mapreduce.job.reduce.slowstart.completedMaps : [0.0, 1.0] • Wang’s attempt to overlap spark stages 0.05 (overlapping, default) 0.95 (no overlapping) 2157 sec 10% improvement 20 Wang proposes to overlap stages to achieve better utilization 10%??? Why Spark & MapReduce improve just 10%? 2385 sec 2nd stage 1st stage
  • 21. Disk Data transfer between tasks of different stages Output file P1 P2 Pn Shuffle server … Consumer Task 1 Consumer Task 2 Consumer Task n P1 … P2 Pn Traditional pull model - Used in MapReduce, Spark, Tez - Extra disk access & simultaneous disk access - Shuffling affects the performance of producers Producer Task (1) Write output to disk (2) Request P1 (3) Send P1 Pipelined data transfer - Used in Flink - Data transfer from memory to memory - Flink causes fewer disk access during shuffling 21 Leads to only 10% improvement
  • 22. Flink causes fewer disk access during shuffling Map Reduce Flink diff. Total disk write (TB) 9.9 6.5 3.4 Total disk read (TB) 8.1 6.9 1.2 Difference comes from shuffling Shuffled data are sometimes read from page cache Cluster disk throughput Disk read Disk write Disk read Disk write Cluster disk throughput FlinkMapReduce 22 Total amount of disk read/write equals to the area of blue/green region
  • 23. Result of TeraSort with various data sizes node data size (GB) Time (seconds) Flink Spark MapReduce Tez 10 157 387 259 277 20 350 652 555 729 40 741 1135 1085 1709 80 1480 2171 2157 1887 160 3127 4927 4796 3950 23 100 1000 10000 10 20 40 80 160 Time(seconds) node data size (GB) Flink Spark MapReduce Tez What we’ve seen Log scale * Map output compression turned on for Spark and Tez
  • 24. Result of HashJoin • 10 slave nodes • org.apache.tez.examples.JoinDataGen • Small dataset : 256MB • Large dataset : 240GB (24GB/node) • Result : 24 Visit my blog  Flink is ~2x faster than Tez ~4x faster than Spark 770 1538 378 0 200 400 600 800 1000 1200 1400 1600 1800 Tez-0.7.0 Spark-1.5.1 Flink-0.9.1 Time(seconds) * No map output compression for both Spark and Tez unlike in TeraSort
  • 25. Result of HashJoin with swimlane & throughput graphs 25 Idle 1 DataSource 2 DataSource 3 Join 4 DataSink Idle Cluster network throughput Cluster disk throughput In Out Disk read Disk write Disk read Disk write In Out In Out Disk read Disk write Cluster network throughput Cluster disk throughput 0.24 TB 0.41 TB 0.60 TB 0.84 TB 0.68 TB 0.74 TB Overlap 2nd 3rd
  • 26. Flink’s shortcoming • No support for map output compression • Small data blocks are pipelined between operators • Job-level fault tolerance • Shuffle data are not materialized • Low disk throughput during the post-shuffling phase 26
  • 27. Low disk throughput during the post-shuffling phase • Possible reason : sorting records from small files • Concurrent disk access to small files  too many disk seeks  low disk throughput • Other engines merge records from larger files than Flink • “Eager pipelining moves some of the sorting work from the mapper to the reducer” • from MapReduce online (NSDI 2010) Flink Tez MapReduce 27
  • 28. Outline • TeraSort for various engines • Experimental setup • Results & analysis • What else for better performance? • Conclusion 28
  • 29. MR2 – another MapReduce engine • PhD thesis • MR2: Fault Tolerant MapReduce with the Push Model • developed for 3 years • Provide the user interface of Hadoop MapReduce • No DAG support • No in-memory computation • No iterative-computation • Characteristics • Push model + Fault tolerance • Techniques to boost up HDD throughput • Prefetching for mappers • Preloading for reducers 29
  • 30. MR2 pipeline • 7 types of components with memory buffers 1. Mappers & reducers : to apply user-defined functions 2. Prefetcher & preloader : to eliminate concurrent disk access 3. Sender & reducer & merger : to implement MR2’s push model • Various buffers : to pass data between components w/o disk IOs • Minimum disk access (2 disk reads & 2 disk writes) • +1 disk write for fault tolerance W1 R2 W2R1 30 1 12 23 3 3 W3
  • 31. Prefetcher & Mappers • Prefetcher loads data for multiple mappers • Mappers do not read input from disks <MR2><Hadoop MapReduce> Mapper1 processing Blk1 Mapper2 processing Blk2 Time Disk throughput CPU utilization 2 mappers on a node Blk1 Time Prefetcher Blk2 Blk3 Blk1 2 Blk1 1 Blk2 2 Blk2 1 Blk3 2 Blk3 1 Blk4 Blk4 2 Blk4 1 Disk throughput CPU utilization 2 mappers on a node 31
  • 32. Push-model in MR2 • Node-to-node network connection for pushing data • To reduce # network connections • Data transfer from memory buffer • Mappers stores spills in send buffer • Spills are pushed to reducer sides by sender • Fault tolerance (can be turned on/off) • Input ranges of each spill are known to master for reproduce • For fast recovery • store spills on disk for fast recovery (extra disk write) 32 similar to Flink’s pipelined execution MR2 does local sorting before pushing data similar to Spark
  • 33. Receiver’s managed memory Receiver & merger & preloader & reducer • Merger produces a file from different partition data • sorts each partition data • and then does interleaving • Preloader preloads each group into reduce buffer • Reducers do not read data directly from disks • MR2 can eliminate concurrent disk reads from reducers thanks to Preloader P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4 … … Preloader loads each group (1 disk access for 4 partitions) 33
  • 34. Result of sorting 80GB/node (3.2TB) with MR2 MapReduce in Hadoop-2.7.1 Tez-0.7.0 Spark-1.5.1 Flink-0.9.1 MR2 Time (sec) 2157 1887 2171 1480 890 MR2 speedup over other engines 2.42 2.12 2.44 1.66 - 2157 1887 2171 1480 890 0 500 1000 1500 2000 2500 MapReduce in Hadoop-2.7.1 Tez-0.7.0 Spark-1.5.1 Flink-0.9.1 MR2 Time(seconds) 34
  • 35. Disk & network throughput 1. DataSource / Mapping • Prefetcher is effective • MR2 shows higher disk throughput 2. Partition / Shuffling • Records to shuffle are generated faster from in MR2 3. DataSink / Reducing • Preloader is effective • Almost 2x throughput Disk read Disk write Out In Cluster network throughput Cluster disk throughput Out In Disk read Disk write Flink MR2 1 1 2 2 3 3 35
  • 36. • Experimental results using 10 nodes PUMA (PUrdue MApreduce benchmarks suite) 36
  • 37. Outline • TeraSort for various engines • Experimental setup • Experimental results & analysis • What else for better performance? • Conclusion 37
  • 38. Conclusion • Pipelined execution for both batch and streaming processing • Even better than other batch processing engines for TeraSort & HashJoin • Shortcomings due to pipelined execution • No fine-grained fault tolerance • No map output compression • Low disk throughput during the post-shuffling phase 38