SlideShare a Scribd company logo
Steffen Zeuch / DFKI GmbH / 08/27/2019 1/14
Analyzing Efficient Stream
Processing on Modern Hardware
Steffen Zeuch, Bonaventura Del Monte, Jeyhun
Karimov, Clemens Lutz, Manuel Renz, Jonas Traub,
Sebastian Breß, Tilmann Rabl, Volker Markl
Steffen Zeuch / DFKI GmbH / 08/27/2019 2/14
What is this paper about?
This paper is about showing the potential of
hardware-tailored code compilation and data
ingestion at memory speed for a scale-up SPE.
This paper is not about
benchmarking existing SPEs.
Steffen Zeuch / DFKI GmbH / 08/27/2019 3/14
Is ingestion at memory-speed possible?
Network is not the bottleneck in the future.
Steffen Zeuch / DFKI GmbH / 08/27/2019 4/14
What is possible?
No SPE is yet ready for processing at memory speed.
Steffen Zeuch / DFKI GmbH / 08/27/2019 5/14
What did we do?
• Analyze state-of-the-art streaming systems and
identify sources of inefficiency.
• Investigate data-related and processing-related
design space.
• Derive design changes for streaming systems to
exploit modern hardware more efficiently.
Steffen Zeuch / DFKI GmbH / 08/27/2019 6/14
How do SPEs transfer data?
Queues are the major bottleneck for scale-up processing.
Steffen Zeuch / DFKI GmbH / 08/27/2019 7/14
How do SPEs parallelize a query?
For scale-up, there are alternatives to partitioning.
Steffen Zeuch / DFKI GmbH / 08/27/2019 8/14
How do SPEs execute a query?
All systems use an interpretation based approach.
All systems, except streambox, use a managed runtime.
Steffen Zeuch / DFKI GmbH / 08/27/2019 9/14
What’s the scale-up performance?
Yahoo Streaming
Benchmark
Linear Road Benchmark
(partly)
New York Taxi Query
Overhead for entire framework: up to 80x
Overhead for managed runtime: up to 56x
Steffen Zeuch / DFKI GmbH / 08/27/2019 10/14
What’s the scale-out performance?(reported)
An optimized scale-up solution outperforms even 10 node
cluster.
Steffen Zeuch / DFKI GmbH / 08/27/2019 11/14
How does Flink scale out?
Add new nodes to the system does not solve the problem.
Steffen Zeuch / DFKI GmbH / 08/27/2019 12/14
Why are current SPEs inefficient?
Large instruction footprints, virtual function calls, and suboptimal
access patterns reduce efficiency.
Steffen Zeuch / DFKI GmbH / 08/27/2019 13/14
What should we do to scale-up?
• Avoid managed runtimes
• Use a compilation-based approach to
produce hardware-tailored code
• Avoid queues and use operator fusion
• Use late merge instead of partitioning
– enables producer/consumer fusing
Steffen Zeuch / DFKI GmbH / 08/27/2019 14/14
Summary
• We explore the data-related and processing-related
design space.
• We show that an up to two orders of magnitude
performance improvement is possible.
• We derive design changes for streaming systems to
exploit modern hardware more efficiently.
https://git.io/fjAZg

More Related Content

Similar to Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentation)

New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...Paul Hofmann
 
Data Engine for NoSQL - IBM Power Systems
Data Engine for NoSQL - IBM Power SystemsData Engine for NoSQL - IBM Power Systems
Data Engine for NoSQL - IBM Power SystemsthinkASG
 
A Collaborative Research Proposal To The NSF Research Accelerator For Multip...
A Collaborative Research Proposal To The NSF  Research Accelerator For Multip...A Collaborative Research Proposal To The NSF  Research Accelerator For Multip...
A Collaborative Research Proposal To The NSF Research Accelerator For Multip...Scott Donald
 
Sebastian Bretschneider - Our way to Ceph
Sebastian Bretschneider - Our way to CephSebastian Bretschneider - Our way to Ceph
Sebastian Bretschneider - Our way to CephShapeBlue
 
Do modernizing the Mainframe for DevOps.
Do modernizing the Mainframe for DevOps.Do modernizing the Mainframe for DevOps.
Do modernizing the Mainframe for DevOps.Massimo Talia
 
A File-Based Approach for Recommender Systems in High-Performance Computing E...
A File-Based Approach for Recommender Systems in High-Performance Computing E...A File-Based Approach for Recommender Systems in High-Performance Computing E...
A File-Based Approach for Recommender Systems in High-Performance Computing E...Simon Dooms
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lakeEMC
 
Designing for Performance: Database Related Worst Practices
Designing for Performance: Database Related Worst PracticesDesigning for Performance: Database Related Worst Practices
Designing for Performance: Database Related Worst PracticesChristian Antognini
 
FPL'2014 - FlexTiles Workshop - 6 - FlexTiles Embedded FPGA Accelerators
FPL'2014 - FlexTiles Workshop - 6 - FlexTiles Embedded FPGA AcceleratorsFPL'2014 - FlexTiles Workshop - 6 - FlexTiles Embedded FPGA Accelerators
FPL'2014 - FlexTiles Workshop - 6 - FlexTiles Embedded FPGA AcceleratorsFlexTiles Team
 
Impact 2011 2899 - Designing high performance straight through processes usin...
Impact 2011 2899 - Designing high performance straight through processes usin...Impact 2011 2899 - Designing high performance straight through processes usin...
Impact 2011 2899 - Designing high performance straight through processes usin...Brian Petrini
 
Analytic hierarchy process for pif thomas fehlmann
Analytic hierarchy process for pif   thomas fehlmannAnalytic hierarchy process for pif   thomas fehlmann
Analytic hierarchy process for pif thomas fehlmannIWSM Mensura
 
Paremus service fabric
Paremus service fabricParemus service fabric
Paremus service fabricpjhInovex
 
Gridcomputing
GridcomputingGridcomputing
Gridcomputingpchengi
 
STG Update 24.11.11
STG Update 24.11.11STG Update 24.11.11
STG Update 24.11.11PatrickGWard
 
Composing and Scaling Data Platforms-2015
Composing and Scaling Data Platforms-2015Composing and Scaling Data Platforms-2015
Composing and Scaling Data Platforms-2015Rahul Kumar
 
Composing and scaling data platforms
Composing and scaling data platformsComposing and scaling data platforms
Composing and scaling data platformsSigmoid
 

Similar to Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentation) (20)

New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
New Business Applications Powered by In-Memory Technology @MIT Forum for Supp...
 
Data Engine for NoSQL - IBM Power Systems
Data Engine for NoSQL - IBM Power SystemsData Engine for NoSQL - IBM Power Systems
Data Engine for NoSQL - IBM Power Systems
 
A Collaborative Research Proposal To The NSF Research Accelerator For Multip...
A Collaborative Research Proposal To The NSF  Research Accelerator For Multip...A Collaborative Research Proposal To The NSF  Research Accelerator For Multip...
A Collaborative Research Proposal To The NSF Research Accelerator For Multip...
 
Sebastian Bretschneider - Our way to Ceph
Sebastian Bretschneider - Our way to CephSebastian Bretschneider - Our way to Ceph
Sebastian Bretschneider - Our way to Ceph
 
Do modernizing the Mainframe for DevOps.
Do modernizing the Mainframe for DevOps.Do modernizing the Mainframe for DevOps.
Do modernizing the Mainframe for DevOps.
 
Streaming is a Detail
Streaming is a DetailStreaming is a Detail
Streaming is a Detail
 
A File-Based Approach for Recommender Systems in High-Performance Computing E...
A File-Based Approach for Recommender Systems in High-Performance Computing E...A File-Based Approach for Recommender Systems in High-Performance Computing E...
A File-Based Approach for Recommender Systems in High-Performance Computing E...
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lake
 
Designing for Performance: Database Related Worst Practices
Designing for Performance: Database Related Worst PracticesDesigning for Performance: Database Related Worst Practices
Designing for Performance: Database Related Worst Practices
 
FPL'2014 - FlexTiles Workshop - 6 - FlexTiles Embedded FPGA Accelerators
FPL'2014 - FlexTiles Workshop - 6 - FlexTiles Embedded FPGA AcceleratorsFPL'2014 - FlexTiles Workshop - 6 - FlexTiles Embedded FPGA Accelerators
FPL'2014 - FlexTiles Workshop - 6 - FlexTiles Embedded FPGA Accelerators
 
Impact 2011 2899 - Designing high performance straight through processes usin...
Impact 2011 2899 - Designing high performance straight through processes usin...Impact 2011 2899 - Designing high performance straight through processes usin...
Impact 2011 2899 - Designing high performance straight through processes usin...
 
Analytic hierarchy process for pif thomas fehlmann
Analytic hierarchy process for pif   thomas fehlmannAnalytic hierarchy process for pif   thomas fehlmann
Analytic hierarchy process for pif thomas fehlmann
 
OpenHPC Update
OpenHPC UpdateOpenHPC Update
OpenHPC Update
 
Data EcoSystem 2.0
Data EcoSystem 2.0Data EcoSystem 2.0
Data EcoSystem 2.0
 
Paremus service fabric
Paremus service fabricParemus service fabric
Paremus service fabric
 
191
191191
191
 
Gridcomputing
GridcomputingGridcomputing
Gridcomputing
 
STG Update 24.11.11
STG Update 24.11.11STG Update 24.11.11
STG Update 24.11.11
 
Composing and Scaling Data Platforms-2015
Composing and Scaling Data Platforms-2015Composing and Scaling Data Platforms-2015
Composing and Scaling Data Platforms-2015
 
Composing and scaling data platforms
Composing and scaling data platformsComposing and scaling data platforms
Composing and scaling data platforms
 

More from Jonas Traub

Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...
Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...
Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...Jonas Traub
 
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...Jonas Traub
 
code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...
code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...
code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...Jonas Traub
 
FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...
FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...
FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...Jonas Traub
 
Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019
Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019
Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019Jonas Traub
 
Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)
Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)
Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)Jonas Traub
 
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...Jonas Traub
 
Flink Forward 2018: Efficient Window Aggregation with Stream Slicing
Flink Forward 2018: Efficient Window Aggregation with Stream SlicingFlink Forward 2018: Efficient Window Aggregation with Stream Slicing
Flink Forward 2018: Efficient Window Aggregation with Stream SlicingJonas Traub
 
Scotty: Efficient Window Aggregation for Out-of-Order Stream Processing
Scotty: Efficient Window Aggregation for Out-of-Order Stream ProcessingScotty: Efficient Window Aggregation for Out-of-Order Stream Processing
Scotty: Efficient Window Aggregation for Out-of-Order Stream ProcessingJonas Traub
 
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...Jonas Traub
 
Efficient SIMD Vectorization for Hashing in OpenCL
Efficient SIMD Vectorization for Hashing in OpenCLEfficient SIMD Vectorization for Hashing in OpenCL
Efficient SIMD Vectorization for Hashing in OpenCLJonas Traub
 
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...Jonas Traub
 
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...Jonas Traub
 
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...Jonas Traub
 
I²: Interactive Real-Time Visualization for Streaming Data
I²: Interactive Real-Time Visualization for Streaming DataI²: Interactive Real-Time Visualization for Streaming Data
I²: Interactive Real-Time Visualization for Streaming DataJonas Traub
 
LWA 2015: The Apache Flink Platform (Poster)
LWA 2015: The Apache Flink Platform (Poster)LWA 2015: The Apache Flink Platform (Poster)
LWA 2015: The Apache Flink Platform (Poster)Jonas Traub
 
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream AnalysisLWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream AnalysisJonas Traub
 

More from Jonas Traub (17)

Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...
Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...
Definitely not Java! A Hands-on Introduction to Efficient Functional Programm...
 
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
Efficient Data Stream Processing in the Internet of Things - SoftwareCampus A...
 
code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...
code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...
code.talks 2019 - Scotty: Efficient Window Aggregation for your Stream Proces...
 
FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...
FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...
FlinkForward Berlin 2019 - Scotty: Efficient Window Aggregation with General ...
 
Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019
Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019
Database Research at TU Berlin DIMA and DFKI IAM - USA Excursion Slides 2019
 
Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)
Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)
Efficient Window Aggregation with General Stream Slicing (EDBT 2019, Best Paper)
 
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...
Resense: Transparent Record and Replay of Sensor Data in the Internet of Thin...
 
Flink Forward 2018: Efficient Window Aggregation with Stream Slicing
Flink Forward 2018: Efficient Window Aggregation with Stream SlicingFlink Forward 2018: Efficient Window Aggregation with Stream Slicing
Flink Forward 2018: Efficient Window Aggregation with Stream Slicing
 
Scotty: Efficient Window Aggregation for Out-of-Order Stream Processing
Scotty: Efficient Window Aggregation for Out-of-Order Stream ProcessingScotty: Efficient Window Aggregation for Out-of-Order Stream Processing
Scotty: Efficient Window Aggregation for Out-of-Order Stream Processing
 
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
 
Efficient SIMD Vectorization for Hashing in OpenCL
Efficient SIMD Vectorization for Hashing in OpenCLEfficient SIMD Vectorization for Hashing in OpenCL
Efficient SIMD Vectorization for Hashing in OpenCL
 
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
UZH Stream Reasoning Workshop 2018: Optimized On-Demand Data Streaming from S...
 
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
 
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
 
I²: Interactive Real-Time Visualization for Streaming Data
I²: Interactive Real-Time Visualization for Streaming DataI²: Interactive Real-Time Visualization for Streaming Data
I²: Interactive Real-Time Visualization for Streaming Data
 
LWA 2015: The Apache Flink Platform (Poster)
LWA 2015: The Apache Flink Platform (Poster)LWA 2015: The Apache Flink Platform (Poster)
LWA 2015: The Apache Flink Platform (Poster)
 
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream AnalysisLWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
 

Recently uploaded

Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptAerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptsreddyrahul
 
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptx
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptxPlasmapheresis - Dr. E. Muralinath - Kalyan . C.pptx
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptxmuralinath2
 
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...Sérgio Sacani
 
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPirithiRaju
 
Topography and sediments of the floor of the Bay of Bengal
Topography and sediments of the floor of the Bay of BengalTopography and sediments of the floor of the Bay of Bengal
Topography and sediments of the floor of the Bay of BengalMd Hasan Tareq
 
Structural annotation................pptx
Structural annotation................pptxStructural annotation................pptx
Structural annotation................pptxCherry
 
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...Sérgio Sacani
 
Mitosis...............................pptx
Mitosis...............................pptxMitosis...............................pptx
Mitosis...............................pptxCherry
 
PLANT DISEASE MANAGEMENT PRINCIPLES AND ITS IMPORTANCE
PLANT DISEASE MANAGEMENT PRINCIPLES AND ITS IMPORTANCEPLANT DISEASE MANAGEMENT PRINCIPLES AND ITS IMPORTANCE
PLANT DISEASE MANAGEMENT PRINCIPLES AND ITS IMPORTANCETALAPATI ARUNA CHENNA VYDYANAD
 
METHODS OF TRANSCRIPTOME ANALYSIS....pptx
METHODS OF TRANSCRIPTOME ANALYSIS....pptxMETHODS OF TRANSCRIPTOME ANALYSIS....pptx
METHODS OF TRANSCRIPTOME ANALYSIS....pptxCherry
 
Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Sérgio Sacani
 
National Biodiversity protection initiatives and Convention on Biological Di...
National Biodiversity protection initiatives and  Convention on Biological Di...National Biodiversity protection initiatives and  Convention on Biological Di...
National Biodiversity protection initiatives and Convention on Biological Di...PABOLU TEJASREE
 
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...Subhajit Sahu
 
A Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on EarthA Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on EarthSérgio Sacani
 
Plasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyanPlasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyanmuralinath2
 
Cell Immobilization Methods and Applications.pptx
Cell Immobilization Methods and Applications.pptxCell Immobilization Methods and Applications.pptx
Cell Immobilization Methods and Applications.pptxCherry
 
Detectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureDetectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureSérgio Sacani
 
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPirithiRaju
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
 
INSIGHT Partner Profile: Tampere University
INSIGHT Partner Profile: Tampere UniversityINSIGHT Partner Profile: Tampere University
INSIGHT Partner Profile: Tampere UniversitySteffi Friedrichs
 

Recently uploaded (20)

Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptAerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
 
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptx
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptxPlasmapheresis - Dr. E. Muralinath - Kalyan . C.pptx
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptx
 
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
 
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
 
Topography and sediments of the floor of the Bay of Bengal
Topography and sediments of the floor of the Bay of BengalTopography and sediments of the floor of the Bay of Bengal
Topography and sediments of the floor of the Bay of Bengal
 
Structural annotation................pptx
Structural annotation................pptxStructural annotation................pptx
Structural annotation................pptx
 
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
 
Mitosis...............................pptx
Mitosis...............................pptxMitosis...............................pptx
Mitosis...............................pptx
 
PLANT DISEASE MANAGEMENT PRINCIPLES AND ITS IMPORTANCE
PLANT DISEASE MANAGEMENT PRINCIPLES AND ITS IMPORTANCEPLANT DISEASE MANAGEMENT PRINCIPLES AND ITS IMPORTANCE
PLANT DISEASE MANAGEMENT PRINCIPLES AND ITS IMPORTANCE
 
METHODS OF TRANSCRIPTOME ANALYSIS....pptx
METHODS OF TRANSCRIPTOME ANALYSIS....pptxMETHODS OF TRANSCRIPTOME ANALYSIS....pptx
METHODS OF TRANSCRIPTOME ANALYSIS....pptx
 
Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...
 
National Biodiversity protection initiatives and Convention on Biological Di...
National Biodiversity protection initiatives and  Convention on Biological Di...National Biodiversity protection initiatives and  Convention on Biological Di...
National Biodiversity protection initiatives and Convention on Biological Di...
 
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
 
A Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on EarthA Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on Earth
 
Plasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyanPlasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyan
 
Cell Immobilization Methods and Applications.pptx
Cell Immobilization Methods and Applications.pptxCell Immobilization Methods and Applications.pptx
Cell Immobilization Methods and Applications.pptx
 
Detectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureDetectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a Technosignature
 
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
 
INSIGHT Partner Profile: Tampere University
INSIGHT Partner Profile: Tampere UniversityINSIGHT Partner Profile: Tampere University
INSIGHT Partner Profile: Tampere University
 

Analyzing Efficient Stream Processing on Modern Hardware (VLDB 2019 Presentation)

  • 1. Steffen Zeuch / DFKI GmbH / 08/27/2019 1/14 Analyzing Efficient Stream Processing on Modern Hardware Steffen Zeuch, Bonaventura Del Monte, Jeyhun Karimov, Clemens Lutz, Manuel Renz, Jonas Traub, Sebastian Breß, Tilmann Rabl, Volker Markl
  • 2. Steffen Zeuch / DFKI GmbH / 08/27/2019 2/14 What is this paper about? This paper is about showing the potential of hardware-tailored code compilation and data ingestion at memory speed for a scale-up SPE. This paper is not about benchmarking existing SPEs.
  • 3. Steffen Zeuch / DFKI GmbH / 08/27/2019 3/14 Is ingestion at memory-speed possible? Network is not the bottleneck in the future.
  • 4. Steffen Zeuch / DFKI GmbH / 08/27/2019 4/14 What is possible? No SPE is yet ready for processing at memory speed.
  • 5. Steffen Zeuch / DFKI GmbH / 08/27/2019 5/14 What did we do? • Analyze state-of-the-art streaming systems and identify sources of inefficiency. • Investigate data-related and processing-related design space. • Derive design changes for streaming systems to exploit modern hardware more efficiently.
  • 6. Steffen Zeuch / DFKI GmbH / 08/27/2019 6/14 How do SPEs transfer data? Queues are the major bottleneck for scale-up processing.
  • 7. Steffen Zeuch / DFKI GmbH / 08/27/2019 7/14 How do SPEs parallelize a query? For scale-up, there are alternatives to partitioning.
  • 8. Steffen Zeuch / DFKI GmbH / 08/27/2019 8/14 How do SPEs execute a query? All systems use an interpretation based approach. All systems, except streambox, use a managed runtime.
  • 9. Steffen Zeuch / DFKI GmbH / 08/27/2019 9/14 What’s the scale-up performance? Yahoo Streaming Benchmark Linear Road Benchmark (partly) New York Taxi Query Overhead for entire framework: up to 80x Overhead for managed runtime: up to 56x
  • 10. Steffen Zeuch / DFKI GmbH / 08/27/2019 10/14 What’s the scale-out performance?(reported) An optimized scale-up solution outperforms even 10 node cluster.
  • 11. Steffen Zeuch / DFKI GmbH / 08/27/2019 11/14 How does Flink scale out? Add new nodes to the system does not solve the problem.
  • 12. Steffen Zeuch / DFKI GmbH / 08/27/2019 12/14 Why are current SPEs inefficient? Large instruction footprints, virtual function calls, and suboptimal access patterns reduce efficiency.
  • 13. Steffen Zeuch / DFKI GmbH / 08/27/2019 13/14 What should we do to scale-up? • Avoid managed runtimes • Use a compilation-based approach to produce hardware-tailored code • Avoid queues and use operator fusion • Use late merge instead of partitioning – enables producer/consumer fusing
  • 14. Steffen Zeuch / DFKI GmbH / 08/27/2019 14/14 Summary • We explore the data-related and processing-related design space. • We show that an up to two orders of magnitude performance improvement is possible. • We derive design changes for streaming systems to exploit modern hardware more efficiently. https://git.io/fjAZg