SlideShare a Scribd company logo
7-8 June 2018
Frankfurt Germany
Improving Business Performance Through Big Data Benchmarking
Todor Ivanov (todor@dbis.cs.uni-frankfurt.de)
Frankfurt Big Data Lab, Goethe University
DataBench Project - GA Nr 780966 2
Benchmarking
• The Term Benchmark:
A benchmark is a measured „best-in-class“ achievement recognised as the standard of excellence
for that business process. (APQC 1993)
• Business Performance Benchmarking – comparison of performance measures for the purpose
of determining how good one’s own company is compared to others.
• A software benchmark is a program used for comparison of software products/tools executing
on a pre-configured hardware environment.
8 June 2018 2
DataBench Project - GA Nr 780966 3
Performance Metrics
• Lord Kelvin defined KPIs as: “When you
can measure what are speaking about and
measure it in numbers, you know
something about it, when you cannot
express it in numbers, your knowledge is
of meager and unsatisfactory kind; it may
be the beginning of knowledge but you
have scarcely, in your thoughts advanced
to the stage of science.”
[Arora Amishi, Kaur Sukhbir. Performance assessment
model for management educators based on KRA/KPI. In:
International conference on technology and business
management March, vol. 23; 2015.]
• Key Performance Indicators (KPIs) - tell
you what to do to highly increase
performance.
8 June 2018 3
• Jim Gray describes the benchmarking as:
”This quantitative comparison starts with the
definition of a benchmark or workload. The
benchmark is run on several different
systems, and the performance and price of
each system is measured and recorded.
Performance is typically a throughput metric
(work/second) and price is typically a five-
year cost-of-ownership metric. Together,
they give a price/performance ratio.”
[Jim Gray (Ed.), The Benchmark Handbook for Database and
Transaction Systems, 1992]
• TPC defines complex performance metrics
for each standard benchmark:
• Transaction per second (tps)
• $/tps
DataBench Project - GA Nr 780966 4
Example Use Case
• Which system will have best price/performance for my application?
 Need to use a benchmark
• What is more important the minimal execution time or lower total cost?
 Depends on the business KPIs
• The company decide to introduce Machine Learning model to improve product
recommendations to customers. Different ML models have different Accuracy.
How important is the accuracy? Should the company invest in improving the
model accuracy?
 Business decision
8 June 2018 4
System A System B
4 Nodes (servers) 30 Nodes (servers)
4 hours (execution time) 4 minutes (execution time)
5500 Euro (total cost) 50 000 Euro (total cost)
DataBench Project - GA Nr 780966 5
Goals of DataBench
• … There is a lack of objective, evidence-based methods measuring the correlation between Big
Data Technology benchmarks and Business benchmarks as well as Big Data Technology
capability to impact the competitiveness and market success of organizations.
• Objective:
… The objective is to provide a model which correlates technical benchmarks to performance and
business needs of different sectors and domains. …
8 June 2018 5
Evidence Based Big Data Benchmarking
to Improve Business Performance
6
6© IDC
Building a bridge between technical and business benchmarking
Evaluating
business
performance and
benchmarks
Mapping and
assessing
technical
benchmarks
Develop a Benchmarking Toolbox
and Handbook
DataBench Project - GA Nr 780966 778 June 2018
8
DataBench ToolBox
8 June 2018
DataBench Project - GA Nr 780966 9
BDV – Big Data and Analytics/Machine Learning
Reference Model
Source: http://bdva.eu/sites/default/files/BDVA_SRIA_v4_Ed1.1.pdf
8 June 2018 9
Data types,
Semantic
interoperability
DataBench Project - GA Nr 780966 108 June 2018 10
Some of the benchmarks to integrate (I)
Year Name Type
2010 HiBench
Big data benchmark suite for evaluating different big data frameworks. 19
workloads including synthetic micro-benchmarks and real-world applications from 6
categories which are micro, machine learning, sql, graph, websearch and
streaming.
2015 SparkBench
System for benchmarking and simulating Spark jobs. Multiple workloads
organized in 4 categories.
2010
Yahoo! Cloud System
Benchmark (YSCB)
Evaluates performance of different “key-value” and “cloud” serving systems,
which do not support the ACID properties. The YCSB++ , an extension, includes
many additions such as multi-tester coordination for increased load and eventual
consistency measurement.
2017 TPCx-IoT
Based on YCSB, but with significant changes. Workloads of data ingestion and
concurrent queries simulating workloads on typical IoT Gateway systems. Dataset
with data from sensors from electric power station(s).
11
Micro-benchmarks: stress test very specific part of the Big Data technologies typically utilizing
synthetic data.
8 June 2018
Some of the benchmarks to integrate (II)
Year Name Type
2015
Yahoo Streaming
Benchmark (YSB)
The Yahoo Streaming Benchmark is a streaming application
benchmark simulating an advertisement analytics pipeline.
2013 BigBench/TPCx-BB
BigBench is an end-to-end, technology agnostic, application-level
benchmark that tests the analytical capabilities of a Big Data platform.
It is based on a fictional product retailer business model.
2017 BigBench V2
Similar to BigBench, BigBench V2 is an end-to-end, technology
agnostic, application-level benchmark that tests the analytical
capabilities of a Big Data platform.
2018
ABench (Work-in-
Progress)
New type of multi-purpose Big Data benchmark covering many big data
scenarios and implementations. Extends other benchmarks such as BigBench.
128 June 2018
Application-level benchmarks: cover a wider and more complex set of functionalities involving
multiple Big Data technologies and typically utilizing real scenario data.
DataBench Project - GA Nr 780966 13
ABench (Work-in-Progress)
• Growing number of new Big Data technologies and connectors in the Big Data Stacks
 Challenges for Solution Architects, Data Engineers, Data Scientist, Developers, etc.
• Missing benchmarks for each technology, connector or a combination of them
• Consequence  Increasing complexity in the Big Data Architecture Stacks
• Our approach  ABench: Big Data Architecture Stack Benchmark
8 June 2018 13
ABench: Big Data Architecture Stack Benchmark.
Todor Ivanov, Rekha Singhal:
Companion of the 2018 ACM/SPEC International Conference
on Performance Engineering, ICPE 2018,
Berlin, Germany, April 09-13, 2018.
DataBench Project - GA Nr 780966 14
ABench Features
• Benchmark Framework
 Data generators or plugins for custom data generators
 Include data generator or public data sets to simulate workload that stresses the architecture
• Reuse of existing benchmarks
 Case study using BigBench (on-going Streaming and Machine Learning extensions)
• Open source implementation and extendable design
• Easy to setup and extend
• Supporting and combining all four types of benchmarks in ABench
8 June 2018 14
DataBench Project - GA Nr 780966 15
Benchmarks Types (adapted from Andersen and Pettersen)
1. Generic Benchmarking: checks whether an implementation fulfills
given business requirements and specifications (Is the defined
business specification implemented accurately?).
2. Competitive Benchmarking: is a performance comparison between
the best tools on the platform layer that offer similar functionality
(e.g., throughput of MapReduce vs. Spark vs. Flink).
3. Functional Benchmarking is a functional comparison of the features
of the tool against technologies from the same area. (e.g., Spark
Streaming vs. Spark Structured Streaming vs. Flink Streaming).
4. Internal Benchmarking: comparing different implementations of a
functionality (e.g., Spark Scala vs. Java vs. R vs. PySpark)
8 June 2018 15
DataBench Project - GA Nr 780966 168 June 2018 16
DataBench Project - GA Nr 780966 17178 June 2018
DataBench Project - GA Nr 780966 18
DataBench Summary
• A framework for big data benchmarking for all Big Data practitioners.
• Will provide methodology and tools.
• Added value:
• An umbrella to access to multiple benchmarks
• Homogenized technical metrics
• Derived business KPIs
• Connecting the different benchmark communities
• Open initiative: Public-private partnership (PPP) projects, industrial partners
(BDVA and beyond) and benchmarking initiatives are welcomed to work with
us, either to use our framework or to add new benchmarks.
8 June 2018 18
DataBench Project - GA Nr 780966 19
Contacts
info@databench.eu
@DataBench_eu
DataBench
DataBench Project
8 June 2018 19

More Related Content

What's hot

Accelerating Big Data & Analytics Innovations through Public – Private Partne...
Accelerating Big Data & Analytics Innovations through Public – Private Partne...Accelerating Big Data & Analytics Innovations through Public – Private Partne...
Accelerating Big Data & Analytics Innovations through Public – Private Partne...
Prof. Dr. Alexander Maedche
 
PeterBarnumGenResume
PeterBarnumGenResumePeterBarnumGenResume
PeterBarnumGenResumePeter Barnum
 
Agile Testing Days 2017 Introducing AgileBI Sustainably
Agile Testing Days 2017 Introducing AgileBI SustainablyAgile Testing Days 2017 Introducing AgileBI Sustainably
Agile Testing Days 2017 Introducing AgileBI Sustainably
Raphael Branger
 
Bhagyashree Nayak Resume
Bhagyashree Nayak ResumeBhagyashree Nayak Resume
Bhagyashree Nayak ResumeBhagya Shree
 
Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017
Debraj GuhaThakurta
 
Novel tool to optimize complext industrial projects
Novel tool to optimize complext industrial projectsNovel tool to optimize complext industrial projects
Novel tool to optimize complext industrial projects
kphodel
 
SAP Big Data Innovation Lab at the University of Mannheim
SAP Big Data Innovation Lab at the University of MannheimSAP Big Data Innovation Lab at the University of Mannheim
SAP Big Data Innovation Lab at the University of Mannheim
Prof. Dr. Alexander Maedche
 
Power BI vs Tableau vs Cognos: A Data Analytics Research
Power BI vs Tableau vs Cognos: A Data Analytics ResearchPower BI vs Tableau vs Cognos: A Data Analytics Research
Power BI vs Tableau vs Cognos: A Data Analytics Research
Luciano Vilas Boas
 
Laura sebu icstcc_merging
Laura sebu icstcc_mergingLaura sebu icstcc_merging
Laura sebu icstcc_merging
laurasebu
 
Resume anh chu data analyst
Resume anh chu data analystResume anh chu data analyst
Resume anh chu data analyst
ANH CHU
 
Analogic Link08 Presentation
Analogic Link08 PresentationAnalogic Link08 Presentation
Analogic Link08 Presentation
rtgalv
 
Artificial Intelligence and BIM
Artificial Intelligence and BIMArtificial Intelligence and BIM
Artificial Intelligence and BIM
Alex Ferguson
 

What's hot (16)

Accelerating Big Data & Analytics Innovations through Public – Private Partne...
Accelerating Big Data & Analytics Innovations through Public – Private Partne...Accelerating Big Data & Analytics Innovations through Public – Private Partne...
Accelerating Big Data & Analytics Innovations through Public – Private Partne...
 
PeterBarnumGenResume
PeterBarnumGenResumePeterBarnumGenResume
PeterBarnumGenResume
 
Agile Testing Days 2017 Introducing AgileBI Sustainably
Agile Testing Days 2017 Introducing AgileBI SustainablyAgile Testing Days 2017 Introducing AgileBI Sustainably
Agile Testing Days 2017 Introducing AgileBI Sustainably
 
Resume
ResumeResume
Resume
 
Resume
ResumeResume
Resume
 
Chaitanya_updated resume
Chaitanya_updated resumeChaitanya_updated resume
Chaitanya_updated resume
 
Chaitanya_updated resume
Chaitanya_updated resumeChaitanya_updated resume
Chaitanya_updated resume
 
Bhagyashree Nayak Resume
Bhagyashree Nayak ResumeBhagyashree Nayak Resume
Bhagyashree Nayak Resume
 
Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017
 
Novel tool to optimize complext industrial projects
Novel tool to optimize complext industrial projectsNovel tool to optimize complext industrial projects
Novel tool to optimize complext industrial projects
 
SAP Big Data Innovation Lab at the University of Mannheim
SAP Big Data Innovation Lab at the University of MannheimSAP Big Data Innovation Lab at the University of Mannheim
SAP Big Data Innovation Lab at the University of Mannheim
 
Power BI vs Tableau vs Cognos: A Data Analytics Research
Power BI vs Tableau vs Cognos: A Data Analytics ResearchPower BI vs Tableau vs Cognos: A Data Analytics Research
Power BI vs Tableau vs Cognos: A Data Analytics Research
 
Laura sebu icstcc_merging
Laura sebu icstcc_mergingLaura sebu icstcc_merging
Laura sebu icstcc_merging
 
Resume anh chu data analyst
Resume anh chu data analystResume anh chu data analyst
Resume anh chu data analyst
 
Analogic Link08 Presentation
Analogic Link08 PresentationAnalogic Link08 Presentation
Analogic Link08 Presentation
 
Artificial Intelligence and BIM
Artificial Intelligence and BIMArtificial Intelligence and BIM
Artificial Intelligence and BIM
 

Similar to Improving Business Performance Through Big Data Benchmarking, Todor Ivanov, BDIC 2018, Frankfurt 7-8/06/2018

Building the DataBench Workflow and Architecture, Todor Ivanov, Bench 2019 - ...
Building the DataBench Workflow and Architecture, Todor Ivanov, Bench 2019 - ...Building the DataBench Workflow and Architecture, Todor Ivanov, Bench 2019 - ...
Building the DataBench Workflow and Architecture, Todor Ivanov, Bench 2019 - ...
DataBench
 
Big Data Benchmarking, Tomas Pariente Lobo, Open Expo Europe, 20/06/2019
Big Data Benchmarking, Tomas Pariente Lobo, Open Expo Europe, 20/06/2019Big Data Benchmarking, Tomas Pariente Lobo, Open Expo Europe, 20/06/2019
Big Data Benchmarking, Tomas Pariente Lobo, Open Expo Europe, 20/06/2019
DataBench
 
Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...
Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...
Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...
DataBench
 
Relating Big Data Business and Technical Performance Indicators, Barbara Pern...
Relating Big Data Business and Technical Performance Indicators, Barbara Pern...Relating Big Data Business and Technical Performance Indicators, Barbara Pern...
Relating Big Data Business and Technical Performance Indicators, Barbara Pern...
DataBench
 
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...
Big Data Value Association
 
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018 Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
DataBench
 
DataBench Toolbox Demo, Ivan Martinez, Tomas Pariente Lobo, BDV Meet-Up Riga,...
DataBench Toolbox Demo, Ivan Martinez, Tomas Pariente Lobo, BDV Meet-Up Riga,...DataBench Toolbox Demo, Ivan Martinez, Tomas Pariente Lobo, BDV Meet-Up Riga,...
DataBench Toolbox Demo, Ivan Martinez, Tomas Pariente Lobo, BDV Meet-Up Riga,...
DataBench
 
Success Stories on Big Data & Analytics
Success Stories on Big Data & AnalyticsSuccess Stories on Big Data & Analytics
Success Stories on Big Data & Analytics
DataBench
 
Demystifying Data Science
Demystifying Data ScienceDemystifying Data Science
Demystifying Data Science
Data Science Milan
 
TechEvent DWH Modernization
TechEvent DWH ModernizationTechEvent DWH Modernization
TechEvent DWH Modernization
Trivadis
 
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
IDC4EU
 
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
IDC4EU
 
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
DataBench
 
BDVe Webinar Series: DataBench – Benchmarking Big Data. Gabriella Cattaneo. T...
BDVe Webinar Series: DataBench – Benchmarking Big Data. Gabriella Cattaneo. T...BDVe Webinar Series: DataBench – Benchmarking Big Data. Gabriella Cattaneo. T...
BDVe Webinar Series: DataBench – Benchmarking Big Data. Gabriella Cattaneo. T...
Big Data Value Association
 
James hall ch 14
James hall ch 14James hall ch 14
James hall ch 14
David Julian
 
Productivity Factors in Software Development for PC Platform
Productivity Factors in Software Development for PC PlatformProductivity Factors in Software Development for PC Platform
Productivity Factors in Software Development for PC Platform
IJERA Editor
 
Using Data Science to Build an End-to-End Recommendation System
Using Data Science to Build an End-to-End Recommendation SystemUsing Data Science to Build an End-to-End Recommendation System
Using Data Science to Build an End-to-End Recommendation System
VMware Tanzu
 
Data_and_Analytics_Industry_IESE_v3.pdf
Data_and_Analytics_Industry_IESE_v3.pdfData_and_Analytics_Industry_IESE_v3.pdf
Data_and_Analytics_Industry_IESE_v3.pdf
prevota
 

Similar to Improving Business Performance Through Big Data Benchmarking, Todor Ivanov, BDIC 2018, Frankfurt 7-8/06/2018 (20)

Building the DataBench Workflow and Architecture, Todor Ivanov, Bench 2019 - ...
Building the DataBench Workflow and Architecture, Todor Ivanov, Bench 2019 - ...Building the DataBench Workflow and Architecture, Todor Ivanov, Bench 2019 - ...
Building the DataBench Workflow and Architecture, Todor Ivanov, Bench 2019 - ...
 
Big Data Benchmarking, Tomas Pariente Lobo, Open Expo Europe, 20/06/2019
Big Data Benchmarking, Tomas Pariente Lobo, Open Expo Europe, 20/06/2019Big Data Benchmarking, Tomas Pariente Lobo, Open Expo Europe, 20/06/2019
Big Data Benchmarking, Tomas Pariente Lobo, Open Expo Europe, 20/06/2019
 
Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...
Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...
Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...
 
Relating Big Data Business and Technical Performance Indicators, Barbara Pern...
Relating Big Data Business and Technical Performance Indicators, Barbara Pern...Relating Big Data Business and Technical Performance Indicators, Barbara Pern...
Relating Big Data Business and Technical Performance Indicators, Barbara Pern...
 
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...
 
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018 Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
 
DataBench Toolbox Demo, Ivan Martinez, Tomas Pariente Lobo, BDV Meet-Up Riga,...
DataBench Toolbox Demo, Ivan Martinez, Tomas Pariente Lobo, BDV Meet-Up Riga,...DataBench Toolbox Demo, Ivan Martinez, Tomas Pariente Lobo, BDV Meet-Up Riga,...
DataBench Toolbox Demo, Ivan Martinez, Tomas Pariente Lobo, BDV Meet-Up Riga,...
 
Success Stories on Big Data & Analytics
Success Stories on Big Data & AnalyticsSuccess Stories on Big Data & Analytics
Success Stories on Big Data & Analytics
 
Demystifying Data Science
Demystifying Data ScienceDemystifying Data Science
Demystifying Data Science
 
Resume (1)
Resume (1)Resume (1)
Resume (1)
 
Resume (1)
Resume (1)Resume (1)
Resume (1)
 
TechEvent DWH Modernization
TechEvent DWH ModernizationTechEvent DWH Modernization
TechEvent DWH Modernization
 
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
 
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
 
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
Building a Bridge between Technical and Business Benchmarking, Gabriella Catt...
 
BDVe Webinar Series: DataBench – Benchmarking Big Data. Gabriella Cattaneo. T...
BDVe Webinar Series: DataBench – Benchmarking Big Data. Gabriella Cattaneo. T...BDVe Webinar Series: DataBench – Benchmarking Big Data. Gabriella Cattaneo. T...
BDVe Webinar Series: DataBench – Benchmarking Big Data. Gabriella Cattaneo. T...
 
James hall ch 14
James hall ch 14James hall ch 14
James hall ch 14
 
Productivity Factors in Software Development for PC Platform
Productivity Factors in Software Development for PC PlatformProductivity Factors in Software Development for PC Platform
Productivity Factors in Software Development for PC Platform
 
Using Data Science to Build an End-to-End Recommendation System
Using Data Science to Build an End-to-End Recommendation SystemUsing Data Science to Build an End-to-End Recommendation System
Using Data Science to Build an End-to-End Recommendation System
 
Data_and_Analytics_Industry_IESE_v3.pdf
Data_and_Analytics_Industry_IESE_v3.pdfData_and_Analytics_Industry_IESE_v3.pdf
Data_and_Analytics_Industry_IESE_v3.pdf
 

More from DataBench

Welcome to DataBench
Welcome to DataBenchWelcome to DataBench
Welcome to DataBench
DataBench
 
Session 1 - The Current Landscape of Big Data Benchmarks
Session 1 - The Current Landscape of Big Data BenchmarksSession 1 - The Current Landscape of Big Data Benchmarks
Session 1 - The Current Landscape of Big Data Benchmarks
DataBench
 
Session 2 - A Project Perspective on Big Data Architectural Pipelines and Ben...
Session 2 - A Project Perspective on Big Data Architectural Pipelines and Ben...Session 2 - A Project Perspective on Big Data Architectural Pipelines and Ben...
Session 2 - A Project Perspective on Big Data Architectural Pipelines and Ben...
DataBench
 
Session 3 - The DataBench Framework: A compelling offering to measure the Imp...
Session 3 - The DataBench Framework: A compelling offering to measure the Imp...Session 3 - The DataBench Framework: A compelling offering to measure the Imp...
Session 3 - The DataBench Framework: A compelling offering to measure the Imp...
DataBench
 
Session 4 - A practical journey on how to use the DataBench Toolbox
Session 4 - A practical journey on how to use the DataBench ToolboxSession 4 - A practical journey on how to use the DataBench Toolbox
Session 4 - A practical journey on how to use the DataBench Toolbox
DataBench
 
CoreBigBench: Benchmarking Big Data Core Operations
CoreBigBench: Benchmarking Big Data Core OperationsCoreBigBench: Benchmarking Big Data Core Operations
CoreBigBench: Benchmarking Big Data Core Operations
DataBench
 
DataBench Toolbox in a Nutshell
DataBench Toolbox in a NutshellDataBench Toolbox in a Nutshell
DataBench Toolbox in a Nutshell
DataBench
 
DataBench Virtual BenchLearning "Success storie on Big Data & Analytics use c...
DataBench Virtual BenchLearning "Success storie on Big Data & Analytics use c...DataBench Virtual BenchLearning "Success storie on Big Data & Analytics use c...
DataBench Virtual BenchLearning "Success storie on Big Data & Analytics use c...
DataBench
 
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...
DataBench
 
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...
DataBench
 
DataBench session @ BDV Meet-Up Riga: The case of HOBBIT, 27/06/2019
DataBench session @ BDV Meet-Up Riga: The case of HOBBIT, 27/06/2019DataBench session @ BDV Meet-Up Riga: The case of HOBBIT, 27/06/2019
DataBench session @ BDV Meet-Up Riga: The case of HOBBIT, 27/06/2019
DataBench
 
DataBench in a Nutshell - The market: Assessing Industrial Needs, Richard Ste...
DataBench in a Nutshell - The market: Assessing Industrial Needs, Richard Ste...DataBench in a Nutshell - The market: Assessing Industrial Needs, Richard Ste...
DataBench in a Nutshell - The market: Assessing Industrial Needs, Richard Ste...
DataBench
 
Impacts of data-driven AI in business sectors, Richard Stevens, ICT 2018, 05/...
Impacts of data-driven AI in business sectors, Richard Stevens, ICT 2018, 05/...Impacts of data-driven AI in business sectors, Richard Stevens, ICT 2018, 05/...
Impacts of data-driven AI in business sectors, Richard Stevens, ICT 2018, 05/...
DataBench
 
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
DataBench
 
Adding Velocity to BigBench, Todor Ivanov, Patrick Bedué, Roberto Zicari, Ahm...
Adding Velocity to BigBench, Todor Ivanov, Patrick Bedué, Roberto Zicari, Ahm...Adding Velocity to BigBench, Todor Ivanov, Patrick Bedué, Roberto Zicari, Ahm...
Adding Velocity to BigBench, Todor Ivanov, Patrick Bedué, Roberto Zicari, Ahm...
DataBench
 
DataBench - Project fiche
DataBench - Project ficheDataBench - Project fiche
DataBench - Project fiche
DataBench
 

More from DataBench (16)

Welcome to DataBench
Welcome to DataBenchWelcome to DataBench
Welcome to DataBench
 
Session 1 - The Current Landscape of Big Data Benchmarks
Session 1 - The Current Landscape of Big Data BenchmarksSession 1 - The Current Landscape of Big Data Benchmarks
Session 1 - The Current Landscape of Big Data Benchmarks
 
Session 2 - A Project Perspective on Big Data Architectural Pipelines and Ben...
Session 2 - A Project Perspective on Big Data Architectural Pipelines and Ben...Session 2 - A Project Perspective on Big Data Architectural Pipelines and Ben...
Session 2 - A Project Perspective on Big Data Architectural Pipelines and Ben...
 
Session 3 - The DataBench Framework: A compelling offering to measure the Imp...
Session 3 - The DataBench Framework: A compelling offering to measure the Imp...Session 3 - The DataBench Framework: A compelling offering to measure the Imp...
Session 3 - The DataBench Framework: A compelling offering to measure the Imp...
 
Session 4 - A practical journey on how to use the DataBench Toolbox
Session 4 - A practical journey on how to use the DataBench ToolboxSession 4 - A practical journey on how to use the DataBench Toolbox
Session 4 - A practical journey on how to use the DataBench Toolbox
 
CoreBigBench: Benchmarking Big Data Core Operations
CoreBigBench: Benchmarking Big Data Core OperationsCoreBigBench: Benchmarking Big Data Core Operations
CoreBigBench: Benchmarking Big Data Core Operations
 
DataBench Toolbox in a Nutshell
DataBench Toolbox in a NutshellDataBench Toolbox in a Nutshell
DataBench Toolbox in a Nutshell
 
DataBench Virtual BenchLearning "Success storie on Big Data & Analytics use c...
DataBench Virtual BenchLearning "Success storie on Big Data & Analytics use c...DataBench Virtual BenchLearning "Success storie on Big Data & Analytics use c...
DataBench Virtual BenchLearning "Success storie on Big Data & Analytics use c...
 
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...
 
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...
 
DataBench session @ BDV Meet-Up Riga: The case of HOBBIT, 27/06/2019
DataBench session @ BDV Meet-Up Riga: The case of HOBBIT, 27/06/2019DataBench session @ BDV Meet-Up Riga: The case of HOBBIT, 27/06/2019
DataBench session @ BDV Meet-Up Riga: The case of HOBBIT, 27/06/2019
 
DataBench in a Nutshell - The market: Assessing Industrial Needs, Richard Ste...
DataBench in a Nutshell - The market: Assessing Industrial Needs, Richard Ste...DataBench in a Nutshell - The market: Assessing Industrial Needs, Richard Ste...
DataBench in a Nutshell - The market: Assessing Industrial Needs, Richard Ste...
 
Impacts of data-driven AI in business sectors, Richard Stevens, ICT 2018, 05/...
Impacts of data-driven AI in business sectors, Richard Stevens, ICT 2018, 05/...Impacts of data-driven AI in business sectors, Richard Stevens, ICT 2018, 05/...
Impacts of data-driven AI in business sectors, Richard Stevens, ICT 2018, 05/...
 
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
Exploratory Analysis of Spark Structured Streaming, Todor Ivanov, Jason Taafe...
 
Adding Velocity to BigBench, Todor Ivanov, Patrick Bedué, Roberto Zicari, Ahm...
Adding Velocity to BigBench, Todor Ivanov, Patrick Bedué, Roberto Zicari, Ahm...Adding Velocity to BigBench, Todor Ivanov, Patrick Bedué, Roberto Zicari, Ahm...
Adding Velocity to BigBench, Todor Ivanov, Patrick Bedué, Roberto Zicari, Ahm...
 
DataBench - Project fiche
DataBench - Project ficheDataBench - Project fiche
DataBench - Project fiche
 

Recently uploaded

一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Enterprise Wired
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 

Recently uploaded (20)

一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 

Improving Business Performance Through Big Data Benchmarking, Todor Ivanov, BDIC 2018, Frankfurt 7-8/06/2018

  • 1. 7-8 June 2018 Frankfurt Germany Improving Business Performance Through Big Data Benchmarking Todor Ivanov (todor@dbis.cs.uni-frankfurt.de) Frankfurt Big Data Lab, Goethe University
  • 2. DataBench Project - GA Nr 780966 2 Benchmarking • The Term Benchmark: A benchmark is a measured „best-in-class“ achievement recognised as the standard of excellence for that business process. (APQC 1993) • Business Performance Benchmarking – comparison of performance measures for the purpose of determining how good one’s own company is compared to others. • A software benchmark is a program used for comparison of software products/tools executing on a pre-configured hardware environment. 8 June 2018 2
  • 3. DataBench Project - GA Nr 780966 3 Performance Metrics • Lord Kelvin defined KPIs as: “When you can measure what are speaking about and measure it in numbers, you know something about it, when you cannot express it in numbers, your knowledge is of meager and unsatisfactory kind; it may be the beginning of knowledge but you have scarcely, in your thoughts advanced to the stage of science.” [Arora Amishi, Kaur Sukhbir. Performance assessment model for management educators based on KRA/KPI. In: International conference on technology and business management March, vol. 23; 2015.] • Key Performance Indicators (KPIs) - tell you what to do to highly increase performance. 8 June 2018 3 • Jim Gray describes the benchmarking as: ”This quantitative comparison starts with the definition of a benchmark or workload. The benchmark is run on several different systems, and the performance and price of each system is measured and recorded. Performance is typically a throughput metric (work/second) and price is typically a five- year cost-of-ownership metric. Together, they give a price/performance ratio.” [Jim Gray (Ed.), The Benchmark Handbook for Database and Transaction Systems, 1992] • TPC defines complex performance metrics for each standard benchmark: • Transaction per second (tps) • $/tps
  • 4. DataBench Project - GA Nr 780966 4 Example Use Case • Which system will have best price/performance for my application?  Need to use a benchmark • What is more important the minimal execution time or lower total cost?  Depends on the business KPIs • The company decide to introduce Machine Learning model to improve product recommendations to customers. Different ML models have different Accuracy. How important is the accuracy? Should the company invest in improving the model accuracy?  Business decision 8 June 2018 4 System A System B 4 Nodes (servers) 30 Nodes (servers) 4 hours (execution time) 4 minutes (execution time) 5500 Euro (total cost) 50 000 Euro (total cost)
  • 5. DataBench Project - GA Nr 780966 5 Goals of DataBench • … There is a lack of objective, evidence-based methods measuring the correlation between Big Data Technology benchmarks and Business benchmarks as well as Big Data Technology capability to impact the competitiveness and market success of organizations. • Objective: … The objective is to provide a model which correlates technical benchmarks to performance and business needs of different sectors and domains. … 8 June 2018 5
  • 6. Evidence Based Big Data Benchmarking to Improve Business Performance 6 6© IDC Building a bridge between technical and business benchmarking Evaluating business performance and benchmarks Mapping and assessing technical benchmarks Develop a Benchmarking Toolbox and Handbook
  • 7. DataBench Project - GA Nr 780966 778 June 2018
  • 9. DataBench Project - GA Nr 780966 9 BDV – Big Data and Analytics/Machine Learning Reference Model Source: http://bdva.eu/sites/default/files/BDVA_SRIA_v4_Ed1.1.pdf 8 June 2018 9 Data types, Semantic interoperability
  • 10. DataBench Project - GA Nr 780966 108 June 2018 10
  • 11. Some of the benchmarks to integrate (I) Year Name Type 2010 HiBench Big data benchmark suite for evaluating different big data frameworks. 19 workloads including synthetic micro-benchmarks and real-world applications from 6 categories which are micro, machine learning, sql, graph, websearch and streaming. 2015 SparkBench System for benchmarking and simulating Spark jobs. Multiple workloads organized in 4 categories. 2010 Yahoo! Cloud System Benchmark (YSCB) Evaluates performance of different “key-value” and “cloud” serving systems, which do not support the ACID properties. The YCSB++ , an extension, includes many additions such as multi-tester coordination for increased load and eventual consistency measurement. 2017 TPCx-IoT Based on YCSB, but with significant changes. Workloads of data ingestion and concurrent queries simulating workloads on typical IoT Gateway systems. Dataset with data from sensors from electric power station(s). 11 Micro-benchmarks: stress test very specific part of the Big Data technologies typically utilizing synthetic data. 8 June 2018
  • 12. Some of the benchmarks to integrate (II) Year Name Type 2015 Yahoo Streaming Benchmark (YSB) The Yahoo Streaming Benchmark is a streaming application benchmark simulating an advertisement analytics pipeline. 2013 BigBench/TPCx-BB BigBench is an end-to-end, technology agnostic, application-level benchmark that tests the analytical capabilities of a Big Data platform. It is based on a fictional product retailer business model. 2017 BigBench V2 Similar to BigBench, BigBench V2 is an end-to-end, technology agnostic, application-level benchmark that tests the analytical capabilities of a Big Data platform. 2018 ABench (Work-in- Progress) New type of multi-purpose Big Data benchmark covering many big data scenarios and implementations. Extends other benchmarks such as BigBench. 128 June 2018 Application-level benchmarks: cover a wider and more complex set of functionalities involving multiple Big Data technologies and typically utilizing real scenario data.
  • 13. DataBench Project - GA Nr 780966 13 ABench (Work-in-Progress) • Growing number of new Big Data technologies and connectors in the Big Data Stacks  Challenges for Solution Architects, Data Engineers, Data Scientist, Developers, etc. • Missing benchmarks for each technology, connector or a combination of them • Consequence  Increasing complexity in the Big Data Architecture Stacks • Our approach  ABench: Big Data Architecture Stack Benchmark 8 June 2018 13 ABench: Big Data Architecture Stack Benchmark. Todor Ivanov, Rekha Singhal: Companion of the 2018 ACM/SPEC International Conference on Performance Engineering, ICPE 2018, Berlin, Germany, April 09-13, 2018.
  • 14. DataBench Project - GA Nr 780966 14 ABench Features • Benchmark Framework  Data generators or plugins for custom data generators  Include data generator or public data sets to simulate workload that stresses the architecture • Reuse of existing benchmarks  Case study using BigBench (on-going Streaming and Machine Learning extensions) • Open source implementation and extendable design • Easy to setup and extend • Supporting and combining all four types of benchmarks in ABench 8 June 2018 14
  • 15. DataBench Project - GA Nr 780966 15 Benchmarks Types (adapted from Andersen and Pettersen) 1. Generic Benchmarking: checks whether an implementation fulfills given business requirements and specifications (Is the defined business specification implemented accurately?). 2. Competitive Benchmarking: is a performance comparison between the best tools on the platform layer that offer similar functionality (e.g., throughput of MapReduce vs. Spark vs. Flink). 3. Functional Benchmarking is a functional comparison of the features of the tool against technologies from the same area. (e.g., Spark Streaming vs. Spark Structured Streaming vs. Flink Streaming). 4. Internal Benchmarking: comparing different implementations of a functionality (e.g., Spark Scala vs. Java vs. R vs. PySpark) 8 June 2018 15
  • 16. DataBench Project - GA Nr 780966 168 June 2018 16
  • 17. DataBench Project - GA Nr 780966 17178 June 2018
  • 18. DataBench Project - GA Nr 780966 18 DataBench Summary • A framework for big data benchmarking for all Big Data practitioners. • Will provide methodology and tools. • Added value: • An umbrella to access to multiple benchmarks • Homogenized technical metrics • Derived business KPIs • Connecting the different benchmark communities • Open initiative: Public-private partnership (PPP) projects, industrial partners (BDVA and beyond) and benchmarking initiatives are welcomed to work with us, either to use our framework or to add new benchmarks. 8 June 2018 18
  • 19. DataBench Project - GA Nr 780966 19 Contacts info@databench.eu @DataBench_eu DataBench DataBench Project 8 June 2018 19