Current Trends and Challenges in Big Data Benchmarking

Current Trends and Challenges in
Big Data Benchmarking
Kai Sachs - SPEC Research Group
May 2014

© 2014 Kai Sachs. All rights reserved. 2
 Hard- & Software Vendors:
Publish results & marketing
 Example: 27.500 results submitted only for SPEC CPU2006 benchmarks
 Developer:
Analysis & product quality
 Example: Regression performance testing
 Consumer:
Compare different products
 Example: Find the best video card for gaming
 IT Architect:
Cloud & hardware sizing
 Example: Choosing configuration
 Researcher:
 Example: Evaluate own implementation using standardized workload
Benchmark Use Cases & Stakeholders

Standard Performance Evaluation Corporation
OSG
Open
Systems
Group
HPG
High
Performance
Group
GWPG
Graphics and
Workstation
Performance
Group
RG
Research
Group
> 80 member organizations & associates
Founded 1988

Development of Industry Standard Benchmarks
OSG
Open
Systems
Group
HPG
High
Performance
Group
GWPG
Graphics and
Workstation
Performance
Group
RG
Research
Group
Founded 1988
CPU, Java,
Virtualization,
Power, …
OpenMP, MPI
…

RG
Research
Group
Cloud,
Intrusion
Detection
Systems,
Big Data
Research Platform
OSG
Open
Systems
Group
HPG
High
Performance
Group
GWPG
Graphics and
Workstation
Performance
Group
Founded 1988

 Provide a platform for collaborative research efforts in the areas of
 Computer benchmarking and
 Quantitative system analysis
 Portal for all kinds of benchmarking-related resources
 Provide research benchmarks, tools, metrics and scenarios.
Mission Statement
SPEC Research Group

Performance
Performance in a broad sense:
 Classical performance metrics
Example: response time, throughput, scalability,
efficiency, and elasticity
 Non-functional system properties under the term
dependability
Example: availability, reliability, and security

Big Data Benchmarking Community (BDBC)
 ‘Incubator’ for Big Data standard benchmark(s) for industry
 >200 members on the mailing list
Workshop on Big Data Benchmarking Series
 2012 in San Jose, CA & in Pune, India, 2013 in San Jose, CA & Xian, China,
2014 in Potsdam, Germany
 Post-proceedings published in LNCS
BDBC is joining the SPEC Research Group
 RG Working group focusing on Big Data in preparation
 Working group chairs: Chaitan Baru, Tillmann Rabl
Towards a Big Data Standard Benchmark
WBDB 2012 Report: Setting the Direction for Big Data Benchmark Standards
C. Baru, M. Bhandarkar, R. Nambiar, M. Poess, T. Rabl, TPCTC: 2012, collocated with VLDB2012

Other Benchmark Organizations
Transaction Processing Performance Council (TPC)
 Focus: Transaction Processing and Database Benchmarks
 Most famous benchmarks:
TPC-C (OLTP benchmark), TPC-E (OLTP
benchmark), TPC-H (Decision support benchmark)
Embedded Microprocessor Benchmark Consortium (EEMBC)
 Focus: hardware and software used in embedded systems
Business Applications Performance Corporation (BAPCo)
 Focus: performance benchmarks for personal computers based on
popular computer applications and industry standard operating systems

General Chairs: Chaitan Baru (UC San Diego), Tilmann Rabl (U Toronto), Kai Sachs (SAP)
Local Arrangements: Matthias Uflacker (Hasso Plattner Institute)
Publicity Chair: Henning Schmitz (SAP Innovations Center)
Publication Chair: Meikel Poess (Oracle)
Program Committee
Milind Bhandarkar (Pivotal)
Anja Bog (SAP Labs)
Dhruba Borthakur (Facebook)
Joos-Hendrik Böse (Amazon)
Tobias Bürger (Payback)
Tyson Condi (UCLA)
Kshitij Doshi (Intel)
Pedro Furtado (U Coimbra)
Bhaskar Gowda (Intel)
Goetz Graefe (HP)
Martin Grund (Exascale)
Alfons Kemper (TU München)
Donald Kossmann (ETH Zürich)
Tim Kraska (Brown University)
Wolfgang Lehner (TU Dresden)
Christof Leng (UC Berkeley)
Stefan Manegold (CWI)
Raghu Nambiar (Cisco)
Manoj K. Nambiar (TCS)
Glenn Paulley (Conestoga Col.)
Keynote Speakers: Umesh Dayal, Alexandru Iosup
Scott Pearson (CLDS Industry Fellow)
Andreas Polze (HPI)
Alexander Reinefeld (HU Berlin)
Berni Schiefer (IBM Labs Toronto)
Saptak Sen (Hortonworks)
Florian Stegmaier (University of Passau)
Till Westmann (Oracle Labs)
Jianfeng Zhan (Chinese Academy of Science)
Platinum Sponsor: Gold Sponsors:
Submission: May 30, 2014 (6pm PDT) Short versions of papers (4-8 LNCS pages)

Past & Present
Past:
 It was common to write a for-loop and call it benchmark.
Present:
 Benchmarks are complex pieces of software and
specifications.
 Benchmark development has turned into a complex team
effort.

The Whetstone Benchmark (1974 – 284 lines)
Curnow, H.J., Wichman, B.A. "A Synthetic Benchmark" Computer Journal, Volume 19, Issue 1, Feb. 1976, p. 43-49

SPEC CPU Benchmark Suite – Lines of Code
Henning, J. ”SPEC CPU suite growth: an historical perspective” SIGARCH Comput. Archit. News 35, Issue 1, March 2007

Example Components of a Standard Benchmark
Workload
Reporter
Run Rules
Implementation &
Framework (opt.)
Documentation
Metrics
BENCHMARK
Workload specification is the most important part
Performance evaluation of message-oriented middleware using the SPECjms2007 benchmark
Kai Sachs, Samuel Kounev, Jean Bacon, Alejandro Buchmann: Performance Evaluation, 2009
Performance Modeling and Benchmarking of Event-Based Systems
Kai Sachs, PhD Thesis, TU Darmstadt, 2010

Workload Requirements
Resilience Benchmarking
Marco Vieira, Henrique Madeira, Kai Sachs, Samuel Kounev in Resilience Assessment and Evaluation, Springer, 2012
 Representativeness
 Comprehensiveness
 Focus
 Scalability
 Configurability

Workload Description ‘Level’
From TPC-C to Big Data Benchmarks: A Functional Workload Model
Yanpei Chen, Francois Raab, and Randy Katz in Workshop on Big Data Benchmarks, 2012.

Current Trends &
Challenges in Big Data
Benchmarking

Current Trends & Challenges in Benchmarking
Technology:
 Virtualization
 Cloud
 (Big) Data
Map Reduce, Mixed Workload (OLAP / OLTP),
Data / Event Streaming, …
Benchmarking methodology:
 Large Scale Systems
Tools:
 Data / workload generator
 Power consumption
 Simulation frameworks
 Generic benchmarking frameworks
Technologies
Tools
Benchmark
Methodologies

Benchmark Methodology
System Under Test
Past & Present
 Single node
 Multiple nodes
Isolated systems

System Under Test
http://instagram.com/p/W2FCksR9-e/
St. Peter's Square
2005 vs. 2013

System Under Test
Challenge: Large Scale Systems
 Isolation is not guaranteed (or impossible)
 High number of nodes
 Data amount is very high
 Repeatability is an issue
How can we benchmark such systems?
Technologies
Tools
Benchmark
Methodology

“Big Data should be Interesting Data!
There are various definitions of Big Data; most center around a number of
V’s like volume, velocity, variety, veracity – in short: interesting data
(interesting in at least one aspect). However, when you look into research
papers on Big Data, in SIGMOD, VLDB, or ICDE, the data that you see
here in experimental studies is utterly boring. Performance and scalability
experiments are often based on the TPC-H benchmark: completely
synthetic data with a synthetic workload that has been beaten to death for
the last twenty years. Data quality, data cleaning, and data integration
studies are often based on bibliographic data from DBLP, usually old
versions with less than a million publications, prolific authors, and curated
records. I doubt that this is a real challenge for tasks like entity linkage or
data cleaning. So where’s the – interesting – data in Big Data research?”
Where’s the Data in the Big Data Wave? – SIGMOD Blog March 2013
Gerhard Weikum

“Big Data should be Interesting Data!
There are various definitions of Big Data; most center around a number of
V’s like volume, velocity, variety, veracity – in short: interesting data
(interesting in at least one aspect). However, when you look into research
papers on Big Data, in SIGMOD, VLDB, or ICDE, the data that you see
here in experimental studies is utterly boring. Performance and scalability
experiments are often based on the TPC-H benchmark: completely
synthetic data with a synthetic workload that has been beaten to
death for the last twenty years. Data quality, data cleaning, and data
integration studies are often based on bibliographic data from DBLP,
usually old versions with less than a million publications, prolific authors,
and curated records. I doubt that this is a real challenge for tasks like entity
linkage or data cleaning. So where’s the – interesting – data in Big Data
research?”
Where’s the Data in the Big Data Wave? – SIGMOD Blog March 2013
Gerhard Weikum

Big Data Benchmark:
Issues and Challenges
‘Big Data World’
Communities
Benchmark Design
 Single benchmark vs. Benchmark collection
 Component vs. End-to-end scenario
 Specification vs. Implementation
 Metric
System under Test
Workload

Enterprise Warehouse + Agglomeration of other data
 Structured enterprise data warehouse
 Extended to incorporate data from other non-fully structured
data sources (e.g. weblogs, text, streams)
Pool of data with sequence of processing
 Enterprise data processing as a pipeline from data ingestion
to transformation, extraction, subsetting, machine learning,
predictive analytics
 Data from multiple structured and non-structured sources
Abstractions of the Big Data World from WBDB
Introduction to the 4th Workshop on Big Data Benchmarking
Chaitan Baru

Scenario:
 Retail domain
Data:
 Structured: based on TPC–DS
 Semi-Structured: click streams
 Unstructured: product reviews
 PDGF used to generate data
BigBench: A Big Data Analytics Benchmark
Data Model
BigBench: Towards an Industry Standard Benchmark for Big Data Analytics
A. Ghazal, Minqing Hu, T. Rabl, F. Raab, M. Poess, A. Crolotte, H. Jacobsen. SIGMOD 2013

Extended version of parallel data generation framework (PDGF)
Separate review generator
BigBench: A Big Data Analytics Benchmark
Data Generation – Unstructured Data
BigBench: Towards an Industry Standard Benchmark for Big Data Analytics
A. Ghazal, Minqing Hu, T. Rabl, F. Raab, M. Poess, A. Crolotte, H. Jacobsen. SIGMOD 2013, to appear

An end-to-end data processing pipeline:
 Data from multiple sources
 Loose, flexible schema
 Data requires structuring
Application characteristics
 Processing pipelines
 Running models with data
Deep Analytics Pipeline
Introduction to the 4th Workshop on Big Data Benchmarking
Chaitan Baru

Example of an Application:
Determine User Interest Profile by Mining Activities
Scalable distributed inference of dynamic user interests for behavioral targeting
A. Ahmed, Y. Low, M. Aly, V. Josifovski, A.J. Smola, SIGKDD 2011

Composite Benchmark for Transactions and Reporting (CBTR)
OLTP & OLAP Benchmark based on Current and Real Enterprise
Order-to-cash Scenario:
18 tables with 5 - 327 columns
2316 columns in sum
Variable Workload Mix
OLTP sub-workload
ST:= {x ∈ ℜ | 0 ≤ x ≤ 1}
OLAP sub-workload
SA = 1 - ST
read-only OLTP queries
SrT:= {x ∈ ℜ | 0 ≤ x ≤ 1}
mixed OLTP queries
SmT = 1 - SrT
S: share
T: transactional | A: analytical
r: read-only | m: mixed
Benchmarking Composite Transaction and Analytical Processing Systems
Anja Bog, PhD Thesis, University of Potsdam, 2012
Interactive Performance Monitoring of a Composite OLTP & OLAP Workload
Anja Bog, Kai Sachs, Hasso Plattner. SIGMOD 2012 (Demo)
Normalization in a Mixed OLTP and OLAP Workload Scenario
Anja Bog, Kai Sachs, Alexander Zeier, Hasso Plattner. TPCTC 2011, collocated with VLDB2011

Big Data & Cloud Benchmark
Related Work – Virtualization Benchmarking

Other activities
TPC–BD
 TPC announced a Big Data working group (11.2013)
Graph 500
 Driven by HPC community
 Cooperating with SPEC CPU group
 Green Graph 500 list
SPEC OSG
 Big Data as part of a cloud benchmark
Cloudsuite 2.0, CH-benCHmark, BigDataBench, HiBench,
LinkedBench …

Target group
 Researchers & developers
Data categories
 Structured, unstructured and semi-structured; events & streams; graphs;
geospatial, retail, astronomy & genomic; …
Benchmark scenario & metrics
 Realistic use-cases & workload mixes
 Big Data Classification schema
(Research) Standard Benchmarks
 BigBench, Deep Analytics Pipeline, …
Data generation
 Real world traces & synthetic data, tooling
SPEC RG – Big Data Working Group
Potential Topics

Conclusions
Benchmarking is more than throughput
Meaningful workloads are most important

Conclusions
Benchmarking is more than throughput
Meaningful workloads are most important
More research is needed
 Benchmarking of large scale systems
 “Big Data World”: Workloads & scenarios
 Benchmarks for Big Data
We Don’t Know Enough to make a Big Data
Benchmark Suite
Yanpei Chen, WBDB 2012

Thank you
Contact information:
Kai Sachs
Email: Kai.Sachs@sap.com
Disclaimer:
SPEC, the SPEC logo, the SPEC Research Group logo and the tool and names SERT, SPECjms2007, SPECpower_ssj2008, SPECweb2009 and
SPECvirt_sc2010 are registered trademarks of the Standard Performance Evaluation Corporation (SPEC). Reprint with permission.

General Chairs: Chaitan Baru (UC San Diego), Tilmann Rabl (U Toronto), Kai Sachs (SAP)
Local Arrangements: Matthias Uflacker (Hasso Plattner Institute)
Publicity Chair: Henning Schmitz (SAP Innovations Center)
Publication Chair: Meikel Poess (Oracle)
Program Committee
Milind Bhandarkar (Pivotal)
Anja Bog (SAP Labs)
Dhruba Borthakur (Facebook)
Joos-Hendrik Böse (Amazon)
Tobias Bürger (Payback)
Tyson Condi (UCLA)
Kshitij Doshi (Intel)
Pedro Furtado (U Coimbra)
Bhaskar Gowda (Intel)
Goetz Graefe (HP)
Martin Grund (Exascale)
Alfons Kemper (TU München)
Donald Kossmann (ETH Zürich)
Tim Kraska (Brown University)
Wolfgang Lehner (TU Dresden)
Christof Leng (UC Berkeley)
Stefan Manegold (CWI)
Raghu Nambiar (Cisco)
Manoj K. Nambiar (TCS)
Glenn Paulley (Conestoga Col.)
Keynote Speakers: Umesh Dayal, Alexandru Iosup
Scott Pearson (CLDS Industry Fellow)
Andreas Polze (HPI)
Alexander Reinefeld (HU Berlin)
Berni Schiefer (IBM Labs Toronto)
Saptak Sen (Hortonworks)
Florian Stegmaier (University of Passau)
Till Westmann (Oracle Labs)
Jianfeng Zhan (Chinese Academy of Science)
Platinum Sponsor: Gold Sponsors:
Submission: May 30, 2014 (6pm PDT) Short versions of papers (4-8 LNCS pages)

Current Trends and Challenges in Big Data Benchmarking

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Current Trends and Challenges in Big Data Benchmarking

Similar to Current Trends and Challenges in Big Data Benchmarking (20)

More from eXascale Infolab

More from eXascale Infolab (20)

Recently uploaded

Recently uploaded (20)

Current Trends and Challenges in Big Data Benchmarking