Breaching the 100TB Mark with SQL Over Hadoop

© 2017 IBM CorporationSept 2017
Breaching the 100TB mark with
SQL over Hadoop
Analytics Performance
.1 Highlights
L’s leadership in performance and scalability
ovements
ts
partitioning options
ments
mory and caching improvements
s with ORC file format
ther on these enhancements by allowing a
Simon Harris (siharris@au1.ibm.com)
IBM Research
Priya Tiruthani (ntiruth@us.ibm.com)
Big SQL Offering Manager

© 2017 IBM Corporation2
Acknowledgements and Disclaimers
Availability. References in this presentation to IBM products, programs, or services do not imply that they will be available in all countries in which
IBM operates.
The workshops, sessions and materials have been prepared by IBM or the session speakers and reflect their own views. They are provided for
informational purposes only, and are neither intended to, nor shall have the effect of being, legal or other guidance or advice to any participant.
While efforts were made to verify the completeness and accuracy of the information contained in this presentation, it is provided AS-IS without
warranty of any kind, express or implied. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, this
presentation or any other materials. Nothing contained in this presentation is intended to, nor shall have the effect of, creating any warranties or
representations from IBM or its suppliers or licensors, or altering the terms and conditions of the applicable license agreement governing the use of
IBM software.
All customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have
achieved. Actual environmental costs and performance characteristics may vary by customer. Nothing contained in these materials is intended to,
nor shall have the effect of, stating or implying that any activities undertaken by you will result in any specific sales, revenue growth or other results.
© Copyright IBM Corporation 2017. All rights reserved.
— U.S. Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with
IBM Corp.
IBM, the IBM logo, ibm.com, BigInsights, and Big SQL are trademarks or registered trademarks of International Business Machines Corporation in
the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a
trademark symbol (® or TM), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was
published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on
the Web at
“Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml
TPC Benchmark, TPC-DS, and QphDS are trademarks of Transaction Processing Performance Council
Cloudera, the Cloudera logo, Cloudera Impala are trademarks of Cloudera.
Hortonworks, the Hortonworks logo and other Hortonworks trademarks are trademarks of Hortonworks Inc. in the United States and other
countries.
Other company, product, or service names may be trademarks or service marks of others.

Big SQL is the only SQL-on-Hadoop
solution to understand SQL syntax from
other vendors and products, including:
Oracle, IBM Db2 and Netezza.
For this reason, Big SQL is the ultimate
hybrid engine to optimize EDW workloads
on an open Hadoop platform
What is IBM Big SQL?

Want to modernize
your EDW without
long and costly
migration efforts
Offloading historical
data from Oracle,
Db2, Netezza
because reaching
capacity
Operationalize
machine learning
Need to query,
optimize and
integrate multiple
data sources from
one single endpoint
Slow query
performance for SQL
workloads
Require skill set to
migrate data from
RDBMS to
Hadoop / Hive
Do you have any of these challenges?

Here’s How Big SQL Addresses These Challenges
 Compatible with Oracle, Db2 & Netezza SQL syntax
 Modernizing EDW workloads on Hadoop has never been easier
 Application portability (eg: Cognos, Tableau, MicroStrategy,…)
 Federates all your data behind a single SQL engine
 Query Hive, Spark and HBase data from a single endpoint
 Federate your Hadoop data using connectors to Teradata, Oracle, Db2 & more
 Query data sources that have Spark connectors
 Addresses a skill set gap needed to migrate technologies
 Delivers high performance & concurrency for BI workloads
 Unlock Hadoop data with analytics tools of choice
 Provides greater security while accessing data
 Robust role-based access control and Ranger integration
 Operationalize machine learning through integration with Spark
 Bi-directional integration with Spark exploits Spark’s connectors as well as ML
capabilities

IBM’s Big SQL Preserves Open Source Foundation
Leverages Hive metastore and storage formats.
No Lock-in. Data part of Hadoop, not Big SQL.
SQL Execution Engines
Big SQL
(IBM)
Hive
(Open Source)
Hive Storage Model
(open source)
CSV Parquet ORC Others…Tab Delim.
Hive Metastore
(open source)
Applications

Big SQL queries heterogeneous systems in a single query - only SQL-on-Hadoop that virtualizes more than 10
different data sources: RDBMS, NoSQL, HDFS or Object Store
Big SQL
Fluid Query (federation)
Oracle
SQL
Server
Teradata DB2
Netezza
(PDA) Informix
Microsoft
SQL Server
Hive HBase HDFS
Object Store
(S3)
WebHDFS
Big SQL allows query federation by virtualizing data sources and processing where data resides
Hortonworks Data Platform (HDP)
Data Virtualization

 Easy porting of enterprise applications
 Ability to work seamlessly with Business Intelligence tools like Cognos to
gain insights
 Big SQL integrates with Information Governance Catalog by enabling easy
shared imports to InfoSphere Metadata Asset Manager, which allows:
 Analyze assets
 Utilize assets in jobs
 Designate stewards for the assets
Oracle
SQL
DB2
SQL
Netezza
SQL
Big SQL
SQL syntax tolerance (ANSI SQL Compliant)
Cognos Analytics
InfoSphere Metadata Asset Manager
Big SQL is a synergetic SQL engine that offers SQL compatibility, portability and
collaborative ability to get composite analysis on data
Data Offloading and Analytics

BRANCH_A FINANCE
(security admin)BRANCH_B
Role Based Access Control
enables separation
of Duties / Audit
Row Level Security
Row and Column Level Security
Big SQL offers row and column level access control (RBAC) among other security settings
Data Security

Themes
Integration
Performance
Usability &
Serviceability
Enterprise,
Governance &
Security
Bringing together different components of the Hadoop and making sure
Big SQL offers enhanced capabilities and smooth.
Execution of queries, simple or complex, needs to complete with low
latency. Big SQL continues to focus on improving the query execution for
all open source file formats.
By simplifying the complexity of setting up or trouble-shooting that
comes with Hadoop ecosystem, users will benefit by increases the
productivity of their use cases.
Enterprise needs are specific to application portability and data security.
Big SQL has high application portability and makes strides to enhance it
further. Big SQL also focuses on having centralized governance and
auditing for.

• At a High Level:
• Bi-directional Spark integration allows you to
run Spark jobs from Big SQL
• Ranger integration provide centralized
security
• Yarn integration for easy flex up/down of Big
SQL workers
• Integration with HDP v2.6.x
Bi-directional Spark integration+
Ranger integration+
HDP integration+
Integration
Big SQL v5.0.x focuses on providing integration with the following…
YARN integration+

• Launch multiple Big SQL workers on a node
using Turbo Boost technology for SQL
execution
• Enhancements in handling ORC file format
has shown marked increase in performance
(at par with parquet file format)
• New benchmark shows great performance
ORC enhancement+
Performance
Constant upgrades to performance helps Big SQL to lead in performance for complex queries
Performance benchmarks+
Elastic Boost technology+

• Oracle compatibility for application portability
• Ranger integration provide centralized
security
Oracle compatibility+
Ranger integration+
Enterprise, Governance & Security

• UI driven simple install with a few clicks
• Data loaded for immediate use
• Tutorial guided using Zeppelin
Sandbox+
Big SQL Sandbox
A single node sandbox to visualize data using Zeppelin

Right Tool for the Right Job
Not Mutually Exclusive. Hive, Big SQL & Spark SQL can co-exist and complement each other in a cluster
Big SQL
Federation
Complex Queries
High Concurrency
Enterprise ready
Application portability
All open source file formats
Spark SQL
Machine learning
Data exploration
Simpler SQL
Hive
In-memory cache
Geospatial analytics
ACID capabilities
Fast ingest
Ideal tool for Data Scientists
and discovery
Ideal tool for BI Data Analysts
and production workloads
Ideal tool for BI Data Analysts
and production workloads

To Summarize - Core Themes of Big SQL
SQL Compatibility
•Understands different SQL
dialects
•Reuse skills and applications
with less/no changes
Federation
•Connect to remote data sources
•Query pushdown
•Spark connectors for more data
sources & ML models
Performance
•Execute all 99 TPCDS queries
•Scales linearly with increased
concurrency
Enterprise & Security
•Automatic memory management
•Role/column based data security
•SQL compatible with:
•Applications work as-is without
any changes
•Federates to more than 10 data
sources: RDBMS, NoSQL and/or
Object Stores
•Integrates bi-directionally with
Spark, like no other
•Operationalizes ML models
•Exhibits high performance
even when data scales up to
100TB with complex SQLs
•Handles many concurrent
users without relinquishing
performance
•Secures data using SQL with
roles
•Integrates with Ranger for
centralized management
Big SQL is the only SQL-on-Hadoop engine that……

Hadoop-DS @ 100TB
Breaching the 100TB mark: The Environment
F1 ClusterLoad
Single
Stream
4-Streams

About the Hadoop-DS Workload
Aim: To provide the fairest and most meaningful comparison of SQL
over Hadoop solutions
Hadoop-DS benchmark is based on the TPC-DS* benchmark.
Strives to follow latest (v2.3) TPC-DS specification whenever possible.
Key deviations:
 No data maintenance or data persistence phases - not possible across all vendors
 Uses a sub-set of queries that all solutions can successfully execute at that scale factor
 Queries are not cherry picked
Is STILL the most complete TPC-DS like benchmark executed so far
Includes database build, single stream run and multi-stream run
First published in Oct 2014, using Big SQL, Impala and Hive
This publication compares Big SQL v5.0 with Spark 2.1 and focuses on 4-
stream run

What is TPC-DS?
 TPC = Transaction Processing Council
 Non-profit corporation (vendor independent)
 Defines various industry driven database benchmarks…. DS = Decision Support
 Models a multi-domain data warehouse environment for a hypothetical retailer
Retail Sales Web Sales Inventory Demographics Promotions
Multiple scale factors:
100GB, 300GB, 1TB, 3TB, 10TB, 30TB and 100TB
99 Pre-Defined Queries
Query Classes:
1.Reporting 4.Ad Hoc2.Iterative
OLAP
3.Data
Mining
TPC-DS now at Version 2.5 (http://www.tpc.org/tpcds/default.asp)

SO WHAT DOES IT TAKE TO BREACH 100TB
F1 CLUSTER: DESIGNED FOR SPARK
HARDWARE
28
LENOVO x3650 M5 NODES
100
GbE MELLANOX SN2700
448
TB SSD PCIe INTEL NVMe
1008
INTEL BROADWELL CORES
42,000
GB RAM
SOFTWARE
4.2 IOP
7.2 RHEL
2.1 SPARK
5.0 Big SQL
CLUSTER BANDWIDTH
375
GB/S NETWORK
480
GB/S DISK IO
ALL 4 SEASONS OF HOUSE OF CARDS +
ORANGE IS THE NEW BLACK
LOADED INTO RAM IN 1 SEC
DATA PREP
7
HOURS TO GENERATE 100TB RAW DATA
39
HOURS TO PARTITION AND LOAD (PARQUET)
39.7
TB ON-HDFS SIZE FOR PARQUET FILES
COMPRESSION
60%SPACE SAVED
WITH
PARQUET
ENERGY USE PER NODE
167 WATTS AT STAND-BY
560 WATTS AT PEAK LOAD
475 WATTS LOAD AVERAGE
PEAK CPU USAGE
96% TOTAL 73.7% USER
22% SYS <1% IOWAIT

5. HDFS Rebalance
Balance data across nodes to reduce
uneven load
3. MR2 Tuning (for data load)
7 properties
Map/reduce memory, java heap size,
io.sort.factor
4. Parquet <-> HDFS block alignment
to reduce unnecessary io ops
block size = 128MB
8. Big SQL Tuning (with elastic boost)
11 properties
(5 are now used as the defaults)
HADOOP-DS @ 100TB: TUNING THE STACK
7. Spark SQL Tuning
11 properties
3 spark-submit properties
6. YARN Tuning
10 properties
Mainly for container allocation
2. HDFS Tuning
5 properties
Mainly for NameNode
1. Low level machine tuning
file system optimization & mounts
network tuning
disable CPU scaling
Fulldetailsareinbackupslides
Generate data
Load data
Re-load data
Rebalance
Spark SQL queries
Big SQL queries
Basic cpu, io &
network throughput
tests

100TB Hadoop-DS is BIGdata

100TB Database Build
Parquet (with compression) was chosen as the storage
format for Big SQL and Spark SQL
Fact tables were partitioned (compliant) to
take advantage of new ‘partition elimination
thru join keys’ available in Big SQL v5
Both Big SQL and Spark SQL used exactly the same partitioned parquet files
Spark SQL did not require the Analyze & Stats View build stages
Load stage took ~39 hours
 STORE_SALES table is heavily skewed on null partition (SS_SOLD_DATE_SK=NULL)
 Most of time spent loading this null partition (~ 570GB, other partitions are
~20GB). In LOAD this is done by a single reducer
 INSERT..SELECT.. using multiple logical Big SQL workers may be faster (we ran out
of time before we could try it)

Query compliance through the scale factors
 Spark SQL has made impressive strides since v1.6 to
run all 99 TPC-DS compliant queries out of the box
 But this is only at the lower scale factors
 AT 100TB, 16 of the 99 queries fail with runtime
errors or timeout (> 10 hours)
 Big SQL has been successfully executing all 99
queries since Oct 2014
 IBM is the only vendor that has proven SQL
compatibility at scale factors up to 100TB
 With compliant DDL and query syntax
For an apples-to-apples comparison, the 83 queries which Spark
could successfully execute were used for the comparison.
Big SQLSpark SQL

Performance: Single Stream Run
 Single stream run represents a power run
 Interesting engineering exercise, but not representative of real life usage.
Big SQL is 3.8x faster than Spark 2.1 for single stream run
27859
18145
68735
0 10000 20000 30000 40000 50000 60000 70000 80000
BIG SQL V5.0
BIG SQL V5.0
SPARK SQL V2.1
Total elapsed time (secs)
Total elapsed time for Hadoop DS workload @ 100TB.
Single stream.
(shorter bars are better)
83 queries
99 queries
83 queries

Performance: 4 concurrent streams
 Multi-stream query execution most closely represents real-life usage.
 Analysis focus on 4-stream runs
Big SQL is 3.2x faster than Spark 2.1 for 4 concurrent streams
81329
49217
155515
0 50000 100000 150000 200000
BIG SQL V5.0
BIG SQL V5.0
SPARK SQL V2.1
Total elapsed time (secs)
Total elapsed time for Hadoop DS workload @ 100TB.
4 concurrent streams.
(shorter bars are better)
83 queries*4 streams = 332
99 queries*4 streams
= 396 queries
83 Q * 4 Strms
= 332 queries

CPU Profile for Big SQL vs. Spark SQL
Hadoop-DS @ 100TB, 4 Concurrent Streams
Spark SQL uses almost 3x more system CPU.
These are wasted CPU cycles.
Average CPU Utilization:
76.4%
Average CPU Utilization:
88.2%
Spark SQL runs 3.2x longer than Big SQL –
so Spark SQL actually consumes > 3x
more CPU for the same workload!
 Average CPU consumption across 4-stream run for
Big SQL is 76.4% compared to 88.2% for Spark SQL
 Almost 3x more system CPU for Spark SQL. These
are wasted CPU cycles.
 Very little io wait time for both (SSDs are fast)
 Since Spark SQL run is 3.2x longer than Big SQL,
Spark SQL actually consumes more than 3x the
CPU resources to complete the same workload
 Big SQL:
Some nodes have higher CPU consumption than
others, showing imbalance in the distribution of
work amongst the nodes
 Spark SQL:
Spark SQL has even distribution of CPU across
nodes indicating work is more evenly distributed
 Big SQL is much more efficient in how it uses the
available CPU resources.

Big SQL vs Spark SQL Memory Consumption
Hadoop-DS @ 100TB, 4 Concurrent Streams
 Big SQL is only ‘actively’ using approx. 1/3rd of
the available memory
 Indicating more memory could be assigned
to bufferpools and sort space etc…
 So could Big SQL be even faster and/or
support greater concurrency !!!
 Spark SQL is doing a better job at utilizing the
available memory, but consequently has less
room for improvement via tuning
Big SQL
Spark SQL
Active
Inactive
Free

I/O Activity: 4-streams.
 Indicates Spark SQL needs to do
more I/O to complete the
workload, but when high I/O
throughput is required, Big SQL
can drive the SSDs harder than
Spark SQL
 Spark SQL is performing more I/O than Big SQL
 Since Spark SQL run lasts 3.2x longer than Big SQL, Spark
SQL is actually reading ~12x more data than Big SQL
and writing ~30x more data.
Indicates greater efficiency within the mature Big
SQL optimizer & execution engine.

PERFORMANCE
Big SQL 3.2x faster
4 concurrent query streams.
HADOOP-DS @ 100TB: SUMMARY
WORKING QUERIES
CPU (vs Spark)
Big SQL uses 3.7x less CPU
I/O (vs Spark)
Big SQL reads 12x less data
Big SQL writes 30x less data
COMPRESSION
60%SPACE SAVED
WITH
PARQUET
AVERAGE CPU:
76.4%
MAX I/O THROUGHPUT
(per node):
READ 4.4 GB/sec
WRITE 2.8 GB/sec

Lessons Learned: General!
 Building a brand-new cluster from the ground up is tough!
 Full stack tuning required in order to get the cluster to a state capable of
handling 100TB
 Pay close attention to how the data is loaded, and think carefully about the partitioning scheme
 Be cognizant of data skew - especially on your partitioning scheme
Concurrency is much more difficult than single stream
Complex queries pose a significant problem for most SQL over Hadoop
solutions at scale
 Near best performance often achieved in the first 5-8 runs, absolute best may take much longer – the
80:20 rule.

Lessons Learned: Spark SQL!
 Spark SQL has come a long way, very quickly. BUT…
 Success at lower scale factors does not guarantee success at higher scale
factors
 Significant effort required to tune failing queries
Spark SQL still relies heavily (almost entirely) on manual tuning…
 To get the best out of Spark SQL, the level of parallelism (num-executors) and memory assigned
to them (executor-memory) needs to be tuned for each query, and the optimal values vary
depending on how many other Spark queries are running in the cluster at that particular time.
 Very difficult, if not impossible, to manage this in a production environment

Lessons Learned: Big SQL!
 4 query failures in early runs using product defaults:
 Quickly eliminated via product tuning
 Big SQL defaults changed as a result
 Focused on hardening “Elastic Boost” capability to gain maximum throughput
 Extensive development work in the Scheduler to evenly distribute work amongst the logical
workers
 Spare capacity (memory, cpu) could be better utilized
 Could have done better!
 Big SQL has unique tuning features to help with stubborn queries
 Only a limited set of these are allowed by the Hadoop-DS rules, but could be deployed in
production clusters

ORC performance evaluation
V5.0.1
V2.1
LLAP on TEZ
Hadoop-DS @ 10TB
Load
Single
Stream
HARDWARE
17
LENOVO x3650 M4 NODES
640
LOGICAL CORES
2,048
GB RAM
288
TB DISK SPACE
10
Gb ETHERNET6-
Streams

PERFORMANCE: 6-streams
BIG SQL 2.3X FASTER
HADOOP-DS @ 10TB BIG SQL V5.0.1 AND HIVE 2.1 (LLAP WITH TEZ)
AT A GLANCE: 85 COMMON QUERIES
WORKING COMPLIANT QUERIES:
6-streams
WORKLOAD
SCALE FACTOR: 10 TB
FILE FORMAT: ORC (ZLIB)
CONCURRENCY: 6 STREAMS
QUERY SUBSET: 85 QUERIES
RESOURCE UTILIZATION:
6-STREAMS
Big SQL: 1.5x FEWER CPU CYCLES
USED
STACK
HDP 2.6.1
BIG SQL 5.0.1
HIVE 2.1 LLAP ON TEZ
INTERESTING FACTS
FASTEST QUERY
5.4X FASTER (BIG SQL: 1.5 SEC,
HIVE: 8.1 SEC)
SLOWEST QUERY (QUERY 67)
1.7X FASTER (BIG SQL: 6827 SEC,
HIVE: 11830 SEC)
BIG SQL FASTER FOR 80% OF
QUERIES RUN
PERFORMANCE: 1-stream
BIG SQL 1.8X FASTER
hrs
hrs

WHY ???
Advanced Autonomics
Self Tuning
Memory Manager
Integrated Work Load
Manager
World Class Cost Based Optimizer
Query rewrite
Advanced Statistics
Advanced Partitioning
Native Row &
Columnar stores
Hardened runtime
Elastic Boost
SQL Compatibility

So, what does all this boil down to ?
 Data Scientists/Business Analysts can be 3-4 times more productive using Big SQL compared to Spark
SQL.
 With Big SQL, users can focus on what they want to do, and not worry about how it is executed.
Proof points:
 Able to successfully run all 99 TPC-DS queries @ 100TB in 4-concurrent streams
 Performance leadership
 Uses fewer cluster resources
 Simpler configuration with mature self-tuning and workload management features
 Big SQL is the best SQL over Hadoop engine for complex analytical workloads
 No one else has published @100TB (or anywhere close)

Questions?
https://developer.ibm.com/hadoop/category/bigsql/

Thank you!

SQL over
Hadoop
use cases
SQL
Adhoc data
preparation
for analytics.
Federation
Transactional
with fast
lookups.
Fewer users.
Adhoc
queries and
discovery.
ELT and
simple, large
scale
queries.
Complex
SQL. Many
users. Deep
analytics
Operational
Data Store
Need to balance “best tool
for the job” paradigm with
maintainability and support
costs. Big SQL
Hive Spark SQL
Big SQL
Hbase
Big SQL
Phoenix
Spark SQL
Hive
Big SQL
Hbase
Big SQL
Phoenix
But SQL can do so much more…

Big SQL v5 + YARN Integration
Dynamic Allocation / Release of Resources
Big SQL Head
NM NM NM NM NM NM
Slider Client
YARN
Resource
Manager
& Scheduler
Big SQL
AM
Big SQL
Worker
Big SQL
Worker
Big SQL
Worker
Container
YARN
components
Slider
Components
Big SQL
Components
Big SQL
Worker
Big SQL
Worker
Big SQL
Worker
Stopped workers
release memory
to YARN for
other jobs
Stopped workers
release memory
to YARN for
other jobs
Stopped workers
release memory
to YARN for other
jobs
Big SQL Slider package
implements Slider Client APIs
HDFS

Big SQL v5 Elastic Boost – Multiple Workers per Host
More Granular Elasticity
NM NM NM NM NM NM
YARN
Resource
Manager
& Scheduler
Big SQL
AM
Container
YARN components
Slider Components
Big SQL Components
Big SQL Head
Slider Client
Big SQL
Worker
Big SQL
Worker
Big SQL
Worker
Big SQL
Worker
Big SQL
Worker
Big SQL
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
HDFS

Cluster Details (F1 Spark cluster)
Designed for Spark
 Totals across all Cluster Data nodes
 1,080 cores, 2,160 threads
 45TB memory
 100TB database storage replicated 3X plus temp, 480TB raw, 240 MVMe
 Hardware Details
 100TB SQL Database requires a 28-node cluster
• 2 management nodes (Lenovo x3650 M5), co-located with data nodes
• 28 data nodes (Lenovo x3650 M5)
• 2 racks, 20x 2U servers per rack (42U racks)
• 1 switch, 100GbE, 32 ports, 1U, (Mellanox SN2700)
 Each data node
 CPU: 2x E5-2697 v4 @ 2.3GHz (Broardwell) (18c) Passmark: 23,054 / 46,108
 Memory: 1.5TB per server 24x 64GB DDR4 2400MHz
 Flash Storage: 8x 2TB SSD PCIe MVMe (Intel DC P3700), 16TB per server
 Network: 100GbE adapter (Mellanox ConnectX-5 EN)
 IO Bandwidth per server: 16GB/s, Network bandwidth 12.5GB/s
 IO Bandwidth per cluster: 480GB/s, Network bandwidth 375GB/s

Query compliance through the scale factors (cont)
Almost half (7) of the Spark SQL 2.1 queries which fail at 100TB
can be classified as complex in nature
 No surprise since Spark is a relatively immature technology
 In-line with findings from original Hadoop-DS work in 2014
Big SQL RDBMs heritage is the key to providing Enterprise
grade SQL for complex analytical workloads
Any query which does not complete, requires
modification or tuning impacts business
productivity and wastes valuable human &
machine resources

Hadoop Tuning
HDFS Setting Default 100TB
NameNode Java heap 4G 20G
NameNode new generation size 512 2.5G
NameNode maximum new generation size 512 2.5G
Hadoop maximum Java heap size 4G 10G
DataNode max data transfer threads
(helps HDFS data rebalance)
4096 16384
MR2 Settings, applicable to load operations Default 100TB
MapReduce Framework map memory 2G 35G
MapReduce Framework reduce memory 4G 69G
MapReduce Sort Allocation Memory
(helps with hdfs rebalancing)
1G 2G
MR Map Java Heap Size (MB) 1638 28262
MR Reduce Java Heap Size (MB) 7840 56524
mapreduce.jobhistory.intermediate-done-dir /var /data15/var
mapreduce.task.io.sort.factor 100 1000
YARN setting Default 100TB
Percentage of physical CPU allocated for all containers 80% 90%
Number of virtual cores 57 (80%) 72
Container - Minimum container size 44G 20G
ResourceManager Java heap size 1G 8G
NodeManager Java heap size 1G 8G
AppTimelineServer Java heap size 1G 8G
YARN Java heap size 1G 2G
Advanced: Fault Tolerance yarn.resourcemanager.connect.retry-
interval.ms
30000 250
Advanced: Advanced yarn-site, hadoop.registry.rm.enabled False True
Advanced: Advanced yarn-site, yarn.client.nodemanager-connect.retry-
interval-ms
10000 250

SPARK Tuning
Spark Setting Default 10TB 100TB
spark.rpc.askTimeout (s) 120 1200 36000
spark.kryoserializer.buffer.max (mb) 64 768 768
spark.yarn.executor.memoryOverhead (mb) 384 1384 8192
spark.driver.maxResultSize 1G 8G 40G
spark.local.dir /tmp /data[1-10]/tmp /data[1-10]/tmp
spark.network.timeout 120 1200 36000
spark.sql.broadcastTimeout 120 1600 36000
spark.buffer.pageSize computed computed 64m
spark.shuffle.file.buffer computed computed 512k
spark.memory.fraction 0.6 0.8 0.8
spark.scheduler.listenerbus.eventqueue.size 10K 120K 600K

Big SQL Tuning
Big SQL Setting Default 100TB
Big SQL Workers per Node 1 12
INSTANCE_MEMORY 25% 97%
DB2_CPU_BINDING 25% MACHINE_SHARE=94
DB2_EXT_TABLE_SETTINGS DFSIO_MEM_RESERVE:20 DFSIO_MEM_RESERVE:0
DFT_DEGREE 8 4
SORTHEAP
SHEAPTHRES_SHR
Computed
Computed
4.4 G
70 G
BufferPool Size Computed 15 G
scheduler.cache.splits false true
scheduler.assignment.algorithm GREEDY MLN_RANDOMIZED
scheduler.dataLocationCount Computed max:28
scheduler.maxWorkerThreads Computed 8192
Green highlights the defaults will be changed in v5.0

Publisher Date Product TPC-DS Queries Data Vol
Cloudera Sept 2016 Impala 2.6 on AWS
Claims 42% more performant than
AWS Redshift
70 query subset 3TB
Cloudera August
2016
Impala 2.6
Claims 22% faster for TPC-DS
than previous version
17 queries
referenced
Not specified
Cloudera April 2016 Impala 2.5
Claims 4.3x faster for TPC-DS
than previous version
24 query subset 15TB *2
Hortonworks July 2016 Hive 2.1 with LLAP
Claims 25x faster for TPC-DS than
Hive 1.2
15 query subset 1TB
Radiant
Advisors *1
June 2016 Impala 2.5 on CDH 62 successful
37 fail
100GB / 1TB
Radiant
Advisors *1
June 2016 Presto .141t on Teradata Hadoop
Appliance (HDP, CDH)
78 successful
21 fail
100GB / 1TB
Radiant
Advisors *1
June 2016 Hive 1.2.1, Tez 0.7.0 on HDP 63 successful
35 fail
100GB / 1TB
What About Other SQL Hadoop TPC-DS Benchmarks?
No other vendor has demonstrated ability to execute all 99 TPC-DS queries -
even at lower scale factors.

Breaching the 100TB Mark with SQL Over Hadoop

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Breaching the 100TB Mark with SQL Over Hadoop

Similar to Breaching the 100TB Mark with SQL Over Hadoop (20)

More from DataWorks Summit

More from DataWorks Summit (20)

Recently uploaded

Recently uploaded (20)

Breaching the 100TB Mark with SQL Over Hadoop

Editor's Notes