Introduction to Kudu - StampedeCon 2016

1© Cloudera, Inc. All rights reserved.
Michael Crutcher
Director, Product Management - Storage
Introduction to Kudu

Where Kudu fits in the Hadoop big data stack
Storage for fast (low latency) analytics on fast (high throughput) data
• Simplifies the architecture for building
analytic applications on changing data
• Optimized for fast analytic performance
• Natively integrated with the Hadoop
ecosystem of components
FILESYSTEM
HDFS
NoSQL
HBASE
INGEST – SQOOP, FLUME, KAFKA
DATA INTEGRATION & STORAGE
SECURITY – SENTRY
RESOURCE MANAGEMENT – YARN
UNIFIED DATA SERVICES
BATCH STREAM SQL SEARCH MODEL ONLINE
DATA ENGINEERING DATA DISCOVERY & ANALYTICS DATA APPS
SPARK,
HIVE, PIG
SPARK IMPALA SOLR SPARK HBASE
RELATIONAL
KUDU

Motivation and Design Goals

Previous storage landscape of the Hadoop ecosystem
HDFS (GFS) excels at:
• Batch ingest only (eg hourly)
• Efficiently scanning large amounts
of data (analytics)
HBase (BigTable) excels at:
• Efficiently finding and writing
individual rows
• Making data mutable
Gaps exist when these properties
are needed simultaneously

Changing hardware landscape
• Spinning disk -> solid state storage
• NAND flash: Up to 450k read 250k write iops, about 2GB/sec read and
1.5GB/sec write throughput, at a price of less than $3/GB and dropping
• 3D XPoint memory (1000x faster than NAND, cheaper than RAM)
• RAM is cheaper and more abundant:
• 64->128->256GB over last few years
• Takeaway: The next bottleneck is CPU, and current storage systems weren’t
designed with CPU efficiency in mind.

• High throughput for big scans
Goal: Within 2x of Parquet
• Low-latency for short accesses
Goal: 1ms read/write on SSD
• Database-like semantics
(initially single-row ACID)
• Relational data model
• SQL queries are easy
• “NoSQL” style scan/insert/update (Java/C++ client)
Kudu design goals

Kudu: Scalable and fast tabular storage
• Scalable
• Tested up to 275 nodes (~3PB cluster)
• Designed to scale to 1000s of nodes, tens of PBs
• Fast
• Millions of read/write operations per second across cluster
• Multiple GB/second read throughput per node
• Individual record-level access to 100+ billion row tables (Java/C++/Python APIs)
• Consistent via a Paxos-like quorum model
• Open source, now a top level Apache project

Kudu usage
• Table has a SQL-like schema
• Finite number of columns (unlike HBase/Cassandra)
• Types: BOOL, INT8, INT16, INT32, INT64, FLOAT, DOUBLE, STRING, BINARY,
TIMESTAMP
• Some subset of columns makes up a possibly-composite primary key
• Fast ALTER TABLE
• Java and C++ “NoSQL” style APIs
• Insert(), Update(), Delete(), Scan()
• Integrations with MapReduce, Spark, and Impala
• Apache Drill work-in-progress, many more planned

What Kudu is *NOT*
• Not a SQL interface
• Just the storage layer
• “BYO SQL”
• Not a file system
• Data must have tabular structure
• Not an application that runs on HDFS
• An alternative, native Hadoop storage engine
• Not a replacement for HDFS or HBase
• Select the right storage for the right use case
• Cloudera will continue to support and invest in all three

Use Cases

Industry Examples
• Stream market data
• Real time fraud
detection &
prevention
• Risk monitoring
• Real time offers
• Location based
targeting
• Geospatial
monitoring
• Risk and threat
detection (real time)
Financial Services Retail Public Sector

“Traditional” real-time analytics in Hadoop
Considerations:
● How do I handle failure
during this process?
● How often do I reorganize
data streaming in into a
format appropriate for
reporting?
● When reporting, how do I see
data that has not yet been
reorganized?
● How do I ensure that
important jobs aren’t
interrupted by maintenance?
New Partition
Most Recent Partition
Historical Data
HBase
Parquet
File
Have we
accumulated
enough data?
Reorganize
HBase file
into Parquet
• Wait for running operations to complete
• Define new Impala partition referencing
the newly written Parquet file
Incoming Data
(Messaging
System)
Reporting
Request
Storage in HDFS

Lambda Architecture
Batch Layer
Serving Layer
Speed Layer
New Data
Data Lake
(HDFS)
Precompute
Views
Stream or
Micro Batch
Increment
Views
Data
Application
“Real-time” Increment
Batch Recompute
Merge
Hadoop
Storm/Spark

Real-time analytics in Hadoop with Kudu
Improvements:
● One system to operate
● No cron jobs or background processes
● Handle late arrivals or data
corrections with ease
● New data available immediately for
analytics or operations
Historical and Real-time
Data
Incoming Data
(Messaging
System)
Reporting
Request
Storage in Kudu

Xiaomi use case
• World’s 4th largest smart-phone maker (most popular in China)
• Gather important RPC tracing events from mobile app and backend service.
• Service monitoring & troubleshooting tool.
High write throughput
• >5 Billion records/day and growing
Query latest data and quick response
• Identify and resolve issues quickly
Can search for individual records
• Easy for troubleshooting

Xiaomi big data analytics pipeline
Before Kudu
Large ETL pipeline delays
● High data visibility latency
(from 1 hour up to 1 day)
● Data format conversion woes
Ordering issues
● Log arrival (storage) not
exactly in correct order
● Must read 2 – 3 days of data
to get all of the data points
for a single day

Xiaomi big data analytics pipeline
Simplified with Kudu
Low latency ETL pipeline
● ~10s data latency
● For apps that need to avoid
direct backpressure or need
ETL for record enrichment
Direct zero-latency path
● For apps that can tolerate
backpressure and can use the
NoSQL APIs
● Apps that don’t need ETL
enrichment for storage /
retrieval
OLAP scan
Side table lookup
Result store

Benchmarks and Current Status

TPC-H (Analytics benchmark)
• 75TS + 1 master cluster
• 12 (spinning) disk each, enough RAM to fit dataset
• Using Kudu 0.5.0, Impala 2.2 with Kudu support, CDH 5.4
• TPC-H Scale Factor 100 (100GB)
• Example query:
• SELECT n_name, sum(l_extendedprice * (1 - l_discount)) as revenue FROM customer,
orders, lineitem, supplier, nation, region WHERE c_custkey = o_custkey AND
l_orderkey = o_orderkey AND l_suppkey = s_suppkey AND c_nationkey = s_nationkey
AND s_nationkey = n_nationkey AND n_regionkey = r_regionkey AND r_name = 'ASIA'
AND o_orderdate >= date '1994-01-01' AND o_orderdate < '1995-01-01’ GROUP BY
n_name ORDER BY revenue desc;
19

- Kudu outperforms Parquet by 31% (geometric mean) for RAM-resident data
- Parquet likely to outperform Kudu for HDD-resident (larger IO requests)

What about Apache Phoenix?
• 10 node cluster (9 worker, 1 master)
• HBase 1.0, Phoenix 4.3
• TPC-H LINEITEM table only (6B rows)
21
2152
219
76
131
0.04
1918
13.2
1.7
0.7
0.15
155
9.3
1.4 1.5 1.37
0.01
0.1
1
10
100
1000
10000
Load TPCH Q1 COUNT(*)
COUNT(*)
WHERE…
single-row
lookup
Time(sec)
Phoenix
Kudu
Parquet

Current Status
✔ Completed all components core to the architecture
✔ Java and C++ API
✔ Impala, MapReduce, and Spark integration
✔ Support for SSDs and spinning disk
✔ Open to beta customers
✔ Kudu 1.0 expected in early Fall

Getting Started
Users:
Install the Beta or try a VM:
getkudu.io
Get help:
kudu-user@googlegroups.com
Read the white paper:
getkudu.io/kudu.pdf
Developers:
Contribute:
github.com/cloudera/kudu (commits)
gerrit.cloudera.org (reviews)
issues.cloudera.org (JIRAs going back to 2013)
Join the Dev list:
kudu-dev@googlegroups.com
Contributions/participation are welcome and
encouraged!

Introduction to Kudu - StampedeCon 2016

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (13)

Similar to Introduction to Kudu - StampedeCon 2016

Similar to Introduction to Kudu - StampedeCon 2016 (20)

More from StampedeCon

More from StampedeCon (20)

Recently uploaded

Recently uploaded (20)

Introduction to Kudu - StampedeCon 2016

Editor's Notes