Intro to Apache Kudu (short) - Big Data Application Meetup

1© Cloudera, Inc. All rights reserved.
Intro to Apache Kudu (incubating)
Hadoop Storage for Fast Analytics on Fast Data
Mike Percy, Kudu committer & PPMC member
mpercy@cloudera.com | @mike_percy

Agenda
• Why Kudu? Motivations and design goals
• Use cases
• Technical design and selected internals
• Performance
• How to get started

Previous storage landscape of the Hadoop ecosystem
HDFS (GFS) excels at:
• Batch ingest only (eg hourly)
• Efficiently scanning large amounts
of data (analytics)
HBase (BigTable) excels at:
• Efficiently finding and writing
individual rows
• Making data mutable
Gaps exist when these properties
are needed simultaneously

• High throughput for big scans
Goal: Within 2x of Parquet
• Low-latency for short accesses
Goal: 1ms read/write on SSD
• Relational data model (tables, schemas)
• Enable easy integration with SQL engines
(including Impala, Hive, and Drill)
• Still provide “NoSQL” style scan/insert/update APIs
(for client programs written in Java, C++, Python)
Kudu design goals

Where Kudu fits in the Hadoop big data stack
Storage for fast (low latency) analytics on fast (high throughput) data
• Simplifies the architecture for building
analytic applications on changing data
• Optimized for fast analytic performance
• Natively integrated with the Hadoop
ecosystem of components
FILESYSTEM
HDFS
NoSQL
HBASE
INGEST – SQOOP, FLUME, KAFKA
DATA INTEGRATION & STORAGE
SECURITY – SENTRY
RESOURCE MANAGEMENT – YARN
UNIFIED DATA SERVICES
BATCH STREAM SQL SEARCH MODEL ONLINE
DATA ENGINEERING DATA DISCOVERY & ANALYTICS DATA APPS
SPARK,
HIVE, PIG
SPARK IMPALA SOLR SPARK HBASE
RELATIONAL
KUDU

Kudu: Scalable and fast tabular storage
• Tabular
• Store tables like a “normal” database
• Individual record-level access to 100+ billion row tables
• Scalable
• Tested up to 275 nodes (~3PB cluster)
• Designed to scale to 1000s of nodes, tens of PBs
• Fast
• Millions of read/write operations per second across cluster
• Multiple GB/second read throughput per node

Kudu: Strong schema, NoSQL + SQL (via integrations)
• Each table has a SQL-like schema
• Finite number of columns (unlike HBase/Cassandra)
• Types: BOOL, INT8, INT16, INT32, INT64, FLOAT, DOUBLE, STRING, BINARY,
TIMESTAMP
• Some subset of columns makes up a possibly-composite primary key
• Fast ALTER TABLE
• Java, Python, and C++ “NoSQL” style APIs
• Insert(), Update(), Delete(), Scan()
• Integrations with MapReduce, Apache Spark, and Apache Impala (incubating)
• Apache Drill work-in-progress

Kudu use cases
Kudu is best for use cases requiring a simultaneous combination of
sequential and random reads and writes
• Time series
• Examples: Streaming market data, fraud detection / prevention, risk monitoring
• Workload: Insert, updates, scans, lookups
• Machine data analytics
• Example: Network threat detection
• Workload: Inserts, scans, lookups
• Online reporting
• Example: Operational data store (ODS)
• Workload: Inserts, updates, scans, lookups

Xiaomi use case
• World’s 4th largest smart-phone maker (most popular in China)
• Gather important RPC tracing events from mobile app and backend service.
• Service monitoring & troubleshooting tool.
High write throughput
• >5 Billion records/day and growing
Query latest data and quick response
• Identify and resolve issues quickly
Can search for individual records
• Easy for troubleshooting

Xiaomi big data analytics pipeline
Before Kudu
Large ETL pipeline delays
● High data visibility latency
(from 1 hour up to 1 day)
● Data format conversion woes
Ordering issues
● Log arrival (storage) not
exactly in correct order
● Must read 2 – 3 days of data
to get all of the data points
for a single day

Xiaomi big data analytics pipeline
Simplified with Kudu
Low latency ETL pipeline
● ~10s data latency
● For apps that need to avoid
direct backpressure or need
ETL for record enrichment
Direct zero-latency path
● For apps that can tolerate
backpressure and can use the
NoSQL APIs
● Apps that don’t need ETL
enrichment for storage /
retrieval
OLAP scan
Side table lookup
Result store

Xiaomi benchmark
• 6 real queries from application trace analysis application
• Q1: SELECT COUNT(*)
• Q2: SELECT hour, COUNT(*) WHERE module = ‘foo’ GROUP BY HOUR
• Q3: SELECT hour, COUNT(DISTINCT uid) WHERE module = ‘foo’ AND app=‘bar’
GROUP BY HOUR
• Q4: analytics on RPC success rate over all data for one app
• Q5: same as Q4, but filter by time range
• Q6: SELECT * WHERE app = … AND uid = … ORDER BY ts LIMIT 30 OFFSET 30

Xiaomi benchmark results
1.4 2.0 2.3
3.1
1.3 0.91.3
2.8
4.0
5.7
7.5
16.7
Q1 Q2 Q3 Q4 Q5 Q6
kudu
parquet
Query latency (seconds):
• HDFS parquet file replication = 3
• Kudu table replication = 3
• Each query run 5 times then averaged

How it works (whirlwind tour)

Kudu tables are partitioned into “tablets” (regions)
Partitioning based on
primary key (PK)
• Native support for
range partitioning
and/or hash
partitioning (salting)
• Hash example:
PRIMARY KEY
(tweet_id) DISTRIBUTE
BY HASH(tweet_id)
INTO 100 BUCKETS

Each tablet is fault tolerant via Raft consensus
• A single Tablet Server can
host many tablets
• Metadata is stored on just
another tablet, but only
Master Server processes
host that tablet
• Raft consensus:
• Strong consistency
• Leader election on failure
• Replication factor 3 or 5 is
typical

Kudu: Cluster metadata management
• Replicated master
• Acts as a tablet directory
• Acts as a catalog (which tables exist, etc)
• Acts as a load balancer (tracks TS liveness, re-replicates under-replicated
tablets)
• Caches all metadata in RAM for high performance
• Under heavy load, 99.99% response times still in microseconds (650us)
• Client configured with master addresses
• Client asks master for tablet locations as needed and caches them locally

Example usage: Spark & Impala

Spark DataSource integration (work-in-progress)
sqlContext.load("org.kududb.spark",
Map("kudu.table" -> “foo”,
"kudu.master" -> “master.example.com”))
.registerTempTable(“mytable”)
df = sqlContext.sql(
“select col_a, col_b from mytable “ +
“where col_c = 123”)

Impala integration
• CREATE TABLE … DISTRIBUTE BY HASH(col1) INTO 16 BUCKETS
AS SELECT … FROM …
• INSERT/UPDATE/DELETE
• Optimizations like predicate pushdown, scan parallelism, plans for
more on the way

Benchmarks

TPC-H (analytics benchmark)
• 75 server cluster
• 12 (spinning) disks each, enough RAM to fit dataset
• TPC-H Scale Factor 100 (100GB)
• Example query:
• SELECT n_name, sum(l_extendedprice * (1 - l_discount)) as revenue FROM customer, orders,
lineitem, supplier, nation, region WHERE c_custkey = o_custkey AND l_orderkey =
o_orderkey AND l_suppkey = s_suppkey AND c_nationkey = s_nationkey AND s_nationkey =
n_nationkey AND n_regionkey = r_regionkey AND r_name = 'ASIA' AND o_orderdate >= date
'1994-01-01' AND o_orderdate < '1995-01-01’ GROUP BY n_name ORDER BY revenue desc;

• Kudu outperforms Parquet by 31% (geometric mean) for RAM-resident data

Versus other NoSQL storage
• Apache Phoenix: OLTP SQL engine built on HBase
• 10 node cluster (9 worker, 1 master)
• TPC-H LINEITEM table only (6B rows)
2152
219
76
131
0.04
1918
13.2
1.7
0.7
0.15
155
9.3
1.4 1.5 1.37
0.01
0.1
1
10
100
1000
10000
Load TPCH Q1 COUNT(*)
COUNT(*)
WHERE…
single-row
lookup
Timeinsec(logscale)
Phoenix
Kudu
Parquet

Getting started with Kudu

Kudu: Project status
• First open source beta released by Cloudera in September ‘15
• Latest version 0.6.0 released end of November
• Usable for many applications
• Have not experienced data loss, reasonably stable (almost no crashes reported)
• Still requires some expert assistance, and you’ll probably find some bugs
• Version 0.7 planned for February
• Now a part of the Apache Software Foundation (ASF) Incubator

Getting started as a user
• Home page: http://getkudu.io
• User mailing list: user@kudu.incubator.apache.org
• Chat room: http://getkudu-slack.herokuapp.com/
• Technical whitepaper: http://getkudu.io/kudu.pdf
• Quickstart VM
• Easiest way to get started: Impala and Kudu in an easy-to-install VM
• Yum / APT repos
• CSD and Parcels
• For automated installation on a Cloudera Manager-managed cluster

Getting started as a developer
• Source code: http://github.com/cloudera/kudu
• All commits go upstream, nothing is held back
• Public gerrit: http://gerrit.cloudera.org
• All code reviews happening here
• Public JIRA: http://issues.cloudera.org
• Includes bugs going back to 2013. Come see our dirty laundry!
• Developer mailing list: dev@kudu.incubator.apache.org
• Apache 2.0 licensed open source and an ASF incubator project.
• Contributions are welcome and encouraged!

Questions?
http://getkudu.io
@ApacheKudu

Intro to Apache Kudu (short) - Big Data Application Meetup

More Related Content

What's hot

Viewers also liked

Similar to Intro to Apache Kudu (short) - Big Data Application Meetup

Recently uploaded

Intro to Apache Kudu (short) - Big Data Application Meetup