Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
1© Cloudera, Inc. All rights reserved.
A brave new world in mutable big data:
Relational storage
Todd Lipcon
Software Engi...
2© Cloudera, Inc. All rights reserved.
Introduction
3© Cloudera, Inc. All rights reserved.
About me
• Engineer at Cloudera since 2009
• Hadoop core (HDFS, MR1)
• HBase stabil...
4© Cloudera, Inc. All rights reserved.
A brief history of databases
Incomplete,distilled, and semi-accurate
5© Cloudera, Inc. All rights reserved.
1960s, 1970s
6© Cloudera, Inc. All rights reserved.
“A database system where an application developer directly uses an
application prog...
7© Cloudera, Inc. All rights reserved.
Probably the only slide at Strata with COBOL on it
Source: http://www.mainframes360...
8© Cloudera, Inc. All rights reserved.
Failings of ISAM/VSAM
A Relational Model of Data for Large Shared Data Banks (Codd,...
9© Cloudera, Inc. All rights reserved.
Origins of SQL (1974)
• Originally SEQUEL (Structured English QUEry Language)
• Ren...
10© Cloudera, Inc. All rights reserved.
Explosion of SQL Popularity
• IBM, Oracle, Microsoft, Informix, and others joined ...
11© Cloudera, Inc. All rights reserved.
ORCL
12© Cloudera, Inc. All rights reserved.
13© Cloudera, Inc. All rights reserved.
All good things must come to an end?
14© Cloudera, Inc. All rights reserved.
15© Cloudera, Inc. All rights reserved.
The beginnings of the NoSQL “movement”
16© Cloudera, Inc. All rights reserved.
17© Cloudera, Inc. All rights reserved.
First ever NoSQL meetup (2009)
18© Cloudera, Inc. All rights reserved.
Jay Kreps (Confluent)
Me!
19© Cloudera, Inc. All rights reserved.
I wasn’t keen on the
NoSQL buzzword!
20© Cloudera, Inc. All rights reserved.
NoSQL search interest over time
What happened in
Jan 2012???
21© Cloudera, Inc. All rights reserved.
NoSQL complaints
• Tool compatibility? BI? ETL? ORMs?
• Consistency
• denormalizat...
22© Cloudera, Inc. All rights reserved.
credit @jrecursive (2010)
f***
23© Cloudera, Inc. All rights reserved.
Not-Only SQL
People wanted their SQL back, and NoSQL
developers gave it!
• Cassand...
24© Cloudera, Inc. All rights reserved.
Meanwhile in RDBMS land
Original complaints still relevant?
Most OLTP apps fit in ...
25© Cloudera, Inc. All rights reserved.
“It is perhaps fair to say that from the perspective of many
engineers working on ...
26© Cloudera, Inc. All rights reserved.
Part 2:
Evaluating a Not-Only SQL Database
27© Cloudera, Inc. All rights reserved.
What kind of application?
• OLTP? OLAP? HTAP (Hybrid Transactional/Analytic Proces...
28© Cloudera, Inc. All rights reserved.
HTAP Application Architecture
• Realtime ingest (high performance writes)
• Throug...
29© Cloudera, Inc. All rights reserved.
Evaluating an HTAP Data Store
• SQL support
• Semantics (eventual vs strict consis...
30© Cloudera, Inc. All rights reserved.
Original usecase Deployment Semantics
HBase Web indexing Anywhere single-row ACID
...
31© Cloudera, Inc. All rights reserved.
Not-Only-SQL in Depth:
Comparing Cloud Spanner and Kudu+Impala
32© Cloudera, Inc. All rights reserved.
Apache Kudu: Scalable and fast tabular storage
Scalable
• Tested up to 275 nodes (...
33© Cloudera, Inc. All rights reserved.
Cloud Spanner at a glance
34© Cloudera, Inc. All rights reserved.
Kudu vs Spanner: Consistency and Availability
Kudu Spanner Winner?
Concurrency
con...
35© Cloudera, Inc. All rights reserved.
Kudu vs Spanner: Data Access
Kudu Spanner Winner?
Programmatic
APIs
Java, C++,
Pyt...
36© Cloudera, Inc. All rights reserved.
Kudu Spanner Winner?
Partitioning Hash or range,
explicit
Range only
(automatic)
<...
37© Cloudera, Inc. All rights reserved.
Checkpoint so far
• Systems are really pretty similar
• No accident - Kudu’s repli...
38© Cloudera, Inc. All rights reserved.
Spanner Storage - SSTable / Log-Structured Merge
• SSTable (sorted-string table)
•...
39© Cloudera, Inc. All rights reserved.
base columnar data
Kudu Storage - Columnar + Deltas
• Stores most of its data in a...
40© Cloudera, Inc. All rights reserved.
So how much does it really
matter?
Analytics benchmarks
41© Cloudera, Inc. All rights reserved.
Benchmark setup
Cloud Spanner
5 “nodes” (unknown specs)
us-central1 region (multi-...
42© Cloudera, Inc. All rights reserved.
Test 1: TPCH Data Loading
• Used a separate node to load the TPC-H “LINEITEM” tabl...
43© Cloudera, Inc. All rights reserved.
Test 2: TPCH Queries
• SELECT COUNT(*)
• TPCH Q1, Q6: simple GROUP BY/SUM/COUNT wh...
44© Cloudera, Inc. All rights reserved.
Test 3: YCSB Loading
• Standard YCSB benchmark
• Configured as recommended in the
...
45© Cloudera, Inc. All rights reserved.
YCSB Throughput (Load and random-read)
46© Cloudera, Inc. All rights reserved.
YCSB Latencies (for read workload)
47© Cloudera, Inc. All rights reserved.
YCSB Workload A (50/50 read/write mix)
Kudu is not optimized for high update-rate ...
48© Cloudera, Inc. All rights reserved.
Benchmark summary
• Kudu ingests data at least 4x faster
• Stability issues with C...
49© Cloudera, Inc. All rights reserved.
Conclusions
50© Cloudera, Inc. All rights reserved.
Conclusions
• NoSQL and SQL are converging again
• We now get “best of both worlds...
51© Cloudera, Inc. All rights reserved.
Acknowledgements
• Spanner team for publishing papers, especially SIGMOD 2017 (“Sp...
52© Cloudera, Inc. All rights reserved.
kudu.apache.org
@tlipcon | @ApacheKudu
Upcoming SlideShare
Loading in …5
×

A brave new world in mutable big data relational storage (Strata NYC 2017)

4,173 views

Published on

The ever-increasing interest in running fast analytic scans on constantly updating data is stretching the capabilities of HDFS and NoSQL storage. Users want the fast online updates and serving of real-time data that NoSQL offers, as well as the fast scans, analytics, and processing of HDFS. Additionally, users are demanding that big data storage systems integrate natively with their existing BI and analytic technology investments, which typically use SQL as the standard query language of choice. This demand has led big data back to a familiar friend: relationally structured data storage systems.

Todd Lipcon explores the advantages of relational storage and reviews new developments, including Google Cloud Spanner and Apache Kudu, which provide a scalable relational solution for users who have too much data for a legacy high-performance analytic system. Todd explains how to address use cases that fall between HDFS and NoSQL with technologies like Apache Kudu or Google Cloud Spanner and how the combination of relational data models, SQL query support, and native API-based access enables the next generation of big data applications. Along the way, he also covers suggested architectures, the performance characteristics of Kudu and Spanner, and the deployment flexibility each option provides.

Published in: Technology
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

A brave new world in mutable big data relational storage (Strata NYC 2017)

  1. 1. 1© Cloudera, Inc. All rights reserved. A brave new world in mutable big data: Relational storage Todd Lipcon Software Engineer at Cloudera Apache Kudu founder and PMC chair
  2. 2. 2© Cloudera, Inc. All rights reserved. Introduction
  3. 3. 3© Cloudera, Inc. All rights reserved. About me • Engineer at Cloudera since 2009 • Hadoop core (HDFS, MR1) • HBase stability and performance • Started Kudu project in 2012 (bias alert!) • My 9th Strata NYC! Feel free to tweet questions @tlipcon or find me on the Kudu Slack
  4. 4. 4© Cloudera, Inc. All rights reserved. A brief history of databases Incomplete,distilled, and semi-accurate
  5. 5. 5© Cloudera, Inc. All rights reserved. 1960s, 1970s
  6. 6. 6© Cloudera, Inc. All rights reserved. “A database system where an application developer directly uses an application programming interface to search indexes in order to locate records in data files.” - Wikipedia “ISAM” • Files contain records (originally fixed-length, later variable-length) • Files stored on disks and applications directly access them (file-system locking) • Later added networked access (client-server model), hierarchical records • Still a simple API: • Seek by key, Read, Write, Insert, Delete 1960s, 1970s: ISAM / VSAM
  7. 7. 7© Cloudera, Inc. All rights reserved. Probably the only slide at Strata with COBOL on it Source: http://www.mainframes360.com/2010/03/ksds-files-random-processing.html
  8. 8. 8© Cloudera, Inc. All rights reserved. Failings of ISAM/VSAM A Relational Model of Data for Large Shared Data Banks (Codd, 1972) • Applications and physical data layout are too tightly coupled • e.g a database of parts might be originally ordered by part number and later changed • inventory app inadvertently depends on order (unexpected breaks) • Hard to make general-purpose programs that run against ISAM/VSAM datasets • Proposed a new model: relational databases • All entities modeled by peer tables with relationships between them • Programs use declarative access (DB decides physical operations necessary)
  9. 9. 9© Cloudera, Inc. All rights reserved. Origins of SQL (1974) • Originally SEQUEL (Structured English QUEry Language) • Renamed to SQL due to trademark issues • Designed to be easy to write, read, and maintain • “is intended for users who are more comfortable with an English-keyword format than with the terse mathematical notation of SQUARE.” • Solves the coupling issue: • Application: specify what should be returned • Database: figure out how to return it
  10. 10. 10© Cloudera, Inc. All rights reserved. Explosion of SQL Popularity • IBM, Oracle, Microsoft, Informix, and others joined the party • ANSI standard in 1986 • Ecosystem growth: • Business Intelligence tools • Object-Relational Mappers • Extract-Transform-Load tools (ETL) • Open source SQL databases • mSQL, MySQL, PostgreSQL, etc • LAMP stack
  11. 11. 11© Cloudera, Inc. All rights reserved. ORCL
  12. 12. 12© Cloudera, Inc. All rights reserved.
  13. 13. 13© Cloudera, Inc. All rights reserved. All good things must come to an end?
  14. 14. 14© Cloudera, Inc. All rights reserved.
  15. 15. 15© Cloudera, Inc. All rights reserved. The beginnings of the NoSQL “movement”
  16. 16. 16© Cloudera, Inc. All rights reserved.
  17. 17. 17© Cloudera, Inc. All rights reserved. First ever NoSQL meetup (2009)
  18. 18. 18© Cloudera, Inc. All rights reserved. Jay Kreps (Confluent) Me!
  19. 19. 19© Cloudera, Inc. All rights reserved. I wasn’t keen on the NoSQL buzzword!
  20. 20. 20© Cloudera, Inc. All rights reserved. NoSQL search interest over time What happened in Jan 2012???
  21. 21. 21© Cloudera, Inc. All rights reserved. NoSQL complaints • Tool compatibility? BI? ETL? ORMs? • Consistency • denormalization is tough • hard to program against weak semantics • Access path sensitivity • Have to tightly couple applications with physical data model • No ad-hoc access • Complex application code to perform simple aggregations Some of these critiques sound awfully familiar... 1970s Database People
  22. 22. 22© Cloudera, Inc. All rights reserved. credit @jrecursive (2010) f***
  23. 23. 23© Cloudera, Inc. All rights reserved. Not-Only SQL People wanted their SQL back, and NoSQL developers gave it! • Cassandra - CQL (late 2011) • HBase - Phoenix (Jan 2013) • HDFS - Hive (2009), Impala (2012), Drill (2012), Spark SQL (2014), Presto (2013)
  24. 24. 24© Cloudera, Inc. All rights reserved. Meanwhile in RDBMS land Original complaints still relevant? Most OLTP apps fit in 1TB of RAM and flash! Shared-nothing OLAP available and works well now Maybe NoSQL and SQL have converged?
  25. 25. 25© Cloudera, Inc. All rights reserved. “It is perhaps fair to say that from the perspective of many engineers working on the Google infrastructure, the SQL vs. NoSQL dichotomy may no longer be relevant.” Source: “Spanner: Becoming a SQL System”
  26. 26. 26© Cloudera, Inc. All rights reserved. Part 2: Evaluating a Not-Only SQL Database
  27. 27. 27© Cloudera, Inc. All rights reserved. What kind of application? • OLTP? OLAP? HTAP (Hybrid Transactional/Analytic Processing) • Next-gen data apps are all hybrid (streaming ingest, constant analytics) • “Combining OLTP, OLAP, and full-text search capabilities in a single system remains at the top of customer priorities.” - Spanner: Becoming a SQL System
  28. 28. 28© Cloudera, Inc. All rights reserved. HTAP Application Architecture • Realtime ingest (high performance writes) • Throughput and latency both important • Concurrent SQL reads • BI apps demand interactive performance • Often a time-series component • IoT, transaction data, click logs, etc. • High Availability/Geo-redundancy Browser tracing Web logs Kafka Kudu Impala JDBC access Marketing Dept. Developers Web-app
  29. 29. 29© Cloudera, Inc. All rights reserved. Evaluating an HTAP Data Store • SQL support • Semantics (eventual vs strict consistency, transactional support, features) • Performance (ingest with concurrent analytics) • Availability (multi-datacenter) • Deployment Model • Cost
  30. 30. 30© Cloudera, Inc. All rights reserved. Original usecase Deployment Semantics HBase Web indexing Anywhere single-row ACID Cassandra OLTP (web serving) Anywhere eventual Cloud Spanner OLTP SaaS-only (GCE) full ACID HDFS OLAP Physical HW bulk access only Kudu HTAP Anywhere single-row ACID Narrowing the options Similar storage implementations (SSTable, Log-Structured-Merge) Let’s compare with Spanner since it’s shiny, new, and similar to Kudu! Only store originally designed for HTAP
  31. 31. 31© Cloudera, Inc. All rights reserved. Not-Only-SQL in Depth: Comparing Cloud Spanner and Kudu+Impala
  32. 32. 32© Cloudera, Inc. All rights reserved. Apache Kudu: Scalable and fast tabular storage Scalable • Tested up to 275 nodes (~3PB cluster) • Designed to scale to 1000s of nodes and tens of PBs Fast • Millions of read/write operations per second across cluster • Multiple GB/second read throughput per node Tabular • Represents data in structured tables like a relational database •Strict schema, finite column count, no BLOBs • Individual record-level access to 100+ billion row tables
  33. 33. 33© Cloudera, Inc. All rights reserved. Cloud Spanner at a glance
  34. 34. 34© Cloudera, Inc. All rights reserved. Kudu vs Spanner: Consistency and Availability Kudu Spanner Winner? Concurrency control MVCC (with HybridTime) MVCC (with TrueTime) Spanner (but needs atomic clock hardware!) Read-only (analytic) queries Consistent Snapshot Isolation Consistent Snapshot Isolation Tie Transactions Single-row ACID Multi-row ACID (small sets of rows only) Spanner Availability/ Replication Replicated log (Raft, 3 replicas) Replicated log (Multi-Paxos, 3 replicas) Tie
  35. 35. 35© Cloudera, Inc. All rights reserved. Kudu vs Spanner: Data Access Kudu Spanner Winner? Programmatic APIs Java, C++, Python C#, Go, Java, Node, PHP, Python, Ruby Spanner Secondary Indexes no supported Spanner SQL via Impala or Spark (SQL 2003 w/ Analytic extensions) Built-in (simple ANSI99 queries only, no write support) Kudu Ecosystem Integrations Spark, Impala, Flume, Apex, StreamSets, et al. ?? (very limited) Kudu
  36. 36. 36© Cloudera, Inc. All rights reserved. Kudu Spanner Winner? Partitioning Hash or range, explicit Range only (automatic) <it depends> Load balancing manual automatic Spanner Deployment Environment on-prem or cloud SaaS only (lock- in) Kudu Ops model operate yourself SaaS (no ops) Spanner Licensing Apache License closed source Kudu Kudu vs Spanner: operational factors
  37. 37. 37© Cloudera, Inc. All rights reserved. Checkpoint so far • Systems are really pretty similar • No accident - Kudu’s replication, partitioning, and data model inherit a lot from Spanner • Current feature gaps • Spanner ahead on transactional feature set (OLTP focus) • Kudu ahead on analytic feature set (OLAP focus) What about underlying storage and performance?
  38. 38. 38© Cloudera, Inc. All rights reserved. Spanner Storage - SSTable / Log-Structured Merge • SSTable (sorted-string table) • same storage format as BigTable (inherited code) • row-oriented design • Each row <cola, colb, colc, ...> stored on disk in that format • Optimal for OLTP (read 1 row = 1 disk seek) • Inefficient for OLAP (high CPU on scans) • not schema-aware • little opportunity for type-specific compression techniques, etc. “SSTables have proven to be remarkably robust even when used for schematized data consisting largely of small values, often traversed by column. But they are ultimately a poor fit and leave a lot of performance on the table.”
  39. 39. 39© Cloudera, Inc. All rights reserved. base columnar data Kudu Storage - Columnar + Deltas • Stores most of its data in an internal columnar format • Each column stored, encoded, and compressed separately, in small chunks • Similar to Parquet, with enhancements: • Indexes allow fast seeking by key or by position (for low-latency read) • Delta Stores allow tracking of updated and deleted rows c1 c2 c3 c4 + deltas (recently changed rows) d1 d2 c1 c2 c3 c4 1 hi 0.1 N 3 bye 0.2 N 2 cat 0.1 N 1 dog 0.5 Y read-time
  40. 40. 40© Cloudera, Inc. All rights reserved. So how much does it really matter? Analytics benchmarks
  41. 41. 41© Cloudera, Inc. All rights reserved. Benchmark setup Cloud Spanner 5 “nodes” (unknown specs) us-central1 region (multi-zone) Price: $0.90/node/hr * 5 nodes = $3240/month Kudu on GCE 5 n1-standard-16 (16vCPU, 60G RAM) us-central1 region (multi-zone) 500G Persistent SSD disk each Price: $0.54/node/hr * 5 + 500GB * $0.17/GB/mo * 5 = $2366.80/month *drops to $1009 if preemptible is used! * factoring in sustained-use discount 30% Lower!
  42. 42. 42© Cloudera, Inc. All rights reserved. Test 1: TPCH Data Loading • Used a separate node to load the TPC-H “LINEITEM” table • 600M rows, 75GB in CSV format • Multi-threaded Java program* to load, followed best practices *Loader available at https://github.com/toddlipcon/spanner-kudu-comparison
  43. 43. 43© Cloudera, Inc. All rights reserved. Test 2: TPCH Queries • SELECT COUNT(*) • TPCH Q1, Q6: simple GROUP BY/SUM/COUNT which scan the whole table
  44. 44. 44© Cloudera, Inc. All rights reserved. Test 3: YCSB Loading • Standard YCSB benchmark • Configured as recommended in the cloudspanner/README file • Experienced many errors, timeouts, and multi-minute stalls loading spanner • eventually succeeded on third try • so take these results with a grain of salt!
  45. 45. 45© Cloudera, Inc. All rights reserved. YCSB Throughput (Load and random-read)
  46. 46. 46© Cloudera, Inc. All rights reserved. YCSB Latencies (for read workload)
  47. 47. 47© Cloudera, Inc. All rights reserved. YCSB Workload A (50/50 read/write mix) Kudu is not optimized for high update-rate scenarios. See KUDU-749
  48. 48. 48© Cloudera, Inc. All rights reserved. Benchmark summary • Kudu ingests data at least 4x faster • Stability issues with Cloud Spanner ingestion (cause unknown) • Kudu performs simple analytic queries 10-100x faster • Spanner wins on high-percentile tail latencies • Kudu performance degrades significantly in 50/50 R/W mix workload • Reminders: • Kudu cluster has 30% lower cost, and can be run on any provider! • Kudu doesn’t have the same rich OLTP feature set as Spanner (indexes, multi-row transactions, etc)
  49. 49. 49© Cloudera, Inc. All rights reserved. Conclusions
  50. 50. 50© Cloudera, Inc. All rights reserved. Conclusions • NoSQL and SQL are converging again • We now get “best of both worlds” from both communities! • Many different excellent choices are now available for building hybrid transactional/analytic applications • Understand the trade-offs before settling on an architecture • Seemingly small details can make orders-of-magnitude difference • Consider non-functional differences as well (licensing, deployment, lock-in, etc)
  51. 51. 51© Cloudera, Inc. All rights reserved. Acknowledgements • Spanner team for publishing papers, especially SIGMOD 2017 (“Spanner: Becoming a SQL System”) • Cloud Spanner team and developer advocates (Deepti Srivastava, Robert Kubis) • Siamak Tazari (YCSB binding for Cloud Spanner) • Cloudera (paying my GCE bill)
  52. 52. 52© Cloudera, Inc. All rights reserved. kudu.apache.org @tlipcon | @ApacheKudu

×