SF PostgreSQL User Group cstore presentation
Upcoming SlideShare
Loading in...5
×
 

SF PostgreSQL User Group cstore presentation

on

  • 143 views

The slides used at the San Francisco PostgreSQL User Group meetup (http://www.meetup.com/postgresql-1/events/178687982/). Learn about how we at Citus Data implemented a columnar store for PostgreSQL ...

The slides used at the San Francisco PostgreSQL User Group meetup (http://www.meetup.com/postgresql-1/events/178687982/). Learn about how we at Citus Data implemented a columnar store for PostgreSQL using foreign data wrappers. Features discussions on architecture and benchmark results.

Statistics

Views

Total Views
143
Views on SlideShare
131
Embed Views
12

Actions

Likes
2
Downloads
5
Comments
0

1 Embed 12

http://www.slideee.com 12

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Columnar store for PostgreSQL <br /> Ozgun .. founder at Citus Data <br /> SF and Istanbul <br /> Hadi did bulk of the work on the columnar store <br /> Have about 30 slides and a demo. I’ll put things into context with 2 slides on Citus <br /> Technical talk. If you have questions, please feel free to interrupt <br /> Speak slowly. <br />
  • When I say extends, we didn’t take a particular version of Postgres and forked from there. Instead we went from 8.4 to 9.0, etc. <br /> We used the existing API and integration points: query planner and executor hooks are an example.
  • Let’s take an example distributed table, and see how it’s spread across the worker nodes. <br /> The yellow boxes here are shards that make up the distributed table. <br /> Worker node extensions <br /> Master node extensions <br /> 1 shard = 1 postgres table = 1 cstore table <br /> <br />
  • Relative ease of use: PostgreSQL config could be much simpler <br /> HDFS: NameNode / DataNode, Hadoop: JobTracker / TaskTracker, Hive: metadata server (MySQL), etc. <br /> Uses the copy hook for loading in the data <br />
  • TPC-H is an ad-hoc, decision support benchmark. <br /> Each table has between 10-20 columns. So not the best benchmark to demonstrate column store performance. <br /> Talk about what graphs are going to show <br /> m3.2xlarge (2 x 80G SSD, 30G ram, 4x3.25 ECU - 10G tests) <br /> m2.2xlarge (1 x 850G HDD, 34.2G ram, 4x3.25 ECU - 10G tests) <br />
  • Representative queries <br /> Q6: 68s -> 25s (Q3: 85s -> 44s) <br /> 1/ Reduce disk bottlenecks <br /> 2/ If you’re deploying PB scale clusters, reduces number of machines <br />
  • Q6: 26s -> 14s (Q3: 37s -> 26s) <br /> 1/ Reduces SSD storage costs <br /> 2/ Query performance starts increasing with CitusDB (use of multiple cores) <br />
  • * Q6: 9GB -> 1.8GB -> 0.8GB
  • cstore is slightly faster. cstore with compression is slightly slower due to the compression’s CPU cost. <br /> Effective memory size increases <br /> 1/ Compression (Instead of fitting 1GB, users can now fit in 2-3GB) <br /> 2/ If queries always selects a subset of the columns, then they occupy the working set <br /> 3/ Ideally, skip indexes are always kept in memory (they get referenced on each query) <br />
  • Bug fixes! <br /> Better cost estimates for join operations! <br /> <br />
  • Questions?

SF PostgreSQL User Group cstore presentation SF PostgreSQL User Group cstore presentation Presentation Transcript

  • cstore_fdw – Columnar store for analytic workloads Hadi Moshayedi & Ozgun Erdogan
  • What is CitusDB? • CitusDB is a scalable analytics database that extends PostgreSQL – Citus shards your data and automatically parallelizes your queries – Citus isn’t a fork of Postgres. Rather, it hooks onto the planner and executor for distributed query execution. – Always rebased to newest Postgres version – Natively supports new data types and extensions
  • A C D G worker node #1 (extended PostgreSQL) worker node #2 (extended PostgreSQL) A worker node #3 (extended PostgreSQL) . . . . 1 shard = 1 Postgres table master node (extended PostgreSQL) shard and shard placement metadata View slide
  • Talk Overview 1. Why customers want columnar stores 2. Live demo 3. Optimized Row Columnar (ORC) format 4. PostgreSQL benefits 5. New benchmark numbers View slide
  • Id Sz Ln Ht … … … … … … … … … … … 1 4 3 4 … … … … … … … … … … … 2 4 11 3 … … … … … … … … … … … 3 1 4 2 … … … … … … … … … … … 4 8 4 12 … … … … … … … … … … … … 4 … … … … … … … … … … … … … … … 4 … … … … … … … … … … … … … … … 4 … … … … … … … … … … … … … … 30M rows 700 columns
  • Example SQL query SELECT id, AVG(price), MAX(price) FROM items WHERE quantity > 100 AND last_stock_date < ‘2013-10-01’ GROUP BY weight;
  • Id … price … … quant … … last_stm … … … … … weight 1 … 3.90 … … 31 … … 2013-… … … … … … 0.6 2 … 13 … … 70 … … 2010-… … … … … … 0.8 3 … 4.25 … … 432 … … 2013-… … … … … … 1 4 … 4 … … 45 … … 2013-… … … … … … 6 … 4… … 95 … … 37 … … 2013-… … … … … … 0.6 4… … 59 … … 90 … … 2012-… … … … … … 1.5 Row-oriented store
  • Id … price … … quant … … last_stm … … … … … weight 1 … 3.90 … … 31 … … 2013-… … … … … … 0.6 2 … 13 … … 70 … … 2010-… … … … … … 0.8 3 … 4.25 … … 432 … … 2013-… … … … … … 1 4 … 4 … … 45 … … 2013-… … … … … … 6 … 4… … 95 … … 37 … … 2013-… … … … … … 0.6 4… … 59 … … 90 … … 2012-… … … … … … 1.5 Row-oriented store
  • Id … price … … quant … … last_stm … … … … … weight 1 … 3.90 … … 31 … … 2013-… … … … … … 0.6 2 … 13 … … 70 … … 2010-… … … … … … 0.8 3 … 4.25 … … 432 … … 2013-… … … … … … 1 4 … 4 … … 45 … … 2013-… … … … … … 6 … 4… … 95 … … 37 … … 2013-… … … … … … 0.6 4… … 59 … … 90 … … 2012-… … … … … … 1.5 Row-oriented store
  • Id … price … … quant … … last_stm … … … … … weight 1 … 3.90 … … 31 … … 2013-… … … … … … 0.6 2 … 13 … … 70 … … 2010-… … … … … … 0.8 3 … 4.25 … … 432 … … 2013-… … … … … … 1 4 … 4 … … 45 … … 2013-… … … … … … 6 … 4… … 95 … … 37 … … 2013-… … … … … … 0.6 4… … 59 … … 90 … … 2012-… … … … … … 1.5 Row-oriented store
  • Cost of row storage • Read 700 columns instead of 5 • >39 GB of unnecessary I/O Input Type Estimated Input Rate Cost to query performance Memory 10 GB/s 3.9 seconds SSD 600 MB/s >60 seconds
  • Example SQL query SELECT id, AVG(price), MAX(price) FROM items WHERE quantity > 100 AND last_stock_date < ‘2013-10-01’ GROUP BY weight;
  • Id sz price … … quant … … last_stm … … … … … weight 1 4 3.90 … … 31 … … 2013-… … … … … … 0.6 2 3 13 … … 70 … … 2010-… … … … … … 0.8 3 2 4.25 … … 432 … … 2013-… … … … … … 1 4 4 4 … … 45 … … 2013-… … … … … … 6 … 4… 19 95 … … 37 … … 2013-… … … … … … 0.6 4… 2 59 … … 90 … … 2012-… … … … … … 1.5 Column-oriented store
  • Column-oriented store Id sz price … … quant … … last_stm … … … … … weight 1 4 3.90 … … 31 … … 2013-… … … … … … 0.6 2 3 13 … … 70 … … 2010-… … … … … … 0.8 3 2 4.25 … … 432 … … 2013-… … … … … … 1 4 4 4 … … 45 … … 2013-… … … … … … 6 … 4… 19 95 … … 37 … … 2013-… … … … … … 0.6 4… 2 59 … … 90 … … 2012-… … … … … … 1.5
  • Column-oriented store Id sz price … … quant … … last_stm … … … … … weight 1 4 3.90 … … 31 … … 2013-… … … … … … 0.6 2 3 13 … … 70 … … 2010-… … … … … … 0.8 3 2 4.25 … … 432 … … 2013-… … … … … … 1 4 4 4 … … 45 … … 2013-… … … … … … 6 … 4… 19 95 … … 37 … … 2013-… … … … … … 0.6 4… 2 59 … … 90 … … 2012-… … … … … … 1.5
  • Columnar Store Motivation • Read subset of columns to reduce I/O • Better compression – Less disk usage – Less disk I/O
  • State of the Columnar Store 1. Fork a popular database, swap in your storage engine, and never look back 2. Develop an open columnar store format for the Hadoop Distributed Filesystem (HDFS) 3. Use PostgreSQL extension machinery for in- memory stores / external databases
  • Columnar Store Specs • Record Columnar File (RCFile) – Facebook, OSU, and Chinese Academy of Sciences – First horizontally-partition, then vertically-partition • ORC (Optimized RCFile) – Second generation. Developed by Hortonworks and Facebook – Lightweight indexes stored within the file – Different compression methods within the same file
  • ORC File Layout benefits 1. Columnar layout – reads columns only related to the query 2. Compression – groups column values (10K) together and compresses them 3. Skip indexes – applies predicate filtering to skip over unrelated values
  • Block 1 Block 2 Block 3 Block 4 Block 5 Block 6 Block 7 150K rows (configurable) 150K rows (configurable) 10K column values (configurable) per block
  • Compression • Current compression method is PG_LZ from PostgreSQL core • Easy to add new compression methods depending on the CPU / disk trade-off • cstore_fdw enables using different compression methods at the column block level
  • Table sizes normalized to 1.0
  • Skip Indexes • For each column block (10K), cstore_fdw also records min/max values in a skip index. • When the user runs a query, we extract all filter clauses from the query. • For example, the query specifies quantity > 100 And last_stock_date < ‘2013-10-01’.
  • Skip Indexes • We then use Postgres’ constraint exclusion mechanism to decide whether to skip over 10K rows. • For each filter clause, we create and apply a constraint. The awesome thing about using PostgreSQL is that we don’t need to write any code. • If input data has an inherent time dimension, that helps. Sorting input data also helps with skip indexes.
  • Drawbacks to ORC • Support for only eight data types. Each data type further needs to have a separate code path for min/max value collection and constraint exclusion. • Gathering statistics from the data and table JOINs are an afterthought.
  • 1. Simply use PostgreSQL data types’ datum representation. 2. Avoid deserialization overhead. 3. Support user-defined types as well.
  • Statistics Collection • FDWs provide an API to collect random samples from data. Users need to manually run Analyze. • Postgres then constructs histograms, most common value frequencies, and other stats. • cstore_fdw estimates query costs for different access paths based on these statistics. * • Informed resource usage. Better join order and join method selection.
  • Recent Benchmark Results • TPC-H is a standard benchmark • Performed in-memory, SSD, and HDD tests on 10 GB of data • Used m2.2xlarge and m3.2xlarge on EC2 • Compared vanilla PostgreSQL, CStore, CStore with compression
  • 10GB of uncached data on m2.2xlarge
  • 10GB of uncached data on m3.2xlarge
  • Total issued disk I/O measures with iotop
  • 10GB of cached data on m2/m3.2xlarge
  • Future Work • CStore is an open source project actively in development: github.com/citusdata/cstore_fdw – Improve memory usage – Automatically determining paths for data files – Native Delete / Insert / Update support – Improve read query performance (vectorized execution) – Different compression codecs – Many more; contribute to the discussion on GitHub!
  • Summary • CStore: Open source columnar store fdw for Postgres • Data layout is based on ORC 1 Columnar data layout per stripe 2 Supports different compression codecs 3 Skip indexes enable predicate filtering • Uses foreign wrapper APIs 1 Supports all PostgreSQL data types 2 Statistics collection for better query plans 3 Load extension. Create Table. Copy
  • cstore_fdw – Columnar Store for Analytic Workloads Hadi Moshayedi – hadi@citusdata.com Ozgun Erdogan – ozgun@citusdata.com