Hypertable Berlin Buzzwords

Hypertable Doug Judd CEO, Hypertable, Inc.

High Performance, Open Source Scalable Database Modeled after Bigtable High Performance Implementation (C++) Project Started in March 2007 Thrift Interface for all popular languages Java PHP Ruby Python Perl, etc.

Bigtable: the infrastructure that Google is built on YouTube Blogger Google Earth Google Maps Orkut (social network) Gmail Google Analytics Google Book Search Google Code Crawl Database … plus 90 other Google services …

Functionality Massive sparse tables of information Single primary key index Cells can have mulitple timestamped versions Not Relational No joins (not yet) No secondary indexes (not yet) Not a transaction system (not yet)

Auto-Sharding MongoDB AsterData Greenplum

Dynamo-based Hash Table Architectures Cassandra Project Voldemort Riak

Order Preserving Partitioner (Cassandra) www.recipezaar.com 1091721999…629750272 + www.ribbonprinters.com 1091721999…965293103 / 2 = www.rgb????i?pQdp ?.??? 1091721999…297521687

Order Preserving Partitioner Balance Problem

Log Structured Merge (LSM) Tree Eliminates random I/O on writes Converts random I/O to sequential I/O Write path Commit log on disk (DFS) In-memory map In-memory map gets “compacted” to disk Disk files periodically get merged

Range Server Manages ranges of table data CellCache: In-memory map containing recent updates CellStore: On-disk (DFS) file containing “compacted” cell cache

Range Server: CellStore Sequence of 65K blocks of compressed key/value pairs

Compression Cell Store blocks are compressed Commit Log updates are compressed Supported Compression Schemes zlib (--best and --fast) lzo quicklz bmz none

Bloom Filter Probabilistic data structure associated with every CellStore Indicates if key is not present

Caching Block Cache Caches CellStore blocks Blocks are cached uncompressed Dynamically adjusted size based on workload Query Cache Caches query results

Performance Evaluation Hypertable vs. HBase

Test Setup Hypertable v0.9.3.2 (not yet released) HBase 0.20.3 HDFS 0.20.2 10 machines 3 Hyperspace / Zookeeper replicas 1 Master / 4 Tablet Servers (5GB RAM) 1 Test Dispatcher / 4 Test Clients Machine profile 1 X 1.8 GHz Dual-core Opteron 10 GB RAM 3 X 250 GB SATA drives

Random Write / Sequential Read

Project Resources Twitter: hypertable www.hypertable.org

Hypertable Berlin Buzzwords

More Related Content

What's hot

Viewers also liked

Similar to Hypertable Berlin Buzzwords

Recently uploaded

Hypertable Berlin Buzzwords

Editor's Notes