Hypertable - massively scalable nosql database

Hypertable

An Open Source,
High Performance,
Massively Scalable Database

Doug Judd
CEO Hypertable Inc.

Three Reasons to
Choose Hypertable
•  High Performance
•  Open Source
•  Future Direction SQL

Highlights
•  Modeled after Google’s Bigtable database
•  High Performance Implementation (C++)
•  Apache Thrift interface for all popular languages
(Java, PHP, Ruby, Python, Perl, etc)
•  Broad Hadoop distribution support
o  Apache 2
o  Cloudera CDH3, CDH4, CDH5
o  IBM BigInsights 3
o  Hortonworks HDP2
o  MapR
•  Actively developed for 8 years

Open Source
•  Licensed under the GPL
•  Hosted on GitHub
o  git://github.com/hypertable/hypertable.git
o  https://github.com/hypertable/hypertable.git
•  Online source documentation
•  Mailing Lists
o  groups.google.com/group/hypertable-user
o  groups.google.com/group/hypertable-dev

Bigtable
•  Google’s most successful scalable database
•  Bigtable underpins 100+ Google services
•  YouTube, Blogger, Google Earth, Google Maps,
Orkut, Gmail, Google Analytics, Google Book
Search, Google Code, Crawl Database, Google
Code …
•  Data is physically ordered by primary key – it’s not a
distributed hash table

How Hypertable Diﬀers From
A Traditional RDBMS
•  Horizontally Scalable
•  Sparse Table Structure
o  Variable number of columns per-row
o  Rows can have billions of columns
•  Cells can have multiple time stamped versions

Database Model
•  Sparse, two-dimensional tables
•  Cells can have multiple versions
•  Cells addressed by 4-part key
o  Row
o  Column family
o  Column qualifier
o  Timestamp

Conceptual Table
Representation

Actual Table
Representation

Anatomy of a Key
•  Column Family is 8-bit
•  Timestamp and Revision are 64-bit integer
nanoseconds since Epoch
•  Simple byte-wise comparison

RangeServer
Insert Handling

RangeServer
Query Handling

Cluster Task
AutomationTool
•  ht_cluster
•  Modeled after Capistrano
•  Role
o  Designates a function or service and the set of machines that will perform
that function or service
o  Examples: Hyperspace, Master, Slave (RangeServer), ThriftBroker
o  Machines can belong to one ore more roles
•  Task
o  Script written for specific roles and used to manage the associated
function or service
o  Examples: start_hyperspace, stop_hyperspace

cluster.def
INSTALL_PREFIX=/opt/hypertable
HYPERTABLE_VERSION=0.9.8.2
PACKAGE_FILE=/tmp/hypertable-0.9.8.2-linux-x86_64.tar.gz
FS=hadoop
HADOOP_DISTRO=cdh4
ORIGIN_CONFIG_FILE=/root/hypertable.cfg
PROMPT_CLEAN=true
role: source test00
role: master test[00-02]
role: hyperspace test[00-02]
role: slave test[03-99] - test37
role: thriftbroker
role: spare
include: "core.tasks"

Common Tasks
ht cluster start
ht cluster stop
ht cluster push_config
ht cluster install_package
ht cluster upgrade

Thrift Broker Metrics
Metric
Units
Connections
count
Requests
requests/s
Errors
errors/s
Virtual Memory
GB
Resident Memory
GB
Heap Size
GB
Heap Slack Bytes
GB
CPU user
percentage
CPU sys
percentage
Version
string

Range Server Metrics
Metric
Units
Scans
scans/s
Updates
updates/s
Bytes Returned
bytes/s
Bytes Scanned
bytes/s
Byte Scan Yield
percentage
Bytes WriUen
bytes/s
Cells Returned
cells/s
Cells Scanned
cells/s
Cell Scan Yield
percentage
Outstanding Scanners
count
Request Backlog
count
Metric
Units
Major Compactions
count
Minor Compactions
count
Merging Compactions
count
GC Compactions
count
Virtual Memory
GB
Resident Memory
GB
Heap Size
GB
Heap Slack Bytes
GB
Tracked Memory
GB
CPU user
percentage
CPU sys
percentage

Range Server Metrics
Metric
Units
Ranges
count
CellStores
count
Block Cache Hits
percentage
Block Cache Memory
GB
Block Cache Fill
GB
Query Cache Hits
Percentage
Query Cache Memory
GB
Query Cache Fill
GB
Version
string

FS Broker Metrics
Metric
Units
Read Throughput
MB/s
Write Throughput
MB/s
Syncs
syncs/s
Sync Latency
milliseconds
Errors
count
JVM GCs
count
JVM GC Time
milliseconds
JVM Heap Size
GB
Virtual Memory
GB
Resident Memory
GB
Metric
Units
Heap Size
GB
Heap Slack Bytes
GB
CPU user
percentage
CPU sys
percentage
Version
string

Master and Hyperspace
Metrics
Metric
Units
Operations
operations/s
Virtual Memory
GB
Resident Memory
GB
Heap Size
GB
Heap Slack Bytes
GB
CPU user
percentage
CPU sys
percentage
Version
string
Metric
Units
Requests
requests/s
Virtual Memory
GB
Resident Memory
GB
Heap Size
GB
Heap Slack Bytes
GB
CPU user
percentage
CPU sys
percentage
Version
string
Master
Hyperspace

Slow Query Log
•  ThriftBroker feature
•  Logs queries that
take longer than 10
seconds
•  Log line format
o  End time (seconds)
o  Start time (seconds)
o  Function called
o  Client IP/port
o  Latency (milliseconds)
o  Sub-scanner count
o  Bytes Returned
o  Bytes Scanned
o  Disk read
o  Servers contacted
o  Namespace
o  HQL representation of query

Namespaces
USE ‘/’;
CREATE NAMESPACE foo;
USE foo;
CREATE NAMESPACE bar;
CREATE TABLE mytable (a, b, c);
GET LISTING;
(bar) namespace
mytable

Atomic Counters
•  Column option:
CREATE TABLE counts (
url COUNTER
);
•  Modified via existing API using specially
formatted values:
Value Format Description
[+]n Increment counter by n
-n Decrement counter by n
=n Reset counter to n

Secondary Indexes
Total Cells Inserted:
1 billion
Total Time Taken:
45 minutes
Aggregate Throughput (inserts/s):
372,362
Aggregate Throughput (bytes/s):
14,763,300
§  Six test machines
-  Dual Six-core Opteron HE Processors
-  24 GB RAM
-  4X 2TB SATA drives
§  Single Indexed column
-  Key: randomly generated 20-byte integer
-  Value: two randomly chosen words from /usr/share/dict/
words

Secondary Indexes (HQL)
CREATE TABLE products (
title,
section,
info,
category,
INDEX section,
INDEX info,
QUALIFIER INDEX info,
QUALIFIER INDEX category
);

Secondary Indexes
SELECT title
FROM products
WHERE info:actor = “Jack Nicholson”;
B00002VWE0 title Five Easy Pieces (1970)
B002VWNIDG title The Shining (1980)

Secondary Indexes
SELECT title, info:author
FROM products
WHERE info:author =~ /^Stephen [PK]/;
0307743659 title The Shining Mass Market Paperback
0307743659 info:author Stephen King
0321776402 title C++ Primer Plus (6th Edition)
(Developer's Library)
0321776402 info:author Stephen Prata

Secondary Indexes
SELECT title
FROM products
WHERE Exists(info:studio);
B000Q66J1M title 2001: A Space Odyssey [Blu-ray]

Secondary Indexes
SELECT title
FROM products
WHERE info:author =~ /^Stephen P/ OR
info:publisher =~ /^Anchor/;

Secondary Indexes
SELECT title
FROM products
WHERE info:author =~ /^Stephen [PK]/ AND
info:publisher =~ /^Anchor/;

Secondary Indexes
SELECT title
FROM products
WHERE ROW =^ 'B' AND
info:actor = 'Jack Nicholson';

Regex Filtering
•  Google’s RE2 regular expression engine
o  Extremely fast (up to 50X Java regex)
o  Searches run in time linear in the size of the
input
o  Searches constrained to a fixed amount of
memory
•  Supported Searches:
o  Row key
o  Column qualifier
o  Value

Regex Filtering
SELECT info:/^a/ FROM products;
0307743659 info:author Stephen King
0321321928 info:author Stephen C. Dewhurst
0321776402 info:author Stephen Prata
B00002VWE0 info:actor Karen Black
B00002VWE0 info:actor Jack Nicholson
B000Q66J1M info:actor Gary Lockwood
B000Q66J1M info:actor Keir Dullea
B002VWNIDG info:actor Shelley Duvall
B002VWNIDG info:actor Jack Nicholson

Regex Filtering
SELECT title
FROM products
WHERE ROW REGEXP "2";
0321321928 title C++ Common Knowledge: Essential
Intermediate Programming [Paperback]

Regex Filtering
SELECT title
FROM products
WHERE VALUE REGEXP "(";

Hadoop MapReduce
•  MapReduce Input/Output formats
o  Normal (mapreduce)
o  Streaming (mapred)

•  Load data from HT to Hive and vice-versa
•  Use Hive types
•  Use Hive QL (joins, aggregations)
•  Low latency data warehousing
•  Uses Hypertable’s native MapReduce Input/Output
format

Column Family Options
•  TTL=<t>
o  “time to live”
o  Remove cells that are older than <t>
•  MAX_VERSIONS=<n>
o  Keep only most recent <n> cell versions

Access Groups
CREATE TABLE User (
name,
address,
photo,
profile,
ACCESS GROUP default (name, address, photo),
ACCESS GROUP profile (profile)
);

Adaptive
Memory Allocation

Group Commit
•  Supports highly concurrent updates
•  Trades average latency for better throughput
•  By default, commit log writes are auto-coalesced
•  Commit log write interval can be statically
configured per-table:
CREATE TABLE counts (
url,
domain
) GROUP_COMMIT_INTERVAL=100;

Caching
•  Block Cache
o  Caches CellStore blocks
o  Can be configured to store blocks compressed or
uncompressed (default = compressed)
o  Dynamically adjusted size based on workload
•  Query Cache
o  Caches query results
o  Caches single row queries only

Compression
•  Cell Store blocks are compressed
•  Commit Log updates are compressed
•  Supported Compression Schemes:
bmz, lzo, quicklz, snappy, zlib, none
•  Quicklz performance numbers:
Language Compression
Speed (MB/s)
Decompression
Speed (MB/s)
C++ 308 358
Java 127 95

Hypertable vs. HBase
•  Modeled after test described in Bigtable paper
•  Hypertable 0.9.5.5 vs. HBase 0.90.4
•  16-node Cluster
o  CPU: 2X AMD C32 Six-core model 4170 HE 2.1GHz
o  RAM: 24GB
o  Disk: 4X 2TB SATA
•  Tests Run
o  Random Write
o  Scan
o  Random Read Zipfian
o  Random Read Uniform

•  Operational Data Store
•  System metrics
o  CPU
o  Memory
o  IO
o  Network
•  Application metrics
o  Web
o  DB
o  Caches
•  Business metrics
o  Usage
o  Revenue
Case Study:
Noah System

•  Storage Capacity
o  Up to 100TB
o  Up to 1 trillion records
•  Automatic Sharding
o  Irregular data growth patterns
•  Heavy Writes
o  ~30K inserts/s
•  Fast Reads of Recent Data
•  Table Scans
System
Requirements

•  2nd Largest Indian Internet Portal
•  Rediffmail
o  One of the world’s largest email services
o  Over 100 Million registered users
•  Active Deployments
o  Rediffmaill
o  Email SPAM classification
o  News Crawl Database
o  Recommendation System
Case Study:
Rediﬀ

Summary
•  High Performance
•  Open Source
•  Future Direction SQL

Hypertable - massively scalable nosql database

More Related Content

What's hot

Similar to Hypertable - massively scalable nosql database

More from bigdatagurus_meetup

Recently uploaded

Hypertable - massively scalable nosql database