High-Performance Storage Services with HailDB and Java

High-Performance
Storage Services
With Java and HailDB

Sunny Gleason
April 14, 2011

whoami
• Sunny Gleason, human
• passion: distributed systems engineering
• previous...
Ning : custom social networks
Amazon.com : infra & web services
• now...
building cloud infrastructure

whereami

• twitter : twitter.com/sunnygleason
• github : github.com/sunnygleason
• linkedin : linkedin.com/in/sunnygleason
• slideshare : slideshare.net/sunnygleason

what’s in this presentation?
• MySQL & NoSQL as Inspiration
• HailDB & InnoDB
• JNA: Integration with Java
• St8 : A REST-Enabled Data Store
• A Handful of Nifty Applications
• Results & Next Steps

prior art
• Mad props to:

• MySQL & InnoDB teams for creating InnoDB
and Embedded InnoDB

• Stewart Smith & Drizzle folks for leading the
HailDB charge and encouraging plugin apis

• Nokia & Percona for publishing results of their
Voldemort / MySQL integration

• Basho for publishing Riak / InnoStore integration

MySQL & InnoDB
• Super-Efﬁcient Database Server
• Tried & True Replication
• Bulletproof Durability (when conﬁgured
correctly)
• Fantastic Stability, Predictability & Insight
into Operation

motivation

• database on 1 box : ok
• database with master/slave replication : ok
• database on cluster : tricky
• database on SAN : scary

NoSQL

• “Not Only” SQL
• What’s the point?
• Proponent: “reaching next level of scale”
• Cynic: “cloud is hype, ops nightmare”

what does it gain?

• Higher performance, scalability, availability
• More robust fault-tolerance
• Simpliﬁed systems design
• Easier operations

what does it lose?
• Reduced / simpliﬁed programming model
• No ad-hoc queries, no joins, no txns
• Not ACID: Weakened Atomicity /
Consistency / Isolation / Durability
• Operations / management is still evolving
• Challenging to quantify health of system
• Fewer domain experts

NoSQL Map
KV Stores
(volatile) Memcached,
Redis

KV Stores Dynamo,
Key-Value (durable) Voldemort,
Store
Riak

Document
Store
NoSQL CouchDB,
MongoDB

Column
Store Cassandra,
BigTable,
HBase

Graph
Neo4J
Store

durable vs. volatile

• RAM is ridiculous speed (ns), not durable
• Disk is persistent and slow (3-7ms)
• RAID eases the pain a bit (4-8x throughput)
• SSD is providing good promise (100-300us)
• FusionIO is redeﬁning the space (30-100us)

performance &
operational complexity*

+ Sharding
Complexity

+FusionIO

+SSD

MySQL Voldemort +Cluster

Memcached

1K 10K 100K 1M

Aggregate Operations / Sec
* This is not a real graph

just a thought...

What if we could use the highly optimized &
durable ‘guts’ of MySQL without having to go
through JDBC & SQL?

enter HailDB
• use case:Voldemort Storage Engine
• let’s evaluate relative to other NoSQL
options
• focus on stability & predictability of
performance
• Graphs are throughput (ops/sec) vs. time

Voldemort schema

_key VARBINARY(200)
_version VARBINARY(200)
_value BLOB
PRIMARY KEY(_key, _version)

experimental setup
• OS X: 8-Core Xeon, 32GB RAM, 200GB
OWC SSD
• Faban Benchmark : PUT 64-byte key, 1024-
byte value
• Scenarios:1, 2, 4, 8 threads
• 512M Java Heap

BDB-JE

• Log-Structured B-Tree
• Fast Storage When Mostly Cached
• Conﬁgured without fsync() by default -
writes are batched and ﬂushed periodically

Krati

• Fast Hash-Oriented Storage
• Uses memory-mapped files for speed
• Configured without fsync() by default -
writes are batched and flushed periodically

HailDB & Java
• g414-haildb : where the magic happens
• Open Source on GitHub
• uses JNA: Java Native Access
• dynamic binding to libhaildb shared library
• auto-generate initial Java class from .h ﬁle
(w/ JNAerator)
• Pointer classes & other shenanigans

implementation gotchas
• InnoDB API-level usage is unclear
• Synchronization & locking is unclear
• Therefore... I learned to love reading C
• Error handling is *nasty*
• Native library installation a bit of a pain
(need to conﬁgure LD_LIBRARY_PATH)

kinder, friendlier APIs
• Level 0: JNA bindings
int err = ib_dostuff();
• Level 1: Object-Oriented
Transaction t = db.openTransaction();
t.commit();
• Level 2: Templated
dbt.inTransaction() { dbt.insert(value); }
• Level 3: Functional
Maps, Iteration, Filters, Apply

St8 Server
• HTTP-enabled Access to HailDB
• PUT /1.0/t/mytable
{

"columns":[
  {"name":"a","type":"INT","length":4},
  {"name":"b","type":"INT","length":8},
  {"name":"c","type":"BLOB","length":0},
],
"indexes":[
  {
   "name":"P",
   "clustered":true,"unique":true,
   "indexColumns":[{"name":"a"}]
  }
]
}

rest-enabled access

• GET /1.0/d/mytable;a=0
• POST /1.0/d/mytable;a=1;b=42;c=xyz
• PUT /1.0/d/mytable;a=1;b=43;c=abc
• DELETE /1.0/d/mytable;a=0
*This is matrix-param style, can also use form
data style for specifying data

cursors & iterators
• GET /1.0/i/mytable.P?q=a+ge+4
• GET /1.0/i/mytable.SecIndex?q=b+le+4
• GET /1.0/i/mytable.SecIndex?q=b+le+4
&s=abce1212121ceeee2120911

• “s” value is opaque index key of next page
of results - way better than LIMIT/OFFSET!
(since HailDB can seek directly to the row)

result
• REST API provides fun, straightforward
access from Ruby, Python, Java, Command-
line...
• very easy benchmarking with HTTP-based
performance tools
• range query support, and more efﬁcient
iteration model for large result sets than
MySQL provides

high-performance counts

• GET /1.0/counts/mykey
0
• POST /1.0/counts/mykey[?inc=1]
1
• POST /1.0/counts/mykey?inc=42
43
• DELETE /1.0/counts/mykey

counts schema
• HailDB count service schema
_id int 8-byte unsigned,
_key_hash int 8-byte unsigned,
_key varchar(80),
_count int 8-byte unsigned

primary key (“_id”)
unique key (“_key_hash”, “key”)

operation: graph store
• Social networks, recommendations, any
relation you can think of
• Which would you prefer?
• SQL adjacency list, stored procedure,
custom storage engine, external
(Memcached), ...
• Graph-aware HailDB application in Java

nifty graph store 1
3
2

1 4
5
6

8

GET /1.0/graph/bfs?a=1&maxDepth=3
=> [[1, 0], [2, 1], [3, 2], [4, 3], [5, 3]]

nifty graph store 2
1 2 3 4

5 6

8

GET /1.0/graph/topo?a=1&a=5&a=8
=> [8, 6, 4, 3, 2, 5, 1]

nifty recovery tool
(Just an idea)

• for recovery: shut down mysql server
• run HailDB-enabled recovery tool
• export as JSON or whatever

wrap-up
• HailDB & InnoDB are phenomenal
• With g414-haildb, can be integrated directly
into applications running on the JVM
• All the InnoDB tuning tricks apply
• Opens up new applications that are tricky
with a traditional SQL database

resources

• github.com/sunnygleason/g414-st8
github.com/sunnygleason/g414-haildb
• haildb.com
• jna.dev.java.net

bonus material!

• we probably didn’t get this far in the live
presentation; the following material is here
for eager, brave & interested folks...

future work
• Improve Packaging / Installation
• Codify schema reﬁnements & perf
enhancements
• Online backup/export with XtraBackup
• JNI Bindings
• PBXT explorations

InnoDB tuning
• Skinny columns, skinny rows! (esp. Primary Key)
• Varchar enum ‘bad’, enum, int or smallint ‘good’
• fixed-width rows allow in-place updates
• Use covering indexes strategically
• More data per page means faster index scans,
more efficient buffer pool utilization
• You only get so many trx’s (read & write) on given
CPU/RAM configuration - benchmark this!
• Strategically offload reads to Memcached/Redis

HailDB schema

_key VARBINARY(200)
_value BLOB
PRIMARY KEY(_key, _version)

reﬁned schema
_id BIGINT (auto increment)
_key_hash BIGINT
_key VARBINARY(200)
_value BLOB
PRIMARY KEY(_id)
KEY(_key_hash)

online backup

• hot backup of data to other machine /
destination
• test Percona Xtrabackup with HailDB
• next step: backup/export to Hadoop/HDFS
(similar to Cloudera Sqoop tool)

JNI bindings

• JNI can get 2-5x perf boost vs. JNA
• ... at the expense of nasty code
• Will go for schema optimizations and
InnoDB tuning tips *ﬁrst*

High-Performance Storage Services with HailDB and Java

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to High-Performance Storage Services with HailDB and Java

Similar to High-Performance Storage Services with HailDB and Java (20)

Recently uploaded

Recently uploaded (20)

High-Performance Storage Services with HailDB and Java