SlideShare a Scribd company logo
High-Performance
Storage Services
   With Java and HailDB


      Sunny Gleason
      April 14, 2011
whoami
• Sunny Gleason, human
• passion: distributed systems engineering
• previous...
   Ning : custom social networks
   Amazon.com : infra & web services
• now...
   building cloud infrastructure
whereami

• twitter : twitter.com/sunnygleason
• github : github.com/sunnygleason
• linkedin : linkedin.com/in/sunnygleason
• slideshare : slideshare.net/sunnygleason
what’s in this presentation?
  • MySQL & NoSQL as Inspiration
  • HailDB & InnoDB
  • JNA: Integration with Java
  • St8 : A REST-Enabled Data Store
  • A Handful of Nifty Applications
  • Results & Next Steps
prior art
•   Mad props to:

    •   MySQL & InnoDB teams for creating InnoDB
        and Embedded InnoDB

    •   Stewart Smith & Drizzle folks for leading the
        HailDB charge and encouraging plugin apis

    •   Nokia & Percona for publishing results of their
        Voldemort / MySQL integration

    •   Basho for publishing Riak / InnoStore integration
MySQL & InnoDB
• Super-Efficient Database Server
• Tried & True Replication
• Bulletproof Durability (when configured
  correctly)
• Fantastic Stability, Predictability & Insight
  into Operation
motivation

• database on 1 box : ok
• database with master/slave replication : ok
• database on cluster : tricky
• database on SAN : scary
NoSQL

• “Not Only” SQL
• What’s the point?
• Proponent: “reaching next level of scale”
• Cynic: “cloud is hype, ops nightmare”
what does it gain?

• Higher performance, scalability, availability
• More robust fault-tolerance
• Simplified systems design
• Easier operations
what does it lose?
• Reduced / simplified programming model
• No ad-hoc queries, no joins, no txns
• Not ACID: Weakened Atomicity /
  Consistency / Isolation / Durability
• Operations / management is still evolving
• Challenging to quantify health of system
• Fewer domain experts
NoSQL Map
                                KV Stores
                                (volatile)           Memcached,
                                                       Redis




                                KV Stores             Dynamo,
        Key-Value               (durable)            Voldemort,
          Store
                                                        Riak


                     Document
                       Store
NoSQL                                    CouchDB,
                                         MongoDB

                    Column
                     Store              Cassandra,
                                         BigTable,
                                          HBase

         Graph
                                             Neo4J
         Store
durable vs. volatile

• RAM is ridiculous speed (ns), not durable
• Disk is persistent and slow (3-7ms)
• RAID eases the pain a bit (4-8x throughput)
• SSD is providing good promise (100-300us)
• FusionIO is redefining the space (30-100us)
performance &
                      operational complexity*

                                                         + Sharding
                   Complexity




                                                  +FusionIO

                                         +SSD

                                MySQL       Voldemort                 +Cluster


                                                  Memcached


                                   1K       10K           100K             1M

                                        Aggregate Operations / Sec
* This is not a real graph
just a thought...


What if we could use the highly optimized &
durable ‘guts’ of MySQL without having to go
through JDBC & SQL?
enter HailDB
• use case:Voldemort Storage Engine
• let’s evaluate relative to other NoSQL
  options
• focus on stability & predictability of
  performance
• Graphs are throughput (ops/sec) vs. time
Voldemort schema

_key VARBINARY(200)
_version VARBINARY(200)
_value BLOB
PRIMARY KEY(_key, _version)
experimental setup
• OS X: 8-Core Xeon, 32GB RAM, 200GB
  OWC SSD
• Faban Benchmark : PUT 64-byte key, 1024-
  byte value
• Scenarios:1, 2, 4, 8 threads
• 512M Java Heap
BDB-JE

• Log-Structured B-Tree
• Fast Storage When Mostly Cached
• Configured without fsync() by default -
  writes are batched and flushed periodically
Perf: BDB Put 100%
Krati

• Fast Hash-Oriented Storage
• Uses memory-mapped files for speed
• Configured without fsync() by default -
  writes are batched and flushed periodically
Perf: Krati Put 100%
Perf: HailDB Put 100%
HailDB & Java
• g414-haildb : where the magic happens
• Open Source on GitHub
• uses JNA: Java Native Access
• dynamic binding to libhaildb shared library
• auto-generate initial Java class from .h file
  (w/ JNAerator)
• Pointer classes & other shenanigans
implementation gotchas
• InnoDB API-level usage is unclear
• Synchronization & locking is unclear
• Therefore... I learned to love reading C
• Error handling is *nasty*
• Native library installation a bit of a pain
  (need to configure LD_LIBRARY_PATH)
kinder, friendlier APIs
• Level 0: JNA bindings
    int err = ib_dostuff();
• Level 1: Object-Oriented
   Transaction t = db.openTransaction();
   t.commit();
• Level 2: Templated
    dbt.inTransaction() { dbt.insert(value); }
• Level 3: Functional
    Maps, Iteration, Filters, Apply
St8 Server
• HTTP-enabled Access to HailDB
• PUT /1.0/t/mytable
  {

  "columns":[
    {"name":"a","type":"INT","length":4},
    {"name":"b","type":"INT","length":8},
    {"name":"c","type":"BLOB","length":0},
  ],
  "indexes":[
    {
     "name":"P",
     "clustered":true,"unique":true,
     "indexColumns":[{"name":"a"}]
    }
  ]
  }
rest-enabled access

  • GET /1.0/d/mytable;a=0
  • POST /1.0/d/mytable;a=1;b=42;c=xyz
  • PUT /1.0/d/mytable;a=1;b=43;c=abc
  • DELETE /1.0/d/mytable;a=0
*This is matrix-param style, can also use form
         data style for specifying data
cursors & iterators
• GET /1.0/i/mytable.P?q=a+ge+4
• GET /1.0/i/mytable.SecIndex?q=b+le+4
• GET /1.0/i/mytable.SecIndex?q=b+le+4
  &s=abce1212121ceeee2120911


• “s” value is opaque index key of next page
  of results - way better than LIMIT/OFFSET!
  (since HailDB can seek directly to the row)
result
• REST API provides fun, straightforward
  access from Ruby, Python, Java, Command-
  line...
• very easy benchmarking with HTTP-based
  performance tools
• range query support, and more efficient
  iteration model for large result sets than
  MySQL provides
high-performance counts

• GET /1.0/counts/mykey
  0
• POST /1.0/counts/mykey[?inc=1]
  1
• POST /1.0/counts/mykey?inc=42
  43
• DELETE /1.0/counts/mykey
counts schema
• HailDB count service schema
   _id int 8-byte unsigned,
   _key_hash int 8-byte unsigned,
   _key varchar(80),
   _count int 8-byte unsigned

   primary key (“_id”)
   unique key (“_key_hash”, “key”)
raid0 put counts
ssd put counts
raid0 put/get
ssd put/get
operation: graph store
• Social networks, recommendations, any
  relation you can think of
• Which would you prefer?
 • SQL adjacency list, stored procedure,
    custom storage engine, external
    (Memcached), ...
 • Graph-aware HailDB application in Java
nifty graph store 1
                              3
                   2



       1                            4
                       5
              6



                             8


GET /1.0/graph/bfs?a=1&maxDepth=3
 => [[1, 0], [2, 1], [3, 2], [4, 3], [5, 3]]
nifty graph store 2
       1     2     3     4




             5           6


                         8



GET /1.0/graph/topo?a=1&a=5&a=8
       => [8, 6, 4, 3, 2, 5, 1]
nifty recovery tool
                (Just an idea)


• for recovery: shut down mysql server
• run HailDB-enabled recovery tool
• export as JSON or whatever
wrap-up
• HailDB & InnoDB are phenomenal
• With g414-haildb, can be integrated directly
  into applications running on the JVM
• All the InnoDB tuning tricks apply
• Opens up new applications that are tricky
  with a traditional SQL database
resources

• github.com/sunnygleason/g414-st8
  github.com/sunnygleason/g414-haildb
• haildb.com
• jna.dev.java.net
Questions? Thank You!
bonus material!


• we probably didn’t get this far in the live
  presentation; the following material is here
  for eager, brave & interested folks...
future work
• Improve Packaging / Installation
• Codify schema refinements & perf
  enhancements
• Online backup/export with XtraBackup
• JNI Bindings
• PBXT explorations
InnoDB tuning
• Skinny columns, skinny rows! (esp. Primary Key)
 • Varchar enum ‘bad’, enum, int or smallint ‘good’
 • fixed-width rows allow in-place updates
• Use covering indexes strategically
• More data per page means faster index scans,
  more efficient buffer pool utilization
• You only get so many trx’s (read & write) on given
  CPU/RAM configuration - benchmark this!
• Strategically offload reads to Memcached/Redis
HailDB schema

_key VARBINARY(200)
_version VARBINARY(200)
_value BLOB
PRIMARY KEY(_key, _version)
refined schema
_id BIGINT (auto increment)
_key_hash BIGINT
_key VARBINARY(200)
_version VARBINARY(200)
_value BLOB
PRIMARY KEY(_id)
KEY(_key_hash)
online backup

• hot backup of data to other machine /
  destination
• test Percona Xtrabackup with HailDB
• next step: backup/export to Hadoop/HDFS
  (similar to Cloudera Sqoop tool)
JNI bindings

• JNI can get 2-5x perf boost vs. JNA
• ... at the expense of nasty code
• Will go for schema optimizations and
  InnoDB tuning tips *first*
Thank You!

More Related Content

What's hot

Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...ivmaykov
 
MongoDB, E-commerce and Transactions
MongoDB, E-commerce and TransactionsMongoDB, E-commerce and Transactions
MongoDB, E-commerce and Transactions
Steven Francia
 
Developing polyglot persistence applications #javaone 2012
Developing polyglot persistence applications  #javaone 2012Developing polyglot persistence applications  #javaone 2012
Developing polyglot persistence applications #javaone 2012
Chris Richardson
 
MariaDB: in-depth (hands on training in Seoul)
MariaDB: in-depth (hands on training in Seoul)MariaDB: in-depth (hands on training in Seoul)
MariaDB: in-depth (hands on training in Seoul)
Colin Charles
 
Why MariaDB?
Why MariaDB?Why MariaDB?
Why MariaDB?
Colin Charles
 
What's New in MySQL 5.6
What's New in MySQL 5.6What's New in MySQL 5.6
What's New in MySQL 5.6Santo Leto
 
MariaDB 10 Tutorial - 13.11.11 - Percona Live London
MariaDB 10 Tutorial - 13.11.11 - Percona Live LondonMariaDB 10 Tutorial - 13.11.11 - Percona Live London
MariaDB 10 Tutorial - 13.11.11 - Percona Live London
Ivan Zoratti
 
MariaDB 10: The Complete Tutorial
MariaDB 10: The Complete TutorialMariaDB 10: The Complete Tutorial
MariaDB 10: The Complete Tutorial
Colin Charles
 
NoSQL into E-Commerce: lessons learned
NoSQL into E-Commerce: lessons learnedNoSQL into E-Commerce: lessons learned
NoSQL into E-Commerce: lessons learned
La FeWeb
 
MariaDB 10 and what's new with the project
MariaDB 10 and what's new with the projectMariaDB 10 and what's new with the project
MariaDB 10 and what's new with the projectColin Charles
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Sean Laurent
 
Optimizing MySQL for Cascade Server
Optimizing MySQL for Cascade ServerOptimizing MySQL for Cascade Server
Optimizing MySQL for Cascade Server
hannonhill
 
Connector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document Store
Connector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document StoreConnector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document Store
Connector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document Store
Filipe Silva
 
Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)
Chris Richardson
 
On Cassandra Development: Past, Present and Future
On Cassandra Development: Past, Present and FutureOn Cassandra Development: Past, Present and Future
On Cassandra Development: Past, Present and Future
pcmanus
 
Introduction to MariaDB
Introduction to MariaDBIntroduction to MariaDB
Introduction to MariaDB
JongJin Lee
 
Run Cloud Native MySQL NDB Cluster in Kubernetes
Run Cloud Native MySQL NDB Cluster in KubernetesRun Cloud Native MySQL NDB Cluster in Kubernetes
Run Cloud Native MySQL NDB Cluster in Kubernetes
Bernd Ocklin
 
MongoDB Case Study at NoSQL Now 2012
MongoDB Case Study at NoSQL Now 2012MongoDB Case Study at NoSQL Now 2012
MongoDB Case Study at NoSQL Now 2012Sean Laurent
 
PostgreSQL as an Alternative to MSSQL
PostgreSQL as an Alternative to MSSQLPostgreSQL as an Alternative to MSSQL
PostgreSQL as an Alternative to MSSQL
Alexei Krasner
 
Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...
Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...
Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...Ivan Zoratti
 

What's hot (20)

Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...
 
MongoDB, E-commerce and Transactions
MongoDB, E-commerce and TransactionsMongoDB, E-commerce and Transactions
MongoDB, E-commerce and Transactions
 
Developing polyglot persistence applications #javaone 2012
Developing polyglot persistence applications  #javaone 2012Developing polyglot persistence applications  #javaone 2012
Developing polyglot persistence applications #javaone 2012
 
MariaDB: in-depth (hands on training in Seoul)
MariaDB: in-depth (hands on training in Seoul)MariaDB: in-depth (hands on training in Seoul)
MariaDB: in-depth (hands on training in Seoul)
 
Why MariaDB?
Why MariaDB?Why MariaDB?
Why MariaDB?
 
What's New in MySQL 5.6
What's New in MySQL 5.6What's New in MySQL 5.6
What's New in MySQL 5.6
 
MariaDB 10 Tutorial - 13.11.11 - Percona Live London
MariaDB 10 Tutorial - 13.11.11 - Percona Live LondonMariaDB 10 Tutorial - 13.11.11 - Percona Live London
MariaDB 10 Tutorial - 13.11.11 - Percona Live London
 
MariaDB 10: The Complete Tutorial
MariaDB 10: The Complete TutorialMariaDB 10: The Complete Tutorial
MariaDB 10: The Complete Tutorial
 
NoSQL into E-Commerce: lessons learned
NoSQL into E-Commerce: lessons learnedNoSQL into E-Commerce: lessons learned
NoSQL into E-Commerce: lessons learned
 
MariaDB 10 and what's new with the project
MariaDB 10 and what's new with the projectMariaDB 10 and what's new with the project
MariaDB 10 and what's new with the project
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Optimizing MySQL for Cascade Server
Optimizing MySQL for Cascade ServerOptimizing MySQL for Cascade Server
Optimizing MySQL for Cascade Server
 
Connector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document Store
Connector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document StoreConnector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document Store
Connector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document Store
 
Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)
 
On Cassandra Development: Past, Present and Future
On Cassandra Development: Past, Present and FutureOn Cassandra Development: Past, Present and Future
On Cassandra Development: Past, Present and Future
 
Introduction to MariaDB
Introduction to MariaDBIntroduction to MariaDB
Introduction to MariaDB
 
Run Cloud Native MySQL NDB Cluster in Kubernetes
Run Cloud Native MySQL NDB Cluster in KubernetesRun Cloud Native MySQL NDB Cluster in Kubernetes
Run Cloud Native MySQL NDB Cluster in Kubernetes
 
MongoDB Case Study at NoSQL Now 2012
MongoDB Case Study at NoSQL Now 2012MongoDB Case Study at NoSQL Now 2012
MongoDB Case Study at NoSQL Now 2012
 
PostgreSQL as an Alternative to MSSQL
PostgreSQL as an Alternative to MSSQLPostgreSQL as an Alternative to MSSQL
PostgreSQL as an Alternative to MSSQL
 
Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...
Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...
Ora mysql bothGetting the best of both worlds with Oracle 11g and MySQL Enter...
 

Similar to High-Performance Storage Services with HailDB and Java

001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introductionScott Miao
 
A Presentation on MongoDB Introduction - Habilelabs
A Presentation on MongoDB Introduction - HabilelabsA Presentation on MongoDB Introduction - Habilelabs
A Presentation on MongoDB Introduction - Habilelabs
Habilelabs
 
How does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataHow does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsData
acelyc1112009
 
Clustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmarkClustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmark
Clustrix
 
SeaJUG May 2012 mybatis
SeaJUG May 2012 mybatisSeaJUG May 2012 mybatis
SeaJUG May 2012 mybatisWill Iverson
 
Your backend architecture is what matters slideshare
Your backend architecture is what matters slideshareYour backend architecture is what matters slideshare
Your backend architecture is what matters slideshare
Colin Charles
 
NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]
Huy Do
 
MySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion QueriesMySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion Queries
Bernd Ocklin
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
Roger Xia
 
Vote NO for MySQL
Vote NO for MySQLVote NO for MySQL
Vote NO for MySQL
Ulf Wendel
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skies
shnkr_rmchndrn
 
In-memory Databases
In-memory DatabasesIn-memory Databases
In-memory Databases
Robert Friberg
 
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarC* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
DataStax Academy
 
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Fwdays
 
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Data Con LA
 
Maria db 10 and the mariadb foundation(colin)
Maria db 10 and the mariadb foundation(colin)Maria db 10 and the mariadb foundation(colin)
Maria db 10 and the mariadb foundation(colin)
kayokogoto
 
Mongodb lab
Mongodb labMongodb lab
Mongodb lab
Bas van Oudenaarde
 
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?DATAVERSITY
 
MariaDB 10: A MySQL Replacement - HKOSC
MariaDB 10: A MySQL Replacement - HKOSC MariaDB 10: A MySQL Replacement - HKOSC
MariaDB 10: A MySQL Replacement - HKOSC
Colin Charles
 

Similar to High-Performance Storage Services with HailDB and Java (20)

001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introduction
 
A Presentation on MongoDB Introduction - Habilelabs
A Presentation on MongoDB Introduction - HabilelabsA Presentation on MongoDB Introduction - Habilelabs
A Presentation on MongoDB Introduction - Habilelabs
 
How does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataHow does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsData
 
Clustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmarkClustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmark
 
NoSQL
NoSQLNoSQL
NoSQL
 
SeaJUG May 2012 mybatis
SeaJUG May 2012 mybatisSeaJUG May 2012 mybatis
SeaJUG May 2012 mybatis
 
Your backend architecture is what matters slideshare
Your backend architecture is what matters slideshareYour backend architecture is what matters slideshare
Your backend architecture is what matters slideshare
 
NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]
 
MySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion QueriesMySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion Queries
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
 
Vote NO for MySQL
Vote NO for MySQLVote NO for MySQL
Vote NO for MySQL
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skies
 
In-memory Databases
In-memory DatabasesIn-memory Databases
In-memory Databases
 
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarC* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
 
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
 
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
 
Maria db 10 and the mariadb foundation(colin)
Maria db 10 and the mariadb foundation(colin)Maria db 10 and the mariadb foundation(colin)
Maria db 10 and the mariadb foundation(colin)
 
Mongodb lab
Mongodb labMongodb lab
Mongodb lab
 
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
 
MariaDB 10: A MySQL Replacement - HKOSC
MariaDB 10: A MySQL Replacement - HKOSC MariaDB 10: A MySQL Replacement - HKOSC
MariaDB 10: A MySQL Replacement - HKOSC
 

Recently uploaded

When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 

Recently uploaded (20)

When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 

High-Performance Storage Services with HailDB and Java

  • 1. High-Performance Storage Services With Java and HailDB Sunny Gleason April 14, 2011
  • 2. whoami • Sunny Gleason, human • passion: distributed systems engineering • previous... Ning : custom social networks Amazon.com : infra & web services • now... building cloud infrastructure
  • 3. whereami • twitter : twitter.com/sunnygleason • github : github.com/sunnygleason • linkedin : linkedin.com/in/sunnygleason • slideshare : slideshare.net/sunnygleason
  • 4. what’s in this presentation? • MySQL & NoSQL as Inspiration • HailDB & InnoDB • JNA: Integration with Java • St8 : A REST-Enabled Data Store • A Handful of Nifty Applications • Results & Next Steps
  • 5. prior art • Mad props to: • MySQL & InnoDB teams for creating InnoDB and Embedded InnoDB • Stewart Smith & Drizzle folks for leading the HailDB charge and encouraging plugin apis • Nokia & Percona for publishing results of their Voldemort / MySQL integration • Basho for publishing Riak / InnoStore integration
  • 6. MySQL & InnoDB • Super-Efficient Database Server • Tried & True Replication • Bulletproof Durability (when configured correctly) • Fantastic Stability, Predictability & Insight into Operation
  • 7. motivation • database on 1 box : ok • database with master/slave replication : ok • database on cluster : tricky • database on SAN : scary
  • 8. NoSQL • “Not Only” SQL • What’s the point? • Proponent: “reaching next level of scale” • Cynic: “cloud is hype, ops nightmare”
  • 9. what does it gain? • Higher performance, scalability, availability • More robust fault-tolerance • Simplified systems design • Easier operations
  • 10. what does it lose? • Reduced / simplified programming model • No ad-hoc queries, no joins, no txns • Not ACID: Weakened Atomicity / Consistency / Isolation / Durability • Operations / management is still evolving • Challenging to quantify health of system • Fewer domain experts
  • 11. NoSQL Map KV Stores (volatile) Memcached, Redis KV Stores Dynamo, Key-Value (durable) Voldemort, Store Riak Document Store NoSQL CouchDB, MongoDB Column Store Cassandra, BigTable, HBase Graph Neo4J Store
  • 12. durable vs. volatile • RAM is ridiculous speed (ns), not durable • Disk is persistent and slow (3-7ms) • RAID eases the pain a bit (4-8x throughput) • SSD is providing good promise (100-300us) • FusionIO is redefining the space (30-100us)
  • 13. performance & operational complexity* + Sharding Complexity +FusionIO +SSD MySQL Voldemort +Cluster Memcached 1K 10K 100K 1M Aggregate Operations / Sec * This is not a real graph
  • 14. just a thought... What if we could use the highly optimized & durable ‘guts’ of MySQL without having to go through JDBC & SQL?
  • 15. enter HailDB • use case:Voldemort Storage Engine • let’s evaluate relative to other NoSQL options • focus on stability & predictability of performance • Graphs are throughput (ops/sec) vs. time
  • 16. Voldemort schema _key VARBINARY(200) _version VARBINARY(200) _value BLOB PRIMARY KEY(_key, _version)
  • 17. experimental setup • OS X: 8-Core Xeon, 32GB RAM, 200GB OWC SSD • Faban Benchmark : PUT 64-byte key, 1024- byte value • Scenarios:1, 2, 4, 8 threads • 512M Java Heap
  • 18. BDB-JE • Log-Structured B-Tree • Fast Storage When Mostly Cached • Configured without fsync() by default - writes are batched and flushed periodically
  • 20. Krati • Fast Hash-Oriented Storage • Uses memory-mapped files for speed • Configured without fsync() by default - writes are batched and flushed periodically
  • 23. HailDB & Java • g414-haildb : where the magic happens • Open Source on GitHub • uses JNA: Java Native Access • dynamic binding to libhaildb shared library • auto-generate initial Java class from .h file (w/ JNAerator) • Pointer classes & other shenanigans
  • 24. implementation gotchas • InnoDB API-level usage is unclear • Synchronization & locking is unclear • Therefore... I learned to love reading C • Error handling is *nasty* • Native library installation a bit of a pain (need to configure LD_LIBRARY_PATH)
  • 25. kinder, friendlier APIs • Level 0: JNA bindings int err = ib_dostuff(); • Level 1: Object-Oriented Transaction t = db.openTransaction(); t.commit(); • Level 2: Templated dbt.inTransaction() { dbt.insert(value); } • Level 3: Functional Maps, Iteration, Filters, Apply
  • 26. St8 Server • HTTP-enabled Access to HailDB • PUT /1.0/t/mytable { "columns":[   {"name":"a","type":"INT","length":4},   {"name":"b","type":"INT","length":8},   {"name":"c","type":"BLOB","length":0}, ], "indexes":[   {    "name":"P",    "clustered":true,"unique":true,    "indexColumns":[{"name":"a"}]   } ] }
  • 27. rest-enabled access • GET /1.0/d/mytable;a=0 • POST /1.0/d/mytable;a=1;b=42;c=xyz • PUT /1.0/d/mytable;a=1;b=43;c=abc • DELETE /1.0/d/mytable;a=0 *This is matrix-param style, can also use form data style for specifying data
  • 28. cursors & iterators • GET /1.0/i/mytable.P?q=a+ge+4 • GET /1.0/i/mytable.SecIndex?q=b+le+4 • GET /1.0/i/mytable.SecIndex?q=b+le+4 &s=abce1212121ceeee2120911 • “s” value is opaque index key of next page of results - way better than LIMIT/OFFSET! (since HailDB can seek directly to the row)
  • 29. result • REST API provides fun, straightforward access from Ruby, Python, Java, Command- line... • very easy benchmarking with HTTP-based performance tools • range query support, and more efficient iteration model for large result sets than MySQL provides
  • 30. high-performance counts • GET /1.0/counts/mykey 0 • POST /1.0/counts/mykey[?inc=1] 1 • POST /1.0/counts/mykey?inc=42 43 • DELETE /1.0/counts/mykey
  • 31. counts schema • HailDB count service schema _id int 8-byte unsigned, _key_hash int 8-byte unsigned, _key varchar(80), _count int 8-byte unsigned primary key (“_id”) unique key (“_key_hash”, “key”)
  • 36. operation: graph store • Social networks, recommendations, any relation you can think of • Which would you prefer? • SQL adjacency list, stored procedure, custom storage engine, external (Memcached), ... • Graph-aware HailDB application in Java
  • 37. nifty graph store 1 3 2 1 4 5 6 8 GET /1.0/graph/bfs?a=1&maxDepth=3 => [[1, 0], [2, 1], [3, 2], [4, 3], [5, 3]]
  • 38. nifty graph store 2 1 2 3 4 5 6 8 GET /1.0/graph/topo?a=1&a=5&a=8 => [8, 6, 4, 3, 2, 5, 1]
  • 39. nifty recovery tool (Just an idea) • for recovery: shut down mysql server • run HailDB-enabled recovery tool • export as JSON or whatever
  • 40. wrap-up • HailDB & InnoDB are phenomenal • With g414-haildb, can be integrated directly into applications running on the JVM • All the InnoDB tuning tricks apply • Opens up new applications that are tricky with a traditional SQL database
  • 41. resources • github.com/sunnygleason/g414-st8 github.com/sunnygleason/g414-haildb • haildb.com • jna.dev.java.net
  • 43. bonus material! • we probably didn’t get this far in the live presentation; the following material is here for eager, brave & interested folks...
  • 44. future work • Improve Packaging / Installation • Codify schema refinements & perf enhancements • Online backup/export with XtraBackup • JNI Bindings • PBXT explorations
  • 45. InnoDB tuning • Skinny columns, skinny rows! (esp. Primary Key) • Varchar enum ‘bad’, enum, int or smallint ‘good’ • fixed-width rows allow in-place updates • Use covering indexes strategically • More data per page means faster index scans, more efficient buffer pool utilization • You only get so many trx’s (read & write) on given CPU/RAM configuration - benchmark this! • Strategically offload reads to Memcached/Redis
  • 46. HailDB schema _key VARBINARY(200) _version VARBINARY(200) _value BLOB PRIMARY KEY(_key, _version)
  • 47. refined schema _id BIGINT (auto increment) _key_hash BIGINT _key VARBINARY(200) _version VARBINARY(200) _value BLOB PRIMARY KEY(_id) KEY(_key_hash)
  • 48. online backup • hot backup of data to other machine / destination • test Percona Xtrabackup with HailDB • next step: backup/export to Hadoop/HDFS (similar to Cloudera Sqoop tool)
  • 49. JNI bindings • JNI can get 2-5x perf boost vs. JNA • ... at the expense of nasty code • Will go for schema optimizations and InnoDB tuning tips *first*