HBase and
Hadoop at
Urban Airship
April 25, 2012
                           Dave Revell
                 dave@urbanairship.com
                          @dave_revell
Who are we?

•   Who am I?
     •   Airshipper for 10 months, Hadoop user for 1.5 years
     •   Database Engineer on Core Data team: we collect
         events from mobile devices and create reports
•   What is Urban Airship?
     •   SaaS for mobile developers. Features that devs
         shouldn’t build themselves.
     •   Mostly push notifications
     •   No airships :(
Goals
Goals
•   “Near real time” reporting
      •   Counters: messages sent and received, app opens, in
          various time slices
      •   More complex analyses: time-in-app, uniques,
          conversions
Goals
•   “Near real time” reporting
      •   Counters: messages sent and received, app opens, in
          various time slices
      •   More complex analyses: time-in-app, uniques,
          conversions
•   Scale
      •   Billions of “events” per month, ~100 bytes each
      •   40 billion events so far, looking exponential.
      •   Event arrival rate varies wildly, ~10K/sec (?)
Enter Hadoop
Enter Hadoop

•   An Apache project with HDFS, MapReduce, and Common
     •   Open source, Apache license
Enter Hadoop

•   An Apache project with HDFS, MapReduce, and Common
     •   Open source, Apache license
•   In common usage: platform, framework, ecosystem
     •   HBase, Hive, Pig, ZooKeeper, Mahout, Oozie ....
Enter Hadoop

•   An Apache project with HDFS, MapReduce, and Common
      •   Open source, Apache license
•   In common usage: platform, framework, ecosystem
      •   HBase, Hive, Pig, ZooKeeper, Mahout, Oozie ....
•   It’s in Java
Enter Hadoop

•   An Apache project with HDFS, MapReduce, and Common
      •   Open source, Apache license
•   In common usage: platform, framework, ecosystem
      •   HBase, Hive, Pig, ZooKeeper, Mahout, Oozie ....
•   It’s in Java
•   History: early 2000s, originally a clone of Google’s GFS and
    MapReduce
Enter HBase
Enter HBase

•   HBase is a database that uses HDFS for storage
Enter HBase

•   HBase is a database that uses HDFS for storage
•   Based on Google’s BigTable. Not relational or SQL.
Enter HBase

•   HBase is a database that uses HDFS for storage
•   Based on Google’s BigTable. Not relational or SQL.
•   Solves the problem “how do I query my Hadoop data?”
      •   Operations typically take a few milliseconds
      •   MapReduce is not suitable for real time queries
Enter HBase

•   HBase is a database that uses HDFS for storage
•   Based on Google’s BigTable. Not relational or SQL.
•   Solves the problem “how do I query my Hadoop data?”
      •   Operations typically take a few milliseconds
      •   MapReduce is not suitable for real time queries
•   Scales well by adding servers (if you do everything right)
Enter HBase

•   HBase is a database that uses HDFS for storage
•   Based on Google’s BigTable. Not relational or SQL.
•   Solves the problem “how do I query my Hadoop data?”
      •   Operations typically take a few milliseconds
      •   MapReduce is not suitable for real time queries
•   Scales well by adding servers (if you do everything right)
•   Not highly-available or multi-datacenter
UA’s basic architecture
   Events in                                     Reports out
            Mobile devices               Reports user

             Queue (Kafka)               Web service

                                 HBase

                                  HDFS



  (not shown: analysis code that reads
  events from HBase and puts derived
         data back into HBase)
Analyzing events
                           •   Absorbs traffic spikes

Queue of incoming events   •   Partially decouples database from internet

                           •   Pub/sub, groups of consumers share work

                           •   Consumes event queue

                           •   Does simple streaming analyses (counters)
UA proprietary Java code
                           •   Stages data in HBase tables for more
                               complex analyses that come later


                           •   Calculations that are difficult or inefficient to
                               compute as data streams through
 Incremental batch jobs
                           •   Read from HBase, write back to HBase
HBase data model
HBase data model

•   The abstraction offered by HBase for reading and writing
HBase data model

•   The abstraction offered by HBase for reading and writing
•   As useful as possible without limiting scalability too much
HBase data model

•   The abstraction offered by HBase for reading and writing
•   As useful as possible without limiting scalability too much
•   Data is in rows, rows are in tables, ordered by row key
HBase data model

•   The abstraction offered by HBase for reading and writing
•   As useful as possible without limiting scalability too much
•   Data is in rows, rows are in tables, ordered by row key


      myApp:1335139200   OPENS_COUNT: 3987 SENDS_COUNT: 28832

      myApp:1335142800   OPENS_COUNT: 4230 SENDS_COUNT: 38990
HBase data model

•   The abstraction offered by HBase for reading and writing
•   As useful as possible without limiting scalability too much
•   Data is in rows, rows are in tables, ordered by row key


      myApp:1335139200       OPENS_COUNT: 3987 SENDS_COUNT: 28832

      myApp:1335142800       OPENS_COUNT: 4230 SENDS_COUNT: 38990



       (not shown: column families)
The HBase data model, cont.


                                              {“myRowKey1”: {
•   This is a nested map/dictionary
                                                 “myColFam”: {
•   Scannable in lexicographic key order            “myQualifierX”: “foo”,
                                                    “myQualifierY”: “bar”}},
•   Interface is very simple:                  “rowKey2”: {
                                                 “myColFam”:
      •   get, put, delete, scan, increment        “myQualifierA”: “baz”,
                                                   “myQualifierB”: “”}},
•   Bytes only
HBase API example

byte[] firstNameQualifier = “fname”.getBytes();

byte[] lastNameQualifier = “lname”.getBytes();

byte[] personalInfoColFam = “personalInfo”.getBytes();



HTable hTable = new HTable(“users”);

Put put = new Put(“dave”.getBytes());

put.add(personalInfoColFam, firstNameQualifier, “Dave”.getBytes());

put.add(personalInfoColFam, lastNameQualifier, “Revell”.getBytes());

hTable.put(put);
How to not fail at HBase
How to not fail at HBase

•   Things you should have done initially, but now it’s too late
    and you’re irretrievably screwed
      •   Keep table count and column family count low
      •   Keep rows narrow, use compound keys
      •   Scale by adding more rows
      •   Tune your flush threshold and memstore sizes
      •   It’s OK to store complex objects as Protobuf/Thrift/etc.
      •   Always try for sequential IO over random IO
MapReduce, briefly
•   The original use case for Hadoop
•   Mappers take in large data set and send (key,value) pairs to
    reducers. Reducers aggregate input pairs and generate
    output.

                    My     input    data items

                  Mapper   Mapper   Mapper   Mapper



                    Reducer            Reducer

                    Output             Output
MapReduce issues
MapReduce issues

•   Hard to process incrementally (efficiently)
MapReduce issues

•   Hard to process incrementally (efficiently)
•   Hard to achieve low latency
MapReduce issues

•   Hard to process incrementally (efficiently)
•   Hard to achieve low latency
•   Can’t have too many jobs
MapReduce issues

•   Hard to process incrementally (efficiently)
•   Hard to achieve low latency
•   Can’t have too many jobs
•   Requires elaborate workflow automation
MapReduce issues

•   Hard to process incrementally (efficiently)
•   Hard to achieve low latency
•   Can’t have too many jobs
•   Requires elaborate workflow automation
MapReduce issues

•   Hard to process incrementally (efficiently)
•   Hard to achieve low latency
•   Can’t have too many jobs
•   Requires elaborate workflow automation


•   Urban Airship uses MapReduce over HBase data for:
      •   Ad-hoc analysis
      •   Monthly billing
Live demo




 (Jump to web browser for HBase and MR status pages)
Batch processing at UA
Batch processing at UA

•   Quartz scheduler, distributed over 3 nodes
      •   Time-in-app, audience count, conversions
Batch processing at UA

•   Quartz scheduler, distributed over 3 nodes
      •   Time-in-app, audience count, conversions
Batch processing at UA

•   Quartz scheduler, distributed over 3 nodes
      •   Time-in-app, audience count, conversions


•   General pattern
      •   Arriving events set a low water mark for its app
      •   Batch jobs reprocess events starting at the low water
          mark
Strengths
Strengths

•   Uptime
     •   We know all the ways to crash by now
Strengths

•   Uptime
     •   We know all the ways to crash by now
•   Schema design, throughput, and scaling
     •   There are many subtle mistakes to avoid
Strengths

•   Uptime
      •   We know all the ways to crash by now
•   Schema design, throughput, and scaling
      •   There are many subtle mistakes to avoid
•   Writing custom tools (statshtable, hbackup, gclogtailer)
Strengths

•   Uptime
      •   We know all the ways to crash by now
•   Schema design, throughput, and scaling
      •   There are many subtle mistakes to avoid
•   Writing custom tools (statshtable, hbackup, gclogtailer)
•   “Real time most of the time”
Weaknesses of our design
Weaknesses of our design


•   Shipping features quickly
Weaknesses of our design


•   Shipping features quickly
•   Hardware efficiency
Weaknesses of our design


•   Shipping features quickly
•   Hardware efficiency
•   Infrastructure automation
Weaknesses of our design


•   Shipping features quickly
•   Hardware efficiency
•   Infrastructure automation
•   Writing custom tools, getting bogged down at low levels,
    leaky abstractions
Weaknesses of our design


•   Shipping features quickly
•   Hardware efficiency
•   Infrastructure automation
•   Writing custom tools, getting bogged down at low levels,
    leaky abstractions
•   Serious operational Java skills required
Reading



•   Hadoop: The Definitive Guide by Tom White
•   HBase: The Definitive Guide by Lars George
•   http://hbase.apache.org/book.html
Questions?




•   #hbase on Freenode
•   hbase-dev, hbase-user Apache mailing lists

HBase and Hadoop at Urban Airship

  • 1.
    HBase and Hadoop at UrbanAirship April 25, 2012 Dave Revell dave@urbanairship.com @dave_revell
  • 2.
    Who are we? • Who am I? • Airshipper for 10 months, Hadoop user for 1.5 years • Database Engineer on Core Data team: we collect events from mobile devices and create reports • What is Urban Airship? • SaaS for mobile developers. Features that devs shouldn’t build themselves. • Mostly push notifications • No airships :(
  • 3.
  • 4.
    Goals • “Near real time” reporting • Counters: messages sent and received, app opens, in various time slices • More complex analyses: time-in-app, uniques, conversions
  • 5.
    Goals • “Near real time” reporting • Counters: messages sent and received, app opens, in various time slices • More complex analyses: time-in-app, uniques, conversions • Scale • Billions of “events” per month, ~100 bytes each • 40 billion events so far, looking exponential. • Event arrival rate varies wildly, ~10K/sec (?)
  • 6.
  • 7.
    Enter Hadoop • An Apache project with HDFS, MapReduce, and Common • Open source, Apache license
  • 8.
    Enter Hadoop • An Apache project with HDFS, MapReduce, and Common • Open source, Apache license • In common usage: platform, framework, ecosystem • HBase, Hive, Pig, ZooKeeper, Mahout, Oozie ....
  • 9.
    Enter Hadoop • An Apache project with HDFS, MapReduce, and Common • Open source, Apache license • In common usage: platform, framework, ecosystem • HBase, Hive, Pig, ZooKeeper, Mahout, Oozie .... • It’s in Java
  • 10.
    Enter Hadoop • An Apache project with HDFS, MapReduce, and Common • Open source, Apache license • In common usage: platform, framework, ecosystem • HBase, Hive, Pig, ZooKeeper, Mahout, Oozie .... • It’s in Java • History: early 2000s, originally a clone of Google’s GFS and MapReduce
  • 11.
  • 12.
    Enter HBase • HBase is a database that uses HDFS for storage
  • 13.
    Enter HBase • HBase is a database that uses HDFS for storage • Based on Google’s BigTable. Not relational or SQL.
  • 14.
    Enter HBase • HBase is a database that uses HDFS for storage • Based on Google’s BigTable. Not relational or SQL. • Solves the problem “how do I query my Hadoop data?” • Operations typically take a few milliseconds • MapReduce is not suitable for real time queries
  • 15.
    Enter HBase • HBase is a database that uses HDFS for storage • Based on Google’s BigTable. Not relational or SQL. • Solves the problem “how do I query my Hadoop data?” • Operations typically take a few milliseconds • MapReduce is not suitable for real time queries • Scales well by adding servers (if you do everything right)
  • 16.
    Enter HBase • HBase is a database that uses HDFS for storage • Based on Google’s BigTable. Not relational or SQL. • Solves the problem “how do I query my Hadoop data?” • Operations typically take a few milliseconds • MapReduce is not suitable for real time queries • Scales well by adding servers (if you do everything right) • Not highly-available or multi-datacenter
  • 17.
    UA’s basic architecture Events in Reports out Mobile devices Reports user Queue (Kafka) Web service HBase HDFS (not shown: analysis code that reads events from HBase and puts derived data back into HBase)
  • 18.
    Analyzing events • Absorbs traffic spikes Queue of incoming events • Partially decouples database from internet • Pub/sub, groups of consumers share work • Consumes event queue • Does simple streaming analyses (counters) UA proprietary Java code • Stages data in HBase tables for more complex analyses that come later • Calculations that are difficult or inefficient to compute as data streams through Incremental batch jobs • Read from HBase, write back to HBase
  • 19.
  • 20.
    HBase data model • The abstraction offered by HBase for reading and writing
  • 21.
    HBase data model • The abstraction offered by HBase for reading and writing • As useful as possible without limiting scalability too much
  • 22.
    HBase data model • The abstraction offered by HBase for reading and writing • As useful as possible without limiting scalability too much • Data is in rows, rows are in tables, ordered by row key
  • 23.
    HBase data model • The abstraction offered by HBase for reading and writing • As useful as possible without limiting scalability too much • Data is in rows, rows are in tables, ordered by row key myApp:1335139200 OPENS_COUNT: 3987 SENDS_COUNT: 28832 myApp:1335142800 OPENS_COUNT: 4230 SENDS_COUNT: 38990
  • 24.
    HBase data model • The abstraction offered by HBase for reading and writing • As useful as possible without limiting scalability too much • Data is in rows, rows are in tables, ordered by row key myApp:1335139200 OPENS_COUNT: 3987 SENDS_COUNT: 28832 myApp:1335142800 OPENS_COUNT: 4230 SENDS_COUNT: 38990 (not shown: column families)
  • 25.
    The HBase datamodel, cont. {“myRowKey1”: { • This is a nested map/dictionary “myColFam”: { • Scannable in lexicographic key order “myQualifierX”: “foo”, “myQualifierY”: “bar”}}, • Interface is very simple: “rowKey2”: { “myColFam”: • get, put, delete, scan, increment “myQualifierA”: “baz”, “myQualifierB”: “”}}, • Bytes only
  • 26.
    HBase API example byte[]firstNameQualifier = “fname”.getBytes(); byte[] lastNameQualifier = “lname”.getBytes(); byte[] personalInfoColFam = “personalInfo”.getBytes(); HTable hTable = new HTable(“users”); Put put = new Put(“dave”.getBytes()); put.add(personalInfoColFam, firstNameQualifier, “Dave”.getBytes()); put.add(personalInfoColFam, lastNameQualifier, “Revell”.getBytes()); hTable.put(put);
  • 27.
    How to notfail at HBase
  • 28.
    How to notfail at HBase • Things you should have done initially, but now it’s too late and you’re irretrievably screwed • Keep table count and column family count low • Keep rows narrow, use compound keys • Scale by adding more rows • Tune your flush threshold and memstore sizes • It’s OK to store complex objects as Protobuf/Thrift/etc. • Always try for sequential IO over random IO
  • 29.
    MapReduce, briefly • The original use case for Hadoop • Mappers take in large data set and send (key,value) pairs to reducers. Reducers aggregate input pairs and generate output. My input data items Mapper Mapper Mapper Mapper Reducer Reducer Output Output
  • 30.
  • 31.
    MapReduce issues • Hard to process incrementally (efficiently)
  • 32.
    MapReduce issues • Hard to process incrementally (efficiently) • Hard to achieve low latency
  • 33.
    MapReduce issues • Hard to process incrementally (efficiently) • Hard to achieve low latency • Can’t have too many jobs
  • 34.
    MapReduce issues • Hard to process incrementally (efficiently) • Hard to achieve low latency • Can’t have too many jobs • Requires elaborate workflow automation
  • 35.
    MapReduce issues • Hard to process incrementally (efficiently) • Hard to achieve low latency • Can’t have too many jobs • Requires elaborate workflow automation
  • 36.
    MapReduce issues • Hard to process incrementally (efficiently) • Hard to achieve low latency • Can’t have too many jobs • Requires elaborate workflow automation • Urban Airship uses MapReduce over HBase data for: • Ad-hoc analysis • Monthly billing
  • 37.
    Live demo (Jumpto web browser for HBase and MR status pages)
  • 38.
  • 39.
    Batch processing atUA • Quartz scheduler, distributed over 3 nodes • Time-in-app, audience count, conversions
  • 40.
    Batch processing atUA • Quartz scheduler, distributed over 3 nodes • Time-in-app, audience count, conversions
  • 41.
    Batch processing atUA • Quartz scheduler, distributed over 3 nodes • Time-in-app, audience count, conversions • General pattern • Arriving events set a low water mark for its app • Batch jobs reprocess events starting at the low water mark
  • 42.
  • 43.
    Strengths • Uptime • We know all the ways to crash by now
  • 44.
    Strengths • Uptime • We know all the ways to crash by now • Schema design, throughput, and scaling • There are many subtle mistakes to avoid
  • 45.
    Strengths • Uptime • We know all the ways to crash by now • Schema design, throughput, and scaling • There are many subtle mistakes to avoid • Writing custom tools (statshtable, hbackup, gclogtailer)
  • 46.
    Strengths • Uptime • We know all the ways to crash by now • Schema design, throughput, and scaling • There are many subtle mistakes to avoid • Writing custom tools (statshtable, hbackup, gclogtailer) • “Real time most of the time”
  • 47.
  • 48.
    Weaknesses of ourdesign • Shipping features quickly
  • 49.
    Weaknesses of ourdesign • Shipping features quickly • Hardware efficiency
  • 50.
    Weaknesses of ourdesign • Shipping features quickly • Hardware efficiency • Infrastructure automation
  • 51.
    Weaknesses of ourdesign • Shipping features quickly • Hardware efficiency • Infrastructure automation • Writing custom tools, getting bogged down at low levels, leaky abstractions
  • 52.
    Weaknesses of ourdesign • Shipping features quickly • Hardware efficiency • Infrastructure automation • Writing custom tools, getting bogged down at low levels, leaky abstractions • Serious operational Java skills required
  • 53.
    Reading • Hadoop: The Definitive Guide by Tom White • HBase: The Definitive Guide by Lars George • http://hbase.apache.org/book.html
  • 54.
    Questions? • #hbase on Freenode • hbase-dev, hbase-user Apache mailing lists