SlideShare a Scribd company logo
1 of 37
Apache HBase Features for the USE PUBLICLY
                          DO NOT
Enterprise                 PRIOR TO 10/23/12
Headline Goes Here
Jonathan Hsieh | @jmhsieh
Speaker Name or Subhead Goes Here
Software Engineer at Cloudera / HBase PMC Member
October 2012
Who Am I?

                •   Cloudera:
                      • Software Engineer
                      • Apache HBase committer / PMC
                      • Apache Flume founder / PMC
                      • Apache Sqoop committer / PMC
                •   U of Washington:
                      •   Research in Distributed Systems


2               10/25/12 Strata Hadoop World 2012
What is Apache HBase?

                          Apache HBase is an open
        App   MR
                          source, distributed, scala
                             ble, consistent, low
                           latency, random access
         ZK   HDFS
                           non-relational database
                          built on Apache Hadoop
3                    10/25/12 Strata Hadoop World 2012
HBase provides Low-latency Random Access
    •   Writes:
           •   1-3ms, 1k-10k writes/sec per node                     0000000000




    •   Reads:                                                   4   1111111111




           • 0-3ms cached, 10-30ms disk                          1   2222222222




           • 10-40k reads / second / node from                       3333333333




             cache                                               5   4444444444




    •   Cell size:                                                   5555555555




                                                                     6666666666



           •   0-3MB preferred                                   2   7777777777




    •   Read, write and insert data anywhere in                  3
        the table
           •   No sequential write limitations

4                                         9/23/12 Strangeloop 2012
HBase On a Cluster
        HDFS NameNodes     ZooKeeper          Slave Boxes (DN + RS)
        HBase Masters       Quorum
         Rack 1



           Name
           node

         Rack 2



           Name
           node

5                        10/25/12 Strata Hadoop World 2012
Production Apache HBase Applications
    • Inbox
    • Storage
    • Web
    • Search
    • Analytics
    • Monitoring



         More Case Studies at http://www.hbasecon.com/agenda/
6                            10/25/12 Strata Hadoop World 2012
Production Systems Need to Avoid Risk
    •   Unfortunately, all things can fail.

    •   Enterprises need to minimize risk.
           • Understand potential data loss scenarios
           • Understand potential unavailability scenarios
           • Must have a disaster recovery story
    •   Downtime, data loss == risk

    •   Let’s talk about how HBase deals with:
           •   Risks from within the cluster
           •   Risks from outside the cluster
           •   Risks posed by Users

    •   Goal: Remove or reduce negative impact of potential risks

7                                          10/25/12 Strata Hadoop World 2012
Risks from within the cluster
Hosts and Services
Causes of HBase Downtime within the cluster

     • Unplanned Maintenance                • Planned Maintenance
        • Hardware failures                        • Upgrades
        • Software errors                          • Migrations
        • Human error


               Goal: Reduce downtime from hours
                      to minutes to seconds.

10                        10/25/12 Strata Hadoop World 2012
Unplanned Downtime
                                   Service realizes there is a
                Failure Event        problem starts fixing.

                            Detection Time                   Recovery Time
         time
                                   Service still                           Service is
                                thinks we are ok                           restored.

     •   Two sources of unavailability
            • Detection time
            • Recovery time

11                                     10/25/12 Strata Hadoop World 2012
Reduce downtime by speeding up recovery time
                                   Service realizes there is a
                Failure Event        problem starts fixing.

                            Detection Time
         time
                                   Service still                           Service is
                                thinks we are ok                           restored.
     • Distributed log splitting (0.92)
     • Automated metadata repairs with hbck (0.92)
     • Enable of writes while recovering from failure (0.96)

12                                     10/25/12 Strata Hadoop World 2012
Reduce downtime by speeding up detection
                                       Service realizes there is a
                 Failure Event           problem starts fixing.



         time
                                      Service still                                Service is
                                   thinks we are ok                                restored.
     •   Proactively notify to recover from process failures quickly
          0.92/0.94        All Master Failure Detection 180s     Some Region Server Failure detection 180s

          0.96             Master process failure detection 0-   Region Server Failure detection
                           1s                                    0-1s


13                                          10/25/12 Strata Hadoop World 2012
Manual Problem detection: Metrics
                      • Goal: Pinpoint root causes of problems
                        faster
                      • Take a baseline of your system in steady-
                        state
                      • Anomalies like spikes or dips from baseline
                        can indicate problems
                            •   Ex: Slow Query Logging
                      •   Integrates with existing infrastructure via
                          JMX or use with Ganglia, Cloudera Manager

14                   10/25/12 Strata Hadoop World 2012
Metrics from all levels of the system
     •   HBase Region Servers                           Host                       JVM
           •   Operations / sec                                                            Master

           •   Get / put latencies (0.92)                 CPUs           Memory   Disks              Network
           •   Per CF metrics (0.94)
           •   Per Region metrics (0.94)                Host

     •   HBase Master                                      JVM                    JVM
                                                               Region server
           •   RIT metrics                                         Region
                                                                                            HDFS
                                                                                          DataNode
           •   Replication Metrics                               CF       CF


     •   System/JVM                                       CPUs           Memory                      Network
                                                                                  Disks
           •   GC, RPC metrics

15                                   10/25/12 Strata Hadoop World 2012
16   10/25/12 Strata Hadoop World 2012
Eliminate HBase downtime
                Maintenance         Service remains online,           Full service
                  Begins               slightly degraded              is restored.

                                Maintenance Time
         time


     •   Highly Available stack: HDFS (2.0) / ZK / HBASE
     •   Client Cross-version wire compatibility (0.96)
     •   Rolling Restarts
     •   Online Schema Change (experimental)

17                                10/25/12 Strata Hadoop World 2012
High Availability: HBase + HDFS + ZK
         HDFS NameNodes     ZooKeeper          Slave Boxes (DN + RS)
         HBase Masters       Quorum
          Rack 1



            Name
            node

          Rack 2



            Name
            node


18                        10/25/12 Strata Hadoop World 2012
Wire Compatibility
 •   Reduces downtime due to planned maintenance
 •   Wire compatibility + Extensible Data formats
       •   Allow for forwards and backwards compatibility
       •   Older clients can talk to newer servers and visa            App   MR
           versa
 •   Rolling upgrade
       •   Upgrade a single node at a time while system runs

 •   Allows API and changes while guaranteeing wire                    ZK    HDFS
     compatibility between different minor versions
 •   HDFS client-server compatibility between Major
     Versions

19                                 10/25/12 Strata Hadoop World 2012
Risks from outside the cluster
Datacenters and Disaster Recovery
External Risks




21                    10/25/12 Strata Hadoop World 2012
Geographically separated copies of data




22                    10/25/12 Strata Hadoop World 2012
Strategy: HBase-Supported Batch Backups
     •   Export / Dist CP / Import                                                   Import
                                                       Export            Dist CP
           •   3 batch MR jobs                         MR Job            MR Job      MR Job
           •   Several extra copies of
               data
           •   High latency (hours)

     •   Copy Table                                                     Copy Table
           •   1 MR Job                                                  MR Job
           •   Single copy of data
           •   High Latency (hours)
           •   Incremental table copies

23                                  10/25/12 Strata Hadoop World 2012
Strategy: Custom Application-managed Replication

     •   Application writes to two instances of HBase
           • Low Latency
           • Adds complexity
           • Inefficient

                            App                               App




24                                10/25/12 Strata Hadoop World 2012
Strategy: HBase replication (0.92+)
     •   HBase Asynchronously copy edit logs to other clusters.
           • Replication lag measured in seconds
           • Automatically catch up from failures.
           • Eventually consistent
           • Efficient batching
     • Master-slave† (0.90)                               logs
                                                             logs
                                                                logs                 logs


     • Master-master (0.92)                                            Replication




25                              10/25/12 Strata Hadoop World 2012
Master-Master Replication

                                             logs


           logs        logs




                  Replicating data reduces chances of data loss.

26                            10/25/12 Strata Hadoop World 2012
Risks from Users
“Problem exists between keyboard and chair.”
Oops… User Error
                User Error:                                        Service is restored,
                drop ‘table’                                         major data loss


                               Recovery Time
         time
                               Service is down!


     • How do we prevent user error?
     • How do we recover from user error?


28                             10/25/12 Strata Hadoop World 2012
Prevent user mistakes: User-level Security
                User Error:                 Operation
                drop ‘table’           rejected, insufficient
                                           permissions.


         time


     •   Authentication:
            •   Ensure the identity of the services or users that are communicating
     •   Access Control:
            •   Ensure user has permission to execute table data operations

29                                   10/25/12 Strata Hadoop World 2012
HBase User-level Security
     •   Based on Kerberos for HBase, HDFS and Zookeeper
           • Grant privileges to users
           • Revoke privileges from users.
           • Column Family and Table granularity


     •   Confidentiality:
           •   Ensure information is only seen by intended users.
     •   Audit Trails:
           •   Track which users performed particular operations

30                                10/25/12 Strata Hadoop World 2012
Recovering from User Mistakes: Table Snapshots
                                                                            Service is
                User Error:            Service is down!               restored, minor data
                drop ‘table’
                                                                              loss

                                                            restore
         time       Periodic
                   snapshot


     • Snapshot the state of a table at a certain moment in time
     • Restore it or Clone it later, creating a new read write table
     • Export it to another cluster with minimal impact on HBase

31                             10/25/12 Strata Hadoop World 2012
Table Snapshots (0.96+)
     • Under development, slated for HBase 0.96
     • Multiple snapshot flavors planned
           •   Offline snapshots
           •   Online Snapshots
     •   Snapshot uses
           •   Recover from application or user error.
           •   Application experimentation (no need to
               spin up another cluster for replication)
           •   Use MR directly on snapshot files

32                                 10/25/12 Strata Hadoop World 2012
Conclusions
HBase for the enterprise.
Feature Summary by Category
                                     Avoid Down Time
                                     • Rolling Restart
                                     • Online backups
                                     • Table Replication
                                     • Security Access Controls (0.92)
                                     • Wire Compatibility (0.96)
                                     • Snapshots (0.96)

                            Detection Time                         Recovery Time
      time
        Reduce Detection Time                                  Reduce Recovery Time
        • Improved metrics                                     • Distributed Log Splitting (0.92)
        • Proactive notification of HMaster failure (0.96)     • Improvements to HBCK (0.94)
        • Proactive notification of RegionServer failure       • Allow writes during recovery (0.96)
        (0.96)                                                 •Snapshots (0.96)


34                                         10/25/12 Strata Hadoop World 2012
Feature Summary by Version
     0.90                            0.92                            0.94                            0.96 (Upcoming Release)
     •Metrics                        •Metrics                        •CF+Region Granularity          •CF +Region Granularity
                                                                     Metrics                         Metrics
                                                                                                     •Improved failure detection
                                                                                                     time
     •Distributed log splitting*     •Distributed log splitting      •Distributed log splitting      •Distributed log splitting
     •HBCK improvements*             •HBCK improvements              •HBCK improvements              •HBCK improvements
     •Copy Table / Import / Export   •Copy Table / Import / Export   •Copy Table / Import / Export   •Copy Table / Import / Export
     •Master-Slave Replication†      •Master-Master Replication      •Master-Master Replication      •Master-Master Replication
                                                                                                     •Client Wire compatibility

                                     •Authentication and             •Authentication and             •Authentication and
                                     Authorization                   Authorization                   Authorization
                                                                                                     •(Snapshots)

     Recovery in Hours               Recovery in Minutes             Recovery in Minutes             (Recovery in Seconds)
                                                                                                                     † experimental
                                                                                                                     (in progress)
                                                                                                                     *backported in CDH

35                                                10/25/12 Strata Hadoop World 2012
Thank You!
Jonathan Hsieh | @jmhsieh
Software Engineer, Cloudera
Apache HBase committer / PMC member
Strata + Hadoop World 2012: Apache HBase Features for the Enterprise

More Related Content

What's hot

[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systexJames Chen
 
Apache Hadoop on Virtual Machines
Apache Hadoop on Virtual MachinesApache Hadoop on Virtual Machines
Apache Hadoop on Virtual MachinesDataWorks Summit
 
Virtualization Primer for Java Developers
Virtualization Primer for Java DevelopersVirtualization Primer for Java Developers
Virtualization Primer for Java DevelopersRichard McDougall
 
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...ivmaykov
 
Inside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldInside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldRichard McDougall
 
From distributed caches to in-memory data grids
From distributed caches to in-memory data gridsFrom distributed caches to in-memory data grids
From distributed caches to in-memory data gridsMax Alexejev
 
HBase Sizing Guide
HBase Sizing GuideHBase Sizing Guide
HBase Sizing Guidelarsgeorge
 
Implementing Parallelism in PostgreSQL - PGCon 2014
Implementing Parallelism in PostgreSQL - PGCon 2014Implementing Parallelism in PostgreSQL - PGCon 2014
Implementing Parallelism in PostgreSQL - PGCon 2014EDB
 
Share point disaster avoidance architecture for large scale enterprises
Share point disaster avoidance architecture for large scale enterprisesShare point disaster avoidance architecture for large scale enterprises
Share point disaster avoidance architecture for large scale enterprisesSentri
 
Distributed Caching Using the JCACHE API and ehcache, Including a Case Study ...
Distributed Caching Using the JCACHE API and ehcache, Including a Case Study ...Distributed Caching Using the JCACHE API and ehcache, Including a Case Study ...
Distributed Caching Using the JCACHE API and ehcache, Including a Case Study ...elliando dias
 
Ultimate SharePoint Infrastructure Best Practices Session - Live360 Orlando 2012
Ultimate SharePoint Infrastructure Best Practices Session - Live360 Orlando 2012Ultimate SharePoint Infrastructure Best Practices Session - Live360 Orlando 2012
Ultimate SharePoint Infrastructure Best Practices Session - Live360 Orlando 2012Michael Noel
 
High Availability in YARN
High Availability in YARNHigh Availability in YARN
High Availability in YARNArinto Murdopo
 
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera, Inc.
 

What's hot (20)

Hadoop on Virtual Machines
Hadoop on Virtual MachinesHadoop on Virtual Machines
Hadoop on Virtual Machines
 
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
 
Hadoop on VMware
Hadoop on VMwareHadoop on VMware
Hadoop on VMware
 
HBase operations
HBase operationsHBase operations
HBase operations
 
Apache Hadoop on Virtual Machines
Apache Hadoop on Virtual MachinesApache Hadoop on Virtual Machines
Apache Hadoop on Virtual Machines
 
Virtualization Primer for Java Developers
Virtualization Primer for Java DevelopersVirtualization Primer for Java Developers
Virtualization Primer for Java Developers
 
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...
 
Inside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldInside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworld
 
5 Steps to PostgreSQL Performance
5 Steps to PostgreSQL Performance5 Steps to PostgreSQL Performance
5 Steps to PostgreSQL Performance
 
From distributed caches to in-memory data grids
From distributed caches to in-memory data gridsFrom distributed caches to in-memory data grids
From distributed caches to in-memory data grids
 
Cassandra Silicon Valley
Cassandra Silicon ValleyCassandra Silicon Valley
Cassandra Silicon Valley
 
HBase Sizing Guide
HBase Sizing GuideHBase Sizing Guide
HBase Sizing Guide
 
Implementing Parallelism in PostgreSQL - PGCon 2014
Implementing Parallelism in PostgreSQL - PGCon 2014Implementing Parallelism in PostgreSQL - PGCon 2014
Implementing Parallelism in PostgreSQL - PGCon 2014
 
Share point disaster avoidance architecture for large scale enterprises
Share point disaster avoidance architecture for large scale enterprisesShare point disaster avoidance architecture for large scale enterprises
Share point disaster avoidance architecture for large scale enterprises
 
Delphix
DelphixDelphix
Delphix
 
Distributed Caching Using the JCACHE API and ehcache, Including a Case Study ...
Distributed Caching Using the JCACHE API and ehcache, Including a Case Study ...Distributed Caching Using the JCACHE API and ehcache, Including a Case Study ...
Distributed Caching Using the JCACHE API and ehcache, Including a Case Study ...
 
hadoop_module6
hadoop_module6hadoop_module6
hadoop_module6
 
Ultimate SharePoint Infrastructure Best Practices Session - Live360 Orlando 2012
Ultimate SharePoint Infrastructure Best Practices Session - Live360 Orlando 2012Ultimate SharePoint Infrastructure Best Practices Session - Live360 Orlando 2012
Ultimate SharePoint Infrastructure Best Practices Session - Live360 Orlando 2012
 
High Availability in YARN
High Availability in YARNHigh Availability in YARN
High Availability in YARN
 
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for Hadoop
 

Viewers also liked

Introduction to HBase - Phoenix HUG 5/14
Introduction to HBase - Phoenix HUG 5/14Introduction to HBase - Phoenix HUG 5/14
Introduction to HBase - Phoenix HUG 5/14Jeremy Walsh
 
HBase from the Trenches - Phoenix Data Conference 2015
HBase from the Trenches - Phoenix Data Conference 2015HBase from the Trenches - Phoenix Data Conference 2015
HBase from the Trenches - Phoenix Data Conference 2015Avinash Ramineni
 
Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars GeorgeJAX London
 
HBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the ArtHBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the ArtMichael Stack
 
HBase Client APIs (for webapps?)
HBase Client APIs (for webapps?)HBase Client APIs (for webapps?)
HBase Client APIs (for webapps?)Nick Dimiduk
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars GeorgeJAX London
 
HBaseConEast2016: Practical Kerberos with Apache HBase
HBaseConEast2016: Practical Kerberos with Apache HBaseHBaseConEast2016: Practical Kerberos with Apache HBase
HBaseConEast2016: Practical Kerberos with Apache HBaseMichael Stack
 
Apache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to UnderstandApache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to UnderstandJosh Elser
 
Apache Hadoop and HBase
Apache Hadoop and HBaseApache Hadoop and HBase
Apache Hadoop and HBaseCloudera, Inc.
 
Hortonworks Technical Workshop: HBase and Apache Phoenix
Hortonworks Technical Workshop: HBase and Apache Phoenix Hortonworks Technical Workshop: HBase and Apache Phoenix
Hortonworks Technical Workshop: HBase and Apache Phoenix Hortonworks
 
Apache HBase for Architects
Apache HBase for ArchitectsApache HBase for Architects
Apache HBase for ArchitectsNick Dimiduk
 
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceHBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceCloudera, Inc.
 
Apache HBase Low Latency
Apache HBase Low LatencyApache HBase Low Latency
Apache HBase Low LatencyNick Dimiduk
 
Meet HBase 1.0
Meet HBase 1.0Meet HBase 1.0
Meet HBase 1.0enissoz
 

Viewers also liked (17)

Introduction to HBase - Phoenix HUG 5/14
Introduction to HBase - Phoenix HUG 5/14Introduction to HBase - Phoenix HUG 5/14
Introduction to HBase - Phoenix HUG 5/14
 
HBase from the Trenches - Phoenix Data Conference 2015
HBase from the Trenches - Phoenix Data Conference 2015HBase from the Trenches - Phoenix Data Conference 2015
HBase from the Trenches - Phoenix Data Conference 2015
 
Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars George
 
HBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the ArtHBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the Art
 
HBase Client APIs (for webapps?)
HBase Client APIs (for webapps?)HBase Client APIs (for webapps?)
HBase Client APIs (for webapps?)
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars George
 
HBaseConEast2016: Practical Kerberos with Apache HBase
HBaseConEast2016: Practical Kerberos with Apache HBaseHBaseConEast2016: Practical Kerberos with Apache HBase
HBaseConEast2016: Practical Kerberos with Apache HBase
 
Apache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to UnderstandApache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to Understand
 
Apache Phoenix + Apache HBase
Apache Phoenix + Apache HBaseApache Phoenix + Apache HBase
Apache Phoenix + Apache HBase
 
Apache Hadoop and HBase
Apache Hadoop and HBaseApache Hadoop and HBase
Apache Hadoop and HBase
 
Hortonworks Technical Workshop: HBase and Apache Phoenix
Hortonworks Technical Workshop: HBase and Apache Phoenix Hortonworks Technical Workshop: HBase and Apache Phoenix
Hortonworks Technical Workshop: HBase and Apache Phoenix
 
Apache HBase for Architects
Apache HBase for ArchitectsApache HBase for Architects
Apache HBase for Architects
 
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceHBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
 
Apache HBase Low Latency
Apache HBase Low LatencyApache HBase Low Latency
Apache HBase Low Latency
 
Spark + HBase
Spark + HBase Spark + HBase
Spark + HBase
 
HBase Storage Internals
HBase Storage InternalsHBase Storage Internals
HBase Storage Internals
 
Meet HBase 1.0
Meet HBase 1.0Meet HBase 1.0
Meet HBase 1.0
 

Similar to Strata + Hadoop World 2012: Apache HBase Features for the Enterprise

HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - ClouderaHBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - ClouderaCloudera, Inc.
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryCloudera, Inc.
 
Infrastructure Around Hadoop
Infrastructure Around HadoopInfrastructure Around Hadoop
Infrastructure Around HadoopDataWorks Summit
 
Scaling Hadoop at LinkedIn
Scaling Hadoop at LinkedInScaling Hadoop at LinkedIn
Scaling Hadoop at LinkedInDataWorks Summit
 
Deployment and Management of Hadoop Clusters
Deployment and Management of Hadoop ClustersDeployment and Management of Hadoop Clusters
Deployment and Management of Hadoop ClustersAmal G Jose
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)mundlapudi
 
Facing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoopFacing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoopfann wu
 
[B4]deview 2012-hdfs
[B4]deview 2012-hdfs[B4]deview 2012-hdfs
[B4]deview 2012-hdfsNAVER D2
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsEsther Kundin
 
Improving h base availability and repair
Improving h base availability and repairImproving h base availability and repair
Improving h base availability and repairDataWorks Summit
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsEsther Kundin
 
HBase: Where Online Meets Low Latency
HBase: Where Online Meets Low LatencyHBase: Where Online Meets Low Latency
HBase: Where Online Meets Low LatencyHBaseCon
 
H2O on Hadoop Dec 12
H2O on Hadoop Dec 12 H2O on Hadoop Dec 12
H2O on Hadoop Dec 12 Sri Ambati
 
End of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationEnd of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationCeph Community
 
Tom Kraljevic presents H2O on Hadoop- how it works and what we've learned
Tom Kraljevic presents H2O on Hadoop- how it works and what we've learnedTom Kraljevic presents H2O on Hadoop- how it works and what we've learned
Tom Kraljevic presents H2O on Hadoop- how it works and what we've learnedSri Ambati
 
Facebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconFacebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconYiwei Ma
 
支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统yongboy
 

Similar to Strata + Hadoop World 2012: Apache HBase Features for the Enterprise (20)

HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - ClouderaHBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
 
HBase with MapR
HBase with MapRHBase with MapR
HBase with MapR
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
 
HBase Low Latency
HBase Low LatencyHBase Low Latency
HBase Low Latency
 
Infrastructure Around Hadoop
Infrastructure Around HadoopInfrastructure Around Hadoop
Infrastructure Around Hadoop
 
Scaling Hadoop at LinkedIn
Scaling Hadoop at LinkedInScaling Hadoop at LinkedIn
Scaling Hadoop at LinkedIn
 
Deployment and Management of Hadoop Clusters
Deployment and Management of Hadoop ClustersDeployment and Management of Hadoop Clusters
Deployment and Management of Hadoop Clusters
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)
 
Hadoop, Taming Elephants
Hadoop, Taming ElephantsHadoop, Taming Elephants
Hadoop, Taming Elephants
 
Facing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoopFacing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoop
 
[B4]deview 2012-hdfs
[B4]deview 2012-hdfs[B4]deview 2012-hdfs
[B4]deview 2012-hdfs
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
 
Improving h base availability and repair
Improving h base availability and repairImproving h base availability and repair
Improving h base availability and repair
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
 
HBase: Where Online Meets Low Latency
HBase: Where Online Meets Low LatencyHBase: Where Online Meets Low Latency
HBase: Where Online Meets Low Latency
 
H2O on Hadoop Dec 12
H2O on Hadoop Dec 12 H2O on Hadoop Dec 12
H2O on Hadoop Dec 12
 
End of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph ReplicationEnd of RAID as we know it with Ceph Replication
End of RAID as we know it with Ceph Replication
 
Tom Kraljevic presents H2O on Hadoop- how it works and what we've learned
Tom Kraljevic presents H2O on Hadoop- how it works and what we've learnedTom Kraljevic presents H2O on Hadoop- how it works and what we've learned
Tom Kraljevic presents H2O on Hadoop- how it works and what we've learned
 
Facebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconFacebook keynote-nicolas-qcon
Facebook keynote-nicolas-qcon
 
支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Strata + Hadoop World 2012: Apache HBase Features for the Enterprise

  • 1. Apache HBase Features for the USE PUBLICLY DO NOT Enterprise PRIOR TO 10/23/12 Headline Goes Here Jonathan Hsieh | @jmhsieh Speaker Name or Subhead Goes Here Software Engineer at Cloudera / HBase PMC Member October 2012
  • 2. Who Am I? • Cloudera: • Software Engineer • Apache HBase committer / PMC • Apache Flume founder / PMC • Apache Sqoop committer / PMC • U of Washington: • Research in Distributed Systems 2 10/25/12 Strata Hadoop World 2012
  • 3. What is Apache HBase? Apache HBase is an open App MR source, distributed, scala ble, consistent, low latency, random access ZK HDFS non-relational database built on Apache Hadoop 3 10/25/12 Strata Hadoop World 2012
  • 4. HBase provides Low-latency Random Access • Writes: • 1-3ms, 1k-10k writes/sec per node 0000000000 • Reads: 4 1111111111 • 0-3ms cached, 10-30ms disk 1 2222222222 • 10-40k reads / second / node from 3333333333 cache 5 4444444444 • Cell size: 5555555555 6666666666 • 0-3MB preferred 2 7777777777 • Read, write and insert data anywhere in 3 the table • No sequential write limitations 4 9/23/12 Strangeloop 2012
  • 5. HBase On a Cluster HDFS NameNodes ZooKeeper Slave Boxes (DN + RS) HBase Masters Quorum Rack 1 Name node Rack 2 Name node 5 10/25/12 Strata Hadoop World 2012
  • 6. Production Apache HBase Applications • Inbox • Storage • Web • Search • Analytics • Monitoring More Case Studies at http://www.hbasecon.com/agenda/ 6 10/25/12 Strata Hadoop World 2012
  • 7. Production Systems Need to Avoid Risk • Unfortunately, all things can fail. • Enterprises need to minimize risk. • Understand potential data loss scenarios • Understand potential unavailability scenarios • Must have a disaster recovery story • Downtime, data loss == risk • Let’s talk about how HBase deals with: • Risks from within the cluster • Risks from outside the cluster • Risks posed by Users • Goal: Remove or reduce negative impact of potential risks 7 10/25/12 Strata Hadoop World 2012
  • 8.
  • 9. Risks from within the cluster Hosts and Services
  • 10. Causes of HBase Downtime within the cluster • Unplanned Maintenance • Planned Maintenance • Hardware failures • Upgrades • Software errors • Migrations • Human error Goal: Reduce downtime from hours to minutes to seconds. 10 10/25/12 Strata Hadoop World 2012
  • 11. Unplanned Downtime Service realizes there is a Failure Event problem starts fixing. Detection Time Recovery Time time Service still Service is thinks we are ok restored. • Two sources of unavailability • Detection time • Recovery time 11 10/25/12 Strata Hadoop World 2012
  • 12. Reduce downtime by speeding up recovery time Service realizes there is a Failure Event problem starts fixing. Detection Time time Service still Service is thinks we are ok restored. • Distributed log splitting (0.92) • Automated metadata repairs with hbck (0.92) • Enable of writes while recovering from failure (0.96) 12 10/25/12 Strata Hadoop World 2012
  • 13. Reduce downtime by speeding up detection Service realizes there is a Failure Event problem starts fixing. time Service still Service is thinks we are ok restored. • Proactively notify to recover from process failures quickly 0.92/0.94 All Master Failure Detection 180s Some Region Server Failure detection 180s 0.96 Master process failure detection 0- Region Server Failure detection 1s 0-1s 13 10/25/12 Strata Hadoop World 2012
  • 14. Manual Problem detection: Metrics • Goal: Pinpoint root causes of problems faster • Take a baseline of your system in steady- state • Anomalies like spikes or dips from baseline can indicate problems • Ex: Slow Query Logging • Integrates with existing infrastructure via JMX or use with Ganglia, Cloudera Manager 14 10/25/12 Strata Hadoop World 2012
  • 15. Metrics from all levels of the system • HBase Region Servers Host JVM • Operations / sec Master • Get / put latencies (0.92) CPUs Memory Disks Network • Per CF metrics (0.94) • Per Region metrics (0.94) Host • HBase Master JVM JVM Region server • RIT metrics Region HDFS DataNode • Replication Metrics CF CF • System/JVM CPUs Memory Network Disks • GC, RPC metrics 15 10/25/12 Strata Hadoop World 2012
  • 16. 16 10/25/12 Strata Hadoop World 2012
  • 17. Eliminate HBase downtime Maintenance Service remains online, Full service Begins slightly degraded is restored. Maintenance Time time • Highly Available stack: HDFS (2.0) / ZK / HBASE • Client Cross-version wire compatibility (0.96) • Rolling Restarts • Online Schema Change (experimental) 17 10/25/12 Strata Hadoop World 2012
  • 18. High Availability: HBase + HDFS + ZK HDFS NameNodes ZooKeeper Slave Boxes (DN + RS) HBase Masters Quorum Rack 1 Name node Rack 2 Name node 18 10/25/12 Strata Hadoop World 2012
  • 19. Wire Compatibility • Reduces downtime due to planned maintenance • Wire compatibility + Extensible Data formats • Allow for forwards and backwards compatibility • Older clients can talk to newer servers and visa App MR versa • Rolling upgrade • Upgrade a single node at a time while system runs • Allows API and changes while guaranteeing wire ZK HDFS compatibility between different minor versions • HDFS client-server compatibility between Major Versions 19 10/25/12 Strata Hadoop World 2012
  • 20. Risks from outside the cluster Datacenters and Disaster Recovery
  • 21. External Risks 21 10/25/12 Strata Hadoop World 2012
  • 22. Geographically separated copies of data 22 10/25/12 Strata Hadoop World 2012
  • 23. Strategy: HBase-Supported Batch Backups • Export / Dist CP / Import Import Export Dist CP • 3 batch MR jobs MR Job MR Job MR Job • Several extra copies of data • High latency (hours) • Copy Table Copy Table • 1 MR Job MR Job • Single copy of data • High Latency (hours) • Incremental table copies 23 10/25/12 Strata Hadoop World 2012
  • 24. Strategy: Custom Application-managed Replication • Application writes to two instances of HBase • Low Latency • Adds complexity • Inefficient App App 24 10/25/12 Strata Hadoop World 2012
  • 25. Strategy: HBase replication (0.92+) • HBase Asynchronously copy edit logs to other clusters. • Replication lag measured in seconds • Automatically catch up from failures. • Eventually consistent • Efficient batching • Master-slave† (0.90) logs logs logs logs • Master-master (0.92) Replication 25 10/25/12 Strata Hadoop World 2012
  • 26. Master-Master Replication logs logs logs Replicating data reduces chances of data loss. 26 10/25/12 Strata Hadoop World 2012
  • 27. Risks from Users “Problem exists between keyboard and chair.”
  • 28. Oops… User Error User Error: Service is restored, drop ‘table’ major data loss Recovery Time time Service is down! • How do we prevent user error? • How do we recover from user error? 28 10/25/12 Strata Hadoop World 2012
  • 29. Prevent user mistakes: User-level Security User Error: Operation drop ‘table’ rejected, insufficient permissions. time • Authentication: • Ensure the identity of the services or users that are communicating • Access Control: • Ensure user has permission to execute table data operations 29 10/25/12 Strata Hadoop World 2012
  • 30. HBase User-level Security • Based on Kerberos for HBase, HDFS and Zookeeper • Grant privileges to users • Revoke privileges from users. • Column Family and Table granularity • Confidentiality: • Ensure information is only seen by intended users. • Audit Trails: • Track which users performed particular operations 30 10/25/12 Strata Hadoop World 2012
  • 31. Recovering from User Mistakes: Table Snapshots Service is User Error: Service is down! restored, minor data drop ‘table’ loss restore time Periodic snapshot • Snapshot the state of a table at a certain moment in time • Restore it or Clone it later, creating a new read write table • Export it to another cluster with minimal impact on HBase 31 10/25/12 Strata Hadoop World 2012
  • 32. Table Snapshots (0.96+) • Under development, slated for HBase 0.96 • Multiple snapshot flavors planned • Offline snapshots • Online Snapshots • Snapshot uses • Recover from application or user error. • Application experimentation (no need to spin up another cluster for replication) • Use MR directly on snapshot files 32 10/25/12 Strata Hadoop World 2012
  • 34. Feature Summary by Category Avoid Down Time • Rolling Restart • Online backups • Table Replication • Security Access Controls (0.92) • Wire Compatibility (0.96) • Snapshots (0.96) Detection Time Recovery Time time Reduce Detection Time Reduce Recovery Time • Improved metrics • Distributed Log Splitting (0.92) • Proactive notification of HMaster failure (0.96) • Improvements to HBCK (0.94) • Proactive notification of RegionServer failure • Allow writes during recovery (0.96) (0.96) •Snapshots (0.96) 34 10/25/12 Strata Hadoop World 2012
  • 35. Feature Summary by Version 0.90 0.92 0.94 0.96 (Upcoming Release) •Metrics •Metrics •CF+Region Granularity •CF +Region Granularity Metrics Metrics •Improved failure detection time •Distributed log splitting* •Distributed log splitting •Distributed log splitting •Distributed log splitting •HBCK improvements* •HBCK improvements •HBCK improvements •HBCK improvements •Copy Table / Import / Export •Copy Table / Import / Export •Copy Table / Import / Export •Copy Table / Import / Export •Master-Slave Replication† •Master-Master Replication •Master-Master Replication •Master-Master Replication •Client Wire compatibility •Authentication and •Authentication and •Authentication and Authorization Authorization Authorization •(Snapshots) Recovery in Hours Recovery in Minutes Recovery in Minutes (Recovery in Seconds) † experimental (in progress) *backported in CDH 35 10/25/12 Strata Hadoop World 2012
  • 36. Thank You! Jonathan Hsieh | @jmhsieh Software Engineer, Cloudera Apache HBase committer / PMC member

Editor's Notes

  1. Hbase is a project that solves this problem. In a sentence, Hbase is an open source, distributed, sorted map modeled after Google’s BigTable.Open-source: Apache HBase is an open source project with an Apache 2.0 license.Distributed: HBase is designed to use multiple machines to store and serve data.Sorted Map: HBase stores data as a map, and guarantees that adjacent keys will be stored next to each other on disk.HBase is modeled after BigTable, a system that is used for hundreds of applications at Google.
  2. Tested under HBase
  3. 2 generals about detection time.
  4. Most metrics reside in the region serverMetrics for various categories (e.g. Stores, compactions, flushes)Metrics per operation type (e.g. get, put, delete)Remember HDFS,JVM, and host metrics as well
  5. Tested under HBase