Your SlideShare is downloading. ×
April 2013 HUG: HBase as a Service at Yahoo!
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

April 2013 HUG: HBase as a Service at Yahoo!


Published on

HBase is an open-source, distributed, versioned, column-oriented store modeled after Google's Bigtable. Yahoo! has been using HBase for a long time as isolated one off deployments. Having a …

HBase is an open-source, distributed, versioned, column-oriented store modeled after Google's Bigtable. Yahoo! has been using HBase for a long time as isolated one off deployments. Having a multi-tenant platform makes it possible for all our grid customers to take advantage of HBase capabilities now. We will provide a brief overview of HBase and how it works (several of you asked for back to basics type talks), and then spend the majority of our time talking about multi-tenancy with HBase.


Francis Christopher Liu, Software Engineer, Yahoo! and PPMC Member, Apache HCatalog

Vandana Ayyalasomayajula, Software Engineer, Yahoo! and PPMC Member, Apache HCatalog

Published in: Spiritual, Technology

  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. HBase as a Service at Yahoo!Bay Area HUG PresentationFrancis LiuVandana AyyalasomayajulaApril 17, 2013
  • 2. HBase Overview2Yahoo! Presentation, ConfidentialApache HBase is an open source Bigtable-like, distributed, scalable, consistent,random access, key-value store built on Apache HadoopColumn Family - InfoRowkey Email Age PasswordAlice 23Bob 25 IambobEve 30 nice1passTable islexicographicallysorted on rowkeys123trickedyounewpasswordCells4ts1 = 1ts2 = 2Each cell has multipleversions represented bytimestamp wherets2>ts1Identify your data (cell value) in the HBase table by[1] rowkey, [2] column family, [3] column qualifier, [4] timestamp/ version]HBase Data Model
  • 3. HBase Distributed Mode3Yahoo! Presentation, ConfidentialAndy ArchBrad ArchDheeraj OpsEleanor PgMFrancis DevGovind DevRajiv OpsSumeet PMVandana DevTable T1 is split into threeregions R1, R2, R3Each region is served by aRegionServer collocated withthe DataNodeClientZooKeeper-Root-Client contactsZooKeeper, aseparate cluster ofZK nodesRetrieve RS hosting–ROOT- region(Row/ Meta region)Find Sumeet’s rolewith HBaseM1M2RS1T1R1RS2T1R2, T1R3RS1(Row/ table region)RS2Query the .Meta.server that has therow key “Sumeet”T1R1T1R2T1R3RS1RS2RS2RS3
  • 4. HBase High-level Architecture4Yahoo! Presentation, ConfidentialSource:
  • 5. HBase Operations§  get()§  put()§  scan()§  checkAndDelete()§  checkAndPut()§  increment()…check HTable class for further details on operationsCaution:§  No queries§  No secondary indexes55Yahoo! Presentation, Confidential
  • 6. Multi-tenancy Motivation§  Successful Deployments§  C.O.R.Eo  Personalization Engine§  Web Crawl Cache§  etc…§  Off-stage processing§  Mutable data§  Random read/write6
  • 7. Metrics/Analytics Use Cases7HBaseCollector Collector CollectorQuery ServerIngestion
  • 8. Dimension Store Use CaseHBaseHDFSMapReduceHivePigClickstream Ad Campaign8
  • 9. Incremental Processing Use Cases9HBaseMapReduceStormHDFSCollectorSlowFastOn-stageOff-stage processingCollectionServingStoreSearchEventsFiles
  • 10. Hadoop at Yahoo!§  Hosted Multi-tenant Service§  Security§  Job Queues§  HDFS Quota10
  • 11. HBase at Yahoo!§  Hosted Multi-tenant Service§  Security§  Isolated Deployment§  Region Server Group§  Namespace11
  • 12. Security§  Authentication§  Kerberos (users, processes)§  Delegation Token (MapReduce, YARN, etc)§  Authorization§  HBase ACLs (Read, Write, Create, Admin)§  Grant permissions to User or Unix Group§  ACL for Table, Column Family or Column§  Only Global Admin can create/drop tables12
  • 13. Isolated DeploymentHBaseClientHBaseClientJobTracker NamenodeTaskTrackerDataNodeNamenodeRegionServerDataNodeRegionServerDataNodeRegionServerDataNodeHBase MasterZookeeperQuorumHBaseClientMR ClientM/R TaskTaskTrackerDataNodeM/R TaskTaskTrackerDataNodeMR TaskCompute Cluster HBase ClusterGateway/Launcher13
  • 14. Region Server Groups§  Member Region Servers§  Member Tables§  Resource Isolation§  Flexibility with configuration14Group BarRegion Server 5…8Table3Table4Group FooRegion Server 1…4Table1Table2RS1Table1Table2RS2Table1Table2RS3Table1Table2RS4 RS5Table3Table4RS6Table3Table4RS7Table3Table4RS8
  • 15. Region Server Groups15§  group_add§  group_remove§  group_move_servers§  group_move_tables§  create … { … CONFIGURATION=>{‘’=>’my_group’}}
  • 16. Region Server Groups16LoadBalancerGroupBasedLoadBalancerGroupAdminEndpointGroupMasterObserverHMasterFilterByGroupfoobarGroupInfoManagerGroup TableGroupZNode
  • 17. Namespace§  Analogous to Database§  Table Name: <table namespace>.<table qualifier>§  i.e. my_ns.my_table§  Reserved namespaces§  Default – tables with no explicit namespace§  System – tables are guaranteed to be assigned prior to user tables§  Table Path: /<hbaseRoot>/data/<namespace>/<tableName>§  /hbase/data/my_ns/my_ns.my_table17
  • 18. Namespace + Security + Group + Quota§  Tables§  Namespace ACL§  Default Region Server Group§  Quota§  Max Tables§  Max Regions18NamespaceGroup Tables Quota ACL
  • 19. Namespace + Quota19HMasterTableNamespaceManagerNamespaceTableNamespaceZNodesNamespace NamespaceControllerZKNamespaceManagerMasterCPHostRegionCPHost
  • 20. Conclusion§  HBase enables new processing paradigms (vs HDFS)§  Namespace provide tenants with a project space§  Region Server Groups guarantee Isolation§  Namespace Quota limits use of shared resources§  Namespace ACLs help project level administrationYahoo! Presentation, Confidential 20
  • 21. References§§  Region Server Group (HBASE-6721)§  Namespace (HBASE-8015)Yahoo! Presentation, Confidential 21