• Like
  • Save
April 2013 HUG: HBase as a Service at Yahoo!
Upcoming SlideShare
Loading in...5
×
 

April 2013 HUG: HBase as a Service at Yahoo!

on

  • 2,402 views

HBase is an open-source, distributed, versioned, column-oriented store modeled after Google's Bigtable. Yahoo! has been using HBase for a long time as isolated one off deployments. Having a ...

HBase is an open-source, distributed, versioned, column-oriented store modeled after Google's Bigtable. Yahoo! has been using HBase for a long time as isolated one off deployments. Having a multi-tenant platform makes it possible for all our grid customers to take advantage of HBase capabilities now. We will provide a brief overview of HBase and how it works (several of you asked for back to basics type talks), and then spend the majority of our time talking about multi-tenancy with HBase.

Presenter(s):

Francis Christopher Liu, Software Engineer, Yahoo! and PPMC Member, Apache HCatalog

Vandana Ayyalasomayajula, Software Engineer, Yahoo! and PPMC Member, Apache HCatalog

Statistics

Views

Total Views
2,402
Views on SlideShare
2,400
Embed Views
2

Actions

Likes
7
Downloads
0
Comments
0

1 Embed 2

http://www.brijj.com 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    April 2013 HUG: HBase as a Service at Yahoo! April 2013 HUG: HBase as a Service at Yahoo! Presentation Transcript

    • HBase as a Service at Yahoo!Bay Area HUG PresentationFrancis LiuVandana AyyalasomayajulaApril 17, 2013
    • HBase Overview2Yahoo! Presentation, ConfidentialApache HBase is an open source Bigtable-like, distributed, scalable, consistent,random access, key-value store built on Apache HadoopColumn Family - InfoRowkey Email Age PasswordAlice alice@wonderland.com 23Bob bob@myworld.com 25 IambobEve hithere@getintouch.com 30 nice1passTable islexicographicallysorted on rowkeys123trickedyounewpasswordCells4ts1 = 1ts2 = 2Each cell has multipleversions represented bytimestamp wherets2>ts1Identify your data (cell value) in the HBase table by[1] rowkey, [2] column family, [3] column qualifier, [4] timestamp/ version]HBase Data Model
    • HBase Distributed Mode3Yahoo! Presentation, ConfidentialAndy ArchBrad ArchDheeraj OpsEleanor PgMFrancis DevGovind DevRajiv OpsSumeet PMVandana DevTable T1 is split into threeregions R1, R2, R3Each region is served by aRegionServer collocated withthe DataNodeClientZooKeeper-Root-Client contactsZooKeeper, aseparate cluster ofZK nodesRetrieve RS hosting–ROOT- region(Row/ Meta region)Find Sumeet’s rolewith HBaseM1M2RS1T1R1RS2T1R2, T1R3RS1(Row/ table region)RS2Query the .Meta.server that has therow key “Sumeet”T1R1T1R2T1R3RS1RS2RS2RS3
    • HBase High-level Architecture4Yahoo! Presentation, ConfidentialSource: http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html
    • HBase Operations§  get()§  put()§  scan()§  checkAndDelete()§  checkAndPut()§  increment()…check HTable class for further details on operationsCaution:§  No queries§  No secondary indexes55Yahoo! Presentation, Confidential
    • Multi-tenancy Motivation§  Successful Deployments§  C.O.R.Eo  Personalization Engine§  Web Crawl Cache§  etc…§  Off-stage processing§  Mutable data§  Random read/write6
    • Metrics/Analytics Use Cases7HBaseCollector Collector CollectorQuery ServerIngestion
    • Dimension Store Use CaseHBaseHDFSMapReduceHivePigClickstream Ad Campaign8
    • Incremental Processing Use Cases9HBaseMapReduceStormHDFSCollectorSlowFastOn-stageOff-stage processingCollectionServingStoreSearchEventsFiles
    • Hadoop at Yahoo!§  Hosted Multi-tenant Service§  Security§  Job Queues§  HDFS Quota10
    • HBase at Yahoo!§  Hosted Multi-tenant Service§  Security§  Isolated Deployment§  Region Server Group§  Namespace11
    • Security§  Authentication§  Kerberos (users, processes)§  Delegation Token (MapReduce, YARN, etc)§  Authorization§  HBase ACLs (Read, Write, Create, Admin)§  Grant permissions to User or Unix Group§  ACL for Table, Column Family or Column§  Only Global Admin can create/drop tables12
    • Isolated DeploymentHBaseClientHBaseClientJobTracker NamenodeTaskTrackerDataNodeNamenodeRegionServerDataNodeRegionServerDataNodeRegionServerDataNodeHBase MasterZookeeperQuorumHBaseClientMR ClientM/R TaskTaskTrackerDataNodeM/R TaskTaskTrackerDataNodeMR TaskCompute Cluster HBase ClusterGateway/Launcher13
    • Region Server Groups§  Member Region Servers§  Member Tables§  Resource Isolation§  Flexibility with configuration14Group BarRegion Server 5…8Table3Table4Group FooRegion Server 1…4Table1Table2RS1Table1Table2RS2Table1Table2RS3Table1Table2RS4 RS5Table3Table4RS6Table3Table4RS7Table3Table4RS8
    • Region Server Groups15§  group_add§  group_remove§  group_move_servers§  group_move_tables§  create … { … CONFIGURATION=>{‘hbase.rsgroup.name’=>’my_group’}}
    • Region Server Groups16LoadBalancerGroupBasedLoadBalancerGroupAdminEndpointGroupMasterObserverHMasterFilterByGroupfoobarGroupInfoManagerGroup TableGroupZNode
    • Namespace§  Analogous to Database§  Table Name: <table namespace>.<table qualifier>§  i.e. my_ns.my_table§  Reserved namespaces§  Default – tables with no explicit namespace§  System – tables are guaranteed to be assigned prior to user tables§  Table Path: /<hbaseRoot>/data/<namespace>/<tableName>§  /hbase/data/my_ns/my_ns.my_table17
    • Namespace + Security + Group + Quota§  Tables§  Namespace ACL§  Default Region Server Group§  Quota§  Max Tables§  Max Regions18NamespaceGroup Tables Quota ACL
    • Namespace + Quota19HMasterTableNamespaceManagerNamespaceTableNamespaceZNodesNamespace NamespaceControllerZKNamespaceManagerMasterCPHostRegionCPHost
    • Conclusion§  HBase enables new processing paradigms (vs HDFS)§  Namespace provide tenants with a project space§  Region Server Groups guarantee Isolation§  Namespace Quota limits use of shared resources§  Namespace ACLs help project level administrationYahoo! Presentation, Confidential 20
    • References§  http://hbase.apache.org/book/book.html§  Region Server Group (HBASE-6721)§  Namespace (HBASE-8015)Yahoo! Presentation, Confidential 21