SlideShare a Scribd company logo
Intro to HBase
                      Internals &
                     Schema Design
                           (for HBase Users)
                     Alex Baranau, Sematext International, 2012




Monday, July 9, 12
About Me


                     Software Engineer at Sematext International

                     http://blog.sematext.com/author/abaranau

                     @abaranau

                     http://github.com/sematext (abaranau)




Monday, July 9, 12
Agenda


                     Logical view

                     Physical view

                     Schema design

                     Other/Advanced topics




Monday, July 9, 12
Why?
                     Why should I (HBase user) care about
                     HBase internals?

                       HBase will not adjust cluster
                       settings to optimal based on usage
                       patterns automatically

                       Schema design, table settings
                       (defined upon creation), etc.
                       depend on HBase implementation
                       aspects


Monday, July 9, 12
Logical View




Monday, July 9, 12
Logical View: Regions
                     HBase cluster serves multiple tables, distinguished by
                     name

                     Each table contains of rows

                     Each row contains cells:
                     (row key, column family, column, timestamp) -> value

                     Table is split into Regions (table shards, each
                     contains full rows), defined by start and end row keys




Monday, July 9, 12
Logical View: Regions are
                              Shards
                     Regions are “atoms of distribution”

                     Each region assigned to single RegionServer
                     (HBase cluster slave)

                       Rows of particular Region served by single
                       RS (cluster slave)

                       Regions are distributed evenly across RSs

                       Region has configurable max size

                     When region reaches max size (or on request)
                     it is split into two smaller regions, which
                     can be assigned to different RSs



Monday, July 9, 12
Logical View: Regions on
                              Cluster
                                                ZooKeeper
                                                ZooKeeper
                                                 ZooKeeper


                     client              HMaster
                                          HMaster


                                          Region   Region

                                         RegionServer


                              Region   Region      Region   Region
                                                   RegionServer
                                                    RegionServer
                              RegionServer         RegionServer


Monday, July 9, 12
Logical View: Regions Load

                     It is essential for Regions under the
                     load to be evenly distributed across
                     the cluster

                     It is HBase user’s job to make sure
                     the above is true. Note: even
                     distribution of Regions over cluster
                     doesn’t imply that the load is evenly
                     distributed



Monday, July 9, 12
Logical View: Regions Load

                     Take into account that rows are stored in ordered
                     manner

                     Make sure you don’t write rows with sequential
                     keys to avoid RS hotspotting*
                        When writing data with monotonically increasing/decreasing
                        keys, data is written at one RS at a time

                     Use pre-splitting of the table upon creation
                        Starting with single region means using one RS for some time

                     In general, splitting can be expensive

                     Increase max region size

      * see https://github.com/sematext/HBaseWD



Monday, July 9, 12
Logical View: Slow RSs
                      When load is distributed evenly, watch for
                      slowest RSs (HBase slaves)

                        Since every region served by single RS,
                        one slow RS can slow down cluster
                        performance e.g. when:

                           data is written into multiple RSs at
                           even pace (random value-based row keys)

                           data is being read from many RSs when
                           doing scan



Monday, July 9, 12
Physical View




Monday, July 9, 12
Physical View: Write/Read Flow
                     HTable       client                           client
                       buffer                          HTable


               write                                     read         RegionServer

                         Region                    z            Region ...
                                                                                         ...
                                        Store                               Store
                       MemStore                                  MemStore
                                        (per CF)                            (per CF)

                          flush

                       HFile    HFile     ...            HFile     HFile



                                          Write Ahead Log

                                                                                       HDFS
Monday, July 9, 12
Physical: Speed up Writing

                     Enabling & increasing client-side buffer reduces RPC
                     operations amount

                       warn: possible loss of buffered data

                          in case of client failure; design for failover

                          in case of write failure (networking/server-
                          side issues); can be handled on client

                     Disabling WAL increases write speed

                       warn: possible data loss in case of RS failure

                     Use bulk import functionality (writes HFiles
                     directly, which can be later added to HBase)




Monday, July 9, 12
Physical: Memstore Flushes
                     When memstore is flushed N HFiles are created (one per
                     CF)

                     Memstore size which causes flushing is configured on
                     two levels:

                        per RS: % of heap occupied by memstores

                        per table: size in MB of single memstore (per CF)
                        of Region

                     When Region memstores flushes, memstores of all CFs
                     are flushed

                        Uneven data amount between CFs causes too many
                        flushes & creation of too many HFiles (one per CF
                        every time)

                        In most cases having one CF is the best design

Monday, July 9, 12
Physical: Memstore Flushes
                     Important: there are Memstore size
                     thresholds which cause writes to be blocked,
                     so slow memstore flushes and overuse of
                     memory by memstore can cause write perf
                     degradation

                       Hint: watch for flush queue size metric on
                       RSs

                     At the same time the more memory memstore
                     uses the better for writing/reading perf
                     (unless it reaches those “write blocking”
                     thresholds)



Monday, July 9, 12
Physical: Memstore Flushes

                        Example of good situation
                                                       *




                * http://sematext.com/spm/index.html

Monday, July 9, 12
Physical: HFiles Compaction
                     HFiles are periodically compacted into bigger
                     HFiles containing same data

                       Reading from less HFiles faster

                     Important: there’s a configured max number of files
                     in Store which, when reached causes writes to block

                       Hint: watch for compaction queue size metric on
                       RSs


                                    read                       Store
                                                    MemStore
                                                               (per CF)




                                            HFile     HFile



Monday, July 9, 12
Physical: Data Locality
                       RSs are usually collocated                    HDFS

                       with HDFS DataNodes                          MapReduce

                                                                     HBase



                                   RegionServer      RegionServer




                                                                      TaskTracker
                     TaskTracker




                                     DataNode          DataNode



                                        Slave Node            Slave Node




Monday, July 9, 12
Physical: Data Locality
                     HBase tries to assign Regions to RSs so that
                     Region data stored physically on the same node.
                     But sometimes fails

                       after Region splits there’s no guarantee that
                       there’s a node that has all blocks (HDFS level)
                       of new Region and

                       no guarantee that HBase will not re-assign this
                       Region to different RS in future (even
                       distribution of Regions takes preference over
                       data locality)

                     There’s an ongoing work towards better preserving
                     data locality



Monday, July 9, 12
Physical: Data Locality
                     Also, data locality can break when:

                        Adding new slaves to cluster

                        Removing slaves from cluster

                            Incl. node failures

                     Hint: look at networking IO between slaves when writing/reading
                     data, it should be minimal

                     Important:

                        make sure HDFS is well balanced (use balancer tool)

                        try to rebalance Regions in HBase cluster if possible (HBase
                        Master restart will do that) to regain data locality

                        Pre-split table on creation to limit (ideally avoid) splits
                        and regions movement; manage splits manually sometimes helps




Monday, July 9, 12
Schema Design
                       (very briefly)




Monday, July 9, 12
Schema: row keys
                     Using row key (or keys range) is the most
                     efficient way to retrieve the data from HBase

                       Row key design is major part of schema design

                       Note: no secondary indices available out of
                       the box


                                  Row Key                  Data
                      ‘login_2012-03-01.00:09:17’   d:{‘user’:‘alex’}
                                    ...                    ...
                      ‘login_2012-03-01.23:59:35’   d:{‘user’:‘otis’}
                      ‘login_2012-03-02.00:00:21’   d:{‘user’:‘david’}




Monday, July 9, 12
Schema: row keys
                     Redundancy is OK!
                        warn: changing two rows in HBase is not atomic operation



                                  Row Key                       Data
                     ‘login_2010-01-01.00:09:17’          d:{‘user’:‘alex’}
                                    ...                          ...
                     ‘login_2012-03-01.23:59:35’          d:{‘user’:‘otis’}
                     ‘alex_2010-01-01.00:09:17’         d:{‘action’:‘login’}
                                    ...                          ...
                     ‘otis_2012-03-01.23:59:35’         d:{‘action’:‘login’}
                     ‘alex_login_2010-01-01.00:09:17’     d:{‘device’:’pc’}
                                    ...                          ...
                     ‘otis_login_2012-03-01.23:59:35’   d:{‘device’:‘mobile’}




Monday, July 9, 12
Schema: Relations
                        Not relational

                        No joins

                        Denormalization is OK! Use ‘nested entities’

                                            Row Key                         Data

                                                          d:{
                                                          student_firstname:Alex,
                                                          student_lastname:Baranau,
                     student
                                                              professor_math_firstname:David,
                        *            ‘student_abaranau’       professor_math_lastname:Smart,


                        *                                     professor_cs_firstname:Jack,

             professor                                        professor_cs_lastname:Weird,

                                                          }

                                         ‘prof_dsmart’    d:{...}


Monday, July 9, 12
Schema: row key/CF/qual size

                     HBase stores cells individually

                        great for “sparse” data

                        row key, CF name and column name stored with each
                        cell which may affect data amount to be stored and
                        managed

                           keep them short

                           serialize and store many values into single cell

                             Row Key                     Data
                                          d:{
                                          s:Alex#Baranau#cs#2009,
                           ‘s_abaranau’   p_math:David#Smart,
                                          p_cs:Jack#Weird,
                                          }



Monday, July 9, 12
Other/Advanced
                         Topics




Monday, July 9, 12
Advanced: Co-Processors
                     CoProcessors API (HBase 0.92.0+) allows to:

                       execute (querying/aggregation/etc.)
                       logic on server side (you may think of
                       it as of stored procedures in RDBMS)

                       perform auditing of actions performed on
                       server-side (you may think of it as of
                       triggers in RDBMS)

                       apply security rules for data access

                       and many more cool stuff



Monday, July 9, 12
Other: Use Compression
                      Using compression:

                         reduces data amount to be stored on disks

                         reduces data amount to be transferred when RS reading data not
                         from local replica

                         increases amount of CPU used, but CPU isn’t usually a bottleneck

                      Favor compression speed over compression ratio

                         SNAPPY is good

                      Use wisely:

                         e.g. avoid wasting CPU cycles on compressing images

                             compression can be configured on per CF basis, so storing
                             non-compressible data in separate CF sometimes helps

                         data blocks are uncompressed in memory, avoid this to cause OOME

                         note: when scanning (seeking data to return for scan) many data
                         blocks can be uncompressed even if none of the data will be
                         returned from those block


Monday, July 9, 12
Other: Use Monitoring



                        TBD

                        Ganglia, Cacti, other*, Just use it!




                * http://sematext.com/spm/index.html

Monday, July 9, 12
Qs?




                     Sematext is hiring!
Monday, July 9, 12

More Related Content

What's hot

Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Mike Dirolf
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
Jurriaan Persyn
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureDan McKinley
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
mumrah
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
Robert Stupp
 
Introduction to Structured Streaming
Introduction to Structured StreamingIntroduction to Structured Streaming
Introduction to Structured Streaming
Knoldus Inc.
 
Google Bigtable Paper Presentation
Google Bigtable Paper PresentationGoogle Bigtable Paper Presentation
Google Bigtable Paper Presentation
vanjakom
 
Apache ZooKeeper
Apache ZooKeeperApache ZooKeeper
Apache ZooKeeper
Scott Leberknight
 
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLabApache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
Saurav Haloi
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars George
JAX London
 
Лекция 12. Spark
Лекция 12. SparkЛекция 12. Spark
Лекция 12. Spark
Technopark
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
Michelle Darling
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into Cassandra
Brent Theisen
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL DatabasesDerek Stainer
 
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the HoodRadical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Databricks
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Cloudera, Inc.
 
Redis Introduction
Redis IntroductionRedis Introduction
Redis IntroductionAlex Su
 
Rainbird: Realtime Analytics at Twitter (Strata 2011)
Rainbird: Realtime Analytics at Twitter (Strata 2011)Rainbird: Realtime Analytics at Twitter (Strata 2011)
Rainbird: Realtime Analytics at Twitter (Strata 2011)
Kevin Weil
 
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, SalesforceHBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
Cloudera, Inc.
 

What's hot (20)

Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Introduction to Structured Streaming
Introduction to Structured StreamingIntroduction to Structured Streaming
Introduction to Structured Streaming
 
Google Bigtable Paper Presentation
Google Bigtable Paper PresentationGoogle Bigtable Paper Presentation
Google Bigtable Paper Presentation
 
Apache ZooKeeper
Apache ZooKeeperApache ZooKeeper
Apache ZooKeeper
 
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLabApache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars George
 
Лекция 12. Spark
Лекция 12. SparkЛекция 12. Spark
Лекция 12. Spark
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into Cassandra
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
 
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the HoodRadical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
 
Redis Introduction
Redis IntroductionRedis Introduction
Redis Introduction
 
Rainbird: Realtime Analytics at Twitter (Strata 2011)
Rainbird: Realtime Analytics at Twitter (Strata 2011)Rainbird: Realtime Analytics at Twitter (Strata 2011)
Rainbird: Realtime Analytics at Twitter (Strata 2011)
 
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, SalesforceHBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
 

Similar to Intro to HBase Internals & Schema Design (for HBase users)

Facebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconFacebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconYiwei Ma
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
强 王
 
支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统yongboy
 
Storage Infrastructure Behind Facebook Messages
Storage Infrastructure Behind Facebook MessagesStorage Infrastructure Behind Facebook Messages
Storage Infrastructure Behind Facebook Messagesyarapavan
 
Hw09 Practical HBase Getting The Most From Your H Base Install
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base InstallCloudera, Inc.
 
Hbase introduction
Hbase introductionHbase introduction
Hbase introductionyangwm
 
Storage infrastructure using HBase behind LINE messages
Storage infrastructure using HBase behind LINE messagesStorage infrastructure using HBase behind LINE messages
Storage infrastructure using HBase behind LINE messages
LINE Corporation (Tech Unit)
 
PostgreSQL Scaling And Failover
PostgreSQL Scaling And FailoverPostgreSQL Scaling And Failover
PostgreSQL Scaling And Failover
John Paulett
 
Introduction To HBase
Introduction To HBaseIntroduction To HBase
Introduction To HBase
Anil Gupta
 
Hadoop Interview Questions and Answers
Hadoop Interview Questions and AnswersHadoop Interview Questions and Answers
Hadoop Interview Questions and Answers
MindsMapped Consulting
 
Hbase 20141003
Hbase 20141003Hbase 20141003
Hbase 20141003
Jean-Baptiste Poullet
 
HBase User Group #9: HBase and HDFS
HBase User Group #9: HBase and HDFSHBase User Group #9: HBase and HDFS
HBase User Group #9: HBase and HDFSCloudera, Inc.
 
A presentaion on Panasas HPC NAS
A presentaion on Panasas HPC NASA presentaion on Panasas HPC NAS
A presentaion on Panasas HPC NAS
Rahul Janghel
 
Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop Architecture
Delhi/NCR HUG
 
Scalability
ScalabilityScalability
Scalabilityfelho
 
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hari Shankar Sreekumar
 
Hbase: an introduction
Hbase: an introductionHbase: an introduction
Hbase: an introduction
Jean-Baptiste Poullet
 

Similar to Intro to HBase Internals & Schema Design (for HBase users) (20)

Facebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconFacebook keynote-nicolas-qcon
Facebook keynote-nicolas-qcon
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
 
支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统
 
Storage Infrastructure Behind Facebook Messages
Storage Infrastructure Behind Facebook MessagesStorage Infrastructure Behind Facebook Messages
Storage Infrastructure Behind Facebook Messages
 
Hw09 Practical HBase Getting The Most From Your H Base Install
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base Install
 
HBase with MapR
HBase with MapRHBase with MapR
HBase with MapR
 
Hbase
HbaseHbase
Hbase
 
Hbase introduction
Hbase introductionHbase introduction
Hbase introduction
 
Storage infrastructure using HBase behind LINE messages
Storage infrastructure using HBase behind LINE messagesStorage infrastructure using HBase behind LINE messages
Storage infrastructure using HBase behind LINE messages
 
PostgreSQL Scaling And Failover
PostgreSQL Scaling And FailoverPostgreSQL Scaling And Failover
PostgreSQL Scaling And Failover
 
Introduction To HBase
Introduction To HBaseIntroduction To HBase
Introduction To HBase
 
Introduction to HBase
Introduction to HBaseIntroduction to HBase
Introduction to HBase
 
Hadoop Interview Questions and Answers
Hadoop Interview Questions and AnswersHadoop Interview Questions and Answers
Hadoop Interview Questions and Answers
 
Hbase 20141003
Hbase 20141003Hbase 20141003
Hbase 20141003
 
HBase User Group #9: HBase and HDFS
HBase User Group #9: HBase and HDFSHBase User Group #9: HBase and HDFS
HBase User Group #9: HBase and HDFS
 
A presentaion on Panasas HPC NAS
A presentaion on Panasas HPC NASA presentaion on Panasas HPC NAS
A presentaion on Panasas HPC NAS
 
Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop Architecture
 
Scalability
ScalabilityScalability
Scalability
 
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
 
Hbase: an introduction
Hbase: an introductionHbase: an introduction
Hbase: an introduction
 

Recently uploaded

How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 

Recently uploaded (20)

How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 

Intro to HBase Internals & Schema Design (for HBase users)

  • 1. Intro to HBase Internals & Schema Design (for HBase Users) Alex Baranau, Sematext International, 2012 Monday, July 9, 12
  • 2. About Me Software Engineer at Sematext International http://blog.sematext.com/author/abaranau @abaranau http://github.com/sematext (abaranau) Monday, July 9, 12
  • 3. Agenda Logical view Physical view Schema design Other/Advanced topics Monday, July 9, 12
  • 4. Why? Why should I (HBase user) care about HBase internals? HBase will not adjust cluster settings to optimal based on usage patterns automatically Schema design, table settings (defined upon creation), etc. depend on HBase implementation aspects Monday, July 9, 12
  • 6. Logical View: Regions HBase cluster serves multiple tables, distinguished by name Each table contains of rows Each row contains cells: (row key, column family, column, timestamp) -> value Table is split into Regions (table shards, each contains full rows), defined by start and end row keys Monday, July 9, 12
  • 7. Logical View: Regions are Shards Regions are “atoms of distribution” Each region assigned to single RegionServer (HBase cluster slave) Rows of particular Region served by single RS (cluster slave) Regions are distributed evenly across RSs Region has configurable max size When region reaches max size (or on request) it is split into two smaller regions, which can be assigned to different RSs Monday, July 9, 12
  • 8. Logical View: Regions on Cluster ZooKeeper ZooKeeper ZooKeeper client HMaster HMaster Region Region RegionServer Region Region Region Region RegionServer RegionServer RegionServer RegionServer Monday, July 9, 12
  • 9. Logical View: Regions Load It is essential for Regions under the load to be evenly distributed across the cluster It is HBase user’s job to make sure the above is true. Note: even distribution of Regions over cluster doesn’t imply that the load is evenly distributed Monday, July 9, 12
  • 10. Logical View: Regions Load Take into account that rows are stored in ordered manner Make sure you don’t write rows with sequential keys to avoid RS hotspotting* When writing data with monotonically increasing/decreasing keys, data is written at one RS at a time Use pre-splitting of the table upon creation Starting with single region means using one RS for some time In general, splitting can be expensive Increase max region size * see https://github.com/sematext/HBaseWD Monday, July 9, 12
  • 11. Logical View: Slow RSs When load is distributed evenly, watch for slowest RSs (HBase slaves) Since every region served by single RS, one slow RS can slow down cluster performance e.g. when: data is written into multiple RSs at even pace (random value-based row keys) data is being read from many RSs when doing scan Monday, July 9, 12
  • 13. Physical View: Write/Read Flow HTable client client buffer HTable write read RegionServer Region z Region ... ... Store Store MemStore MemStore (per CF) (per CF) flush HFile HFile ... HFile HFile Write Ahead Log HDFS Monday, July 9, 12
  • 14. Physical: Speed up Writing Enabling & increasing client-side buffer reduces RPC operations amount warn: possible loss of buffered data in case of client failure; design for failover in case of write failure (networking/server- side issues); can be handled on client Disabling WAL increases write speed warn: possible data loss in case of RS failure Use bulk import functionality (writes HFiles directly, which can be later added to HBase) Monday, July 9, 12
  • 15. Physical: Memstore Flushes When memstore is flushed N HFiles are created (one per CF) Memstore size which causes flushing is configured on two levels: per RS: % of heap occupied by memstores per table: size in MB of single memstore (per CF) of Region When Region memstores flushes, memstores of all CFs are flushed Uneven data amount between CFs causes too many flushes & creation of too many HFiles (one per CF every time) In most cases having one CF is the best design Monday, July 9, 12
  • 16. Physical: Memstore Flushes Important: there are Memstore size thresholds which cause writes to be blocked, so slow memstore flushes and overuse of memory by memstore can cause write perf degradation Hint: watch for flush queue size metric on RSs At the same time the more memory memstore uses the better for writing/reading perf (unless it reaches those “write blocking” thresholds) Monday, July 9, 12
  • 17. Physical: Memstore Flushes Example of good situation * * http://sematext.com/spm/index.html Monday, July 9, 12
  • 18. Physical: HFiles Compaction HFiles are periodically compacted into bigger HFiles containing same data Reading from less HFiles faster Important: there’s a configured max number of files in Store which, when reached causes writes to block Hint: watch for compaction queue size metric on RSs read Store MemStore (per CF) HFile HFile Monday, July 9, 12
  • 19. Physical: Data Locality RSs are usually collocated HDFS with HDFS DataNodes MapReduce HBase RegionServer RegionServer TaskTracker TaskTracker DataNode DataNode Slave Node Slave Node Monday, July 9, 12
  • 20. Physical: Data Locality HBase tries to assign Regions to RSs so that Region data stored physically on the same node. But sometimes fails after Region splits there’s no guarantee that there’s a node that has all blocks (HDFS level) of new Region and no guarantee that HBase will not re-assign this Region to different RS in future (even distribution of Regions takes preference over data locality) There’s an ongoing work towards better preserving data locality Monday, July 9, 12
  • 21. Physical: Data Locality Also, data locality can break when: Adding new slaves to cluster Removing slaves from cluster Incl. node failures Hint: look at networking IO between slaves when writing/reading data, it should be minimal Important: make sure HDFS is well balanced (use balancer tool) try to rebalance Regions in HBase cluster if possible (HBase Master restart will do that) to regain data locality Pre-split table on creation to limit (ideally avoid) splits and regions movement; manage splits manually sometimes helps Monday, July 9, 12
  • 22. Schema Design (very briefly) Monday, July 9, 12
  • 23. Schema: row keys Using row key (or keys range) is the most efficient way to retrieve the data from HBase Row key design is major part of schema design Note: no secondary indices available out of the box Row Key Data ‘login_2012-03-01.00:09:17’ d:{‘user’:‘alex’} ... ... ‘login_2012-03-01.23:59:35’ d:{‘user’:‘otis’} ‘login_2012-03-02.00:00:21’ d:{‘user’:‘david’} Monday, July 9, 12
  • 24. Schema: row keys Redundancy is OK! warn: changing two rows in HBase is not atomic operation Row Key Data ‘login_2010-01-01.00:09:17’ d:{‘user’:‘alex’} ... ... ‘login_2012-03-01.23:59:35’ d:{‘user’:‘otis’} ‘alex_2010-01-01.00:09:17’ d:{‘action’:‘login’} ... ... ‘otis_2012-03-01.23:59:35’ d:{‘action’:‘login’} ‘alex_login_2010-01-01.00:09:17’ d:{‘device’:’pc’} ... ... ‘otis_login_2012-03-01.23:59:35’ d:{‘device’:‘mobile’} Monday, July 9, 12
  • 25. Schema: Relations Not relational No joins Denormalization is OK! Use ‘nested entities’ Row Key Data d:{ student_firstname:Alex, student_lastname:Baranau, student professor_math_firstname:David, * ‘student_abaranau’ professor_math_lastname:Smart, * professor_cs_firstname:Jack, professor professor_cs_lastname:Weird, } ‘prof_dsmart’ d:{...} Monday, July 9, 12
  • 26. Schema: row key/CF/qual size HBase stores cells individually great for “sparse” data row key, CF name and column name stored with each cell which may affect data amount to be stored and managed keep them short serialize and store many values into single cell Row Key Data d:{ s:Alex#Baranau#cs#2009, ‘s_abaranau’ p_math:David#Smart, p_cs:Jack#Weird, } Monday, July 9, 12
  • 27. Other/Advanced Topics Monday, July 9, 12
  • 28. Advanced: Co-Processors CoProcessors API (HBase 0.92.0+) allows to: execute (querying/aggregation/etc.) logic on server side (you may think of it as of stored procedures in RDBMS) perform auditing of actions performed on server-side (you may think of it as of triggers in RDBMS) apply security rules for data access and many more cool stuff Monday, July 9, 12
  • 29. Other: Use Compression Using compression: reduces data amount to be stored on disks reduces data amount to be transferred when RS reading data not from local replica increases amount of CPU used, but CPU isn’t usually a bottleneck Favor compression speed over compression ratio SNAPPY is good Use wisely: e.g. avoid wasting CPU cycles on compressing images compression can be configured on per CF basis, so storing non-compressible data in separate CF sometimes helps data blocks are uncompressed in memory, avoid this to cause OOME note: when scanning (seeking data to return for scan) many data blocks can be uncompressed even if none of the data will be returned from those block Monday, July 9, 12
  • 30. Other: Use Monitoring TBD Ganglia, Cacti, other*, Just use it! * http://sematext.com/spm/index.html Monday, July 9, 12
  • 31. Qs? Sematext is hiring! Monday, July 9, 12