Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Apache HBase Internals you hoped you Never Needed to Understand

1,197 views

Published on

Covers numerous internal features, concepts, and implementations of Apache HBase. The focus will be driven from an operational standpoint, investigating each component enough to understand its role in Apache HBase and the generic problems that each are trying to solve. Topics will range from HBase’s RPC system to the new Procedure v2 framework, to filesystem and ZooKeeper use, to backup and replication features, to region assignment and row locks. Each topic will be covered at a high-level, attempting to distill the often complicated details down to the most salient information.

Published in: Software
  • Be the first to comment

Apache HBase Internals you hoped you Never Needed to Understand

  1. 1. Apache HBase Internals you Hoped you Never Needed to Understand Josh Elser Future of Data, NYC 2016/10/11
  2. 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Engineer at Hortonworks, Member of the Apache Software Foundation Top-Level Projects • Apache Accumulo® • Apache Calcite™ • Apache Commons ™ • Apache HBase ® • Apache Phoenix ™ ASF Incubator • Apache Fluo ™ • Apache Gossip ™ • Apache Pirk ™ • Apache Rya ™ • Apache Slider ™ These Apache project names are trademarks or registered trademarks of the Apache Software Foundation.
  3. 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache HBase for storing your data! CC BY 3.0 US: http://hbase.apache.org/
  4. 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved What happens when things go wrong? CC BY-ND 2.0: https://www.flickr.com/photos/widnr/6588151679
  5. 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved The BigTable Architecture  BigTable’s architecture is simple  Debugging a distributed system is not simple  How can we break down a complex system?  How do we write resilient software? • Log-Structured Merge Tree • Write-Ahead Logs • Distributed Coordination • Row-based, Auto-Sharding • Strong Consistency • Read Isolation • Coprocessors • Security (AuthN/AuthZ) • Backups
  6. 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Naming Conventions  Servers – Hostname, Port, and Timestamp – RegionServer: r01n01.domain.com,16201,1475691463147 – Master: r02n01.domain.com,16000,1475691462616  Regions – Table, Start RowKey, Region ID (timestamp), Replica ID, Encoded name – T1,x04x00x00,1470324608597.c04d94cd4ee9797da2fb906b4dcd2e3c. – Or simply c04d94cd4ee9797da2fb906b4dcd2e3c
  7. 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Regions  A sorted “shard” of a table  At least one “column family” – Physical partitions  Each family can have zero to many files  Hosted by at most one RegionServer – Can have many hosting RS’s for reads  In-memory locks for certain intra-row operations
  8. 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Region Assignment  Coordinated by the HBase Master  A Region must only be hosted by one RegionServer  State tracked in hbase:meta – hbck to fix issues  Region splits/merges make a hard problem even harder  Moving towards ProcedureV2 Closed Offline Opening OpenPending Open Normal Region Assignment States
  9. 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved The File System  HDFS “Compatible” – Distributed, durable, ”write leases”  Physical storage of HBase Tables (HFiles)  Write-ahead logs  A parent directory in that FileSystem (hbase.rootdir)
  10. 10. 10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved The File System Physical Separation by HBase Namespace /hbase/data/ /hbase/data/default/<table1> /hbase/data/default/.tabledesc/.tableinfo… /hbase/data/default/<table2>/<region_id1> /hbase/data/default/<table2>/<region_id2> /hbase/data/my_custom_ns/<table3>/… /hbase/data/hbase/meta/… /hbase/archive/… /hbase/WALs/<regionserver_name>/… /hbase/oldWALs/… /hbase/corrupt/…
  11. 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved The File System for one Region /hbase/data/default/<table2>/<region_id1> …/.regioninfo …/.tmp …/<family1>/<hfile> …/<family1>/<hfile> …/<family2>/<hfile> …/<family3>/<hfile> …/recovered.edits/<number>.seqid
  12. 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Writes into HBase  Mutations inserted into sorted in-memory structure and WAL – Fast lookups of recent data – Append-only log for durability and speed  Mutations are collected by destination Region  Beware of hot-spotting  Data in memory eventually flush’ed into sorted (H)files
  13. 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Compactions and Flushes  Flush: Taking Key-Values from the In-Memory map and creating an HFile  Minor Compaction: Rewriting a subset of HFiles for a Region into one HFile  Major Compaction: Rewriting all HFiles for a Region into one HFile  Compactions balance improved query performance with cost of rewriting data – Compactions are good! – Must understand SLA’s to properly tune compactions
  14. 14. 14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Reads into HBase  Merge-Sort over multiple streams of data – Memory – Disk (many files)  hbase:meta is the definitive source of where to find Regions RowKey Region hbase:meta RegionServer ZooKeeper
  15. 15. 15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache ZooKeeper™  Distributed coordination is really hard  Obvious use cases – Service Discovery – Cluster Membership – “Root Table”  Non-obvious use cases – Assignment (sometimes) – Region Recovery – WAL Splitting – Cluster Replication – Distributed Procedures – HBase Snapshots Apache ZooKeeper is a trademark of the Apache Software Foundation
  16. 16. 16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache ZooKeeper™  Discovery/Leader ZNodes – /hbase/rs/… – /hbase/master/… – /hbase/backup-masters/…  Consensus – /hbase/splitWAL/… – /hbase/flush-table-proc/... – /hbase/table-lock/... – /hbase/region-in-transition/... – /hbase/recovering-regions/...
  17. 17. 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Distributed Procedures  Resiliency in an unreliable system – How do we create a table?  “Procedure V2” – Resilient, finite state machine  HBase operations represented as ”procedures”  Clients are agnostic of Master state – Clients track procedure state https://issues.apache.org/jira/secure/attachment/12679960/ProcedureV2.pdf
  18. 18. 18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Distributed Procedures  Procedures are durable via Write-Ahead Log – /hbase/MasterProcWALs/…  Procedures only executed by the active HBase Master  Reusable framework for the future
  19. 19. 19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HBase RPCs  Internal and External HBase Communication  Half-Sync/Half-Async Model  Many knobs to tweak  Listener  Readers  Scheduler  Call Queues  Call Runners/Handlers Overview Components
  20. 20. 20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HBase RPCs L i s t e n e r Reader Reader Reader Reader S c h e d u l e r Call Queues Handlers Priority Read Write Replication Request to Execution
  21. 21. 21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Disaster Recovery  Multiple tools to ensure copies of data in the face of catastrophic failure  CopyTable – MapReduce job which reads all data from a source, writing to destination  Snapshots – A collection of Regions, their HFiles, and metadata  Backup & Restore – HBASE-7912, current targeted for HBase-2.0.0 – Incremental and full backup/restore
  22. 22. 22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Kerberos  Strong authentication for untrusted networks  ”Standard” across Apache Hadoop and friends  Requirements: – Forward/Reverse DNS – Unlimited Strength Java Cryptography Extension  SASL used to build RPC systems  “Practical Kerberos with Apache HBase” https://goo.gl/y0d9ZO
  23. 23. 23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Finding an Hypothesis  Logs logs logs  Application and System  Metrics exposed by JMX  Graphing solutions – Ambari Metrics Server + Grafana
  24. 24. 24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Thank You jelser@hortonworks.com / elserj@apache.org

×