HBaseCon 2013: Apache HBase Operations at Pinterest

4,841
-1

Published on

Presented by: Jeremy Carroll, Pinterest

Published in: Technology

HBaseCon 2013: Apache HBase Operations at Pinterest

  1. 1. Operations Jeremy Carroll Operations Engineer HBaseCon 2013
  2. 2. We help people discover things they love and inspire them to do those things…
  3. 3. HBase in Production Overview • All running on Amazon Web Services • 5 production clusters and growing • Mix SSD and SATA clusters • Billions of page views per month
  4. 4. With lots of patches Designing for EC2 • CDH 4.2.x • HDFS-3912 • HBase 0.94.7 • HBASE-8284 • One zone per cluster / no rack locality • RegionServers - Ephemeral disk only • Redundant clusters for availability • HDFS-4721 • HDFS-3703 • HDFS-9503 • HBASE-8389• HBASE-8434 • HBASE-7878
  5. 5. Configuration Cluster Setup • Managed splitting w/pre split tables • Bloom filters for pretty much everything • Manual / Rolling major compactions • Reverse DNS on EC2 • 3 ZooKeepers in quorum • 1 NameNode / Sec-NameNode / Master • 1 EBS volume for fsImage / 1 Elastic IP • 10-50 nodes per cluster
  6. 6. Fact-driven “Fry” method using Puppet Provisioning • User-data passed in to drive config management • Repackaged modifications to HDFS / HBase • Ubuntu .deb packages created with FPM • Synced to S3, nodes configured with s3-apt plugin • Mount + format ephemerals on boot • Ext4 / nodiratime / nodealloc / lazy_itable_init
  7. 7. ---- HBASE MODULE ---- class { 'hbase': cluster => 'feeds_e', namenode => 'ec2-xxx-xxx-xxx-xxx.compute-1.amazonaws.com', zookeeper_quorum => 'zk1,zk2,zk3', hbase_site_opts => { 'hbase.replication' => true, 'hbase.snapshot.enabled' => true, 'hbase.snapshot.region.timeout' => '35000', 'replication.sink.client.ops.timeout' => '20000', 'replication.sink.client.retries.number' => '3', 'replication.source.size.capacity' => '4194304', 'replication.source.nb.capacity' => '100', ... } } ---- FACT BASED VARIABLES ---- $hbase_heap_size = $ec2_instance_type ? { 'hi1.4xlarge' => '24000', 'm2.2xlarge' => '24000', 'm2.xlarge' => '11480', 'm1.xlarge' => '11480', 'm1.large' => '6500', ... } Puppet Module
  8. 8. Designed for EC2 Service Monitoring • Wounded (dying) vs Operational • High value metrics first • Overall health • Alive / dead nodes • Service up/down • Fsck / Blocks / % Space • Replication status • Regions needing splits • fsImage checkpoint • Zookeeper quorum • Synthetic transactions (get / put) • Queues (flush / compaction / rpc) • Latency (client / filesystem)
  9. 9. Designed for EC2 Service Monitoring
  10. 10. Instrumentation Metrics • OpenTSDB for high cardinality metrics • Per region stats collection • tCollector • RegionServer HTTP JMX • HBase REST • GangliaContext for hadoop-metrics
  11. 11. OpenTSDB Table RegionServer Region Slicing and Dicing
  12. 12. Using R Tables Regions StoreFiles
  13. 13. Tuning Performance Compaction +Logs
  14. 14. Operational Intelligence Dashboards
  15. 15. S3 + HBase Snapshots Backups • Full NameNode backup every 60 mins • EBS Volume as an name.dir for crash recovery • HBase snapshots + ExportSnapShot
  16. 16. Additional Tuning Solid State Clusters • Lower block size down from 32k • Something a lot smaller. 8-16k • Placement groups for 10Gb networking • Increase DFSBandwidthPerSec • Kernel tuning for TCP • Compaction threads • Disk elevator to noop
  17. 17. Process Planning for Launch • Pyres queue asynchronous reads / writes • Allows for tuning a system before it goes live • Tuning • Schema • Hot spots • Compaction • Canary roll out to new users • 10% -> 30% -> 80% -> 100%
  18. 18. Pssstt. We’re hiring

×