2. We help people discover things they love
and inspire them to do those thingsâŚ
3. HBase in Production
Overview
⢠All running on Amazon Web Services
⢠5 production clusters and growing
⢠Mix SSD and SATA clusters
⢠Billions of page views per month
4. With lots of patches
Designing for EC2
⢠CDH 4.2.x
⢠HDFS-3912
⢠HBase 0.94.7
⢠HBASE-8284
⢠One zone per cluster / no rack locality
⢠RegionServers - Ephemeral disk only
⢠Redundant clusters for availability
⢠HDFS-4721 ⢠HDFS-3703 ⢠HDFS-9503
⢠HBASE-8389⢠HBASE-8434 ⢠HBASE-7878
5. ConďŹguration
Cluster Setup
⢠Managed splitting w/pre split tables
⢠Bloom filters for pretty much everything
⢠Manual / Rolling major compactions
⢠Reverse DNS on EC2
⢠3 ZooKeepers in quorum
⢠1 NameNode / Sec-NameNode / Master
⢠1 EBS volume for fsImage / 1 Elastic IP
⢠10-50 nodes per cluster
6. Fact-driven âFryâ method using Puppet
Provisioning
⢠User-data passed in to drive config management
⢠Repackaged modifications to HDFS / HBase
⢠Ubuntu .deb packages created with FPM
⢠Synced to S3, nodes configured with s3-apt plugin
⢠Mount + format ephemerals on boot
⢠Ext4 / nodiratime / nodealloc / lazy_itable_init
8. Designed for EC2
Service Monitoring
⢠Wounded (dying) vs Operational
⢠High value metrics first
⢠Overall health
⢠Alive / dead nodes
⢠Service up/down
⢠Fsck / Blocks / % Space
⢠Replication status
⢠Regions needing splits
⢠fsImage checkpoint
⢠Zookeeper quorum
⢠Synthetic transactions (get / put)
⢠Queues (flush / compaction / rpc)
⢠Latency (client / filesystem)
15. S3 + HBase Snapshots
Backups
⢠Full NameNode backup every 60 mins
⢠EBS Volume as an name.dir for crash recovery
⢠HBase snapshots + ExportSnapShot
16. Additional Tuning
Solid State Clusters
⢠Lower block size down from 32k
⢠Something a lot smaller. 8-16k
⢠Placement groups for 10Gb networking
⢠Increase DFSBandwidthPerSec
⢠Kernel tuning for TCP
⢠Compaction threads
⢠Disk elevator to noop
17. Process
Planning for Launch
⢠Pyres queue asynchronous reads / writes
⢠Allows for tuning a system before it goes live
⢠Tuning
⢠Schema
⢠Hot spots
⢠Compaction
⢠Canary roll out to new users
⢠10% -> 30% -> 80% -> 100%