Machines
•Industry Standard
•No RAID controller (JBOD on the slaves)
•Homogeneous environment is not necessary
•Cores, Spindles, and RAM
• Different configurations for different uses
7
Network
•Leverage the existing infrastructure
•No fancy equipment, no Infiniband
•Redundancy is key, no SPOF
•TOR vs Core
•1Gb, 10Gb, and 40Gb
•Bonding, VIPs, other such complexities
8
Distributed Database(HBase)
•Distributed Hash table
•Get, put, delete, scan, and CaS
•Denormalization is necessary
•Not a parallel database, just distributed
•Write-ahead log / data durability
•Master/slave replication
•ACID compliance
14
Automation
How fast can you:
•Change an OS configuration on 100 machines?
•Kill one process on said machines?
•Reboot all your machines?
•Reboot all your machines one by one, with
some added configuration changes?
•Add 10 new fully configured nodes?
37
If you can manage to take your cluster offline for
possibly an hour:
1.Shutdown HBase
2.distcp to another cluster/separate folder
3.Restart HBase
* It's possible to run a distcp before shutting down, make sure you run distcp
-update -delete for the second step.
Backup - Offline
49
1.Create another HBase cluster (can be remote)
2.Alter the families that need replication
3.Make sure the same tables exist on the slave
cluster
* Replication isn't done inline with the inserts in the master cluster
* See "Apache HBase Replication" with Chris Trezzo at 5:20PM
Backup - Replication
50
•Doesn't require copying data
•Runs in less than 60 seconds
•Minimal impact on performance
* See the slides from "Apache HBase Table Snapshots" with Jonathan Hsieh
& pals
Backup - Snapshot
51