Hortonworks HBase Meetup Presentation


Published on

Published in: Technology

Hortonworks HBase Meetup Presentation

  1. 1. HBase Dev Meetup09/11/2012Enis Soztutarenis [@] apache [dot] org#enissoz© Hortonworks Inc. 2011 Page 1
  2. 2. Integration Testing• Hbase Integration Tests plan – HBASE-62011. Create hbase-it module2. Ability to run integration tests on a given cluster mvn verify bin/hbase org.apache.hadoop.hbase.IntegrationTestsDriver3. Port candidate tests4. Add more long running and fault injection tests5. (Most of) Integration tests should be able to run over a mini cluster or a 100+ node cluster.6. Integration tests should run in mini mode as nightlies (apache jenkins)7. Our plan is to run the whole suite of Integration tests nightly/weekly on medium-sized clusters. Every org. should be able to do this easily. Architecting the Future of Big Data Page 2 © Hortonworks Inc. 2011
  3. 3. Current Status• hbase-it module committed, HBASE-6203• HBaseCluster interface for interacting with the cluster from system tests (HBASE-6241) – Patch near to being committed• Subtask jiras for porting each test or adding a new test• Hbase book documentation available in patch for HBASE-6302• Candidate tests to be converted: • LoadTestTool • TestFromClientSide • BigTop TestLoadAndVerify • TestShell and src/test/ruby • Goraci • TestRollingRestart (https://github.com/keith- • TestMasterFailover turner/goraci) • TestImportExport • TestAcidGuarantees / • Test**OnCluster TestAtomicOperation • Balancer tests • TestRegionBalancing (HBASE- 6053) • TestFullLogReconstruction • TestMultiVersions / Architecting the Future of Big Data Page 3 TestKeepDeletes © Hortonworks Inc. 2011
  4. 4. HBASE-6241 Overview• New junit @Category(IntegrationTests.class)• Introduces HBaseCluster – public abstract ClusterStatus getClusterStatus(); – public abstract void startRegionServer(String hostname); – Start/stop/kill RS / Master – MiniHBaseCluster and DistributedHBaseCluster implements HBaseCluster• Introduces DistributedHBaseCluster – Keeps state about a distributed cluster – Does not know about environment. Uses ClusterManager to interact with the cluster• ClusterManager is an Api to manage services – public abstract void start(ServiceType service, String hostname) – Pluggable, specific implementations for every environment• HBaseClusterManager extends ClusterManager – Knows about environment (HBASE_HOME, HBASE_CONF_DIR, HBASE_SSH_OPTS ) – Uses SSH and remotely executes ps, kill -9, bin/hbase-daemon.sh start/stop Architecting the Future of Big Data Page 4 © Hortonworks Inc. 2011
  5. 5. HBASE-6241 Overview (cont.)• Introduces HBaseIntegrationTestingUtility extends HBaseTestingUtility – Integration tests uses this• Introduces 2 ways to run integration tests – bin/hbase o.a.h.h.IntegrationTestsDriver –config <hbase_conf_dir> – mvn verify (runs tests using mini cluster) – mvn verify –Dhbase.test.cluster.distributed=true – Dhbase.conf.dir=<hbase_conf_dir> (further patch)• Introduces ChaosMonkey – A utility to injects faults in a running cluster. – Does not know about environment, uses HBaseIntegrationTestingUtility – Actions (RestartActiveMaster, RestartRandomRs, RestartRsHoldingMeta, BatchRestartRs, RollingBatchRestartRs, etc ) – Policies (PeriodicRandomActionPolicy), predefined named policies (EVERY_MINUTE_RANDOM_ACTION_POLICY)• Introduces IntegrationTestDataIngestWithChaosMonkey – Runs LoadTestTool and ChaosMonkey for configurable running time – Adjusts data load based on the cluster size (also works with mini cluster) Architecting the Future of Big Data Page 5 © Hortonworks Inc. 2011
  6. 6. Open Issues• 0.94 porting?• Hadoop NN HA testing• BIGTOP-635• Community interest• Different environments (no SSH, etc)• ??? Architecting the Future of Big Data Page 6 © Hortonworks Inc. 2011
  7. 7. HBase on Windows• WHY? – Because we can : ) – More platforms = more users + more development• WHO – MS + HW• WHEN? – Patches will start rolling shortly• WHERE? – Apache HBase Trunk – 0.94? 0.92?• WHAT – Hadoop on windows (HADOOP-8079) – Test fixes, file path name related fixes, timing issues, network related changes, build changes, bin scripts, etc Architecting the Future of Big Data Page 7 © Hortonworks Inc. 2011
  8. 8. ThanksGo bananas!Taken from http://99designs.com/t-shirt-design/contests/design-chaos-monkey-t-shirt-70909/entries?entriespage=1#contest-header Architecting the Future of Big Data Page 8 © Hortonworks Inc. 2011