006 performance tuningandclusteradmin

  • 5,405 views
Uploaded on

 

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
5,405
On Slideshare
0
From Embeds
0
Number of Embeds
10

Actions

Shares
Downloads
44
Comments
0
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. PERFORMANCE TUNING & CLUSTERADMINISTRATION2012/8/2Scott Miao
  • 2. AGENDA Course Credit Performance Tuning More… Cluster Administration More…2
  • 3. COURSE CREDIT Show up, 30 scores Ask question, each question earns 5 scores Hands-on, 40 scores 70 scores will pass this course Each course credit will be calculated once for eachcourse finished The course credit will be sent to you and yoursupervisor by mail3
  • 4. PERFORMANCE TUNING Garbage Collection Tuning MSLAB Compression Optimizing Splits and Compactions Load Balancing Merging Regions Client API: Best Practices Configuration Load Tests4
  • 5. GARBAGE COLLECTION TUNING The process to rewrite the heap generation inquestion is called a garbage collection (GC) GC parameters only need to be added to the regionservers JRE comes with basic assumptions Regarding what your programs are doing, how theycreate objects, how they allocate the heap to handledata, and so on These assumptions work well in a lot of cases But NOT work well for HBase… Especially write-heavy ones It cannot safely rely on the JRE assumption alone 5
  • 6. 6https://service.ithome.com.tw/20120720Java/index3.html#3
  • 7. 7
  • 8. GARBAGE COLLECTION TUNING –WRITE-HEAVY USE CASES (1/2) Memstore flushes the data by the configured minimumflush size, hbase.hregion.memstore.flush.size It leaves different size of holes in the heap Data resided in different locations in the generationalarchitecture of the Java heap Depending on how long the data was in memory Young generation (new generation) The space can be reclaimed quickly and no harm is done Old generation (tenured generation) Data promoted to this location if it stays in memory for a longerperiod of time8
  • 9. GARBAGE COLLECTION TUNING –WRITE-HEAVY USE CASES (2/2) Reuse the holes created by data that has been writtento disk Requests a size of heap that does not fit into one ofthose holes Needs to compact the fragmented heap Young to Old The promotion of longer-living objects from the young to the oldgeneration Old to Stop-The-World There is no longer enough space for a young allocation caused bythe fragmentation Falls back to the stop-the-world garbage collector Rewrites the entire heap space and compacts it to the remainingactive objects If this fails, you will see a promotion failure in yourgarbage collection logs9
  • 10. 10What is the Heap looks like ?
  • 11. GARBAGE COLLECTION TUNING –SPECIFY THE YOUNG GENERATION SIZE Young generation is between 128 MB and 512 MB Old generation holds the remaining available heap, which is usuallymany gigabytes of memory Using 128 MB is a good starting point Further observation of the JVM metrics should beconducted Specify the young generation size like so -XX:MaxNewSize=128m -XX:NewSize=128m One convenient option -Xmn128m11
  • 12. GARBAGE COLLECTION TUNING –GC OPTIONS SETTING GC Options setting for HBase Adding them in the hbase-env.sh configuration file HBASE_OPTS variable for all HBase HBASE_REGIONSERVER_OPTS variable for all regionservers Enable the JRE’s log output for garbage collectiondetails Monitor it for occurrences of "concurrent mode failure" or "promotionfailed" messages 12-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:$HBASE_HOME/logs/gc-$(hostname)-hbase.log"
  • 13. GARBAGE COLLECTION TUNING –GC STRATEGY FOR YOUNG GENERATION Recommended value for young generation -XX:+UseParNewGC Use the Parallel New Collector It stops the entire Java process to clean up the younggeneration heap Since Young generation’s size is small in comparison Usually less than a few hundred milliseconds13
  • 14. GARBAGE COLLECTION TUNING –GC STRATEGY FOR OLD GENERATION Recommended value for old generation -XX:+UseConcMarkSweepGC Use the Concurrent Mark-Sweep Collector (CMS) It tries to do as much work concurrently aspossible, without stopping the Java process It takes extra effort and an increased CPU load Avoids the required stops to rewrite a fragmented oldgeneration heap If you hit the promotion error It falls back to stop-the-world again14
  • 15. GARBAGE COLLECTION TUNING –GC STRATEGY FOR OLD GENERATION A switch for CMS -XX:CMSInitiatingOccupancyFraction=70 A percentage that specifies when the backgroundprocess starts Avoids the concurrent mode failure The background process to mark and sweep the heap forcollection is still running when the heap runs out of usablespace Falls back to stop-the-world again Initiating occupancy fraction to 70% 20% block cache + 40% memstore limits = 60%, by default Starts the background process at appropriate time Early enough, and not too early 15
  • 16. GARBAGE COLLECTION TUNING - SUMMARY Recommended GC options The Alex Su’s GC options GC Options Reference16export HBASE_REGIONSERVER_OPTS="-Xmx8g -Xms8g -Xmn128m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:$HBASE_HOME/logs/gc-$(hostname)-hbase.log"-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<%= hbase_log_path %>/hbase-regionserver-gc-`date +%F-%H-%M-%S`.log -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:PrintFLSStatistics=1 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=<%= hbase_log_path %>/hbase-regionserver.hprofhttp://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html
  • 17. MSLAB - QUESTION For solving the stop-the-world issue Stop-the-world The key to reducing these compacting collections is toreduce fragmentation Only objects of exactly the same size should beallocated from the heap Subsequent allocations of new objects of the exact same sizewill always reuse these holes No promotion error, and therefore no stop-the-worldcompacting collection is required17
  • 18. MSLAB –MEMSTORE-LOCAL ALLOCATION BUFFER (1/3) Are buffers of fixed sizes containing KeyValueinstances of varying sizes1. A buffer cannot completely fit a newly addedKeyValue, it is considered full2. And a new buffer is created, once again of the givenfixed size Enabled by default in version 0.92 Disabled in version 0.90 of HBase hbase.hregion.memstore.mslab.enabled property It is recommended that test your setup with thisfeature 18
  • 19. MSLAB –MEMSTORE-LOCAL ALLOCATION BUFFER (2/3) The size of each allocated, fixed-sized buffer hbase.hregion.memstore.mslab.chunksize property Default is 2 MB Based on your KeyValue instances, you may have to adjustthis value E.g., 100 KB in size, you need to increase the MSLAB size to fitmore than just a few cells An upper boundary of what is stored in the buffers hbase.hregion.memstore.mslab.max.allocation property Default 256 KB Any cell (KeyValue) that is larger will be directly allocated inthe Java heap 19
  • 20. MSLAB –MEMSTORE-LOCAL ALLOCATION BUFFER (3/3) MSLAB do not come without a cost More wasteful in regard to heap usage Most likely not fill every buffer to the last byte A Tradeoff Use MSLABs and benefit from better garbage collection butincur the extra space that is required NOT use MSLABs and benefit from better memoryefficiency but deal with the problem caused by garbagecollection pauses Could plan to restart the servers every few days, or weeks, beforethe pause happens The buffers require an additional byte array copyoperation, therefore slightly slower Measure the impact on your workload20
  • 21. COMPRESSION A number of compression algorithms that can beenabled at the column family level It is recommended Enable compression unless you have a reason not to doso For example, when using already compressed content, suchas JPEG images Compression usually will yield overall betterperformance The overhead of the CPU performing the compression/de-compression is less than what is required to readmore data from disk21
  • 22. COMPRESSION – AVAILABLE CODECS It is recommended Snappy/Zippy (in Bigtable) Released by Google under the BSD License Ships with the required JNI libraries to be able to use it in HBase-0.92 Must install the native binary library on all region servers LZO (Lempel-Ziv-Oberhumer) A lossless data compression algorithm that is focused ondecompression speed, and written in ANSI C HBase cannot ship with LZO because of licensing issues incompatible GNU General Public License (GPL) LZO installation needs to be performed separately, after HBase hasbeen installed22http://norfolk.cs.washington.edu/htbin-post/unrestricted/colloq/details.cgi?id=437
  • 23. COMPRESSION –COMPRESSION TEST TOOL Use command hbase org.apache.hadoop.hbase.util.CompressionTest<path> <none|gz|lzo|snappy> Example ./bin/hbase org.apache.hadoop.hbase.util.CompressionTest /user/larsgeorge/test.gz gz It will return result based on the test If success If failed23…SUCCESSException in thread "main" java.lang.RuntimeException: java.lang.ClassNotFoundException:com.hadoop.compression.lzo.LzoCodec…
  • 24. COMPRESSION – STARTUP CHECK A fast failing setup notices the missing libraries Instead of running into issues later For example, check the Snappy and LZOcompression libraries The server will abort at startup with an IOExceptionstating "Compression codec <codec-name> notsupported, aborting RS construction" Copy the changed configuration file to all regionservers and to restart them afterward24<property><name>hbase.regionserver.codecs</name><value>snappy,lzo</value></property>
  • 25. COMPRESSION – ENABLING COMPRESSION Install the JNI libraries Install native compression libraries Specifying the chosen algorithm in the column family schema In HBase shell create testtable, { NAME => colfam1, COMPRESSION => GZ } In API HColumnDescriptor.setCompressionType(…) Refer to ppt#003, p#1125
  • 26. OPTIMIZING SPLITS AND COMPACTIONS- SPLIT/COMPACTION STORMS Grow your regions roughly at the same rate Eventually they all need to be split at about thesame time A large spike in disk I/O because of the requiredcompactions to rewrite the split region Refer to ppt#004, p#1326
  • 27. OPTIMIZING SPLITS AND COMPACTIONS –MANAGED SPLITTING (1/2) you can turn it off and manually invoke the split andmajor_compact commands Setting Region Maximum File Size hbase.hregion.max.filesize property for the entire cluster table level by API HTableDescriptor.setMaxFileSize(…) Refer to ppt#003, p#7 To a very high number Better to set this value to a reasonable upper boundary Such as 100GB Long.MAX_VALUE is not recommended in case the manualsplits fail to run Then you can time-control them Running them staggered across all regions Spreads the I/O load as much as possible, avoiding anysplit/compaction storm Use HBase shell + cron Or write your own codes with HBase Admin API supports Refer to #003, p#2127
  • 28. OPTIMIZING SPLITS AND COMPACTIONS –MANAGED SPLITTING (2/2) RegionSplitter Class (added in version 0.90.2) Another way to split existing regions Rolling split feature Split the existing regions while waiting long enough for theinvolved compactions to complete API docs An additional advantage Have better control over which regions are available atany time In rare case, you need to do very low-level debugging With automated splits, it is hard to debug !! Due to this region is split to two daughter regions28
  • 29. OPTIMIZING SPLITS AND COMPACTIONS –REGION HOTSPOTTING You may be dealing with a write pattern that is causing aspecific region to run hot Use Region Server Metrics to observe Refer to ppt#005, p#12 Key design approaches Salt keys, random keys, etc Refer to ppt#004, p#52 Other only way to alleviate this situation Manually split a hot region into one or more new regions, atexact boundaries You can specify any row key within specific region Be able to generate halves that are completely different in size Refer ppt#003, p#21 This can not dealing with completely sequential key ranges Those are always going to hit one region for a considerable amountof time29
  • 30. OPTIMIZING SPLITS AND COMPACTIONS –PRESPLITTING REGIONS (1/3) Manage splits manually is useful Therefore start with a larger number of regions right fromthe table creation Means to create a table with the required number ofregions Three ways… HBase shell create, refer to ppt#003, p#37 API HBaseAdmin.createTable(…), refer to ppt#003, p#16 RegionSplitter Class By default, MD5StringSplit class to partition the row keys intoranges Use -D split.algorithm=<your-algorithm-class> for otherimplementation30/bin/hbase org.apache.hadoop.hbase.util.RegionSplitterusage: RegionSplitter <TABLE>
  • 31. OPTIMIZING SPLITS AND COMPACTIONS –PRESPLITTING REGIONS (2/3) RegionSplitter with MD5StringSplit sample31testtable,,1309766006467.c0937d09f1da31f2a6c2950537a61093.testtable,0ccccccc,1309766006467.83a0a6a949a6150c5680f39695450d8a.testtable,19999998,1309766006467.1eba79c27eb9d5c2f89c3571f0d87a92.testtable,26666664,1309766006467.7882cd50eb22652849491c08a6180258.testtable,33333330,1309766006467.cef2853e36bd250c1b9324bac03e4bc9.testtable,3ffffffc,1309766006467.00365940761359fee14d41db6a73ffc5.
  • 32. OPTIMIZING SPLITS AND COMPACTIONS –PRESPLITTING REGIONS (3/3) How many presplit regions ? Start low with 10 presplit regions per server and watch as datagrows over time It is better to err on the side of too few regions and using arolling split later If Presplit regions to thin Increase hbase.hregion.majorcompaction property Refet to ppt#004, p# 19 If data size grows too large Use the RegionSplitter utility to perform a rolling split of allregions The main objective is to avoid split/compaction storm32
  • 33. LOAD BALANCING – BALANCER (1/3) The master has a built-in feature Called the balancer By default, runs every five minutes hbase.balancer.period property Attempts to equal out the number of assignedregions per region server Within one region of the average number per server Determines a new assignment plan Describes which regions should be moved where startsthe process of moving the regions by calling theunassign() method Refer to ppt#003, p#22 33
  • 34. LOAD BALANCING - BALANCER (2/3) balancer has an upper limit on how long it is allowed torun hbase.balancer.max.balancing property defaults to half of the balancer period value 2.5 mins The balancer switch Toggle the balancer status between enabled and disabled HBase shell balance_switch command, refer to ppt#003, p#39 balanceSwitch() API method, refer to ppt#003, p#2234
  • 35. LOAD BALANCING - BALANCER (3/3) Can be explicitly started HBase shell balancer command, refer to ppt#003, p#39 balancer() API method, refer to ppt#003, p#22 Return true Any work has be done Return false balancer was switched off No work to be done balancer was not able to run the balancer There is a region currently in transition, the balancer will beskipped35
  • 36. LOAD BALANCING - MOVE Can also use the move To assign regions to other servers HBase shell move command, refer to ppt#003, p#39 move() API method, refer to ppt#003, p#2236
  • 37. MERGING REGIONS Sometimes you may need to merge regions For example, after you have removed a large amount ofdata and you want to reduce the number of regionshosted by each server HBase allows you to merge two adjacent regions The HBase cluster must be offline, but HDFS37/bin/hbase org.apache.hadoop.hbase.util.MergeUsage: bin/hbase merge <table-name> <region-1> <region-2>
  • 38. CLIENT API: BEST PRACTICES (1/3) Disable auto-flush When performing a lot of put operations Refer to ppt#002, p#9 Use scanner-caching Set Scan.setCaching() method to something greater than thedefault of 1 if needed Refer to ppt#002, p#26 Limit scan scope If only a small number of the available columns are to beprocessed, only those should be specified in the input scan For example, use Scan.addFamily() method Refer to ppt#002, p#24 38
  • 39. CLIENT API: BEST PRACTICES (2/3) Close ResultScanners Avoiding performance problems This may cause problems on the region servers Refer to ppt#002, p#25 Block cache usage Scan instances can be set to use the block cache in theregion server via the setCacheBlocks() method true by default, default settings of the table and familyare used API docs Server side block cache settings Refer to ppt#003, p#12 39
  • 40. CLIENT API: BEST PRACTICES (3/3) Optimal loading of row keys When performing a table scan where only the row keysare needed a FilterList with a MUST_PASS_ALL operator +FirstKeyOnlyFilter + KeyOnlyFilter Refer to ppt#002, p#43 & 46 Turn off WAL on Puts Increasing throughput on Puts is to callwriteToWAL(false), there might be data loss Consider to use the bulk loading techniques instead40
  • 41. CONFIGURATION (1/6) Advanced options you can consider adjustingbased on your use case Most properties are configured in hbase-site.xml Others are in hbase-env.sh Decrease ZooKeeper timeout The default timeout between a region server and theZooKeeper quorum is three minutes Tune the timeout down to a minute, or even less, so themaster notices failures sooner zookeeper.session.timeout property Be careful of ―Juliet Pause‖ 41
  • 42. CONFIGURATION (2/6) Increase handlers The number of threads that are kept open to answerincoming requests to user tables By default is 10 hbase.regionserver.handler.count property Keep this number low when the payload per requestapproaches megabytes And high when the payload is small Increase heap settings HBASE_HEAPSIZE setting in hbase-env.sh file Consider using HBASE_REGIONSERVER_OPTSinstead of changing the global HBASE_HEAP SIZE Region servers may need more memory than Master42
  • 43. CONFIGURATION (3/6) Enable data compression Should enable compression for the storage files In most cases, boosts performance Increase region size Consider going to larger regions to cut down on the totalnumber of regions on your cluster Fewer regions to manage makes for a smoother-runningcluster43
  • 44. CONFIGURATION (4/6) Adjust block cache size The amount of heap used for the block cache is specified as apercentage Defaults to 20% perf.hfile.block.cache.size property It is good if you have mainly reading workloads Adjust memstore limits Memstore heap usage hbase.regionserver.global.memstore.upperLimit property Defaults to 40% hbase.regionserver.global.memstore.lowerLimit property Defaults to 35% Control the amount of flushing that will take place once the server isrequired to free heap space Mainly read-oriented workloads Consider reducing both limits to make more room for the block cache Handling many writes Increase the memstore limits to reduce the excessive amount of I/Othis causes 44
  • 45. CONFIGURATION (5/6) Increase blocking store files The region servers block further updates from clients togive compactions time to reduce the number of files Default is seven files hbase.hstore.blockingStoreFiles property Increase block multiplier A safety latch that blocks any further updates from clientswhen the memstores exceed the multiplier * flush size limit hbase.hregion.memstore.block.multiplier property Default to 2 If you have enough memory, can increase this value tohandle spikes more gracefully Refer to ppt#003, p#8 45
  • 46. CONFIGURATION (6/6) Decrease maximum logfiles How often flushes occur based on the number of WALfiles on disk Default is 32 hbase.regionserver.maxlogs property Can be high in a write-heavy use case Lower it to force the servers to flush data more often todisk46
  • 47. LOAD TESTS It is advisable to run performance tests to verifyfunctionality of your cluster These tests give you a baseline which you can referto After making changes to the configuration of the cluster Or the schemas of your tables Doing a burn-in of your cluster Show you how much you can gain from it But this does not replace a test with the load asexpected from your use case47
  • 48. LOAD TESTS –PERFORMANCE EVALUATION (1/2) HBase ships with its own tool to execute aperformance evaluation Performance Evaluation (PE) Wiki http://wiki.apache.org/hadoop/Hbase/PerformanceEvaluation48/bin/hbase org.apache.hadoop.hbase.PerformanceEvaluationUsage: java org.apache.hadoop.hbase.PerformanceEvaluation [--miniCluster] [--nomapred] [--rows=ROWS] <command> <nclients>
  • 49. LOAD TESTS –PERFORMANCE EVALUATION (2/2) Example49/bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 111/07/03 13:18:34 INFO hbase.PerformanceEvaluation: Start class org.apache.hadoop.hbase.PerformanceEvaluation$SequentialWriteTest at offset 0 for 1048576 rows...11/07/03 13:18:41 INFO hbase.PerformanceEvaluation: 0/104857/1048576...11/07/03 13:18:45 INFO hbase.PerformanceEvaluation: 0/209714/1048576...11/07/03 13:20:03 INFO hbase.PerformanceEvaluation: 0/1048570/104857611/07/03 13:20:03 INFO hbase.PerformanceEvaluation: Finished class org.apache.hadoop.hbase.PerformanceEvaluation$SequentialWriteTest in 89062ms at offset 0 for 1048576 rows
  • 50. LOAD TESTS – YCSB (1/2) Yahoo! Cloud Serving Benchmark* (YCSB) It is a suite of tools that can be used to run comparableworkloads against different storage systems Also a reasonable tool for performing an HBase cluster burn-in—or performance test Using YCSB is preferred over the HBase-suppliedPerformance Evaluation Offers more options Can combine read and write workloads Home page http://research.yahoo.com/Web_Information_Management/YCSB50
  • 51. LOAD TESTS – YCSB (2/2) Use HBase shell create “usertable”, “family” git pull cd ${GIT_HOME}/hbase-training/006/ycsb Run command Then you can see performance metrics in ycsb-laod.log file51java -cp "${HBASE_CONF_DIR}:core-0.1.4.jar:hbase-binding-0.1.4.jar"com.yahoo.ycsb.Client -load -db com.yahoo.ycsb.db.HBaseClient -Pworkloads/workloada -p columnfamily=family -p recordcount=1000 -s > ycsb-load.log
  • 52. CLUSTER ADMINISTRATION52 Operational Tasks Node Decommission Rolling Restarts Adding BackupMaster Adding a RegionServer Data Task Export Import CopyTable Tool Bulk Import Troubleshooting HBase Fsck Analyzing the Logs
  • 53. OPERATIONAL TASKS – NODE DECOMMISSION (1/2) Use following script In normal HBase distribution In tm distribution Disable the Load Balancer beforeDecommissioning a node In hbase shell balance_switch false Regions could be offline for a good period of time Many regions on the server All regions close The master notices the region server’s ZooKeeperznode being removed53${HBASE_HOME}/bin/hbase-daemon.sh stop regionserver${TM_PUPPET_HOME}/bin/services/shutdown-regionservers.sh [<host> ...]
  • 54. OPERATIONAL TASKS – NODE DECOMMISSION (2/2) Stop a region server gradually A node to gradually shed its load and then shut itselfdown From HBASE 0.90.2 ${HBASE_HOME}/bin/graceful_stop.sh Example Check the HOSTNAME on your HBase master UI Refer to ppt#003, p#41 IP address is NOT supported at present54${HBASE_HOME}/bin/graceful_stop.sh HOSTNAME
  • 55. OPERATIONAL TASKS – ROLLING RESTARTS Also use graceful_stop.sh Steps as follows1. Ensure the cluster is consistent Fix it if inconsistent2. Restart the master3. Disable the region balancer4. Run the graceful_stop.sh script per region server5. Restart the master again Clear out the dead servers list and reenable the balancer6. Run hbck to ensure the cluster is consistent55hbase hbckhbase hbck -fix${HBASE_HOME}/bin/hbase-daemon.sh stop master; ${HBASE_HOME}/bin/hbase-daemon.sh start masterecho "balance_switch false" | ${HBASE_HOME}/bin/hbase shellfor i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart --reload --debug $i; done &> /tmp/log.txt &
  • 56. OPERATIONAL TASKS –ADDING BACKUP MASTER (1/2) To prevent the Single Point of Failure The machine currently hosting the active master isfailing, the system can fall back to a backup master Underlying operations1. A dedicated ZooKeeper znode /hbase/master2. All master processes will race to create, and the firstone to create it wins (become currently master) It happens at startup3. All other master processes simply loop around theznode check and wait for it to disappear Triggering the race again 56
  • 57. OPERATIONAL TASKS –ADDING BACKUP MASTER (2/2) How to start multiple backup master processes Use original way to start a master process In tm distribution Specifically start a backup master process57${HBASE_HOME}/bin/hbase-daemon.sh start master${TM_PUPPET_HOME}/bin/services/startup-hmaster.sh [<host> ...]${HBASE_HOME}/bin/hbase-daemon.sh start master --backup
  • 58. OPERATIONAL TASKS –ADDING A REGION SERVER In normal HBase distribution Edit the ${HBASE_HOME}/conf/regionservers To add newly added region server’s host name Two scripts can use… ${HBASE_HOME}/bin/start-hbase.sh It will bypass the original existing region servers, and startthe newly added region server referred to regionservers file ${HBASE_HOME}/bin/hbase-daemon.sh start regionserver Must executing on the newly added region server In tm distribution New feature, not talk about this here58
  • 59. DATA TASK You may be required to move the data as a wholeor in parts Archive data for backup purposes To bootstrap another cluster59hadoop jar ${HBASE_HOME}/hbase-0.91.0-SNAPSHOT.jarAn example program must be given as the first argument.Valid program names are:…completebulkload: Complete a bulk data load.copytable: Export a table from local cluster to peer clusterexport: Write table data to HDFS.import: Import data written by Export.importtsv: Import data in TSV format.…http://hbase.apache.org/book/ops_mgt.html
  • 60. DATA TASK – EXPORT (1/3)60hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar exportUsage: Export [-D <property=value>]* <tablename> <outputdir> [<versions> [<starttime> [<endtime>]]
  • 61. DATA TASK - EXPORT (2/3)61hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar export testtable /user/larsgeorge/backup-testtable11/06/25 15:58:29 INFO mapred.JobClient: Running job: job_201106251558_000111/06/25 15:58:30 INFO mapred.JobClient: map 0% reduce 0%…11/06/25 15:59:40 INFO mapred.JobClient: map 100% reduce 0%11/06/25 15:59:42 INFO mapred.JobClient: Job complete: job_201106251558_000111/06/25 15:59:42 INFO mapred.JobClient: Counters: 611/06/25 15:59:42 INFO mapred.JobClient: Job Counters11/06/25 15:59:42 INFO mapred.JobClient: Rack-local map tasks=3211/06/25 15:59:42 INFO mapred.JobClient: Launched map tasks=3211/06/25 15:59:42 INFO mapred.JobClient: FileSystemCounters11/06/25 15:59:42 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=364811/06/25 15:59:42 INFO mapred.JobClient: Map-Reduce Framework11/06/25 15:59:42 INFO mapred.JobClient: Map input records=011/06/25 15:59:42 INFO mapred.JobClient: Spilled Records=011/06/25 15:59:42 INFO mapred.JobClient: Map output records=0
  • 62. DATA TASK - EXPORT (3/3) Each part-m-nnnnn file contains a piece of theexported data Together they form the full backup of the table Use the hadoop distcp command to move thedirectory from one cluster to another, and performthe import there 62hadoop dfs -lsr /user/larsgeorge/backup-testtabledrwxr-xr-x - ... 0 2011-06-25 15:58 _logs-rw-r--r-- 1 ... 114 2011-06-25 15:58 part-m-00000-rw-r--r-- 1 ... 114 2011-06-25 15:58 part-m-00001…-rw-r--r-- 1 ... 114 2011-06-25 15:59 part-m-00030-rw-r--r-- 1 ... 114 2011-06-25 15:59 part-m-00031
  • 63. DATA TASK – IMPORT (1/2)63hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar importUsage: Import <tablename> <inputdir>hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar import testtable /user/larsgeorge/backup-testtable11/06/25 17:09:48 INFO mapreduce.TableOutputFormat: Created table instance for testtable11/06/25 17:09:48 INFO input.FileInputFormat: Total input paths to process : 3211/06/25 17:09:49 INFO mapred.JobClient: Running job: job_201106251558_000311/06/25 17:09:50 INFO mapred.JobClient: map 0% reduce 0%11/06/25 17:10:04 INFO mapred.JobClient: map 6% reduce 0%…11/06/25 17:10:51 INFO mapred.JobClient: Job Counters11/06/25 17:10:51 INFO mapred.JobClient: Launched map tasks=3211/06/25 17:10:51 INFO mapred.JobClient: Data-local map tasks=3211/06/25 17:10:51 INFO mapred.JobClient: FileSystemCounters11/06/25 17:10:51 INFO mapred.JobClient: HDFS_BYTES_READ=364811/06/25 17:10:51 INFO mapred.JobClient: Map-Reduce Framework11/06/25 17:10:51 INFO mapred.JobClient: Map input records=011/06/25 17:10:51 INFO mapred.JobClient: Spilled Records=011/06/25 17:10:51 INFO mapred.JobClient: Map output records=0
  • 64. DATA TASK - IMPORT (2/2) Use the Import job to store the data in a differenttable With the same schema Both export/import commend are per-table only Use hadoop distcp command to copy the entire/hbase in HDFS Not recommended May copy store files that are halfway through amemstore flush operation64
  • 65. DATA TASK – COPYTABLE TOOL (1/2) Designed to bootstrap cluster replication Make a copy of an existing table from the mastercluster to the slave cluster65hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar copytableUsage: CopyTable [--rs.class=CLASS] [--rs.impl=IMPL] [--starttime=X][--endtime=Y] [--new.name=NEW] [--peer.adr=ADR] <tablename>
  • 66. DATA TASK – COPYTABLE TOOL (2/2) The copy of the table is stored on the same cluster66hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar copytable --new.name=testtable3 testtable11/06/26 15:20:07 INFO mapreduce.TableOutputFormat: Created table instance for testtable311/06/26 15:20:07 INFO mapred.JobClient: Running job: job_201106261454_000311/06/26 15:20:08 INFO mapred.JobClient: map 0% reduce 0%11/06/26 15:20:19 INFO mapred.JobClient: map 6% reduce 0%…11/06/26 15:21:04 INFO mapred.JobClient: map 100% reduce 0%11/06/26 15:21:06 INFO mapred.JobClient: Job complete: job_201106261454_000311/06/26 15:21:06 INFO mapred.JobClient: Counters: 511/06/26 15:21:06 INFO mapred.JobClient: Job Counters11/06/26 15:21:06 INFO mapred.JobClient: Launched map tasks=3211/06/26 15:21:06 INFO mapred.JobClient: Data-local map tasks=3211/06/26 15:21:06 INFO mapred.JobClient: Map-Reduce Framework11/06/26 15:21:06 INFO mapred.JobClient: Map input records=011/06/26 15:21:06 INFO mapred.JobClient: Spilled Records=011/06/26 15:21:06 INFO mapred.JobClient: Map output records=0
  • 67. DATA TASK – BULK IMPORT (1/2) Importtsv tool Given files containing data in tab-separated value (TSV)format By default , it uses the HBase put() API to insert datainto HBase one row at a time By setting importtsv.bulk.output option, generate filesusing HFileOutputFormat These can subsequently be bulk-loaded into HBase bycompletebulkload Tool67hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar importtsvUsage: importtsv -Dimporttsv.columns=a,b,c <tablename> <inputdir>
  • 68. DATA TASK – BULK IMPORT (2/2) completebulkload Tool Is used to import the data into the running cluster After a data import has been prepared By using the importtsv tool with the importtsv.bulk.outputoption By some other MapReduce job using theHFileOutputFormat68hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar completebulkload -conf ~/my-hbase-site.xml /user/larsgeorge/myoutput mytable
  • 69. TROUBLESHOOTING – HBASE FSCK (1/4) Shell Command ${HBASE_HOME}/bin/hbase hbck Once started Scans the .META. table to gather all the pertinent informationit holds Scans the HDFS root directory HBase is configured to use Compare the collected details to report on inconsistenciesand integrity issues Consistency check Whether the region is listed in .META. and exists in HDFS Is also assigned to exactly one region server Integrity check Compares the regions with the table details to find missingregions Those that have holes or overlaps in their row key ranges 69
  • 70. TROUBLESHOOTING – HBASE FSCK (2/4)70${HBASE_HOME}/bin/hbase hbck -hUsage: fsck [opts]where [opts] are:-details Display full report of all regions.-timelag {timeInSeconds} Process only regions that have not experiencedany metadata updates in the last {{timeInSeconds} seconds.-fix Try to fix some of the errors.-sleepBeforeRerun {timeInSeconds} Sleep this many seconds before checkingif the fix worked if run with -fix-summary Print only summary of the tables and status.
  • 71. TROUBLESHOOTING – HBASE FSCK (3/4) No option at all invokes the normal output detail71${HBASE_HOME}/bin/hbase hbckNumber of Tables: 40Number of live region servers: 19Number of dead region servers: 0Number of empty REGIONINFO_QUALIFIER rows in .META.: 0Summary:...testtable2 is okay.Number of regions: 1Deployed on: host11.foo.com:600200 inconsistencies detected.Status: OK
  • 72. TROUBLESHOOTING – HBASE FSCK (4/4) ${HBASE_HOME}/bin/hbase hbck -fix Repairs following issues Assign .META. to a single new server if it is unassigned Reassign .META. to a single new server if it is assigned tomultiple servers Assign a user table region to a new server if it is unassigned Reassign a user table region to a single new server if it isassigned to multiple servers Reassign a user table region to a new server if the currentserver does not match what the .META. table refers to hbck reports inconsistencies which are temporal, ortransitional only Rerun the tool a few times to confirm a permanent problem72
  • 73. TROUBLESHOOTING – ANALYZING THE LOGS (1/2)Server type Default Logfile tm settingsHBase Master$HBASE_HOME/logs/hbase-<user>-master-<hostname>.log/var/log/hbase/hbase-<user>-master-<hostname>.logHBaseRegionServer$HBASE_HOME/logs/hbase-<user>-regionserver-<hostname>.log/var/log/hbase/hbase-<user>-regionserver-<hostname>.logZooKeeper Console log output only/var/log/hbase/hbase-<user>-zookeeper-<hostname>.logNameNode$HADOOP_HOME/logs/hadoop-<user>-namenode-<hostname>.log/var/log/hadoop/hadoop-<user>-namenode-<hostname>.logDataNode$HADOOP_HOME/logs/hadoop-<user>-datanode-<hostname>.log/var/log/hadoop/hadoop-<user>-datanode-<hostname>.logJobTracker$HADOOP_HOME/logs/hadoop-<user>-jobtracker-<hostname>.log/var/log/hadoop/hadoop-<user>-jobtracker-<hostname>.logTaskTracker$HADOOP_HOME/logs/hadoop-<user>-jobtracker-<hostname>.log/var/log/hadoop/hadoop-<user>-jobtracker-<hostname>.log73
  • 74. TROUBLESHOOTING – ANALYZING THE LOGS (2/2) Is useful to begin with the master logfile first It acts as the coordinator service of the entire cluster Find the processes began logging ERROR levelmessages Be able to identify the root cause A lot of subsequent messages are often side-effect of theoriginal problem Recommend to use the error log event metric underSystem Event Metrics group Gives you a graph showing you where the server(s)started logging an increasing number of error messagesin the logfiles If find an error message Google it !! Use the online resources to search for the message inthe public mailing lists Search Hadoop74
  • 75. HANDS-ON – USE YCSB New VM list Due to VMs are not affordable at present :p ${YOUR_HOME}=${GIT_HOME}/hbase-training/006/hands-on/${YOUR_NAME} mkdir ${YOUR_HOME} cd ${YOUR_HOME}; cp -rf ../../ycsb/* . Use HBase shell create <YOUR_NAMED_TABLE>, “family” Run YCSB with 5000 record count And ouput ycsb-load.log file Hands-on result Put the ycsb-load.log file under ${YOUR_HOME}75