HBaseCon 2013: Apache HBase on Flash

1,916 views
1,728 views

Published on

Presented by: Matt Kennedy, Fusion-io

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,916
On SlideShare
0
From Embeds
0
Number of Embeds
134
Actions
Shares
0
Downloads
51
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

HBaseCon 2013: Apache HBase on Flash

  1. 1. Fusion-io Confidential—Copyright © 2013 Fusion-io, Inc. All rights reserved.Fusion-io Confidential—Copyright © 2013 Fusion-io, Inc. All rights reserved. HBase on Flash Matt Kennedy HBaseCon June 13, 2013
  2. 2. Switch your database to flash now. Or you’re doing it wrong. Brian Bulkowski, Aerospike CTO and co-founder July 8, 2013 2 http://highscalability.com/blog/2012/12/10/switch-your- databases-to-flash-storage-now-or-youre-doing-it.html
  3. 3. July 8, 2013 3 NAND Flash +
  4. 4. NAND Flash Memory July 8, 2013 4#Cassandra13
  5. 5. NAND Flash Memory July 8, 2013 5 Flash is a persistent memory technology invented by Dr. Fujio Masuoka at Toshiba in 1980. Bit Line Source Line Word Line Control Gate Float Gate NPN
  6. 6. Flash in Servers July 8, 2013 6
  7. 7. Direct Cut Through Architecture 7/8/2013 Fusion-io Confidential 7 PCIe DRAM Host CPU App OS LEGACY APPROACH FUSION DIRECT APPROACH PCIeSAS DRAM Data path Controller NAND Host CPU RAID Controller App OS Goal of every I/O operation to move data to/from DRAM and flash. SC Super Capacitors
  8. 8. July 8, 2013 8 NAND Flash +
  9. 9. HBase Options July 8, 2013 9 1. What do we do today? 2. What does HBase look like on flash? 3. What if we can not go all flash?
  10. 10. Conventional HBase Node July 8, 2013 10 ▸ Key Design Principle: ▸ Working Set < DRAM
  11. 11. Working Set Getting Bigger July 8, 2013 11 ▸ Key Design Principle: ▸ Working Set < DRAM
  12. 12. EEP July 8, 2013 12 ▸ Key Design Principle: ▸ Working Set < DRAM
  13. 13. DOLLARS Cost of DRAM Modules July 8, 2013 13 0 200 400 600 800 1000 1200 1400 1600 4GB 8GB 16GB 32GB $ $$ $$$ $$$$$$
  14. 14. HBase Server July 8, 2013 14 ▸ A typical server… CPU Cores: 32 with HT Memory: 128 GB Is your working set larger than 128GB?
  15. 15. HBase Cluster July 8, 2013 15 ▸ With NoSQL Databases, we tend to scale out for DRAM Combined Resources CPU Cores: 96 Memory: 384 GB More cores than needed to serve reads and writes.
  16. 16. The All Flash Option July 8, 2013 16 HBase Node Data Node Daemon Region Server Daemon Memory Store Storage Directories Disk Flash HBase Node Data Node Daemon Storage Directories
  17. 17. Short-Circuit Reads Matter on Flash! July 8, 2013 Fusion-io Confidential 17 READOPS/SEC 0 1000 2000 3000 4000 5000 6000 7000 0 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400 420 440 460 480 500 520 540 560 580 600 620 640 660 680 700 720 740 760 780 800 DataNode reads Short-Circuit Reads
  18. 18. Short-Circuit Reads (Datanode Bypass) July 8, 2013 18 ▸ In hdfs-site.xml ▸ <property>
 <name>dfs.block.local-path- access.user</name>
 <value>hbase</value>
 </property> In hbase-site.xml <property>
 <name>dfs.client.read.shortcircuit</name>
 <value>true</value >
 </property> <property>
 <name>hbase.regionserver.checksum.verify</name>
 <value>tr ue</value>
 </property>
  19. 19. YCSB Suite – Uniform Distribution July 8, 2013 Fusion-io Confidential 19 MIXEDWORKLOADOPERATIONS 0 20000 40000 60000 80000 100000 120000 140000 10 30 50 70 90 110 130 150 170 190 210 230 250 270 290 310 330 350 370 390 410 430 450 470 490 510 530 550 570 590 610 630 650 670 690 710 730 750 770 790 810 830 850 870 890 50/50 R/W 95/5 R/W Read-only
  20. 20. YCSB Suite Latency – Uniform Distribution July 8, 2013 20 Workload Average Latency 95th Percentile Latency 99th Percentile Latency 50/50 Read/Write Update Latency 81 µs 0 ms 0 ms 50/50 Read/Write Read Latency 13.5 ms 34ms 128ms 95/5 Read/Write Update Latency 69.3 µs 0 ms 0 ms 95/5 Read/Write Read Latency 8.5 ms 26 ms 39 ms Read-only 9.2 ms 26 ms 38 ms
  21. 21. Write Amplification July 8, 2013 21 Workload Type Amplification Factor Bulk Load 14.8 Normal Operations (80/20 update/insert split) 4.2 Amplification Factor = Physical Bytes Written Workload Bytes Written
  22. 22. The HBase BucketCache (HBase- 7404) July 8, 2013 22 Committed to HBase trunk. Will be in 0.96 release, backport patch for 0.94 available. + https://issues.apache.org/jira/browse/HBASE-7404
  23. 23. BucketCache Configuration July 8, 2013 23 ▸ In hbase-site.xml <property>
 <name>hbase.bucketcache.ioengine</name>
 <value>file:/path /to/bucketcache.dat</value>
 </property> <property>
 <name>hbase.bucketcache.size</name>
 <!-- 2TB: unit is MB --> <value>2097152</value> </property>
  24. 24. BucketCache Warm-up July 8, 2013 Fusion-io Confidential 24 READOPSDURINGCACHEWARM-UP 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 10 570 1130 1690 2250 2810 3370 3930 4490 5050 5610 6170 6730 7290 7850 8410 8970 9530 10090 10650 11210 11770 12330 12890 13450 14010 14570 15130 15690 16250 16810 17370 17930 18490 19050 19610 20171 20731 21291 21851 22411 22971 23531 24091 24651 25211 25771 26331 26891 read ops/sec
  25. 25. BucketCache Steady-State July 8, 2013 Fusion-io Confidential 25 READOPS/SECSTEADY-STATE 0 10000 20000 30000 40000 50000 60000 10 100 190 280 370 460 550 640 730 820 910 1000 1090 1180 1270 1360 1450 1540 1630 1720 1810 1900 1990 2080 2170 2260 2350 2440 2530 2620 2710 2800 2890 2980 3070 3160 3250 3340 3430 3520 3610 3700 3790 3880 3970 4060 read ops/sec Avg Latency: 5.2 ms 95th Percentile: 21 ms 99th Percentile: 24 ms
  26. 26. BucketCache 50% Read, 50% Update July 8, 2013 Fusion-io Confidential 26 OPS/SECMIXEDWORKLOAD 0 20000 40000 60000 80000 100000 120000 10 80 150 220 290 364 434 506 576 646 716 786 856 927 997 1067 1137 1207 1277 1347 1417 1487 1557 1631 1701 1771 1841 1914 1984 2054 2124 2194 2264 2334 2404 2474 2544 2614 2684 2754 2824 2894 2964 3034 read ops/sec Update Latency Average: 11.7 µs 95th Pctl:0 ms 99th Pctl: 0 ms Read Latency Average: 7.9 ms 95th Pctl: 34 ms 99th Pctl: 65 ms
  27. 27. BucketCache during Compaction July 8, 2013 Fusion-io Confidential 27 READOPS/SECUNDERCOMPACTION 0 10000 20000 30000 40000 50000 60000 10 140 270 400 530 660 790 920 1050 1180 1310 1440 1570 1700 1830 1960 2090 2220 2350 2480 2610 2740 2870 3000 3130 3260 3390 3520 3650 3780 3910 4040 4170 4300 4430 4560 4690 4820 4950 5080 5210 5340 5470 5600 5730 5860 5990 read ops/sec Avg Latency: 7.8 ms 95th Percentile: 37 ms 99th Percentile: 61 ms
  28. 28. What Next? July 8, 2013 28 1. Can we do something about that write amplification? 2. There is minimal penalty to in-place updates in flash on modern FTLs; can we devise a way to do this compatible with HDFS? 3. Does HDFS need to be more aware of different storage technologies? (DRAM, Flash, PCM)
  29. 29. f u s i o n i o . c o m | R E D E F I N E W H A T ’ S P O S S I B L E THANK YOU f u s i o n i o . c o m | R E D E F I N E W H A T ’ S P O S S I B L E THANK YOU
  30. 30. Performance July 8, 2013 30
  31. 31. Performance July 8, 2013 31

×