Matt Kennedy
(@mattmorefaster)
October 17, 2013

#CassandraEU — Copyright

©

2013 Fusion-io, Inc. All rights reserved.

C...
What is this talk about?
▸ Efficiency
• Definition:
noun 1. The state or quality of being efficient.

▸ Efficient
• Defini...
Flash vs Disk Cost Efficiency

▸ Capacity

3TB

▸ IOPS

150

200,000

▸ Cost per IOP

3

4TB

$$$$

¢¢¢¢

#CassandraEU
What is flash?

4

#CassandraEU
NAND Flash Memory

Source
Line

Word Line
Control Gate

Bit
Line

Float Gate

N

P

N

Flash is a persistent memory techno...
Consumer Volume Drives Economics

6

#CassandraEU
Flash in Servers

7

#CassandraEU
Direct Cut Through Architecture
FUSION DIRECT APPROACH

LEGACY APPROACH

Host
CPU

App

OS

DRAM

DRAM

SAS

SC

Super
Cap...
9

#CassandraEU
Cassandra I/O - Writes

http://www.datastax.com/docs/1.2/dml/about_writes

10

#CassandraEU
Cassandra I/O - Reads

http://www.datastax.com/docs/1.2/dml/about_reads

11

#CassandraEU
DRAM Dictates Cassandra Scaling
▸ Key Design Principle:
▸ Working Set < DRAM

12

#CassandraEU
Cost of DRAM Modules

1600

$$$$$$

1400
1200

DOLLARS

1000
800
600

$$$

400
200

$

$$

0
4GB

13

#CassandraEU

8GB

1...
When do we scale out?
▸ A typical server…

CPU Cores: 32 with HT
Memory: 128 GB

…is your working set > 128GB?

14

#Cassa...
Is there a better way?
▸ With NoSQL Databases, we tend to scale out for
DRAM
Combined Resources
CPU Cores: 192
Memory: 768...
Flash Offers A New Architectural Choice
CPU Cache DRAM
Server-based Flash
Disk Drives

Milliseconds 10-3

16

#CassandraEU...
How can we use
flash in Cassandra?

17

#CassandraEU
Four Deployment Options

1. All Flash
2. Data Placement (CASSANDRA-2749)
3. Use Logical Data Centers
4. Cache Layer

18

#...
Cassandra with All-Flash Storage

Step 1: Mount ioMemory at /var/lib/cassandra
Step 2:

19

#CassandraEU
Data Placement
▸ https://issues.apache.org/jira/browse/CASSANDRA-2749
• Thanks Marcus!

▸ Takes advantage of filesystem hi...
Data Centers for Storage Control
Cassandra cluster
DC1
(Interactive requests)

HIGH

MEDIUM

21

#CassandraEU

DC2
(Hadoop...
Flash Caching
▸ Use Flash to cache blocks from spinning disk
• Larger cheaper caches than DRAM
• Helps stabilize performan...
The Numbers

23

#CassandraEU
YCSB Testing Setup
150 million 1KB records, RF=3: ~ 120GB SSTables/node

YCSB Load Generator
x4
x1

Workloads use uniform
...
10
750
1490
2230
2970
3710
4450
5190
5930
6670
7410
8150
8890
9630
10370
11110
11850
12590
13330
14070
14810
15550
16290
1...
95/5 R/W Uniform distribution

80000

70000

MIXED OPS/SEC

60000

50000

# threads

10000

99th pctl

1.4/0.22
ms

2/0 ms...
Consolidation

27

#CassandraEU
http://techblog.netflix.com/2012/07/benchmarking-high-performance-io-with.html

28

#CassandraEU
Real-World Cassandra on Fusion

• 3-4x consolidation factor
• 3-6x reduction in latency
• 2.2x ROI
29

#CassandraEU
Efficiency: Performance or Consolidation?
Cassandra @ ~100,000 ops/sec (mixed workload)
x
4
x
4
x
4

x
4
x
4

x
4
x
4

x
4...
Thank You
@mattmorefaster

fusionio.com |

S A M E P L A N E T. D I F F E R E N T W O R L D .
Cassandra: ioDrive2 vs 10 disk RAID-0

32

#Cassandra13

Novemb
50/50 R/W Uniform distribution

120000

YCSB MIXED OPS/SEC

100000

80000

60000

Read Latency
Average: 8.2 ms
95th Pctl: ...
10
70
130
190
250
310
370
430
490
550
610
670
730
790
850
910
970
1030
1090
1150
1210
1270
1330
1390
1450
1510
1570
1630
1...
Upcoming SlideShare
Loading in...5
×

C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

1,550

Published on

Speaker: Matt Kennedy, Solution Architect: Big Data at Fusion.io
YouTube: http://www.youtube.com/watch?v=xu_4TAQlY2U&list=PLqcm6qE9lgKLoYaakl3YwIWP4hmGsHm5e&index=21
Flash Memory technology, deployed as server-side PCIe or solid state disks (SSDs), is emerging as a critical tool for performance and efficiency in data centers of all scales. This presentation will discuss how the use of Flash impacts Cassandra deployments in terms of configuration, DRAM requirements and performance expectations. Ideas on leveraging C*'s cutting-edge data-center awareness to blend flash and disk storage nodes for cost and workload efficiency will also be shared. Flash media itself will be examined from a physical perspective to understand endurance issues. Data on write amplification under bulk-load and operational workload conditions will be presented to explain the impact to Flash of C*'s Log Structured Merge Tree architecture and the associated compactions. Finally, we will examine strategies to make Cassandra more Flash-aware using both conventional techniques as well as emerging Non-volatile memory (NVM) programming capabilities. Lessons learned from real-world customer deployments will be shared to complete this presentation.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,550
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
24
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • D’oh! Indeed.
  • Tuning: turn off compaction throttling
  • C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

    1. 1. Matt Kennedy (@mattmorefaster) October 17, 2013 #CassandraEU — Copyright © 2013 Fusion-io, Inc. All rights reserved. Cassandra: No Moving Parts Cassandra on Flash Memory
    2. 2. What is this talk about? ▸ Efficiency • Definition: noun 1. The state or quality of being efficient. ▸ Efficient • Definition: adjective 1. (especially of a system of machine) achieving maximum productivity with minimum wasted effort or expense 2 #CassandraEU
    3. 3. Flash vs Disk Cost Efficiency ▸ Capacity 3TB ▸ IOPS 150 200,000 ▸ Cost per IOP 3 4TB $$$$ ¢¢¢¢ #CassandraEU
    4. 4. What is flash? 4 #CassandraEU
    5. 5. NAND Flash Memory Source Line Word Line Control Gate Bit Line Float Gate N P N Flash is a persistent memory technology invented by Dr. Fujio Masuoka at Toshiba in 1980. 5 #CassandraEU
    6. 6. Consumer Volume Drives Economics 6 #CassandraEU
    7. 7. Flash in Servers 7 #CassandraEU
    8. 8. Direct Cut Through Architecture FUSION DIRECT APPROACH LEGACY APPROACH Host CPU App OS DRAM DRAM SAS SC Super Capacitors App PCIe RAID Controller OS PCIe Host CPU Data path Controller NAND Goal of every I/O operation to move data to/from DRAM and flash. 8 #CassandraEU
    9. 9. 9 #CassandraEU
    10. 10. Cassandra I/O - Writes http://www.datastax.com/docs/1.2/dml/about_writes 10 #CassandraEU
    11. 11. Cassandra I/O - Reads http://www.datastax.com/docs/1.2/dml/about_reads 11 #CassandraEU
    12. 12. DRAM Dictates Cassandra Scaling ▸ Key Design Principle: ▸ Working Set < DRAM 12 #CassandraEU
    13. 13. Cost of DRAM Modules 1600 $$$$$$ 1400 1200 DOLLARS 1000 800 600 $$$ 400 200 $ $$ 0 4GB 13 #CassandraEU 8GB 16GB 32GB
    14. 14. When do we scale out? ▸ A typical server… CPU Cores: 32 with HT Memory: 128 GB …is your working set > 128GB? 14 #CassandraEU
    15. 15. Is there a better way? ▸ With NoSQL Databases, we tend to scale out for DRAM Combined Resources CPU Cores: 192 Memory: 768 GB • Low CPU utilization • High Utility cost 15 #CassandraEU 15
    16. 16. Flash Offers A New Architectural Choice CPU Cache DRAM Server-based Flash Disk Drives Milliseconds 10-3 16 #CassandraEU Microseconds 10-6 Nanoseconds 10-9
    17. 17. How can we use flash in Cassandra? 17 #CassandraEU
    18. 18. Four Deployment Options 1. All Flash 2. Data Placement (CASSANDRA-2749) 3. Use Logical Data Centers 4. Cache Layer 18 #CassandraEU
    19. 19. Cassandra with All-Flash Storage Step 1: Mount ioMemory at /var/lib/cassandra Step 2: 19 #CassandraEU
    20. 20. Data Placement ▸ https://issues.apache.org/jira/browse/CASSANDRA-2749 • Thanks Marcus! ▸ Takes advantage of filesystem hierarchy ▸ Use mount points to pin Keyspaces or Column Families to flash: • /var/lib/cassandra/data/{Keyspace}/{CF} ▸ Use flash for high performance needs, disk for capacity needs 20 #CassandraEU
    21. 21. Data Centers for Storage Control Cassandra cluster DC1 (Interactive requests) HIGH MEDIUM 21 #CassandraEU DC2 (Hadoop MR Jobs) P E R F O R M AN C E C APAC I T Y / N O D E DC3 (High density replicas) LOW HIGH
    22. 22. Flash Caching ▸ Use Flash to cache blocks from spinning disk • Larger cheaper caches than DRAM • Helps stabilize performance during compaction ▸ Open-Source & Commercial options: • Flashcache: FB developed write-through/back/around cache ▸ Kernel patch ▸ https://github.com/facebook/flashcache/ • bcache: write-through/back/around cache ▸ Kernel patch ▸ http://bcache.evilpiepirate.org/ • Fusion ioTurbine: write-through, commercially supported 22 #CassandraEU
    23. 23. The Numbers 23 #CassandraEU
    24. 24. YCSB Testing Setup 150 million 1KB records, RF=3: ~ 120GB SSTables/node YCSB Load Generator x4 x1 Workloads use uniform random key selection instead of Zipfian. 24 #CassandraEU 10GB 16-cores 24GB DRAM
    25. 25. 10 750 1490 2230 2970 3710 4450 5190 5930 6670 7410 8150 8890 9630 10370 11110 11850 12590 13330 14070 14810 15550 16290 17030 17770 18510 19250 19990 20730 21470 22210 22950 23690 24430 25170 25910 26650 27390 28131 28871 29611 30351 31091 31831 32571 33311 34051 34791 35531 YCSB MIXED OPS/SEC 50/50 R/W Uniform distribution 10hrs 70000 60000 25 50000 40000 30000 20000 10000 Update Latency Average: 511 µs 95th Pctl:1 ms 99th Pctl: 2 ms #CassandraEU Read Latency Average: 7.0 ms 95th Pctl: 18 ms 99th Pctl: 42 ms 0 mixed ops/sec
    26. 26. 95/5 R/W Uniform distribution 80000 70000 MIXED OPS/SEC 60000 50000 # threads 10000 99th pctl 1.4/0.22 ms 2/0 ms 5/0 ms 3.1/0.19 ms 7/0 ms 13/0 ms 300 20000 95th pctl 200 30000 Avg Lat. 75 40000 4.4/2.2 ms 11/0 ms 19/0 ms 75 threads 26 #CassandraEU 200 threads 300 threads 690 670 650 630 610 590 570 550 530 510 490 470 450 430 410 390 370 350 330 310 290 270 250 230 210 190 170 150 130 110 90 70 50 30 10 0
    27. 27. Consolidation 27 #CassandraEU
    28. 28. http://techblog.netflix.com/2012/07/benchmarking-high-performance-io-with.html 28 #CassandraEU
    29. 29. Real-World Cassandra on Fusion • 3-4x consolidation factor • 3-6x reduction in latency • 2.2x ROI 29 #CassandraEU
    30. 30. Efficiency: Performance or Consolidation? Cassandra @ ~100,000 ops/sec (mixed workload) x 4 x 4 x 4 x 4 x 4 x 4 x 4 x 4 vs. x 4 x 4 Memory/Disk ioMemory http://www.fusionio.com/white-papers/accelerate-cassandra-without-the-cluster-crawl/ 30 #CassandraEU
    31. 31. Thank You @mattmorefaster fusionio.com | S A M E P L A N E T. D I F F E R E N T W O R L D .
    32. 32. Cassandra: ioDrive2 vs 10 disk RAID-0 32 #Cassandra13 Novemb
    33. 33. 50/50 R/W Uniform distribution 120000 YCSB MIXED OPS/SEC 100000 80000 60000 Read Latency Average: 8.2 ms 95th Pctl: 20 ms 99th Pctl: 62 ms Update Latency Average: 311 µs 95th Pctl:0 ms 99th Pctl: 1 ms 40000 20000 mixed ops/sec 33 #Cassandra13 Novemb 550 530 510 490 470 450 430 410 390 370 350 330 310 290 270 250 230 210 190 170 150 130 110 90 70 50 30 10 0
    34. 34. 10 70 130 190 250 310 370 430 490 550 610 670 730 790 850 910 970 1030 1090 1150 1210 1270 1330 1390 1450 1510 1570 1630 1690 1750 1810 1870 1930 1990 2050 2110 2170 2230 2290 2350 2410 2470 2530 2590 2650 2710 2770 2830 YCSB INSERTS YCSB: Bulk Load (CL=ALL) 70000 60000 50000 34 40000 30000 20000 Avg Latency: 0.9 ms 95th Percentile: 1 ms 99th Percentile: 4 ms 10000 0 inserts/sec #CassandraEU
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×