Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Performance Tuning Cassandra 
In AWS" 
Cassandra Summit 2014! 
Michael Nelson! 
1! © 2014 by Intellectual Reserve, Inc. Al...
2! 
Outline! 
• The App: FamilySearch Family Tree! 
• The Test: Borland Silk Performer! 
• The Findings:! 
• Row Cache! 
•...
3! 
What Is FamilySearch?! 
• Familysearch.org Website! 
• Very Large Single Pedigree (Family Tree)! 
• Largest Collection...
4! 
Why does FamilySearch exist?! 
Visit http://mormon.org/family-history/! 
!
5! 
Family Tree Data! 
Family Tree: ! 
• 900M+ Person Records, Open-Edit! 
• 500M+ Relationships, Open-Edit! 
• 8.4B Chang...
6! 
Family Tree: Example 9 Gen Pedigree! 
up 
to 
511 
person 
slots 
Dynamic 
content!
7! 
Family Tree: Example Pedigree App! 
31+ 
persons 
per 
sec0on 
Dynamic 
content!
8! 
Family Tree: Example Ancestor Page! 
10+ 
persons 
in 
families 
100-­‐1000+ 
changes 
Dynamic 
content!
9! 
Cassandra Reimplementation! 
• Event-Sourced Data Model – journal / views! 
• New Data Model – no indexes! 
• New Cons...
10! 
77% Reads / 23% Writes! 
Reads:! 
• LOCAL_ONE! 
• Simple Queries! 
Writes:! 
• LOCAL_QUORUM! 
• Atomic Batches! 
• Mu...
A Little Optimization Goes A Long Way! 
11! 
28 Node Cluster! 
• 250,000 op/sec! 
• Optimized App! 
8 Node Cluster! 
• 200...
12! 
Test System! 
Cassandra 
(Community 
Ed. 
2.0.5) 
Family 
Tree 
App 
Servers 
(Datastax 
2.0.0) 
Silk 
Performer 
Loa...
13! 
2x Throughput Increase! 
200,000 
150,000 
100,000 
50,000 
0 
Defaults 
Row 
Cache 
Token 
Aware 
concurrent_reads 
...
14! 
Row Cache = 35% More Throughput! 
Default Key Cache:! 
• Cached Disk Location! 
• Data From Disk Cache! 
• ~11ms Read...
15! 
Configuring Row Cache! 
cassandra.yaml:! 
# Maximum size of the row cache in memory. 
# Default value is 0, to disabl...
16! 
90% Row Cache Hit Rate!
17! 
Token Aware = 50% More Throughput! 
Default Round Robin:! 
• Coordinator Middleman! 
• Adds Network Hops! 
• Load On ...
18! 
Configuring Token Aware! 
Default Load Balancing Policy:! 
new RoundRobinPolicy() 
Better:! 
new TokenAwarePolicy(new...
concurrent_reads = 5% More Throughput! 
19! 
Defaults:! 
concurrent_reads: 32 
concurrent_writes: 32 
native_transport_max...
20! 
Now Where’s The Bottleneck?! 
• 181,000 reads/sec; 21,000 writes/sec! 
• CPU = 80%! 
• Network = 10%! 
• Disk < 5%!
21! 
Network Mystery: C* ≤ 800Mb! 
C* Never Exceeded 800Mb On 10Gb Network! 
! 
!
22! 
Network Mystery: Cyclic Net Queues! 
• About 5 Second Cycle of Net Queues Backing Up! 
• Client Machines Seemed OK! 
...
23! 
Network Mystery: Cyclic Net Queues! 
Send-Qs Backup! 
!
24! 
Network Mystery: Cyclic Net Queues! 
Recv-Qs Backup! 
!
25! 
Network Mystery: Cyclic Net Queues! 
Somewhat Normal – Then Starts Again! 
!
26! 
2x Throughput Increase! 
200,000 
150,000 
100,000 
50,000 
0 
Defaults 
Row 
Cache 
Token 
Aware 
concurrent_reads 
...
27! 
Contact Info! 
Michael Nelson" 
Development Manager! 
nelsonmi@familysearch.org! 
! 
Thanks to FamilySearch team!! 
!...
Upcoming SlideShare
Loading in …5
×

Cassandra Summit 2014: Performance Tuning Cassandra in AWS

10,129 views

Published on

Presenters: Michael Nelson, Development Manager at FamilySearch

A recent research project at FamilySearch.org pushed Cassandra to very high scale and performance limits in AWS using a real application. Come see how we achieved 250K reads/sec with latencies under 5 milliseconds on a 400-core cluster holding 6 TB of data while maintaining transactional consistency for users. We'll cover tuning of Cassandra's caches, other server-side settings, client driver, AWS cluster placement and instance types, and the tradeoffs between regular & SSD storage.

Published in: Technology
  • Sex in your area is here: ❤❤❤ http://bit.ly/369VOVb ❤❤❤
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Follow the link, new dating source: ❶❶❶ http://bit.ly/369VOVb ❶❶❶
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD FULL. BOOKS INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Cassandra Summit 2014: Performance Tuning Cassandra in AWS

  1. 1. Performance Tuning Cassandra In AWS" Cassandra Summit 2014! Michael Nelson! 1! © 2014 by Intellectual Reserve, Inc. All rights reserved.!
  2. 2. 2! Outline! • The App: FamilySearch Family Tree! • The Test: Borland Silk Performer! • The Findings:! • Row Cache! • Token Aware Driver! • Networking Issues! • Etc.!
  3. 3. 3! What Is FamilySearch?! • Familysearch.org Website! • Very Large Single Pedigree (Family Tree)! • Largest Collection of Free Genealogical Records! • Largest Genealogical Library! • The Church of Jesus Christ of Latter-day Saints (Mormons)!
  4. 4. 4! Why does FamilySearch exist?! Visit http://mormon.org/family-history/! !
  5. 5. 5! Family Tree Data! Family Tree: ! • 900M+ Person Records, Open-Edit! • 500M+ Relationships, Open-Edit! • 8.4B Change Log Entries, ~1M / day! • 7TB in Cassandra (13TB in Oracle)! • Dynamic OLTP system! • Data-dependent performance issues!
  6. 6. 6! Family Tree: Example 9 Gen Pedigree! up to 511 person slots Dynamic content!
  7. 7. 7! Family Tree: Example Pedigree App! 31+ persons per sec0on Dynamic content!
  8. 8. 8! Family Tree: Example Ancestor Page! 10+ persons in families 100-­‐1000+ changes Dynamic content!
  9. 9. 9! Cassandra Reimplementation! • Event-Sourced Data Model – journal / views! • New Data Model – no indexes! • New Consistency Model – satisfies consistency! P1 JE #8 P1 Views A B P2 P2 Views JE #6 A B
  10. 10. 10! 77% Reads / 23% Writes! Reads:! • LOCAL_ONE! • Simple Queries! Writes:! • LOCAL_QUORUM! • Atomic Batches! • Multiple Tables! • Multiple Rows! • Business Logic!
  11. 11. A Little Optimization Goes A Long Way! 11! 28 Node Cluster! • 250,000 op/sec! • Optimized App! 8 Node Cluster! • 200,000 op/sec! • Optimized App! • Row Cache! • Token Aware Driver!
  12. 12. 12! Test System! Cassandra (Community Ed. 2.0.5) Family Tree App Servers (Datastax 2.0.0) Silk Performer Load Agents 8 hi1.4xlarge: • 16 CPU • 61 GB RAM • 2 TB SSD • 10 Gb net 60 m2.2xlarge: • 4 CPU • 34 GB RAM • “moderate” net 25 m2.xlarge: • 2 CPU • 17 GB RAM • “moderate” net
  13. 13. 13! 2x Throughput Increase! 200,000 150,000 100,000 50,000 0 Defaults Row Cache Token Aware concurrent_reads op / sec Reads Writes
  14. 14. 14! Row Cache = 35% More Throughput! Default Key Cache:! • Cached Disk Location! • Data From Disk Cache! • ~11ms Reads! Row Cache:! • Cached Row Contents! • ~7ms Reads!
  15. 15. 15! Configuring Row Cache! cassandra.yaml:! # Maximum size of the row cache in memory. # Default value is 0, to disable row caching. row_cache_size_in_mb: 32768 ! Enable For Each Table Explicitly:! ALTER TABLE person_view WITH caching = 'ALL'; !
  16. 16. 16! 90% Row Cache Hit Rate!
  17. 17. 17! Token Aware = 50% More Throughput! Default Round Robin:! • Coordinator Middleman! • Adds Network Hops! • Load On Multiple Nodes! • ~7ms! Token Aware:! • Reads From Replicas! • No Network Hops! • ~2ms!
  18. 18. 18! Configuring Token Aware! Default Load Balancing Policy:! new RoundRobinPolicy() Better:! new TokenAwarePolicy(new RoundRobinPolicy())
  19. 19. concurrent_reads = 5% More Throughput! 19! Defaults:! concurrent_reads: 32 concurrent_writes: 32 native_transport_max_threads: 128 Improved:! concurrent_reads: 256 concurrent_writes: 256 native_transport_max_threads: 256
  20. 20. 20! Now Where’s The Bottleneck?! • 181,000 reads/sec; 21,000 writes/sec! • CPU = 80%! • Network = 10%! • Disk < 5%!
  21. 21. 21! Network Mystery: C* ≤ 800Mb! C* Never Exceeded 800Mb On 10Gb Network! ! !
  22. 22. 22! Network Mystery: Cyclic Net Queues! • About 5 Second Cycle of Net Queues Backing Up! • Client Machines Seemed OK! • Tweaking Network Stack Had No Impact:! • net.core.wmem_max! • net.core.rmem_max! • net.ipv4.tcp_wmem! • net.ipv4.tcp_rmem! • net.core.somaxconn! • net.core.netdev_max_backlog! • net.ipv4.tcp_tw_recycle! • net.ipv4.tcp_max_syn_backlog! • net.ipv4.ip_local_port_range! • txqueuelen!
  23. 23. 23! Network Mystery: Cyclic Net Queues! Send-Qs Backup! !
  24. 24. 24! Network Mystery: Cyclic Net Queues! Recv-Qs Backup! !
  25. 25. 25! Network Mystery: Cyclic Net Queues! Somewhat Normal – Then Starts Again! !
  26. 26. 26! 2x Throughput Increase! 200,000 150,000 100,000 50,000 0 Defaults Row Cache Token Aware concurrent_reads op / sec Reads Writes
  27. 27. 27! Contact Info! Michael Nelson" Development Manager! nelsonmi@familysearch.org! ! Thanks to FamilySearch team!! ! Thanks to the awesome presenters & organizers at #CassandraSummit!!

×