Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
DEMONWARE 
Deploying Cassandra for 
Call of Duty 
#CassandraSummit
Tim Czerniak 
Software Engineer 
DemonWare 
Seán O Sullivan 
Operations Engineer 
DemonWare
DEMON-WHO? 
DemonWare is a subsidiary of 
Activision-Blizzard 
We write, deploy and maintain 
client and server applicatio...
SERVICES 
• Matchmaking 
• Leaderboards 
• Chat 
• File Storage 
• Leagues 
• Social Network 
Integration 
• etc…
TECHNOLOGIES 
Client 
C++ HTTP 
Server 
Python Erlang 
MySQL CentOS 
Puppet
OUR UNUSUAL USE CASE 
Release 
First weekend 
Christmas 
Peak
“By failing to prepare, 
you are preparing to fail.” 
– Benjamin Franklin
OUR PREDICAMENT 
Needed to share 
data cross-DC… 
…but MySQL isn’t 
so good at that.
SERVICES 
• Progress store 
• High write, low read. 
• File size ~4KB 
• Persistent 
• Presence 
• High write, high read 
...
REQUIREMENTS 
• Cross DC 
• Ease of consolidation and expansion 
• Manageability for the operations teams 
• Throughput 
•...
EVALUATION 
• Shortlisted suitable 
options 
• Riak 
• Cassandra 
• Re-wrote our 
application backend, 
twice
LOAD TESTING 
• Two clusters 
• Single CPU, SSD and 
average memory 
• Dual CPU, Spindles and 
high memory 
• Used realist...
THE WINNER??? 
• Initially Riak was a slam-dunk 
• Erlang-based (we know Erlang) 
• Tooling is excellent 
• Performed well...
THE WINNER 
• Cassandra won in the end 
• Write performance 
• Richer feature set 
• Maturity of codebase and tooling 
• T...
SCHEMA 
• Progress store 
• A perfect fit! 
• Presence 
• More relational 
• High throughput (Tombstones!) 
• TTLs 
• Mess...
SCHEMA: LESSONS LEARNED 
• Keep it simple 
• It’s not a relational DB 
• Get your partition keys and 
clustering keys righ...
SCHEMA: LESSONS LEARNED 
• Don’t ignore CAP theorem 
• Cassandra has tuneable 
consistency, but there will be 
trade-offs ...
CONFIG 
• Default settings, probably not 
what you want 
• Changed many settings off 
the bat 
• Reverted some (oops)
HARDWARE 
• 2x Intel Xeon E5-2620 @ 2Ghz 
• 2x 480GB SSD (RAID-1) 
• 32GB 
• 1Gb non-dedicated network
MONITORING 
• Graphite 
• Nagios 
• Jolokia
GOTCHAS 
• Vnodes and rack awareness 
• Loadbalancers 
• Dev differs from production 
(of course...) 
• Launching in a DC ...
LAUNCH 
• Request to simulate a 
node failure 
• Two nodes died over 
Christmas 
• Expanding to other titles
QUESTIONS?
APPENDIX 
cassandra.conf: 
auto_bootstrap: false 
hinted_handoff_throttle_in_kb: 1024 
max_hints_delivery_threads: 2 
tric...
Cassandra Summit 2014: Deploying Cassandra for Call of Duty
Cassandra Summit 2014: Deploying Cassandra for Call of Duty
Upcoming SlideShare
Loading in …5
×

Cassandra Summit 2014: Deploying Cassandra for Call of Duty

2,340 views

Published on

Presenters: Seán O Sullivan, Service Reliability Engineer & Tim Czerniak, Software Engineer at Demonware
This presentation covers the eight-month evaluation process we underwent to migrate some of Call of Duty’s core services from MySQL to Cassandra. We will outline our requirements, the process we followed for the evaluation, decisions we made around our schema, configuration and hardware, and some issues we encountered.

Published in: Technology
  • Be the first to comment

Cassandra Summit 2014: Deploying Cassandra for Call of Duty

  1. 1. DEMONWARE Deploying Cassandra for Call of Duty #CassandraSummit
  2. 2. Tim Czerniak Software Engineer DemonWare Seán O Sullivan Operations Engineer DemonWare
  3. 3. DEMON-WHO? DemonWare is a subsidiary of Activision-Blizzard We write, deploy and maintain client and server applications for Activision and Blizzard games
  4. 4. SERVICES • Matchmaking • Leaderboards • Chat • File Storage • Leagues • Social Network Integration • etc…
  5. 5. TECHNOLOGIES Client C++ HTTP Server Python Erlang MySQL CentOS Puppet
  6. 6. OUR UNUSUAL USE CASE Release First weekend Christmas Peak
  7. 7. “By failing to prepare, you are preparing to fail.” – Benjamin Franklin
  8. 8. OUR PREDICAMENT Needed to share data cross-DC… …but MySQL isn’t so good at that.
  9. 9. SERVICES • Progress store • High write, low read. • File size ~4KB • Persistent • Presence • High write, high read • Data size minimal • Transient • Messaging • Low write, low read • Transient
  10. 10. REQUIREMENTS • Cross DC • Ease of consolidation and expansion • Manageability for the operations teams • Throughput • Storage: 1,500,000 reqs/min • Presence: 250,000 reqs/min • Messaging: 850,000 reqs/min
  11. 11. EVALUATION • Shortlisted suitable options • Riak • Cassandra • Re-wrote our application backend, twice
  12. 12. LOAD TESTING • Two clusters • Single CPU, SSD and average memory • Dual CPU, Spindles and high memory • Used realistic user profiles • Included peaks and troughs during testing • Ran a soak test
  13. 13. THE WINNER??? • Initially Riak was a slam-dunk • Erlang-based (we know Erlang) • Tooling is excellent • Performed well • Previously evaluated
  14. 14. THE WINNER • Cassandra won in the end • Write performance • Richer feature set • Maturity of codebase and tooling • Testing continued 24/7 until launch
  15. 15. SCHEMA • Progress store • A perfect fit! • Presence • More relational • High throughput (Tombstones!) • TTLs • Messaging • Time-series data, well suited • Tombstones!
  16. 16. SCHEMA: LESSONS LEARNED • Keep it simple • It’s not a relational DB • Get your partition keys and clustering keys right. • C* will do what it does best
  17. 17. SCHEMA: LESSONS LEARNED • Don’t ignore CAP theorem • Cassandra has tuneable consistency, but there will be trade-offs • Load test with real numbers • Some issues aren’t evident in unit-tests
  18. 18. CONFIG • Default settings, probably not what you want • Changed many settings off the bat • Reverted some (oops)
  19. 19. HARDWARE • 2x Intel Xeon E5-2620 @ 2Ghz • 2x 480GB SSD (RAID-1) • 32GB • 1Gb non-dedicated network
  20. 20. MONITORING • Graphite • Nagios • Jolokia
  21. 21. GOTCHAS • Vnodes and rack awareness • Loadbalancers • Dev differs from production (of course...) • Launching in a DC we didn't load test in
  22. 22. LAUNCH • Request to simulate a node failure • Two nodes died over Christmas • Expanding to other titles
  23. 23. QUESTIONS?
  24. 24. APPENDIX cassandra.conf: auto_bootstrap: false hinted_handoff_throttle_in_kb: 1024 max_hints_delivery_threads: 2 trickle_fsync: true rpc_server_type: hsha <% if virtual == "physical" -%> concurrent_reads: 128 <% else -%> concurrent_reads: 32 <% end -%> concurrent_writes: <%= processorcount.to_i * 8 -%> multithreaded_compaction: false <% if virtual == "physical" -%> compaction_throughput_mb_per_sec: 0 <% else -%> compaction_throughput_mb_per_sec: 16 <% end -%> ! cassandra-env.sh: <% if virtual == "physical" -%> JVM_OPTS="$JVM_OPTS -Xss180k" <% else -%> JVM_OPTS="$JVM_OPTS -Xss228k" <% end -%> JVM_EXTRA_OPTS="$JVM_EXTRA_OPTS -javaagent:/usr/ share/java/graphite-reporter-agent.jar -javaagent:/usr/share/java/ jolokia-jvm-agent.jar=port=8080,host=<%= hostname %>" EXTRA_CLASSPATH="/usr/share/java/metrics-graphite-2.0.3.jar"

×