Hindsight is 20/20:MySQL to CassandraMichael Kjellman (@mkjellman)Barracuda Networks#cassandra13
What I Do• Build and maintain “real-time” Spam detectionand Web Filter classification• Java/Perl/C (and bits of everything...
Our C* Cluster• In production for ~2 years since 0.8• Running 1.2.5 + minor patches• 24 nodes in 2 datacenters• (2) 2TB Ha...
What is “real-time” exactly?#cassandra13
#cassandra13
Our Rewrite by the NumbersCassandra Based MySQL BasedAverage Application Latency 2.41ms 5.0msElements in Database 32,836,7...
Should you Rewrite?• How To Survive a Ground-Up Rewrite WithoutLosing Your Sanity[1] – Joel Spolsky• Past engineering deci...
Evolving Legacy Systems• Even good developers can write sloppy code• Too much duct tape– Most layers applied around the da...
Hitting the Reset Button• Plan for continuous failure• Easily Scalable• No Single Point of Failure – that you know of• Man...
Whiteboard to Reality• Get technical buy-in from all parties• Migrate and rewrite in stages– Business requirements forced ...
#cassandra13
Cassandra is Not…1. Direct MySQL replacement2. Magic bullet to solve everything#cassandra13
Migrating• Painful• Painful• Painful• Tons of rewriting• Tons of regressions• Did I say painful?#cassandra13
So Why Migrate?• C* is the best option for persistence tier• Business success motivation• Don‟t let your database hold you...
Lessons Learned (the good)• Carefully defining data model up front• Creating a flexible systems architecture thatadapts we...
Lessons Learned (the bad)• Consider migration and delivery requirementsfrom the very beginning• Adjust expectations – didn...
Tips1. Define requirements early2. Start with the queries3. Think differently regarding reads4. Syncing and migrating data...
1. Define Requirements Early• What kind of queries will your application make?• Do you need ordered results for all of you...
2. Start with the Queries• C* != “#dontneedtothinkaboutmyschema”• Counters and Composites• Optimize for use case– Don‟t be...
3. Think Differently Regarding Reads• Do you really need all that data at once?• mysql> SELECT * FROM mysupercooltable WHE...
4. Syncing and Migrating Data• Sync and migration scripts – take more seriouslythan production code• Design sync to be con...
5. Don‟t use C* as a Queue• Cassandra anti-patterns: Queues and queue-likedatasets[2] – Aleksey Yeschenko• Tombstones + re...
6. Estimate Capacity• Don‟t forget the Java heap (8GB Max)• Plan capacity – today and future• Stress Tool – profile node a...
7. Automate, Automate, Automate• Love your inner Ops self. Distributed systemsmove complexity to operations.• Puppet or so...
8. Some Maintenance Required• Repairs & Cleanup ops– automate and run frequently• Rolling restart meet rollingrepair• Lear...
Where is Barracuda Today?• 2 years in production with Cassandra• Definitely the right choice for our persistence tier• 2 p...
2.0 and Beyond• Thrift -> CQL• CQL helps the MySQL to C* migration– Easier to comprehend / grasp• Everyone understands SEL...
C* Community• Supercalifragilisticexpialidocious community!• Riak, HBase, Oracle are other options. How istheir dev commun...
Upcoming SlideShare
Loading in …5
×

C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman

2,385 views

Published on

Abstract A brief intro to how Barracuda Networks uses Cassandra and the ways in which they are replacing their MySQL infrastructure, with Cassandra. This presentation will include the lessons they've learned along the way during this migration.

Published in: Technology
  • Be the first to comment

C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman

  1. 1. Hindsight is 20/20:MySQL to CassandraMichael Kjellman (@mkjellman)Barracuda Networks#cassandra13
  2. 2. What I Do• Build and maintain “real-time” Spam detectionand Web Filter classification• Java/Perl/C (and bits of everything else)• Author perlcassa (Perl C* client)• Frontend? Backend? Customer? Internal?Broken RAID Card? Bad Disk? I touch it all.#cassandra13
  3. 3. Our C* Cluster• In production for ~2 years since 0.8• Running 1.2.5 + minor patches• 24 nodes in 2 datacenters• (2) 2TB Hard Drives (no RAID)• (1) Small SSD for small hot CFs• 64GB of RAM• Puppet for management• Cobbler for deployment• Target max load at 600GB per node#cassandra13
  4. 4. What is “real-time” exactly?#cassandra13
  5. 5. #cassandra13
  6. 6. Our Rewrite by the NumbersCassandra Based MySQL BasedAverage Application Latency 2.41ms 5.0msElements in Database 32,836,767 3,946,713Elements Application Handles 32,836,767 314,974Element Seen Prior to Tracking 1st request Various ThresholdsDatacenters 2 1Average Latency of AutomatedClassification3 seconds 8 minutes#cassandra13
  7. 7. Should you Rewrite?• How To Survive a Ground-Up Rewrite WithoutLosing Your Sanity[1] – Joel Spolsky• Past engineering decisions preventingimplementation of new business requirements• New threats smarter and more targeted[1]http://onstartups.com/tabid/3339/bid/97052/How-To-Survive-a-Ground-Up-Rewrite-Without-Losing-Your-Sanity.aspx#cassandra13
  8. 8. Evolving Legacy Systems• Even good developers can write sloppy code• Too much duct tape– Most layers applied around the database#cassandra13
  9. 9. Hitting the Reset Button• Plan for continuous failure• Easily Scalable• No Single Point of Failure – that you know of• Many smaller boxes vs. one monolithic box#cassandra13
  10. 10. Whiteboard to Reality• Get technical buy-in from all parties• Migrate and rewrite in stages– Business requirements forced hybrid period with theold and new systems operated in parallel#cassandra13
  11. 11. #cassandra13
  12. 12. Cassandra is Not…1. Direct MySQL replacement2. Magic bullet to solve everything#cassandra13
  13. 13. Migrating• Painful• Painful• Painful• Tons of rewriting• Tons of regressions• Did I say painful?#cassandra13
  14. 14. So Why Migrate?• C* is the best option for persistence tier• Business success motivation• Don‟t let your database hold you back#cassandra13
  15. 15. Lessons Learned (the good)• Carefully defining data model up front• Creating a flexible systems architecture thatadapts well to changes during implementation• Seriously – “Measure twice, cut once.”#cassandra13
  16. 16. Lessons Learned (the bad)• Consider migration and delivery requirementsfrom the very beginning• Adjust expectations – didn‟t expect relying onlegacy systems for so long• Make syncing data between systems a priority#cassandra13
  17. 17. Tips1. Define requirements early2. Start with the queries3. Think differently regarding reads4. Syncing and migrating data5. Don‟t use C* as a queue6. Estimate capacity7. Automate, Automate, Automate8. Some maintenance required#cassandra13
  18. 18. 1. Define Requirements Early• What kind of queries will your application make?• Do you need ordered results for all of yourrows?• What is your read load? Write load?#cassandra13
  19. 19. 2. Start with the Queries• C* != “#dontneedtothinkaboutmyschema”• Counters and Composites• Optimize for use case– Don‟t be afraid of writes. Storage is cheap.– Optimize to reduce the number of tombstones#cassandra13
  20. 20. 3. Think Differently Regarding Reads• Do you really need all that data at once?• mysql> SELECT * FROM mysupercooltable WHEREfoo = ‘bar’;– Slow, but eventually will work• cqlsh> SELECT * FROM myreallybigcf WHERE foo= ‘bar’;– Won‟t work. Expect RPC timeout exceptions on reads generallyafter ~10,000 rows even with paging• Our solutions:– ElasticSearch– Hadoop/Pig#cassandra13
  21. 21. 4. Syncing and Migrating Data• Sync and migration scripts – take more seriouslythan production code• Design sync to be continuous with both systemsrunning in parallel during migration• Prioritize the sync#cassandra13
  22. 22. 5. Don‟t use C* as a Queue• Cassandra anti-patterns: Queues and queue-likedatasets[2] – Aleksey Yeschenko• Tombstones + read performance• Our solution:– Kafka (multiple publisher, multiple consumer durablequeue)[2]http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets#cassandra13
  23. 23. 6. Estimate Capacity• Don‟t forget the Java heap (8GB Max)• Plan capacity – today and future• Stress Tool – profile node and multiply• MySQL hardware != Cassandra hardware• New bottlenecks thanks to C* being soawesome?• I/O still an important concern with C*#cassandra13
  24. 24. 7. Automate, Automate, Automate• Love your inner Ops self. Distributed systemsmove complexity to operations.• Puppet or something similar (really)• Learn CCM earlier rather than later– www.github.com/pcmanus/ccm#cassandra13
  25. 25. 8. Some Maintenance Required• Repairs & Cleanup ops– automate and run frequently• Rolling restart meet rollingrepair• Learn jconsole• Solution:– Jolokia (JMX via HTTP)#cassandra13
  26. 26. Where is Barracuda Today?• 2 years in production with Cassandra• Definitely the right choice for our persistence tier• 2 product lines on C* based system and anothermajor product in beta• Achieved “real-time” response#cassandra13
  27. 27. 2.0 and Beyond• Thrift -> CQL• CQL helps the MySQL to C* migration– Easier to comprehend / grasp• Everyone understands SELECT * FROM cf WHEREkey = „foo‟;• CAS and other 2.0 features make C* an evenbetter replacement option for MySQL#cassandra13
  28. 28. C* Community• Supercalifragilisticexpialidocious community!• Riak, HBase, Oracle are other options. How istheir dev community?• Great client support. Great people. Greatmotivated developers.• IRC: #cassandra on freenode• Mailing List: user@cassandra.apache.org#cassandra13

×