Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Cassandra @ Sony: The good, the bad, and the ugly part 1

1,360 views

Published on

This talk covers scaling Cassandra to a fast growing user base. Alex and Isaias will cover new best practices and how to work with the strengths and weaknesses of Cassandra at large scale. They will discuss how to adapt to bottlenecks while providing a rich feature set to the playstation community.

Published in: Technology
  • Be the first to comment

Cassandra @ Sony: The good, the bad, and the ugly part 1

  1. 1. Created Cassandra @ Sony The good, the bad, and the ugly. Isaias Formacio-Serna Engineering Manager
  2. 2. Cassandra @ Sony • The San Francisco Office • What we have built so far • Challenges and Solutions
  3. 3. The San Francisco Office
  4. 4. Playstation Store
  5. 5. What’s New
  6. 6. Live Details
  7. 7. Friend Finder
  8. 8. Communities
  9. 9. Live from PlayStation
  10. 10. Verified Accounts
  11. 11. Social Network and Apps • Spotify • Facebook • Twitter • YouTube • Ustream • Nico nico • …
  12. 12. The San Francisco Office • The majority of our services use Cassandra • We started with 0 customers, currently 60 million active for PS3 and PS4 • Growth rate over Christmas was 20% on PS4! • All of our clusters are in double digits number of nodes
  13. 13. Some Stats • Dozens of Gb per second in data transfer • 100s of TB of raw data • Millions of reads and writes per second • Complex functionality on APIs
  14. 14. What’s New
  15. 15. • Started with 3 developers • 60+ Column Families • Thrift / Astyanax • What’s New • Players Met • Recent Activities  Profile  Live Details • Title news feed • In game posts Activity Feed
  16. 16. Challenges • Data Distribution • Volatile Data • Performance • Real time privacy • Data Retention • Unnecessary reads • Optimize for data size transfer • Avoid tombstone hell, adjust gc_period and compation threshold • Test Compaction Strategy, consistency level, etc • Optimize for reads • Design with ttl in mind • Avoid denormalization Activity Feed
  17. 17. Why aggregate? • Single read for any user and get all its stories • Condensed stories • Paging + Real time privacy = Blocks
  18. 18. Vnodes • Very unstable when we launched • Flapping when adding new nodes • Easy to manage • Our current strategy • Over time stabilized
  19. 19. Communities
  20. 20. • 4 developers • 20+ Tables • CQL / DataStax • Community Wall • Now Playing • Community Members Communities
  21. 21. Challenges • IN clause & Astyanax • Small dataset could kill the cluster • Volatile data • Multi-level reads • Counters • Use DataStax • Use something else to store small datasets • Adjust gc_period • Don’t do that! • Use them when they can be innacurate Communities
  22. 22. Cassandra + Cache = $$$ • Communities was the first strong user of Redis • Most features offload significant work from Cassandra to ehCache and Redis • Activity caches stories on Redis
  23. 23. Cassandra Challenges • Data changes often • Small datasets • Transactions (e.g. Counters) • Relational Data
  24. 24. Q&A

×