Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Relational Scaling and the Temple of Gloom (from Cassandra Summit 2015)

876 views

Published on

You're building the next big thing. It will attract hundreds of thousands of users and make so much cash, Gordon Gecko would blush. You just know that if you build it, they will come. But what happens when all those users do show up? Will you spend your time adding the new features they're clamoring for, or will you be scrambling to make sure your relational database doesn't die hard? In this talk, we'll take a look at some of the risky business we undertake to try and scale our relational databases and the problems we run into. Then we'll talk about how Cassandra is different and some of the knobs you control to turn things up to 11. If you're new to Cassandra and are looking for an introduction, come from a relational database background, or you just want to see how many 80s movie references we can cover in 40 minutes, then don your favorite fedora and come for an excellent adventure.

Published in: Technology

Relational Scaling and the Temple of Gloom (from Cassandra Summit 2015)

  1. 1. Luke Tillman Technical Evangelist at DataStax (@LukeTillman)
  2. 2. • Evangelist with a focus on Developers • Long-time Developer on RDBMS (lots of .NET) Who are you?! 2
  3. 3. The Good ol' Relational Database © 2015. All Rights Reserved. 3 First proposed in 1970
  4. 4. The Relational Database Makes us Feel Good © 2015. All Rights Reserved. 4 SQL is ubiquitous and allows flexible querying Data modeling is well understood (3NF or higher) ACID guarantees make us feel good
  5. 5. © 2015. All Rights Reserved. 5 Building and Scaling our Applications
  6. 6. © 2015. All Rights Reserved. 5 Building and Scaling our Applications
  7. 7. © 2015. All Rights Reserved. 5 Building and Scaling our Applications I'm getting too old for this…
  8. 8. Scaling Up © 2015. All Rights Reserved. 8
  9. 9. All these JOINs are Killing Us 9 SELECT array_agg(players), player_teams FROM ( SELECT DISTINCT t1.t1player AS players, t1.player_teams FROM ( SELECT p.playerid AS t1id, concat(p.playerid, ':', p.playername, ' ') AS t1player, array_agg (pl.teamid ORDER BY pl.teamid) AS player_teams FROM player p LEFT JOIN plays pl ON p.playerid = pl.playerid GROUP BY p.playerid, p.playername ) t1 INNER JOIN ( SELECT p.playerid AS t2id, array_agg (pl.teamid ORDER BY pl.teamid) AS player_teams FROM player p LEFT JOIN plays pl ON p.playerid = pl.playerid GROUP BY p.playerid, p.playername ) t2 ON t1.player_teams = t2.player_teams AND t1.t1id <> t2.t2id ) innerQuery © 2015. All Rights Reserved.
  10. 10. All these JOINs are Killing Us 9 SELECT * FROM denormalized_view Let's Denormalize! © 2015. All Rights Reserved.
  11. 11. All these JOINs are Killing Us 9 But I thought data modeling was 3NF or higher?! There can be only one! © 2015. All Rights Reserved.
  12. 12. Read Replication 12 Client Users Data Replica 2 Replica 1 Primary Write Requests ReadRequests © 2015. All Rights Reserved. Replication Lag Consistent results? Nope, now eventually consistent Replication speed is limited by the speed of light
  13. 13. Sharding © 2015. All Rights Reserved. 13 Client Router A-F G-M N-T U-Z Users Data Queries that aren't on the shard key require scatter-gather Resharding can be a painful, manual process
  14. 14. Replication for Availability © 2015. All Rights Reserved. 14 Client Users Data Failover Process Monitor Failover Failover takes time. How long are you offline while it's happening?
  15. 15. And while you're offline... © 2015. All Rights Reserved. 15
  16. 16. Putting it All Together © 2015. All Rights Reserved. 16 Client Users Data
  17. 17. Putting it All Together © 2015. All Rights Reserved. 16 Client Users Data Sharding A-F G-M N-T U-Z
  18. 18. Putting it All Together © 2015. All Rights Reserved. 16 Client Users Data Sharding Router A-F G-M N-T U-Z
  19. 19. Putting it All Together © 2015. All Rights Reserved. 16 Client Users Data Router A-F G-M N-T U-Z Sharding and Replication (and probably Denormalization)
  20. 20. Putting it All Together © 2015. All Rights Reserved. 16 Client Users Data Failover Process Router A-F G-M N-T U-Z Sharding and Replication (and probably Denormalization)
  21. 21. Putting it All Together © 2015. All Rights Reserved. 16 Client Users Data Failover Process Monitor Failover Router A-F G-M N-T U-Z Sharding and Replication (and probably Denormalization)
  22. 22. Putting it All Together © 2015. All Rights Reserved. 16 Client Users Data Failover Process Monitor Failover Router A-F G-M N-T U-Z Sharding and Replication (and probably Denormalization)
  23. 23. Putting it All Together © 2015. All Rights Reserved. 16 Client Users Data Failover Process Monitor Failover Router A-F G-M N-T U-Z Replication Lag Sharding and Replication (and probably Denormalization)
  24. 24. Putting it All Together © 2015. All Rights Reserved. 16 Client Users Data Failover Process Monitor Failover Router A-F G-M N-T U-Z Replication Lag Sharding and Replication (and probably Denormalization)
  25. 25. Putting it All Together © 2015. All Rights Reserved. 16 Client Users Data Failover Process Monitor Failover Router A-F G-M N-T U-Z Replication Lag Sharding and Replication (and probably Denormalization)
  26. 26. Putting it All Together © 2015. All Rights Reserved. 16 Client Users Data Failover Process Monitor Failover Router A-F G-M N-T U-Z Replication Lag Sharding and Replication (and probably Denormalization)
  27. 27. What is Cassandra? © 2015. All Rights Reserved. 27 A linearly scaling and fault tolerant distributed database
  28. 28. What is Cassandra? © 2015. All Rights Reserved. 28 A linearly scaling and fault tolerant distributed database • Data spread over many nodes • All nodes participate in a cluster • All nodes are equal • No SPOF (shared nothing) • Run on commodity hardware
  29. 29. What is Cassandra? © 2015. All Rights Reserved. 29 A linearly scaling and fault tolerant distributed database • Have more data? Add more nodes. • Need more throughput? Add more nodes.
  30. 30. What is Cassandra? © 2015. All Rights Reserved. 30 A linearly scaling and fault tolerant distributed database • Nodes down != Database Down • Datacenter down != Database Down • No middle of the night phone calls
  31. 31. Multi Datacenter with Cassandra © 2015. All Rights Reserved. 31 America Zamunda Client
  32. 32. Fault Tolerance in Cassandra © 2015. All Rights Reserved. 32 You have the power to control fault tolerance in Cassandra
  33. 33. Replication Factor © 2015. All Rights Reserved. 33 How many copies of the data should exist? Client Write Beetlejuice RF=3 Beetlejuice Beetlejuice Beetlejuice
  34. 34. Consistency Level © 2015. All Rights Reserved. 34 How many replicas do we need to hear from before we acknowledge? CL=ONE Copy #1 Copy #2 Copy #3 Client
  35. 35. Consistency Level © 2015. All Rights Reserved. 35 How many replicas do we need to hear from before we acknowledge? CL=QUORUM Copy #1 Copy #2 Copy #3 Client
  36. 36. Consistency Levels and Speed © 2015. All Rights Reserved. 36 Use a lower consistency level like ONE to get faster reads and writes
  37. 37. Consistency Levels and Eventual Consistency © 2015. All Rights Reserved. 37 Use a higher consistency level like QUORUM if you don’t want to be surprised by data from the past (stale data)
  38. 38. Before you get too excited... © 2015. All Rights Reserved. 38
  39. 39. Cassandra is not... • A Data Ocean, Lake, or Pond • An In-Memory Database • A Queue • A magical database luck dragon that will solve all your database use cases and problems © 2015. All Rights Reserved. 39
  40. 40. How bad of an idea? © 2015. All Rights Reserved. 40 Actually a 90's movie Actually a 70's movie Why Arnold?! Why?
  41. 41. Cassandra is good when... • Uptime is a top priority • You have unpredictable or high scaling requirements • The workload is transactional (i.e. OLTP not OLAP) • You are willing to put the time and effort into understanding how it works and how to use it © 2015. All Rights Reserved. 41
  42. 42. © 2015. All Rights Reserved. 42 Movie References (in order of appearance) Leap of Faith (1992) Patton (1970) The Aristocats (1970) When Harry Met Sally (1989) Beverly Hills Cop (1984) Lethal Weapon (1987) Big (1988) Trading Places (1983) Highlander (1986)
  43. 43. © 2015. All Rights Reserved. 42 Spaceballs (1987) Rain Man (1988) Ghostbusters (1984) Gremlins (1984) Star Trek II: The Wrath of Khan (1982) Star Wars Episode VI: Return of the Jedi (1983) Weekend at Bernie's (1989) Coming to America (1988) Masters of the Universe (1987) Beetlejuice (1988) The Goonies (1985) Top Gun (1986)
  44. 44. © 2015. All Rights Reserved. 42 Back to the Future (1985) Footloose (1984) The NeverEnding Story (1984) Batman and Robin (1997) Star Wars (1977) Twins (1988) The Karate Kid (1984) Find me on Twitter: @LukeTillman
  45. 45. Thank You

×