Scaling Twitter with Cassandra
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Scaling Twitter with Cassandra

on

  • 26,929 views

 

Statistics

Views

Total Views
26,929
Views on SlideShare
21,837
Embed Views
5,092

Actions

Likes
112
Downloads
989
Comments
6

27 Embeds 5,092

http://nosql.mypopescu.com 3714
http://abrdev.com 502
http://www.slideshare.net 365
http://www.nosqldatabases.com 256
http://escalabilidade.com 162
http://cloudcomputing.by 33
http://static.slidesharecdn.com 19
http://dovestation.blogspot.com 7
url_unknown 5
http://cofundit.neevtech.com 3
http://translate.googleusercontent.com 3
http://webcache.googleusercontent.com 3
http://www.nhimblog.net 3
http://www.kapilmohan.com 2
http://dovestation.blogspot.sg 2
http://localhost 2
http://drizzlin.com 1
http://aspbboard02 1
http://aaronlee.posterous.com 1
http://www.twittertim.es 1
http://www.webvampires.net 1
https://190.96.167.74 1
http://homolog.myimage.com.br 1
http://twitter.com 1
https://proxy-me.appspot.com 1
http://thelimemagazine.blogspot.com 1
http://nosqldatabases.squarespace.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

15 of 6 Post a comment

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • <br />
  • * storage team <br /> * personal background <br />
  • <br />
  • * began working on this problem last june <br /> * complexity had grown unmanageable <br /> * multiple internal customers <br /> * error domain grows as data size and complexity grow <br />
  • * every master db is a SPOF (failover is hard to pull off without strong coordination) <br /> * SPOFs lead to expensive hardware <br /> * app-managed hosts is tight coupling <br />
  • <br />
  • * our application is already tolerant of eventual consistency (actually more tolerant...) <br /> * in addition to scale, we want more flexibility than relational data models give us <br /> <br />
  • <br />
  • keyspace: database <br /> CF: table <br /> column: attribute <br /> SC: collection of attributes <br />
  • <br />
  • [insert diagrams of ring + tokens] <br /> <br /> nodes are arranged on a ring <br /> keys are mapped to the ring and written to the next N machines <br /> partitioners map keys to the ring <br />
  • [flow chart of how updates happen] <br /> <br />
  • if OPP, rows are ordered <br /> columns are ordered <br /> <br /> [diagram of range and slice] <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • insert to mysql <br /> insert into memcache <br /> replicate to slave <br /> update mysql <br /> insert into memcache fails <br /> replication to slave fails <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • <br />
  • Launching is shifting from roll back to roll forward <br /> <br />
  • <br />

Scaling Twitter with Cassandra Presentation Transcript

  • 1. Scaling Twitter with Cassandra Ryan King Storage Team
  • 2. bit.ly/chirpcassandra ryan@twitter.com @rk
  • 3. Legacy • vertically & horiztonally partitioned mysql • memcached (rows, indexes and fragments) • application managed
  • 4. Legacy Drawbacks • many single-points-of-failure • hardware-intensive • manpower-intensive • tight coupling
  • 5. Apache Cassandra • Apache top level project • originally developed at Facebook • Rackspace, Digg, SimpleGeo, Twitter, etc.
  • 6. Why Cassandra? • highly available • consistent, eventually • decentralized • fault tolerant • elastic • flexible schema • high write throughput
  • 7. What is Cassandra? • distributed database • Google's BigTable's data model • Amazon's Dynamo's infrastructure
  • 8. Cassandra Data Model • keyspaces • column families • columns • super columns
  • 9. Cassandra Infrastructure • partitioners • storage • querying
  • 10. Partitioners • order-preserving • random • custom
  • 11. Storage • commit log • memtables • sstables • compaction • bloom filters • indexes • key cache • row cache
  • 12. Querying • get • multiget • range • slice
  • 13. Consistency
  • 14. Consistency • N, R, W
  • 15. Consistency • N, R, W • N = number of replicas
  • 16. Consistency • N, R, W • N = number of replicas • R = read replicas
  • 17. Consistency • N, R, W • N = number of replicas • R = read replicas • W = write replicas
  • 18. Consistency • N, R, W • N = number of replicas • R = read replicas • W = write replicas • send request, wait for specified number
  • 19. Consistency • N, R, W • N = number of replicas • R = read replicas • W = write replicas • send request, wait for specified number • wait for others in background and perform read- repair
  • 20. Consistency Levels • ZERO • ONE • QUORUM • ALL
  • 21. Strong Consistency • If W + R > N, you will have consistency • W=1, R=N • W=N, R=1 • W=Q, R=Q where Q = N / 2 + 1
  • 22. Eventuality • Hinted Handoff • Read Repair • Proactive Repair (Merkle trees)
  • 23. Potential Consistency
  • 24. Potential Consistency • causes • write-through caching • master-slave replication failures
  • 25. Example
  • 26. Read Repair • send read to all replicas • if they differ, resolve conflicts and update (in background)
  • 27. Hinted Handoff • A wants to write to B • B is down • A tells C, "when B is back, send them this update"
  • 28. Proactive Repair • use Merkle trees to find inconsistencies • resolve conflicts • send repaired data • triggered manually
  • 29. Parallel Deployment
  • 30. How we’re moving? • parallel deployments • incremental traffic shifting
  • 31. Parallel Deployment 1. build new implementation 2. integrate it alongside existing 3. ...with switches for dynamically move/mirror traffic 4. turn up traffic 5. break something 6. Fix it 7. GOTO 4
  • 32. ?