Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Svccg nosql 2011_sri-cassandra

3,479 views

Published on

silicon valley cloud computing group, netflix, cassandra talk

Published in: Technology
  • Sex in your area is here: ❤❤❤ http://bit.ly/39sFWPG ❤❤❤
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Dating direct: ❶❶❶ http://bit.ly/39sFWPG ❶❶❶
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Svccg nosql 2011_sri-cassandra

  1. 1. HowStuffWorks version Cassandra SriSatish Ambati engineer, DataStax @srisatish
  2. 2. Bigtable, 2006 Dynamo, 2007 OSS, 2008 Incubator, 2009 TLP, 2010
  3. 3. Digital Reasoning: NLP + entity analytics OpenWave: enterprise messaging OpenX: largest publisher-side ad network in the world Cloudkick: performance data & aggregation SimpleGEO: location-as-API Ooyala: video analytics and business intelligence ngmoco: massively multiplayer game worlds Cassandra in production
  4. 4. <ul><li>furiously fast writes </li></ul><ul><ul><li>Append only writes </li></ul></ul><ul><ul><li>Sequential disk access </li></ul></ul><ul><ul><li>No locks in critical path </li></ul></ul><ul><ul><li>Key based atomicity </li></ul></ul><ul><li>client issues </li></ul><ul><li>write </li></ul><ul><li>n1 </li></ul><ul><li>partitioner </li></ul><ul><li>commit log </li></ul><ul><li>apply to memory </li></ul><ul><li>n2 </li></ul><ul><li>find node </li></ul><ul><li>n3 </li></ul>
  5. 5. Tuneable reads
  6. 6. Read Internals @r39132 - #netflixcloud
  7. 7. <ul><li>A feather in the CAP </li></ul><ul><ul><li>Eventual Consistency </li></ul></ul><ul><ul><li>R + W > N </li></ul></ul><ul><ul><ul><li>N is RF </li></ul></ul></ul><ul><ul><ul><li>T is total nodes </li></ul></ul></ul><ul><ul><li>ex: rdbms with backup </li></ul></ul><ul><ul><li>R=1, W=2, N=2, T=2 </li></ul></ul><ul><li>Read Performance </li></ul><ul><ul><li>R=1, 100s of nodes </li></ul></ul><ul><ul><li>R=1, W=N (consistency) </li></ul></ul><ul><li>Write Performance </li></ul><ul><ul><li>W=1, R=N </li></ul></ul><ul><ul><li>Quorum (fast writes!) </li></ul></ul>
  8. 8. Client Marshal Arts <ul><li>Roll your own, C </li></ul><ul><li>Thrift </li></ul><ul><li>pycassa, phpcassa </li></ul><ul><li>Ruby, Scala </li></ul><ul><li>Ready made, Java: Hector, Pelops </li></ul><ul><li>Common Patterns of Doom: </li></ul><ul><ul><li>Death by a million gets </li></ul></ul><ul><ul><li>Turn off Nagle </li></ul></ul><ul><ul><li>Manage your connections </li></ul></ul>
  9. 9. Adding Nodes <ul><li>New nodes </li></ul><ul><ul><li>Add themselves to busiest node </li></ul></ul><ul><ul><li>And then Split its Range </li></ul></ul><ul><li>Busy Node starts transmit to new node </li></ul><ul><li>Bootstrap logic initiated from any node, cli, web </li></ul>
  10. 10. Cassandra on EC2 cloud
  11. 11. Cassandra on EC2 cloud *Corey Hulen, EC2
  12. 12. inter-node comm <ul><li>Gossip Protocol </li></ul><ul><ul><li>It’s exponential </li></ul></ul><ul><ul><li>(epidemic algorithm) </li></ul></ul><ul><li>Failure Detector </li></ul><ul><ul><li>Accrual rate phi </li></ul></ul><ul><li>Anti-Entropy </li></ul><ul><ul><li>Bringing replicas to uptodate </li></ul></ul><ul><li>UDP for control messages </li></ul><ul><li>TCP for request routing </li></ul>
  13. 13. Compactions K1 < Serialized data > K2 < Serialized data > K3 < Serialized data > -- -- -- Sorted K2 < Serialized data > K10 < Serialized data > K30 < Serialized data > -- -- -- Sorted K4 < Serialized data > K5 < Serialized data > K10 < Serialized data > -- -- -- Sorted MERGE SORT Loaded in memory K1 < Serialized data > K2 < Serialized data > K3 < Serialized data > K4 < Serialized data > K5 < Serialized data > K10 < Serialized data > K30 < Serialized data > Sorted K1 Offset K5 Offset K30 Offset Bloom Filter Index File Data File
  14. 14. Compactions K1 < Serialized data > K2 < Serialized data > K3 < Serialized data > -- -- -- Sorted K2 < Serialized data > K10 < Serialized data > K30 < Serialized data > -- -- -- Sorted K4 < Serialized data > K5 < Serialized data > K10 < Serialized data > -- -- -- Sorted MERGE SORT Loaded in memory K1 < Serialized data > K2 < Serialized data > K3 < Serialized data > K4 < Serialized data > K5 < Serialized data > K10 < Serialized data > K30 < Serialized data > Sorted K1 Offset K5 Offset K30 Offset Bloom Filter Index File Data File D E L E T E D
  15. 16. A L T W F P Y Key “C” U Availability in Action
  16. 17. A L T W F P Y Key “C” U X hint Availability in Action
  17. 18. JMX
  18. 19. OpsCenter
  19. 20. OpsCenter
  20. 21. OpsCenter

×