Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Cassandra - ADecentralizedStructured StorageSystemAvinash Lakshman, FacebookPrashant Malik, Facebook                      ...
Content• Introduction• Data Model• API• System architecture• Performance example• Conclusion                        2
Introduction• Cassandra is a NoSQL database.• Originally developed by Facebook.• Cassandra was designed to: •   Manage lar...
Data Model                                                                                4Source: http://www.inmensia.com...
Data Model (Example)                                                                                              5Source:...
API (Application Programming Interface)• Three simple methods:  • insert (table, key, rowMutation)  • get (table, key, col...
System architecture• Partitioning • Data is partitioned dynamically over a set of nodes. • Uses consistent hashing.• Repli...
Performance example• Facebook Inbox Search • More than 50 TB • 150 nodes • Different datacentres (west and east cost)• Rea...
Conclusion• They successfully implemented a system  which provides: • Scalability • High performance • Wide applicability ...
PaperA. Lakshman and P. Malik, “Cassandra: adecentralized structured storage system”, ACMSIGOPS Operating Systems Review, ...
THANK YOUANY QUESTIONS?                 11
Upcoming SlideShare
Loading in …5
×

Cassandra: a NoSQL storage system

1,101 views

Published on

Published in: Technology
  • Be the first to comment

Cassandra: a NoSQL storage system

  1. 1. Cassandra - ADecentralizedStructured StorageSystemAvinash Lakshman, FacebookPrashant Malik, Facebook Iván Carballo icf1e11@ecs.soton.ac.uk
  2. 2. Content• Introduction• Data Model• API• System architecture• Performance example• Conclusion 2
  3. 3. Introduction• Cassandra is a NoSQL database.• Originally developed by Facebook.• Cassandra was designed to: • Manage large amounts of structured data. • Run at tope of a system of hundreds of nodes. • Handle high write throughput. • Run on cheap hardware (scale out). • Provide high scalability, reliability and performance. 3
  4. 4. Data Model 4Source: http://www.inmensia.com/blog/20100327/desmitificando_a_cassandra.html
  5. 5. Data Model (Example) 5Source: http://www.divconq.com/2010/how-to-add-and-retrieve-data-from-a-cassandra-database/
  6. 6. API (Application Programming Interface)• Three simple methods: • insert (table, key, rowMutation) • get (table, key, columnName) • delete (table, key, columnName) 6
  7. 7. System architecture• Partitioning • Data is partitioned dynamically over a set of nodes. • Uses consistent hashing.• Replication • Each data item is replicated at N hosts.• Failure detection • Every node knows if the rest of nodes in the system are up or down. 7
  8. 8. Performance example• Facebook Inbox Search • More than 50 TB • 150 nodes • Different datacentres (west and east cost)• Read performance: Latency Search interactions Term search Min 7.69 ms 7.78 ms Median 15.69 ms 18.27 ms Max 26.13 ms 44.41 ms 8
  9. 9. Conclusion• They successfully implemented a system which provides: • Scalability • High performance • Wide applicability 9
  10. 10. PaperA. Lakshman and P. Malik, “Cassandra: adecentralized structured storage system”, ACMSIGOPS Operating Systems Review, vol. 44, n. 2,pp. 35-40, April 2010. 10
  11. 11. THANK YOUANY QUESTIONS? 11

×