Your SlideShare is downloading. ×
20110515 cassandra linuxfb
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

20110515 cassandra linuxfb

2,754
views

Published on

Published in: Technology, Education

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
2,754
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
47
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Cassandra and NoSQL Database Wang Xu gnawux@gmail.com May, 2011 Cassandra and NoSQL Database 1 / 13.▲
  • 2. Outline . 1 . Cassandra in Greek Mythology . 2 . Brief History of Cassandra Project . 3 . NoSQL and Big Data . 4 . Eventual Consistency . 5 . Bigtable and Dynamo . 6 . Some Highlight Detail . 7 . Pieces about the Book and Translation Cassandra and NoSQL Database 2 / 13. ▲
  • 3. Cassandra in Greek Mythology She Could Foresee the Future . . Daughter of King Priam and Queen Hecuba of Troy . . . Apollo gave her the ability to see the future. . . No one would believe her. . Some related... . . Delphi, Oracle . . . Hector . Cassandra and NoSQL Database 3 / 13. ▲
  • 4. Brief History of Cassandra Project The important players in Cassandra Community . . Facebook create Cassandra for their inbox search. . . Facebook donate Cassandra to Apache Software Foundation. . . . Rackspace become a leader contributor in community. . . Twitter detonate the Cassandra discussion, but. . . . . Digg also actively participates the development. . And the releases . . 0.7 introduces runtime schema modification . . . 0.8 (Beta) introduces a query language named as CQL . Cassandra and NoSQL Database 4 / 13. ▲
  • 5. No-SQL or Not-Only-SQL NoSQL is Blooming in the recent decade . . Columnar: HBase in the Hadoop Community follows the design . of Google Bigtable. . Doucument-base: MongoDB is used in Foursqure and other . . popular site. . Key-value: Redis is dramatic fast. . . Graph: Neo4j and other Graph Database is suit for Social . Network and Semantic Web. Quotes: . . . . the term “Big Data” to highlight the fact that this family of . nonrelational databases is not defined by they’re not (implemen- tations of SQL), but rather by what they do (handle huge data loads). Cassandra and NoSQL Database 5 / 13. ▲
  • 6. Eventual Consistency and Brewer’s CAP Theory Brewer’s CAP Theory . . Figure: Databases in CAP Continuum Cassandra and NoSQL Database 6 / 13. ▲
  • 7. Bigtable: Column Family based Data Model Bigtable and Column Family based Data Model (in Cas- sandra) . . Google Columnar DB, build upon Google GFS . . Both HBase and Cassandra follow Bigtable’s Data Model . . Keyspace vs. Database, Column Family vs. Table, . . . Sparse table, every column is a name/value pair, rather than a . single value. . Columns are sorted and could query range of columns . . Insert or update a column for a row-key is the same. . . Cassandra has “Super Column” . Cassandra and NoSQL Database 7 / 13. ▲
  • 8. Dynamo: DHT Based Decentralized Storage It’s a DHT Ring . . Dynamo is designed for Amazon’s “Shopping Cart”. . . Cassandra is based on Dynamo’s Decentralized architecture. . . Dynamo is fully decentralized, or say structured P2P. . . . Routing information is maintained by “Gossip”. . . Repair data while reading . . Clock vector vs. timestamp . . Anti-Entropy and Merkle Tree . Cassandra and NoSQL Database 8 / 13. ▲
  • 9. Memtable, Commit-log, and SSTable Append write vs. Random write . . Write into Memtable (in memory), and commit log (on disk, . append) . Memtable is flushed to SSTable . . . SSTable will be Compact periodically or triggered by nodetool . . Commit log is read only during repair . Bloom Filter . . Disk acces is expensive . . . Bloom Filter is used for accelorating element search . Cassandra and NoSQL Database 9 / 13. ▲
  • 10. Trade-off between Available and Consistency Different Consistency level . . CL.ZERO . . CL.ANY (Hinted Hand-off) . . . CL.ONE . . CL.QUORUM . . CL.ALL . Cassandra and NoSQL Database 10 / 13. ▲
  • 11. SEDA for Performance Threading pool and IO . . Operations are separated as Stages . . . Stages are specified for special resources such as CPU and IO . . Stages are driven by Executors (Threadpool) . . Stages could be observated through JMX . Cassandra and NoSQL Database 11 / 13. ▲
  • 12. Pieces about the Book and Translation How about the Book . . The only one focus on Cassandra . . . Give you the big picture of NoSQL and Cassandra . . Not excellent, but still useful . . Some repeat content and codes . . . . Is it a funny job? . . It took me about 3 months . . . More than half of it is finished in the last month. . . Now I feel well and do not want to translate another one soon. . Cassandra and NoSQL Database 12 / 13. ▲
  • 13. Q&A Cassandra and NoSQL Database 13 / 13. ▲

×