20110515 cassandra linuxfb
Upcoming SlideShare
Loading in...5
×
 

20110515 cassandra linuxfb

on

  • 3,098 views

 

Statistics

Views

Total Views
3,098
Views on SlideShare
3,097
Embed Views
1

Actions

Likes
0
Downloads
47
Comments
0

1 Embed 1

http://twitter.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    20110515 cassandra linuxfb 20110515 cassandra linuxfb Presentation Transcript

    • Cassandra and NoSQL Database Wang Xu gnawux@gmail.com May, 2011 Cassandra and NoSQL Database 1 / 13.▲
    • Outline . 1 . Cassandra in Greek Mythology . 2 . Brief History of Cassandra Project . 3 . NoSQL and Big Data . 4 . Eventual Consistency . 5 . Bigtable and Dynamo . 6 . Some Highlight Detail . 7 . Pieces about the Book and Translation Cassandra and NoSQL Database 2 / 13. ▲
    • Cassandra in Greek Mythology She Could Foresee the Future . . Daughter of King Priam and Queen Hecuba of Troy . . . Apollo gave her the ability to see the future. . . No one would believe her. . Some related... . . Delphi, Oracle . . . Hector . Cassandra and NoSQL Database 3 / 13. ▲
    • Brief History of Cassandra Project The important players in Cassandra Community . . Facebook create Cassandra for their inbox search. . . Facebook donate Cassandra to Apache Software Foundation. . . . Rackspace become a leader contributor in community. . . Twitter detonate the Cassandra discussion, but. . . . . Digg also actively participates the development. . And the releases . . 0.7 introduces runtime schema modification . . . 0.8 (Beta) introduces a query language named as CQL . Cassandra and NoSQL Database 4 / 13. ▲
    • No-SQL or Not-Only-SQL NoSQL is Blooming in the recent decade . . Columnar: HBase in the Hadoop Community follows the design . of Google Bigtable. . Doucument-base: MongoDB is used in Foursqure and other . . popular site. . Key-value: Redis is dramatic fast. . . Graph: Neo4j and other Graph Database is suit for Social . Network and Semantic Web. Quotes: . . . . the term “Big Data” to highlight the fact that this family of . nonrelational databases is not defined by they’re not (implemen- tations of SQL), but rather by what they do (handle huge data loads). Cassandra and NoSQL Database 5 / 13. ▲
    • Eventual Consistency and Brewer’s CAP Theory Brewer’s CAP Theory . . Figure: Databases in CAP Continuum Cassandra and NoSQL Database 6 / 13. ▲
    • Bigtable: Column Family based Data Model Bigtable and Column Family based Data Model (in Cas- sandra) . . Google Columnar DB, build upon Google GFS . . Both HBase and Cassandra follow Bigtable’s Data Model . . Keyspace vs. Database, Column Family vs. Table, . . . Sparse table, every column is a name/value pair, rather than a . single value. . Columns are sorted and could query range of columns . . Insert or update a column for a row-key is the same. . . Cassandra has “Super Column” . Cassandra and NoSQL Database 7 / 13. ▲
    • Dynamo: DHT Based Decentralized Storage It’s a DHT Ring . . Dynamo is designed for Amazon’s “Shopping Cart”. . . Cassandra is based on Dynamo’s Decentralized architecture. . . Dynamo is fully decentralized, or say structured P2P. . . . Routing information is maintained by “Gossip”. . . Repair data while reading . . Clock vector vs. timestamp . . Anti-Entropy and Merkle Tree . Cassandra and NoSQL Database 8 / 13. ▲
    • Memtable, Commit-log, and SSTable Append write vs. Random write . . Write into Memtable (in memory), and commit log (on disk, . append) . Memtable is flushed to SSTable . . . SSTable will be Compact periodically or triggered by nodetool . . Commit log is read only during repair . Bloom Filter . . Disk acces is expensive . . . Bloom Filter is used for accelorating element search . Cassandra and NoSQL Database 9 / 13. ▲
    • Trade-off between Available and Consistency Different Consistency level . . CL.ZERO . . CL.ANY (Hinted Hand-off) . . . CL.ONE . . CL.QUORUM . . CL.ALL . Cassandra and NoSQL Database 10 / 13. ▲
    • SEDA for Performance Threading pool and IO . . Operations are separated as Stages . . . Stages are specified for special resources such as CPU and IO . . Stages are driven by Executors (Threadpool) . . Stages could be observated through JMX . Cassandra and NoSQL Database 11 / 13. ▲
    • Pieces about the Book and Translation How about the Book . . The only one focus on Cassandra . . . Give you the big picture of NoSQL and Cassandra . . Not excellent, but still useful . . Some repeat content and codes . . . . Is it a funny job? . . It took me about 3 months . . . More than half of it is finished in the last month. . . Now I feel well and do not want to translate another one soon. . Cassandra and NoSQL Database 12 / 13. ▲
    • Q&A Cassandra and NoSQL Database 13 / 13. ▲