Your SlideShare is downloading. ×
0
Apache Cassandra    Present and Future    Jonathan EllisMonday, October 24, 2011
History   Bigtable, 2006                                    Dynamo, 2007                                             OSS, ...
Why people choose Cassandra    ✤    Multi-master, multi-DC    ✤    Linearly scalable    ✤    Larger-than-memory datasets  ...
Cassandra users    ✤    Financial    ✤    Social Media    ✤    Advertising    ✤    Entertainment    ✤    Energy    ✤    E-...
Road to 1.0    ✤    Storage engine improvements    ✤    Specialized replication modes    ✤    Performance and other real-w...
Compaction                           Size-Tiered                           LeveledMonday, October 24, 2011
Differences in Cassandra’s leveling    ✤    ColumnFamily instead of key/value    ✤    Multithreading (experimental)    ✤  ...
Column (“secondary”) indexingcqlsh> CREATE INDEX state_idx ON users(state);cqlsh> INSERT INTO users (uname, state, birth_d...
Queryingcqlsh> SELECT * FROM users                WHERE state=UT AND birth_date > 1970                ORDER BY tokenize(ke...
More sophisticated indexing?    ✤    Want to:          ✤   Support more operators          ✤   Support user-defined orderin...
Other storage engine improvements        ✤    Compression        ✤    Expiring columns        ✤    Bounded worst-case read...
Eventually-consistent counters    ✤    Counter is partitioned by replica; each replica is “master”         of its own part...
13Monday, October 24, 2011
14Monday, October 24, 2011
15Monday, October 24, 2011
16Monday, October 24, 2011
The interesting parts    ✤    Tuneable consistency    ✤    Avoiding contention on local increments          ✤   Store incr...
(What about version vectors?)    ✤    Version vectors allow detecting conflict, but do not give         enough information ...
Performance     A single four-core machine; one million inserts + one million updatesMonday, October 24, 2011
Dealing with the JVM    ✤    JNA          ✤   mlockall()          ✤   posix_fadvise()          ✤   link()    ✤    Memory  ...
The Cassandra ecosystem    ✤    Replication into Cassandra          ✤   Gigaspaces          ✤   Drizzle    ✤    Solandra: ...
DataStax EnterpriseMonday, October 24, 2011
Operations    ✤    “Vanilla” Hadoop          ✤   8+ services to setup, monitor, backup, and recover              (NameNode...
What’s next    ✤    Ease Of Use    ✤    CQL: “Native” transport, prepared statements    ✤    Triggers    ✤    Entity group...
Questions?    ✤    jbellis@datastax.com    ✤    DataStax is hiring!          ✤   ~15 engineers (35 employees), want to dou...
Upcoming SlideShare
Loading in...5
×

Cassandra at High Performance Transaction Systems 2011

4,458

Published on

Published in: Technology, Business
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,458
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
55
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Transcript of "Cassandra at High Performance Transaction Systems 2011"

  1. 1. Apache Cassandra Present and Future Jonathan EllisMonday, October 24, 2011
  2. 2. History Bigtable, 2006 Dynamo, 2007 OSS, 2008 TLP, 2010 Incubator, 2009 1.0, October 2011Monday, October 24, 2011
  3. 3. Why people choose Cassandra ✤ Multi-master, multi-DC ✤ Linearly scalable ✤ Larger-than-memory datasets ✤ High performance ✤ Full durability ✤ Integrated caching ✤ Tuneable consistencyMonday, October 24, 2011
  4. 4. Cassandra users ✤ Financial ✤ Social Media ✤ Advertising ✤ Entertainment ✤ Energy ✤ E-tail ✤ Health care ✤ GovernmentMonday, October 24, 2011
  5. 5. Road to 1.0 ✤ Storage engine improvements ✤ Specialized replication modes ✤ Performance and other real-world considerations ✤ Building an ecosystemMonday, October 24, 2011
  6. 6. Compaction Size-Tiered LeveledMonday, October 24, 2011
  7. 7. Differences in Cassandra’s leveling ✤ ColumnFamily instead of key/value ✤ Multithreading (experimental) ✤ Optional throttling (16MB/s by default) ✤ Per-sstable bloom filter for row keys ✤ Larger data files (5MB by default) ✤ Does not block writes if compaction falls behindMonday, October 24, 2011
  8. 8. Column (“secondary”) indexingcqlsh> CREATE INDEX state_idx ON users(state);cqlsh> INSERT INTO users (uname, state, birth_date) VALUES (‘bsanderson’, ‘UT’, 1975) users state_idx state birth_date bsanderson UT 1975 UT bsanderson htayler prothfuss WI 1973 WI prothfuss htayler UT 1968Monday, October 24, 2011
  9. 9. Queryingcqlsh> SELECT * FROM users WHERE state=UT AND birth_date > 1970 ORDER BY tokenize(key);Monday, October 24, 2011
  10. 10. More sophisticated indexing? ✤ Want to: ✤ Support more operators ✤ Support user-defined ordering ✤ Support high-cardinality values ✤ But: ✤ Scatter/gather scales poorly ✤ If it’s not node local, we can’t guarantee atomicity ✤ If we can’t guarantee atomicity, doublechecking across nodes is a huge penaltyMonday, October 24, 2011
  11. 11. Other storage engine improvements ✤ Compression ✤ Expiring columns ✤ Bounded worst-case reads by re-writing fragmented rowsMonday, October 24, 2011
  12. 12. Eventually-consistent counters ✤ Counter is partitioned by replica; each replica is “master” of its own partitionMonday, October 24, 2011
  13. 13. 13Monday, October 24, 2011
  14. 14. 14Monday, October 24, 2011
  15. 15. 15Monday, October 24, 2011
  16. 16. 16Monday, October 24, 2011
  17. 17. The interesting parts ✤ Tuneable consistency ✤ Avoiding contention on local increments ✤ Store increments, not full value; merge on read + compaction ✤ Renewing counter id to deal with data loss ✤ Interaction with tombstonesMonday, October 24, 2011
  18. 18. (What about version vectors?) ✤ Version vectors allow detecting conflict, but do not give enough information to resolve it except in special cases ✤ In the counters case, we’d need to hold onto all previous versions until we can be sure no new conflict with them can occur ✤ Jeff Darcy has a longer discussion at http://pl.atyp.us/ wordpress/?p=2601Monday, October 24, 2011
  19. 19. Performance A single four-core machine; one million inserts + one million updatesMonday, October 24, 2011
  20. 20. Dealing with the JVM ✤ JNA ✤ mlockall() ✤ posix_fadvise() ✤ link() ✤ Memory ✤ Move cache off-heap ✤ In-heap arena allocation for memtables, bloom filters ✤ Move compaction to a separate process?Monday, October 24, 2011
  21. 21. The Cassandra ecosystem ✤ Replication into Cassandra ✤ Gigaspaces ✤ Drizzle ✤ Solandra: Cassandra + Solr search ✤ DataStax Enterprise: Cassandra + Hadoop analyticsMonday, October 24, 2011
  22. 22. DataStax EnterpriseMonday, October 24, 2011
  23. 23. Operations ✤ “Vanilla” Hadoop ✤ 8+ services to setup, monitor, backup, and recover (NameNode, SecondaryNameNode, DataNode, JobTracker, TaskTracker, Zookeeper, Metastore, ...) ✤ Single points of failure ✤ Cant separate online and offline processing ✤ DataStax Enterprise ✤ Single, simplified component ✤ Peer to peer ✤ JobTracker failover ✤ No additional cassandra configMonday, October 24, 2011
  24. 24. What’s next ✤ Ease Of Use ✤ CQL: “Native” transport, prepared statements ✤ Triggers ✤ Entity groups ✤ Smarter range queries enabling Hive predicate push- down ✤ Blue sky: streaming / CEPMonday, October 24, 2011
  25. 25. Questions? ✤ jbellis@datastax.com ✤ DataStax is hiring! ✤ ~15 engineers (35 employees), want to double in 2012 ✤ Austin, Burlingame, NYC, France, Japan, Belarus ✤ 100+ customers ✤ $11M Series B fundingMonday, October 24, 2011
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×