Your SlideShare is downloading. ×
0
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Cassandra at High Performance Transaction Systems 2011

4,428

Published on

Published in: Technology, Business
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,428
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
55
Comments
0
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Apache Cassandra Present and Future Jonathan EllisMonday, October 24, 2011
  • 2. History Bigtable, 2006 Dynamo, 2007 OSS, 2008 TLP, 2010 Incubator, 2009 1.0, October 2011Monday, October 24, 2011
  • 3. Why people choose Cassandra ✤ Multi-master, multi-DC ✤ Linearly scalable ✤ Larger-than-memory datasets ✤ High performance ✤ Full durability ✤ Integrated caching ✤ Tuneable consistencyMonday, October 24, 2011
  • 4. Cassandra users ✤ Financial ✤ Social Media ✤ Advertising ✤ Entertainment ✤ Energy ✤ E-tail ✤ Health care ✤ GovernmentMonday, October 24, 2011
  • 5. Road to 1.0 ✤ Storage engine improvements ✤ Specialized replication modes ✤ Performance and other real-world considerations ✤ Building an ecosystemMonday, October 24, 2011
  • 6. Compaction Size-Tiered LeveledMonday, October 24, 2011
  • 7. Differences in Cassandra’s leveling ✤ ColumnFamily instead of key/value ✤ Multithreading (experimental) ✤ Optional throttling (16MB/s by default) ✤ Per-sstable bloom filter for row keys ✤ Larger data files (5MB by default) ✤ Does not block writes if compaction falls behindMonday, October 24, 2011
  • 8. Column (“secondary”) indexingcqlsh> CREATE INDEX state_idx ON users(state);cqlsh> INSERT INTO users (uname, state, birth_date) VALUES (‘bsanderson’, ‘UT’, 1975) users state_idx state birth_date bsanderson UT 1975 UT bsanderson htayler prothfuss WI 1973 WI prothfuss htayler UT 1968Monday, October 24, 2011
  • 9. Queryingcqlsh> SELECT * FROM users WHERE state=UT AND birth_date > 1970 ORDER BY tokenize(key);Monday, October 24, 2011
  • 10. More sophisticated indexing? ✤ Want to: ✤ Support more operators ✤ Support user-defined ordering ✤ Support high-cardinality values ✤ But: ✤ Scatter/gather scales poorly ✤ If it’s not node local, we can’t guarantee atomicity ✤ If we can’t guarantee atomicity, doublechecking across nodes is a huge penaltyMonday, October 24, 2011
  • 11. Other storage engine improvements ✤ Compression ✤ Expiring columns ✤ Bounded worst-case reads by re-writing fragmented rowsMonday, October 24, 2011
  • 12. Eventually-consistent counters ✤ Counter is partitioned by replica; each replica is “master” of its own partitionMonday, October 24, 2011
  • 13. 13Monday, October 24, 2011
  • 14. 14Monday, October 24, 2011
  • 15. 15Monday, October 24, 2011
  • 16. 16Monday, October 24, 2011
  • 17. The interesting parts ✤ Tuneable consistency ✤ Avoiding contention on local increments ✤ Store increments, not full value; merge on read + compaction ✤ Renewing counter id to deal with data loss ✤ Interaction with tombstonesMonday, October 24, 2011
  • 18. (What about version vectors?) ✤ Version vectors allow detecting conflict, but do not give enough information to resolve it except in special cases ✤ In the counters case, we’d need to hold onto all previous versions until we can be sure no new conflict with them can occur ✤ Jeff Darcy has a longer discussion at http://pl.atyp.us/ wordpress/?p=2601Monday, October 24, 2011
  • 19. Performance A single four-core machine; one million inserts + one million updatesMonday, October 24, 2011
  • 20. Dealing with the JVM ✤ JNA ✤ mlockall() ✤ posix_fadvise() ✤ link() ✤ Memory ✤ Move cache off-heap ✤ In-heap arena allocation for memtables, bloom filters ✤ Move compaction to a separate process?Monday, October 24, 2011
  • 21. The Cassandra ecosystem ✤ Replication into Cassandra ✤ Gigaspaces ✤ Drizzle ✤ Solandra: Cassandra + Solr search ✤ DataStax Enterprise: Cassandra + Hadoop analyticsMonday, October 24, 2011
  • 22. DataStax EnterpriseMonday, October 24, 2011
  • 23. Operations ✤ “Vanilla” Hadoop ✤ 8+ services to setup, monitor, backup, and recover (NameNode, SecondaryNameNode, DataNode, JobTracker, TaskTracker, Zookeeper, Metastore, ...) ✤ Single points of failure ✤ Cant separate online and offline processing ✤ DataStax Enterprise ✤ Single, simplified component ✤ Peer to peer ✤ JobTracker failover ✤ No additional cassandra configMonday, October 24, 2011
  • 24. What’s next ✤ Ease Of Use ✤ CQL: “Native” transport, prepared statements ✤ Triggers ✤ Entity groups ✤ Smarter range queries enabling Hive predicate push- down ✤ Blue sky: streaming / CEPMonday, October 24, 2011
  • 25. Questions? ✤ jbellis@datastax.com ✤ DataStax is hiring! ✤ ~15 engineers (35 employees), want to double in 2012 ✤ Austin, Burlingame, NYC, France, Japan, Belarus ✤ 100+ customers ✤ $11M Series B fundingMonday, October 24, 2011

×