State of Cassandra, August 2010
Upcoming SlideShare
Loading in...5
×
 

State of Cassandra, August 2010

on

  • 7,722 views

Keynote from the 2010 Cassandra Summit

Keynote from the 2010 Cassandra Summit

Statistics

Views

Total Views
7,722
Views on SlideShare
6,691
Embed Views
1,031

Actions

Likes
15
Downloads
220
Comments
0

16 Embeds 1,031

http://www.dbthink.com 341
http://log.medcl.net 277
http://www.datastax.com 165
http://www.riptano.com 142
http://www.nosqldatabases.com 78
http://www.wumii.com 5
https://coral.corp.apple.com 4
http://xianguo.com 4
http://confluence.corp.apple.com 4
http://www.slideshare.net 3
http://reader.youdao.com 2
http://coral.corp.apple.com 2
http://translate.googleusercontent.com 1
http://web.archive.org 1
http://static.slidesharecdn.com 1
http://cache.baidu.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

State of Cassandra, August 2010 State of Cassandra, August 2010 Presentation Transcript

  • Professional Cassandra support and services Tuesday, August 10, 2010
  • Cassandra: Present & Future Jonathan Ellis @spyced Tuesday, August 10, 2010
  • Cassandra 0.6 & 0.7 Jonathan Ellis @spyced Tuesday, August 10, 2010
  • Quiet change of policy • 0.5.1 was bug fixes only • Too early to be strict about bugfix-only policy in stable branch, especially w/ 0.7 being longer/more break-y • Maybe after 1.0? Tuesday, August 10, 2010
  • 1500 mails sent 1125 750 375 0 Jan Feb Apr May Jun Jul (0.5) (0.5.1) Mar (0.6, 0.6.1) (0.6.2) (0.6.3) (0.6.4) Tuesday, August 10, 2010
  • Lots of bug fixes • 85 issues marked Resolved/Fixed in 0.6 branch after 0.6 released Tuesday, August 10, 2010
  • Runtime configuration • concurrent reads, writes (0.6.2) • making it easier to bandage your foot after you shoot it • PhiConvictThreshold (0.6.2) Tuesday, August 10, 2010
  • Performance • JVM GC defaults (0.6.2) • Faster commitlog (0.6.2) • Faster range slice, Hadoop jobs (0.6.1, 2) • Better parallelization of multiget (0.6.4) • UTF8Type, UUIDType optimizations (0.6.5) Tuesday, August 10, 2010
  • Bulletproofing • HH disable (0.6.2) • compaction priority (0.6.3) • HH hourly scan (0.6.3) • JMX metrics for row-level bloom filters (0.6.3) • Flow control (0.6.4, 5) • HH paging (0.6.5) • Dynamic snitch (0.6.5) Tuesday, August 10, 2010
  • Hinted Handoff • 0.6.0: send hints to natural replicas • 0.6.0: fix row-level concurrency bottleneck • 0.6.2: option to disable entirely • 0.6.3: remove hourly scan • 0.6.4: lower priority • 0.6.5: paging of large hinted rows • 0.7.0: large rows Tuesday, August 10, 2010
  • Why keep HH around? https://www.cloudkick.com/blog/2010/jan/12/visual-ec2-latency/ Tuesday, August 10, 2010
  • Compaction priority -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Dcassandra.compaction.priority=1 Extended to HH in 0.6.4 Tuesday, August 10, 2010
  • http://www.javamex.com/tutorials/threads/priority_what.shtml Tuesday, August 10, 2010
  • JMX for bloom filters • o.a.c.db:ColumnFamilyStores • getBloomFilterFalsePositives • [not in nodetool yet] Tuesday, August 10, 2010
  • Flow control in 0.5 • Why backpressure doesn’t fit Cassandra Tuesday, August 10, 2010
  • Flow Control in 0.6.4 • Replica nodes drop hopeless requests on the floor • Coordinator node is unaffected • TimedOutException signals client to back off • Requires enough memory to buffer RPCTimeout’s worth of requests • (In the short term, you’re still screwed) Tuesday, August 10, 2010
  • Flow Control, 0.6.4 IncomingTcpConnection Message Deserializer Uncapped Read Mutation Capped at 4096 Tuesday, August 10, 2010
  • IncomingTcpConnection Message Deserializer Read Gossip Mutation Tuesday, August 10, 2010
  • Flow Control, 0.6.5 IncomingTcpConnection Read Gossip Mutation Uncapped Tuesday, August 10, 2010
  • Dynamic snitch • sortByProximity Tuesday, August 10, 2010
  • Open problems • Linux/mmap/swap unholy trio (0.6.5) • Memory fragmentation (0.6.5?) • Compaction effect on caches (0.7.1?) Tuesday, August 10, 2010
  • mmap and swap • The problem • Mitigations • mmap_index_only • swappiness=0 • turn off swap • mlockall at startup (Xms=Xmx) Tuesday, August 10, 2010
  • GC Fragmentation • Culprit of infamous CASSANDRA-1014? • Mitigation: tune with much larger new generation / tenuring threshold? Tuesday, August 10, 2010
  • Compaction and caches • Compactions wrecks the OS fs cache • Wrecks Cassandra key cache, too • (but not row cache) Tuesday, August 10, 2010
  • 0.7 Tuesday, August 10, 2010
  • New in 0.7 • live schema changes • large rows • secondary indexes • efficient Streaming • DatacenterStrategy Tuesday, August 10, 2010
  • Live schema changes • Details: http://www.riptano.com/blog/live- schema-updates-cassandra-07 Tuesday, August 10, 2010
  • Large rows • 0.6: smaller of {2GB, memory limit} • 0.7: in_memory_compaction_limit_in_mb Tuesday, August 10, 2010
  • Secondary indexes Tuesday, August 10, 2010
  • Streaming in 0.6 W A F (A-L] T L Tuesday, August 10, 2010
  • W A F (A-F] (A-F] T (F-L] L Tuesday, August 10, 2010
  • W A F Data T L Index Filter Tuesday, August 10, 2010
  • Streaming in 0.7 W A F T L Index Filter Tuesday, August 10, 2010
  • DatacenterStrategy • RackAwareStrategy is tuned for 3 replicas and 2 data centers • DS allows configuring replicas per data center, per Keyspace Tuesday, August 10, 2010
  • Minor features in 0.7 • read_repair_chance • per-keyspace request scheduling • Hadoop OutputFormat • Per CF what used to be global (gc_grace_seconds, memtable thresholds) Tuesday, August 10, 2010
  • 0.7 API changes • String keys become byte[] • Thrift keyspace argument moved to set_keyspace • i64 timestamp becomes Clock • SlicePredicate for _count methods Tuesday, August 10, 2010
  • 0.7 performance • Reads roughly 100% faster, thanks largely to removing String creation • Row-cached reads up to 8x faster after optimizations by tjake and jbellis • Optimizations for reads of large rows • 0.7.1? ~20% improvement everywhere from Thrift optimizations Tuesday, August 10, 2010
  • Thrift • OOMs on malformed packets • Python Unicode string issues • PHP support is buggy and maintainerless Tuesday, August 10, 2010
  • After 0.7.0 • IndexOperator.GT • Triggers / plugins • Avro? • On-disk data format improvements (Compression, heirarchical data?) • Auth Tuesday, August 10, 2010
  • Questions Tuesday, August 10, 2010