Progressive NOSQL: Cassandra

1,598 views

Published on

Tom Wilkie's talk at Progressive NOSQL conference in London on 11/05/12.

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,598
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
49
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Progressive NOSQL: Cassandra

  1. 1. A tunably consistent, highly-available, Distributed Database Tom Wilkie @tom_wilkie Founder & VP Engineering, Acunu 1
  2. 2. • Overview• Distribution• Storage• Datamodel• Usecases 2
  3. 3. • Overview• Distribution• Storage• Datamodel• Usecases 3
  4. 4. • A distributed database for Big Data • Scale out on commodity servers • Best of bread performance • Multi-master architecture, no SPOF • Powerful multi data centre support4 4
  5. 5. 5 5
  6. 6. BigTable, 2006 Dynamo, 2007 Open sourced, 2008 TLP, 2010 Incubator, 2009 v1.0 2011 6
  7. 7. BigTable: ... • Simple but powerful datamodel • Write-optimised storage system • Consistent, available but not partition tolerant • Master-slave distribution system, SPOF http://goo.gl/7T1Ej7 7
  8. 8. Dynamo: ... • Sophisticated distribution system with tradable consistency and availability • Over-simple datamodel http://goo.gl/Q80b48 8
  9. 9. • Overview• Distribution• Storage• Datamodel• Usecases 9
  10. 10. Distribution: Consistent Hashing → r1, c1 v1 → r2, c2 v2 → r3, c3 v310 10
  11. 11. Distribution: Scaling11 11
  12. 12. Distribution: Scaling12 12
  13. 13. Distribution: Scaling • .13 13
  14. 14. Distribution: Scaling14 14
  15. 15. Distribution: Scaling15 15
  16. 16. Distribution: Replication → r1, c1 v116 16
  17. 17. Distribution: Replication17 17
  18. 18. Distribution: Consistency Tuneable, per-operation consistency Timestamped values, N > R + W W R18 18
  19. 19. Distribution: Read Repair19 19
  20. 20. Distribution: Read Repair20 20
  21. 21. Distribution: Read Repair21 21
  22. 22. Distribution: Read Repair22 22
  23. 23. • Overview• Distribution• Storage• Datamodel• Usecases 23
  24. 24. Writing to Cassandra Row Key Column Column Column Column24 24
  25. 25. Writing to Cassandra In the JVM Row Colu Colu Colu Colu Memtable On disk Commit log25 25
  26. 26. Writing to Cassandra In the JVM Full Memtable On disk Commit log26 26
  27. 27. Writing to Cassandra In the JVM New Memtable On disk Commit SSTable log27 27
  28. 28. Writing to Cassandra On disk Commit SSTable log SSTable SSTable SSTable SSTable SSTable28 28
  29. 29. Writing to Cassandra On disk Commit log SSTable29 29
  30. 30. Reading from Cassandra 30
  31. 31. 2 Off-heap Row cache (no GC) 1 In the JVM Memtable 3 4 5 SSTable Bloom filter Key cache index 6 On disk Commit log SSTable31 31
  32. 32. • Overview• Distribution• Storage• Datamodel• Usecases 32
  33. 33. SQL Cassandra Database row/key col_1 col_2 Keyspace row/key col_1 col_1 row/ col_1 col_1 Table Column Family33 33
  34. 34. col1 col2 col3 col4 col5 col6 col7 row1 x x x row2 x x x x x row3 x x x x x row4 x x x x row5 x x x x row6 x row7 x x x34 34
  35. 35. alice: { m2: { Sender: bob, Subject: ‘paper!’, ... } } bob: { m1: { Sender: alice, Subject: ‘rock?’, ... } } charlie: { m1: { Sender: alice, Subject: ‘rock?’, ... }, m2: { Sender: bob, Subject: ‘paper!’, ... } }35 35
  36. 36. • Overview• Distribution• Storage• Datamodelling• Usecases 36
  37. 37. Perfect for high velocity data Web, SCM, Retail Location Services Cloud Monitoring Social Gaming Social Media Ad Marketplaces Fraud Detection Smart Metering Oil/Gas Sensors 37 Confidential 6Wednesday, 25 April 12 37
  38. 38. Not Covered... • Distribution: Hinted Handoff, Anti-entropy repair, Counter distribution • Storage: Counter storage, different compaction strategies, partitioning etc • Datamodel: de-normalisation, TTLs, secondary indexes, CQL, super-columns, schema optional • Operations: backup, nodetool, performance tuning • Integration: Hadoop, Client Libraries etc38 38
  39. 39. • Distributed, scalable database• Opensource, widely used• Tunably consistent• Highly-available• Partition tolerant• Write-optimised• Schema-optional 39
  40. 40. Data Platform 40
  41. 41. Data PlatformData driven applications Web UI Acunu Analytics Control Apache Cassandra CenterAcunu Storage Engine Configured and tuned OS Commodity Hardware 41
  42. 42. Control Center“Ive had the EC2 instance running for a little while and Ihave to say, Im impressed. You guys have done well with this product.” - Lloyd, JustDevelopIt 42
  43. 43. Control Center“The new UI has been critical in helping us work out what is wrong in our code” - Matt, TellyBug 43
  44. 44. Castle: Built for Big Data • Storage engine optimized for large slow disks, many cores, Big Data workloads • Enterprise density on commodity hardware • Lightning disk rebuilds:10x faster than RAID Shared memory interface Castle keys Userspace Acunu Kernel userspace interface values In-kernel async, shared memory ring workloads shared buffers kernelspace Streaming interface interface range key buffered key buffered queries insert value insert get value get Doubling Arrays • Opensource (GPLv2, MIT doubling array mapping layer for user libraries) insert Bloom filters queues key get arrays x range arrays queries management http://goo.gl/gzihe key • insert merges http://bitbucket.org/acunu Arrays mapping layer • modlist btree key Version tree Loadable Kernel Module, insert btree key get btree targeting CentOS’s 2.6.18 range queries value arrays44 • Cache block mapping & http://www.acunu.com/ cacheing layer "Extent" layer prefetcher extent block extent cache blogs/andy-twigg/why- freespace allocator manager 44 flusher
  45. 45. 45
  46. 46. Rebuild time 5 4 Rebuild Time (Hours) 3 2 1 0 RAID10, 8 Disks RAID5, 8 Disks RDA, 8 Disks RDA, 15 Disks46 46
  47. 47. Analytics counter updatesClick stream events AcunuSensor data Analytics etc • Simple, real-time, incremental analytics • Push processing into ingest phase 47
  48. 48. Questions? tom@acunu.com @tom_wilkie www.acunu.com 48
  49. 49. Introduction Live & historical aggregates...49 49
  50. 50. Realtime trends...50 50
  51. 51. Drill downs and roll ups51 51
  52. 52. Solution Con Scalability $$$ Not realtime Inefficient Recomputation Spartan query semantics => complex, DIY solutions52 52

×