Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Cassandra and IoT

2,431 views

Published on

Slides from a talk I gave at PubNub's IoT Stream Conference

Published in: Software

Cassandra and IoT

  1. 1. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 1 “ideal to store time series data” “Apache Cassandra has never failed us.”
  2. 2. PerformanceAvailabilityScale
  3. 3. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 3 Startup Program ToastrBox Analytics Search In-memory Visual Admin Security Certified Cassandra
  4. 4. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 4 IoT requires performance and reliability
  5. 5. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 5 Your System Send Heating Coil Repair Man IoT requires performance and reliability
  6. 6. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 6 Your System Send Heating Coil Repair Man Send 10% Off Bread Coupon IoT requires performance and reliability
  7. 7. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 7 Your System Send Heating Coil Repair Man Send 10% Off Bread Coupon Offer Upgrade Suggestions IoT requires performance and reliability
  8. 8. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 8 Your System Send Heating Coil Repair Man Send 10% Off Bread Coupon Offer Upgrade Suggestions Integrate with your SaaS (Spread as a Service) IoT requires performance and reliability
  9. 9. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 9 Your SystemFAULT IoT requires performance and reliability App Down, Customers Lose Interest
  10. 10. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 10 Your SystemSLOW Send Heating Coil Repair Man
 Three months after they get a competitor's toaster Offer Upgrade Suggestions
 That are already out of date Send 10% Off Bread Coupon
 They've already restocked on bread Integrate with your SaaS (Spread as a Service)
 Toast got spread a long time ago IoT requires performance and reliability
  11. 11. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 11 Send Heating Coil Repair Man Send 10% Off Bread Coupon Offer Upgrade Suggestions Integrate with your SaaS (Spread as a Service) IoT requires performance and reliability
  12. 12. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 12 0 50 100 150 200 250 300 35 174,373 366,828 537,172 1,099,837 http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html 50 nodes 100 150 300 nodesScale
  13. 13. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 13 Horizontal scale B A A B Token Range
 Mapping Data To Nodes Ring Architecture Peer to Peer Communication No Masters, No Slaves
  14. 14. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 14 C BA D A B C D Token Range
 Mapping Data To Nodes Ring Architecture Peer to Peer Communication No Masters, No Slaves Horizontal scale
  15. 15. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 15 Availability "During Hurricane Sandy, we lost an entire data center. Completely. Lost. It. Our data in Cassandra never went offline."
  16. 16. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. Peer-to-peer architecture 16 C BA D Client Client has a holistic view Cluster cluster = Cluster.builder().addContactPoint("192.168.0.1").build(); Cassandra Cluster
  17. 17. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. 17 C BA D Client Client has a holistic view Partition Keys are Hashed to a Token Range DeviceID: 102349 Divided data responsibility across cluster A B C D
  18. 18. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. Controlling fault tolerance: replication factor 18 Server - Replication: How many copies of a data should exist in the cluster? ReplicationFactor=3 Client Replication Strategies can span data centers!
 Survive whole AWS Region Failure! ACD ABCABD BCD A B C D
  19. 19. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. Controlling fault tolerance: replication factor 19 ACD ABCABD BCDACD ABCABD BCD US-West US-East Server - Replication: How many copies of a data should exist in the cluster? ReplicationFactor=3 A B C D
  20. 20. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. Controlling fault tolerance: replication factor 20 Cassandra Cluster ACD ABCABD BCDACD ABCABD BCD US-East Server - Replication: How many copies of a data should exist in the cluster? ReplicationFactor=3 US-West A B C D
  21. 21. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. Controlling fault tolerance: replication factor 21 A B C D ACD ABCABD BCDACD ABCABD BCD US-West US-East Server - Replication: How many copies of a data should exist in the cluster? ReplicationFactor=3
  22. 22. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. Controlling fault tolerance: tunable consistency 22 Client - Consistency Level: How many replicas should we check before acknowledgement? CL = One Client Successful  Toast  Made! ACD ABCABD BCDACD ABCABD BCD A B C D
  23. 23. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. Controlling fault tolerance: tunable consistency 23 Client - Consistency Level: How many replicas should we check before acknowledgement? CL = Quorum Client Toaster  Burst  Into  Flames! Higher Consistency Level's Let us Make Sure Events are Persisted ACD ABCABD BCDACD ABCABD BCD A B C D
  24. 24. http://www.datastax.com/apache-cassandra-leads-nosql-benchmark 0 40000 80000 120000 160000 1 2 4 8 Performance
  25. 25. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. Unparalleled durable performance 25 Par ReClu Memory Commit Log Memtable Memtable Disk Memtable Par ReClu Par ReCluPar ReClu Par ReCluPar ReClu Par ReCluPar ReClu
  26. 26. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. Unparalleled durable performance 26 Par ReClu Memory Commit Log Memtable Memtable Disk Memtable Par ReClu Par ReCluPar ReClu Par ReCluPar ReClu Par ReCluPar ReClu SSTable SSTable Flushed
  27. 27. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. Reading data is fast but limited by disk IO 27 Memory Commit Log Memtable Memtable Disk Memtable Par ReCluPar ReClu Par ReCluPar ReClu Par ReCluPar ReClu SSTable SSTable Flushed Replica
  28. 28. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. Reading data is fast but limited by disk IO 28 Memory Commit Log Memtable Memtable Disk Memtable Par ReCluPar ReClu Par ReCluPar ReClu Par ReCluPar ReClu SSTable SSTable Flushed Replica Par ReCluPar ReClu Par ReCluPar ReClu
  29. 29. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. Reading data is fast but limited by disk IO 29 Memory Commit Log Memtable Memtable Disk Memtable Par ReCluPar ReClu Par ReCluPar ReClu Par ReCluPar ReClu SSTable SSTable Flushed Replica Par ReCluPar ReClu Par ReCluPar ReCluLWW
  30. 30. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. Data modeling for time series 30 Things Generating Events
  31. 31. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. Data modeling for time series 31 Things Generating Events Store Events ordered by TimeUUID t1 t2 t3 t4 t5 t6 t7 t8 t9
  32. 32. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. Data modeling for time series 32 Things Generating Events Store Events ordered by TimeUUID t1 t2 t3 t4 t5 t6 t7 t8 t9 SSTable SSTable t1 t10 t11 t20 Data Ends up being Stored Temporally Sequentially on Disk Additional tables with Rollups/aggs etc … With data stored sequentially by time, time based queries become extremely fast!
  33. 33. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. Cassandra data modeling 33 Create Table example ( toasterID UUID, eventTime TIMEUUID, event Text, PRIMARY KEY (pk, ck)) Whole partition available on each replica Data ordered within Partition by Clustering Key Partition Key
 Idle Toasting Toasting Toast Success! Idle 12:00 12:01 12:02 12:03 12:04 Stored as Multiple SSTables, Each Internally Ordered Easy to Search Ranges of Clustering Key Difficult to search Ranges of Partition Key
  34. 34. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. DataStax Spark-Cassandra connector 34 Receiver DStream Events Batch Batch RDD RDD RDD RDD https://github.com/datastax/spark-cassandra-connector
  35. 35. Company Confidential© 2015 Aeris Communications, Inc. All Rights Reserved © 2015 DataStax, All Rights Reserved. Streaming data direct to Cassandra 35 It's easier than ever to connect you incoming event data with Cassandra
  36. 36. Start free Apache Cassandra training at DataStax Academy

×