Strata west 2012_java_cassandra

4,564 views

Published on

Published in: Technology, Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
4,564
On SlideShare
0
From Embeds
0
Number of Embeds
88
Actions
Shares
0
Downloads
45
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Strata west 2012_java_cassandra

  1. Building Applications With Apache CassandraNATE MCCALLSr. Software Developer CONFIDENTIAL | 1
  2. What We’ll Cover Cassandra Basics Common API Usage Storage Model Ring Overview Web Application Integration CONFIDENTIAL | 2
  3. Getting StartedRequirements JDK 1.6 or greater Apache Maven 3.0.2 or greater Apache Cassandra 1.0.7 – DataStax community edition: http://www.datastax.com/download/community/versions IDE such as Eclipse or IntelliJ will be helpful but not necessary Several thumb drives available (please share) All source on GitHub: https://github.com/zznate/strata-west-2012 CONFIDENTIAL | 3
  4. How We’ll Cover ItLearning by doing Looking at and writing code Examples are constructed explicitly to show off certain concepts Move ahead if it gets slow – just start hacking You must be comfortable writing and debugging software CONFIDENTIAL | 4
  5. Getting Down To ItIt does not have to be hard. CONFIDENTIAL | 5
  6. Getting Down To It CONFIDENTIAL | 6
  7. Getting Down To ItIt does not have to be mysterious. CONFIDENTIAL | 7
  8. Getting Down To It CONFIDENTIAL | 8
  9. Getting Down To ItYou can leverage a mature language with stable clients against a proven, best of breed solution in use at high- traffic production environments right now CONFIDENTIAL | 9
  10. What We’ll Cover Cassandra Basics Common API Usage Storage Model Ring Overview Web Application Integration CONFIDENTIAL | 10
  11. Scale Out. But Really Though.Best of Breed Linear scaling Real multi-datacenter support “Fix it on Monday” fault tolerance CONFIDENTIAL | 11
  12. Static Column FamilyGOOG Price:589.55 Name=GoogleAPPL Price=401.76 Name=AppleNFLX Price=78.73 Nam=NetflixNOK Price=6.90 Name=Nokia Exchange=NYSE Schema Optional Not all columns are required CONFIDENTIAL | 12
  13. Dynamic Column Family GOOG 10/25/11=583.16 10/24/11=596.42 10/23/11=590.49 APPL 10/25/11=397.77 10/24/11=405.77 10/23/11=392.87 NFLX 10/25/11=77.37 10/24/11=118.14 10/23/11=117.23 NOK 10/25/11=6.71 10/24/11=6.76 10/23/11=6.61 Prematerialized Queries Store it how you read it CONFIDENTIAL | 13
  14. The API Cassandra Basics Common API Usage Storage Model Ring Overview Web Application Integration CONFIDENTIAL | 14
  15. Common API UsageStarting up If you didn’t look before hand: http://www.datastax.com/docs/1.0/getting_started/indexWe want to run the Cassandra process in the foreground to see what’sgoing on:cd $CASSANDRA_HOME/bin/cassandra -f CONFIDENTIAL | 15
  16. Common API UsageDataStax OpsCenterIf you are not sure why you should have monitoring, have this running atall times.http://www.datastax.com/docs/opscenter/index CONFIDENTIAL | 16
  17. Common API UsageStatic Column Families See org.apache.tutorial.BasicUsageExample CONFIDENTIAL | 17
  18. Common API UsageDynamic Column Families See org.apache.tutorial.TimeseriesInserter – A Cassandra row can hold up to 2 billion columns CONFIDENTIAL | 18
  19. Common API UsageDynamic Column Families See org.apache.tutorial.TimeseriesIterationQuery – Encapsulate paging in iteration for easier traversal of wide rows CONFIDENTIAL | 19
  20. Common API UsageUsing CQL See comments in class files as we go – Use cqlsh for queries, some administration tasks – Caveat: no composites or super column support CONFIDENTIAL | 20
  21. Common API UsageJdbcTemplate Some compiling required – Not quite there on the typing support – Pooling library needs work – Give this a try if you want: https://github.com/riptano/jdbc-conn-pool Specifically: – https://github.com/riptano/jdbc-conn-pool/tree/master/portfolio-example CONFIDENTIAL | 21
  22. Common API UsageJdbcTemplate Configuration via ResourceRef CONFIDENTIAL | 22
  23. Common API UsageJdbcTemplate Configuration via Context CONFIDENTIAL | 23
  24. Common API UsageJdbcTemplate Insertion CONFIDENTIAL | 24
  25. Common API UsageJdbcTemplate Selection CONFIDENTIAL | 25
  26. Storage and On-Disk Structure Cassandra Basics Common API Usage Storage Model Ring Overview Web Application Integration CONFIDENTIAL | 26
  27. Merge-On-Read Benefits On-disk structure is immutable  No read-before-write  Highest timestamp wins  Delete markers (“tombstones”) thrown out on merge CONFIDENTIAL | 27
  28. Compaction Benefits Merge SSTables  Keeps SSTable count down  Makes merge-on-read process more efficient  Groups rows into single SSTable  Can be vary on workload  Size-Tiered compaction  Leveled compaction CONFIDENTIAL | 28
  29. Common API UsageIndexing Techniques See org.apache.tutorial.CompositeDataLoader – Store a static index in a single row CONFIDENTIAL | 29
  30. Common API UsageIndexing Techniques See org.apache.tutorial.CompositeQuery – Use slice of composites to narrow in on query CONFIDENTIAL | 30
  31. Common API UsageIndexing Techniques See org.apache.tutorial.CompositeQuery – Let’s add another level to the composite CONFIDENTIAL | 31
  32. Common API UsageIndexing Techniques See org.apache.tutorial.CompositeQuery – Add a third level to composite to narrow search to “cities in California starting with “Ag” CONFIDENTIAL | 32
  33. Common API UsageRevisiting the Time Series Example See org.apache.tutorial.BucketingTimeSeriesInserter – Uses buckets for granularityEvery minute gets a distinct row 2012_02_28_13_30 CONFIDENTIAL | 33
  34. Storage ModelRevisiting the Time Series Example See org.apache.tutorial.BucketingTimeSeriesQuery – More advanced slicing examples – Keys can be rebuilt for any time window – Keep rows grouped tightly on diskI need the 30 minutes between 3 and 4pm for every day last week CONFIDENTIAL | 34
  35. Storage ModelTombstones See org.apache.tutorial.TombstoneDemoInserter and TombstoneDemoQuery CONFIDENTIAL | 35
  36. TombstoneOutput before deletion CONFIDENTIAL | 36
  37. TombstoneOutput after deletion CONFIDENTIAL | 37
  38. Understanding the Ring and Consistency Cassandra Basics Common API Usage Storage Model Ring Overview Web Application Integration CONFIDENTIAL | 38
  39. The Ring Token Distribution Distributed Hashing Lexigraphically similar tokens are hashed to (very) different values key fon foo  Provides for shared knowledge of key locationtoken 0 100  The actual token range is from 0 to 2^128  The token is created by converting an MD5 hash of the key to a java.lang.BigInteger CONFIDENTIAL | 39
  40. The RingToken Distribution as a Ring Wrapping Ranges The next token after the highest 100 1 possible value is the lowest possible value.foo fon CONFIDENTIAL | 40
  41. The Ring 4 Node Token Distribution Simplified Ring Example Nodes distribute ownership via Node 1 Token ranges token: 0  A node owns it’s token and the “foo” range immediately before  Nodes continuously “gossip” ring ownership Node 4 Node 2token: 75 token: 25  Any node can act as a coordinator to service requests for any other node Node 3 token: 50 CONFIDENTIAL | 41
  42. The Ring Initial Token First Token Last Token Node 1 0 76 0 Node 2 25 1 25 Node 3 50 26 50 Node 4 75 51 75 Inclusive token ranges for a four node cluster CONFIDENTIAL | 42
  43. Integrating with Web Applicaitons Cassandra Basics Common API Usage Storage Model Ring Overview Web Application Integration CONFIDENTIAL | 43
  44. Web Application IntegrationUsing Spring AccountController and AccountDao – Similar to JDBC example for wiring CONFIDENTIAL | 44
  45. Web Application IntegrationProbably as far as we’ll get…DataStax Documentation: http://www.datastax.com/docs/1.0/indexApache Cassandra project wiki: http://wiki.apache.org/cassandra/“The Dynamo Paper”: http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdfP. Helland. Building on Quicksand: http://arxiv.org/pdf/0909.1788P. Helland. Life Beyond Distributed Transactions:http://www.ics.uci.edu/~cs223/papers/cidr07p15.pdf“The Megastore Paper”:http://research.google.com/pubs/archive/36971.pdfThe Hector Client: http://hector-client.org CONFIDENTIAL | 45
  46. Web Application IntegrationDeveloper ResourcesCQL Documentation: http://www.datastax.com/docs/1.0/dml/using_cqlHector Documentation: http://hector-client.orgCassandra Maven Plugin: http://mojo.codehaus.org/cassandra-maven-plugin/CCM localhost cassandra cluster: https://github.com/pcmanus/ccmOpsCenter: http://www.datastax.com/products/opscenterCassandra AMIs: https://github.com/riptano/CassandraClusterAMICassandra Launcher:https://github.com/joaquincasares/cassandralauncher CONFIDENTIAL | 46

×