Apache Cassandra in the Real World

2,325 views
2,071 views

Published on

Given at the NoSQL Roadshow in London.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,325
On SlideShare
0
From Embeds
0
Number of Embeds
46
Actions
Shares
0
Downloads
22
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Apache Cassandra in the Real World

  1. 1. Apache Cassandra in the Real World Jeremy Hanna Support Engineer ©2013 DataStax Confidential. Do not distribute without consent.
  2. 2. Cassandra Design •Massive scalability •Multi-datacenter •High Performance •Reliability/Availability •no SPOF, no special roles
  3. 3. Multi-DC Replication
  4. 4. Ops Friendly •Simple design •no special role, no single point of failure •Lots of exposed metrics via JMX •Nodes and entire datacenters can go down with no loss of service •DataStax OpsCenter •Visual monitoring tool •REST interface to metric data •Free version •Hands-off services
  5. 5. Developer friendly •CQL3 •Collections (Set, Map, List) •Cassandra native drivers •Native paging •Tracing •DataStax DevCenter tool •Atomic batches •Lightweight transactions •Triggers
  6. 6. CQL3 examples CREATE USER bombadil WITH PASSWORD 'goldberry4ever' SUPERUSER; CREATE KEYSPACE shire WITH REPLICATION = {'class': 'NetworkTopologyStrategy', 'eu' : 3, 'us-east' : 2}; GRANT ALTER ON KEYSPACE shire TO gandalf; SELECT * FROM emp WHERE empID IN (130,104) ORDER BY deptID DESC; INSERT INTO excelsior.clicks (userid, url, date, name)
 VALUES (
 3715e600-2eb0-11e2-81c1-0800200c9a66,
 ‘http://cassandra.apache.org',
 ‘2013-10-09', ‘Mary')
 USING TTL 86400; UPDATE users SET email = ‘charlie@wonka.com’ WHERE login = ‘cbucket64' IF email = ‘cbucket@wonka.com’
  7. 7. Some C* Users
  8. 8. Netflix •50 clusters, 750 nodes •Nearly all data in Cassandra •film metadata •user ratings •recommendations •Interesting use case because: •Sheer size and how much they depend on it •Multi-region (effectively multi-datacenter) within AWS •Highly available (through various AWS outages) See also: http://planetcassandra.org/blog/post/case-study-netflix
  9. 9. La Poste •Use case: parcel distribution metadata •From MySQL to Cassandra •Holiday load doubles •4 million parcels/day •Average day for one of 70,000 postmen •Scan parcels •Print parcel list •Deliver parcels •Scans remaining, held up to 15 days (TTL) See also: http://www.slideshare.net/planetcassandra/c-summit-eu-2013-delivering-christmas-gifts-in-france-since-2012
  10. 10. Rackspace •Use case: multi-tenant cloud monitoring services •Common time series use case •raw metric data at varying intervals •raw data expires using TTLs •Supports •Ingestion through modular sources •Rollups •Servicing queries at various resolutions •Currently ingests 120 million metrics/hour •See Blueflood.io for project details See also: http://www.slideshare.net/gdusbabek/blueflood-open-source-metrics-processing-at-cassandraeu-2013
  11. 11. Spotify •Use case began with playlist storage •Grew significantly beyond that •Some playlist details •Essentially version control system •More than 1 billion playlists •>40,000 request/second at peak •Off-line mode (both access and changes) •Concurrent changes See also: http://www.slideshare.net/planetcassandra/c-summit-eu-2013-playlists-at-spotify-using-cassandra-to-store-version-controlled-objects
  12. 12. Questions? •@jeromatron on twitter and #cassandra irc •More real world cases •http://planetcassandra.org/FiveMinuteInterviews •DataStax •Free online training •Free developer tools

×