Your SlideShare is downloading. ×
Using Apache Cassandra: What is this thing, and how do I use it?
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Using Apache Cassandra: What is this thing, and how do I use it?

2,322
views

Published on

This is the presentation I gave at the Reflections | Projections conference at UIUC. http://www.acm.uiuc.edu/conference/2013/ It is an introduction to some of the basics of Apache Cassandra, followed …

This is the presentation I gave at the Reflections | Projections conference at UIUC. http://www.acm.uiuc.edu/conference/2013/ It is an introduction to some of the basics of Apache Cassandra, followed by actually getting it up and running. This presentation goes over what Apache Cassandra is and how to get it up and running on your development machine. It then goes over using the DataStax Python Driver and the Cassandra Query Language (CQL) to create tables, write data to them, and then read it back out.

Published in: Technology

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,322
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
44
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Using Apache Cassandra for Big Data What is this thing, and how do I use it? Jeremiah Jordan Lead Software Engineer/Support @zanson ©2013 DataStax. Do not distribute without consent. Monday, October 14, 13 1
  • 2. Who I am • Jeremiah Jordan • Lead Software Engineer in Support at DataStax • Previously Senior Architect at Morningstar, Inc. • Using Cassandra since 0.6 • Before that, wrote code for the F22 Monday, October 14, 13
  • 3. Cassandra - An introduction Monday, October 14, 13
  • 4. Cassandra - Intro • Based on Amazon Dynamo and Google BigTable papers • Shared nothing • Distributed • Data safe as possible • Predictable scaling Dynamo BigTable 4 Monday, October 14, 13
  • 5. Cassandra - More than one server • All nodes participate in a cluster • Shared nothing • Add or remove as needed • More capacity? Add a server • Each node owns a number of tokens • Tokens denote a range of keys • 4 nodes? -> Key range/4 • Each node owns 1/4 the data 5 Monday, October 14, 13
  • 6. Cassandra - Locally Distributed • Client writes to any node • Node coordinates with others • Data replicated in parallel • Replication factor (RF): How many copies of your data? • RF = 3 here Each node stores 3/4 of clusters total data. 6 Monday, October 14, 13
  • 7. Cassandra - Geographically Distributed • Client writes local • Data syncs across WAN • Replication Factor per DC Single coordinator 7 Monday, October 14, 13
  • 8. Cassandra - Consistency • Consistency Level (CL) • Client specifies per read or write • ALL = All replicas ack • QUORUM = > 51% of replicas ack • LOCAL_QUORUM = > 51% in local DC ack • ONE = Only one replica acks 8 Monday, October 14, 13
  • 9. Cassandra - Transparent to the application • A single node failure shouldn’t bring failure • Replication Factor + Consistency Level = Success • This example: • RF = 3 • CL = QUORUM >51% Ack so we are good! 9 Monday, October 14, 13
  • 10. Application Example - Layout • Active-Active • Service based DNS routing Cassandra Replication 10 Monday, October 14, 13
  • 11. Application Example - Uptime • Normal server maintenance • Application is unaware Cassandra Replication 11 Monday, October 14, 13
  • 12. Application Example - Failure • Data center failure Another happy user! • Data is safe. Route traffic. 12 33 Monday, October 14, 13
  • 13. Five Years of Cassandra 0.1 Jul-08 ... 0.3 Jul-09 0.6 May-10 0.7 Feb-11 1.0 Dec-11 DSE Monday, October 14, 13 1.2 Oct-12 2.0 Jul-13
  • 14. Cassandra 2.0 - Big new features Monday, October 14, 13
  • 15. Lightweight transactions: the problem Session 1 Session 2 SELECT * FROM users WHERE username = ’jbellis’ SELECT * FROM users WHERE username = ’jbellis’ [empty resultset] [empty resultset] It’s a Race! INSERT INTO users (username,password) VALUES (’jbellis’,‘xdg44hh’) Who wins? Monday, October 14, 13 INSERT INTO users (userName,password) VALUES (’jbellis’,‘8dhh43k’)
  • 16. LWT: details • 4 round trips vs 1 for normal updates • Paxos - Paxos made easy • Immediate consistency with no leader election or failover • For reads, ConsistencyLevel.SERIAL • http://www.datastax.com/dev/blog/lightweight-transactions-incassandra-2-0 Monday, October 14, 13
  • 17. Using LWT • Don’t overwrite an existing record INSERT INTO USERS (username, email, ...) VALUES (‘jbellis’, ‘jbellis@datastax.com’, ... ) IF NOT EXISTS; • Only update record if condition is met UPDATE USERS SET email = ’jonathan@datastax.com’, ... WHERE username = ’jbellis’ IF email = ’jbellis@datastax.com’; Monday, October 14, 13
  • 18. LWT: Use with caution • Great for 1% of your application • Eventual consistency is your friend • http://www.slideshare.net/planetcassandra/c-summit-2013-eventual-consistencyhopeful-consistency-by-christos-kalantzis Monday, October 14, 13
  • 19. Installing Cassandra Monday, October 14, 13
  • 20. Download Cassandra Monday, October 14, 13
  • 21. Download Cassandra Monday, October 14, 13
  • 22. Download Cassandra Monday, October 14, 13
  • 23. Extract Cassandra Monday, October 14, 13
  • 24. Setup Data and Log Directories Monday, October 14, 13
  • 25. Start Cassandra Monday, October 14, 13
  • 26. Start Cassandra Monday, October 14, 13
  • 27. Installing Cassandra Python Driver Monday, October 14, 13
  • 28. Python Cassandra Driver Monday, October 14, 13
  • 29. Install Python Cassandra Driver Monday, October 14, 13
  • 30. Connect and Create a Keyspace from cassandra.cluster import Cluster cluster = Cluster(['127.0.0.1']) session = cluster.connect() log.info("creating keyspace...") KEYSPACE = "testkeyspace" session.execute(""" CREATE KEYSPACE IF NOT EXISTS %s WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': '1' } """ % KEYSPACE) Monday, October 14, 13
  • 31. Create a Table log.info("setting keyspace...") session.set_keyspace(KEYSPACE) log.info("creating table...") session.execute(""" CREATE TABLE IF NOT EXISTS mytable ( thekey text, col1 text, col2 text, PRIMARY KEY (thekey, col1) ) """) Monday, October 14, 13
  • 32. Insert a Row query = SimpleStatement(""" INSERT INTO mytable (thekey, col1, col2) VALUES ('key1', 'a', 'b') """, consistency_level=ConsistencyLevel.ONE) log.info("inserting row") session.execute(query) Monday, October 14, 13
  • 33. Insert Rows (Prepared Statement) prepared = session.prepare(""" INSERT INTO mytable (thekey, col1, col2) VALUES (?, ?, ?) """) for i in range(10): log.info("inserting row %d" % i) bound = prepared.bind(("key%d" % i, "b%d" % i, "c%d" % i)) session.execute(bound) Monday, October 14, 13
  • 34. Query Results future = session.execute_async(""" SELECT * FROM mytable WHERE thekey='key1' """) rows = future.result() log.info("keytcol1tcol2") log.info("---t----t----") for row in rows: log.info("t".join(row)) Monday, October 14, 13
  • 35. Run It Monday, October 14, 13
  • 36. Cassandra Applications - Drivers • DataStax Drivers for Cassandra • Java • C# • Python • more on the way 36 Monday, October 14, 13
  • 37. Find Out More Cassandra: http://cassandra.apache.org DataStax Drivers: https://github.com/datastax Documentation: http://www.datastax.com/docs Getting Started: http://www.datastax.com/documentation/gettingstarted/index.html Developer Blog: http://www.datastax.com/dev/blog Cassandra Community Site: http://planetcassandra.org Download: http://planetcassandra.org/Download/DataStaxCommunityEdition Webinars: http://planetcassandra.org/Learn/CassandraCommunityWebinars Cassandra Summit Talks: http://planetcassandra.org/Learn/CassandraSummit Monday, October 14, 13
  • 38. ©2013 DataStax Confidential. Do not distribute without consent. Monday, October 14, 13 38

×