Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Cassandra for Developers 
DataStax Drivers in Practice 
Michaël Figuière 
Drivers & Developer Tools Architect 
@mfiguiere
Cassandra Peer to Peer Architecture 
© 2014 DataStax, All Rights Reserved. 
2 
Node 
Node Node 
Node 
Node 
Node 
Every no...
Cassandra Peer to Peer Architecture 
© 2014 DataStax, All Rights Reserved. 
3 
Node 
Node Replica 
Replica 
Replica 
Node ...
Client / Server Communication 
© 2014 DataStax, All Rights Reserved. 
4 
Client 
Client 
Client 
Client 
Node 
Node Replic...
Tunable Consistency 
© 2014 DataStax, All Rights Reserved. 
5 
3 replicas 
A A A 
Time 
5
Tunable Consistency 
© 2014 DataStax, All Rights Reserved. 
66 
Write and wait for 
acknowledge from one node 
Write ‘B’ 
...
Tunable Consistency 
© 2014 DataStax, All Rights Reserved. 
77 
Write and wait for 
acknowledge from one node 
Write ‘B’ 
...
Tunable Consistency 
© 2014 DataStax, All Rights Reserved. 
88 
R + W < N 
A A A 
Read waiting for one node 
to answer 
B ...
Tunable Consistency 
© 2014 DataStax, All Rights Reserved. 
9 
R + W = N 
A A A 
B B 
A 
B B A 
Write and wait for 
acknow...
Tunable Consistency 
© 2014 DataStax, All Rights Reserved. 
10 
R + W > N 
A A A 
B B 
A 
B B 
A 
Write and wait for 
ackn...
Tunable Consistency 
© 2014 DataStax, All Rights Reserved. 
11 
R = W = QUORUM 
A A A 
B B 
A 
B B 
A 
Time 
QUORUM = (N /...
Cassandra Query Language (CQL) 
• Similar to SQL, mostly a subset 
• Without joins, sub-queries, and aggregations 
• Prima...
CQL: Create Table 
CREATE TABLE users ( 
login text, 
name text, 
age int, 
… 
PRIMARY KEY (login)); 
Just like in SQL! 
l...
CQL: Clustered Table 
A TimeUUID is a UUID that 
can be sorted chronologically 
CREATE TABLE mailbox ( 
login text, 
messa...
CQL: Queries 
Get message by user and message_id (date) 
SELECT * FROM mailbox 
WHERE login = jdoe 
AND message_id = '2014...
CQL: Collections 
CREATE TABLE users ( 
login text, 
set and list have a similar 
name text, 
semantic as in Java 
age int...
Cassandra 2.1: User Defined Type (UDT) 
CREATE TABLE users ( 
login text, 
… 
street_number int, 
street_name text, 
postc...
Cassandra 2.1: UDT Insert / Update 
INSERT INTO users(login,name, location) 
VALUES ('jdoe','John DOE', 
{ 
'street_number...
Client / Server Communication 
© 2014 DataStax, All Rights Reserved. 
19 
Client 
Client 
Client 
Client 
Node 
Node Repli...
Request Pipelining 
© 2014 DataStax, All Rights Reserved. 
20 
Client 
Without 
Request Pipelining 
Cassandra 
Client Cass...
Notifications 
© 2014 DataStax, All Rights Reserved. 
21 
Client 
Without 
Notifications 
With 
Notifications 
Node 
Node ...
Asynchronous Driver Architecture 
© 2014 DataStax, All Rights Reserved. 
22 
Client 
Thread 
Node 
Node 
Node 
Client 
Thr...
Asynchronous Driver Architecture 
© 2014 DataStax, All Rights Reserved. 
23 
Client 
Thread 
Node 
Node 
Node 
Client 
Thr...
Failover 
© 2014 DataStax, All Rights Reserved. 
24 
Client 
Thread 
Node 
Node 
Node 
Client 
Thread 
Client 
Thread 
Nod...
DataStax Drivers Highlights 
• Asynchronous architecture using Non Blocking IOs 
• Prepared Statements Support 
• Automati...
DataCenter Aware Balancing 
© 2014 DataStax, All Rights Reserved. 
26 
Node 
Node 
Client Node 
Node 
Datacenter B 
Node 
...
Token Aware Balancing 
© 2014 DataStax, All Rights Reserved. 
Nodes that own a Replica 
of the PK being read or 
written b...
State of DataStax Drivers 
© 2014 DataStax, All Rights Reserved. 
28 
Cassandra 
1.2 
Cassandra 
2.0 
Cassandra 
2.1 
Java...
DataStax Driver in Practice 
Java 
<dependency> 
<groupId>com.datastax.cassandra</groupId> 
<artifactId>cassandra-­‐driver...
Connect and Write 
Cluster cluster = Cluster.builder() 
.addContactPoints("10.1.2.5", "cassandra_node3") 
.build(); 
Sessi...
Read 
ResultSet resultSet = session.execute( 
Session is a thread safe 
object. A singleton should 
be instantiated at sta...
Write with Prepared Statements 
PreparedStatement objects 
are also threadsafe, just create 
a singleton at startup 
Prepa...
Asynchronous Read 
ResultSetFuture future = session.executeAsync( 
"SELECT * FROM user WHERE user_id IN (1,2,3)" 
); 
Resu...
Asynchronous Read with Callbacks 
ResultSetFuture future = session.executeAsync( 
"SELECT * FROM user WHERE user_id IN (1,...
Query Builder 
import static of 
QueryBuilder is required in 
order to use the DSL 
import static 
com.datastax.driver.cor...
Python 
cluster = Cluster(['10.1.1.3', '10.1.1.4', ’10.1.1.5']) 
session = cluster.connect('mykeyspace') 
def handle_succe...
C# 
var cluster = Cluster.Builder() 
.AddContactPoints("host1", "host2", "host3") 
.Build(); 
var session = cluster.Connec...
C / C++ 
CassString query = cass_string_init("SELECT keyspace_name 
FROM system.schema_keyspaces;"); 
CassStatement* state...
Node.js 
var cassandra = require('cassandra-driver'); 
var client = new cassandra.Client({ 
contactPoints: ['host1', 'h2']...
Ruby 
cluster = Cassandra.cluster 
session = cluster.connect(‘system') 
future = session.execute_async('SELECT * FROM sche...
Object Mapper 
• Avoid boilerplate for common use cases 
• Map Objects to Statements and ResultSets to Objects 
• Do NOT h...
Object Mapper in Practice 
<dependency> 
<groupId>com.datastax.cassandra</groupId> 
<artifactId>cassandra-­‐driver-­‐mappi...
Basic Object Mapping 
CREATE 
TYPE 
address 
( 
street 
text, 
city 
text, 
zip 
int 
); 
CREATE 
TABLE 
users 
( 
email 
...
Basic Object Mapping 
MappingManager 
manager 
= 
new 
MappingManager(session); 
Mapper 
mapper 
= 
manager.mapper(User.cl...
Accessors 
@Accessor 
interface 
UserAccessor 
{ 
@Query("SELECT 
* 
FROM 
user_profiles 
LIMIT 
:max") 
Result<User> 
fir...
We’re Hiring! 
Cassandra Tech Day - Paris 
November 4th 
Cassandra Summit Europe - London 
December 3-4th 
@mfiguiere
Upcoming SlideShare
Loading in …5
×

Paris Cassandra Meetup - Cassandra for Developers

1,624 views

Published on

Published in: Engineering
  • Be the first to comment

Paris Cassandra Meetup - Cassandra for Developers

  1. 1. Cassandra for Developers DataStax Drivers in Practice Michaël Figuière Drivers & Developer Tools Architect @mfiguiere
  2. 2. Cassandra Peer to Peer Architecture © 2014 DataStax, All Rights Reserved. 2 Node Node Node Node Node Node Every node have the same role, there’s no Master or Slave Each node contains a replica of some partitions of tables
  3. 3. Cassandra Peer to Peer Architecture © 2014 DataStax, All Rights Reserved. 3 Node Node Replica Replica Replica Node Each partition is stored in several Replicas to ensure durability and high availability
  4. 4. Client / Server Communication © 2014 DataStax, All Rights Reserved. 4 Client Client Client Client Node Node Replica Replica Replica Node Coordinator node: Forwards all R/W requests to corresponding replicas
  5. 5. Tunable Consistency © 2014 DataStax, All Rights Reserved. 5 3 replicas A A A Time 5
  6. 6. Tunable Consistency © 2014 DataStax, All Rights Reserved. 66 Write and wait for acknowledge from one node Write ‘B’ B A A Time A A A
  7. 7. Tunable Consistency © 2014 DataStax, All Rights Reserved. 77 Write and wait for acknowledge from one node Write ‘B’ B A A Time A A A
  8. 8. Tunable Consistency © 2014 DataStax, All Rights Reserved. 88 R + W < N A A A Read waiting for one node to answer B A A 8 B A A Write and wait for acknowledge from one node Time
  9. 9. Tunable Consistency © 2014 DataStax, All Rights Reserved. 9 R + W = N A A A B B A B B A Write and wait for acknowledges from two nodes Read waiting for one node to answer Time
  10. 10. Tunable Consistency © 2014 DataStax, All Rights Reserved. 10 R + W > N A A A B B A B B A Write and wait for acknowledges from two nodes Read waiting for two nodes to answer Time
  11. 11. Tunable Consistency © 2014 DataStax, All Rights Reserved. 11 R = W = QUORUM A A A B B A B B A Time QUORUM = (N / 2) + 1
  12. 12. Cassandra Query Language (CQL) • Similar to SQL, mostly a subset • Without joins, sub-queries, and aggregations • Primary Key contains: • A Partition Key used to select the partition that will store the Row • Some Clustering Columns, used to define how Rows should be grouped and sorted on the disk • Support Collections • Support User Defined Types (UDT) © 2014 DataStax, All Rights Reserved. 12
  13. 13. CQL: Create Table CREATE TABLE users ( login text, name text, age int, … PRIMARY KEY (login)); Just like in SQL! login is the partition key, it will be hashed and rows will be spread over the cluster on different partitions © 2014 DataStax, All Rights Reserved. 13
  14. 14. CQL: Clustered Table A TimeUUID is a UUID that can be sorted chronologically CREATE TABLE mailbox ( login text, message_id timeuuid, interlocutor text, message text, PRIMARY KEY((login), message_id) ); message_id is a clustering column, it means that all the rows with a same login will be grouped and sorted by message_id on the disk © 2014 DataStax, All Rights Reserved. 14
  15. 15. CQL: Queries Get message by user and message_id (date) SELECT * FROM mailbox WHERE login = jdoe AND message_id = '2014-09-25 16:00:00'; Get message by user and date interval SELECT * FROM mailbox WHERE login = jdoe AND message_id <= '2014-09-25 16:00:00' AND message_id >= '2014-09-20 16:00:00'; WHERE clauses can only be constraints on the primary key and range queries are not possible on the partition key © 2014 DataStax, All Rights Reserved. 15
  16. 16. CQL: Collections CREATE TABLE users ( login text, set and list have a similar name text, semantic as in Java age int, friends set<text>, hobbies list<text>, languages map<int, text>, … PRIMARY KEY (login) ); It’s not possible to use nested collections… yet © 2014 DataStax, All Rights Reserved. 16
  17. 17. Cassandra 2.1: User Defined Type (UDT) CREATE TABLE users ( login text, … street_number int, street_name text, postcode int, country text, … PRIMARY KEY(login)); CREATE TYPE address ( street_number int, street_name text, postcode int, country text ); CREATE TABLE users ( login text, … location frozen<address>, … PRIMARY KEY(login) ); © 2014 DataStax, All Rights Reserved. 17
  18. 18. Cassandra 2.1: UDT Insert / Update INSERT INTO users(login,name, location) VALUES ('jdoe','John DOE', { 'street_number': 124, 'street_name': 'Congress Avenue', 'postcode': 95054, 'country': 'USA' }); UPDATE users SET location = { 'street_number': 125, 'street_name': 'Congress Avenue', 'postcode': 95054, 'country': 'USA' } WHERE login = jdoe; © 2014 DataStax, All Rights Reserved. 18
  19. 19. Client / Server Communication © 2014 DataStax, All Rights Reserved. 19 Client Client Client Client Node Node Replica Replica Replica Node Coordinator node: Forwards all R/W requests to corresponding replicas
  20. 20. Request Pipelining © 2014 DataStax, All Rights Reserved. 20 Client Without Request Pipelining Cassandra Client Cassandra With Request Pipelining
  21. 21. Notifications © 2014 DataStax, All Rights Reserved. 21 Client Without Notifications With Notifications Node Node Node Client Node Node Node
  22. 22. Asynchronous Driver Architecture © 2014 DataStax, All Rights Reserved. 22 Client Thread Node Node Node Client Thread Client Thread Node Driver
  23. 23. Asynchronous Driver Architecture © 2014 DataStax, All Rights Reserved. 23 Client Thread Node Node Node Client Thread Client Thread Node 6 2 3 4 5 1 Driver
  24. 24. Failover © 2014 DataStax, All Rights Reserved. 24 Client Thread Node Node Node Client Thread Client Thread Node 7 2 4 3 5 1 Driver 6
  25. 25. DataStax Drivers Highlights • Asynchronous architecture using Non Blocking IOs • Prepared Statements Support • Automatic Failover • Node Discovery • Tunable Load Balancing • Round Robin, Latency Awareness, Multi Data Centers, Replica Awareness • Cassandra Tracing Support • Compression & SSL © 2014 DataStax, All Rights Reserved. 25
  26. 26. DataCenter Aware Balancing © 2014 DataStax, All Rights Reserved. 26 Node Node Client Node Node Datacenter B Node Node Client Client Client Client Client Datacenter A Local nodes are queried first, if non are available, the request could be sent to a remote node.
  27. 27. Token Aware Balancing © 2014 DataStax, All Rights Reserved. Nodes that own a Replica of the PK being read or written by the query will be contacted first. 27 Node Node Replica Node Client Replica Replica Partition Key will be inferred from Prepared Statements metadata
  28. 28. State of DataStax Drivers © 2014 DataStax, All Rights Reserved. 28 Cassandra 1.2 Cassandra 2.0 Cassandra 2.1 Java 1.0 - 2.1 2.0 - 2.1 2.1 Python 1.0 - 2.1 2.0 - 2.1 2.1 C# 1.0 - 2.1 2.0 - 2.1 2.1 Node.js 1.0 1.0 Later C++ 1.0-beta4 1.0-beta4 Later Ruby 1.0-beta3 1.0-beta3 Later Later versions of Cassandra can use earlier Drivers, but some features won’t be supported
  29. 29. DataStax Driver in Practice Java <dependency> <groupId>com.datastax.cassandra</groupId> <artifactId>cassandra-­‐driver-­‐core</artifactId> <version>2.1.0</version> </dependency> Python $ pip install cassandra-­‐driver C# PM> Install-­‐Package CassandraCSharpDriver Ruby gem install cassandra-­‐driver -­‐-­‐pre Node.js $ npm install cassandra-­‐driver © 2014 DataStax, All Rights Reserved. 29
  30. 30. Connect and Write Cluster cluster = Cluster.builder() .addContactPoints("10.1.2.5", "cassandra_node3") .build(); Session session = cluster.connect(“my_keyspace"); session.execute( "INSERT INTO user (user_id, name, email) VALUES (12345, 'johndoe', 'john@doe.com')" ); The rest of the nodes will be discovered by the driver A keyspace is just like a schema in the SQL world © 2014 DataStax, All Rights Reserved. 30
  31. 31. Read ResultSet resultSet = session.execute( Session is a thread safe object. A singleton should be instantiated at startup "SELECT * FROM user WHERE user_id IN (1,8,13)" ); List<Row> rows = resultSet.all(); for (Row row : rows) { String userId = row.getString("user_id"); String name = row.getString("name"); String email = row.getString("email"); } Actually ResultSet also implements Iterable<Row> © 2014 DataStax, All Rights Reserved. 31
  32. 32. Write with Prepared Statements PreparedStatement objects are also threadsafe, just create a singleton at startup PreparedStatement insertUser = session.prepare( "INSERT INTO user (user_id, name, email) VALUES (?, ?, ?)" ); BoundStatement statement = insertUser .bind(12345, "johndoe", "john@doe.com") .setConsistencyLevel(ConsistencyLevel.QUORUM); session.execute(statement); Parameters can be named as well BoundStatement is a stateful, NON threadsafe object Consistency Level can be set for each statement © 2014 DataStax, All Rights Reserved. 32
  33. 33. Asynchronous Read ResultSetFuture future = session.executeAsync( "SELECT * FROM user WHERE user_id IN (1,2,3)" ); ResultSet resultSet = future.get(); List<Row> rows = resultSet.all(); for (Row row : rows) { String userId = row.getString("user_id"); String name = row.getString("name"); String email = row.getString("email"); } Will not block. Returns immediately Will block until less all the connections are busy © 2014 DataStax, All Rights Reserved. 33
  34. 34. Asynchronous Read with Callbacks ResultSetFuture future = session.executeAsync( "SELECT * FROM user WHERE user_id IN (1,2,3)" ); future.addListener(new Runnable() { public void run() { // Process the results here } }, executor); ResultSetFuture implements Guava’s ListenableFuture executor = Executors .newCachedThreadPool(); executor = MoreExecutors .sameThreadExecutor(); Only if your listener code is trivial and non blocking as it’ll be executed in the IO Thread …Or any thread pool that you prefer © 2014 DataStax, All Rights Reserved. 34
  35. 35. Query Builder import static of QueryBuilder is required in order to use the DSL import static com.datastax.driver.core.querybuilder.QueryBuilder.*; Statement selectAll = select().all().from("user").where(eq("user_id", userId)); session.execute(selectAll); Statement insert = insertInto("user") .value("user_id", 2) .value("name", "johndoe") .value("email", "john@doe.com"); session.execute(insert); © 2014 DataStax, All Rights Reserved. 35
  36. 36. Python cluster = Cluster(['10.1.1.3', '10.1.1.4', ’10.1.1.5']) session = cluster.connect('mykeyspace') def handle_success(rows): user = rows[0] try: process_user(user.name, user.age, user.id) except Exception: log.error("Failed to process user %s", user.id) # don't re-raise errors in the callback def handle_error(exception): log.error("Failed to fetch user info: %s", exception) future = session.execute_async("SELECT * FROM users WHERE user_id=3") future.add_callbacks(handle_success, handle_error) It’s also possible to retrieve the result from the future object synchronously © 2014 DataStax, All Rights Reserved. 36
  37. 37. C# var cluster = Cluster.Builder() .AddContactPoints("host1", "host2", "host3") .Build(); var session = cluster.Connect("sample_keyspace"); var task = session.ExecuteAsync(statement); task.ContinueWith((t) => { var rs = t.Result; foreach (var row in rs) { //Get the values from each row } }, TaskContinuationOptions.OnlyOnRanToCompletion); Asynchronously execute a query using the TPL © 2014 DataStax, All Rights Reserved. 37
  38. 38. C / C++ CassString query = cass_string_init("SELECT keyspace_name FROM system.schema_keyspaces;"); CassStatement* statement = cass_statement_new(query, 0); CassFuture* result_future = cass_session_execute(session, statement); if (cass_future_error_code(result_future) == CASS_OK) { const CassResult* result = cass_future_get_result(result_future); CassIterator* rows = cass_iterator_from_result(result); while (cass_iterator_next(rows)) { // Process results } cass_result_free(result); cass_iterator_free(rows); } cass_future_free(result_future); Each structure must be freed with the appropriate function © 2014 DataStax, All Rights Reserved. 38
  39. 39. Node.js var cassandra = require('cassandra-driver'); var client = new cassandra.Client({ contactPoints: ['host1', 'h2'], keyspace: 'ks1' }); var query = 'SELECT email, last_name FROM user_profiles WHERE key=?'; client.execute(query, ['guy'], function(err, result) { assert.ifError(err); console.log('got user profile with email ' + result.rows[0].email); }); Here we’re using a Parameterized Statement, which is not prepared, but still allows parameters © 2014 DataStax, All Rights Reserved. 39
  40. 40. Ruby cluster = Cassandra.cluster session = cluster.connect(‘system') future = session.execute_async('SELECT * FROM schema_columnfamilies') future.on_success do |rows| rows.each do |row| Register a listener on the future, which will be called when results are available puts "The keyspace #{row['keyspace_name']} has a table called #{row['columnfamily_name']}" end end future.join © 2014 DataStax, All Rights Reserved. 40
  41. 41. Object Mapper • Avoid boilerplate for common use cases • Map Objects to Statements and ResultSets to Objects • Do NOT hide Cassandra from the developer • No “clever tricks” à la Hibernate • Not JPA compatible, but JPA-ish API © 2014 DataStax, All Rights Reserved. 41
  42. 42. Object Mapper in Practice <dependency> <groupId>com.datastax.cassandra</groupId> <artifactId>cassandra-­‐driver-­‐mapping</artifactId> <version>2.1.0</version> </dependency> Additional artifact for object mapping Available from Driver 2.1.0 © 2014 DataStax, All Rights Reserved. 42
  43. 43. Basic Object Mapping CREATE TYPE address ( street text, city text, zip int ); CREATE TABLE users ( email text PRIMARY KEY, address address ); @UDT(keyspace = "ks", name = "address") public class Address { private String street; private String city; private int zip; // getters and setters omitted... } @Table(keyspace = "ks", name = "users") public class User { @PartitionKey private String email; private Address address; // getters and setters omitted... } © 2014 DataStax, All Rights Reserved. 43
  44. 44. Basic Object Mapping MappingManager manager = new MappingManager(session); Mapper mapper = manager.mapper(User.class); UserProfile myProfile = mapper.get("xyz@example.com"); ListenableFuture saveFuture = mapper.saveAsync(anotherProfile); mapper.delete("xyz@example.com"); Mapper, just like Session, is a thread-safe object. Create a singleton at startup. get() returns a mapped row for the given Primary Key ListenableFuture from Guava. Completed when the write is acknowledged. © 2014 DataStax, All Rights Reserved. 44
  45. 45. Accessors @Accessor interface UserAccessor { @Query("SELECT * FROM user_profiles LIMIT :max") Result<User> firstN(@Param("max") int limit); } UserAccessor accessor = manager.createAccessor(UserAccessor.class); Result<User> users = accessor.firstN(10); for (User user : users) { System.out.println( profile.getAddress().getZip() ); } Result is like ResultSet but specialized for a mapped class… …so we iterate over it just like we would with a ResultSet © 2014 DataStax, All Rights Reserved. 45
  46. 46. We’re Hiring! Cassandra Tech Day - Paris November 4th Cassandra Summit Europe - London December 3-4th @mfiguiere

×