Cassandra Drivers And Tools 
DuyHai DOAN, Technical Advocate 
@doanduyhai
Agenda! 
@doanduyhai 
2 
Drivers 
• architecture, policies, Java driver API 
DevCenter (live coding demo!) 
Cassandra Unit (+ live coding demo!) 
Object Mapper module (+ live coding demo!) 
Achilles Object Mapper (+ live coding demo!)
Cassandra Drivers Architecture! 
Architecture! 
Policies! 
Java driver API!
Drivers list! 
@doanduyhai 
4 
• Java 
• C# 
• Python 
• Node.js 
• Ruby (1.0.0.rc1) 
• C++ (beta) 
• ODBC (beta) 
• Clojure (community) 
• Go (community) 
• PHP (to be announced)
Connection pooling! 
@doanduyhai 
5 
n3 
n2 
n4 
Driver 
Pool1 
Pool2 
Pool3 
Client 
Thread1 
Client 
Thread2 
Client 
Thread3
Connection pooling! 
@doanduyhai 
6 
n3 
n2 
n4 
Driver 
Pool1 
Pool2 
Pool3 
Client 
Thread1 
Client 
Thread2 
Client 
Thread3 
1 
2 3 
4 
5 
6
Request Pipelining! 
@doanduyhai 
7 
Client Cassandra
Request Pipelining! 
@doanduyhai 
8 
Client Cassandra 
StreamID 
StreamID
Nodes Discovery! 
@doanduyhai 
9 
n2 
n3 
n4 
n5 
n6 
n7 
n8 
Control Connection 
n1 Driver
Round Robin Load Balancing! 
@doanduyhai 
10 
n2 
n3 
n4 
n5 
n6 
n7 
n8 
n1 Client 
1 
2 
3 
4
DC Aware Load Balancing! 
@doanduyhai 
11 
Client1 DC1 
⤫ 
Client2 DC2
DC Aware Load Balancing! 
@doanduyhai 
12 
⤫ 
Client1 DC1 
Client2 DC2
Token Aware Load Balancing! 
@doanduyhai 
13 
n2 
n3 
n4 
n5 
n6 
n7 
n8 
1 
n1 Client 
2 
3 
⤫
Combining Load Balancing Policies! 
Token Aware 
Round Robin DC Aware Round Robin 
@doanduyhai 
14 
extends 
Load Balancing Policy 
wraps 
Default config
Automatic Failover! 
@doanduyhai 
15 
n3 
n2 
n4 
Driver 
7 6 
Pool1 
4 5 
Pool2 
Pool3 
Client 
Thread 
⤫ 
1 
2 
3 
8
Other policies! 
@doanduyhai 
16 
Retry policy 
• write/read timeout 
• node unavailable 
Reconnection policy 
• constant schedule 
• exponential schedule
Statements! 
@doanduyhai 
17 
Plain statement 
• convenient, one-off query 
• plain string ☞ parsing overhead 
INSERT INTO user(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33)’;
Statements! 
@doanduyhai 
18 
Prepared statements 
• avoid parsing overhead 
• query structure should be known ahead of time 
• bound values 
• named parameters 
INSERT INTO user(login, name, age) VALUES(?, ?, ?)’; 
INSERT INTO user(login, name, age) VALUES(:login, :name, :age)’;
Statements! 
@doanduyhai 
19 
Parameterized statements 
• same as plain statement 
• pass bound values as bytes ☞ avoid ser/deser of values 
INSERT INTO user(login, name, age) VALUES(?, ?, ?)’;
Java Driver! 
@doanduyhai 
20 
Reference implementation 
Base on asynchronous Netty library 
Configurable policies 
Query tracing support 
Client-node compression & SSL
Maven dependency! 
Available on Maven Central 
@doanduyhai 
21 
<dependency> 
<groupId>com.datastax.cassandra</groupId> 
<artifactId>cassandra-driver-core</artifactId> 
<version>2.1.3</version> 
</dependency> 
depends on Netty, Guava, Metrics
Connect and Write! 
@doanduyhai 
22 
Cluster cluster = Cluster.builder() 
.addContactPoints("127.0.0.1", “another-host").build(); 
seed nodes (IP or DNS name) 
Session session = cluster.connect("my_keyspace"); 
session.execute("INSERT INTO user (user_id, name, email) 
VALUES (12345, 'johndoe', 'john_doe@fiction.com’)");
Configuration! 
@doanduyhai 
23 
Cluster cluster = Cluster.builder() 
.addContactPoints("127.0.0.1", “another-host") 
.withLoadBalancingPolicy(new DCAwareRoundRobinPolicy("DC1") 
.withRetryPolicy(DowngradingConsistencyRetryPolicy.INSTANCE) 
.withReconnectionPolicy(new ConstantReconnectionPolicy(1000)) 
.build();
Read! 
@doanduyhai 
24 
ResultSet resultSet = session.execute("SELECT * FROM user"); 
List<Row> rows = resultSet.all(); 
for (Row row : rows) { 
Long userId = row.getLong("user_id"); 
String name = row.getString("name"); 
String email = row.getString("email"); 
} 
Stateless ☞ thread-safe
Asynchronous Read! 
ResultSetFuture future = session.executeAsync("SELECT * FROM user"); 
ResultSet resultSet = future.get(); //blocking call 
List<Row> rows = resultSet.all(); 
for (Row row : rows) { 
Long userId = row.getLong("user_id"); 
String name = row.getString("name"); 
String email = row.getString("email"); 
} 
@doanduyhai 25
Asynchronous Read with CallBack! 
ResultSetFuture future = session.executeAsync("SELECT * FROM user"); 
future.addListener(new Runnable() { 
public void run() { 
// Process the results here 
} 
}, executor); 
executor = Executors .newCachedThreadPool(); 
or 
executor = Executors .sameThreadExecutor(); 
@doanduyhai 26
Query Builder! 
Query query = QueryBuilder 
.select() 
.all() 
.from( "my_keyspace", "user") 
.where(eq("login", "jdoe")); 
query.setConsistencyLevel(ConsistencyLevel.ONE); 
ResultSet rs = session.execute(query); 
@doanduyhai 27
Old manual paging! 
@doanduyhai 
28 
Some time you need to fetch all table content 
Manual paging: 
SELECT * FROM users WHERE token(login) >= token(<last_fetched_login>) 
LIMIT 100;
New automatic paging! 
@doanduyhai 
29 
n2 
n3 
n4 
n5 
n6 
n7 
n8 
Query 
n1 Driver 
Page 1 + paging state 1
New automatic paging! 
@doanduyhai 
30 
n2 
n3 
n4 
n5 
n6 
n7 
n8 
Query 
n1 Driver 
Page 2 + paging state 2
New automatic paging! 
@doanduyhai 
31 
Paging state ≈ stateless cookie 
Resilient to node failure
Paging during node failure! 
@doanduyhai 
32 
n2 
n3 
n4 
n5 
n6 
n7 
n8 
Query 
n1 Driver 
Page 1 + paging state 1
Paging during node failure! 
@doanduyhai 
33 
n2 
n3 
n4 
n5 
n6 
n7 
n8 
Query 
n1 Driver 
Page 2 + paging state 2 
⤫
! " 
! 
Q & R
Dev Center 
Demo
Cassandra Unit! 
@doanduyhai 
36 
• start an embedded Cassandra server 
• useful for unit testing 
• mature project, created at August 5th 2011 
• designed around Thrift & Hector 
• propose a JUnit rule for CQL
Cassandra Unit 
Demo
Java Driver Object Mapper! 
@doanduyhai 
38 
• simple mapper 
• KISS 
• annotations à-la JPA (but no JPA dependencies) 
• templating system à-la Spring Data
Java Driver Object Mapper! 
Mapping 
@doanduyhai 
39 
@Table(keyspace = "mapper_module", name = "users") 
public class User { 
@PartitionKey 
private String login; 
private String name; 
// getters and setters omitted... 
}
Java Driver Object Mapper! 
Usage 
@doanduyhai 
40 
MappingManager manager = new MappingManager(session); 
Mapper mapper = manager.mapper(User.class); 
User user = mapper.get("jdoe@fiction.com"); 
mapper.saveAsync(new User("hsue@fiction.com")); 
mapper.delete("jdoe@fiction.com");
Java Driver Object Mapper! 
Accessors (SpringData template-like) definition 
@doanduyhai 
41 
@Accessor 
interface UserAccessor { 
@Query("SELECT * FROM users LIMIT :max") 
Result<User> firstNUsers(@Param("max") int limit); 
}
Java Driver Object Mapper! 
@doanduyhai 
42 
Accessors usage 
UserAccessor accessor = manager.createAccessor(UserAccessor.class); 
List<User> users = accessor.firstNUsers(10).all(); 
for (User user : users) { 
System.out.println( profile.getAddress().getZip() ); 
}
Java Driver Object Mapper 
Demo
Achilles! 
@doanduyhai 
44 
Why ? 
• started in late 2012, when mapper module did not exists 
• more involved and more features than the mapper module 
• different annotations set (may converge)
Achilles 
Demo
Dirty Checking! 
@doanduyhai 
46 
Dirty checking, why is it important ? 
• 1 user ≈ 8 mutable fields 
• × n denormalizations = n update combinations 
• and not even counting multiple fields updates …
Dirty Checking! 
@doanduyhai 
47 
• Are you going to manually generate n prepared statements for all 
possible updates ? 
• Or just use dynamic plain string statements and get some perf 
penalty ?
Dirty Checking! 
@doanduyhai 
48 
//No read-before-write 
ContactEntity proxy = manager.forUpdate(ContactEntity.class, contactId); 
proxy.setFirstName(…); 
proxy.setLastName(…); //type-safe updates 
proxy.setAddress(…); 
manager.update(proxy);
Dirty Checking! 
@doanduyhai 
49 
Proxy 
Setters interception 
DirtyMap 
Empty 
Entity 
PrimaryKey
Dirty Checking! 
@doanduyhai 
50 
• Dynamic statement generation 
UPDATE contacts SET firstname=?, lastname=?,address=? 
WHERE contact_id=? 
prepared statements are cached, of course
Main API! 
@doanduyhai 
51 
manager.insert(entity) 
manager.update(entity) 
manager.remove(entity) 
manager.find(Entity.class, primaryKey)
Advanced Features! 
@doanduyhai 
52 
Counter 
Batch mode 
Strategies (insert, naming) 
Options 
Asynchronous
Documentation! 
@doanduyhai 
53 
Comprehensive Github WIKI 
Twitter-clone demo app (demo.achilles.io) 
Versioned documentation (HTML & PDF) 
JavaDoc
RoadMap! 
@doanduyhai 
54 
C* 2.1 user defined types (UDT) 
Query-templates à-la Spring Data 
Reactive ? (RxJava) 
ElasticSearch integration (@olivierbourgain)
! " 
! 
Q & R
Thank You 
@doanduyhai 
duy_hai.doan@datastax.com 
https://academy.datastax.com/

Cassandra drivers and libraries

  • 1.
    Cassandra Drivers AndTools DuyHai DOAN, Technical Advocate @doanduyhai
  • 2.
    Agenda! @doanduyhai 2 Drivers • architecture, policies, Java driver API DevCenter (live coding demo!) Cassandra Unit (+ live coding demo!) Object Mapper module (+ live coding demo!) Achilles Object Mapper (+ live coding demo!)
  • 3.
    Cassandra Drivers Architecture! Architecture! Policies! Java driver API!
  • 4.
    Drivers list! @doanduyhai 4 • Java • C# • Python • Node.js • Ruby (1.0.0.rc1) • C++ (beta) • ODBC (beta) • Clojure (community) • Go (community) • PHP (to be announced)
  • 5.
    Connection pooling! @doanduyhai 5 n3 n2 n4 Driver Pool1 Pool2 Pool3 Client Thread1 Client Thread2 Client Thread3
  • 6.
    Connection pooling! @doanduyhai 6 n3 n2 n4 Driver Pool1 Pool2 Pool3 Client Thread1 Client Thread2 Client Thread3 1 2 3 4 5 6
  • 7.
  • 8.
    Request Pipelining! @doanduyhai 8 Client Cassandra StreamID StreamID
  • 9.
    Nodes Discovery! @doanduyhai 9 n2 n3 n4 n5 n6 n7 n8 Control Connection n1 Driver
  • 10.
    Round Robin LoadBalancing! @doanduyhai 10 n2 n3 n4 n5 n6 n7 n8 n1 Client 1 2 3 4
  • 11.
    DC Aware LoadBalancing! @doanduyhai 11 Client1 DC1 ⤫ Client2 DC2
  • 12.
    DC Aware LoadBalancing! @doanduyhai 12 ⤫ Client1 DC1 Client2 DC2
  • 13.
    Token Aware LoadBalancing! @doanduyhai 13 n2 n3 n4 n5 n6 n7 n8 1 n1 Client 2 3 ⤫
  • 14.
    Combining Load BalancingPolicies! Token Aware Round Robin DC Aware Round Robin @doanduyhai 14 extends Load Balancing Policy wraps Default config
  • 15.
    Automatic Failover! @doanduyhai 15 n3 n2 n4 Driver 7 6 Pool1 4 5 Pool2 Pool3 Client Thread ⤫ 1 2 3 8
  • 16.
    Other policies! @doanduyhai 16 Retry policy • write/read timeout • node unavailable Reconnection policy • constant schedule • exponential schedule
  • 17.
    Statements! @doanduyhai 17 Plain statement • convenient, one-off query • plain string ☞ parsing overhead INSERT INTO user(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33)’;
  • 18.
    Statements! @doanduyhai 18 Prepared statements • avoid parsing overhead • query structure should be known ahead of time • bound values • named parameters INSERT INTO user(login, name, age) VALUES(?, ?, ?)’; INSERT INTO user(login, name, age) VALUES(:login, :name, :age)’;
  • 19.
    Statements! @doanduyhai 19 Parameterized statements • same as plain statement • pass bound values as bytes ☞ avoid ser/deser of values INSERT INTO user(login, name, age) VALUES(?, ?, ?)’;
  • 20.
    Java Driver! @doanduyhai 20 Reference implementation Base on asynchronous Netty library Configurable policies Query tracing support Client-node compression & SSL
  • 21.
    Maven dependency! Availableon Maven Central @doanduyhai 21 <dependency> <groupId>com.datastax.cassandra</groupId> <artifactId>cassandra-driver-core</artifactId> <version>2.1.3</version> </dependency> depends on Netty, Guava, Metrics
  • 22.
    Connect and Write! @doanduyhai 22 Cluster cluster = Cluster.builder() .addContactPoints("127.0.0.1", “another-host").build(); seed nodes (IP or DNS name) Session session = cluster.connect("my_keyspace"); session.execute("INSERT INTO user (user_id, name, email) VALUES (12345, 'johndoe', 'john_doe@fiction.com’)");
  • 23.
    Configuration! @doanduyhai 23 Cluster cluster = Cluster.builder() .addContactPoints("127.0.0.1", “another-host") .withLoadBalancingPolicy(new DCAwareRoundRobinPolicy("DC1") .withRetryPolicy(DowngradingConsistencyRetryPolicy.INSTANCE) .withReconnectionPolicy(new ConstantReconnectionPolicy(1000)) .build();
  • 24.
    Read! @doanduyhai 24 ResultSet resultSet = session.execute("SELECT * FROM user"); List<Row> rows = resultSet.all(); for (Row row : rows) { Long userId = row.getLong("user_id"); String name = row.getString("name"); String email = row.getString("email"); } Stateless ☞ thread-safe
  • 25.
    Asynchronous Read! ResultSetFuturefuture = session.executeAsync("SELECT * FROM user"); ResultSet resultSet = future.get(); //blocking call List<Row> rows = resultSet.all(); for (Row row : rows) { Long userId = row.getLong("user_id"); String name = row.getString("name"); String email = row.getString("email"); } @doanduyhai 25
  • 26.
    Asynchronous Read withCallBack! ResultSetFuture future = session.executeAsync("SELECT * FROM user"); future.addListener(new Runnable() { public void run() { // Process the results here } }, executor); executor = Executors .newCachedThreadPool(); or executor = Executors .sameThreadExecutor(); @doanduyhai 26
  • 27.
    Query Builder! Queryquery = QueryBuilder .select() .all() .from( "my_keyspace", "user") .where(eq("login", "jdoe")); query.setConsistencyLevel(ConsistencyLevel.ONE); ResultSet rs = session.execute(query); @doanduyhai 27
  • 28.
    Old manual paging! @doanduyhai 28 Some time you need to fetch all table content Manual paging: SELECT * FROM users WHERE token(login) >= token(<last_fetched_login>) LIMIT 100;
  • 29.
    New automatic paging! @doanduyhai 29 n2 n3 n4 n5 n6 n7 n8 Query n1 Driver Page 1 + paging state 1
  • 30.
    New automatic paging! @doanduyhai 30 n2 n3 n4 n5 n6 n7 n8 Query n1 Driver Page 2 + paging state 2
  • 31.
    New automatic paging! @doanduyhai 31 Paging state ≈ stateless cookie Resilient to node failure
  • 32.
    Paging during nodefailure! @doanduyhai 32 n2 n3 n4 n5 n6 n7 n8 Query n1 Driver Page 1 + paging state 1
  • 33.
    Paging during nodefailure! @doanduyhai 33 n2 n3 n4 n5 n6 n7 n8 Query n1 Driver Page 2 + paging state 2 ⤫
  • 34.
    ! " ! Q & R
  • 35.
  • 36.
    Cassandra Unit! @doanduyhai 36 • start an embedded Cassandra server • useful for unit testing • mature project, created at August 5th 2011 • designed around Thrift & Hector • propose a JUnit rule for CQL
  • 37.
  • 38.
    Java Driver ObjectMapper! @doanduyhai 38 • simple mapper • KISS • annotations à-la JPA (but no JPA dependencies) • templating system à-la Spring Data
  • 39.
    Java Driver ObjectMapper! Mapping @doanduyhai 39 @Table(keyspace = "mapper_module", name = "users") public class User { @PartitionKey private String login; private String name; // getters and setters omitted... }
  • 40.
    Java Driver ObjectMapper! Usage @doanduyhai 40 MappingManager manager = new MappingManager(session); Mapper mapper = manager.mapper(User.class); User user = mapper.get("jdoe@fiction.com"); mapper.saveAsync(new User("hsue@fiction.com")); mapper.delete("jdoe@fiction.com");
  • 41.
    Java Driver ObjectMapper! Accessors (SpringData template-like) definition @doanduyhai 41 @Accessor interface UserAccessor { @Query("SELECT * FROM users LIMIT :max") Result<User> firstNUsers(@Param("max") int limit); }
  • 42.
    Java Driver ObjectMapper! @doanduyhai 42 Accessors usage UserAccessor accessor = manager.createAccessor(UserAccessor.class); List<User> users = accessor.firstNUsers(10).all(); for (User user : users) { System.out.println( profile.getAddress().getZip() ); }
  • 43.
    Java Driver ObjectMapper Demo
  • 44.
    Achilles! @doanduyhai 44 Why ? • started in late 2012, when mapper module did not exists • more involved and more features than the mapper module • different annotations set (may converge)
  • 45.
  • 46.
    Dirty Checking! @doanduyhai 46 Dirty checking, why is it important ? • 1 user ≈ 8 mutable fields • × n denormalizations = n update combinations • and not even counting multiple fields updates …
  • 47.
    Dirty Checking! @doanduyhai 47 • Are you going to manually generate n prepared statements for all possible updates ? • Or just use dynamic plain string statements and get some perf penalty ?
  • 48.
    Dirty Checking! @doanduyhai 48 //No read-before-write ContactEntity proxy = manager.forUpdate(ContactEntity.class, contactId); proxy.setFirstName(…); proxy.setLastName(…); //type-safe updates proxy.setAddress(…); manager.update(proxy);
  • 49.
    Dirty Checking! @doanduyhai 49 Proxy Setters interception DirtyMap Empty Entity PrimaryKey
  • 50.
    Dirty Checking! @doanduyhai 50 • Dynamic statement generation UPDATE contacts SET firstname=?, lastname=?,address=? WHERE contact_id=? prepared statements are cached, of course
  • 51.
    Main API! @doanduyhai 51 manager.insert(entity) manager.update(entity) manager.remove(entity) manager.find(Entity.class, primaryKey)
  • 52.
    Advanced Features! @doanduyhai 52 Counter Batch mode Strategies (insert, naming) Options Asynchronous
  • 53.
    Documentation! @doanduyhai 53 Comprehensive Github WIKI Twitter-clone demo app (demo.achilles.io) Versioned documentation (HTML & PDF) JavaDoc
  • 54.
    RoadMap! @doanduyhai 54 C* 2.1 user defined types (UDT) Query-templates à-la Spring Data Reactive ? (RxJava) ElasticSearch integration (@olivierbourgain)
  • 55.
    ! " ! Q & R
  • 56.
    Thank You @doanduyhai duy_hai.doan@datastax.com https://academy.datastax.com/