SlideShare a Scribd company logo
1 of 56
Download to read offline
Cassandra Drivers
instaclustr.com
@Instaclustr
Who am I and what do I do?
• Ben Bromhead
• Co-founder and CTO of Instaclustr -> www.instaclustr.com
• Instaclustr provides Cassandra-as-a-Service in the cloud
• Currently in AWS, Azure and IBM Softlayer
• We currently manage 400+ nodes
What this talk will cover
• Driver basics
• Sync vs Async
• Driver connection policies and tuning
The driver
• The Cassandra driver contains the logic for connecting to
Cassandra and running queries in a fast and efficient manner
• Focus on the Datastax Open Source drivers:
The driver
• Java
• .NET (C#)
• C/C++
• Python
• Node.js
• Ruby
• PHP
Cassandra Drivers
• All have a similar architecture that consists of:
• Session & pool management
• Chainable policies for managing failure and performance
• Sync vs Async queries
• Failover & Retry
• Tracing
Cassandra Drivers
A basic example in Java:
Cluster	
  cluster	
  =	
  Cluster.builder()	
  
	
  	
  	
  	
  .addContactPoints("52.89.183.67")	
  
	
  	
  	
  	
  .withPort(9042)	
  
	
  	
  	
  	
  .build();	
  
Session	
  session	
  =	
  cluster.newSession();	
  
session.execute("SELECT	
  *	
  FROM	
  foo…");
Cassandra Drivers
A basic example in Python:
cluster	
  =	
  Cluster(contact_points=["52.89.183.67"],	
  port=9042)	
  
session	
  =	
  cluster.connect()	
  
rows	
  =	
  session.execute("SELECT	
  name,	
  age,	
  email	
  FROM	
  users")
Cassandra Drivers
A basic example in Ruby:
cluster	
  =	
  Cassandra.cluster(	
  
	
  	
  	
  	
  :hosts	
  =>	
  ["52.89.183.67",	
  "52.89.99.88",	
  "54.69.217.141"],
	
  	
  	
  	
  :datacenter	
  =>	
  'AWS_VPC_US_WEST_2'	
  
)	
  
session	
  =	
  cluster.connect()	
  
rows	
  =	
  session.execute("SELECT	
  name,	
  age,	
  email	
  FROM	
  users")
Cassandra Drivers
• Architecture makes the driver similar across languages
• What happens under the hood?
• Cluster object creates configuration (auth, load balancing, contact
points).
• Session object holds the thread pool and manages connections.
• Session object authenticates and maintains connections.
• Session can be shared and is threadsafe!
Different ways of querying
• Synchronous:
session.execute("SELECT	
  *	
  FROM	
  foo..”);
• Asynchronous:
ResultSetFuture	
  result	
  =	
  session.executeAsync("SELECT	
  *	
  FROM	
  
foo..”);	
  
result.get();
How do these perform?
Operations
0
7500
15000
22500
30000
Read Sync Write Sync Read Async Write Async
Op/s
How do these perform?
Latency
0
20
40
60
80
Read Sync Write Sync Read Async Write Async
ms
Different ways of querying
Prepared Statements:
PreparedStatement	
  statement	
  =	
  getSession().prepare(	
  
	
  	
  	
  	
  	
  	
  "INSERT	
  INTO	
  simplex.songs	
  "	
  +	
  
	
  	
  	
  	
  	
  	
  "(id,	
  title,	
  album,	
  artist,	
  tags)	
  "	
  +	
  
	
  	
  	
  	
  	
  	
  "VALUES	
  (?,	
  ?,	
  ?,	
  ?,	
  ?);");
Different ways of querying
boundStatement	
  =	
  new	
  BoundStatement(statement);	
  
getSession().execute(boundStatement.bind(	
  
	
  	
  	
  	
  	
  	
  UUID.fromString("2cc9ccb7-­‐6221-­‐4ccb-­‐8387-­‐f22b6a1b354d"),	
  
	
  	
  	
  	
  	
  	
  UUID.fromString("756716f7-­‐2e54-­‐4715-­‐9f00-­‐91dcbea6cf50"),	
  
	
  	
  	
  	
  	
  	
  "La	
  Petite	
  Tonkinoise",	
  
	
  	
  	
  	
  	
  	
  "Bye	
  Bye	
  Blackbird",	
  
	
  	
  	
  	
  	
  	
  "Joséphine	
  Baker")	
  );
Drivers and consistency
• Within the different ways of querying Cassandra you can also adjust
the consistency level per query.
• Lets have a quick consistency refresh
A brief intro to tuneable consistency
• Cassandra is considered to be a db that favours Availability and
Partition Tolerance.
• Let’s you change those characteristics per query to suit your
application requirement
Two consistency levers
• Consistency level - How many acknowledgements/responses from
replicas before a query is considered a success.
• Replication Factor (RF) - How many copies of a record do I store.
Two consistency levers
• Consistency level - Chosen by the client at query time
• Replication Factor (RF) - Determined client on schema definition
Consistency Levels
• ALL - Every replica
• *QUORUM - (EACH_QUORUM, QUORUM, LOCAL_QUORUM)
• Numbered - (ONE, TWO, THREE, LOCAL_ONE)
• *SERIAL - (SERIAL, LOCAL_SERIAL)
• ANY
What does it all mean
• At the client level (your application) you have total control
• Define implicit and explicit failure handling
• Isolate queries to a single geography
• Trade consistency for latency (a decision is better than no
decision)
How does it all work?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: a
How does it all work?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: a
How does it all work?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: a
How does it all work?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: a
How does it all work?
How does CL impact Op/s ?
Operations
0
7500
15000
22500
30000
Read Sync Write Sync Read Async Write Async
ONE QUORUM ALL
How does CL impact latency ?
Latency
0
30
60
90
120
Read Sync Write Sync Read Async Write Async
ONE QUORUM ALL
What happens when something goes wrong?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: b
How does it all work?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: b
How does it all work?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: b
How does it all work?
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: b
How does it all work?
✓✓
Required responses:
floor(3 * 0.5) + 1 = 2
Write
CL:QUORUM
RF:3
1
2
3
4
partition_key: b
How does it all work?
✓✓
Success!
How does an outage impact Op/s ?
Operations
0
7500
15000
22500
30000
Read Sync Write Sync Read Async Write Async
ONE QUORUM ALL
How does an outage impact latency ?
Latency
0
25
50
75
100
Read Sync Write Sync Read Async Write Async
ONE QUORUM ALL
We are now have a replica that is not
consistent
• Anti-entropy repair (only guaranteed way to make things consistent)
• Hinted handoff
• Read repair
We are now have a replica that is not
consistent
• Anti-entropy repair (only guaranteed way to make things consistent)
• Hinted handoff - lets cover this quickly
• Read repair
What is hinted handoff ?
• A performance optimisation for “catching up” nodes who missed
writes.
What isn’t hinted handoff ?
• A consistent distribution mechanism
Write
CL:QUORUM
RF:3
partition_key: b
1
2
3
4
How does it all work?
Write
CL:QUORUM
RF:3
partition_key: b
1
2
3
4
How does it all work?
How does hinted handoff work?
1
2
3
4
host / key A B
1 ✔ ✔
2 ?
3 ✔ ✔
…
✔
How does hinted handoff work?
partition_key: b
1
2
3
4
How does hinted handoff work?
partition_key: b
1
2
3
4
Gossip: 2 is now UP
Node 1: I have stored hints for 2
How does hinted handoff work?
partition_key: b
1
2
3
4
Some things to keep in mind
• Cassandra will only store hints for a certain period of time, set by
max_hint_window_in_ms. 3 hours by default
• Hints are not a reliable delivery mechanism
• Hint replay will cause counters to overcoat
• CF of ANY will cause a hint to be stored even if no replicas are
available. Sometimes called extreme availability… also called who
knows where and if your data is safe?
Hinted handoff performance
• Causes the same volume of writes to occur in a cluster with
reduced capacity (local write amplification on the co-ordinator
node)
• Hints are written to system.hints, each replica has hints stored in a
single partition.
• Hints use TTLs and tombstones.. the hint table is actually a queue!
• When cassandra starts compacting or throwing tombstone
warnings on the system.hints table… things are bad
Hinted handoff performance
• Rewritten in Cassandra 3.0 (in beta now)
• Takes a commitlog approach:
• No compaction
• no TTL
• no tombstones
• no memtables
How does this relate to the driver?
• With a node outage the “latency” on the down node becomes
hours/days until it becomes consistent
• Cassandra itself takes over the client portion of ensuring the write
makes it to the node that was down.
• You can control whether C* handles this (via repair, HH etc) or
whether your application controls this (have your client receive an
exception instead).
Driver policies
• Cassandra driver policies allow you to control failure
• Cassandra driver policies allow you to control how the driver routes
requests
• This can reduce your latency and/or increase op/s (in some cases)
Retry Policy
• Default Retry Policy
• Downgrading Consistency Retry Policy
• Fall through Retry Policy
• Logging Retry Policy
Load Balancing Policy
• Round Robin
• DC Aware
• TokenAware
• LatencyAware
Driver policies impact latency ?
Latency
0
0.3
0.6
0.9
1.2
Read Sync Write Sync
Round Robin Token Aware Latency Aware
Last but not least
• Use one Cluster instance per (physical) cluster (per application
lifetime)
• Use at most one Session per keyspace, or use a single Session and
explicitly specify the keyspace in your queries
• If you execute a statement more than once, consider using a
PreparedStatement
• You can reduce the number of network roundtrips and also have
atomic operations by using Batches
Thank you!
Questions?

More Related Content

What's hot

Highly Available MySQL/PHP Applications with mysqlnd
Highly Available MySQL/PHP Applications with mysqlndHighly Available MySQL/PHP Applications with mysqlnd
Highly Available MySQL/PHP Applications with mysqlndJervin Real
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaJiangjie Qin
 
How Yelp does Service Discovery
How Yelp does Service DiscoveryHow Yelp does Service Discovery
How Yelp does Service DiscoveryJohn Billings
 
Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2Marco Tusa
 
Docker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsDocker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsFederico Michele Facca
 
Load Balancing with Nginx
Load Balancing with NginxLoad Balancing with Nginx
Load Balancing with NginxMarian Marinov
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcachedJurriaan Persyn
 
Troubleshooting RabbitMQ and services that use it
Troubleshooting RabbitMQ and services that use itTroubleshooting RabbitMQ and services that use it
Troubleshooting RabbitMQ and services that use itMichael Klishin
 
Redis trouble shooting_eng
Redis trouble shooting_engRedis trouble shooting_eng
Redis trouble shooting_engDaeMyung Kang
 
Tuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish CacheTuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish CachePer Buer
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathHDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathYahoo Developer Network
 
HighLoad Solutions On MySQL / Xiaobin Lin (Alibaba)
HighLoad Solutions On MySQL / Xiaobin Lin (Alibaba)HighLoad Solutions On MySQL / Xiaobin Lin (Alibaba)
HighLoad Solutions On MySQL / Xiaobin Lin (Alibaba)Ontico
 
Nginx Internals
Nginx InternalsNginx Internals
Nginx InternalsJoshua Zhu
 
Consul - service discovery and others
Consul - service discovery and othersConsul - service discovery and others
Consul - service discovery and othersWalter Liu
 
PostgreSQL: Welcome To Total Security
PostgreSQL: Welcome To Total SecurityPostgreSQL: Welcome To Total Security
PostgreSQL: Welcome To Total SecurityRobert Bernier
 
Varnish Cache 4.0 / Redpill Linpro breakfast in Oslo
Varnish Cache 4.0 / Redpill Linpro breakfast in OsloVarnish Cache 4.0 / Redpill Linpro breakfast in Oslo
Varnish Cache 4.0 / Redpill Linpro breakfast in OsloPer Buer
 

What's hot (20)

Highly Available MySQL/PHP Applications with mysqlnd
Highly Available MySQL/PHP Applications with mysqlndHighly Available MySQL/PHP Applications with mysqlnd
Highly Available MySQL/PHP Applications with mysqlnd
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
 
Redis ndc2013
Redis ndc2013Redis ndc2013
Redis ndc2013
 
How Yelp does Service Discovery
How Yelp does Service DiscoveryHow Yelp does Service Discovery
How Yelp does Service Discovery
 
Shootout at the AWS Corral
Shootout at the AWS CorralShootout at the AWS Corral
Shootout at the AWS Corral
 
Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2
 
Docker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsDocker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platforms
 
Load Balancing with Nginx
Load Balancing with NginxLoad Balancing with Nginx
Load Balancing with Nginx
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Troubleshooting RabbitMQ and services that use it
Troubleshooting RabbitMQ and services that use itTroubleshooting RabbitMQ and services that use it
Troubleshooting RabbitMQ and services that use it
 
Redis trouble shooting_eng
Redis trouble shooting_engRedis trouble shooting_eng
Redis trouble shooting_eng
 
Fail over fail_back
Fail over fail_backFail over fail_back
Fail over fail_back
 
Tuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish CacheTuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish Cache
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathHDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
 
HighLoad Solutions On MySQL / Xiaobin Lin (Alibaba)
HighLoad Solutions On MySQL / Xiaobin Lin (Alibaba)HighLoad Solutions On MySQL / Xiaobin Lin (Alibaba)
HighLoad Solutions On MySQL / Xiaobin Lin (Alibaba)
 
Nginx Internals
Nginx InternalsNginx Internals
Nginx Internals
 
Consul - service discovery and others
Consul - service discovery and othersConsul - service discovery and others
Consul - service discovery and others
 
How to monitor NGINX
How to monitor NGINXHow to monitor NGINX
How to monitor NGINX
 
PostgreSQL: Welcome To Total Security
PostgreSQL: Welcome To Total SecurityPostgreSQL: Welcome To Total Security
PostgreSQL: Welcome To Total Security
 
Varnish Cache 4.0 / Redpill Linpro breakfast in Oslo
Varnish Cache 4.0 / Redpill Linpro breakfast in OsloVarnish Cache 4.0 / Redpill Linpro breakfast in Oslo
Varnish Cache 4.0 / Redpill Linpro breakfast in Oslo
 

Similar to Cassandra Driver Performance Tuning

AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...Amazon Web Services
 
Right-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual MachineRight-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual Machineheraflux
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACKristofferson A
 
Three Perspectives on Measuring Latency
Three Perspectives on Measuring LatencyThree Perspectives on Measuring Latency
Three Perspectives on Measuring LatencyScyllaDB
 
Where Django Caching Bust at the Seams
Where Django Caching Bust at the SeamsWhere Django Caching Bust at the Seams
Where Django Caching Bust at the SeamsConcentric Sky
 
Aerospike Go Language Client
Aerospike Go Language ClientAerospike Go Language Client
Aerospike Go Language ClientSayyaparaju Sunil
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scalethelabdude
 
Cassandra serving netflix @ scale
Cassandra serving netflix @ scaleCassandra serving netflix @ scale
Cassandra serving netflix @ scaleVinay Kumar Chella
 
Real-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormReal-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormJohn Georgiadis
 
(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis
(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis
(ARC310) Solving Amazon's Catalog Contention With Amazon KinesisAmazon Web Services
 
The impact of cloud NSBCon NY by Yves Goeleven
The impact of cloud NSBCon NY by Yves GoelevenThe impact of cloud NSBCon NY by Yves Goeleven
The impact of cloud NSBCon NY by Yves GoelevenParticular Software
 
KoprowskiT - SQLBITS X - 2am a disaster just began
KoprowskiT - SQLBITS X - 2am a disaster just beganKoprowskiT - SQLBITS X - 2am a disaster just began
KoprowskiT - SQLBITS X - 2am a disaster just beganTobias Koprowski
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...DataStax
 
Database and Public Endpoints redundancy on Azure
Database and Public Endpoints redundancy on AzureDatabase and Public Endpoints redundancy on Azure
Database and Public Endpoints redundancy on AzureRadu Vunvulea
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...confluent
 
Performance Oriented Design
Performance Oriented DesignPerformance Oriented Design
Performance Oriented DesignRodrigo Campos
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...DataStax Academy
 

Similar to Cassandra Driver Performance Tuning (20)

AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
 
Right-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual MachineRight-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual Machine
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
 
Three Perspectives on Measuring Latency
Three Perspectives on Measuring LatencyThree Perspectives on Measuring Latency
Three Perspectives on Measuring Latency
 
Scalable Web Apps
Scalable Web AppsScalable Web Apps
Scalable Web Apps
 
Where Django Caching Bust at the Seams
Where Django Caching Bust at the SeamsWhere Django Caching Bust at the Seams
Where Django Caching Bust at the Seams
 
Aerospike Go Language Client
Aerospike Go Language ClientAerospike Go Language Client
Aerospike Go Language Client
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
Cassandra serving netflix @ scale
Cassandra serving netflix @ scaleCassandra serving netflix @ scale
Cassandra serving netflix @ scale
 
Release it! - Takeaways
Release it! - TakeawaysRelease it! - Takeaways
Release it! - Takeaways
 
Real-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormReal-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and Storm
 
(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis
(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis
(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis
 
The impact of cloud NSBCon NY by Yves Goeleven
The impact of cloud NSBCon NY by Yves GoelevenThe impact of cloud NSBCon NY by Yves Goeleven
The impact of cloud NSBCon NY by Yves Goeleven
 
KoprowskiT - SQLBITS X - 2am a disaster just began
KoprowskiT - SQLBITS X - 2am a disaster just beganKoprowskiT - SQLBITS X - 2am a disaster just began
KoprowskiT - SQLBITS X - 2am a disaster just began
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
 
Database and Public Endpoints redundancy on Azure
Database and Public Endpoints redundancy on AzureDatabase and Public Endpoints redundancy on Azure
Database and Public Endpoints redundancy on Azure
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
 
Performance Oriented Design
Performance Oriented DesignPerformance Oriented Design
Performance Oriented Design
 
L6.sp17.pptx
L6.sp17.pptxL6.sp17.pptx
L6.sp17.pptx
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
 

Recently uploaded

Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 

Recently uploaded (20)

Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 

Cassandra Driver Performance Tuning

  • 2. Who am I and what do I do? • Ben Bromhead • Co-founder and CTO of Instaclustr -> www.instaclustr.com • Instaclustr provides Cassandra-as-a-Service in the cloud • Currently in AWS, Azure and IBM Softlayer • We currently manage 400+ nodes
  • 3. What this talk will cover • Driver basics • Sync vs Async • Driver connection policies and tuning
  • 4. The driver • The Cassandra driver contains the logic for connecting to Cassandra and running queries in a fast and efficient manner • Focus on the Datastax Open Source drivers:
  • 5. The driver • Java • .NET (C#) • C/C++ • Python • Node.js • Ruby • PHP
  • 6. Cassandra Drivers • All have a similar architecture that consists of: • Session & pool management • Chainable policies for managing failure and performance • Sync vs Async queries • Failover & Retry • Tracing
  • 7. Cassandra Drivers A basic example in Java: Cluster  cluster  =  Cluster.builder()          .addContactPoints("52.89.183.67")          .withPort(9042)          .build();   Session  session  =  cluster.newSession();   session.execute("SELECT  *  FROM  foo…");
  • 8. Cassandra Drivers A basic example in Python: cluster  =  Cluster(contact_points=["52.89.183.67"],  port=9042)   session  =  cluster.connect()   rows  =  session.execute("SELECT  name,  age,  email  FROM  users")
  • 9. Cassandra Drivers A basic example in Ruby: cluster  =  Cassandra.cluster(          :hosts  =>  ["52.89.183.67",  "52.89.99.88",  "54.69.217.141"],        :datacenter  =>  'AWS_VPC_US_WEST_2'   )   session  =  cluster.connect()   rows  =  session.execute("SELECT  name,  age,  email  FROM  users")
  • 10. Cassandra Drivers • Architecture makes the driver similar across languages • What happens under the hood? • Cluster object creates configuration (auth, load balancing, contact points). • Session object holds the thread pool and manages connections. • Session object authenticates and maintains connections. • Session can be shared and is threadsafe!
  • 11. Different ways of querying • Synchronous: session.execute("SELECT  *  FROM  foo..”); • Asynchronous: ResultSetFuture  result  =  session.executeAsync("SELECT  *  FROM   foo..”);   result.get();
  • 12. How do these perform? Operations 0 7500 15000 22500 30000 Read Sync Write Sync Read Async Write Async Op/s
  • 13. How do these perform? Latency 0 20 40 60 80 Read Sync Write Sync Read Async Write Async ms
  • 14. Different ways of querying Prepared Statements: PreparedStatement  statement  =  getSession().prepare(              "INSERT  INTO  simplex.songs  "  +              "(id,  title,  album,  artist,  tags)  "  +              "VALUES  (?,  ?,  ?,  ?,  ?);");
  • 15. Different ways of querying boundStatement  =  new  BoundStatement(statement);   getSession().execute(boundStatement.bind(              UUID.fromString("2cc9ccb7-­‐6221-­‐4ccb-­‐8387-­‐f22b6a1b354d"),              UUID.fromString("756716f7-­‐2e54-­‐4715-­‐9f00-­‐91dcbea6cf50"),              "La  Petite  Tonkinoise",              "Bye  Bye  Blackbird",              "Joséphine  Baker")  );
  • 16. Drivers and consistency • Within the different ways of querying Cassandra you can also adjust the consistency level per query. • Lets have a quick consistency refresh
  • 17. A brief intro to tuneable consistency • Cassandra is considered to be a db that favours Availability and Partition Tolerance. • Let’s you change those characteristics per query to suit your application requirement
  • 18. Two consistency levers • Consistency level - How many acknowledgements/responses from replicas before a query is considered a success. • Replication Factor (RF) - How many copies of a record do I store.
  • 19. Two consistency levers • Consistency level - Chosen by the client at query time • Replication Factor (RF) - Determined client on schema definition
  • 20. Consistency Levels • ALL - Every replica • *QUORUM - (EACH_QUORUM, QUORUM, LOCAL_QUORUM) • Numbered - (ONE, TWO, THREE, LOCAL_ONE) • *SERIAL - (SERIAL, LOCAL_SERIAL) • ANY
  • 21. What does it all mean • At the client level (your application) you have total control • Define implicit and explicit failure handling • Isolate queries to a single geography • Trade consistency for latency (a decision is better than no decision)
  • 22. How does it all work?
  • 27. How does CL impact Op/s ? Operations 0 7500 15000 22500 30000 Read Sync Write Sync Read Async Write Async ONE QUORUM ALL
  • 28. How does CL impact latency ? Latency 0 30 60 90 120 Read Sync Write Sync Read Async Write Async ONE QUORUM ALL
  • 29. What happens when something goes wrong?
  • 33. Write CL:QUORUM RF:3 1 2 3 4 partition_key: b How does it all work? ✓✓ Required responses: floor(3 * 0.5) + 1 = 2
  • 35. How does an outage impact Op/s ? Operations 0 7500 15000 22500 30000 Read Sync Write Sync Read Async Write Async ONE QUORUM ALL
  • 36. How does an outage impact latency ? Latency 0 25 50 75 100 Read Sync Write Sync Read Async Write Async ONE QUORUM ALL
  • 37. We are now have a replica that is not consistent • Anti-entropy repair (only guaranteed way to make things consistent) • Hinted handoff • Read repair
  • 38. We are now have a replica that is not consistent • Anti-entropy repair (only guaranteed way to make things consistent) • Hinted handoff - lets cover this quickly • Read repair
  • 39. What is hinted handoff ? • A performance optimisation for “catching up” nodes who missed writes.
  • 40. What isn’t hinted handoff ? • A consistent distribution mechanism
  • 43. How does hinted handoff work? 1 2 3 4 host / key A B 1 ✔ ✔ 2 ? 3 ✔ ✔ … ✔
  • 44. How does hinted handoff work? partition_key: b 1 2 3 4
  • 45. How does hinted handoff work? partition_key: b 1 2 3 4 Gossip: 2 is now UP Node 1: I have stored hints for 2
  • 46. How does hinted handoff work? partition_key: b 1 2 3 4
  • 47. Some things to keep in mind • Cassandra will only store hints for a certain period of time, set by max_hint_window_in_ms. 3 hours by default • Hints are not a reliable delivery mechanism • Hint replay will cause counters to overcoat • CF of ANY will cause a hint to be stored even if no replicas are available. Sometimes called extreme availability… also called who knows where and if your data is safe?
  • 48. Hinted handoff performance • Causes the same volume of writes to occur in a cluster with reduced capacity (local write amplification on the co-ordinator node) • Hints are written to system.hints, each replica has hints stored in a single partition. • Hints use TTLs and tombstones.. the hint table is actually a queue! • When cassandra starts compacting or throwing tombstone warnings on the system.hints table… things are bad
  • 49. Hinted handoff performance • Rewritten in Cassandra 3.0 (in beta now) • Takes a commitlog approach: • No compaction • no TTL • no tombstones • no memtables
  • 50. How does this relate to the driver? • With a node outage the “latency” on the down node becomes hours/days until it becomes consistent • Cassandra itself takes over the client portion of ensuring the write makes it to the node that was down. • You can control whether C* handles this (via repair, HH etc) or whether your application controls this (have your client receive an exception instead).
  • 51. Driver policies • Cassandra driver policies allow you to control failure • Cassandra driver policies allow you to control how the driver routes requests • This can reduce your latency and/or increase op/s (in some cases)
  • 52. Retry Policy • Default Retry Policy • Downgrading Consistency Retry Policy • Fall through Retry Policy • Logging Retry Policy
  • 53. Load Balancing Policy • Round Robin • DC Aware • TokenAware • LatencyAware
  • 54. Driver policies impact latency ? Latency 0 0.3 0.6 0.9 1.2 Read Sync Write Sync Round Robin Token Aware Latency Aware
  • 55. Last but not least • Use one Cluster instance per (physical) cluster (per application lifetime) • Use at most one Session per keyspace, or use a single Session and explicitly specify the keyspace in your queries • If you execute a statement more than once, consider using a PreparedStatement • You can reduce the number of network roundtrips and also have atomic operations by using Batches