SlideShare a Scribd company logo
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra python driver
Benchmarking concurrency for nyt⨍aбrik
Michael.Laing@nytimes.com
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
A Global Mesh with a Memory
Message-based: WebSocket, AMQP, SockJS
If in doubt:
• Resend
• Reconnect
• Reread
Idempotent:
• Replicating
• Racy
• Resolving
Classes of service:
• Gold: replicate/race
• Silver: prioritize
• Bronze: queueable
Millions of users
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Message: an event with data
CREATE TABLE source_data (
hash_key int, -- real ones are more complex
message_id timeuuid,
body blob, -- whatever
metadata text, -- JSON
PRIMARY KEY (hash_key, message_id)
);
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
1-10kb
1-10kb
Ack
Ack
Push
1kb
1kb
10-150kb
10-150kb
Pull
Synchronous:
C* Thrift or
CQL Native
Concurrent
Degree = 3
(using the
Libev event
Loop)
Asynchronous:
CQL Native only
More Concurrency
Can also try:
• DC Aware
• Token Aware
• Subprocessing
Build one
def build_message(self):
message = {
"message_id": str(uuid.uuid1()),
"hash_key": randint(0, self._hash_key_range), # int(e ** 8)
"app_id": self._app_id,
"timestamp": datetime.utcnow().isoformat() + 'Z',
"content_type": "application/binary",
"body": os.urandom(randint(1, self._body_range)) # int(e ** 9)
}
Kick-off
def push_message(self):
if self._submitted_count.next() < self._message_count:
message = self.build_message()
self.submit_query(message)
def push_initial_data(self):
self._start_time = time()
try:
with self._lock:
for i in range(
0, min(CONCURRENCY, self._message_count)
):
self.push_message()
Put it in the pipeline
def submit_query(self, message):
body = message.pop('body')
substitution_args = (
json.dumps(message, **JSON_DUMPS_ARGS),
body,
message['hash_key'],
uuid.UUID(message['message_id'])
)
future = self._cql_session.execute_async(
self._query, substitution_args
)
future.add_callback(self.push_or_finish)
future.add_errback(self.note_error)
Maintain concurrency or finish
def push_or_finish(self, _):
try:
if (
self._unfinished and
self._confirmed_count.next() < self._message_count
):
with self._lock:
self.push_message()
else:
self.finish()
1-10kb
1-10kb
Ack
Ack
Push
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Push some messages
usage: bm_push.py [-h] [-c [CQL_HOST [CQL_HOST ...]]] [-d LOCAL_DC]
[--remote-dc-hosts REMOTE_DC_HOSTS] [-p PREFETCH_COUNT]
[-w WORKER_COUNT] [-a] [-t]
[-n {ONE, TWO, THREE, QUORUM, ALL, LOCAL_QUORUM,
EACH_QUORUM, SERIAL, LOCAL_SERIAL, LOCAL_ONE}]
[-r] [-j] [-l
{CRITICAL,ERROR,WARNING,INFO,DEBUG,NOTSET}]
Push messages from a RabbitMQ queue into a Cassandra table.
Push messages many times
usage: run_push.py [-h] [-c [CQL_HOST [CQL_HOST ...]]] [-i ITERATIONS]
[-d LOCAL_DC] [-w [worker_count [worker_count ...]]]
[-p [prefetch_count [prefetch_count ...]]]
[-n [level [level ...]]] [-a] [-t] [-m MESSAGE_EXPONENT]
[-b BODY_EXPONENT]
[-l {CRITICAL,ERROR,WARNING,INFO,DEBUG,NOTSET}]
Run multiple test cases based upon the product of worker_counts,
prefetch_counts, and consistency_levels. Each test case may be run with up to
4 variations reflecting the use or not of the dc_aware and token_aware
policies. The results are output to stdout as a JSON object.
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
1kb
1kb
10-150kb
10-150kb
Pull
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform

More Related Content

What's hot

A Technical Introduction to WiredTiger
A Technical Introduction to WiredTigerA Technical Introduction to WiredTiger
A Technical Introduction to WiredTiger
MongoDB
 
Apache kafka 관리와 모니터링
Apache kafka 관리와 모니터링Apache kafka 관리와 모니터링
Apache kafka 관리와 모니터링
JANGWONSEO4
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing data
DataWorks Summit/Hadoop Summit
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
Guozhang Wang
 
Scaling Instagram
Scaling InstagramScaling Instagram
Scaling Instagram
iammutex
 
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Henning Jacobs
 
Scaling for Performance
Scaling for PerformanceScaling for Performance
Scaling for Performance
ScyllaDB
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
Jiangjie Qin
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
Databricks
 
Thoughts on kafka capacity planning
Thoughts on kafka capacity planningThoughts on kafka capacity planning
Thoughts on kafka capacity planning
JamieAlquiza
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
Jiangjie Qin
 
The Impala Cookbook
The Impala CookbookThe Impala Cookbook
The Impala Cookbook
Cloudera, Inc.
 
Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...
Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...
Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...
HostedbyConfluent
 
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and PitfallsRunning Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
Databricks
 
Monitoring Microservices
Monitoring MicroservicesMonitoring Microservices
Monitoring Microservices
Weaveworks
 
Building and running cloud native cassandra
Building and running cloud native cassandraBuilding and running cloud native cassandra
Building and running cloud native cassandra
Vinay Kumar Chella
 
Getting Started with Confluent Schema Registry
Getting Started with Confluent Schema RegistryGetting Started with Confluent Schema Registry
Getting Started with Confluent Schema Registry
confluent
 
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DBDistributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
YugabyteDB
 
How to Build a Scylla Database Cluster that Fits Your Needs
How to Build a Scylla Database Cluster that Fits Your NeedsHow to Build a Scylla Database Cluster that Fits Your Needs
How to Build a Scylla Database Cluster that Fits Your Needs
ScyllaDB
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
Christian Johannsen
 

What's hot (20)

A Technical Introduction to WiredTiger
A Technical Introduction to WiredTigerA Technical Introduction to WiredTiger
A Technical Introduction to WiredTiger
 
Apache kafka 관리와 모니터링
Apache kafka 관리와 모니터링Apache kafka 관리와 모니터링
Apache kafka 관리와 모니터링
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing data
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
Scaling Instagram
Scaling InstagramScaling Instagram
Scaling Instagram
 
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
 
Scaling for Performance
Scaling for PerformanceScaling for Performance
Scaling for Performance
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
 
Thoughts on kafka capacity planning
Thoughts on kafka capacity planningThoughts on kafka capacity planning
Thoughts on kafka capacity planning
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
 
The Impala Cookbook
The Impala CookbookThe Impala Cookbook
The Impala Cookbook
 
Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...
Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...
Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...
 
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and PitfallsRunning Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
 
Monitoring Microservices
Monitoring MicroservicesMonitoring Microservices
Monitoring Microservices
 
Building and running cloud native cassandra
Building and running cloud native cassandraBuilding and running cloud native cassandra
Building and running cloud native cassandra
 
Getting Started with Confluent Schema Registry
Getting Started with Confluent Schema RegistryGetting Started with Confluent Schema Registry
Getting Started with Confluent Schema Registry
 
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DBDistributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
 
How to Build a Scylla Database Cluster that Fits Your Needs
How to Build a Scylla Database Cluster that Fits Your NeedsHow to Build a Scylla Database Cluster that Fits Your Needs
How to Build a Scylla Database Cluster that Fits Your Needs
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 

Similar to Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform

Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQLKafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
confluent
 
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
ScyllaDB
 
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Yaroslav Tkachenko
 
High order bits from cassandra & hadoop
High order bits from cassandra & hadoopHigh order bits from cassandra & hadoop
High order bits from cassandra & hadoop
srisatish ambati
 
2016-tcpkali-websocket
2016-tcpkali-websocket2016-tcpkali-websocket
2016-tcpkali-websocket
Lev Walkin
 
Cassandra
CassandraCassandra
Cassandra
오석 한
 
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL MeetupCassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
Michael Wynholds
 
Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop
Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoopJava one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop
Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop
srisatish ambati
 
Making KVS 10x Scalable
Making KVS 10x ScalableMaking KVS 10x Scalable
Making KVS 10x Scalable
Sadayuki Furuhashi
 
Cassandra
CassandraCassandra
Cassandra
robjk
 
Toward 10,000 Containers on OpenStack
Toward 10,000 Containers on OpenStackToward 10,000 Containers on OpenStack
Toward 10,000 Containers on OpenStack
Ton Ngo
 
Cassandra & Spark for IoT
Cassandra & Spark for IoTCassandra & Spark for IoT
Cassandra & Spark for IoT
Matthias Niehoff
 
High order bits from cassandra & hadoop
High order bits from cassandra & hadoopHigh order bits from cassandra & hadoop
High order bits from cassandra & hadoop
srisatish ambati
 
Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...
Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...
Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...
confluent
 
Cassandra at no_sql
Cassandra at no_sqlCassandra at no_sql
Cassandra at no_sql
srisatish ambati
 
Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka Meetup
Cliff Gilmore
 
Making (Almost) Any Database Faster and Cheaper with Caching
Making (Almost) Any Database Faster and Cheaper with CachingMaking (Almost) Any Database Faster and Cheaper with Caching
Making (Almost) Any Database Faster and Cheaper with Caching
Amazon Web Services
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentation
Murat Çakal
 
On Rails with Apache Cassandra
On Rails with Apache CassandraOn Rails with Apache Cassandra
On Rails with Apache Cassandra
Stu Hood
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
Roger Xia
 

Similar to Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform (20)

Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQLKafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
 
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
 
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
 
High order bits from cassandra & hadoop
High order bits from cassandra & hadoopHigh order bits from cassandra & hadoop
High order bits from cassandra & hadoop
 
2016-tcpkali-websocket
2016-tcpkali-websocket2016-tcpkali-websocket
2016-tcpkali-websocket
 
Cassandra
CassandraCassandra
Cassandra
 
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL MeetupCassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
 
Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop
Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoopJava one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop
Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop
 
Making KVS 10x Scalable
Making KVS 10x ScalableMaking KVS 10x Scalable
Making KVS 10x Scalable
 
Cassandra
CassandraCassandra
Cassandra
 
Toward 10,000 Containers on OpenStack
Toward 10,000 Containers on OpenStackToward 10,000 Containers on OpenStack
Toward 10,000 Containers on OpenStack
 
Cassandra & Spark for IoT
Cassandra & Spark for IoTCassandra & Spark for IoT
Cassandra & Spark for IoT
 
High order bits from cassandra & hadoop
High order bits from cassandra & hadoopHigh order bits from cassandra & hadoop
High order bits from cassandra & hadoop
 
Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...
Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...
Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...
 
Cassandra at no_sql
Cassandra at no_sqlCassandra at no_sql
Cassandra at no_sql
 
Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka Meetup
 
Making (Almost) Any Database Faster and Cheaper with Caching
Making (Almost) Any Database Faster and Cheaper with CachingMaking (Almost) Any Database Faster and Cheaper with Caching
Making (Almost) Any Database Faster and Cheaper with Caching
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentation
 
On Rails with Apache Cassandra
On Rails with Apache CassandraOn Rails with Apache Cassandra
On Rails with Apache Cassandra
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
 

More from DataStax Academy

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
DataStax Academy
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
DataStax Academy
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
DataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
DataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
DataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
DataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
DataStax Academy
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
DataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
DataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
DataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
DataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
DataStax Academy
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
DataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax Academy
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
DataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
DataStax Academy
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
DataStax Academy
 

More from DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 

Recently uploaded

Data Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining DataData Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining Data
Safe Software
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
bhumivarma35300
 
leewayhertz.com-Generative AI tech stack Frameworks infrastructure models and...
leewayhertz.com-Generative AI tech stack Frameworks infrastructure models and...leewayhertz.com-Generative AI tech stack Frameworks infrastructure models and...
leewayhertz.com-Generative AI tech stack Frameworks infrastructure models and...
alexjohnson7307
 
The Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF GuideThe Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF Guide
Shiv Technolabs
 
Connector Corner: Leveraging Snowflake Integration for Smarter Decision Making
Connector Corner: Leveraging Snowflake Integration for Smarter Decision MakingConnector Corner: Leveraging Snowflake Integration for Smarter Decision Making
Connector Corner: Leveraging Snowflake Integration for Smarter Decision Making
DianaGray10
 
Patch Tuesday de julio
Patch Tuesday de julioPatch Tuesday de julio
Patch Tuesday de julio
Ivanti
 
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
Priyanka Aash
 
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
Priyanka Aash
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
ldtexsolbl
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
alexjohnson7307
 
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
Priyanka Aash
 
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Muhammad Ali
 
Semantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software DevelopmentSemantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software Development
Baishakhi Ray
 
Computer HARDWARE presenattion by CWD students class 10
Computer HARDWARE presenattion by CWD students class 10Computer HARDWARE presenattion by CWD students class 10
Computer HARDWARE presenattion by CWD students class 10
ankush9927
 
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptxIntroduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
313mohammedarshad
 
Mastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for SuccessMastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for Success
David Wilson
 
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptxUse Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
SynapseIndia
 
(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
Priyanka Aash
 
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
Priyanka Aash
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
sunilverma7884
 

Recently uploaded (20)

Data Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining DataData Integration Basics: Merging & Joining Data
Data Integration Basics: Merging & Joining Data
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
 
leewayhertz.com-Generative AI tech stack Frameworks infrastructure models and...
leewayhertz.com-Generative AI tech stack Frameworks infrastructure models and...leewayhertz.com-Generative AI tech stack Frameworks infrastructure models and...
leewayhertz.com-Generative AI tech stack Frameworks infrastructure models and...
 
The Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF GuideThe Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF Guide
 
Connector Corner: Leveraging Snowflake Integration for Smarter Decision Making
Connector Corner: Leveraging Snowflake Integration for Smarter Decision MakingConnector Corner: Leveraging Snowflake Integration for Smarter Decision Making
Connector Corner: Leveraging Snowflake Integration for Smarter Decision Making
 
Patch Tuesday de julio
Patch Tuesday de julioPatch Tuesday de julio
Patch Tuesday de julio
 
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
 
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
 
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
 
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
 
Semantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software DevelopmentSemantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software Development
 
Computer HARDWARE presenattion by CWD students class 10
Computer HARDWARE presenattion by CWD students class 10Computer HARDWARE presenattion by CWD students class 10
Computer HARDWARE presenattion by CWD students class 10
 
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptxIntroduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
 
Mastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for SuccessMastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for Success
 
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptxUse Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
 
(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
 
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
(CISOPlatform Summit & SACON 2024) Cyber Insurance & Risk Quantification.pdf
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
 

Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍aбrik Platform

  • 2. Cassandra python driver Benchmarking concurrency for nyt⨍aбrik Michael.Laing@nytimes.com
  • 12. A Global Mesh with a Memory Message-based: WebSocket, AMQP, SockJS If in doubt: • Resend • Reconnect • Reread Idempotent: • Replicating • Racy • Resolving Classes of service: • Gold: replicate/race • Silver: prioritize • Bronze: queueable Millions of users
  • 14. Message: an event with data CREATE TABLE source_data ( hash_key int, -- real ones are more complex message_id timeuuid, body blob, -- whatever metadata text, -- JSON PRIMARY KEY (hash_key, message_id) );
  • 20. Concurrent Degree = 3 (using the Libev event Loop) Asynchronous: CQL Native only
  • 21. More Concurrency Can also try: • DC Aware • Token Aware • Subprocessing
  • 22. Build one def build_message(self): message = { "message_id": str(uuid.uuid1()), "hash_key": randint(0, self._hash_key_range), # int(e ** 8) "app_id": self._app_id, "timestamp": datetime.utcnow().isoformat() + 'Z', "content_type": "application/binary", "body": os.urandom(randint(1, self._body_range)) # int(e ** 9) }
  • 23. Kick-off def push_message(self): if self._submitted_count.next() < self._message_count: message = self.build_message() self.submit_query(message) def push_initial_data(self): self._start_time = time() try: with self._lock: for i in range( 0, min(CONCURRENCY, self._message_count) ): self.push_message()
  • 24. Put it in the pipeline def submit_query(self, message): body = message.pop('body') substitution_args = ( json.dumps(message, **JSON_DUMPS_ARGS), body, message['hash_key'], uuid.UUID(message['message_id']) ) future = self._cql_session.execute_async( self._query, substitution_args ) future.add_callback(self.push_or_finish) future.add_errback(self.note_error)
  • 25. Maintain concurrency or finish def push_or_finish(self, _): try: if ( self._unfinished and self._confirmed_count.next() < self._message_count ): with self._lock: self.push_message() else: self.finish()
  • 39. Push some messages usage: bm_push.py [-h] [-c [CQL_HOST [CQL_HOST ...]]] [-d LOCAL_DC] [--remote-dc-hosts REMOTE_DC_HOSTS] [-p PREFETCH_COUNT] [-w WORKER_COUNT] [-a] [-t] [-n {ONE, TWO, THREE, QUORUM, ALL, LOCAL_QUORUM, EACH_QUORUM, SERIAL, LOCAL_SERIAL, LOCAL_ONE}] [-r] [-j] [-l {CRITICAL,ERROR,WARNING,INFO,DEBUG,NOTSET}] Push messages from a RabbitMQ queue into a Cassandra table.
  • 40. Push messages many times usage: run_push.py [-h] [-c [CQL_HOST [CQL_HOST ...]]] [-i ITERATIONS] [-d LOCAL_DC] [-w [worker_count [worker_count ...]]] [-p [prefetch_count [prefetch_count ...]]] [-n [level [level ...]]] [-a] [-t] [-m MESSAGE_EXPONENT] [-b BODY_EXPONENT] [-l {CRITICAL,ERROR,WARNING,INFO,DEBUG,NOTSET}] Run multiple test cases based upon the product of worker_counts, prefetch_counts, and consistency_levels. Each test case may be run with up to 4 variations reflecting the use or not of the dc_aware and token_aware policies. The results are output to stdout as a JSON object.