SlideShare a Scribd company logo
1 of 28
Download to read offline
©2013 DataStax Confidential. Do not distribute without consent.
@rustyrazorblade
Jon Haddad

Technical Evangelist, DataStax
Python & Cassandra
1
This should be boring
• Talking to a database should not
be any of the following:
• Exciting
• "AH HA!"
• Confusing
git@github.com:rustyrazorblade/python-presentation.git
Agenda
• Go over driver basic concepts
• Connecting
• Perform queries
• Introduce object mapper
(cqlengine)
• Application integration
DataStax Native Python Driver
• Talks to Cassandra
• Connection pooling
• Aware of cluster topology
• Automatic retries / failure
management
• Load balancing
• Will include object mapper
(cqlengine) in next release
• Fully Open Source (Apache
License)
Connect to Cassandra
• Import and create a Cluster instance
• Cluster takes options such as load balancing policy, reconnect policy, retry
policy
• On connection, driver discovers entire cluster automatically
Executing queries
• CQL: Similar to SQL
• session.execute()
• Create tables, insert, selects
• Can accept simple strings
• Not token aware
Prepared Statements
• Use for all queries (inserts / updates / deletes)
• Decrease server load
• Increase security
• Allows for token aware queries
Async Queries
• Prepared statements required!
• Much faster than sync
• Utilize the entire cluster
• Driver can help us here
• We can use futures
1 statement = """INSERT INTO sensor
2 (sensor_id, name, created_at)
3 VALUES (?, ?, ?)"""
4
5 insert_sensor = session.prepare(statement)
6
7 def create_sensor_entries_callback(response, sensor_id):
8 print "CALLBACK"
9
10 for x in range(10):
11 sensor_data = (uuid.uuid4(), "sensor %d" % x, datetime.now())
12 future = session.execute_async(insert_sensor, sensor_data)
13 future.add_callback(create_sensor_entries_callback, sensor_id)
14
Async Queries w/ Callbacks
callback function
add callback
1 from cassandra.concurrent import execute_concurrent_with_args
2
3 stmt = """SELECT * FROM sensor_data WHERE sensor_id=?
4 ORDER BY created_at DESC LIMIT 1""")
5
6 select_statement = session.prepare(stmt)
7
8 sensor_ids = [["f472d5ff-0c76-404a-8044-038db416685e"],
9 ["940cb741-d5b5-4c5d-82f5-bf1aa61c6d47"],
10 ["497d4b2c-cba2-4d0f-bd80-42de612690fd"],
11 ["1bdeac75-7e12-43ba-80b5-2d38405f9843"]
12
13 result = execute_concurrent_with_args(session, select_statement, sensor_ids)
Async Queries (managed)
prepared statement
automatically manages concurrency
Performance Considerations
• Like SQL, CQL features IN() but in
general, it's terrible for
performance
• Results in more GC & perf
problems
• BATCH has the same issue
• Failure to get a single result
causes entire IN() or batch to retry
Object Mapper
Defining Models
• Each model maps to a single table
• Every model inherits from cassandra.cqlengine.models.Model
• Define fields in your table programatically
• Collections map to native Python types (lists, sets, dict)
• Table management included (no need to write ALTER)
Model with Collections
• Sets & Maps are most useful
• Use to denormalize
• Lists can have performance issues if misused
1 class Message(Model):
2 message_id = TimeUUID(primary_key=True, default=uuid1)
3 subject = Text()
4 body = Text()
5 addressed_to = Set(UUID)
6
7 class Photo(Model):
8 photo_id = UUID(primary_key=True, default=uuid4)
9 title = Text()
10 likes = Map<UUID, Text>
Clustering Keys
• Automatically determined by
ordering in model
• First primary key is partition key
• The rest are clustering keys
1 class UsersInGroup(Model):
2 group_id = UUID(primary_key=True)
3 user_id = UUID(primary_key=True)
4 is_admin = Boolean()
5
6
1 class UsersInGroupByState(Model):
2 group_id = UUID(primary_key=True, partition_key=True)
3 state = Text(primary_key=True, partition_key=True
4 user_id = UUID(primary_key=True)
5 is_admin = Boolean(default=False)
Inserting Data
• Model.create(**kwargs)
• Performs validation
• Supports custom validation
• Supports TTLs
Lightweight Transactions
• Uses paxos for consensus
• IF NOT EXISTS for INSERT
• IF FIELD=VALUE for UPDATE
• Use sparingly - requires
several round trips
Batches
• Use only to maintain multiple views (for consistency purposes)
1 class User(Model):
2 name = Text(primary_key=True)
3 twitter = Text()
4 email = Text()
5
6 class TwitterToUser(Model):
7 twitter = Text(primary_key=True)
8 name = Text()
9
10 (twitter, name) = ("rustyrazorblade", "jon")
11
12 with BatchQuery() as b:
13 User.batch(b).create(name=name, twitter=twitter)
14 EmailToUser.batch(b).create(twitter=twitter, name=name)
Fetching a Row
• Model.get() can be used to
fetch a single row
• Will throw a DoesNotExist
exception if not found
Fetching Many Rows
• Model.objects() accepts any filter acceptable to Cassandra
Table Properties
• Every table option supported
• Compaction
• gc_grace_seconds
• read repair chance
• caching
Table Inheritance
• Multiple tables with similar fields
• Query Pattern: filtering
Table Polymorphism
• Similar to inheritance
• Uses a single table
• Query pattern: select all types
Application Development
Virtual Environments
• virtualenv is your friend!
• mkvirtualenv also your friend!
• pip install mkvirtualenv
Flask==0.10.1
blist==1.3.6
cassandra-driver==2.1.2
Flask==0.9.0
rednose==0.4.1
ipdb==0.7
ipdbplugin==1.2
ipython==2.3.1
mock==1.0.1
nose==1.3.4
All sandboxed environments
Integrations
• Django
• django-cassandra-engine
• Integrates with manage.py
• Flask
• use @app.before_first_request
• General rule: connect post-fork
Go build stuff!
©2013 DataStax Confidential. Do not distribute without consent. 28

More Related Content

What's hot

Capturing, Analyzing, and Optimizing your SQL
Capturing, Analyzing, and Optimizing your SQLCapturing, Analyzing, and Optimizing your SQL
Capturing, Analyzing, and Optimizing your SQL
Padraig O'Sullivan
 
Hello click click boom
Hello click click boomHello click click boom
Hello click click boom
symbian_mgl
 
บทที่ 4 การเพิ่มข้อมูลลงฐานข้อมูล
บทที่ 4 การเพิ่มข้อมูลลงฐานข้อมูลบทที่ 4 การเพิ่มข้อมูลลงฐานข้อมูล
บทที่ 4 การเพิ่มข้อมูลลงฐานข้อมูล
Priew Chakrit
 
Understanding Query Execution
Understanding Query ExecutionUnderstanding Query Execution
Understanding Query Execution
webhostingguy
 

What's hot (19)

50 new features of Java EE 7 in 50 minutes
50 new features of Java EE 7 in 50 minutes50 new features of Java EE 7 in 50 minutes
50 new features of Java EE 7 in 50 minutes
 
Core Data Performance Guide Line
Core Data Performance Guide LineCore Data Performance Guide Line
Core Data Performance Guide Line
 
Introduction into MySQL Query Tuning
Introduction into MySQL Query TuningIntroduction into MySQL Query Tuning
Introduction into MySQL Query Tuning
 
Configure the dbase using em in oracle 11g
Configure the dbase using em in oracle 11gConfigure the dbase using em in oracle 11g
Configure the dbase using em in oracle 11g
 
Drools
DroolsDrools
Drools
 
Learning Java 4 – Swing, SQL, and Security API
Learning Java 4 – Swing, SQL, and Security APILearning Java 4 – Swing, SQL, and Security API
Learning Java 4 – Swing, SQL, and Security API
 
Mini Session - Using GDB for Profiling
Mini Session - Using GDB for ProfilingMini Session - Using GDB for Profiling
Mini Session - Using GDB for Profiling
 
Capturing, Analyzing, and Optimizing your SQL
Capturing, Analyzing, and Optimizing your SQLCapturing, Analyzing, and Optimizing your SQL
Capturing, Analyzing, and Optimizing your SQL
 
Hello click click boom
Hello click click boomHello click click boom
Hello click click boom
 
MySql:Basics
MySql:BasicsMySql:Basics
MySql:Basics
 
State of entity framework
State of entity frameworkState of entity framework
State of entity framework
 
Introduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]sIntroduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]s
 
What's new in Cassandra 2.0
What's new in Cassandra 2.0What's new in Cassandra 2.0
What's new in Cassandra 2.0
 
Load Data Fast!
Load Data Fast!Load Data Fast!
Load Data Fast!
 
Performance Schema in Action: demo
Performance Schema in Action: demoPerformance Schema in Action: demo
Performance Schema in Action: demo
 
บทที่ 4 การเพิ่มข้อมูลลงฐานข้อมูล
บทที่ 4 การเพิ่มข้อมูลลงฐานข้อมูลบทที่ 4 การเพิ่มข้อมูลลงฐานข้อมูล
บทที่ 4 การเพิ่มข้อมูลลงฐานข้อมูล
 
Insert
InsertInsert
Insert
 
Understanding Query Execution
Understanding Query ExecutionUnderstanding Query Execution
Understanding Query Execution
 
Selenium Webdriver with data driven framework
Selenium Webdriver with data driven frameworkSelenium Webdriver with data driven framework
Selenium Webdriver with data driven framework
 

Similar to Cassandra Day Atlanta 2015: Python & Cassandra

Getting Started with Datatsax .Net Driver
Getting Started with Datatsax .Net DriverGetting Started with Datatsax .Net Driver
Getting Started with Datatsax .Net Driver
DataStax Academy
 

Similar to Cassandra Day Atlanta 2015: Python & Cassandra (20)

DBA Commands and Concepts That Every Developer Should Know - Part 2
DBA Commands and Concepts That Every Developer Should Know - Part 2DBA Commands and Concepts That Every Developer Should Know - Part 2
DBA Commands and Concepts That Every Developer Should Know - Part 2
 
DBA Commands and Concepts That Every Developer Should Know - Part 2
DBA Commands and Concepts That Every Developer Should Know - Part 2DBA Commands and Concepts That Every Developer Should Know - Part 2
DBA Commands and Concepts That Every Developer Should Know - Part 2
 
Getting Started with Datatsax .Net Driver
Getting Started with Datatsax .Net DriverGetting Started with Datatsax .Net Driver
Getting Started with Datatsax .Net Driver
 
PostgreSQL Performance Problems: Monitoring and Alerting
PostgreSQL Performance Problems: Monitoring and AlertingPostgreSQL Performance Problems: Monitoring and Alerting
PostgreSQL Performance Problems: Monitoring and Alerting
 
DjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling DisqusDjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling Disqus
 
Cassandra Day NY 2014: Getting Started with the DataStax C# Driver
Cassandra Day NY 2014: Getting Started with the DataStax C# DriverCassandra Day NY 2014: Getting Started with the DataStax C# Driver
Cassandra Day NY 2014: Getting Started with the DataStax C# Driver
 
Building and Scaling Node.js Applications
Building and Scaling Node.js ApplicationsBuilding and Scaling Node.js Applications
Building and Scaling Node.js Applications
 
Staying Ahead of the Curve with Spring and Cassandra 4
Staying Ahead of the Curve with Spring and Cassandra 4Staying Ahead of the Curve with Spring and Cassandra 4
Staying Ahead of the Curve with Spring and Cassandra 4
 
Staying Ahead of the Curve with Spring and Cassandra 4 (SpringOne 2020)
Staying Ahead of the Curve with Spring and Cassandra 4 (SpringOne 2020)Staying Ahead of the Curve with Spring and Cassandra 4 (SpringOne 2020)
Staying Ahead of the Curve with Spring and Cassandra 4 (SpringOne 2020)
 
Planet-HTML5-Game-Engine Javascript Performance Enhancement
Planet-HTML5-Game-Engine Javascript Performance EnhancementPlanet-HTML5-Game-Engine Javascript Performance Enhancement
Planet-HTML5-Game-Engine Javascript Performance Enhancement
 
(DEV204) Building High-Performance Native Cloud Apps In C++
(DEV204) Building High-Performance Native Cloud Apps In C++(DEV204) Building High-Performance Native Cloud Apps In C++
(DEV204) Building High-Performance Native Cloud Apps In C++
 
Performance Test Driven Development with Oracle Coherence
Performance Test Driven Development with Oracle CoherencePerformance Test Driven Development with Oracle Coherence
Performance Test Driven Development with Oracle Coherence
 
Built-in query caching for all PHP MySQL extensions/APIs
Built-in query caching for all PHP MySQL extensions/APIsBuilt-in query caching for all PHP MySQL extensions/APIs
Built-in query caching for all PHP MySQL extensions/APIs
 
New features in Performance Schema 5.7 in action
New features in Performance Schema 5.7 in actionNew features in Performance Schema 5.7 in action
New features in Performance Schema 5.7 in action
 
Introduction to Apache Mesos
Introduction to Apache MesosIntroduction to Apache Mesos
Introduction to Apache Mesos
 
Where the wild things are - Benchmarking and Micro-Optimisations
Where the wild things are - Benchmarking and Micro-OptimisationsWhere the wild things are - Benchmarking and Micro-Optimisations
Where the wild things are - Benchmarking and Micro-Optimisations
 
Rails Security
Rails SecurityRails Security
Rails Security
 
Query Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New TricksQuery Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New Tricks
 
Anirudh Koul. 30 Golden Rules of Deep Learning Performance
Anirudh Koul. 30 Golden Rules of Deep Learning PerformanceAnirudh Koul. 30 Golden Rules of Deep Learning Performance
Anirudh Koul. 30 Golden Rules of Deep Learning Performance
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 

More from DataStax Academy

Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
DataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
DataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
DataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
DataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
DataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
DataStax Academy
 

More from DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

Cassandra Day Atlanta 2015: Python & Cassandra

  • 1. ©2013 DataStax Confidential. Do not distribute without consent. @rustyrazorblade Jon Haddad
 Technical Evangelist, DataStax Python & Cassandra 1
  • 2. This should be boring • Talking to a database should not be any of the following: • Exciting • "AH HA!" • Confusing git@github.com:rustyrazorblade/python-presentation.git
  • 3. Agenda • Go over driver basic concepts • Connecting • Perform queries • Introduce object mapper (cqlengine) • Application integration
  • 4. DataStax Native Python Driver • Talks to Cassandra • Connection pooling • Aware of cluster topology • Automatic retries / failure management • Load balancing • Will include object mapper (cqlengine) in next release • Fully Open Source (Apache License)
  • 5. Connect to Cassandra • Import and create a Cluster instance • Cluster takes options such as load balancing policy, reconnect policy, retry policy • On connection, driver discovers entire cluster automatically
  • 6. Executing queries • CQL: Similar to SQL • session.execute() • Create tables, insert, selects • Can accept simple strings • Not token aware
  • 7. Prepared Statements • Use for all queries (inserts / updates / deletes) • Decrease server load • Increase security • Allows for token aware queries
  • 8. Async Queries • Prepared statements required! • Much faster than sync • Utilize the entire cluster • Driver can help us here • We can use futures
  • 9. 1 statement = """INSERT INTO sensor 2 (sensor_id, name, created_at) 3 VALUES (?, ?, ?)""" 4 5 insert_sensor = session.prepare(statement) 6 7 def create_sensor_entries_callback(response, sensor_id): 8 print "CALLBACK" 9 10 for x in range(10): 11 sensor_data = (uuid.uuid4(), "sensor %d" % x, datetime.now()) 12 future = session.execute_async(insert_sensor, sensor_data) 13 future.add_callback(create_sensor_entries_callback, sensor_id) 14 Async Queries w/ Callbacks callback function add callback
  • 10. 1 from cassandra.concurrent import execute_concurrent_with_args 2 3 stmt = """SELECT * FROM sensor_data WHERE sensor_id=? 4 ORDER BY created_at DESC LIMIT 1""") 5 6 select_statement = session.prepare(stmt) 7 8 sensor_ids = [["f472d5ff-0c76-404a-8044-038db416685e"], 9 ["940cb741-d5b5-4c5d-82f5-bf1aa61c6d47"], 10 ["497d4b2c-cba2-4d0f-bd80-42de612690fd"], 11 ["1bdeac75-7e12-43ba-80b5-2d38405f9843"] 12 13 result = execute_concurrent_with_args(session, select_statement, sensor_ids) Async Queries (managed) prepared statement automatically manages concurrency
  • 11. Performance Considerations • Like SQL, CQL features IN() but in general, it's terrible for performance • Results in more GC & perf problems • BATCH has the same issue • Failure to get a single result causes entire IN() or batch to retry
  • 13. Defining Models • Each model maps to a single table • Every model inherits from cassandra.cqlengine.models.Model • Define fields in your table programatically • Collections map to native Python types (lists, sets, dict) • Table management included (no need to write ALTER)
  • 14. Model with Collections • Sets & Maps are most useful • Use to denormalize • Lists can have performance issues if misused 1 class Message(Model): 2 message_id = TimeUUID(primary_key=True, default=uuid1) 3 subject = Text() 4 body = Text() 5 addressed_to = Set(UUID) 6 7 class Photo(Model): 8 photo_id = UUID(primary_key=True, default=uuid4) 9 title = Text() 10 likes = Map<UUID, Text>
  • 15. Clustering Keys • Automatically determined by ordering in model • First primary key is partition key • The rest are clustering keys 1 class UsersInGroup(Model): 2 group_id = UUID(primary_key=True) 3 user_id = UUID(primary_key=True) 4 is_admin = Boolean() 5 6 1 class UsersInGroupByState(Model): 2 group_id = UUID(primary_key=True, partition_key=True) 3 state = Text(primary_key=True, partition_key=True 4 user_id = UUID(primary_key=True) 5 is_admin = Boolean(default=False)
  • 16. Inserting Data • Model.create(**kwargs) • Performs validation • Supports custom validation • Supports TTLs
  • 17. Lightweight Transactions • Uses paxos for consensus • IF NOT EXISTS for INSERT • IF FIELD=VALUE for UPDATE • Use sparingly - requires several round trips
  • 18. Batches • Use only to maintain multiple views (for consistency purposes) 1 class User(Model): 2 name = Text(primary_key=True) 3 twitter = Text() 4 email = Text() 5 6 class TwitterToUser(Model): 7 twitter = Text(primary_key=True) 8 name = Text() 9 10 (twitter, name) = ("rustyrazorblade", "jon") 11 12 with BatchQuery() as b: 13 User.batch(b).create(name=name, twitter=twitter) 14 EmailToUser.batch(b).create(twitter=twitter, name=name)
  • 19. Fetching a Row • Model.get() can be used to fetch a single row • Will throw a DoesNotExist exception if not found
  • 20. Fetching Many Rows • Model.objects() accepts any filter acceptable to Cassandra
  • 21. Table Properties • Every table option supported • Compaction • gc_grace_seconds • read repair chance • caching
  • 22. Table Inheritance • Multiple tables with similar fields • Query Pattern: filtering
  • 23. Table Polymorphism • Similar to inheritance • Uses a single table • Query pattern: select all types
  • 25. Virtual Environments • virtualenv is your friend! • mkvirtualenv also your friend! • pip install mkvirtualenv Flask==0.10.1 blist==1.3.6 cassandra-driver==2.1.2 Flask==0.9.0 rednose==0.4.1 ipdb==0.7 ipdbplugin==1.2 ipython==2.3.1 mock==1.0.1 nose==1.3.4 All sandboxed environments
  • 26. Integrations • Django • django-cassandra-engine • Integrates with manage.py • Flask • use @app.before_first_request • General rule: connect post-fork
  • 28. ©2013 DataStax Confidential. Do not distribute without consent. 28