Webinar | Building Apps with the Cassandra Python Driver

1,899 views

Published on

With the new Python driver for Cassandra it is easy to build integrations and apps that use Cassandra seamlessly as a back in. This session will explore what it takes to build the app and the features available with the new Python drivers.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,899
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
8
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Webinar | Building Apps with the Cassandra Python Driver

  1. 1. Building Apps with the Cassandra Python Driver Eddie Satterly– CTO Big Data & Analytics at CSC Dial In: 1-877-668-4493 Access Code: 807 224 168
  2. 2. Where is the Driver https://github.com/datastax/python-driver
  3. 3. Key Features The driver is a connection handler for the Cassandra system underneath your app with a low-level API. The key features which really helped simplify the python code from the earlier version of the app are: Connection Pooling & Node Discovery – This lets you connect to the whole set of nodes providing only the seed nodes in your list. With my old driver you had to provide the list of all nodes and make the python code decide how to connect. You give it this set of nodes 192.168.1.1 & 192.168.1.2 and the driver makes a connection and automatically discovers all other nodes in the cluster instance.
  4. 4. Key Features Cont. Cluster Attributes – There are several cluster object attributes you can set but some of the key ones are the ability to set a default keyspace via the method cluster.connect(‘mykeyspace’) as well as setting the CQL version for cluster that run in mixed mode due to different timing of data models being built also metrics_enabled which controls metrics collection SSL_Options – This attribute is called out separately due to the high value of this in environments where client to node communication needs to be encrypted and that feature is turned on cluster side. While this is not turned on by default in my app it is needed for many of the customers that are using it. Load balancing – This is a great added feature that really helps to avoid hotspot nodes in the older driver approach as now you set the policy in an attribute (roundrobin is the default) and the driver controls connection. In early test with the old driver even though the code was supposed to pick a pseudo-random node affinity seemed to happen and creat hotspot nodes for queries.
  5. 5. Key Features Cont. default_timeout– Setting a timeout so that the app can detect failures and respond without leaving the client hanging is key row_factory – This lets you determine what format to return the results in. This is super valuable to make sure your app has the data returned in the optimal way for analysis and manipulation. There were over 50 lines on code in my old python scripts to handle one-offs that are now gone since this feature exists. Below are the options: execute_async() – This is one of the best features in the new driver and makes the processing time for requests much faster from the client PoV. There is a method to call to force blocking for results to this if needed but in most cases doing other work while waiting on results providers speeds up the response times by many milliseconds.
  6. 6. Take a Look at Docs There are many other features I did not call out so take a look at: http://datastax.github.io/python-driver/index.html http://datastax.github.io/python-driver/api/index.html For high throughput operations like remote lookups I highly suggest using multiprocessing module instead of using multithreading, but make sure you understand the implication with object passing.
  7. 7. How I Use It Take a look at my github in a couple of weeks the new version of the app will be there using this driver once all the final testing is done. The current version there is using the old driver and approach so look for v2.0 https://github.com/esatterly/splunk-cassandra Build your own playgrounds and figure out the right options and configuration settings to return data and do analysis and manipulation on it. I will be putting two other apps out in the next few months for other non-Splunk use cases as well so stay tuned.
  8. 8. Thank You Questions?

×