Webinar: Getting Started with Apache Cassandra
Upcoming SlideShare
Loading in...5

Webinar: Getting Started with Apache Cassandra



Would you like to learn how to use Cassandra but don’t know where to begin? Want to get your feet wet but you’re lost in the desert? Longing for a cluster when you don’t even know how to set up ...

Would you like to learn how to use Cassandra but don’t know where to begin? Want to get your feet wet but you’re lost in the desert? Longing for a cluster when you don’t even know how to set up a node? Then look no further! Rebecca Mills, Junior Evangelist at Datastax, will guide you in the webinar “Getting Started with Apache Cassandra...”

You'll get an overview of Planet Cassandra’s resources to get you started quickly and easily. Rebecca will take you down the path that's right for you, whether you are a developer or administrator. Join if you are interested in getting Cassandra up and working in the way that suits you best.



Total Views
Slideshare-icon Views on SlideShare
Embed Views



1 Embed 1

http://www.slideee.com 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • Part of my job is to help try to make Cassandra more approachable for everyone <br /> A lot of people claim that other databases are faster and easier to get up and running with then Cassandra <br /> I consider it my mission to guide people through the challenges of getting started <br />
  • Maybe you don’t have a loads of free time to spend trying to learn how to use a new database <br /> Sometimes it can be hard navigating your way through tangly docs <br /> when you really just want a quick taste of what its like to use the database <br /> Today I’m going to give you a brief overview of what it takes, we’ll say the bare minimum steps to get up and running with Cassandra <br /> I’m not saying you’ll have your own 100 node cluster going by the end of all this, but at least you’ll have a concept of what its like <br /> So sit back, relax, and lets go <br />
  • As a Junior Evangelist I <br /> try to create awareness for open source cassandra <br /> I develop Cassandra themed content like blog posts, video tutorials, webinars, and I also have my twitter account <br /> Part of my job is also to step in the shoes of a ‘newbie’ to try to determine what kind of problems people just being <br /> introduced to Cassandra might encounter, which may not be obvious to an expert. <br />
  • So if you haven’t already, head to Planet Cassandra and go to the Downloads section <br /> There you can choose your operating system and the type of DSC download that you want <br /> On the downloads page there are also guides on how to install DSC once you have it <br />
  • Alright, well lets get going with our instance <br />
  • But before you can fire up your instance, there are a few things that we need to tinker with <br /> Otherwise Cassandra may not work properly, or may not even start up at all! <br /> If we were starting up a cluster, this list would get a little longer as we would have to tell the nodes how to share information <br /> But for now we are just worried about our single instance <br /> Two things we are concerned about are checking our version of Java and making sure we have access to our data files when they get saved <br />
  • Firstly, you need to make sure you have the latest version of Java, JDK7 installed on all your nodes <br />
  • You’re going to want to to change the location of data, commit logs and save caches <br /> If you leave them as default, you’re going to have to run Cassandra as root in order for it to start, which isn’t ideal of course <br /> Put probably The easiest way to deal with this problem is set the save location in your home directory <br /> The location for the saves is configured in the cassandra.yaml file in the conf directory <br />
  • Instead of using the default directory paths we’ll change them all to use our home directory. <br /> This will guarantee that we have the correct permissions. <br />
  • We’re going to run through this list here now of 5 things you should be able to do quickly when you start up a Cassandra instance <br />
  • So assuming you downloaded the tarball, just go to your install location and run cassandra from the bin directory <br />
  • Once we get our instance started, we can run CQL shell <br /> CQL is Cassandra Query Language <br /> Syntactically its pretty similar to SQL, so it shouldn’t be too hard if you have a relational database background <br /> When you run CQL shell, you’ll get a prompt and then you can start communicating with your database <br />
  • So a keyspaces hold our data in cassandra <br /> They have tables which are made up of rows and columns <br /> A row represents a single data entry <br /> Here I’m showing the creation of a keyspace in CQL, never mind the class and replication factor component for now, that’s outside of the scope of this webinar <br /> And then I created a “user” table within that keyspace, where I assign the columns a name and data type <br />
  • Next, We can populate our the rows in our table using the insert command <br /> If I ran these 3 insert commands, it would inject 3 rows of user information into the “users” table I made <br />
  • If we wanted to query our database, a “SELECT * FROM users“ would return all the rows from the table <br /> Using a WHERE clause and a specific last name (which we set to be our primary key), it would return the users associated with that last name. <br /> The PRIMARY KEY (which is also the partition key in this case) refers to the partition on disk where the data is located <br />
  • These are examples of what an update and delete look like in CQL <br /> As you can see its pretty familiar looking syntax, it’s just that simple! <br />
  • Two really great tools you can use with Cassandra are Opscenter and DevCenter <br />
  • DevCenter is a free tool you can download on the DS website <br /> It’s a cool alternative to CQL shell, if you’d prefer a GUI <br /> You can connect to a local server or remote clusters <br />
  • This is what dev center looks like <br /> You can type most of the same commands here as you would in CQL shell <br /> It has almost the same functionaliy, and has a nice visual interface <br />
  • In the connection center, you can save a new connection if you intend to use it frequently, Instead of reconnecting over and over each time that you use it <br /> Here I’m connecting to an instance on my local machine <br />
  • I’m running the same commands here as I was in CQL as earlier, creating that same demo keyspace <br />
  • Creating that same user table. <br /> Notice the nice syntax highlighting. <br /> Also notice the schema window in the upper right corner showing all our keyspaces <br />
  • Insert new records into the database <br />
  • Then select those records and get a nice table view of the data <br />
  • OpsCenter is a there to help you manage a Cassandra cluster <br /> Because managing a lot of machines can be a challenge sometimes <br />
  • It’s easy to make cluster wide configuration changes with Opscenter, instead of digging through configuration files on the command line <br />
  • You can also diagnosis problems with your cluster using Opscenter <br /> You can set up graphs to track Write latency, read latency, hinted handoff etc <br /> And these may give you a good indication of the source of a problem <br />
  • So what about multi data center? <br /> Of course Opscenter does multi data center! Because its cassandra!  <br />
  • You can create a Cassandra instance or cluster in the cloud using the AWS AMI <br /> You spin these up through Opscenter <br /> In the new cluster section, select the cloud option, which only appears if you’re running opscenter on an EC2 instance <br />
  • Adding a cluster can be done from a single image and configuration file <br /> You give your Datastax credentials sent to you by email <br /> As well as the credentials of each node <br />
  • You use your own AWS credentials to create a cluster and configure things like security groups on the fly <br />
  • So DS has drivers for Java, Python, C# and C++ <br /> There are a lot of other opensource drivers though <br /> Check out the Client Drivers section of Planet Cassandra and you’ll probably find one in the language you’re looking for <br />
  • Connecting to your cluster using Java is really easy <br /> First create a cluster object <br /> Use the builder method to connect to the cluster <br /> That’s it! It’s just that easy. <br />
  • Here is a simple program that will connect to your database <br /> Just a few lines of code and you are ready to insert and select data from Cassandra <br />
  • Here the same situation in python, I wish I had more to say about this but it essentially the same, very simple <br /> Create a cluster object and use the connect method. That’s it. <br />
  • Once you have a session, you can use the execute method to run CQL commands <br />
  • So if your looking for great resouroces on Apache Casandra, you should definiety check out Planet Cassandra <br /> You’ll find everything you need there: webinars, blog posts, use cases, tutorials <br /> While you’re there, check out the try Cassandra section, which I created all the content for <br />
  • Try cassandra has quick 10 minute tutorial for developers and administrators <br /> And some walk through videos that I made to help you guys out <br />
  • Thank you everyone! Is there any questions? <br />

Webinar: Getting Started with Apache Cassandra Webinar: Getting Started with Apache Cassandra Presentation Transcript

  • ©2013 DataStax Confidential. Do not Rebecca Mills Junior Evangelist DataStax @rebccamills Getting Started with Apache Cassandra 1
  • • Then you’ve come to the right place! • To learn some important basics of Cassandra without ever having to leave your couch Don’t want to spend exorbitant amount of time and energy learning a new database?
  • What do I do? • Try to create awareness for open source Cassandra • Develop content to get people interested in trying • Identify problems newcomers might be encountering • Develop strategies and material to help with that
  • Where can you download Cassandra? • The easiest way is to head straight to Planet Cassandra • http://planetcassandra.or • Go to the “Downloads” section, choose you operating system and the version of DSC that’ you’d like • Get crackin’!
  • Let’s get started
  • 2 things you should do to get going 1.Check your version of Java 2.Edit your cassandra.yaml file to point your Cassandra instance towards your home directory
  • 1. Check your version of Java • To check what version of java you are using, at the prompt type % java –version •Be sure to use the latest version (JDK 7) on all nodes
  • 2. Change default location to save data • Don’t run Cassandra as root • Other wise we will not be able to start Cassandra or have access to the directories where our data is being saved. • Access the cassandra.yaml file though the cassandra conf directory
  • The 3 lines you should change in the cassandra.yaml file: Edit cassandra.yaml data_file_directories: - /var/lib/cassandra/data -$HOME/cassandra/data commitlog_directory: /var/lib/cassandra/commitlog $HOME/cassandra/commitlog saved_caches_directory: /var/lib/cassandra/saved_caches $HOME/cassandra/saved_caches
  • 1.Start up an instance 1.Create a schema with CQL 2.Inject some data into our instance 1.Run a query against our database 5 things you can do quickly
  • 1. Start up an instance • It’s very simple! Just go to your install location and start it from the bin directory as such: $ cd install_location $ bin/cassandra
  • 2. Create a schema with CQL • From within your installation directory, start up your CQL shell from within the bin directory $ cd install_directory $ bin/cqlsh • You should see the cqlsh command prompt as such Connected to Test Cluster at localhost:9160. [cqlsh 4.1.1 | Cassandra 2.0.8 | CQL spec 3.1.1 | Thrift protocol 19.39.0] Use HELP for help. cqlsh>
  • 2. Create a schema with CQL • A keyspace is a container for our data. Here we are creating a demo keyspace and a users table within. A table consists of rows and columns. CREATE KEYSPACE demo WITH REPLICATION = {‘class’:’SimpleStrategy’,’replication_factor’:1}; USE demo; CREATE TABLE users ( firstname text, lastname text, age int, email text, city text, PRIMARY KEY (lastname) );
  • 3. Inject some data into your instance • Nothing sadder than an empty database. Here we are populating our “users” table with rows of data using the INSERT command. INSERT INTO users (firstname, lastname, age, email, city) VALUES (‘John’,’Smith’, 46, ‘johnsmith@email.com’, ‘Sacramento’); INSERT INTO users (firstname, lastname, age, email, city) VALUES (‘Jane’,’Doe’, 36, ‘janedoe@email.com’, ‘Beverly Hills’); INSERT INTO users (firstname, lastname, age, email, city) VALUES (‘Rob’,’Byrne’, 24, ‘robbyrne@email.com’, ‘San Diego’);
  • 4. Make a query against your database SELECT * FROM users; SELECT * FROM users WHERE lastname=‘Doe’; lastname | age | city | email | firstname ----------+-----+---------------+---------------------+----------- Doe | 36 | Beverly Hills | janedoe@email.com | Jane Bryne | 24 | San Diego | robbyrne@email.com | Rob Smith | 46 | Sacramento | johnsmith@email.com | John lastname | age | city | email | firstname ----------+-----+---------------+-------------------+----------- Doe | 36 | Beverly Hills | janedoe@email.com | Jane
  • 5. Make a change to your data UPDATE users SET city=‘San Jose’ WHERE lastname=‘Doe’; SELECT * FROM users WHERE lastname= ‘Doe’; lastname | age | city | email | firstname ----------+-----+----------+-------------------+------------- Doe | 36 | San Jose | janedoe@email.com | Jane SELECT * FROM users;DELETE FROM users WHERE lastname=‘Doe’; lastname | age | city | email | firstname ----------+-----+---------------+---------------------+----------- Bryne | 24 | San Diego | robbyrne@email.com | Rob Smith | 46 | Sacramento | johnsmith@email.com | John
  • Two really neat tools: 1. Opscenter 2. DevCenter
  • Dev Center • Try out your CQL in an easy- to-use tool • Has most of the same functionality as cqlsh with a few exceptions • Quickly connect to your cluster and keyspace. GO!
  • Opscenter • Opscenter makes it easy to manage and configure your cluster!
  • Change configurations • Just a couple clicks and you can reconfigure an entire cluster.
  • Metrics • Diagnosis problems with your cluster
  • How about multi datacenter? Of course!
  • You can run an AWS AMI from Opscenter! • Run a Cassandra instance/cluster in the cloud! • Using Amazon Web Services EC2 Management Console • Quickly deploy a Cassandra cluster within a single availability zone through Opscenter • Check out http://www.datastax.com/documentation/cassa
  • What about the drivers • Datastax provides drivers for Java, Python, C#, and C+ + • There are also many open sources community drivers, including Closure, Go, Node.js and many many more.
  • Connect to your instance with Java • Create a new Java class, com.example.cassandra.SimpleClient for example • Add an instance field to hold cluster reference private Cluster cluster; • Add an instance method, connect, to your new class. Here you can add your contact point, the ip address of your node. public void connect(String node) { cluster = Cluster.builder() .addContactPoint(<ip_address>) .build(); } • Add an instance method, close, to shut down the cluster once you are finished
  • Connect to your instance with Java • In your main class, create a SimpleClient object, call connect, and close it public static void main(String[] args) { SimpleClient client = new SimpleClient(); client .connect(<ip_address>); client.close(); } • Select some data session.execute (‘SELECT * FROM demo.users’);
  • Connect to your instance in Python • From cassandra.cluster import Cluster cluster = Cluster() • This will attempt to connect to a cluster on your local machine. You could also give it an ip address and it will connect to that. cluster = Cluster(<ip_address>) • To connect to a node and begin begin actually running queries against our instance, we need a session, which is created by calling Cluster.connect() cluster = Cluster() Session = cluster.connect() • You can even connect to a particular keyspace cluster = Cluster() Session = cluster.connect(‘demo’)
  • Connect to your instance in Python • Select some data results = session.execute (””” SELECT * FROM demo.users “““)
  • Get familiar • Visit http://planetcassandra.org • Your #1 destination for NoSQL Apache Cassandra resources • Downloads, webinars, presentations, blog posts, and much, much more!
  • Try Cassandra
  • Thank you!! Any Questions?