Cassandra - Say Goodbye to the Relational Database (5-6-2010)

  • 8,902 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
8,902
On Slideshare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
400
Comments
0
Likes
16

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. v Cassandra Say Goodbye to the Relation Database Twin Cities PHP User Group May 6, 2010 Chris Barber CB1, INC. http://www.cb1inc.com/
  • 2. About Me ● Chris Barber ● Open source hacker ● Software consultant ● JavaScript, C++, PHP ● http://www.cb1inc.com/ ● http://twitter.com/cb1inc ● http://twitter.com/cb1kenobi ● http://slideshare.net/cb1kenobi Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 3. What is Cassandra? Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 4. A highly scalable, eventually consistent, distributed, structured key-value store. Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 5. About Cassandra ● Started by Facebook ● Open Source ● Apache Project ● Apache License 2.0 ● Written in Java ● Mutli-platform ● Current Version 0.6.1 ● http://cassandra.apache.org/ Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 6. Who's using Cassandra? Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 7. Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 8. Cassandra Internals Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 9. Cassandra Overview ● Like a big hash table of hash tables ● Column Database (schemaless) ● Highly scalable ● Add nodes in minutes ● Fault tolerant ● Distributed ● Tunable Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 10. Dynamo + BigTable = Cassandra ● Amazon Dynamo ● Cluster management ● Replication ● Fault tolerance ● Google BigTable ● Sparse ● Columnar data model ● Storage architecture Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 11. Pros & Cons ● Pros ● Cons ● Easy to scale ● No joins ● No single point of failure ● Index & sort keys only ● High write-through ● Not good for large blobs ● Handles lots of data ● Rows must fit in ● Durable memory ● No more SQL injection ● Built on Thrift Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 12. CAP Theorem ● CAP Theorem ● Consistency ● Availability ● Partitioning ● You can only have 2 ● Cassandra is Available and Partitioning ● Eventually consistent – Can be defined on a per request basis Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 13. Consistency ● Specified for each operation ● Zero ● One ● Quorum (N-1) ● All ● Any Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 14. Replication Ring ● Ring of servers ● Talk to each other using "gossip" ● Data distributed between nodes ● Uses "tokens" to partition data ● Must be unique per node Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 15. Partitioning ● RandomPartitioner ● Inefficient range queries ● Doesn't sort properly ● OrderPreservingPartitioner ● Can cause unevenly distributed data ● Stores data sorted Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 16. Replica Placement Strategy ● Rack-unware ● Default ● Rack-aware ● Place one replica in a different datacenter, and the others on different racks in the same one Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 17. Data Model ● Keyspace ● Column Family (standard or super) ● Columns & Super Columns ● Keys and column names Keyspace1: { users: { "cb1kenobi": { "FirstName": "chris", "LastName": "barber" } } } Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 18. Installing & Deploying Cassandra Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 19. Getting Cassandra ● http://cassandra.apache.org/download/ ● http://www.apache.org/dist/cassandra/0.6.1/apache-cassandra-0.6.1-bin.tar.gz ● http://www.apache.org/dist/cassandra/0.6.1/apache-cassandra-0.6.1-src.tar.gz ● svn checkout https://svn.apache.org/repos/asf/cassandra/trunk cassandra ● git clone git://git.apache.org/cassandra.git Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 20. Getting Cassandra ● http://cassandra.apache.org/download/ ● http://www.apache.org/dist/cassandra/0.6.1/apache-cassandra-0.6.1-bin.tar.gz ● http://www.apache.org/dist/cassandra/0.6.1/apache-cassandra-0.6.1-src.tar.gz ● svn checkout https://svn.apache.org/repos/asf/cassandra/trunk cassandra ● git clone git://git.apache.org/cassandra.git Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 21. Installing Cassandra su cd /usr/local wget http://www.apache.org/dist/cassandra/0.6.1/apache-cassandra-0.6.1-src.tar.gz tar xzf apache-cassandra-0.6.1-src.tar.gz mkdir -p /var/log/cassandra chown -R `whoami` /var/log/cassandra mkdir -p /var/lib/cassandra chown -R `whoami` /var/lib/cassandra cd apache-cassandra-0.6.1-src ant Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 22. Configuration ● Main config file ● conf/storage-conf.xml ● Keyspaces ● Partitioner ● AutoBootstrap ● Authentication method ● Buffer sizes ● Timeouts Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 23. Automatically Start Cassanrda useradd -G cassandra cassandra <editor of choice> /etc/init.d/cassandra # paste contents of next slide chmod +x /etc/init.d/cassandra # Ubuntu/Debian method: update-rc.d -f cassandra defaults # Red Hat/Fedora method: use chkconfig Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 24. Automatically Start Cassandra #!/bin/bash export JAVA_HOME=/usr/bin/java export CASSANDRA_HOME=/usr/local/apache-cassandra-0.6.1-src export CASSANDRA_INCLUDE=$CASSANDRA_HOME/bin/cassandra.in.sh export CASSANDRA_CONF=$CASSANDRA_HOME/conf/storage-conf.xml export CASSANDRA_OWNR=cassandra export PATH=$PATH:$CASSANDRA_HOME/bin log_file=/var/log/cassandra/stdout pid_file=/var/run/cassandra/pid_file if [ ! -f $CASSANDRA_HOME/bin/cassandra -o ! -d $CASSANDRA_HOME ] then echo "Cassandra startup: cannot start" exit 1 fi mkdir -p /var/run/cassandra chown cassandra:cassandra /var/run/cassandra case "$1" in start) # Cassandra startup echo -n "Starting Cassandra: " su $CASSANDRA_OWNR -c "$CASSANDRA_HOME/bin/cassandra -p $pid_file" > $log_file 2>&1 echo "OK" ;; stop) # Cassandra shutdown echo -n "Shutdown Cassandra: " su $CASSANDRA_OWNR -c "kill `cat $pid_file`" echo "OK" ;; reload|restart) $0 stop $0 start ;; status) ;; *) echo "Usage: `basename $0` start|stop|restart|reload" exit 1 esac exit 0 Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 25. Running Cassandra ● Manually start ● bin/cassandra -f ● Command line app ● bin/cassandra-cli --host localhost --port 9160 ● Nodetool ● bin/nodetool -h localhost info 20146078924586773365182178806181105130 Load : 274.66 KB Generation No : 1273183803 Uptime (seconds) : 121 Heap Memory (MB) : 51.84 / 1023.88 ● Many more commands: ring, cleanup, cfstats, etc Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 26. PHP Clients Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 27. PHP Clients ● Thrift ● Pandra (LGPL) ● PHP Cassa – pycassa port ● Simple Cassie (New BSD License) ● Prophet (PHP License) Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 28. PHP Thrift Client ● Thrift files ● Thrift generated PHP ● Thrift.php files ● protocol/TBinaryProtocol.php ● thrift --gen php cassandra.thrift ● protocol/TProtocol.php – cassandra_constants.php ● transport/TBufferedTransport.php – Cassandra.php ● transport/TFramedTransport.php – cassandra_types.php ● transport/THttpClient.php ● Use thrift_protocol ● transport/TMemoryBuffer.php native PHP extension ● transport/TNullTransport.php ● transport/TPhpStream.php ● transport/TSocket.php ● transport/TSocketPool.php ● transport/TTransport.php Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 29. PHP Thrift Client Example <?php $GLOBALS['THRIFT_ROOT'] = './thrift'; require $GLOBALS['THRIFT_ROOT'] . '/Thrift.php'; require $GLOBALS['THRIFT_ROOT'] . '/transport/TSocket.php'; require $GLOBALS['THRIFT_ROOT'] . '/transport/TBufferedTransport.php'; require $GLOBALS['THRIFT_ROOT'] . '/protocol/TBinaryProtocol.php'; require $GLOBALS['THRIFT_ROOT'] . '/packages/cassandra/Cassandra.php'; $socket = new TSocket('127.0.0.1', 9160); $transport = new TBufferedTransport($socket, 1024, 1024); $protocol = new TbinaryProtocolAccelerated($transport); $client = new CassandraClient($protocol); $transport->open(); $columnPath = new cassandra_ColumnPath(); $columnPath->column_family = 'Standard1'; $columnPath->super_column = null; $columnPath->column = 'firstname'; $client->insert('Keyspace1', 'mykey', $columnPath, 'Chris', time(), cassandra_ConsistencyLevel::ONE); $name = $client->get('Keyspace1', 'mykey', $columnPath, cassandra_ConsistencyLevel::ONE); var_dump($name); $transport->close(); Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 30. Prophet PHP Extension ● C++ PHP Extension ● Built on top of Thrift C library ● Very, very, very far from usable/working/complete ● Goals ● Speed! ● Full API support ● CRUD/ORM magic ● Serialization helper ● Developed for PHP 5.3, Linux, non-threaded (i.e. FastCGI) ● http://github.com/cb1kenobi/prophet Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 31. Cassandra Roadmap Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 32. Roadmap 0.7 & Beyond ● SSTable compression ● Live keyspace & column family changes ● Vector clock support ● Truncate support ● Range delete ● byte[] keys ● Memory efficient compactions ● Apache Avro ● Multi-tenant support * Taken from other presentations Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 33. Resources ● Cassandra Wiki ● http://wiki.apache.org/cassandra/ ● IRC ● #cassandra on irc.freenode.net ● Cassandra Users Mailing List ● user-subscribe@cassandra.apache.org ● Follow people on Twitter ● @cassandra ● @jericevans ● @spyced ● @riptano ● @b6n Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 34. Getting Help CB1, INC http://www.cb1inc.com/ Web Applications Open Source Solutions Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/
  • 35. Thanks! Questions? http://www.cb1inc.com/ http://twitter.com/cb1inc http://slideshare.net/cb1kenobi http://twitter.com/cb1kenobi Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/