Your SlideShare is downloading. ×
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Cassandra and Rails at LA NoSQL Meetup

7,397

Published on

This presentation introduces people to Cassandra and Column Family Datastores in general. I will discuss what Cassandra is, how and when it is useful, and how it integrates with Rails. I will also go …

This presentation introduces people to Cassandra and Column Family Datastores in general. I will discuss what Cassandra is, how and when it is useful, and how it integrates with Rails. I will also go in to lessons learned during our 3-month project, and the useful patterns that emerged. The discussion will be very technical, but targeted at developers who are not familiar with, or have not done a project with Cassandra.

Published in: Technology
0 Comments
10 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
7,397
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
150
Comments
0
Likes
10
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
















  • Transcript

    • 1. Cassandra and Rails What we learned on our first project Mike Wynholds Carbon Five @mwynholds
    • 2. Agenda Introduction What is Cassandra? Cassandra and Rails Emergent data patterns Deployment
    • 3. What is Cassandra? Column Family database Distributed design - “eventually consistent” Open sourced by Facebook, now Apache Used by Facebook, Twitter, Digg, Rackspace... Largest cluster: 100 TB, 150 nodes Still very immature - at version 0.6.2
    • 4. Relational vs Column Family Schema-ful Schema-less (mostly) Row-based Column-based Robust SQL queries No query language Transactional Eventually consistent * ODBC/JDBC Thrift * Fast - 300/350ms Blazing - .12/15ms ** * Cassandra-specific ** MySQL vs Cassandra, > 50GB data, write/read
    • 5. Relational Column Based Database Keyspace Column Family Table 1 col1 col2 col3 col4 col5 key1: col1:val col3:val key2: col1:val1 col2:val col3:val key3: col3:val Table 2 Table 3 Super Column Family col1 col2 col1 col2 col3 col1: col1: key1: col1:val col1:val col2:val key2::
    • 6. What about ORM? Cassandra is NOT a relational database No ActiveRecord support (currently) No Hibernate support (currently) OCFM? Lots of room for jars/gems
    • 7. An example in Rails bitchroom - a place to bitch and whine Twitter-like features - user post timelines Digg-like features - up/down, fav, reply
    • 8. Custom Rails config -- /config/cassandra.yml -- development: servers: "127.0.0.1:9160" keyspace: "bitchroom_development" timeout: 3 retries: 2 -- /initializers/initialize_app.rb -- require 'cassandra' env = ENV['RAILS_ENV'] || 'development' cfg = YAML.load_file('#{RAILS_ROOT}/config/cassandra.yml')[env] thrift_options = { :timeout => cfg['timeout'], :retries => cfg['retries'] } $cassandra = Cassandra.new(cfg['keyspace'], cfg['servers'], thrift_options) $cassandra.disable_node_auto_discovery!
    • 9. Cassandra API get(keyspace, key, column_path, level) get_slice(keyspace, key, column_parent, predicate, level) multiget_slice(keyspace, keys, column_parent, predicate, level) get_count(keyspace, key, column_parent, level) get_range_slices(keyspace, column_parent, predicate, range, level) insert(keyspace, ey, column_path, value, timestamp, level) batch_mutate(keyspace, mutation_map, level) remove(keyspace, key, column_path, timestamp, level) describe_*(...) Version 0.6.x only
    • 10. Sample save def save uuid = SimpleUUID::UUID.new(Time.now) @id = uuid.to_guid post_hash = self.to_cassandra_hash cassandra.insert(:Posts, @id, post_hash) pointer = {uuid => @id} cassandra.insert(:Timelines, "main", pointer) cassandra.insert(:Timelines, "user-#{@author_id}") return self end
    • 11. Emergent data patterns Simple object map Object relationship map Timeline <-- this one is the key
    • 12. Simple Object Map Row id = object id (primary key) Attribute column names String column values ids = [ 1, 2, 3 ] posts = cassandra.multi_get(:Posts, ids, { })
    • 13. Object Relationship Map Row id = object id (primary key) Relationship attribute column names External ID super-column values ids = [ 1, 2, 3 ] prs = cassandra.multi_get(:PostRelationships, ids, { })
    • 14. Timeline TimeUUID column names External ID column values result = [] options = { :count => 20, :reverse => true } timelines = [ 'user-1', 'user-2', 'user-3' ] multi_get_result = cassandra.multi_get(:Timelines, timelines, options) multi_get_result.values().each do |timeline| timeline.each { |uuid, id| result << [uuid, id] } end result = result.sort { |a, b| b[0] <=> a[0] }.slice(0, options[:count])
    • 15. Deployment Load balancing + failover :80 haproxy :81 :9160 cassandra nginx :5000 :9160 :9160 mongrel cassandra
    • 16. Resources http://cassandra.apache.org/ http://wiki.apache.org/cassandra/API http://github.com/fauna/cassandra http://github.com/ryanking/simple_uuid

    ×