Persistence                     Michael Bleigh
                                       Intridea, Inc.


      Smoothie
    ...
Thursday, March 11, 2010
Thursday, March 11, 2010
present.ly

Thursday, March 11, 2010
tweetstream hashie
                        acts-as-taggable-on
                      subdomain-fu seed-fu
                ...
@mbleigh

Thursday, March 11, 2010
You’ve (probably)
                       heard a lot about
                            NoSQL


Thursday, March 11, 2010
NoSQL is a new way
                to think about
                  persistence


Thursday, March 11, 2010
Atomicity
                           Consistency
                            Isolation
                            Durabil...
Denormalization
               Eventual Consistency
                  Schema-Free
                 Horizontal Scale


Thur...
NoSQL tries to scale
                 (more) simply


Thursday, March 11, 2010
NoSQL is going
                            mainstream


Thursday, March 11, 2010
New York Times
                   Business Insider
                   BBC ShopWiki
                    GitHub Meebo
      ...
...but not THAT
                              mainstream.


Thursday, March 11, 2010
A word of caution...



Thursday, March 11, 2010
NoSQL can
                           divide by zero


Thursday, March 11, 2010
sn’t
                d oe s
             QL wait
         oS , it
       N
        s le ep
                            NoS...
NoSQL is a (growing)
             collection of tools, not
               a new way of life


Thursday, March 11, 2010
Key-Value Stores
            Document Databases
               Column Stores
             Graph Databases

Thursday, March...
Key-Value Stores



Thursday, March 11, 2010
Redis

                    • Key-value store + datatypes
                     • Lists, (Scored) Sets, Hashes
             ...
Riak

                    • Combo key-value store and
                           document database
                    • H...
Map/Reduce
                    • Massively parallel way to
                           process large datasets
             ...
map = function() {
                    this.tags.forEach(function(tag) {
                      emit(tag, {count: 1});
    ...
Tokyo Cabinet
                             Dynomite
                           MemcachedDB
                             Vo...
Document Databases



Thursday, March 11, 2010
MongoDB

                    • Document store that speaks
                           BSON (Binary JSON)
                  ...
CouchDB

                    • JSON Document Store
                    • HTTP REST Interface
                    • Increme...
Column-Oriented
                              Datastores


Thursday, March 11, 2010
Cassandra

                    • Built by Facebook,
                           used by Twitter
                    • Pure ...
Graph Databases



Thursday, March 11, 2010
Neo4J



Thursday, March 11, 2010
When should I use
                        this stuff?


Thursday, March 11, 2010
Complex, slow joins
               for “activity stream”




Thursday, March 11, 2010
Complex, slow joins
               for “activity stream”

                 Denormalize,
              use Key-Value Store
...
Variable schema,
                   vertical interaction




Thursday, March 11, 2010
Variable schema,
                   vertical interaction

                Document Database
                 or Column Sto...
Modeling multi-step
                relationships




Thursday, March 11, 2010
Modeling multi-step
                relationships


                           Graph Database

Thursday, March 11, 2010
NoSQL solves real
                scalability and data
                   design issues


Thursday, March 11, 2010
Ben Scofield
               bit.ly/state-of-nosql


Thursday, March 11, 2010
Ready to go?



Thursday, March 11, 2010
Just one problem...



Thursday, March 11, 2010
Your data is already
                 in a SQL database


Thursday, March 11, 2010
We CAN all just
                             get along.


Thursday, March 11, 2010
Three Ways



Thursday, March 11, 2010
The Hard(ish) Way



Thursday, March 11, 2010
The Easy Way



Thursday, March 11, 2010
A Better Way?



Thursday, March 11, 2010
The Hard Way:
                           Do it by hand.


Thursday, March 11, 2010
class Post
                    include MongoMapper::Document

                      key   :title, String
                 ...
Pros & Cons
                    •      Simple, maps to your domain

                    •      Works for small, simple ORM...
The Easy Way:
                            DataMapper


Thursday, March 11, 2010
DataMapper

                    • Generic, relational ORM
                    • Speaks pretty much everything
            ...
DataMapper.setup(:default, "mysql://localhost")
                  DataMapper.setup(:mongodb, "mongo://localhost/posts")

 ...
Pros & Cons
                    •      The ultimate Polyglot ORM

                    •      Simple relationships between ...
Is there a better way?



Thursday, March 11, 2010
Maybe.



Thursday, March 11, 2010
Gloo: Cross-ORM
           Relationship Mapper

                           github.com/intridea/gloo


Thursday, March 11, ...
0.0.0.prealpha.1



Thursday, March 11, 2010
Can’t we just sit
                           down and talk to
                             each other?


Thursday, March 1...
class Post
                    include MongoMapper::Resource

                       key :title, String
                  ...
Goals/Status
                    • Be able to define relationships
                           on the terms of any ORM from
...
Code Time:
                           Schema4Less


Thursday, March 11, 2010
Social Storefront
                    • Dummy application of a store that
                           lets others “follow” ...
Users

                    • I already have an authentication
                           system
                    • I’m ...
Purchasing

                    • Users need to be able to purchase
                           items from my storefront
  ...
Social Graph

                    • I want activity streams and one
                           and two way relationships
 ...
Product Listings
                    • I am selling both movies and
                           books
                    •...
Demo and
                           Walkthrough


Thursday, March 11, 2010
Thursday, March 11, 2010
Wrapping Up



Thursday, March 11, 2010
These systems can
           (and should) live and
              work together


Thursday, March 11, 2010
Most important step
               is to actually think
                about data design


Thursday, March 11, 2010
When you have a
                  whole bag of tools,
                  things stop looking
                       like na...
Questions?



Thursday, March 11, 2010
Upcoming SlideShare
Loading in...5
×

Persistence Smoothie

3,415

Published on

Talk given at Confoo 2010 in Montreal describing how to integrate NoSQL systems with SQL in Ruby applications.

Published in: Technology
1 Comment
13 Likes
Statistics
Notes
No Downloads
Views
Total Views
3,415
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
55
Comments
1
Likes
13
Embeds 0
No embeds

No notes for slide

Persistence Smoothie

  1. 1. Persistence Michael Bleigh Intridea, Inc. Smoothie Blending SQL and NoSQL photo by Nikki L. via Flickr Thursday, March 11, 2010
  2. 2. Thursday, March 11, 2010
  3. 3. Thursday, March 11, 2010
  4. 4. present.ly Thursday, March 11, 2010
  5. 5. tweetstream hashie acts-as-taggable-on subdomain-fu seed-fu mustache_json github.com/intridea Thursday, March 11, 2010
  6. 6. @mbleigh Thursday, March 11, 2010
  7. 7. You’ve (probably) heard a lot about NoSQL Thursday, March 11, 2010
  8. 8. NoSQL is a new way to think about persistence Thursday, March 11, 2010
  9. 9. Atomicity Consistency Isolation Durability Thursday, March 11, 2010
  10. 10. Denormalization Eventual Consistency Schema-Free Horizontal Scale Thursday, March 11, 2010
  11. 11. NoSQL tries to scale (more) simply Thursday, March 11, 2010
  12. 12. NoSQL is going mainstream Thursday, March 11, 2010
  13. 13. New York Times Business Insider BBC ShopWiki GitHub Meebo Disqus SourceForge Sony Digg Thursday, March 11, 2010
  14. 14. ...but not THAT mainstream. Thursday, March 11, 2010
  15. 15. A word of caution... Thursday, March 11, 2010
  16. 16. NoSQL can divide by zero Thursday, March 11, 2010
  17. 17. sn’t d oe s QL wait oS , it N s le ep NoSQL can divide by zero NoSQL to infin counte ity, twi d ce Thursday, March 11, 2010
  18. 18. NoSQL is a (growing) collection of tools, not a new way of life Thursday, March 11, 2010
  19. 19. Key-Value Stores Document Databases Column Stores Graph Databases Thursday, March 11, 2010
  20. 20. Key-Value Stores Thursday, March 11, 2010
  21. 21. Redis • Key-value store + datatypes • Lists, (Scored) Sets, Hashes • Cache-like functions (expiration) • (Mostly) In-Memory Thursday, March 11, 2010
  22. 22. Riak • Combo key-value store and document database • HTTP REST interface • “Link walking” • Map-Reduce Thursday, March 11, 2010
  23. 23. Map/Reduce • Massively parallel way to process large datasets • First you scour data and “map” a new set of data • Then you “reduce” the data down to a salient result Thursday, March 11, 2010
  24. 24. map = function() { this.tags.forEach(function(tag) { emit(tag, {count: 1}); }); } reduce = function(key, values) { var total = 0; for (var i = 0; i < values.length; i++) { total += values[i].count; return {count: total}; } Thursday, March 11, 2010
  25. 25. Tokyo Cabinet Dynomite MemcachedDB Voldemort Thursday, March 11, 2010
  26. 26. Document Databases Thursday, March 11, 2010
  27. 27. MongoDB • Document store that speaks BSON (Binary JSON) • Indexing, simple query syntax • GridFS • Deliberate MapReduce Thursday, March 11, 2010
  28. 28. CouchDB • JSON Document Store • HTTP REST Interface • Incremental MapReduce • Intelligent Replication Thursday, March 11, 2010
  29. 29. Column-Oriented Datastores Thursday, March 11, 2010
  30. 30. Cassandra • Built by Facebook, used by Twitter • Pure horizontal scalability • Schemaless Thursday, March 11, 2010
  31. 31. Graph Databases Thursday, March 11, 2010
  32. 32. Neo4J Thursday, March 11, 2010
  33. 33. When should I use this stuff? Thursday, March 11, 2010
  34. 34. Complex, slow joins for “activity stream” Thursday, March 11, 2010
  35. 35. Complex, slow joins for “activity stream” Denormalize, use Key-Value Store Thursday, March 11, 2010
  36. 36. Variable schema, vertical interaction Thursday, March 11, 2010
  37. 37. Variable schema, vertical interaction Document Database or Column Store Thursday, March 11, 2010
  38. 38. Modeling multi-step relationships Thursday, March 11, 2010
  39. 39. Modeling multi-step relationships Graph Database Thursday, March 11, 2010
  40. 40. NoSQL solves real scalability and data design issues Thursday, March 11, 2010
  41. 41. Ben Scofield bit.ly/state-of-nosql Thursday, March 11, 2010
  42. 42. Ready to go? Thursday, March 11, 2010
  43. 43. Just one problem... Thursday, March 11, 2010
  44. 44. Your data is already in a SQL database Thursday, March 11, 2010
  45. 45. We CAN all just get along. Thursday, March 11, 2010
  46. 46. Three Ways Thursday, March 11, 2010
  47. 47. The Hard(ish) Way Thursday, March 11, 2010
  48. 48. The Easy Way Thursday, March 11, 2010
  49. 49. A Better Way? Thursday, March 11, 2010
  50. 50. The Hard Way: Do it by hand. Thursday, March 11, 2010
  51. 51. class Post include MongoMapper::Document key :title, String key :body, String key :tags, Array key :user_id, Integer def user User.find_by_id(self.user_id) end def user=(some_user) self.user_id = some_user.id end end class User < ActiveRecord::Base def posts(options = {}) Post.all({:conditions => {:user_id => self.id}}.merge(options)) end end Thursday, March 11, 2010
  52. 52. Pros & Cons • Simple, maps to your domain • Works for small, simple ORM intersections • MUCH simpler in Rails 3 • Complex relationships are a mess • Makes your models fat • As DRY as the ocean Thursday, March 11, 2010
  53. 53. The Easy Way: DataMapper Thursday, March 11, 2010
  54. 54. DataMapper • Generic, relational ORM • Speaks pretty much everything you’ve ever heard of • Implements Identity Map • Module-based inclusion Thursday, March 11, 2010
  55. 55. DataMapper.setup(:default, "mysql://localhost") DataMapper.setup(:mongodb, "mongo://localhost/posts") class Post include DataMapper::Resource def self.default_repository_name; :mongodb; end property :title, String property :body, String property :tags, Array belongs_to :user end class User include DataMapper::Resource property :email, String property :name, String has n, :posts end Thursday, March 11, 2010
  56. 56. Pros & Cons • The ultimate Polyglot ORM • Simple relationships between persistence engines are easy • Jack of all trades, master of none • Perpetuates (sometimes) false assumptions • Legacy stuff is in ActiveRecord anyway Thursday, March 11, 2010
  57. 57. Is there a better way? Thursday, March 11, 2010
  58. 58. Maybe. Thursday, March 11, 2010
  59. 59. Gloo: Cross-ORM Relationship Mapper github.com/intridea/gloo Thursday, March 11, 2010
  60. 60. 0.0.0.prealpha.1 Thursday, March 11, 2010
  61. 61. Can’t we just sit down and talk to each other? Thursday, March 11, 2010
  62. 62. class Post include MongoMapper::Resource key :title, String key :body, String key :tags, Array gloo :active_record do belongs_to :user end end class User < ActiveRecord::Base gloo :mongo_mapper do many :posts end end Thursday, March 11, 2010
  63. 63. Goals/Status • Be able to define relationships on the terms of any ORM from any class, ORM or not • Right Now: Partially working ActiveRecord relationships • Doing it wrong? Maybe Thursday, March 11, 2010
  64. 64. Code Time: Schema4Less Thursday, March 11, 2010
  65. 65. Social Storefront • Dummy application of a store that lets others “follow” your purchases (a less creepy Blippy?) • Four requirements: • users • purchasing • listings • social graph Thursday, March 11, 2010
  66. 66. Users • I already have an authentication system • I’m happy with it • It’s Devise and ActiveRecord • Stick with SQL Thursday, March 11, 2010
  67. 67. Purchasing • Users need to be able to purchase items from my storefront • I can’t lose their transactions • I need full ACID • I’ll use MySQL Thursday, March 11, 2010
  68. 68. Social Graph • I want activity streams and one and two way relationships • I need speed • I don’t need consistency • I’ll use Redis Thursday, March 11, 2010
  69. 69. Product Listings • I am selling both movies and books • They have very different properties • Products are relatively non- relational • I’ll use MongoDB Thursday, March 11, 2010
  70. 70. Demo and Walkthrough Thursday, March 11, 2010
  71. 71. Thursday, March 11, 2010
  72. 72. Wrapping Up Thursday, March 11, 2010
  73. 73. These systems can (and should) live and work together Thursday, March 11, 2010
  74. 74. Most important step is to actually think about data design Thursday, March 11, 2010
  75. 75. When you have a whole bag of tools, things stop looking like nails Thursday, March 11, 2010
  76. 76. Questions? Thursday, March 11, 2010
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×