Practical Ruby Projects with MongoDB - Ruby Midwest

10,216 views
10,086 views

Published on

Published in: Technology
0 Comments
28 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
10,216
On SlideShare
0
From Embeds
0
Number of Embeds
2,821
Actions
Shares
0
Downloads
182
Comments
0
Likes
28
Embeds 0
No embeds

No notes for slide

Practical Ruby Projects with MongoDB - Ruby Midwest

  1. 1. PRACTICAL PROJECTS WITH Saturday, July 17, 2010
  2. 2. WHO AM I? Alex Sharp Lead Developer at OptimisDev @ajsharp (on the twitters) alexjsharp.com (on the ‘tubes) github.com/ajsharp (on the ‘hubs) Saturday, July 17, 2010
  3. 3. MAJOR POINTS • Brief intro to MongoDB • Practical Projects • Making the case for Mongo (time permitting) Saturday, July 17, 2010
  4. 4. BRIEF INTRO Saturday, July 17, 2010
  5. 5. WHAT IS MONGODB? mongodb is a high-performance, schema-less, document-oriented database Saturday, July 17, 2010
  6. 6. STRENGTHS Saturday, July 17, 2010
  7. 7. SCHEMA-LESS Saturday, July 17, 2010
  8. 8. SCHEMA-LESS • Great for rapid, iterative, agile development Saturday, July 17, 2010
  9. 9. SCHEMA-LESS • Great for rapid, iterative, agile development • Makes things possible that in an rdbms are either a. impossibly hard b. virtually impossible c. way harder than they should be Saturday, July 17, 2010
  10. 10. DOCUMENT-ORIENTED Saturday, July 17, 2010
  11. 11. DOCUMENT-ORIENTED • Mongo stores documents, not rows • documents are stored as binary json Saturday, July 17, 2010
  12. 12. DOCUMENT-ORIENTED • Mongo stores documents, not rows • documents are stored as binary json • Allows for a rich query syntax Saturday, July 17, 2010
  13. 13. DOCUMENT-ORIENTED • Mongo stores documents, not rows • documents are stored as binary json • Allows for a rich query syntax • Embedded documents eliminate many modeling headaches Saturday, July 17, 2010
  14. 14. BIG DATA Saturday, July 17, 2010
  15. 15. BIG DATA • Built to scale horizontally into the cloudz Saturday, July 17, 2010
  16. 16. BIG DATA • Built to scale horizontally into the cloudz • Auto-sharding • set a shard-key, a cluster of nodes, go • still in alpha, but production-ready soon Saturday, July 17, 2010
  17. 17. FAST WRITES Saturday, July 17, 2010
  18. 18. FAST WRITES • Mongo writes are “fire-and-forget” • you can get a response with getLastError() Saturday, July 17, 2010
  19. 19. FAST WRITES • Mongo writes are “fire-and-forget” • you can get a response with getLastError() • In-place updates Saturday, July 17, 2010
  20. 20. FAST WRITES • Mongo writes are “fire-and-forget” • you can get a response with getLastError() • In-place updates • Lot’s of db commands for fast updates Saturday, July 17, 2010
  21. 21. AGGREGATION Map/reduce for aggregation Saturday, July 17, 2010
  22. 22. READS FAST TOO Optimize reads with indexes, just like an rdbms Saturday, July 17, 2010
  23. 23. WEAKNESSES Saturday, July 17, 2010
  24. 24. JOINS nope. Saturday, July 17, 2010
  25. 25. MULTI-DOCUMENT TRANSACTIONS no-sir-ee. Saturday, July 17, 2010
  26. 26. RELATIONAL INTEGRITY not a chance. Saturday, July 17, 2010
  27. 27. PRACTICAL PROJECTS Saturday, July 17, 2010
  28. 28. 1. Accounting Application 2. Capped Collection Logging 3. Blogging app 4. Reporting app Saturday, July 17, 2010
  29. 29. GENERAL LEDGER ACCOUNTING APPLICATION Saturday, July 17, 2010
  30. 30. THE OBJECT MODEL ledger * transactions * entries Saturday, July 17, 2010
  31. 31. THE OBJECT MODEL # Credits Debits Line item { 1 { :account => “Cash”, :amount => 100.00 } { :account => “Notes Pay.”, :amount => 100.00 } Ledger Entries Line item { 2 { :account => “A/R”, :amount => 25.00 } { :account => “Gross Revenue”, :amount => 25.00 } Saturday, July 17, 2010
  32. 32. THE OBJECT MODEL Saturday, July 17, 2010
  33. 33. THE OBJECT MODEL • Each ledger line item belongs to a ledger Saturday, July 17, 2010
  34. 34. THE OBJECT MODEL • Each ledger line item belongs to a ledger • Each ledger line item has two ledger entries • must have at least one credit and one debit • credits and debits must balance Saturday, July 17, 2010
  35. 35. THE OBJECT MODEL • Each ledger line item belongs to a ledger • Each ledger line item has two ledger entries • must have at least one credit and one debit • credits and debits must balance • These objects must be transactional Saturday, July 17, 2010
  36. 36. SQL-BASED OBJECT MODEL @debit = Entry.new(:account => 'Cash', :amount => 100.00, :type => 'credit') @credit = Entry.new(:account => 'Notes Pay.', :amount => 100.00, :type => 'debit') @line_item = LineItem.new :ledger_id => 1, :entries => [@debit, @credit] Saturday, July 17, 2010
  37. 37. SQL-BASED OBJECT MODEL @debit = Entry.new(:account => 'Cash', :amount => 100.00, :type => 'credit') @credit = Entry.new(:account => 'Notes Pay.', :amount => 100.00, :type => 'debit') @line_item = LineItem.new :ledger_id => 1, :entries => [@debit, @credit] In a SQL schema, we need a database transaction to ensure multi-row atomicity Saturday, July 17, 2010
  38. 38. SQL-BASED OBJECT MODEL @debit = Entry.new(:account => 'Cash', :amount => 100.00, :type => 'credit') @credit = Entry.new(:account => 'Notes Pay.', :amount => 100.00, :type => 'debit') @line_item = LineItem.new :ledger_id => 1, :entries => [@debit, @credit] Remember: we have to create 3 new records in the database here Saturday, July 17, 2010
  39. 39. MONGO-FIED MODELING @line_item = LineItem.new(:ledger_id => 1, :entries => [ {:account => 'Cash', :type => 'debit', :amount => 100}, {:account => 'Notes pay.', :type => 'debit', :amount => 100} ] ) Saturday, July 17, 2010
  40. 40. MONGO-FIED MODELING @line_item = LineItem.new(:ledger_id => 1, :entries => [ {:account => 'Cash', :type => 'debit', :amount => 100}, {:account => 'Notes pay.', :type => 'debit', :amount => 100} ] ) This is the perfect case for embedded documents. Saturday, July 17, 2010
  41. 41. MONGO-FIED MODELING @line_item = LineItem.new(:ledger_id => 1, :entries => [ {:account => 'Cash', :type => 'debit', :amount => 100}, {:account => 'Notes pay.', :type => 'debit', :amount => 100} ] ) We will never have more than a few entries (usually two) Saturday, July 17, 2010
  42. 42. MONGO-FIED MODELING @line_item = LineItem.new(:ledger_id => 1, :entries => [ {:account => 'Cash', :type => 'debit', :amount => 100}, {:account => 'Notes pay.', :type => 'debit', :amount => 100} ] ) Embedded docs are perfect for “contains” many relationships Saturday, July 17, 2010
  43. 43. MONGO-FIED MODELING @line_item = LineItem.new(:ledger_id => 1, :entries => [ {:account => 'Cash', :type => 'debit', :amount => 100}, {:account => 'Notes pay.', :type => 'debit', :amount => 100} ] ) DB-level transactions no-longer needed. Mongo has single document atomicity. Saturday, July 17, 2010
  44. 44. LESSONS Saturday, July 17, 2010
  45. 45. LESSONS • Simplified, de-normalized data model • Objects modeled differently in Mongo than SQL Saturday, July 17, 2010
  46. 46. LESSONS • Simplified, de-normalized data model • Objects modeled differently in Mongo than SQL • Simpler data model + no transactions = WIN Saturday, July 17, 2010
  47. 47. LOGGING WITH CAPPED COLLECTIONS Saturday, July 17, 2010
  48. 48. BASELESS STATISTIC 95% of the time logs are NEVER used. Saturday, July 17, 2010
  49. 49. MOST LOGS ARE ONLY USED WHEN SOMETHING GOES WRONG Saturday, July 17, 2010
  50. 50. CAPPED COLLECTIONS • Fixed-sized, limitedoperation, auto age-out collections (kinda like memcached, except persistent) • Fixed insertion order • Super fast (faster than normal writes) • Ideal for logging and caching Saturday, July 17, 2010
  51. 51. GREAT. WHAT’S THAT MEAN? We can log additional pieces of arbitrary data effortlessly due to Mongo’s schemaless nature. Saturday, July 17, 2010
  52. 52. THIS IS AWESOME Saturday, July 17, 2010
  53. 53. QUERY. ANALYZE. PROFIT. Saturday, July 17, 2010
  54. 54. ALSO A REALLY HANDY TROUBLESHOOTING TOOL Saturday, July 17, 2010
  55. 55. User-reported errors are difficult to diagnose Saturday, July 17, 2010
  56. 56. Wouldn’t it be useful to see the complete click- path? Saturday, July 17, 2010
  57. 57. BUNYAN http://github.com/ajsharp/bunyan Thin ruby layer around a MongoDB capped collection Saturday, July 17, 2010
  58. 58. BUNYAN http://github.com/ajsharp/bunyan Still needs a middleware api. Want to contribute? Saturday, July 17, 2010
  59. 59. require 'bunyan' Bunyan::Logger.configure do |c| c.database 'my_bunyan_db' c.collection "#{Rails.env}_bunyan_log" c.size 1073741824 # 1.gigabyte end Saturday, July 17, 2010
  60. 60. PAPERMILL http://github.com/ajsharp/papermill SINATRA FRONT-END TO BUNYAN Saturday, July 17, 2010
  61. 61. BLOGGING APPLICATION Saturday, July 17, 2010
  62. 62. BLOGGING APPLICATION Much easier to model with Mongo than a relational database Saturday, July 17, 2010
  63. 63. BLOGGING APPLICATION A post has an author Saturday, July 17, 2010
  64. 64. BLOGGING APPLICATION A post has an author A post has many tags Saturday, July 17, 2010
  65. 65. BLOGGING APPLICATION A post has an author A post has many tags A post has many comments Saturday, July 17, 2010
  66. 66. BLOGGING APPLICATION A post has an author A post has many tags A post has many comments Instead of JOINing separate tables, we can use embedded documents. Saturday, July 17, 2010
  67. 67. MONGOMAPPER Saturday, July 17, 2010
  68. 68. MONGOMAPPER •MongoDB “ORM” developed by John Nunemaker Saturday, July 17, 2010
  69. 69. MONGOMAPPER •MongoDB “ORM” developed by John Nunemaker •Very similar syntax to DataMapper Saturday, July 17, 2010
  70. 70. MONGOMAPPER •MongoDB “ORM” developed by John Nunemaker •Very similar syntax to DataMapper •Declarative rather than inheritance-based Saturday, July 17, 2010
  71. 71. MONGOMAPPER •MongoDB “ORM” developed by John Nunemaker •Very similar syntax to DataMapper •Declarative rather than inheritance-based •Very easy to drop into rails Saturday, July 17, 2010
  72. 72. MONGOMAPPER require 'mongo' connection = Mongo::Connection.new.db("#{Rails.env}_app") class Post include MongoMapper::Document belongs_to :author, :class_name => "User" key :title, String, :reuqired => true key :body, String, :required => true key :author_id, Integer, :required => true key :published_at, Time key :published, Boolean, :default => false key :tags, Array, :default => [] timestamps! end Saturday, July 17, 2010
  73. 73. REPORTING APP Saturday, July 17, 2010
  74. 74. Saturday, July 17, 2010
  75. 75. REPORTING IS HARD • Synchronization • Data, application API, schema • Schema synchronization is the hard one Saturday, July 17, 2010
  76. 76. WHY MONGO? • Aggregation support with Mongo’s map/reduce support • Mongo is good (and getting better) at handling big data sets • Schemaless is perfect for a flattened data set Saturday, July 17, 2010
  77. 77. Amazon EC2 Rackspace Transform MongoDB Server Reporting App WAN Main App MySQL Saturday, July 17, 2010
  78. 78. CLIENT REQUESTS Amazon EC2 MongoDB Server map/reduce Reporting App User requests Client request GET /reports/some_report Saturday, July 17, 2010
  79. 79. TRANFORMATION Data needs to extracted from the the main app, transormed (i.e. flattened) into some other form, and loaded into the reporting app for aggregation Saturday, July 17, 2010
  80. 80. TRANSFORMATION Rackspace Amazon EC2 1. GET /object_dependencies Transform MongoDB Server { :visit => [ :insurances, :patients ] } 2. read from active record MySQL 2. write to mongo Reporting App Main App Saturday, July 17, 2010
  81. 81. MAKING THE CASE Saturday, July 17, 2010
  82. 82. THE PROBLEMS OF SQL Saturday, July 17, 2010
  83. 83. A BRIEF HISTORY OF SQL • Developed at IBM in early 1970’s. • Designed to manipulate and retrieve data stored in relational databases Saturday, July 17, 2010
  84. 84. A BRIEF HISTORY OF SQL • Developed at IBM in early 1970’s. • Designed to manipulate and retrieve data stored in relational databases this is a problem Saturday, July 17, 2010
  85. 85. WHY?? Saturday, July 17, 2010
  86. 86. WE DON’T WORK WITH DATA... Saturday, July 17, 2010
  87. 87. WE WORK WITH OBJECTS Saturday, July 17, 2010
  88. 88. WE DON’T CARE ABOUT STORING DATA Saturday, July 17, 2010
  89. 89. WE CARE ABOUT PERSISTING STATE Saturday, July 17, 2010
  90. 90. DATA != OBJECTS Saturday, July 17, 2010
  91. 91. THEREFORE... Saturday, July 17, 2010
  92. 92. RELATIONAL DBS ARE AN ANTIQUATED TOOL FOR OUR NEEDS Saturday, July 17, 2010
  93. 93. OK, SO WHAT? SQL schemas are designed for storing and querying data, not persisting objects. Saturday, July 17, 2010
  94. 94. To reconcile this mismatch, we have ORM’s Saturday, July 17, 2010
  95. 95. Object Relational Mappers Saturday, July 17, 2010
  96. 96. OK, SO WHAT? We need ORMs to bridge the gap (map) between our native Ruby object model and a SQL schema Saturday, July 17, 2010
  97. 97. We’re forced to create relationships for data when what we really want is properties for our objects. Saturday, July 17, 2010
  98. 98. THIS IS NOT EFFICIENT Saturday, July 17, 2010
  99. 99. @alex = Person.new( :name => "alex", :stalkings => [ Friend.new("Jim"), Friend.new("Bob")] ) Saturday, July 17, 2010
  100. 100. Native ruby object <Person:0x10017d030 @name="alex", @stalkings= [#<Friend:0x10017d0a8 @name="Jim">, #<Friend:0x10017d058 @name="Bob"> ]> Saturday, July 17, 2010
  101. 101. JSON/Mongo Representation @alex.to_json { name: "alex", stalkings: [{ name: "Jim" }, { name: "Bob" }] } Saturday, July 17, 2010
  102. 102. SQL Schema Representation people: - name stalkings: - name - stalker_id Saturday, July 17, 2010
  103. 103. SQL Schema Representation people: - name stalkings: What!? - name - stalker_id Saturday, July 17, 2010
  104. 104. SQL Schema Representation people: - name stalkings: what’s a stalker_id? - name - stalker_id Saturday, July 17, 2010
  105. 105. SQL Schema Representation people: - name stalkings: this is our sql mapping - name - stalker_id Saturday, July 17, 2010
  106. 106. SQL Schema Representation people: - name stalkings: this is the - name “pointless” join - stalker_id Saturday, July 17, 2010
  107. 107. Ruby -> JSON -> SQL <Person:0x10017d030 @name="alex", @stalkings= Ruby [#<Friend:0x10017d0a8 @name="Jim">, #<Friend:0x10017d058 @name="Bob"> ]> @alex.to_json JSON { name: "alex", stalkings: [{ name: "Jim" }, { name: "Bob" }] } people: - name SQL stalkings: - name - stalker_id Saturday, July 17, 2010
  108. 108. Ruby -> JSON -> SQL <Person:0x10017d030 @name="alex", @stalkings= Ruby [#<Friend:0x10017d0a8 @name="Jim">, #<Friend:0x10017d058 @name="Bob"> ]> @alex.to_json JSON { name: "alex", stalkings: [{ name: "Jim" }, { name: "Bob" }] } people: - name Feels like we’re having SQL stalkings: to work too hard here - name - stalker_id Saturday, July 17, 2010
  109. 109. You’re probably thinking... “Listen GUY, SQL isn’t that bad” Saturday, July 17, 2010
  110. 110. Maybe. Saturday, July 17, 2010
  111. 111. But we can do much, much better. Saturday, July 17, 2010
  112. 112. NEEDS FOR A PERSISTENCE LAYER: Saturday, July 17, 2010
  113. 113. NEEDS FOR A PERSISTENCE LAYER: 1. To persist the native state of our objects Saturday, July 17, 2010
  114. 114. NEEDS FOR A PERSISTENCE LAYER: 1. To persist the native state of our objects 2. Should NOT interfere w/ application development Saturday, July 17, 2010
  115. 115. NEEDS FOR A PERSISTENCE LAYER: 1. To persist the native state of our objects 2. Should NOT interfere w/ application development 3. Provides features necessary to build modern web apps Saturday, July 17, 2010
  116. 116. NEXT STEPS Saturday, July 17, 2010
  117. 117. USING MONGO W/ RUBY • Ruby mongo driver • MongoMapper (github.com/jnunemaker/mongomapper) • MongoID (github.com/durran/mongoid) • Many other ORM/ODM’s • mongo-sequel (github.com/jeremyevans/sequel-mongo) • moneta (github.com/wycats/moneta) Saturday, July 17, 2010
  118. 118. MONGOMAPPER • DataMapper-like API • good for moving from a sql schema to mongo • easy to drop into rails • works with rails 3 Saturday, July 17, 2010
  119. 119. MONGOID • DataMapper-like API • uses ActiveModel, so familiar validation syntax • more opinionated embedded docs • easy to drop into rails • rails 2 - mongoid 1.x • rails 3 - mongoid 2.x Saturday, July 17, 2010
  120. 120. QUESTIONS? Saturday, July 17, 2010
  121. 121. THANKS! Saturday, July 17, 2010

×