Benefits of using MongoDB: Reduce Complexity & Adapt to Changes
Upcoming SlideShare
Loading in...5
×
 

Benefits of using MongoDB: Reduce Complexity & Adapt to Changes

on

  • 8,648 views

 

Statistics

Views

Total Views
8,648
Views on SlideShare
8,482
Embed Views
166

Actions

Likes
9
Downloads
169
Comments
2

7 Embeds 166

http://www.qlazzy.com 67
http://stnguyen.com 45
http://sontung23.wordpress.com 27
http://localhost:4000 21
http://imperavi.com 4
http://www.raptor-editor.com 1
http://nicedit.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

12 of 2

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Hi everyone. It’s my pleasure to be here today. I’m going to talk about MongoDB one of the most popular NoSQL databases.\n
  • Hi, my name is Alex. I’m co-founder at Vinova. We are a Ruby on Rails and Mobile App development shop in Singapore. We’ve doing Rails for 5 years.\n\nWe are growing and looking for projects. If you need expertise's, feel free to contact us.\n
  • \n
  • I love SQL. I’ve done a lot of projects using MySQL, PostgreSQL ... \nI just found a better tool\n
  • What’s MongoDB. MongoDB is a open source, document-oriented database that want to be the best database for web apps (not everything)\n
  • Document-oriented is like this. \n\nThink of document as a Hash in Ruby or an Object in JavaScript.\n\nYou can store anything in document. Id, string, number, array and other documents (embedded documents).\n\n
  • In relational database, we have tables and rows. In MongoDB we have collections and documents. You can think of collections as tables and documents as rows.\n
  • MongoDB try to be as fast and scalable as key / value stores without loosing functionality.\n
  • MongoDB has a lot of great features. Rich query interface, atomic and in-place update\n
  • My experiences show that ..\n
  • Why mongo reduce complexity?\n
  • Because by using MongoDB we can get rid of migrations\n
  • Get rid of relationships. \n\nFor data don’t share among objects or small enough. We just store it as a nested documents or arrays. So many 1-1 and 1-n relationships is not really necessary.\n
  • MongoDB help to reduce number of database requests because we already pre-joined your data by storing 1-1, 1-n relational data as arrays or nested document.\n
  • Because Mongo know JSON we don’t have to convert data to JSON format.\nWe can pull JSON from Mongo and push it to client as it is.\n\n
  • \n
  • \n
  • Atomic, in-place updates are very powerful to modify data. I’ll show you in one of the case-studies later.\n\n
  • Feed enough hardware resources to MongoDB to keep it run fast. \nWhen you need to scale your DB to multiple boxes you just do it.\n\nUnless your target is to build next Google or Facebook you may need Hadoop, HBase, Hive or Cassandra. For most use-cases, I think MongoDB is GOOD enough for scaling.\n
  • \n
  • A common use-cases I met is storing crawled information from various third party websites. Later we want to add more sources and they may change the data format in the future.\n
  • Normally, when using SQL I have to create an additional table for each source. For MongoDB, I just push them the object itself as an embedded document like this.\n
  • Then later, any changes in data structure like adding a new field\n
  • or adding new source, I just push it to the product object. No migration, now new table creating\n
  • And I can query those information use later using dot notation.\n
  • Another problem that can utilize both MongoDB document and ability to index everything is product listing.\n\nI built an online catalogue application to show products, and a product can be listed on multiple category on certain months\n
  • In SQL I need an extra table to express which product is listed in which category and on which month.\n\nListings table is not really a join table, since product_id and category_id can be duplicated.\n
  • To query product listed on a specific category and month. I need to join products table with listing table and do the query.\n
  • When using MongoDB we don’t need listings table. We store listings as an array of value pair [category_id, month]\n
  • \n
  • Can index listings array so speed up query\n
  • Instead of category_id, month pair we can store listings as an arrays of object that people know which value is category id, which value is month explicitly. But it require more storage to store field names.\n\nI don’t recommend that for simple data structure like listings.\n
  • Another example that show the power Mongo query is finding uniq slug.\n\nWe have many books with the same title “Ruby” but different categories.\n\nIn SQL we need n queries to find uniq slug for each of them. \n\nThe algorithm is simple, init slug from book’s title, set counter to zero. Check if slug is already in use, if yes increase the counter, modify slug and continue until we found an unique one.\n
  • In Mongo, we don’t have to write the while loop by using regular expression matching.\n\nFirst we init the original slug and slug pattern that match the original slug and it variants.\n\nUse regular expression matching to find the variant with max counter value.\n\nIf found, extract the max counter value, increase it by one to create the uniq slug.\n\nIf original slug and it’s variants are not in used. Return the original slug.\n\nAnd don’t forget to index slug field to speed up your query.\n
  • The last case study is voting. By solving this problem in both SQL and Mongo, I will show you how flexible and powerful Mongo is to avoid join table reduce number of database requests.\n\nThe problem is like this. In a forum, a user can only vote for each post one. Each vote can be a up vote or a down vote. Up votes and down votes have different vote points. +2 for an up vote and -1 for a down vote for example.\n\nWe need to cache votes_count and votes_point in post so that we can query and sort by votes_count and votes_point later.\n
  • In SQL, we need an join table to store vote data.\n
  • Here is the algorithm to do voting in SQL. \n\nCheck if user did not vote the post. \n\nCreate the vote.\n\nRetrieve post to get votes_point and votes_count\n\nUpdate votes_point, votes_count and save updated value to the database.\n
  • As you see, we need fours database request to do a voting in SQL.\n
  • \n
  • Same for unvote\n
  • When using Mongo, we can avoid join table by storing votes as an embedded document in post object itself.\n\nvotes.up array to store user id who give up votes\nvotes.down array to store user id who give down votes\n\nvotes.count, votes.point for querying and ordering purposes.\n
  • Here is voting algorithm in Mongo.\n\ngive a post_id and a user_id, the query part to find the post and make sure user have not vote the post yet.\n\nThe update data part put user id to votes.up or votes.down array depend on vote value, update votes.point and votes.count.\n
  • By using Mongo find_and_modify operator, I can query the post, do validation, update votes and return updated data in just ONE database request.\n
  • \n
  • Same for unvote\n
  • I extracted the voting solution from one of your project and released it as a gem. You can install it and check source code at github. Comments and contributions are welcome.\n
  • For summary, MongoDB is Flexible, Powerful and Fun.\n\nFlexible: come from Schema-less and document-oriented.\n\nPowerful: because Mongo is fast, scalable, and have rich queries\n\nFun: because you don’t have to think in the SQL box (tables, columns, joins ...)\n
  • \n
  • In case you want to know more about MongoDB, there is some selected slides in references session to know more MongoDB, Schema Design, Indexing and Query Optimization.\n

Benefits of using MongoDB: Reduce Complexity & Adapt to Changes Benefits of using MongoDB: Reduce Complexity & Adapt to Changes Presentation Transcript

  • RedDotRubyConf 2011Benefits of MongoDB: Reduce Complexity & Adapt to Changes Vinova Pte Ltd
  • About me• Alex Nguyen• Co-founder at Vinova• http://vinova.sg/• https://github.com/vinova/
  • Agenda• What’s MongoDB?• Why MongoDB reduce complexity?• Why MongoDB adapt to changes better?• Case studies
  • I don’t hate SQLJust found a better tool for most of my use cases
  • What’s MongoDB? “MongoDB (from "humongous") is a scalable, high- performance, open source, document-oriented database”mongodb.org
  • What’s MongoDB?http://www.slideshare.net/kbanker/mongodb-schema-design-mongo-chicago
  • What’s MongoDB?• Collections ~ Tables• Documents ~ Rows
  • MongoDB Philosophy • Reduce transactional semantics for performance • No-relational is the best way to scale horizontallymongodb.org
  • MongoDB Features• JSON style documents • Map / Reduce• Index on any attribute • GridFS to store files• Rich queries • Server-side JavaScript• In-place update • Capped collections• Auto-sharding • Full-text-search (coming soon)
  • MongoDBs flexibility data structure, ability to index &query data, and auto-sharding make it a strong tool thatadapt to changes well. It also help to reduce complexity comparing to tradition RDBMS.
  • Why MongoDB reduce complexity?• Get rid of migrations• Get rid of relationships (most of)• Reduce number of database requests• JSON (client, server, and database)
  • Get rid of migrations• No create table• No alter column• No add column• No change column
  • Get rid of relationships• Many one-to-one and one-to-many relationships is not necessary • User :has_one :setting • User :has_many :addresses • User :has_many :roles • Post :has_many :tags
  • Reduce number of database requests• Pre-joined• Rich queries• Atomic, in-place updates
  • JSON• MongoDB knows JSON• Don’t have to convert data from / to JSON
  • Adapt to changes• Changes in schema• Changes in data & algorithms• Changes for performance & scaling
  • Changes in schema• In modern apps, schema changes quite often (weekly, monthly ...)• Alter tables are expensive in RDBMS• Dynamic schema document makes those changes seamlessly
  • Changes in data & algorithms• Atomic, in-place updates are very powerful to modify data $inc, $set, $unset, $push, $pop, $rename, $bit• Rich$all, $exists,and aggregators $in, queries $size, $type, regexp count(), size(), distinct(), min(), max()• Map/Reduce
  • Changes forperformance & scaling• Very fast & ready to scale => • Don’t have to use additional tools (memcached ...) • Don’t have to change platforms
  • Case Studies• Store crawled info as embedded documents• Product listing• Find unique slug• Voting
  • Store crawled info asembedded documents• Data from 3rd party sources• Sources and data formats can be changed in the future
  • Store crawled info asembedded documents product = { "_id" : ObjectId("4d8ace4b0dc3e43231bb930d"), "name" : "Product ABC", "amazon" : { "asin" : ..., "price" : ..., .... } };
  • Store crawled info asembedded documents product = { "_id" : ObjectId("4d8ace4b0dc3e43231bb930d"), "name" : "Product ABC", "amazon" : { "asin" : ..., "price" : ..., "shipping_cost" : ..., ... } };
  • Store crawled info asembedded documents product = { "_id" : ObjectId("4d8ace4b0dc3e43231bb930d"), "name" : "Product ABC", "amazon" : { "asin" : ..., "price" : ..., "shipping_cost" : ..., .... }, "walmart" : { "price" : ..., ... } };
  • Store crawled info asembedded documents def Product.find_by_asin(asin) Product.where(amazon.asin => asin).first end
  • Product listing• A product can be listed on multiple categories on certain months
  • Product listing• Need an extra table to express which product is listed in which category and on which month product_id category_id month 1 2 2011-03 1 2 2011-04 SQL
  • Product listing • To query products listed in category 2 and month ‘2011-04’Product.join(:listings).where(category_id = ? AND month = ?, 2,‘2011-04’) SQL
  • Product listing • Store listings in product itselfproduct = { "_id" : ObjectId("4d8ace4b0dc3e43231bb930d"), "name" : "Product ABC", "listings" : [ [1, "2011-01"], [1, "2011-04"], [3,"2011-01"] ]}; Mongo
  • Product listing • Store listings in product itselfproduct = { "_id" : ObjectId("4d8ace4b0dc3e43231bb930d"), "name" : "Product ABC", "listings" : [ [1, "2011-01"], [1, "2011-04"], [3,"2011-01"] ]}; • Query is simplerProduct.where("listings" => [1, 2011-04]) Mongo
  • Product listing • Store listings in product itselfproduct = { "_id" : ObjectId("4d8ace4b0dc3e43231bb930d"), "name" : "Product ABC", "listings" : [ [1, "2011-01"], [1, "2011-04"], [3,"2011-01"] ]}; • Query is simplerProduct.where("listings" => [1, 2011-04]) • Can index listings arraydb.products.ensureIndex({"listings" : 1 }); Mongo
  • Simplify product listing • Clearer but more data storageproduct = { "_id" : ObjectId("4d8ace4b0dc3e43231bb930d"), "name" : "Product ABC", "listings" : [ {"category_id" : 1, "month" : "2011-01" }, {"category_id" : 1,"month" : "2011-04" }, {"category_id" : 3, "month" : "2011-01" }]};db.products.find("listings" : {"category_id" : 1, "month" :"2011-04" }) Mongo
  • Find unique slug • book1 = #<Book id: .., title => “Ruby”, ... > • book2 = #<Book id: .., title => “Ruby”, ... > • book2.uniq_slug => /books/ruby-1 • Need n queries to find an unique slugdef uniq_slug slug = original_slug = title.to_slug counter = 0 while (where(:slug => slug).count > 0) counter += 1 slug = "#{original_slug}-#{counter}" end slugend SQL
  • Find unique slug • Need one query using regexp matchingdef find_uniq_slug original_slug = title.to_slug slug_pattern = /^#{original_slug}(-d+)?$/ book = where(:slug => slug_pattern). order(:slug.desc).limit(1) if book max_counter = book.slug.match(/-(d+)$/)[1].to_i "#{original_slug}-#{max_counter + 1}" else original_slug endenddb.books.ensureIndex({"slug" : -1 }) Mongo
  • Voting• A user can only vote each post once• up / down votes has different points• Cached votes_count and votes_point in post for sorting and querying • Post.max(:votes_point) • Post.order_by(:votes_count.desc)
  • Voting• Use extra votes table to store vote data SQL
  • Votingdef vote(user_id, post_id, value) # Validate not_voted = Vote.where(:user_id => user_id, :post_id => post_id).count == 0 if not_voted # Create a new vote Vote.create( :user_id => user_id, :post_id => post_id, :value => value ) # Get post post = Post.find(post_id) # Update votes_point & votes_count post.votes_point += POINT[value] post.votes_count += 1 post.save endend SQL
  • Votingdef vote(user_id, post_id, value) # Validate not_voted = Vote.where(:user_id => user_id, :post_id => post_id).count == 0 if not_voted # Create a new vote Vote.create( :user_id => user_id, 4 requests :post_id => post_id, :value => value ) # Get post post = Post.find(post_id) # Update votes_point & votes_count post.votes_point += POINT[value] post.votes_count += 1 post.save endend SQL
  • Votingdef unvote(user_id, post_id) # Get current vote vote = Vote.where(:user_id => user_id, :post_id => post_id).first # Check if voted if vote # Destroy vote vote.destroy # Get post post = Post.find(post_id) # Update votes_point & votes_count post.votes_point -= POINT[vote.value] post.votes_count -= 1 post.save endend SQL
  • Votingdef unvote(user_id, post_id) # Get current vote vote = Vote.where(:user_id => user_id, :post_id => post_id).first # Check if voted if vote # Destroy vote 4 requests vote.destroy # Get post post = Post.find(post_id) # Update votes_point & votes_count post.votes_point -= POINT[vote.value] post.votes_count -= 1 post.save endend SQL
  • Voting • Embed votes data to post • use arrays to store who vote up and who vote downpost = { "_id" : ObjectId("4d8ace4b0dc3e43231bb930d"), "title" : "Post ABC", .... "votes" : { "up" : [ user_id_1 ], "down" : [ user_id_2 ], "count" => 2, "point" => -1 Mongo }};
  • def vote(user_id, post_id, value) # Find post with post_id that was not up voted or down voted by user_id query = { post_id => post_id, votes.up => { $ne => user_id }, votes.down => { $ne => user_id } } # Push user_id to votes.up_ids if vote up or votes.down_ids if vote_down # and update votes.point and votes.count update = { $push => { (value == :up ? votes.up : votes.down) => user_id }, $inc => { votes.point => POINT[value], votes.count => +1 } } # Validate, update and get result post = Post.collection.find_and_modify( :query => query, :update => update, :new => true # return post after update votes data )end Mongo
  • def vote(user_id, post_id, value) # Find post with post_id that was not up voted or down voted by user_id query = { post_id => post_id, votes.up => { $ne => user_id }, votes.down => { $ne => user_id } } # Push user_id to votes.up_ids if vote up or votes.down_ids if vote_down # and update votes.point and votes.count update = { $push => { (value == :up ? votes.up : votes.down) => user_id }, $inc => { votes.point => POINT[value], votes.count => +1 } } # Validate, update and get result post = Post.collection.find_and_modify( :query => query, one request :update => update, :new => true # return post after update votes data )end Mongo
  • def unvote(user_id, post_id) # Find post with post_id that was up voted or down voted by user_id query = { post_id => post_id, $or => { votes.up => user_id, votes.down => user_id } } # Pull user_id from both votes.up_ids and votes.down_ids # and update votes.point and votes.count update = { $pull => { votes.up => user_id, votes.down => user_id }, $inc => { votes.point => -POINT[value], votes.count => -1 } } # Validate, update and get result post = Post.collection.find_and_modify( :query => query, :update => update, :new => true # return post after update votes data )end Mongo
  • def unvote(user_id, post_id) # Find post with post_id that was up voted or down voted by user_id query = { post_id => post_id, $or => { votes.up => user_id, votes.down => user_id } } # Pull user_id from both votes.up_ids and votes.down_ids # and update votes.point and votes.count update = { $pull => { votes.up => user_id, votes.down => user_id }, $inc => { votes.point => -POINT[value], votes.count => -1 } } # Validate, update and get result post = Post.collection.find_and_modify( :query => query, one request :update => update, :new => true # return post after update votes data )end Mongo
  • Voting• For a complete solution:• gem install voteable_mongoid• visit https://github.com/vinova/voteable_mongoid
  • Summary• MongoDB is • Flexible • Powerful • Fun
  • Thank you Alex Nguyen @tiendungalex@vinova.sg
  • ReferencesIntroduction to MongoDB • http://scribd.com/doc/26506063/Introduction-To-MongoDB • http://slideshare.net/jnunemaker/why-mongodb-is-awesomeSchema Design • http://slideshare.net/kbanker/mongodb-schema-design-mongo-chicagoIndexing & Query Optimization • http://slideshare.net/mongodb/indexing-with-mongodb • http://slideshare.net/mongodb/mongodb-indexing-the-details