The emerging world of mongo db   csp
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

The emerging world of mongo db csp

on

  • 604 views

 

Statistics

Views

Total Views
604
Views on SlideShare
604
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

The emerging world of mongo db csp Presentation Transcript

  • 1. The Emerging World of MongoDB
  • 2. scheduler Reflexion What's NOSQL means We're thinking in changes How is MongoDB Let's get started MongoDB What's going wrong? Document structure Basic Operations CRUD Index Explain Hint Data Model: be quiet - the answer Sharding -scaling
  • 3. { "Name" : "Carlos Sánchez Pérez", Name "Love" : "Web Dev", Love "Title" : "Rough Boy", Title "Twitter" : "@carlossanchezp", Twitter "Blog" : "carlossanchezperez.wordpress.com", Blog "Job" : "ASPgems", Job "Github" : "carlossanchezp" Github }
  • 4. REFLEXION Through all this time one thing has stayed constant—relational databases store the data and our decision is almost always implied
  • 5. What's NOSQL means?
  • 6. Because the true spirit of “NoSQL” does not consist in the way data is queried. It consists in the way data is stored. NoSQL is all about data storage. “NoSQL” should be called “SQL with alternative storage models”
  • 7. we are thinking in changes..... Wait let me show you
  • 8. ● How will we add new machines? ● Are their any single points of failure? ● Do the writes scale as well? ● How much administration will the system require? ● If its open source, is there a healthy community? How much time and effort would we have to expend to deploy and integrate it? ● ● Does it use technology which we know we can work with?
  • 9. How is MongoDB
  • 10. MongoDB is a powerful, flexible, and scalable general-purpose database
  • 11. Ease of Use MongoDB is a document-oriented database, not a relational one. One of the reason for moving away from the relational model is to make scaling out easier.
  • 12. Ease of Use A document-oriented database replaces the concept “row” with a more flexible model: “document”. By allowing embedded documents and arrays, the document-oriented approach makes it possible to represent complex hierarchical relationships with a single record.
  • 13. Ease of Use Without a fixed schema, adding or removing fields as needed becomes easier. This makes development faster as developers can quickly iterate. It is also easier to experiment. Developers can try a lot of models for the data and then choose the best one.
  • 14. Easy Scaling Data set sizes for applications are growing at an incredible pace. As the amount of data that developers need to store grows, developers face a difficult decision: how should they scale their databases? Scaling a database comes down to the choice between: scaling up : getting a bigger machine.  scaling out : partitioning data across more machines. 
  • 15. Let’s Get Started
  • 16. Let's see some of the basic concepts of MongoDB: • A document is the basic unit of data for MongoDB and is equivalent to a row in a Relational Database. • Collection can be thought of as a table with a dynamic schema. • A single instance of MongoDB can host multiple independent databases, each of which can have its own collections. • One document has a special key, "_id", that is unique within a collection. • Awesome JavaScript shell, which is useful for the administration and data manipulation.
  • 17. Schemaless dynamics MongoDB is a "schemaless" but it doesn't mean that you don't need to thinking about design your schema!! MongoDB is a “schemaless dynamic”, the meaning is don't have an ALTER TABLE and migration.
  • 18. At the core of MongoDB is the document: an ordered set of keys with associated values. In JavaScript, for example, documents are represented as objects: {"name" : "Hello, MongoDB world!"} {"name" : "Hello, MongoDB world!", "foo" : 3} {"name" : "Hello, MongoDB world!", "foo" : 3, "fruit": ["pear","apple"]}
  • 19. let's start working
  • 20. What's going wrong?
  • 21. The keys in a document are strings. Any UTF-8 character is allowed in a key, with a few notable exceptions: • Keys must not contain the character 0 (the null character). This character is used to signify the end of a key. • The . and $ characters have some special properties and should be used only in certain circumstances, but in general, they should be considered reserved.
  • 22. SQL Terms/Concepts database table Row column index foreign key primary key MongoDB Terms/Concepts database collection document or BSON document field index joins embedded documents and linking automatically set to the _id field.
  • 23. MongoDB is type-sensitive and case-sensitive. For example, these documents are distinct: {"foo" : "3"} {"foo" : 3} {"Foo" : 3} MongoDB cannot contain duplicate keys: {"name" : "Hello, world!", "name" : "Hello, MongoDB!"}
  • 24. WAIT!! NO JOIN
  • 25. Document Structure
  • 26. References
  • 27. Embedded Data
  • 28. Data COLLECTION db.blog methods_mongodb DataBase Use db_name DOCUMENT {…..} SUBDOCUMENT {...} FIELD name: type Array [….] document {…..[{...}]......}
  • 29. Or my Owner By default ID { _id: ObjectID('4bd9e8e17cefd644108961bb'), By default ID title: 'My own Adventures in Databases', url: 'http://example.com/exampledatabases.txt', author: 'csp', vote_count: 5, Array tags: ['databases', 'mongodb', 'indexing'], Array image: { url: 'http://example.com/db.jpg', caption: '', SubDoc type: 'jpg', SubDoc size: 75381, data: "Binary" }, comments: [ { user: 'abc',text: 'Nice article!'}, { user: 'jkl',text: 'Another related article is at http://example.com/db/mydb.txt'} ] } DOCUMENT _id: 1 Array + SubDoc Array + SubDoc
  • 30. Collections Like a Table SQL {“title” : “First”, “edge” : 34} {“title” : “First”, “edge” : 34} {“title” : “First”, “edge” : 34} Collections Document Document Like a row SQL A collection is a group of documents. If a document is the MongoDB analog of a row in a relational database, then a collection can be thought of as the analog to a table.
  • 31. Dynamic Schemas Collections have dynamic schemas. This means that the documents within a single collection can have any number of different “shapes.” For example, both of this documents could be stored in a single collection: {"greeting" : "Hello, mongoDB world!"} {"foo" : 23}
  • 32. Subcollections One convention for organizing collections is to use namespaced subcollections separated by the “.” character. For example, an a Blog application might have a collection named blog.posts and a separate collection named blog.authors.
  • 33. Basic Operations CRUD
  • 34. Basic Operations CREATE: The insert function adds a document to a collection. > post = {"title" : "My first post in my blog", ... "content" : "Here's my blog post.", ... "date" : new Date()} > db.blog.insert(post) > db.blog.find() { "_id" : ObjectId("5037ee4a1084eb3ffeef7228"), "title" : "My first post in my blog", "content" : "Here's my blog post.", "date" : ISODate("2013-10-05T16:13:42.181Z") }
  • 35. this.insertEntry = function (title, body, tags, author, callback) { "use strict"; console.log("inserting blog entry" + title + body); // fix up the permalink to not include whitespace var permalink = title.replace( /s/g, '_' ); permalink = permalink.replace( /W/g, '' ); // Build a new post var post = {"title": title, "author": author, "body": body, "permalink":permalink, "tags": tags, "comments": [], "date": new Date()} // now insert the post posts.insert(post, function (err, post) { "use strict"; if (!err) { console.log("Inserted new post"); console.dir("Successfully inserted: " + JSON.stringify(post)); return callback(null, post); } return callback(err, null); }); }
  • 36. Basic Operations FIND: find and findOne can be used to query a collection. > db.blog.findOne() { "_id" : ObjectId("5037ee4a1084eb3ffeef7228"), "title" : "My first post in my blog", "content" : "Here's my blog post.", "date" : ISODate("2012-08-24T21:12:09.982Z") }
  • 37. this.getPostsByTag = function(tag, num, callback) { "use strict"; posts.find({ tags : tag }).sort('date', -1).limit(num).toArray(function(err, items) { "use strict"; if (err) return callback(err, null); console.log("Found " + items.length + " posts"); callback(err, items); }); } this.getPostByPermalink = function(permalink, callback) { "use strict"; posts.findOne({'permalink': permalink}, function(err, post) { "use strict"; if (err) return callback(err, null); callback(err, post); }); }
  • 38. Basic Operations UPDATE: If we would like to modify our post, we can use update. update takes (at least) two parameters: the first is the criteria to find which document to update, and the second is the new document. > post.comments = [] > db.blog.update({title : "My first post in my blog"}, post) > db.blog.find() { "_id" : ObjectId("5037ee4a1084eb3ffeef7228"), "title" : "My first post in my blog", "content" : "Here's my blog post.", "date" : ISODate("2013-10-05T16:13:42.181Z"), "comments" : [ ] }
  • 39. this.addComment = function(permalink, name, email, body, callback) { "use strict"; var comment = {'author': name, 'body': body} if (email != "") { comment['email'] = email } posts.update({'permalink': permalink},{ $push: { "comments": comment } },{safe:true}, function (err, comment) { "use strict"; if (!err) { console.log("Inserted new comment"); console.log(comment); return callback(null, comment); } return callback(err, null); }); }
  • 40. Basic Operations DELETE: remove permanently deletes documents from the database. Called with no parameters, it removes all documents from a collection. It can also take a document specifying criteria for removal. > db.blog.remove({title : "My first post in my blog"}) > db.blog.find() { "_id" : ObjectId("5037ee4a1084eb3ffeef7228"), "title" : "My second post in my blog", "content" : "Here's my second blog post.", "date" : ISODate("2013-10-05T16:13:42.181Z"), "comments" : [ ] }
  • 41. Comparison Name Description $gt Matches values that are greater than the value specified in the query. $gteMatches values that are equal to or greater than the value specified in the query. $in Matches any of the values that exist in an array specified in the query. $lt Matches values that are less than the value specified in the query. $lte Matches values that are less than or equal to the value specified in the query. $ne Matches all values that are not equal to the value specified in the query. $ninMatches values that do not exist in an array specified to the query. Logical Name Description $or Joins query clauses with a logical OR returns all documents that match the conditions of either clause. $and Joins query clauses with a logical AND returns all documents that match the conditions of both clauses. $not Inverts the effect of a query expression and returns documents that do not match the query expression. $nor Joins query clauses with a logical NOR returns all documents that fail to match both clauses.
  • 42. Examples db.scores.find( { score : { $gt : 50 }, score : { $lt : 60 } } ); db.scores.find( { $or : [ { score : { $lt : 50 } }, { score : { $gt : 90 } } ] } ) ; db.users.find({ name : { $regex : "q" }, email : { $exists: true } } ); db.users.find( { friends : { $all : [ "Joe" , "Bob" ] }, favorites : { $in : [ "running" , "pickles" ] } } )
  • 43. Index Explain Hint
  • 44. > for (i=0; i<1000000; i++) { ... ... db.users.insert( ... { ... "i" : i, ... "username" : "user"+i, ... "age" : Math.floor(Math.random()*120), ... "created" : new Date() ... } ... ); ... } > db.users.count() 1000000 > db.users.find() { "_id" : ObjectId("526403c77c1042777e4dd7f1"), "i" : 0, "username" : "user0", "age" : 80, "created" : ISODate("2013-10-20T16:24:39.780Z") } { "_id" : ObjectId("526403c77c1042777e4dd7f2"), "i" : 1, "username" : "user1", "age" : 62, "created" : ISODate("2013-10-20T16:24:39.826Z") } { "_id" : ObjectId("526403c77c1042777e4dd7f3"), "i" : 2, "username" : "user2", "age" : 5, "created" : ISODate("2013-10-20T16:24:39.826Z") } { "_id" : ObjectId("526403c77c1042777e4dd7f4"), "i" : 3, "username" : "user3", "age" : 69, "created" : ISODate("2013-10-20T16:24:39.826Z") } { "_id" : ObjectId("526403c77c1042777e4dd7f5"), "i" : 4, "username" : "user4", "age" : 93, "created" : ISODate("2013-10-20T16:24:39.826Z") }
  • 45. > db.users.find({username: "user999999"}).explain() { "cursor" : "BasicCursor", "isMultiKey" : false, "n" : 1, Others means: Others means: "nscannedObjects" : 1000000, "nscanned" : 1000000, The query could be The query could be "nscannedObjectsAllPlans" : 1000000, returned 5 documents --n returned 5 documents n "nscannedAllPlans" : 1000000, scanned 9 documents scanned 9 documents "scanAndOrder" : false, from the index --nscanned from the index nscanned "indexOnly" : false, and then read 5 and then read 5 "nYields" : 1, full documents from full documents from "nChunkSkips" : 0, the collection the collection "millis" : 392, --nscannedObjects nscannedObjects "indexBounds" : { }, "server" : "desarrollo:27017" }
  • 46. The results of explain() describe the details of how MongoDB executes the query. Some of relevant fields are: cursor: A result of BasicCursor indicates a non-indexed query. If we had used an indexed query, the cursor would have a type of BtreeCursor. nscanned and nscannedObjects: The difference between these two similar fields is distinct but important. The total number of documents scanned by the query is represented by nscannedObjects. The number of documents and indexes is represented by nscanned. Depending on the query, it's possible for nscanned to be greater than nscannedObjects. n: The number of matching objects. millis: Query execution duration.
  • 47. > db.users.ensureIndex({"username" : 1}) > db.users.find({username: "user101"}).limit(1).explain() { "cursor" : "BtreeCursor username_1", "isMultiKey" : false, "n" : 1, "nscannedObjects" : 1, "nscanned" : 1, "nscannedObjectsAllPlans" : 1, "nscannedAllPlans" : 1, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 40, "indexBounds" : { "username" : [ [ "user101", "user101" ] ] }, "server" : "desarrollo:27017" }
  • 48. > db.users.ensureIndex({"age" : 1, "username" : 1}) > db.users.find({"age" : {"$gte" : 41, "$lte" : 60}}). ... sort({"username" : 1}). ... limit(1000). ... hint({"age" : 1, "username" : 1}) > db.users.find({"age" : {"$gte" : 41, "$lte" : 60}}). ... sort({"username" : 1}). ... limit(1000). ... hint({"username" : 1, "age" : 1})
  • 49. Data Model
  • 50. Where are the answers? 1) Embedded or Link 2) 1 : 1 3) 1: many 4) 1:few 4) many:many 5) few:few So gimme just a minute and I'll tell you why
  • 51. Because any document can be put into any collection, then i wonder: “Why do we need separate collections at all?” with no need for separate schemas for different kinds of documents, why should we use more than one collection?
  • 52. COMMETS POST { _id:1, post_id:____, author:___, author_email:_, order:___} {_id:1, title:____, body:___, author:___, date:___} TAGS 1) Embedded 16Mb 2) Living without Constrain, in MongoDB dependent of you 3) No JOINS { _id:___, tag:____, post_id: 1 }
  • 53. Model One-to-One Relationships with Embedded Documents { _id: "csp", name: "Carlos Sánchez Pérez" } { 2 Collections patron_id: "csp", street: "123 Aravaca", city: "Madrid", Number: "25 3º A", zip: 12345 1) Frequency access Thinking about the memory. All information load 2) Growing size at the items Writen of data separated or embedbed 3) > 16Mb 4) Atomicity of data } If the address data is frequently retrieved with the name information, then with referencing, your application needs to issue multiple queries to resolve the reference.
  • 54. The better data model would be to embed the address data in the patron data: { { _id: "csp", name: "Carlos Sánchez Pérez", address: { street: "123 Aravaca", city: "Madrid", Number: "25 3º A", zip: 12345 } } { } } _id: "csp", name: "Carlos Sánchez Pérez", address: { street: "123 Aravaca", city: "Madrid", Number: "25 3º A", zip: 12345 } { _id: "csp2", name: "Carlos SP3", address: { street: "666 Aravaca", city: "Madrid", Number: "75 3º A", zip: 12345 } } _id: "csp1", name: "Carlos1 Sánchez1", address: { street: "1 Aravaca", city: "Madrid", Number: "95 3º A", zip: 12345 } { _id: "csp", name: "Carlos SN", address: { street: "777 Aravaca", city: "Madrid", Number: "45 3º A", zip: 12345 } }
  • 55. Model One-to-Many Relationships with Embedded Documents { _id: "csp", name: "Carlos Sánchez Pérez" } { patron_id: "csp", street: "123 Aravaca", city: "Madrid", Number: "25 3º A", zip: 12345 } { patron_id: "csp", street: "456 Aravaca", city: "Madrid", Number: "55 1º B", zip: 12345 } 3 Collections 1) Frequency access 2) Growing size 3) > 16Mb o Mib 4) Atomicity of data
  • 56. Model One-to-Many Relationships with Embedded Documents { _id: "csp", name: "Carlos Sánchez Pérez" addresses: [ { street: "123 Aravaca", city: "Madrid", Number: "25 3º A", zip: 12345 }, { street: "123 Aravaca", city: "Madrid", Number: "25 3º A", zip: 12345 } ] }
  • 57. Model One-to-Many Relationships with Document References People { _id: "csp", name: "Carlos Sánchez Pérez" City: “MD”, …................. } City { _id: "MD", Name: ….. …................. } If you are thinking in If you are thinking in Embedded: ititnot a good Embedded: not a good solution in this case solution in this case Why? Why? Redundance information Redundance information
  • 58. Model One-to-Many Relationships with Document References BLOG BLOG Embedded itita good Embedded a good solution in this case solution in this case One-to-few One-to-few post { title: “My first tirle”, author : “Carlos Sánchez Pérez” , date : “19/08/2013″, comments : [ {name: "Antonio López", comment : "my comment" }, { .... } ], tags : ["tag1","tag2","tag3"] } autor { _id : “Carlos Sánchez Pérez “, password; “”,…….. }
  • 59. Model Many-to-Many Relationships Books and Authors Books { :id: 12 title: "MongoDB: My Definitive easy Guide", author: [32] …........................ } Authors { :id: 32 author: "Peter Garden", books: [12,34,5,6,78,65,99] …........................ } MODEL MODEL Few-to-few Few-to-few Embedded books: ititnot a good Embedded books: not a good solution in this case solution in this case
  • 60. Benefits of Embedding Performace and better read. Be careful if the document chage a lot, slow write.
  • 61. Sharding, horizontal scaling
  • 62. Sharding Vertical scaling adds more CPU and storage resources to increase capacity. Scaling by adding capacity has limitations: high performance systems with large numbers of CPUs and large amount of RAM are disproportionately more expensive than smaller systems. Additionally, cloud-based providers may only allow users to provision smaller instances. As a result there is a practical maximum capability for vertical scaling. Sharding, or horizontal scaling, by contrast, divides the data set and distributes the data over multiple servers, or shards. Each shard is an independent database, and collectively, the shards make up a single logical database.
  • 63. S1 S1 r1,r2,r3 r1,r2,r3 s2 s2 s4 s4 s3 s3 Mongos (router) Mongos (router) APP APP s5 s5 Config server
  • 64. COLLECTIONS User0.......................................................................................................user99999 User0.......................................................................................................user99999 FIRST: a collection is sharded, it can be thought of as a single chunk from the smallest value of the shard key to the largest $minkey user100 User100 user300 User300 user600 User600 user900 User900 User1200 User1500 user1200 user1500 $maxkey Sharding splits the collection into many chunks based on shard key ranges
  • 65. at the end......
  • 66. Any questions? ….. I'll shoot it to you straight and look you in the eye. So gimme just a minute and I'll tell you why I'm a rough boy.
  • 67. That's all folks!! and Thanks a lot for your attention I'm comming soon..........