Introdução ao MongoDB


Published on

As grandes rupturas: IMS x RDBMS x NoSQL
Sobre o MongoDB
O modelo de dados orientado a documentos
Um documento JSON
Tipos de Dados do MongoDB
O formato BSON (Binary JSON)
Embed vs Reference
Insert, Update, Delete
Modificadores Atômicos
Linguagem de Consulta
Agregação e Map/Reduce
Capped Collections
Server-Side Scripting
Replicação: Master/Slave e Replica Sets
Arquitetura com Sharding
Auto-Sharding + Replicação
Suporte e Treinamento
Quem está usando?

Published in: Technology
1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Grande ruptura – IMS x RDBMS (invenção do modelo relacional)
  • A segunda ruptura: RDBMS x NoSQL
  • MongoDB is a powerful, flexible, and scalable data store. It combines the ability to scale out with many of the most useful features of relational databases, such as secondary indexes, range queries, and sorting. MongoDB is also incredibly featureful: it has tons of useful features such as built-in support for MapReduce-style aggregation and geospatial indexes. 1. (Slang.) humongous extraordinarily large.
  • MongoDB basic concepts: • A document is the basic unit of data, roughly equivalent to a row in a RDBMS • Similarly, a collection can be thought of as the schema-free equivalent of a table • A single instance of MongoDB can host multiple independent databases , each of which can have its own collections and permissions Document : an ordered set of keys with associated values (i.e., map, hash, or dictionary)
  • With Mongo, you do less "normalization" than you would perform designing a relational schema because there are no server-side joins. Generally, you will want one database collection for each of your top level objects.
  • JSON-style documents with dynamic schemas offer simplicity and power.
  • MongoDB supports a wide range of data types as values in documents: null, boolean, integer, float, string, date, regexp, code, binary, array. Documents in MongoDB can be thought of as "JSON-like" in that they are conceptually similar to objects in JavaScript .
  • BSON (Binary JSON) is a lightweight binary format capable of representing any MongoDB document as a string of bytes . The database understands BSON, and BSON is the format in which documents are saved to disk. BSON is a binary-encoded serialization of JSON-like documents. BSON contains extensions that allow representation of data types that are not part of the JSON spec (e.g. Date and BinData). BSON characteristics: Lightweight, Traversable, and Efficient.
  • The key question in Mongo schema design is "does this object merit its own collection , or rather should it embed in objects in other collections?" In relational databases, each sub-item of interest typically becomes a separate table (unless denormalizing for performance). In Mongo, this is not recommended - embedding objects is much more efficient. A DBRef is an embedded document , just like any other embedded document in MongoDB. A DBRef, however, has specific keys that must be present.
  • An upsert is a special type of update. If no document is found that matches the update criteria, a new document will be created by combining the criteria and update documents. If a matching document is found, it will be updated normally.
  • MongoDB supports atomic, in-place updates as well as more traditional updates for replacing an entire document. update() replaces the document matching criteria entirely with the new object. If you only want to modify some fields, you should use the atomic modifiers . Fast In-Place Updates: atomic modifiers for contention-free performance .
  • Rich, document-based queries.
  • Database indexes are similar to a book ’ s index : instead of looking through the whole book, the database takes a shortcut and just looks in the index, allowing it to do queries orders of magnitude faster. Once it finds the entry in the index, it can jump right to the location of the desired document. Index on any attribute, just like you're used to. MongoDB provides a special type of index for coordinate plane queries, called a geospatial index .
  • Everything described with count, distinct, and group can be done with MapReduce , and more. It is a method of aggregation that can be easily parallelized across multiple servers. It splits up a problem, sends chunks of it to different machines, and lets each machine solve its part of the problem. When all of the machines are finished, they merge all of the pieces of the solution back into a full solution.
  • A capped collection is created in advance and is fixed in size. They behave like circular queues : if we're out of space, the oldest documents will be deleted, and the new one will take its place. This means that capped collections automatically age-out the oldest documents as new documents are inserted.
  • GridFS is a mechanism for storing large binary files in MongoDB. No need for a separate file storage architecture. Getting failover and scale-out for file storage is easy as MongoDB has replication and autosharding.
  • JavaScript can be executed on the server using the db.eval function. It can also be stored in the database and is used in some database commands. MongoDB has a special collection for each database called system.js , which can store JavaScript variables.
  • Replication & High Availability : mirror across LANs and WANs for scale and peace of mind. Master-slave replication is the most general replication mode supported by MongoDB. This mode is very flexible and can be used for backup , failover , read scaling , and more. MongoDB allows developers to enforce guarantees about how up-to-date replication is ( “ w ” param). It will block until at least N servers have replicated the last write operation.
  • A replica set is basically a master-slave cluster with automatic failover . The biggest difference between a master-slave cluster and a replica set is that a replica set does not have a single master: one is elected by the cluster and may change to another node if the current master goes down. However, they look very similar: a replica set always has a single master node (called a primary ) and one or more slaves (called secondaries ).
  • Sharding is the process of splitting data up and storing different portions of the data on different machines. Sharding is MongoDB ’ s approach to scaling out . Sharding allows you to add more machines to handle increasing load and data size without affecting your application. Scale horizontally without compromising functionality.
  • MongoDB supports autosharding , which eliminates some of the administrative headaches of manual sharding. The cluster handles splitting up data and rebalancing automatically . You can start with a nonsharded setup and convert it to a sharded one, if and when you need. Sharding involves 3 components working together: - shard : a container that holds a subset of a collection ’ s data - mongos : routes requests and aggregates responses - config server : store the configuration of the cluster (which data is on which shard)
  • From the application ’ s point of view, a sharded setup looks just like a nonsharded setup. There is no need to change application code when you need to scale. To set up sharding with no points of failure , you ’ ll need the following: • Multiple config servers • Multiple mongos servers • Replica sets for each shard • w set correctly
  • MongoDB: The Definitive Guide Kristina Chodorow and Mike Dirolf The Definitive Guide to MongoDB: The NoSQL Database for Cloud and Desktop Computing Peter Membrey Scaling MongoDB Kristina Chodorow MongoDB in Action Kyle Banker MongoDB for Web Development Mitch Pirtle
  • Production Deployments:
  • Pros: "...the best features of key/values stores, document databases and relational databases in one." - John Nunemaker ( Fast, scalable, highly available Cons: no joins single master approach
  • Introdução ao MongoDB

    1. 1. Introdução ao Rodrigo Hjort [email_address]
    2. 4. “ MongoDB (from "humongous") is a scalable, high-performance, open source, powerful, document-oriented database written in C++. ”
    3. 5. O modelo de dados Relacional (Tabular) Orientado a Documentos
    4. 6. Modelo Relacional
    5. 7. Modelo Orientado a Documentos
    6. 8. Um documento JSON { _id : ObjectId("5ebf5e0fec5fab7db2b9b40e"), title : "Introdução ao MongoDB", slug : "introducao-ao-mongodb", body : "Este é o texto do post...", published : true, created : "Jun 28 2011 13:48:22 GMT-0400 (AMT)", updated : "Jun 28 2011 17:01:15 GMT-0400 (AMT)", comments : [ { author : "Xunda", email : "", body : "Caramba!", created : "Jun 28 2011 15:01:30 GMT-0300 (BRT)" } ] , tags : [ "databases", "MongoDB", "nosql" ] } Array Object ID Embedded Document
    7. 9. Tipos de Dados null regexp array string code float date boolean binary integer
    8. 10. lightweight traversable efficient
    9. 11. Embed vs Reference Embedded DBRef
    10. 12. Insert, Update, Delete INSERT INTO usuarios (nome, idade, ativo) VALUES ('xunda', 32, true) > db.usuarios.insert( {nome: “ xunda ” , idade: 32, ativo: true}) UPDATE usuarios SET idade = 28 WHERE nome = 'xunda' > db.usuarios.update({nome: “ xunda ” }, {nome: “ xunda ” , idade: 28, ativo: true}) DELETE FROM usuarios WHERE nome = “ xunda ” > db.usuarios.remove({nome: “ xunda ” }) DELETE FROM usuarios > db.usuarios.remove()
    11. 13. Modificadores Atômicos UPDATE posts SET hits = hits + 1 WHERE id = 999 > db.posts.update( { _id : new ObjectId("4c041e...30c093") }, { $inc : { "hits" : 1 }} ) UPDATE posts SET hits = 0 > db.posts.update({}, { $set : { "hits" : 0 }}) > db.posts.update({}, { $unset : { "hits" : 1 }}) > db.posts.update({title: "A blog post"}, {$push : {comments: { name: "joe", email: "", content: "nice post."} }})
    12. 14. Linguagem de Consulta SELECT * FROM usuarios > db.usuarios.find() SELECT nome FROM usuarios > db.usuarios.find({}, { “ nome ” : 1}) SELECT * FROM usuarios WHERE idade = 29 > db.usuarios.find({ “ idade ” : 29}) SELECT * FROM usuarios WHERE idade = 29 AND ativo = true > db.usuarios.find({ “ idade ” : 29, “ ativo ” : true}) SELECT * FROM usuarios WHERE idade >= 18 AND idade <= 30 > db.usuarios.find({ “ idade ” : { “ $gte ” : 18, “ $lte ” : 30}}) SELECT * FROM usuarios WHERE nome LIKE “ %admin% ” > db.usuarios.find({ “ nome ” : /admin/i})
    13. 15. Linguagem de Consulta SELECT * FROM usuarios ORDER BY nome > db.usuarios.find().sort({ “ nome ” : 1}) SELECT * FROM usuarios ORDER BY idade DESC, nome > db.usuarios.find().sort({ “ idade ” : -1, “ nome ” : 1}) SELECT * FROM usuarios LIMIT 3 > db.usuarios.find().limit(3) SELECT * FROM usuarios OFFSET 5 > db.usuarios.find().skip(5) SELECT * FROM usuarios LIMIT 3 OFFSET 5 > db.usuarios.find().limit(3).skip(5) SELECT * FROM usuarios ORDER BY nome LIMIT 3 > db.usuarios.find().sort({ “ nome ” : 1}).limit(3)
    14. 16. Indexação ensureIndex() explain() hint() índice geoespacial Yes!!!
    15. 17. Agregação e Map/Reduce count() group() distinct() Map/Reduce
    16. 18. Capped Collections natural sort tailable cursors tail -f fila circular
    17. 19. GridFS mongofiles
    18. 20. Server-Side Scripting db.eval() system.js JavaScript
    19. 21. Replicação: Master/Slave
    20. 22. Replicação: Replica Sets
    21. 23. Sharding
    22. 24. Arquitetura com Sharding
    23. 25. Auto-Sharding + Replicação
    24. 26. Suporte e Treinamento
    25. 27. Literatura
    26. 28. Quem está usando?
    27. 29. Rodrigo Hjort [email_address] &quot;If I had asked people what they wanted, they would have said faster horses.&quot; – Henry Ford