A gentle, friendly overview       Antonio Pintus        CRS4, 08/09/2011               1
NOSQL /1• MongoDB     belongs to the NoSQL databases family:   • non-relational   • document-oriented   • no   prefixed, ri...
NOSQL /2• NoSQL    DB family includes several DB types: • document/oriented:     mongoDB, CouchDB, ... • Key Value   / Tup...
MongoDB• Performant: C++                 • document-based      queries• Schema-free                     • Map/Reduce• Full...
SCHEMA-FREE• Schema-free    collections = NO TABLES!•A   Mongo deployment (server) holds a set of databases •A   database ...
DATA FORMAT• document/oriented• stores   JSON-style documents: BSON (Binary JSON):      • JSON       + other data types. E...
BSON{!   "_id" : ObjectId("4dcec9a0af391a0d53000003"),!    "servicetype" : "sensor",!    "description" : "it’s only rock’n...
COLLECTIONS• More   or less, same concept as “table” but dynamic, schema- free• collection   of BSON documents• documents ...
QUERIES• query    by documents• Examples     (using the interactive shell):    •   db.mycollection.find( {"policy" : "PUBL...
INDEXES• Full   index support: index on any attribute (including multiple)• increase    query performance• indexes     are...
INSERTS• Simplicity•   db.mycollection.insert({“a”:”abc”,...})•   var doc = {“name”:”mongodb”,...};•   db.mycollection.ins...
UPDATES1. replace entire document2. atomic, in-place updates•   db.collection.update( criteria, objNew, upsert, multi )   ...
UPDATES /2• atomic, in-place      updates = highly efficient• provides     special operators•   db.mycollection.update( { “...
Mongo DISTRIBUTION• Mac, Linux, Solaris, Win• mongod: database         server.      •   By default, port=27017, store path...
MISCELLANEOUS: REST• mongod      provides a basic REST interface• launch    it with --rest option:        default port=280...
GOOD FOR• event   logging• high   performance small read/writes• Web:  real-time inserts, updates, and queries. Auto-shard...
LESS GOOD FOR• Systems   with heavy transactional nature• Traditional   Business Intelligence• (obviously)   System and pr...
SHARDING /1• Horizontal   scalability: MongoDB auto-sharding   • partitioning   by keys   • auto-balancing   • easy   addi...
SHARDING /2                mongod        mongod            mongod                mongod        mongod      ...   mongod   ...
DRIVERS• C#   and .NET              • Python, Ruby, Delphi• C, C++                     • Scala• Erlang, Perl              ...
PyMongo• Recommended       MongoDB driver for the Python language• An   easy way to install it (Mac, Linux):       •   eas...
QUICK-START: INSERT• (obviously)   mongod must be running ;-)import pymongofrom pymongo import Connectionconn = Connection...
QUICK-START: QUERYres = test_coll.find_one()        # gets one documentquery = {"author":"Antonio"}      # a query documen...
NOT COVERED (HERE)• GridFS:  binary data storage is limited to 16MB in DB, so GridFS transparently splits large files among...
IN PRODUCTION (some...)           25
26
Paraimpu LOVES MongoDB• MongoDB      powers Paraimpu, our Social Web of Things tool• great   data heterogeneity• real-time...
REFERENCES• http://www.mongodb.org/• http://www.mongodb.org/display/DOCS/Manual• http://www.mongodb.org/display/DOCS/Slide...
THANK YOUAntonio Pintus                 email:     pintux@crs4.it                 twitter:   @apintux                     ...
Upcoming SlideShare
Loading in...5
×

MongoDB: a gentle, friendly overview

3,853

Published on

My talk @ CRS4 about a MongoDB overview

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,853
On Slideshare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
31
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

MongoDB: a gentle, friendly overview

  1. 1. A gentle, friendly overview Antonio Pintus CRS4, 08/09/2011 1
  2. 2. NOSQL /1• MongoDB belongs to the NoSQL databases family: • non-relational • document-oriented • no prefixed, rigid, database schemas • no joins • horizontal scalability 2
  3. 3. NOSQL /2• NoSQL DB family includes several DB types: • document/oriented: mongoDB, CouchDB, ... • Key Value / Tuple Store: Redis, ... • Graph databases: Neo4j, ... • ... 3
  4. 4. MongoDB• Performant: C++ • document-based queries• Schema-free • Map/Reduce• Full index support • GridFS• No transactions •a JavaScript interactive shell• Scalable: replication + sharding 4
  5. 5. SCHEMA-FREE• Schema-free collections = NO TABLES!•A Mongo deployment (server) holds a set of databases •A database holds a set of collections •A collection holds a set of documents •A document is a set of fields: key-value pair (BSON) •A key is a name (string), a value is a basic type like string, integer, float, timestamp, binary, etc.,a document, or an array of values 5
  6. 6. DATA FORMAT• document/oriented• stores JSON-style documents: BSON (Binary JSON): • JSON + other data types. E.g., Date type and a BinData type. • Can reference other documents• lightweight, traversable, efficient 6
  7. 7. BSON{! "_id" : ObjectId("4dcec9a0af391a0d53000003"),! "servicetype" : "sensor",! "description" : "it’s only rock’n’roll but I like it",! "policy" : "PUBLIC",! "owner" : "User001",! "date_created" : "2011-05-02 17:11:28.874086",! "shortname" : "SampleSensor",! "content-type" : "text/plain",! "icon" : "http://myserver.com/images/sens.png"} 7
  8. 8. COLLECTIONS• More or less, same concept as “table” but dynamic, schema- free• collection of BSON documents• documents can have heterogeneous data structure in the same collection 8
  9. 9. QUERIES• query by documents• Examples (using the interactive shell): • db.mycollection.find( {"policy" : "PUBLIC"} ); • db.mycollection.findOne({"policy" : "PUBLIC", “owner”:”User001”}); • db.mycollection.find({"policy" : "PUBLIC", “owner”:”User001”}).limit(2); • db.mycollection.find( {"policy" : "PUBLIC"}, {“shortname”:1} ); • db.mycollection.find({"counter": {$gt:2}});• conditional ops: <, <=, >, >=, $and, $in, $or, $nor, ... 9
  10. 10. INDEXES• Full index support: index on any attribute (including multiple)• increase query performance• indexes are implemented as “B-Tree” indexes• data overhead for inserts and deletes, don’t abuse! • db.mycollection.ensureIndex( {"servicetype" : 1} ); • db.mycollection.ensureIndex( {"servicetype" : 1, “owner”:-1} ); • db.mycollection.getIndexes() • db.system.indexes.find() 10
  11. 11. INSERTS• Simplicity• db.mycollection.insert({“a”:”abc”,...})• var doc = {“name”:”mongodb”,...};• db.mycollection.insert(doc); 11
  12. 12. UPDATES1. replace entire document2. atomic, in-place updates• db.collection.update( criteria, objNew, upsert, multi ) • criteria: the query • objNew: updated object or $ operators (e.g., $inc, $set) which manipulate the object • upsert: if the record(s) do not exist, insert one. • multi: if all documents matching criteria should be updated• db.collection.save(...): single object update with upsert 12
  13. 13. UPDATES /2• atomic, in-place updates = highly efficient• provides special operators• db.mycollection.update( { “shortname”:"Arduino" }, { $inc: { n : 1 } } );• db.mycollection.update( { “shortname”:"Arduino" }, { $set: { “shortname” : “OldArduino” } } );• other atomic ops: $unset, $push, $pushAll, $addToSet, $pop, $pull, $rename, ... 13
  14. 14. Mongo DISTRIBUTION• Mac, Linux, Solaris, Win• mongod: database server. • By default, port=27017, store path=/data/db. • Override with --dbpath, --port command options• mongo: interactive JavaScript shell• mongos: sharding controller server 14
  15. 15. MISCELLANEOUS: REST• mongod provides a basic REST interface• launch it with --rest option: default port=28017• http://localhost:28017/mydb/mycollection/• http://localhost:28017/mydb/mycollection/?filter_shortname=Arduino• http://localhost:28017/mydb/mycollection/?filter_shortname=Arduino• http://localhost:28017/mydb/mycollection/? filter_shortname=Arduino&limit=10 15
  16. 16. GOOD FOR• event logging• high performance small read/writes• Web: real-time inserts, updates, and queries. Auto-sharding (scalability) and replication are provided.• Real-time stats/analytics 16
  17. 17. LESS GOOD FOR• Systems with heavy transactional nature• Traditional Business Intelligence• (obviously) System and problems requiring SQL 17
  18. 18. SHARDING /1• Horizontal scalability: MongoDB auto-sharding • partitioning by keys • auto-balancing • easy addition of new servers • no single points-of-failure • automatic failover/replica-sets 18
  19. 19. SHARDING /2 mongod mongod mongod mongod mongod ... mongod Shards mongod mongod mongodConfig serversmongod mongos mongos ...mongodmongod Client 19
  20. 20. DRIVERS• C# and .NET • Python, Ruby, Delphi• C, C++ • Scala• Erlang, Perl • Clojure• Haskell • Go, Objective C• Java, Javascript • Smalltalk• PHP • ... 20
  21. 21. PyMongo• Recommended MongoDB driver for the Python language• An easy way to install it (Mac, Linux): • easy_install pymongo • easy_install -U pymongo 21
  22. 22. QUICK-START: INSERT• (obviously) mongod must be running ;-)import pymongofrom pymongo import Connectionconn = Connection() # default localhost:27017; conn=Connection(myhost,9999)db = conn[test_db] # gets the databasetest_coll = db[testcoll] # gets the desired collectiondoc = {"name":"slides.txt", "author":"Antonio", "type":"text", "tags":["mongodb", "python", "slides"]} # a dicttest_coll.insert(doc) # inserts document into the collection• lazycreation: collections and databases are created when the first document is inserted into them 22
  23. 23. QUICK-START: QUERYres = test_coll.find_one() # gets one documentquery = {"author":"Antonio"} # a query documentres = test_coll.find_one(query) # searches for one documentfor doc in test_coll.find(query): # using Cursors on multiple docs print doc ...test_coll.count() # counts the docs in the collection 23
  24. 24. NOT COVERED (HERE)• GridFS: binary data storage is limited to 16MB in DB, so GridFS transparently splits large files among multiple documents• MapReduce: batch processing of data and aggregation operations• GeoSpatial Indexing: two-dimensional indexing for location-based queries (e.g., retrieve the n closest restaurants to my location) 24
  25. 25. IN PRODUCTION (some...) 25
  26. 26. 26
  27. 27. Paraimpu LOVES MongoDB• MongoDB powers Paraimpu, our Social Web of Things tool• great data heterogeneity• real-time thousands, small data inserts/queries• performances• horizontal scalability• easy of use, development is funny! 27
  28. 28. REFERENCES• http://www.mongodb.org/• http://www.mongodb.org/display/DOCS/Manual• http://www.mongodb.org/display/DOCS/Slides+and+Video• pymongo: http://api.mongodb.org/python/• Paraimpu: http://paraimpu.crs4.it 28
  29. 29. THANK YOUAntonio Pintus email: pintux@crs4.it twitter: @apintux 29
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×