MongoDB: a gentle, friendly overview
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

MongoDB: a gentle, friendly overview

  • 4,094 views
Uploaded on

My talk @ CRS4 about a MongoDB overview

My talk @ CRS4 about a MongoDB overview

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
4,094
On Slideshare
2,889
From Embeds
1,205
Number of Embeds
33

Actions

Shares
Downloads
25
Comments
0
Likes
1

Embeds 1,205

http://jaranto.blogspot.com 579
http://jaranto.blogspot.it 400
http://jaranto.blogspot.de 23
http://jaranto.blogspot.com.es 21
http://jaranto.blogspot.fr 18
http://jaranto.blogspot.nl 17
http://jaranto.blogspot.co.uk 17
http://jaranto.blogspot.pt 14
http://jaranto.blogspot.com.br 13
http://jaranto.blogspot.se 12
http://jaranto.blogspot.ca 11
http://jaranto.blogspot.in 11
http://jaranto.blogspot.com.au 9
http://jaranto.blogspot.ch 6
http://jaranto.blogspot.co.at 5
http://jaranto.blogspot.tw 5
http://jaranto.blogspot.fi 4
http://jaranto.blogspot.hk 4
http://jaranto.blogspot.jp 4
http://jaranto.blogspot.mx 4
http://jaranto.blogspot.be 4
http://jaranto.blogspot.sg 3
http://jaranto.blogspot.no 3
http://jaranto.blogspot.kr 3
http://jaranto.blogspot.ie 3
http://jaranto.blogspot.dk 2
http://translate.googleusercontent.com 2
http://feeds.feedburner.com 2
http://jaranto.blogspot.co.il 2
http://translate.yandex.net 1
http://jaranto.blogspot.cz 1
http://jaranto.blogspot.gr 1
http://jaranto.blogspot.co.nz 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. A gentle, friendly overview Antonio Pintus CRS4, 08/09/2011 1
  • 2. NOSQL /1• MongoDB belongs to the NoSQL databases family: • non-relational • document-oriented • no prefixed, rigid, database schemas • no joins • horizontal scalability 2
  • 3. NOSQL /2• NoSQL DB family includes several DB types: • document/oriented: mongoDB, CouchDB, ... • Key Value / Tuple Store: Redis, ... • Graph databases: Neo4j, ... • ... 3
  • 4. MongoDB• Performant: C++ • document-based queries• Schema-free • Map/Reduce• Full index support • GridFS• No transactions •a JavaScript interactive shell• Scalable: replication + sharding 4
  • 5. SCHEMA-FREE• Schema-free collections = NO TABLES!•A Mongo deployment (server) holds a set of databases •A database holds a set of collections •A collection holds a set of documents •A document is a set of fields: key-value pair (BSON) •A key is a name (string), a value is a basic type like string, integer, float, timestamp, binary, etc.,a document, or an array of values 5
  • 6. DATA FORMAT• document/oriented• stores JSON-style documents: BSON (Binary JSON): • JSON + other data types. E.g., Date type and a BinData type. • Can reference other documents• lightweight, traversable, efficient 6
  • 7. BSON{! "_id" : ObjectId("4dcec9a0af391a0d53000003"),! "servicetype" : "sensor",! "description" : "it’s only rock’n’roll but I like it",! "policy" : "PUBLIC",! "owner" : "User001",! "date_created" : "2011-05-02 17:11:28.874086",! "shortname" : "SampleSensor",! "content-type" : "text/plain",! "icon" : "http://myserver.com/images/sens.png"} 7
  • 8. COLLECTIONS• More or less, same concept as “table” but dynamic, schema- free• collection of BSON documents• documents can have heterogeneous data structure in the same collection 8
  • 9. QUERIES• query by documents• Examples (using the interactive shell): • db.mycollection.find( {"policy" : "PUBLIC"} ); • db.mycollection.findOne({"policy" : "PUBLIC", “owner”:”User001”}); • db.mycollection.find({"policy" : "PUBLIC", “owner”:”User001”}).limit(2); • db.mycollection.find( {"policy" : "PUBLIC"}, {“shortname”:1} ); • db.mycollection.find({"counter": {$gt:2}});• conditional ops: <, <=, >, >=, $and, $in, $or, $nor, ... 9
  • 10. INDEXES• Full index support: index on any attribute (including multiple)• increase query performance• indexes are implemented as “B-Tree” indexes• data overhead for inserts and deletes, don’t abuse! • db.mycollection.ensureIndex( {"servicetype" : 1} ); • db.mycollection.ensureIndex( {"servicetype" : 1, “owner”:-1} ); • db.mycollection.getIndexes() • db.system.indexes.find() 10
  • 11. INSERTS• Simplicity• db.mycollection.insert({“a”:”abc”,...})• var doc = {“name”:”mongodb”,...};• db.mycollection.insert(doc); 11
  • 12. UPDATES1. replace entire document2. atomic, in-place updates• db.collection.update( criteria, objNew, upsert, multi ) • criteria: the query • objNew: updated object or $ operators (e.g., $inc, $set) which manipulate the object • upsert: if the record(s) do not exist, insert one. • multi: if all documents matching criteria should be updated• db.collection.save(...): single object update with upsert 12
  • 13. UPDATES /2• atomic, in-place updates = highly efficient• provides special operators• db.mycollection.update( { “shortname”:"Arduino" }, { $inc: { n : 1 } } );• db.mycollection.update( { “shortname”:"Arduino" }, { $set: { “shortname” : “OldArduino” } } );• other atomic ops: $unset, $push, $pushAll, $addToSet, $pop, $pull, $rename, ... 13
  • 14. Mongo DISTRIBUTION• Mac, Linux, Solaris, Win• mongod: database server. • By default, port=27017, store path=/data/db. • Override with --dbpath, --port command options• mongo: interactive JavaScript shell• mongos: sharding controller server 14
  • 15. MISCELLANEOUS: REST• mongod provides a basic REST interface• launch it with --rest option: default port=28017• http://localhost:28017/mydb/mycollection/• http://localhost:28017/mydb/mycollection/?filter_shortname=Arduino• http://localhost:28017/mydb/mycollection/?filter_shortname=Arduino• http://localhost:28017/mydb/mycollection/? filter_shortname=Arduino&limit=10 15
  • 16. GOOD FOR• event logging• high performance small read/writes• Web: real-time inserts, updates, and queries. Auto-sharding (scalability) and replication are provided.• Real-time stats/analytics 16
  • 17. LESS GOOD FOR• Systems with heavy transactional nature• Traditional Business Intelligence• (obviously) System and problems requiring SQL 17
  • 18. SHARDING /1• Horizontal scalability: MongoDB auto-sharding • partitioning by keys • auto-balancing • easy addition of new servers • no single points-of-failure • automatic failover/replica-sets 18
  • 19. SHARDING /2 mongod mongod mongod mongod mongod ... mongod Shards mongod mongod mongodConfig serversmongod mongos mongos ...mongodmongod Client 19
  • 20. DRIVERS• C# and .NET • Python, Ruby, Delphi• C, C++ • Scala• Erlang, Perl • Clojure• Haskell • Go, Objective C• Java, Javascript • Smalltalk• PHP • ... 20
  • 21. PyMongo• Recommended MongoDB driver for the Python language• An easy way to install it (Mac, Linux): • easy_install pymongo • easy_install -U pymongo 21
  • 22. QUICK-START: INSERT• (obviously) mongod must be running ;-)import pymongofrom pymongo import Connectionconn = Connection() # default localhost:27017; conn=Connection(myhost,9999)db = conn[test_db] # gets the databasetest_coll = db[testcoll] # gets the desired collectiondoc = {"name":"slides.txt", "author":"Antonio", "type":"text", "tags":["mongodb", "python", "slides"]} # a dicttest_coll.insert(doc) # inserts document into the collection• lazycreation: collections and databases are created when the first document is inserted into them 22
  • 23. QUICK-START: QUERYres = test_coll.find_one() # gets one documentquery = {"author":"Antonio"} # a query documentres = test_coll.find_one(query) # searches for one documentfor doc in test_coll.find(query): # using Cursors on multiple docs print doc ...test_coll.count() # counts the docs in the collection 23
  • 24. NOT COVERED (HERE)• GridFS: binary data storage is limited to 16MB in DB, so GridFS transparently splits large files among multiple documents• MapReduce: batch processing of data and aggregation operations• GeoSpatial Indexing: two-dimensional indexing for location-based queries (e.g., retrieve the n closest restaurants to my location) 24
  • 25. IN PRODUCTION (some...) 25
  • 26. 26
  • 27. Paraimpu LOVES MongoDB• MongoDB powers Paraimpu, our Social Web of Things tool• great data heterogeneity• real-time thousands, small data inserts/queries• performances• horizontal scalability• easy of use, development is funny! 27
  • 28. REFERENCES• http://www.mongodb.org/• http://www.mongodb.org/display/DOCS/Manual• http://www.mongodb.org/display/DOCS/Slides+and+Video• pymongo: http://api.mongodb.org/python/• Paraimpu: http://paraimpu.crs4.it 28
  • 29. THANK YOUAntonio Pintus email: pintux@crs4.it twitter: @apintux 29