MongoDB: a gentle, friendly overview

  • 3,740 views
Uploaded on

My talk @ CRS4 about a MongoDB overview

My talk @ CRS4 about a MongoDB overview

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
3,740
On Slideshare
0
From Embeds
0
Number of Embeds
6

Actions

Shares
Downloads
25
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. A gentle, friendly overview Antonio Pintus CRS4, 08/09/2011 1
  • 2. NOSQL /1• MongoDB belongs to the NoSQL databases family: • non-relational • document-oriented • no prefixed, rigid, database schemas • no joins • horizontal scalability 2
  • 3. NOSQL /2• NoSQL DB family includes several DB types: • document/oriented: mongoDB, CouchDB, ... • Key Value / Tuple Store: Redis, ... • Graph databases: Neo4j, ... • ... 3
  • 4. MongoDB• Performant: C++ • document-based queries• Schema-free • Map/Reduce• Full index support • GridFS• No transactions •a JavaScript interactive shell• Scalable: replication + sharding 4
  • 5. SCHEMA-FREE• Schema-free collections = NO TABLES!•A Mongo deployment (server) holds a set of databases •A database holds a set of collections •A collection holds a set of documents •A document is a set of fields: key-value pair (BSON) •A key is a name (string), a value is a basic type like string, integer, float, timestamp, binary, etc.,a document, or an array of values 5
  • 6. DATA FORMAT• document/oriented• stores JSON-style documents: BSON (Binary JSON): • JSON + other data types. E.g., Date type and a BinData type. • Can reference other documents• lightweight, traversable, efficient 6
  • 7. BSON{! "_id" : ObjectId("4dcec9a0af391a0d53000003"),! "servicetype" : "sensor",! "description" : "it’s only rock’n’roll but I like it",! "policy" : "PUBLIC",! "owner" : "User001",! "date_created" : "2011-05-02 17:11:28.874086",! "shortname" : "SampleSensor",! "content-type" : "text/plain",! "icon" : "http://myserver.com/images/sens.png"} 7
  • 8. COLLECTIONS• More or less, same concept as “table” but dynamic, schema- free• collection of BSON documents• documents can have heterogeneous data structure in the same collection 8
  • 9. QUERIES• query by documents• Examples (using the interactive shell): • db.mycollection.find( {"policy" : "PUBLIC"} ); • db.mycollection.findOne({"policy" : "PUBLIC", “owner”:”User001”}); • db.mycollection.find({"policy" : "PUBLIC", “owner”:”User001”}).limit(2); • db.mycollection.find( {"policy" : "PUBLIC"}, {“shortname”:1} ); • db.mycollection.find({"counter": {$gt:2}});• conditional ops: <, <=, >, >=, $and, $in, $or, $nor, ... 9
  • 10. INDEXES• Full index support: index on any attribute (including multiple)• increase query performance• indexes are implemented as “B-Tree” indexes• data overhead for inserts and deletes, don’t abuse! • db.mycollection.ensureIndex( {"servicetype" : 1} ); • db.mycollection.ensureIndex( {"servicetype" : 1, “owner”:-1} ); • db.mycollection.getIndexes() • db.system.indexes.find() 10
  • 11. INSERTS• Simplicity• db.mycollection.insert({“a”:”abc”,...})• var doc = {“name”:”mongodb”,...};• db.mycollection.insert(doc); 11
  • 12. UPDATES1. replace entire document2. atomic, in-place updates• db.collection.update( criteria, objNew, upsert, multi ) • criteria: the query • objNew: updated object or $ operators (e.g., $inc, $set) which manipulate the object • upsert: if the record(s) do not exist, insert one. • multi: if all documents matching criteria should be updated• db.collection.save(...): single object update with upsert 12
  • 13. UPDATES /2• atomic, in-place updates = highly efficient• provides special operators• db.mycollection.update( { “shortname”:"Arduino" }, { $inc: { n : 1 } } );• db.mycollection.update( { “shortname”:"Arduino" }, { $set: { “shortname” : “OldArduino” } } );• other atomic ops: $unset, $push, $pushAll, $addToSet, $pop, $pull, $rename, ... 13
  • 14. Mongo DISTRIBUTION• Mac, Linux, Solaris, Win• mongod: database server. • By default, port=27017, store path=/data/db. • Override with --dbpath, --port command options• mongo: interactive JavaScript shell• mongos: sharding controller server 14
  • 15. MISCELLANEOUS: REST• mongod provides a basic REST interface• launch it with --rest option: default port=28017• http://localhost:28017/mydb/mycollection/• http://localhost:28017/mydb/mycollection/?filter_shortname=Arduino• http://localhost:28017/mydb/mycollection/?filter_shortname=Arduino• http://localhost:28017/mydb/mycollection/? filter_shortname=Arduino&limit=10 15
  • 16. GOOD FOR• event logging• high performance small read/writes• Web: real-time inserts, updates, and queries. Auto-sharding (scalability) and replication are provided.• Real-time stats/analytics 16
  • 17. LESS GOOD FOR• Systems with heavy transactional nature• Traditional Business Intelligence• (obviously) System and problems requiring SQL 17
  • 18. SHARDING /1• Horizontal scalability: MongoDB auto-sharding • partitioning by keys • auto-balancing • easy addition of new servers • no single points-of-failure • automatic failover/replica-sets 18
  • 19. SHARDING /2 mongod mongod mongod mongod mongod ... mongod Shards mongod mongod mongodConfig serversmongod mongos mongos ...mongodmongod Client 19
  • 20. DRIVERS• C# and .NET • Python, Ruby, Delphi• C, C++ • Scala• Erlang, Perl • Clojure• Haskell • Go, Objective C• Java, Javascript • Smalltalk• PHP • ... 20
  • 21. PyMongo• Recommended MongoDB driver for the Python language• An easy way to install it (Mac, Linux): • easy_install pymongo • easy_install -U pymongo 21
  • 22. QUICK-START: INSERT• (obviously) mongod must be running ;-)import pymongofrom pymongo import Connectionconn = Connection() # default localhost:27017; conn=Connection(myhost,9999)db = conn[test_db] # gets the databasetest_coll = db[testcoll] # gets the desired collectiondoc = {"name":"slides.txt", "author":"Antonio", "type":"text", "tags":["mongodb", "python", "slides"]} # a dicttest_coll.insert(doc) # inserts document into the collection• lazycreation: collections and databases are created when the first document is inserted into them 22
  • 23. QUICK-START: QUERYres = test_coll.find_one() # gets one documentquery = {"author":"Antonio"} # a query documentres = test_coll.find_one(query) # searches for one documentfor doc in test_coll.find(query): # using Cursors on multiple docs print doc ...test_coll.count() # counts the docs in the collection 23
  • 24. NOT COVERED (HERE)• GridFS: binary data storage is limited to 16MB in DB, so GridFS transparently splits large files among multiple documents• MapReduce: batch processing of data and aggregation operations• GeoSpatial Indexing: two-dimensional indexing for location-based queries (e.g., retrieve the n closest restaurants to my location) 24
  • 25. IN PRODUCTION (some...) 25
  • 26. 26
  • 27. Paraimpu LOVES MongoDB• MongoDB powers Paraimpu, our Social Web of Things tool• great data heterogeneity• real-time thousands, small data inserts/queries• performances• horizontal scalability• easy of use, development is funny! 27
  • 28. REFERENCES• http://www.mongodb.org/• http://www.mongodb.org/display/DOCS/Manual• http://www.mongodb.org/display/DOCS/Slides+and+Video• pymongo: http://api.mongodb.org/python/• Paraimpu: http://paraimpu.crs4.it 28
  • 29. THANK YOUAntonio Pintus email: pintux@crs4.it twitter: @apintux 29