Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

MongoDB EuroPython 2009

8,500 views

Published on

MongoDB in the "Real World" talk from EuroPython 2009 in birmingham

Published in: Art & Photos, Technology

MongoDB EuroPython 2009

  1. open-source, high-performance, schema-free, document-oriented database
  2. RDBMS • Great for many applications • Shortcomings • Scalability • Flexibility
  3. CAP Theorem • Consistency • Availability • Tolerance to network Partitions • Pick two http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf
  4. ACID vs BASE • Atomicity • Basically Available • Consistency • Soft state • Isolation • Eventually consistent • Durability
  5. Schema-free • Loosening constraints - added flexibility • Dynamically typed languages • Migrations
  6. BigTable • Single master node • Row / Column hybrid • Versioned
  7. BigTable • Open-source clones: • HBase • Hypertable
  8. Dynamo • Simple Key/Value store • No master node • Write to any (many) nodes • Read from one or more nodes (balance speed vs. consistency) • Read repair
  9. Dynamo • Open-source clones • Project Voldemort • Cassandra - data model more like BigTable • Dynomite
  10. memcached • Used as a caching layer • Essentially a key/value store • RAM only - fast • Does away with ACID
  11. Redis • Like memcached • Different • Values can be strings, lists, sets • Non-volatile
  12. Tokyo Cabinet + Tyrant • Key/value store with focus on speed • Some more advanced queries • Sorting, range or prefix matching • Multiple storage engines • Hash, B-Tree, Fixed length and Table
  13. • A lot in common with MongoDB: • Document-oriented • Schema-free • JSON-style documents
  14. • Differences • MVCC based • Replication as path to scalability • Query through predefined views • ACID • REST
  15. • Focus on performance • Rich dynamic queries • Secondary indexes • Replication / failover • Auto-sharding • Many platforms / languages supported
  16. Good at • The web • Caching • High volume / low value • Scalability
  17. Less good at • Highly transactional • Ad-hoc business intelligence • Problems that require SQL
  18. PyMongo • Python driver for MongoDB • Pure Python, with optional C extension • Installation (setuptools): easy_install pymongo
  19. Document • Unit of storage (think row) • Just a dictionary • Can store many Python types: • None, bool, int, float, string / unicode, dict, datetime.datetime, compiled re • Some special types: • SON, Binary, ObjectId, DBRef
  20. Collection • Schema-free equivalent of a table • Logical groups of documents • Indexes are per-collection
  21. _id • Special key • Present in all documents • Unique across a Collection • Any type you want
  22. Blog back-end
  23. Post {“author”: “mike”, “date”: datetime.datetime.utcnow(), “text”: “my blog post...”, “tags”: [“mongodb”, “python”]}
  24. Comment {“author”: “eliot”, “date”: datetime.datetime.utcnow(), “text”: “great post!”}
  25. New post post = {“author”: “mike”, “date”: datetime.datetime.utcnow(), “text”: “my blog post...”, “tags”: [“mongodb”, “python”]} post_id = db.posts.save(post)
  26. Embedding a comment c = {“author”: “eliot”, “date”: datetime.datetime.utcnow(), “text”: “great post!”} db.posts.update({“_id”: post_id}, {“$push”: {“comments”: c}})
  27. Last 10 posts query = db.posts.find() .sort(“date”, DESCENDING) .limit(10) for post in query: print post[“text”]
  28. Posts by author db.posts.find({“author”: “mike”})
  29. Posts in the last week last_week = datetime.datetime.utcnow() + datetime.timedelta(days=-7) db.posts.find({“date”: {“$gt”: last_week}})
  30. Posts ending with ‘Python’ db.posts.find({“text”: re.compile(“Python$”)})
  31. Posts with a tag db.posts.find({“tag”: “mongodb”}) ... and fast db.posts.create_index(“tag”, ASCENDING)
  32. Counting posts db.posts.count() db.posts.find({“author”: “mike”}).count()
  33. Basic paging page = 2 page_size = 15 db.posts.find().limit(page_size) .skip(page * page_size)
  34. Migration: adding titles • Easy - just start adding them: post = {“author”: “mike”, “date”: datetime.datetime.utcnow(), “text”: “another blog post...”, “tags”: [“meetup”, “python”], “title”: “Document Oriented Dbs”} post_id = db.posts.save(post)
  35. Advanced queries • $gt, $lt, $gte, $lte, $ne, $all, $in, $nin • where() db.posts.find().where(“this.author == ‘mike’”) • group()
  36. Other cool stuff • Capped collections • Unique indexes • Mongo shell • GridFS • MongoKit (on pypi)
  37. • Download MongoDB http://www.mongodb.org • Install PyMongo • Try it out!
  38. • http://www.mongodb.org • irc.freenode.net#mongodb • mongodb-user on google groups • @mongodb, @mdirolf • mike@10gen.com • http://www.slideshare.net/mdirolf

×