MongoDB EuroPython 2009

6,427
-1

Published on

MongoDB in the "Real World" talk from EuroPython 2009 in birmingham

Published in: Art & Photos, Technology
1 Comment
18 Likes
Statistics
Notes
No Downloads
Views
Total Views
6,427
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
213
Comments
1
Likes
18
Embeds 0
No embeds

No notes for slide

MongoDB EuroPython 2009

  1. open-source, high-performance, schema-free, document-oriented database
  2. RDBMS • Great for many applications • Shortcomings • Scalability • Flexibility
  3. CAP Theorem • Consistency • Availability • Tolerance to network Partitions • Pick two http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf
  4. ACID vs BASE • Atomicity • Basically Available • Consistency • Soft state • Isolation • Eventually consistent • Durability
  5. Schema-free • Loosening constraints - added flexibility • Dynamically typed languages • Migrations
  6. BigTable • Single master node • Row / Column hybrid • Versioned
  7. BigTable • Open-source clones: • HBase • Hypertable
  8. Dynamo • Simple Key/Value store • No master node • Write to any (many) nodes • Read from one or more nodes (balance speed vs. consistency) • Read repair
  9. Dynamo • Open-source clones • Project Voldemort • Cassandra - data model more like BigTable • Dynomite
  10. memcached • Used as a caching layer • Essentially a key/value store • RAM only - fast • Does away with ACID
  11. Redis • Like memcached • Different • Values can be strings, lists, sets • Non-volatile
  12. Tokyo Cabinet + Tyrant • Key/value store with focus on speed • Some more advanced queries • Sorting, range or prefix matching • Multiple storage engines • Hash, B-Tree, Fixed length and Table
  13. • A lot in common with MongoDB: • Document-oriented • Schema-free • JSON-style documents
  14. • Differences • MVCC based • Replication as path to scalability • Query through predefined views • ACID • REST
  15. • Focus on performance • Rich dynamic queries • Secondary indexes • Replication / failover • Auto-sharding • Many platforms / languages supported
  16. Good at • The web • Caching • High volume / low value • Scalability
  17. Less good at • Highly transactional • Ad-hoc business intelligence • Problems that require SQL
  18. PyMongo • Python driver for MongoDB • Pure Python, with optional C extension • Installation (setuptools): easy_install pymongo
  19. Document • Unit of storage (think row) • Just a dictionary • Can store many Python types: • None, bool, int, float, string / unicode, dict, datetime.datetime, compiled re • Some special types: • SON, Binary, ObjectId, DBRef
  20. Collection • Schema-free equivalent of a table • Logical groups of documents • Indexes are per-collection
  21. _id • Special key • Present in all documents • Unique across a Collection • Any type you want
  22. Blog back-end
  23. Post {“author”: “mike”, “date”: datetime.datetime.utcnow(), “text”: “my blog post...”, “tags”: [“mongodb”, “python”]}
  24. Comment {“author”: “eliot”, “date”: datetime.datetime.utcnow(), “text”: “great post!”}
  25. New post post = {“author”: “mike”, “date”: datetime.datetime.utcnow(), “text”: “my blog post...”, “tags”: [“mongodb”, “python”]} post_id = db.posts.save(post)
  26. Embedding a comment c = {“author”: “eliot”, “date”: datetime.datetime.utcnow(), “text”: “great post!”} db.posts.update({“_id”: post_id}, {“$push”: {“comments”: c}})
  27. Last 10 posts query = db.posts.find() .sort(“date”, DESCENDING) .limit(10) for post in query: print post[“text”]
  28. Posts by author db.posts.find({“author”: “mike”})
  29. Posts in the last week last_week = datetime.datetime.utcnow() + datetime.timedelta(days=-7) db.posts.find({“date”: {“$gt”: last_week}})
  30. Posts ending with ‘Python’ db.posts.find({“text”: re.compile(“Python$”)})
  31. Posts with a tag db.posts.find({“tag”: “mongodb”}) ... and fast db.posts.create_index(“tag”, ASCENDING)
  32. Counting posts db.posts.count() db.posts.find({“author”: “mike”}).count()
  33. Basic paging page = 2 page_size = 15 db.posts.find().limit(page_size) .skip(page * page_size)
  34. Migration: adding titles • Easy - just start adding them: post = {“author”: “mike”, “date”: datetime.datetime.utcnow(), “text”: “another blog post...”, “tags”: [“meetup”, “python”], “title”: “Document Oriented Dbs”} post_id = db.posts.save(post)
  35. Advanced queries • $gt, $lt, $gte, $lte, $ne, $all, $in, $nin • where() db.posts.find().where(“this.author == ‘mike’”) • group()
  36. Other cool stuff • Capped collections • Unique indexes • Mongo shell • GridFS • MongoKit (on pypi)
  37. • Download MongoDB http://www.mongodb.org • Install PyMongo • Try it out!
  38. • http://www.mongodb.org • irc.freenode.net#mongodb • mongodb-user on google groups • @mongodb, @mdirolf • mike@10gen.com • http://www.slideshare.net/mdirolf
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×