Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Like this presentation? Why not share!

Like this? Share it with your network

Share

MongoDB + Pylons FTW: Scalable Web apps with Python & NoSQL

on

  • 5,478 views

A talk by Catch.com's Director of Platform Engineering on their production deployment of Pylons and MongoDB. Comparison with MySQL and other NoSQL databases, some common gotchas, other tips and ...

A talk by Catch.com's Director of Platform Engineering on their production deployment of Pylons and MongoDB. Comparison with MySQL and other NoSQL databases, some common gotchas, other tips and tricks.

Statistics

Views

Total Views
5,478
Views on SlideShare
5,420
Embed Views
58

Actions

Likes
3
Downloads
31
Comments
0

1 Embed 58

http://www.python.rk.edu.pl 58

Accessibility

Categories

Upload Details

Uploaded via as OpenOffice

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

MongoDB + Pylons FTW: Scalable Web apps with Python & NoSQL Presentation Transcript

  • 1. MongoDB + Pylons FTW: Scalabale Web apps with Python & NoSQL
    • Niall O'Higgins
  • 2. MongoDB ??
    • Non-relational (NoSQL) document-oriented database
    • 3. Rich query language
    • 4. Flexible data model (JSON)
    • 5. Sharding and replication with automatic-failover
    • 6. CAP write quorum support
  • 7. MongoDB vs MySQL vs BDB
    • Feature Comparison
    MongoDB MySQL BDB Master-Slave Replication Yes Yes No Automatic master failover Yes No No Ad-hoc Schemas Yes No No Manual index specification Yes Yes Yes Rich query language Yes Yes No Joins No Yes No
  • 8. MongoDB: NoSQL?
    • Closer to MySQL than to BDB/Tyrant/Redis
    • 9. Less emphasis on Google-scale scalability at this point compared to e.g. Cassandra/HBase
    • 10. Today, MongoDB scales slightly better than MySQL IMHO.
    • 11. Main advantage over RDBMS is ease of development!
  • 12. NoORM
    • MongoDB document is a JSON object.
    • 13. PyMongo driver exposes these as dictionaries
    • 14. E.g. db.mycollection.insert(dict(myprop='foo!'))
    • 15. You can even embed objects within a document:
    • 16. E.g. db.mycollection.insert(
    • 17. dict(emails=
    • 18. [dict(email_lower='niallo@pywebsf.org')]
    • 19. ))
  • 20. NoORM II
    • You can query across properties, embedded or otherwise, easily.
    • 21. E.g.
    • 22. r = db.mycollection.find_one(
    • 23. {emails.email_lower:'niallo@pywebsf.org'}
    • 24. )
    • 25. print r['emails'][0]['email_lower']
  • 26. NoORM III
    • SQL queries can get very complicated (huge joins, aggregate queries, etc)
    • 27. Marshalling SQL to/from objects is painful, tedious.
    • 28. Cross-RDBMS portability issues
    • 29. -> ORMs can make sense with RDBMS
    • 30. At least in Python, I see absolutely no need for ORM with MongoDB.
  • 31. More on querying MongoDB
    • I alluded to a rich query language:
    • 32. $lt, $lte, $gt, $gte == <, <=, >, =>
    • 33. $ne == <>, !=
    • 34. $in == IN
    • 35. $or == OR
    • 36. Sort() == ORDER BY
    • 37. Limit() == LIMIT
    • 38. Skip() == OFFSET
    • 39. Group() == GROUP BY
  • 40. Geo Features
      Geospatial querying out of the box:
    • $near (x, y): Return documents sorted by distance from point.
    • 41. $within, $box/$center: Return documents within bounds of a rectangle/circle. Earth is flat in 1.6.x and lower. Spherical model in 1.7.0 and up.
    • 42. Other limits: Only one geo index per collection. No sharding support.
  • 43. More on querying MongoDB
    • Sample rich query:
    • 44. db.foo.find({'$or': [{'owner_account_id': Id}, {'shared_to': Id}], 'location': {'$near': [37.783, -122.393]}}.skip(offset).limit(PAGE_SIZE)
  • 45. MongoDB Gotchas
    • Obviously not as mature as any RDBMS on the market. You can find bugs, but developers are extremely responsive.
    • 46. 4MB limit per document.
    • 47. Cap on number of indexes per collection.
    • 48. No transactions, although some imperfect support for atomic operations. Race with e.g. Deleting single embedded object from list property.
    • 49. -> Not the right fit for every system!
    • 50. -> Think through your schema!
  • 51. Pylons
    • Elegant, lightweight but fully-featured framework.
    • 52. Perfect fit for mixed API and Web rendering environment we have at Catch.com
    • 53. Threaded design, easy to program. No callbacks needed. Works fine with synchronous libraries.
    • 54. Threads in Python suck due to the GIL.
    • 55. Multi-core systems necessitate multi-process operation.
    • 56. Options: Apache+mod_wsgi or multiple Pasters.
  • 57. mod_wsgi vs multiple Pasters
    • Apache+mod_wsgi is kind of a pain to setup, configure.
    • 58. Past production experiences with mod_wsgi include memory leaks, CPU spinning.
    • 59. Paster is simple, built into Pylons. One Paster server maps cleanly to one Python interpreter process.
    • 60. LOLapps serve high-volume production traffic with Paster just fine.
  • 61. Catch.com Paster
    • Multiple Pasters per machine, each listening on a separate port.
    • 62. Nginx load balances between them all.
    • 63. Nginx also terminates SSL.
    • 64. No issues whatsoever handling the load today.
  • 65. Paster Debugging Tip
    • Configure smtp_server
    • 66. Point it at a Google Mail list
    • 67. Set up a Gmail label to filter messages into
    • 68. Now you have nicely searchable, real-time stack traces aggregated from each Paster. Very useful for debugging distributed systems!