Your SlideShare is downloading. ×
Scaling Django
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Scaling Django

21,384
views

Published on

Published in: Technology

1 Comment
95 Likes
Statistics
Notes
No Downloads
Views
Total Views
21,384
On Slideshare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
771
Comments
1
Likes
95
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide































































































  • Transcript

    • 1. Scaling Django Web Apps Mike Malone eu con 2009 ro
    • 2. Hi, I’m Mike.
    • 3. http://www.flickr.com/photos/kveton/2910536252/
    • 4. Pownce • Large scale • Hundreds of requests/sec • Thousands of DB operations/sec • Millions of user relationships • Millions of notes • Terabytes of static data eu con 2009 7 ro
    • 5. Pownce • Encountered and eliminated many common scaling bottlenecks • Real world example of scaling a Django app • Django provides a lot for free • I’ll be focusing on what you have to build yourself, and the rare places where Django got in the way eu con 2009 8 ro
    • 6. Scalability
    • 7. Scalability Scalability is NOT: • Speed / Performance • Generally affected by language choice • Achieved by adopting a particular technology eu con 2009 10 ro
    • 8. A Scalable Application import time def application(environ, start_response): time.sleep(10) start_response('200 OK', [('content-type', 'text/plain')]) return ('Hello, world!',) eu con 2009 11 ro
    • 9. A High Performance Application def application(environ, start_response): remote_addr = environ['REMOTE_ADDR'] f = open('access-log', 'a+') f.write(remote_addr + quot;nquot;) f.flush() f.seek(0) hits = sum(1 for l in f.xreadlines() if l.strip() == remote_addr) f.close() start_response('200 OK', [('content-type', 'text/plain')]) return (str(hits),) eu con 2009 12 ro
    • 10. Scalability A scalable system doesn’t need to change when the size of the problem changes. eu con 2009 13 ro
    • 11. Scalability • Accommodate increased usage • Accommodate increased data • Maintainable eu con 2009 14 ro
    • 12. Scalability • Two kinds of scalability • Vertical scalability: buying more powerful hardware, replacing what you already own • Horizontal scalability: buying additional hardware, supplementing what you already own eu con 2009 15 ro
    • 13. Vertical Scalability • Costs don’t scale linearly (server that’s twice is fast is more than twice as much) • Inherently limited by current technology • But it’s easy! If you can get away with it, good for you. eu con 2009 16 ro
    • 14. Vertical Scalability “ Sky scrapers are special. Normal buildings don’t need 10 floor foundations. Just build! - Cal Henderson eu con 2009 17 ro
    • 15. Horizontal Scalability The ability to increase a system’s capacity by adding more processing units (servers) eu con 2009 18 ro
    • 16. Horizontal Scalability It’s how large apps are scaled. eu con 2009 19 ro
    • 17. Horizontal Scalability • A lot more work to design, build, and maintain • Requires some planning, but you don’t have to do all the work up front • You can scale progressively... • Rest of the presentation is roughly in order eu con 2009 20 ro
    • 18. Caching
    • 19. Caching • Several levels of caching available in Django • Per-site cache: caches every page that doesn’t have GET or POST parameters • Per-view cache: caches output of an individual view • Template fragment cache: caches fragments of a template • None of these are that useful if pages are heavily personalized eu con 2009 22 ro
    • 20. Caching • Low-level Cache API • Much more flexible, allows you to cache at any granularity • At Pownce we typically cached • Individual objects • Lists of object IDs • Hard part is invalidation eu con 2009 23 ro
    • 21. Caching • Cache backends: • Memcached • Database caching • Filesystem caching eu con 2009 24 ro
    • 22. Caching Use Memcache. eu con 2009 25 ro
    • 23. Sessions Use Memcache. eu con 2009 26 ro
    • 24. Sessions Or Tokyo Cabinet http://github.com/ericflo/django-tokyo-sessions/ Thanks @ericflo eu con 2009 27 ro
    • 25. Caching Basic caching comes free with Django: from django.core.cache import cache class UserProfile(models.Model): ... def get_social_network_profiles(self): cache_key = ‘networks_for_%s’ % self.user.id profiles = cache.get(cache_key) if profiles is None: profiles = self.user.social_network_profiles.all() cache.set(cache_key, profiles) return profiles eu con 2009 28 ro
    • 26. Caching Invalidate when a model is saved or deleted: from django.core.cache import cache from django.db.models import signals def nuke_social_network_cache(self, instance, **kwargs): cache_key = ‘networks_for_%s’ % self.instance.user_id cache.delete(cache_key) signals.post_save.connect(nuke_social_network_cache, sender=SocialNetworkProfile) signals.post_delete.connect(nuke_social_network_cache, sender=SocialNetworkProfile) eu con 2009 29 ro
    • 27. Caching • Invalidate post_save, not pre_save • Still a small race condition • Simple solution, worked for Pownce: • Instead of deleting, set the cache key to None for a short period of time • Instead of using set to cache objects, use add, which fails if there’s already something stored for the key eu con 2009 30 ro
    • 28. Advanced Caching • Memcached’s atomic increment and decrement operations are useful for maintaining counts • But they’re not available in Django 1.0 • Added in 1.1 by ticket #6464 eu con 2009 31 ro
    • 29. Advanced Caching • You can still use them if you poke at the internals of the cache object a bit • cache._cache is the underlying cache object try: result = cache._cache.incr(cache_key, delta) except ValueError: # nonexistent key raises ValueError # Do it the hard way, store the result. return result eu con 2009 32 ro
    • 30. Advanced Caching • Other missing cache API • delete_multi & set_multi • append: add data to existing key after existing data • prepend: add data to existing key before existing data • cas: store this data, but only if no one has edited it since I fetched it eu con 2009 33 ro
    • 31. Advanced Caching • It’s often useful to cache objects ‘forever’ (i.e., until you explicitly invalidate them) • User and UserProfile • fetched almost every request • rarely change • But Django won’t let you • IMO, this is a bug :( eu con 2009 34 ro
    • 32. The Memcache Backend class CacheClass(BaseCache): def __init__(self, server, params): BaseCache.__init__(self, params) self._cache = memcache.Client(server.split(';')) def add(self, key, value, timeout=0): if isinstance(value, unicode): value = value.encode('utf-8') return self._cache.add(smart_str(key), value, timeout or self.default_timeout) eu con 2009 35 ro
    • 33. The Memcache Backend class CacheClass(BaseCache): def __init__(self, server, params): BaseCache.__init__(self, params) self._cache = memcache.Client(server.split(';')) def add(self, key, value, timeout=None): if isinstance(value, unicode): value = value.encode('utf-8') if timeout is None: timeout = self.default_timeout return self._cache.add(smart_str(key), value, timeout) eu con 2009 36 ro
    • 34. Advanced Caching • Typical setup has memcached running on web servers • Pownce web servers were I/O and memory bound, not CPU bound • Since we had some spare CPU cycles, we compressed large objects before caching them • The Python memcache library can do this automatically, but the API is not exposed eu con 2009 37 ro
    • 35. Monkey Patching core.cache from django.core.cache import cache from django.utils.encoding import smart_str import inspect as i if 'min_compress_len' in i.getargspec(cache._cache.set)[0]: class CacheClass(cache.__class__): def set(self, key, value, timeout=None, min_compress_len=150000): if isinstance(value, unicode): value = value.encode('utf-8') if timeout is None: timeout = self.default_timeout return self._cache.set(smart_str(key), value, timeout, min_compress_len) cache.__class__ = CacheClass eu con 2009 38 ro
    • 36. Advanced Caching • Useful tool: automagic single object cache • Use a manager to check the cache prior to any single object get by pk • Invalidate assets on save and delete • Eliminated several hundred QPS at Pownce eu con 2009 39 ro
    • 37. Advanced Caching All this and more at: http://github.com/mmalone/django-caching/ eu con 2009 40 ro
    • 38. Advanced Caching • Consistent hashing: hashes cached objects in such a way that most objects map to the same node after a node is added or removed. http://www.flickr.com/photos/deepfrozen/2191036528/ eu con 2009 41 ro
    • 39. Caching Now you’ve made life easier for your DB server, next thing to fall over: your app server. eu con 2009 42 ro
    • 40. Load Balancing
    • 41. Load Balancing • Out of the box, Django uses a shared nothing architecture • App servers have no single point of contention • Responsibility pushed down the stack (to DB) • This makes scaling the app layer trivial: just add another server eu con 2009 44 ro
    • 42. Load Balancing Spread work between multiple nodes in a cluster using a load balancer. Load Balancer • Hardware or software • Layer 7 or Layer 4 App Servers Database eu con 2009 45 ro
    • 43. Load Balancing • Hardware load balancers • Expensive, like $35,000 each, plus maintenance contracts • Need two for failover / high availability • Software load balancers • Cheap and easy, but more difficult to eliminate as a single point of failure • Lots of options: Perlbal, Pound, HAProxy,Varnish, Nginx eu con 2009 46 ro
    • 44. Load Balancing • Most of these are layer 7 proxies, and some software balancers do cool things • Caching • Re-proxying • Authentication • URL rewriting eu con 2009 47 ro
    • 45. Load Balancing A common setup for large operations is to use redundant layer 4 hardware Hardware Balancers balancers in front of a pool of layer 7 software balancers. Software Balancers App Servers eu con 2009 48 ro
    • 46. Load Balancing • At Pownce, we used a single Perlbal balancer • Easily handled all of our traffic (hundreds of simultaneous connections) • A SPOF, but we didn’t have $100,000 for black box solutions, and weren’t worried about service guarantees beyond three or four nines • Plus there were some neat features that we took advantage of eu con 2009 49 ro
    • 47. Perlbal Reproxying Perlbal reproxying is a really cool, and really poorly documented feature. eu con 2009 50 ro
    • 48. Perlbal Reproxying 1. Perlbal receives request 2. Redirects to App Server 1. App server checks auth (etc.) 2. Returns HTTP 200 with X- Reproxy-URL header set to internal file server URL 3. File served from file server via Perlbal eu con 2009 51 ro
    • 49. Perlbal Reproxying • Completely transparent to end user • Doesn’t keep large app server instance around to serve file • Users can’t access files directly (like they could with a 302) eu con 2009 52 ro
    • 50. Perlbal Reproxying Plus, it’s really easy: def download(request, filename): # Check auth, do your thing response = HttpResponse() response[‘X-REPROXY-URL’] = ‘%s/%s’ % (FILE_SERVER, filename) return response eu con 2009 53 ro
    • 51. Load Balancing Best way to reduce load on your app servers: don’t use them to do hard stuff. eu con 2009 54 ro
    • 52. Queuing
    • 53. Queuing • A queue is simply a bucket that holds messages until they are removed for processing by clients • Many expensive operations can be queued and performed asynchronously • User experience doesn’t have to suffer • Tell the user that you’re running the job in the background (e.g., transcoding) • Make it look like the job was done real-time (e.g., note distribution) eu con 2009 56 ro
    • 54. Queuing • Lots of open source options for queuing • Ghetto Queue (MySQL + Cron) • this is the official name. • Gearman • TheSchwartz • RabbitMQ • Apache ActiveMQ • ZeroMQ eu con 2009 57 ro
    • 55. Queuing • Lots of fancy features: brokers, exchanges, routing keys, bindings... • Don’t let that crap get you down, this is really simple stuff • Biggest decision: persistence • Does your queue need to be durable and persistent, able to survive a crash? • This requires logging to disk which slows things down, so don’t do it unless you have to eu con 2009 58 ro
    • 56. Queuing • Pownce used a simple ghetto queue built on MySQL / cron • Problematic if you have multiple consumers pulling jobs from the queue • No point in reinventing the wheel, there are dozens of battle-tested open source queues to choose from eu con 2009 59 ro
    • 57. Django Standalone Scripts Consumers need to setup the Django environment from django.core.management import setup_environ from mysite import settings setup_environ(settings) eu con 2009 60 ro
    • 58. THE DATABASE!
    • 59. The Database • Til now we’ve been talking about • Shared nothing • Pushing problems down the stack • But we have to store a persistent and consistent view of our application’s state somewhere • Enter, the database... eu con 2009 62 ro
    • 60. CAP Theorem • Three properties of a shared-data system • Consistency: all clients see the same data • Availability: all clients can see some version of the data • Partition Tolerance: system properties hold even when the system is partitioned & messages are lost • But you can only have two eu con 2009 63 ro
    • 61. CAP Theorem • Big long proof... here’s my version. • Empirically, seems to make sense. • Eric Brewer • Professor at University of California, Berkeley • Co-founder and Chief Scientist of Inktomi • Probably smarter than me eu con 2009 64 ro
    • 62. CAP Theorem • The relational database systems we all use were built with consistency as their primary goal • But at scale our system needs to have high availability and must be partitionable • The RDBMS’s consistency requirements get in our way • Most sharding / federation schemes are kludges that trade consistency for availability & partition tolerance eu con 2009 65 ro
    • 63. The Database • There are lots of non-relational databases coming onto the scene • CouchDB • Cassandra • Tokyo Cabinet • But they’re not that mature, and they aren’t easy to use with Django eu con 2009 66 ro
    • 64. The Database • Django has no support for • Non-relational databases like CouchDB • Multiple databases (coming soon?) • If you’re looking for a project, plz fix this. • Only advice: don’t get too caught up in trying to duplicate the existing ORM API eu con 2009 67 ro
    • 65. I Want a Pony • Save always saves every field of a model • Causes unnecessary contention and more data transfer • A better way: • Use descriptors to determine what’s dirty • Only update dirty fields when an object is saved eu con 2009 68 ro
    • 66. Denormalization
    • 67. Denormalization • Django encourages normalized data, which is usually good • But at scale you need to denormalize • Corollary: joins are evil • Django makes it really easy to do joins using the ORM, so pay attention eu con 2009 70 ro
    • 68. Denormalization • Start with a normalized database • Selectively denormalize things as they become bottlenecks • Denormalized counts, copied fields, etc. can be updated in signal handlers eu con 2009 71 ro
    • 69. Replication
    • 70. Replication • Typical web app is 80 to 90% reads • Adding read capacity will get you a long way • MySQL Master-Slave replication Read & Write Read only eu con 2009 73 ro
    • 71. Replication • Django doesn’t make it easy to use multiple database connections, but it is possible • Some caveats • Slave lag interacts with caching in weird ways • You can only save to your primary DB (the one you configure in settings.py) • Unless you get really clever... eu con 2009 74 ro
    • 72. Replication 1. Create a custom database wrapper by subclassing DatabaseWrapper class SlaveDatabaseWrapper(DatabaseWrapper): def _cursor(self, settings): if not self._valid_connection(): kwargs = { 'conv': django_conversions, 'charset': 'utf8', 'use_unicode': True, } kwargs = pick_random_slave(settings.SLAVE_DATABASES) self.connection = Database.connect(**kwargs) ... cursor = CursorWrapper(self.connection.cursor()) return cursor eu con 2009 75 ro
    • 73. Replication 2. Custom QuerySet that uses primary DB for writes class MultiDBQuerySet(QuerySet): ... def update(self, **kwargs): slave_conn = self.query.connection self.query.connection = default_connection super(MultiDBQuerySet, self).update(**kwargs) self.query.connection = slave_conn eu con 2009 76 ro
    • 74. Replication 3. Custom Manager that uses your custom QuerySet class SlaveDatabaseManager(db.models.Manager): def get_query_set(self): return MultiDBQuerySet(self.model, query=self.create_query()) def create_query(self): return db.models.sql.Query(self.model, connection) eu con 2009 77 ro
    • 75. Replication Example on github: http://github.com/mmalone/django-multidb/ eu con 2009 78 ro
    • 76. Replication • Goal: • Read-what-you-write consistency for writer • Eventual consistency for everyone else • Slave lag screws things up eu con 2009 79 ro
    • 77. Replication What happens when you become write saturated? eu con 2009 80 ro
    • 78. Federation
    • 79. Federation • Start with Vertical Partitioning: split tables that aren’t joined across database servers • Actually pretty easy • Except not with Django eu con 2009 82 ro
    • 80. Federation django.db.models.base FAIL! eu con 2009 83 ro
    • 81. Federation If the Django pony gets kicked every time someon uses {% endifnotequal %} I don’t want to know what happens every time django.db.connection is imported. http://www.flickr.com/photos/captainmidnight/811458621/ eu con 2009 84 ro
    • 82. Federation • At some point you’ll need to split a single table across databases (e.g., user table) • Now auto-increment won’t work • But Django uses auto-increment for PKs • ugh • Pluggable UUID backend? eu con 2009 85 ro
    • 83. Profiling, Monitoring & Measuring
    • 84. Know your SQL >>> Article.objects.filter(pk=3).query.as_sql() ('SELECT quot;app_articlequot;.quot;idquot;, quot;app_articlequot;.quot;namequot;, quot;app_articlequot;.quot;author_idquot; FROM quot;app_articlequot; WHERE quot;app_articlequot;.quot;idquot; = %s ', (3,)) eu con 2009 87 ro
    • 85. Know your SQL >>> import sqlparse >>> def pp_query(qs): ... t = qs.query.as_sql() ... sql = t[0] % t[1] ... print sqlparse.format(sql, reindent=True, keyword_case='upper') ... >>> pp_query(Article.objects.filter(pk=3)) SELECT quot;app_articlequot;.quot;idquot;, quot;app_articlequot;.quot;namequot;, quot;app_articlequot;.quot;author_idquot; FROM quot;app_articlequot; WHERE quot;app_articlequot;.quot;idquot; = 3 eu con 2009 88 ro
    • 86. Know your SQL >>> from django.db import connection >>> connection.queries [{'time': '0.001', 'sql': u'SELECT quot;app_articlequot;.quot;idquot;, quot;app_articlequot;.quot;namequot;, quot;app_articlequot;.quot;author_idquot; FROM quot;app_articlequot;'}] eu con 2009 89 ro
    • 87. Know your SQL • It’d be nice if a lightweight stacktrace could be done in QuerySet.__init__ • Stick the result in connection.queries • Now we know where the query originated eu con 2009 90 ro
    • 88. Measuring Django Debug Toolbar http://github.com/robhudson/django-debug-toolbar/ eu con 2009 91 ro
    • 89. Monitoring You can’t improve what you don’t measure. • Ganglia • Munin eu con 2009 92 ro
    • 90. Measuring & Monitoring • Measure • Server load, CPU usage, I/O • Database QPS • Memcache QPS, hit rate, evictions • Queue lengths • Anything else interesting eu con 2009 93 ro
    • 91. All done... Questions?
    • 92. Contact Me Mike Malone mjmalone@gmail.com twitter.com/mjmalone eu con 2009 95 ro

    ×