Scaling Django
Upcoming SlideShare
Loading in...5
×
 

Scaling Django

on

  • 6,728 views

by Mike Malone, presented at EuroDjangoCon

by Mike Malone, presented at EuroDjangoCon

Statistics

Views

Total Views
6,728
Views on SlideShare
6,463
Embed Views
265

Actions

Likes
30
Downloads
167
Comments
0

5 Embeds 265

http://blogs.huihoo.com 159
http://blog.huihoo.com 85
http://www.slideshare.net 18
http://www.centoser.org 2
http://centoser.org 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Scaling Django Scaling Django Presentation Transcript

    • Scaling Django Web Apps Mike Malone euro con 2009 Tuesday, May 5, 2009
    • Hi, I’m Mike. Tuesday, May 5, 2009
    • Tuesday, May 5, 2009
    • Tuesday, May 5, 2009
    • http://www.flickr.com/photos/kveton/2910536252/ Tuesday, May 5, 2009
    • Tuesday, May 5, 2009
    • Pownce • Large scale • Hundreds of requests/sec • Thousands of DB operations/sec • Millions of user relationships • Millions of notes • Terabytes of static data euro con 2009 7 Tuesday, May 5, 2009
    • Pownce • Encountered and eliminated many common scaling bottlenecks • Real world example of scaling a Django app • Django provides a lot for free • I’ll be focusing on what you have to build yourself, and the rare places where Django got in the way euro con 2009 8 Tuesday, May 5, 2009
    • Scalability Tuesday, May 5, 2009
    • Scalability Scalability is NOT: • Speed / Performance • Generally affected by language choice • Achieved by adopting a particular technology euro con 2009 10 Tuesday, May 5, 2009
    • A Scalable Application import time def application(environ, start_response): time.sleep(10) start_response('200 OK', [('content-type', 'text/plain')]) return ('Hello, world!',) euro con 2009 11 Tuesday, May 5, 2009
    • A High Performance Application def application(environ, start_response): remote_addr = environ['REMOTE_ADDR'] f = open('access-log', 'a+') f.write(remote_addr + quot;nquot;) f.flush() f.seek(0) hits = sum(1 for l in f.xreadlines() if l.strip() == remote_addr) f.close() start_response('200 OK', [('content-type', 'text/plain')]) return (str(hits),) euro con 2009 12 Tuesday, May 5, 2009
    • Scalability A scalable system doesn’t need to change when the size of the problem changes. euro con 2009 13 Tuesday, May 5, 2009
    • Scalability • Accommodate increased usage • Accommodate increased data • Maintainable euro con 2009 14 Tuesday, May 5, 2009
    • Scalability • Two kinds of scalability • Vertical scalability: buying more powerful hardware, replacing what you already own • Horizontal scalability: buying additional hardware, supplementing what you already own euro con 2009 15 Tuesday, May 5, 2009
    • Vertical Scalability • Costs don’t scale linearly (server that’s twice is fast is more than twice as much) • Inherently limited by current technology • But it’s easy! If you can get away with it, good for you. euro con 2009 16 Tuesday, May 5, 2009
    • Vertical Scalability “ Sky scrapers are special. Normal buildings don’t need 10 floor foundations. Just build! - Cal Henderson euro con 2009 17 Tuesday, May 5, 2009
    • Horizontal Scalability The ability to increase a system’s capacity by adding more processing units (servers) euro con 2009 18 Tuesday, May 5, 2009
    • Horizontal Scalability It’s how large apps are scaled. euro con 2009 19 Tuesday, May 5, 2009
    • Horizontal Scalability • A lot more work to design, build, and maintain • Requires some planning, but you don’t have to do all the work up front • You can scale progressively... • Rest of the presentation is roughly in order euro con 2009 20 Tuesday, May 5, 2009
    • Caching Tuesday, May 5, 2009
    • Caching • Several levels of caching available in Django • Per-site cache: caches every page that doesn’t have GET or POST parameters • Per-view cache: caches output of an individual view • Template fragment cache: caches fragments of a template • None of these are that useful if pages are heavily personalized euro con 2009 22 Tuesday, May 5, 2009
    • Caching • Low-level Cache API • Much more flexible, allows you to cache at any granularity • At Pownce we typically cached • Individual objects • Lists of object IDs • Hard part is invalidation euro con 2009 23 Tuesday, May 5, 2009
    • Caching • Cache backends: • Memcached • Database caching • Filesystem caching euro con 2009 24 Tuesday, May 5, 2009
    • Caching Use Memcache. euro con 2009 25 Tuesday, May 5, 2009
    • Sessions Use Memcache. euro con 2009 26 Tuesday, May 5, 2009
    • Sessions Or Tokyo Cabinet http://github.com/ericflo/django-tokyo-sessions/ Thanks @ericflo euro con 2009 27 Tuesday, May 5, 2009
    • Caching Basic caching comes free with Django: from django.core.cache import cache class UserProfile(models.Model): ... def get_social_network_profiles(self): cache_key = ‘networks_for_%s’ % self.user.id profiles = cache.get(cache_key) if profiles is None: profiles = self.user.social_network_profiles.all() cache.set(cache_key, profiles) return profiles euro con 2009 28 Tuesday, May 5, 2009
    • Caching Invalidate when a model is saved or deleted: from django.core.cache import cache from django.db.models import signals def nuke_social_network_cache(self, instance, **kwargs): cache_key = ‘networks_for_%s’ % self.instance.user_id cache.delete(cache_key) signals.post_save.connect(nuke_social_network_cache, sender=SocialNetworkProfile) signals.post_delete.connect(nuke_social_network_cache, sender=SocialNetworkProfile) euro con 2009 29 Tuesday, May 5, 2009
    • Caching • Invalidate post_save, not pre_save • Still a small race condition • Simple solution, worked for Pownce: • Instead of deleting, set the cache key to None for a short period of time • Instead of using set to cache objects, use add, which fails if there’s already something stored for the key euro con 2009 30 Tuesday, May 5, 2009
    • Advanced Caching • Memcached’s atomic increment and decrement operations are useful for maintaining counts • But they’re not available in Django 1.0 • Added in 1.1 by ticket #6464 euro con 2009 31 Tuesday, May 5, 2009
    • Advanced Caching • You can still use them if you poke at the internals of the cache object a bit • cache._cache is the underlying cache object try: result = cache._cache.incr(cache_key, delta) except ValueError: # nonexistent key raises ValueError # Do it the hard way, store the result. return result euro con 2009 32 Tuesday, May 5, 2009
    • Advanced Caching • Other missing cache API • delete_multi & set_multi • append: add data to existing key after existing data • prepend: add data to existing key before existing data • cas: store this data, but only if no one has edited it since I fetched it euro con 2009 33 Tuesday, May 5, 2009
    • Advanced Caching • It’s often useful to cache objects ‘forever’ (i.e., until you explicitly invalidate them) • User and UserProfile • fetched almost every request • rarely change • But Django won’t let you • IMO, this is a bug :( euro con 2009 34 Tuesday, May 5, 2009
    • The Memcache Backend class CacheClass(BaseCache): def __init__(self, server, params): BaseCache.__init__(self, params) self._cache = memcache.Client(server.split(';')) def add(self, key, value, timeout=0): if isinstance(value, unicode): value = value.encode('utf-8') return self._cache.add(smart_str(key), value, timeout or self.default_timeout) euro con 2009 35 Tuesday, May 5, 2009
    • The Memcache Backend class CacheClass(BaseCache): def __init__(self, server, params): BaseCache.__init__(self, params) self._cache = memcache.Client(server.split(';')) def add(self, key, value, timeout=None): if isinstance(value, unicode): value = value.encode('utf-8') if timeout is None: timeout = self.default_timeout return self._cache.add(smart_str(key), value, timeout) euro con 2009 36 Tuesday, May 5, 2009
    • Advanced Caching • Typical setup has memcached running on web servers • Pownce web servers were I/O and memory bound, not CPU bound • Since we had some spare CPU cycles, we compressed large objects before caching them • The Python memcache library can do this automatically, but the API is not exposed euro con 2009 37 Tuesday, May 5, 2009
    • Monkey Patching core.cache from django.core.cache import cache from django.utils.encoding import smart_str import inspect as i if 'min_compress_len' in i.getargspec(cache._cache.set)[0]: class CacheClass(cache.__class__): def set(self, key, value, timeout=None, min_compress_len=150000): if isinstance(value, unicode): value = value.encode('utf-8') if timeout is None: timeout = self.default_timeout return self._cache.set(smart_str(key), value, timeout, min_compress_len) cache.__class__ = CacheClass euro con 2009 38 Tuesday, May 5, 2009
    • Advanced Caching • Useful tool: automagic single object cache • Use a manager to check the cache prior to any single object get by pk • Invalidate assets on save and delete • Eliminated several hundred QPS at Pownce euro con 2009 39 Tuesday, May 5, 2009
    • Advanced Caching All this and more at: http://github.com/mmalone/django-caching/ euro con 2009 40 Tuesday, May 5, 2009
    • Advanced Caching • Consistent hashing: hashes cached objects in such a way that most objects map to the same node after a node is added or removed. http://www.flickr.com/photos/deepfrozen/2191036528/ euro con 2009 41 Tuesday, May 5, 2009
    • Caching Now you’ve made life easier for your DB server, next thing to fall over: your app server. euro con 2009 42 Tuesday, May 5, 2009
    • Load Balancing Tuesday, May 5, 2009
    • Load Balancing • Out of the box, Django uses a shared nothing architecture • App servers have no single point of contention • Responsibility pushed down the stack (to DB) • This makes scaling the app layer trivial: just add another server euro con 2009 44 Tuesday, May 5, 2009
    • Load Balancing Spread work between multiple nodes in a cluster using a load balancer. Load Balancer • Hardware or software • Layer 7 or Layer 4 App Servers Database euro con 2009 45 Tuesday, May 5, 2009
    • Load Balancing • Hardware load balancers • Expensive, like $35,000 each, plus maintenance contracts • Need two for failover / high availability • Software load balancers • Cheap and easy, but more difficult to eliminate as a single point of failure • Lots of options: Perlbal, Pound, HAProxy,Varnish, Nginx euro con 2009 46 Tuesday, May 5, 2009
    • Load Balancing • Most of these are layer 7 proxies, and some software balancers do cool things • Caching • Re-proxying • Authentication • URL rewriting euro con 2009 47 Tuesday, May 5, 2009
    • Load Balancing A common setup for large operations is to use redundant layer 4 hardware Hardware Balancers balancers in front of a pool of layer 7 software balancers. Software Balancers App Servers euro con 2009 48 Tuesday, May 5, 2009
    • Load Balancing • At Pownce, we used a single Perlbal balancer • Easily handled all of our traffic (hundreds of simultaneous connections) • A SPOF, but we didn’t have $100,000 for black box solutions, and weren’t worried about service guarantees beyond three or four nines • Plus there were some neat features that we took advantage of euro con 2009 49 Tuesday, May 5, 2009
    • Perlbal Reproxying Perlbal reproxying is a really cool, and really poorly documented feature. euro con 2009 50 Tuesday, May 5, 2009
    • Perlbal Reproxying 1. Perlbal receives request 2. Redirects to App Server 1. App server checks auth (etc.) 2. Returns HTTP 200 with X- Reproxy-URL header set to internal file server URL 3. File served from file server via Perlbal euro con 2009 51 Tuesday, May 5, 2009
    • Perlbal Reproxying • Completely transparent to end user • Doesn’t keep large app server instance around to serve file • Users can’t access files directly (like they could with a 302) euro con 2009 52 Tuesday, May 5, 2009
    • Perlbal Reproxying Plus, it’s really easy: def download(request, filename): # Check auth, do your thing response = HttpResponse() response[‘X-REPROXY-URL’] = ‘%s/%s’ % (FILE_SERVER, filename) return response euro con 2009 53 Tuesday, May 5, 2009
    • Load Balancing Best way to reduce load on your app servers: don’t use them to do hard stuff. euro con 2009 54 Tuesday, May 5, 2009
    • Queuing Tuesday, May 5, 2009
    • Queuing • A queue is simply a bucket that holds messages until they are removed for processing by clients • Many expensive operations can be queued and performed asynchronously • User experience doesn’t have to suffer • Tell the user that you’re running the job in the background (e.g., transcoding) • Make it look like the job was done real-time (e.g., note distribution) euro con 2009 56 Tuesday, May 5, 2009
    • Queuing • Lots of open source options for queuing • Ghetto Queue (MySQL + Cron) • this is the official name. • Gearman • TheSchwartz • RabbitMQ • Apache ActiveMQ • ZeroMQ euro con 2009 57 Tuesday, May 5, 2009
    • Queuing • Lots of fancy features: brokers, exchanges, routing keys, bindings... • Don’t let that crap get you down, this is really simple stuff • Biggest decision: persistence • Does your queue need to be durable and persistent, able to survive a crash? • This requires logging to disk which slows things down, so don’t do it unless you have to euro con 2009 58 Tuesday, May 5, 2009
    • Queuing • Pownce used a simple ghetto queue built on MySQL / cron • Problematic if you have multiple consumers pulling jobs from the queue • No point in reinventing the wheel, there are dozens of battle-tested open source queues to choose from euro con 2009 59 Tuesday, May 5, 2009
    • Django Standalone Scripts Consumers need to setup the Django environment from django.core.management import setup_environ from mysite import settings setup_environ(settings) euro con 2009 60 Tuesday, May 5, 2009
    • THE DATABASE! Tuesday, May 5, 2009
    • The Database • Til now we’ve been talking about • Shared nothing • Pushing problems down the stack • But we have to store a persistent and consistent view of our application’s state somewhere • Enter, the database... euro con 2009 62 Tuesday, May 5, 2009
    • CAP Theorem • Three properties of a shared-data system • Consistency: all clients see the same data • Availability: all clients can see some version of the data • Partition Tolerance: system properties hold even when the system is partitioned & messages are lost • But you can only have two euro con 2009 63 Tuesday, May 5, 2009
    • CAP Theorem • Big long proof... here’s my version. • Empirically, seems to make sense. • Eric Brewer • Professor at University of California, Berkeley • Co-founder and Chief Scientist of Inktomi • Probably smarter than me euro con 2009 64 Tuesday, May 5, 2009
    • CAP Theorem • The relational database systems we all use were built with consistency as their primary goal • But at scale our system needs to have high availability and must be partitionable • The RDBMS’s consistency requirements get in our way • Most sharding / federation schemes are kludges that trade consistency for availability & partition tolerance euro con 2009 65 Tuesday, May 5, 2009
    • The Database • There are lots of non-relational databases coming onto the scene • CouchDB • Cassandra • Tokyo Cabinet • But they’re not that mature, and they aren’t easy to use with Django euro con 2009 66 Tuesday, May 5, 2009
    • The Database • Django has no support for • Non-relational databases like CouchDB • Multiple databases (coming soon?) • If you’re looking for a project, plz fix this. • Only advice: don’t get too caught up in trying to duplicate the existing ORM API euro con 2009 67 Tuesday, May 5, 2009
    • I Want a Pony • Save always saves every field of a model • Causes unnecessary contention and more data transfer • A better way: • Use descriptors to determine what’s dirty • Only update dirty fields when an object is saved euro con 2009 68 Tuesday, May 5, 2009
    • Denormalization Tuesday, May 5, 2009
    • Denormalization • Django encourages normalized data, which is usually good • But at scale you need to denormalize • Corollary: joins are evil • Django makes it really easy to do joins using the ORM, so pay attention euro con 2009 70 Tuesday, May 5, 2009
    • Denormalization • Start with a normalized database • Selectively denormalize things as they become bottlenecks • Denormalized counts, copied fields, etc. can be updated in signal handlers euro con 2009 71 Tuesday, May 5, 2009
    • Replication Tuesday, May 5, 2009
    • Replication • Typical web app is 80 to 90% reads • Adding read capacity will get you a long way • MySQL Master-Slave replication Read & Write Read only euro con 2009 73 Tuesday, May 5, 2009
    • Replication • Django doesn’t make it easy to use multiple database connections, but it is possible • Some caveats • Slave lag interacts with caching in weird ways • You can only save to your primary DB (the one you configure in settings.py) • Unless you get really clever... euro con 2009 74 Tuesday, May 5, 2009
    • Replication 1. Create a custom database wrapper by subclassing DatabaseWrapper class SlaveDatabaseWrapper(DatabaseWrapper): def _cursor(self, settings): if not self._valid_connection(): kwargs = { 'conv': django_conversions, 'charset': 'utf8', 'use_unicode': True, } kwargs = pick_random_slave(settings.SLAVE_DATABASES) self.connection = Database.connect(**kwargs) ... cursor = CursorWrapper(self.connection.cursor()) return cursor euro con 2009 75 Tuesday, May 5, 2009
    • Replication 2. Custom QuerySet that uses primary DB for writes class MultiDBQuerySet(QuerySet): ... def update(self, **kwargs): slave_conn = self.query.connection self.query.connection = default_connection super(MultiDBQuerySet, self).update(**kwargs) self.query.connection = slave_conn euro con 2009 76 Tuesday, May 5, 2009
    • Replication 3. Custom Manager that uses your custom QuerySet class SlaveDatabaseManager(db.models.Manager): def get_query_set(self): return MultiDBQuerySet(self.model, query=self.create_query()) def create_query(self): return db.models.sql.Query(self.model, connection) euro con 2009 77 Tuesday, May 5, 2009
    • Replication Example on github: http://github.com/mmalone/django-multidb/ euro con 2009 78 Tuesday, May 5, 2009
    • Replication • Goal: • Read-what-you-write consistency for writer • Eventual consistency for everyone else • Slave lag screws things up euro con 2009 79 Tuesday, May 5, 2009
    • Replication What happens when you become write saturated? euro con 2009 80 Tuesday, May 5, 2009
    • Federation Tuesday, May 5, 2009
    • Federation • Start with Vertical Partitioning: split tables that aren’t joined across database servers • Actually pretty easy • Except not with Django euro con 2009 82 Tuesday, May 5, 2009
    • Federation django.db.models.base FAIL! euro con 2009 83 Tuesday, May 5, 2009
    • Federation If the Django pony gets kicked every time someon uses {% endifnotequal %} I don’t want to know what happens every time django.db.connection is imported. http://www.flickr.com/photos/captainmidnight/811458621/ euro con 2009 84 Tuesday, May 5, 2009
    • Federation • At some point you’ll need to split a single table across databases (e.g., user table) • Now auto-increment won’t work • But Django uses auto-increment for PKs • ugh • Pluggable UUID backend? euro con 2009 85 Tuesday, May 5, 2009
    • Profiling, Monitoring & Measuring Tuesday, May 5, 2009
    • Know your SQL >>> Article.objects.filter(pk=3).query.as_sql() ('SELECT quot;app_articlequot;.quot;idquot;, quot;app_articlequot;.quot;namequot;, quot;app_articlequot;.quot;author_idquot; FROM quot;app_articlequot; WHERE quot;app_articlequot;.quot;idquot; = %s ', (3,)) euro con 2009 87 Tuesday, May 5, 2009
    • Know your SQL >>> import sqlparse >>> def pp_query(qs): ... t = qs.query.as_sql() ... sql = t[0] % t[1] ... print sqlparse.format(sql, reindent=True, keyword_case='upper') ... >>> pp_query(Article.objects.filter(pk=3)) SELECT quot;app_articlequot;.quot;idquot;, quot;app_articlequot;.quot;namequot;, quot;app_articlequot;.quot;author_idquot; FROM quot;app_articlequot; WHERE quot;app_articlequot;.quot;idquot; = 3 euro con 2009 88 Tuesday, May 5, 2009
    • Know your SQL >>> from django.db import connection >>> connection.queries [{'time': '0.001', 'sql': u'SELECT quot;app_articlequot;.quot;idquot;, quot;app_articlequot;.quot;namequot;, quot;app_articlequot;.quot;author_idquot; FROM quot;app_articlequot;'}] euro con 2009 89 Tuesday, May 5, 2009
    • Know your SQL • It’d be nice if a lightweight stacktrace could be done in QuerySet.__init__ • Stick the result in connection.queries • Now we know where the query originated euro con 2009 90 Tuesday, May 5, 2009
    • Measuring Django Debug Toolbar http://github.com/robhudson/django-debug-toolbar/ euro con 2009 91 Tuesday, May 5, 2009
    • Monitoring You can’t improve what you don’t measure. • Ganglia • Munin euro con 2009 92 Tuesday, May 5, 2009
    • Measuring & Monitoring • Measure • Server load, CPU usage, I/O • Database QPS • Memcache QPS, hit rate, evictions • Queue lengths • Anything else interesting euro con 2009 93 Tuesday, May 5, 2009
    • All done... Questions? Tuesday, May 5, 2009
    • Contact Me Mike Malone mjmalone@gmail.com twitter.com/mjmalone euro con 2009 95 Tuesday, May 5, 2009