Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
SCALING DJANGO FOR X FACTOR             MALCOLM BOX, DJUGL OCTOBER 2012
WHAT I’M TALKING ABOUT  Scaling Django to >10K request/s  Caching, Counting and Cassandra  Toolbox
ME Malcolm Box, CTO & Co-Founder @malcolmbox malcolm@tellybug.com http://tellybug.com
Making TV more entertainingLive interaction Highly socialUnique content
WHO ARE YOU?  Technical?  Running Django?  Scale?
THE CHALLENGE
THE CHALLENGE  Millions of people watch the  shows we work with
THE CHALLENGE  Millions of people watch the  shows we work with  TV tells them to buzz/clap/  score....
THE CHALLENGE  Millions of people watch the  shows we work with  TV tells them to buzz/clap/  score....  A giant DDOS is l...
HOW BIG?  Peak loads of 10,000 requests/s  Read/write mix    Write-heavy workload - lots of user interactions
HOW BIG?10K REQUESTS/S IS 25,920,000,000REQUESTS/MONTH
The InternetARCHITECTURE                                                                       Static assets              ...
CACHING  Cache as speedup or Cache as mission-critical?  Use Django cache framework    Pylibmc - consistent hashing and se...
CACHE PROBLEMS  Cache miss behaviour         value = cache.get(key)                               if value is None:       ...
COUNTING  Hard to count a few things very fast  And have real-time access to the latest result  Things we tried:    memcac...
SHARDED COUNTERS  Implemented in about 350 lines of Python  To provide two basic operations!    incr()    get()  Uses a co...
CASSANDRA  Core piece of our infrastructure  Highly write-scalable  Reads scaled from cache  Using Acunu Cassandra for vir...
TOOLBOX  Development    Django Extensions, Celery, Piston (heavily forked), iPython, pycassa    Tsung (load testing tool) ...
THINGS THAT STILL SUCK                Monitoring
Q&AAND YES, WE’RE HIRING SO IF YOU’RE INTERESTED IN BUILDING EXTREMELY LARGE                    DJANGO SITES THEN GET IN T...
Upcoming SlideShare
Loading in …5
×

Scaling Django for X Factor - DJUGL Oct 2012

1,046 views

Published on

Talk at the Django User Group London meeting, October 2012

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Scaling Django for X Factor - DJUGL Oct 2012

  1. 1. SCALING DJANGO FOR X FACTOR MALCOLM BOX, DJUGL OCTOBER 2012
  2. 2. WHAT I’M TALKING ABOUT Scaling Django to >10K request/s Caching, Counting and Cassandra Toolbox
  3. 3. ME Malcolm Box, CTO & Co-Founder @malcolmbox malcolm@tellybug.com http://tellybug.com
  4. 4. Making TV more entertainingLive interaction Highly socialUnique content
  5. 5. WHO ARE YOU? Technical? Running Django? Scale?
  6. 6. THE CHALLENGE
  7. 7. THE CHALLENGE Millions of people watch the shows we work with
  8. 8. THE CHALLENGE Millions of people watch the shows we work with TV tells them to buzz/clap/ score....
  9. 9. THE CHALLENGE Millions of people watch the shows we work with TV tells them to buzz/clap/ score.... A giant DDOS is launched against our servers
  10. 10. HOW BIG? Peak loads of 10,000 requests/s Read/write mix Write-heavy workload - lots of user interactions
  11. 11. HOW BIG?10K REQUESTS/S IS 25,920,000,000REQUESTS/MONTH
  12. 12. The InternetARCHITECTURE Static assets HAProxy layer Entirely cloud based Web layer Chef Nodes come and Cache go - frequently! Monitor Cassandra Cluster Automatic Task deployment direct RDS MySQL Server from Github via Amazon AWS eu-west-1 Logs, backups Amazon S3 Chef
  13. 13. CACHING Cache as speedup or Cache as mission-critical? Use Django cache framework Pylibmc - consistent hashing and server death patches Problems as you scale up...
  14. 14. CACHE PROBLEMS Cache miss behaviour value = cache.get(key) if value is None: try: Thundering herds are bad lock = cache.add(lock_key(key)) if lock: Key overload # Do something expensive new_value = calculate_new_value() cache.set(key, new_value) Server overload return new_value finally: Dualcache - https:// if lock: cache.delete(lock_key(key) gist.github.com/953524 return value
  15. 15. COUNTING Hard to count a few things very fast And have real-time access to the latest result Things we tried: memcache Cassandra counters Final solution: Sharded counters
  16. 16. SHARDED COUNTERS Implemented in about 350 lines of Python To provide two basic operations! incr() get() Uses a combination of two layers of memcache and Cassandra to provide real-time, scalable counters
  17. 17. CASSANDRA Core piece of our infrastructure Highly write-scalable Reads scaled from cache Using Acunu Cassandra for virtual nodes “Fake” Django ORM classes to make it feel more natural But no automatic join support
  18. 18. TOOLBOX Development Django Extensions, Celery, Piston (heavily forked), iPython, pycassa Tsung (load testing tool) Deployment: Fabric, Chef, Boto Operations Sentry, Gargoyle
  19. 19. THINGS THAT STILL SUCK Monitoring
  20. 20. Q&AAND YES, WE’RE HIRING SO IF YOU’RE INTERESTED IN BUILDING EXTREMELY LARGE DJANGO SITES THEN GET IN TOUCH MALCOLM@TELLYBUG.COM

×