Super-sizing YouTube
    with Python

       Mike Solomon
     mike@youtube.com
Welcome
this is about scaling a web application

there are a lot of things left out - mostly
mistakes and implementation d...
Architecture
this is the core of scalability

systems change over time, so will your
architecture

impossible to predict t...
YouTube's Early Days
web boxes do everything

servlets, images, thumbnails, search

shoehorn everything into Apache, MySQL...
hw load balancer




                                      httpd
                                 mod_python
             ...
Early Key Factors in
     Engineering
really small team

  we     python

  logical separation in code

  discipline and h...
Running Without Tripping

user demand can grow 50% in a day

removing one bottleneck can immediately reveal
another (usual...
Good Components
     (Hypothetical)
minimize dependencies*

accept some latency

localize failures - don’t let them spread...
Balance Machine
        Resources
more efficient resource utilization via specialized
deployment

balance based on CPU, RAM...
Migratory Patterns of the
     Norwegian Blue
  move from mod_python to mod_fastcgi

  move thumbnails to their own machin...
Serenity Now



Can you spot where we turned on
transcoding processes?
SQL Shenanigans
if you have a relational database, it will be
abused

   difficult to track the true source

series of obje...
Object Caching
take pressure off of relational db

can save additional resources if your objects
require significant comput...
Software Optimization
fast vs fast enough

strive for machine efficiency - don't obsess

be scientific - collect data and un...
Python Optimization
pure python HMAC was 40% of web cpu

  write a few lines of C

threaded comments fiasco

  overly compl...
Python Optimization
psyco - specializing compiler for Python

  'hot' functions are psyco-ized

  there is a 'context swit...
Reasonable Efficiency
pruned all the obvious leaf services

dynamic web requests are one `service`

web service is easy to ...
Scaling MySQL
pretty much have to go horizontal

choose your partition plan carefully

understand your data access pattern...
Partition By Entity
entities are 'transactional'

allow joins across properties of an entity

entities are migratory

cros...
EMD, a TLA not an ORM!
 connection and transaction management

 lookup service

 query factory

 minimalist table abstract...
Seismic Retrofit
apply this fundamental change to a large and
growing site

make it relatively painless with python

  mult...
Resulting API
all the scale-aware code nicely opaque to
application developers

base use cases are painless
User.select_by...
Bulk Entity Migration
hijack mysql replication to partition on the fly
while the live site is running

all DML gets tagged ...
Recurring Themes
the elegance of simplicity

take reliable open software and customize it

`pythonic veneer`

DIY - filing ...
Questions?
Upcoming SlideShare
Loading in...5
×

Super Sizing Youtube with Python

15,583

Published on

by Mike Solomon.

See more scalability tales at:
http://rapd.wordpress.com

Published in: Technology
1 Comment
42 Likes
Statistics
Notes
No Downloads
Views
Total Views
15,583
On Slideshare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
359
Comments
1
Likes
42
Embeds 0
No embeds

No notes for slide

Super Sizing Youtube with Python

  1. 1. Super-sizing YouTube with Python Mike Solomon mike@youtube.com
  2. 2. Welcome this is about scaling a web application there are a lot of things left out - mostly mistakes and implementation details this may generate more questions than it answers my goal is to give you ideas for solving your own problems
  3. 3. Architecture this is the core of scalability systems change over time, so will your architecture impossible to predict the optimal approach start simple aim for local maxima python enables flexibility
  4. 4. YouTube's Early Days web boxes do everything servlets, images, thumbnails, search shoehorn everything into Apache, MySQL very simple this survives longer than you'd think
  5. 5. hw load balancer httpd mod_python db objects search thumbnails biz logic servlets templates Early Web Stack db master circa January ‘06 db replicas
  6. 6. Early Key Factors in Engineering really small team we python logical separation in code discipline and honor - not linguistically enforced (don’t waste time writing code to restrict people)* grown by systematically removing bottlenecks easy to know when something is a `win`
  7. 7. Running Without Tripping user demand can grow 50% in a day removing one bottleneck can immediately reveal another (usually more heinous) replace and migrate components as they become problems good (python) components make this easy obviously, pick your battles
  8. 8. Good Components (Hypothetical) minimize dependencies* accept some latency localize failures - don’t let them spread you are only down if it looks like you are applies to both systems and software
  9. 9. Balance Machine Resources more efficient resource utilization via specialized deployment balance based on CPU, RAM, network and disk usage patterns overlay orthogonal loads disjoint tasks running on the same physical hardware
  10. 10. Migratory Patterns of the Norwegian Blue move from mod_python to mod_fastcgi move thumbnails to their own machines make search to a remote service running on separate machines run transcoder processes on video servers do more with the same hardware
  11. 11. Serenity Now Can you spot where we turned on transcoding processes?
  12. 12. SQL Shenanigans if you have a relational database, it will be abused difficult to track the true source series of object proxies for DB-API enable logging encode a portion of call stack as a query comment* (more about this later)
  13. 13. Object Caching take pressure off of relational db can save additional resources if your objects require significant computation to set up memcached makes a good home for this need good client to make this into a truly useful service ‡ pools and better failure handling
  14. 14. Software Optimization fast vs fast enough strive for machine efficiency - don't obsess be scientific - collect data and understand it can yield some surprising results don't assume code optimization techniques from another language are relevant just like carpentry, measure twice cut once
  15. 15. Python Optimization pure python HMAC was 40% of web cpu write a few lines of C threaded comments fiasco overly complex algorithm to compute the display object tree simplify query, simplify algorithm
  16. 16. Python Optimization psyco - specializing compiler for Python 'hot' functions are psyco-ized there is a 'context switch' penalty so you need to experiment to see if it helps previous threaded comments algorithm -closure +psyco = 400% boost
  17. 17. Reasonable Efficiency pruned all the obvious leaf services dynamic web requests are one `service` web service is easy to scale, so it stresses out other resources - probably a DB DB’s are hard(er) to scale tricks of escalating cleverness‡ eventually, no cards left to play
  18. 18. Scaling MySQL pretty much have to go horizontal choose your partition plan carefully understand your data access patterns what queries do you run most often? do you have joins? do you need transactional consistency? why? does an 'entity' emerge?
  19. 19. Partition By Entity entities are 'transactional' allow joins across properties of an entity entities are migratory cross entity is more complicated weaken guarantees to make it easier minimize activity by design
  20. 20. EMD, a TLA not an ORM! connection and transaction management lookup service query factory minimalist table abstraction ORM can be (is?) evil make common behaviors simple, while leaving some transparency to the actual database
  21. 21. Seismic Retrofit apply this fundamental change to a large and growing site make it relatively painless with python multiple inheritance decorators AST plugins for validation and testing
  22. 22. Resulting API all the scale-aware code nicely opaque to application developers base use cases are painless User.select_by_username(db_context, username) Video.select_by_id(db_context, video_id) Video.select_by_user_id(db_context, user_id)
  23. 23. Bulk Entity Migration hijack mysql replication to partition on the fly while the live site is running all DML gets tagged with an entity id read master binlog and selectively replay it into a set of new mini-masters update lookup service to point to new resources
  24. 24. Recurring Themes the elegance of simplicity take reliable open software and customize it `pythonic veneer` DIY - filing a ticket for a bugfix doesn’t give me a warm feeling - take matters into your own hands*
  25. 25. Questions?
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×