“Web 2.0 sucks (for scaling).”
Joe Stump, Lead Architect, Digg.com
Users want access to all of their crap at
all times. I, personally, don’t find your dog
funny or cute, but I’ll be damned if I’m
the one who’ll stand in the way of you
posting it and others consuming it.
Backend Considerations
Language considerations
Scaling out
Caching strategies
Content storage and delivery
Parallel data requests
Near time data processing
Partitioning data
Frontend Considerations
Reduce HTTP requests
Avoid inline JavaScript and CSS
Compression and Minification
Learn to love HTTP/1.1
“PHP doesn’t scale.”
Cal Henderson, Director of Development, Flickr.com
Languages don’t scale
Bytecode caching (PHP, Python, etc)
Robust library & driver support
Active developer communities
How do I scale easily?
1.Caching
2.Caching
3.Caching!
What are my options?
Disk based caching (e.g. Cache_Lite)
In memory caching (e.g. APC, Memcached)
Cloud caching (e.g. MogileFS, S3)
Disk based caching
Stupid simple
Cheap
Fairly easy to scale out
Dynamic images
Slower than others
Use fast disks!
RAM disks are faster
APC (PHP)
Bytecode caching
In memory user cache
Insanely fast
Not centralized or shared
Memcache
If you’re not using this you’re crazy
Easy to set up and use
Insanely fast over the network
Scales to insane heights
Failover, widely supported, etc.
Centralized and shared across site
Mogile FS
File and data store
Runs over WebDAV
Scales out infinitely (in theory)
Serialize data, store in file
Centralized and shared across site
Amazon S3
File and data store
Runs over HTTP
Scales out infinitely (in theory)
Serialize data, store in file
Centralized and shared across site
Costs money
Widely supported in all languages
Check out ThruDB
Content Storage/Delivery
What are your storage needs?
Is it critical YOU store them?
How costly is it to store in-house?
Can you do it for free? (YAY! Mooching!)
i can has free storage?
YouTube for video
Scribd for documents
Flickr for images
Cloud Services (S3)
Simple to get up and running
No hardware maintenance
Costs money, but not as much as you think
NFS
Simple to set up and get running
Costs money, requires colocation, etc.
Does. Not. Scale.
Did I mention it doesn’t scale?
Stop gap solution at best
Mogile FS
Somewhat complicated to set up
Costs money, requires colocation, etc.
Scales exceptionally well
Used at Digg, LiveJournal, others
Check out File_Mogile by Digg (PEAR)
Roll Your Own
File storage IS your business
Highly specialized and customized
Costs money, requires colocation, etc.
Last resort
Parallel Data Requests
Access your data in parallel
Make data access asynchronous (WHAT?!)
Loosely couple your data access layer
All for the low, low price of FREE!*
*Offer only available for hardcore nerds looking for street cred.
Resources
High Performance Web Sites
Essential Knowledge for Front-End Engineers
by Steve Souders
Serving JavaScript Fast
http://www.thinkvitamin.com/features/webapps/serving-javascript-fast
by Cal Henderson, Director of Development, Flickr.com