Distributed “Web Scale” Systems Ricardo Vice Santos @ricardovice
Who am I?• I’m Ricardo!• Lead Engineer at Spotify• ricardovice on twitter, spotify, about.me, kiva, slideshare, github, bitbucket, delicious…• Portuguese• Previously working in the video streaming industry• (only) Discovered Spotify late 2009• Joined in 2010
spotiﬁera: to use Spotify;spo·ti·fie·ra Verb to provide a service free of cost;
What’s Spotify all about?• A big catalogue, tons of music• Available everywhere• Great user experience• More convenient than piracy• Reliable, high availability• Scalable for many, many usersBut what really got me hooked up:• Free, legal ad-supported service• Very fast
The importance of being fast• High latency can be a problem, not only in First Person Shooters• Slow performance is a major user experience killer• At Velocity 2009, Eric Schurman (Bing) and Jake Brutlag (Google Search) showed that increased latency directly hurt usage and revenue per user.• Latency leads to users leaving, many wont ever come back• Users will share their experience with friends  http://radar.oreilly.com/2009/07/velocity-making-your-site-fast.html
So how fast is Spotify?• We monitor playback latency on the client side• Current median latency to play any track is 265ms• On average, the human notion of “instant” is anything under 200ms• Due to disk lookup, at times its actually faster to start playing a track from network than from disk• Below 1% of playbacks experienced stutter
“Spotify is fast due to P2P”• This is something I read a lot around the web• P2P does play a crucial role in the picture, but…• Experience at Spotify showed me that most latency issues are directly linked to backend problems• It’s a mistake to think that we could be this fast without a smart and scalable backend architectureSo let’s give credit where credit is due.
Going web scale!!1“Scaling Twitter”Blaine Cook, 2007http://www.slideshare.net/Blaine/scaling-twitter
Handling growthThings to keep in mind:• Scaling is not an exact science• There is no such thing as a magic formula• Usage patterns differ• There is always a limit to what you can handle• Fail gracefully• Continuous evolution process
Scaling horizontally• You can always add more machines!• Stateless services• Several processes can share memcached• Possible to run in “the cloud” (EC2, Rackspace)• Need some kind of load balancer• Data sharing/synchronization can be hard• Complexity: many pieces, maybe hidden SPOFs• Fundamental to the application’s design
Usage patternsTypically, some services are more demanding thanothers, this can be due to:• Higher popularity• Higher complexity• Low latency expectation• All combined
Decoupling• Divide and conquer!• The Unix way• Resources assigned individually• Using the right tools to address each problem• Organization and delegation• Problems are isolated• Easier to handle growth
Read only services• The easiest to scale• Stateless• Use indices, large read-optimized data containers• Each node has its local copy• Data structured according to service• Updated periodically, during off-peak hours• Take advantage of OS page cache
Read-write services• User generated content, e.g. playlists• Hard to ensure consistence of data across instancesSolutions:• Eventual consistency: • Reads of just written data not guaranteed to be up-to-date• Locking, atomic operations • Creating globally unique keys, e.g. usernames • Transactions, e.g. billing
Finding a service via DNSEach service has an SRV DNS record:• One record with same name for each service instance• Clients (AP) resolve to find servers providing that service• Lowest priority record is chosen with weighted shuffle• Clients retry other instances in case of failuresExample SRV record_frobnicator._http.example.com. 3600 SRV 10 50 8081 frob1.example.com.! name TTL type prio weight port host!
Request assignment• Hardware load balancers• Round-robin DNS• Proxy servers• Sharding: • Each server/instance responsible for subset of data • Directs client to instance that has its data • Easy if nothing is shared • Hard if you require replication
Sharding using a DHTSome Spotify services use Dynamo inspired DHTs:• Each request has a key• Each service node is responsible for a range of hash keys• Data is distributed among service nodes• Redundancy is ensured by re-hashing and writing to replica node• Data must be transitioned when ring changes!  http://dl.acm.org/citation.cfm?id=1294281
Spotify’s DNS powered DHTConfiguration of DHTconfig._frobnicator._http.example.com. 3600 TXT “slaves=0”! config.srv_name. TTL type ! no replication!!config._frobnicator._http.example.com. 3600 TXT “slaves=2 redundancy=host”! config.srv_name. TTL! type ! three replicas! on separate hosts!Ring segment, one per nodetokens.8081.frob1.example.com. 3600 TXT “00112233445566778899aabbccddeeff”! tokens.port.host. TTL type last key!!
And if none of this works for youRemember/dev/null isweb scale!! http://www.xtranormal.com/watch/6995033/
Questions? get in touch! @ricardovice firstname.lastname@example.org