Scaling at Showyou: Operations

  • 904 views
Uploaded on

Architecture/operations slides from the Scaling at Showyou talk. …

Architecture/operations slides from the Scaling at Showyou talk.

From the same talk, John's Riak backend, Mecha: http://www.slideshare.net/jmuellerleile/scaling-with-riak-at-showyou

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • Later in the talk, John Mullerleile introduced his new Riak backend: http://www.slideshare.net/jmuellerleile/scaling-with-riak-at-showyou
    Are you sure you want to
    Your message goes here
    Be the first to like this
No Downloads

Views

Total Views
904
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
5
Comments
1
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Introduction Storage Processing Monitoring Review Scaling at Showyou Operations September 26, 2011
  • 2. Introduction Storage Processing Monitoring Review I’m Kyle Kingsbury Handle aphyr Code http://github.com/aphyr Email kyle@remixation.com Focus Backend, API, ops
  • 3. Introduction Storage Processing Monitoring Review What the hell is Showyou?
  • 4. Introduction Storage Processing Monitoring Review Nontrivial complexity
  • 5. Introduction Storage Processing Monitoring Review Challenges ˆ Scanning social networks ˆ Feeds ˆ Search ˆ Trends ˆ Responsive client experience
  • 6. Introduction Storage Processing Monitoring Review Challenges ˆ Scanning social networks ˆ Feeds ˆ Search ˆ Trends ˆ Responsive client experience ˆ Everything fails all the time
  • 7. Introduction Storage Processing Monitoring Review Storage
  • 8. Introduction Storage Processing Monitoring Review We left MySQL
  • 9. Introduction Storage Processing Monitoring Review We left MySQL ˆ Changing the schema requires downtime ˆ Crashes ˆ Master-slave lag ˆ Slow restarts ˆ Node replacements difficult ˆ Fully normalized queries complex, slow
  • 10. Introduction Storage Processing Monitoring Review MySQL does scale But there are tradeoffs
  • 11. Introduction Storage Processing Monitoring Review Riak ˆ Key/value store ˆ Homogenous ˆ Scales linearly with nodes ˆ Excellent durability/recoverability ˆ Eventually consistent
  • 12. Introduction Storage Processing Monitoring Review We use Riak as our durable datastore ˆ Users, feeds, videos, etc ˆ Highly denormalized ˆ Limited MR queries (feeds, etc) ˆ Latency-bounded MR jobs are Erlang ˆ Hot-deployable ˆ Extensive use of conflict resolution ˆ Made possible by Risky
  • 13. Introduction Storage Processing Monitoring Review Riak at Showyou ˆ 51 million keys (153 M replicated) ˆ 100 GB of data (300 GB replicated) ˆ 260 gets/sec (baseline) ˆ 75 puts/sec (baseline) ˆ Capable of over 3000 ops/sec
  • 14. Introduction Storage Processing Monitoring Review SSDs are amazing WD 7200RPM Micron RealSSD P300 ˆ 100 ops/sec ˆ 1000+ ops/sec ˆ 95%: 100-300ms ˆ 95%: 3-5ms
  • 15. Introduction Storage Processing Monitoring Review When Riak fails, ˆ Another node takes up the slack ˆ Clients connected to that node reconnect to others ˆ Typically no service interruption ˆ However, latencies may rise ˆ Especially for MR jobs
  • 16. Introduction Storage Processing Monitoring Review Riak has downsides ˆ Difficult to debug ˆ Membership changes are dangerous ˆ Significantly slower than MySQL ˆ (Bitcask) All keys must fit in memory ˆ Mapreduce is only appropriate for known keys ˆ List-keys can take down your cluster Long story short: it’s only a KV store
  • 17. Introduction Storage Processing Monitoring Review +Redis
  • 18. Introduction Storage Processing Monitoring Review We use Redis for fast, temporary state ˆ List of users ˆ List of videos ˆ Counters ˆ Queues Incredibly fast, excellent primitives
  • 19. Introduction Storage Processing Monitoring Review When Redis fails, ˆ Daemons using those indexes pause ˆ Frontend service continues ˆ Bitcask scanners and incremental updaters repair any lost data
  • 20. Introduction Storage Processing Monitoring Review When Redis fails, ˆ Daemons using those indexes pause ˆ Frontend service continues ˆ Bitcask scanners and incremental updaters repair any lost data Eventually consistent.
  • 21. Introduction Storage Processing Monitoring Review We also use SOLR extensively ˆ Supplements Riak ˆ Complex indices ˆ Full-text search ˆ Analytics More on that later. . .
  • 22. Introduction Storage Processing Monitoring Review Processing
  • 23. Introduction Storage Processing Monitoring Review Do one thing well Lots of small processes handling well-defined tasks ˆ Easier to debug ˆ Easier to test ˆ Simplifies parallelism ˆ Simplifies error handling ˆ Less likely to cause total system failure
  • 24. Introduction Storage Processing Monitoring Review Minimize Shared State ˆ Vector clocks for concurrent modification ˆ Queues for message passing ˆ Riak for durable storage ˆ Redis for fast synchronous state
  • 25. Introduction Storage Processing Monitoring Review Crash by Default ˆ Someone else will take your work ˆ Repair constantly ˆ Assume everybody is out to kill you
  • 26. Introduction Storage Processing Monitoring Review Distribute ˆ Multiple threads, processes, hosts ˆ Failover IPs with Heartbeat ˆ Rolling restarts mean frequent deploys and nobody notices ˆ Losing a node is no big deal ˆ Scaling out is easy
  • 27. Introduction Storage Processing Monitoring Review Monitoring
  • 28. Introduction Storage Processing Monitoring Review UState: A state aggregator
  • 29. Introduction Storage Processing Monitoring Review Receive states over protobufs Host backend1.showyou.com Service feed merger rate Time unix epoch seconds State ok Metric 12.5 Description 12.5 feed items/sec
  • 30. Introduction Storage Processing Monitoring Review Query states ˆ state = "warning" or state = "critical" ˆ service =∼ "api %" and host != null
  • 31. Introduction Storage Processing Monitoring Review ˆ Combine states together (sum, average, . . . ) ˆ Send email on changes ˆ Forward to another UState server ˆ Forward to Graphite ˆ Dashboard
  • 32. Introduction Storage Processing Monitoring Review Understand application behavior
  • 33. Introduction Storage Processing Monitoring Review When can we. . . ?
  • 34. Introduction Storage Processing Monitoring Review It’s 23:15 PST.
  • 35. Introduction Storage Processing Monitoring Review It’s 23:15 PST. Do you know where YOUR database is?
  • 36. Introduction Storage Processing Monitoring Review http://github.com/aphyr/ustate
  • 37. Introduction Storage Processing Monitoring Review Recap ˆ Robust, discrete components ˆ Highly distributed ˆ Message passing ˆ Eventual consistency ˆ Comprehensive monitoring
  • 38. Introduction Storage Processing Monitoring Review Thanks! ˆ Basho (esp. Pharkmillups!) ˆ Formspring ˆ Bump