Scaling at Showyou: Operations

1,407 views

Published on

Architecture/operations slides from the Scaling at Showyou talk.

From the same talk, John's Riak backend, Mecha: http://www.slideshare.net/jmuellerleile/scaling-with-riak-at-showyou

Published in: Technology, Business
1 Comment
0 Likes
Statistics
Notes
  • Later in the talk, John Mullerleile introduced his new Riak backend: http://www.slideshare.net/jmuellerleile/scaling-with-riak-at-showyou
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

No Downloads
Views
Total views
1,407
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
7
Comments
1
Likes
0
Embeds 0
No embeds

No notes for slide

Scaling at Showyou: Operations

  1. 1. Introduction Storage Processing Monitoring Review Scaling at Showyou Operations September 26, 2011
  2. 2. Introduction Storage Processing Monitoring Review I’m Kyle Kingsbury Handle aphyr Code http://github.com/aphyr Email kyle@remixation.com Focus Backend, API, ops
  3. 3. Introduction Storage Processing Monitoring Review What the hell is Showyou?
  4. 4. Introduction Storage Processing Monitoring Review Nontrivial complexity
  5. 5. Introduction Storage Processing Monitoring Review Challenges ˆ Scanning social networks ˆ Feeds ˆ Search ˆ Trends ˆ Responsive client experience
  6. 6. Introduction Storage Processing Monitoring Review Challenges ˆ Scanning social networks ˆ Feeds ˆ Search ˆ Trends ˆ Responsive client experience ˆ Everything fails all the time
  7. 7. Introduction Storage Processing Monitoring Review Storage
  8. 8. Introduction Storage Processing Monitoring Review We left MySQL
  9. 9. Introduction Storage Processing Monitoring Review We left MySQL ˆ Changing the schema requires downtime ˆ Crashes ˆ Master-slave lag ˆ Slow restarts ˆ Node replacements difficult ˆ Fully normalized queries complex, slow
  10. 10. Introduction Storage Processing Monitoring Review MySQL does scale But there are tradeoffs
  11. 11. Introduction Storage Processing Monitoring Review Riak ˆ Key/value store ˆ Homogenous ˆ Scales linearly with nodes ˆ Excellent durability/recoverability ˆ Eventually consistent
  12. 12. Introduction Storage Processing Monitoring Review We use Riak as our durable datastore ˆ Users, feeds, videos, etc ˆ Highly denormalized ˆ Limited MR queries (feeds, etc) ˆ Latency-bounded MR jobs are Erlang ˆ Hot-deployable ˆ Extensive use of conflict resolution ˆ Made possible by Risky
  13. 13. Introduction Storage Processing Monitoring Review Riak at Showyou ˆ 51 million keys (153 M replicated) ˆ 100 GB of data (300 GB replicated) ˆ 260 gets/sec (baseline) ˆ 75 puts/sec (baseline) ˆ Capable of over 3000 ops/sec
  14. 14. Introduction Storage Processing Monitoring Review SSDs are amazing WD 7200RPM Micron RealSSD P300 ˆ 100 ops/sec ˆ 1000+ ops/sec ˆ 95%: 100-300ms ˆ 95%: 3-5ms
  15. 15. Introduction Storage Processing Monitoring Review When Riak fails, ˆ Another node takes up the slack ˆ Clients connected to that node reconnect to others ˆ Typically no service interruption ˆ However, latencies may rise ˆ Especially for MR jobs
  16. 16. Introduction Storage Processing Monitoring Review Riak has downsides ˆ Difficult to debug ˆ Membership changes are dangerous ˆ Significantly slower than MySQL ˆ (Bitcask) All keys must fit in memory ˆ Mapreduce is only appropriate for known keys ˆ List-keys can take down your cluster Long story short: it’s only a KV store
  17. 17. Introduction Storage Processing Monitoring Review +Redis
  18. 18. Introduction Storage Processing Monitoring Review We use Redis for fast, temporary state ˆ List of users ˆ List of videos ˆ Counters ˆ Queues Incredibly fast, excellent primitives
  19. 19. Introduction Storage Processing Monitoring Review When Redis fails, ˆ Daemons using those indexes pause ˆ Frontend service continues ˆ Bitcask scanners and incremental updaters repair any lost data
  20. 20. Introduction Storage Processing Monitoring Review When Redis fails, ˆ Daemons using those indexes pause ˆ Frontend service continues ˆ Bitcask scanners and incremental updaters repair any lost data Eventually consistent.
  21. 21. Introduction Storage Processing Monitoring Review We also use SOLR extensively ˆ Supplements Riak ˆ Complex indices ˆ Full-text search ˆ Analytics More on that later. . .
  22. 22. Introduction Storage Processing Monitoring Review Processing
  23. 23. Introduction Storage Processing Monitoring Review Do one thing well Lots of small processes handling well-defined tasks ˆ Easier to debug ˆ Easier to test ˆ Simplifies parallelism ˆ Simplifies error handling ˆ Less likely to cause total system failure
  24. 24. Introduction Storage Processing Monitoring Review Minimize Shared State ˆ Vector clocks for concurrent modification ˆ Queues for message passing ˆ Riak for durable storage ˆ Redis for fast synchronous state
  25. 25. Introduction Storage Processing Monitoring Review Crash by Default ˆ Someone else will take your work ˆ Repair constantly ˆ Assume everybody is out to kill you
  26. 26. Introduction Storage Processing Monitoring Review Distribute ˆ Multiple threads, processes, hosts ˆ Failover IPs with Heartbeat ˆ Rolling restarts mean frequent deploys and nobody notices ˆ Losing a node is no big deal ˆ Scaling out is easy
  27. 27. Introduction Storage Processing Monitoring Review Monitoring
  28. 28. Introduction Storage Processing Monitoring Review UState: A state aggregator
  29. 29. Introduction Storage Processing Monitoring Review Receive states over protobufs Host backend1.showyou.com Service feed merger rate Time unix epoch seconds State ok Metric 12.5 Description 12.5 feed items/sec
  30. 30. Introduction Storage Processing Monitoring Review Query states ˆ state = "warning" or state = "critical" ˆ service =∼ "api %" and host != null
  31. 31. Introduction Storage Processing Monitoring Review ˆ Combine states together (sum, average, . . . ) ˆ Send email on changes ˆ Forward to another UState server ˆ Forward to Graphite ˆ Dashboard
  32. 32. Introduction Storage Processing Monitoring Review Understand application behavior
  33. 33. Introduction Storage Processing Monitoring Review When can we. . . ?
  34. 34. Introduction Storage Processing Monitoring Review It’s 23:15 PST.
  35. 35. Introduction Storage Processing Monitoring Review It’s 23:15 PST. Do you know where YOUR database is?
  36. 36. Introduction Storage Processing Monitoring Review http://github.com/aphyr/ustate
  37. 37. Introduction Storage Processing Monitoring Review Recap ˆ Robust, discrete components ˆ Highly distributed ˆ Message passing ˆ Eventual consistency ˆ Comprehensive monitoring
  38. 38. Introduction Storage Processing Monitoring Review Thanks! ˆ Basho (esp. Pharkmillups!) ˆ Formspring ˆ Bump

×