More Related Content


Scalable Service Architectures

  1. Scalable Service Architectures Lessons learned Zoltán Németh Engineering Manager, Core Systems
  2. Agenda  Our scalability experience  What is Scalability?  Requirements in detail  Tips and tools  Extras, Closing remarks
  3. Our experience
  4. Streaming stack
  5. Defining scalability Scalability is the ability to handle increased workload by repeatedly applying a costeffective strategy for extending a system’s capacity. (CMU paper, 2006) How well a solution to some problem will work when the size of the problem increases. When the size decreases, the solution must fit. ( and Theo Schlossnagle, 2006)
  6. Self-contained service  Explicitly declare and isolate dependencies  Isolation from the outside system  Static linking  Do not rely on system packages
  7. Disposability  Maximize robustness with fast startup and graceful shutdown  Disposable processes  Graceful shutdown on SIGTERM  Handling sudden death: robust queue backend
  8. Startup and Shutdown  Automate all the things  Chef  Docker  Gold image based deployment  Immutable  Handling tasks before shutdown
  9. Backing Services  Treat backing services as attached resources  No distinction between local and third party services  Easily swap out resources  Export services via port binding  Become the backing service for another app
  10. Processes, concurrency  Stateless processes (not even sticky sessions)  Process types by work type  We <3 linux process  Shared-nothing  adding concurrency is safe  Process distribution spanning machines
  11. Statelessness  Store everything in a datastore  Aggregate data  Chandra  Scalable datastores  Redis  Cassandra  Aerospike  Handling user sessions
  12. Monitoring  Application state and metrics  Dashboards  Alerting  Health  Remove failing nodes  Capacity  Act on trends
  13. Monitoring  Metrics collecting  Graphite, New Relic  Self-aware checks  Cluster state  Zookeeper, Consul  Scaling decision types  Capacity amount  Graph derivative  App requests
  14. Load Balance and Resource Allocation  Load Balance: distribute tasks  Utilize machines efficiently  VM compatible apps  Flexibility  Adapting to available resources
  15. Load Balance  DNS or API  App level balance  Uniform entry point or proxy  Balance decisions  Load  Zookeeper state  Resource policies
  16. Service Separation  Failure is inevitable  Protect from failing components  Cascading failure  Fail fast  Decoupling  Asynchronous operations  Message queues
  17. Service Separation  Rate limiting  Circuit Breaker pattern  Stop cascading failure, allow recovery  Hystrix  Fail fast, fail silent  Service decoupling
  18. Extras  Debugging features  Logs  Clojure / JS consoles  Runtime configuration via env  Scaling API  Integrating several cloud providers  Automatic start / stop
  19. Reading  Scalable Internet Architectures by Theo Schlossnagle  The 12-factor App:  Carnegie Mellon Paper:  Circuit Breaker:  Release It! by Michael T. Nygard
  20. Questions

Editor's Notes

  1. A bit of Ustream intro
  2. Definition Requirements coming from 12-factor, and some added by us Some more detail and tools on selected requirements
  3. 30 day viewer graph. Clear peaks -> need for scaling
  4. Quick description of the streaming stack, roles of components, how they require scaling - Transcontroller/transcoder scaling - UMS scaling
  5. Quick description of the streaming stack, roles of components, how they require scaling - Transcontroller/transcoder scaling - UMS scaling
  6. Carnegie Mellon University paper by Charles B. Weinstock, John B. Goodenough: On System Scalability LINFO: The Linux Information Project
  7. Example: calling imagemagick or curl from code – they might be there or might not be Bundle everything into the app instead
  8. Disposable process: they can be started or stopped at a moment’s notice For a web process, graceful shutdown is achieved by ceasing to listen on the service port (thereby refusing any new requests), allowing any current requests to finish, and then exiting. Implicit in this model is that HTTP requests are short (no more than a few seconds), or in the case of long polling, the client should seamlessly attempt to reconnect when the connection is lost. For a worker process, graceful shutdown is achieved by returning the current job to the work queue.
  9. Docker: build images from dockerfile, deploy from repository Tasks before shutdown: moving jobs, log collection, sleep
  10. A backing service is any service the app consumes over the network as part of its normal operation. Examples include datastores (such as MySQL or CouchDB), messaging/queueing systems (such as RabbitMQ or Beanstalkd), SMTP services for outbound email (such as Postfix), and caching systems (such as Memcached). Put a resource locator in the config only – environment variables Example: Easily swap out a local mysql to a remote service The app does not rely on runtime injection of a webserver into the execution environment to create a web-facing service. The web app exports HTTP as a service by binding to a port, and listening to requests coming in on that port. One app can become the backing service for another app, by providing the URL to the backing app as a resource handle in the config for the consuming app
  11. Handle diverse workloads by assigning each type of work to a process type. For example, HTTP requests may be handled by a web process, and long-running background tasks handled by a worker process An individual VM can only grow so large (vertical scale), so the application must also be able to span multiple processes running on multiple physical machines.
  12. Aggregate everything within the app and write it out in bulk – careful about write frequency, must not lose too many data on a crash Redis: scales reads, write problematic Cassandra: quick scaling questionable Aerospike: scales reads and writes, working together with their eng team User sessions: persistent connection, NIO+
  13. Alerting -> openduty Two important groups: Health vs capacity
  14. Report everything to graphite, constantly check graph trends automatically Apps are self-aware, they know their health App instances report into Zookeeper and thus know about each other Central logic can request resource based on capacity or graph, app can request based on self-check or zookeeper Zookeeper, Consul: miért, mik az előnyei
  15. load balancing distributes workloads across multiple computing resources Flexibility: can increase or decrease its own size, example: Threadpools Adapting to CPU, RAM, disk, network
  16. App level: transcontroller selects transcoder App level balance with proxy can be SPOF, careful Resource policies: even distribution, keep large chunks free for possible large tasks (transcoder use case), group requests together on some attribute (pro, etc)
  17. Failure inevitable because: large numbers, hw issues, independent network Decoupling: serving one request should not wait on others
  18. Hystrix by Netflix 2011/12 Circuit Breaker: Martin Fowler post from 2014 Service decoupling example: inserting layers between DB and UMS -> RGW. Then another layer between RGW and UMS -> Queue
  19. Logs: logs as stream / stdout (factor #9), collect / transport / process Scaling API: Other considerations: price, network line to the cloud provider, instance type (spot vs normal)