Successfully reported this slideshow.

Scalable service architectures @ BWS16



Upcoming SlideShare
Culture @ Velocity UK
Culture @ Velocity UK
Loading in …3
1 of 22
1 of 22

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

Scalable service architectures @ BWS16

  1. 1. Scalable Service Architectures Lessons learned Zoltán Németh Engineering Manager, Core SystemsAn IBM company
  2. 2. Agenda  Our scalability experience  What is Scalability?  Requirements in detail  Tips and tools  Extras, Closing remarks
  3. 3. Our experience
  4. 4. Streaming stack
  5. 5. Defining scalability Scalability is the ability to handle increased workload by repeatedly applying a costeffective strategy for extending a system’s capacity. (CMU paper, 2006) How well a solution to some problem will work when the size of the problem increases. When the size decreases, the solution must fit. ( and Theo Schlossnagle, 2006)
  6. 6. Self-contained service  Explicitly declare and isolate dependencies  Isolation from the outside system  Static linking  Do not rely on system packages
  7. 7. Disposability  Maximize robustness with fast startup and graceful shutdown  Disposable processes  Graceful shutdown on SIGTERM  Handling sudden death: robust queue backend
  8. 8. Startup and Shutdown  Automate all the things  Chef  Docker  Gold image based deployment  Immutable  Handling tasks before shutdown
  9. 9. Backing Services  Treat backing services as attached resources  No distinction between local and third party services  Easily swap out resources  Export services via port binding  Become the backing service for another app
  10. 10. Processes, concurrency  Stateless processes (not even sticky sessions)  Process types by work type  We <3 linux process  Shared-nothing  adding concurrency is safe  Process distribution spanning machines
  11. 11. Statelessness  Store everything in a datastore  Aggregate data  Chandra  Aggregator / map & reduce  Scalable datastores  Handling user sessions
  12. 12. Monitoring  Application state and metrics  Dashboards  Alerting  Health  Remove failing nodes  Capacity  Act on trends
  13. 13. Monitoring  Metrics collecting  Graphite, New Relic  Self-aware checks  Cluster state  Zookeeper, Consul  Scaling decision types  Capacity amount  Graph derivative  App requests
  14. 14. Load Balance and Resource Allocation  Load Balance: distribute tasks  Utilize machines efficiently  VM compatible apps  Flexibility  Adapting to available resources
  15. 15. Load Balance  DNS or API  App level balance  Uniform entry point or proxy  Balance decisions  Load  Zookeeper state  Resource policies
  16. 16. Service Separation  Failure is inevitable  Protect from failing components  Cascading failure  Fail fast  Decoupling  Asynchronous operations  Message queues
  17. 17. Service Separation  Rate limiting  Circuit Breaker pattern  Stop cascading failure, allow recovery  Hystrix  Fail fast, fail silent  Service decoupling
  18. 18. Extras  Debugging features  Logs  Clojure / JS consoles  Runtime configuration via env  Scaling API  Integrating several cloud providers  Automatic start / stop
  19. 19. Reading  Scalable Internet Architectures by Theo Schlossnagle  The 12-factor App:  Carnegie Mellon Paper:  Circuit Breaker:  Release It! by Michael T. Nygard
  20. 20. Questions

Editor's Notes

  • A bit of Ustream intro
  • Definition
    Requirements coming from 12-factor, and some added by us
    Some more detail and tools on selected requirements
  • 30 day viewer graph. Clear peaks -> need for scaling
  • Quick description of the streaming stack, roles of components, how they require scaling
    - Transcontroller/transcoder scaling
    - UMS scaling
  • Quick description of the streaming stack, roles of components, how they require scaling
    - Transcontroller/transcoder scaling
    - UMS scaling
  • Carnegie Mellon University paper by Charles B. Weinstock, John B. Goodenough: On System Scalability
    LINFO: The Linux Information Project

    Next: principles
  • Example: calling imagemagick or curl from code – they might be there or might not be
    Bundle everything into the app instead
  • Disposable process: they can be started or stopped at a moment’s notice

    For a web process, graceful shutdown is achieved by ceasing to listen on the service port (thereby refusing any new requests), allowing any current requests to finish, and then exiting. Implicit in this model is that HTTP requests are short (no more than a few seconds), or in the case of long polling, the client should seamlessly attempt to reconnect when the connection is lost.
    For a worker process, graceful shutdown is achieved by returning the current job to the work queue.
  • Docker: build images from dockerfile, deploy from repository
    Tasks before shutdown: moving jobs, log collection, sleep
  • A backing service is any service the app consumes over the network as part of its normal operation. Examples include datastores (such as MySQL or CouchDB), messaging/queueing systems (such as RabbitMQ or Beanstalkd), SMTP services for outbound email (such as Postfix), and caching systems (such as Memcached).

    Put a resource locator in the config only – environment variables
    Example: Easily swap out a local mysql to a remote service

    The app does not rely on runtime injection of a webserver into the execution environment to create a web-facing service. The web app exports HTTP as a service by binding to a port, and listening to requests coming in on that port.

    One app can become the backing service for another app, by providing the URL to the backing app as a resource handle in the config for the consuming app
  • Handle diverse workloads by assigning each type of work to a process type. For example, HTTP requests may be handled by a web process, and long-running background tasks handled by a worker process

    An individual VM can only grow so large (vertical scale), so the application must also be able to span multiple processes running on multiple physical machines.
  • Aggregate everything within the app and write it out in bulk – careful about write frequency, must not lose too many data on a crash
    Aggregator map-reduce

    Redis: scales reads, write problematic
    Cassandra: quick scaling questionable
    Aerospike: scales reads and writes, working together with their eng team
    User sessions: persistent connection, NIO+
  • Alerting -> openduty
    Two important groups: Health vs capacity
  • Report everything to graphite, constantly check graph trends automatically
    Apps are self-aware, they know their health
    App instances report into Zookeeper and thus know about each other
    Central logic can request resource based on capacity or graph, app can request based on self-check or zookeeper

    Zookeeper, Consul: miért, mik az előnyei
  • load balancing distributes workloads across multiple computing resources

    Flexibility: can increase or decrease its own size, example: Threadpools
    Adapting to CPU, RAM, disk, network
  • App level: transcontroller selects transcoder
    App level balance with proxy can be SPOF, careful

    Resource policies: even distribution, keep large chunks free for possible large tasks (transcoder use case), group requests together on some attribute (pro, etc)
  • Failure inevitable because: large numbers, hw issues, independent network
    Decoupling: serving one request should not wait on others
  • Hystrix by Netflix 2011/12
    Circuit Breaker: Martin Fowler post from 2014
    Service decoupling example: inserting layers between DB and UMS -> RGW. Then another layer between RGW and UMS -> Queue
  • Logs: logs as stream / stdout (factor #9), collect / transport / process
    Scaling API: Other considerations: price, network line to the cloud provider, instance type (spot vs normal)
    Openstack, Ganeti
  • ×