Successfully reported this slideshow.

Scalable service architectures @ VDB16

0

Share

Upcoming SlideShare
Reuse in adf applications
Reuse in adf applications
Loading in …3
×
1 of 26
1 of 26

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

Scalable service architectures @ VDB16

  1. 1. Scalable Service Architectures Lessons learned Zoltán Németh Sr Engineering Manager, Streaming & PlayerAn IBM company
  2. 2.  Founded in 2007, acquired by IBM in 2016  Live streaming and VOD  Freemium / Pro / Demand / Align An IBM company
  3. 3. Streaming flow
  4. 4. Now back to 2009...
  5. 5. Earthquake in Japan Protests in Ukraine, Egypt, Syria Asteroid Approach SpaceX Launch El Classico
  6. 6. We must scale
  7. 7. Defining scalability Scalability is the ability to handle increased workload by repeatedly applying a costeffective strategy for extending a system’s capacity. (CMU paper, 2006) How well a solution to some problem will work when the size of the problem increases. When the size decreases, the solution must fit. (dictionary.com and Theo Schlossnagle, 2006)
  8. 8. Self-contained service  Explicitly declare and isolate dependencies  Isolation from the outside system  Static linking  Pay attention to GPL  Do not rely on system packages
  9. 9. Disposability  Maximize robustness with fast startup and graceful shutdown  Disposable processes  Graceful shutdown on SIGTERM  Handling sudden death: robust queue backend
  10. 10. Backing Services  Treat backing services as attached resources  No distinction between local and third party services  Easily swap out resources  Export services via port binding  Become the backing service for another app Drawing source: 12factor.net
  11. 11. Processes, concurrency  Stateless processes (not even sticky sessions)  Process types by work type  We <3 linux process  Container > VM  Shared-nothing  adding concurrency is safe  Process distribution spanning machines
  12. 12. Statelessness  Store everything in a datastore  Aggregate data  Aggregator / map & reduce  CQEngine  Chandra  Scalable datastores  Handling user sessions
  13. 13. Microservices  Self-contained  Disposable  Stateless  Shared-nothing  API communication  Dependency management moved to external  Be Warned! Image credits: christofcoetzee.co.za, techblog.netflix.com
  14. 14. Monitoring  Metrics collecting  Graphite, New Relic  Self-aware applications  Cluster state  Zookeeper, Consul  Scaling decisions  Capacity amount  Graph derivative  App requests
  15. 15. Load Balance  DNS or API  App level balance  Uniform entry point or proxy  Balance decisions  Load  Zookeeper state  Resource policies
  16. 16. Service Separation  Rate limiting  Failure is inevitable  Circuit Breaker pattern  Stop cascading failure, allow recovery  Fail fast, fail silent  Hystrix  Service decoupling  Asynchronous operations
  17. 17. Deployment  Automate all the things  Chef & VMs  Docker  Immutable deployment  Docker / Kubernetes / Rancher  Handling tasks before shutdown
  18. 18. Extras  Debugging features  Log processing: Logstash, Kibana  Clojure / JS consoles  Runtime configuration via env  Scaling API  Cloud providers  Automatic start / stop
  19. 19. Reading  Scalable Internet Architectures by Theo Schlossnagle  The 12-factor App: http://12factor.net/  Carnegie Mellon Paper: http://www.sei.cmu.edu/reports/06tn012.pdf  Circuit Breaker: http://martinfowler.com/bliki/CircuitBreaker.html  Release It! by Michael T. Nygard  Netflix Tech Blog: http://techblog.netflix.com/
  20. 20. Questions? syntaxerror@hu.ibm.com

Editor's Notes

  • A bit of Ustream intro
  • Quick description of the streaming stack, roles of components, how they require scaling
    - Transcontroller/transcoder scaling
    - UMS scaling
  • Quick description of the streaming stack, roles of components, how they require scaling
    - Transcontroller/transcoder scaling
    UMS scaling
  • 30 day viewer graph. Clear peaks -> need for scaling
  • Scaling delivery  CDN, UCDN, other talk 
    Scaling applications!
    Now comes some scaling theory
  • Carnegie Mellon University paper by Charles B. Weinstock, John B. Goodenough: On System Scalability
    LINFO: The Linux Information Project http://www.linfo.org/

    Next: principles
  • Example: calling imagemagick or curl from code – they might be there or might not be
    Bundle everything into the app instead
  • Disposable process: they can be started or stopped at a moment’s notice

    For a web process, graceful shutdown is achieved by ceasing to listen on the service port (thereby refusing any new requests), allowing any current requests to finish, and then exiting. Implicit in this model is that HTTP requests are short (no more than a few seconds), or in the case of long polling, the client should seamlessly attempt to reconnect when the connection is lost.
    For a worker process, graceful shutdown is achieved by returning the current job to the work queue.
  • A backing service is any service the app consumes over the network as part of its normal operation. Examples include datastores (such as MySQL or CouchDB), messaging/queueing systems (such as RabbitMQ or Beanstalkd), SMTP services for outbound email (such as Postfix), and caching systems (such as Memcached).

    Put a resource locator in the config only – environment variables
    Example: Easily swap out a local mysql to a remote service

    The app does not rely on runtime injection of a webserver into the execution environment to create a web-facing service. The web app exports HTTP as a service by binding to a port, and listening to requests coming in on that port.

    One app can become the backing service for another app, by providing the URL to the backing app as a resource handle in the config for the consuming app
  • Handle diverse workloads by assigning each type of work to a process type. For example, HTTP requests may be handled by a web process, and long-running background tasks handled by a worker process

    An individual VM can only grow so large (vertical scale), so the application must also be able to span multiple processes running on multiple physical machines.
  • Aggregate everything within the app and write it out in bulk – careful about write frequency, must not lose too many data on a crash
    Aggregator map-reduce

    Redis: scales reads, write problematic
    Cassandra: quick scaling questionable
    Aerospike: scales reads and writes, working together with their eng team
    User sessions: persistent connection, NIO+
  • Aggregate everything within the app and write it out in bulk – careful about write frequency, must not lose too many data on a crash
    Aggregator map-reduce

    Redis: scales reads, write problematic
    Cassandra: quick scaling questionable
    Aerospike: scales reads and writes, working together with their eng team
    User sessions: persistent connection, NIO+
  • Report everything to graphite, constantly check graph trends automatically
    Apps are self-aware, they know their health
    App instances report into Zookeeper and thus know about each other
    Central logic can request resource based on capacity or graph, app can request based on self-check or zookeeper

    Zookeeper, Consul: miért, mik az előnyei
  • load balancing distributes workloads across multiple computing resources
    Flexibility: can increase or decrease its own size, example: Threadpools
    Adapting to CPU, RAM, disk, network

    App level: transcontroller selects transcoder
    App level balance with proxy can be SPOF, careful

    Resource policies: even distribution, keep large chunks free for possible large tasks (transcoder use case), group requests together on some attribute (pro, etc)
  • Failure inevitable because: large numbers, hw issues, independent network
    Hystrix by Netflix 2011/12
    Circuit Breaker: Martin Fowler post from 2014
    Decoupling: serving one request should not wait on others
    Service decoupling example: inserting layers between DB and UMS -> RGW. Then another layer between RGW and UMS -> Queue
    Antipattern example: connection limit, if filled up, new connections are kept waiting until a resource frees up
  • Docker: build images from dockerfile, deploy from repository
    Tasks before shutdown: moving jobs, log collection, sleep
  • van egy environment, es abba rakunk egy kubernetest, ez N darab gepen fut, abbol a nehany gepbol 1-3 az maga a Kubernetes master, a tobbi pedig a worker nodeok, azokon futnak a userek alkalmazasai. A kornyezetek jol elszeparaltak, nincsennek egymassal hatassal, de ha kell megtalaljak egymast
    A pod-ok azok kontenerek halmazai (1 vagy tobb, altalaban 1 amugy)[9:33] Peldaul az egy POD hogy UHS meg Chunkserver van egy gepen, mert a chunkserver olvassa az Ingest altal kiirt fajlokat.[9:33] Ez ket kontener, de megis egy POD.
    egy app az servicek es replication controllerek-bol allnak, a replication controller ami felugyeli hogy kello szamu instance legyen egy adott POD-bol (anya POD)
  • Logs: logs as stream / stdout (factor #9), collect / transport / process
    Scaling API: Other considerations: price, network line to the cloud provider, instance type (spot vs normal)
    Openstack, Ganeti
  • ×