SlideShare a Scribd company logo
Scalable Service Architectures
Lessons learned
Zoltán Németh
Engineering Manager, Core SystemsAn IBM company
Agenda
 Our scalability experience
 What is Scalability?
 Requirements in detail
 Tips and tools
 Extras, Closing remarks
Our experience
Streaming stack
Defining scalability
Scalability is the ability to handle increased workload
by repeatedly applying a costeffective strategy for
extending a system’s capacity.
(CMU paper, 2006)
How well a solution to some problem will work when
the size of the problem increases. When the size
decreases, the solution must fit. (dictionary.com and
Theo Schlossnagle, 2006)
Self-contained
service
 Explicitly declare and
isolate dependencies
 Isolation from the outside
system
 Static linking
 Do not rely on system
packages
Disposability  Maximize robustness with
fast startup and graceful
shutdown
 Disposable processes
 Graceful shutdown on
SIGTERM
 Handling sudden death:
robust queue backend
Startup and
Shutdown
 Automate all the things
 Chef
 Docker
 Gold image based
deployment
 Immutable
 Handling tasks before
shutdown
Backing Services  Treat backing services as
attached resources
 No distinction between
local and third party
services
 Easily swap out resources
 Export services via port
binding
 Become the backing
service for another app
Processes,
concurrency
 Stateless processes (not
even sticky sessions)
 Process types by work type
 We <3 linux process
 Shared-nothing  adding
concurrency is safe
 Process distribution
spanning machines
Statelessness  Store everything in a
datastore
 Aggregate data
 Chandra
 Aggregator / map &
reduce
 Scalable datastores
 Handling user sessions
Monitoring  Application state and
metrics
 Dashboards
 Alerting
 Health
 Remove failing nodes
 Capacity
 Act on trends
Monitoring  Metrics collecting
 Graphite, New Relic
 Self-aware checks
 Cluster state
 Zookeeper, Consul
 Scaling decision types
 Capacity amount
 Graph derivative
 App requests
Load Balance and
Resource
Allocation
 Load Balance: distribute
tasks
 Utilize machines
efficiently
 VM compatible apps
 Flexibility
 Adapting to available
resources
Load Balance  DNS or API
 App level balance
 Uniform entry point or
proxy
 Balance decisions
 Load
 Zookeeper state
 Resource policies
Service
Separation
 Failure is inevitable
 Protect from failing
components
 Cascading failure
 Fail fast
 Decoupling
 Asynchronous operations
 Message queues
Service
Separation
 Rate limiting
 Circuit Breaker pattern
 Stop cascading failure,
allow recovery
 Hystrix
 Fail fast, fail silent
 Service decoupling
Extras  Debugging features
 Logs
 Clojure / JS consoles
 Runtime configuration
via env
 Scaling API
 Integrating several
cloud providers
 Automatic start / stop
Reading
 Scalable Internet Architectures by Theo Schlossnagle
 The 12-factor App: http://12factor.net/
 Carnegie Mellon Paper: http://www.sei.cmu.edu/reports/06tn012.pdf
 Circuit Breaker: http://martinfowler.com/bliki/CircuitBreaker.html
 Release It! by Michael T. Nygard
Questions
syntaxerror@ustream.tv

More Related Content

What's hot

Keynote : évolution et vision d'Elastic Observability
Keynote : évolution et vision d'Elastic ObservabilityKeynote : évolution et vision d'Elastic Observability
Keynote : évolution et vision d'Elastic Observability
Elasticsearch
 
Grant Fritchey - Query Tuning In Azure SQL Database
Grant Fritchey - Query Tuning In Azure SQL DatabaseGrant Fritchey - Query Tuning In Azure SQL Database
Grant Fritchey - Query Tuning In Azure SQL Database
Red Gate Software
 
Workflows via Event driven architecture
Workflows via Event driven architectureWorkflows via Event driven architecture
Workflows via Event driven architecture
Milan Patel
 
Ionut hrubaru, bogdan lazarescu sql server high availability
Ionut hrubaru, bogdan lazarescu   sql server high availabilityIonut hrubaru, bogdan lazarescu   sql server high availability
Ionut hrubaru, bogdan lazarescu sql server high availability
Codecamp Romania
 
CloudBrew 2016 - Building IoT solution with Service Fabric
CloudBrew 2016 - Building IoT solution with Service FabricCloudBrew 2016 - Building IoT solution with Service Fabric
CloudBrew 2016 - Building IoT solution with Service Fabric
Teemu Tapanila
 
Whitepaper : Building an Efficient Microservices Architecture
Whitepaper : Building an Efficient Microservices ArchitectureWhitepaper : Building an Efficient Microservices Architecture
Whitepaper : Building an Efficient Microservices Architecture
Newt Global Consulting LLC
 

What's hot (6)

Keynote : évolution et vision d'Elastic Observability
Keynote : évolution et vision d'Elastic ObservabilityKeynote : évolution et vision d'Elastic Observability
Keynote : évolution et vision d'Elastic Observability
 
Grant Fritchey - Query Tuning In Azure SQL Database
Grant Fritchey - Query Tuning In Azure SQL DatabaseGrant Fritchey - Query Tuning In Azure SQL Database
Grant Fritchey - Query Tuning In Azure SQL Database
 
Workflows via Event driven architecture
Workflows via Event driven architectureWorkflows via Event driven architecture
Workflows via Event driven architecture
 
Ionut hrubaru, bogdan lazarescu sql server high availability
Ionut hrubaru, bogdan lazarescu   sql server high availabilityIonut hrubaru, bogdan lazarescu   sql server high availability
Ionut hrubaru, bogdan lazarescu sql server high availability
 
CloudBrew 2016 - Building IoT solution with Service Fabric
CloudBrew 2016 - Building IoT solution with Service FabricCloudBrew 2016 - Building IoT solution with Service Fabric
CloudBrew 2016 - Building IoT solution with Service Fabric
 
Whitepaper : Building an Efficient Microservices Architecture
Whitepaper : Building an Efficient Microservices ArchitectureWhitepaper : Building an Efficient Microservices Architecture
Whitepaper : Building an Efficient Microservices Architecture
 

Similar to Scalable service architectures @ BWS16

Scalable service architectures @ VDB16
Scalable service architectures @ VDB16Scalable service architectures @ VDB16
Scalable service architectures @ VDB16
Zoltán Németh
 
Stephane Lapointe & Alexandre Brisebois: Développer des microservices avec Se...
Stephane Lapointe & Alexandre Brisebois: Développer des microservices avec Se...Stephane Lapointe & Alexandre Brisebois: Développer des microservices avec Se...
Stephane Lapointe & Alexandre Brisebois: Développer des microservices avec Se...
MSDEVMTL
 
Microsoft Azure Cloud Basics Tutorial
Microsoft Azure Cloud Basics TutorialMicrosoft Azure Cloud Basics Tutorial
Microsoft Azure Cloud Basics Tutorial
IIMSE Edu
 
(ENT210) Accelerating Business Innovation with DevOps on AWS | AWS re:Invent ...
(ENT210) Accelerating Business Innovation with DevOps on AWS | AWS re:Invent ...(ENT210) Accelerating Business Innovation with DevOps on AWS | AWS re:Invent ...
(ENT210) Accelerating Business Innovation with DevOps on AWS | AWS re:Invent ...
Amazon Web Services
 
Eda on the azure services platform
Eda on the azure services platformEda on the azure services platform
Eda on the azure services platform
Yves Goeleven
 
Azure and cloud design patterns
Azure and cloud design patternsAzure and cloud design patterns
Azure and cloud design patterns
Venkatesh Narayanan
 
Designing distributed systems
Designing distributed systemsDesigning distributed systems
Designing distributed systems
Malisa Ncube
 
Sql Azure
Sql AzureSql Azure
Sql Azure
Yves Goeleven
 
Muves3 Elastic Grid Java One2009 Final
Muves3 Elastic Grid Java One2009 FinalMuves3 Elastic Grid Java One2009 Final
Muves3 Elastic Grid Java One2009 Final
Elastic Grid, LLC.
 
Deploying SaaS Application on the Cloud - Case Study
Deploying SaaS Application on the Cloud - Case StudyDeploying SaaS Application on the Cloud - Case Study
Deploying SaaS Application on the Cloud - Case Study
Nati Shalom
 
A guide through the Azure Messaging services - Update Conference
A guide through the Azure Messaging services - Update ConferenceA guide through the Azure Messaging services - Update Conference
A guide through the Azure Messaging services - Update Conference
Eldert Grootenboer
 
Introduction To Cloud Computing
Introduction To Cloud ComputingIntroduction To Cloud Computing
Introduction To Cloud Computing
Rinat Shagisultanov
 
12-Factor App
12-Factor App12-Factor App
CSC AWS re:Invent Enterprise DevOps session
CSC AWS re:Invent Enterprise DevOps sessionCSC AWS re:Invent Enterprise DevOps session
CSC AWS re:Invent Enterprise DevOps session
Tom Laszewski
 
Being Elastic -- Evolving Programming for the Cloud
Being Elastic -- Evolving Programming for the CloudBeing Elastic -- Evolving Programming for the Cloud
Being Elastic -- Evolving Programming for the Cloud
Randy Shoup
 
Managing the cloud
Managing the cloudManaging the cloud
Managing the cloud
Francesco Orlando
 
SaaS Enablement of your existing application (Cloud Slam 2010)
SaaS Enablement of your existing application (Cloud Slam 2010)SaaS Enablement of your existing application (Cloud Slam 2010)
SaaS Enablement of your existing application (Cloud Slam 2010)
Nati Shalom
 
Kluczowe elementy infrastruktury...
Kluczowe elementy infrastruktury...Kluczowe elementy infrastruktury...
Kluczowe elementy infrastruktury...
Alicja Sieminska
 
Albara Abdalkhalig
Albara AbdalkhaligAlbara Abdalkhalig
Albara Abdalkhalig
brra51
 
Current trends in software engineering
Current trends in software engineeringCurrent trends in software engineering
Current trends in software engineering
brra51
 

Similar to Scalable service architectures @ BWS16 (20)

Scalable service architectures @ VDB16
Scalable service architectures @ VDB16Scalable service architectures @ VDB16
Scalable service architectures @ VDB16
 
Stephane Lapointe & Alexandre Brisebois: Développer des microservices avec Se...
Stephane Lapointe & Alexandre Brisebois: Développer des microservices avec Se...Stephane Lapointe & Alexandre Brisebois: Développer des microservices avec Se...
Stephane Lapointe & Alexandre Brisebois: Développer des microservices avec Se...
 
Microsoft Azure Cloud Basics Tutorial
Microsoft Azure Cloud Basics TutorialMicrosoft Azure Cloud Basics Tutorial
Microsoft Azure Cloud Basics Tutorial
 
(ENT210) Accelerating Business Innovation with DevOps on AWS | AWS re:Invent ...
(ENT210) Accelerating Business Innovation with DevOps on AWS | AWS re:Invent ...(ENT210) Accelerating Business Innovation with DevOps on AWS | AWS re:Invent ...
(ENT210) Accelerating Business Innovation with DevOps on AWS | AWS re:Invent ...
 
Eda on the azure services platform
Eda on the azure services platformEda on the azure services platform
Eda on the azure services platform
 
Azure and cloud design patterns
Azure and cloud design patternsAzure and cloud design patterns
Azure and cloud design patterns
 
Designing distributed systems
Designing distributed systemsDesigning distributed systems
Designing distributed systems
 
Sql Azure
Sql AzureSql Azure
Sql Azure
 
Muves3 Elastic Grid Java One2009 Final
Muves3 Elastic Grid Java One2009 FinalMuves3 Elastic Grid Java One2009 Final
Muves3 Elastic Grid Java One2009 Final
 
Deploying SaaS Application on the Cloud - Case Study
Deploying SaaS Application on the Cloud - Case StudyDeploying SaaS Application on the Cloud - Case Study
Deploying SaaS Application on the Cloud - Case Study
 
A guide through the Azure Messaging services - Update Conference
A guide through the Azure Messaging services - Update ConferenceA guide through the Azure Messaging services - Update Conference
A guide through the Azure Messaging services - Update Conference
 
Introduction To Cloud Computing
Introduction To Cloud ComputingIntroduction To Cloud Computing
Introduction To Cloud Computing
 
12-Factor App
12-Factor App12-Factor App
12-Factor App
 
CSC AWS re:Invent Enterprise DevOps session
CSC AWS re:Invent Enterprise DevOps sessionCSC AWS re:Invent Enterprise DevOps session
CSC AWS re:Invent Enterprise DevOps session
 
Being Elastic -- Evolving Programming for the Cloud
Being Elastic -- Evolving Programming for the CloudBeing Elastic -- Evolving Programming for the Cloud
Being Elastic -- Evolving Programming for the Cloud
 
Managing the cloud
Managing the cloudManaging the cloud
Managing the cloud
 
SaaS Enablement of your existing application (Cloud Slam 2010)
SaaS Enablement of your existing application (Cloud Slam 2010)SaaS Enablement of your existing application (Cloud Slam 2010)
SaaS Enablement of your existing application (Cloud Slam 2010)
 
Kluczowe elementy infrastruktury...
Kluczowe elementy infrastruktury...Kluczowe elementy infrastruktury...
Kluczowe elementy infrastruktury...
 
Albara Abdalkhalig
Albara AbdalkhaligAlbara Abdalkhalig
Albara Abdalkhalig
 
Current trends in software engineering
Current trends in software engineeringCurrent trends in software engineering
Current trends in software engineering
 

More from Zoltán Németh

Reveal The Secrets of Your Videos
Reveal The Secrets of Your VideosReveal The Secrets of Your Videos
Reveal The Secrets of Your Videos
Zoltán Németh
 
Voxxed Days Belgrade 2017 - How not to do DevOps
Voxxed Days Belgrade 2017 - How not to do DevOpsVoxxed Days Belgrade 2017 - How not to do DevOps
Voxxed Days Belgrade 2017 - How not to do DevOps
Zoltán Németh
 
Content protection with UMS
Content protection with UMSContent protection with UMS
Content protection with UMS
Zoltán Németh
 
Implementing DevOps In Practice
Implementing DevOps In PracticeImplementing DevOps In Practice
Implementing DevOps In Practice
Zoltán Németh
 
Building our own CDN
Building our own CDNBuilding our own CDN
Building our own CDN
Zoltán Németh
 
Culture @ Velocity UK
Culture @ Velocity UKCulture @ Velocity UK
Culture @ Velocity UK
Zoltán Németh
 
On-demand real time transcoding
On-demand real time transcoding On-demand real time transcoding
On-demand real time transcoding
Zoltán Németh
 
DB séma kezelés Liquibase-el
DB séma kezelés Liquibase-elDB séma kezelés Liquibase-el
DB séma kezelés Liquibase-el
Zoltán Németh
 
Daemons in PHP
Daemons in PHPDaemons in PHP
Daemons in PHP
Zoltán Németh
 

More from Zoltán Németh (9)

Reveal The Secrets of Your Videos
Reveal The Secrets of Your VideosReveal The Secrets of Your Videos
Reveal The Secrets of Your Videos
 
Voxxed Days Belgrade 2017 - How not to do DevOps
Voxxed Days Belgrade 2017 - How not to do DevOpsVoxxed Days Belgrade 2017 - How not to do DevOps
Voxxed Days Belgrade 2017 - How not to do DevOps
 
Content protection with UMS
Content protection with UMSContent protection with UMS
Content protection with UMS
 
Implementing DevOps In Practice
Implementing DevOps In PracticeImplementing DevOps In Practice
Implementing DevOps In Practice
 
Building our own CDN
Building our own CDNBuilding our own CDN
Building our own CDN
 
Culture @ Velocity UK
Culture @ Velocity UKCulture @ Velocity UK
Culture @ Velocity UK
 
On-demand real time transcoding
On-demand real time transcoding On-demand real time transcoding
On-demand real time transcoding
 
DB séma kezelés Liquibase-el
DB séma kezelés Liquibase-elDB séma kezelés Liquibase-el
DB séma kezelés Liquibase-el
 
Daemons in PHP
Daemons in PHPDaemons in PHP
Daemons in PHP
 

Recently uploaded

Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
Shinana2
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
Data Hops
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
LucaBarbaro3
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 

Recently uploaded (20)

Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 

Scalable service architectures @ BWS16

  • 1. Scalable Service Architectures Lessons learned Zoltán Németh Engineering Manager, Core SystemsAn IBM company
  • 2. Agenda  Our scalability experience  What is Scalability?  Requirements in detail  Tips and tools  Extras, Closing remarks
  • 5.
  • 6. Defining scalability Scalability is the ability to handle increased workload by repeatedly applying a costeffective strategy for extending a system’s capacity. (CMU paper, 2006) How well a solution to some problem will work when the size of the problem increases. When the size decreases, the solution must fit. (dictionary.com and Theo Schlossnagle, 2006)
  • 7. Self-contained service  Explicitly declare and isolate dependencies  Isolation from the outside system  Static linking  Do not rely on system packages
  • 8. Disposability  Maximize robustness with fast startup and graceful shutdown  Disposable processes  Graceful shutdown on SIGTERM  Handling sudden death: robust queue backend
  • 9. Startup and Shutdown  Automate all the things  Chef  Docker  Gold image based deployment  Immutable  Handling tasks before shutdown
  • 10. Backing Services  Treat backing services as attached resources  No distinction between local and third party services  Easily swap out resources  Export services via port binding  Become the backing service for another app
  • 11. Processes, concurrency  Stateless processes (not even sticky sessions)  Process types by work type  We <3 linux process  Shared-nothing  adding concurrency is safe  Process distribution spanning machines
  • 12. Statelessness  Store everything in a datastore  Aggregate data  Chandra  Aggregator / map & reduce  Scalable datastores  Handling user sessions
  • 13. Monitoring  Application state and metrics  Dashboards  Alerting  Health  Remove failing nodes  Capacity  Act on trends
  • 14. Monitoring  Metrics collecting  Graphite, New Relic  Self-aware checks  Cluster state  Zookeeper, Consul  Scaling decision types  Capacity amount  Graph derivative  App requests
  • 15.
  • 16. Load Balance and Resource Allocation  Load Balance: distribute tasks  Utilize machines efficiently  VM compatible apps  Flexibility  Adapting to available resources
  • 17. Load Balance  DNS or API  App level balance  Uniform entry point or proxy  Balance decisions  Load  Zookeeper state  Resource policies
  • 18. Service Separation  Failure is inevitable  Protect from failing components  Cascading failure  Fail fast  Decoupling  Asynchronous operations  Message queues
  • 19. Service Separation  Rate limiting  Circuit Breaker pattern  Stop cascading failure, allow recovery  Hystrix  Fail fast, fail silent  Service decoupling
  • 20. Extras  Debugging features  Logs  Clojure / JS consoles  Runtime configuration via env  Scaling API  Integrating several cloud providers  Automatic start / stop
  • 21. Reading  Scalable Internet Architectures by Theo Schlossnagle  The 12-factor App: http://12factor.net/  Carnegie Mellon Paper: http://www.sei.cmu.edu/reports/06tn012.pdf  Circuit Breaker: http://martinfowler.com/bliki/CircuitBreaker.html  Release It! by Michael T. Nygard

Editor's Notes

  1. A bit of Ustream intro
  2. Definition Requirements coming from 12-factor, and some added by us Some more detail and tools on selected requirements
  3. 30 day viewer graph. Clear peaks -> need for scaling
  4. Quick description of the streaming stack, roles of components, how they require scaling - Transcontroller/transcoder scaling - UMS scaling
  5. Quick description of the streaming stack, roles of components, how they require scaling - Transcontroller/transcoder scaling - UMS scaling
  6. Carnegie Mellon University paper by Charles B. Weinstock, John B. Goodenough: On System Scalability LINFO: The Linux Information Project http://www.linfo.org/ Next: principles
  7. Example: calling imagemagick or curl from code – they might be there or might not be Bundle everything into the app instead
  8. Disposable process: they can be started or stopped at a moment’s notice For a web process, graceful shutdown is achieved by ceasing to listen on the service port (thereby refusing any new requests), allowing any current requests to finish, and then exiting. Implicit in this model is that HTTP requests are short (no more than a few seconds), or in the case of long polling, the client should seamlessly attempt to reconnect when the connection is lost. For a worker process, graceful shutdown is achieved by returning the current job to the work queue.
  9. Docker: build images from dockerfile, deploy from repository Tasks before shutdown: moving jobs, log collection, sleep
  10. A backing service is any service the app consumes over the network as part of its normal operation. Examples include datastores (such as MySQL or CouchDB), messaging/queueing systems (such as RabbitMQ or Beanstalkd), SMTP services for outbound email (such as Postfix), and caching systems (such as Memcached). Put a resource locator in the config only – environment variables Example: Easily swap out a local mysql to a remote service The app does not rely on runtime injection of a webserver into the execution environment to create a web-facing service. The web app exports HTTP as a service by binding to a port, and listening to requests coming in on that port. One app can become the backing service for another app, by providing the URL to the backing app as a resource handle in the config for the consuming app
  11. Handle diverse workloads by assigning each type of work to a process type. For example, HTTP requests may be handled by a web process, and long-running background tasks handled by a worker process An individual VM can only grow so large (vertical scale), so the application must also be able to span multiple processes running on multiple physical machines.
  12. Aggregate everything within the app and write it out in bulk – careful about write frequency, must not lose too many data on a crash Aggregator map-reduce Redis: scales reads, write problematic Cassandra: quick scaling questionable Aerospike: scales reads and writes, working together with their eng team User sessions: persistent connection, NIO+
  13. Alerting -> openduty Two important groups: Health vs capacity
  14. Report everything to graphite, constantly check graph trends automatically Apps are self-aware, they know their health App instances report into Zookeeper and thus know about each other Central logic can request resource based on capacity or graph, app can request based on self-check or zookeeper Zookeeper, Consul: miért, mik az előnyei
  15. load balancing distributes workloads across multiple computing resources Flexibility: can increase or decrease its own size, example: Threadpools Adapting to CPU, RAM, disk, network
  16. App level: transcontroller selects transcoder App level balance with proxy can be SPOF, careful Resource policies: even distribution, keep large chunks free for possible large tasks (transcoder use case), group requests together on some attribute (pro, etc)
  17. Failure inevitable because: large numbers, hw issues, independent network Decoupling: serving one request should not wait on others
  18. Hystrix by Netflix 2011/12 Circuit Breaker: Martin Fowler post from 2014 Service decoupling example: inserting layers between DB and UMS -> RGW. Then another layer between RGW and UMS -> Queue
  19. Logs: logs as stream / stdout (factor #9), collect / transport / process Scaling API: Other considerations: price, network line to the cloud provider, instance type (spot vs normal) Openstack, Ganeti