SlideShare a Scribd company logo
1 of 37
Service Architectures at Scale
Lessons from Google and eBay
Randy Shoup
@randyshoup
linkedin.com/in/randyshoup
Architecture
Evolution
• eBay
• 5th generation today
• Monolithic Perl  Monolithic C++  Java  microservices
• Twitter
• 3rd generation today
• Monolithic Rails  JS / Rails / Scala  microservices
• Amazon
• Nth generation today
• Monolithic C++  Java / Scala  microservices
Service Architectures
at Scale
• Ecosystem of Services
• Building a Service
• Operating a Service
• Service Anti-Patterns
Service Architectures
at Scale
• Ecosystem of Services
• Building a Service
• Operating a Service
• Service Anti-Patterns
Ecosystem
of Services
• Hundreds to thousands of
independent services
• Many layers of dependencies,
no strict tiers
• Graph of relationships, not a
hierarchy
C
B
A E
F
G
D
Evolution,
not Intelligent Design
• No centralized, top-down design of the system
• Variation and Natural selection
o Create / extract new services when needed to solve a problem
o Deprecate services when no longer used
o Services justify their existence through usage
• Appearance of clean layering is an emergent
property
Google
Service Layering
• Cloud Datastore: NoSQL service
o Highly scalable and resilient
o Strong transactional consistency
o SQL-like rich query capabilities
• Megastore: geo-scale structured
database
o Multi-row transactions
o Synchronous cross-datacenter replication
• Bigtable: cluster-level structured storage
o (row, column, timestamp) -> cell contents
• Colossus: next-generation clustered file
system
o Block distribution and replication
• Borg: cluster management infrastructure
o Task scheduling, machine assignment
Cloud Datastore
Megastore
Bigtable
Colossus
Borg
Architecture without an
Architect?
• No “Architect” title / role
• (+) No central approval for technology decisions
o Most technology decisions made locally instead of globally
o Better decisions in the field
• (-) eBay Architecture Review Board
o Central approval body for large-scale projects
o Usually far too late in the process to be valuable
o Experienced engineers saying “no” after the fact vs. encoding knowledge
in a reusable library, tool, or service
Standardization
• Standardized communication
o Network protocols
o Data formats
o Interface schema / specification
• Standardized infrastructure
o Source control
o Configuration management
o Cluster management
o Monitoring, alerting, diagnosing, etc.
Standards become standards by
being better than the alternatives!
“Enforcing”
Standardization
• Encouraged via
o Libraries
o Support in underlying services
o Code reviews
o Searchable code
The easiest way to encourage best
practices is with *code*!
Make it really easy to do the right
thing, and harder to do the wrong
thing!
Service
Independence
• No standardization of service internals
o Programming languages
o Frameworks
o Persistence mechanisms
In a mature ecosystem of services,
we standardize the arcs of the
graph, not the nodes!
Creating
New Services
• Spinning out a new service
o Almost always built for particular use-case first
o If successful and appropriate, form a team and generalize for multiple
use-cases
• Pragmatism wins
• Examples
o Google File System
o Bigtable
o Megastore
o Google App Engine
o Gmail
Deprecating
Old Services
• What if a service is a failure?
o Repurpose technologies for other uses
o Redeploy people to other teams
• Examples
o Google Wave -> Google Apps
o Multiple generations of core services
“Every service at Google is either
deprecated or not ready yet.”
-- Google engineering proverb
Service Architectures
at Scale
• Ecosystem of Services
• Building a Service
• Operating a Service
• Service Anti-Patterns
Characteristics of an
Effective Service
• Single-purpose
• Simple, well-defined interface
• Modular and independent
• Isolated persistence (!)
A
C D E
B
Goals of a
Service Owner
• Meet the needs of my clients …
• Functionality
• Quality
• Performance
• Stability and reliability
• Constant improvement over time
• … at minimum cost and effort
• Leverage common tools and infrastructure
• Leverage other services
• Automate building, deploying, and operating my service
• Optimize for efficient use of resources
Responsibilities of a
Service Owner
• End-to-end Ownership
o Team owns service from design to deployment to retirement
o No separate maintenance or sustaining engineering team
o DevOps philosophy of “You build it, you run it”
• Autonomy and Accountability
o Freedom to choose technology, methodology, working environment
o Responsibility for the results of those choices
Service as
Bounded Context
• Primary focus on my service
o Clients which depend on my service
o Services which my service depends on
o Cognitive load is very bounded
• Very little worry about
o The complete ecosystem
o The underlying infrastructure
•  Small, nimble service teams
Service
Client
A
Client
B
Client
C
Service-Service
Relationships
• Vendor – Customer Relationship
o Friendly and cooperative, but structured
o Clear ownership and division of responsibility
o Customer can choose to use service or not (!)
• Service-Level Agreement (SLA)
o Promise of service levels by the provider
o Customer needs to be able to rely on the service, like a utility
Service-Service
Relationships
• Charging and Cost Allocation
o Charge customers for *usage* of the service
o Aligns economic incentives of customer and provider
o Motivates both sides to optimize for efficiency
o (+) Pre- / post-allocation at Google
Maintaining
Service Quality
• Small incremental changes
o Easy to reason about and understand
o Risk of code change is nonlinear in the size of the change
o (-) Initial memcache service submission
• Solid Development Practices
o Code reviews before submission
o Automated tests for everything
• Google build and test system
o Uses production cluster manager
o Runs millions of tests per day in parallel
o All acceptance tests run before code is accepted into source control
Maintaining
Interface Stability
• Backward / forward compatibility of interfaces
o Can *never* break your clients’ code
o Often multiple interface versions
o Sometimes multiple deployments
o Majority of changes don’t impact the interface in any way
• Explicit deprecation policy
o Strong incentive to wean customers off old versions (!)
Service Architectures
at Scale
• Ecosystem of Services
• Building a Service
• Operating a Service
• Service Anti-Patterns
Predictable
Performance
• Services at scale highly exposed to performance
variability
• Imagine an operation …
o 1ms median latency, but 1 second latency at 99.99%ile (1 in 10,000)
o Service using one machine  0.01% slow
o Service using 5,000 machines  50% slow
• Predictability trumps average performance
o Low latency + inconsistent performance != low latency
o Far easier to program to consistent performance
o Tail latencies are *much* more important than average latencies
Google App Engine
Memcache Service
• Periodic “hiccups” in latency at 99.99%ile and
beyond
• Very difficult to detect and diagnose
•  Slab memory allocation
Service
Reliability
• Systems at scale highly exposed to failure
o Software, hardware, service failures
o Sharks and backhoes
o Operator “oops”
• Resilience in depth
o Redundancy for machine / cluster / data center failures
o Load-balancing and flow control for service invocations
o Rapid rollback for “oops”
Service Reliability:
Deployment
• Incremental Deployment
o Canary systems
o Staged rollouts
o Rapid rollback
• eBay “Feature Flags”
o Decouple code deployment from feature deployment
o Rapidly turn on / off features without redeploying code
o Typically deploy with feature turned off, then turn on as a separate step
Service Reliability:
Monitoring
• Instrumentation
o Common monitoring service
o Machine / instance statistics: CPU, memory, I/O
o Request statistics: request rate, error rate, latency distribution
o Application / service statistics
o Downstream service invocations
• Diagnosability
o In-process web server with current statistics
o Distributed tracing of requests through multiple service invocations
You can have too much alerting,
but you can never have too much
monitoring!
Service Architectures
at Scale
• Ecosystem of Services
• Building a Service
• Operating a Service
• Service Anti-Patterns
Service
Anti-Patterns
• The “Mega-Service”
o Overbroad area of responsibility is difficult to reason about, change
o Leads to more upstream / downstream dependencies
• Shared persistence
o Breaks encapsulation, encourages “backdoor” interface violations
o Unhealthy and near-invisible coupling of services
o (-) Initial eBay SOA efforts
Thank You!
• @randyshoup
• linkedin.com/in/randyshoup
• Slides will be at slideshare.net/randyshoup

More Related Content

What's hot

DOES15 - Randy Shoup - Ten (Hard-Won) Lessons of the DevOps Transition
DOES15 - Randy Shoup - Ten (Hard-Won) Lessons of the DevOps TransitionDOES15 - Randy Shoup - Ten (Hard-Won) Lessons of the DevOps Transition
DOES15 - Randy Shoup - Ten (Hard-Won) Lessons of the DevOps TransitionGene Kim
 
DevOpsDays Silicon Valley 2014 - The Game of Operations
DevOpsDays Silicon Valley 2014 - The Game of OperationsDevOpsDays Silicon Valley 2014 - The Game of Operations
DevOpsDays Silicon Valley 2014 - The Game of OperationsRandy Shoup
 
Scaling Your Architecture for the Long Term
Scaling Your Architecture for the Long TermScaling Your Architecture for the Long Term
Scaling Your Architecture for the Long TermRandy Shoup
 
An Agile Approach to Machine Learning
An Agile Approach to Machine LearningAn Agile Approach to Machine Learning
An Agile Approach to Machine LearningRandy Shoup
 
One Terrible Day at Google, and How It Made Us Better
One Terrible Day at Google, and How It Made Us BetterOne Terrible Day at Google, and How It Made Us Better
One Terrible Day at Google, and How It Made Us BetterRandy Shoup
 
From the Monolith to Microservices - CraftConf 2015
From the Monolith to Microservices - CraftConf 2015From the Monolith to Microservices - CraftConf 2015
From the Monolith to Microservices - CraftConf 2015Randy Shoup
 
Monoliths, Migrations, and Microservices
Monoliths, Migrations, and MicroservicesMonoliths, Migrations, and Microservices
Monoliths, Migrations, and MicroservicesRandy Shoup
 
Being Elastic -- Evolving Programming for the Cloud
Being Elastic -- Evolving Programming for the CloudBeing Elastic -- Evolving Programming for the Cloud
Being Elastic -- Evolving Programming for the CloudRandy Shoup
 
Learning from Learnings: Anatomy of Three Incidents
Learning from Learnings: Anatomy of Three IncidentsLearning from Learnings: Anatomy of Three Incidents
Learning from Learnings: Anatomy of Three IncidentsRandy Shoup
 
Minimal Viable Architecture - Silicon Slopes 2020
Minimal Viable Architecture - Silicon Slopes 2020Minimal Viable Architecture - Silicon Slopes 2020
Minimal Viable Architecture - Silicon Slopes 2020Randy Shoup
 
Best Practices for Large-Scale Websites -- Lessons from eBay
Best Practices for Large-Scale Websites -- Lessons from eBayBest Practices for Large-Scale Websites -- Lessons from eBay
Best Practices for Large-Scale Websites -- Lessons from eBayRandy Shoup
 
DOES16 London - Better Faster Cheaper .. How?
DOES16 London - Better Faster Cheaper .. How? DOES16 London - Better Faster Cheaper .. How?
DOES16 London - Better Faster Cheaper .. How? John Willis
 
Moving Fast at Scale
Moving Fast at ScaleMoving Fast at Scale
Moving Fast at ScaleRandy Shoup
 
Managing Data at Scale - Microservices and Events
Managing Data at Scale - Microservices and EventsManaging Data at Scale - Microservices and Events
Managing Data at Scale - Microservices and EventsRandy Shoup
 
All daydevops 2016 - Turning Human Capital into High Performance Organizati...
All daydevops   2016 - Turning Human Capital into High Performance Organizati...All daydevops   2016 - Turning Human Capital into High Performance Organizati...
All daydevops 2016 - Turning Human Capital into High Performance Organizati...John Willis
 
Turning Human Capital into High Performance Organizational Capital
Turning Human Capital into High Performance Organizational CapitalTurning Human Capital into High Performance Organizational Capital
Turning Human Capital into High Performance Organizational CapitalJohn Willis
 
Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...
Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...
Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...Randy Shoup
 
Immutable Service Delivery Shenzhen 2016
Immutable Service Delivery   Shenzhen 2016Immutable Service Delivery   Shenzhen 2016
Immutable Service Delivery Shenzhen 2016John Willis
 
Road to Continuous Delivery - Wix.com
Road to Continuous Delivery - Wix.comRoad to Continuous Delivery - Wix.com
Road to Continuous Delivery - Wix.comAviran Mordo
 
Dockercon USA 2016 - Immutable Awesomeness
Dockercon USA 2016 - Immutable Awesomeness Dockercon USA 2016 - Immutable Awesomeness
Dockercon USA 2016 - Immutable Awesomeness John Willis
 

What's hot (20)

DOES15 - Randy Shoup - Ten (Hard-Won) Lessons of the DevOps Transition
DOES15 - Randy Shoup - Ten (Hard-Won) Lessons of the DevOps TransitionDOES15 - Randy Shoup - Ten (Hard-Won) Lessons of the DevOps Transition
DOES15 - Randy Shoup - Ten (Hard-Won) Lessons of the DevOps Transition
 
DevOpsDays Silicon Valley 2014 - The Game of Operations
DevOpsDays Silicon Valley 2014 - The Game of OperationsDevOpsDays Silicon Valley 2014 - The Game of Operations
DevOpsDays Silicon Valley 2014 - The Game of Operations
 
Scaling Your Architecture for the Long Term
Scaling Your Architecture for the Long TermScaling Your Architecture for the Long Term
Scaling Your Architecture for the Long Term
 
An Agile Approach to Machine Learning
An Agile Approach to Machine LearningAn Agile Approach to Machine Learning
An Agile Approach to Machine Learning
 
One Terrible Day at Google, and How It Made Us Better
One Terrible Day at Google, and How It Made Us BetterOne Terrible Day at Google, and How It Made Us Better
One Terrible Day at Google, and How It Made Us Better
 
From the Monolith to Microservices - CraftConf 2015
From the Monolith to Microservices - CraftConf 2015From the Monolith to Microservices - CraftConf 2015
From the Monolith to Microservices - CraftConf 2015
 
Monoliths, Migrations, and Microservices
Monoliths, Migrations, and MicroservicesMonoliths, Migrations, and Microservices
Monoliths, Migrations, and Microservices
 
Being Elastic -- Evolving Programming for the Cloud
Being Elastic -- Evolving Programming for the CloudBeing Elastic -- Evolving Programming for the Cloud
Being Elastic -- Evolving Programming for the Cloud
 
Learning from Learnings: Anatomy of Three Incidents
Learning from Learnings: Anatomy of Three IncidentsLearning from Learnings: Anatomy of Three Incidents
Learning from Learnings: Anatomy of Three Incidents
 
Minimal Viable Architecture - Silicon Slopes 2020
Minimal Viable Architecture - Silicon Slopes 2020Minimal Viable Architecture - Silicon Slopes 2020
Minimal Viable Architecture - Silicon Slopes 2020
 
Best Practices for Large-Scale Websites -- Lessons from eBay
Best Practices for Large-Scale Websites -- Lessons from eBayBest Practices for Large-Scale Websites -- Lessons from eBay
Best Practices for Large-Scale Websites -- Lessons from eBay
 
DOES16 London - Better Faster Cheaper .. How?
DOES16 London - Better Faster Cheaper .. How? DOES16 London - Better Faster Cheaper .. How?
DOES16 London - Better Faster Cheaper .. How?
 
Moving Fast at Scale
Moving Fast at ScaleMoving Fast at Scale
Moving Fast at Scale
 
Managing Data at Scale - Microservices and Events
Managing Data at Scale - Microservices and EventsManaging Data at Scale - Microservices and Events
Managing Data at Scale - Microservices and Events
 
All daydevops 2016 - Turning Human Capital into High Performance Organizati...
All daydevops   2016 - Turning Human Capital into High Performance Organizati...All daydevops   2016 - Turning Human Capital into High Performance Organizati...
All daydevops 2016 - Turning Human Capital into High Performance Organizati...
 
Turning Human Capital into High Performance Organizational Capital
Turning Human Capital into High Performance Organizational CapitalTurning Human Capital into High Performance Organizational Capital
Turning Human Capital into High Performance Organizational Capital
 
Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...
Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...
Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...
 
Immutable Service Delivery Shenzhen 2016
Immutable Service Delivery   Shenzhen 2016Immutable Service Delivery   Shenzhen 2016
Immutable Service Delivery Shenzhen 2016
 
Road to Continuous Delivery - Wix.com
Road to Continuous Delivery - Wix.comRoad to Continuous Delivery - Wix.com
Road to Continuous Delivery - Wix.com
 
Dockercon USA 2016 - Immutable Awesomeness
Dockercon USA 2016 - Immutable Awesomeness Dockercon USA 2016 - Immutable Awesomeness
Dockercon USA 2016 - Immutable Awesomeness
 

Viewers also liked

Concurrency at Scale: Evolution to Micro-Services
Concurrency at Scale:  Evolution to Micro-ServicesConcurrency at Scale:  Evolution to Micro-Services
Concurrency at Scale: Evolution to Micro-ServicesRandy Shoup
 
Why Enterprises Are Embracing the Cloud
Why Enterprises Are Embracing the CloudWhy Enterprises Are Embracing the Cloud
Why Enterprises Are Embracing the CloudRandy Shoup
 
Teaching Machines to Fish -- How eBay Improves Itself
Teaching Machines to Fish -- How eBay Improves ItselfTeaching Machines to Fish -- How eBay Improves Itself
Teaching Machines to Fish -- How eBay Improves ItselfRandy Shoup
 
QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...
QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...
QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...Randy Shoup
 
QCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYE
QCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYEQCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYE
QCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYERandy Shoup
 
The eBay Architecture: Striking a Balance between Site Stability, Feature Ve...
The eBay Architecture:  Striking a Balance between Site Stability, Feature Ve...The eBay Architecture:  Striking a Balance between Site Stability, Feature Ve...
The eBay Architecture: Striking a Balance between Site Stability, Feature Ve...Randy Shoup
 
Developing functional domain models with event sourcing (sbtb, sbtb2015)
Developing functional domain models with event sourcing (sbtb, sbtb2015)Developing functional domain models with event sourcing (sbtb, sbtb2015)
Developing functional domain models with event sourcing (sbtb, sbtb2015)Chris Richardson
 
eBay Architecture
eBay Architecture eBay Architecture
eBay Architecture Tony Ng
 

Viewers also liked (8)

Concurrency at Scale: Evolution to Micro-Services
Concurrency at Scale:  Evolution to Micro-ServicesConcurrency at Scale:  Evolution to Micro-Services
Concurrency at Scale: Evolution to Micro-Services
 
Why Enterprises Are Embracing the Cloud
Why Enterprises Are Embracing the CloudWhy Enterprises Are Embracing the Cloud
Why Enterprises Are Embracing the Cloud
 
Teaching Machines to Fish -- How eBay Improves Itself
Teaching Machines to Fish -- How eBay Improves ItselfTeaching Machines to Fish -- How eBay Improves Itself
Teaching Machines to Fish -- How eBay Improves Itself
 
QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...
QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...
QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...
 
QCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYE
QCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYEQCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYE
QCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYE
 
The eBay Architecture: Striking a Balance between Site Stability, Feature Ve...
The eBay Architecture:  Striking a Balance between Site Stability, Feature Ve...The eBay Architecture:  Striking a Balance between Site Stability, Feature Ve...
The eBay Architecture: Striking a Balance between Site Stability, Feature Ve...
 
Developing functional domain models with event sourcing (sbtb, sbtb2015)
Developing functional domain models with event sourcing (sbtb, sbtb2015)Developing functional domain models with event sourcing (sbtb, sbtb2015)
Developing functional domain models with event sourcing (sbtb, sbtb2015)
 
eBay Architecture
eBay Architecture eBay Architecture
eBay Architecture
 

Similar to Service Architectures At Scale - QCon London 2015

Microservices Practitioner Summit Jan '15 - Microservice Ecosystems At Scale ...
Microservices Practitioner Summit Jan '15 - Microservice Ecosystems At Scale ...Microservices Practitioner Summit Jan '15 - Microservice Ecosystems At Scale ...
Microservices Practitioner Summit Jan '15 - Microservice Ecosystems At Scale ...Ambassador Labs
 
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of SimplicityLarge Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of SimplicityRandy Shoup
 
Effective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldEffective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldRandy Shoup
 
Azure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challengesAzure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challengesIvo Andreev
 
Get Loose! Microservices and Loosely Coupled Architectures
Get Loose! Microservices and Loosely Coupled ArchitecturesGet Loose! Microservices and Loosely Coupled Architectures
Get Loose! Microservices and Loosely Coupled ArchitecturesDeborah Schalm
 
Get Loose! Microservices and Loosely Coupled Architectures
Get Loose! Microservices and Loosely Coupled Architectures Get Loose! Microservices and Loosely Coupled Architectures
Get Loose! Microservices and Loosely Coupled Architectures DevOps.com
 
Minimum Viable Architecture - Good Enough is Good Enough
Minimum Viable Architecture - Good Enough is Good EnoughMinimum Viable Architecture - Good Enough is Good Enough
Minimum Viable Architecture - Good Enough is Good EnoughRandy Shoup
 
Integration strategies best practices- Mulesoft meetup April 2018
Integration strategies   best practices- Mulesoft meetup April 2018Integration strategies   best practices- Mulesoft meetup April 2018
Integration strategies best practices- Mulesoft meetup April 2018Rohan Rasane
 
Lessions Learned - Service Oriented Architecture
Lessions Learned - Service Oriented Architecture Lessions Learned - Service Oriented Architecture
Lessions Learned - Service Oriented Architecture Helge Olav Aarstein
 
Iot cloud service v2.0
Iot cloud service v2.0Iot cloud service v2.0
Iot cloud service v2.0Vinod Wilson
 
Going Global with Itoc and AWS
Going Global with Itoc and AWS Going Global with Itoc and AWS
Going Global with Itoc and AWS Mark Promnitz
 
Neotys PAC - Ian Molyneaux
Neotys PAC - Ian MolyneauxNeotys PAC - Ian Molyneaux
Neotys PAC - Ian MolyneauxNeotys_Partner
 
Customer-centric Metrics
Customer-centric MetricsCustomer-centric Metrics
Customer-centric MetricsMark McBride
 
Jeremy Edberg (MinOps ) - How to build a solid infrastructure for a startup t...
Jeremy Edberg (MinOps ) - How to build a solid infrastructure for a startup t...Jeremy Edberg (MinOps ) - How to build a solid infrastructure for a startup t...
Jeremy Edberg (MinOps ) - How to build a solid infrastructure for a startup t...Startupfest
 
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!Richard Robinson
 
Software Architecture and Architectors: useless VS valuable
Software Architecture and Architectors: useless VS valuableSoftware Architecture and Architectors: useless VS valuable
Software Architecture and Architectors: useless VS valuableComsysto Reply GmbH
 
Microservices for java architects it-symposium-2015-09-15
Microservices for java architects it-symposium-2015-09-15Microservices for java architects it-symposium-2015-09-15
Microservices for java architects it-symposium-2015-09-15Derek Ashmore
 
ConFoo: Moving web performance testing to the left
ConFoo: Moving web performance testing to the leftConFoo: Moving web performance testing to the left
ConFoo: Moving web performance testing to the leftTom Chavez
 

Similar to Service Architectures At Scale - QCon London 2015 (20)

Microservices Practitioner Summit Jan '15 - Microservice Ecosystems At Scale ...
Microservices Practitioner Summit Jan '15 - Microservice Ecosystems At Scale ...Microservices Practitioner Summit Jan '15 - Microservice Ecosystems At Scale ...
Microservices Practitioner Summit Jan '15 - Microservice Ecosystems At Scale ...
 
Microservices Architecture
Microservices ArchitectureMicroservices Architecture
Microservices Architecture
 
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of SimplicityLarge Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
 
Effective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldEffective Microservices In a Data-centric World
Effective Microservices In a Data-centric World
 
Azure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challengesAzure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challenges
 
Get Loose! Microservices and Loosely Coupled Architectures
Get Loose! Microservices and Loosely Coupled ArchitecturesGet Loose! Microservices and Loosely Coupled Architectures
Get Loose! Microservices and Loosely Coupled Architectures
 
Get Loose! Microservices and Loosely Coupled Architectures
Get Loose! Microservices and Loosely Coupled Architectures Get Loose! Microservices and Loosely Coupled Architectures
Get Loose! Microservices and Loosely Coupled Architectures
 
Minimum Viable Architecture - Good Enough is Good Enough
Minimum Viable Architecture - Good Enough is Good EnoughMinimum Viable Architecture - Good Enough is Good Enough
Minimum Viable Architecture - Good Enough is Good Enough
 
Integration strategies best practices- Mulesoft meetup April 2018
Integration strategies   best practices- Mulesoft meetup April 2018Integration strategies   best practices- Mulesoft meetup April 2018
Integration strategies best practices- Mulesoft meetup April 2018
 
Lessions Learned - Service Oriented Architecture
Lessions Learned - Service Oriented Architecture Lessions Learned - Service Oriented Architecture
Lessions Learned - Service Oriented Architecture
 
Beyond The Rails Way
Beyond The Rails WayBeyond The Rails Way
Beyond The Rails Way
 
Iot cloud service v2.0
Iot cloud service v2.0Iot cloud service v2.0
Iot cloud service v2.0
 
Going Global with Itoc and AWS
Going Global with Itoc and AWS Going Global with Itoc and AWS
Going Global with Itoc and AWS
 
Neotys PAC - Ian Molyneaux
Neotys PAC - Ian MolyneauxNeotys PAC - Ian Molyneaux
Neotys PAC - Ian Molyneaux
 
Customer-centric Metrics
Customer-centric MetricsCustomer-centric Metrics
Customer-centric Metrics
 
Jeremy Edberg (MinOps ) - How to build a solid infrastructure for a startup t...
Jeremy Edberg (MinOps ) - How to build a solid infrastructure for a startup t...Jeremy Edberg (MinOps ) - How to build a solid infrastructure for a startup t...
Jeremy Edberg (MinOps ) - How to build a solid infrastructure for a startup t...
 
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!
 
Software Architecture and Architectors: useless VS valuable
Software Architecture and Architectors: useless VS valuableSoftware Architecture and Architectors: useless VS valuable
Software Architecture and Architectors: useless VS valuable
 
Microservices for java architects it-symposium-2015-09-15
Microservices for java architects it-symposium-2015-09-15Microservices for java architects it-symposium-2015-09-15
Microservices for java architects it-symposium-2015-09-15
 
ConFoo: Moving web performance testing to the left
ConFoo: Moving web performance testing to the leftConFoo: Moving web performance testing to the left
ConFoo: Moving web performance testing to the left
 

Recently uploaded

5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 

Recently uploaded (20)

5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 

Service Architectures At Scale - QCon London 2015

  • 1. Service Architectures at Scale Lessons from Google and eBay Randy Shoup @randyshoup linkedin.com/in/randyshoup
  • 2. Architecture Evolution • eBay • 5th generation today • Monolithic Perl  Monolithic C++  Java  microservices • Twitter • 3rd generation today • Monolithic Rails  JS / Rails / Scala  microservices • Amazon • Nth generation today • Monolithic C++  Java / Scala  microservices
  • 3. Service Architectures at Scale • Ecosystem of Services • Building a Service • Operating a Service • Service Anti-Patterns
  • 4. Service Architectures at Scale • Ecosystem of Services • Building a Service • Operating a Service • Service Anti-Patterns
  • 5. Ecosystem of Services • Hundreds to thousands of independent services • Many layers of dependencies, no strict tiers • Graph of relationships, not a hierarchy C B A E F G D
  • 6. Evolution, not Intelligent Design • No centralized, top-down design of the system • Variation and Natural selection o Create / extract new services when needed to solve a problem o Deprecate services when no longer used o Services justify their existence through usage • Appearance of clean layering is an emergent property
  • 7. Google Service Layering • Cloud Datastore: NoSQL service o Highly scalable and resilient o Strong transactional consistency o SQL-like rich query capabilities • Megastore: geo-scale structured database o Multi-row transactions o Synchronous cross-datacenter replication • Bigtable: cluster-level structured storage o (row, column, timestamp) -> cell contents • Colossus: next-generation clustered file system o Block distribution and replication • Borg: cluster management infrastructure o Task scheduling, machine assignment Cloud Datastore Megastore Bigtable Colossus Borg
  • 8. Architecture without an Architect? • No “Architect” title / role • (+) No central approval for technology decisions o Most technology decisions made locally instead of globally o Better decisions in the field • (-) eBay Architecture Review Board o Central approval body for large-scale projects o Usually far too late in the process to be valuable o Experienced engineers saying “no” after the fact vs. encoding knowledge in a reusable library, tool, or service
  • 9. Standardization • Standardized communication o Network protocols o Data formats o Interface schema / specification • Standardized infrastructure o Source control o Configuration management o Cluster management o Monitoring, alerting, diagnosing, etc.
  • 10. Standards become standards by being better than the alternatives!
  • 11. “Enforcing” Standardization • Encouraged via o Libraries o Support in underlying services o Code reviews o Searchable code
  • 12. The easiest way to encourage best practices is with *code*!
  • 13. Make it really easy to do the right thing, and harder to do the wrong thing!
  • 14. Service Independence • No standardization of service internals o Programming languages o Frameworks o Persistence mechanisms
  • 15. In a mature ecosystem of services, we standardize the arcs of the graph, not the nodes!
  • 16. Creating New Services • Spinning out a new service o Almost always built for particular use-case first o If successful and appropriate, form a team and generalize for multiple use-cases • Pragmatism wins • Examples o Google File System o Bigtable o Megastore o Google App Engine o Gmail
  • 17. Deprecating Old Services • What if a service is a failure? o Repurpose technologies for other uses o Redeploy people to other teams • Examples o Google Wave -> Google Apps o Multiple generations of core services
  • 18. “Every service at Google is either deprecated or not ready yet.” -- Google engineering proverb
  • 19. Service Architectures at Scale • Ecosystem of Services • Building a Service • Operating a Service • Service Anti-Patterns
  • 20. Characteristics of an Effective Service • Single-purpose • Simple, well-defined interface • Modular and independent • Isolated persistence (!) A C D E B
  • 21. Goals of a Service Owner • Meet the needs of my clients … • Functionality • Quality • Performance • Stability and reliability • Constant improvement over time • … at minimum cost and effort • Leverage common tools and infrastructure • Leverage other services • Automate building, deploying, and operating my service • Optimize for efficient use of resources
  • 22. Responsibilities of a Service Owner • End-to-end Ownership o Team owns service from design to deployment to retirement o No separate maintenance or sustaining engineering team o DevOps philosophy of “You build it, you run it” • Autonomy and Accountability o Freedom to choose technology, methodology, working environment o Responsibility for the results of those choices
  • 23. Service as Bounded Context • Primary focus on my service o Clients which depend on my service o Services which my service depends on o Cognitive load is very bounded • Very little worry about o The complete ecosystem o The underlying infrastructure •  Small, nimble service teams Service Client A Client B Client C
  • 24. Service-Service Relationships • Vendor – Customer Relationship o Friendly and cooperative, but structured o Clear ownership and division of responsibility o Customer can choose to use service or not (!) • Service-Level Agreement (SLA) o Promise of service levels by the provider o Customer needs to be able to rely on the service, like a utility
  • 25. Service-Service Relationships • Charging and Cost Allocation o Charge customers for *usage* of the service o Aligns economic incentives of customer and provider o Motivates both sides to optimize for efficiency o (+) Pre- / post-allocation at Google
  • 26. Maintaining Service Quality • Small incremental changes o Easy to reason about and understand o Risk of code change is nonlinear in the size of the change o (-) Initial memcache service submission • Solid Development Practices o Code reviews before submission o Automated tests for everything • Google build and test system o Uses production cluster manager o Runs millions of tests per day in parallel o All acceptance tests run before code is accepted into source control
  • 27. Maintaining Interface Stability • Backward / forward compatibility of interfaces o Can *never* break your clients’ code o Often multiple interface versions o Sometimes multiple deployments o Majority of changes don’t impact the interface in any way • Explicit deprecation policy o Strong incentive to wean customers off old versions (!)
  • 28. Service Architectures at Scale • Ecosystem of Services • Building a Service • Operating a Service • Service Anti-Patterns
  • 29. Predictable Performance • Services at scale highly exposed to performance variability • Imagine an operation … o 1ms median latency, but 1 second latency at 99.99%ile (1 in 10,000) o Service using one machine  0.01% slow o Service using 5,000 machines  50% slow • Predictability trumps average performance o Low latency + inconsistent performance != low latency o Far easier to program to consistent performance o Tail latencies are *much* more important than average latencies
  • 30. Google App Engine Memcache Service • Periodic “hiccups” in latency at 99.99%ile and beyond • Very difficult to detect and diagnose •  Slab memory allocation
  • 31. Service Reliability • Systems at scale highly exposed to failure o Software, hardware, service failures o Sharks and backhoes o Operator “oops” • Resilience in depth o Redundancy for machine / cluster / data center failures o Load-balancing and flow control for service invocations o Rapid rollback for “oops”
  • 32. Service Reliability: Deployment • Incremental Deployment o Canary systems o Staged rollouts o Rapid rollback • eBay “Feature Flags” o Decouple code deployment from feature deployment o Rapidly turn on / off features without redeploying code o Typically deploy with feature turned off, then turn on as a separate step
  • 33. Service Reliability: Monitoring • Instrumentation o Common monitoring service o Machine / instance statistics: CPU, memory, I/O o Request statistics: request rate, error rate, latency distribution o Application / service statistics o Downstream service invocations • Diagnosability o In-process web server with current statistics o Distributed tracing of requests through multiple service invocations
  • 34. You can have too much alerting, but you can never have too much monitoring!
  • 35. Service Architectures at Scale • Ecosystem of Services • Building a Service • Operating a Service • Service Anti-Patterns
  • 36. Service Anti-Patterns • The “Mega-Service” o Overbroad area of responsibility is difficult to reason about, change o Leads to more upstream / downstream dependencies • Shared persistence o Breaks encapsulation, encourages “backdoor” interface violations o Unhealthy and near-invisible coupling of services o (-) Initial eBay SOA efforts
  • 37. Thank You! • @randyshoup • linkedin.com/in/randyshoup • Slides will be at slideshare.net/randyshoup