SlideShare a Scribd company logo
1 of 51
Download to read offline
Large-Scale Architecture
The Unreasonable
Effectiveness of Simplicity
Randy Shoup
@randyshoup
Background
@randyshoup
@randyshoup
• 1995 Monolithic Perl
• 1996-2002 v2
o Monolithic C++ ISAPI DLL
o 3.4M lines of code
o Compiler limits on number of methods per class
• 2002-2006 v3 Migration
o Java “mini-applications”
o Shared databases
• 2012 v4 Microservices
• 2017 v5 Microservices
@randyshoup
eBay Architecture
• 1995-2001 Obidos
o Monolithic Perl / Mason frontend over C
backend
o ~4GB application in a 4GB address space
o Regularly breaking Gnu linker
o Restarting every 100-200 requests for memory
leaks
o Releasing once per quarter
• 2001-2005 Service Migration
o Services in C++, Java, etc.
o No shared databases
• 2006 AWS launches
@randyshoup
Amazon Architecture
No one starts with microservices
…
Past a certain scale, everyone ends
up with microservices
@randyshoup
Large-Scale Architecture
•Simple Components
•Simple Interactions
•Simple Changes
•Putting It All Together
Large-Scale Architecture
•Simple Components
•Simple Interactions
•Simple Changes
•Putting It All Together
“There are two methods in
software design: One is to make
the program so simple, there are
obviously no errors. The other is
to make it so complicated, there
are no obvious errors.”
-- Tony Hoare
@randyshoup
Modular Services
• Service boundaries match the problem
domain
• Service boundaries encapsulate
business logic and data
o All interactions through published service interface
o Interface hides internal implementation details
o No back doors
• Service boundaries encapsulate
architectural -ilities
o Fault isolation
o Performance optimization
o Security boundary
@randyshoup
Orthogonal Domain Logic
• Stateless domain logic
o Ideally stateless pure function
o Matches domain problem as directly as possible
o Deterministic and testable in isolation
o Robust to change over time
• “Straight-line processing”
o Straightforward, synchronous, minimal branching
• Separate domain logic from I/O
o Hexagonal architecture, Ports and Adapters
o Functional core, imperative shell
@randyshoup
Sharding
• Shards partition the service’s “data space”
o Units for distribution, replication, processing, storage
o Hidden as internal implementation detail
• Shards encapsulate architectural -ilities
o Resource isolation
o Fault isolation
o Availability
o Performance
• Shards are autoscaled
o Divide or scale out as processing or data needs increase
o E.g., DynamoDB partitions, Aurora segments, Bigtable
tablets
@randyshoup
Service Graph
• Common services provide and
abstract widely-used capabilities
• Service topology
o Services call others, which call others, etc.
o Graph, not a strict layering
• Simplifies concerns for service team
o Concentrate on business logic
o Abstract away details of dependencies
o Focus on services you need, capabilities you
provide
@randyshoup
Common Platform
• “Paved Road”
o Shared infrastructure
o Standard frameworks
o Developer experience
o E.g., Netflix, Google
• Separation of Concerns
o Reduce cognitive load on stream-aligned
teams
o Bound decisions through enabling constraints
@randyshoup
Large-scale organizations often
invest more than 50% of
engineering effort in platform
capabilities
@randyshoup
Service Ecosystem
@randyshoup
Evolving Services
• Variation and Natural Selection
o Create / extract new services when needed to
solve a problem
o Services justify their continued existence through
usage
o Deprecate services when they are no longer used
• Services grow and evolve over time
o Factor out common libraries and services as
needed
o Teams and services split like “cellular mitosis”
@randyshoup
“Every service at
Google is either
deprecated or not
ready yet.”
@randyshoup
Large-Scale Architecture
•Simple Components
•Simple Interactions
•Simple Changes
•Putting It All Together
“Everything should
be made as simple
as possible, but not
simpler.”
@randyshoup
Event-Driven
• Communicate state changes as
stream of events
o Statement that some interesting thing
occurred
o Ideally represents a semantic domain event
• Decouples domains and teams
o Abstracted through a well-defined
interface
o Asynchronous from one another
• Simplifies component
implementation
@randyshoup
Immutable Log
• Store state as immutable log of events
o Event Sourcing
• Often matches domain
o E.g., Stitch Fix order processing / delivery state
• Log encapsulates architectural –ilities
o Durable
o Traceable and auditable
o Replayable
o Explicit and comprehensible
• Compact snapshots for efficiency
@randyshoup
Immutable Log
• Stitch Fix order states
Request a fix fix_scheduled
Assign fix to warehouse fix_hizzy_assigned
Assign fix to a stylist fix_stylist_assigned
Style the fix fix_styled
Pick the items for the fix fix_picked
Pack the items into a box fix_packed
Ship the fix via a carrier fix_shipped
Fix travels to customer fix_delivered
Customer decides, pays fix_checked_out
Embrace Asynchrony
• Decouples operations in time
o Decoupled availability
o Independent scalability
o Allows more complex processing, more
processing in parallel
o Safer to make independent changes
• Simplifies component
implementation
@randyshoup
Embrace Asynchrony
• Invert from synchronous call
graph to async dataflow
o Exploit asymmetry between writes and
reads
o Can be orders of magnitude less
resource intensive
@randyshoup
Large-Scale Architecture
•Simple Components
•Simple Interactions
•Simple Changes
•Putting It All Together
“A complex system that works is
invariably found to have evolved
from a simple system that
worked.”
-- Gall’s Law
@randyshoup
Incremental Change
• Decompose every large change into small
incremental steps
• Each step maintains backward / forward
compatibility of data and interfaces
• Multiple service versions commonly coexist
o Every change is a rolling upgrade
o Transitional states are normal, not exceptional
Continuous Testing
• Tests help us go faster
o Tests are “solid ground”
o Tests are the safety net
• Tests make better code
o Confidence to break things
o Courage to refactor mercilessly
• Tests make better systems
o Catch bugs earlier, fail faster
@randyshoup
Developer Productivity
• 75% reading
existing code
• 20% modifying
existing code
• 5% writing new
code
https://blogs.msdn.microsoft.com/peterhal/2006/01/04/what-do-programmers-really-do-anyway-aka-part-2-of-the-yardstick-saga/
@randyshoup
Developer Productivity
• 75% reading
existing code
• 20% modifying
existing code
• 5% writing new
code
https://blogs.msdn.microsoft.com/peterhal/2006/01/04/what-do-programmers-really-do-anyway-aka-part-2-of-the-yardstick-saga/
@randyshoup
Continuous Testing
• Tests make better designs
o Modularity
o Separation of Concerns
o Encapsulation
@randyshoup
“There’s a deep synergy between
testability and good design. All of the
pain that we feel when writing unit tests
points at underlying design problems.”
@randyshoup
-- Michael Feathers
Test-Driven Development
• è Basically no bug tracking system (!)
o “Inbox Zero” for bugs
o Bugs are fixed as they come up
o Backlog contains features we want to build
o Backlog contains technical debt we want to repay
@randyshoup
Canary Deployments
• Staged rollout
o Go slowly at first; go faster when you gain confidence
• Automated rollout / rollback
o Automatically monitor changes to metrics
o If metrics look good, keep going; if metrics look bad, roll back
• Make deployments routine and boring
@randyshoup
Feature Flags
• Configuration “flag” to enable / disable a feature
for a particular set of users
o Independently discovered at eBay, Facebook, Google, etc.
• More solid systems
o Decouple feature delivery from code delivery
o Rapid on and off
o Separate experiment and control groups
o Develop / test / verify in production
@randyshoup
Continuous Delivery
• Deploy services multiple times per day
o Robust build, test, deploy pipeline
o SLO monitoring
o Synthetic monitoring
• More solid systems
o Release smaller, simpler units of work
o Smaller changes to roll back or roll forward
o Faster to repair, easier to understand, simpler to diagnose
o Increase rate of change and reduce risk of change
@randyshoup
• Cross-company Velocity Initiative to
improve software delivery
o Think Big, Start Small, Learn Fast
o Iteratively identify and remove bottlenecks for teams
o “What would it take to deploy your application every
day?”
• Doubled engineering productivity
o 5x faster deployment frequency
o 5x faster lead time
o 3x lower change failure rate
o 3x lower mean-time-to-restore
• Prerequisite for large-scale architecture
changes
@randyshoup
Continuous Delivery
Large-Scale Architecture
•Simple Components
•Simple Interactions
•Simple Changes
•Putting It All Together
System of Record
• Single System of Record
o Every piece of data is owned by a single service
o That service is the canonical system of record for that data
• Every other copy is a read-only, non-authoritative cache
customer-service
styling-service
customer-search
billing-service
@randyshoup
Shared Data
Option 1: Synchronous Lookup
o Customer service owns customer data
o Fulfillment service calls customer service in real time
fulfillment-service
customer-service
@randyshoup
Shared Data
Option 2: Async event + local cache
o Customer service owns customer data
o Customer service sends address-updated event when customer address
changes
o Fulfillment service caches current customer address
fulfillment-service
customer-service
@randyshoup
Joins
Option 1: Join in Client Service
o Get a single customer from customer-service
o Query matching orders for that customer from order-service
Customers
Orders
order-history-page
customer-service order-service
@randyshoup
Joins
Option 2: Service that “Materializes the View”
o Listen to events from customer-service, events from order-service
o Maintain denormalized join of customer data and orders together in local
storage
Customer Orders
customer-order-service
customer-service
order-service
@randyshoup
Netflix Viewing History
• Store and process member’s playback
data
o 1M requests per second
o Used for viewing history, personalization,
recommendations, analytics, etc.
• Original synchronous architecture
o Synchronously write to persistent storage and lookup
cache
o Availability and data loss from backpressure at high
load
• Asynchronous rearchitecture
o Write to durable queue
o Async pipeline to enrich, process, store, serve
o Materialize views to serve reads
@randyshoup Sharma Podila, 2021, Microservices to Async Processing Migration at Scale, QConPlus 2021.
Walmart Item Availability
• Is this item available to ship to this customer?
o Customer SLO 99.98% uptime in 300ms
• Complex logic involving many teams and
domains
o Inventory, reservations, backorders, eligibility, sales caps, etc.
• Original synchronous architecture
o Graph of 23 nested synchronous service calls in hot path
o Any component failure invalidates results
o Service SLOs 99.999% uptime with 50ms marginal latency
o Extremely expensive to build and operate
@randyshoup Scott Havens, 2019, Fabulous Fortunes, Fewer Failures, and Faster Fixes from Functional Fundamentals, DOES 2019.
Walmart Item Availability
@randyshoup Scott Havens, 2019, Fabulous Fortunes, Fewer Failures, and Faster Fixes from Functional Fundamentals, DOES 2019.
Walmart Item Availability
• Invert each service to use async events
o Event-driven “dataflow”
o Idempotent processing
o Event-sourced immutable log
o Materialized view of data from upstream dependencies
• Asynchronous rearchitecture
o 2 services in synchronous hot path
o Async service SLOs 99.9% uptime with latency in seconds
or minutes
o More resilient to delays and outages
o Orders of magnitude simpler to build and operate
@randyshoup Scott Havens, 2019, Fabulous Fortunes, Fewer Failures, and Faster Fixes from Functional Fundamentals, DOES 2019.
Walmart Item Availability
@randyshoup Scott Havens, 2019, Fabulous Fortunes, Fewer Failures, and Faster Fixes from Functional Fundamentals, DOES 2019.
Large-Scale Architecture
•Simple Components
•Simple Interactions
•Simple Changes
•Putting It All Together
Thank you!
@randyshoup
linkedin.com/in/randyshoup
medium.com/@randyshoup

More Related Content

What's hot

Lecture 1 (distributed systems)
Lecture 1 (distributed systems)Lecture 1 (distributed systems)
Lecture 1 (distributed systems)Fazli Amin
 
Distributed Transaction
Distributed TransactionDistributed Transaction
Distributed TransactionPratik Tambekar
 
Minimum Viable Architecture - Good Enough is Good Enough
Minimum Viable Architecture - Good Enough is Good EnoughMinimum Viable Architecture - Good Enough is Good Enough
Minimum Viable Architecture - Good Enough is Good EnoughRandy Shoup
 
Spm ap-network model-
Spm ap-network model-Spm ap-network model-
Spm ap-network model-Kanchana Devi
 
Introduction to GitHub Actions
Introduction to GitHub ActionsIntroduction to GitHub Actions
Introduction to GitHub ActionsKnoldus Inc.
 
Software engineering model
Software engineering modelSoftware engineering model
Software engineering modelManish Chaurasia
 
Ds objects and models
Ds objects and modelsDs objects and models
Ds objects and modelsMayank Jain
 
Software Process Models
Software Process ModelsSoftware Process Models
Software Process ModelsAtul Karmyal
 
Introducing Saga Pattern in Microservices with Spring Statemachine
Introducing Saga Pattern in Microservices with Spring StatemachineIntroducing Saga Pattern in Microservices with Spring Statemachine
Introducing Saga Pattern in Microservices with Spring StatemachineVMware Tanzu
 
Code Generation idioms with Xtend
Code Generation idioms with XtendCode Generation idioms with Xtend
Code Generation idioms with XtendHolger Schill
 
Microservices Testing Strategies JUnit Cucumber Mockito Pact
Microservices Testing Strategies JUnit Cucumber Mockito PactMicroservices Testing Strategies JUnit Cucumber Mockito Pact
Microservices Testing Strategies JUnit Cucumber Mockito PactAraf Karsh Hamid
 
gRPC Design and Implementation
gRPC Design and ImplementationgRPC Design and Implementation
gRPC Design and ImplementationVarun Talwar
 
Distributing Transactions using MassTransit
Distributing Transactions using MassTransitDistributing Transactions using MassTransit
Distributing Transactions using MassTransitChris Patterson
 
Organization and team structures
Organization and team structuresOrganization and team structures
Organization and team structuresNur Islam
 
Taming Complex Domains with Domain Driven Design
Taming Complex Domains with Domain Driven DesignTaming Complex Domains with Domain Driven Design
Taming Complex Domains with Domain Driven DesignAlberto Brandolini
 
Ethics of software project management
Ethics of software project managementEthics of software project management
Ethics of software project managementPeica Ionela
 
Git - Basic Crash Course
Git - Basic Crash CourseGit - Basic Crash Course
Git - Basic Crash CourseNilay Binjola
 
Optimistic concurrency control in Distributed Systems
Optimistic concurrency control in Distributed SystemsOptimistic concurrency control in Distributed Systems
Optimistic concurrency control in Distributed Systemsmridul mishra
 

What's hot (20)

Lecture 1 (distributed systems)
Lecture 1 (distributed systems)Lecture 1 (distributed systems)
Lecture 1 (distributed systems)
 
Distributed Transaction
Distributed TransactionDistributed Transaction
Distributed Transaction
 
Minimum Viable Architecture - Good Enough is Good Enough
Minimum Viable Architecture - Good Enough is Good EnoughMinimum Viable Architecture - Good Enough is Good Enough
Minimum Viable Architecture - Good Enough is Good Enough
 
Spm ap-network model-
Spm ap-network model-Spm ap-network model-
Spm ap-network model-
 
Introduction to GitHub Actions
Introduction to GitHub ActionsIntroduction to GitHub Actions
Introduction to GitHub Actions
 
Software engineering model
Software engineering modelSoftware engineering model
Software engineering model
 
Ds objects and models
Ds objects and modelsDs objects and models
Ds objects and models
 
Avro
AvroAvro
Avro
 
Software Process Models
Software Process ModelsSoftware Process Models
Software Process Models
 
Introducing Saga Pattern in Microservices with Spring Statemachine
Introducing Saga Pattern in Microservices with Spring StatemachineIntroducing Saga Pattern in Microservices with Spring Statemachine
Introducing Saga Pattern in Microservices with Spring Statemachine
 
Code Generation idioms with Xtend
Code Generation idioms with XtendCode Generation idioms with Xtend
Code Generation idioms with Xtend
 
Microservices Testing Strategies JUnit Cucumber Mockito Pact
Microservices Testing Strategies JUnit Cucumber Mockito PactMicroservices Testing Strategies JUnit Cucumber Mockito Pact
Microservices Testing Strategies JUnit Cucumber Mockito Pact
 
gRPC Design and Implementation
gRPC Design and ImplementationgRPC Design and Implementation
gRPC Design and Implementation
 
Distributing Transactions using MassTransit
Distributing Transactions using MassTransitDistributing Transactions using MassTransit
Distributing Transactions using MassTransit
 
Message passing in Distributed Computing Systems
Message passing in Distributed Computing SystemsMessage passing in Distributed Computing Systems
Message passing in Distributed Computing Systems
 
Organization and team structures
Organization and team structuresOrganization and team structures
Organization and team structures
 
Taming Complex Domains with Domain Driven Design
Taming Complex Domains with Domain Driven DesignTaming Complex Domains with Domain Driven Design
Taming Complex Domains with Domain Driven Design
 
Ethics of software project management
Ethics of software project managementEthics of software project management
Ethics of software project management
 
Git - Basic Crash Course
Git - Basic Crash CourseGit - Basic Crash Course
Git - Basic Crash Course
 
Optimistic concurrency control in Distributed Systems
Optimistic concurrency control in Distributed SystemsOptimistic concurrency control in Distributed Systems
Optimistic concurrency control in Distributed Systems
 

Similar to Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity

Scaling Your Architecture for the Long Term
Scaling Your Architecture for the Long TermScaling Your Architecture for the Long Term
Scaling Your Architecture for the Long TermRandy Shoup
 
Effective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldEffective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldRandy Shoup
 
Service Architectures At Scale - QCon London 2015
Service Architectures At Scale - QCon London 2015Service Architectures At Scale - QCon London 2015
Service Architectures At Scale - QCon London 2015Randy Shoup
 
Service Architectures at Scale
Service Architectures at ScaleService Architectures at Scale
Service Architectures at ScaleRandy Shoup
 
Evolving Architecture and Organization - Lessons from Google and eBay
Evolving Architecture and Organization - Lessons from Google and eBayEvolving Architecture and Organization - Lessons from Google and eBay
Evolving Architecture and Organization - Lessons from Google and eBayRandy Shoup
 
Monoliths, Migrations, and Microservices
Monoliths, Migrations, and MicroservicesMonoliths, Migrations, and Microservices
Monoliths, Migrations, and MicroservicesRandy Shoup
 
Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Don Demcsak
 
Moving Fast At Scale
Moving Fast At ScaleMoving Fast At Scale
Moving Fast At ScaleRandy Shoup
 
Converting Your Legacy Data to S1000D
Converting Your Legacy Data to S1000DConverting Your Legacy Data to S1000D
Converting Your Legacy Data to S1000Ddclsocialmedia
 
Art of refactoring - Code Smells and Microservices Antipatterns
Art of refactoring - Code Smells and Microservices AntipatternsArt of refactoring - Code Smells and Microservices Antipatterns
Art of refactoring - Code Smells and Microservices AntipatternsEl Mahdi Benzekri
 
Azure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challengesAzure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challengesIvo Andreev
 
Scaling Your Architecture with Services and Events
Scaling Your Architecture with Services and EventsScaling Your Architecture with Services and Events
Scaling Your Architecture with Services and EventsRandy Shoup
 
Minimal Viable Architecture - Silicon Slopes 2020
Minimal Viable Architecture - Silicon Slopes 2020Minimal Viable Architecture - Silicon Slopes 2020
Minimal Viable Architecture - Silicon Slopes 2020Randy Shoup
 
Developing and Implementing a QA Plan During Your Legacy Data to S1000D
Developing and Implementing a QA Plan During Your Legacy Data to S1000DDeveloping and Implementing a QA Plan During Your Legacy Data to S1000D
Developing and Implementing a QA Plan During Your Legacy Data to S1000Ddclsocialmedia
 
From the Monolith to Microservices - CraftConf 2015
From the Monolith to Microservices - CraftConf 2015From the Monolith to Microservices - CraftConf 2015
From the Monolith to Microservices - CraftConf 2015Randy Shoup
 
DevOps - It's About How We Work
DevOps - It's About How We WorkDevOps - It's About How We Work
DevOps - It's About How We WorkRandy Shoup
 
Beyond The Rails Way
Beyond The Rails WayBeyond The Rails Way
Beyond The Rails WayAndrzej Krzywda
 
How to create custom dashboards in Elastic Search / Kibana with Performance V...
How to create custom dashboards in Elastic Search / Kibana with Performance V...How to create custom dashboards in Elastic Search / Kibana with Performance V...
How to create custom dashboards in Elastic Search / Kibana with Performance V...PerformanceVision (previously SecurActive)
 
Cloud-native Data: Every Microservice Needs a Cache
Cloud-native Data: Every Microservice Needs a CacheCloud-native Data: Every Microservice Needs a Cache
Cloud-native Data: Every Microservice Needs a Cachecornelia davis
 
5 Steps on the Way to Continuous Delivery
5 Steps on the Way to Continuous Delivery5 Steps on the Way to Continuous Delivery
5 Steps on the Way to Continuous DeliveryXebiaLabs
 

Similar to Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity (20)

Scaling Your Architecture for the Long Term
Scaling Your Architecture for the Long TermScaling Your Architecture for the Long Term
Scaling Your Architecture for the Long Term
 
Effective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldEffective Microservices In a Data-centric World
Effective Microservices In a Data-centric World
 
Service Architectures At Scale - QCon London 2015
Service Architectures At Scale - QCon London 2015Service Architectures At Scale - QCon London 2015
Service Architectures At Scale - QCon London 2015
 
Service Architectures at Scale
Service Architectures at ScaleService Architectures at Scale
Service Architectures at Scale
 
Evolving Architecture and Organization - Lessons from Google and eBay
Evolving Architecture and Organization - Lessons from Google and eBayEvolving Architecture and Organization - Lessons from Google and eBay
Evolving Architecture and Organization - Lessons from Google and eBay
 
Monoliths, Migrations, and Microservices
Monoliths, Migrations, and MicroservicesMonoliths, Migrations, and Microservices
Monoliths, Migrations, and Microservices
 
Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)
 
Moving Fast At Scale
Moving Fast At ScaleMoving Fast At Scale
Moving Fast At Scale
 
Converting Your Legacy Data to S1000D
Converting Your Legacy Data to S1000DConverting Your Legacy Data to S1000D
Converting Your Legacy Data to S1000D
 
Art of refactoring - Code Smells and Microservices Antipatterns
Art of refactoring - Code Smells and Microservices AntipatternsArt of refactoring - Code Smells and Microservices Antipatterns
Art of refactoring - Code Smells and Microservices Antipatterns
 
Azure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challengesAzure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challenges
 
Scaling Your Architecture with Services and Events
Scaling Your Architecture with Services and EventsScaling Your Architecture with Services and Events
Scaling Your Architecture with Services and Events
 
Minimal Viable Architecture - Silicon Slopes 2020
Minimal Viable Architecture - Silicon Slopes 2020Minimal Viable Architecture - Silicon Slopes 2020
Minimal Viable Architecture - Silicon Slopes 2020
 
Developing and Implementing a QA Plan During Your Legacy Data to S1000D
Developing and Implementing a QA Plan During Your Legacy Data to S1000DDeveloping and Implementing a QA Plan During Your Legacy Data to S1000D
Developing and Implementing a QA Plan During Your Legacy Data to S1000D
 
From the Monolith to Microservices - CraftConf 2015
From the Monolith to Microservices - CraftConf 2015From the Monolith to Microservices - CraftConf 2015
From the Monolith to Microservices - CraftConf 2015
 
DevOps - It's About How We Work
DevOps - It's About How We WorkDevOps - It's About How We Work
DevOps - It's About How We Work
 
Beyond The Rails Way
Beyond The Rails WayBeyond The Rails Way
Beyond The Rails Way
 
How to create custom dashboards in Elastic Search / Kibana with Performance V...
How to create custom dashboards in Elastic Search / Kibana with Performance V...How to create custom dashboards in Elastic Search / Kibana with Performance V...
How to create custom dashboards in Elastic Search / Kibana with Performance V...
 
Cloud-native Data: Every Microservice Needs a Cache
Cloud-native Data: Every Microservice Needs a CacheCloud-native Data: Every Microservice Needs a Cache
Cloud-native Data: Every Microservice Needs a Cache
 
5 Steps on the Way to Continuous Delivery
5 Steps on the Way to Continuous Delivery5 Steps on the Way to Continuous Delivery
5 Steps on the Way to Continuous Delivery
 

More from Randy Shoup

Anatomy of Three Incidents -- Commonalities and Lessons
Anatomy of Three Incidents -- Commonalities and LessonsAnatomy of Three Incidents -- Commonalities and Lessons
Anatomy of Three Incidents -- Commonalities and LessonsRandy Shoup
 
One Terrible Day at Google, and How It Made Us Better
One Terrible Day at Google, and How It Made Us BetterOne Terrible Day at Google, and How It Made Us Better
One Terrible Day at Google, and How It Made Us BetterRandy Shoup
 
An Agile Approach to Machine Learning
An Agile Approach to Machine LearningAn Agile Approach to Machine Learning
An Agile Approach to Machine LearningRandy Shoup
 
Moving Fast at Scale
Moving Fast at ScaleMoving Fast at Scale
Moving Fast at ScaleRandy Shoup
 
Breaking Codes, Designing Jets, and Building Teams
Breaking Codes, Designing Jets, and Building TeamsBreaking Codes, Designing Jets, and Building Teams
Breaking Codes, Designing Jets, and Building TeamsRandy Shoup
 
Learning from Learnings: Anatomy of Three Incidents
Learning from Learnings: Anatomy of Three IncidentsLearning from Learnings: Anatomy of Three Incidents
Learning from Learnings: Anatomy of Three IncidentsRandy Shoup
 
Managing Data at Scale - Microservices and Events
Managing Data at Scale - Microservices and EventsManaging Data at Scale - Microservices and Events
Managing Data at Scale - Microservices and EventsRandy Shoup
 
Ten Lessons of the DevOps Transition
Ten Lessons of the DevOps TransitionTen Lessons of the DevOps Transition
Ten Lessons of the DevOps TransitionRandy Shoup
 
Managing Data in Microservices
Managing Data in MicroservicesManaging Data in Microservices
Managing Data in MicroservicesRandy Shoup
 
Pragmatic Microservices
Pragmatic MicroservicesPragmatic Microservices
Pragmatic MicroservicesRandy Shoup
 
A CTO's Guide to Scaling Organizations
A CTO's Guide to Scaling OrganizationsA CTO's Guide to Scaling Organizations
A CTO's Guide to Scaling OrganizationsRandy Shoup
 
Concurrency at Scale: Evolution to Micro-Services
Concurrency at Scale:  Evolution to Micro-ServicesConcurrency at Scale:  Evolution to Micro-Services
Concurrency at Scale: Evolution to Micro-ServicesRandy Shoup
 
Minimum Viable Architecture -- Good Enough is Good Enough in a Startup
Minimum Viable Architecture -- Good Enough is Good Enough in a StartupMinimum Viable Architecture -- Good Enough is Good Enough in a Startup
Minimum Viable Architecture -- Good Enough is Good Enough in a StartupRandy Shoup
 
Why Enterprises Are Embracing the Cloud
Why Enterprises Are Embracing the CloudWhy Enterprises Are Embracing the Cloud
Why Enterprises Are Embracing the CloudRandy Shoup
 
DevOpsDays Silicon Valley 2014 - The Game of Operations
DevOpsDays Silicon Valley 2014 - The Game of OperationsDevOpsDays Silicon Valley 2014 - The Game of Operations
DevOpsDays Silicon Valley 2014 - The Game of OperationsRandy Shoup
 
QCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYE
QCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYEQCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYE
QCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYERandy Shoup
 
QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...
QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...
QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...Randy Shoup
 
The Importance of Culture: Building and Sustaining Effective Engineering Org...
The Importance of Culture:  Building and Sustaining Effective Engineering Org...The Importance of Culture:  Building and Sustaining Effective Engineering Org...
The Importance of Culture: Building and Sustaining Effective Engineering Org...Randy Shoup
 

More from Randy Shoup (18)

Anatomy of Three Incidents -- Commonalities and Lessons
Anatomy of Three Incidents -- Commonalities and LessonsAnatomy of Three Incidents -- Commonalities and Lessons
Anatomy of Three Incidents -- Commonalities and Lessons
 
One Terrible Day at Google, and How It Made Us Better
One Terrible Day at Google, and How It Made Us BetterOne Terrible Day at Google, and How It Made Us Better
One Terrible Day at Google, and How It Made Us Better
 
An Agile Approach to Machine Learning
An Agile Approach to Machine LearningAn Agile Approach to Machine Learning
An Agile Approach to Machine Learning
 
Moving Fast at Scale
Moving Fast at ScaleMoving Fast at Scale
Moving Fast at Scale
 
Breaking Codes, Designing Jets, and Building Teams
Breaking Codes, Designing Jets, and Building TeamsBreaking Codes, Designing Jets, and Building Teams
Breaking Codes, Designing Jets, and Building Teams
 
Learning from Learnings: Anatomy of Three Incidents
Learning from Learnings: Anatomy of Three IncidentsLearning from Learnings: Anatomy of Three Incidents
Learning from Learnings: Anatomy of Three Incidents
 
Managing Data at Scale - Microservices and Events
Managing Data at Scale - Microservices and EventsManaging Data at Scale - Microservices and Events
Managing Data at Scale - Microservices and Events
 
Ten Lessons of the DevOps Transition
Ten Lessons of the DevOps TransitionTen Lessons of the DevOps Transition
Ten Lessons of the DevOps Transition
 
Managing Data in Microservices
Managing Data in MicroservicesManaging Data in Microservices
Managing Data in Microservices
 
Pragmatic Microservices
Pragmatic MicroservicesPragmatic Microservices
Pragmatic Microservices
 
A CTO's Guide to Scaling Organizations
A CTO's Guide to Scaling OrganizationsA CTO's Guide to Scaling Organizations
A CTO's Guide to Scaling Organizations
 
Concurrency at Scale: Evolution to Micro-Services
Concurrency at Scale:  Evolution to Micro-ServicesConcurrency at Scale:  Evolution to Micro-Services
Concurrency at Scale: Evolution to Micro-Services
 
Minimum Viable Architecture -- Good Enough is Good Enough in a Startup
Minimum Viable Architecture -- Good Enough is Good Enough in a StartupMinimum Viable Architecture -- Good Enough is Good Enough in a Startup
Minimum Viable Architecture -- Good Enough is Good Enough in a Startup
 
Why Enterprises Are Embracing the Cloud
Why Enterprises Are Embracing the CloudWhy Enterprises Are Embracing the Cloud
Why Enterprises Are Embracing the Cloud
 
DevOpsDays Silicon Valley 2014 - The Game of Operations
DevOpsDays Silicon Valley 2014 - The Game of OperationsDevOpsDays Silicon Valley 2014 - The Game of Operations
DevOpsDays Silicon Valley 2014 - The Game of Operations
 
QCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYE
QCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYEQCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYE
QCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYE
 
QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...
QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...
QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...
 
The Importance of Culture: Building and Sustaining Effective Engineering Org...
The Importance of Culture:  Building and Sustaining Effective Engineering Org...The Importance of Culture:  Building and Sustaining Effective Engineering Org...
The Importance of Culture: Building and Sustaining Effective Engineering Org...
 

Recently uploaded

Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in NoidaBuds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in Noidabntitsolutionsrishis
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy LĂłpez
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....kzayra69
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
2.pdf Ejercicios de programaciĂłn competitiva
2.pdf Ejercicios de programaciĂłn competitiva2.pdf Ejercicios de programaciĂłn competitiva
2.pdf Ejercicios de programaciĂłn competitivaDiego IvĂĄn Oliveros Acosta
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 

Recently uploaded (20)

Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in NoidaBuds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
2.pdf Ejercicios de programaciĂłn competitiva
2.pdf Ejercicios de programaciĂłn competitiva2.pdf Ejercicios de programaciĂłn competitiva
2.pdf Ejercicios de programaciĂłn competitiva
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 

Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity

  • 1. Large-Scale Architecture The Unreasonable Effectiveness of Simplicity Randy Shoup @randyshoup
  • 4. • 1995 Monolithic Perl • 1996-2002 v2 o Monolithic C++ ISAPI DLL o 3.4M lines of code o Compiler limits on number of methods per class • 2002-2006 v3 Migration o Java “mini-applications” o Shared databases • 2012 v4 Microservices • 2017 v5 Microservices @randyshoup eBay Architecture
  • 5. • 1995-2001 Obidos o Monolithic Perl / Mason frontend over C backend o ~4GB application in a 4GB address space o Regularly breaking Gnu linker o Restarting every 100-200 requests for memory leaks o Releasing once per quarter • 2001-2005 Service Migration o Services in C++, Java, etc. o No shared databases • 2006 AWS launches @randyshoup Amazon Architecture
  • 6. No one starts with microservices … Past a certain scale, everyone ends up with microservices @randyshoup
  • 7. Large-Scale Architecture •Simple Components •Simple Interactions •Simple Changes •Putting It All Together
  • 8. Large-Scale Architecture •Simple Components •Simple Interactions •Simple Changes •Putting It All Together
  • 9. “There are two methods in software design: One is to make the program so simple, there are obviously no errors. The other is to make it so complicated, there are no obvious errors.” -- Tony Hoare @randyshoup
  • 10. Modular Services • Service boundaries match the problem domain • Service boundaries encapsulate business logic and data o All interactions through published service interface o Interface hides internal implementation details o No back doors • Service boundaries encapsulate architectural -ilities o Fault isolation o Performance optimization o Security boundary @randyshoup
  • 11. Orthogonal Domain Logic • Stateless domain logic o Ideally stateless pure function o Matches domain problem as directly as possible o Deterministic and testable in isolation o Robust to change over time • “Straight-line processing” o Straightforward, synchronous, minimal branching • Separate domain logic from I/O o Hexagonal architecture, Ports and Adapters o Functional core, imperative shell @randyshoup
  • 12. Sharding • Shards partition the service’s “data space” o Units for distribution, replication, processing, storage o Hidden as internal implementation detail • Shards encapsulate architectural -ilities o Resource isolation o Fault isolation o Availability o Performance • Shards are autoscaled o Divide or scale out as processing or data needs increase o E.g., DynamoDB partitions, Aurora segments, Bigtable tablets @randyshoup
  • 13. Service Graph • Common services provide and abstract widely-used capabilities • Service topology o Services call others, which call others, etc. o Graph, not a strict layering • Simplifies concerns for service team o Concentrate on business logic o Abstract away details of dependencies o Focus on services you need, capabilities you provide @randyshoup
  • 14. Common Platform • “Paved Road” o Shared infrastructure o Standard frameworks o Developer experience o E.g., Netflix, Google • Separation of Concerns o Reduce cognitive load on stream-aligned teams o Bound decisions through enabling constraints @randyshoup
  • 15. Large-scale organizations often invest more than 50% of engineering effort in platform capabilities @randyshoup
  • 17. Evolving Services • Variation and Natural Selection o Create / extract new services when needed to solve a problem o Services justify their continued existence through usage o Deprecate services when they are no longer used • Services grow and evolve over time o Factor out common libraries and services as needed o Teams and services split like “cellular mitosis” @randyshoup
  • 18. “Every service at Google is either deprecated or not ready yet.” @randyshoup
  • 19. Large-Scale Architecture •Simple Components •Simple Interactions •Simple Changes •Putting It All Together
  • 20. “Everything should be made as simple as possible, but not simpler.” @randyshoup
  • 21. Event-Driven • Communicate state changes as stream of events o Statement that some interesting thing occurred o Ideally represents a semantic domain event • Decouples domains and teams o Abstracted through a well-defined interface o Asynchronous from one another • Simplifies component implementation @randyshoup
  • 22. Immutable Log • Store state as immutable log of events o Event Sourcing • Often matches domain o E.g., Stitch Fix order processing / delivery state • Log encapsulates architectural –ilities o Durable o Traceable and auditable o Replayable o Explicit and comprehensible • Compact snapshots for efficiency @randyshoup
  • 23. Immutable Log • Stitch Fix order states Request a fix fix_scheduled Assign fix to warehouse fix_hizzy_assigned Assign fix to a stylist fix_stylist_assigned Style the fix fix_styled Pick the items for the fix fix_picked Pack the items into a box fix_packed Ship the fix via a carrier fix_shipped Fix travels to customer fix_delivered Customer decides, pays fix_checked_out
  • 24. Embrace Asynchrony • Decouples operations in time o Decoupled availability o Independent scalability o Allows more complex processing, more processing in parallel o Safer to make independent changes • Simplifies component implementation @randyshoup
  • 25. Embrace Asynchrony • Invert from synchronous call graph to async dataflow o Exploit asymmetry between writes and reads o Can be orders of magnitude less resource intensive @randyshoup
  • 26. Large-Scale Architecture •Simple Components •Simple Interactions •Simple Changes •Putting It All Together
  • 27. “A complex system that works is invariably found to have evolved from a simple system that worked.” -- Gall’s Law @randyshoup
  • 28. Incremental Change • Decompose every large change into small incremental steps • Each step maintains backward / forward compatibility of data and interfaces • Multiple service versions commonly coexist o Every change is a rolling upgrade o Transitional states are normal, not exceptional
  • 29. Continuous Testing • Tests help us go faster o Tests are “solid ground” o Tests are the safety net • Tests make better code o Confidence to break things o Courage to refactor mercilessly • Tests make better systems o Catch bugs earlier, fail faster @randyshoup
  • 30. Developer Productivity • 75% reading existing code • 20% modifying existing code • 5% writing new code https://blogs.msdn.microsoft.com/peterhal/2006/01/04/what-do-programmers-really-do-anyway-aka-part-2-of-the-yardstick-saga/ @randyshoup
  • 31. Developer Productivity • 75% reading existing code • 20% modifying existing code • 5% writing new code https://blogs.msdn.microsoft.com/peterhal/2006/01/04/what-do-programmers-really-do-anyway-aka-part-2-of-the-yardstick-saga/ @randyshoup
  • 32. Continuous Testing • Tests make better designs o Modularity o Separation of Concerns o Encapsulation @randyshoup
  • 33. “There’s a deep synergy between testability and good design. All of the pain that we feel when writing unit tests points at underlying design problems.” @randyshoup -- Michael Feathers
  • 34. Test-Driven Development • è Basically no bug tracking system (!) o “Inbox Zero” for bugs o Bugs are fixed as they come up o Backlog contains features we want to build o Backlog contains technical debt we want to repay @randyshoup
  • 35. Canary Deployments • Staged rollout o Go slowly at first; go faster when you gain confidence • Automated rollout / rollback o Automatically monitor changes to metrics o If metrics look good, keep going; if metrics look bad, roll back • Make deployments routine and boring @randyshoup
  • 36. Feature Flags • Configuration “flag” to enable / disable a feature for a particular set of users o Independently discovered at eBay, Facebook, Google, etc. • More solid systems o Decouple feature delivery from code delivery o Rapid on and off o Separate experiment and control groups o Develop / test / verify in production @randyshoup
  • 37. Continuous Delivery • Deploy services multiple times per day o Robust build, test, deploy pipeline o SLO monitoring o Synthetic monitoring • More solid systems o Release smaller, simpler units of work o Smaller changes to roll back or roll forward o Faster to repair, easier to understand, simpler to diagnose o Increase rate of change and reduce risk of change @randyshoup
  • 38. • Cross-company Velocity Initiative to improve software delivery o Think Big, Start Small, Learn Fast o Iteratively identify and remove bottlenecks for teams o “What would it take to deploy your application every day?” • Doubled engineering productivity o 5x faster deployment frequency o 5x faster lead time o 3x lower change failure rate o 3x lower mean-time-to-restore • Prerequisite for large-scale architecture changes @randyshoup Continuous Delivery
  • 39. Large-Scale Architecture •Simple Components •Simple Interactions •Simple Changes •Putting It All Together
  • 40. System of Record • Single System of Record o Every piece of data is owned by a single service o That service is the canonical system of record for that data • Every other copy is a read-only, non-authoritative cache customer-service styling-service customer-search billing-service @randyshoup
  • 41. Shared Data Option 1: Synchronous Lookup o Customer service owns customer data o Fulfillment service calls customer service in real time fulfillment-service customer-service @randyshoup
  • 42. Shared Data Option 2: Async event + local cache o Customer service owns customer data o Customer service sends address-updated event when customer address changes o Fulfillment service caches current customer address fulfillment-service customer-service @randyshoup
  • 43. Joins Option 1: Join in Client Service o Get a single customer from customer-service o Query matching orders for that customer from order-service Customers Orders order-history-page customer-service order-service @randyshoup
  • 44. Joins Option 2: Service that “Materializes the View” o Listen to events from customer-service, events from order-service o Maintain denormalized join of customer data and orders together in local storage Customer Orders customer-order-service customer-service order-service @randyshoup
  • 45. Netflix Viewing History • Store and process member’s playback data o 1M requests per second o Used for viewing history, personalization, recommendations, analytics, etc. • Original synchronous architecture o Synchronously write to persistent storage and lookup cache o Availability and data loss from backpressure at high load • Asynchronous rearchitecture o Write to durable queue o Async pipeline to enrich, process, store, serve o Materialize views to serve reads @randyshoup Sharma Podila, 2021, Microservices to Async Processing Migration at Scale, QConPlus 2021.
  • 46. Walmart Item Availability • Is this item available to ship to this customer? o Customer SLO 99.98% uptime in 300ms • Complex logic involving many teams and domains o Inventory, reservations, backorders, eligibility, sales caps, etc. • Original synchronous architecture o Graph of 23 nested synchronous service calls in hot path o Any component failure invalidates results o Service SLOs 99.999% uptime with 50ms marginal latency o Extremely expensive to build and operate @randyshoup Scott Havens, 2019, Fabulous Fortunes, Fewer Failures, and Faster Fixes from Functional Fundamentals, DOES 2019.
  • 47. Walmart Item Availability @randyshoup Scott Havens, 2019, Fabulous Fortunes, Fewer Failures, and Faster Fixes from Functional Fundamentals, DOES 2019.
  • 48. Walmart Item Availability • Invert each service to use async events o Event-driven “dataflow” o Idempotent processing o Event-sourced immutable log o Materialized view of data from upstream dependencies • Asynchronous rearchitecture o 2 services in synchronous hot path o Async service SLOs 99.9% uptime with latency in seconds or minutes o More resilient to delays and outages o Orders of magnitude simpler to build and operate @randyshoup Scott Havens, 2019, Fabulous Fortunes, Fewer Failures, and Faster Fixes from Functional Fundamentals, DOES 2019.
  • 49. Walmart Item Availability @randyshoup Scott Havens, 2019, Fabulous Fortunes, Fewer Failures, and Faster Fixes from Functional Fundamentals, DOES 2019.
  • 50. Large-Scale Architecture •Simple Components •Simple Interactions •Simple Changes •Putting It All Together