SlideShare a Scribd company logo
1 of 68
Download to read offline
Moving Fast
at Scale
Randy Shoup
@randyshoup
linkedin.com/in/randyshoup
Background
• VP Engineering at Stitch Fix
o Using technology and data science to revolutionize clothing retail
• Consulting “CTO as a service”
o Helping companies move fast at scale 
• Director of Engineering for Google App Engine
o World’s largest Platform-as-a-Service
• Chief Engineer at eBay
o Evolving multiple generations of eBay’s infrastructure
Stitch Fix
@randyshoup linkedin.com/in/randyshoup
Stitch Fix
@randyshoup linkedin.com/in/randyshoup
Stitch Fix
@randyshoup linkedin.com/in/randyshoup
Stitch Fix
@randyshoup linkedin.com/in/randyshoup
Personalized
Recommendations
Inventory
Algorithmic
recommendations
Machine learning
@randyshoup linkedin.com/in/randyshoup
Expert Human
Curation
Human
curation
Algorithmic
recommendations
@randyshoup linkedin.com/in/randyshoup
Humans + Data Science
• 1:1 Ratio of Data Science to Engineering
o More than 100 software engineers
o ~80 data scientists and algorithm developers
o Unique ratio in our industry
• Apply intelligence to *every* part of the business
o Buying
o Inventory management
o Logistics optimization
o Styling recommendations
o Demand prediction
• Humans and machines augmenting each other
@randyshoup linkedin.com/in/randyshoup
Faster is Better
Willingness to go fast
Ability to go fast
+
Lack of Fear
Capability
+
High-Performing
Organizations
• Multiple deploys per day vs. one per month
• Commit to deploy in less than 1 hour vs. one week
• Recover from failure in less than 1 hour vs. one day
• Change failure rate of 0-15% vs. 31-45%
@randyshoup linkedin.com/in/randyshoup
https://puppet.com/resources/whitepaper/state-of-devops-report
High-Performing
Organizations
2.5x more likely to exceed
business goals
o Profitability
o Market share
o Productivity
@randyshoup linkedin.com/in/randyshoup
https://puppet.com/resources/whitepaper/state-of-devops-report
Speed vs. Stability?
Faster is Better
Moving Fast
at Scale
•Organizing for Speed
•What to Build / What NOT to Build
•When to Build
•How to Build
•Delivering and Operating
Moving Fast
at Scale
•Organizing for Speed
•What to Build / What NOT to Build
•When to Build
•How to Build
•Delivering and Operating
Conway’s Law
• Organization determines architecture
o Design of a system will be a reflection of the communication paths within the
organization
• Modular system requires modular organization
o Small, independent teams lead to more flexible, composable systems
o Larger, interdependent teams lead to larger systems
• We can engineer the system we want by engineering the
organization
@randyshoup linkedin.com/in/randyshoup
Small
“Service” Teams
• Full-Stack, “2 Pizza” Teams
o No team should be larger than can be fed by 2 large pizzas
o Typically 4-6 people
o All disciplines required for the team to function
• Aligned to Business Domains
o Clear, well-defined area of responsibility
o Single service or set of related services
o Deep understanding of business problems
• Growth through “cellular mitosis”
@randyshoup linkedin.com/in/randyshoup
Ideally, 80% of project work
should be within a team
boundary.
eBay “Train Seats”
• eBay’s development process (circa 2006)
o Design and estimate project
(“Train Seat” == 2 engineer-weeks)
o Assign engineers from common pool to implement tasks
o Designer does not implement; implementers do not design (or test, or operate)
•  Dysfunctional engineering culture
o (-) Engineers treated as interchangeable “cogs”
o (-) No regard for skill, interest, experience
o (-) No pride of ownership in task implementation
o (-) No long-term ownership of codebase
Moving Fast
at Scale
•Organizing for Speed
•What to Build / What NOT to Build
•When to Build
•How to Build
•Delivering and Operating
“Building the wrong thing is
the biggest waste in software
development.”
-- Mary and Tom Poppendieck,
Lean Software Development
What problem are
you trying to solve?
“A problem well-stated is a
problem half-solved.”
-- Charles Kettering, former head of research
for General Motors
What Problem Are You
Trying to Solve?
• Focus on what is important for your business
• Problem might be solved without any technology at all
o Redefine the problem
o Change the business process
o Implement manually for a while before automating in an application
@randyshoup linkedin.com/in/randyshoup
Buy, Not Build
• Use Cloud Infrastructure
o Faster, cheaper, better than we can do ourselves
o Stitch Fix has no owned physical infrastructure anywhere in the world
• Prefer Open Source
o Kubernetes, Docker, Istio
o MySQL, Postgres, Redis, Elastic Search
o Machine learning models
o Etc.
o Usually better than the commercial alternatives (!)
@randyshoup linkedin.com/in/randyshoup
Buy, Not Build
• Third-Party Services
o Stitch Fix uses >50 third party services
o Logging, monitoring, alerting
o Project management, bug tracking
o Billing, fraud detection
o Etc.
• Focus on our core competency
o Use services for everything else (!)
@randyshoup linkedin.com/in/randyshoup
Soon it will be just as common
to run your own data center as
it is to run your own electrical
power generation.
Experimental
Discipline
• State your hypothesis
o What metrics do you expect to move and why
o Understand your baseline
• Run a real A | B test
o Sample size
o Isolated treatment and control groups
o No peeking or quitting early!
• Obsessively log and measure
o Understand customer and system behavior
o Understand why this experiment worked or did not
Experimental
Discipline
• Listen to the data
o Data trumps hope and intuition
o Develop insights for next experiment
• Thinking of the experiment is art; evaluating it is science
• Rinse and Repeat
o This is a journey, not a single step
eBay Machine-Learned
Ranking
• Ranking function for search results
o Which item should appear 1st, 10th, 100th, 1000th
o Before: Small number of hand-tuned factors
o Goal: Thousands of factors
• Incremental Experimentation
o Predictive models: query->view, view->purchase, etc.
o Hundreds of parallel A | B tests
o Full year of steady, incremental improvements
 2% increase in eBay revenue (~$120M / year)
eBay
Site Speed
• Reduce user-experienced latency for search results
• Iterative Process
o Implement a potential improvement
o Release to the site in an A | B test
o Monitor metrics –time to first byte, time to click, click rate, purchase rate
 2% increase in eBay revenue (~$120M / year)
Moving Fast
at Scale
•Organizing for Speed
•What to Build / What NOT to Build
•When to Build
•How to Build
•Delivering and Operating
Prioritization
• Scarce resources require prioritization
o We always have more to do than resources to do it
o Opportunity cost -- deciding to do X means deciding not to do Y
o Every decision is a tradeoff
• Priority ← Return on Investment
o Impact / Effort
• Prioritization is a business decision, not a technical
decision
@randyshoup linkedin.com/in/randyshoup
Fewer Things,
More Done
Fewer Things,
More Done
• Maximize resources applied to
o Priority 1, then
o Priority 2
o etc.
• Incremental Delivery
o Deliver increments along the way instead of everything at the end
• Deliver Value Faster
o Time Value of Money
o Benefit now is worth more than benefit in the future
@randyshoup linkedin.com/in/randyshoup
“When you solve problem
one, problem two gets a
promotion.”
Moving Fast
at Scale
•Organizing for Speed
•What to Build / What NOT to Build
•When to Build
•How to Build
•Delivering and Operating
Microservices
• Single-purpose
• Simple, well-defined interface
• Modular and composable
• Independently deployable
A
C D E
B
Evolution to
Microservices
• eBay
• 5th generation today
• Monolithic Perl  Monolithic C++  Java  microservices
• Twitter
• 3rd generation today
• Monolithic Rails  JS / Rails / Scala  microservices
• Amazon
• Nth generation today
• Monolithic Perl / C++  Java / Scala  microservices
No one starts with microservices
…
Past a certain scale, everyone
ends up with microservices
If you don’t end up regretting
your early technology
decisions, you probably over-
engineered.
Quality
Discipline
• Quality and Reliability are “Priority-0 features”
o Equally important to users as product features and engaging user experience
• Developers responsible for
o Features
o Quality
o Performance
o Reliability
o Manageability
Test-Driven
Development
• Tests help you go faster
o Tests “have your back”
o Development velocity
• Tests make better code
o Confidence to break things
o Courage to refactor mercilessly
• Tests make better systems
o Catch bugs earlier, fail faster
@randyshoup linkedin.com/in/randyshoup
Optimizing
Developer Effort
@randyshoup linkedin.com/in/randyshoup
• 75% reading
existing code
• 20% modifying
existing code
• 5% writing new
code
https://blogs.msdn.microsoft.com/peterhal/2006/01/04/what-do-programmers-really-do-anyway-aka-part-2-of-the-yardstick-saga/
Optimizing
Developer Effort
@randyshoup linkedin.com/in/randyshoup
• 75% reading
existing code
• 20% modifying
existing code
• 5% writing new
code
https://blogs.msdn.microsoft.com/peterhal/2006/01/04/what-do-programmers-really-do-anyway-aka-part-2-of-the-yardstick-saga/
“Do you have time to do it
twice?”
“We don’t have time to do it
right!”
The more constrained you are
on time or resources, the more
important it is to build it right
the first time.
Build It Right (Enough)
The First Time
• Build one great thing instead of two half-finished things
• Right ≠ Perfect (80 / 20 Rule)
•  Basically no bug tracking system (!)
o Bugs are fixed as they come up
o Backlog contains features we want to build
o Backlog contains technical debt we want to repay
@randyshoup linkedin.com/in/randyshoup
Continuous
Integration
• Small incremental changes
o Faster feedback loop
o Easy to understand, code-review, test, roll back
• Large changes are risky
o Risk of code change is nonlinear in the size of the change
o Ex. Initial memcache service submission
• Main code branch always shippable
o Increased rate of change while reducing risk of change
Vicious Cycle
of Technical Debt
Technical
Debt
“No time
to do it
right”
Quick-
and-dirty
Virtuous Cycle
of Investment
Solid
Foundation
Confidence
Faster and
Better
Testing
Moving Fast
at Scale
•Organizing for Speed
•What to Build / What NOT to Build
•When to Build
•How to Build
•Delivering and Operating
You Build It, You Run It.
-- Werner Vogels
Continuous
Delivery
• Repeatable Deployment Pipeline
o Low-risk, push-button deployment
o Rapid release cadence
o Rapid rollback and recovery
• Most applications deployed multiple times per day
• More solid systems
o Release smaller units of work
o Smaller changes to roll back or roll forward
o Faster to repair, easier to understand, simpler to diagnose
@randyshoup linkedin.com/in/randyshoup
Observability
• Strong practice of detailed, end-to-end monitoring of
production systems
• Ability to detect and alert on issues anywhere in the
system
• Sufficient monitoring to be able to do remote runtime
diagnosis
If you ever have to ssh into a
machine, your monitoring has
failed.
Feature
Flags
• Configuration “flag” to enable / disable a feature for a
particular set of users
o Independently discovered at eBay, Yahoo, Facebook, Google, etc.
• More solid systems
o Decouple feature delivery from code delivery
o Rapid on and off
o Develop / test / verify in production
o Dark launches
• Enables experimentation
Blameless
Post-Mortems
• Post-mortem After Every Incident
o Document exactly what happened
o What went right
o What went wrong
• Open and Honest Discussion
o What contributed to the incident?
o What could we have done better?
Engineers compete to take personal responsibility (!)
@randyshoup linkedin.com/in/randyshoup
“Finally we can prioritize
fixing that broken system!”
Blameless
Post-Mortems
• Action Items
o How will we change process, technology, documentation, etc.
o How could we have automated the problems away?
o How could we have diagnosed more quickly?
o How could we have restored service more quickly?
• Follow up (!)
@randyshoup linkedin.com/in/randyshoup
Failure is not falling down,
but refusing to get back up.
-- Theodore Roosevelt
Moving Fast
at Scale
•Organizing for Speed
•What to Build / What NOT to Build
•When to Build
•How to Build
•Delivering and Operating
High-Performing
Organizations
2.5x more likely to exceed
business goals
o Profitability
o Market share
o Productivity
@randyshoup linkedin.com/in/randyshoup
https://puppet.com/resources/whitepaper/state-of-devops-report
Faster is Better
Thank You!
• @randyshoup
• linkedin.com/in/randyshoup

More Related Content

What's hot

A CTO's Guide to Scaling Organizations
A CTO's Guide to Scaling OrganizationsA CTO's Guide to Scaling Organizations
A CTO's Guide to Scaling OrganizationsRandy Shoup
 
Anatomy of Three Incidents -- Commonalities and Lessons
Anatomy of Three Incidents -- Commonalities and LessonsAnatomy of Three Incidents -- Commonalities and Lessons
Anatomy of Three Incidents -- Commonalities and LessonsRandy Shoup
 
An Agile Approach to Machine Learning
An Agile Approach to Machine LearningAn Agile Approach to Machine Learning
An Agile Approach to Machine LearningRandy Shoup
 
Service Architectures At Scale - QCon London 2015
Service Architectures At Scale - QCon London 2015Service Architectures At Scale - QCon London 2015
Service Architectures At Scale - QCon London 2015Randy Shoup
 
Moving Fast at Scale
Moving Fast at ScaleMoving Fast at Scale
Moving Fast at ScaleRandy Shoup
 
Service Architectures at Scale
Service Architectures at ScaleService Architectures at Scale
Service Architectures at ScaleRandy Shoup
 
Monoliths, Migrations, and Microservices
Monoliths, Migrations, and MicroservicesMonoliths, Migrations, and Microservices
Monoliths, Migrations, and MicroservicesRandy Shoup
 
Minimum Viable Architecture -- Good Enough is Good Enough in a Startup
Minimum Viable Architecture -- Good Enough is Good Enough in a StartupMinimum Viable Architecture -- Good Enough is Good Enough in a Startup
Minimum Viable Architecture -- Good Enough is Good Enough in a StartupRandy Shoup
 
Pragmatic Microservices
Pragmatic MicroservicesPragmatic Microservices
Pragmatic MicroservicesRandy Shoup
 
Why Enterprises Are Embracing the Cloud
Why Enterprises Are Embracing the CloudWhy Enterprises Are Embracing the Cloud
Why Enterprises Are Embracing the CloudRandy Shoup
 
The Importance of Culture: Building and Sustaining Effective Engineering Org...
The Importance of Culture:  Building and Sustaining Effective Engineering Org...The Importance of Culture:  Building and Sustaining Effective Engineering Org...
The Importance of Culture: Building and Sustaining Effective Engineering Org...Randy Shoup
 
Managing Data at Scale - Microservices and Events
Managing Data at Scale - Microservices and EventsManaging Data at Scale - Microservices and Events
Managing Data at Scale - Microservices and EventsRandy Shoup
 
DevOpsDays Silicon Valley 2014 - The Game of Operations
DevOpsDays Silicon Valley 2014 - The Game of OperationsDevOpsDays Silicon Valley 2014 - The Game of Operations
DevOpsDays Silicon Valley 2014 - The Game of OperationsRandy Shoup
 
Scaling Your Architecture with Services and Events
Scaling Your Architecture with Services and EventsScaling Your Architecture with Services and Events
Scaling Your Architecture with Services and EventsRandy Shoup
 
Effective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldEffective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldRandy Shoup
 
Enterprise DevOps fact or fiction - DevOps Summit 2014
Enterprise DevOps fact or fiction - DevOps Summit 2014Enterprise DevOps fact or fiction - DevOps Summit 2014
Enterprise DevOps fact or fiction - DevOps Summit 2014Chris Riley ☁
 
Tales from the Platform Trade
Tales from the Platform TradeTales from the Platform Trade
Tales from the Platform TradeWilliam Grosso
 
Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...
Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...
Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...Randy Shoup
 
Software devops engineer in test (SDET)
Software devops engineer in test (SDET)Software devops engineer in test (SDET)
Software devops engineer in test (SDET)Sriram Angajala
 
CodeOne 2018 - Better software, faster: principles of Continuous Delivery and...
CodeOne 2018 - Better software, faster: principles of Continuous Delivery and...CodeOne 2018 - Better software, faster: principles of Continuous Delivery and...
CodeOne 2018 - Better software, faster: principles of Continuous Delivery and...Bert Jan Schrijver
 

What's hot (20)

A CTO's Guide to Scaling Organizations
A CTO's Guide to Scaling OrganizationsA CTO's Guide to Scaling Organizations
A CTO's Guide to Scaling Organizations
 
Anatomy of Three Incidents -- Commonalities and Lessons
Anatomy of Three Incidents -- Commonalities and LessonsAnatomy of Three Incidents -- Commonalities and Lessons
Anatomy of Three Incidents -- Commonalities and Lessons
 
An Agile Approach to Machine Learning
An Agile Approach to Machine LearningAn Agile Approach to Machine Learning
An Agile Approach to Machine Learning
 
Service Architectures At Scale - QCon London 2015
Service Architectures At Scale - QCon London 2015Service Architectures At Scale - QCon London 2015
Service Architectures At Scale - QCon London 2015
 
Moving Fast at Scale
Moving Fast at ScaleMoving Fast at Scale
Moving Fast at Scale
 
Service Architectures at Scale
Service Architectures at ScaleService Architectures at Scale
Service Architectures at Scale
 
Monoliths, Migrations, and Microservices
Monoliths, Migrations, and MicroservicesMonoliths, Migrations, and Microservices
Monoliths, Migrations, and Microservices
 
Minimum Viable Architecture -- Good Enough is Good Enough in a Startup
Minimum Viable Architecture -- Good Enough is Good Enough in a StartupMinimum Viable Architecture -- Good Enough is Good Enough in a Startup
Minimum Viable Architecture -- Good Enough is Good Enough in a Startup
 
Pragmatic Microservices
Pragmatic MicroservicesPragmatic Microservices
Pragmatic Microservices
 
Why Enterprises Are Embracing the Cloud
Why Enterprises Are Embracing the CloudWhy Enterprises Are Embracing the Cloud
Why Enterprises Are Embracing the Cloud
 
The Importance of Culture: Building and Sustaining Effective Engineering Org...
The Importance of Culture:  Building and Sustaining Effective Engineering Org...The Importance of Culture:  Building and Sustaining Effective Engineering Org...
The Importance of Culture: Building and Sustaining Effective Engineering Org...
 
Managing Data at Scale - Microservices and Events
Managing Data at Scale - Microservices and EventsManaging Data at Scale - Microservices and Events
Managing Data at Scale - Microservices and Events
 
DevOpsDays Silicon Valley 2014 - The Game of Operations
DevOpsDays Silicon Valley 2014 - The Game of OperationsDevOpsDays Silicon Valley 2014 - The Game of Operations
DevOpsDays Silicon Valley 2014 - The Game of Operations
 
Scaling Your Architecture with Services and Events
Scaling Your Architecture with Services and EventsScaling Your Architecture with Services and Events
Scaling Your Architecture with Services and Events
 
Effective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldEffective Microservices In a Data-centric World
Effective Microservices In a Data-centric World
 
Enterprise DevOps fact or fiction - DevOps Summit 2014
Enterprise DevOps fact or fiction - DevOps Summit 2014Enterprise DevOps fact or fiction - DevOps Summit 2014
Enterprise DevOps fact or fiction - DevOps Summit 2014
 
Tales from the Platform Trade
Tales from the Platform TradeTales from the Platform Trade
Tales from the Platform Trade
 
Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...
Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...
Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...
 
Software devops engineer in test (SDET)
Software devops engineer in test (SDET)Software devops engineer in test (SDET)
Software devops engineer in test (SDET)
 
CodeOne 2018 - Better software, faster: principles of Continuous Delivery and...
CodeOne 2018 - Better software, faster: principles of Continuous Delivery and...CodeOne 2018 - Better software, faster: principles of Continuous Delivery and...
CodeOne 2018 - Better software, faster: principles of Continuous Delivery and...
 

Similar to How to Move Fast at Scale While Ensuring Stability

Dashlane Mission Teams
Dashlane Mission TeamsDashlane Mission Teams
Dashlane Mission TeamsDashlane
 
(SPOT205) 5 Lessons for Managing Massive IT Transformation Projects
(SPOT205) 5 Lessons for Managing Massive IT Transformation Projects(SPOT205) 5 Lessons for Managing Massive IT Transformation Projects
(SPOT205) 5 Lessons for Managing Massive IT Transformation ProjectsAmazon Web Services
 
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of SimplicityLarge Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of SimplicityRandy Shoup
 
sitHH16 - The Implications of Becoming Agile
sitHH16 - The Implications of Becoming AgilesitHH16 - The Implications of Becoming Agile
sitHH16 - The Implications of Becoming AgileMarkus Theilen
 
Customer Development Fast Protyping
Customer Development Fast ProtypingCustomer Development Fast Protyping
Customer Development Fast ProtypingSerdar Temiz
 
Atlassian Executive Business Forum - LinkedIn HQ
Atlassian Executive Business Forum - LinkedIn HQAtlassian Executive Business Forum - LinkedIn HQ
Atlassian Executive Business Forum - LinkedIn HQServiceRocket
 
DevOps, Performance Optimization and the Green Life with Magento
DevOps, Performance Optimization and the Green Life with MagentoDevOps, Performance Optimization and the Green Life with Magento
DevOps, Performance Optimization and the Green Life with MagentoLuis Tineo
 
How to create awesome customer experiences
How to create awesome customer experiencesHow to create awesome customer experiences
How to create awesome customer experiencesMorgan Simonsen
 
Product Management for Startup Founders, CEOs, and CTOs
Product Management for Startup Founders, CEOs, and CTOsProduct Management for Startup Founders, CEOs, and CTOs
Product Management for Startup Founders, CEOs, and CTOsChris Cera
 
Introduction to Agile Hardware
Introduction to Agile Hardware Introduction to Agile Hardware
Introduction to Agile Hardware Cprime
 
Holistic Product Development
Holistic Product DevelopmentHolistic Product Development
Holistic Product DevelopmentGary Pedretti
 
Building enterprise platforms - off the beaten path - SharePoint User Group U...
Building enterprise platforms - off the beaten path - SharePoint User Group U...Building enterprise platforms - off the beaten path - SharePoint User Group U...
Building enterprise platforms - off the beaten path - SharePoint User Group U...Andy Talbot
 
Building SharePoint Enterprise Platforms - Off the beaten path
Building SharePoint Enterprise Platforms - Off the beaten pathBuilding SharePoint Enterprise Platforms - Off the beaten path
Building SharePoint Enterprise Platforms - Off the beaten pathAndy Talbot
 
10 steps to salvation: Creating digital governance that works
10 steps to salvation: Creating digital governance that works10 steps to salvation: Creating digital governance that works
10 steps to salvation: Creating digital governance that worksKate Thomas
 
Practical agile TechExeter
Practical agile TechExeterPractical agile TechExeter
Practical agile TechExeterIan Ames
 
Practical Agile. Lessons learned the hard way on our journey building digita...
Practical Agile.  Lessons learned the hard way on our journey building digita...Practical Agile.  Lessons learned the hard way on our journey building digita...
Practical Agile. Lessons learned the hard way on our journey building digita...TechExeter
 
Big Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil GamesBig Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil GamesRob Winters
 
So Now You’re a UiPath Developer – What’s Next?” What Role do You Play as Dev...
So Now You’re a UiPath Developer – What’s Next?” What Role do You Play as Dev...So Now You’re a UiPath Developer – What’s Next?” What Role do You Play as Dev...
So Now You’re a UiPath Developer – What’s Next?” What Role do You Play as Dev...DianaGray10
 

Similar to How to Move Fast at Scale While Ensuring Stability (20)

Dashlane Mission Teams
Dashlane Mission TeamsDashlane Mission Teams
Dashlane Mission Teams
 
(SPOT205) 5 Lessons for Managing Massive IT Transformation Projects
(SPOT205) 5 Lessons for Managing Massive IT Transformation Projects(SPOT205) 5 Lessons for Managing Massive IT Transformation Projects
(SPOT205) 5 Lessons for Managing Massive IT Transformation Projects
 
SAFe and DevOps - better together
SAFe and DevOps - better togetherSAFe and DevOps - better together
SAFe and DevOps - better together
 
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of SimplicityLarge Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
 
sitHH16 - The Implications of Becoming Agile
sitHH16 - The Implications of Becoming AgilesitHH16 - The Implications of Becoming Agile
sitHH16 - The Implications of Becoming Agile
 
Customer Development Fast Protyping
Customer Development Fast ProtypingCustomer Development Fast Protyping
Customer Development Fast Protyping
 
Atlassian Executive Business Forum - LinkedIn HQ
Atlassian Executive Business Forum - LinkedIn HQAtlassian Executive Business Forum - LinkedIn HQ
Atlassian Executive Business Forum - LinkedIn HQ
 
DevOps, Performance Optimization and the Green Life with Magento
DevOps, Performance Optimization and the Green Life with MagentoDevOps, Performance Optimization and the Green Life with Magento
DevOps, Performance Optimization and the Green Life with Magento
 
How to create awesome customer experiences
How to create awesome customer experiencesHow to create awesome customer experiences
How to create awesome customer experiences
 
Lean Analytics: How to get more out of your data science team
Lean Analytics: How to get more out of your data science teamLean Analytics: How to get more out of your data science team
Lean Analytics: How to get more out of your data science team
 
Product Management for Startup Founders, CEOs, and CTOs
Product Management for Startup Founders, CEOs, and CTOsProduct Management for Startup Founders, CEOs, and CTOs
Product Management for Startup Founders, CEOs, and CTOs
 
Introduction to Agile Hardware
Introduction to Agile Hardware Introduction to Agile Hardware
Introduction to Agile Hardware
 
Holistic Product Development
Holistic Product DevelopmentHolistic Product Development
Holistic Product Development
 
Building enterprise platforms - off the beaten path - SharePoint User Group U...
Building enterprise platforms - off the beaten path - SharePoint User Group U...Building enterprise platforms - off the beaten path - SharePoint User Group U...
Building enterprise platforms - off the beaten path - SharePoint User Group U...
 
Building SharePoint Enterprise Platforms - Off the beaten path
Building SharePoint Enterprise Platforms - Off the beaten pathBuilding SharePoint Enterprise Platforms - Off the beaten path
Building SharePoint Enterprise Platforms - Off the beaten path
 
10 steps to salvation: Creating digital governance that works
10 steps to salvation: Creating digital governance that works10 steps to salvation: Creating digital governance that works
10 steps to salvation: Creating digital governance that works
 
Practical agile TechExeter
Practical agile TechExeterPractical agile TechExeter
Practical agile TechExeter
 
Practical Agile. Lessons learned the hard way on our journey building digita...
Practical Agile.  Lessons learned the hard way on our journey building digita...Practical Agile.  Lessons learned the hard way on our journey building digita...
Practical Agile. Lessons learned the hard way on our journey building digita...
 
Big Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil GamesBig Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil Games
 
So Now You’re a UiPath Developer – What’s Next?” What Role do You Play as Dev...
So Now You’re a UiPath Developer – What’s Next?” What Role do You Play as Dev...So Now You’re a UiPath Developer – What’s Next?” What Role do You Play as Dev...
So Now You’re a UiPath Developer – What’s Next?” What Role do You Play as Dev...
 

More from Randy Shoup

Breaking Codes, Designing Jets, and Building Teams
Breaking Codes, Designing Jets, and Building TeamsBreaking Codes, Designing Jets, and Building Teams
Breaking Codes, Designing Jets, and Building TeamsRandy Shoup
 
Ten Lessons of the DevOps Transition
Ten Lessons of the DevOps TransitionTen Lessons of the DevOps Transition
Ten Lessons of the DevOps TransitionRandy Shoup
 
Managing Data in Microservices
Managing Data in MicroservicesManaging Data in Microservices
Managing Data in MicroservicesRandy Shoup
 
From the Monolith to Microservices - CraftConf 2015
From the Monolith to Microservices - CraftConf 2015From the Monolith to Microservices - CraftConf 2015
From the Monolith to Microservices - CraftConf 2015Randy Shoup
 
Concurrency at Scale: Evolution to Micro-Services
Concurrency at Scale:  Evolution to Micro-ServicesConcurrency at Scale:  Evolution to Micro-Services
Concurrency at Scale: Evolution to Micro-ServicesRandy Shoup
 
QCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYE
QCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYEQCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYE
QCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYERandy Shoup
 
QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...
QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...
QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...Randy Shoup
 

More from Randy Shoup (7)

Breaking Codes, Designing Jets, and Building Teams
Breaking Codes, Designing Jets, and Building TeamsBreaking Codes, Designing Jets, and Building Teams
Breaking Codes, Designing Jets, and Building Teams
 
Ten Lessons of the DevOps Transition
Ten Lessons of the DevOps TransitionTen Lessons of the DevOps Transition
Ten Lessons of the DevOps Transition
 
Managing Data in Microservices
Managing Data in MicroservicesManaging Data in Microservices
Managing Data in Microservices
 
From the Monolith to Microservices - CraftConf 2015
From the Monolith to Microservices - CraftConf 2015From the Monolith to Microservices - CraftConf 2015
From the Monolith to Microservices - CraftConf 2015
 
Concurrency at Scale: Evolution to Micro-Services
Concurrency at Scale:  Evolution to Micro-ServicesConcurrency at Scale:  Evolution to Micro-Services
Concurrency at Scale: Evolution to Micro-Services
 
QCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYE
QCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYEQCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYE
QCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYE
 
QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...
QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...
QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...
 

Recently uploaded

Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITmanoharjgpsolutions
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolsosttopstonverter
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxRTS corp
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdfAndrey Devyatkin
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jNeo4j
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencessuser9e7c64
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesVictoriaMetrics
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogueitservices996
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonApplitools
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldRoberto Pérez Alcolea
 
Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptxVinzoCenzo
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...OnePlan Solutions
 

Recently uploaded (20)

Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh IT
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryError
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration tools
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conference
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 Updates
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogue
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository world
 
Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptx
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
 

How to Move Fast at Scale While Ensuring Stability

  • 1. Moving Fast at Scale Randy Shoup @randyshoup linkedin.com/in/randyshoup
  • 2. Background • VP Engineering at Stitch Fix o Using technology and data science to revolutionize clothing retail • Consulting “CTO as a service” o Helping companies move fast at scale  • Director of Engineering for Google App Engine o World’s largest Platform-as-a-Service • Chief Engineer at eBay o Evolving multiple generations of eBay’s infrastructure
  • 9. Humans + Data Science • 1:1 Ratio of Data Science to Engineering o More than 100 software engineers o ~80 data scientists and algorithm developers o Unique ratio in our industry • Apply intelligence to *every* part of the business o Buying o Inventory management o Logistics optimization o Styling recommendations o Demand prediction • Humans and machines augmenting each other @randyshoup linkedin.com/in/randyshoup
  • 11. Willingness to go fast Ability to go fast +
  • 13. High-Performing Organizations • Multiple deploys per day vs. one per month • Commit to deploy in less than 1 hour vs. one week • Recover from failure in less than 1 hour vs. one day • Change failure rate of 0-15% vs. 31-45% @randyshoup linkedin.com/in/randyshoup https://puppet.com/resources/whitepaper/state-of-devops-report
  • 14. High-Performing Organizations 2.5x more likely to exceed business goals o Profitability o Market share o Productivity @randyshoup linkedin.com/in/randyshoup https://puppet.com/resources/whitepaper/state-of-devops-report
  • 17. Moving Fast at Scale •Organizing for Speed •What to Build / What NOT to Build •When to Build •How to Build •Delivering and Operating
  • 18. Moving Fast at Scale •Organizing for Speed •What to Build / What NOT to Build •When to Build •How to Build •Delivering and Operating
  • 19. Conway’s Law • Organization determines architecture o Design of a system will be a reflection of the communication paths within the organization • Modular system requires modular organization o Small, independent teams lead to more flexible, composable systems o Larger, interdependent teams lead to larger systems • We can engineer the system we want by engineering the organization @randyshoup linkedin.com/in/randyshoup
  • 20. Small “Service” Teams • Full-Stack, “2 Pizza” Teams o No team should be larger than can be fed by 2 large pizzas o Typically 4-6 people o All disciplines required for the team to function • Aligned to Business Domains o Clear, well-defined area of responsibility o Single service or set of related services o Deep understanding of business problems • Growth through “cellular mitosis” @randyshoup linkedin.com/in/randyshoup
  • 21. Ideally, 80% of project work should be within a team boundary.
  • 22. eBay “Train Seats” • eBay’s development process (circa 2006) o Design and estimate project (“Train Seat” == 2 engineer-weeks) o Assign engineers from common pool to implement tasks o Designer does not implement; implementers do not design (or test, or operate) •  Dysfunctional engineering culture o (-) Engineers treated as interchangeable “cogs” o (-) No regard for skill, interest, experience o (-) No pride of ownership in task implementation o (-) No long-term ownership of codebase
  • 23. Moving Fast at Scale •Organizing for Speed •What to Build / What NOT to Build •When to Build •How to Build •Delivering and Operating
  • 24. “Building the wrong thing is the biggest waste in software development.” -- Mary and Tom Poppendieck, Lean Software Development
  • 25. What problem are you trying to solve?
  • 26. “A problem well-stated is a problem half-solved.” -- Charles Kettering, former head of research for General Motors
  • 27. What Problem Are You Trying to Solve? • Focus on what is important for your business • Problem might be solved without any technology at all o Redefine the problem o Change the business process o Implement manually for a while before automating in an application @randyshoup linkedin.com/in/randyshoup
  • 28. Buy, Not Build • Use Cloud Infrastructure o Faster, cheaper, better than we can do ourselves o Stitch Fix has no owned physical infrastructure anywhere in the world • Prefer Open Source o Kubernetes, Docker, Istio o MySQL, Postgres, Redis, Elastic Search o Machine learning models o Etc. o Usually better than the commercial alternatives (!) @randyshoup linkedin.com/in/randyshoup
  • 29. Buy, Not Build • Third-Party Services o Stitch Fix uses >50 third party services o Logging, monitoring, alerting o Project management, bug tracking o Billing, fraud detection o Etc. • Focus on our core competency o Use services for everything else (!) @randyshoup linkedin.com/in/randyshoup
  • 30. Soon it will be just as common to run your own data center as it is to run your own electrical power generation.
  • 31. Experimental Discipline • State your hypothesis o What metrics do you expect to move and why o Understand your baseline • Run a real A | B test o Sample size o Isolated treatment and control groups o No peeking or quitting early! • Obsessively log and measure o Understand customer and system behavior o Understand why this experiment worked or did not
  • 32. Experimental Discipline • Listen to the data o Data trumps hope and intuition o Develop insights for next experiment • Thinking of the experiment is art; evaluating it is science • Rinse and Repeat o This is a journey, not a single step
  • 33. eBay Machine-Learned Ranking • Ranking function for search results o Which item should appear 1st, 10th, 100th, 1000th o Before: Small number of hand-tuned factors o Goal: Thousands of factors • Incremental Experimentation o Predictive models: query->view, view->purchase, etc. o Hundreds of parallel A | B tests o Full year of steady, incremental improvements  2% increase in eBay revenue (~$120M / year)
  • 34. eBay Site Speed • Reduce user-experienced latency for search results • Iterative Process o Implement a potential improvement o Release to the site in an A | B test o Monitor metrics –time to first byte, time to click, click rate, purchase rate  2% increase in eBay revenue (~$120M / year)
  • 35. Moving Fast at Scale •Organizing for Speed •What to Build / What NOT to Build •When to Build •How to Build •Delivering and Operating
  • 36. Prioritization • Scarce resources require prioritization o We always have more to do than resources to do it o Opportunity cost -- deciding to do X means deciding not to do Y o Every decision is a tradeoff • Priority ← Return on Investment o Impact / Effort • Prioritization is a business decision, not a technical decision @randyshoup linkedin.com/in/randyshoup
  • 38. Fewer Things, More Done • Maximize resources applied to o Priority 1, then o Priority 2 o etc. • Incremental Delivery o Deliver increments along the way instead of everything at the end • Deliver Value Faster o Time Value of Money o Benefit now is worth more than benefit in the future @randyshoup linkedin.com/in/randyshoup
  • 39. “When you solve problem one, problem two gets a promotion.”
  • 40. Moving Fast at Scale •Organizing for Speed •What to Build / What NOT to Build •When to Build •How to Build •Delivering and Operating
  • 41. Microservices • Single-purpose • Simple, well-defined interface • Modular and composable • Independently deployable A C D E B
  • 42. Evolution to Microservices • eBay • 5th generation today • Monolithic Perl  Monolithic C++  Java  microservices • Twitter • 3rd generation today • Monolithic Rails  JS / Rails / Scala  microservices • Amazon • Nth generation today • Monolithic Perl / C++  Java / Scala  microservices
  • 43. No one starts with microservices … Past a certain scale, everyone ends up with microservices
  • 44. If you don’t end up regretting your early technology decisions, you probably over- engineered.
  • 45. Quality Discipline • Quality and Reliability are “Priority-0 features” o Equally important to users as product features and engaging user experience • Developers responsible for o Features o Quality o Performance o Reliability o Manageability
  • 46. Test-Driven Development • Tests help you go faster o Tests “have your back” o Development velocity • Tests make better code o Confidence to break things o Courage to refactor mercilessly • Tests make better systems o Catch bugs earlier, fail faster @randyshoup linkedin.com/in/randyshoup
  • 47. Optimizing Developer Effort @randyshoup linkedin.com/in/randyshoup • 75% reading existing code • 20% modifying existing code • 5% writing new code https://blogs.msdn.microsoft.com/peterhal/2006/01/04/what-do-programmers-really-do-anyway-aka-part-2-of-the-yardstick-saga/
  • 48. Optimizing Developer Effort @randyshoup linkedin.com/in/randyshoup • 75% reading existing code • 20% modifying existing code • 5% writing new code https://blogs.msdn.microsoft.com/peterhal/2006/01/04/what-do-programmers-really-do-anyway-aka-part-2-of-the-yardstick-saga/
  • 49. “Do you have time to do it twice?” “We don’t have time to do it right!”
  • 50. The more constrained you are on time or resources, the more important it is to build it right the first time.
  • 51. Build It Right (Enough) The First Time • Build one great thing instead of two half-finished things • Right ≠ Perfect (80 / 20 Rule) •  Basically no bug tracking system (!) o Bugs are fixed as they come up o Backlog contains features we want to build o Backlog contains technical debt we want to repay @randyshoup linkedin.com/in/randyshoup
  • 52. Continuous Integration • Small incremental changes o Faster feedback loop o Easy to understand, code-review, test, roll back • Large changes are risky o Risk of code change is nonlinear in the size of the change o Ex. Initial memcache service submission • Main code branch always shippable o Increased rate of change while reducing risk of change
  • 53. Vicious Cycle of Technical Debt Technical Debt “No time to do it right” Quick- and-dirty
  • 55. Moving Fast at Scale •Organizing for Speed •What to Build / What NOT to Build •When to Build •How to Build •Delivering and Operating
  • 56. You Build It, You Run It. -- Werner Vogels
  • 57. Continuous Delivery • Repeatable Deployment Pipeline o Low-risk, push-button deployment o Rapid release cadence o Rapid rollback and recovery • Most applications deployed multiple times per day • More solid systems o Release smaller units of work o Smaller changes to roll back or roll forward o Faster to repair, easier to understand, simpler to diagnose @randyshoup linkedin.com/in/randyshoup
  • 58. Observability • Strong practice of detailed, end-to-end monitoring of production systems • Ability to detect and alert on issues anywhere in the system • Sufficient monitoring to be able to do remote runtime diagnosis
  • 59. If you ever have to ssh into a machine, your monitoring has failed.
  • 60. Feature Flags • Configuration “flag” to enable / disable a feature for a particular set of users o Independently discovered at eBay, Yahoo, Facebook, Google, etc. • More solid systems o Decouple feature delivery from code delivery o Rapid on and off o Develop / test / verify in production o Dark launches • Enables experimentation
  • 61. Blameless Post-Mortems • Post-mortem After Every Incident o Document exactly what happened o What went right o What went wrong • Open and Honest Discussion o What contributed to the incident? o What could we have done better? Engineers compete to take personal responsibility (!) @randyshoup linkedin.com/in/randyshoup
  • 62. “Finally we can prioritize fixing that broken system!”
  • 63. Blameless Post-Mortems • Action Items o How will we change process, technology, documentation, etc. o How could we have automated the problems away? o How could we have diagnosed more quickly? o How could we have restored service more quickly? • Follow up (!) @randyshoup linkedin.com/in/randyshoup
  • 64. Failure is not falling down, but refusing to get back up. -- Theodore Roosevelt
  • 65. Moving Fast at Scale •Organizing for Speed •What to Build / What NOT to Build •When to Build •How to Build •Delivering and Operating
  • 66. High-Performing Organizations 2.5x more likely to exceed business goals o Profitability o Market share o Productivity @randyshoup linkedin.com/in/randyshoup https://puppet.com/resources/whitepaper/state-of-devops-report
  • 68. Thank You! • @randyshoup • linkedin.com/in/randyshoup