SlideShare a Scribd company logo
1 of 27
Download to read offline
JEFF
NICKOLOFF
All in Geek Consulting
Retiring Service
Interfaces: A
Retrospective on
Two 10+ Year Old
Services
@allingeek
jeff@allingeek.com
Who am I?
• Former Amazonian
• Author of Docker in Action
• Independent Software Engineer
• Blogger
• Containerization and AWS consulting
Why should I care?

(about service retirement)
• There are a surprising number of ways to change a
service interface
• Most of those changes will require some retirement
campaign
Background and
Declarations
Microservices enable implementation iteration in isolation
while hindering interface iteration.
Amazon runs an amazing number of microservices.
My former team of eight owned hundreds of services.
Amazon has amazing tooling for service owners

(but it wont save you).
What follows is an account of a process that surely happens
all the time and everywhere that different people’s code
establish runtime service dependencies.
A Long Story Short
• We retired two 10+ year old service interfaces each
having hundreds of unique consumers
• We did so in about 7 months
• We did not go out of business
• Other similar deprecation campaigns had been
attempted within the five years prior
What and Why
Two services that predated all of AWS:
• Ol’ McCruftyface suffered from non-sensical UX, an
RPC style interface, mutable entities, overly broad
ID space, “variadic” function definitions, and no
authentication.
• Blobby Cleartext used fake crypto, weak (never
rotated) keys, crappy write-through caching, file re-
streaming, and no authentication.
The Plan
• Identify clients that will need to migrate
• Identify active use-cases
• Document migration paths and timeline
• Open communication with clients and present plan
• Follow up regularly and monitor migration efforts
• Increase migration pressure
• Shut it down… eventually
Client Discovery
• Amazon has one package manager to rule them all
• Service dependencies are modeled
• Dependent packages have known owners
• Amazon has strong hardware ownership… Inspect
our service logs for IP addresses
• Additional challenges?

Mixed ownership, old software, and infrequent usage
Analysis: Client
Composition
0
25
50
75
100
Identified Partial Unknown
0
25
50
75
100
> 10 yrs > 5 yrs > 1 yr < 1 yr
% Identifiable % Grouped by Age
Wait…
“Why didn’t you just make all of the client changes yourself?”
Hundreds of clients, unique release cycles, repositories,
permissions issues, and politics.
Migration Path and
Assistance
• Analyze existing API and usage patterns from logs
• Document an internal client migration
• Prepare a migration matrix (before and after
mapping for discovered use-cases)
• Organize a migration assistance on-call rotation
• Establish lightweight procedures for assistance
Timeline and
Communication
• Make your sales pitch (carrots)
• Provide a complete but concise explanation,
documentation, assistance options, timeline, and
milestones
• Establish clear rules for regular communication

(frequency, medium, heartbeat, etc.)
• Highlight escalation paths and consequences of
compliance failure (sticks)
Carrots
• Better availability
• Clearer API UX
• Enhanced features
• Immutability
• Tighter latency guarantees
• Real data protection
“forgiveness > permission”

— Larry Wall
Sticks and Secret Sticks
• Secretly reducing service redundancy and
throughput capacity (make the services worse)
• Failure to communicate - pageable
• Missed milestones - pageable
You’re supposed to own these clients, so own them.
Following Up… with Sticks
• Half your customers will comply quickly <3
• A quarter will talk to you and never act

(missing milestones - page ‘em)
• About a quarter will ignore you 

(until later - page ‘em regularly)
• Some small percentage will never respond at all

(wait for the hammer)
More Client Discovery…
Sticks
• We identified that some clients were not identifiable
• Fell back to weekly advertisements in org-wide
meetings
• Unscheduled, inconvenient scream tests
Scream tests have to be painful or they’ll be ignored.
Prioritizing Empathy
1. I acknowledge that the short term gains of ignoring
service debt are tempting.
2. I also acknowledge that your team’s needs are
important.
3. However I propose that massive risks like this one
are more important, and that our customers share my
opinion.
4. I’m always going to prioritize customer needs over
your comfort.
A Hammer
I could always just turn them off, release the
hardware, and delete the service definitions.
“… but you can’t do that.”
“Are you sure?”
A Few Anecdotes
• Muddled ownership problems
• Reorg problems
• Unowned clients
• “This person just doesn’t want to work” problems
• “The painful test is painful” problems
Swinging the Hammer
It’s 10:32 am PST, we’ve only got a trickle of traffic,
the remaining known clients are unresponsive for
months, we’ve extended the shutdown date a month,
we’re wearing “deal with it” sunglasses inside.
Hit it.
Dousing the Final Fires
• Swinging the hammer will light a few fires. They
may take a few days for people to notice.
• Deal with them and resist all urges to turn the
service back on. Salt the earth where it stood.
Say, “This is your life now.”
Lessons Learned

… or suspicions confirmed
• Service adoption (really all dependency) is debt
• Don’t transfer ownership of “complete” services
• Services that “just work” suffer the most drastic
knowledge rot
• Greater success will bring greater pain
Things I’d do again…
• Structured planning and communication
• Provide strong positive incentives
• Use operational pain as leverage
• Scream tests were very successful
• Swing the hammer
Things I’d do Next Time
• Improve communication consistency
• Increase awareness of scream test risk

… but not of individual tests
• Escalate more quickly
JEFF
NICKOLOFF
All in Geek Consulting
Questions about
services or
Docker? Come
talk in the hall!
@allingeek
jeff@allingeek.com

More Related Content

Similar to Retiring Service Interfaces: A Retrospective on Two 10+ Year Old Services

Customer-centric Metrics
Customer-centric MetricsCustomer-centric Metrics
Customer-centric MetricsMark McBride
 
The 7 deadly sins of micro services
The 7 deadly sins of micro servicesThe 7 deadly sins of micro services
The 7 deadly sins of micro servicesAidan Casey
 
The 7 deadly sins of micro services
The 7 deadly sins of micro servicesThe 7 deadly sins of micro services
The 7 deadly sins of micro servicesAidan Casey
 
CA Security Communities Webcast - CA SSO Performance Testing with CA BlazeMeter
CA Security Communities Webcast - CA SSO Performance Testing with CA BlazeMeterCA Security Communities Webcast - CA SSO Performance Testing with CA BlazeMeter
CA Security Communities Webcast - CA SSO Performance Testing with CA BlazeMeterCA Technologies
 
The "Why", "What" and "How" of Microservices
The "Why", "What" and "How" of Microservices The "Why", "What" and "How" of Microservices
The "Why", "What" and "How" of Microservices INPAY
 
Mucon 2018: Heuristics for Identifying Microservice Boundaries By Erich Eichi...
Mucon 2018: Heuristics for Identifying Microservice Boundaries By Erich Eichi...Mucon 2018: Heuristics for Identifying Microservice Boundaries By Erich Eichi...
Mucon 2018: Heuristics for Identifying Microservice Boundaries By Erich Eichi...OpenCredo
 
The challenges of live events scalability
The challenges of live events scalabilityThe challenges of live events scalability
The challenges of live events scalabilityGuy Tomer
 
QCon 2015 - Microservices Track Notes
QCon 2015 - Microservices Track Notes QCon 2015 - Microservices Track Notes
QCon 2015 - Microservices Track Notes Abdul Basit Munda
 
Microservices Journey NYC
Microservices Journey NYCMicroservices Journey NYC
Microservices Journey NYCChristian Posta
 
Effective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldEffective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldRandy Shoup
 
Identity management delegation and automation
Identity management delegation and automationIdentity management delegation and automation
Identity management delegation and automationBill Buchan
 
2017 Melbourne YOW! CTO Summit - Monolith to micro-services with CQRS & Event...
2017 Melbourne YOW! CTO Summit - Monolith to micro-services with CQRS & Event...2017 Melbourne YOW! CTO Summit - Monolith to micro-services with CQRS & Event...
2017 Melbourne YOW! CTO Summit - Monolith to micro-services with CQRS & Event...Douglas English
 
HSI's Cloud-Hosted Foglight IT Monitoring & APM
HSI's Cloud-Hosted Foglight IT Monitoring & APMHSI's Cloud-Hosted Foglight IT Monitoring & APM
HSI's Cloud-Hosted Foglight IT Monitoring & APMKent Cartwright
 
AWS Summit Auckland - Smaller is Better - Microservices on AWS
AWS Summit Auckland - Smaller is Better - Microservices on AWSAWS Summit Auckland - Smaller is Better - Microservices on AWS
AWS Summit Auckland - Smaller is Better - Microservices on AWSAmazon Web Services
 
GWAVACon 2013: Novell Service Desk Design, Deployment
GWAVACon 2013: Novell Service Desk Design, DeploymentGWAVACon 2013: Novell Service Desk Design, Deployment
GWAVACon 2013: Novell Service Desk Design, DeploymentGWAVA
 
Service Architectures At Scale - QCon London 2015
Service Architectures At Scale - QCon London 2015Service Architectures At Scale - QCon London 2015
Service Architectures At Scale - QCon London 2015Randy Shoup
 
Site Reliability Engineering (SRE) - Tech Talk by Keet Sugathadasa
Site Reliability Engineering (SRE) - Tech Talk by Keet SugathadasaSite Reliability Engineering (SRE) - Tech Talk by Keet Sugathadasa
Site Reliability Engineering (SRE) - Tech Talk by Keet SugathadasaKeet Sugathadasa
 
Consul: Service-oriented at Scale
Consul: Service-oriented at ScaleConsul: Service-oriented at Scale
Consul: Service-oriented at ScaleC4Media
 

Similar to Retiring Service Interfaces: A Retrospective on Two 10+ Year Old Services (20)

Customer-centric Metrics
Customer-centric MetricsCustomer-centric Metrics
Customer-centric Metrics
 
The 7 deadly sins of micro services
The 7 deadly sins of micro servicesThe 7 deadly sins of micro services
The 7 deadly sins of micro services
 
The 7 deadly sins of micro services
The 7 deadly sins of micro servicesThe 7 deadly sins of micro services
The 7 deadly sins of micro services
 
CA Security Communities Webcast - CA SSO Performance Testing with CA BlazeMeter
CA Security Communities Webcast - CA SSO Performance Testing with CA BlazeMeterCA Security Communities Webcast - CA SSO Performance Testing with CA BlazeMeter
CA Security Communities Webcast - CA SSO Performance Testing with CA BlazeMeter
 
The "Why", "What" and "How" of Microservices
The "Why", "What" and "How" of Microservices The "Why", "What" and "How" of Microservices
The "Why", "What" and "How" of Microservices
 
Mucon 2018: Heuristics for Identifying Microservice Boundaries By Erich Eichi...
Mucon 2018: Heuristics for Identifying Microservice Boundaries By Erich Eichi...Mucon 2018: Heuristics for Identifying Microservice Boundaries By Erich Eichi...
Mucon 2018: Heuristics for Identifying Microservice Boundaries By Erich Eichi...
 
The challenges of live events scalability
The challenges of live events scalabilityThe challenges of live events scalability
The challenges of live events scalability
 
QCon 2015 - Microservices Track Notes
QCon 2015 - Microservices Track Notes QCon 2015 - Microservices Track Notes
QCon 2015 - Microservices Track Notes
 
Microservices Journey NYC
Microservices Journey NYCMicroservices Journey NYC
Microservices Journey NYC
 
Effective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldEffective Microservices In a Data-centric World
Effective Microservices In a Data-centric World
 
Identity management delegation and automation
Identity management delegation and automationIdentity management delegation and automation
Identity management delegation and automation
 
2017 Melbourne YOW! CTO Summit - Monolith to micro-services with CQRS & Event...
2017 Melbourne YOW! CTO Summit - Monolith to micro-services with CQRS & Event...2017 Melbourne YOW! CTO Summit - Monolith to micro-services with CQRS & Event...
2017 Melbourne YOW! CTO Summit - Monolith to micro-services with CQRS & Event...
 
HSI's Cloud-Hosted Foglight IT Monitoring & APM
HSI's Cloud-Hosted Foglight IT Monitoring & APMHSI's Cloud-Hosted Foglight IT Monitoring & APM
HSI's Cloud-Hosted Foglight IT Monitoring & APM
 
A Microservice Journey
A Microservice JourneyA Microservice Journey
A Microservice Journey
 
Microservices vs monolithics betabeers
Microservices vs monolithics   betabeersMicroservices vs monolithics   betabeers
Microservices vs monolithics betabeers
 
AWS Summit Auckland - Smaller is Better - Microservices on AWS
AWS Summit Auckland - Smaller is Better - Microservices on AWSAWS Summit Auckland - Smaller is Better - Microservices on AWS
AWS Summit Auckland - Smaller is Better - Microservices on AWS
 
GWAVACon 2013: Novell Service Desk Design, Deployment
GWAVACon 2013: Novell Service Desk Design, DeploymentGWAVACon 2013: Novell Service Desk Design, Deployment
GWAVACon 2013: Novell Service Desk Design, Deployment
 
Service Architectures At Scale - QCon London 2015
Service Architectures At Scale - QCon London 2015Service Architectures At Scale - QCon London 2015
Service Architectures At Scale - QCon London 2015
 
Site Reliability Engineering (SRE) - Tech Talk by Keet Sugathadasa
Site Reliability Engineering (SRE) - Tech Talk by Keet SugathadasaSite Reliability Engineering (SRE) - Tech Talk by Keet Sugathadasa
Site Reliability Engineering (SRE) - Tech Talk by Keet Sugathadasa
 
Consul: Service-oriented at Scale
Consul: Service-oriented at ScaleConsul: Service-oriented at Scale
Consul: Service-oriented at Scale
 

Recently uploaded

CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxMarkSteadman7
 
Decarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceDecarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceIES VE
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformWSO2
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseWSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaWSO2
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....rightmanforbloodline
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxFIDO Alliance
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data SciencePaolo Missier
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfdanishmna97
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTopCSSGallery
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 

Recently uploaded (20)

CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
Decarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceDecarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational Performance
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern Enterprise
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using Ballerina
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 

Retiring Service Interfaces: A Retrospective on Two 10+ Year Old Services

  • 1. JEFF NICKOLOFF All in Geek Consulting Retiring Service Interfaces: A Retrospective on Two 10+ Year Old Services @allingeek jeff@allingeek.com
  • 2. Who am I? • Former Amazonian • Author of Docker in Action • Independent Software Engineer • Blogger • Containerization and AWS consulting
  • 3. Why should I care?
 (about service retirement) • There are a surprising number of ways to change a service interface • Most of those changes will require some retirement campaign
  • 4. Background and Declarations Microservices enable implementation iteration in isolation while hindering interface iteration. Amazon runs an amazing number of microservices. My former team of eight owned hundreds of services. Amazon has amazing tooling for service owners
 (but it wont save you). What follows is an account of a process that surely happens all the time and everywhere that different people’s code establish runtime service dependencies.
  • 5. A Long Story Short • We retired two 10+ year old service interfaces each having hundreds of unique consumers • We did so in about 7 months • We did not go out of business • Other similar deprecation campaigns had been attempted within the five years prior
  • 6. What and Why Two services that predated all of AWS: • Ol’ McCruftyface suffered from non-sensical UX, an RPC style interface, mutable entities, overly broad ID space, “variadic” function definitions, and no authentication. • Blobby Cleartext used fake crypto, weak (never rotated) keys, crappy write-through caching, file re- streaming, and no authentication.
  • 7. The Plan • Identify clients that will need to migrate • Identify active use-cases • Document migration paths and timeline • Open communication with clients and present plan • Follow up regularly and monitor migration efforts • Increase migration pressure • Shut it down… eventually
  • 8. Client Discovery • Amazon has one package manager to rule them all • Service dependencies are modeled • Dependent packages have known owners • Amazon has strong hardware ownership… Inspect our service logs for IP addresses • Additional challenges?
 Mixed ownership, old software, and infrequent usage
  • 9. Analysis: Client Composition 0 25 50 75 100 Identified Partial Unknown 0 25 50 75 100 > 10 yrs > 5 yrs > 1 yr < 1 yr % Identifiable % Grouped by Age
  • 10. Wait… “Why didn’t you just make all of the client changes yourself?” Hundreds of clients, unique release cycles, repositories, permissions issues, and politics.
  • 11. Migration Path and Assistance • Analyze existing API and usage patterns from logs • Document an internal client migration • Prepare a migration matrix (before and after mapping for discovered use-cases) • Organize a migration assistance on-call rotation • Establish lightweight procedures for assistance
  • 12. Timeline and Communication • Make your sales pitch (carrots) • Provide a complete but concise explanation, documentation, assistance options, timeline, and milestones • Establish clear rules for regular communication
 (frequency, medium, heartbeat, etc.) • Highlight escalation paths and consequences of compliance failure (sticks)
  • 13. Carrots • Better availability • Clearer API UX • Enhanced features • Immutability • Tighter latency guarantees • Real data protection
  • 15. Sticks and Secret Sticks • Secretly reducing service redundancy and throughput capacity (make the services worse) • Failure to communicate - pageable • Missed milestones - pageable You’re supposed to own these clients, so own them.
  • 16. Following Up… with Sticks • Half your customers will comply quickly <3 • A quarter will talk to you and never act
 (missing milestones - page ‘em) • About a quarter will ignore you 
 (until later - page ‘em regularly) • Some small percentage will never respond at all
 (wait for the hammer)
  • 17. More Client Discovery… Sticks • We identified that some clients were not identifiable • Fell back to weekly advertisements in org-wide meetings • Unscheduled, inconvenient scream tests Scream tests have to be painful or they’ll be ignored.
  • 18. Prioritizing Empathy 1. I acknowledge that the short term gains of ignoring service debt are tempting. 2. I also acknowledge that your team’s needs are important. 3. However I propose that massive risks like this one are more important, and that our customers share my opinion. 4. I’m always going to prioritize customer needs over your comfort.
  • 19. A Hammer I could always just turn them off, release the hardware, and delete the service definitions. “… but you can’t do that.” “Are you sure?”
  • 20. A Few Anecdotes • Muddled ownership problems • Reorg problems • Unowned clients • “This person just doesn’t want to work” problems • “The painful test is painful” problems
  • 21. Swinging the Hammer It’s 10:32 am PST, we’ve only got a trickle of traffic, the remaining known clients are unresponsive for months, we’ve extended the shutdown date a month, we’re wearing “deal with it” sunglasses inside. Hit it.
  • 22.
  • 23. Dousing the Final Fires • Swinging the hammer will light a few fires. They may take a few days for people to notice. • Deal with them and resist all urges to turn the service back on. Salt the earth where it stood. Say, “This is your life now.”
  • 24. Lessons Learned
 … or suspicions confirmed • Service adoption (really all dependency) is debt • Don’t transfer ownership of “complete” services • Services that “just work” suffer the most drastic knowledge rot • Greater success will bring greater pain
  • 25. Things I’d do again… • Structured planning and communication • Provide strong positive incentives • Use operational pain as leverage • Scream tests were very successful • Swing the hammer
  • 26. Things I’d do Next Time • Improve communication consistency • Increase awareness of scream test risk
 … but not of individual tests • Escalate more quickly
  • 27. JEFF NICKOLOFF All in Geek Consulting Questions about services or Docker? Come talk in the hall! @allingeek jeff@allingeek.com