SpaceX's triumph in the space industry reveals essential principles that can guide our DevOps journey: optimizing for the whole system, embracing failure, and tightening feedback loops. Unfortunately, in the world of DevOps, these principles are often lost in a rush to adopt the latest technologies, such as Kubernetes, without understanding their true value.
This talk will explore how many organizations are caught up in the hype, sacrificing the core of DevOps for superficial implementations. Drawing parallels with SpaceX's development and the audacity of their mission, we'll investigate how to bring DevOps back to its roots.
Prepare for a potentially controversial examination of common practices that may be slowing you down, like mandated pull requests and feature branching. By aligning with the principles that powered SpaceX's success, we'll redefine what DevOps means and how it can transform the way we work.
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
From SpaceX Launch Pads to Rapid Deployments
1. By Victor Szoltysek
October 12th / 2023
From SpaceX Launch Pads to
Rapid Deployments
10x Your DevOps with First Principles
2. Measuring DevOps Success
Lead Time to Change
How quickly can you go from code
commit to production?
Good: Hours to days
Bad: Months
3. DevOps:
SpaceX Lens
Traditional DevOps mirrors
early rocket industry—costly,
slow, massively broken.
SpaceX thrived by
questioning norms.
Time to rethink our DevOps.
4. Genesis of SpaceX
Musk's frustration with Mars
exploration costs led to
SpaceX and Starship, the
mightiest rocket.
Starship - SpaceX's masterpiece
A height surpassing even the Statue of Liberty
5. Record-Breaking
Launch
Starship broke a 50-year
record, doubling Saturn V's
thrust, rede
fi
ning space
travel.
SpaceX Starship - April 20th 2023
First Integrated Flight Test
6. Unprecedented
Savings
Falcon 9 cut costs by 20x
compared to the Space
Shuttle; Starship aims for up
to 200x, unlocking new
space markets.
*Graph is Logarithmic scale.
Prior to SpaceX, launch costs were stagnant or rising.
7. Reviving a
Stagnant Space
Industry
How did SpaceX reignite
rapid space progress?
Neil Armstrong on the Moon
A giant leap in 1969, then a long wait for the next big step in space exploration
10. The Real Reason:
First Principles
Thinking
Utilizing a top-down
approach to understand and
address fundamental truths,
focusing on the underlying
value rather than getting
swayed by technologies.
LIFT
GRAVITY
THRUST
DRAG
Rocket Science Simpli
fi
ed: A cylinder
fi
lled with fuel and ignited - Thrust, Gravity, Drag, and Lift
The fundamental principles that SpaceX harnessed to innovate space travel,
exempli
fi
ed in the Starship's daring 'belly
fl
op' descent maneuver
11. Measuring SpaceX Success
Cost per Pound to
Orbit
Re
fl
ects the monetary
e
ffi
ciency of space launches.
Good: ~$50/lb
(Starship estimated)
Bad: ~$10,000/lb
(Space Shuttle estimated)
12. Principle #1:
Target Outcomes
By viewing the whole picture
and removing barriers
between teams, aligning
towards the shared outcome
of minimizing cost per
pound to orbit.
SpaceX Starbase at Boca Chica
Factory, launch pad, and testing sites united in one location
13. Siloed Setbacks
The FAA's siloed structure
fostered misaligned
objectives and rubber
stamping, causing delays in
SpaceX's Mars mission.
Ethiopian Airlines Flight 302 - Second 737-MAX Crash - March 10, 2019 - 157 deaths
Highlighted the FAA's con
fl
icting interests which allowed Boeing to self-certify the 737-MAX
No individual was ever found liable
14. Relentless Pursuit
of Simplicity
Focusing on simplifying
parts and processes
minimizes components,
enhancing functionality and
reducing failure points.
SpaceX Raptor Engine V1 vs V2
‘"The best part is no part. The best process is no process.
It weighs nothing. Costs nothing. Can’t go wrong” - Elon Musk
15. Principles #2:
Tighten Feedback
Promote real-world testing
and continuous integration
through rapid prototyping to
quickly reduce errors and
accelerate progress, moving
beyond theoretical diagrams.
Starship (23,24,25) at Starbase - May 22, 2022
Evolution through feedback
“"It's very important to have a feedback loop, where you're constantly thinking
about what you've done and how you could be doing it better. “ - Elon Musk
16. Steady Releases
with Progressive
Improvements
Embrace an early Minimum
Viable Product (MVP), mature
through endless iterations
and regular releases, and
avoiding the big bang's race
for perfection upfront.
SpaceX Falcon 1 - September 28, 2008
The small, low-risk design start that led to the creation of Starship
“If a design is taking too long, the design is wrong” - Elon Musk
17. Success through
Continuous
Experimentation
Encourages
fl
exible pivoting
and responsiveness to
feedback, promoting a
willingness to explore
di
ff
erent approaches.
Starship's Unconventional Hot Staging Technique Pivot (2023)
Igniting the upper stage immediately after disconnecting while still
touching the lower stage, aiming for a 10% fuel ef
fi
ciency gain
18. Principle #3:
Embrace Failures
Emphasizes that failures are
stepping stones, nurturing
exploration over a risk-
averse culture that sti
fl
es
progress.
Starship “Unscheduled Disassembly” Celebration - April 20th 2023
“If things are not failing, you’re not innovating enough” - Elon Musk
19. Stability through
Failure Readiness
Advocates for resilience
over perfection, emphasizing
a proactive approach
towards rapid recovery
instead an unattainable
perfect system.
Starship Booster Engine Failure - April 20th 2023
Redundant design allowed other engines to compensate
20. Fast Failures,
Safer Successes
Underscores the paradox
that striving for quicker,
more frequent failures
actually paves the way for a
safer, more reliable system.
In 2022, Falcon 9's 60 successful launches set a
new record, earning it the title of the world's safest,
most launched, and most reliable rocket.
21. DevOps: Untapped
Potential
Massive potential lies in
adhering to long-established
best practices, rather than
chasing technologies.
Jez Humble and David Farley's 2010’s ‘Continuous Delivery'
remains relevant. Its decade-old core principles are untapped,
mirroring the vast potential seen in space exploration
22. DevOps Goals
Deployment
Frequency
Lead Time for
Changes
Change Failure
Rate
Mean Time to
Recovery
How often code is
deployed to production.
Time from code commit
to deploy.
Percentage of failed
deployments.
Time taken to restore
service post-failure.
Good: Multiple daily Good: Hours to days Good: < 15% Good: < 1 hour
Bad: Less than monthly Bad: Weeks to months Bad: > 30% Bad: Days
23. Principle #1:
Target Outcomes
Embracing the full software
lifecycle and sometimes making
unconventional choices better
align with overarching goals. It's
not about chasing technologies,
but optimizing the entire process
to excel in DORA metrics.
Starship's Methane(CH4) fuel choice
Unconventional, more expensive and less dense
but making sense in the context of the Mars vision
24. Ownership over
Handoffs
Foster accountability and
rapid iteration with self-
su
ffi
cient teams that align
with business needs,
reducing siloed
ine
ffi
ciencies.
Houston, We Have a Problem: Mission Control in Texas, launch in Florida
A legacy political move that induces team silos and hampers communication
25. Do More
Do Less
Ownership over Handoffs
‘Throw it over the wall, it's an Ops
problem now’ mentality
Developers owning the application in
production
Separate QA, DevOps, DB teams Embedded Team Members
Team interactions with tickets Team interactions with APIs
26. Simplicity over
Complexity
Minimize complexity to
enhance e
ffi
ciency, reduce
bugs, and optimize the
process. The best code is
the one unwritten.
Starship booster caught mid-air by tower ‘chopsticks'
Ditching landing legs showcases SpaceX's dedication to simple, ef
fi
cient design.
27. Do More
Do Less
Simplicity over Complexity
Reinventing the Wheel
(including K8s platforms)
Defaulting to easier solutions
(like PaaS, and Serverless)
Overuse of Containers and
Microservices
Value-focused appropriate use of
technology
Manual and complicated pipelines,
environment setup, and deploys
Automated and simple pipelines,
environment setup, and deploys
28. Principle #2:
Tighten Feedback
Through Continuous
Integration, frequent
integration at every stage
ensures constant feedback
and smoother deployments.
Falcon 9 Grid Fins
Constantly adjusting hundreds of times per second for precise steering
29. Trust over Control
Rigid control and oversight
sti
fl
es creativity and slows
down integration, often
rooted in organizational
insecurities and a need for
control.
European Space Agency (ESA) - Ariane 5 Rocket - 22 Flags, 22 Bottlenecks
A recognized barrier to rapid innovation
30. Do More
Do Less
Trust over Control
Mandated Pull Requests
Pair Programming, Static Code
Analysis, Async Reviews, Targeted Pull
Requests, Trust in Developers
Gated Approval Steps
Automated Veri
fi
cation, Async
Veri
fi
cation, Easy Rollbacks, Trust in
Teams
Lengthy Approval Processes for
Resources
Self-Service Provisioning, Trust in
Teams
31. Integration Over
Isolation
Continuous integration,
including end-to-end testing,
is crucial to avoid 'merge
hell' and ensure cohesive
functioning.
ESA Ariane 5 Rocket Failure - June 4, 1996
One of the Most Expensive Software Bugs of All Time, Costing $370 Million
Due to the reuse of Ariane 4 code without proper integration testing with Ariane 5, a 64-bit to 16-bit error occurred
32. Do More
Do Less
Long Lived Feature Branching
Trunk Based Development with Feature
Toggling (implicit and explicit)
Blind Faith in Unit Tests, Infrequent
Integration
Daily Fully Integrated End-to-End
Snapshot Builds and Deployments,
Eating your own dog food
As-needed Pulls from Master Daily Pulls from Master
Integration Over Isolation
33. Principle #3:
Embrace Failures
Value failures as lessons to
detect and resolve issues
swiftly, fostering a culture of
openness and continuous
growth.
Starship SN10 - March 3, 2021
Successfully landed only to explode minutes later
Elon Musk celebrates: 'Starship SN10 landed in one piece!'
34. Resiliency over
Perfection
Design for automatic
detection and recovery,
focusing on resilience over
chasing an unattainable
perfection.
Mars Explorations Rover (with Watch Dog Timer mechanism)
Self-recovery allows uninterrupted exploration despite software crashes
35. Do More
Do Less
Resiliency over Perfection
Manual Weekend Deploys following
Lengthy Delivery Plans
Zero-downtime Automated Business
Hour Deploys with instant rollbacks
Striving for Perfect, Unchanging
Systems
Automatic Self Healing Applications
Prolonged Manual QA, Extensive Pre-
Release Validation
Immediate Alerting, Early Error
Detection and Resolution
36. Openness over
Fear
A safe workspace fosters
creativity and collaboration.
Tools like Post-Mortems and
Retrospectives turn failures
into growth opportunities.
Challenger Disaster - January 28, 1986 - First Space Shuttle Loss - 7 deaths
Open communication and heeding engineers' warnings could have prevented the disaster
No individual was ever found liable
37. Do More
Do Less
Openness over Fear
‘That's how we do things’ mentality Questioning, trying unproven solutions
Sacred-Cows, fearing opposition
Retrospectives, Post-Mortems,
constructive feedback
Fear of open discussion Open dialogue, anonymous feedback
38. Target Outcomes Tighten Feedback Embrace Failures
Focus on achieving
overarching goals, not on
adhering to the status-
quo or tools.
Foster rapid iterations,
immediate responses,
and continuous
integration to evolve
swiftly.
Learn and innovate from
missteps, turning
setbacks into stepping
stones.
Recap
39. The Challenge
Ahead
Ignite a culture of questioning
everything and building from
fi
rst
principles.
Embrace experimentation, and steer
your focus towards achieving stellar
DORA metrics, rather than getting
entangled in technology fads.
Wernher von Braun - The Father of Modern Rocketry
A controversial hire that paid off (Operation Paperclip)
40. Optional pull requests, especially for low-risk items
WIP commits of isolated features
Alternative methods of code reviews
Alerting on uncaught errors via webhooks
Automated daily deploys of latest code
Simpler alternatives to K8s (i.e. PaaS)
Retrospectives with anonymous feedback
Map out end-to-end path to production value stream
A culture of psychological safety and experimentation
Trial