There are many reasons to adopt DevOps but equally many ways in which organizational and cultural factors slow progress down. Hear from Just Eat on their recommendations on building DevOps and Site Reliability teams at scale.
4. Fosduo dolores etoa jasom rebum.
Steto clita kuasd gubogren, nosotra drs
frone.
What makes us
UK/Ukraine/Australia/Canada 500+ ppl in Tech
22.8m active customers
30+ teams
450+ services
2,700+ orders/min
1,500+ AWS instances in production
1.6M+ metrics/min
1.5TB+ logs/day
500+ releases/week
45% Revenue Growth (FY17)
FTSE100 >£5bn Market Cap
Special?
5. What is SRE at Just Eat?
1 - Relentlessly protect site availability
2 - Enable change to be delivered fast, but with quality
3 - Optimise the use of our infrastructure and resources
4 - Innovate to stay ahead
5 - Foster the right culture at Just Eat
We believe that Dev teams own their product - full stop!
Site Reliability Engineering operates on 5 key principles...
5
6. How do we structure it?
Our customers are 30+ Dev Teams in multiple countries (these numbers vary)
Central Reliability Engineering department
- 24/7 Service Operations Centre (SOC)
- Development team
- Hosting/Platform
- Delivery Automation (CI/CD)
- Observability
- Service Management
Daily production standups
Weekly risk meeting
Monthly Engineering all-hands
1st class citizen in various architecture/project groups
6
7. What tools/processes do we own?
In one extreme SRE owns all tools and processes
+ economies of scale
+ faster decisions
- limits innovation
- slows down development teams
In the other extreme Dev teams own all tools and processes
+ maximum flexibility for development teams
- tooling sprawl
- wasted time reinventing the wheel
- support problems
Our solution
+ central support for a range of tooling
+ ability for dev teams to interact via an opensource approach
+ freedom for dev teams to deviate
+ survival of the fittest approach
The Central vs. Distributed debate
7
13. A formula for managing chaos?
13
if ( ReliabilityScore() < DesiredReliability() )
{
LetUsHelpYou()
}
else
{
LetUsHighlightYou()
Freedom++
}
14. What’s next? The FUTURE!
Automation of
observability.
A step jump
from the simple
time series
metrics.
14
The dream of
incident
resolution
automation.
The robots
talking to the
robots.
15. Questions?
If you want to contact us?
richard.haigh@just-eat.com
bennie.johnston@just-eat.com
If you want to read more about us?
Our tech blog: https://tech.just-eat.com
If you want to work for us ;)
Our Careers site: https://careers.just-eat.com