In the process of migrating our architecture to microservices we suffered from a lack of alignment across the different applications we have. This brought several painful effects to our organisation, like missing heterogeneous procedures, problems in managing services at scale and a lot of duplicated work across development teams.
Almost one year ago we started to develop a common internal framework based on Spring Boot to tackle problems at different levels. We began with monitoring, logging and metrics and we quickly added several other features.
One of the main benefits we achieved was to streamline and improve consistency in the way we deploy and support our services from an ops perspective.
In this talk we describe the journey that brings us to what we have at the moment: a project inherited from more than 40 different microservices that provide a lot of common features and a clean space where our new contribution model can take place.
3. A tech company to the core
3
● tech department: 300+ people
● applications: ~150
● database: ~180 database schemas and ~50TB of data
● servers: 1400 VMs, 300 physical machines
● locations: Chiasso, Milan, Madrid, London, Bengaluru
4. At the beginning DEVs created the Monolith
4
credits https://www.flickr.com/photos/southtopia/5702790189
7. 1. identify a feature and a business opportunity
○ new UX on checkout page
2. extract and wrap into a service
3. measure improvements
(Agile procedure to prepare the new development)
The first process of extraction
7
8. ● high coupling from
○ a functional point of view
○ an architectural point of view
● configuration hell
○ Tomcat
○ env variables, properties, etc ..
● dependency hell
○ different libraries/versions
Monolith: the micro-problems
8
9. ● well known inside our company
● simplify development
● bounded configuration
○ application.yaml
○ java-opts.yaml
● self-contained deployability
○ tomcat now inside the jar
Spring boot at the core
9
10. … and it worked!
credits https://www.pexels.com/photo/hand-thumbs-up-thumb-black-and-white-8252/
12. .. ready to follow the micro-service way!
credits https://www.pexels.com/photo/women-standing-on-race-track-while-preparing-for-a-run-race-during-daytime-28554/
13. Issues (1): low-level
13
● LACK OF alignment across micro-service
○ solutions
○ libraries
○ logging/tracing
○ monitoring/alerting
● LACK OF awareness about OPS
● LACK OF alignment across environments
14. Issues (2): higher perspective
14
● reinventing the wheel
● different tools for same issues
● weak contribution model
○ many silos
15. Our objectives
15
● hexagonal architecture at the core
● software alignment
● centralized monitoring/logging, with alerts
● zero downtime deployment
● automation everywhere
16. A year-long endeavour
16
● build the first company framework
● build a new, modern infrastructure
● migrate the search (flight/hotel) product there
... without:
● impacting the business
17. Two new teams: platform and devops
17
credits https://www.pexels.com/photo/blue-lego-toy-beside-orange-and-white-lego-toy-standing-during-daytime-105822/
18. Team objectives
18
● Platform team
○ servant leadership (no ivory tower)
○ develop the framework and common services
○ support adoption
● Devops team
○ focus on automation
■ building pipelines (CI + CD)
19. Hexagonal Architecture (Port and Adapters)
19
● designed for testability
● independent from tech
(framework, DB, client for
external services)
● independent from UI
20. Tailored service template (cit. Thoughtworks)
20
[..] which can be used to quickly seed new services,
pre-configured to operate within that organization's
production environment [..] This is a very useful technique
for encouraging collaborative evolution while retaining
lightweight governance
21. Platform framework: a thin layer
21
● common logging format
● common metrics (monitoring/alerting)
● common tracing format
● centralized dependency management
● smart HTTP Client
● Kubernetes lifecycle (graceful startup/shutdown)
22. Platform framework: based on Spring Boot
22
● self-contained uber jar
● easier to debug
● configuration is now inside the artifact
● run from command line
● pragmatic test approach, also for integration tests
23. Complexity in the new infrastructure
23
NODE
1
NODE
70
...
APP3-PRODUCTION
APP2-PRODUCTION
cluster
pod pod
pod
BASE OS
JAVA SRE
START/STOP
JAR APP
24. Healthy use case: self-healing
24
"When a container is dead I will restart it"
"When a container is ready I will forward traffic to it"
25. A common solution for all the micro-services
25
/liveness
● when tomcat container is up
● when ratio active/max threads < threshold
/readiness
● all the startup jobs have run
.. ongoing never-ending research ..
28. Platform framework from a team perspective (1)
28
● focus on application domain
● focus on features to develop
● better understanding of what happens inside
29. Platform framework from a team perspective (2)
29
● always aligned with latest technologies
● lightweight governance
● a platform that encourages contribution
30. Platform framework from platform engineering perspective
30
● where do the problems lay?
○ not anymore in the smaller components
○ now at the border
● Check how different micro-services work together
○ (Only in case of centralised logging and monitoring)
31. Innovation: KeyValueStorage port example
31
public interface KeyValueStorageService
{
boolean put(ValueStored data);
ValueStored get(String key);
void delete(String key);
Date findLastUpdateById(String key);
}
KeyValueStorageService.java
32. Innovation
32
● KeyValueStorage
○ Mysql adapter
○ Google BigTable adapter (in the cloud)
● easy to change tech part, maintaining domain one
● PlatformFw
○ initializes application (main partition)
○ is the platform adapter
34. Platform framework contribution model
34
● initially perceived as a foreign body
○ .. but it really solved problems
● scale
○ on common efforts
○ sharing solutions across organisation
○ in a more structured way
● sandbox model before platform framework
37. Give me the numbers!
37
● app migration in 1/2 weeks, 1/2 people
● 40 micro-services migrated in 6 months
● 5K req/sec generates 1.5M metrics/minute flows
● whole pipeline runs in 16 minutes