10. Why microservices
• organized around business domains
• decentralized development process
• decentralized data management
• automate everything
• design for failure
11. There was once a little app…
WEB
Customer
Basket
Product
Order
14. Scaling models
• Scaling large problem domain using one Unified Model
• => very hard!
• Scaling many smaller models
• => not as hard!
• Scaling organizations, not code
15. Bounded Context
The system scales better
WEB
SALES
AUTH
Bounded Context
Bounded Context
Team A
Team B
Team C
19. What happened?
• Decompose model into components
• Expose components as service
• Components reference each other
• Where are these references in our model?
20. What happened?
• Design mistakes are easy
• Accidental complexity
• Business processes > 1 service
• Failure scenarios
21. C.2
C.4
The complexity increases
A.3
A.4
C.1
100% cpu
503 Unavailable
retry!
• Service overloads
• Upstream services retry
• Queue fills up
• “Thundering herd” problemcrash
crash
retry!
retry!
503 Unavailable
503 Unavailable
26. Data-centric approach
• Data being communicated is important
• Rather than REST API with interface…
• Use event data with a schema
• Asynchronous
• Loosely coupled
27. Event sourcing / CQRS
• Append-only event log
• Record things that happen
• Not what they are
• Create aggregates based on this event log
• a.k.a. Materialised Views
• a.k.a. “Aggregate Roots”
• Write and read models differ
• CQRS, “Command & Query Responsibility Segregation”
• Point-In-Time / Projections
(I believe) data = language of the system -> more common
-> data engineering to become more and more important;
done right, a data-driven system architecture -> empower other people deep into the data, no bottleneck
This talk -> how these things fit. Talk a lot about architecture -> important to lay the foundation.
Better understanding => better decisions.
End: “good stuff”: event sourcing & cqrs, -> implement yourself!
CTO, lead engineer, systems engineer?
I love data. Data as the driving force behind system architecture.
Let’s start with a question. What is a microservice?
[questions]
It’s like this whole system of cogs and wheels, that runs together smoothly.
Microservices are a modularization approach.
Applying microservices means to compose an application out of independent services running in separate processes.
Therefore microservices can be independently deployed.
Within a service you can use any technology and infrastructure.
But there’s a myth that microservices are there to achieve scalability.
But I can scale a monolith just fine.
Actually, a monolith is much easier to manage from an operational point of view.
So clearly there is more to them. Everybody is doing them.
(next slide) What do some real world microservice infrastructures look like?
Coined: Death star architecture!
And for the sake of comparison, here we have an actual death star. Notice the resemblance.
So, clearly these things can go out of hand.
But it’s important for us to understand why people build microservices. We saw that we can scale monolith applications just fine… wouldn’t a single application then be just easier ?
What problems are they actually solving? -> I looked it up!
- Organized around business domains
- Decentralized development process
- Decentralized data management
- Automate everything
- Design for failure
Disaster! Let’s look at how this can go wrong.
Team issues upgrade -> Migration ->
Schema changes!
As you try to scale model to larger domain..
.. it becomes difficult to describe it in single model
Rather than scaling single model, scale many smaller models
Explicit boundaries make it easier to scale organizations; the code functions as a “contract” between different teams.
So, just to show what that looks like, here we can see how each service wraps a database, they do not share it anymore. They can each do migrations themselves, without risking any conflict.
Such a wrapped service, we call a “bounded context”. (Domain Driven Design)
Once again, to reiterate, when services can do migrations independently, what this actually means is that different teams can operate independently.
Real world example
How do we let these services talk with each other ?
Everybody already uses REST + HTTP
Service boundaries act as a “contract” that teams can have between each other
We can test contracts: http integration testing.
Jeff Bezos was right!
But… remember those death star architectures?
Let’s see how things can go wrong.
Our process is as follows: we decompose our model into different components. For example, our overall system can have billing information, but we don’t care about that in our core backend. So billing becomes a standalone component.
We expose these components as services, microservices. Our billing service might have its own database that keeps track about balance, and maps it to an account id.
These components have to reference each other! The billing API needs the core API to figure out what bill to send.
But where are these references in our model? In SQL you have foreign keys… but this information is lost! It is wrapped in logic.
It’s very easy to make design mistakes early that can haunt you later.
These things cause a ton of accidental complexity. What starts out as something simple and elegant, suddenly becomes a complete mess of intertwined logic.
How do you write business processes than span more than 1 service? There are no distributed transactions. How do you cope with ACID-requirements?
Failure scenarios – numerous ways that things can fail.
Service overloads
Services backpropagate a “service unavailable” properly
Upstream services retry
Queue fills up
Once the service recovers, all processes are racing to get access to that single resource -> “thundering herd” problem. -> Lots of real-life scenarios, AWS S3 down, etc.
-> Can cause counterintuitive behavior
Meanwhile, Silicon Valley was trying to figure out how to handle unprecedented volumes of cat pictures, and were asking: is there a better solution to this? As a matter of fact,
yes there is!
Data being communicated is important – “data is the language of the system”. The rest is logic, and just noise. Data is simple, unambiguous, auditable, the rest is noise!
Rather than REST API with an interface…..
Event data with schema!
Async
Loosely coupled
Append-only event log – event sourcing ensures that all changes to application state are stored as a sequence of events.
Record things that happen, not what they are – record only facts! Example: when creating a new user, do not record that “command” (“create user”), but record the fact (“user with id 1234 created”).
-> “User with id X clicked add to basket for product Y at timestamp Z”
-> Facts always remain true
Create aggregates based on this event log – based on this permanent record of all facts that ever happened, you can rebuild any state you want (e.g. “users” table).
Write and read models differ
The key insight from CQRS is that the model you use to read data does not have to be the same as the model you use to write it.
-> Write facts
-> Read shopping cart / users
-> Optimize Read models for how they are queried, rather than how they are stored
Projections – not only query it, but also reconstruct state at PIT, automatically cope with retroactive changes, audit log
Create User – command coming in from web browser
Validation – user exists? With existing user aggregate store
Validated – we can create it – user created event, this is now a fact.
Merge – the “User” aggregate is constantly streaming from the event log, and merges the new user information into its aggregate.
Many more aggregates, all decoupled.
Create User – command coming in from web browser
Validation – user exists? With existing user aggregate store
Validated – we can create it – user created event, this is now a fact.
Merge – the “User” aggregate is constantly streaming from the event log, and merges the new user information into its aggregate.
Many more aggregates, all decoupled.