This is a sum up of what 15 year of implementation distributed systems has taught me, because many of the mistakes that I see people make with microservices are essentially a repetition of the mistakes that were made more than 10 years ago with layered SOA. We will look at the challenges experienced with poorly designed service boundaries, bad UI integration and what a solution to these challenges may look like in the form of logical services aligned with business capabilities, autonomous components and composite UI's
8. @jeppec
https://our.cool.bank-portal
Customer 🧩
Name: Peter Hansen
Address: Some Road 1
Some Zip Some City
…
Customer relations 🧩
Spouse: Mette Hansen
Children: Jacob Hansen
Ditte Hansen
Accounts 🧩
Account Number Account Name Balance
9128-21892982981 Salary Account 214,59
9128-89812981189 Budget Account -345,75
9128-12217698156 Savings Account 17.230,15
Account details: 9128-21892982981 🧩
Account Name:
Interest rate: 3% 🔧
Salary Account
9. @jeppec
Learnings
The good
• Smaller dedicated UI pieces allow many developers to work independently
on the same portal page
• Small views only used by a single portlet was easy to change as it has fewer
couplings
• Reactive UI was cool. Click a part in a portlet and through a UI event one or
more portlets would react and change their content. Decoupling
The bad
• Portlet technology was very heavy and very inflexible
10. @jeppec
SOA for For The Win
Build tiny services that could be combined to larger services
16. @jeppec
What’s wrong with distributed transactions?
• Transactions lock resources while active
• Services are autonomous
• Can’t be expected to finish within a certain time interval
• Locking keeps other transactions from completing their job
• Locking doesn’t scale
• X Phase Commit is fragile by design
17. @jeppec
Essential complexity of 2 way integration
Component CComponent BComponent A
UI
Service Service
B:Service()
call C:Service()
call A:Service()
commit()
Service
Local transaction between
Component A, B and C
19. @jeppec
Synchronous calls lower our tolerance for faults
• When you get an IO error
• When servers crash or restarts
• When databases are down
• When deadlocks occurs in our databases
• Do you retry?
With synchronous style Service interaction we can loose business data if there’s no automatic retry
Or we risk creating data more than once if the operation isn’t idempotent*
Client Server
Duplicated Response
Duplicated Request
Processing
Response
Request Processing
The same message can be
processed more than once
*Idempotence describes the quality of an operation
in which result and state does not change if the
operation is performed more than 1 time
20. @jeppec
B:Service()
call C:Service()
call A:Service()
if (A:Call-Failed:Too-Busy?)
Wait-A-While()
call A:Service()
if (A:Call-Failed:Too-Busy?)
Wait-A-Little-While-Longer()
call A:Service()
if (A:Call-Failed:IO-Error?)
Save-We-Need-Check-If-Call-A-Succeded-After-All
AND We-Need-To-Retry call C:Service and call B:Service
AND Tell-Customer-That-This-Operation-Perhaps-Went-Well
if (A:Call-Went-Well?)
commit()
Accidental complexity from distributed service integration
Component CComponent BSystem A
UI
Service Service Service
Local transaction between
Component B and C
22. @jeppec
Learnings
The bad
• Task/Activity services needs to perform UPDATES across multiple data/entity
services. Requires distributed transactions
• The more synchronous request-response remote calls you have to make the
more it hurts Performance.
• Robustness is lower. If one data/entity services is down it can take down
many other services.
• Coupling is higher. Multiple task/activity services manage the same CRUD
data/entity services
• Cohesion is likely lower as multiple task/activity services need to replicate
entity related logic
27. @jeppec
Learnings
Good
• Very early in the project a full end to end correlation logging framework and
application was built, so you knew exactly which services were being called
by whom, where the time was spent
Bad
• When it takes > 10 min to perform the primary use case you have a serious
performance overhead
• When you have circular Service calls you have a big boundary issue
• When a single service changes its interface and it takes days to adjust other
services before everything works again, then you’re far from decoupled
28. @jeppec
ESB’s the the Rescue?
SOA Web of Mess hidden by ESB
Inventory
Purchase
ShippingPortal
ESB
30. @jeppec
ESB Learnings
• Very often the ESB’s only role is to decouple teams
• You’re effectively introducing a middle management layer with a thin promise
of increased security, monitoring and decoupling
• Often most services exposed are 1-1 with the original interface. Any
decoupling is imaginary and the ESB is a bottleneck.
• Or even worse they follow the layered SOA approach
• Monitoring and security is an end-to-end concern
• Putting some magic technology in the middle doesn’t solve the real problem
36. @jeppec
Learnings
The good:
• Each section on the page is componentized.
• The page knows NOTHING about how information is fetched
• Very low horizontal coupling
• Each component is responsible for ensuring availability and stability.
You build it, you run it, you fix it
• Individual component/service failure doesn’t bring down the entire
page
The bad:
• The microservices tended to be overly small and overlapped ➡️ code
duplication and shared databases
41. @jeppec
One way Master Data duplication
System A Master
Query
Query result
1. Read Ids of all Entities that match criteria
2. For each Id load the Entity
3. Result: 1+n problem
44. @jeppec
System A
Update Customer
Contact Info
Update
Customer
ContactInfo
Process
”UpdateCustomerContactInfo”
command
Event Publisher
”CustomerContactInfoUpdated” Event
Customer View
Customer
local/cached replica
System D
BI/And other parties
System B
Customer Master
45. @jeppec
Back-end Data
System A System D
OSB
Private Customer
Service Proxy
ESB Event
Receiver
CustomerCore
Private Customer
Service
Event
Queue
Event Topic
Command
Internal Event
External Event
External
PrivateCustomer
EventReceiver
Internal
PrivateCustomer
EventReceiverInternal Services
System A System D
Event Topic
With SSNWithout SSN ?
Contains:
Social Security Number (SSN), Who’s
allowed to see SSN and who the Event
is relevant for
External Services
Security control
46. @jeppec
Learnings
The Good:
• We were able to allow all the external source systems to continue with their existing domain
models
• We gained a centralized overview of all customers and their memberships
• The external systems rarely needed to contact the Customer Service
• Breakdowns in communication paths or systems didn’t bring down all other systems
• Use of Optimistic Concurrency combined with eventsourcing made for easy coordination of (rare)
out of sync updates
• CQRS proved to be a good fit for an event sourced solution
• Resilience and stability
The Bad:
• Use of ESB again proved to be a big time waster
• Full resilience, e.g. during Queue/Topic data loss or coding errors in external systems, required
additional Reconciliation service operations (to fetch missing events)
48. @jeppec
"I consider 'getting the boundaries right'
the single design decision with the most
significant impact over the entire life of a
software project."
@ziobrando
49. @jeppec
Monoliths can be well designed
Unfortunately many monoliths experience high degrees of coupling, resulting in them
earning a bad name
53. Business Capability and
Bounded Contexts are
in many ways similar
• What we want to achieve is
• problem domain and
• solution domain
• ALIGNMENT
54. @jeppec
Business Capability alignment
“The advantage of business capabilities is their remarkable level of
stability. If we take a typical insurance organisation, it will likely
have sales, marketing, policy administration, claims management,
risk assessment, billing, payments, customer service, human
resource management, rate management, document
management, channel management, commissions management,
compliance, IT support and human task management capabilities.
In fact, any insurance organisation will very likely have many of
these capabilities.”
See http://bill-poole.blogspot.dk/2008/07/business-capabilities.html
55. @jeppec
Business – IT alignment
• We want the Business and IT to speak the same Ubiquitous language
• Want want our architecture to be aligned with the business capabilities
• Because these capabilities are stable
56. @jeppec
Many perspectives on data
Online Retail System
Product
Unit Price
Promotional Price
Promotion End Date
Stock Keeping Unit (SKU)
Quantity On Hand (QOH)
Location Code
Price
Quantity Ordered
Name
The lifecycle of the data is VERY important!
Customer
Pricing
Inventory
Sales
Management Reporting
57. @jeppec
Smaller models & clear data ownership
Retail System
Pricing
Product
ProductID
Unit Price
Promotional Price
…
Pricing
Inventory
Product
ProductID
SKU
QOH
Location Code
…
Inventory
Sales
Product
ProductID
Name
Description
Quantity Ordered
…
Sales
Shared Entity identity
DDD:
Bounded
Context
Business
Capability
58. @jeppec
Bounded Contexts and Aggregates
Sales
Product
Customer
customerId
…
Order
orderId
customerId
…
OrderLine
orderId
productId
quantity
timestamp
…
ProductCategory
productCategoryId
…
Pricing
Product
productId
productCategoryId
name
tag
...
Product-Price
productId
normalPrice
discountPeriods
…
59. @jeppec
Using Business Events to drive Business Processes
Sales Service
Shipping
Billing
Sales
Customers
MessageChannel
Online Ordering System
Web Shop
(Composite UI)
Billing Service
Shipping Service
<<External>>
Order Accepted
AcceptOrder
The sales
fulfillment
processing can
now begin…
Cmd Handler
Order Accepted
Apply
60. @jeppec
Choreographed Event Driven Processes
Sales Service
Order
Accepted
Billing Service
Order Fulfilment
(Saga/
Process-Manager)
Shipping Service
Online Ordering System
MessageChannel(e.g.aTopic)
Order
Accepted
Order
Accepted
Customer
Billed
Customer
Billed
Order
Approved
Order
Approved
Works as a Finite
State Machine
(WorkFlow)
handling the life
cycle of Shipping and
thereby forms a very
central new
Aggregate in the
System
61. @jeppec
If we align the problem domain with the solution domain
Bounded Context 1 Bounded Context 3Bounded Context 2
UI
BL
DAO
UI
BL
DAO
UI
BL
DAO
Vertical coupling
is
unavoidable
We want to avoid
horizontal coupling
64. @jeppec
A Service is
• The technical authority for a given bounded context/business-capability
• It is the owner of all the data and business rules that support this
bounded context – everywhere
• It forms a single source of truth for that bounded context
http://udidahan.com/2010/11/15/the-known-unknowns-of-soa/
69. @jeppec
Service and deployment
• A Service represents a logical responsibility boundary
• Logical responsibility and physical deployment of a Service
DOES NOT have to be 1-to-1
• It’s too constraining
• We need more degrees of freedom
• Philippe Krutchen 4+1 views of architecture: Logical and Physical
designs should be independent of each other
A service needs to be deployed everywhere its data is needed
72. @jeppec
Service Microservices
1..*
Is implemented by
Service vs Microservices
Microservices are a division of Services along Transactional boundaries (a transaction stays within the
boundary of a Microservice)
Microservices are the individual deployable units of a Service with their own Endpoints. Could e.g. be the
split between Read and Write models (CQRS) - each would be their own Microservices
74. @jeppec
Services are the corner stone
• We talk in terms of Services capabilities and the processes/use-cases
they support
• Microservices are an implementation detail
• They are much less stable (which is a good thing – it means they’re easier to
replace)
75. @jeppec
There’s cost in deploying 1000’s of microservices
(or 100.000’s serverless functions)
75
77. @jeppec
This means they CAN, but they don’t HAVE to be deployed
individually.
Design for Distribution
But take advantage of locality
78. @jeppec
Let’s be even more pragmatic
In which case we allow other services
to call them using local calls
78
79. @jeppec
AC in code
public class PricingEngineAc extends AutonomousComponent {
public static AutonomousComponentId SERVICE_AC_ID = SALES_SERVICE_ID.ac(”pricing_engine_ac");
…
public PricingEngineAc(CurrencyConverter currencyConverter) {
this.currencyConverter = currencyConverter;
}
@Override
public void onInitialize(IConfigureACEnvironment acSetup) {
acSetup.withAutonomousComponentId(SERVICE_AC_ID)
.usingServiceDataSource()
.withBusConfiguration(cfg -> {
bus.registerAxonReplayableTopicPublisher(InternalPricingEvents.TOPIC_NAME,
replayFromAggregate(Pricing.class)
.dispatchAggregateEventsOfType(InternalPricingEvents.class));
bus.subscribeTopic(SERVICE_AC_ID.topicSubscriber(”InventoryEvents"),
ExternalInventoryEvents.TOPIC_NAME,
new InventoryTopicSubscription(bus));
})
.runOnBusStartup((bus, axonContext) -> {
});
}
}
80. @jeppec
Service deployment
• Many services can be deployed to the same physical server
• Many services can be deployed in the same application
• Application boundary is a Process boundary which is a physical boundary
• A Service is a logical boundary
• Service deployment is not restricted to tiers either
• Part of service A and B can be deployed to the Web tier
• Another part of Service A and B can be deployed to the backend/app-service tier of the same
application
• The same service can be deployed to multiple tiers / multiple applications
• ie. applications and services are not the same and does not share the same boundaries
• Multiple services can be “deployed” to the same UI page (service mashup)
81. @jeppec
An Application is the plate where Components are
co-deployed
81
Sales service components
Inventory service components
…
83. @jeppec
iOS Home banking
Customer information
Legal and contract information
Accounts
Credit card
Mortgage loans
Web banking portal
Bank Back-office application
84. @jeppec
Service A Service B Service C
IT-OPS
Web shop
Warehouse
Back office
App logic
(Layered, CQRS,…)
Warehouse
Storage
UI Components
A Services owns its UI Components
88. @jeppec
A Service represents a logical boundary
88
Service
Autonomous
Component
Autonomous
Component
Adapter
UI component
89. @jeppec
Autonomous Component
• Can be deployed alone or co-located, together with one or more adapters from the same service
• Works transparently in a clustered environment
Core logic
(use case controllers/aggregates)
Primary/Driving Adapter
GUI / API / …
Primary/Driving Port
Secondary/Driven Adapter
Database/Notifications/…
Secondary/Driven Port
90. @jeppec
Autonomous Components can be co-deployed together
with Application backends
web_shop (Spring Boot fat–jar)
pricing_engine_ac
discounts_ac
webshop_basket_adapters
api_products
product_details_ac
product_details_adapters
frontend
api_pricing
app
libs
products
pricing
branding
91. @jeppec
Application in code
@Configuration
@ComponentScan(basePackages = { "com.mycoolshop.webshop",
"com.mycoolshop.adapters",
"com.mycoolshop.itops.spring" })
public class Application extends IAmASpringBootApplication {
@Override
protected ApplicationIdentifier getApplicationIdentifier() {
return ApplicationIdentifier.from(”WebShop");
}
@Override
protected Collection<AutonomousComponent> getAutonomousComponentsHostedInThisApplication() {
CurrencyExchangeRateAc currencyExchangeRateAc = new CurrencyExchangeRateAc();
return list(
new PricingEngineAc(currencyExchangeRateAc.getCurrencyConverter()),
new DiscountsAc(currencyExchangeRateAc.getCurrencyConverter()),
new InventoryAc(),
new CustomersAc(),
currencyExchangeRateAc
);
}
public static void main(String[] args) {
SpringApplication.run(Application.class, args);
}
}
93. @jeppec
Autonomous Component
AC - 1
AC - 1AC - 1
AC - 1
AC - 2
AC - 2
AC - 2
AC - 2
AC - 3
AC - 3
AC - 3
AC - 3
AC - 4
AC - 4
AC - 4
AC - 4
Federated
Bus
95. @jeppec
Topics
Bus features
AC - 4
AC - 1
AC - 3
bus.registerAxonReplayableTopicPublisher(InternalPricingEvents.TOPIC_NAME,
replayFromAggregate(Pricing.class)
.dispatchAggregateEventsOfType(
InternalPricingEvents.class
)
);
bus.subscribeTopic(SERVICE_AC_ID.topicSubscriber(”Pricing"),
InternalPricingEvents.TOPIC_NAME,
new PricingTopicSubscription(bus));
98. @jeppec
Client handled subscriptions
• Highly resilient pattern for an Event Driven Architecture that’s backed by
Event-Sourced AC’s
• In this model the publisher of the Events is responsible for the durability of
all its Events, typically to an EventStore/EventLog.
• Each client (subscriber) maintains durable information of the last event it has
received from each publisher.
• When ever the client starts up it makes a subscription to the publisher
where it states from which point in time it wants events published/streamed
to it.
• This effectively means that publisher can remain simple and the client
(subscriber) can remain simple and we don’t need additional sophisticated
broker infrastructure such as Kafka+ZooKeeper.
99. @jeppec
Client handled subscriptions
Publisher
Subscriber A
Local storage
EventStore
Subscriber B
Local storage
Topic
Subscription
Topic
Subscription
TopicSubscriptionHandler
TopicSubscriptionHandler
EventEvent
Event Event
EventBus
Event
Event
Distributed Event Bus,
which ensures that
live events published
on an AC node in the
cluster can be seen
by all AC’s of the
same type
Singe Instance
Subscriber, which
ensures that only
one instance of
Subscriber B has
an active
subscription(s).
Other instances of
the same
subscriber are
hot-standby
<<Topic Subscriber>>
Customer_Service:Customer_Agreements_Ac:OrderEvents
<<Topic Publisher>>
Sales_Service:OrderEvents
100. @jeppec
Topics
Bus features
Features:
• The Bus provides automatic and durable handling of Redelivery in case of message handling failure
through a Redelivery Policy
• Exponential Backoff
• Max retries
• Dead letter/Error Queue
• Support for resubscription at any point in the timeline of an Event Stream
• Automatically tracking of resubscription points - aka. resubscribe at last tracked point
AC - 4
AC - 1
AC - 3
101. @jeppec
Bus features
Notifications:
• Durable notifications with failover support
• bus.notify(notificationQueue, notificationMessage)
Notifications
Distributed
Broadcast
Broadcast:
• Broadcast a Message to all AC’s in the cluster
• Broadcast a Message to a UI client (all, per user, per privilege)
102. @jeppec
Bus features
Features:
• Support for sending a message to a single consumer
• Default pattern is Competing Consumers
• The bus provides durability for all messages send on a Queue
• The Bus provider automatic and durable handling of Redelivery in case of message handling failure
through a Redelivery Policy
• Exponential Backoff
• Max retries
• Dead letter/Error Queue
Durable Queues
103. @jeppec
At least once delivery
In a distributed systems message delivery can and will fail!
Therefore, everything that can handle messages are built with
idempotency in mind:
• We always check for the existence of an Aggregate before creating it
• All business methods in an aggregate check if the operation/side effect
has already taken place
• View Repositories support automatic reordering of messages that
come out of order due to redelivery(if this is required by the View)
104. @jeppec
Bus features
Single Instance Task:
• Ensures that only one Active instance of a Task is active in the cluster at one time
• Other tasks of the same type are in hot standby
• Used to e.g. group multiple subscribers, to ensure that all subscribers are either all
active or standby.
• Used by our ViewRepositories
Distributed
SingleInstanceTask
bus.createClusterSingleInstanceTask(”MyTask",
new MyTask ()); // Where MyTask implements Lifecycle
105. @jeppec
Bus features
Process Manager:
• Defines durable business processes
as a flow of Events
Process Manager
Sales Service
Order
Accepted
Billing Service
Order Fulfilment
(Saga/
Process-Manager)
Shipping Service
MessageChannel(e.g.aTopic)
Order
Accepted
Order
Accepted
Customer
Billed
Customer
Billed
Order
Approved
Order
Approved
106. @jeppec
Application features
Workflow Manager:
• Defines the Tasks that required human intervention, such as:
• Approvals (e.g. Contract approval)
• Assistance/Help with a business problem
• Incident handling (e.g. a technical problem identified by a developer)
• Authorization (e.g. request manager approval)
• Reminders
• Common tasks supported: Claiming Tasks, Escalating Tasks, Completing Tasks
Workflow Manager
107. @jeppec
Application features
Identity Management:
• Common handling of Users, Privileges, Roles
• Automatic enforcement of privilege requirements
• Support for Multitenancy (Contractual level, Service Agreement Level)
Identity Management
108. @jeppec
Correlation logging
Our infrastructure automatically captures information about an API call,
a Message delivery in the form of a CallContext:
• Message Id - the Id of the message/call being handled.
• Correlation Id – an Id that binds API calls and message handlings across
multiple services/AC’s together
• When – time of the “call”
• Who – which user performed the “call”
• Meta Data: Which Event or Command caused this “call”