2. The problem with microservices: distributed
data management
• When developing microservices you must tackle the
problem of distributed data management. Each
microservice has its own private database, sometimes
a SQL and sometimes a NoSQL database.
• Developing business transactions that update entities
that are owned by multiple services is a challenge, as
is implementing queries that retrieve data from
multiple services.
3. Event-driven architecture to the rescue
• For most applications, the way to make
Microservices work and to manage
distributed data successfully is to adopt an
event-driven architecture. In an event-driven
architecture, a service publishes events when
something notable happens, such as when it
updates a business object. Other services
subscribe to those events. In response to an
event a service typically updates its own state.
It might also publish more events, which then
get consumed by other services.
• You can use an event-driven approach to
implement eventually consistent transactions
and to maintaining materialized views.
4. Using event-driven eventually consistent
transactions
You can use events to implement eventually consistent business
transactions that span multiple services. ACID transactions are
replaced by multi-step, event-driven eventually consistent
workflows. At each step, a service updates its data and then
publishes an event that triggers the next step.
NO ACID
5. Challenges
As you can see, an event-driven architecture solves the distributed
data management problems inherent in a microservice architecture.
However, implementing an event-driven architecture is not easy.
• This pattern has the following benefit:
– It enables an application to maintain data consistency across multiple services
without using distributed transactions
• This solution has the following drawback:
– The programming model is more complex
6. Issues to be addressed
In order to be reliable, an application must atomically update its
database and publish an event. It cannot use the traditional
mechanism of a distributed transaction that spans the database and
the message broker. Instead, it must use one of the patterns listed
below:
• The Database per Service pattern creates the need for this pattern
• The following patterns are ways to atomically update state and
publish events:
– Event sourcing
– Application events
– Database triggers
– Transaction log tailing
7. Example
An e-commerce application that uses this approach would work as follows:
1.The Order Service creates an Order in a pending state and publishes an OrderCreated event.
2.The Customer Service receives the event and attempts to reserve credit for that Order. It then publishes either a Credit Reserved event or a CreditLimitExceeded event.
3.The Order Service receives the event from the Customer Service and changes the state of the order to either approved or cancelled
OrderCreated event
Order
Service • Credit Reserved event
• CreditLimitExceeded event
Customer
Service
Pending
State
Approved or
Cancelled State
10. Achieving Atomicity
In an event-driven architecture there is also the problem of atomically
updating the database and publishing an event. For example, the
Order Service must insert a row into the ORDER table and publish an
Order Created event. It is essential that these two operations are done
atomically. If the service crashes after updating the database but
before publishing the event, the system becomes inconsistent. The
standard way to ensure atomicity is to use a distributed transaction
involving the database and the Message Broker. However, for the
reasons described above, such as the CAP theorem, this is exactly
what we do not want to do.
11. Publishing Events Using Local Transactions
One way to achieve atomicity is for the
application to publish events using a
multi-step process involving only local
transactions. The trick is to have an EVENT
table, which functions as a message queue, in
the database that stores the state of the
business entities. The application begins a
(local) database transaction, updates the state
of the business entities, inserts an event into
the EVENT table, and commits the transaction.
A separate application thread or process
queries the EVENT table, publishes the events
to the Message Broker, and then uses a local
transaction to mark the events as published.
The following diagram shows the design.
12. Event Store
Events persist in an Event Store, which is a database of events. The
store has an API for adding and retrieving an entity’s events. The
Event Store also behaves like the Message Broker in the architectures
we described previously. It provides an API that enables services to
subscribe to events. The Event Store delivers all events to all
interested subscribers. The Event Store is the backbone of an
event-driven Microservices architecture.
13. CAP theorem
In theoretical computer science, the CAP theorem, also named
Brewer's theorem after computer scientist Eric Brewer, states that it is
impossible for a distributed computer system to simultaneously
provide more than two out of three of the following guarantees:
In other words, the CAP Theorem states that in the
presence of a network partition, one has to choose
between consistency and availability. Note that
consistency as defined in the CAP Theorem is quite
different from the consistency guaranteed in ACID
database transactions.
14. ZooKeeper
Apache ZooKeeper is a software project of the Apache
Software Foundation. It is essentially a distributed
hierarchical key-value store, which is used to provide a
distributed configuration service, synchronization
service, and naming registry for large distributed
systems. ZooKeeper was a sub-project of Hadoop but is
now a top-level project in its own right.
ZooKeeper's architecture supports high availability
through redundant services. The clients can thus ask
another ZooKeeper leader if the first fails to answer.
ZooKeeper nodes store their data in a hierarchical name
space, much like a file system or a tree data structure.
Clients can read from and write to the nodes and in this
way have a shared configuration service. Updates are
totally ordered.
ZooKeeper is used by companies including Rackspace,
Yahoo!, Odnoklassniki, Reddit and eBay as well as open
source enterprise search systems like Solr.