2. 2
In this presentation, I would like to discuss, how you can
organize messaging in a loose-coupled groups. There are many
solution already proposed and implemented to deal this problem but
my approach, may give you a somewhat new vision on this very old
problem.
At first, let’s model the problem. Imagine, that you do need to
organise communication between a number of people. They all
related to the information that you shall pass. Some of them, are
sources, or “writers”. Some of them, are consumers of information,
or “members”, and they only reading. Some of them delivering
information to the “members”, let’s call them “agitators”. And some
of them responsible for transferring information between “writers”
and “agitators”. Let’s call them “a network”.
Network
Writer
Member Member
3. 3
Most solutions, propose to use ether fixed topology, where everyone knows everyone. Or
some form of “star” network, where information is passed through some central core, consisting
of one or more elements. Or some sort sort of “mesh” or “peer to peer” communication, where
participants shall somehow magically find each other, and infrastructure topology get’s widely
propagated
Many of those approaches are suffering from one or few shortcomings. Let’s review
some of them:
❏ Mesh network requires complicated mechanism for distributing and keeping topology
information up-to date.
❏ “Peer-to-peer” network complicates information transfer as it require to create some
mechanism of detection, if the piece of information has or hasn’t been already delivered to
the specific member. As well as topology maintenance is still a drag.
❏ Star topologies are suffering from inflexibility. If the core of the network is compromised,
the information flow will be severely impacted. “Self-healing” in centralised topology is
also somewhat difficult to implement.
4. 4
Oftentimes, in real life information is passed through very complicated and multi
layered network, before it reached everyone, or at least most members. But how we can be
sure, that our information distribution operating efficiently, with some toleration to faults,
means for information passing, doesn’t require disclosure of all writers and members at network
level, isolated different elements of topology from each others and provides some assurance
that our network while connected loosely, performing its duties.
We can take as a sample, mechanism used by undercover networks of past and
present. In those organisations, writers (or other sources of information) are decoupled from the
members. Nether writers, nor members do not know the whole distribution network. They only
knew its small part, “a cell”. And “cell” is not generally aware about other “cells” that’s may
exists. Only few “cell members” of the specific “cell” knows how to get an information from
another cells anonymously. Without revealing them self.
5. 5
My name is Vladimir Ulogov and I’ve been around awhile. In several occasions, I
got myself involved in creation of all kinds of communication network and messaging
platforms. Beginning from grabbing a stream of data from a passing aerial vehicle, and
evolving into a creation of the messaging platform for a metrics passing or HA messaging
platform for communication between real-time traffic monitors and security scanners. I’ve
seen and parse information from a serial lines, X.25 endpoints and CAMAC bus to
TCP/UDP sockets and UDP broadcasts and IP Multicasts. And of course, over the years
I’ve tried some of the known messaging platforms and wrote couple of them me-self.
This work is not an architectural design, that is ready for implementation, but rather
way to provoke thoughts and agitate a “little grey cells”.
6. 6
1
A
B
E
2
D
F
G
H
C
I
J
Here is the idea, that’s I would like to
discuss. Message platform is consisting of loosely
connected rings. An information is presented in a form
of finite-size messages. And those messages are
passed along the ring from one node to another. Ether
members or writers can connect to any node in there
ring network. And members or writers may or may
not know full list of the members of the ring or
existence of any other rings. Selected member of
other ring (ring number 2) can connect to any known
node of the ring (number 1) and receive messages.
But it may or may not know other members of the
ring 1. Naturally, nether writers, nor members is not
aware of existence of other rings and it’s topology.
When message received, it is marked as “seen”
and passed along down the ring. Rings are directional
and user must be able to choose directionality of the
ring.
7. 7
First, we shall review anatomy of the
ring. Set of the nodes, not less than 3 making the ring.
During ring initialization phase, each node may not
have a full list of the all nodes of the ring, but it shall
at least know it’s “uplink” or “bootstrap” node. For
example, if ring is “right handed”, i.e.
counterclockwise, node A shall know about node B,
but not necessary C. B shall know about C, and C
about A.
Key component of the ring message
passing is a PUB-SUB protocols. There are three
PUB-SUB protocols, one PUSH and one PULL. We will
review them shortly.
A
B
C
8. 8
First PUB-SUB protocol is a “Ring Sync
Protocol” (RSP). If you ever heard about “Token Ring”
or any other Ring Protocols, you may say: “Hey, that’s
something familiar”, but it is not. RSP is not
responsible for the ring synchronization, like in Token
Ring. It is rather intended to be a ring assurance and
topology control protocol. In the previous slide, I’ve
mentioned that during initialization phase, ring
members may not know full ring topology. But each
node knows it’s “uplink”, and connects SUB socket to
the PUB service of the uplink and wait for the token.
Each node waits for a random number of seconds and
if token is not received, assumes primary role,
generate token and send it to the PUB
A
B
C
A assumes primary
role and generate
token
9. 9
Because A is uplink to the C, C receives
token, checks timestamp and a path, verify topology
information and certificates passed along with that
token, registered itself in the token path and pass
token to it’s PUB service.
Since, C is an uplink for node B, B received
the token, checking timestamp, path, certificates
register itself in the path and pass to a PUB service.
And because B is uplink to node A, node A
receives this token from B and since A already
registered in the token path, it is clear that token
made a full circle, therefore A generates a new one,
register itself in the path and pass it alone to PUB.
The idea behind token passing along the
ring is to provide a simple assurance and topology
propagation mechanism. By receiving the token, each
node knows: the ring is working, messages will be
delivered, and topology…. We will talk about topology
in a second.
A
B
C
Token is received
and verified by
node C.
Token is received
and verified by
node B.
10. 10
One of the purpose of the token and RSP
protocol altogether is to convert inconsistent ring to
consistent. This achieved when each node adds
locally known information about ring topology to a
token. And eventually, at least starting after a first
pass, node A will have consistency and after second
pass of the token, all members of the ring will be
aware of the ring topology.
Since, we do not know when and how
timed-out nodes will generate token, we will consider
few N passes of the token as “sync-up”. Nodes
receiving and passing tokens and learning about ring
topology.
A
B
C
A -> B
C -> A -> B
C -> A -> B -> C
11. 11
How to detect and combat “Token Death
March” ? “Token Death March” is a condition on the
ring when two or more tokens been generated. This
will destroy synchronization and beats the whole
purpose or RSP. It is very important, that we shall
have only one token at a time in the ring. How we will
detect a “Token Death March” ?
Each token do have a serial number.
Every time, token regenerated, serial number
increased. Each node remembers, which serial
number passes the node or been generated by that
node. If node receives token and it never had any
token before, life is easy, node remembers token
serial number, do the checks, update path and passes
token alone.
If node had some token registered or
token been generated on this node, any token
received from uplink with higher serial number is
having a precedence.
A
B
C
12. 12
If node receive a token from it’s uplink,
with the same serial number as it been generated on
that node and token path doesn’t contain information
about that node, this means we just received a bogus
token, generated by some time-outed node in the ring
and our token is still out there, so, we can safely
discard this token and wait when our token is made a
full round. But before we discard this token, we must
also check it’s timestamp. If this token timestamp is
older than timestamp of my generated token, but it is
not not timed-out yet, then this means that it was
me, who generated a bogus token and received token
will have precedence. But if this token is younger than
mine, yep, I’ll kill it.
During ring synchronization, it is okay to
generate and receive bogus tokens. Serial number and
a timestamp will help to eventually eliminate those.
A
B
C
13. 13
In a unidirectional ring, when each node
both, publisher and subscriber, token plays an
important role to control the state of the ring.
Publishers generally are not aware about subscribers.
Do they exist ? Do they connected ? Do they passing
messages down the ring ? How we are going to
detect if our ring is broken and node A (as for
example) is not functioning any longer ?
Token. Each node expects to receive a
token in configurable interval. If our ring in a healthy
state, token handling takes no efforts : receive, check
stamp, serial number, path, topology and pass along
or regenerate. Update status: “Ring is working”. As
long as all nodes shall behave that way, each of them
will know at all time the true status of the ring.
A
B
C
PUB
SUB
PUB
SUB
PUB
SUB
14. 14
But what if one (or more) nodes in our
ring gets compromised ? They are stopped serving
messages. We are not receiving nether messages, nor
token. Token timeout occurs. Hopefully, that happens
when our ring were stable and information about ring
topology is propagated. When we detect a token
timeout, first, we detect “next good uplink”. For the
node C, it will be a node B, therefore C will severe
connection with A and connects to B. And update
topology, reflecting that fact. Next, nodes wait and
begin the process of ring synchronization as it been
declared earlier, but with updated topology.
While ring is not synchronized, we do not
serve any incoming and outgoing messages, until
normality restored.
A
B
C
PUB
SUB
PUB
SUB
PUB
SUB
15. 15
Here is some summary for the first out of
few protocol that we reviewed.
❏ RSP protocol validates the ring. If you get the
token - ring passes the messages.
❏ RSP protocol distributes topology information.
❏ RSP protocol is essential for detection of the
problems on the and in general for “in-ring High
Availability”.
❏ RSP protocol is essential for a ring security.
Certificates are delivered as a part of the token.
And this will be enough for the basics of
RSP
A
B
C
16. 16
Another PUB-SUB protocol is “Message
Transmit Protocol” or MTP. The only purpose of this
protocol is to transmit a messages down the ring.
Each node of the ring connects to the PUB socket of
it’s uplink with SUB request and as soon as ring
becomes synchronized, begins to receive a messages.
Once message is received, node checks it’s timestamp
and path. If timestamp of the message is timed-out,
or if the current node already in the message’s path,
it’s marked in “Messages Database” as “delivered”
and discarded. If conditions are not met, node
publishing this message in own PUB socket, so
downstream nodes can receive and pass along.
A
B
C
PUB
SUB
PUB
SUBPUB
SUB
17. 17
All members of this ring, wishing to
receive a messages, shall open SUB socket, to a PUB
MTP interface of known member of the ring. Another
SUB connection, must be established with PUB
interface of the RSP protocol. All members of the ring
will receive a state of there ring and it’s topology.
A “Member” will behave about the same, as a
ring node. If node, to which member is connected
becomes “stale” then token timeout occurs.
Hopefully, by that time, member shall receive few
tokens and at least one of them will tell that the ring
is synchronized. This means that the member are
aware about topology of the ring and can reconnect to
another ring node the same way as ring nodes.
A
B
C
PUB
SUB
PUB
SUBPUB
SUB
SUB
18. 18
When the ring is recovering from the
node failure, and as soon as synchronization is
reached, every node can recirculate some of the
messages that they hold in MDB on each node, so if
there is any gap in the messages delivered to a
members, they shall have a chance to actually get it.
There will be a duplicates, but they will be
discarded ether on the nodes or once recirculate is
finished.
A
B
C
PUB
SUB
PUB
SUBPUB
SUB
SUB
19. 19
Third protocol that we will discuss is not a
PUB-SUB. It is rather PUSH-PULL, and it’s used by a
“Writer”, a client of the ring network, submitting a
new messages. To do so, writer connects to a PULL
interface of it’s uplink node with PUSH socket and
begin to produce a messages, which accepted by the
ring node and passed down the ring.
Just like a member node, which receives
messages from the ring, writer must also connect
with SUB socket to PUB interface of RSP. Once ring
will be synchronized, and from this point, writer will
have a full access to a ring topology and status.
A
B
C
PUB
SUB
PUB
SUBPUB
SUB
PULL
PUSH
20. 20
One of the writer’s requirements is to
hold a small MDB with a message cache, in case of
the problems on the ring. If writer receive information
through RSP, that the ring is re-synchronizing, once
recovery is achieved, writer shall send whatever
believed is missed. There will be a duplicates, but
that’s alright.
A
B
C
PUB
SUB
PUB
SUBPUB
SUB
PULL
PUSH
21. 21
Third PUB-SUB protocol for the ring
communication is a “Shadow Ring Sync Protocol” or
SRSP. This protocol is used to propagate ring token
outside our ring. Why we can not just use RSP for that
purpose ?
To answer this question, we shall
reiterate a very important property of our
architecture. Within a ring, we are striving to be as
reliable as possible, but outside our ring, we are not
keeping such tight control.
Another reason for this protocol is
isolation of the topologies. Ring topology will not
cross SRSP. There are very few pieces of information
about one ring, will propagate to a members of other
ring. Only selected nodes of other ring will know, how
to communicate to “outside world”
A
B
PUB
SUB
22. 22
And because all messages do hold some
information about ring topology (path and some
others), we do not want to expose this to another
ring. For the purpose of retransmission of the
messages with stripped ring information, we have
PUB-SUB “Shadow Message Transfer Protocol” or
SHMTP. Node B from other ring subscribe to the
SHMTP PUB interface on node A, member of another
ring. And let’s not forget, node B is communicating
with SRSP PUB interface of node A as well.
A
B
PUB
SUB
23. 23
Communication between rings are always
unidirectional. All you have is SRSP and SHMTP PUB-
SUB protocols. And sometimes it is really what you
needed. If you want to establish bidirectional
communication between rings, you have to instruct
you first ring to utilize SRSP and SHMTP from the
second ring.
A
B
PUB
SUB
SUB
PUB
24. 24
By combining how you configure and
connect your rings, you can achieve pretty tight
isolation and control proper distribution of the
messages. Discussed architecture doesn’t provide you
instant propagation mechanism. It rather gives you
tools and architecture. You have to architect and
define your network and this network will do the rest.
A H
B
D
E
G
K
C
F
L
N
M