Modern Distributed Messaging and RPC

LIGHTWEIGHT MESSAGING AND
RPC IN DISTRIBUTED SYSTEMS
Max A. Alexejev
11.10.2012

Messaging System
Message (not packet/byte/…) as a minimal
transmission unit.
The whole system unifies
• Underlying protocol (TCP, UDP)
• UNICAST or MULTICAST
• Data format (message types & structure)
Tied with
• Serialization format (text or binary)

Typical peer-to-peer messaging
Producer
[host, port]
Consumer
[host, port]

Typical broker-based messaging
Producer
[bhost, bport]
Broker Consumer
[bhost, bport]
• Broker is an indirection layer between
producer and consumer.
• Producer PUSHes messages to broker.
• Consumer PULLs messages from broker.

The trick is…
Producer
[bhost, bport]
Broker Consumer
[bhost, bport]
• Producers and consumers are logical units.
• Both P and C may be launched in multiple
instances.
• p2p and pubsub terms are expressed in terms of
these logical (!) units.
• Even broker may be distributed or replicated
entity.

Generic SOA picture
S1
S2
S3
S4
S5
S6

In a generic case
• A service may be both a consumer for many
producers and a producer to many
consumers

Characteristics and Features
• Topology (1-1, 1-N, N-N)
• Retries
• Service discovery
• Guaranteed delivery (in case yes – at-least-once or exactly-once)
• Ordering
• Acknowledge
• Disconnect detection
• Transactions support (can participate in distributed transactions)
• Persistence
• Portability (one or many languages and platforms)
• Distributed or not
• Highly available or not
• Type (p2p or broker-based)
• Load balancing strategy for consumers
• Client backoff strategy for producers
• Tracing support
• Library or standalone software

Main classes
• ESBs (Enterprise service buses)
– Slow, but most feature-rich. MuleESB, JbossESB,
Apache Camel, many commercial.
• JMS implementations
– ActiveMQ, JBOSS Messaging, Glassfish, etc.
• AMQP implementations
– RabbitMQ, Qpid, HornetQ, etc.
• Lightweight modern stuff - unstandardized
– ZeroMQ, Finagle, Kafka, Beanstalkd, etc.

Messaging Performance
As usual, its about throughput and latency…
Major throughput factors:
– Network hardware used
– UNICAST vs MULTICAST (for fan-out)
Major latency factors:
– Persistence (batched or single-message persistence
involves sequential or random disk writes)
– Transactions
– Broker replication
– Delivery guarantees (at-least-once & exactly-once)

Guaranteed delivery
Involves additional logic both on Producer,
Consumer and Broker (if any)!
This is at-least-once delivery:
• Producer needs to get ack’ed by Broker
• Consumer needs to track high-watermark of
messages received from Broker
Exact-once delivery requires more work and even
more expensive. Typically implemented as 2-phase
commit.

Ordering (distributed broker scenario)
• Producers receive messages in any order. Very cheap.
No Ordering
• Messages are ordered within single data partition. Such
as: stock symbol, account number, etc. Possible to create
well-performing implementation of distributed broker.
Partitioned Ordering
• All incoming messages are fairly ordered. Scalability and
performance is limited.
Global (fair) ordering

Remote procedure calls
Inherently builds on top of some messaging.
Method call as a minimal unit (3 states: may
succeed returning optional value, throw exception,
or time out).
Adds some RPC-specific characteristics & features:
• Sync or async
• Distributed stack traces for exceptions
• Interfaces and structs declaration (possibly, via
some DSL) – often come with serialization library
• May support schema evolution

Serialization libraries
Currently, there are 4 clear winners:
1. Google Protocol buffers (with ProtoStuff)
2. Apache Thrift
3. Avro
4. MessagePack
All provide DSLs and schema evolution.
Difference is in wire format and DSL compiler
form (program in C, in Java, or does not require
compilation).

Messaging vs RPC
Messaging
• In Broker-enabled case:
Producers are decoupled
from Consumers. Just
push message and don’t
care who pulls it.
• Natively matches
messages to events in
event-sourcing
architectures.
RPC
• Need to know destination
(i.e., service A must know
service B and call
signature).
Messaging and RPC dictate different programming
models. RPC requires higher coupling between
interacting services.

Today’s Overview
• Broker[less] peer-to-peer messaging
ZeroMQ
• Broker-enabled persistent distributed
pubsub
Apache Kafka
• Multi-paradigm and feature-rich RPC in Scala
Twitter Finagle

ZeroMQ
“It's sockets on steroids. It's like mailboxes with
routing. It's fast!
Things just become simpler. Complexity goes
away. It opens the mind. Others try to explain
by comparison. It's smaller, simpler, but still
looks familiar.”
@ ZeroMQ 2.2 Guide

ZeroMQ - features
• Topology – all, very flexible.
• Retries – no.
• Service discovery – no.
• Guaranteed delivery – no.
• Acknowledge – no.
• Disconnect detection – no.
• Transactions support (can participate in distributed transactions) – no.
• Persistence – kind of.
• Portability (one or many languages and platforms) – yes, there are many bindings. However,
library itself is written in C, so there’s only one “native” binding.
• Distributed – yes.
• Highly available or not – no.
• Type (p2p or broker-based) – mostly p2p. In case of N-N topology, a broker needed in form of
ZMQ “Device” with ROUTER/DEALER type sockets.
• Load balancing strategy for consumers – yes (???).
• Client backoff strategy for producers – no.
• Tracing support – no.
• Library or standalone software – platform-native library + language bindings.

ZeroMQ – features explained
Isn’t there too much “no”s ?
Yes and no. Most of the features are not
provided out of the box, but may be
implemented manually in client andor server.
Some features are easy to implement
(heartbeats, ack’s, retries, …) some are very
complex (guaranteed delivery, persistence, high
availability).

ZeroMQ – what’s bad about it
• First of all – name.
Think of ZMQ as a sockets library and u’re happy.
Consider it messaging middleware and u got frustrated
just while reading guide.
• Complex implementation for multithreaded
clients and servers.
• There were issues with services going down due
to corrupted packets (so, may not be suitable for
WAN).
• Some mess with development process. Initial
ZMQ developers forked ZMQ as Crossroads.io

ZeroMQ – what’s good
• Huge list of supported platforms.
• MULTICAST support for fan-out (1-N)
topology.
• High raw performance.
• Fluent connect/disconnect/reconnect
behavior – really feels how it should be.
• Wants to be part of Linux kernel.

ZeroMQ – verdict
• Good for non-reliable high performance
communication, when delivery semantics is
not strict. Example - ngx-zeromq module for
NGINX.
• Good if you can invest sufficient effort in
building custom messaging platform on top
of ZMQ as a network library. Example –
ZeroRPC lib by DotCloud.
• Bad for any other purpose.

Apache Kafka
“We have built a novel messaging system for log
processing called Kafka that combines the benefits
of traditional log aggregators and messaging
systems. On the one hand, Kafka is distributed and
scalable, and offers high throughput. On the other
hand, Kafka provides an API similar to a messaging
system and allows applications to consume log
events in real time.”
@ Kafka: a Distributed Messaging System for Log Processing,
LinkedIn

Kafka - features
• Topology – all.
• Retries – no.
• Service discovery – yes (Zookeeper).
• Guaranteed delivery – no (at-least-once in normal case).
• Disconnect detection – yes (Zookeeper).
• Persistence – yes.
• Portability (one or many languages and platforms) – no.
• Highly available or not – no (work in progress).
• Type (p2p or broker-based) – broker-enabled with distributed broker.
• Load balancing strategy for consumers – yes.
• Client backoff strategy for producers – yes .
• Tracing support – no.
• Library or standalone software – standalone + client libraries in Java.

Kafka - Internals
• Fast writes
– Configurable batching
– All writes are continuous, no need for random disk access
(i.e., works well on commodity SATA/SAS disks in RAID
arrays)
• Fast reads
– O(1) disk search
– Extensive use of sendfile()
– No in-memory data caching inside Kafka – fully relies on
OS file system’s page cache
• Elastic horizontal scalability
– Zookeeper is used for brokers and consumers discovery
– Pubsub topics are distributed among brokers

Kafka - conclusion
• Good for event-sourcing architectures
(especially when they add HA support for
brokers).
• Good to decouple incoming stream and
processing to withstand request spikes.
• Very good for logs aggregation and
monitoring data collection.
• Bad for transactional messaging with rich
delivery semantics (exact once etc).

Twitter Finagle
“Finagle is a protocol-agnostic, asynchronous
RPC system for the JVM that makes it easy to
build robust clients and servers in Java, Scala,
or any JVM-hosted language.
Finagle supports a wide variety of
request/response- oriented RPC protocols and
many classes of streaming protocols.”
@ Twitter Engineering Blog

Finagle - features
• Topology – all, very flexible.
• Retries – yes.
• Service discovery – yes (Zookeper).
• Guaranteed delivery – no.
• Disconnect detection – yes.
• Persistence – no.
• Portability (one or many languages and platforms) – JVM only.
• Highly available – yes.
• Type (p2p or broker-based) – p2p.
• Load balancing strategy for consumers – yes (least connections etc).
• Client backoff strategy for producers – yes (limited exponential).
• Tracing support – yes (Zipkin ).
• Library or standalone software – Scala library.

Finagle – from authors
Finagle provides a robust implementation of:
• connection pools, with throttling to avoid TCP connection churn;
• failure detectors, to identify slow or crashed hosts;
• failover strategies, to direct traffic away from unhealthy hosts;
• load-balancers, including “least-connections” and other strategies;
• back-pressure techniques, to defend servers against abusive
clients and dogpiling.
Additionally, Finagle makes it easier to build and deploy a service that
• publishes standard statistics, logs, and exception reports;
• supports distributed tracing (a la Dapper) across protocols;
• optionally uses ZooKeeper for cluster management; and
• supports common sharding strategies.

Finagle – Layered architecture

Finagle - conclusion
• Good for complex JVM-based RPC
architectures.
• Very good for Scala, worse experience with
Java (but yes, they have some utility classes).
• Works well with Thrift and HTTP (plus trivial
protocols), but lacks support for Protobuf and
other popular stuff.
• Active developers community (Google
group), but project infrastructure (maven
repo, versioning, etc) still being improved.

Resources
• Moscow Big Systems / Big Data group
http://www.meetup.com/bigmoscow/
• http://www.zeromq.org
• http://zerorpc.dotcloud.com
• http://kafka.apache.org
• http://twitter.github.io/finagle/

QUESTIONS?
AND CONTACTS
 HTTP://MAKSIMALEKSEEV.MOIKRUG.RU/
 HTTP://RU.LINKEDIN.COM/PUB/MAX-ALEXEJEV/51/820/AB9
 HTTP://WWW.SLIDESHARE.NET/MAXALEXEJEV
 MALEXEJEV@GMAIL.COM
 SKYPE: MALEXEJEV

Modern Distributed Messaging and RPC

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Modern Distributed Messaging and RPC

Similar to Modern Distributed Messaging and RPC (20)

Recently uploaded

Recently uploaded (20)

Modern Distributed Messaging and RPC

Editor's Notes