Message-based Architectures
Week 4
Distinction of systems in comms patterns
• Distributed systems (microservice-based systems belong to them)
need well defined and efficient methods for communications.
• There are many different architectural patterns which specify the
communications between the components of a distributed system.
• They can be classified to two broad categories:
• Event-based
• Message passing
Event-based systems
• The control flow is determined by events like user actions, sensory
input or requests from other systems.
• Usually there is a main loop which listens for events and triggers
the appropriate callback function.
• Used predominantly for GUI-centered applications.
Event-based systems (cont.)
Is the MVC pattern an event-based pattern?
Are web applications web based?
Is the MVC pattern an event-based pattern?
Short answer: Yes
Is the MVC pattern an event-based pattern?
Credit: MIT
Are web applications web based?
Short answer: Yes
Are web applications event-based?
Nearly all of them
Pros and Cons of event based systems
• Pros
• Very reliable when the producer doesn't need to know who is the consumer
or vice versa.
• Simpler communication patterns.
• Systems can be dynamically composable (e.g. via plugins)
at runtime without strict contracts.
• Cons
• Becomes very complex for large scale applications.
• Can end up in 'callback hell'.
Interesting project: P Language
Source: https://www.microsoft.com/en-us/research/publication/p-safe-asynchronous-event-driven-programming/
P Language
• A system comprises from a set of interacting state machines.
• Compiled and verifiable with model checking.
• It is ensured that events can not be deferred.
• The new USB device driver in Windows 8 is written in 8.
• Open-sourced for IoT and embedded applications.
Message passing based systems
• Based on messages being send from one process to another one.
• The recipient process can decide how to respond to the message.
• It separates invocation from implementation, where encapsulation plays a
fundamental role.
• Synchronous vs asynchronous
• Usually implemented via actors or channels (π-calculus).
Message passing based systems (cont.)
• Prominent actor-based languages:
• Erlang
• Scala
• Smalltalk
• Prominent channel-based languages:
• Go
Pros and Cons of message based systems
• Pros
• Very suitable for systems where producers and consumers are very well
known.
• It adds security, e.g. if you write a credit-card processing system, you do want
to know to whom you talk to.
• Cons
• Requires more thorough design, semantics need to be specified in advance.
• May require schema versioning.
Message Exchange Pattern Types
There are many variations of MEP. These are:
• In-Only
• Robust In-Only
• In-Out
• In-Optional-Out
• Out-Only
• Robust Out-Only
• Out-In
• Out-Optional-In
In-Only
Sender
(ATM)
Recipient
(Account)
account_withdraw, 10
%%% account.erl
-module(account).
check_balance(sender, msg) ->
receive
{account_withdraw, amount} ->
{res, amount} = withdraw(amount) % from db
end.
%%% main.erl
ATM = fun(account, amount) -> account ! amount end.
spawn(ATM).
Account = spawn(Account, toms_account, []).
ATM(Account, 10). % returns nothing
Robust In-Only
Sender
(ATM)
Recipient
(Account)
account_withdraw, 10
ok | err
%%% account.erl
-module(account).
check_balance(sender, msg) ->
receive
{account_withdraw, amount} ->
{res, amount} = withdraw(amount) % from db
sender ! res
end.
%%% main.erl
ATM = fun(account, amount) -> account ! amount end.
spawn(ATM).
Account = spawn(Account, toms_account, []).
ATM(Account, 10). % returns ok
In-Out
Sender
(ATM)
Recipient
(Account)
account_withdraw, 10
{ ok, 2990 }
%%% account.erl
-module(account).
check_balance(sender, msg) ->
receive
{account_withdraw, amount} ->
{res, amount} = withdraw(amount) % from db
sender ! {res, amount}
end.
%%% main.erl
ATM = fun(account, amount) -> account ! amount end.
spawn(ATM).
Account = spawn(Account, toms_account, []).
ATM(Account, 10). % returns { ok, 2990 }
In-Optional-Out
Sender
(ATM)
Recipient
(Account)
account_withdraw, 100000
%%% account.erl
-module(account).
check_balance(sender, msg) ->
receive
{account_withdraw, amount} ->
{res, amount} = withdraw(amount) % from db
case res of
ok ->
sender ! {res, amount}
Error ->
io:format("let forget it :-)")
end.
%%% main.erl
ATM = fun(account, amount) -> account ! amount end.
spawn(ATM).
Account = spawn(Account, toms_account, []).
ATM(Account, 10). % returns nothing
{ ok, 2990 }
Out-Only
Consumer
(Mobile Banking
App)
Recipient
(Account)
bill_withdraw
%%% account.erl
-module(account).
check_balance(sender, msg) ->
receive
{account_withdraw, amount, app_id} ->
{res, amount} = withdraw(amount) % from db
app_id ! bill_withdraw
end.
%%% main.erl
Vattenfall = fun(account, amount) -> account ! amount end.
spawn(Vattenfall ).
IPhone = fun(msg) -> io::format(msg) end.
spawn(IPhone).
Account = spawn(Account, toms_account, []).
Vattenfall (Account, 10). % IPhone prints bill_withdraw
Vattenfall
(Account)
account_withdraw, 10
Robust Out-Only
Consumer
(Mobile Banking
App)
Recipient
(Account)
err
%%% account.erl
-module(account).
check_balance(sender, msg) ->
receive
{account_withdraw, amount, app_id} ->
{res, amount} = withdraw(amount) % from db
case res of
ok ->
app_id ! bill_withdraw
error ->
app_id ! error
end.
%%% main.erl
Vattenfall = fun(account, amount) -> account ! amount end.
spawn(Vattenfall ).
IPhone = fun(msg) -> io::format(msg) end.
spawn(IPhone).
Account = spawn(Account, toms_account, []).
Vattenfall (Account, 10). % IPhone prints bill_withdraw
Vattenfall
(Account)
account_withdraw, 100000
Out-In
Consumer
(Mobile Banking
App)
Recipient
(Account)
bill_withdraw, 150 %%% account.erl
-module(account).
check_balance(sender, msg) ->
receive
{account_withdraw, amount, app_id} ->
{res, amount} = withdraw(amount) % from db
case res of
ok ->
app_id ! bill_withdraw
error ->
app_id ! error
end.
%%% main.erl
Vattenfall = fun(account, amount) -> account ! amount end.
spawn(Vattenfall ).
IPhone = fun(msg) -> io::format(msg) end.
spawn(IPhone).
Account = spawn(Account, toms_account, []).
Vattenfall (Account, 10). % IPhone prints bill_withdraw
Vattenfall
(Account)
account_withdraw, 150
need_to_cut_waste
Out-Optional-In
Consumer
(Mobile Banking
App)
Recipient
(Account)
bill_withdraw, 150 %%% account.erl
-module(account).
check_balance(sender, msg) ->
receive
{account_withdraw, amount, app_id} ->
{res, amount} = withdraw(amount) % from db
case res of
ok ->
app_id ! bill_withdraw
error ->
app_id ! error
end.
%%% main.erl
Vattenfall = fun(account, amount) -> account ! amount end.
spawn(Vattenfall ).
IPhone = fun(msg) -> io::format(msg) end.
spawn(IPhone).
Account = spawn(Account, toms_account, []).
Vattenfall (Account, 10). % IPhone prints bill_withdraw
Vattenfall
(Account)
account_withdraw, 150
response
Enough! Anything ready to use?
• ØMQ
• RabbitMQ
• ActiveMQ
• Kafka
Easiest Start - ØMQ
• A socket on steroids
• Super fast
• Available in a multitude of languages
• Supports MEP and even more
• Request – Reply
• Publish – Subscribe
• Push – Pull
• Exclusive Pair
ØMQ Request - Reply
require 'rubygems'
require 'ffi-rzmq'
context = ZMQ::Context.new(1)
puts "Starting Hello World server…"
# socket to listen for clients
socket = context.socket(ZMQ::REP)
socket.bind("tcp://*:5555")
while true do
# Wait for next request from client
request = ''
rc = socket.recv_string(request)
puts "Received request. Data: #{request.inspect}"
# Do some 'work'
sleep 1
# Send reply back to client
socket.send_string("world")
end
ØMQ Push - Pull
• Ideal for delegating workload to worker threads.
• It can simplify asynchronous computation.
• Essentially a simple DAG (directed acyclid graph).
ØMQ Exclusive Pair
• It can ensure the correct order of execution on a batch of data.
• Easier to ensure correctness of execution and validitiy of data.
• Can become overly complicated easily.
• Difficult to scale.
How far can we go with ØMQ?
• Really far!
• Openstack uses it for
messaging between
cluster nodes.
• It has been scaled up
to 168.000 cloud
instances.
How far can we go with ØMQ? (cont.)
• It is also very popular in HFT
applications, where it has been
optimized heavily.
Kafka
Not Franz Kafka, Apache Kafka
What is kafka?
• A distributed message queue.
• A high-throughput, low-latency platform handling real-time data
feeds.
• A message bus.
Message Bus Pattern
'just' a smart pipe
LinkedIn's perception of Kafka
https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-
datas-unifying
Kafka Topics / Logs
• The most fundamental abstraction in Kafka, an ordered sequence of
records ordered by time.
Text durch Klicken
hinzufügen
Distributed Topics
Credit: Cloudera
Log-structured data flow
• Multiple readers can subscribe to a log and receive the same
information, creating a small Message Bus.
Flows can be very extensive and form DAGs
DAGs are the basis for flow-based
programming
What is flow-based programming (FBP)?
• A programming paradigm that uses a "data factory" metaphor for
designing and building applications.
• FBP defines applications as networks of "black box" processes, which
exchange data across predefined connections by message passing,
where the connections are specified externally to the processes.
• These black box processes can be reconnected endlessly to form
different applications without having to be changed internally. FBP is
thus naturally component-oriented.
NoFlo
NoFlo (cont.)
Cloud Dataflow
System responsibility can be easily distributed
Advantages of the message bus
• Decoupling. Producers and consumers are independent and can evolve and innovate seperately
at their own rate.
• Redundancy. Queues can persist messages until they are fully processed.
• Scalability. Scaling is acheived simply by adding more queue processors.
• Elasticity & Spikability. Queues soak up load until more resources can be brought online.
• Resiliency. Decoupling implies that failures are not linked. Messages can still be queue even if
there's a problem on the consumer side.
• Delivery Guarantees. Queues make sure a message will be consumed eventually and even
implement higher level properties like deliver at most once.
• Ordering Guarantees. Coupled with publish and subscribe mechanisms, queues can be used
message ordering guarantees to consumers.
• Buffering. A queue acts a buffer between writers and readers. Writers can write faster than
readers may read, which helps control the flow of processing through the entire system.
• Understanding Data Flow. By looking at the rate at which messages are processed you can
identify areas where performance may be improved.
• Asynchronous Communication. Writers and readers are independent of each other, so writers
can just fire and forget will readers can process work at their own leisure.
A Showcase: Kafka in Linkedin
• The company's cornerstone in transfering high-velocity data.
• When combined, the Kafka ecosystem at LinkedIn is sent over 800
billion messages per day which amounts to over 175 terabytes of data
• Over 650 terabytes of messages are then consumed daily.
• At the busiest times of day, over 13 million messages per second are
received, or 2.75 gigabytes of data per second.
• To handle all these messages, LinkedIn runs over 1100 Kafka brokers
organized into more than 60 clusters.
Linkedin's data platform before
and after
Software Architectures, Week 4 - Message-based Architectures, Message Bus

Software Architectures, Week 4 - Message-based Architectures, Message Bus

  • 1.
  • 2.
    Distinction of systemsin comms patterns • Distributed systems (microservice-based systems belong to them) need well defined and efficient methods for communications. • There are many different architectural patterns which specify the communications between the components of a distributed system. • They can be classified to two broad categories: • Event-based • Message passing
  • 3.
    Event-based systems • Thecontrol flow is determined by events like user actions, sensory input or requests from other systems. • Usually there is a main loop which listens for events and triggers the appropriate callback function. • Used predominantly for GUI-centered applications.
  • 4.
    Event-based systems (cont.) Isthe MVC pattern an event-based pattern? Are web applications web based?
  • 5.
    Is the MVCpattern an event-based pattern? Short answer: Yes
  • 6.
    Is the MVCpattern an event-based pattern? Credit: MIT
  • 7.
    Are web applicationsweb based? Short answer: Yes
  • 8.
    Are web applicationsevent-based? Nearly all of them
  • 9.
    Pros and Consof event based systems • Pros • Very reliable when the producer doesn't need to know who is the consumer or vice versa. • Simpler communication patterns. • Systems can be dynamically composable (e.g. via plugins) at runtime without strict contracts. • Cons • Becomes very complex for large scale applications. • Can end up in 'callback hell'.
  • 10.
    Interesting project: PLanguage Source: https://www.microsoft.com/en-us/research/publication/p-safe-asynchronous-event-driven-programming/
  • 11.
    P Language • Asystem comprises from a set of interacting state machines. • Compiled and verifiable with model checking. • It is ensured that events can not be deferred. • The new USB device driver in Windows 8 is written in 8. • Open-sourced for IoT and embedded applications.
  • 12.
    Message passing basedsystems • Based on messages being send from one process to another one. • The recipient process can decide how to respond to the message. • It separates invocation from implementation, where encapsulation plays a fundamental role. • Synchronous vs asynchronous • Usually implemented via actors or channels (π-calculus).
  • 13.
    Message passing basedsystems (cont.) • Prominent actor-based languages: • Erlang • Scala • Smalltalk • Prominent channel-based languages: • Go
  • 14.
    Pros and Consof message based systems • Pros • Very suitable for systems where producers and consumers are very well known. • It adds security, e.g. if you write a credit-card processing system, you do want to know to whom you talk to. • Cons • Requires more thorough design, semantics need to be specified in advance. • May require schema versioning.
  • 15.
    Message Exchange PatternTypes There are many variations of MEP. These are: • In-Only • Robust In-Only • In-Out • In-Optional-Out • Out-Only • Robust Out-Only • Out-In • Out-Optional-In
  • 16.
    In-Only Sender (ATM) Recipient (Account) account_withdraw, 10 %%% account.erl -module(account). check_balance(sender,msg) -> receive {account_withdraw, amount} -> {res, amount} = withdraw(amount) % from db end. %%% main.erl ATM = fun(account, amount) -> account ! amount end. spawn(ATM). Account = spawn(Account, toms_account, []). ATM(Account, 10). % returns nothing
  • 17.
    Robust In-Only Sender (ATM) Recipient (Account) account_withdraw, 10 ok| err %%% account.erl -module(account). check_balance(sender, msg) -> receive {account_withdraw, amount} -> {res, amount} = withdraw(amount) % from db sender ! res end. %%% main.erl ATM = fun(account, amount) -> account ! amount end. spawn(ATM). Account = spawn(Account, toms_account, []). ATM(Account, 10). % returns ok
  • 18.
    In-Out Sender (ATM) Recipient (Account) account_withdraw, 10 { ok,2990 } %%% account.erl -module(account). check_balance(sender, msg) -> receive {account_withdraw, amount} -> {res, amount} = withdraw(amount) % from db sender ! {res, amount} end. %%% main.erl ATM = fun(account, amount) -> account ! amount end. spawn(ATM). Account = spawn(Account, toms_account, []). ATM(Account, 10). % returns { ok, 2990 }
  • 19.
    In-Optional-Out Sender (ATM) Recipient (Account) account_withdraw, 100000 %%% account.erl -module(account). check_balance(sender,msg) -> receive {account_withdraw, amount} -> {res, amount} = withdraw(amount) % from db case res of ok -> sender ! {res, amount} Error -> io:format("let forget it :-)") end. %%% main.erl ATM = fun(account, amount) -> account ! amount end. spawn(ATM). Account = spawn(Account, toms_account, []). ATM(Account, 10). % returns nothing { ok, 2990 }
  • 20.
    Out-Only Consumer (Mobile Banking App) Recipient (Account) bill_withdraw %%% account.erl -module(account). check_balance(sender,msg) -> receive {account_withdraw, amount, app_id} -> {res, amount} = withdraw(amount) % from db app_id ! bill_withdraw end. %%% main.erl Vattenfall = fun(account, amount) -> account ! amount end. spawn(Vattenfall ). IPhone = fun(msg) -> io::format(msg) end. spawn(IPhone). Account = spawn(Account, toms_account, []). Vattenfall (Account, 10). % IPhone prints bill_withdraw Vattenfall (Account) account_withdraw, 10
  • 21.
    Robust Out-Only Consumer (Mobile Banking App) Recipient (Account) err %%%account.erl -module(account). check_balance(sender, msg) -> receive {account_withdraw, amount, app_id} -> {res, amount} = withdraw(amount) % from db case res of ok -> app_id ! bill_withdraw error -> app_id ! error end. %%% main.erl Vattenfall = fun(account, amount) -> account ! amount end. spawn(Vattenfall ). IPhone = fun(msg) -> io::format(msg) end. spawn(IPhone). Account = spawn(Account, toms_account, []). Vattenfall (Account, 10). % IPhone prints bill_withdraw Vattenfall (Account) account_withdraw, 100000
  • 22.
    Out-In Consumer (Mobile Banking App) Recipient (Account) bill_withdraw, 150%%% account.erl -module(account). check_balance(sender, msg) -> receive {account_withdraw, amount, app_id} -> {res, amount} = withdraw(amount) % from db case res of ok -> app_id ! bill_withdraw error -> app_id ! error end. %%% main.erl Vattenfall = fun(account, amount) -> account ! amount end. spawn(Vattenfall ). IPhone = fun(msg) -> io::format(msg) end. spawn(IPhone). Account = spawn(Account, toms_account, []). Vattenfall (Account, 10). % IPhone prints bill_withdraw Vattenfall (Account) account_withdraw, 150 need_to_cut_waste
  • 23.
    Out-Optional-In Consumer (Mobile Banking App) Recipient (Account) bill_withdraw, 150%%% account.erl -module(account). check_balance(sender, msg) -> receive {account_withdraw, amount, app_id} -> {res, amount} = withdraw(amount) % from db case res of ok -> app_id ! bill_withdraw error -> app_id ! error end. %%% main.erl Vattenfall = fun(account, amount) -> account ! amount end. spawn(Vattenfall ). IPhone = fun(msg) -> io::format(msg) end. spawn(IPhone). Account = spawn(Account, toms_account, []). Vattenfall (Account, 10). % IPhone prints bill_withdraw Vattenfall (Account) account_withdraw, 150 response
  • 24.
    Enough! Anything readyto use? • ØMQ • RabbitMQ • ActiveMQ • Kafka
  • 25.
    Easiest Start -ØMQ • A socket on steroids • Super fast • Available in a multitude of languages • Supports MEP and even more • Request – Reply • Publish – Subscribe • Push – Pull • Exclusive Pair
  • 26.
    ØMQ Request -Reply require 'rubygems' require 'ffi-rzmq' context = ZMQ::Context.new(1) puts "Starting Hello World server…" # socket to listen for clients socket = context.socket(ZMQ::REP) socket.bind("tcp://*:5555") while true do # Wait for next request from client request = '' rc = socket.recv_string(request) puts "Received request. Data: #{request.inspect}" # Do some 'work' sleep 1 # Send reply back to client socket.send_string("world") end
  • 27.
    ØMQ Push -Pull • Ideal for delegating workload to worker threads. • It can simplify asynchronous computation. • Essentially a simple DAG (directed acyclid graph).
  • 28.
    ØMQ Exclusive Pair •It can ensure the correct order of execution on a batch of data. • Easier to ensure correctness of execution and validitiy of data. • Can become overly complicated easily. • Difficult to scale.
  • 29.
    How far canwe go with ØMQ? • Really far! • Openstack uses it for messaging between cluster nodes. • It has been scaled up to 168.000 cloud instances.
  • 30.
    How far canwe go with ØMQ? (cont.) • It is also very popular in HFT applications, where it has been optimized heavily.
  • 31.
  • 32.
    Not Franz Kafka,Apache Kafka
  • 33.
    What is kafka? •A distributed message queue. • A high-throughput, low-latency platform handling real-time data feeds. • A message bus.
  • 34.
  • 35.
  • 36.
    LinkedIn's perception ofKafka https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time- datas-unifying
  • 37.
    Kafka Topics /Logs • The most fundamental abstraction in Kafka, an ordered sequence of records ordered by time.
  • 38.
  • 39.
    Log-structured data flow •Multiple readers can subscribe to a log and receive the same information, creating a small Message Bus.
  • 40.
    Flows can bevery extensive and form DAGs
  • 41.
    DAGs are thebasis for flow-based programming
  • 42.
    What is flow-basedprogramming (FBP)? • A programming paradigm that uses a "data factory" metaphor for designing and building applications. • FBP defines applications as networks of "black box" processes, which exchange data across predefined connections by message passing, where the connections are specified externally to the processes. • These black box processes can be reconnected endlessly to form different applications without having to be changed internally. FBP is thus naturally component-oriented.
  • 43.
  • 44.
  • 45.
  • 46.
    System responsibility canbe easily distributed
  • 47.
    Advantages of themessage bus • Decoupling. Producers and consumers are independent and can evolve and innovate seperately at their own rate. • Redundancy. Queues can persist messages until they are fully processed. • Scalability. Scaling is acheived simply by adding more queue processors. • Elasticity & Spikability. Queues soak up load until more resources can be brought online. • Resiliency. Decoupling implies that failures are not linked. Messages can still be queue even if there's a problem on the consumer side. • Delivery Guarantees. Queues make sure a message will be consumed eventually and even implement higher level properties like deliver at most once. • Ordering Guarantees. Coupled with publish and subscribe mechanisms, queues can be used message ordering guarantees to consumers. • Buffering. A queue acts a buffer between writers and readers. Writers can write faster than readers may read, which helps control the flow of processing through the entire system. • Understanding Data Flow. By looking at the rate at which messages are processed you can identify areas where performance may be improved. • Asynchronous Communication. Writers and readers are independent of each other, so writers can just fire and forget will readers can process work at their own leisure.
  • 48.
    A Showcase: Kafkain Linkedin • The company's cornerstone in transfering high-velocity data. • When combined, the Kafka ecosystem at LinkedIn is sent over 800 billion messages per day which amounts to over 175 terabytes of data • Over 650 terabytes of messages are then consumed daily. • At the busiest times of day, over 13 million messages per second are received, or 2.75 gigabytes of data per second. • To handle all these messages, LinkedIn runs over 1100 Kafka brokers organized into more than 60 clusters.
  • 49.
  • 50.

Editor's Notes