Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Alvaro Videla 
• Developer Advocate at Pivotal / RabbitMQ! 
• Co-Author of RabbitMQ in Action! 
• Creator of the RabbitMQ ...
About Me 
Co-authored! 
! 
RabbitMQ in Action! 
http://bit.ly/rabbitmq
About this Talk 
• Exploratory Talk 
• A ‘what could be done’ talk instead of ‘this is how you do it’
Agenda 
• Intro to RabbitMQ 
• The Problem 
• Solution Proposal 
• Improvements
https://twitter.com/spacemanaki/status/514590885523505153
What is RabbitMQ
RabbitMQ
RabbitMQ
RabbitMQ 
• Multi Protocol Messaging Server
RabbitMQ 
• Multi Protocol Messaging Server! 
• Open Source (MPL)
RabbitMQ 
• Multi Protocol Messaging Server! 
• Open Source (MPL)! 
• Polyglot
RabbitMQ 
• Multi Protocol Messaging Server! 
• Open Source (MPL)! 
• Polyglot! 
• Written in Erlang/OTP
Multi Protocol 
http://bit.ly/rmq-protocols
Community Plugins 
http://www.rabbitmq.com/community-plugins.html
Polyglot
Polyglot
Polyglot 
• Java
Polyglot 
• Java! 
• node.js
Polyglot 
• Java! 
• node.js! 
• Erlang
Polyglot 
• Java! 
• node.js! 
• Erlang! 
• PHP
Polyglot 
• Java! 
• node.js! 
• Erlang! 
• PHP! 
• Ruby
Polyglot 
• Java! 
• node.js! 
• Erlang! 
• PHP! 
• Ruby! 
• .Net
Polyglot 
• Java! 
• node.js! 
• Erlang! 
• PHP! 
• Ruby! 
• .Net! 
• Haskell! 
• Python
Polyglot 
Even COBOL!!!11
Some users of RabbitMQ
Some users of RabbitMQ 
• Instagram
Some users of RabbitMQ 
• Instagram! 
• Indeed.com
Some users of RabbitMQ 
• Instagram! 
• Indeed.com! 
• Telefonica
Some users of RabbitMQ 
• Instagram! 
• Indeed.com! 
• Telefonica! 
• Mercado Libre
Some users of RabbitMQ 
• Instagram! 
• Indeed.com! 
• Telefonica! 
• Mercado Libre! 
• NHS
Some users of RabbitMQ 
• Instagram! 
• Indeed.com! 
• Telefonica! 
• Mercado Libre! 
• NHS! 
• Mozilla
The New York Times on RabbitMQ 
This architecture - Fabrik - has dozens of RabbitMQ instances 
spread across 6 AWS zones i...
http://www.rabbitmq.com/download.html 
Unix - Mac - Windows
Messaging with RabbitMQ 
A demo with the RabbitMQ Simulator 
https://github.com/RabbitMQSimulator/RabbitMQSimulator
http://tryrabbitmq.com
RabbitMQ Simulator
The Problem
Distributed Application 
App 
App 
App 
App
Distributed Application 
App 
App 
App 
App
{ok, Connection} = 
Data Producer 
Obtain a Channel 
amqp_connection:start(#amqp_params_network{host = "localhost"}), 
! 
...
Data Producer 
Declare an Exchange 
amqp_channel:call(Channel, #'exchange.declare'{exchange = <<"events">>, 
type = <<"dir...
Data Producer 
amqp_channel:cast(Channel, 
Publish a message 
#'basic.publish'{ 
exchange = <<"events">>}, 
#amqp_msg{prop...
Data Consumer 
Obtain a Channel 
{ok, Connection} = 
amqp_connection:start(#amqp_params_network{host = "localhost"}), 
! 
...
Data Consumer 
Declare Queue and bind it 
amqp_channel:call(Channel, #'exchange.declare'{exchange = <<"events">>, 
type = ...
amqp_channel:subscribe(Channel, #'basic.consume'{queue = Queue, 
no_ack = true}, self()), 
! 
receive 
#'basic.consume_ok'...
loop(Channel) -> 
receive 
{#'basic.deliver'{}, #amqp_msg{payload = Body}} -> 
io:format(" [x] ~p~n", [Body]), 
loop(Chann...
Ad-hoc solution
A process that replicates data 
to the remote server
Possible issues
Possible issues 
• Remote server is offline
Possible issues 
• Remote server is offline 
• Prevent unbounded local buffers
Possible issues 
• Remote server is offline 
• Prevent unbounded local buffers 
• Prevent message loss
Possible issues 
• Remote server is offline 
• Prevent unbounded local buffers 
• Prevent message loss 
• Prevent unnecess...
Possible issues 
• Remote server is offline 
• Prevent unbounded local buffers 
• Prevent message loss 
• Prevent unnecess...
Possible issues 
• Remote server is offline 
• Prevent unbounded local buffers 
• Prevent message loss 
• Prevent unnecess...
Can we do better?
RabbitMQ Federation
RabbitMQ Federation
RabbitMQ Federation 
• Supports replication across different administrative domains
RabbitMQ Federation 
• Supports replication across different administrative domains 
• Supports mix of Erlang and RabbitMQ...
RabbitMQ Federation 
• Supports replication across different administrative domains 
• Supports mix of Erlang and RabbitMQ...
RabbitMQ Federation 
• Supports replication across different administrative domains 
• Supports mix of Erlang and RabbitMQ...
RabbitMQ Federation
RabbitMQ Federation
RabbitMQ Federation
RabbitMQ Federation 
• It’s a RabbitMQ Plugin
RabbitMQ Federation 
• It’s a RabbitMQ Plugin 
• Internally uses Queues and Exchanges Decorators
RabbitMQ Federation 
• It’s a RabbitMQ Plugin 
• Internally uses Queues and Exchanges Decorators 
• Managed using Paramete...
Enabling the Plugin 
rabbitmq-plugins enable rabbitmq_federation
Enabling the Plugin 
rabbitmq-plugins enable rabbitmq_federation 
rabbitmq-plugins enable rabbitmq_federation_management
Federating an Exchange 
rabbitmqctl set_parameter federation-upstream my-upstream  
‘{“uri":"amqp://server-name","expires"...
Federating an Exchange 
rabbitmqctl set_parameter federation-upstream my-upstream  
‘{“uri":"amqp://server-name","expires"...
Federating an Exchange
Configuring Federation
Config Options 
rabbitmqctl set_parameter federation-upstream  
name ‘json-object’
Config Options 
rabbitmqctl set_parameter federation-upstream  
name ‘json-object’ 
! 
json-object: { 
‘uri’: ‘amqp://serv...
Prevent unbound buffers 
expires: N // ms. 
message-ttl: N // ms.
Prevent message forwarding 
max-hops: N
Speed vs No Message Loss 
ack-mode: on-confirm 
ack-mode: on-publish 
ack-mode: no-ack
AMQP URI: 
amqp://user:pass@host:10000/vhost 
http://www.rabbitmq.com/uri-spec.html
Config can be applied via 
• CLI using rabbitmqctl 
• HTTP API 
• RabbitMQ Management Interface
RabbitMQ Federation
Some Queueing Theory 
http://www.rabbitmq.com/blog/2012/05/11/some-queuing-theory-throughput-latency-and-bandwidth/
RabbitMQ BasicQos Simulator
Prevent Unbound Buffers 
λ = mean arrival time 
μ = mean service rate 
if λ > μ what happens? 
https://www.rabbitmq.com/bl...
Prevent Unbound Buffers 
λ = mean arrival time 
μ = mean service rate 
if λ > μ what happens? 
Queue length goes to infini...
Recommended Reading 
Performance Modeling and 
Design of Computer Systems: 
Queueing Theory in Action
Scaling the Setup
The Problem
The Problem 
• Queues contents live in the node where the Queue was declared
The Problem 
• Queues contents live in the node where the Queue was declared 
• A cluster can access the queue from every ...
The Problem 
• Queues contents live in the node where the Queue was declared 
• A cluster can access the queue from every ...
The Problem 
• Queues contents live in the node where the Queue was declared 
• A cluster can access the queue from every ...
Enter Sharded Queues
Enter Sharded Queues
Pieces of the Puzzle 
• modulo hash exchange (consistent hash works as well) 
• good ol’ queues
Sharded Queues
Sharded Queues
Sharded Queues
Sharded Queues
Sharded Queues 
• Declare Queues with name: nodename.queuename.index
Sharded Queues 
• Declare Queues with name: nodename.queuename.index 
• Bind the queues to a consistent hash exchange
Sharded Queues 
• Declare Queues with name: nodename.queuename.index 
• Bind the queues to a partitioner exchange 
• Trans...
We need more scale!
Federated Queues
Federated Queues 
• Load-balance messages across federated queues 
• Only moves messages when needed
Federating a Queue 
rabbitmqctl set_parameter federation-upstream my-upstream  
‘{“uri":"amqp://server-name","expires":360...
Federating a Queue 
rabbitmqctl set_parameter federation-upstream my-upstream  
‘{“uri":"amqp://server-name","expires":360...
With RabbitMQ we can
With RabbitMQ we can 
• Ingest data using various protocols: AMQP, MQTT and STOMP
With RabbitMQ we can 
• Ingest data using various protocols: AMQP, MQTT and STOMP 
• Distribute that data globally using F...
With RabbitMQ we can 
• Ingest data using various protocols: AMQP, MQTT and STOMP 
• Distribute that data globally using F...
With RabbitMQ we can 
• Ingest data using various protocols: AMQP, MQTT and STOMP 
• Distribute that data globally using F...
Credits 
world map: wikipedia.org 
federation diagrams: rabbitmq.com
Questions?
Thanks 
Alvaro Videla - @old_sound
Alvaro Videla, Building a Distributed Data Ingestion System with RabbitMQ
Upcoming SlideShare
Loading in …5
×

Alvaro Videla, Building a Distributed Data Ingestion System with RabbitMQ

1,203 views

Published on

In this talk I am going to show how to build a system that can ingest data produced at separate geo located areas (think AWS and it’s many regions) and replicate it to a central cluster where it can be further processed and analysed. I will present an example of how to build a system like this one by using RabbitMQ Federation to replicate data across AWS Regions and RabbitMQ support for many protocols to produce/consume data.
To help with scalability I am going to show an interesting way to implement sharded queues with RabbitMQ by using the Consistent Hash Exchange.

Published in: Technology
  • Be the first to comment

Alvaro Videla, Building a Distributed Data Ingestion System with RabbitMQ

  1. 1. Alvaro Videla • Developer Advocate at Pivotal / RabbitMQ! • Co-Author of RabbitMQ in Action! • Creator of the RabbitMQ Simulator! • Blogs about RabbitMQ Internals: http://videlalvaro.github.io/internals.html! • @old_sound — alvaro@rabbitmq.com — github.com/videlalvaro
  2. 2. About Me Co-authored! ! RabbitMQ in Action! http://bit.ly/rabbitmq
  3. 3. About this Talk • Exploratory Talk • A ‘what could be done’ talk instead of ‘this is how you do it’
  4. 4. Agenda • Intro to RabbitMQ • The Problem • Solution Proposal • Improvements
  5. 5. https://twitter.com/spacemanaki/status/514590885523505153
  6. 6. What is RabbitMQ
  7. 7. RabbitMQ
  8. 8. RabbitMQ
  9. 9. RabbitMQ • Multi Protocol Messaging Server
  10. 10. RabbitMQ • Multi Protocol Messaging Server! • Open Source (MPL)
  11. 11. RabbitMQ • Multi Protocol Messaging Server! • Open Source (MPL)! • Polyglot
  12. 12. RabbitMQ • Multi Protocol Messaging Server! • Open Source (MPL)! • Polyglot! • Written in Erlang/OTP
  13. 13. Multi Protocol http://bit.ly/rmq-protocols
  14. 14. Community Plugins http://www.rabbitmq.com/community-plugins.html
  15. 15. Polyglot
  16. 16. Polyglot
  17. 17. Polyglot • Java
  18. 18. Polyglot • Java! • node.js
  19. 19. Polyglot • Java! • node.js! • Erlang
  20. 20. Polyglot • Java! • node.js! • Erlang! • PHP
  21. 21. Polyglot • Java! • node.js! • Erlang! • PHP! • Ruby
  22. 22. Polyglot • Java! • node.js! • Erlang! • PHP! • Ruby! • .Net
  23. 23. Polyglot • Java! • node.js! • Erlang! • PHP! • Ruby! • .Net! • Haskell! • Python
  24. 24. Polyglot Even COBOL!!!11
  25. 25. Some users of RabbitMQ
  26. 26. Some users of RabbitMQ • Instagram
  27. 27. Some users of RabbitMQ • Instagram! • Indeed.com
  28. 28. Some users of RabbitMQ • Instagram! • Indeed.com! • Telefonica
  29. 29. Some users of RabbitMQ • Instagram! • Indeed.com! • Telefonica! • Mercado Libre
  30. 30. Some users of RabbitMQ • Instagram! • Indeed.com! • Telefonica! • Mercado Libre! • NHS
  31. 31. Some users of RabbitMQ • Instagram! • Indeed.com! • Telefonica! • Mercado Libre! • NHS! • Mozilla
  32. 32. The New York Times on RabbitMQ This architecture - Fabrik - has dozens of RabbitMQ instances spread across 6 AWS zones in Oregon and Dublin. Upon launch today, the system autoscaled to ~500,000 users. Connection times remained flat at ~200ms. http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2014-January/032943.html
  33. 33. http://www.rabbitmq.com/download.html Unix - Mac - Windows
  34. 34. Messaging with RabbitMQ A demo with the RabbitMQ Simulator https://github.com/RabbitMQSimulator/RabbitMQSimulator
  35. 35. http://tryrabbitmq.com
  36. 36. RabbitMQ Simulator
  37. 37. The Problem
  38. 38. Distributed Application App App App App
  39. 39. Distributed Application App App App App
  40. 40. {ok, Connection} = Data Producer Obtain a Channel amqp_connection:start(#amqp_params_network{host = "localhost"}), ! {ok, Channel} = amqp_connection:open_channel(Connection),
  41. 41. Data Producer Declare an Exchange amqp_channel:call(Channel, #'exchange.declare'{exchange = <<"events">>, type = <<"direct">>}),
  42. 42. Data Producer amqp_channel:cast(Channel, Publish a message #'basic.publish'{ exchange = <<"events">>}, #amqp_msg{props = #'P_basic'{delivery_mode = 2}, payload = <<“Hello Federation">>}),
  43. 43. Data Consumer Obtain a Channel {ok, Connection} = amqp_connection:start(#amqp_params_network{host = "localhost"}), ! {ok, Channel} = amqp_connection:open_channel(Connection),
  44. 44. Data Consumer Declare Queue and bind it amqp_channel:call(Channel, #'exchange.declare'{exchange = <<"events">>, type = <<"direct">>}), ! #'queue.declare_ok'{queue = Queue} = amqp_channel:call(Channel, #'queue.declare'{exclusive = true}), ! amqp_channel:call(Channel, #'queue.bind'{exchange = <<"events">>, queue = Queue}),
  45. 45. amqp_channel:subscribe(Channel, #'basic.consume'{queue = Queue, no_ack = true}, self()), ! receive #'basic.consume_ok'{} -> ok end, ! loop(Channel). Data Consumer Start a consumer
  46. 46. loop(Channel) -> receive {#'basic.deliver'{}, #amqp_msg{payload = Body}} -> io:format(" [x] ~p~n", [Body]), loop(Channel) end. Data Consumer Process messages
  47. 47. Ad-hoc solution
  48. 48. A process that replicates data to the remote server
  49. 49. Possible issues
  50. 50. Possible issues • Remote server is offline
  51. 51. Possible issues • Remote server is offline • Prevent unbounded local buffers
  52. 52. Possible issues • Remote server is offline • Prevent unbounded local buffers • Prevent message loss
  53. 53. Possible issues • Remote server is offline • Prevent unbounded local buffers • Prevent message loss • Prevent unnecessary message replication
  54. 54. Possible issues • Remote server is offline • Prevent unbounded local buffers • Prevent message loss • Prevent unnecessary message replication • No need for those messages on remote server
  55. 55. Possible issues • Remote server is offline • Prevent unbounded local buffers • Prevent message loss • Prevent unnecessary message replication • No need for those messages on remote server • Messages that became stale
  56. 56. Can we do better?
  57. 57. RabbitMQ Federation
  58. 58. RabbitMQ Federation
  59. 59. RabbitMQ Federation • Supports replication across different administrative domains
  60. 60. RabbitMQ Federation • Supports replication across different administrative domains • Supports mix of Erlang and RabbitMQ versions
  61. 61. RabbitMQ Federation • Supports replication across different administrative domains • Supports mix of Erlang and RabbitMQ versions • Supports Network Partitions
  62. 62. RabbitMQ Federation • Supports replication across different administrative domains • Supports mix of Erlang and RabbitMQ versions • Supports Network Partitions • Specificity - not everything has to be federated
  63. 63. RabbitMQ Federation
  64. 64. RabbitMQ Federation
  65. 65. RabbitMQ Federation
  66. 66. RabbitMQ Federation • It’s a RabbitMQ Plugin
  67. 67. RabbitMQ Federation • It’s a RabbitMQ Plugin • Internally uses Queues and Exchanges Decorators
  68. 68. RabbitMQ Federation • It’s a RabbitMQ Plugin • Internally uses Queues and Exchanges Decorators • Managed using Parameters and Policies
  69. 69. Enabling the Plugin rabbitmq-plugins enable rabbitmq_federation
  70. 70. Enabling the Plugin rabbitmq-plugins enable rabbitmq_federation rabbitmq-plugins enable rabbitmq_federation_management
  71. 71. Federating an Exchange rabbitmqctl set_parameter federation-upstream my-upstream ‘{“uri":"amqp://server-name","expires":3600000}'
  72. 72. Federating an Exchange rabbitmqctl set_parameter federation-upstream my-upstream ‘{“uri":"amqp://server-name","expires":3600000}' ! rabbitmqctl set_policy --apply-to exchanges federate-me "^amq." '{"federation-upstream-set":"all"}'
  73. 73. Federating an Exchange
  74. 74. Configuring Federation
  75. 75. Config Options rabbitmqctl set_parameter federation-upstream name ‘json-object’
  76. 76. Config Options rabbitmqctl set_parameter federation-upstream name ‘json-object’ ! json-object: { ‘uri’: ‘amqp://server-name/’, ‘prefetch-count’: 1000, ‘reconnect-delay’: 1, ‘ack-mode’: on-confirm } http://www.rabbitmq.com/federation-reference.html
  77. 77. Prevent unbound buffers expires: N // ms. message-ttl: N // ms.
  78. 78. Prevent message forwarding max-hops: N
  79. 79. Speed vs No Message Loss ack-mode: on-confirm ack-mode: on-publish ack-mode: no-ack
  80. 80. AMQP URI: amqp://user:pass@host:10000/vhost http://www.rabbitmq.com/uri-spec.html
  81. 81. Config can be applied via • CLI using rabbitmqctl • HTTP API • RabbitMQ Management Interface
  82. 82. RabbitMQ Federation
  83. 83. Some Queueing Theory http://www.rabbitmq.com/blog/2012/05/11/some-queuing-theory-throughput-latency-and-bandwidth/
  84. 84. RabbitMQ BasicQos Simulator
  85. 85. Prevent Unbound Buffers λ = mean arrival time μ = mean service rate if λ > μ what happens? https://www.rabbitmq.com/blog/2014/01/23/preventing-unbounded-buffers-with-rabbitmq/
  86. 86. Prevent Unbound Buffers λ = mean arrival time μ = mean service rate if λ > μ what happens? Queue length goes to infinity over time. https://www.rabbitmq.com/blog/2014/01/23/preventing-unbounded-buffers-with-rabbitmq/
  87. 87. Recommended Reading Performance Modeling and Design of Computer Systems: Queueing Theory in Action
  88. 88. Scaling the Setup
  89. 89. The Problem
  90. 90. The Problem • Queues contents live in the node where the Queue was declared
  91. 91. The Problem • Queues contents live in the node where the Queue was declared • A cluster can access the queue from every connected node
  92. 92. The Problem • Queues contents live in the node where the Queue was declared • A cluster can access the queue from every connected node • Queues are an Erlang process (tied to one core)
  93. 93. The Problem • Queues contents live in the node where the Queue was declared • A cluster can access the queue from every connected node • Queues are an Erlang process (tied to one core) • Adding more nodes doesn’t really help
  94. 94. Enter Sharded Queues
  95. 95. Enter Sharded Queues
  96. 96. Pieces of the Puzzle • modulo hash exchange (consistent hash works as well) • good ol’ queues
  97. 97. Sharded Queues
  98. 98. Sharded Queues
  99. 99. Sharded Queues
  100. 100. Sharded Queues
  101. 101. Sharded Queues • Declare Queues with name: nodename.queuename.index
  102. 102. Sharded Queues • Declare Queues with name: nodename.queuename.index • Bind the queues to a consistent hash exchange
  103. 103. Sharded Queues • Declare Queues with name: nodename.queuename.index • Bind the queues to a partitioner exchange • Transparent to the consumer (virtual queue name)
  104. 104. We need more scale!
  105. 105. Federated Queues
  106. 106. Federated Queues • Load-balance messages across federated queues • Only moves messages when needed
  107. 107. Federating a Queue rabbitmqctl set_parameter federation-upstream my-upstream ‘{“uri":"amqp://server-name","expires":3600000}'
  108. 108. Federating a Queue rabbitmqctl set_parameter federation-upstream my-upstream ‘{“uri":"amqp://server-name","expires":3600000}' ! rabbitmqctl set_policy --apply-to queues federate-me "^images." '{"federation-upstream-set":"all"}'
  109. 109. With RabbitMQ we can
  110. 110. With RabbitMQ we can • Ingest data using various protocols: AMQP, MQTT and STOMP
  111. 111. With RabbitMQ we can • Ingest data using various protocols: AMQP, MQTT and STOMP • Distribute that data globally using Federation
  112. 112. With RabbitMQ we can • Ingest data using various protocols: AMQP, MQTT and STOMP • Distribute that data globally using Federation • Scale up using Sharding
  113. 113. With RabbitMQ we can • Ingest data using various protocols: AMQP, MQTT and STOMP • Distribute that data globally using Federation • Scale up using Sharding • Load balance consumers with Federated Queues
  114. 114. Credits world map: wikipedia.org federation diagrams: rabbitmq.com
  115. 115. Questions?
  116. 116. Thanks Alvaro Videla - @old_sound

×