Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Scaling RabbitMQ @ Goldman Sachs
RabbitMQ Summit 2018
Jonathan Skrzypek
2
Messaging Engineering
• Messaging infrastructure services for applications
across the firm
• Portfolio of products to ac...
3
A bit of history
4
Back in 2013
• Reliable multicast messaging
• Fire-and-forget firehose
• JMS brokers for guaranteed messaging
• XA trans...
5
Back in 2013
6
Back in 2013
Guaranteed
Delivery
Reliable
Delivery
7
What then ?
Resilient messaging
8
Why RabbitMQ
• Highly available
• Many messaging patterns
• Many language bindings
• Observability
• Manageability
9
How ?
?
10
Building an offering
11
No one-size-fits-all
BUT...
12
Deployment model
• 3 nodes clusters
• DNS round-robin
• Cluster(s) per application
• AMQP 0-9-1
• Consistency across no...
13
Flexibility
• Programmatically define your topology
14
Topology ownership
• Applications to maintain their topology
• Ability to reconstruct topology from
scratch
15
Namespacing
• Message Domain concept
• Virtual Host
• Prefix for queues and exchanges
• Mirroring
• Queue Synchronizati...
16
Namespacing
• A given user can access one or multiple
domains
• Domain access READ/WRITE/CONFIGURE
• Leverages regular ...
17
Developer awareness
18
Developer awareness
• Write smarter applications
• Idempotent consumers
• No more “one blocking call and you’re done”
•...
19
Developer awareness
• Publishers need to keep track
“The client currently does not perform any internal buffering of su...
20
Developer awareness
• Implement listeners !
• Confirm listener
• Return listener for unroutable messages
• Shutdown lis...
21
Deploy and Manage
22
Automated provisioning
Resource Manager
Run
this
Which cluster ?
What config ?
23
Automated provisioning
24
Configuration management
• Central Inventory
• Minimize deviations
• Inventory vs Actual consistency checks before any
...
25
Telemetry is your friend
• Don’t fly blind !
• Collecting metrics from management API
• Automated dashboard generation
26
Telemetry
27
When things go wrong
• Single queue workloads
• Large amounts of pending messages
• Memory usage
• Swing effect with un...
28
Where are we now
• 225 clusters
• 170 applications
• Mix of 3.6.6 and 3.6.16
29
What’s next ?
• Fully self-service
• Replication
• Federation ? Something else ?
• Peer discovery
30
Visit us online at:
goldmansachs.com/careers
Upcoming SlideShare
Loading in …5
×

Keynote: Scaling RabbitMQ at Goldman Sachs - Jonathan Skrzypek

187 views

Published on

Watch full lecture on YouTube: https://www.youtube.com/watch?v=D9H7i6Ye_to&list=PLDUzG2yLXrU4Lz33ZzSdHyfqdHJ8Zum5A

Goldman Sachs leverages hundreds of applications communicating with each other. The Data Management and Distribution group provides messaging middleware services to the firm’s ecosystem. This talk will be about why and how we adopted RabbitMQ as a first class citizen in our messaging product portfolio. A significant proportion of application teams at Goldman Sachs was used to traditional guaranteed messaging systems, and as such, moving to RabbitMQ was and still is a paradigm shift in how applications interact with a messaging layer. We will touch on the challenges of delivering RabbitMQ as a service at enterprise scale, including but not limited to deployment model, monitoring and telemetry, achieving data consistency, developer awareness.
--

The first RabbitMQ Summit connected RabbitMQ users and developers from around the world in London on November 12, 2018. Learn what's happening in and around RabbitMQ, and how top companies utilize RabbitMQ to power their services.

https://www.rabbitmqsummit.com

RabbitMQ Summit was organized by:
- Erlang Solutions, offering world-leading RabbitMQ Consultancy, Support, Health Checks & Tuning solutions https://www.erlang-solutions.com/
- CloudAMQP, offering fully managed RabbitMQ clusters https://www.cloudamqp.com

RabbitMQ Summit 2018 was sponsored by the following companies.

Platinum sponsors:
Pivotal
LShift

Gold sponors:
Trifork
AWS

Silver sponsor:
Cogin Queue Explorer

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Keynote: Scaling RabbitMQ at Goldman Sachs - Jonathan Skrzypek

  1. 1. Scaling RabbitMQ @ Goldman Sachs RabbitMQ Summit 2018 Jonathan Skrzypek
  2. 2. 2 Messaging Engineering • Messaging infrastructure services for applications across the firm • Portfolio of products to accommodate various use cases • log shipping • trade booking • payments • regulatory flows
  3. 3. 3 A bit of history
  4. 4. 4 Back in 2013 • Reliable multicast messaging • Fire-and-forget firehose • JMS brokers for guaranteed messaging • XA transactions with synchronous storage replication
  5. 5. 5 Back in 2013
  6. 6. 6 Back in 2013 Guaranteed Delivery Reliable Delivery
  7. 7. 7 What then ? Resilient messaging
  8. 8. 8 Why RabbitMQ • Highly available • Many messaging patterns • Many language bindings • Observability • Manageability
  9. 9. 9 How ? ?
  10. 10. 10 Building an offering
  11. 11. 11 No one-size-fits-all BUT...
  12. 12. 12 Deployment model • 3 nodes clusters • DNS round-robin • Cluster(s) per application • AMQP 0-9-1 • Consistency across nodes • Partition handling : Pause Minority • Mirrored queue with automatic synchronization
  13. 13. 13 Flexibility • Programmatically define your topology
  14. 14. 14 Topology ownership • Applications to maintain their topology • Ability to reconstruct topology from scratch
  15. 15. 15 Namespacing • Message Domain concept • Virtual Host • Prefix for queues and exchanges • Mirroring • Queue Synchronization Mode • Entitled Users
  16. 16. 16 Namespacing • A given user can access one or multiple domains • Domain access READ/WRITE/CONFIGURE • Leverages regular expressions WARNINGS.* | ERRORS.*
  17. 17. 17 Developer awareness
  18. 18. 18 Developer awareness • Write smarter applications • Idempotent consumers • No more “one blocking call and you’re done” • Asynchronous design • Use publish confirms if you care about your data
  19. 19. 19 Developer awareness • Publishers need to keep track “The client currently does not perform any internal buffering of such outgoing messages. It is an application developer's responsibility to keep track of such messages and republish them”
  20. 20. 20 Developer awareness • Implement listeners ! • Confirm listener • Return listener for unroutable messages • Shutdown listener • Recovery listener
  21. 21. 21 Deploy and Manage
  22. 22. 22 Automated provisioning Resource Manager Run this Which cluster ? What config ?
  23. 23. 23 Automated provisioning
  24. 24. 24 Configuration management • Central Inventory • Minimize deviations • Inventory vs Actual consistency checks before any change
  25. 25. 25 Telemetry is your friend • Don’t fly blind ! • Collecting metrics from management API • Automated dashboard generation
  26. 26. 26 Telemetry
  27. 27. 27 When things go wrong • Single queue workloads • Large amounts of pending messages • Memory usage • Swing effect with unpredictable performance • Shutdown/Start-up order
  28. 28. 28 Where are we now • 225 clusters • 170 applications • Mix of 3.6.6 and 3.6.16
  29. 29. 29 What’s next ? • Fully self-service • Replication • Federation ? Something else ? • Peer discovery
  30. 30. 30 Visit us online at: goldmansachs.com/careers

×