Scalable Persistent Message Brokering
with WSO2 Message Broker
Srinath Perera
Senior Software Architect
WSO2 Inc.
Outline
     • Understanding Messaging
     • Scalable Messaging
     • WSO2 MB Architecture
            •     Distributed Pub/Sub architecture
            •     Distributed Queues architecture
     • Usecases
     • Conclusion




photo by John Trainoron Flickr http://www.flickr.com/photos/trainor/2902023575/, Licensed under CC
What is Messaging ?
• We often program and design
  distributed systems with RPC
  style communication (E.g. Web
  Services, Thrift, REST)
• RPC communication is
   •   Request/Response (there is
       always a response)
   •   Synchronous (client waits for
       response)
   •   Non-persistent (message is lost if
       something failed)
• But there are other 7 possibilities
Messaging Systems in Real World
• Sensor networks
• Monitoring/
  Surveillance
• Business Activity
  Monitoring
• Job Scheduling
  Systems
• Social Networks

  http://www.flickr.com/photos/imuttoo/4257813689/ by Ian Muttoo, http://www.flickr.com/photos/eastcapital/4554220770/,
   http://www.flickr.com/photos/patdavid/4619331472/ by Pat David copyright CC, http://www.fotopedia.com/items/flickr-
                                                      2548697541
Why Messaging?




            WSO2 Inc.   5
Messaging Systems

    • Message Broker(s) as
      the middlemen
    • There are two main
      models
          •    Queues
          •    Publish/Subscribe




http://www.geograph.org.uk/photo/2639458 and http://www.geograph.org.uk/photo/1138150
Distributed Queues



• A queue in the “Network”
• API Operations
   •   Put(M) – put a message
   •   Get() – get a message (dqueue)
   •   Subscribe() – send me a message when there is one
• E.g. SQS (Amazon Queuing Service)
• Use cases:- Job Queues, Store and process
                                                     7
Publish/ Subscribe




• There is a topic space based on interest groups
• Publishers send messages to brokers
• Subscribers register their interest
• Brokers matches events (messages) and deliver to all
  interested parties
• Usecases: Surveillance, Monitoring
                              WSO2 Inc.         8
Messaging APIs and Message Formats and
               Standards




             WSO2 Inc.    9
Scaling Message Brokers
   There are several dimensions of
    Scale
     Number of messages
     Number of Queues
     Size of messages
 Scaling Pub/Sub is relatively easy
     E.g. Narada Broker, Padres
 Scaling Distributed Queues is
  harder



                                   WSO2 Inc.   10
Scaling Distributed Queues
Scaling Distributed Queues (Contd.)
Topology                Pros                       Cons                        Supporting Systems


Master Salve            Support HA                 No Scalability              Qpid, ActiveMQ,
                                                                               RabbitMQ
Queue Distribution      Scale to large number of   Does not scale for large    RabbitMQ
                        Queues                     number of messages for
                                                   a queue

Cluster Connections     Support HA                 Might not support in-       HorentMQ
                                                   order delivery Logic runs
                                                   in the client side takes
                                                   local decisions.

Broker/Queue Networks   Load balancing and         Fair load balancing is      ActiveMQ
                        distribution               hard

                                             WSO2 Inc.                        12
MB2 Messaging Architecture




       WSO2 Inc.     13
WSO2 MB




          WSO2 Inc.   14
Cassandra and Zookeeper
• Cassandra
  •   NoSQL Highly scalable new data model (column family)
  •   Highly scalable (multiple Nodes), available and no Single Point of
      Failure.
  •   SQL like query language (from 0.8) and support search through
      secondary indexes (well no JOINs, Group By etc. ..).
  •   Tunable consistency and replication
  •   Very high write throughput and good read throughput. It is pretty
      fast.
• Zookeeper
  •   Scalable, fault tolerant distributed coordination framework
How Distributed Queues Works ?
How Distributed Queues Works (Contd..)
How Distributed Queues Works (Contd.)
   Users can publish to any node (to a topic)
   When published, the node writes the message to queue in
    Cassandra called “global queue”
   Each node have a queue in Cassandra called the node queue
   A worker running in a node reads message from global queue
    and writes messages to a node queue that has a subscription for
    that topic.
   A worker in each node reads messages from node queue and
    delivers to subscriber for that queue.
   Node deletes messages only when subscriber has acked the
    delivery
How Pub/Sub Works ?
How Pub/Sub Works (Contd.) ?
Fault Tolerance
                                         • We write the message to
                                           Cassandra once we receive the
                                           message
                                         • We always read, process and
                                           then only delete messages (e.g.
                                           at client delivery after receiving
                                           the Ack)
                                         • In case of a failure of nodes, then
                                           worse case there will be
                                           duplicates

Copy right , http://www.fotopedia.com/items/flickr-6206406047:CC license
JMS support for MB2

      Feature                          Yes
     Pub / Sub                         Yes
Durable Subscriptions                  Yes
 Hierarchical Topics                   Yes
      Queues                           Yes
 Message Selectors              No, planned for 3.0
    Transactions                No, planned for 3.0



                    WSO2 Inc.          22
How does MB2 Make a difference?
                                                      • Scale up in all 3 dimensions
                                                      • Create only one copy of
                                                        message while delivery
                                                      • High Availability and Fault
                                                        Tolerance
                                                      • Large message transfers in
                                                        pub/sub (asynchronous style)
                                                      • Let users choose between strict
                                                        and best effort message delivery
                                                      • Replication of stored messages
http://www.flickr.com/photos/flickcoolpix/356684845
              8/sizes/m/in/photostream/                 in the storage
Usecase 1: Store and Process
Usecase 2: Message Bus
Future Work and Roadmap
                                              • WSO2 MB (2013 Q4)
                                                 •   Support for in-memory delivery
                                                     through hazelcast
                                                     (www.hazelcast.com)
                                                 •   AMQP 1.0 support
                                                 •   Transaction support
                                              • If you have thoughts, please
                                                chat with me or join us at
                                                architecture@wso2.org

http://www.flickr.com/photos/24071429@N08/2
                   314391179/
Conclusion
• Provides an alternative architecture for
  scalable message brokers using
  Cassandra and Zookeeper
• It provides
    •    A publish/subscribe model that does not
         need any coordination between broker nodes
    •    A strict mode for distributed queues that
         provides in order delivery
    •    A best-effort mode for distributed queue



http://ambr0.deviantart.com/art/Looking-Back-Wolf-
                    310857819
Questions?

Scalable Persistent Message Brokering with WSO2 Message Broker

  • 1.
    Scalable Persistent MessageBrokering with WSO2 Message Broker Srinath Perera Senior Software Architect WSO2 Inc.
  • 2.
    Outline • Understanding Messaging • Scalable Messaging • WSO2 MB Architecture • Distributed Pub/Sub architecture • Distributed Queues architecture • Usecases • Conclusion photo by John Trainoron Flickr http://www.flickr.com/photos/trainor/2902023575/, Licensed under CC
  • 3.
    What is Messaging? • We often program and design distributed systems with RPC style communication (E.g. Web Services, Thrift, REST) • RPC communication is • Request/Response (there is always a response) • Synchronous (client waits for response) • Non-persistent (message is lost if something failed) • But there are other 7 possibilities
  • 4.
    Messaging Systems inReal World • Sensor networks • Monitoring/ Surveillance • Business Activity Monitoring • Job Scheduling Systems • Social Networks http://www.flickr.com/photos/imuttoo/4257813689/ by Ian Muttoo, http://www.flickr.com/photos/eastcapital/4554220770/, http://www.flickr.com/photos/patdavid/4619331472/ by Pat David copyright CC, http://www.fotopedia.com/items/flickr- 2548697541
  • 5.
    Why Messaging? WSO2 Inc. 5
  • 6.
    Messaging Systems • Message Broker(s) as the middlemen • There are two main models • Queues • Publish/Subscribe http://www.geograph.org.uk/photo/2639458 and http://www.geograph.org.uk/photo/1138150
  • 7.
    Distributed Queues • Aqueue in the “Network” • API Operations • Put(M) – put a message • Get() – get a message (dqueue) • Subscribe() – send me a message when there is one • E.g. SQS (Amazon Queuing Service) • Use cases:- Job Queues, Store and process 7
  • 8.
    Publish/ Subscribe • Thereis a topic space based on interest groups • Publishers send messages to brokers • Subscribers register their interest • Brokers matches events (messages) and deliver to all interested parties • Usecases: Surveillance, Monitoring WSO2 Inc. 8
  • 9.
    Messaging APIs andMessage Formats and Standards WSO2 Inc. 9
  • 10.
    Scaling Message Brokers  There are several dimensions of Scale  Number of messages  Number of Queues  Size of messages  Scaling Pub/Sub is relatively easy  E.g. Narada Broker, Padres  Scaling Distributed Queues is harder WSO2 Inc. 10
  • 11.
  • 12.
    Scaling Distributed Queues(Contd.) Topology Pros Cons Supporting Systems Master Salve Support HA No Scalability Qpid, ActiveMQ, RabbitMQ Queue Distribution Scale to large number of Does not scale for large RabbitMQ Queues number of messages for a queue Cluster Connections Support HA Might not support in- HorentMQ order delivery Logic runs in the client side takes local decisions. Broker/Queue Networks Load balancing and Fair load balancing is ActiveMQ distribution hard WSO2 Inc. 12
  • 13.
  • 14.
    WSO2 MB WSO2 Inc. 14
  • 15.
    Cassandra and Zookeeper •Cassandra • NoSQL Highly scalable new data model (column family) • Highly scalable (multiple Nodes), available and no Single Point of Failure. • SQL like query language (from 0.8) and support search through secondary indexes (well no JOINs, Group By etc. ..). • Tunable consistency and replication • Very high write throughput and good read throughput. It is pretty fast. • Zookeeper • Scalable, fault tolerant distributed coordination framework
  • 16.
  • 17.
    How Distributed QueuesWorks (Contd..)
  • 18.
    How Distributed QueuesWorks (Contd.)  Users can publish to any node (to a topic)  When published, the node writes the message to queue in Cassandra called “global queue”  Each node have a queue in Cassandra called the node queue  A worker running in a node reads message from global queue and writes messages to a node queue that has a subscription for that topic.  A worker in each node reads messages from node queue and delivers to subscriber for that queue.  Node deletes messages only when subscriber has acked the delivery
  • 19.
  • 20.
  • 21.
    Fault Tolerance • We write the message to Cassandra once we receive the message • We always read, process and then only delete messages (e.g. at client delivery after receiving the Ack) • In case of a failure of nodes, then worse case there will be duplicates Copy right , http://www.fotopedia.com/items/flickr-6206406047:CC license
  • 22.
    JMS support forMB2 Feature Yes Pub / Sub Yes Durable Subscriptions Yes Hierarchical Topics Yes Queues Yes Message Selectors No, planned for 3.0 Transactions No, planned for 3.0 WSO2 Inc. 22
  • 23.
    How does MB2Make a difference? • Scale up in all 3 dimensions • Create only one copy of message while delivery • High Availability and Fault Tolerance • Large message transfers in pub/sub (asynchronous style) • Let users choose between strict and best effort message delivery • Replication of stored messages http://www.flickr.com/photos/flickcoolpix/356684845 8/sizes/m/in/photostream/ in the storage
  • 24.
    Usecase 1: Storeand Process
  • 25.
  • 26.
    Future Work andRoadmap • WSO2 MB (2013 Q4) • Support for in-memory delivery through hazelcast (www.hazelcast.com) • AMQP 1.0 support • Transaction support • If you have thoughts, please chat with me or join us at architecture@wso2.org http://www.flickr.com/photos/24071429@N08/2 314391179/
  • 27.
    Conclusion • Provides analternative architecture for scalable message brokers using Cassandra and Zookeeper • It provides • A publish/subscribe model that does not need any coordination between broker nodes • A strict mode for distributed queues that provides in order delivery • A best-effort mode for distributed queue http://ambr0.deviantart.com/art/Looking-Back-Wolf- 310857819
  • 28.