Ultra-scalable Architectures for
Telecommunications and Web 2.0
Services



Mauricio Arango
Bernd Kaponig

Sun Microsystems


October 27, 2009
Web 2.0 Impact on Communications
          Applications
• Information-sharing, social media services such
  as Facebook, Twitter – experiencing fast &
  enormous growth
• Growth handled via horizontal scaling vs.
  vertical scaling – no forklift upgrades – agile
  and lower-cost
• Involves handling parallelism challenge in
  telecommunications service software

                                                    2
Impact of New Communications
         Applications - cont.
• Telecom service providers launching Web 2.0-
  like services
  – Nokia Ovi
  – Vodafone 360
• Expansion of services based on telecom
  network advantages - event-driven
  – Location-based
  – M2M
  – Metered realtime billing
  – SLA management

                                                 3
Parallel Application Architecture
      Model (server side)
               ●   Parallelize – breakup
                   into pieces
                   –   Application
                   –   Data
               ●   Requires middleware
                   –   Coordination
                       (communication)
                   –   Data management


                                           4
Price of Horizontal Scaling
• Requires software that can be distributed and has high degree
  of parallelism -
   – Applications broken into components
   – Components run in parallel in a distributed processing infrastructure



  Monolithic application                Parallel application




                                                                             5
Good News – Event-driven apps
           have high parallelism
• Web and                    • Data stream-driven
  telecommunications         • Event-based coordination
  applications are event-      pattern
  driven
  – User-generated events
  – Monitoring and sensing
    events
  – Network events




                                                      6
Other Parallel Application Design
                Patterns
                                               Worker 1


• Master-worker             Master             Worker 2




                                               Worker N



• Pipeline                 Phase
                           Stage 1   Stage 2               Stage N
                           1




                           Phase
                           Stage 1
 • Hybrid                  1         Stage 2              Stage N


                  Master   Phase
                           St
                           1         Stage 2              Stage N




                           Phase
                           Stage 1
                           1         Stage 2              Stage N
                                                                     7
Communications Middleware
• Coordination – interaction required when an
  application is split into multiple components
    Direct communication                          Indirect communication




•   Request-response, Notification            •   A node only needs to know about the
•                                                 intermediate entity
    Tightly coupled – every node needs to
    know about other nodes' address and       •   Space decoupling
    functions – non-scalable complexity for   •   Time decoupling
    developer
                                              •   Three mechanisms
•   Poor at handling node failures
                                                    – Tuple Space
                                                    – Message Queue
                                                    – Publish/Subscribe                 8
Tuple Space
• Invented by David Gelernter in early 80s •        Space & time decoupling across
• Simple functionality – via four operations        components
   – out()- deposits tuple                   •      Shared associative object (tuple)
   – in()- removes matching tuple                   store
   – read() - reads matching tuple           •      Inherent load-balancing
   – notify() - registers for notification •        Event processing framework
     on matching tuple                       •      Natural fit with business
                                                    processes



                           Component 2                   Component 3
   Component 1
                                                                               Component 4

                           in(     )     out(   )
    out(         )                                     read(      )
                                                                       read(       )



                                 Tuple Space                                                 9
Message Queue
• Intermediate entity: set of queues
• Communicating components only need to know queue names
• Basic functions:
   – sendMsg(queueName, message)
   – getMsg(queueName)
• Can be mapped as a subset of Tuple Space using:
   – out(queueName, message) for sendMsg
   – in(queueName) for getMsg



             Component 1
                                                         Component 2
                    sendMsg(queue-1,   )
                                                             getMsg(queue-1)
      sendMsg(queue-2,   )
                                       Message queue 1
                                                                       Component 3


                                                                           getMsg(queue-2)
                             Message queue 2


                                                                                             10
Publish/Subscribe
• Intermediate entity: broker
• Basic functions:
  – subscribe(template)
  – publish(message)
• Can be mapped as a subset of Tuple Space
  using:
  – out(tuple) for publish()
  – notify(template) for subscribe()




                                             11
Surge of Indirect Messaging
                   Middleware
●   Blitz (JavaSpaces)
●   Gigaspaces
●   Rinda
●   Gruple
●   Semispace
●   Open source JMS
     –   Sun Open Message Queue
     –   Apache ActiveMQ
●   Amazon Simple Message Queue Service
●   Rabbit MQ – based AMQP standard
●   Gearman
●   XMPP Pub/Sub
●   Pubsubhubbub                          12
Data Management Middleware
• Scaling via parallelization and
  caching
• Parallelization
   – Replication – read-most data
   – Partitioning/sharding – write &
     read
• Distributed caching
   – Shares scaling techniques with
     communications middleware
   – Use of Distributed Hash Tables
     (DHT)




                                       13
Use Case – Twitter-like, many-to-
   many Messaging System
               • Process new message
                  – out(message_list_update, follower,
                    message_publisher, message_content)
               • Update message list
                  – in(message_list_update, ?r, ?t)
                  – Put(new message)
               • Retrieve message list
                  – get(messages)




                                                  14
Conclusions
• Horizontal scalability requires parallel application
  architectures from day one
• Parallelization involves simple communications
  middleware models & simple APIs
  – Tuple Space simplest & superset model
• Distributed caching – scalable foundation for both
  communications and data management middleware
• Parallel application architectures key in
  telecommunications evolution


                                                         15
Additional information
●   Paper (includes references):
    http://blogs.sun.com/arango/resource/ICIN-09-Ultra
●   Blog: http://blogs.sun.com/arango/




                                                 16

Ultra-scalable Architectures for Telecommunications and Web 2.0 Services

  • 1.
    Ultra-scalable Architectures for Telecommunicationsand Web 2.0 Services Mauricio Arango Bernd Kaponig Sun Microsystems October 27, 2009
  • 2.
    Web 2.0 Impacton Communications Applications • Information-sharing, social media services such as Facebook, Twitter – experiencing fast & enormous growth • Growth handled via horizontal scaling vs. vertical scaling – no forklift upgrades – agile and lower-cost • Involves handling parallelism challenge in telecommunications service software 2
  • 3.
    Impact of NewCommunications Applications - cont. • Telecom service providers launching Web 2.0- like services – Nokia Ovi – Vodafone 360 • Expansion of services based on telecom network advantages - event-driven – Location-based – M2M – Metered realtime billing – SLA management 3
  • 4.
    Parallel Application Architecture Model (server side) ● Parallelize – breakup into pieces – Application – Data ● Requires middleware – Coordination (communication) – Data management 4
  • 5.
    Price of HorizontalScaling • Requires software that can be distributed and has high degree of parallelism - – Applications broken into components – Components run in parallel in a distributed processing infrastructure Monolithic application Parallel application 5
  • 6.
    Good News –Event-driven apps have high parallelism • Web and • Data stream-driven telecommunications • Event-based coordination applications are event- pattern driven – User-generated events – Monitoring and sensing events – Network events 6
  • 7.
    Other Parallel ApplicationDesign Patterns Worker 1 • Master-worker Master Worker 2 Worker N • Pipeline Phase Stage 1 Stage 2 Stage N 1 Phase Stage 1 • Hybrid 1 Stage 2 Stage N Master Phase St 1 Stage 2 Stage N Phase Stage 1 1 Stage 2 Stage N 7
  • 8.
    Communications Middleware • Coordination– interaction required when an application is split into multiple components Direct communication Indirect communication • Request-response, Notification • A node only needs to know about the • intermediate entity Tightly coupled – every node needs to know about other nodes' address and • Space decoupling functions – non-scalable complexity for • Time decoupling developer • Three mechanisms • Poor at handling node failures – Tuple Space – Message Queue – Publish/Subscribe 8
  • 9.
    Tuple Space • Inventedby David Gelernter in early 80s • Space & time decoupling across • Simple functionality – via four operations components – out()- deposits tuple • Shared associative object (tuple) – in()- removes matching tuple store – read() - reads matching tuple • Inherent load-balancing – notify() - registers for notification • Event processing framework on matching tuple • Natural fit with business processes Component 2 Component 3 Component 1 Component 4 in( ) out( ) out( ) read( ) read( ) Tuple Space 9
  • 10.
    Message Queue • Intermediateentity: set of queues • Communicating components only need to know queue names • Basic functions: – sendMsg(queueName, message) – getMsg(queueName) • Can be mapped as a subset of Tuple Space using: – out(queueName, message) for sendMsg – in(queueName) for getMsg Component 1 Component 2 sendMsg(queue-1, ) getMsg(queue-1) sendMsg(queue-2, ) Message queue 1 Component 3 getMsg(queue-2) Message queue 2 10
  • 11.
    Publish/Subscribe • Intermediate entity:broker • Basic functions: – subscribe(template) – publish(message) • Can be mapped as a subset of Tuple Space using: – out(tuple) for publish() – notify(template) for subscribe() 11
  • 12.
    Surge of IndirectMessaging Middleware ● Blitz (JavaSpaces) ● Gigaspaces ● Rinda ● Gruple ● Semispace ● Open source JMS – Sun Open Message Queue – Apache ActiveMQ ● Amazon Simple Message Queue Service ● Rabbit MQ – based AMQP standard ● Gearman ● XMPP Pub/Sub ● Pubsubhubbub 12
  • 13.
    Data Management Middleware •Scaling via parallelization and caching • Parallelization – Replication – read-most data – Partitioning/sharding – write & read • Distributed caching – Shares scaling techniques with communications middleware – Use of Distributed Hash Tables (DHT) 13
  • 14.
    Use Case –Twitter-like, many-to- many Messaging System • Process new message – out(message_list_update, follower, message_publisher, message_content) • Update message list – in(message_list_update, ?r, ?t) – Put(new message) • Retrieve message list – get(messages) 14
  • 15.
    Conclusions • Horizontal scalabilityrequires parallel application architectures from day one • Parallelization involves simple communications middleware models & simple APIs – Tuple Space simplest & superset model • Distributed caching – scalable foundation for both communications and data management middleware • Parallel application architectures key in telecommunications evolution 15
  • 16.
    Additional information ● Paper (includes references): http://blogs.sun.com/arango/resource/ICIN-09-Ultra ● Blog: http://blogs.sun.com/arango/ 16