Your systems. Working as one.




Architect Scalable Real-Time Systems



David Barnett
About RTI
• World leader in communications software for
  real-time systems
  – 350,000+ deployed copies
  – 500+ unique designs
• Standards leader
  – Participate in 15+ standards
    organizations
  – Authored DDS (OMG)
Real-Time Systems
Characteristics of Real-Time Systems
•   Time sensitive
•   Low latency
•   High throughput
•   Non-stop availability
•   Dynamic, ad hoc
•   Autonomous, no sys admin
•   Components run outside
    data center
     –   Decentralized
     –   Processing distributed
     –   Few or no central servers
     –   Embedded, resource constrained
     –   Disadvantaged networks
Challenge: Increasing Scale
• More CPUs, nodes, apps,
  data, orgs, developers…
• Systems of systems

                                                       System of
                                                        systems




     Existing architectures and infrastructure break under the load
Traditional Integration

                                •   Complex
                                •   Costly – O(n2)
                                •   Poor reuse
                                •   Limits information
                                    sharing

• Point-to-point
• Custom, e.g., using sockets
• RPC, RMI
Cost Constrains Integration




Time & cost of
 integration,
 maintenance
and upgrades


                 System Scale and Age
Six Keys to Scalability
1. Publish/Subscribe



                   Sensor Data
     Sensor Data




                                  Commands
                                 Control           Display
    Sensor         Sensor                                     Actuator
                                  App               App




• Simplifies integration – O(n)              • Improves information sharing
2. Explicit, Well-Defined Data Model
3. Maintain State in Infrastructure

                      Source
                               Latitude Longitude Altitude
                      (Key)
                      RADAR1      37.4     -122.0    500.0
                      UAV2        40.7      -74.0    250.0
                      LPD3        50.2       -0.7      0.0




•   External to applications
•   System-wide single version of truth
•   Temporal decoupling
•   Simplifies system development and integration
•   Essential for robust, dynamic systems
Data Centric                            Message Centric

            Application

 Create Read Update Delete                            Application Logic
                                          Send                                 Receive

 Source              Longitud
          Latitude            Altitude
 (Key)                      e
 RADAR1      37.4      -122.0    500.0           Message          Message
 UAV2        40.7       -74.0    250.0           Src=RADAR1       Src=RADAR1
 LPD3        50.2        -0.7      0.0           Msg=Update       Msg=Create
                                                   X=37.4           X=37.4
                                                   Y=37.4           Y=37.4
                                                                    Z=500


• Infrastructure manages data and        • Applications must manage data
  lifecycle                              • Application layer protocol
• Standard data operations (CRUD)          required for lifecycle management
• Automatic state synchronization        • State synchronization requires
  when applications join                   persisting every message or
• Robust, scalable                         application protocol
                                         • Brittle, inefficient
4. Explicit Quality of Service Contracts



                 Sensor Data
   Sensor Data




                                Commands
                               Control     Display
  Sensor         Sensor                                  Actuator
                                App         App

  Reliable       Reliable
                               Reliable    Best effort    Reliable
  1000 Hz        1000 Hz
                               100 Hz         1 Hz        100 Hz
  primary        backup
Sample Real-Time QoS

• Volatility               • High availability
   – Durability               – Liveliness
   – History                  – Ownership
   – Lifespan                 – Ownership strength
• Delivery
   –   Reliability
   –   Time based filter
   –   Content filter
   –   Deadline
5. Govern Interactions with Middleware

                         • Instantiates data model
                         • Provides
                            – Syntactic interoperability
                            – Discovery
                            – Access control
                         • Independent of:
                            –   Programming language
         Bus                –   Operating system
                            –   CPU type
                            –   Physical location
                            –   Transport protocol
                            –   Network type
Example: Data Distribution Service
• Standard means to define data             Portability

  – IDL, XML, dynamically via API
  – Generated from UML                  Data
                                                DDS API
                                        def’n



                                            Middleware
• Wire protocol for interoperability
                                             Real-Time
                                         Publish-Subscribe
                                        Wire Protocol (RTPS)
• API for portability
                                         Interoperability
                                       RTPS also standardized as IEC 61148
6. Decentralized Physical Architecture


                                           Network




• No central ESBs, brokers, database or servers
• Peer-to-peer communication
   – Components, services, applications, devices,
     subsystems, systems
• Bus is virtual
Comparison
Decentralized: Real Time            Centralized: Traditional IT




•   Library only, easy to embed     • Server-based
•   Low latency; good determinism   • Administration heavy
•   Highly scalable                 • Assume high-bandwidth,
•   No single point of failure:       reliable network (TCP)
    non-stop availability           • Poor latency, scalability
                                    • Slow failover
Relative JMS Performance

                   1 KB Messages per Second



         RTI
       Tibco
       Sonic
    ActiveMQ
        Sun
       JBoss

               0    25,000 50,000 75,000 100,000
P2P Publish/Subscribe over Multicast

                                           Minimizes:
             Publisher
                                           • CPU overhead
                                           • Network overhead
                           Switch          • Latency
                           • Replication   • Determinism
                           • Filtering




Subscriber   Subscriber   Subscriber
Multicast Scalability
                                          Gigabit Ethernet
                      600,000

                      500,000
Messages per Second




                      400,000

                      300,000

                      200,000

                      100,000

                           0
                                0   200       400            600   800   1,000
                                                Subscribers
Summary:
Architecting Scalable Real-Time Systems
1.   Publish/subscribe
2.   Explicit data model
3.   Maintain state in infrastructure
4.   Explicit QoS contracts
5.   Govern interactions with middleware
6.   Decentralized physical architecture
RTI’s Solution
                                                                       Discrete
    Small Device                               General-Purpose       Apps/Systems
                           DDS Apps
       Apps                                    Real-Time Apps
    Pub/Sub API          Pub/Sub API              Messaging API
                                                                       Adapters
    (DDS subset)           (Full DDS)             (DDS++ & JMS)
     Connext               Connext                  Connext            Connext
      Micro                 DDS                    Messaging          Integrator

  RTI DataBus™

          Administration                Recording             Federation

            Monitoring                   Replay             Transformation

              Logging               Visualization            Persistence

                   Common Tools and Infrastructure Services
Designed for Real-Time Systems
• Fully embeddable
    –   Applications are self-contained
    –   Deterministic resource utilization
    –   Support embedded and real-time OS
    –   C, C++, C#, Java and Ada APIs; no Java dependence
• Autonomous operation
    –   Plug-and-play via automatic discovery
    –   No sys admin
    –   Supports highly dynamic and ad hoc systems
    –   Self healing
• Real-time Quality of Service (QoS)
    – Control and visibility over timing and resources
    – Built-in filtering by time and content
• Disadvantaged network support
    – E.g.: wireless, radio, satellite, WAN
    – No TCP or IP dependence
Learn More

• Contact RTI
• Downloads
  – www.rti.com/downloads
  – Interactive “Shapes” demo
  – Free trial with comprehensive
    tutorial
  – Free licenses for internally-funded
    IR&D
• Videos, webinars and
  whitepaper
  – www.rti.com/resources
Download
Connext
Free Trial
NOW




 www.rti.com/downloads
Thank You!

Don't Architect a Real-Time System that Can't Scale

  • 1.
    Your systems. Workingas one. Architect Scalable Real-Time Systems David Barnett
  • 2.
    About RTI • Worldleader in communications software for real-time systems – 350,000+ deployed copies – 500+ unique designs • Standards leader – Participate in 15+ standards organizations – Authored DDS (OMG)
  • 3.
  • 4.
    Characteristics of Real-TimeSystems • Time sensitive • Low latency • High throughput • Non-stop availability • Dynamic, ad hoc • Autonomous, no sys admin • Components run outside data center – Decentralized – Processing distributed – Few or no central servers – Embedded, resource constrained – Disadvantaged networks
  • 5.
    Challenge: Increasing Scale •More CPUs, nodes, apps, data, orgs, developers… • Systems of systems System of systems Existing architectures and infrastructure break under the load
  • 6.
    Traditional Integration • Complex • Costly – O(n2) • Poor reuse • Limits information sharing • Point-to-point • Custom, e.g., using sockets • RPC, RMI
  • 7.
    Cost Constrains Integration Time& cost of integration, maintenance and upgrades System Scale and Age
  • 8.
    Six Keys toScalability
  • 9.
    1. Publish/Subscribe Sensor Data Sensor Data Commands Control Display Sensor Sensor Actuator App App • Simplifies integration – O(n) • Improves information sharing
  • 10.
  • 11.
    3. Maintain Statein Infrastructure Source Latitude Longitude Altitude (Key) RADAR1 37.4 -122.0 500.0 UAV2 40.7 -74.0 250.0 LPD3 50.2 -0.7 0.0 • External to applications • System-wide single version of truth • Temporal decoupling • Simplifies system development and integration • Essential for robust, dynamic systems
  • 12.
    Data Centric Message Centric Application Create Read Update Delete Application Logic Send Receive Source Longitud Latitude Altitude (Key) e RADAR1 37.4 -122.0 500.0 Message Message UAV2 40.7 -74.0 250.0 Src=RADAR1 Src=RADAR1 LPD3 50.2 -0.7 0.0 Msg=Update Msg=Create X=37.4 X=37.4 Y=37.4 Y=37.4 Z=500 • Infrastructure manages data and • Applications must manage data lifecycle • Application layer protocol • Standard data operations (CRUD) required for lifecycle management • Automatic state synchronization • State synchronization requires when applications join persisting every message or • Robust, scalable application protocol • Brittle, inefficient
  • 13.
    4. Explicit Qualityof Service Contracts Sensor Data Sensor Data Commands Control Display Sensor Sensor Actuator App App Reliable Reliable Reliable Best effort Reliable 1000 Hz 1000 Hz 100 Hz 1 Hz 100 Hz primary backup
  • 14.
    Sample Real-Time QoS •Volatility • High availability – Durability – Liveliness – History – Ownership – Lifespan – Ownership strength • Delivery – Reliability – Time based filter – Content filter – Deadline
  • 15.
    5. Govern Interactionswith Middleware • Instantiates data model • Provides – Syntactic interoperability – Discovery – Access control • Independent of: – Programming language Bus – Operating system – CPU type – Physical location – Transport protocol – Network type
  • 16.
    Example: Data DistributionService • Standard means to define data Portability – IDL, XML, dynamically via API – Generated from UML Data DDS API def’n Middleware • Wire protocol for interoperability Real-Time Publish-Subscribe Wire Protocol (RTPS) • API for portability Interoperability RTPS also standardized as IEC 61148
  • 17.
    6. Decentralized PhysicalArchitecture Network • No central ESBs, brokers, database or servers • Peer-to-peer communication – Components, services, applications, devices, subsystems, systems • Bus is virtual
  • 18.
    Comparison Decentralized: Real Time Centralized: Traditional IT • Library only, easy to embed • Server-based • Low latency; good determinism • Administration heavy • Highly scalable • Assume high-bandwidth, • No single point of failure: reliable network (TCP) non-stop availability • Poor latency, scalability • Slow failover
  • 19.
    Relative JMS Performance 1 KB Messages per Second RTI Tibco Sonic ActiveMQ Sun JBoss 0 25,000 50,000 75,000 100,000
  • 20.
    P2P Publish/Subscribe overMulticast Minimizes: Publisher • CPU overhead • Network overhead Switch • Latency • Replication • Determinism • Filtering Subscriber Subscriber Subscriber
  • 21.
    Multicast Scalability Gigabit Ethernet 600,000 500,000 Messages per Second 400,000 300,000 200,000 100,000 0 0 200 400 600 800 1,000 Subscribers
  • 22.
    Summary: Architecting Scalable Real-TimeSystems 1. Publish/subscribe 2. Explicit data model 3. Maintain state in infrastructure 4. Explicit QoS contracts 5. Govern interactions with middleware 6. Decentralized physical architecture
  • 23.
    RTI’s Solution Discrete Small Device General-Purpose Apps/Systems DDS Apps Apps Real-Time Apps Pub/Sub API Pub/Sub API Messaging API Adapters (DDS subset) (Full DDS) (DDS++ & JMS) Connext Connext Connext Connext Micro DDS Messaging Integrator RTI DataBus™ Administration Recording Federation Monitoring Replay Transformation Logging Visualization Persistence Common Tools and Infrastructure Services
  • 24.
    Designed for Real-TimeSystems • Fully embeddable – Applications are self-contained – Deterministic resource utilization – Support embedded and real-time OS – C, C++, C#, Java and Ada APIs; no Java dependence • Autonomous operation – Plug-and-play via automatic discovery – No sys admin – Supports highly dynamic and ad hoc systems – Self healing • Real-time Quality of Service (QoS) – Control and visibility over timing and resources – Built-in filtering by time and content • Disadvantaged network support – E.g.: wireless, radio, satellite, WAN – No TCP or IP dependence
  • 25.
    Learn More • ContactRTI • Downloads – www.rti.com/downloads – Interactive “Shapes” demo – Free trial with comprehensive tutorial – Free licenses for internally-funded IR&D • Videos, webinars and whitepaper – www.rti.com/resources
  • 26.
  • 27.

Editor's Notes

  • #2 Don't Architect a Real-Time System that Can't ScaleAs distributed system scale up, so does their integration time and cost. This integration challenge is particularly acute for real-time and intelligent systems: increased connectivity cannot come at the expense of performance, reliability or resource consumption.Adopting an inherently scalable architecture is the secret to agilely and affordably building systems that encompass ever more applications, nodes and real-time data. This webinar will review how you can apply proven integration techniques—such as loose coupling and service orientation—to demanding real-time systems. Unlike approaches designed for conventional business applications, the architecture we'll introduce is appropriate for systems that span embedded, high performance and IT applications.This webinar targets software architects, chief engineers and development leads in all industries that design real-time and intelligent systems. This includes defense, industrial, transportation, medical and aerospace applications.Better integration is increasingly the key to competitive advantage. It provides end-users with higher situational awareness, responsiveness and resource utilization. Don't let your architecture hold you back.
  • #5 Real-time Systems have unique technical requirements that must be taken into account when it comes to both high-level architecture and the technologies used to integrate that architecture. Architectural approaches and integration technologies typically used for enterprise business application often can’t be applied to real-times.Non-stop availability – mission critical, business critical, safety critical
  • #7 In the past, most systems were developed using what I call point-to-point or application-centric integration. With this approach, applications or components directly communicate with each other. This communication could be implemented using a custom protocol, often built on top of TCP/IP sockets. It could also be implemented using client/server approaches such as Remote Procedure Calls or Remote Method Invocations.There are a number of problems with this approach:First off, it results in complexity. Communication paths are embedded in individual applications. This makes them tightly coupled. Over time, as systems grow in scale, they become very stovepipe and brittle. This makes maintenance extremely costly.Integration and upgradesare also costly. The addition of new applications requires changes to existing applications so that they can communicate. This is an order n-squared problem. If there are ten components in a system, there are potentially 90 connections. If there are fifty components, there can be 2,450 connections. If there are a hundred components, there can be 9,990 connections.Because components are tied to the interface of other components in their system, reuse is also difficult.Finally, because data must be explicitly exchanged, there is very poor information sharing. This limitation is compounded by the cost of integrating applications, which discourages sharing of information which may be useful but isn’t absolutely essential. This constrains situational awareness.
  • #8 This chart illustrates the impact of system scale on the time and cost required for integration. Integration time goes up exponentially as the number of applications and system components increases. With two applications, there are only up to two data flows to handle, from A to B and B to A. With five applications, there are 20 potential connections in a system. With 25 applications, there are 600 potential connections. With 100 applications, which is not unusual in a system of systems, there are nearly 10,000 potential connections. That’s why integration and upgrades can take years.Costs also increase over time as the systems themselves become larger, more complex and less maintainable.
  • #10 First key is to use publish/subscribe as the underlying communication paradigm..Eliminates complexityApplications require no knowledge of each other…only of the data they produce or consumeEasy to add applications that you didn’t anticipate when initially designing a syystemFosters information sharingData easily discovered, accessed---Publish/subscribe overcomes these problems. It essentially provides an integration bus that decouples individual applications. Applications require no knowledge of each other, only of the “topics” or “subjects” of data they process. Applications simply publish the data they produce and subscribe to the data they consume. They do not require any hardcoded knowledge of other applications.For example, a Sensor just has to publish its data to a “Sensor Data” topic. Likewise, the controller for an actuator can publish its status to a “Status” topic. A supervisory control application can subscribe to both “Sensor Data” and “Actuator Status” and issue commands based on those inputs. The actuator’s controller can then subscribe to those commands.<click>The elegance of this approach is that no changes are required to existing applications when new components are added, for example, another sensor.<click>…likewise if you add a new display application. It can just subscribe to the Sensor Data and Actuator status topics that are already available on the bus.<click>This greatly simplifies integration. Complexity is independent of the number of applications. It also promotes information sharing, since subscribing to information is essentially free.Publish/subscribe is particularly well-suited for edge applications because they are inherently data and event driven and require many-to-many communication.
  • #11 Helps ensure interoperability when scaling development across multiple individuals, teams, projects and organizations. Everyone has a common understanding of the semantics of the data that is being exchanged.Keeps the data exchange independent of how data structures are defined or implemented in any specific application or programming language. For example, if data was defined in Java, it would be hard to access from a C application.Independent of its implementation in any specific application
  • #12 Also similar to RESTEssential to dynamic distributed systemsPerhaps our most powerful capability is support for data centricity. Data centricity extends the traditional publish/subscribe paradigm by allowing application to directly read and write data objects instead of just sending and receiving messages about data. In effect, applications can communicate as if they share a database—with each topic being a table and each object being a row in the table. However, there are two significant differences from a traditional database:The first is that updates are automatically pushed to subscribers for event-driven and real-time processing.The second difference is that that data objects are cached within each publisher and subscriber, not centrally. Because the solution is completely decentralized, we call it a data space and not a database. For cases in which it is desirable to persist data separately—for example, when applications may be connected over a low bandwidth network—RTI does provide an optional Persistence Service. Persistence Services do not have to run centrally and can be distributed across a network for load balancing or high availability purposes.Data centricity is particularly valuable for edge systems because it provides a single source of truth regarding the state of a system. Because systems can be highly dynamic, there needs to be a way for late joining or returning applications to receive a current snapshot regarding the state of the world. With RTI’s middleware, late joining application can automatically receive the most recent values of subscribed data objects—including any required historic data. The ensures that state is consistent across even a very large scale and dynamic system.---Relate to cachingMention Persistence Service option
  • #16 [Use Middleware that Instantiates the Data Model and Governs Interactions Interactions]A complement to having a standard data model. Provides a means to physically and unambiguously exchange data defined in the model.In addition to having a standard and implementation-independent data model, it is also important to have a standard and implementation-independent means to physically exchange data.A well-defined data model gives developers or integrators the knowledge they need to interpret data. Explicit data model required for semantic interoperabilityAbstract – independent of how data structures are defined or implemented in any specific application. Applications can change without affecting other applications.Formally defined and discoverable – So an application or services doesn’t need to know anything about how other applications or services are implemented---Just like Allstate is “The Good Hands People” we’re the “Good Architecture People”Plug and playContent and time awareTo “plug in” a module to a “software bus” implies that we have crisply defined *everything* that module needs to communicate with other modules. This is the secret sauce of integration. DDS defines all these interfaces, then enforces connection contracts.Add or replace modules without changing the system. Orders of magnitude easier integrationMuch higher performance and parallelismDistributed, re-locatable, modular servicesNo central bottleneck or failureEfficient, scalable distributionInteroperable standards compliantFlexible evolutionInteroperableStandard protocolTransparent connectivityC, C++, Java, .NET, moreWindows, Linux, Unix, embedded, real ­time
  • #21 Another benefit of publish/subscribe is that it is well-suited to the use of multicast. Multicast provides very efficient and scalable broad data distribution. With multicast, the network switch automatically replicates messages for each interested subscriber. The publisher only has to send data once, regardless of how many subscribers there are. In contrast, with unicast data has to be sent separately to each subscriber. With broadcast, it is sent to all subscribers even if they aren’t interested. Each subscriber has to filter out data it isn’t interested in.Multicast provides several scalability advantages.Low overhead on publisher. Publisher only has to send data one regardless of how many subscribers there are.Latency is constant to each subscriber. Latency doesn’t degrade as the number of subscribers increases.By assigning different data flows to different multicast addresses, the switch can be used to filter data at wire speed
  • #22 This chart illustrates how scalable multicast can be. 200-byte messages were sent reliably to up to nearly 1,000 subscribers. As you can see, there was very little impact on per-application throughput as the number of subscribers increased. At the largest number of subscribers, nearly 441 million messages per second were being delivered across the system!---These benchmarks were conducted with following configuration:RTI Data Distribution Service 4.5d CentOS 5.5, 64-bit, kernel version 2.6.18-194.11.3.el5 Intel Core i7 Extreme 980X 6-Core @ 3.33GHz UDP over IPv4 Network adapters: Intel 82574L Gigabit Ethernet, Juniper Ex 4200 Series switch Voltaire InfiniBand HCA HCA 600Ex2-Q-1 with Voltaire Messaging Acclerator (VMA version 4.5.12.0), Voltaire 4036 switch Reliable messaging with ordered delivery
  • #24 This is an architectural view of the product line. Again, all of the products are built around the RTI DataBus.Another beauty of a data-centric architecture is that services can be plugged into the bus in the same way that a customer’s application can be. So, in addition to just providing the core messaging capability, we can also offer value added services and tools that customers can take advantage of in both their development environments as well as in deployed systems.