Your SlideShare is downloading. ×
UAV Data Link Design for Dependable Real-Time Communications
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

UAV Data Link Design for Dependable Real-Time Communications

9,755

Published on

Communication over the kinds of Data-Links used for unmanned vehicles presents important challenges dues to the low bandwidth, intermittent, and lower reliability of these links. Classic network …

Communication over the kinds of Data-Links used for unmanned vehicles presents important challenges dues to the low bandwidth, intermittent, and lower reliability of these links. Classic network protocols such as TCP do not operate well in this environment forcing application developers to implement their own reliability and session management. This presentation describes he issues and alternatives.

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
9,755
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
419
Comments
0
Likes
5
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. UAV Data Link Design for Dependable Real-Time Communications Avionics 2009, Amsterdam Gerardo Pardo-Castellote, Ph.D. Chief Technology Officer Real-Time Innovations, Inc. www.rti.com gerardo@rti.com
  • 2. Outline UAV Communication Requirements Why TCP-based solutions do not work Implementing your own data-link protocol Using middleware (DDS) for the Data-Link Conclusions
  • 3. UAVs part of larger integrated network Vehicle LAN Data Link Ground Station LAN Avionics Net Centric GIG Tactical Backbone Real-Time Ground Station Backend WAN
  • 4. Characteristics of UAV Communications In-Vehicle comm. Data Link comm. Ground Station comm. Net-Centric Backbone comm.
  • 5. Inside Vehicle Communications Deeply-embedded, low-power – Limited CPU speed – General Purpose Processors and FPGAs Memory constrained devices – Limited RAM – Flash filesystem or none Dedicated IPC transports – Back plane Certification requirements – DO-178B Operating Systems Challenging Environment to operate on!
  • 6. Data-Link Communications Multiple traffic types: – Sensor data streams – Command & Control data – Status, Intelligence, Mission, Supervisory Different traffic requirements for each type: – Urgency, Priority, Reliability, Volume – Stealth operations Challenging communications channel: – Large latency, low throughput channel – Lossy links – Disconnections – Asymmetric bandwidth (downlink vs uplink)
  • 7. Data Link Types & Requirements Low Throughput High reliability High integrity Aggregate Performance for HRDL + HCDL High Throughput (streaming data) Moderate Throughput High Avail. High Integrity Reqs C2 and Status data transfer in emergency Relay xfer: High Altitude Platform or UAV Sensor Data C2 Data Status Data (position attitude) Use Back UpBeyond Line of Sight High Capacity (HCDL) High Reliability (HRDL)
  • 8. Ground Station Communications Modularity Multiple vendors Multiple languages Evolution Failover Handwover Redundancy
  • 9. Ground Station Characteristics Heterogeneous system – RT storage & processing of sensor data – Integration with display – Integration with C2 / supervision systems – Integration with net-centric back end Multiple programming languages: C/C++/Java/.NET Multi-Platform: Linux/Windows/Embedded Modular, reconfigurable Varying assignments of Ground-Station to UAV
  • 10. Ground Station Requirements Be able to handle and adapt a variety of: – CPUs, / Computer platforms – Traffic flows – Programming Languages – Operating Systems Provide a modular framework – Support reconfiguration – Support evolution, extensibility – Support SOA tenets Support operational use-cases – Link fallback – Multi-station handoff
  • 11. Net-Centric Communications Multiple vendors Highly Heterogeneous: – computer platforms – operating systems – programming languages Integration with disparate technologies – Databases – REST/HTTP, Web-Services – ESBs – Middlewares: DDS, JMS, CORBA, MOMs Multiple architectural concerns – SOA – Event-Driven – Publish-Subscribe Security Auditing, Recording requirements Cross Domain Solution requirements
  • 12. Net-Centric Communications DDSDDS
  • 13. Outline UAV Communication Requirements Why TCP-based solutions do not work Implementing your own data-link protocol Using middleware (DDS) for the Data-Link Conclusions
  • 14. TCP-based solutions do not work for the Data Link TCP has fundamental problems in the Data-Link… – Un-tunable timers & congestion control algorithm – Bad behavior on lossy networks & networks with dropouts – Bad behavior on large latency links The consequences are: – Protocol Problems Head of line blocking Brittle connection-oriented model Byte-oriented. Lacks prioritization Inflexible reliability model. Not stealth – Performance problems Slow connect Low link utilization TCP
  • 15. TCP problem: Head-of-line blocking TCP funnels all traffic over single reliable stream A Byte cannot be delivered until all previous Bytes have been received – A lost packet will “block” all future traffic until that packet is repaired – A large message will “block” all future traffic until it is completely delivered – IMAGE: Broken Bicycle blocking race car – IMAGE: Large Tractor blocking race car TCP’s “stream-oriented” reliability model not suitable for Data Link TCP
  • 16. TCP issues: Brittle connections TCP relies in hard-coded timers to establish connections – SYN messages bust be responded before timeout – SYN timer is 3 secs with doubling exponential backoff: 3s, 6s, 12s,… – Implementations give up after fixed number of attempts – Large latencies (> 60 sec) cause every TCP connection attempt to fail – Some TCP implementations fail sooner: e.g 9 sec for Windows TCP is bad a detecting disconnections – To detect connection liveliness must use KEEP_ALIVE option System-wide timeout defaults to 2 hours – Common solution is periodic application messaging Detection time non-deterministic. Order of minutes TCP connection failure is drastic – All state is lost – No knowledge of what messages were delivered or not – Application must do their message framing, sequence numbering and acknowledgment to enable to continue upon re-connect TCP
  • 17. TCP issue: Low bandwidth use ‘Perfect storm’ for TCP protocol: – TCP slow start Ramp-up time ~ RTT*log(BW) – RTT – roundtrip time (2 x latency) – BW -- bandwidth – Insufficient TCP buffersize given large RTT To utilize a given BW TCP needs buffersize ~ BW*RTT For 10 Mbps and 500msec buffsize ~ 640KB! Typical Available/Configured TCP buffsize far smaller! – TCP congestion-control algorithm misinterprets packet loss as a sign of congestion End result: long ramp-up times, low and/or unstable bandwidth use TCP
  • 18. Details: Insufficient buffersize TCP flow control is based on a send "window size“ – Send Window determines how much data can be outstanding (i.e., unacknowledged) in the network. In long-delay networks require large send-windows to hold large amount of “in flight data” without blocking sender – DataInTransit ~ bandwidth X delay Operating Systems limit/hardcode window size. – TCP standard limits window to 64 KB (in practice 32KB due to signed arithmetic) – Required windows are much larger: RTT 0.8, BW 1.54 Mbps requires 154 KB New "large-window“ TCP extension (TCP-LW) allows windows up to 232KB – But that makes the slow-start problem bigger… TCP
  • 19. Details: Congestion control TCP congestion avoidance is bad for lossy or long latency links: – Mistakenly interprets packet loss as congestion – Excessively long ramp-up for new connections RED (random early detection) gateways requires each gateway to monitor its own queue length. When imminent congestion is detected the TCP sender is notified. By dropping a packet earlier than it would normally, RED sends an implicit notification of congestion. The sender is effectively notified by the timeout of this packet. The principle behind the RED approach is that a few earlier-than-usual drops may help avoid more packet drops later on. The TCP sender can then reduce its window before serious congestion occurs. In TCP Vegas the TCP sender predicts when congestion is about to occur and reduces its transmission window before intermediate routers drop packets – TCP can keep track of the minimum round trip time seen during a transfer and use the most recently observed round trip time to compute the data queued in the network. – TCP can also keep track of the throughput before and after the congestion window changes to estimate the network congestion level. – If estimates indicate that the number of packets queued in the network is rising, it reduces the congestion window. As it observes the number decreasing it increases the congestion window. Although neither approach has been widely adopted, both hold promise for satellite networks. As we mentioned earlier, TCP congestion control responds to congestion slowly because of latency. If such congestion can be avoided before it happens, it is a big win for high-speed and long-delay networks. TCP
  • 20. TCP issue: reliability & congestion control TCP acknowledgment is non-selective & blunt – If a segment is lost, TCP will retransmit all data starting from the lost segment without regard to the successful transmission of later segments. TCP congestion control fooled by lossy networks – TCP considers this lost segment as an indication of congestion and reduce its window size in half TCP
  • 21. TCP issue: chatty reliability protocol TCP reliability requires constant ACKs from receiver – Even if all messages are received… ACK traffic consumers power and bandwidth ACK traffic prevents stealth operations (can reveal position of ACKer) Other protocols (best efforts or NACK only) may be better suited… TCP
  • 22. Summary TCP protocol problems TCP is inflexible TCP protocol not well suited for Data Link – Low performance – Incorrect behavior NASA and others have tried to spearhead efforts to modify TCP… – Research on “delay tolerant” networks – Research on TCP: HACK, SACK, Trunk protocols These efforts remain in the research domain TCP “one size fits all” Qos not suitable for Data Link TCP
  • 23. Outline UAV Communication Requirements Why TCP-based solutions do not work Implementing your own data-link protocol Using middleware (DDS) for the Data-Link Conclusions
  • 24. Implementing your own data-link protocol Session management Data stream management Buffering Traffic Prioritization/Shaping Fragmentation / Reassembly Reliability Redundant links/failover
  • 25. General Architecture To solve the reliability, flow control, and disconnection issues we need: – Data buffers at both ends – a reliable comm. protocol sends the data from the send buffer to the receive buffer Sender Application Receiver Application Reliability Protocol Send Buffer Receive Buffer
  • 26. General Architecture (2) To avoid head-of-line blocking we need – Separate buffers for each traffic type – Separate reliable data streams for each traffic type, each should have its own separate session Sender Application Receiver ApplicationEach traffic type has its own session Send Buffer Receive Buffer
  • 27. Reliable Protocol At a minimum the reliability protocol must – Identify each message with sessionId and a sequence number – Send periodic HearBeats announcing which sequence numbers should have been received – Accept ACKs to record the messages and clear from send buffer – Accept NACKs for sequence numbers and send the requested repairs
  • 28. Company Confidential Confirmed Reliability (TCP style) No packet loss 01 02 03 04 01 02 03 04, HB 01 02 03 04ACK 1-4 05 06 07 08 05 06 07 08, HB 05 06 07 08ACK 1-8
  • 29. Company Confidential Confirmed Reliability (TCP Style) Some packet loss 01 02 03 04 01 02 03 04, HB 01 02 X ACK 1-2, NACK 3 05 06 07 08 05 06 07 08, HB 06 07 08 ACK 1-8 03 04 05 X X Packets 04 and 05 are received but the protocol drops them because a prior packet 03 is missing. This wastes valuable bandwidth
  • 30. Reliable Protocol (II) For performance the protocol should – Accept received messages out of order and cache them on the receiver buffer while the missing messages are repaired – Send selective NACKs (SACKs) for just the sequence numbers that are missed To handle large sensor data (e.g images) – Fragment & re-assemble large messages – Handle reliability on message fragments as well To handle small updates – Bundle small updates into batches – Flush batches based on max delay or packet size
  • 31. Company Confidential Confirmed Reliability (Reader Cache + SACK) No packet loss 01 02 03 04 01 02 03 04, HB 01 02 03 04ACK 1-4 05 06 07 08 05 06 07 08, HB 05 06 07 08ACK 1-8
  • 32. Company Confidential Confirmed Reliability (Reader Cache + SACK) Some packet loss 01 02 03 04 01 02 03 04, HB 01 02 X 04ACK 1-2, SACK 3 05 06 07 08 05 06 07 08, HB 05 06 07 08ACK 1-8 03 Packets 04 and 05 are received and cached waiting for the repair of 03. No bandwidth is wasted.
  • 33. Reliable Protocol (III) For performance on a wide variety of links the protocol must – Allow configuration of timers and buffer sizes – Maintain liveliness of the link via KeepAlive messages – Allow sessions and buffers to survive link disconnection – Perform output shaping with rate limits – Support prioritization between sessions/traffic types – Support differential shaping for each traffic type – …
  • 34. Redundancy and Failover Data-Link may deliver duplicate packets Data might arrive from redundant transports Failover requires multiple sources of the same information How does protocol identify/filter these duplicates? – Needs VirtualSessionId identifying session independent of data-link or source – Reader queue must be 2-level. Second level organized by VirtualSessionId filters-out duplicates
  • 35. Stealth Reliability should be tunable: – Best-efforts mode. No ACK traffic Sacrifices reliability While ensures order & no duplicates – A NACK-only limits backwards traffic But requires smarter buffer management – Full reliability. Both ACKs and NACKs Ensures delivery to the receiving application
  • 36. Example (best effort with packet loss) 01 02 03 04 01 02 03 04, HB 01 02 X 04 05 06 07 08 05 06 07 08, HB 05 06 07 08 Company Confidential Packets 03 is permanently lost Repair request would compromise stealth. Application notified of packet loss.
  • 37. Stealth Reliability (no packet loss) 01 02 03 04 01 02 03 04, HB 01 02 03 04 05 06 07 08 05 06 07 08, HB 05 06 07 08 Stealth not compromised under Normal operating conditions.
  • 38. Stealth Reliability (some packet loss) 01 02 03 04 01 02 03 04, HB 01 02 X 04NACK 3 05 06 07 08 05 06 07 08, HB 05 06 07 08 03 Stealth minimally compromised Only when some message is lost
  • 39. Message Batching write() sender receiver write() sender Send queue Receive queue Send queue Receive queue Without batching each message is separately sent. For small messages protocol headers might be bigger than payload With batching messages are held a little and combined into larger batches maximizing throughout and minimizing CPU receiver Transparent: Receiver still sees individual messages
  • 40. Reliability with Batching Reliability must work even when messages are batched ACK or NACK of individual samples would negate some of the benefits of batching… => Protocol must be batch aware so that it can ACK/NACK complete batches! B3 B2 B1 B3 B2 B1 ACK(B3), NACK(B2) Repair B2 B3 B2 B1 write() sender receiver
  • 41. Batching is hard but it pays! RTI DDS 4.3b perftest results 0 100 200 300 400 500 600 700 800 900 1000 0 1000 2000 3000 4000 5000 Sample size (bytes) Throughput(Mbps) Linux Baseline Linux 10Kb Batch Intel Core2Duo Single-CPU Dual-Core 2.4GHz, 4MB cache 32-bit CentOS 5 (RHEL 5), 2GB memory, Intel E1000 NIC
  • 42. Other considerations Resource management: – During disconnected operation buffers might fill or overflow… – Solution is smart caching: Purge by age Filter by frequency Keep “one of each” – requires additional insight onto the data – Some object identifier (e.g. track Id) Filter by content
  • 43. This is HARD!!
  • 44. Outline UAV Communication Requirements Why TCP-based solutions do not work Implementing your own data-link protocol Using middleware (DDS) for the Data-Link Conclusions
  • 45. Ethernet Wireless Radio Shared Memory cPCI 1553 Using a Network Middleware Network middleware: A library between the operating system and the application It insulates application from the raw network Implements reliability, caching, … Hardware (e.g. Radio) Network stack (e.g. IP) Middleware Application Middleware Application Application Application Application Application
  • 46. Which middleware to use? Standards based Configurable via QoS Not based on TCP Manages Sessions/Fragmentation/Reliability… Failover/handover supoort Efficient use of bandwidth Multi-platform Embeddable, Certifiable… Integration with net-centric back end
  • 47. DDS mandated for data-distribution DISR (formerly JTA) – DoD Information Technology Standards Registry US Navy Open Architecture FCS SOSCOE – Future Combat System – System of System Common Operating Environment SPAWAR NESI – Net-centric Enterprise Solutions for Interoperability – Mandates DDS for Pub-Sub SOA
  • 48. 48 European Air Traffic Control RETF (USA) Train Communications Tokyo Japan Traffic Control Boeing Army Future Combat System Boeing AWACS program US Navy, DD(X) LCS, LPD-17 SeaSlice and 13 other Navies DDS Adoption
  • 49. Insitu Unmanned Air Vehicle “…we have seen a 30% increase in productivity based on not having to handle data communication issues.” Gary Viviani, VP of Engineering Insitu is a recognized leader in the exploding UAV space The next generation of UAV’s including the Scan Eagle and newer platforms Challenge is to have a successful UAV mission which requires impressive autonomy and reliable ground control DDS enables an information flow that is much more orchestrated and flexible allowing seamless switch control between multiple ground stations while connecting reliably over unreliable links
  • 50. © 2008 Real-Time Innovations, Inc.50 Advanced Cockpit Ground Control Station Defense General Atomics Aeronautical Systems developed advanced cockpit ground control stations (GCSs) for unmanned aircraft systems Required real-time data distribution for acquisition, analysis, and response of remote controlled aircraft DDS selected for proven software & services. Application built in under 14 months, significantly less time than with alternative software or building their own
  • 51. CLIP Mediator Bridge Transportation • Common Link Integration Processing (CLIP): U.S. Air Force and Navy joint project to build Tactical Data Link (TDL) aggregator • Enables information exchange between platforms with incompatible tactical data links • Challenge: existing system had poor integration with platform mission systems • With Northrop Grumman, RTI helped architect, design, develop & test mediator bridge between platform systems and CLIP – RTI Services built a ‘mediator’ bridge between Air Force, Navy, NGC, B1, B52 – First NESI DDS Compliant Product Defense “Working with RTI has been both effective and productive.” – Jim Miller, CLIP Program Manager
  • 52. © 2008 Real-Time Innovations, Inc.52 BASE 10 Systems Land Vehicles 5 different subsystems on the data bus and communication link between RCC and RoboScout RS Next release of RoboScout will implement DDS in vehicular platform and outside services (radio and satellite data-links). Defense
  • 53. © 2008 Real-Time Innovations, Inc.53 Autonomous vehicle in the 2005 DARPA Grand Challenge race Unique characteristic of FireFox: adaptive vision system – vehicle “learns” through example Complex network of control and vision systems, sensors, processors, operating systems DDS integrates all kinds of data sources, shares data with minimal latency DARPA Flying Fox - Autonomous Vehicle Systems Unmanned Vehicles
  • 54. © 2008 Real-Time Innovations, Inc.54 DDS B-1B Tactical Systems Upgrade DDS is being used to seamlessly integrate legacy flight control systems with a new open architecture tactical communication and control system. Adding new command & control and communications capabilities that need to work with legacy control system Need architecture that is open & modular for future extensions and upgrades DDS is open and scalable, reducing integration risk, standards-based ensuring supportability Defense
  • 55. AWACS Radar System Upgrade Airborne control system for surveillance, command & control and battle management Upgrading system to be open, supportable, less expensive to maintain and extend DDS is standards-based, open and extensible, reducing integration risk DDS is a proven COTS solution, reducing total cost of ownership over in-house development
  • 56. CAE SimXXI Flight Simulation State-of-the-art full-flight simulator from CAE Challenge is communication between subsystems (over IEEE 1394) with low-latency data transfer DDS chosen because it excels in real- time performance and is simple to use and integrate
  • 57. © 2008 Real-Time Innovations, Inc.57 INDRA: Air Traffic Management Finance Air traffic service needed to control flow of traffic through busy metropolitan air space Reliability is critical − hardware or software failures mean flight delays and substantial costs DDS high performance permits the fast addition, updating and removal of system nodes without disrupting the data flow DDS engaged to integrate, extend and design to the customer’s specific needs Transportation
  • 58. Next-generation of the U.S. Navy Aegis Weapon System Challenge to share time-critical data across highly distributed system including radar, weapons, displays and controls Need to maximize future scalability and flexibility DDS provides real-time communication infrastructure. Standards-based & extensible for future system enhancements Lockheed Martin US Navy Aegis Open Architecture Weapon System
  • 59. © 2008 Real-Time Innovations, Inc.59 Navy Open Architecture Ship Self Defense System (SSDS) Project to employ standards throughout ship systems (frameworks, OS, etc.) Goal: Reduce total cost of ownership, ease system upgrades, reduce interoperability issues DDS selected as middleware: its extensibility enables an open architecture throughout Navy! DDS Services provided advanced integration, support & consulting Defense
  • 60. Sample EU project using DDS ESO Extremely Large Telescope (E-ELT) – 43m diameter (see vehicles on picture!) – 30.000 sensors send data on the bus – RTI DDS used as middleware for critical data communication and integration INDRA i-TEC e-FDP ATM program – European leader in Air Traffic Mgmt applications – ATM integration for UK, Spain and Germany – RTI Used as integration solution for Flight Data Management and Distribution EADS Euro Hawk UAV program – EADS selected RTI for European UAV program – RTI is used as embedded middleware in UAV versatile payload
  • 61. Sample EU project using DDS PLATH (Hamburg, Germany) – Radio signal analysis experts – Has decided to use RTI on a large scale for key middleware services Volkswagen R&D – After thorough evaluation VW has selected RTI as a middleware for their next generation vehicular R&D platform, – AUTOSAR, ECS, ECU context. MBDA France & UK – They have been using RTI for 2 years – Vertical launch missile program « MOUV »
  • 62. Sample EU project using DDS BASE 10 RoboScout Technology Reference System (TRS) – BTSE is a German project focused company specialized in the defense market within NATO. They are experts in robotics integrating systems engineering, system qualification, manufacturing and long term support. – Base 10 has been working with RTI for 1 year – We delivered Quick-Start training and an architecture study on how to implement RTI on the vehicular platform data flows – There are 5 different subsystems on the data bus and communication link between RCC and RoboScout RS – RTI is now implemented in the RCC (bottom left picture) – Next release of RoboScout will implement RTI in vehicular platform and outside services (radio and satellite data-links).
  • 63. Many others
  • 64. Dissecting Messaging Technologies The alternatives: Standards based: – Web-Service/SOAP Based (WS-Eventing, WS-Notification…) – JMS – CORBA – Real-Time Data-Distribution Service Vendor-proprietary: – ESBs – IBM WebSphere MQ, – TIBCO, – 29West, – Gigaspaces Custom build Architecture Quality of Service Performance & Scalability
  • 65. Best-of breed RT-Messaging: DDS Data Distribution Service (DDS) – High performance real-time data distribution – Object Management Group (OMG) DDS Standard API (v1.2) – Specifies user-visible API – Ensures application portability – Adopted in June 2003, revised June 2005,2006 DDS Standard Wire Protocol (v 2.1) – Real-Time Publish-Subscribe (RTPS) – Ensures application interoperability – Adopted in June 2006, revised July 2007, 2008 Real-time Publish-Subscribe (RTPS) Wire Protocol DDS Middleware Data Distribution Service API Standards-based services for application developers Standard protocol for interoperability
  • 66. Message bus architectures Centralized Clustered Federated Peer to Peer DDS JMS IBM TIBCO IBM
  • 67. Message Quality of Service Avoid a single source from overwhelming the network. Prevent large low-urgency data (e.g., file downloads) from compromising the performance of critical data (e.g., alarms and critical news updates). Provide dedicated bandwidth to the most critical data. Control how much load and bandwidth a particular sender can inject into the network. Control the peak load, average load, and size of a burst. Flow Control Prioritize real-time flows like live audio over traffic that may be buffered (e.g., video replay). Prioritize critical control information (e.g., live radar tracks) over non-time critical information such as aircraft schedule changes. Specify the relative importance of different messages and the maximum acceptable delay between the time the message is sent and the time it’s delivered to the reader(s). Latency Budget Send live voice or video data. Send sensor data (e.g., radar tracks), traffic readings, CPU/network statistics and readings. Let the application decide whether messages should be confirmed and retried when missed, or else sent as best efforts. Reliability Example Use CasesPurposeQoS
  • 68. Message Quality of Service Avoid a single source from overwhelming the network. Prevent large low-urgency data (e.g., file downloads) from compromising the performance of critical data (e.g., alarms and critical news updates). Provide dedicated bandwidth to the most critical data. Control how much load and bandwidth a particular sender can inject into the network. Control the peak load, average load, and size of a burst. Flow Control Prioritize real-time flows like live audio over traffic that may be buffered (e.g., video replay). Prioritize critical control information (e.g., live radar tracks) over non-time critical information such as aircraft schedule changes. Specify the relative importance of different messages and the maximum acceptable delay between the time the message is sent and the time it’s delivered to the reader(s). Latency Budget Send live voice or video data. Send sensor data (e.g., radar tracks), traffic readings, CPU/network statistics and readings. Let the application decide whether messages should be confirmed and retried when missed, or else sent as best efforts. Reliability Example Use CasesPurposeQoS DDS JMS* (partial) DDS DDS WS-* (partial)Proprietary Proprietary
  • 69. Message Quality of Service (Cont.) Allow exploiting the differential service capabilities of the network infrastructure Configure the network infrastructure to prioritize messages ahead of others. Controls the traffic class used for the underlying network transport. Takes advantage of network multicast infrastructure Transport Priority Multicast Prevent a rapidly changing source from using a lot of resources and starving other less-active sources. Some applications may only be interested in the last 100 events for each server regardless of the time interval when they occurred. Control how many related messages (e.g., successive updates to a stock value or successive readings of a sensor) must be maintained by the middleware and delivered to readers. History Prevent data that loses value with age (e.g., old stock values, old news, old sensor readings) from using valuable system resources, while ensuring that needed historic information is kept (e.g., transaction records). Control how long the data must be kept by the middleware to be delivered to readers. Old data may be of little value delivering it wastes bandwidth and gets in the way of the more recent data. Lifespan Example Use CasesPurposeQoS
  • 70. Message Quality of Service (Cont.) Allow exploiting the differential service capabilities of the network infrastructure Configure the network infrastructure to prioritize messages ahead of others. Controls the traffic class used for the underlying network transport. Takes advantage of network multicast infrastructure Transport Priority Multicast Prevent a rapidly changing source from using a lot of resources and starving other less-active sources. Some applications may only be interested in the last 100 events for each server regardless of the time interval when they occurred. Control how many related messages (e.g., successive updates to a stock value or successive readings of a sensor) must be maintained by the middleware and delivered to readers. History Prevent data that loses value with age (e.g., old stock values, old news, old sensor readings) from using valuable system resources, while ensuring that needed historic information is kept (e.g., transaction records). Control how long the data must be kept by the middleware to be delivered to readers. Old data may be of little value delivering it wastes bandwidth and gets in the way of the more recent data. Lifespan Example Use CasesPurposeQoS DDS JMS DDS DDS Proprietary
  • 71. Message Quality of Service (Cont.) Allow consumers with slow CPU or network (e.g. wireless) Filter data at the source or in the infrastructure. Avoid wasting CPU and bandwidth delivering data that is not of interest Monitor aircraft in your airspace, alarms in the immediate vicinity, stocks that cross a threshold or in the industries of interest… Provide an application only the data it needs Filter messages based on content as requested by the consuming application Content Filtering Prevent data that loses when application crash Allow short-living applications (e.g. cgi scripts) to generate messages that are received reliable even by applications that join the network later Externalize message history so that they survive beyond the life of the application that generates them Deliver messages reliably in the presence of application failure and re-starts. Persisten ce Example Use CasesPurposeQoS
  • 72. Message Quality of Service (Cont.) Allow consumers with slow CPU or network (e.g. wireless) Filter data at the source or in the infrastructure. Avoid wasting CPU and bandwidth delivering data that is not of interest Monitor aircraft in your airspace, alarms in the immediate vicinity, stocks that cross a threshold or in the industries of interest… Provide an application only the data it needs Filter messages based on content as requested by the consuming application Content Filtering Prevent data that loses when application crash Allow short-living applications (e.g. cgi scripts) to generate messages that are received reliable even by applications that join the network later Externalize message history so that they survive beyond the life of the application that generates them Deliver messages reliably in the presence of application failure and re-starts. Persisten ce Example Use CasesPurposeQoS DDS JMS DDS Proprietary Proprietary
  • 73. Non-real-time Soft real-time Hard real-time Extreme real-time Java/RMIJava/JMS CORBA MPI Java RTSJ (soft RT) RTSJ (hard RT) Web Services MessagingTechnologiesandStandardsMessagingTechnologiesandStandards Data Distribution Service / DDS RT CORBA Adapted from NSWC-DD OA Documentation Data Distribution Service spans a very wide spectrum of application needs
  • 74. Top reasons to use DDS Flexibility and Power of the data-centric model Performance & Scalability Rich set of built-in services Interoperability across platforms and Languages Provides/integrates Pub-Sub into SOA
  • 75. #1 DDS Data-Centric Model Data WriterData Writer Data WriterData Writer Data ReaderData Reader Data Reader Data Reader Data Writer “Global Data Space” generalizes Subject-Based Addressing – Data objects addressed by DomainId, Topic and Key – Domains provide a level of isolation – Topic groups homogeneous subjects (same data-type & meaning) – Key is a generalization of subject Key can be any set of fields, not limited to a “x.y.z …” formatted string
  • 76. #1 DDS Data-Centric Model Data WriterData Writer Data WriterData Writer Data ReaderData Reader Data Reader Data Reader Data Writer Data Object “Global Data Space” generalizes Subject-Based Addressing – Data objects addressed by DomainId, Topic and Key – Domains provide a level of isolation – Topic groups homogeneous subjects (same data-type & meaning) – Key is a generalization of subject Key can be any set of fields, not limited to a “x.y.z …” formatted string
  • 77. #1 DDS Data-Centric Model Data WriterData Writer Data WriterData Writer Data ReaderData Reader Data Reader Data Reader Data Writer Topic “Global Data Space” generalizes Subject-Based Addressing – Data objects addressed by DomainId, Topic and Key – Domains provide a level of isolation – Topic groups homogeneous subjects (same data-type & meaning) – Key is a generalization of subject Key can be any set of fields, not limited to a “x.y.z …” formatted string
  • 78. #1 DDS Data-Centric Model Data WriterData Writer Data WriterData Writer Data ReaderData Reader Data Reader Data Reader Data Writer Key (subject) “Global Data Space” generalizes Subject-Based Addressing – Data objects addressed by DomainId, Topic and Key – Domains provide a level of isolation – Topic groups homogeneous subjects (same data-type & meaning) – Key is a generalization of subject Key can be any set of fields, not limited to a “x.y.z …” formatted string
  • 79. Company Confidential Topic: “Market Data” Subject Filter (for a Reader) Field Value Symbol Type Exchange Payload * * NYSE * Subject Filter (for a Reader) SourceField Value Symbol Type Exchange Payload REUTERS * EQ NYSE Volume > x, Ask < y Payload Filter (for a Reader) Topic: “Order Entry” Topic: “Market Data” Subscriptions: By Topic, Subject, Content Symbol OrderKind Stop Limit SourceField Value Symbol Type Exchange Payload * * * * * Volume Bid Ask … OrderNumber …
  • 80. DDS Demo: Concepts Topics – Square, Circle, Triangle – Attributes Data types (schemas) – Shape (color, x, y, size) Color is instance Key – Attributes Shape & color used for key QoS – Deadline, Liveliness – Reliability, Durability – History, Partition – OwnershipControl Area: Allows selection of objects and QoS Display Area: Shows state of objects Start demo
  • 81. QoS: Quality of Service TRANSPORT PRIORITYCONTENT FILTERS PRESENTATIONLIFESPAN DESTINATION ORDERENTITY FACTORY LATENCY BUDGETDEADLINE LIVELINESSTIME BASED FILTER OWNERSHIP STRENGTHRELIABILITY OWNERSHIPRESOURCE LIMITS PARTITIONWRITER DATA LIFECYCLE GROUP DATAREADER DATA LIFECYCLE TOPIC DATAHISTORY (per subject) USER DATADURABILITY QoS PolicyQoS Policy
  • 82. Tunable Reliability Protocol Configurable AckNack reply times to eliminate storms Fully configurable to bound latency and overhead – Heartbeats, delays, buffer sizes Consumer / Reader Producer / Writer Reliable •Guaranteed Ordered Delivery •“Best effort” also supportedS7 S5 S6 S4 S3 S2 S7 S6 S5 S4 S3 S2 S1 S1 S7 S5 S6 S4 S3 S2 S1 Performance can be tracked by senders and recipients – Configurable high/low watermark, Buffer full Flexible handling of slow recipients – Dynamically remove slow receivers
  • 83. High-Throughput via Aggregation Increases throughput by aggregating smaller messages into larger network packets User tunable – # packets to aggregate for delivery – Aggregate packet size – Max elapsed time before data is sent – Manual flush at any time write() Full or timeout
  • 84. Demo: Quality of Service (QoS) Topics – Square, Circle, Triangle – Attributes Data types (schemas) – Shape (color, x, y, size) Color is instance Key – Attributes Shape & color used for key QoS – Deadline, Liveliness – Reliability, Durability – History, Partition – Ownership RTI DDS delivers Writers and readers state Their needs Start demo
  • 85. #2 Performance & Scalability DDS was designed to support high performance RTI DDS was developed to maximize performance and minimize jitter Advanced techniques employed: – Pre-allocation of memory Never allocate/free memory in the critical path – Use dedicated threads per receive port Minimize thread switching Avoid expensing operating system calls (e.g. select()) – Maximize concurrency Carefully design critical sections Patented concurrent mutex-free thread-safe data structures – Employ high-performance data-access APIs Read data by array (no additional copies) Scatter/gather APIs to access transport. Buffer loaning for zero copy access
  • 86. Latency – (Linear Scale) DDS/GSOAP/JMS/Notification Service Comparison - Latency 0 500 1000 1500 2000 2500 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 Message Size (bytes) DDS JMS Notification Service Message Length (samples) Adapted from Vanderbilt presentation at July 2006 OMG Workshop on RT Systems
  • 87. Jitter – (Linear Scale) DDS/JMS/CORBA Notification Service Comparison - Jitter 0 200 400 600 800 1000 1200 1400 1600 1800 2000 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 Message Size (bytes) StandardDeviation(usecs) DDS JMS Notification service Message Length (samples) Source: Vanderbilt presentation at July 2006 OMG Workshop on RT Systems DDS/CORBA Notification Service Comparison - Jitter 0 20 40 60 80 100 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 Message Size (bytes) StandardDeviation(usecs) DDS JMS Notification service Message Length (samples)
  • 88. Performance: Looking under the hood… Increases performance for large messages, resulting in higher throughput and lower latency. Reduces CPU consumption on both sender and receiver. Makes performance scale better with message size. Operating system network-stack technology that allows an application to put and get data from the network buffers “by reference,” without performing extra copy operations. Zero Copy Decouples sender and receiver, providing more predictable performance for the writer and reducing latency jitter. Allows multiple write operations to be performed concurrently over multiple channels, batched, or optimized in other ways. Middleware technology that allows a write operation to be processed by a separate thread and not block the application thread that performed the write. Asynchronous Writes Enables multicast use for larger (greater than 64KB) messages. Prevents “Head of Line” blocking where a high-priority message is queued behind a large message. Reduces jitter. Provides better performance in less reliable networks (wireless / WANs). Middleware technology that breaks a large message into smaller units, delivers them separately, and then reassembles them prior to deliverance to the application. Message Fragmentation Greatly increases the throughput for small messages. Reduces bandwidth and processor utilization for small messages. Middleware technology that combines multiple messages into a single unit. Message Batching Provides the most efficient way to send messages to multiple receivers. Reduces bandwidth, reduces overhead on the sender, and minimizes latency and jitter. Internet technology that allows a single UDP message to be delivered to many receivers. Multicast Why It MattersDescriptionPerformance & Scalability Technique
  • 89. Company Confidential Performance: RTI DDS Low Latency and Jitter 0 50 100 150 200 250 300 350 400 32 64 128 256 512 1024 2048 4096 8192 Maximum 99.99% 99% Median Minimum Reliable, ordered delivery over Gigabit Ethernet between 2.0 GHz Opteron processors running 32-bit Red Hat Enterprise Linux 4.0 Message/Data Size (bytes) Latency(microseconds) Latency and Jitter on Unloaded Network without Message Batching
  • 90. © 2008 Real-Time Innovations, Inc. - May 1, 200890 0 50 100 150 200 250 300 350 400 450 500 0 100,000 200,000 300,000 400,000 500,000 600,000 Throughput (Messages per Seconds) AverageLatency(Microseconds) 1 (1 per CPU and NIC) 20 (1 per CPU and NIC) 40 (1 per core, 2 per NIC) Performance: RTI DDS Latency at high Throughput Number of Subscribers Half Million msg/sec at less than 300usec latency
  • 91. 563,498 556,896 535,883 365,760 0 100,000 200,000 300,000 400,000 500,000 600,000 1 subscriber 20 subscribers (1 per CPU and NIC) 40 subscribers (1 per core, 2 per NIC) 72 subscribers (1 per core, 2-8 per NIC) MessagesperSecond Scalability: RTI DDS Reliable Multicast Performance 200 Byte messages GBit Ethernet Single publishing thread All data subscribed No message loss – throttled to slowest subscriber CentOS 5, 32-bit CPUs – 2.4 GHz Intel Core 2 Duo E6600 – 2.4 GHz Intel Core 2 Quad Q6600 – 2.33 GHz Intel Xeon E5345 – 2.4 GHz AMD Opteron 8216 NICs – Intel PRO/1000 – Broadcom NetXtreme II Throughput with batching
  • 92. 0 100 200 300 400 500 600 700 800 900 1,000 32 64 128 256 512 1024 2048 4096 8192 16384 Message Size (bytes) MegabitsperSecond Native C++ .NET (C#) Java Performance: RTI DDS High Performance across all Languages Windows XP Pro SP2 32-bit Reliable multicast Gigabit Ethernet 2.4 GHz Intel Core 2 Quad Q6600 Single Intel PRO/1000 NIC Four producer and consumer threads Throughput: Megabits per Second with batching
  • 93. #3 Powerful Services & Tools – High-Availability – Persistent Data – Recording service – Relational Database bridge – Development & Monitoring Tools
  • 94. DDS High Availability via Redundancy Owner determined per subject Only extant writer with highest strength can publish a subject (or topic for non-keyed topics) Automatic failover when highest strength writer: – Loses liveliness – Misses a deadline – Stops writing the subject Shared Ownership allows any writer to update the subject Producer / Writer strength=10 Topic T1 I1 I2 Producer / Writer strength=5 Producer / Writer strength=1 I1 Primary I1 Backup I2 Primary I2 Backup
  • 95. DDS Data Persistence A standalone service that persists data outside of the context of a DataWriter Data Writer Global Data Space Data Reader Persistence Service Persistence Service Data Reader Data Writer Permanent Storage Permanent Storage Can be configured for: • Redundancy • Load balancing Demo: 1. PersistenceService 2. ShapesDemo 3. Application failure 4. Application (ShapesDemo) re-start 5. Persistence Svc failure 6. Application re-start Cleanup database
  • 96. DDS Real-Time Recording Service Applications: – Future analysis and debugging – Post-mortem – Compliance checking – Replay for testing and simulation purposes Record high-rate data arriving in real-time Non-intrusive – multicast reception Demo: 1. Start RecorderService 2. Start ShapesDemo 3. See output files 4. Convert to: HTML XML CSV 5. View Data: HTML XML CSV
  • 97. Relational Actions DDS Relational Database Integration Topic T1 I1 I2 I3 I1 I2 I3 Table T1 Messaging Actions Write() Read() & Take() Dispose() Wait() & Listener UPDATE & INSERT SELECT DELETE Event driven – The fastest way to observe database changes!
  • 98. DDS Enables Event Processing CEP: programmable engines used to transform “data” into “information” CEP engines are programmed using a derivative of SQL CEP engines save time: They can implement a lot of the application logic: – Classification, Correlation, Aggregation, Filter, Cleansing, Pattern Detection, etc. DDS is the perfect ‘data’ and ‘information’ pipe for CEP engines – Use high-speed data streams (1,000-1,000,000 msg/sec) – Require latency measured in sub-milliseconds – Demand access to events from a heterogeneous systems CEP Engine Dashboards Applications Alerts RTI Global Data Space Market Data Trades Low Latency Messages
  • 99. Tools provide insight into a distributed system RTI Analyzer – Understand connections and data flow – Tune QoS properties without changing code RTI Scope – Capture and monitor packet payloads – Collect time histories of Topic values RTI Protocol Analyzer – Sniff the wire and analyze traffic
  • 100. #4 Interoperability between platforms & languages Data accessible to all interested applications: – Data distribution (publishers and subscribers): DDS – Data management (storage, retrieval, queries): SQL – ESB Integration, Business process integration: WSDL – Legacy Java Integration: JMS DBMS DBMSDBMS Global Data Space Distributed Node Distributed Node Distributed Node Distributed Node Distributed Node SQL JMS DDS SQL DDSWSDL D T
  • 101. DDS: Multi- Architecture Support • Same API for all platforms • Language Independence: C, C++, Java™, C#, .NET, ADA • Enterprise and Embedded Support VxWorks®, INTEGRITY®, LynxOS® Linux, Solaris, Windows • Prototype on any platform Linux RTI DDS Windows RTI DDS Integrity RTI DDS VxWorks RTI DDS
  • 102. RTI DDS: Pluggable Transports • Enables non-IP centric transports (e.g InfiniBand) • Allows for multiple transports on same node • Provides high-performance (zero-copy interface) • Saves bandwidth (compact messages & encapsulation) Standard IP network (Ethernet, Wifi, etc.) IPv4 & IPv6 UDP Shared Memory InfiniBand Custom (e.g. Radio) RTI DDS Real-time Applications
  • 103. #5 Provides Real-Time Pub-Sub in SOA Real-Time Devices Fault Tolerance Auditing & Recording Tools & Visualization Database Event Processing Real-Time Pub-Sub/Caching/Messaging SOA & Real-Time Web Services WS-DDS
  • 104. Real-Time SOA Architecture/Implementation RT Architecture/Technology High Performance Event-Driven/Publish-Subscribe Small footprint Quality of Service Support for embedded environments Support for unreliable & low- bandwidth networks Traditional Enterprise Low Performance Client-Server Centralized (Server-based) TCP based DDS Data Bus
  • 105. Conclusions Implementing your own Data-Link Protocol is HARD The simplest, most flexible solution is to use middleware to handle the reliability, caching, failover… Middleware must have special features to support specialized needs of Data Link: Robust to packet loss, disconnects, good use of bandwidth, etc. DDS the best choice today – Is a mature international Standard from OMG Platform Neutral: Operating systems and Programming Languages Deployed worldwide in Military systems and other Demanding real-time applications – It is mandated by US DoD for Publish-Subscribe and data-distribution applications – It is ideally suited to UAVs Highly Tunable via Quality of Service (QoS) Flexible reliability model overcomes TCP problems Can accommodate unreliable & high-latency transports Uses bandwidth Efficiently Rich services (persistence, filtering, high-availability)
  • 106. Thank you Gerardo Pardo-Castellote, Ph.D. gerardo.pardo@rti.com www.rti.com

×