OPS Forum Migration of ESA Missions to TCP/IP and SLE 15.06.2007

1,208 views

Published on

This seminar provides an overview on the migration of all the ESA missions controlled by ESOC from X.25 to TCP/IP and from proprietary protocols between Mission Control System and Ground Station to the CCSDS Space Link Extension protocol.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,208
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
19
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

OPS Forum Migration of ESA Missions to TCP/IP and SLE 15.06.2007

  1. 1. OPS-G FORUM 15 June 2007 Migration of ESA Missions to TCP/IP and SLE Presented by: M.Bertelsmeier OPS-ECT E.M.Soerensen OPS-ONV G.Kerr TOS-GDA R.P.Bonilla OPS-OAX
  2. 2. <ul><li>Communications Network Infrastructure (M. Bertelsmeier) </li></ul><ul><li>Overall mission overview (E. Soerensen) </li></ul><ul><li>XMM Challenge (G. Kerr) </li></ul><ul><li>XMM case; problems encountered and solutions (R. Pérez Bonilla) </li></ul><ul><li>Lessons learned </li></ul>Agenda
  3. 3. Migration of ESA Ground Station Networking to Internet Protocol Presented by M. Bertelsmeier, OPS-ECT
  4. 4. Migration Drivers <ul><li>ESTRACK /OPSNET strategies: single protocol, use of CotS </li></ul><ul><li>IP world-wide de-facto standard </li></ul><ul><li>IP support integral part of CotS TTC building blocks (vs. X.25 as extra / exception with unknown future) </li></ul><ul><li>Control Center internal support via LAN, TCP/IP </li></ul><ul><li>MOC, SOC, SSC, SDC links support via routers, TCP/IP </li></ul><ul><li>IP standardised for SLE support </li></ul><ul><li>Current packet switched WAN nearing crossroads to complete overhaul </li></ul><ul><li>Future of X.25 parts and support </li></ul>
  5. 5. Migration Strategy <ul><li>Boundary conditions at start of project (late 2000) </li></ul><ul><ul><ul><li>no operational impact on missions in orbit or immediately before LEOP (at the time: ERS, XMM, Cluster, ENVISAT, Integral) </li></ul></ul></ul><ul><ul><ul><li>new system to support future missions (success-oriented: Rosetta, LEOP planned for 2003) </li></ul></ul></ul><ul><li>Concept </li></ul><ul><ul><ul><li>upgrade systems so that they can support current and future mode concurrently, subject to dynamic reconfigurations </li></ul></ul></ul><ul><ul><li>implement “dual protocol support capability” on OPSNET and OPSNET subscribers </li></ul></ul><ul><li>Context </li></ul><ul><ul><ul><li>maximum alignment with </li></ul></ul></ul><ul><ul><ul><ul><li>New Norcia Deep Space Station implementation, </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Maspalomas upgrade and </li></ul></ul></ul></ul><ul><ul><ul><ul><li>ESTRACK stations back-end modernisations </li></ul></ul></ul></ul>
  6. 6. Phases <ul><li>Phase 1 - Preparation and Verification (start late 2000) </li></ul><ul><ul><li>software adaptations, testbed, end-to-end proof of concept, testing in Rosetta scenario, MEX scenario (high speed TM), including LAN roll-out in stations subject to back-end upgrades </li></ul></ul><ul><li>Phase 2 – Field deployment of dual capability (2001-2005) </li></ul><ul><ul><li>completion of control center and stations upgrades </li></ul></ul><ul><li>Phase 3 - Mission migrations (2002 ff) </li></ul><ul><ul><li>migrate operations support from X.25 to IP, adapted to mission / station use profiles </li></ul></ul><ul><ul><li>Natural pace done at windows of opportunity </li></ul></ul><ul><li>Phase 4 – Completion (2006 and ongoing) </li></ul><ul><ul><li>withdrawal of OPSNET packet switching equipment </li></ul></ul>
  7. 7. Protocol and System Features Sink Source Network L-2 Data Link L-4 Transport L-3 Network L-1 Physical L-5,6,7
  8. 8. ESTRACK OPSNET Links Before and After Migration ESA internal LANs Ref. Stn LAN leased lines prime / backup ISDN Station LAN OPSLAN Core Firewall Firewall leased lines prime / backup ISDN OCC ISS node (X25) Station ISS node (X25) M&C LAN OPSLAN Firewall point-to- point Extranet links Internet point-to- point Extranet links Internet OCC NetCore LAN Router A/B Router A/B ESA internal LANs Sim LAN Firewall Ref. Stn LAN Router TMP TCE RNG STC server Router STC client NCTRS A/B MCS A/B TM TC [RNG] STC Server STC client MCS A/B Sim A/B NCTRS A/B Sim LAN Sim D Sim A/B/D Before Routers Routers requires change After (Target ) ESTRACK Security Perimeter
  9. 9. Topologies During Migration <ul><li>Aim: </li></ul><ul><li>no additional line rental cost for support to two protocols </li></ul><ul><li>Topologies </li></ul><ul><li>“ Overlay”: for WAN links of poor capacity/price ratio (e.g. Kourou, Santiago, Maspalomas, Malindi) </li></ul><ul><ul><li>IP-OPSNET as frame relay overlay over X-OPSNET </li></ul></ul><ul><li>“ Side-by-side”: for WAN links of 2 Mbps (KIR, NNO, PER, ESAC) </li></ul><ul><ul><li>IP-OPSNET and X-OPSNET side by side, using multiplexing interface converters between WAN line and Switches / Routers </li></ul></ul>
  10. 10. Operations Scenarios During Migration hybrid operations conv conv WAN overlay side-by-side
  11. 11. Requirements on IP-OPSNET <ul><li>Services </li></ul><ul><ul><ul><li>WAN: digital voice OCC - Stations (ca. 10...12 kbit/s) </li></ul></ul></ul><ul><ul><ul><li>WAN: data OCC - Stations (up to few hundred kbit/s) </li></ul></ul></ul><ul><ul><ul><ul><li>TM, TC, STC client/server, orbital data, GPS, auxiliary data, service management, network management </li></ul></ul></ul></ul><ul><ul><ul><li>LAN: data transit to / from OCC; all remaining data exchanges inside station, incl. M&C, UPS, BMS, FM (e.g. NNO) </li></ul></ul></ul><ul><ul><ul><li>Security </li></ul></ul></ul><ul><li>Capabilities </li></ul><ul><ul><ul><li>near “non-stop” availability --> reliability, redundancy, resilience </li></ul></ul></ul><ul><ul><ul><li>capacity --> performance, modularity, scalability </li></ul></ul></ul><ul><ul><ul><li>throughput --> performance, prioritisation, congestion management </li></ul></ul></ul><ul><li>Environment </li></ul><ul><ul><ul><li>WAN circuits with delay and errors </li></ul></ul></ul><ul><ul><ul><li>(benchmark: 400 ms delay one way, BER 10 -7 both ways) </li></ul></ul></ul>
  12. 12. <ul><li>Communications Systems </li></ul><ul><ul><li>Automatic rerouting in case of line drops and equipment failures (distributed dynamic routing algorithm, Hot-Standby Routing Protocol (HSRP)) </li></ul></ul><ul><ul><li>Throughput maximisation: tuned Frame Relay interface between Cisco routers and Netrix nodes (during overlay phase) </li></ul></ul><ul><ul><li>Hierarchical bandwidth reservations and priorisation, better than X.25 (“Quality of Service” system, feasible for on-line and off-line) </li></ul></ul><ul><ul><li>Provisions for Voice over IP integration </li></ul></ul><ul><li>Subscribers </li></ul><ul><ul><li>Feasible UNIX system configurations under Sun Solaris 2.6 and above </li></ul></ul><ul><ul><li>Tuned TCP stacks to cope with high-delay, high BER environments </li></ul></ul><ul><li>End-to-End Connections </li></ul><ul><ul><li>Stable performance for real-time telemetry at rates up to 256 Kbps with RTT of 800 ms and BER of 10exp-7. </li></ul></ul><ul><ul><li>Delta-DOR throughput over load-sharing pair of E1 lines: 95% of wire-speed . </li></ul></ul>Implemented Features / Performances
  13. 13. Present Status <ul><li>SVA, CEB: X.25 never deployed </li></ul><ul><li>KIR, MSP, KRU, RED, MAL: X.25 idle or already de-installed </li></ul><ul><li>NNO, PER, VIL: X.25 equipment to be freed of voice support. (VIL scheduled next week.) </li></ul><ul><li>AGO: X.25 still in use. Current leased line (128k) insufficient for XMM retransmission needs, awaiting cancelation, new leased line not planned due to AGO use predicition. Alternate link concept under discussion. </li></ul><ul><ul><li>“ IP”-OPSNET is now the “OPSNET” </li></ul></ul><ul><ul><li>OPSNET SLE-ready (except AGO) </li></ul></ul>
  14. 14. Papers EXCITE – The Migration of the ESA TTC Network to TCP/IP, TTC 2001 The Evolution of ESA Ground Station Communications to Internet Protocol, SpaceOps 2002 Network Security and SLE / IP Internetworking for Inter-Agency Cooperation, SpaceOps 2004 A Novel Approach for Ground Stations Communications within the ESTRACK Network of ESA, DASIA 2005 A Novel and Cost Effective Communications Platform for the ESA Stations Network, RCSGSO 2005 Information Technology Solutions for Delta-DOR Large Volume Data Transfers, SpaceOps 2006 New Communications Solutions for ESA Ground Stations, ESA Bulletin February 2006
  15. 15. Migration of Missions to SLE: Overall perspective Presented by E. M. Soerensen OPS-ONV
  16. 16. Scope of work <ul><li>Strategy only SLE will be used in the future (longer term) </li></ul><ul><li>9 Missions that needed to be migrated to SLE </li></ul><ul><li>NCTRS upgraded to support SLE </li></ul><ul><li>13 stations to be upgraded, as of 2000 </li></ul><ul><ul><li>In some stations TMTCS is installed and in some CORTEX is installed and both support SLE </li></ul></ul><ul><li>A total of 28 configurations (mission/station combinations) had to be implemented and validated </li></ul>
  17. 17. Mapping of Missions and Stations (2004)
  18. 18. Challenge <ul><li>The SCOS-1 Missions (ERS-2, ENVISAT and CLUSTER) were a special challenge because they use VMS and SLE is not supported on VMS </li></ul><ul><li>Solution: migrate to SUN-based NCTRS for these missions </li></ul><ul><li>Successfully done </li></ul><ul><ul><li>N.B. these missions were the first at ESOC to use SLE fully </li></ul></ul>
  19. 19. Status Summary X25 still in use – plan: service contract providing SLE services Santiago IP-OPSNET, X25 Removed Vilspa IP-OPSNET, X25 Removed Redu IP-OPSNET, X25 Removed Perth IP-OPSNET, X25 Removed New Norcia IP-OPSNET, X25 Removed Maspalomas IP-OPSNET, X25 Removed Kourou IP-OPSNET, X25 Removed Kiruna IP-OPSNET, X25 Removed Cebreros Status Station
  20. 20. SLE Service Providers Goldstone, CA U.S. Madrid, Spain Kourou, French Guiana Cebreros, Spain Villafranca, Spain Mas Palomas, Gran Canaria Island Redu, Belgium Kiruna, Sweden Svalbard, Norway Weilheim, Germany Malindi, Kenya Perth, Australia New Norcia, Australia Operators/Networks Troms ø , Norway Esrange, Sweden St-Hubert, Canada Kerguelen, France Hartebeessthoek, Republic of South Africa Kourou, French Guiana Aussaguel, France Kiruna, Sweden Canberra, Australia ESA/ESTRACK NASA/JPL/DSN DLR NSC/KSAT SSC/Prioranet CSA CNES China
  21. 21. SLE Service Users NASA/JPL, Pasadena, CA U.S. Lockheed Martin Denver, CO U.S. JHU/APL Laurel, MD U.S. NASA/GSFC Greenbelt, MD U.S. ESA/ESOC Darmstadt, Germany DLR/GSOC, Oberpfaffenhofen, Germany JAXA/ISAS Sagamihara City Japan CNES, Toulouse, France China
  22. 22. XMM Challenge Presented by G. Kerr, TOS-GDA
  23. 23. LINK TO MAIN Ground Station KOUROU
  24. 24. <ul><li>XMM commands mainly in real-time (~1000’s cmds/hr) </li></ul><ul><li>Implicit timing constraints on commanding embedded in database </li></ul><ul><li>Commands sent via X.25 receive G/S confirmation in 2-3 secs from Kourou – OK </li></ul><ul><li>Commands sent via TCP/IP received G/S confirmation in 6-10 secs from Kourou – NOT Acceptable </li></ul><ul><li>INTEGRAL changed TCP/IP buffer sizes on TMP at Redu – not an option for XMM at Kourou (multi-mission) </li></ul><ul><li>We concentrated on NCTRS (TCP/IP negotiates between computers) </li></ul>INITIAL CONSIDERATIONS
  25. 25. <ul><li>Underlying cause of delays not initially clear </li></ul><ul><li>Which TCP/IP parameters could/should be modified - how to get TCP/IP expertise? </li></ul><ul><li>Confusing and contradictory documentation – mainly for maximising bandwidth utilisation (we have guaranteed bandwidth) </li></ul><ul><li>Different TCP/IP parameter sets on TCE and NCTRS – difficult to make an equivalence </li></ul><ul><li>No root privileges to change anything anyway – strong opposition to changing TCP/IP on OPSLAN - understandable </li></ul>PROBLEMS ENCOUNTERED (1)
  26. 26. <ul><li>No useful analysis tools available for us to use – sniffer/snoop output difficult to interpret – requested TCPTRACE / TCPDUMP – not allowed on OPSLAN </li></ul><ul><li>Initially testing on ESOC Reference Station impractical and not representative (satellite link, frame relay over part of link, router delays, etc.) </li></ul><ul><li>G/S operator support needed to help set up CLCW path from PSS to TCE – setup time often some hours </li></ul>PROBLEMS ENCOUNTERED (2)
  27. 27. XMM case Problems and Solutions Presented by R.P.Bonilla, OPS-OAX
  28. 28. X25 versus TCP/IP <ul><li>X25 </li></ul><ul><li>connection oriented protocol </li></ul><ul><li>record based. Data is organised in blocks, and transmitted one at a time. </li></ul><ul><li>creates packets containing info for reliability. No packets loss, and ensures delivery in order. </li></ul><ul><li>no buffers. </li></ul><ul><li>data flow doesn’t use algorithms. </li></ul><ul><li>TCP/IP </li></ul><ul><li>Connectionless protocol </li></ul><ul><li>stream based. Data is organised as a stream of bytes, much like a file. </li></ul><ul><li>creates segments containing info for reliability. No segment loss, and ensures delivery in order. </li></ul><ul><li>buffers at each end point, store data to be transmitted before the other side is prepared to read data. </li></ul><ul><li>data flow is based on algorithms that are tuneable; manages buffers, and coordinates traffic. </li></ul>
  29. 29. TCP protocol stack IP layer IP layer Transport Application Sender / Receiver Receiver / Sender Router Router Physical link IPv4 (re-assembly buffers) TCE / TMP TCP IPv4 Application S2K NCTRS write ( ) Transport TCP (socket-buffer) Segments Output queue Receive queue read ( ) MTU sized IP packets Ground Station ESOC TCP ACK packets Network Network
  30. 30. Definition of Delay used for analysis <ul><li>Delay: Time that the command takes to travel from the NCTRS to the TCE  Time the ‘acknowledgement message’ generated by the TCE takes to reach the NCTRS. </li></ul>Sender / Receiver Receiver / Sender Router Router Satellite link or Terrestrial link Application write ( ) Output queue Receive queue read ( ) MTU packets TCP acknowledgements XMCS XNCTRS buffers TCE TMP buffers MTU packets ESOC Ground Station
  31. 31. TCP/IP vs X25 (KRU PSS) delay with default TCP/IP
  32. 32. Telemetry and Commanding transfer flow (1) <ul><li>Buffers </li></ul>Receiver Sender packets Application TCP write buffer A Application <ul><li>Transfer window size: </li></ul><ul><li>Buffer B free space. </li></ul><ul><li>Latency </li></ul>Waiting for ACK send buffer A Received buffer B ≥ RTT * Bandwidth read buffer B TCP IP NIC TMP NCTRS IP NIC segments R p R n to be read empty
  33. 33. TCP data flow (1) Sender parameters Congestion window = amount of data injected into the network at a particular time. Congestion window max = determined by the link capacity (tuneable). And/or adjusted to the receiver buffer capacity. data allowed to be sent = min [cong. window, window offered by receiver] Timeout timer = interval waited before Retransmitting, due to ACK not received. <ul><li>Buffers and Windows </li></ul>Congestion window (Kbytes) Time (s) Slow start
  34. 34. TCP algorithms behavior XMM specific R p = Data rate delivered from TMP to TCP layer (70kbs). R n = data rate delivered by TCP to the Network. R line = physical capacity of the comms link. Rate (kbs) Slow start Recovery Nominal operations Packet loss Recovery… Undesirable TCP/IP behaviour in the presence packet loss time (s)
  35. 35. TCP data flow (2) <ul><li>Timers for ACK control  system performance </li></ul>Receiver tcp_deferred_acks_max (1 -> 8 segments) max TCP segments received before forcing out an ACK. Timeout timer initial Sender reset 0.4s Timeout timer min Timeout timer min Timeout timer max 60s 3s tcp_deferred_ack_interval tcp_deferred_ack_interval (0.1s) Time interval the sender waits to receive an ACK. Timers not optimised for XMM latency Data ACK rexmit data rexmit data rexmit data 0.4s
  36. 36. XMM problem (1) <ul><li>In our test setup, commands released from NCTRS to TCP every 2 seconds, but from TCP to Network layer was much slower </li></ul><ul><ul><li> buffering of commands at the Sender side in order to fill the MTU size. </li></ul></ul><ul><li>acknowledgements NOT released from the TCE back to the NCTRS as soon as a segment was received </li></ul><ul><ul><li> buffering of ACKs at Receiver side. </li></ul></ul><ul><li>Not possible to achieve a reasonable ‘end-to-end delay’ (MCS  S/C) (maximum 5sec.) </li></ul>Undesirable behaviour : Buffering of Commands at NCTRS (TCP level) , and of ACKs at TCE (TCP level) <ul><li>TCP segment is not equal to Max Transfer Unit (MTU) and also not equal to longest length commands. </li></ul><ul><ul><li> TCP transmits the data as a stream of bytes, unrelated to application coding </li></ul></ul>
  37. 37. TCP interval between successive ack’s KRU TCP/IP default set-up Acknowledgements not synchronous with command release
  38. 38. XMM Solution (1) <ul><li>Set Max Transfer Unit (MTU) ~ 1 command of longest length. </li></ul><ul><li>encapsulation of MTU size between NCTRS and TCE. </li></ul><ul><li>The number of segments received before forcing an ACK, was set to 1 (only on NCTRS, default value = 8). Equivalent parameter not found on TCE. </li></ul><ul><li>Telemetry and ACKs separated into 2 different data streams (applying independent Quality of Service for each one on GS routers). </li></ul><ul><li>TMP set to deliver data every 1 second, instead of the default value, 2 seconds. </li></ul>
  39. 39. TCP interval between successive ack’s ( tuned parameters on NCTRS ) Desired behaviour after tuning: ACKs synchronised with commands
  40. 40. TCP vs X25 tuned parameters on NCTRS After tuning, TCP/IP now behaves as well as X25.
  41. 41. XMM problem (2) <ul><li>Unnecessary retransmissions at TCP level. Detection of packet loss causes a decrease of the Send window, so the system starts to slow down. Packet loss due to a hit on the link, or due to the intrinsic high BER of the satellite link. </li></ul><ul><li>Reception of Telemetry packets up to 4 minutes late at MCS, and sometimes causing loss of data (‘FIFO buffer full’) - Due to burst errors on the satellite link. </li></ul>Receiver Sender packets Application TCP write buffer A Application send buffer A read buffer TCP TMP NCTRS R p R n to be read empty 2MB 70kbs
  42. 42. XMM problem (2) - graphical b a c ··· Throughput a Nominal rate b Bit error c Burst error
  43. 43. XMM Solution (2) <ul><li>Increase guaranteed Bandwidth above theoretical required for Telemetry. </li></ul><ul><li>Tuning of Retransmissions timers at Sender and Receiver sides should be done. Because of the latency of the transmission and the low speed of the link, packets are continuously retransmitted, even without errors on the link. </li></ul><ul><li>Use Selective Acknowledge (SACK) at the TMP and TCE. </li></ul>
  44. 44. SLE IP layer IP layer Transport Application Sender / Receiver Receiver / Sender Router Router Physical link IP TCP IP Application write ( ) Transport TCP Segments Output queue Receive queue read ( ) MTU sized IP packets Ground Station ESOC TCP ACK packets Network Network SLE NCTRS SLE TMTCS
  45. 45. <ul><li>TCP/IP should be tuned, and tuning is a very complex exercise. </li></ul><ul><li>Each mission should have one person with final responsibility for ensuring appropriate comms setup, and with full root authority on all related computers, end-to-end </li></ul><ul><li>Ensure strengthening and maintenance of systems levels expertise of TCP/IP concepts. </li></ul><ul><li>XMM TCP/IP migration effort was radically underestimated. </li></ul>LESSONS LEARNED (1)
  46. 46. <ul><li>The effort of rolling out a new system involving network infrastructure and multiple missions is considerable and was underestimated </li></ul><ul><li>When introducing new protocols (TCP/IP, SLE) adequate access to stations and network for operations validation on each mission is critical: must be taken into account. </li></ul><ul><li>The decision to take SLE as the single supported protocol for ESA or third party missions was correct. </li></ul>LESSONS LEARNED (2)
  47. 47. <ul><li>Dual protocol capable networks and platforms were very good concept to allow migrations at windows of best opportunity. Schedule flexibility and independence from constraints like changes to mission model and ESTRACK load. </li></ul><ul><li>Network design as overlay or side by side on standard high economy leased lines has avoided extra cost for two networks. </li></ul><ul><li>TCP/IP protocol suite standards evolution occurs, not with X.25. </li></ul><ul><li>The communications network is just that. It can offer different classes of throughput and priority, but control of the load that the “user” system offers to the network has to reside in the “user” system.  </li></ul><ul><li>The design of an e2e communications service has to be understood by all vertical layers involved in source-destination relations. </li></ul>LESSONS LEARNED (3)

×