SCTP Programmer's Guide
HP-UX 1 v2
       1i




 HP Part Number: 5992-0620
 Published: September 2007
 Edition: 1
© Copyright 2007 Hewlett-Packard Development Company, L.P.
Confidential computer software. Valid license from HP required ...
Table of Contents
About This Document........................................................................................
Overview.....................................................................................................................
3 Compiling and Running Applications that Use the SCTP Socket APIs..............................................69
     Co...
6
List of Figures
1-1   The Internet Protocol Stack............................................................................
8
List of Tables
1-1   Chunk Types.............................................................................................
10
List of Examples
3-1   Sample Commands to Compile the Server and Client Programs...............................70
3-2   Sa...
12
About This Document
     This document describes how to write, compile, and run applications using Stream
     Control Tra...
audit(5)             An HP-UX manpage. The name of the manpage is audit and 5 is the
                            section i...
— RFC 3873 (Stream Control Transmission Protocol (SCTP) Management Information
           Base (MIB)) at:
            http...
16
1 Introduction
     This chapter introduces Stream Control Transmission Protocol (SCTP). It also discusses
     the SCTP a...
NOTE: In SCTP, the term “stream” refers to a sequence of user messages that are
       delivered in sequence, with respect...
TCP not only makes partial ordering of data impossible, it also causes unnecessary
           delay in the overall data de...
•     “SCTP Packet” (page 23)
       •     “Congestion Control in SCTP” (page 26)

SCTP in the IP Stack
       Figure 1-1 ...
Control Protocol (TCP) and the User Datagram Protocol (UDP). TCP supports
          reliable and sequential packet deliver...
1.    Host A sends a Synchronize (SYN) packet to Host B.
       2.    Upon receiving the SYN packet, Host B allocates reso...
COOKIE-ECHO packet. As a result, the conversation ends without the server
           allocating any resources for the conn...
An SCTP packet contains a common header, and one or more chunks. The SCTP
       common header contains the following info...
Table 1-1 Chunk Types (continued)
                     Chunk                        Definition

                     Heart...
Congestion Control in SCTP
       SCTP uses various congestion control algorithms to effectively handle network failures
 ...
Partial Bytes Acknowledged               Adjusts of the cwnd parameter.
       (partial_byte_acked)
       In an SCTP conn...
Table 1-2 Comparison Between SCTP, TCP, and UDP (continued)
        Feature                                               ...
Figure 1-5 A Single-Homed Connection




In Figure 1-5, Host A contains a single network interface (NIA1) and Host B conta...
may occur because of continued failure to send DATA to the primary address. As a
       result, all DATA chunks are transm...
Figure 1-7 illustrates how multi-streaming works in an SCTP association.

      Figure 1-7 Multistreaming in an SCTP Assoc...
before completing the shutdown process. When an immediate shutdown is required,
       SCTP sends an ABORT message to an e...
the delay between SACKs. The frequency of sending SACKs increases to one per
          received packet if gaps are detecte...
under-utilization of the network link. Depending on the severity of the error, the sender
       can remain in a state of ...
that provides congestion signal to the sender. This is because ECN does not contain
      mechanisms to avoid network elem...
The communication failure detection and protection capabilities of reliable SCTP data
       traffic are also applicable t...
•   Restrain from sending a FORWARD TSN chunk at any time during the lifetime of
          an association.
      •   Check...
SCTP also uses the four SACK rule to avoid retransmission caused by normal
       occurrences, such as packets received ou...
information, together with a valid lifetime and a signature for authentication, and sends
       these back in the INIT AC...
40
2 SCTP Socket APIs
     This chapter discusses the different SCTP socket API types, their call flow sequence,
     SCTP ev...
However, because of the unique features of SCTP, such as multistreaming and
       multihoming, the existing socket APIs e...
Basic One-to-One Call Flow Sequence
       A one-to-one style SCTP application uses the following system call sequence to ...
SOCK_STREAM            Indicates the creation of a one-to-one style socket.
       IPPROTO_SCTP           Specifies the ty...
The listen() Socket API
        Applications use listen() to prepare the SCTP endpoint for accepting inbound
        assoc...
If SCTP does not call the bind() API before calling connect() , the application picks
       a transient port and chooses ...
descriptor open, so that the receiving endpoint can receive data that SCTP was unable
       to deliver.

The sendmsg() an...
NOTE: A sendmsg() API does not fail if it contains an invalid SCTP stream identifier
       but an error is returned on al...
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Sctp
Upcoming SlideShare
Loading in...5
×

Sctp

3,780

Published on

sctp protocol in unix and linux explained in simple and useful way in this presentation of sctp protocol. for related presentations visit www.technoexplore.blogspot.com

Published in: Technology, Business
1 Comment
3 Likes
Statistics
Notes
No Downloads
Views
Total Views
3,780
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
199
Comments
1
Likes
3
Embeds 0
No embeds

No notes for slide

Sctp

  1. 1. SCTP Programmer's Guide HP-UX 1 v2 1i HP Part Number: 5992-0620 Published: September 2007 Edition: 1
  2. 2. © Copyright 2007 Hewlett-Packard Development Company, L.P. Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. UNIX is a registered trademark of The Open Group.
  3. 3. Table of Contents About This Document...................................................................................................................13 Intended Audience.............................................................................................................13 Document Organization.....................................................................................................13 Typographical Conventions................................................................................................13 Related Information............................................................................................................14 HP Encourages Your Comments........................................................................................15 1 Introduction..............................................................................................................................17 SCTP Overview...................................................................................................................17 Limitations of TCP and UDP..............................................................................................18 Limitations of TCP........................................................................................................18 Limitations of UDP........................................................................................................19 SCTP Architecture..............................................................................................................19 SCTP in the IP Stack......................................................................................................20 Connection Setup in SCTP............................................................................................21 SCTP Packet...................................................................................................................23 Congestion Control in SCTP.........................................................................................26 Slow Start and Congestion Avoidance Algorithms.................................................26 Fast Retransmit and Fast Recovery..........................................................................27 SCTP Features.....................................................................................................................27 Multihoming..................................................................................................................28 Multistreaming..............................................................................................................30 Conservation of Data Boundaries.................................................................................31 SCTP Graceful Shutdown Feature................................................................................31 SCTP Support for IPv4 and IPv6 Addresses.................................................................32 SCTP Data Exchange Features......................................................................................32 Support for Dynamic Address Reconfiguration ..........................................................33 Reporting Packet Drops to an Endpoint.......................................................................33 Support for ECN-Nonces in SCTP................................................................................34 SCTP Support for Partially Reliable Data Transmission...............................................35 Error Handling in SCTP.....................................................................................................37 Retransmission of DATA Chunks.................................................................................37 HEARTBEATs to Identify Path Failures........................................................................38 HEARTBEATs to Identify Endpoint Failure.................................................................38 SCTP Security.....................................................................................................................38 Cookie Mechanism........................................................................................................38 Verification Tag..............................................................................................................39 2 SCTP Socket APIs......................................................................................................................41 Table of Contents 3
  4. 4. Overview............................................................................................................................41 Socket API Versus SCTP Socket API..................................................................................41 Different Socket API Styles.................................................................................................42 One-to-One Socket APIs................................................................................................42 Basic One-to-One Call Flow Sequence..........................................................................43 The socket() Socket API............................................................................................43 The bind() Socket API..............................................................................................44 The listen() Socket API.............................................................................................45 The accept() Socket API...........................................................................................45 The connect() Socket API.........................................................................................45 The close() Socket API..............................................................................................46 The shutdown() Socket API.....................................................................................46 The sendmsg() and recvmsg() Socket APIs.............................................................47 The getpeername() Socket API.................................................................................48 One-to-Many Socket APIs.............................................................................................48 Basic One-to-Many Call Flow Sequence.......................................................................48 The socket() Socket API ...........................................................................................49 The bind() Socket API..............................................................................................50 The listen() Socket API.............................................................................................51 The sendmsg() and recvmsg() Socket APIs.............................................................51 The close() Socket API..............................................................................................52 The connect() Socket API.........................................................................................52 API Options to Modify Socket Behavior............................................................................52 Common Socket Calls.........................................................................................................54 The send(), sendto(), recv(), and recvfrom() Socket Calls.............................................55 The setsocktopt() and getsockopt() Socket Calls...........................................................56 The read() and write() Socket Calls...............................................................................56 The getsockname() Socket Call......................................................................................57 SCTP Events and Notifications...........................................................................................57 SCTP Ancillary Data Structures.........................................................................................58 SCTP Initiation Structure (SCTP_INIT)........................................................................59 SCTP Header Information (SCTP_SNDRCV)...............................................................59 SCTP-Specific Socket APIs..................................................................................................61 The sctp_bindx() SCTP Socket API...............................................................................61 The sctp_peeloff() SCTP Socket API.............................................................................62 The sctp_getpaddrs() SCTP Socket API........................................................................62 The sctp_freepaddrs() SCTP Socket API.......................................................................63 The sctp_getladdrs() SCTP Socket API.........................................................................63 The sctp_freeladdrs() SCTP Socket API........................................................................64 The sctp_sendmsg() SCTP Socket API..........................................................................64 The sctp_recvmsg() SCTP Socket API...........................................................................65 The sctp_connectx() SCTP Socket API..........................................................................65 The sctp_send() SCTP Socket API.................................................................................66 The sctp_sendx() SCTP Socket API...............................................................................66 4 Table of Contents
  5. 5. 3 Compiling and Running Applications that Use the SCTP Socket APIs..............................................69 Compiling Applications that Use the SCTP APIs..............................................................69 Running Sample Applications that use the SCTP APIs.....................................................70 4 Migrating TCP Applications to SCTP...........................................................................................73 A SCTP Sample Programs.............................................................................................................75 Sample Server Programs.....................................................................................................75 One-to-One Server Program..........................................................................................75 One-to-Many Server Program.......................................................................................77 Sample Client Programs.....................................................................................................80 One-to-One Client Program..........................................................................................80 One-to-Many Client Program.......................................................................................82 Glossary.....................................................................................................................................85 Index..........................................................................................................................................87 Table of Contents 5
  6. 6. 6
  7. 7. List of Figures 1-1 The Internet Protocol Stack.........................................................................................20 1-2 Three-Way Handshake in TCP....................................................................................21 1-3 Four-Way Handshake in SCTP...................................................................................22 1-4 SCTP Packet Format....................................................................................................23 1-5 A Single-Homed Connection......................................................................................29 1-6 A Multihomed Connection.........................................................................................29 1-7 Multistreaming in an SCTP Association.....................................................................31 1-8 Shutdown in TCP and SCTP.......................................................................................32 7
  8. 8. 8
  9. 9. List of Tables 1-1 Chunk Types...............................................................................................................24 1-2 Comparison Between SCTP, TCP, and UDP...............................................................27 2-1 Data Structures in the recvmsg() and sendmsg() Calls...............................................60 9
  10. 10. 10
  11. 11. List of Examples 3-1 Sample Commands to Compile the Server and Client Programs...............................70 3-2 Sample Command to Run the Server Application......................................................70 3-3 Sample Command to Run the Client Application......................................................71 11
  12. 12. 12
  13. 13. About This Document This document describes how to write, compile, and run applications using Stream Control Transmission Protocol (SCTP) socket APIs on systems running HP-UX 11i v2. HP's implementation of SCTP conforms to the RFCs and RFC drafts listed in “Related Information” (page 14). The document printing date and part number indicate the document’s current edition. The printing date will change when a new edition is printed. Minor changes may be made at reprint without changing the printing date. The document part number will change when extensive changes are made. The latest version of the document will be available at: http://www.docs.hp.com Document updates can be issued between editions to correct errors or document product changes. To ensure that you receive the updated or new edition, subscribe to the appropriate support service. Contact your HP sales representative for details. Intended Audience This document is intended for application developers who write programs using SCTP socket APIs. Application developers are expected to be familiar with SCTP, C, UNIX®, TCP, UDP, networking concepts, and operating system concepts. Application developers are recommended to read the relevant SCTP RFCs for detailed information on SCTP. This document is not a tutorial. Document Organization The SCTP Programmer's Guide is organized as follows: Chapter 1 Chapter 1 (page 17) introduces the SCTP protocol. It also discusses the SCTP protocol architecture, the message format, congestion control, fault management, SCTP security, and error handling. Chapter 2 Chapter 2 (page 41) describes the different socket API styles, SCTP events and notifications, common socket options, common socket calls, SCTP ancillary data structures, and the new SCTP-specific socket APIs. Chapter 3 Chapter 3 (page 69) describes how to compile and run applications that use the SCTP APIs. Chapter 4 Chapter 4 (page 73) describes how to migrate existing TCP applications to SCTP. It also discusses the benefits of migrating TCP applications to SCTP. Typographical Conventions This document uses the following typographical conventions: Intended Audience 13
  14. 14. audit(5) An HP-UX manpage. The name of the manpage is audit and 5 is the section in the HP-UX Reference. On the web and on the Instant Information CD, it may be a link to the manpage itself. From the HP-UX command line, you can enter “man audit” or “man 5 audit” to view the manpage. See man(1). Book Title The title of a book. On the web and on the Instant Information CD, it may be a link to the book itself. The name of a keyboard key. Note that Return and Enter both refer KeyCap to the same key. Emphasis Text that is emphasized. Emphasis Text that is strongly emphasized. Term The defined use of an important word or phrase. Text displayed by the computer. ComputerOut Commands and other text that you type. UserInput A command name or qualified command phrase. Command The name of a variable that you may replace in a command or Variable function or information in a display that represents several possible values. [] The contents are optional in formats and command descriptions. {} The contents are required in formats and command descriptions. If the contents are a list separated by , you must choose one of the items ... The preceding element may be repeated an arbitrary number of times. | Separates items in a list of choices. Related Information The following related documents are available for the SCTP product: • SCTP Administrator's Guide at: http://docs.hp.com/en/netcom.html • SCTP Release Notes at: http://docs.hp.com/en/netcom.html • Request for Comments (RFC) documents: — RFC 2960 (Stream Control Transmission Protocol) at: http://www.ietf.org/rfc/rfc2960.txt?number=2960 — RFC 3286 (An Introduction to the Stream Control Transmission Protocol (SCTP)) at: http://www.ietf.org/rfc/rfc3286.txt?number=3286 14 About This Document
  15. 15. — RFC 3873 (Stream Control Transmission Protocol (SCTP) Management Information Base (MIB)) at: http://www.ietf.org/rfc/rfc3873.txt?number=3873 — RFC 3309 (Stream Control Transmission Protocol (SCTP) Checksum Change) at: http://www.ietf.org/rfc/rfc3309.txt?number=3309 — RFC 3758 (Stream Control Transmission Protocol (SCTP) Partial Reliability Extension) at: http://www.ietf.org/rfc/rfc3758.txt?number=3758 — RFC 4460 (Stream Control Transmission Protocol (SCTP) Specification Errata and Issues) at: http://www.ietf.org/rfc/rfc4460.txt?number=4460 • Draft RFCs: — draft-ietf-tsvwg-sctpsocket-10.txt at: http://tools.ietf.org/wg/tsvwg/draft-ietf-tsvwg-sctpsocket/draft-ietf-tsvwg-sctpsocket-10.txt — draft-ietf-tsvwg-addip-sctp-10.txt (Stream Control Transmission Protocol (SCTP) Dynamic Address Reconfiguration) at: http://tools.ietf.org/wg/tsvwg/draft-ietf-tsvwg-addip-sctp/draft-ietf-tsvwg-addip-sctp-10.txt — draft-stewart-sctp-pktdrprep-02.txt (Stream Control Transmission Protocol (SCTP) Packet Drop Reporting) at: http://tools.ietf.org/html/draft-stewart-sctp-pktdrprep-02 — draft-ladha-sctp-nonce-01.txt (ECN Nonces for Stream Control Transmission Protocol (SCTP)) at: http://tools.ietf.org/html/draft-ladha-sctp-nonce-05 HP Encourages Your Comments HP encourages your comments concerning this document. We are committed to providing documentation that meets your needs. Send any errors found, suggestions for improvement, or compliments to: feedback@fc.hp.com Include the document title, manufacturing part number, and any comment, error found, or suggestion for improvement you have concerning this document. HP Encourages Your Comments 15
  16. 16. 16
  17. 17. 1 Introduction This chapter introduces Stream Control Transmission Protocol (SCTP). It also discusses the SCTP architecture, the features that SCTP supports, the security features that SCTP offers, and error handling. This chapter addresses the following topics: • “SCTP Overview” (page 17) • “Limitations of TCP and UDP” (page 18) • “SCTP Architecture” (page 19) • “SCTP Features” (page 27) • “Error Handling in SCTP” (page 37) • “SCTP Security” (page 38) SCTP Overview SCTP is a connection-oriented transport layer protocol that enables reliable transfer of data over IP-based networks. In an IP stack, it exists at a level equivalent to that of Transmission Control Protocol (TCP) and User Datagram Protocol (UDP). SCTP offers all the features that are supported by TCP and UDP. It also overcomes certain limitations in TCP and adopts the beneficial features of UDP. SCTP offers the following features: • Network-level fault tolerance through support for multihoming • Minimized delay in data delivery by sending data in multiple streams • Acknowledged, error-free non-duplicated transfer of data • Data fragmentation to conform to discovered maximum transmission unit (MTU) size • Sequenced delivery of user messages within multiple streams • Optional bundling of multiple user messages into an SCTP packet • Improved SYN-flood protection • Preservation of message boundaries SCTP also includes mechanisms, such as checksums, sequence numbers, and selective retransmission of data, to detect data corruption, loss of data, and duplication of data. In addition, it contains different congestion control algorithms to minimize data loss in an unstable network. SCTP supports improved error handling methods to avoid unnecessary retransmission of data. The security methods implemented in SCTP enable the endpoints of an association to avoid SYN-flooding, and to identify stale or unwanted data packets. Initially, the features of SCTP were designed to transport telephone signaling messages over IP networks. Other applications that require similar features can also use SCTP. SCTP Overview 17
  18. 18. NOTE: In SCTP, the term “stream” refers to a sequence of user messages that are delivered in sequence, with respect to other messages within the same stream. In TCP, “stream” refers to a sequence of bytes. HP's implementation of SCTP conforms to the following RFCs and draft RFCs: • RFC 3286 (An Introduction to the Stream Control Transmission Protocol (SCTP)) • RFC 2960 (Stream Control Transmission Protocol) • RFC 3873 (Stream Control Transmission Protocol (SCTP) Management Information Base (MIB)) • RFC 4460 (Stream Control Transmission Protocol (SCTP) Specification Errata and Issues) • RFC 3309 (Stream Control Transmission Protocol (SCTP) Checksum Change) • RFC 3758 (Stream Control Transmission Protocol (SCTP) Partial Reliability Extension) • draft-ladha-sctp-nonce-01.txt (ECN Nonces for Stream Control Transmission Protocol (SCTP)) • draft-ietf-tsvwg-addip-sctp-10.txt (Stream Control Transmission Protocol (SCTP) Dynamic Address Reconfiguration) • draft-stewart-sctp-pktdrprep-02.txt (Stream Control Transmission Protocol (SCTP) Packet Drop Reporting) • Limitations of TCP and UDP TCP and UDP are the most widely used network layer protocols. However, the data transfer services offered by these protocols are inadequate to meet the requirements of a wide range of commercial applications, such as real-time multimedia and telecommunication applications. These applications require a robust protocol, which provides the flexibility of UDP and reliability of TCP, for transferring data between two endpoints. This section discusses the limitations of the TCP and UDP protocols, which led to the development of SCTP. This section addresses the following topics: • “Limitations of TCP” (page 18) • “Limitations of UDP” (page 19) Limitations of TCP Following are the limitations of TCP: • TCP provides reliable data transfer, but it transmits data in a sequence. However, some applications may need reliable data transfer, though not necessarily in a strict sequence. These applications prefer partial ordering of data, wherein ordering is maintained only within subflows of data. The strict sequence maintenance in 18 Introduction
  19. 19. TCP not only makes partial ordering of data impossible, it also causes unnecessary delay in the overall data delivery. Moreover, if a single packet is lost, delivery of subsequent packets is blocked until the lost TCP packet is delivered. This causes head-of-line (HOL) blocking. • TCP transmits data in a stream. This requires that applications add their own record marking, to delineate their messages. Applications must use the PUSH flag in the TCP header, to ensure that a complete message is transferred in reasonable time. • In a TCP connection, each host includes a single network interface, and a connection is established between the network interfaces of the two hosts. As a result, if the connection breaks because of a path failure, data becomes unavailable until the connection is re-established. • TCP is vulnerable to denial of service (DoS) attacks, such as SYN flood attacks. A DoS occurs when a malicious host forges an IP packet with a fake IP address and sends a large number of TCP SYN messages to the victim host. Each time the TCP stack, on the victim host, receives a new SYN message, the TCP stack allocates kernel resources to service the new SYN message. When the TCP stack is flooded with multiple SYN messages, the victim host can run out of resources and fail to service the new legitimate SYN messages. Limitations of UDP Following are the limitations of UDP: • In UDP, the transfer of data is unreliable, because it is a connectionless protocol. In a UDP connection, an application cannot verify if the packet has reached the destination. • UDP does not contain an in-built congestion control mechanism to detect path congestion. As a result, more data may be injected into an already congested network. This results in data loss. • If stringent rules for reliable data transfer are implemented in applications that use UDP, the implementation causes additional overhead and complexity in the applications. SCTP Architecture SCTP is designed to address the shortcomings in TCP. It uses mechanisms, such as four-way handshake to prevent DoS attacks. The SCTP architecture defines packet format that contains additional fields, such as cookie and verification tag, to avoid SYN flooding. The SCTP architecture includes improved congestion control algorithms that are effective in controlling congestion in unstable networks. This section addresses the following topics: • “SCTP in the IP Stack” (page 20) • “Connection Setup in SCTP” (page 21) SCTP Architecture 19
  20. 20. • “SCTP Packet” (page 23) • “Congestion Control in SCTP” (page 26) SCTP in the IP Stack Figure 1-1 illustrates a typical IP stack and denotes the layer in which SCTP is located. Figure 1-1 The Internet Protocol Stack An Internet protocol stack contains several layers and each layer provides a specific functionality. Following are the layers in an IP stack and their functionalities: • The physical layer defines the physical means of sending data over network devices. • The data link layer transfers data between network entities, and detects and corrects errors that can occur in the physical layer. • The network layer routes data packets from the sender to the receiver in the network. The most common network layer protocol is IP. • The transport layer enables transfer of data between endpoints using the services of the network layer. This layer has two primary protocols, the Transmission 20 Introduction
  21. 21. Control Protocol (TCP) and the User Datagram Protocol (UDP). TCP supports reliable and sequential packet delivery through error recovery and flow control mechanisms. UDP is a simple message-based connectionless protocol compared to TCP. SCTP is yet another transport layer protocol that application developers can use to transmit data between endpoints. • The socket layer provides the transport layer with an interface to interact with the application layer. The socket layer contains a set of APIs, which facilitate the transport layer to interface with the application layer. • The application layer provides application programs with an interface to communicate and transfer data across the network. All application layer protocols use the sockets layer as their interface, to interact with the transport layer protocol. Connection Setup in SCTP This section discusses the connection setup between two endpoints in TCP and SCTP. It also discusses how the connection setup in SCTP prevents the DoS attack. Both TCP and SCTP initiate a new connection with a packet handshake. TCP uses a three-way handshake to set up a new connection, whereas SCTP uses a four-way handshake to set up a new connection. Figure 1-2 illustrates the three-way handshake in TCP. Figure 1-2 Three-Way Handshake in TCP The following steps describe the three-way handshake in TCP: SCTP Architecture 21
  22. 22. 1. Host A sends a Synchronize (SYN) packet to Host B. 2. Upon receiving the SYN packet, Host B allocates resources for the connection and sends a Synchronize-Acknowledge (SYN-ACK) packet to Host A. 3. Host A sends an ACK packet to confirm the receipt of the SYN-ACK packet. The connection is set up between Host A and Host B, and Host A can now start sending data to Host B. Figure 1-3 illustrates the four-way handshake in SCTP. Figure 1-3 Four-Way Handshake in SCTP The following steps describe the four-way handshake in SCTP: 1. Host A initiates an association by sending an INIT packet to Host B. 2. Host B responds with an INIT-ACK packet that contains the following fields: • A Verification tag • A Cookie The TCP SYN-ACK packet does not contain these fields. The cookie contains the necessary state information, which the server uses to allocate resources for the association. The cookie field includes a signature for authenticity and a timestamp to prevent replay attacks using old cookies. Unlike TCP, Host B in SCTP does not allocate resources at this point in the connection. The verification tag provides a key that enables Host A to verify that the SCTP packet belongs to the current association. 3. Host A sends the COOKIE-ECHO packet to Host B. If Host A has a forged IP address, it never receives the INIT-ACK chunk. This prevents Host A from sending the 22 Introduction
  23. 23. COOKIE-ECHO packet. As a result, the conversation ends without the server allocating any resources for the connection. 4. Host B responds with a COOKIE-ACK chunk and allocates resources for the connection. The connection is now established between Host A and Host B. Host A can now start sending data to Host B. In SCTP, the transfer of data may be delayed because of the additional handshake. The four-way handshake may seem to be less efficient than a three-way handshake. To overcome this delay, SCTP permits data to be exchanged in the COOKIE-ECHO and COOKIE-ACK chunks. SCTP Packet SCTP transmits data in the form of messages and each message contains one or more packets. Figure 1-4 illustrates an SCTP packet format. Figure 1-4 SCTP Packet Format SCTP Architecture 23
  24. 24. An SCTP packet contains a common header, and one or more chunks. The SCTP common header contains the following information: • Source and destination port numbers to enable multiplexing of different SCTP associations at the same address. • A 32-bit verification tag that guards against the insertion of an out-of-date or false message into the SCTP association. • A 32-bit checksum for error detection. The checksum can be either a 32-bit CRC checksum or Adler-32 checksum. A chunk can be either a control chunk or a DATA chunk. A control chunk incorporates different flags and parameters, depending on the chunk type. The DATA chunk incorporates flags to control segmentation and reassembly, and parameters for the transmission sequence number (TSN), Stream Identifier (SID) and Stream Sequence Number (SSN), and a Payload Protocol ID. The DATA chunk contains the actual data payload. Each control and data chunk in the SCTP packet contains the following information: Chunk Type This field identifies the type of information contained in the Chunk Data field. The value of the chunk field ranges from 0 to 254. The value 255 is reserved for future use, as an extension field. SCTP consists of one DATA chunk and 12 control chunks. Table 1-1 lists the definitions and parameters of the different chunk types. Table 1-1 Chunk Types Chunk Definition Used for data transfer. Payload Data (DATA) Initiates an SCTP association between two Initiation (INIT) endpoints. Initiation Acknowledgement ( Acknowledges the receipt of an INIT chunk. The receipt of the INIT ACK chunk establishes INIT ACK) an association. Selective Acknowledgement Acknowledges the receipt of the DATA chunks (SACK) and also reports gaps in the data. Used during the initiation process. The Cookie Echo (COOKIE ECHO) endpoint initiating the association sends the COOKIE ECHO chunk to the peer endpoint. Cookie Acknowledgement Acknowledges the receipt of the COOKIE (COOKIE ACK) ECHO chunk. The COOKIE ACK chunk must take precedence over any DATA chunk or SACK chunk sent in the association. The COOKIE ACK chunk can be bundled with DATA chunks or SACK chunks 24 Introduction
  25. 25. Table 1-1 Chunk Types (continued) Chunk Definition Heartbeat Request ( Tests the connectivity of a specific destination address in the association. HEARTBEAT) Heartbeat Acknowledgement ( Acknowledges the receipt of the HEARTBEAT chunk. HEARTBEAT ACK) Informs the peer endpoint to close the Abort Association (ABORT) association. The ABORT chunk also informs the receiver of the reason for aborting the association. Operation Error (ERROR) Reports error conditions. The ERROR chunk contains parameters that determine the type of error. Shutdown Association ( Triggers a graceful shutdown of an association with a peer endpoint. SHUTDOWN) Shutdown Acknowledgement Acknowledges the receipt of the SHUTDOWN (SHUTDOWN ACK) chunk at the end of the shutdown process. Shutdown Complete ( Concludes the shutdown procedure. SHUTDOWN COMPLETE) Chunk Flag This field contains the flags, such as U (unordered bit), B (beginning fragment bit), and E (ending fragment bit). Usage of this field depends on the chunk type specified in the chunk type field. Unless otherwise specified, SCTP sets this field to 0 while transmitting the packet and ignores the chunk flag on receipt of the packet. Chunk Length This field represents the size of the fields chunk type, chunk flag, chunk length, and chunk value, in bytes. Chunk Data This field contains the actual information to be transferred in the chunk. The usage and format of this field depends on the chunk type. The number of chunks in an SCTP packet is determined by the MTU size of the transmission path. Multiple chunks can be bundled into one SCTP packet except the INIT, INIT ACK, and SHUTDOWN COMPLETE chunks. The SCTP packet size must not be more than the MTU size. The SCTP packet format supports bundling of multiple DATA and control chunks into a single packet, to improve transport efficiency. An application can control bundling, to avoid bundling during initial transmission. Bundling occurs on retransmission of DATA chunks, to reduce the possibility of congestion. If the user data does not fit into one packet, SCTP fragments data into multiple chunks. For more information on the SCTP packet format, see RFC 2960 (Stream Control Transmission Protocol). SCTP Architecture 25
  26. 26. Congestion Control in SCTP SCTP uses various congestion control algorithms to effectively handle network failures or unexpected traffic surges, and ensures quick recovery from data congestion. SCTP and TCP support the same set of congestion control algorithms. Following are the congestion control algorithm supported by SCTP: • Slow Start and Congestion Control • Fast Retransmit and Fast Recovery However, in SCTP, the congestion control algorithms are modified to suite the protocol-specific requirements. For information on the TCP congestion control algorithms, see RFC 2581 (TCP Congestion Control). This section addresses the following topics: • “Slow Start and Congestion Avoidance Algorithms” • “Fast Retransmit and Fast Recovery” Slow Start and Congestion Avoidance Algorithms The slow start and congestion avoidance algorithms are used to control the amount of outstanding data being injected into the network. SCTP uses the slow start algorithm at the beginning of the transmission, when the network condition is unknown, and also in repairing loss detected by the retransmission timer. SCTP slowly probes the network to determine the available capacity of the network to avoid congestion in the network. If SCTP detects a congestion in the network, it switches to the congestion avoidance algorithm to manage the congestion. The slow start and congestion avoidance algorithms use the following congestion control variables: Specifies the limit on the amount of data the Congestion window (cwnd) sender can transmit through the network, before receiving an acknowledgement. This variable is maintained for each destination address. Specifies the receiver’s limit on the amount of Receiver window (rwnd) outstanding data. NOTE: The minimum value of the cwnd and rwnd variables determine the amount of data transmission. Determines whether the slow start or congestion Slow start threshold (ssthresh) avoidance algorithm must be used to control data transmission. 26 Introduction
  27. 27. Partial Bytes Acknowledged Adjusts of the cwnd parameter. (partial_byte_acked) In an SCTP connection, the sender uses the slow start algorithm if the value of cwnd is less than the ssthresh value. If the value of cwnd is greater than the ssthresh value, the sender uses the congestion avoidance algorithm. If the values for cwnd and ssthresh are same, the sender can use either the slow start or congestion avoidance algorithm. Unlike TCP, an SCTP sender must store the cwnd, ssthresh, and partial_bytes_acked congestion control variables for each destination address of the peer. However, the sender needs to store only one rwnd value for the whole association, irrespective of whether the peer is multihomed or contains only one address. Fast Retransmit and Fast Recovery The fast retransmit congestion control algorithm is used to intelligently retransmit missing segments of information in an SCTP association. When a receiver in an SCTP connection receives a DATA chunk out of sequence, the receiver sends a SACK packet with the unordered TSN, to the sender. The fast retransmit algorithm uses four SACK packets to indicate loss of data, and retransmits DATA without waiting for the retransmission timer to timeout. After the fast retransmit algorithm sends the DATA that appears to be missing, the fast recovery algorithm controls the transmission of new data until all the lost segments are retransmitted. SCTP Features The Signaling Transport (SIGTRAN) Working Group in IETF developed SCTP to address the limitations in TCP and UDP. Though the development of SCTP was directly motivated by the need to transfer Public Switched Telephone Network (PSTN) signaling messages across the IP network, SIGTRAN ensured that the design meets the requirements of other applications with similar requirements. Table 1-2 compares features of SCTP, TCP, and UDP. Table 1-2 Comparison Between SCTP, TCP, and UDP Feature SCTP TCP UDP no1 State required at each endpoint yes yes Reliable data transfer yes yes no Congestion control and avoidance yes yes no no2 Message boundary conservation yes yes yes2 Path MTU discovery and message fragmentation yes no yes2 Message bundling yes no Multi-homed hosts support yes no no SCTP Features 27
  28. 28. Table 1-2 Comparison Between SCTP, TCP, and UDP (continued) Feature SCTP TCP UDP Multi-stream support yes no no Unordered data delivery yes no yes yes no no Security cookie against SYN flood attack no3 Built-in heartbeat (reachability check) yes 1 In UDP, a node can communicate with another node without going through a setup procedure, or without changing any state information. However, each UDP packet contains the required state information to form a connection, so that an ongoing state need not be maintained at each endpoint. 2 TCP does not preserve any message boundaries. It treats all the data passed from its upper layer as a formatless stream of data bytes. However, because TCP transfers data in sequence of bytes, it can automatically resize all the data into new TCP segments that are suitable for the Path MTU, before transmitting them. 3 TCP implements a keep-alive mechanism, which is similar to the SCTP HEARTBEAT chunk. In TCP, however, the keep-alive interval is, by default, set to two hours for state cleanup. In SCTP, the HEARTBEAT chunk is used to facilitate fast failover. This section addresses the following topics: • “Multihoming” (page 28) • “Multistreaming” (page 30) • “Conservation of Data Boundaries” (page 31) • “SCTP Graceful Shutdown Feature” (page 31) • “SCTP Support for IPv4 and IPv6 Addresses” (page 32) • “SCTP Data Exchange Features” (page 32) • “Support for Dynamic Address Reconfiguration ” (page 33) • “Reporting Packet Drops to an Endpoint” (page 33) • “Support for ECN-Nonces in SCTP” (page 34) • “SCTP Support for Partially Reliable Data Transmission” (page 35) Multihoming Multihoming is the ability of a single SCTP endpoint to contain multiple interfaces with different IP addresses. In a single-homed connection, an endpoint contains only one network interface and one IP address. Figure 1-5 illustrates the single-homed connection in TCP. 28 Introduction
  29. 29. Figure 1-5 A Single-Homed Connection In Figure 1-5, Host A contains a single network interface (NIA1) and Host B contains a single network interface (NIB1). NIA1 is the only interface for Host A to interact with Host B. When a network or path failure occurs, the endpoint is completely isolated from the network. Multihoming in SCTP ensures better chances of survival if a network failure occurs, when compared to TCP. The built-in support for multi-homed hosts in SCTP enables a single SCTP association to run across multiple links or paths, to achieve link or path redundancy. This enables an SCTP association to achieve faster failover from one link or path to another, with minimum interruption in the data transfer service. Figure 1-6 illustrates the mutli-homed connection in SCTP. Figure 1-6 A Multihomed Connection In this figure, Host A contains multiple network interfaces to interact with Host B, which also has multiple interfaces. SCTP selects a single address as the quot;primaryquot; address and uses it as the destination for all DATA chunks for normal transmission. All the other addresses are considered as alternate IP addresses. SCTP uses these alternate IP addresses to retransmit DATA chunks and to improve the probability of reaching the remote endpoint. Retransmission SCTP Features 29
  30. 30. may occur because of continued failure to send DATA to the primary address. As a result, all DATA chunks are transmitted to the alternate address until the HEARTBEAT chunks have re-established contact with the primary address During the initiation of an association, the SCTP endpoints exchange the list of IP addresses, so that each endpoint can receive messages from any of the addresses associated with the remote endpoint. For security reasons, SCTP sends response messages to the source address in the message that prompted the response. An endpoint can receive messages that are out of sequence or with different address pairs, because multi-homing supports multiple IP addresses. To overcome this problem, SCTP incorporates procedures to resolve parallel initiation attempts into a single association. Multistreaming Multistreaming enables data to be sent in multiple, independent streams in parallel, so that data loss in one stream does not affect or stop the delivery of data in other streams. Each stream in an SCTP association uses two sets of sequence numbers, namely a Transmission Sequence Number (TSN) that governs the transmission of messages and the detection of message loss, and the Stream ID/Stream Sequence Number (SID/SSN) pair that determines the sequence of delivery of the received data. TCP transmits data sequentially in the form of bytes in a single stream and ensures that all the bytes are delivered in a particular order. Therefore, a second byte is sent only after the first byte has safely reached the destination. The sequential delivery of data causes delay when a message loss or sequence error occurs within the network. An additional delay occurs when TCP stops sending data until the correct sequencing is restored, either upon receiving an out-of-sequence message or by retransmitting a lost message. The strict preservation of message sequence in TCP poses a limitation for certain applications. These applications require sequencing of messages that affect the same resource (such as the same call or the same channel), so that messages are loosely correlated and delivered without maintaining the overall sequence integrity. The multistreaming feature in an SCTP, in which reliable data transmission and data delivery are independent of each other, overcomes this problem. This feature also avoids HOL blocking. This independence improves the flexibility of an application, by allowing it to define semantically different streams of data inside the overall SCTP message flow, and by enforcing message ordering only within each of the streams. As a result, message loss in one particular stream does not affect the delivery of messages in a different stream. The receiver can immediately determine if there is a gap in the transmission sequence (for example, caused by message loss), and also can determine whether messages received following the gap are within the affected stream. If SCTP receives a message that belongs to the affected stream, a corresponding gap occurs in SSN. The sender can continue to deliver messages to the unaffected streams while buffering messages in the affected stream until retransmission occurs. 30 Introduction
  31. 31. Figure 1-7 illustrates how multi-streaming works in an SCTP association. Figure 1-7 Multistreaming in an SCTP Association NOTE: By default, SCTP contains two streams. SCTP uses stream 0 as the default stream to transmit data. Applications can modify the number of streams through which SCTP transmits data. Conservation of Data Boundaries In SCTP, a sending application can construct a message out of a block of data bytes and instruct SCTP to transport the message to a receiving application. SCTP guarantees the delivery of this message (data block) in its entirety. It also indicates to the receiver about both the beginning and end of the data block. This is called conservation of message boundaries. TCP does not conserve data boundaries. It treats all the data passed to it from the sending application as a sequence or stream of data bytes. It delivers all the data bytes to the receiver in the same sequential order as they were passed from the application. TCP does not conserve data boundaries when packets arrive out of sequence. As a result, the receiver cannot rearrange the packets. It has to wait till the packets arrive in sequence, starting from the last unreceived packet to the received out-of-sequence packet. SCTP Graceful Shutdown Feature SCTP does not support a quot;half-openquot; connection, which can occur in TCP. In a half-open connection, even though an endpoint indicates that it has no more data to send, the other endpoint continues to send data indefinitely. SCTP, on the other hand, assumes that when the shutdown procedure begins, both the endpoints will stop sending new data across the association. It also assumes that it needs only to clear up acknowledgements of the previously sent data. The SCTP shutdown feature uses a three-message procedure to gracefully shutdown the association, in which each endpoint has confirmed the receipt of the DATA chunks SCTP Features 31
  32. 32. before completing the shutdown process. When an immediate shutdown is required, SCTP sends an ABORT message to an endpoint. Figure 1-8 illustrates graceful shutdown in SCTP and the half-closed state in TCP. Figure 1-8 Shutdown in TCP and SCTP SCTP Support for IPv4 and IPv6 Addresses SCTP supports both IPv4 and IPv6 address parameters in an SCTP packet, as defined in RFC 2960 (Stream Control Transmission Protocol). When an association is set up, the SCTP endpoints exchange the list of addresses of the endpoints in the INIT and INIT-ACK chunks. The address of the endpoint is represented by the following parameters: an IPv4 address parameter with value 5 and an IPv6 address parameter with value 6. The INIT chunks can contain multiple addresses, which can be an IPv4 or IPv6 address. SCTP Data Exchange Features This section discusses the enhanced features in SCTP that ensures reliable data exchange between endpoints. Following are the data exchange features in SCTP: • In SCTP, data is transmitted in the form of packets. Each packet contains a DATA chunk and a control chunk. An SCTP endpoint acknowledges the receipt of a DATA chunk by sending a SACK chunk to the other endpoint. The SACK chunk indicates the range of cumulative TSNs and non-cumulative TSNs, if any. The non-cumulative TSNs indicate gaps in the received TSN sequence. When SCTP identifies gaps in the TSN sequence, it resends the missing DATA chunks to the other endpoint. SCTP uses the “delayed ack” method to send the SACK chunks. In this method, SACK is sent for every second packet, but with an upper limit on 32 Introduction
  33. 33. the delay between SACKs. The frequency of sending SACKs increases to one per received packet if gaps are detected in the TSN sequence. For information on an SCTP packet, see “SCTP Packet” (page 23). • SCTP contains various congestion control algorithms, such as slow start, congestion avoidance, fast recovery, and fast retransmit, to control the flow and retransmission of data. For information on these congestion control algorithms see, “Congestion Control in SCTP” (page 26). In these algorithms, the receiver advertises the receive window and a sender advertises a per-path congestion window to handle congestion. The receiver window indicates buffer occupancy of the receiver. The per-path congestion window manages the packets in flight. The congestion control algorithms in SCTP are similar to that of TCP, except that the endpoints in an SCTP connection manages the conversion between bytes sent and received, and TSNs sent and received. This is because a TSN is attached only to a chunk. • An HP-UX application can specify a lifetime for the data to be transmitted. If the lifetime of the data has expired and the data has not been transmitted, the data, such as time-sensitive signalling messages, can be discarded. If the lifetime of the data has expired and the data has been transmitted, data must be delivered to avoid a hole in the TSN sequence. Support for Dynamic Address Reconfiguration SCTP enables an endpoint to reconfigure the IP address information dynamically for an existing association. When the endpoints exchange information during association startup, the usability of SCTP also improves without modifying the SCTP protocol. This feature is useful in computational and networking applications that add or remove physical interface cards dynamically and need the IP address of the interface to be changed dynamically. This feature also enables an endpoint to set the primary destination address of a remote peer so that when the primary address of an endpoint is deleted, the remote peer is informed of the address to which the data must be sent. To enable SCTP to reconfigure IP addresses dynamically, an SCTP packet contains the following chunk types: Address Configuration Change The ASCONF chunk communicates the Chunk (ASCONF) configuration change requests that must be acknowledged, to the remote endpoint. Address Configuration The ASCONF-ACK chunk is used by the receiver Acknowledgment (ASCONF-ACK) of an ASCONF chunk to acknowledge the reception of the ASCONF chunk. Reporting Packet Drops to an Endpoint When a packet drop occurs because of an error other than congestion, an endpoint can mistakenly interpret the packet drop as an indication of congestion in the network. The misinterpretation can cause an SCTP sender to stop sending packets. This results in SCTP Features 33
  34. 34. under-utilization of the network link. Depending on the severity of the error, the sender can remain in a state of congestion, which affects the performance of the association. SCTP contains the PKTDROP chunk that discovers packets that are dropped because of errors other than congestion. After receiving the PKTDROP chunk, an SCTP endpoint can inform its peer that it has received an SCTP packet with an incorrect CRC32C or Adler-32 checksum. The peer can then retransmit the SCTP packet without modifying the congestion window. For information on packet drop scenarios, see draft-stewart-sctp-pktdrprep-02.txt (Stream Control Transmission Protocol (SCTP) Packet Drop Reporting) at: http://tools.ietf.org/html/draft-stewart-sctp-pktdrprep-02 Support for ECN-Nonces in SCTP With the increased deployment of real-time applications and transport services that are sensitive to the delay and loss of packets, relying on packet loss alone as indicative of congestion is not sufficient. SCTP's congestion management algorithms have built-in techniques, such as Fast Retransmit and Fast Recovery, to minimize the impact of losses. These mechanisms consider the network as a black box and continue to send packets till packets are dropped because of congestion. However, these mechanisms are not intended to help applications that are sensitive to the delay or loss of one or more individual packets. With the inclusion of active queue management techniques in the Internet infrastructure, routers can assist in managing congestion. When a congestion occurs and the sender continues to send packets, the number of packets in the queue in the router increases and causes a bottleneck in the router. In such a case, the router marks the packets with congestion experienced (CE) bits and sends them to the receiver to indicate congestion, instead of dropping the packets. Explicit Congestion Notification (ECN) is a congestion management algorithm that uses a similar method to handle congestion. ECN uses the ECN field and the congestion experienced (CE) field in the IP header to mark the packets. The ECN field contains the ECN-Capable Transport (ECT) field, which is set by the data sender to indicate that the endpoints are ECN-capable. The CE bit is set by the router to indicate congestion. The ECT code points range from 00 to 01. Senders use the ECT (0) or ECT(1) code point to indicate ECT for each packet. ECN uses the following information to provide congestion notifications: • Negotiation between the endpoints during connection setup to determine whether they are both ECN-capable. • An ECN-Echo (ECNE) flag in the the IP header, which enables the data receiver to inform the data sender when a CE packet is received. • A congestion window reduced (cwr) flag in the IP header, which enables the data sender to inform the data receiver that the congestion window has been reduced. The drawback in ECN is that a poorly implemented receiver or an intermediate network element, such as router, firewall, intrusion detection system, can erase the ECNE flag 34 Introduction
  35. 35. that provides congestion signal to the sender. This is because ECN does not contain mechanisms to avoid network elements from clearing the ECNE flag. Moreover, ECN requires the cooperation of the receiver to return congestion experienced signals to the sender. If the receiver erases the congestion signals to conceal congestion and does not send these signals to the sender, the sender gains a performance advantage at the expense of competing connections that do not experience congestion. SCTP supports the ECN method and is exposed to misbehaving receivers that conceal congestion signals. The misbehavior includes concealment of ECNE signals that may cause an SCTP sender to be aggressive and unfair to compliant flows. SCTP supports ECN-nonce to avoid misbehaving receivers from concealing congestion signals. ECN-nonce also protects senders from other forms of misbehavior, such as optimistic acknowledgements and false duplicate TSN notifications. The ECN-nonce is a modification of the ECN signaling mechanism. It improves the congestion control by preventing receivers from exploiting ECN to gain an unfair share of network bandwidth. ECN-nonce improves the robustness of ECN by preventing receivers from concealing marked or dropped packets. Like ECN, ECN-nonce uses the ECT(0) and ECT(1) code points, the IP header flag, the cwr, and the ECNE bits. The ECN-nonce uses two bits of the IP header called the ECT bits. The sender randomly generates a single bit nonce and encodes it in the ECT codepoints, ECT(0) or ECT(1). To indicate congestion in the network, routers overwrite the ECT codepoints with the CE bit. The nonce sum (NS) is a cumulative one bit addition of the nonces received from the receiver. The receiver calculates the nonce sum and returns it in the NS flag of the SACK chunk. The sender verifies the value of the NS flag in the SACK chunk. An incorrect nonce sum implies that one or more nonces are missing at the receiver, because all the nonces are required to calculate the correct nonce sum. If an incorrect nonce sum is received by the sender without ECNE signals, the sender can infer that the receiver is concealing congestion notifications. The ECN-nonce support in SCTP includes the following: • A single nonce-supported parameter in the INIT or INIT-ACK chunk that is exchanged during the association establishment, to indicate to the peer whether ECN-nonce is supported at both endpoints. • A single bit flag in the SACK chunk called the Nonce Sum (NS). SCTP Support for Partially Reliable Data Transmission SCTP supports partially reliable data transmission service (PR-SCTP) that enables an SCTP sender to signal the receiver that it must not expect data from the SCTP sender. PR-SCTP enables ordered and unreliable data transfer service between endpoints, in addition to unordered and unreliable data transfer (similar to UDP). PR-SCTP employs similar congestion control and congestion avoidance algorithms as SCTP, for both reliable or partially reliable data traffic. SCTP Features 35
  36. 36. The communication failure detection and protection capabilities of reliable SCTP data traffic are also applicable to partially reliable data traffic. PR-SCTP enables an endpoint to detect a failure destination address quickly and to failover to an alternate destination address. It also notifies when the destination address becomes unreachable. The chunk bundling capability in SCTP enables reliable and unreliable messages to be multiplexed over a single PR-SCTP association. Multiplexing enables a single protocol (that is SCTP) to be used to transmit different types of messages, instead of using separate protocols. SCTP includes the following parameter and chunk to support the partially reliable data transmission service: The Forward-TSN-Supported This is an optional parameter in the INIT and parameter INIT ACK chunks. When an association is initialized, the SCTP sender must include this parameter in the INIT or INIT ACK chunk to inform its peer that it supports partially reliable data service. The Forward Cumulative TSN The receiver sends this chunk to a sender to inform its support for PR-SCTP. An SCTP sender (FORWARD TSN) chunk uses this chunk to inform the receiver to move its cumulative received TSN forward, because the missing TSNs are associated with data chunks that must not be transmitted or retransmitted by the sender. The timed-reliability service is an example of a partially reliable service that SCTP provides to the upper layer using PR-SCTP. This service enables the service user to indicate a limit on the duration of time that the sender must try to transmit or retransmit the message. If an SCTP endpoint supports the FORWARD TSN chunk, it can include the Forward-TSN-supported parameter in the INIT chunk to indicate support for FORWARD TSN chunk to its peer. If an endpoint chooses not to include the Forward-TSN-Supported parameter, it cannot send or process a FORWARD TSN chunk anytime during the lifetime of an association. Instead, it must pretend as if it does not support the FORWARD TSN chunk and return an error to the peer upon the receipt of any FORWARD TSN chunk. When a receiver of an INIT or INIT ACK chunk detects a Forward-TSN-Supported parameter and does not support the Forward-TSN chunk type, the receiver may optionally respond with the Unsupported Parameters parameter, as defined in Section 3.3.3 of RFC 2960. A receiver can perform the following tasks if it receives an INIT chunk that does not contain the Forward-TSN-Supported parameter: • Include the Forward-TSN-Supported parameter in INIT-ACK. • Record the information that the peer does not support the FORWARD TSN chunk. 36 Introduction
  37. 37. • Restrain from sending a FORWARD TSN chunk at any time during the lifetime of an association. • Check with the upper layer if it has requested a notification on whether the peer endpoint supports the Forward-TSN-Supported parameter. Error Handling in SCTP The network traffic in the Internet is unpredictable. Sudden network failures and traffic surges can occur, which result in non-reachability of an endpoint. Such a network is error prone and a sending application must be cautious while transmitting or retransmitting data, because the receiving endpoint may be unavailable to receive data. The unavailability of the endpoint is caused either by a path failure or an endpoint failure. SCTP offers appropriate error handling methods, to overcome this problem. Before transmitting data, SCTP sends chunks of information to verify whether a destination is active. Even before using a different path to reach a destination or closing an association, SCTP ensures that the destination address is not reachable or inactive. SCTP uses the following error handling methods: • Retransmission of DATA chunks • HEARTBEATs to identify path failures • HEARTBEATs to identify endpoint failures This section addresses the following topics: • “Retransmission of DATA Chunks” (page 37) • “HEARTBEATs to Identify Path Failures” (page 38) • “HEARTBEATs to Identify Endpoint Failure” (page 38) Retransmission of DATA Chunks SCTP uses DATA chunks to exchange information between two addresses. Upon receiving a DATA chunk, the receiving address sends an acknowledgement to the sending address. If the receiving address does not receive the DATA chunk properly, it sends a SACK packet that triggers the sending address to retransmit the DATA chunk. The sending address also retransmits the DATA chunk when the retransmission timer times out. SCTP limits the rate of retransmission of DATA chunks, to reduce chances of congestion. It modifies the retransmission timeout (RTO) value, based on the estimates of the round trip delay and reduces the transmission rate exponentially when the message loss increases. In an active SCTP association with constant DATA transmission, SACKs are more likely to cause retransmission than the retransmission timeout. To reduce unnecessary retransmission of data, SCTP uses the four SACK rule, so that SCTP retransmits a DATA chunk only after receiving the fourth SACK, which indicates a missing DATA chunk. Error Handling in SCTP 37
  38. 38. SCTP also uses the four SACK rule to avoid retransmission caused by normal occurrences, such as packets received out of sequence. HEARTBEATs to Identify Path Failures SCTP periodically sends HEARTBEAT chunks to idle destinations, or alternate addresses to identify a path failure. SCTP maintains a counter to store the number of heartbeats that are sent to the inactive destination, without receiving a corresponding Heartbeat Ack chunk. When the counter reaches the specified maximum value, SCTP also declares the destination address as inactive. SCTP notifies the application about the inactive destination address and starts using an alternate address for sending the DATA chunks. However, SCTP continues to send heartbeats to the inactive destination address until it receives an ACK chunk. On receipt of an ACK chunk, SCTP considers the destination address as active again. The rate at which SCTP sends heartbeats depends on the sum of the RTO value and the delay parameter, which allow Heartbeat traffic to be tailored per the needs of the user application. HEARTBEATs to Identify Endpoint Failure SCTP identifies an endpoint failure in a way that is similar to path failure discussed in “HEARTBEATs to Identify Path Failures” (page 38) SCTP maintains a counter across all destination addresses, to store the number of retransmits or Heartbeats sent to the remote endpoint without a successful ACK. When the value of the counter exceeds a preconfigured maximum value, SCTP declares the endpoint as unreachable and closes the association. SCTP Security SCTP uses the following methods to provide security: • Cookie Mechanism • Verification Tag This section addresses the following topics: • “Cookie Mechanism” (page 38) • “Verification Tag” (page 39) Cookie Mechanism A cookie mechanism is employed during the initialization of an association, to provide protection against security attacks. The cookie mechanism uses a four-way handshake, and the last pair of handshake is allowed to carry user data for fast setup. The cookie mechanism guards against a blind attacker from generating INIT chunks, which overload the resources of an SCTP server by causing the server to use memory and resources to handle new INIT requests. Instead of allocating memory for a Transmission Control Block (TCB), the server creates a cookie parameter with the TCB 38 Introduction
  39. 39. information, together with a valid lifetime and a signature for authentication, and sends these back in the INIT ACK chunk. The blind attacker cannot obtain the cookie, because the INIT ACK always goes back to the source address of the INIT. A valid SCTP client gets the cookie and returns it in the COOKIE ECHO chunk, where the SCTP server can validate the cookie and use it to rebuild the TCB. The cookie is created by the server, and the cookie format and secret key remain with the server. The server does not exchange these details with the client. Verification Tag A verification tag is a 32–bit unsigned integer that is randomly generated to verify whether the SCTP packet belongs to the current association, or to a stale packet from a previous association. SCTP discards packets received without the expected verification tag value, to protect against blind masquerade attacks and also from receiving stale SCTP packets from a previous association. The verification tag rules apply when sending or receiving SCTP packets that do not contain an INIT, SHUTDOWN COMPLETE, COOKIE ECHO, ABORT, or a SHUTDOWN ACK chunk. While sending an SCTP packet, the endpoint must fill in the verification tag field of the outbound packet, with the tag value in the Initiate Tag parameter of INIT or INIT ACK received from its peer. After receiving an SCTP packet, the endpoint must ensure that the value in the verification tag field of the received SCTP packet matches its own tag. If the received verification tag value does not match the receiver's own tag value, the receiver silently discards the packet and does not process it any further. The verification tag value is chosen by each endpoint of the association during association startup. SCTP Security 39
  40. 40. 40
  41. 41. 2 SCTP Socket APIs This chapter discusses the different SCTP socket API types, their call flow sequence, SCTP events and notifications, socket options, command socket calls, and the SCTP ancillary data structures. This chapter addresses the following topics: • “Overview” (page 41) • “Socket API Versus SCTP Socket API” (page 41) • “Different Socket API Styles” (page 42) • “API Options to Modify Socket Behavior” (page 52) • “Common Socket Calls” (page 54) • “SCTP Events and Notifications” (page 57) • “SCTP Ancillary Data Structures” (page 58) • “SCTP-Specific Socket APIs” (page 61) Overview The socket layer in an IP stack contains socket APIs that enable the transport layer to interface with the application layer. The socket APIs make the various protocol-specific features available to an application. SCTP contains the existing socket APIs and the SCTP-specific APIs. Both these APIs enable SCTP to interface with the application layer. These APIs are also compatible with TCP applications that can be migrated to SCTP with minimum changes. Following are the design objectives of the SCTP socket APIs: • Maintain consistency and ensure compatibility with the existing sockets APIs • Define socket mapping for SCTP that is consistent with other socket API protocols, such as UDP, TCP, IPv4, and IPv6 • Support a one-to-many style interface • Support a one-to one style interface The following sections discuss the differences between the socket API and the SCTP socket APIs, the different SCTP socket API styles, data structures that enable applications to control an association, and socket APIs to modify the socket options. Socket API Versus SCTP Socket API The SCTP APIs use the existing socket APIs to perform operations that are similar to the operating behavior of the socket APIs. For example, in the existing socket APIs and the SCTP socket APIs, an application can call the bind() API only once and an application can specify only a single address in the bind() API. Overview 41
  42. 42. However, because of the unique features of SCTP, such as multistreaming and multihoming, the existing socket APIs either do not work on an SCTP socket, or the semantics of the socket APIs need modification. For example, because of the multi-homing feature supported in SCTP, the socket APIs, getsockname() and getpeername(), do not work on an SCTP socket if a given association is bound to multiple local addresses and the association has multiple peer addresses. Applications must use the sctp_getpaddrs() SCTP socket API to obtain the peer addresses in an association. Unlike the existing socket APIs, the SCTP socket APIs disclose many features of the SCTP protocol and association status to the application, to enable applications gain better control over the SCTP protocol. For example, an application can specify some of the association setup parameters, such as the number of desired outbound streams and maximum number of inbound streams, to control an association. Different Socket API Styles This section discusses the different socket API styles and the basic call flow sequence of each socket API style. Following are the different socket API styles: • One-to-one socket APIs • One-to-many socket APIs The one-to-one style API is similar to the existing socket APIs for a connection-oriented protocol, such as TCP. The one-to-many style API facilitates simultaneous associations with multiple peers using one end point (that is, it associates with multiple peers using one socket file descriptor simultaneously). These socket API styles share common data structures and operations. However, each socket API style requires a different application programming style. You can use these socket APIs to implement all the SCTP features. You can also select the API style depending on the type of association you need in the application. This section addresses the following topics: • “One-to-One Socket APIs” (page 42) • “Basic One-to-One Call Flow Sequence” (page 43) • “One-to-Many Socket APIs” (page 48) • “Basic One-to-Many Call Flow Sequence” (page 48) One-to-One Socket APIs The one-to-one style socket APIs are designed to enable the existing TCP applications to migrate to SCTP with minimal changes. The sequence of socket calls made by the client and server of a one-to-one style SCTP application is similar to the sequence of socket calls made by a TCP application. A one-to-one style SCTP application can control only one association using one file descriptor. 42 SCTP Socket APIs
  43. 43. Basic One-to-One Call Flow Sequence A one-to-one style SCTP application uses the following system call sequence to prepare an SCTP endpoint for servicing requests: 1. socket() 2. bind() or sctp_bindx() 3. sctp_getladdrs() 4. sctp_freeladdrs 5. listen() 6. accept() When a client sends a connection request to the server, the accept() call returns with a new socket descriptor. The server then uses the new socket descriptor to communicate with the client, using recv() and send() calls to receive requests and send responses. 7. sctp_getpaddrs() 8. sctp_freepaddrs 9. recv() or recvmsg() 10. send() or sctp_sendx() or sctp_send() 11. close() terminates the association. An SCTP client uses the following system call sequence to set up an association with a server to request services: 1. socket() 2. connect() or sctp_connectx() After returning from connect(), the client uses send() and recv() calls to send out requests and receive responses from the server. 3. The client calls close() to terminate this association when . For more information about the one-to-one style socket calls, see“Common Socket Calls” (page 54). The socket() Socket API Applications call socket() to create a socket descriptor, to represent an SCTP endpoint. Following is the syntax for the socket() socket API: int socket(PF_INET, SOCK_STREAM, IPPROTO_SCTP); or int socket(PF_INET6, SOCK_STREAM, IPPROTO_SCTP); where: Specifies the IPv4 domain. PF_INET Specifies the IPv6 domain. PF_INET6 Different Socket API Styles 43
  44. 44. SOCK_STREAM Indicates the creation of a one-to-one style socket. IPPROTO_SCTP Specifies the type of the protocol. The first syntax of the socket() socket API creates an endpoint that can use only IPv4 addresses, while the second syntax creates an endpoint, which can use both IPv6 and IPv4 addresses. The bind() Socket API Applications use bind() to specify the local address with which an SCTP endpoint must associate. These addresses, associated with a socket, are eligible transport addresses for the endpoint to send and receive data. The endpoint also presents these addresses to its peers during the association initialization process. To accept new associations on the socket, the endpoint must call listen(), after calling bind(). For information on listen(), see “The listen() Socket API” (page 45). Following is the syntax for the bind() API: ret = bind(int sd, struct sockaddr *addr, socklen_t addrlen); where: Represents the socket descriptor returned by the socket() call. sd Represents the address structure (struct sockaddr_in or struct addr sockaddr_in6). Represents the size of the address structure. addrlen If sd is an IPv4 socket, the address passed must be an IPv4 address. If sd is an IPv6 socket, the address passed can either be an IPv4 or an IPv6 address. Applications cannot call bind() multiple times to associate multiple addresses to an endpoint. After the first call to bind(), all the subsequent calls will return an error. If addr is specified as a wildcard (INADDR_ANY for an IPv4 address, or IN6ADDR_ANY_INIT or in6addr_any for an IPv6 address), the operating system associates the endpoint with an optimal address set of the available interfaces. If bind() is not called before a sendmsg() call that initiates a new association, the endpoint picks a transient port and chooses an address set that is equivalent to binding with a wildcard address. One of the addresses in the address set serves as the primary address for the association. Thus, when an application calls bind() with the INADDR_ANY or the IN6ADDR_ANY_INIT wildcard address, the multihoming feature is enabled in SCTP. The completion of the bind() process alone does not prepare the SCTP endpoint to accept inbound SCTP association requests. When a listen() system call is performed on the socket, the SCTP endpoint promptly rejects an inbound INIT request using an ABORT flag. 44 SCTP Socket APIs
  45. 45. The listen() Socket API Applications use listen() to prepare the SCTP endpoint for accepting inbound associations. Following is the syntax for the listen() socket API: int listen(int sd, int backlog); where: Represents the socket descriptor of the SCTP endpoint. sd Represents the maximum number of outstanding associations allowed in backlog the accept queue of the socket. These associations have completed the four-way initiation handshake and are in the ESTABLISHED state. A backlog of 0 (zero) indicates that the caller no longer wants to receive new associations. The accept() Socket API Applications use the accept() call to remove an established SCTP association from the accept queue. The accept() API returns a new socket descriptor, to represent the newly formed association. Following is the syntax for the accept() socket API: new_sd = accept(int sd, struct sockaddr *addr, socklen_t *addrlen); where: Represents the socket descriptor for the newly formed association. new_sd Represents the listening socket descriptor. sd Contains the primary address of the peer endpoints. addr Specifies the size of addr. addrlen The connect() Socket API Applications use connect() to initiate an association with a peer. Following is the syntax for the connect() socket API: int connect(int sd, const struct sockaddr *addr, socklen_t addrlen); where: Represents the socket descriptor of the endpoint. sd Represents the address of the peer. addr Represents the size of the address. addrlen By default, the newly created association has only one outbound stream. Applications must use the SCTP_INITMSG option before connecting to the server, to change the number of outbound streams. The SCTP_INITMSG option enables you to set a socket option and get a socket option, using the setsockopt() and getsockopt() APIs. Different Socket API Styles 45
  46. 46. If SCTP does not call the bind() API before calling connect() , the application picks a transient port and chooses an address set that is equivalent to binding with INADDR_ANY and IN6ADDR_ANY for IPv4 and IPv6 sockets, respectively. One of these addresses serves as the primary address for the association. When an application calls bind() with the INADDR_ANY or the IN6ADDR_ANY_INIT wildcard address, the multihoming feature is enabled in SCTP. The close() Socket API Applications use close() to gracefully close down an association. Following is the syntax for the close() socket API: int close(int sd); where: Represents the socket descriptor of the association to be closed. sd After an application calls close() on a socket descriptor, no further socket operations succeed on that descriptor. The shutdown() Socket API Applications use the shutdown() socket API to disable send or receive operations at an endpoint. The effect of the shutdown() call is different in SCTP and TCP. In TCP, a connection is in half-closed state even after an application calls shutdown(). In the half-close state, an application at the sending endpoint continues to send data even if an application at the receiving endpoint has stopped receiving data. In SCTP, shutdown() completely disables applications at both the endpoints from sending or receiving data. NOTE: Applications can use the SCTP streams feature to achieve the half closed state in SCTP. Following is the syntax for the shutdown() socket call: int shutdown(int sd, int how); Specifies the socket descriptor of the association that needs to be closed. sd Specifies the type of shutdown. The values are as follows: how Disables further receive operations SHUT_RD Disables further send operations and initiates the SCTP shutdown SHUT_WR sequence SHUT_RDWR Disables further send and receive operations, and initiates the SCTP shutdown sequence In SCTP, SHUT_WR initiates an immediate and full protocol shutdown. In TCP, SHUT_WR causes TCP to enter a half-closed state. The SHUT_RD value behaves in the same way for SCTP and TCP. SCTP_WR closes the SCTP association while leaving the socket 46 SCTP Socket APIs
  47. 47. descriptor open, so that the receiving endpoint can receive data that SCTP was unable to deliver. The sendmsg() and recvmsg() Socket APIs Applications use the sendmsg() and recvmsg() socket APIs to transmit data to and receive data from its peer. Following is the syntax for the sendmsg() and recvmsg() socket APIs: ssize_t sendmsg(int sd, const struct msghdr *message, int flags); ssize_t recvmsg(int sd, struct msghdr *message, int flags); where: Represents the socket descriptor of the endpoint. sd Specifies the pointer to the msghdr structure that contains a single user message message and the ancillary data. Following is the structure for the msghdr structure: struct msghdr { void *msg_name; socklen_t msg_namelen; struct iovec *msg_iov; size_t msg_iovlen; void *msg_control; socklen_t msg_controllen; int msg_flags; }; where: Specifies the pointer to the socket address structure. msg_name Specifies the size of the socket address structure. msg_namelen Includes an array of message buffers. msg_iov Specifies the number of elements in the msg_iov msg_iovlen structure. Specifies the ancillary data. msg_control Specifies the length of the ancillary data buffer. msg_controllen Specifies the flags on the received message. msg_flags For more information on the msghdr, see RFC 2292 (Advanced Sockets API for IPv6). Contains flags that affect the messages being sent or received flags Different Socket API Styles 47
  48. 48. NOTE: A sendmsg() API does not fail if it contains an invalid SCTP stream identifier but an error is returned on all subsequent calls on the file descriptor. The getpeername() Socket API Applications use the getpeername() socket API to retrieve the primary socket address of the peer. Following is the syntax for the getpeername() socket API: int getpeername(int sd, struct sockaddr *address, socklen_t *len); where: Specifies the socket descriptor to be queried. sd Contains the primary peer address. If the socket is an IPv4 socket, the address address will be an IPv4 address. If the socket is an IPv6 socket, the address will be either an IPv6 or an IPv4 address. Specifies the length of the address. len If the actual length of the address is greater than the length of the supplied sockaddr structure, SCTP truncates the stored address. NOTE: The getpeername() socket API is available only for TCP compatibility. It must not be used for the multihoming feature in SCTP, because this socket API does not work with one-to-many style sockets. One-to-Many Socket APIs The one-to-many style APIs are designed to enable applications to control many associations from a single endpoint, using a single file descriptor. Similar to the APIs in UDP, one-to-many style APIs in SCTP enable a single socket file descriptor to connect to multiple remote endpoints. A one-to-many style socket can send and receive data without connecting to an endpoint. Unlike UDP, however, SCTP always has a valid association with the specified endpoints, because SCTP is a connection-oriented protocol. Basic One-to-Many Call Flow Sequence A server in the one-to-many style uses the following socket call sequence to prepare an endpoint for servicing requests: 1. socket() 2. bind() or sctp_bindx() 3. sctp_getladdrs() 4. sctp_freeladdrs() 5. listen() 6. sctp_getpaddrs() 48 SCTP Socket APIs

×