Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

MPTCP Upstreaming

149 views

Published on

Presentation on MPTCP upstream patch by Mat Martineau (Intel) and Matthieu Baerts (Tessares).
Agenda:
- MPTCP: definition, use case, current implementation and protocol overview
- First patch set roadmap
- Advances feature roadmap
- Next steps

Any questions? Please send an email at contact@tessares.net

Published in: Software
  • Be the first to comment

  • Be the first to like this

MPTCP Upstreaming

  1. 1. Multipath TCP Upstreaming Mat Martineau (Intel) and Matthieu Baerts (Tessares)
  2. 2. Plan ● Multipath TCP Overview ● First Patch Set Upstreaming Roadmap ● Advanced Features Roadmap ● Conclusion and links 2
  3. 3. What is MPTCP? 3
  4. 4. Multipath TCP (MPTCP) ● Exchange data for a single connection over different paths, simultaneously ● RFC-6824 and supported by IETF Multipath TCP (MPTCP) working group 4Smartphone and WiFi icons by Blurred203 and Antü Plasma under CC-by-sa, others from Tango project, public domain
  5. 5. Multipath TCP (MPTCP) ● Exchange data for a single connection over different paths, simultaneously ● RFC-6824 and supported by IETF Multipath TCP (MPTCP) working group ● More bandwidth: 5Smartphone and WiFi icons by Blurred203 and Antü Plasma under CC-by-sa, others from Tango project, public domain
  6. 6. Multipath TCP (MPTCP) ● Exchange data for a single connection over different paths, simultaneously ● RFC-6824 and supported by IETF Multipath TCP (MPTCP) working group ● More mobility (walk-out): 6Smartphone and WiFi icons by Blurred203 and Antü Plasma under CC-by-sa, others from Tango project, public domain
  7. 7. Multipath TCP (MPTCP) ● Exchange data for a single connection over different paths, simultaneously ● RFC-6824 and supported by IETF Multipath TCP (MPTCP) working group ● More mobility (walk-out): 7Smartphone and WiFi icons by Blurred203 and Antü Plasma under CC-by-sa, others from Tango project, public domain
  8. 8. Multipath TCP (MPTCP) ● Exchange data for a single connection over different paths, simultaneously ● RFC-6824 and supported by IETF Multipath TCP (MPTCP) working group ● More mobility (walk-out): 8Smartphone and WiFi icons by Blurred203 and Antü Plasma under CC-by-sa, others from Tango project, public domain
  9. 9. Multipath TCP (MPTCP) ● Exchange data for a single connection over different paths, simultaneously ● RFC-6824 and supported by IETF Multipath TCP (MPTCP) working group ● More mobility (walk-out): 9Smartphone and WiFi icons by Blurred203 and Antü Plasma under CC-by-sa, others from Tango project, public domain
  10. 10. Multipath TCP Use Cases ● Smartphones (Apple, Samsung, LG, others) ○ Support failover / “walk-out” scenario. ○ More Bandwidth ● Residential Gateways (LTE + DSL, for example) ○ More Bandwidth ● Multipath TCP is part of 5G standardisation: ○ Access Traffic Steering, Switching and Splitting: ATSSS 10
  11. 11. Multipath TCP Use Cases: ATSSS 11 Steering best network selection OR Switching seamless handover FROM TO and vice versa Splitting network aggregation AND improved end-user experience Defined in 3GPP Release 16, ATSSS is a core network function in 5G networks, playing a key role in managing data traffic between 3GPP (5G, 4G) networks and non-3GPP (Wi-Fi) networks 5G 5G 5G WiFi WiFi WiFi
  12. 12. Existing Linux implementation ● First implementation for Linux kernel in March 2009 ○ Latest MPTCP out-of-tree Linux kernel version is v0.95 ○ Generally used as a client / server in current deployments, for millions of users ● But not upstreamable ○ Built to support experiments and rapid changes but not generic enough ○ Special purpose implementation of MPTCP 12
  13. 13. Guidelines for upstream ● New implementation cannot affect existing TCP stack: ○ Without performance regressions. No code size change if CONFIG_MPTCP=n ○ Maintainable and configurable ○ Can be used in a variety of deployments ● Multipath TCP will be "opt-in" ● Proceed in steps: ○ Minimal features set ○ Optimisations and advanced features for later 13
  14. 14. Protocol Overview: RFC 6824 ● Looks like TCP on the wire, similar usage for apps 14 MPTCP TCP Subflow TCP Subflow Socket Layer IP Layer
  15. 15. Protocol Overview: RFC 6824 ● Looks like TCP on the wire, similar usage for apps ● Subflow mapping: Data Sequence Number 15 MPTCP TCP Subflow TCP Subflow Socket Layer IP Layer AB CD EF Dseq=0, seq=123,“A” Dseq=1, seq=124,“B” Dseq=2, seq=456,“C” Dseq=4, seq=125,“E” Dseq=5, seq=126,“F” Dseq=3, seq=456,“D” Smartphone icon by Blurred203 under CC-by-sa, others from Tango project, public domain
  16. 16. ● Looks like TCP on the wire, similar usage for apps ● Subflow mapping: Data Sequence Number Protocol Overview: RFC 6824 16 MPTCP TCP Subflow TCP Subflow Socket Layer IP Layer
  17. 17. Protocol Overview: RFC 6824 17 MPTCP TCP Subflow TCP Subflow Socket Layer IP Layer ● Looks like TCP on the wire, similar usage for apps ● Subflow mapping: Data Sequence Number
  18. 18. Protocol Overview: RFC 6824 18 MPTCP TCP Subflow TCP Subflow Socket Layer IP Layer ● Looks like TCP on the wire, similar usage for apps ● Subflow mapping: Data Sequence Number
  19. 19. Protocol Overview: RFC 6824 19 MPTCP TCP Subflow TCP Subflow Socket Layer IP Layer ● Looks like TCP on the wire, similar usage for apps ● Subflow mapping: Data Sequence Number + ACK
  20. 20. Protocol Overview: RFC 6824 ● Looks like TCP on the wire, similar usage for apps ● Subflow mapping: Data Sequence Number + ACK ● MP_CAPABLE, MP_JOIN, DATA_FIN 20 MPTCP TCP Subflow TCP Subflow Socket Layer IP Layer In TCP options
  21. 21. Protocol Overview: RFC 6824 ● Looks like TCP on the wire, similar usage for apps ● Subflow mapping: Data Sequence Number + ACK ● MP_CAPABLE, MP_JOIN, DATA_FIN ● Signaling: Add/Remove Addresses, Fast Close 21 MPTCP TCP Subflow TCP Subflow Socket Layer IP Layer Via TCP ACK
  22. 22. Protocol Overview: RFC 6824 ● Looks like TCP on the wire, similar usage for apps ● Subflow mapping: Data Sequence Number + ACK ● MP_CAPABLE, MP_JOIN, DATA_FIN ● Signaling: Add/Remove Addresses, Fast Close ● Coupled receive windows across TCP subflows 22 MPTCP TCP Subflow TCP Subflow Socket Layer IP Layer
  23. 23. Multiple versions of MPTCP ● RFC 6824: Experimental ○ All known implementations support it, only this version ● RFC 6824 bis: Standard ○ Submitted to IESG for publication ○ Behavioral changes: MPTCP v0 → MPTCP v1 ○ Some parts easier to implement ○ Selected by 3GPP for 5G 23
  24. 24. First Patch Set Roadmap 24
  25. 25. MPTCP Socket architecture 25 MPTCP TCP Subflow TCP Subflow Socket Layer IP Proto: struct proto SKB 1 SKB 2 ... SKB 1 SKB 2 ... TCP ULP: struct tcp_ulp_ops We start from: tcp_request_sock_ops SKB extension: struct mptcp_ext To store Data Sequence Signal (25 bytes)
  26. 26. Userspace API ● MPTCP selected when creating the socket: socket(AF_INET(6), SOCK_STREAM, IPPROTO_MPTCP); 26
  27. 27. Userspace API ● MPTCP selected when creating the socket: socket(AF_INET(6), SOCK_STREAM, IPPROTO_MPTCP); ○ IPPROTO_MPTCP = IPPROTO_TCP | 0x100; /* = 262 */ 27
  28. 28. Userspace API ● MPTCP selected when creating the socket: socket(AF_INET(6), SOCK_STREAM, IPPROTO_MPTCP); ○ IPPROTO_MPTCP = IPPROTO_TCP | 0x100; /* = 262 */ ● getsockopt() / setsockopt() with MPTCP socket or its TCP subflows? 28
  29. 29. Userspace API ● MPTCP selected when creating the socket: socket(AF_INET(6), SOCK_STREAM, IPPROTO_MPTCP); ○ IPPROTO_MPTCP = IPPROTO_TCP | 0x100; /* = 262 */ ● getsockopt() / setsockopt() with MPTCP socket or its TCP subflows? ● Security: who can create MPTCP sockets? ○ Initial implementation will not be hardened by broad use yet (syzkaller, etc.) ○ sysctl per network namespace, MPTCP disabled by default: is it enough? 29
  30. 30. Diagnostics ● MPTCP will have a collection of counters for diagnostic and debug purposes ● Per-socket data will be shared with userspace via sock_diag(7) ○ TCP ULP framework has been extended to enable diag ● Some TCP counters are also found in /proc ○ Should MPTCP add to these as well? 30
  31. 31. Tests ● Kernel Self Tests ○ Between multiple namespaces (veth) ○ MPTCP ⇔ MPTCP, MPTCP ⇔ TCP, TCP ⇔ MPTCP ○ Various conditions including packet loss, reordering, and variations in routing ● Packetdrill ○ Background project ongoing to add MPTCP support ○ Out-of-tree Packetdrill with MPTCP support but old and limited 31
  32. 32. Initial use case ● Server role is a good place to start ● Simpler path management ○ Client side handles multiple interfaces (like cellular + Wi-Fi) ● Common server configuration uses one public interface for clients ○ Advertising additional interfaces not required ● Client features all build on what’s needed for servers 32 Server IP1 Client IP2 IP3 NAT NAT Internet
  33. 33. Code already merged upstream ● SKB extensions ○ Needed to carry MPTCP options that are tied to the data payload ○ Also used to remove sp (sec_path) and nf_bridge pointers from struct sk_buff ○ Suitable for data that can’t fit in sk_buff and justifies memory overhead ● Add inet_diag_ulp_info to socket diag format and ULP get_info hook 33
  34. 34. Change in TCP Code Git Stat: include/linux/skbuff.h | 11 ++ include/linux/tcp.h | 51 ++++++++++ include/net/sock.h | 6 +- include/net/tcp.h | 20 ++++ include/trace/events/sock.h | 5 +- include/uapi/linux/in.h | 2 + net/Kconfig | 1 + net/Makefile | 1 + net/ax25/af_ax25.c | 2 +- net/core/skbuff.c | 7 ++ net/decnet/af_decnet.c | 2 +- net/ipv4/inet_connection_sock.c | 2 + net/ipv4/tcp.c | 8 +- net/ipv4/tcp_input.c | 29 +++++- net/ipv4/tcp_ipv4.c | 4 +- net/ipv4/tcp_minisocks.c | 6 ++ net/ipv4/tcp_output.c | 62 +++++++++++- net/ipv4/tcp_ulp.c | 12 +++ 34
  35. 35. Changes to TCP code ● tcp_ulp_clone() ● Export two low-level TCP functions and one struct ● SKBs with MPTCP extensions can’t be coalesced or collapsed ● MPTCP option parsing and writing ● is_mptcp flag in tcp_sock and tcp_request_sock 35
  36. 36. Changes to TCP code, continued ● One MPTCP-specific branch in TCP minisocks ● Call out to MPTCP from tcp_data_queue to add SKB extension and process ACKs ● Additional members in struct tcp_options_received ● Subflow receive window sharing will introduce changes too 36
  37. 37. Advanced Features Roadmap 37
  38. 38. Path Manager Which path to create/remove? Which address to announce? 38 ? ? Smartphone and WiFi icons by Blurred203 and Antü Plasma under CC-by-sa, others from Tango project, public domain
  39. 39. Userspace Path Manager ● Peers share ADD_ADDR and REMOVE_ADDR signals to advertise available addresses for each MPTCP connection ● Path manager runs in userspace and uses generic netlink to track address and local interface updates and request subflow changes ● Can be customized with different policies. ● Multipath TCP Daemon alpha release is available at github.com/intel/mptcpd 39
  40. 40. Packet Scheduler On which available path packets will be sent? Reinject packets in another path? 40 A A ? ? Smartphone and WiFi icons by Blurred203 and Antü Plasma under CC-by-sa, others from Tango project, public domain
  41. 41. Packet scheduling ● Different connections may optimize for throughput, latency, or redundancy. ● Peers can set a ‘backup’ flag on each subflow to limit transmission on that flow ● Include basic scheduler options in the kernel ● Consider eBPF to define custom schedulers, instead of kernel modules 41
  42. 42. Using MPTCP with unmodified binaries ● Some organizations want to take advantage of MPTCP without recompiling their userspace ● Can add BPF_CGROUP_SOCKET to attach an eBPF program that rewrites the protocol number passed to socket() ● Similar attachment points exist for bind() and connect() 42
  43. 43. MPTCP Performance optimizations ● Initial emphasis is on correctness and reasonable MPTCP performance ○ While not disrupting TCP’s optimizations! ● Target performance optimizations based on data ● Protocol optimizations ○ Example: changing scheduler behavior for reinjection of data on different subflows ● TCP Fast Open support 43
  44. 44. Break-before-make ● MPTCP can keep a connection active even with zero subflows connected ○ Allows the session to continue by adding a subflow with MP_JOIN ● Can be useful to switch between access points ● Will add this capability if there’s demand for it 44
  45. 45. Subflow socket options ● One MPTCP socket manages a set of in-kernel subflow sockets ● Socket options that use TCP option space or change data flow could interfere ● The MPTCP socket can act as an intermediary for subflow options ● Will need to whitelist specific known-safe options ● Could expose file descriptors only good for getsockopt()/setsockopt() 45
  46. 46. Kernel TLS and MPTCP ● kTLS is built on top of TCP using ULP framework ● An MPTCP socket is not a TCP socket, so it doesn’t have ULP ● TLS needs to operate on the MPTCP data stream, not subflow streams ○ TLS records could be split across subflows ○ MPTCP DSS mappings are specific to TCP sequence numbers ● TLS_SW appears feasible but would need work to integrate with an MPTCP socket type 46
  47. 47. Conclusion 47
  48. 48. Conclusion ● Build around TCP as much as we can. ● We are close to having an initial patch set ready. 48 This project is open to everybody. ● Wiki: https://is.gd/mptcp_upstream ● Mailing list: https://lists.01.org/mailman/listinfo/mptcp ● Git repository: https://github.com/multipath-tcp/mptcp_net-next ● Paper: https://linuxplumbersconf.org/event/4/contributions/435/ ● mathew.j.martineau@linux.intel.com ● matthieu.baerts@tessares.net
  49. 49. Backup slides 49
  50. 50. Protocol challenges 50Used with Christoph Paasch’s permission Relations between structures
  51. 51. kTLS record 51From: https://netdevconf.org/1.2/papers/netdevconf-TLS.pdf
  52. 52. Protocol challenges Coupled receive windows across TCP subflows 52Used with Sébastien Barré’s permission
  53. 53. Multipath TCP (MPTCP) Hybrid access network use-case (BBF TR-348 by Tessares - SwissCom - OVH) 53 Telco CloudDSL CPE 4G CPE TCP TCP MPTCP copper (long line) 4G/LTE network fiber xDSL network MPTCP cloud native HAG multi- platform Agent available capacity needed, for 4G/LTE connectivity Images from Tessares
  54. 54. Protocol challenges 54
  55. 55. Protocol challenges Data sequence numbers and mappings 55 AB C Smartphone icon by Blurred203 under CC-by-sa, others from Tango project, public domain
  56. 56. Protocol challenges Data sequence numbers and mappings 56 AB CD EF Smartphone icon by Blurred203 under CC-by-sa, others from Tango project, public domain
  57. 57. Protocol challenges Data sequence numbers and mappings 57 AB C D EF Smartphone icon by Blurred203 under CC-by-sa, others from Tango project, public domain
  58. 58. Protocol challenges Data sequence numbers and mappings 58 AB C D EF Dseq=0, seq=123,“A” Dseq=1, seq=124,“B” Dseq=2, seq=456,“C” Dseq=4, seq=125,“E” Dseq=5, seq=126,“F” Dseq=3, seq=456,“D” Smartphone icon by Blurred203 under CC-by-sa, others from Tango project, public domain
  59. 59. Protocol challenges Data sequence numbers and mappings mptcp_sock tcp_sock Socket Layer IP Layer App sends data 59
  60. 60. Protocol challenges Data sequence numbers and mappings mptcp_sock tcp_sock Socket Layer IP Layer 60 Select the TCP subflow
  61. 61. Protocol challenges Data sequence numbers and mappings mptcp_sock tcp_sock Socket Layer IP Layer TCP header: What DSS to set? 61
  62. 62. Protocol challenges Sending of ACKs to signal options, e.g. REMOVE_ADDR in a TCP ACK mptcp_sock tcp_sock tcp_sock Socket Layer IP Layer Subflow ops Subflow ops 62 Notification: one iface is down
  63. 63. Protocol challenges Sending of ACKs to signal options, e.g. REMOVE_ADDR in a TCP ACK mptcp_sock tcp_sock tcp_sock Socket Layer IP Layer Subflow ops Subflow ops 63 Select the subflow
  64. 64. Protocol challenges Sending of ACKs to signal options, e.g. REMOVE_ADDR in a TCP ACK mptcp_sock tcp_sock tcp_sock Socket Layer IP Layer Subflow ops Subflow ops 64 Sending a ACK not from TCP stack
  65. 65. Protocol challenges Reception of ACKs with signaling options, e.g. REMOVE_ADDR in a TCP ACK mptcp_sock tcp_sock tcp_sock Socket Layer IP Layer Subflow ops Subflow ops 65 TCP ACK received
  66. 66. Protocol challenges Reception of ACKs with signaling options, e.g. REMOVE_ADDR in a TCP ACK mptcp_sock tcp_sock tcp_sock Socket Layer IP Layer Subflow ops Subflow ops 66 TCP ACK is not dropped
  67. 67. Protocol challenges Reception of ACKs with signaling options, e.g. REMOVE_ADDR in a TCP ACK mptcp_sock tcp_sock tcp_sock Socket Layer IP Layer Subflow ops Subflow ops 67
  68. 68. Protocol challenges Signaling with MPTCP: ● MP_CAPABLE ● MP_JOIN ● DSEQ / DACK ● FAST_CLOSE ● ADD_ADDR ● REMOVE_ADDR 68 SYN SYN ALL ACK followed by RST ACK ACK

×