Fault Management Mechanism in MPTCP SessionHuaweijiao_kang2022@163.comHuaweiNo. 207, Jiufeng 3rd Road, East Lake High-tech Development ZoneWuhanChina+86 18651640216liangqiandeng@huawei.comHuaweiD2-03,Huawei Industrial BaseShenzhenChinadengshangling@huawei.com
Transport Area
TCP Maintenance and Minor ExtensionsmptcpThis document presents a mechanism for fault management during a MPTCP session. It is used to convey subflow failure information from client to server by other subflow running normally. It includes: 1) a new Fault Announce Option for describing subflow failure, 2) implementation and interoperability of this option during a MPTCP session when one subflow suffers a failure. In fact, the server is able to determine network problems accurately based on these fault information reported from multiple clients for their connections.IntroductionDuring data transmission in a MPTCP session, subflows may encounter some problems, for example, port failure on one endpoint, network failure, or middlebox working abnormally. Current MPTCP protocol does not provide exchanges between client and server when a fault happens on a subflow which will cause transmission failure or delay.RFC8684 introduces TCP RST Reason (MP_TCPRST) option to signal reasons for sending a RST on a subflow which can help an implementation decide whether to attempt later reconnection. TCP RST Reason (MP_TCPRST) option only reports the reason for a specific subflow that has been determined to be closed later. This solution does not cover the case of abnormal termination of one ongoing subflow.Requirements LanguageThe key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119.Fault Announce ExchangesThis document proposes a fault announce mechanism with a new option that can be used to deliver failure information of abnormal subflow between client and server via another subflow in the MPTCP session that works properly. The flow is illustrated in Figure 1.The Fault Announce option is carried on SYN, ACK or data packets.Client may detect a local fault, for example, local port or network card failure, or an error in local protocol processing. In this way, the client can determine the fault cause.Client may actively detect subflow failure by a detecting task to determine the fault cause. For example, the client may deploy a detection task using a bidirectional forwarding detection (BFD) to determine whether the subflow is faulty.Client may send an ICMP request to server and determine the exceptions by the duration of a response. Specifically, if the client cannot receive a response within a preset time, it means that this subflow is not working properly.Another way for client to determine the fault reason is ICMP error report. Client may receive an ICMP error report from a third device (e.g., middlebox on the faulty subflow), in which indicates the fault cause.Fault Announce optionA new Fault Announce option is defined to describe the fault in detail occurring on one subflow. If it is set, the faulty subflow is identified by its source address ID (SrcAddressID) and destination address ID (SrcAddressID). The mapping between IP addresses and addresses IDs should be created on both client and server through the process of ADD_ADDR defined in RFC8684 and RFC6824.Option formatThe format of the Fault Announce option (FAULT_ANNOUNCE) is depicted in Figure 2:A new subtype should be allocated to indicate Fault Announce option."Cause" is an 8-bit field to describe the reason code for which causes the subflow to malfunction. Client detects the fault and determines the cause. Following values (partially mapped to the Exception Code in ICMP error report) are defined in this document:
0x00~0x09 is reserved. It is compatible with "Reason" defined in RFC8684.
Network is unreachable (code 0x0A).
Host is unreachable (code 0x0B).
Routing is failed (code 0x0C).
Server Suppression (code 0x0D).
TTL equals zero (IP loops may occur) (code 0x0E).
"SrcAddressID" is used to identify source address ID for the faulty subflow."DestAddressID" is used to identify destination address ID for the faulty subflow.Additional requirements to be consideredScenario of middlebox failureIn some actual scenarios, it is the middlebox failure that causes blocking of one subflow. So client should report to server the information of the faulty middlebox by Fault Announce option so that the server can quickly locate it. The information of a faulty middlebox may include:Middlebox IP: The IP address of the faulty middlebox.IP protocol version: The IP protocol version adopted by the faulty middlebox, i.e. IPv4 or IPv6. Server can use it to parse the field of "Middlebox IP address".Flag ‘A’: If "Middlebox IP address" is optional, this flag should be defined to indicate whether the field of "Middlebox IP address" is carried in Fault Announce option.Scenario of distinguishing fault typesIn some possible implementations, faults are classified into transient fault and non-transitory fault. So a field of "fault type" may be added to identify the type (transient fault or non-transitory fault) for subsequent processing.IANA ConsiderationsIANA is requested to assign a MPTCP option subtype for the Fault Announce option.Security ConsiderationsFault Announce option is neither encrypted nor authenticated, so on-path attackers and middleboxes could remove, add or modify this option on observed Multipath TCP connections.ReferencesNormative ReferencesKey words for use in RFCs to Indicate Requirement LevelsIn many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.Transmission Control ProtocolTCP Extensions for Multipath Operation with Multiple AddressesTCP/IP communication is currently restricted to a single path per connection, yet multiple paths often exist between peers. The simultaneous use of these multiple paths for a TCP/IP session would improve resource usage within the network and, thus, improve user experience through higher throughput and improved resilience to network failure.Multipath TCP provides the ability to simultaneously use multiple paths between peers. This document presents a set of extensions to traditional TCP to support multipath operation. The protocol offers the same type of service to applications as TCP (i.e., reliable bytestream), and it provides the components necessary to establish and use multiple TCP flows across potentially disjoint paths. This document defines an Experimental Protocol for the Internet community.TCP Extensions for Multipath Operation with Multiple AddressesTCP/IP communication is currently restricted to a single path per connection, yet multiple paths often exist between peers. The simultaneous use of these multiple paths for a TCP/IP session would improve resource usage within the network and thus improve user experience through higher throughput and improved resilience to network failure.Multipath TCP provides the ability to simultaneously use multiple paths between peers. This document presents a set of extensions to traditional TCP to support multipath operation. The protocol offers the same type of service to applications as TCP (i.e., a reliable bytestream), and it provides the components necessary to establish and use multiple TCP flows across potentially disjoint paths.This document specifies v1 of Multipath TCP, obsoleting v0 as specified in RFC 6824, through clarifications and modifications primarily driven by deployment experience.