Internet-Draft | Careful resumption of congestion control | July 2022 |
Kuhn, et al. | Expires 11 January 2023 | [Page] |
This document discusses careful resumption of congestion control parameters in QUIC with a cautious method that enables faster startup of new connections.¶
The method uses a set of computed congestion control parameters that are based on the previously observed path characteristics, such as the bottleneck bandwidth, available capacity, or the RTT. These parameters are stored and can then used to modify the congestion control behaviour of a subsequent connection. The draft discusses assumptions around how a server ought to utilise these parameters to provide opportunities for a new connection to more quickly get up to speed (i.e. utilise available capacity). It discusses how these changes impact the capacity at a shared network bottleneck and the response that is needed after any indication that the new rate is inappropriate.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 11 January 2023.¶
Copyright (c) 2022 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
All Internet transports are required to use a CC method. In 2010, RFC 5783 provided a survey of alternative CC methods, and noted that there are challenges when a CC operates across an Internet path with a high and/or variable bandwidth-delay product (BDP) [RFC5783].¶
A CC algorithm typically takes time to ramp-up the packet rate, called the "slow-start phase", informally known as the time to "Get up to speed". The slow start phase is a period in which a sender intentionally uses less capacity than might be available with the intention to avoid overshooting the actual capacity at a bottleneck, which would result in increased queueing (latency/jitter) and/or congestion packet loss. An overshoot in the capacity can have a detrimental effect on other flows sharing a common bottleneck. In the extreme case, persistent congestion can result in unwanted starvation of other flows [RFC8867] (i.e. Preventing other flows from successfully sharing a common bottleneck).¶
In Reno, the slow-start phase consists of a sequence of increases in the congestion window (cwnd) starting from the Initial Window (IW). Each step lasts approximately one path RTT, until the sender estimates that the capacity at the bottleneck for the path has been (or is nearing) reached.¶
To fully-utilise the capacity along a path with a certain RTT, the transport needs to determine an appropriate volume of bytes in flight, based on the product of the available capacity and the path RTT. [RFC6349] defines the BDP as follows: "Derived from Round-Trip Time (RTT) and network Bottleneck Bandwidth (BB), the Bandwidth-Delay Product (BDP) determines the Send and Received Socket buffer sizes required to achieve the maximum TCP Throughput." The BDP estimated by a server includes all buffering experienced along a network path. Various approaches are possible to determine the BDP, based on measurements of the path characteristics. [RFC6349] specifies one procedure for TCP. CC for QUIC is specified in [RFC9002] and does not specify a required method to measure the BDP, allowing the sender to implement an appropriate method.¶
The specification for the QUIC transport protocol [RFC9000] notes "Generally, implementations are advised to be cautious when using previous values on a new path." The method uses a set of computed Congestion Control (CC) parameters that are based on the previously observed path characteristics, such as the bottleneck bandwidth, available capacity, or the Round Trip Time (RTT). These parameters are stored and can then used to modify the CC behaviour of a subsequent connection.¶
This document specifies a method that can improve throughput by reducing the time to get up to speed, and hence the total duration of a transfer. It introduces an alternative method to select initial CC parameters, including a way to more rapidly and safely grow the cwnd.¶
There are scenarios where temporal sharing of previously parameters relating to observed path characteristics, such as the bottleneck bandwidth or RTT, can help to save round-trip times at the start of a new connection. For example:¶
In these cases, specific characteristics of the path may have been learned, including CC information. This information might be expected to be similar when a new connection is made between the same local and remote endpoints.¶
While a server could take optimization decisions without considering the client's preference, in some cases a client could have information that is not available at the server. A client may provide hints, for example: (1) an indication that the path/local interface has changed; (2) information related to current hardware limitations of the client or (3) an understanding about the capacity needs of other concurrent flows that would compete for shared capacity. As a result, a client could explicitely ask for tuning the slow start when the application continues transmission, or to inhibit tuning. This is discussed further later in the document.¶
There are also cases where using the parameters of a previous connection are not appropriate, and a need to evaluate the potential for malicious use of the method.¶
The remainder of this document:¶
This section provides a brief summary of key terms and the requirements language that is used. The document uses language drawn from a range of IETF RFCs.¶
The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
This document defines current, and saved values for a set of CC parameters:¶
Congestion controllers, such as CUBIC or RENO, could estimate the saved_bb and current_bb values by utilizing a combination of the cwnd/flight_size and the minimum RTT. A different method could be used to estimate the same values when using a rate-based congestion controller, such as BBR [I-D.cardwell-iccrg-bbr-congestion-control]. It is important to consider whether the methods could result in over-estimating the bottleneck bandwidth, and the preserved values ought to be used with caution.¶
QUIC introduces the concept of transport parameters (section 4 of [RFC9000]). This document notes that a new connection can utilise a set of key transport parameters from a previous connection to reduce the completion time for a transfer with a size much larger than the IW over paths where the available capacity is also significantly larger than the IW. This benefit is particularly evident for a path where the RTT is much larger than for typical Internet paths.¶
For example, a satellite access network, a 5.3 MB transfer takes up to 9 seconds using standard congestion control, whereas using the specified method this could reduce to 4 seconds [IJSCN]; and the time to complete a 1 MB transfer could be reduced by 62 % [MAPRG111]. Benefits is also expected for other sizes of transfer and for different path characteristics that also result in a higher BDP.¶
A transport protocol is not able to assume that the path characteristics remain the same. Variation can arise from a combination of various factors:¶
Although a transport protocol can have information about a previously used path, the path characteristics can change, and previous information may not be appropriate when a new connection uses the path.¶
In some cases (e.g., after a change in the interface used by the local endpoint), a client may be aware of such a change, and might be able to infer that a previously available path has again become available. However, to safetly utilise the previous information, the client would need assurance that the path was to the same endpoint, and that the characteristics have not significantly changed from those previously measured. When the path is expected to be the same, there is then an opportunity to reduce the time to get up to speed by utilising saved CC information for the path.¶
There can be benefit in sharing transport information across multiple concurrent connections. [RFC9040] considers the sharing of transport parameters between TCP connections that originate from a host. The proposal in this document has the advantage of storing server-generated information at the client and not requiring the server to retain additional state for each client.¶
In the previously detailed scenarios, the application data transfer was unidirectional towards the client, i.e., the main flow of data was from a server to a client (e.g., downloading a file or web page). This is the focus of the current version of the document.¶
In a different example, the application data transfer can be unidirectional towards the server, e.g., uploading an image/video is a server.¶
There are also use cases where a client initiates a connection for a bidirectional service where both endpoints send data to each other, such as to support a remote executing application, or a video conference call.¶
In general, the guidelines proposed in this document apply when a congestion controller is sending data to a remote peer and that remote endpoint resumes the connection. Both endpoints can assume the role of a client or a server.¶
This document defines a series of different phases through which the CC algorithm moves as a connection gets up to speed. The phases are labelled as follows:¶
Unvalidated: Utilise the saved path characteristics to send at a rate higher than allowed by slow start. The convergence towards the previous rate is expected be faster than when using traditional slow-start mechanisms, but should not be instantaneous, to avoid adding congestion to an already congested bottleneck.¶
This section introduces the rationale behind safety guidelines related to the usage of previous values on a new path: variable network conditions and malicious client.¶
The "variable network conditions" related to the fact that previously measured values may not remaib relevant and should be exploited cautiously by a CC algorithm.¶
The "malicious client" relates to the fact that a malicious client could try to send malicious information to a server. Three approaches are then introduced and compared : either (1) all the information related to previous connections is stored at the server and never send to a client ("Local storage"), (2) some information is transmited to a client that can use it when reconnecting but the client cannot read the information received from the server ("NEW TOKEN"), or (3) some information is transmitted to a client that can use it when reconnecting and the client can read it to accept or not the exploitation of previous congestion information (a.k.a. "BDP extension").¶
NOTE: The sender ought not to re-utilise all the capacity it previously used, to avoid starving other flows that started or increased their capacity after the last measurement. How strong should this be stated: ... MUST or SHOULD ... What safety factor is appropriate for the resuming sender? If using slow-start it would anyway double the rate on the next RTT, so is capacity/2 appropriate to initially try?¶
A new connection MUST NOT use the previously measured saved_rtt and saved_bb to simply initialise a new flow to resume sending at the same rate.¶
The server MUST check the validity of any received saved_rtt and saved_bb parameters, whether these are sent by a client or are stored at the server. The following events indicates cases where the use of these parameters is inappropriate:¶
There are several solutions to mitigate the impact of changes in network conditions:¶
Section 6 describes various approaches for Rationale #1 - Solution #2.¶
The server MUST check the integrity of the saved_rtt and saved_bb parameters received from a client.¶
There are several solutions to avoid attacks by malicious clients:¶
Appendix A describes various implementation approaches for each of these solutions using local storage ( Appendix A.2 for Rationale #2 - Solution #1), NEW_TOKEN Frame ( Appendix A.3 for Rationale #2 - Solution #2), BDP extension Frame ( Appendix A.4 for Rationale #2 - Solution #3).¶
This section provides a description of several implementation options and discusses their respective advantages and drawbacks.¶
While there are some discussions for the solutions regarding Rationale #2, the server MUST consider Rationale #1 - Solution #2 and avoid Rationale #1 - Solution #1: the server MUST implement a safety check to measure whether the saved BDP parameters (i.e. saved_rtt and saved_bb) are relevant or check that their usage would not cause excessive congestion over the path.¶
Security consideration are discussed in Section 9 .¶
A server that stores a resumption ticket for each client to protect against replay on a third party IP, it could also store the IP address (i.e., saved_client_ip) and BDP parameters (i.e., saved_rtt and saved_bb) of a previous connection.¶
When the BDP Frame extension is used, locally stored BDP parameters at the server can provide a cross-check of the BDP parameters sent by a client. The server can anyway enable a safe jump, but without the BDP Frame extension. However, using the parameters enables a client to choose whether to request this or not, enabling it to utilize local knowledge of the network conditions, connectivity, or connection requirements.¶
XXX-Editor-note: Text to be improved: Storing local values related to the BDP would help improve the ingress for new connections, however, not using a BDP Frame extension could reduce the interest of the approach where (1) the client knows the BDP estimation at the server, (2) the client decides to accept or reject ingress optimization, (3) the client tunes application level requests.¶
Local storage of values can be secure and the BDP Frame extension provides more information to the client and more interoperability. The Figure 1 provides a summary of the advantages and drawbacks of each approach.¶
The following safety guidelines refer to the labelling defined in Section 4.¶
The safety guidelines are designed to mitigate the risk that a server adds excessive congestion to an already congested path. The following mechanisms help in fulfilling this objective:¶
The proposed mechanisms SHOULD be limited by any rate-limitation mechanisms of QUIC, such as flow control mechanisms or amplification attack prevention. In particular, it may be necessary to issue proactive MAX_DATA frames to increase the flow control limits of a connection. In particular, the maximum number of packets that can be sent without acknowledgements needs to be chosen to avoid the creation and the increase of congestion for the path.¶
This extension MUST NOT provide an opportunity for the current connection to be a vector of an amplification attack. The address validation process, used to prevent amplification attacks, SHOULD be performed [RFC9000].¶
XXX-Editor-note: This probbaly should be a range rather than an inequality (current_rtt < 1.2*saved_rtt).¶
The following mechanisms could be implemented:¶
Exploit a standard IW:¶
Identify a relevant pacing rhythm:¶
The server estimates the pacing rhythm using saved_rtt and saved_bb. The Inter-packet Transmission Time (ITT) is determined by the ratio between the current Maximum Message Size (MMS) and the ratio between the saved_bb and saved_rtt. A tunable safety margin can avoid sending more than a recommended maximum IW (recom_iw):¶
Tune slow-start mechanisms: After transport parameters are set to a previously estimated bottleneck bandwidth, if the slow-start mechanisms continue, the sender can then overshoot the bottleneck capacity. This can occur even when using the safety check described in this section.¶
This follows the idea presented in [RFC4782], [I-D.irtf-iccrg-sallantin-initial-spreading] and [CONEXT15].¶
The authors would like to thank Gabriel Montenegro, Patrick McManus, Ian Swett, Igor Lubashev, Robin Marx, Roland Bless and Franklin Simo for their fruitful comments on earlier versions of this document.¶
TBD: Text is required to register the BDP Frame and the enable_bdp transport parameter. Parameters are registered using the procedure defined in [RFC9000].¶
Security considerations for QUIC are discussed in Section 6¶
The client can send information related to the saved_rtt and saved_bb to the server with the BDP Frame extension using either Rationale #2 - Solution #2 or Rationale #2 - Solution #3. However, the server SHOULD NOT trust the client. Indeed, even if 0-RTT packets containing the BDP Frame are encrypted, a client could modify the values within the extension and encrypt the 0-RTT packet. Authentication mechanisms might not guarantee that the values are safe. It is not an easy operation for a client to modify authenticated or encrypted data without this being detected by a server. Modification could be realized by malicious clients. One way to avoid this is for a server to also store the saved_rtt and saved_bb parameters.¶
A malicious client might modify the saved_bb parameter to convince the server to use a larger CWND than appropriate. Using the algorithms proposed in Section 6, the server may reduce any intended harm and can check that part of the information provided by the client are valid.¶
Storing the BDP parameters locally at the server reduces the associated risks by allowing the client to transmit information related to the BDP of the path in the case of a malicious client trying to break the encryption mechanism that it had received.¶
The NewSessionTickets message of TLS can offer a solution. The proposal is to add a 'bdp_metada' field in the NewSessionTickets, which the client is able to read. The only extension currently defined in TLS1.3 that can be seen by the client is max_early_data_size (see Section 4.6.1 of [RFC8446]). However, in the general design of QUIC, TLS sessions are managed by a TLS stack.¶
Three distinct approaches are presented: sending an opaque blob to the client that the client may return to the server when establishing a future new connection (see Appendix A.3), enabling local storage of the BDP infromation (see Appendix A.2) and a BDP Frame extension (see Appendix A.4).¶
This approach independently lets both a client and a server store their BDP parameters:¶
During the 0-RTT session, the local endpoint waits for the first RTT measurement from the remote endpoint IP address. This is used to verify that the current_rtt has not significantly changed from the saved_rtt (used as an indication that the BDP information is appropriate for the current path).¶
If this RTT is confirmed, the endpoint also verifies that an IW of data has been acknowledged without requiring retransmission or resulting in an ECN CE-mark. This second check detects whether a path is experiencing significant congestion (i.e., where it would not be safe to update the cwnd based on the saved_bb). In practice, this could be realized by a proportional increase in the cwnd, where the increase is (saved_bb/IW)*proportion_of_IW_currently-ACKed.¶
This solution does not allow a client to request the server not to use the BDP parameters. If the server does not want to store the metrics from previous connections, an equivalent of the tcp_no_metrics_save for QUIC may be necessary. This option could be negotiated that allows a client to choose whether to use the saved information.¶
A server can send a NEW_TOKEN Frame to the client. The token is an opaque (encrypyted) blob and the client can not read its content (see section 19.7 of [RFC9000]). The client sends the received token in the header of an Initial packet of a later connection.¶
Using BDP Frames, the server could send information relating to the path characteristics to the client. The use of the BDP Frame is negotiated with the client. The client can read its content. If the client agrees with the usage of previous parameters, it can send the BDP Frame back to the server in an Initial packet of a later connection.¶