<?xmlversion="1.0" encoding="US-ASCII"?>version='1.0' encoding='UTF-8'?> <!DOCTYPE rfcSYSTEM "rfc2629.dtd"> <?rfc toc="yes"?> <?rfc tocompact="yes"?> <?rfc tocdepth="2"?> <?rfc tocindent="yes"?> <?rfc symrefs="yes"?> <?rfc sortrefs="yes"?> <?rfc comments="yes"?> <?rfc inline="yes"?> <?rfc compact="yes"?> <?rfc subcompact="no"?>[ <!ENTITY nbsp " "> <!ENTITY zwsp "​"> <!ENTITY nbhy "‑"> <!ENTITY wj "⁠"> ]> <rfc xmlns:xi="http://www.w3.org/2001/XInclude" category="exp" docName="draft-ietf-lsr-isis-fast-flooding-11"ipr="trust200902">number="9681" ipr="trust200902" obsoletes="" updates="" submissionType="IETF" xml:lang="en" tocInclude="true" tocDepth="2" consensus="true" symRefs="true" sortRefs="true" version="3"> <front> <title abbrev="IS-IS Fast Flooding">IS-IS Fast Flooding</title> <seriesInfo name="RFC" value="9681"/> <author fullname="Bruno Decraene" initials="B." surname="Decraene"> <organization>Orange</organization> <address> <email>bruno.decraene@orange.com</email> </address> </author> <author fullname="Les Ginsberg" initials="L" surname="Ginsberg"> <organization>Cisco Systems</organization> <address> <postal> <street>821 Alder Drive</street> <city>Milpitas</city> <code>95035</code> <region>CA</region><country>USA</country><country>United States of America</country> </postal> <email>ginsberg@cisco.com</email> </address> </author> <author fullname="Tony Li" initials="T." surname="Li"> <organization>Juniper Networks, Inc.</organization> <address><phone/><email>tony.li@tony.li</email> </address> </author> <author fullname="Guillaume Solignac" initials="G." surname="Solignac"> <address> <email>gsoligna@protonmail.com</email> </address> </author> <author fullname="Marek Karasek" initials="M" surname="Karasek"> <organization>Cisco Systems</organization> <address> <postal> <street>Pujmanove 1753/10a, Prague 4 - Nusle</street> <city>Prague</city><region/><code>10 14000</code> <country>Czech Republic</country> </postal><phone/> <facsimile/><email>mkarasek@cisco.com</email><uri/></address> </author> <author initials="G." surname="Van de Velde" fullname="Gunter Van de Velde"> <organization>Nokia</organization> <address> <postal> <street>Copernicuslaan 50</street> <city>Antwerp</city> <code>2018</code> <country>Belgium</country> </postal> <email>gunter.van_de_velde@nokia.com</email> </address> </author> <author fullname="Tony Przygienda" initials="T" surname="Przygienda"> <organization>Juniper</organization> <address> <postal> <street>1137 Innovation Way</street> <city>Sunnyvale</city><region>Ca</region> <code/> <country>USA</country><region>CA</region> <country>United States of America</country> </postal><phone/> <facsimile/><email>prz@juniper.net</email><uri/></address> </author> <date month="October" year="2024"/> <area>RTG</area> <workgroup>lsr</workgroup> <!-- [rfced] Please insert any keywords (beyond those that appear in the title) for use on https://www.rfc-editor.org/search. --> <keyword>example</keyword> <abstract><t> Current<t>Current Link StateProtocol Data Unit (PDU)PDU flooding rates are much slower than what modern networks can support. The use of IS-IS at larger scale requires faster flooding rates to achieve desired convergence goals. This document discusses the need for faster flooding, the issues around faster flooding, and some example approaches to achieve faster flooding. It also defines protocol extensions relevant to faster flooding. </t> </abstract> </front> <middle> <sectiontitle="Introduction">numbered="true" toc="default"> <name>Introduction</name> <t>Link state IGPs such asIntermediate-System-to-Intermediate-SystemIntermediate System to Intermediate System (IS-IS) depend upon having consistent Link State Databases(LSDB)(LSDBs) on all Intermediate Systems (ISs) in the network in order to provide correct forwarding of data packets. When topology changes occur, new/updated Link State PDUs (LSPs) are propagated network-wide. The speed of propagation is a key contributor to convergence time.</t> <t>IS-IS base specification <xreftarget="ISO10589"/>target="ISO10589" format="default"/> does not use flow or congestion control but static flooding rates. Historically, flooding rates have been conservative--- on the order of10stens ofLSPs/second.LSPs per second. This is the result of guidance in the base specification and early deployments when the CPU and interface speeds were much slower and the area scale was much smaller than they are today.</t> <t>As IS-IS is deployed in greater scale both in the number of nodes in an area and in the number of neighbors per node, the impact of the historic flooding rates becomes more significant. Consider thebringupbring-up or failure of a node with 1000 neighbors. This will result in a minimum of 1000 LSP updates. At typical LSP flooding rates used today (33LSPs/second),LSPs per second), it would take more than 30 seconds simply to send the updated LSPs to a given neighbor. Depending on the diameter of the network, achieving a consistent LSDB on all nodes in the network could easily take a minute or more.</t><t>Increasing<t>Therefore, increasing the LSP flooding ratethereforebecomes an essential element of supporting greater network scale.</t> <t> Improving the LSP flooding rate is complementary to protocol extensions that reduce LSP flooding traffic by reducing the flooding topology such as Mesh Groups <xreftarget="RFC2973"/>target="RFC2973" format="default"/> or Dynamic Flooding <xreftarget="I-D.ietf-lsr-dynamic-flooding"/> .target="I-D.ietf-lsr-dynamic-flooding" format="default"/>. Reduction of the flooding topology does not alter the number of LSPs required to be exchanged between two nodes, so increasing the overall flooding speed is still beneficial when such extensions are in use. It is also possible that the flooding topology can be reduced in ways that prefer the use of neighbors that support improved flooding performance.</t> <t>With the goal of supporting faster flooding, this document introduces the signaling of additional flooding related parameters<xref target="FloodingTLV"/>,(<xref target="FloodingTLV" format="default"/>), specifies some performance improvements on the receiver<xref target="Receiver"/>(<xref target="Receiver" format="default"/>) and introduces the use of flow and/or congestion control<xref target="Control"/>.</t>(<xref target="Control" format="default"/>).</t> </section> <section anchor="Language"title="Requirements Language"> <t>Thenumbered="true" toc="default"> <name>Requirements Language</name> <t> The key words"MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY","<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>", "<bcp14>REQUIRED</bcp14>", "<bcp14>SHALL</bcp14>", "<bcp14>SHALL NOT</bcp14>", "<bcp14>SHOULD</bcp14>", "<bcp14>SHOULD NOT</bcp14>", "<bcp14>RECOMMENDED</bcp14>", "<bcp14>NOT RECOMMENDED</bcp14>", "<bcp14>MAY</bcp14>", and"OPTIONAL""<bcp14>OPTIONAL</bcp14>" in this document are to be interpreted as described inBCP 14BCP 14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and only when, they appear in all capitals, as shownhere.</t>here. </t> </section> <section anchor="HISTORY"title="Historical Behavior">numbered="true" toc="default"> <name>Historical Behavior</name> <t>The base specification for IS-IS <xreftarget="ISO10589"/>target="ISO10589" format="default"/> was first published in 1992 and updated in 2002. The update made no changes in regards to suggested timer values. Convergence targets at the time were on the order ofsecondsseconds, and the specified timer values reflect that. Here are some examples:</t><t> <figure> <artwork><![CDATA[minimumLSPGenerationInterval - This<dl spacing="normal" newline="false"> <dt>minimumLSPGenerationInterval:</dt> <dd><t>This is the minimum time interval between generation of Link State PDUs. A source Intermediate system shall wait at least this long beforere-generatingregenerating one of its own Link StatePDUs.]]></artwork> </figure> </t> <t> ThePDUs.</t> <t>The recommended value is 30seconds. </t> <t> <figure> <artwork><![CDATA[minimumLSPTransmissionInterval - Thisseconds.</t></dd> <dt>minimumLSPTransmissionInterval:</dt> <dd><t>This is the amount of time an Intermediate system shall wait before further propagating another Link State PDU from the same sourcesystem.]]></artwork> </figure> </t> <t> Thesystem.</t> <t>The recommended value is 5seconds. </t> <t> <figure> <artwork><![CDATA[partialSNPInterval - Thisseconds.</t></dd> <dt>partialSNPInterval:</dt> <dd><t>This is the amount of time between periodic action for transmission of Partial Sequence Number PDUs. It shall be less thanminimumLSPTransmissionInterval.]]></artwork> </figure> </t> <t> TheminimumLSPTransmissionInterval.</t> <t>The recommended value is 2seconds. </t>seconds.</t></dd> </dl> <t>Most relevant to a discussion of the LSP flooding rate is the recommended interval between the transmission of two different LSPs on a given interface.</t> <t>For broadcast interfaces, <xreftarget="ISO10589"/>target="ISO10589" format="default"/> defined:</t><t> <figure> <artwork><![CDATA[ minimumBroadcastLSPTransmissionInterval<blockquote><t>minimumBroadcastLSPTransmissionInterval - the minimum interval between PDU arrivals which can be processed by the slowest Intermediate System on theLAN.]]></artwork> </figure> </t>LAN.</t></blockquote> <t> The default value was defined as 33 milliseconds. It is permitted to send multiple LSPs"back-to-back"back to back as a burst, but this was limited to 10 LSPs in aone secondone-second period. </t> <t> <!-- [rfced] Is the following referring to a section from [ISO10589]? If so, we recommend updating the text to cite the document. Current: In fact, Section 12.1.2.4.3 states... Perhaps: In fact, Section 12.1.2.4.3 of [ISO10589] states... --> Although this value was specific to LAN interfaces, this has commonly been applied by implementations to all interfaces though that was not the original intent of the base specification. Infactfact, Section 12.1.2.4.3 states:</t><t> <figure> <artwork><![CDATA[ On<blockquote><t>On point-to-point links the peak rate of arrival is limited only by the speed of the data link and the other traffic flowing on thatlink.]]></artwork> </figure> </t>link.</t></blockquote> <t>Although modern implementations have not strictly adhered to the33 millisecond33-millisecond interval, it is commonplace for implementations to limit the flooding rate to the same order of magnitude: tens of milliseconds, and not the single digits or fractions of milliseconds that are needed today.</t> <t>In the past 20 years, significant work on achieving faster convergence, more specifically sub-second convergence, has resulted in implementations modifying a number of the above timers in order to support faster signaling of topology changes. For example, minimumLSPGenerationInterval has been modified to support millisecond intervals, often with a backoff algorithm applied to prevent LSP generation storms in the event of rapid successive oscillations.</t> <t>However, the flooding rate has not been fundamentally altered.</t> </section> <section anchor="FloodingTLV"title="Floodingnumbered="true" toc="default"> <name>Flooding ParametersTLV"> <t> ThisTLV</name> <t>This document defines a new Type-Length-Valuetuple(TLV) tuple called the "Flooding Parameters TLV" that may be included inIS to ISIS-IS Hellos(IIH)(IIHs) or Partial Sequence Number PDUs (PSNPs). It allows IS-IS implementations to advertise flooding-related parameters and capabilitieswhichthat may be used by the peer to support fasterflooding. </t> <t>Type: 21</t> <t>Length: variable,flooding.</t> <dl newline="false" spacing="compact"> <dt>Type:</dt> <dd>21</dd> <dt>Length:</dt> <dd>variable; the size in octets of the Valuefield</t> <t>Value: Onefield</dd> <dt>Value:</dt> <dd>one or moresub-TLVs</t>sub-TLVs</dd> </dl> <t>Several sub-TLVs are defined in this document. The support of any sub-TLV isOPTIONAL.</t><bcp14>OPTIONAL</bcp14>.</t> <t> For a given IS-IS adjacency, the Flooding Parameters TLV does not need to be advertised in each IIH or PSNP. An IS uses the latest received value for each parameter until a new value is advertised by the peer. However, as IIHs and PSNPs are not reliablyexchanged,exchanged and may never be received, parametersSHOULD<bcp14>SHOULD</bcp14> be sent even if there is no change in value since the last transmission. For a parameter that has never been advertised, an IS uses its local default value. That valueSHOULD<bcp14>SHOULD</bcp14> be configurable on a per-node basis andMAY<bcp14>MAY</bcp14> be configurable on a per-interface basis. </t> <section anchor="LSPBurstSize"title="LSPnumbered="true" toc="default"> <name>LSP Burst Sizesub-TLV">Sub-TLV</name> <t>The LSP Burst Size sub-TLV advertises the maximum number of LSPs that the node can receive without an intervening delay between LSP transmissions.</t><t>Type: 1</t> <t>Length: 4 octets</t> <t>Value: number<dl newline="false" spacing="compact"> <dt>Type:</dt> <dd>1</dd> <dt>Length:</dt> <dd>4 octets</dd> <dt>Value:</dt> <dd>number of LSPs that can be receivedback-to-back.</t>back to back</dd> </dl> </section> <section anchor="InterfaceLSPTransmissionInterval"title="LSPnumbered="true" toc="default"> <name>LSP Transmission Intervalsub-TLV">Sub-TLV</name> <t>The LSP Transmission Interval sub-TLV advertises the minimum interval, inmicro-seconds,microseconds, between LSPs arrivalswhichthat can be sustained on this receiving interface.</t><t>Type: 2</t> <t>Length: 4 octets</t> <t>Value: minimum<dl newline="false" spacing="compact"> <dt>Type:</dt> <dd>2</dd> <dt>Length:</dt> <dd>4 octets</dd> <dt>Value:</dt> <dd>minimum interval, inmicro-seconds,microseconds, between two consecutive LSPs received after LSP Burst Size LSPs have beenreceived</t>received</dd> </dl> <t>The LSP Transmission Interval is an advertisement of the receiver's sustainable LSP reception rate. This rate may be safely used by a senderwhich dothat does not support the flow control or congestion algorithm. It may also be used as the minimal safe rate by flow control or congestion algorithms in unexpected cases, e.g., when the receiver is not acknowledging LSPs anymore. </t> </section> <section anchor="LPP"title="LSPs Pernumbered="true" toc="default"> <name>LSPs per PSNPsub-TLV">Sub-TLV</name> <t>The LSP per PSNP (LPP) sub-TLV advertises the number of received LSPs that triggers the immediate sending of a PSNP to acknowledge them.</t><t>Type: 3</t> <t>Length: 2 octets</t> <t>Value: number<dl newline="false" spacing="compact"> <dt>Type:</dt> <dd>3</dd> <dt>Length:</dt> <dd>2 octets</dd> <dt>Value:</dt> <dd>number of LSPs acknowledged perPSNP</t>PSNP</dd> </dl> <t>A node advertising this sub-TLV with a value for LPPMUST<bcp14>MUST</bcp14> send a PSNP once LPP LSPs have been received and need to be acknowledged.</t> </section> <section anchor="Flags"title="Flags sub-TLV">numbered="true" toc="default"> <name>Flags Sub-TLV</name> <t>The sub-TLV Flags advertises a set of flags.</t><t>Type: 4</t> <t>Length: Indicates<dl newline="false" spacing="compact"> <dt>Type:</dt> <dd>4</dd> <dt>Length:</dt> <dd>Indicates the length in octets (1-8) of the Value field. The lengthSHOULD<bcp14>SHOULD</bcp14> be the minimum required to send all bits that areset.</t> <t>Value: Listset.</dd> <dt>Value:</dt> <dd><t>list offlags.</t> <t> <figure>flags</t> <artworkalign="left">align="left" name="" type="" alt=""><![CDATA[ 0 1 2 3 4 5 6 7 ... +-+-+-+-+-+-+-+-+... |O| ...+-+-+-+-+-+-+-+-+...</artwork> </figure> </t>+-+-+-+-+-+-+-+-+...]]></artwork> </dd></dl> <t>An LSP receiver sets the O-flag (Ordered acknowledgment) to indicate to the LSP sender that it will acknowledge the LSPs in the order as received. A PSNP acknowledging N LSPs is acknowledging the N oldest LSPs received. The order inside the PSNP is meaningless. If the sender keeps track of the order of LSPs sent, this indication allowsafor fast detection of the loss of an LSP. ThisMUST NOT<bcp14>MUST NOT</bcp14> be used to alter the retransmission timer for any LSP. ThisMAY<bcp14>MAY</bcp14> be used to trigger a congestion signal.</t> </section> <section anchor="partialSNPI"title="Partialnumbered="true" toc="default"> <name>Partial SNP Interval Sub-TLV</name> <!--[rfced] "Partial SNP" vs. "PNSP" Regarding the Partial SNP Interval sub-TLV, we note that "PNSP" is used in another description in the same registry ("LSPs per PSNP"). Would you like to leave the name as is, or change it to use "PSNP"? Current: 4.5. Partial SNP Intervalsub-TLV">Sub-TLV Perhaps: 4.5. PSNP Interval Sub-TLV If the latter, we will ask IANA to update the registry (https://www.iana.org/assignments/isis-tlv-codepoints/ isis-tlv-codepoints.xhtml#isis-sub-tlvs-for-flooding-parameters-tlv) accordingly. --> <t>The Partial SNP Interval sub-TLV advertises the amount of time in milliseconds between periodic action for transmission ofPartial Sequence Number PDUs.PSNPs. This time will trigger the sending of a PSNP even if the number of unacknowledged LSPs received on a given interface does not exceed LPP (<xreftarget="LPP"/>).target="LPP" format="default"/>). The time is measured from the reception of the first unacknowledged LSP.</t><t>Type: 5</t> <t>Length: 2 octets</t> <t>Value: partialSNPInterval<dl newline="false" spacing="compact"> <dt>Type:</dt> <dd>5</dd> <dt>Length:</dt> <dd>2 octets</dd> <dt>Value:</dt> <dd>partialSNPInterval inmilliseconds</t>milliseconds</dd> </dl> <t>A node advertising this sub-TLVSHOULD<bcp14>SHOULD</bcp14> send a PSNP at least once per Partial SNP Interval if one or more unacknowledged LSPs have been received on a given interface.</t> </section> <section anchor="RWIN"title="Receivenumbered="true" toc="default"> <name>Receive Windowsub-TLV">Sub-TLV</name> <t>The Receive Window (RWIN) sub-TLV advertises the maximum number of unacknowledged LSPs that the node can receive for a given adjacency.</t><t>Type: 6</t> <t>Length: 2 octets</t> <t>Value: maximum<dl newline="false" spacing="compact"> <dt>Type:</dt> <dd>6</dd> <dt>Length:</dt> <dd>2 octets</dd> <dt>Value:</dt> <dd>maximum number of unacknowledgedLSPs</t>LSPs</dd> </dl> </section> <section anchor="TLVoperationLAN"title="Operationnumbered="true" toc="default"> <name>Operation on a LANinterface">Interface</name> <t>On a LAN interface, all LSPs are link-level multicasts. Each LSP sent will be received by all ISs on theLANLAN, and each IS will receive LSPs from all transmitters. In this section, we clarify how the flooding parameters should be interpreted in the context of a LAN.</t> <t>An LSP receiver on a LAN will communicate its desired flooding parameters using a single Flooding Parameters TLV, which will be received by all LSP transmitters. The flooding parameters sent by the LSP receiverMUST<bcp14>MUST</bcp14> be understood as instructions from the LSP receiver to each LSP transmitter about the desired maximum transmit characteristics of each transmitter. The receiver is aware that there are multiple transmitters that can send LSPs to the receiver LAN interface. The receiver might want to take that into account by advertising more conservative values, e.g., a higher LSP Transmission Interval. When the transmitters receive the LSP Transmission Interval value advertised by an LSP receiver, the transmitters should rate-limit LSPs according to the advertised flooding parameters. They should not apply any further interpretation to the flooding parameters advertised by the receiver.</t> <t>A given LSP transmitter will receive multiple flooding parameter advertisements from different receivers that may include different flooding parameter values. A given transmitterSHOULD<bcp14>SHOULD</bcp14> use the mostconvervativeconservative value on a per-parameter basis. For example, if the transmitter receives multiple LSP Burst Size values, it should use the smallest value.</t> <t>The Designated Intermediate System (DIS) plays a special role in the operation of flooding on the LAN as it is responsible for responding to PSNPs sent on the LAN circuitwhichthat are used to request LSPs that the sender of the PSNP does not have. If the DIS does not support faster flooding, this will impact the maximum flooding speedwhichthat could occur on a LAN. Use of LAN priority to prefer a nodewhichthat supports faster flooding in the DIS election may be useful.</t><t>NOTE:<aside><t>Note: The focus of work used to develop the example algorithms discussed later in this document focused on operation over point-to-point interfaces. A full discussion of how best to do faster flooding on a LAN interface is therefore out of scope for thisdocument.</t>document.</t></aside> </section> </section> <section anchor="Receiver"title="Performance improvementnumbered="true" toc="default"> <name>Performance Improvement on thereceiver">Receiver</name> <t>This section defines two behaviors thatSHOULD<bcp14>SHOULD</bcp14> be implemented on the receiver.</t> <section anchor="LSPACKRate"title="Ratenumbered="true" toc="default"> <name>Rate of LSPAcknowledgments">Acknowledgments</name> <t>On point-to-point networks, PSNPs provide acknowledgments for received LSPs. <xreftarget="ISO10589"/>target="ISO10589" format="default"/> suggeststhatusing some delaybe usedwhen sending PSNPs. This provides some optimization as multiple LSPs can be acknowledged by a single PSNP.</t><t> Faster<t>Faster LSP flooding benefits from a faster feedback loop. This requires a reduction in the delay in sending PSNPs. </t> <t>For the generation of PSNPs, the receiverSHOULD<bcp14>SHOULD</bcp14> use a partialSNPInterval smaller than the one defined in[ISO10589].<xref target="ISO10589" format="default"/>. The choice of this lower value is a local choice. It may depend on the available processing power of the node, the number of adjacencies, and the requirement to synchronize the LSDB more quickly. 200 ms seems to be a reasonable value.</t><t> In<t>In addition to the timer-based partialSNPInterval, the receiverSHOULD<bcp14>SHOULD</bcp14> keep track of the number of unacknowledged LSPs per circuit and level. When this number exceeds a preset threshold of LSPsPerper PSNP (LPP), the receiverSHOULD<bcp14>SHOULD</bcp14> immediately send a PSNP without waiting for the PSNP timer to expire. In the case of a burst of LSPs, this allowsformore frequent PSNPs, giving faster feedback to the sender. Outside of the burst case, the usual time-based PSNP approach comes into effect.</t><t> The<t>The smaller theLPP,LPP is, the faster the feedback to the sender and possibly the higher the rate if the rate is limited by theend to endend-to-end RTT (link RTT + time to acknowledge). This may result in an increase in the number of PSNPssentsent, which may increase CPU and IO load on both the sender and receiver. The LPP should be less than or equal to 90 as this is the maximum number of LSPs that can be acknowledged in a PSNP at common MTUsizes, hencesizes; hence, waiting longer would not reduce the number of PSNPs sent but would delay theacknowledgements.acknowledgments. LPP should not be chosen too high as the congestion control starts with a congestion window ofLPP+1.LPP + 1. Based on experimental evidence, 15 unacknowledged LSPs is a goodvaluevalue, assuming that the Receive Window is at least 30. More frequent PSNPsgivesgive the transmitter more feedback on receiver progress, allowing the transmitter to continue transmitting while not burdening the receiver with undue overhead. </t> <t>By deploying both the time-based and the threshold-based PSNP approaches, the receiver can be adaptive to both LSP bursts and infrequent LSP updates. </t> <t>As PSNPs also consume link bandwidth, packet-queue space, and protocol-processing time on receipt, the increased sending of PSNPs should be taken into account when considering the rate at which LSPs can be sent on an interface.</t> </section> <section anchor="PKTPRI"title="Packetnumbered="true" toc="default"> <name>Packet Prioritization onReceive">Receive</name> <t>There are three classes of PDUs sent by IS-IS:</t><t> <list style="symbols"><ul spacing="normal"> <li> <t>Hellos</t> </li> <li> <t>LSPs</t> </li> <li> <t>Complete Sequence Number PDUs (CSNPs) and PSNPs</t></list>Implementations</li> </ul> <t>Implementations today may prioritize the reception of Hellos over LSPs and Sequence Number PDUs (SNPs) in order to prevent a burst of LSP updates from triggering an adjacencytimeouttimeout, which in turn would require additional LSPs to be updated.</t> <t>CSNPs and PSNPs serve to trigger or acknowledge the transmission of specified LSPs. On a point-to-point link, PSNPs acknowledge the receipt of one or more LSPs. For this reason, <xreftarget="ISO10589"/>target="ISO10589" format="default"/> specifies a delay (partialSNPInterval) before sending a PSNP so that the number of PSNPs required to be sent is reduced. On receipt of a PSNP, the set of LSPs acknowledged by that PSNP can be marked so that they do not need to be retransmitted.</t> <t>If a PSNP is dropped on reception, the set of LSPs advertised in the PSNP cannot be marked asacknowledgedacknowledged, and this results in needless retransmissions thatwillfurther delay transmission of other LSPs that are yet to be transmitted. It may also make it more likely that a receiver becomes overwhelmed by LSP transmissions.</t><t>Therefore<t>Therefore, implementationsSHOULD<bcp14>SHOULD</bcp14> prioritize IS-IS PDUs on the way from the incoming interface to the IS-IS process. The relative priority of packets in decreasing orderSHOULD<bcp14>SHOULD</bcp14> be: Hellos, SNPs, and LSPs. ImplementationsMAY<bcp14>MAY</bcp14> also prioritize IS-IS packets over otherprotocolsprotocols, which are less critical for the router or network, less sensitive todelaydelay, or more bursty (e.g., BGP).</t> </section> </section> <section anchor="Control"title="Congestionnumbered="true" toc="default"> <name>Congestion and FlowControl">Control</name> <section anchor="Overview"title="Overview">numbered="true" toc="default"> <name>Overview</name> <t>Ensuring the goodput between two entities is alayer-4Layer 4 responsibility as per the OSI model. A typical example is the TCP protocol defined in <xreftarget="RFC9293"></xref>target="RFC9293" format="default"/> that provides flow control, congestion control, and reliability. </t> <t>Flow control creates a control loop between a transmitter and a receiver so that the transmitter does not overwhelm the receiver. TCP provides a means for the receiver to govern the amount of data sent by the sender through the use of a sliding window.</t> <t> Congestion control prevents the set of transmitters from overwhelming the path of the packets between two IS-IS implementations. This path typically includes a point-to-point link between two IS-ISneighborsneighbors, which is usuallyover-sizedoversized compared to the capability of the IS-IS speakers, but potentially also includes some internal elements inside each neighbor such as switching fabric, line card CPU, and forwarding plane buffers that may experience congestion. These resources may be shared across multiple IS-IS adjacencies for thesystemsystem, and it is the responsibility of congestion control to ensure that these are shared reasonably.</t> <t>Reliability provides loss detection and recovery. IS-IS already has mechanisms to ensure the reliable transmission of LSPs. This is not changed by this document.</t><t>The<!-- [rfced] In Section 6.1, is this sentence referring to Sections 6.2 and 6.3? If so, may we update it as follows for precision? Current: The following two sections provide two Flow and/or Congestion control algorithms that may be implemented by taking advantage of the extensions defined in this document. Perhaps: Sections 6.2 and 6.3 provide two flow and/or congestion control algorithms that may be implemented by taking advantage of the extensions defined in this document. --> <t>The following two sections provide two flow and/or congestion control algorithms that may be implemented by taking advantage of the extensions defined in this document. The signal that these IS-IS extensionsdefined(defined in Sections <xreftarget="FloodingTLV"/>target="FloodingTLV" format="counter"/> and <xreftarget="Receiver"/>target="Receiver" format="counter"/>) provideareis generic andareis designed to support different sender-side algorithms. A sender can unilaterally choose a different algorithm to use.</t> </section> <section anchor="RWIN-Algo"title="Congestionnumbered="true" toc="default"> <name>Congestion and Flow Controlalgorithm">Algorithm</name> <section anchor="FlowControl"title="Flow control">numbered="true" toc="default"> <name>Flow Control</name> <!-- [rfced] Will the meaning of "a single instance of a transmitter" be clear to the reader? Is it intended to contrast with "a single receiver"? Original: A flow control mechanism creates a control loop between a single instance of a transmitter and a single receiver. --> <t> A flow control mechanism creates a control loop between a single instance of a transmitter and a single receiver. This section uses a mechanism similar to the TCP receive window to allow the receiver to govern the amount of data sent by the sender. This receive window ('rwin') indicates an allowed number of LSPs that the sender may transmit before waiting for an acknowledgment. The size of the receive window, in units of LSPs, is initialized with the value advertised by the receiver in the Receive Window sub-TLV. <!-- [rfced] What does "this neighbor" refer to? The sender, the receiver, or otherwise? (There is no mention of a 'neighbor' in the preceding text of this paragraph.) Current: If no value is advertised, the transmitter should initialize rwin with its locally configured value for this neighbor. --> If no value is advertised, the transmitter should initialize rwin with its locally configured value for this neighbor. </t> <t> When the transmitter sends a set of LSPs to the receiver, it subtracts the number of LSPs sent from rwin. If the transmitter receives a PSNP, then rwin is incremented for each acknowledged LSP. The transmitter must ensure that the value of rwin never goes negative. </t> <t>The RWIN value is of importance when the RTT is the limiting factor for the throughput. In thiscasecase, the optimal size is the desired LSP rate multiplied by the RTT. The RTTbeingis the addition of the link RTT plus the time taken by the receiver to acknowledge the first received LSP in its PSNP.50 50 or 100 may be reasonable default numbers. <!--[rfced] Please clarify this sentence; specifically, is "and limits" outside of the "assuming" phrase? Original: As an example, a RWIN of 100 requires a control plane input buffer of 150 kbytes per neighbor assuming an IS-IS MTU of 1500 octets and limits the throughput to 10000 LSPs per second and per neighbor for a link RTT of 10 ms. Perhaps (adding parentheses): As an example, an RWIN of 100 requires a control plane input buffer of 150 kbytes per neighbor (assuming an IS-IS MTU of 1500 octets) and limits the throughput to 10000 LSPs per second and per neighbor for a link RTT of 10 ms. --> As an example, an RWIN of 100 requires a control plane input buffer of 150 kbytes per neighbor assuming an IS-IS MTU of 1500 octets and limits the throughput to 10000 LSPs per second and per neighbor for a link RTT of 10 ms. With the same RWIN, the throughput limitation is 2000LSPLSPs per second when the RTT is50ms.50 ms. That's the maximum throughput assuming no other limitations such as CPU limitations.</t><t>Equally<t>Equally, RTT is of importance for the performance. That is why the performance improvements on the receiver specified insection<xreftarget="Receiver"/>target="Receiver" format="default"/> are important to achieve good throughput. If the receiver does not support those performance improvements, in the worst case (small RWIN and high RTT) the throughput will be limited by the LSP Transmission Interval as defined insection<xreftarget="InterfaceLSPTransmissionInterval"/>.</t>target="InterfaceLSPTransmissionInterval" format="default"/>.</t> <section anchor="TLVoperationP2P"title="Operationnumbered="true" toc="default"> <name>Operation on apoint to point interface">Point-to-Point Interface</name> <t>By sending the Receive Window sub-TLV, a node advertises to its neighbor its ability to receive that manyun-acknowledgedunacknowledged LSPs from the neighbor. This is akin to a receive window or sliding window in flow control. In some implementations, this value should reflect the IS-IS socket buffer size. Special care must be taken to leave space forCSNPs and PSNPsCSNPs, PSNPs, and IIHs if they share the same input queue. In this case, this document suggests advertising an LSP Receive Window corresponding to half the size of the IS-IS input queue. </t> <t>By advertising an LSP Transmission Interval sub-TLV, a node advertises its ability to receive LSPs separated by at least the advertised value, outside of LSP bursts.</t> <t>By advertising an LSP Burst Size sub-TLV, a node advertises its ability to receive that number of LSPsback-to-back.</t>back to back.</t> <t>The LSP transmitterMUST NOT<bcp14>MUST NOT</bcp14> exceed these parameters. After having sent a full burst of LSPs, itMUST<bcp14>MUST</bcp14> send the subsequent LSPs with a minimum of LSP Transmission Interval between LSP transmissions. For CPU scheduling reasons, this rateMAY<bcp14>MAY</bcp14> be averaged over a small period, e.g.,10-30ms.</t>10-30 ms.</t> <t>If either the LSP transmitter or receiver does not adhere to these parameters, forexampleexample, because of transient conditions, this doesn't result in a fatal condition for IS-IS operation. In the worst case, an LSP is lost at thereceiverreceiver, and this situation is already remedied by mechanisms in <xreftarget="ISO10589"/>.target="ISO10589" format="default"/>. After a few seconds, neighbors will exchange PSNPs (for point-to-point interfaces) or CSNPs (for broadcast interfaces) and recover from the lost LSPs. This worst case should be avoided as those additional seconds impact convergence time since the LSDB is not fully synchronized.HenceHence, it is better to err on the conservative side and to under-run the receiver rather than over-run it.</t> </section> <sectiontitle="Operationnumbered="true" toc="default"> <name>Operation on abroadcastBroadcast LANinterface">Interface</name> <t>Flow and congestion control on a LAN interface is out of scope for this document.</t> </section> </section> <section anchor="CongestionControl"title="Congestion Control">numbered="true" toc="default"> <name>Congestion Control</name> <t>Whereas flow control prevents the sender from overwhelming the receiver, congestion control prevents senders from overwhelming the network. For an IS-IS adjacency, the network between two IS-IS neighbors is relatively limited in scope and includes a single linkwhichthat is typicallyover-sizedoversized compared to the capability of the IS-IS speakers. In situations where the probability of LSP drop is low, flow control<xref target="FlowControl"/>(<xref target="FlowControl" format="default"/>) is expected to give good results, without the need to implement congestion control. Otherwise, adding congestion control will help handling congestion of LSPs in the receiver.</t> <t>This section describes one sender-side congestion control algorithm largely inspired by the TCP congestion control algorithm <xreftarget="RFC5681"></xref>.</t>target="RFC5681" format="default"/>.</t> <t>The proposed algorithm uses a variable congestion window 'cwin'. It plays a role similar to the receive window described above. The main difference is that cwin is adjusted dynamically according to various events described below.</t> <section anchor="CC1Core"title="Core algorithm">numbered="true" toc="default"> <name>Core Algorithm</name> <t>In its simplest form, the congestion control algorithm looks like the following:</t> <figure anchor="cc1_core_algo"><artwork><artwork name="" type="" align="left" alt=""><![CDATA[ +---------------+ | | | v | +----------------------+ | | Congestion avoidance | | + ---------------------+ | | | | Congestion signal----------------+ </artwork>----------------+]]></artwork> </figure> <t>The algorithm starts with cwin = cwin0 = LPP + 1. In the congestion avoidance phase, cwin increases as LSPs are acked: for every acked LSP, cwin += 1 / cwin without exceeding RWIN. When LSPs are exchanged, cwin LSPs will be acknowledged in 1 RTT, meaning cwin(t) = t/RTT + cwin0. Since the RTT is low in many IS-IS deployments, the sending rate can reach fast rates in short periods of time.</t> <t>When updating cwin, it must not become higher than the number of LSPs waiting to be sent, otherwise the sending will not be paced by the receiving of acks. Said differently,txtransmission pressure is needed to maintain and increase cwin.</t> <t>When the congestion signal is triggered, cwin is set back to its initialvaluevalue, and the congestion avoidance phase starts again.</t> </section> <section anchor="CC1CongestionSignals"title="Congestion signals">numbered="true" toc="default"> <name>Congestion Signals</name> <t>The congestion signal can take various forms. The more reactive the congestion signals, the fewer LSPs will be lost due to congestion. However, overly aggressive congestion signals will cause a sender to keep a very low sending rate even without actual congestion on the path.</t> <t>Two practical signals are given below.</t><t>Delay:<ol spacing="normal" type="1"> <li><t>Delay: When receivingacknowledgements,acknowledgments, a sender estimates theacknowledgementacknowledgment time of the receiver. Based on this estimation, it can infer that a packet waslost,lost andinferthat congestion is on the path.</t> <t>There can be a timer per LSP, but this can become costly for implementations. It is possible to use only a single timer t1 for all LSPs: during t1, sent LSPs are recorded in a list list_1. Once the RTT is over, list_1 is kept and anotherlist list_2list, list_2, is used to store the next LSPs. LSPs are removed from the lists when acked. At the end of the second t1 period, every LSP in list_1 should have been acked, so list_1 is checked to be empty. list_1 can then be reused for the next RTT.</t><t>There<!-- [rfced] Please clarify how the sentence starting with "Alternatively" fits with the rest of the paragraph. Would the following update retain the intended meaning? Original: There are multiple strategies to set the timeout value t1. It should be based on measurements of the maximum acknowledgement time (MAT) of each PSNP. The simplest one is to use three times the RTT. Alternatively an exponential moving average of the MATs, like [RFC6298]. Perhaps: There are multiple strategies to set the timeout value t1. It should be based on measurements of the maximum acknowledgment time (MAT) of each PSNP. Using three times the RTT is the simplest strategy; alternatively, an exponential moving average of the MATs, as described in [RFC6298], can be used. --> <t>There are multiple strategies to set the timeout value t1. It should be based on measurements of the maximum acknowledgment time (MAT) of each PSNP. The simplest one is to use three times the RTT. Alternatively an exponential moving average of the MATs, like <xreftarget="RFC6298"/>.target="RFC6298" format="default"/>. A more elaborate one is to take a running maximum of the MATs over a period of a few seconds. This value should include a margin of error to avoid false positives (e.g., estimated MAT measurevariance)variance), which would have a significant impact onperformance.</t> <t> Loss:performance.</t></li> <li><t>Loss: if the receiver has signaled the O-flag(Ordered acknowledgement)(see <xreftarget="Flags"/>,target="Flags" format="default"/>), a senderMAY<bcp14>MAY</bcp14> record its sending order and check thatacknowledgementsacknowledgments arrive in the same order. If not, some LSPs aremissingmissing, and thisMAY<bcp14>MAY</bcp14> be used to trigger a congestionsignal.</t>signal.</t></li> </ol> </section> <section anchor="CC1Refinement"title="Refinement">numbered="true" toc="default"> <name>Refinement</name> <t>With the algorithm presented above, if congestion is detected, cwin goes back to its initialvalue,value and does not use the information gathered in previous congestion avoidance phases.</t> <t>It is possible to use a fast recovery phase once congestion isdetected,detected and to avoid going through this linear rate of growth from scratch. When congestion is detected, a fast recovery threshold frthresh is set to frthresh = cwin / 2. In this fast recovery phase, for every acked LSP, cwin += 1. Once cwin reaches frthresh, the algorithm goes back to the congestion avoidance phase.</t> <figure anchor="cc1_algo_refinement_1"><artwork><artwork name="" type="" align="left" alt=""><![CDATA[ +---------------+ | | | v | +----------------------+ | | Congestion avoidance | | + ---------------------+ | | | | Congestion signal | | | +----------------------+ | | Fast recovery | | +----------------------+ | | | | frthresh reached----------------+ </artwork>----------------+]]></artwork> </figure> </section> <section anchor="cc_remarks"title="Remarks">numbered="true" toc="default"> <name>Remarks</name> <t> This algorithm's performance is dependent on the LPP value. Indeed, the smaller the LPP is, the more information is available for the congestion control algorithm to perform well. However, it also increases the resources spent on sending PSNPs, so a trade-off must be made. This document recommendsto useusing an LPP of 15 or less. If a Receive Window is advertised, LPPSHOULD<bcp14>SHOULD</bcp14> belowerlower, and the best performance is achieved when LPP is an integer fraction of the Receive Window. </t> <t>Note that this congestion control algorithm benefits from the extensions proposed in this document. The advertisement of a receive window from the receiver (<xreftarget="FlowControl"/>)target="FlowControl" format="default"/>) avoids the use of an arbitrary maximum value by the sender. The faster acknowledgment of LSPs (<xreftarget="LSPACKRate"/>)target="LSPACKRate" format="default"/>) allows for a faster control loop and hence a faster increase of the congestion window in the absence of congestion. </t> </section> </section> <section anchor="Pacing"title="Pacing">numbered="true" toc="default"> <name>Pacing</name> <t>As discussed in <xref target="RFC9002" sectionFormat="comma" section="7.7"/>format="default"/>, a senderSHOULD<bcp14>SHOULD</bcp14> pace sending of all in-flight LSPs based on input from the congestion controller.</t> <t>Sending multiple packets without any delay between them creates a packet burst that might cause short-term congestion and losses. SendersMUST<bcp14>MUST</bcp14> either use pacing or limit such bursts. SendersSHOULD<bcp14>SHOULD</bcp14> limit bursts to LSP Burst Size.</t> <t>Senders can implement pacing as they choose. A perfectly paced sender spreads packets evenly over time. For a window-based congestion controller, such as the one in this section, that rate can be computed by averaging the congestion window over the RTT. Expressed as an inter-packet interval in units oftime:</t> <t>intervaltime:</t><t indent="3">interval = (SRTT / cwin) / N</t> <t>SRTT is thesmoothed round-trip time [RFC6298]</t>Smoothed Round-Trip Time <xref target="RFC6298" format="default"/>.</t> <t>Using a value for N that is small, but at least 1 (for example,1.25)1.25), ensures that variations in RTT do not result in underutilization of the congestion window.</t> <t>Practical considerations, such as scheduling delays and computational efficiency, can cause a sender to deviate from this rate over time periods that are much shorter than an RTT.</t> <t>One possible implementation strategy for pacing uses a leaky bucket algorithm, where the capacity of the "bucket" is limited to the maximum burstsizesize, and the rate that the "bucket" fills is determined by the above function.</t> </section> <section anchor="sec_determining_values"title="Determining valuesnumbered="true" toc="default"> <name>Determining Values to beadvertisedAdvertised in the Flooding ParametersTLV">TLV</name> <t>The values that a receiver advertises do not need to be perfect. If the values are toolowlow, then the transmitter will not use the full bandwidth or available CPU resources. If the values are toohighhigh, then the receiver may drop some LSPs during the firstRTTRTT, and this loss will reduce the usable receivewindowwindow, and the protocol mechanisms will allow the adjacency to recover. Flooding slower than both nodes can support will hurtperformance,performance as will consistently overloading the receiver.</t> <section anchor="sec_determining_values_static"title="Static values">numbered="true" toc="default"> <name>Static Values</name> <t>The values advertised need not bedynamicdynamic, as feedback is provided by the acknowledgment of LSPs in SNP messages. Acknowledgments provide a feedback loop on how fast the LSPs are processed by the receiver. They also signal that the LSPs can be removed from the receive window, explicitly signaling to the sender that more LSPs may be sent. By advertising relatively static parameters, we expect to produce overall flooding behavior similar to what might be achieved by manually configuring per-interface LSP rate-limiting on all interfaces in the network. The advertised values could be based, for example, on offline tests of the overall LSP-processing speed for a particular set of hardware and the number of interfaces configured for IS-IS. With such a formula, the values advertised in the Flooding Parameters TLV would only change when additional IS-IS interfaces are configured.</t> <t>Static values are dependent on the CPU generation, class ofrouterrouter, and network scaling, typically the number of adjacent neighbors. Examples at the time of publication are provided below. The LSP Burst Size could be in the range 5 to 20. From a router perspective, this value typically depends on the queue(s) size(s) on the I/O path from the packet forwarding engine to the controlplaneplane, which is veryplatform dependent.platform-dependent. It also depends upon how many IS-IS neighbors share this I/Opathpath, as typically all neighbors will send the same LSPs at the same time. It may also depend on other incoming control plane traffic that is sharing that I/O path, how bursty they are, and how many incoming IS-IS packets are prioritized over other incoming control plane traffic. As indicated in <xreftarget="HISTORY"/>,target="HISTORY" format="default"/>, the historical behavior from <xreftarget="ISO10589"/>target="ISO10589" format="default"/> allows a value of10 hence10; hence, 10 seems conservative. From a network operation perspective, it would be beneficial for the burst size to be equal to or higher than the number of LSPswhichthat may be originated by a single failure. For a node failure, this is equal to the number of IS-IS neighbors of the failed node. The LSP Transmission Interval could be in the range of 1 ms to 33 ms. As indicated in <xreftarget="HISTORY"/>,target="HISTORY" format="default"/>, the historical behavior from <xreftarget="ISO10589"/>target="ISO10589" format="default"/> is33ms hence33 ms; hence, 33 ms is conservative. The LSP Transmission Interval is an advertisement of the receiver's sustainable LSP reception rate taking into account all aspects andin particularparticularly the control plane CPU and the I/O bandwidth. It's expected to improve(hence(hence, decrease) as hardware and software naturally improve over time. It should be chosenconservativelyconservatively, as this rate may be used by the sender in all conditions -- including the worst conditions. It's also not a bottleneck as the flow control algorithm may use a higher rate in good conditions,in particularparticularly when the receiver acknowledgesquicklyquickly, and the receive window is large enough compared to the RTT. LPP could be in the range of 5 to 90 with a proposed 15. A smaller value provides faster feedback at the cost of the small overhead of more PSNP messages. PartialSNPInterval could be in the range50ms50 to500ms500 ms with a proposed200ms.value of 200 ms. One may distinguish the value used locally from the value signaled to the sender. The value used locally benefits from being small but is not expected to be the main parameter to improve performance. It depends on how fast the IS-IS flooding process may be scheduled by the CPU.It's safe as, evenEven when the receiver CPU is busy, it's safe because it will naturally delay itsacknowledgmentsacknowledgments, which provides a negative feedback loop. The value advertised to the sender should be conservative (high enough) as this value could be used by the sender to send some LSPs rather than keep waiting for acknowledgments. Receive Window could be in the range of 30 to 200 with a proposed value of 60. In general, the larger the better the performance on links with high RTT. The higherthethat number and the higher the number of IS-IS neighbors, the higher the use of control planememorymemory, so it's mostly dependent on the amount ofmemorymemory, which may be dedicated to IS-IS flooding and the number of IS-IS neighbors. From a memory usageperspective, a priori,perspective (a priori), one could use the same value as the TCP receive window, but the value advertised should not be higher than the buffer of the "socket" used.</t> </section> <section anchor="sec_determining_values_dynamic"title="Dynamic values"> <t>The values may be updated dynamically, tonumbered="true" toc="default"> <name>Dynamic Values</name> <t>To reflect the relative change of load on the receiver, the values may be updated dynamically by improving the values when the receiver load is getting lower and by degrading the values when the receiver load is getting higher. For example, if LSPs are regularly dropped, or if the queue regularly comes close to being filled, then the values may be too high. On the other hand, if the queue is barely used (by IS-IS), then the values may be too low.</t> <!--[rfced] Should 'absolute value' be plural here? Original: The values may also be absolute value reflecting relevant average hardware resources that are monitored, typically the amount of buffer space used by incoming LSPs. Perhaps: The values may also be absolute values that reflect the relevant average hardware resources that are monitored, e.g., the amount of buffer space used by incoming LSPs. Or: The values may be specified as absolute values that reflect the relevant average hardware resources, e.g., the amount of buffer space used by incoming LSPs. --> <t>The values may also be absolute value reflecting relevant average hardware resources that are monitored, typically the amount of buffer space used by incoming LSPs. In this case, care must be taken when choosing the parameters influencing the values in order to avoid undesirable or unstable feedback loops.ItFor example, it would be undesirable to use a formula thatdepends, for example,depends on an active measurement of the instantaneous CPU load to modify the values advertised in the Flooding Parameters TLV. This could introduce feedback into the IGP flooding process that could produce unexpected behavior.</t> </section> </section> <section anchor="OPS_Considerations"title="Operation considerations">numbered="true" toc="default"> <name>Operation Considerations</name> <t>As discussed in <xreftarget="TLVoperationLAN"/>,target="TLVoperationLAN" format="default"/>, the solution is more effective on point-to-point adjacencies.HenceHence, a broadcast interface (e.g., Ethernet) only shared by two IS-IS neighbors should be configured as point-to-point in order to have more effective flooding.</t> </section> </section> <section anchor="TxSide"title="Transmitter Basednumbered="true" toc="default"> <name>Transmitter-Based Congestion ControlApproach">Approach</name> <t>This section describes an approach to the congestion control algorithm based on performance measured by the transmitter withoutdependancedependence on signaling from the receiver.</t> <section anchor="Router-arch"title="Routernumbered="true" toc="default"> <name>Router ArchitectureDiscussion"> <t>(TheDiscussion</name> <t>Note that the following description is anabstraction -abstraction; implementation detailsvary.)</t>vary.</t> <t>Existing router architectures may utilize multiple input queues. On a given line card, IS-IS PDUs from multiple interfaces may be placed in a rate-limited input queue. This queue may be dedicated to IS-IS PDUs or may be shared with other routing related packets.</t> <t>The input queue may then pass IS-IS PDUs to a "puntqueue"queue," which is used to pass PDUs from the data plane to the control plane. The punt queue typically also has controls on its size and the rate at which packets will be punted.</t> <t>An input queue in the control plane may then be used to assemble PDUs from multiplelinecards,line cards, separate the IS-IS PDUs from other types of packets, and place the IS-IS PDUsonin an input queue dedicated to the IS-IS protocol.</t> <t>The IS-IS input queue then separates the IS-IS PDUs and directs them to an instance-specific processing queue. The instance-specific processing queue may then further separate the IS-IS PDUs by type (IIHs, SNPs, and LSPs) so that separate processing threads with varying priorities may be employed to process the incoming PDUs.</t> <t>In such an architecture, it may be difficult for IS-IS in the control plane to determine what value should be advertised as a receive window.</t> <t>The following section describes an approach to congestion control based on performance measured by the transmitter withoutdependancedependence on signaling from the receiver.</t> </section> <section anchor="Ex2-tx"title="Guidelinesnumbered="true" toc="default"> <name>Guidelines fortransmitter side congestion controls">Transmitter-Side Congestion Controls</name> <t>The approach described in this section does not depend upon direct signaling from the receiver.InsteadInstead, it adapts the transmission rate based on measurement of the actual rate of acknowledgments received.</t> <t>Flow control is not used by this approach. When congestion control is necessary, it can be implemented based on knowledge of the current flooding rate and the currentacknowledgementacknowledgment rate. The algorithm used is a local matter. There is no requirement to standardizeitit, but there are a number of aspectswhichthat serve as guidelineswhichthat can be described. Algorithms based on this approach should follow the recommendations described below. </t> <t>A maximum LSP transmission rate (LSPTxMax) should be configurable. This represents the fastest LSP transmission ratewhichthat will be attempted. This value should be applicable to all interfaces and should be consistent network wide.</t> <t>When the current rate of LSP transmission (LSPTxRate) exceeds the capabilities of the receiver, the congestion control algorithm needs to quickly and aggressively reduce the LSPTxRate. Slower responsiveness is likely to result in a larger number ofretransmissionsretransmissions, which can introduce much longer delays in convergence.</t> <t>Dynamic increase of the rate of LSP transmission(LSPTxRate) (i.e., faster)(LSPTxRate), i.e., making the rate faster, should be done less aggressively and only be done when the neighbor has demonstrated its ability to sustain the current LSPTxRate.</t> <t>The congestion control algorithm should not assume that the receive performance of a neighbor is static, i.e., it should handle transient conditionswhichthat result in a slower or faster receive rate on the part of a neighbor.</t> <t>The congestion control algorithm should consider the expected delay time in receiving an acknowledgment.It thereforeTherefore, it incorporates the neighbor partialSNPInterval (<xreftarget="partialSNPI"/>)target="partialSNPI" format="default"/>) to help determine whetheracknowlegmentsacknowledgments are keeping pace with the rate of LSPs transmitted. In the absence of an advertisement of partialSNPInterval, a locally configured value can be used.</t> </section> </section> </section> <section anchor="IANA_Consideration"title="IANA Considerations">numbered="true" toc="default"> <name>IANA Considerations</name> <section anchor="IANA_Consideration1"title="Floodingnumbered="true" toc="default"> <name>Flooding ParametersTLV">TLV</name> <t>IANA has made the followingtemporaryallocationfromin theIS-IS"IS-IS Top-Level TLVcodepoint registry. This document requests the allocation be made permanent.</t> <figure anchor="IANA_Registration" title=''> <preamble></preamble> <artworkCodepoints" registry.</t> <table align="center">Type Description IIH LSP SNP Purge ---- --------------------------- --- --- --- --- 21 Flooding<name></name> <thead> <tr> <th>Value</th> <th>Name</th> <th>IIH</th> <th>LSP</th> <th>SNP</th> <th>Purge</th> </tr> </thead> <tbody> <tr> <td align="center">21</td> <td>Flooding ParametersTLV y n y n </artwork> </figure>TLV</td> <td>y</td> <td>n</td> <td>y</td> <td>n</td> </tr> </tbody> </table> </section> <section anchor="IANA_Consideration2"title="Registry:numbered="true" toc="default"> <name>Registry: IS-IS Sub-TLV for Flooding ParametersTLV"> <t>This document createsTLV</name> <t>IANA has created the following sub-TLVRegistry underregistry in the "IS-IS TLV Codepoints"grouping:</t> <t>Name: IS-ISregistry group.</t> <dl newline="false" spacing="compact"> <dt>Name:</dt> <dd>IS-IS Sub-TLVs for Flooding ParametersTLV.</t> <t>Registration Procedure(s): Expert Review</t> <t>Expert(s): TBD</t> <t>Description: ThisTLV</dd> <dt>Registration Procedure(s):</dt> <dd>Expert Review (as defined in <xref target="RFC8126"/>)</dd> <dt>Description:</dt> <dd>This registry defines sub-TLVs for the Flooding ParametersTLV(21).</t> <t>Reference: This document.</t> <texttableTLV (21).</dd> <dt>Reference:</dt> <dd>RFC 9681</dd> </dl> <table anchor="Registry_Flooding"title="Initialalign="center"> <name>Initial Sub-TLVallocationsAllocations for Flooding ParametersTLV"> <ttcol align='center'>Type</ttcol> <ttcol align='left'>Description</ttcol> <c>0</c> <c>Reserved</c> <c>1</c> <c>LSPTLV</name> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td align="center">0</td> <td>Reserved</td> </tr> <tr> <td align="center">1</td> <td>LSP BurstSize</c> <c>2</c> <c>LSPSize</td> </tr> <tr> <td align="center">2</td> <td>LSP TransmissionInterval</c> <c>3</c> <c>LSPs Per PSNP</c> <c>4</c> <c>Flags</c> <c>5</c> <c>PartialInterval</td> </tr> <tr> <td align="center">3</td> <td>LSPs per PSNP</td> </tr> <tr> <td align="center">4</td> <td>Flags</td> </tr> <tr> <td align="center">5</td> <td>Partial SNPInterval</c> <c>6</c> <c>Receive Window</c> <c>7-255</c> <c>Unassigned</c> </texttable>Interval</td> </tr> <tr> <td align="center">6</td> <td>Receive Window</td> </tr> <tr> <td align="center">7-255</td> <td>Unassigned</td> </tr> </tbody> </table> </section> <section anchor="IANA_Consideration3"title="Registry:numbered="true" toc="default"> <name>Registry: IS-IS Bit Values for Flooding Parameters FlagsSub-TLV"> <t>This document requests IANA to createSub-TLV</name> <t>IANA has created a new registry,underin the "IS-IS TLV Codepoints"grouping,registry group, for assigning Flag bits advertised in the Flagssub- TLV.</t> <t>Name: IS-ISsub-TLV.</t> <dl newline="false" spacing="compact"> <dt>Name:</dt> <dd>IS-IS Bit Values for Flooding Parameters FlagsSub-TLV.</t> <t>Registration Procedure: Expert Review</t> <t>Expert Review Expert(s): TBD</t> <t>Description: ThisSub-TLV</dd> <dt>Registration Procedure:</dt> <dd>Expert Review</dd> <dt>Description:</dt> <dd><t>This registry defines bit values for the Flagssub-TLV(4)sub-TLV (4) advertised in the Flooding ParametersTLV(21).</t> <t>Note: InTLV (21).</t></dd> <dt>Note:</dt><dd><t>In order to minimize encoding space, a new allocation should pick the smallest availablevalue.</t> <t>Reference: This document.</t> <texttablevalue.</t></dd> <dt>Reference:</dt> <dd>RFC 9681</dd> </dl> <table anchor="Registry_Flags"title="Initial bit allocationsalign="center"> <name>Initial Bit Allocations for FlagsSub-TLV"> <ttcol align='center'>Bit #</ttcol> <ttcol align='left'>Description</ttcol> <c>0</c> <c>Ordered acknowledgement (O-flag)</c> <c>1-63</c> <c>Unassigned</c> </texttable>Sub-TLV</name> <thead> <tr> <th>Bit #</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td>0</td> <td>Ordered acknowledgment (O-flag)</td> </tr> <tr> <td>1-63</td> <td>Unassigned</td> </tr> </tbody> </table> </section> </section> <section anchor="Security"title="Security Considerations" toc="default"> <t> Securitytoc="default" numbered="true"> <name>Security Considerations</name> <t>Security concerns for IS-IS are addressed in <xreftarget="ISO10589"/> ,target="ISO10589" format="default"/>, <xreftarget="RFC5304"/> ,target="RFC5304" format="default"/>, and <xreftarget="RFC5310"/> .target="RFC5310" format="default"/>. These documents describe mechanisms that provide for the authentication and integrity of IS-IS PDUs, including SNPs and IIHs. These authentication mechanisms are not altered by this document.</t><t> With<t>With the cryptographic mechanisms described in <xreftarget="RFC5304"/>target="RFC5304" format="default"/> and <xreftarget="RFC5310"/> ,target="RFC5310" format="default"/>, an attacker wanting to advertise an incorrect Flooding Parameters TLV would have to first defeat thesemechanisms. </t>mechanisms.</t> <t>In the absence of cryptographic authentication, as IS-IS does not run over IP but directly over the link layer, it's considered difficult to inject a falseSNP/IIHSNP or IIH without having access to the link layer.</t> <t>If a falseSNP/IIHSNP or IIH is sent with a Flooding Parameters TLV set to conservative values, the attacker can reduce the flooding speed between the two adjacentneighborsneighbors, which can result in LSDB inconsistencies and transient forwarding loops. However, it is not significantly different than filtering or alteringLSPsLSPs, which would also be possible with access to the link layer. In addition, if the downstream flooding neighbor has multiple IGPneighbors, whichneighbors (which is typically the case for reliability or topologicalreasons,reasons), it would receive LSPs at a regular speed from its other neighbors and hence would maintain LSDB consistency.</t> <t>If a falseSNP/IIHSNP or IIH is sent with a Flooding Parameters TLV set to aggressive values, the attacker can increase the floodingspeedspeed, which can either overload a node or more likelygeneratecause loss of LSPs. However, it is not significantly different than sending manyLSPsLSPs, which would also be possible with access to the link layer, even with cryptographic authentication enabled. In addition, IS-IS has procedures to detect the loss of LSPs and recover.</t> <t>This TLV advertisement is not flooded across the network but only sent between adjacent IS-IS neighbors. This would limit the consequences in case of forgedmessages,messages and alsolimitslimit the dissemination of such information.</t> </section><section anchor="Contributors" title="Contributors"> <t>The following people gave a substantial contribution to the content of this document and should be considered as coauthors:<list style="symbols"> <t>Jayesh J, Ciena, jayesh.ietf@gmail.com</t> <t>Chris Bowers, Juniper Networks, cbowers@juniper.net</t> <t>Peter Psenak, Cisco Systems, ppsenak@cisco.com</t> </list></t> </section> <section anchor="Acknowledgments" title="Acknowledgments"> <t>The authors would like to thank Henk Smit, Sarah Chen, Xuesong Geng, Pierre Francois, Hannes Gredler, Acee Lindem, Mirja Kuhlewind, Zaheduzzaman Sarker and John Scudder for their reviews, comments and suggestions.</t> <t>The authors would like to thank David Jacquet, Sarah Chen, and Qiangzhou Gao for the tests performed on commercial implementations and their identification of some limiting factors.</t> </section></middle> <back><references title="Normative References"> <?rfc include="reference.RFC.2119"?> <?rfc include="reference.RFC.8174"?> <?rfc include="reference.RFC.5304"?> <?rfc include="reference.RFC.5310"?> <?rfc include="reference.RFC.6298"?><displayreference target="I-D.ietf-lsr-dynamic-flooding" to="DYNAMIC-FLOODING"/> <references> <name>References</name> <references> <name>Normative References</name> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.5304.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.5310.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6298.xml"/> <referenceanchor="ISO10589">anchor="ISO10589" target="https://www.iso.org/standard/30932.html"> <front><title>Intermediate<title>Information technology - Telecommunications and information exchange between systems - Intermediate system to Intermediate system intra-domain routeing information exchange protocol for use in conjunction with the protocol for providing the connectionless-modeNetwork Servicenetwork service (ISO 8473)</title> <author> <organizationabbrev="ISO">Internationalabbrev="ISO/IEC">International Organization forStandardization</organization>Standardization/International Electrotechnical Commission</organization> </author> <date month="Nov" year="2002"/> </front> <seriesInfo name="ISO/IEC"value="10589:2002, Second Edition"/>value="10589:2002"/> <refcontent>Second Edition</refcontent> </reference> </references><references title="Informative References"> <?rfc include="reference.I-D.ietf-lsr-dynamic-flooding"?> <?rfc include="reference.RFC.9293"?> <?rfc include="reference.RFC.9002"?> <?rfc include="reference.RFC.2973"?> <?rfc include="reference.RFC.5681"?><references> <name>Informative References</name> <!-- [I-D.ietf-lsr-dynamic-flooding] IESG state: RFC Ed Queue as of 05/14/24 --> <xi:include href="https://datatracker.ietf.org/doc/bibxml3/draft-ietf-lsr-dynamic-flooding.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9293.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9002.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2973.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.5681.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8126.xml"/> </references> </references> <sectionanchor="authors-notes" title="Changes / Author Notes"> <t>[RFC Editor:anchor="Acknowledgments" numbered="false" toc="default"> <name>Acknowledgments</name> <t>The authors would like to thank <contact fullname="Henk Smit"/>, <contact fullname="Sarah Chen"/>, <contact fullname="Xuesong Geng"/>, <contact fullname="Pierre Francois"/>, <contact fullname="Hannes Gredler"/>, <contact fullname="Acee Lindem"/>, <contact fullname="Mirja Kuehlewind"/>, <contact fullname="Zaheduzzaman Sarker"/>, and <contact fullname="John Scudder"/> for their reviews, comments, and suggestions.</t> <t>The authors would like to thank <contact fullname="David Jacquet"/>, <contact fullname="Sarah Chen"/>, and <contact fullname="Qiangzhou Gao"/> for the tests performed on commercial implementations and for their identification of some limiting factors.</t> </section> <section anchor="Contributors" numbered="false" toc="default"> <name>Contributors</name> <t>The following people gave substantial contributions to the content of this document and should be considered as coauthors:</t> <contact fullname="Jayesh J"> <organization>Ciena</organization> <address> <email>jayesh.ietf@gmail.com</email> </address> </contact> <contact fullname="Chris Bowers"> <organization>Juniper Networks</organization> <address> <email>cbowers@juniper.net</email> </address> </contact> <contact fullname="Peter Psenak"> <organization>Cisco Systems</organization> <address> <email>ppsenak@cisco.com</email> </address> </contact> </section> <!-- [rfced] Terminology A) RWIN vs. rwin Pleaseremovereview; both forms appear in Sections 6.2.1 and 6.2.2.1. If these refer to the same concept, should these be made consistent? (Perhaps all caps "RWIN" is meant to refer to the value held in the Receive Window sub-TLV, whereas "rwin" is for the receive window size itself?) B) acknowledgement vs. acknowledgment FYI, usage was mixed; we have updated to the latter form (no 'e'). We note that form was used slightly more in thissection before publication]</t> <t>IND 00: Initial version.</t> <t>WG 00: No change.</t> <t>WG 01:document, and it is used more commonly in the normative references of this document. We will ask IANAallocated code point.</t> <t>WG 02: No change.</t> <t>WG 03: <list style="symbols"> <t>Pacing section added (taken from RFC 9002).</t> <t>Some text borrowed from RFC 9002 (QUIC Loss Detectionto update to "Ordered acknowledgment (O-flag)" in the registry on https://www.iana.org/assignments/isis-tlv-codepoints. C) Per vs. per We suggest lowercase 'per' be used consistently in "LSP per PSNP" (singular) andCongestion Control).</t> <t>Considerations"LSPs per PSNP" (plural), even though the acronym "LPP" is used. FYI, we have updated the document in that manner. If you agree, we will ask IANA to update the capitalization in the registry (https://www.iana.org/assignments/isis-tlv-codepoints) accordingly. Current: LSP per PSNP (LPP) sub-TLV LSPs per PSNP (LPP) --> <!-- [rfced] FYI, we updated the following abbreviations to expand upon first occurrence and to use them as abbreviations for the remainder of the document. LSDB - Link State Databases CSNPs - Complete Sequence Number PDUs SNPs - Sequence Number PDUs PSNPs - Partial Sequence Number PDUs (There is a separate question re: Partial SNP Interval sub-TLV.) --> <!-- [rfced] FYI, the final paragraph of Section 4.7 has been put in an <aside> element. Please review. Also, please let us know whether any of the other notes in this document should be in the <aside> element. It is defined as "a container for content that is semantically less important or tangential to the content that surrounds it" (https://authors.ietf.org/rfcxml-vocabulary#aside). Current: | Note: The focus of work used to develop the example algorithms | discussed later in this document focused on operation over | point-to-point interfaces. A full discussion of how best to do | faster flooding on a LAN interface is therefore out of scope | for this document. --> <!-- [rfced] Please review thespecial role"Inclusive Language" portion of theDIS.</t> <t>Editorial changes.</t> </list></t> <t>WG 04: Update IANA sectiononline Style Guide <https://www.rfc-editor.org/styleguide/part2/#inclusive_language> and let us know if any changes are needed. Updates of this nature typically result in more precise language, which is helpful for readers. Note that our script did not flag any words in particular, but this should still be reviewed asper IANA editor comments (2023-03-23).</t> <t>WG 06: AD review.</t> </section>a best practice. --> <!-- [rfced] For Tony Przygienda, would you like to include the ZIP code in your postal address (seemingly 94089), or is it intentional to not include it? Current: Tony Przygienda Juniper 1137 Innovation Way Sunnyvale, CA United States of America Email: prz@juniper.net --> </back> </rfc>