Improving the Robustness of Stateless Address Autoconfiguration (SLAAC) to Flash Renumbering Events
SI6 Networks
Segurola y Habana 4310, 7mo Piso
Villa Devoto
Ciudad Autonoma de Buenos Aires
Argentina
fgont@si6networks.com
https://www.si6networks.com
6connect
jan@connect.com
Sky UK
richard.patterson@sky.uk
Internet
IPv6 Maintenance (6man) Working Group
In renumbering scenarios where an IPv6 prefix suddenly becomes invalid, hosts on the local network will continue using stale prefixes for an unacceptably long period of time, thus resulting in connectivity problems. This document improves the reaction of IPv6 Stateless Address Autoconfiguration to such renumbering scenarios.
IPv6 network renumbering is usually assumed to take place in a planned manner, with old/stale prefixes being phased-out via reduced prefix lifetimes while new prefixes (with normal lifetimes) are introduced. However, there are a number of scenarios that may lead to the so-called "flash-renumbering" events, where the prefix being employed on a network suddenly becomes invalid and replaced by a new prefix . In such scenarios, hosts on the local network will continue using stale prefixes for an unacceptably long period of time, thus resulting in connectivity problems. discusses this problem in detail.
In some scenarios, the local router triggering the network renumbering event may try to deprecate the currently-employed prefixes (by explicitly signaling the network about the renumbering event), whereas in other scenarios the renumbering event may happen inadvertently, without the router explicitly signaling the scenario to local hosts.
From the perspective of a Stateless Address Autoconfiguration (SLAAC) host, there are two different (but related) problems to be solved:
Avoiding the use of stale addresses for new communication instances
Performing "garbage collection" for stale prefixes and related network configuration information
Clearly, if a host has both working and stale addresses, it is paramount that it employs working addresses for new communication instances. Additionally, a host should also perform garbage collection for the stale prefixes/addresses, since they not only unnecessarily tie system resources, but also prevent communication with the new "owners" of the stale prefixes.
The term "globally reachable" is used in this document as defined in .
The term "Global Unicast Address" (or its acronym "GUA") is used throughout this document to refer to "globally reachable" addresses. That is, when used throughout this document, GUAs do NOT include Unique Local Addresses (ULAs) . Similarly, the term "Global Unicast prefix" (or "GUA prefix") is employed throughout this document to refer to network prefixes that specify GUAs, and does NOT include the ULA prefix (FC00::/7) .
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
"MAY", and "OPTIONAL" in this document are to be interpreted as
described in BCP 14 when, and only when, they
appear in all capitals, as shown here.
As noted in , in some scenarios the router triggering the renumbering event may be able to explicitly signal this event, while in other scenarios the renumbered hosts may need to infer a renumbering event is taking place. The following subsections analyze specific considerations for each of these scenarios.
In the absence of explicit signalling from SLAAC routers (such as sending Prefix Information Options (PIOs) with small lifetimes to deprecate stale prefixes), stale prefixes will remain preferred and valid according to the Preferred Lifetime and Valid Lifetime parameters (respectively) of the last received PIO. specifies the following default values for PIOs:
Preferred Lifetime (AdvPreferredLifetime): 604800 seconds (7 days)
Valid Lifetime (AdvValidLifetime): 2592000 seconds (30 days)
This means that, in the absence of explicit signaling by a SLAAC router to deprecate a prefix, it will take a host 7 days (one week) to deprecate the corresponding addresses, and 30 days (one month) to eventually remove any addresses configured for the stale prefix. Clearly, for any practical purposes, employing such long default values is generally unacceptable for most deployment scenarios that may experience flash-renumbering events.
provides an operational recommendation for Customer Edge (CE) routers to override the standard default Preferred Lifetime (AdvPreferredLifetime) and Valid Lifetime (AdvValidLifetime) to 2700 seconds (45 minutes) and 5400 seconds (90 minutes), respectively, thus improving the state of affairs for CE router scenarios.
Use of more appropriate timers in Router Advertisement messages can help limit the amount of time that hosts will maintain stale configuration information. Additionally, hosts may normally in a position to infer whether a prefix has become stale -- for example, if a router ceases to advertise an existing prefix while it continues to advertise other prefixes.
formally updates to use of more appropriate (i.e., shorter) default lifetimes for PIOs, while specifies a local policy that SLAAC hosts may implement to infer that network configuration information has changed, such that stale configuration information can be phased out.
In scenarios where a local router is aware about the renumbering event, it may try to phase out the stale network configuration information. In these scenarios, there are two aspects to be considered:
The amount of time during which the router should continue trying to deprecate the stale network configuration information
The ability of SLAAC hosts to phase out stale configuration in a timelier manner.
Since the network could be become partitioned at any arbitrary time and for an arbitrarily long period of time, routers need to contemplate the possible scenario where hosts receive an RA message, and the network subsequently becomes partitioned. This means that in order to reliably deprecate stale information, a router would should try to deprecate a prefix for a period of time equal "Preferred Lifetime" used when advertising the prefix, and try to invalidate the prefix for a period of time equal to the "Valid Lifetime" (see Section 12 of ) used when advertising the prefix.
Once the number of seconds in the original "Preferred Lifetime" have elapsed, all hosts would have deprecated the corresponding addresses anyway, while once the number of seconds in the "Valid Lifetime" have elapsed, the corresponding addresses would be invalidated and removed.
Thus, use of more appropriate default lifetimes for PIOs, as proposed in , would reduce the amount of time a stale prefix would need to be announced as such by a router in order to ensure that it is deprecated/invalidated.
In scenarios where a router has positive knowledge that a prefix has become invalid and thus could signal this condition to local hosts, the current specifications will prevent SLAAC hosts from fully recovering from such stale information: Item "e)" of Section 5.5.3 of specifies that an RA may never reduce the "RemainingLifetime" to less than two hours. Additionally, if the RemainingLifetime of an address is smaller than 2 hours, then a Valid Lifetime smaller than 2 hours will be ignored. The inability to invalidate a stale prefix may prevent communication with the new "owners" of a prefix, and thus is highly undesirable. On the other hand, the Preferred Lifetime of an address *may* be reduced to any value to avoid the use of a stale prefix for new communications.
formally updates to remove this restriction, such that hosts may react to the advertised "Valid Lifetime" even if it is smaller than 2 hours.
Finally, recommends that routers disseminate network configuration information when a network interface is initialized, such that new configuration information propagates in a timelier manner.
The following subsections update and , such that the problem discussed in this document is mitigated. The updates in the following subsections are mostly orthogonal, and mitigate different aspects of SLAAC that prevent a timely reaction to flash renumbering events.
Reduce the default Valid Lifetime and Preferred Lifetime of PIOs ():
This helps limits the amount of time a host may employ stale information, and also limits the amount of time a router needs to try to deprecate stale information.
Honor PIOs with small Valid Lifetimes ():
This allows routers to invalidate stale prefixes, since otherwise would prevent hosts from honoring PIOs with a Valid Lifetime smaller than two hours.
Recommend routers to retransmit configuration information upon interface initialization/reinitialization ():
This helps spread the new information in a timelier manner, and also deprecate stale information via host-side heuristics (see ).
Recommend routers to always send all options (i.e. the complete configuration information) in RA messages, and in the smallest possible number of packets ():
This helps propagate the same information to all hosts, and also allows hosts to better infer that information missing in RA messages has become stale (see ).
Infer stale network configuration information from received RAs ():
This allows hosts to deprecate stale network configuration information, even in the absence of explicit signaling.
This document formally updates to modify the default values of the Preferred Lifetime and the Valid Lifetime of PIOs as follows:
AdvPreferredLifetime: ND_DEFAULT_PREFERRED_LIFETIME
AdvValidLifetime: ND_DEFAULT_VALID_LIFETIME
This document formally updates to specify the default Route Lifetime of Route Information Options (RIOs) as follows:
Route Lifetime: Default: ND_DEFAULT_PREFERRED_LIFETIME
This document formally updates to modify the default Lifetime of Recursive DNS Server Options as:
Lifetime: Default: ND_DEFAULT_PREFERRED_LIFETIME
Additionally, this document formally updates to modify the default Lifetime of DNS Search List Options as:
Lifetime: Default: ND_DEFAULT_PREFERRED_LIFETIME
This document defines the following variables to be employed for the default lifetimes of Neighbor Discovery options (please see above):
ND_DEFAULT_PREFERRED_LIFETIME: max(AdvDefaultLifetime, 3 * MaxRtrAdvInterval)
ND_DEFAULT_VALID_LIFETIME: 2 * ND_DEFAULT_PREFERRED_LIFETIME
where:
Router configuration variable specified in , which specifies the value to be placed in the Router Lifetime field of Router Advertisements sent from the interface, in seconds.
Router configuration variable specified in , which specifies the maximum time allowed between sending unsolicited multicast Router Advertisements from the interface, in seconds.
A function that computes the maximum of its arguments.
The expression above computes of maximum among AdvDefaultLifetime and "3 * MaxRtrAdvInterval" (the default value of AdvDefaultLifetime, as per ) to accommodate the case where an operator might simply want to disable one local router for maintenance, while still having the router advertise SLAAC configuration information.
specifies the default value of MaxRtrAdvInterval as 600 seconds, and the default value of AdvDefaultLifetime as 3 * MaxRtrAdvInterval. Therefore, when employing default values for MaxRtrAdvInterval and AdvDefaultLifetime, the
default values of ND_DEFAULT_PREFERRED_LIFETIME and ND_DEFAULT_VALID_LIFETIME become 1800 seconds (30 minutes) and 3600 seconds (1 one hour), respectively. We note that when implementing BCP202 , AdvDefaultLifetime will typically be in the range of 45-90 minutes, and therefore the value of ND_DEFAULT_PREFERRED_LIFETIME will be in the range 45-90 minutes, while the value of ND_DEFAULT_VALID_LIFETIME will be in the range of 90-180 minutes.
The entire item "e)" (pp. 19-20) from Section 5.5.3 of is replaced with the following text:
e) If the advertised prefix is equal to the prefix of an address
configured by stateless autoconfiguration in the list, the
valid lifetime and the preferred lifetime of the address should
be updated by processing the Valid Lifetime and the Preferred
Lifetime (respectively) in the received advertisement.
This change allows hosts to react to the signal provided by a router that has positive knowledge that a prefix has become invalid.
The behavior described in had been incorporated during
the revision of the original IPv6 Stateless Address
Autoconfiguration specification (). At the time, the
IPNG working group decided to mitigate the attack vector
represented by Prefix Information Options with very short
lifetimes, on the premise that these packets represented a bigger
risk than other ND-based attack vectors .
While reconsidering the trade-offs represented by such
decision, we conclude that the drawbacks of the aforementioned mitigation outweigh the possible benefits.
In scenarios where RA-based attacks are of concern, proper
mitigations such as RA-Guard or SEND should be
implemented.
When an interface is initialized, it is paramount that network configuration information is spread on the corresponding network (particularly in scenarios where an interface has been re-initialized, and the conveyed information has changed). Thus, this document replaces the following text from Section 6.2.4 of :
In such cases, the router MAY transmit up to
MAX_INITIAL_RTR_ADVERTISEMENTS unsolicited advertisements, using the
same rules as when an interface becomes an advertising interface.
with:
In such cases, the router SHOULD transmit
MAX_INITIAL_RTR_ADVERTISEMENTS unsolicited advertisements, using the
same rules as when an interface becomes an advertising interface.
Use of stale information can lead to interoperability problems. Therefore, it is important that new configuration information propagates in a timelier manner to all hosts.
specifies recommendations for CPE routers to signal any stale network configuration information.
Intentionally omitting information in Router Advertisements may prevent the propagation of such information, and may represent a challenge for hosts that need to infer whether they have received a complete set of SLAAC configuration information. As a result, this section recommends that, to the extent that is possible, RA messages contain a complete set of SLAAC information.
This document replaces the following text from Section 6.2.3 of :
A router MAY choose not to include some or all options when sending
unsolicited Router Advertisements. For example, if prefix lifetimes
are much longer than AdvDefaultLifetime, including them every few
advertisements may be sufficient. However, when responding to a
Router Solicitation or while sending the first few initial
unsolicited advertisements, a router SHOULD include all options so
that all information (e.g., prefixes) is propagated quickly during
system initialization.
If including all options causes the size of an advertisement to
exceed the link MTU, multiple advertisements can be sent, each
containing a subset of the options.
with:
When sending Router Advertisements, a router SHOULD include all
options.
If including all options would cause the size of an advertisement
to exceed the link MTU, multiple advertisements can be sent, each
containing a subset of the options. In all cases, routers SHOULD
convey all information using the smallest possible number of
packets, and convey options of the same type in the same
packet to the extent possible.
Sending information in the smallest possible number of packets was somewhat already implied by the original text in . Including all options when sending RAs leads to simpler code (as opposed to dealing with special cases where specific information is intentionally omitted), and also helps hosts infer when they have received a complete set of SLAAC configuration information. Note that while allowed some RAs to omit some options, to the best of the authors' knowledge, all SLAAC router implementations always send all options in the smallest possible number of packets. Therefore, this section simply aligns the protocol specifications with existing implementation practice.
This document has no actions for IANA.
[NOTE: This section is to be removed by the RFC-Editor before this document is published as an RFC.]
This section summarizes the implementation status of the updates proposed in this document. In some cases, they correspond to variants of the mitigations proposed in this document (e.g., use of reduced default lifetimes for PIOs, albeit using different values than those recommended in this document). In such cases, we believe these implementations signal the intent to deal with the problems described in while lacking any guidance on the best possible approach to do it.
We have produced a patch for OpenBSD's rad(8) that employs the default lifetimes recommended in this document, albeit it has not yet been committed to the tree. The patch is available at: <https://www.gont.com.ar/code/fgont-patch-rad-pio-lifetimes.txt>.
The radvd(8) daemon , normally employed by Linux-based router implementations, currently employs different default lifetimes than those recommended in . radvd(8) employs the following default values :
Preferred Lifetime: 14400 seconds (4 hours)
Valid Lifetime: 86400 seconds (1 day)
This is not following the specific recommendation in this document, bu is already a deviation from the current standards.
A Linux kernel implementation of this document has been committed to the net-next tree. The implementation was produced in April 2020 by Fernando Gont <fgont@si6networks.com>. The corresponding patch can be found at: <https://patchwork.ozlabs.org/project/netdev/patch/20200419122457.GA971@archlinux-current.localdomain/>
NetworkManager processes RA messages with a Valid Lifetime smaller than two hours as recommended in this document.
We know of no implementation that splits network configuration information into multiple RA messages.
The dhcpcd(8) daemon , a user-space SLAAC implementation employed by some Linux-based and BSD-derived operating systems, will set the Preferred Lifetime of addresses corresponding to a given prefix to 0 when a single RA from the router that previously advertised the prefix fails to advertise the corresponding prefix. However, it does not affect the corresponding Valid Lifetime. Therefore, it can be considered a partial implementation of this feature.
is a Customer Edge Router that tries to deprecate stale prefixes by advertising stale prefixes with a Preferred Lifetime of 0, and a Valid Lifetime of 2 hours (or less). There are two things to note with respect to this implementation:
Rather than recording prefixes on stable storage (as recommended in ), this implementation checks the source address of IPv6 packets, and assumes that usage of any address that does not correspond to a prefix currently-advertised by the Customer Edge Router is the result of stale network configuration information. Hence, upon receipt of a packet that employs a source address that does not correspond to a currently-advertised prefix, this implementation will start advertising the corresponding prefix with small lifetimes, with the intent of deprecating it.
Possibly as a result of item "e)" (pp. 19-20) from Section 5.5.3 of (discussed in of this document), upon first occurrence of a stale prefix, this implementation will employ a decreasing Valid Lifetime, starting from 2 hours (7200 seconds), as opposed to a Valid Lifetime of 0.
The protocol update in could allow an on-link
attacker to perform a Denial of Service attack against local hosts,
by sending a forged RA with a PIO with a Valid Lifetime of 0.
Upon receipt of that packet, local hosts would invalidate the
corresponding prefix, and therefore remove any addresses
configured for that prefix, possibly terminating e.g. associated TCP
connections. However, an attacker may
achieve similar effects via a number other Neighbor Discovery (ND) attack
vectors, such as directing traffic to a non-existing node until
ongoing TCP connections time out,
or performing a ND-based man-in-the-middle (MITM) attack and
subsequently forging TCP RST segments to cause on-going TCP
connections to be reset. Thus, for all practical purposes,
this attack vector does not really represent any greater risk
than other ND attack vectors. As noted in , in scenarios where RA-based attacks
are of concern, proper
mitigations such as RA-Guard or SEND should be
implemented.
The authors would like to thank (in alphabetical order) Mikael Abrahamsson, Tore Anderson, Luis Balbinot, Brian Carpenter, Lorenzo Colitti, Owen DeLong, Gert Doering, Thomas Haller, Nick Hilliard, Bob Hinden, Philip Homburg, Lee Howard, Christian Huitema, Tatuya Jinmei, Erik Kline, Ted Lemon, Jen Linkova, Albert Manfredi, Roy Marples, Florian Obser, Jordi Palet Martinez, Michael Richardson, Hiroki Sato, Mark Smith, Hannes Frederic Sowa, Dave Thaler, Tarko Tikan, Ole Troan, Eduard Vasilenko, and Loganaden Velvindron, for providing valuable comments on earlier versions of this document.
The algorithm specified in is the result of mailing-list discussions over previous versions of this document with Philip Homburg.
Fernando would like to thank Alejandro D'Egidio and Sander Steffann for a discussion of these issues, which led to the publication of , and eventually to this document.
Fernando would also like to thank Brian Carpenter who, over the years, has answered many questions and provided valuable comments that has benefited his protocol-related work.
IPNG working group (ipngwg) Meeting Minutes
IETF
dhcpcd - a DHCP client
Linux IPv6 Router Advertisement Daemon (radvd)
NetworkManager web site
systemd web site
OpenBSD Router Advertisement Daemon - rad(8)
OpenBSD SLAAC Daemon - slaacd(8)
radvd.conf - configuration file of the router advertisement daemon
Quiz: Weird IPv6 Traffic on the Local Network (updated with solution)
SI6 Networks
Segurola y Habana 4310, 7mo Piso
Villa Devoto
Ciudad Autonoma de Buenos Aires
Argentina
+54 11 4650 8472
fgont@si6networks.com
https://www.si6networks.com
[This section is to be removed before publication of this document as an RFC].
During the discussion of this document, some alternative workarounds were suggested on the 6man mailing-list. The following subsections analyze these suggested workarounds, in the hopes of avoiding rehashing the same discussions.
It has been suggested that if configured addresses become stale, a CPE enforcing ingress/egress filtering (BCP38) () could send ICMPv6 Type 1 (Destination Unreachable) Code 5 (Source address failed ingress/egress policy) error messages to the sending node, and that, upon receipt of such error messages, the sending node could perform heuristics that might help to mitigate the problem discussed in this document.
The aforementioned proposal has a number of drawbacks and limitations:
It assumes that the CPE routers enforce ingress/egress filtering . While this is desirable behaviour, it cannot be relied upon.
It assumes that if the CPE enforces ingress/egress filtering, the CPE will signal the packet drops to the sending node with ICMPv6 Type 1 (Destination Unreachable) Code 5 (Source address failed ingress/egress policy) error messages. While this may be desirable, does not suggest signaling the packet drops with ICMPv6 error messages, let alone the use of specific error messages (such as Type 1 Code 5) as suggested.
ICMPv6 Type 1 Code 5 could be interpreted as the employed address being stale, but also as a selected route being inappropriate/suboptimal. If the later, deprecating addresses or invalidating addresses upon receipt of these error messages would be inappropriate.
Reacting to these error messages would create a new attack vector that could be exploited from remote networks. This is of particular concern since ICMP-based attacks do not even require that the Source Address of the attack packets be spoofed .
specifies source address selection (SAS) for IPv6. Conceptually, it sorts the candidate set of source addresses for a given destination, based on a number of pair-wise comparison rules that must be successively applied until there is a "winning" address.
An implementation might improve source address selection, and prefer the most-recently advertised information. In order to incorporate the "freshness" of information in source address selection, an implementation would be updated as follows:
The node is assumed to maintain a timer/counter that is updated at least once per second. For example, the time(2) function from unix-like systems could be employed for this purpose.
The local information associated with each prefix advertised via RAs on the local network is augmented with a "LastAdvertised" timestamp value. Whenever an RA with a PIO with the "A" bit set for such prefix is received, the "LastAdvertised" timestamp is updated with the current value of the timer/counter.
is updated such that this rule is incorporated:
If one of the two source addresses corresponds to a prefix that has been more recently advertised, say LastAdvertised(SA) > LastAdvertised(SA), then prefer that address (SA in our case).
A clear benefit of this approach is that a host will normally prefer "fresh" addresses over possibly stale addresses.
However, there are a number of drawbacks associated with this approach:
In scenarios where multiple prefixes are being advertised on the same LAN segment, the new SAS rule is *guaranteed* to result in non-deterministic behaviour, with hosts frequently changing the default source address. This is certainly not desirable from a troubleshooting perspective.
Since the rule must be incorporated before "Rule 8: Use longest matching prefix" from , it may lead to suboptimal paths.
This new rule may help to improve the selection of a source address, but it does not help with the housekeeping (garbage collection) of configured information:
If the stale prefix is re-used in another network, nodes employing stale addresses and routes for this prefix will be unable to communicate with the new "owner" of the prefix, since the stale prefix will most likely be considered "on-link".
Given that the currently recommended default value for the "Valid Lifetime" of PIOs is 2592000 seconds (30 days), it would take too long for hosts to remove the configured addresses and routes for the stale prefix. While the proposed update in of this document would mitigate this problem, the lifetimes advertised by the local SLAAC router are not under the control of hosts.
As a result, updating IPv6 source address selection does not relieve nodes from improving their SLAAC implementations as specified in , if at all desirable. On the other hand, the algorithm specified in would result in Rule 3 of employing fresh addresses, without leading to non-deterministic behaviour.