A Framework for Enhanced Virtual Private
Network (VPN+)Huaweijie.dong@huawei.comUniversity of Surreystewart.bryant@gmail.comChina Mobilelizhenqiang@chinamobile.comKDDI Corporationta-miyasaka@kddi.comSamsungyounglee.tx@gmail.comTEAS Working GroupThis document describes the framework for Enhanced Virtual Private
Network (VPN+). The purpose of VPN+ is to support the needs of new
applications (e.g. low latency, bounded jitter, etc.), by utilizing an
approach that is based on the VPN and Traffic Engineering (TE)
technologies, and adds characteristics that specific services require
beyond those provided by existing VPNs. Typically, VPN+ will be used to
underpin network slicing, but could also be of use in its own right
providing enhanced connectivity services between customer sites. This
document also provides an overview of relevant technologies in different
network layers, and identifies some areas for potential new work.Virtual Private Networks (VPNs) have served the industry well as a
means of providing different groups of users with logically isolated
connectivity over a common network. The common (base) network that is
used to provide the VPNs is often referred to as the underlay, and the
VPN is often called an overlay.Customers of a network operator may request connectivity services
with advanced characteristics such as low latency guarantees, bounded
jitter, or isolation from other services or customers so that changes in
some other services (e.g. changes in network load, or events such as
congestion or outages) have no or only acceptable effect on the
throughput or latency of the services provided to the customer. These
services are referred to as "enhanced VPNs" (known as VPN+) in that they
are similar to VPN services providing the customer with the required
connectivity, but in addition they have enhanced characteristics.The concept of network slicing has gained traction driven largely by
needs surfacing from 5G . According to , a 5G end-to-end network slice consists of three
major types of network segments: Radio Access Network (RAN), Transport
Network (TN), and Mobile Core Network (CN). The transport network
provides the connectivity between different entities in RAN and CN
segments of a 5G end-to-end network slice, with specific performance
commitment. defines the
terminologies and the characteristics of IETF Network Slices. It also
discusses the general framework, the components and interfaces for
requesting and operating IETF Network Slices. An IETF Network Slice
Service enables connectivity between a set of Service Demarcation Points
(SDPs) with specific Service Level Objectives (SLOs) and Service Level
Expectations (SLEs) over a common underlay network. An IETF Network
Slice can be realized as a logical network connecting a number of
endpoints and is associated with a set of shared or dedicated network
resources that are used to satisfy the Service Level Objectives (SLOs)
and Service Level Expectations (SLEs) requirements. In this document
(which is solely about IETF technologies) we refer to an "IETF Network
Slice" simply as a "network slice": a network slice is considered as one
typical use case of VPN+.A network slice may involve multiple technologies (e.g. IP or
Optical), and may span multiple administrative domains. Depending on the
customer's requirements, a network slice could be isolated from other
network slices in terms of data plane, control plane, and management
plane resources.Network slicing can build on the concepts of resource management,
network virtualization, and abstraction to provide performance
assurance, flexibility, programmability, and modularity. It may use
techniques such as Software Defined Networking (SDN) , network abstraction , and
Network Function Virtualization (NFV) to create multiple logical (virtual) networks, each
tailored for use by a set of services or by a particular tenant or a
group of tenants that share the same or similar requirements. These
logical networks are created on top of a common underlay network. How
the network slices are engineered is deployment-specific.The requirements of VPN+ services cannot simple be met by overlay
networks, as these services require tighter coordination and integration
between the overlay and the underlay networks.In the overlay network, VPN has been defined as the network construct
to provide the required connectivity for different services or
customers. In the underlay network, this document introduces the concept
Virtual Transport Network (VTN) A VTN is a virtual underlay network
which is associated with a network topology, and is allocated with a set
of dedicated or shared resources from the physical underlay network.A VPN+ service is realized by integrating a VPN in the overlay and a
VTN in the underlay. With this, a VPN+ service can provide enhanced
properties such as guaranteed resources and assured or predictable
performance. A VPN+ service may also include a set of service functions.
These enhanced properties have general applicability, and are also of
interest as part of a network slicing solution. Hence VPN+ techniques
can be used to instantiate a network slice service, and they can also be
of use in general cases to provide enhanced connectivity services
between customer sites or service end points. introduces the
concept Network Resource Partition (NRP) as a collection of resources in
the underlay network that can reliably support specific IETF Network
Slice SLAs. An NRP can be associated with a network topology to select
or specify the set of links and nodes involved. VTN and NRP are
considered as similar concepts, and NRP can be seen as an instantiation
of VTN in the context of network slicing.It is not envisaged that VPN+ services will replace VPN services. VPN
services will continue to be delivered using existing mechanisms and can
co-exist with VPN+ services.This document describes a framework for using existing, modified, and
potential new technologies as components to provide VPN+ services.
Specifically, this document provides:The functional requirements and service characteristics of a VPN+
service.The design of the data plane for VPN+.The necessary control and management protocols in both the
underlay and the overlay of VPN+.The mechanisms to achieve integration between overlay and
underlay.The necessary Operation, Administration, and Management (OAM)
methods to instrument a VPN+ to make sure that the required Service
Level Agreement (SLA) between the customer and the network operator
is met, and to take any corrective action (such as switching traffic
to an alternate path) to avoid SLA violation.The required layered network structure to achieve these objectives is
shown in .In this document, the relationship of the four terms "VPN", "VPN+",
"VTN", and "Network Slice" are as follows:A Virtual Private Network (VPN) refers to the overlay network
service that provides connectivity between different customer sites,
and that maintains traffic separation between different customers.
The typical VPN technologies are: IPVPN ,
L2VPN , L3VPN , and
EVPN .An enhanced VPN (VPN+) service is an evolution of the VPN service
that makes additional service-specific commitments. An enhanced VPN
is made by integrating a VPN with a set of network resources
allocated in the underlay network.A Virtual Transport Network (VTN) is a virtual underlay network
which is associated with a logical network topology, and is
allocated with a set of dedicated or shared network resources from
the physical underlay network. A VTN has the capability of
delivering the network resources and performance characteristics
required by the VPN+ customers.A network slice service could be delivered by provisioning a VPN+
service in the network. Other mechanisms for delivering network
slices may exist but are not in scope for this document.The term "tenant" is used in this document to refer to the customers
and all of their associated VPN+ services.The following terms are also used in this document. Some of them are
newly defined, some others reference existing definitions. Service Level Agreement. See .Service Level Objective. See .Service Level Expectation. See .Abstraction and Control of Traffic Engineered
Networks Deterministic Networking. See and Flexible Ethernet Time Sensitive Networking Virtual Network. See Virtual Transport Path. A VTP is a path through
the VTN which provides the required connectivity and performance
between two or more customer sites.This section provides an overview of the requirements of a VPN+
service.Performance guarantees are made by network operators to their
customers in relation to the services provided to the customers. They
are usually expressed in SLAs as a set of SLOs.There are several kinds of performance guarantee, including
guaranteed maximum packet loss, guaranteed maximum delay, and
guaranteed delay variation. Note that these guarantees apply to
conformance traffic, out-of-profile traffic will be handled according
to a separate agreement with the customer. (see for example, section
3.6 of ).Guaranteed maximum packet loss is usually addressed by setting
packet priorities, queue size, and discard policy. However this
becomes more difficult when the requirement is combined with latency
requirements. The limiting case is zero congestion loss, and that is
the goal of DetNet and TSN . In modern optical networks, loss due to transmission
errors already approaches zero, but there is the possibility of
failure of the interface or the fiber itself. This type of fault can
be addressed by some form of signal duplication and transmission over
diverse paths.Guaranteed maximum latency is required by a number of applications
particularly real-time control applications and some types of virtual
reality applications. DetNet is relevant,
however additional methods of enhancing the underlay to better support
the delay guarantees may be needed, and these methods will need to be
integrated with the overall service provisioning mechanisms.Guaranteed maximum delay variation is a performance guarantee that
may also be needed. calls up a number of
cases that need this guarantee, for example in electrical utilities.
Time transfer is an example service that needs a performance
guarantee, although it is in the nature of time that the service might
be delivered by the underlay as a shared service and not provided
through different VPN+s. Alternatively, a dedicated VPN+ might be used
to provide this as a shared service.This suggests that a spectrum of service guarantees need to be
considered when deploying a VPN+. As a guide to understanding the
design requirements we can consider four types of service:Best effortAssured bandwidthGuaranteed latencyEnhanced deliveryThe best effort service is the basic connectivity service that can
be provided by current VPNs.An assured bandwidth service is a connectivity service in which the
bandwidth over some period of time is assured. This could be achieved
either simply based on a best effort service with over-capacity
provisioning, or it can be based on MPLS traffic engineered label
switching paths (TE-LSPs) with bandwidth reservations. Depending on
the technique used, however, the bandwidth is not necessarily assured
at any instant. Providing assured bandwidth to VPNs, for example by
using per-VPN TE-LSPs, is not widely deployed at least partially due
to scalability concerns. The more common approach of aggregating
multiple VPNs onto common TE-LSPs results in shared bandwidth and so
may reduce the assurance of bandwidth to any one service. VPN+ aims to
provide a more scalable approach for such services.A guaranteed latency service has an upper bound to edge-to-edge
latency. Assuring the upper bound is sometimes more important than
minimizing latency. There are several new technologies that provide
some assistance with this performance guarantee. Firstly, the IEEE TSN
project introduces the concept of scheduling of
delay- and loss-sensitive packets. FlexE is
also useful to help provide a guaranteed upper bound to latency.
DetNet is also of relevance in assuring an upper bound of end-to-end
packet latency in network layer. The use of these technologies to
deliver VPN+ services needs to be considered when a guaranteed latency
service is required.An enhanced delivery service is a connectivity service in which the
underlay network (at layer-3) needs to ensure to eliminate packet loss
in the event of equipment or media failures. This may be achieved by
delivering a copy of the packet through multiple paths. Such a
mechanism may need to be used for VPN+ service.One element of the SLA demanded for VPN+ service may be a guarantee
that the service offered to the customer will not be affected by any
other traffic flows in the network. This is termed "isolation" in
section 3.8 of , and a customer may express
the requirement for isolation as an SLE for network slice service
.One way for a network operator to meet the requirement for
isolation is setting and conforming to all the SLOs. For example,
traffic congestion (interference from other services) might impact on
the latency experienced by a VPN+ customer. Thus, in this example,
conformance to a latency SLO would be the primary requirement for
delivery of the VPN+ service, and isolation from other services might
be only a means to that end.Another way for a service provider to meet the isolation
requirement is to control the degree to which traffic from one service
is isolated from other services in the network.There is a fine distinction between how isolation is requested by a
customer and how it is delivered by the service provider. In general,
the customer is interested in service performance and not how it is
delivered. Thus, for example, the customer wants specific quality
guarantees and is not concerned about how the service provider
delivers them. However, it should be noted that some aspects of
isolation might be directly measurable by a customer if they have
information about the traffic patterns on a number services supported
by the same service provider. Furthermore, a customer may be nervous
about disruption caused by other services, contamination by other
traffic, or delivery of their traffic to the wrong destinations. In
this way, the customer may want to specify (and pay for) the level of
isolation provided by the service provider.Isolation is achieved in the realization of a VPN+ through existing
technologies that may be supplemented by new mechanisms. The service
provider chooses which processes to use to meet this SLE just as they
choose how to meet all other SLOs and SLEs. Isolation may be achieved
in the network by various forms of resource partitioning ranging from
simple separation of service traffic on delivery (ensuring that
traffic is not delivered to the wrong customer), through sharing of
resources with some form of safeguards, to dedicated allocation of
resources for a specific VPN+ service. For example, interference
avoidance may be achieved by network capacity planning, allocating
dedicated network resources, traffic policing or shaping, prioritizing
in using shared network resources, etc.The terms hard and soft isolation are used to indicate different
levels of isolation within the underlay network. A service has soft
isolation if the traffic of one service cannot be received by the
customers of another service. The existing IP and MPLS VPNs are
examples of services with soft isolation: the network delivers the
traffic only to the required customer endpoints. However, with soft
isolation, as the network resources are shared, traffic from some
services may congest the network, resulting in packet loss and delay
for other services. The ability for a service or a group of services
to be sheltered from this effect is called hard isolation. Hard
isolation may be needed so that applications with exacting
requirements can function correctly, despite other demands (perhaps a
burst of traffic in another service) competing for the underlying
resources. A customer may request different degrees of isolation
ranging from soft isolation to hard isolation. In practice isolation
may be delivered on a spectrum between soft and hard, and in some
cases soft and hard isolation may be used in a hierarchical manner
with one VPN+ service being built on another.To provide the required level of isolation, resources may need to
be reserved in the data plane of the underlay network and dedicated to
traffic from a specific VPN+ service or a specific group of VPN+
services. This may introduce scalability concerns both in the
implementation (as each VPN+ would need to be tracked in the network)
and in how many resources need to be reserved and may be under-used
(see ). Thus, some trade-off needs to be
considered to provide the isolation between VPN+ services while still
allowing reasonable resource sharing.An optical underlay can offer a high degree of isolation, at the
cost of allocating resources on a long-term and end-to-end basis. On
the other hand, where adequate isolation can be achieved at the packet
layer, this permits the resources to be shared amongst a group of
services and only dedicated to a service on a temporary basis.The next section explores a pragmatic approach to isolation in
packet networks.A key question is whether it is possible to achieve hard
isolation in packet networks that were designed to provide
statistical multiplexing through sharing of data plane resources, a
significant economic advantage when compared to a dedicated or a
Time Division Multiplexing (TDM) network. Clearly, there is no need
for a customer to request more isolation than their applications
require, and no need for a service provider to provide more
isolation than requested by their customer, an approximation to full
hard isolation is sufficient in most cases when hard isolation is
requested. For example, pseudowires emulate
services that would have had hard isolation in their native
form. shows a spectrum of isolation that may be
delivered by a network. At one end of the spectrum, we see
statistical multiplexing technologies that support current VPNs.
This is a service type that has served the industry well and will
continue to do so. At the opposite end of the spectrum, we have the
absolute isolation provided by dedicated transport networks. The
goal of VPN+ is "pragmatic isolation". This is isolation that is
better than what is obtainable from pure statistical multiplexing,
more cost effective and flexible than a dedicated network, but is a
practical solution that is good enough for the majority of
applications. Mechanisms for both soft isolation and hard isolation
are needed to meet different levels of service requirements.The way to achieve the characteristics demand of a VPN+ service
(such as guaranteed or predictable performance) is by integrating the
overlay VPN with a particular set of resources in the underlay network
which are allocated to meet the service requirements. This needs be
done in a flexible and scalable way so that it can be widely deployed
in operators' networks to support a good number of VPN+ services.Taking mobile networks and in particular 5G into consideration, the
integration of the network with service functions is likely a
requirement. The IETF's work on service function chaining (SFC) provides a foundation for this. Service functions can
be considered as part of the VPN+ services. The detailed mechanisms
about the integration between service functions and VPN+ are out of
the scope of this document.Integration of the overlay VPN and the underlay network resources
and functions does not always need to be a direct mapping. As
described in , abstraction is the process of
applying policy to a set of information about a traffic engineered
(TE) network to produce selective information that represents the
potential ability to connect across the network. The process of
abstraction presents the connectivity graph in a way that is
independent of the underlying network technologies, capabilities,
and topology so that the graph can be used to plan and deliver
network services in a uniform way.With the approach of abstraction, VPN+ may be built on top of an
abstracted topology that represents the connectivity capabilities of
the underlay TE based network as described in the framework for
Abstraction and Control of TE Networks (ACTN) as discussed further in .VPN+s need to be created, modified, and removed from the network
according to service demands. A VPN+ that requires hard isolation
() must not be disrupted by the
instantiation or modification of another VPN+ service. Determining
whether modification of an VPN+ can be disruptive to that VPN+, and
whether the traffic in flight will be disrupted can be a difficult
problem.Dynamic changes both to the VPN+ and to the underlay network need
to be managed to avoid disruption to services that are sensitive to
changes in network performance.In addition to non-disruptively managing the network during changes
such as the inclusion of a new VPN+ service endpoint or a change to a
link, VPN+ traffic might need to be moved because of changes to
traffic patterns and volumes. This means that during the lifetime of a
VPN+ service, closed-loop optimization is needed so that the delivered
service always matches the ordered service SLA.The data plane aspects of this problem are discussed further in
, , and .The control plane aspects of this problem are discussed further in
.The management plane aspects of this problem are discussed further
in .In many cases the customers are delivered with VPN+ services
without knowing the information about the underlying VTNs. However,
depending on the agreement between the operator and the customer, in
some cases the customer may also be provided with some information
about the underlying VTNs. Such information can be filtered or
aggregated according to the operator's policy. This allows the
customer of the VPN+ to have some visibility and even control over how
the underlying topology and resources of the VTN are used. For
example, the customers may be able to specify the path or path
constraints within the VTN for specific traffic flows of their VPN+
service. Depending on the requirements, an VPN+ customer may have
his/her own network controller, which may be provided with an
interface to the control or management system run by the network
operator. Note that such control is within the scope of the customer's
VPN+ service, any additional changes beyond this would require some
intervention by the network operator.A description of the control plane aspects of this problem are
discussed further in . A description of
the management plane aspects of this feature can be found in .The concept of VPN+ can be applied to any existing and future
multi-tenancy overlay technologies including but not limited to :Layer-2 point-to-point services such as pseudowires Layer-2 VPNs Ethernet VPNs , Layer-3 VPNs , Where such VPN service types need enhanced isolation and delivery
characteristics, the technologies described in
can be used to tweak the underlay so that to provide the required
enhanced performance.In some scenarios, a VPN+ service may span multiple network
domains. A domain is considered to be any collection of network
elements under the responsibility of the same administrative entity,
for example, an Autonomous System (AS). In some domains the network
operator may manage a multi-layered network, for example, a packet
network over an optical network. When VPN+ services are provisioned in
such network scenarios, the technologies used in different network
planes (data plane, control plane, and management plane) need to
provide mechanisms to support multi-domain and multi-layer
coordination and integration, so as to provide the required service
characteristics for different VPN+ services, and improve network
efficiency and operational simplicity.A number of VPN+ services will typically be provided by a common
network infrastructure. Each VPN+ service is provisioned with an overlay
VPN and mapped to a corresponding VTN, which has a specific set of
network resources and functions allocated in the underlay to satisfy the
needs of the customer. One VTN may support one of more VPN+ services.
The integration between the overlay connectivity and the underlay
resources ensures the required isolation between different VPN+
services, and achieves the guaranteed performance for different
customers.The VPN+ architecture needs to be designed with consideration given
to:An enhanced data plane.A control plane to create VPN+, making use of the data plane
isolation and performance guarantee techniques.A management plane for VPN+ service life-cycle management. These topics are expanded below.The enhanced data plane:Provides the required packet latency and jitter
characteristics.Provides the required packet loss characteristics.Provides the required resource isolation capability, e.g.,
bandwidth guarantee.Provides the mechanism to associate a packet with the set of
resources allocated to a VTN which the VPN+ service packet is
mapped to.The control plane:Collects information about the underlying network topology
and network resources, and exports this to network nodes and/or
a centralized controller as required.Creates VTNs with the network resource and topology
properties needed by the VPN+ services.Distributes the attributes of VTNs to network nodes which
participate in the VTNs and/or a centralized controller.Computes and set up network paths in each VTN.Maps VPN+ services to an appropriate VTN.Determines the risk of SLA violation and takes appropriate
avoiding/correction actions.Considers the right balance of per-packet and per-node state
according to the needs of the VPN+ services to scale to the
required size.The management plane:Provides an interface between the VPN+ service provider
(e.g., operator's network management system) and the VPN+
customer (e.g., an organization or a service with VPN+
requirement) such that the operation requests and the related
parameters can be exchanged without the awareness of other VPN+
customers.Provides an interface between the VPN+ service provider and
the VPN+ customers to expose the network capability information
toward the customer.Provides the service life-cycle management and operation of
VPN+ services (e.g., creation, modification,
assurance/monitoring, and decommissioning).Operations, Administration, and Maintenance (OAM):Provides the tools to verify the connectivity and monitor the
performance of the VPN+ service.Provides the tools to verify whether the underlay network
resources are correctly allocated and operating properly.Telemetry: Provides the mechanisms to collect network information about
the operation of the data plane, control plane, and management
plane. More specifically, telemetry provides the mechanisms to
collect network data: from the underlay network for overall performance
evaluation and for the planning VPN+ services.from each VPN+ service for monitoring and analytics of
the characteristics and SLA fulfillment of the VPN+
services.The layered architecture of VPN+ is shown in .Underpinning everything is the physical network infrastructure
layer which provide the underlying resources used to provision the
separated VTNs. This layer is responsible for the partitioning of link
and/or node resources for different VTNs. Each subset of link or node
resource can be considered as a virtual link or virtual node used to
build the VTNs.Various components and techniques discussed in can be used to enable resource partitioning of the
physical network infrastructure, such as FlexE, TSN, dedicated queues,
etc. These partitions may be physical or virtual so long as the SLA
required by the higher layers is met.Based on the set of network resource partitions provided by the
physical network infrastructure, multiple VTNs can be created, each
with a set of dedicated or shared network resources allocated from the
physical underlay network, and each can be associated with a
customized logical network topology, so as to meet the requirements of
different VPN+ services or different groups of VPN+ services.
According to the associated logical network topology, each VTN needs
to be instantiated on a set of network nodes and links which are
involved in the logical topology. And on each node or link, each VTN
is associated with a set of local resources which are allocated for
the processing of traffic in the VTN. The VTN provides the integration
between the logical network topology and the required underlying
network resources.According to the service requirements on connectivity, performance
and isolation, etc., VPN services can be mapped to the appropriate
VTNs in the network. Different VPN services can be mapped to different
VTNs, while it is also possible that multiple VPNs are mapped to the
same VTN. Thus the VTN is an essential scaling technique, as it has
the potential of eliminating per-path state from the network. In
addition, when a group of VPN+ services are mapped to a single VTN,
only the network state of the single VTN needs to be maintained in the
network (see for more information).The network controller is responsible for creating a VTN,
instructing the involved network nodes to allocate network resources
to the VTN, and provisioning the VPN services on the VTN. A
distributed control plane may be used for distributing the VTN
resource and topology attributes among nodes in the VTN.The process used to create VTNs and to allocate network resources
for use by the VTNs needs to take a holistic view of the needs of all
of the service provider's customers and to partition the resources
accordingly. However, within a VTN these resources can, if required,
be managed via a dynamic control plane. This provides the required
scalability and isolation with some flexibility.At the VPN service level, the required connectivity for an MP2MP
VPN service is usually full or partial mesh. To support such VPN
services, the corresponding VTN also needs to provide MP2MP
connectivity among the end points.Other service requirements may be expressed at different
granularities, some of which can be applicable to the whole service,
while some others may only be applicable to some pairs of end points.
For example, when a particular level of performance guarantee is
required, the point-to-point path through the underlying VTN of the
VPN+ service may need to be specifically engineered to meet the
required performance guarantee.Although a lot of the traffic that will be carried over VPN+ will
likely be IP based, the design must be capable of carrying other
traffic types, in particular Ethernet traffic. This is easily
accomplished through the various pseudowire (PW) techniques . Where the underlay is MPLS, Ethernet traffic can
be carried over VPN+ encapsulated according to the method specified in
. Where the underlay is IP, Layer Two
Tunneling Protocol - Version 3 (L2TPv3) can
be used with Ethernet traffic carried according to . Encapsulations have been defined for most of the
common layer-2 types for both PW over MPLS and for L2TPv3.VPNs are instantiated as overlays on top of an operator's network
and offered as services to the operator's customers. An important
feature of overlays is that they can deliver services without placing
per-service state in the core of the underlay network.VPN+ may need to install some additional state within the network
to achieve the features that they require. Solutions must consider
minimizing and controlling the scale of such state, and deployment
architectures should constrain the number of VPN+ services so that the
additional state introduced to the network is acceptable and under
control. It is expected that the number of VPN+ services will be small
at the beginning, and even in the future the number of VPN+ services
will be fewer than traditional VPNs because existing VPN techniques
are good enough to meet the needs of most existing VPN-type
services.In general, it is not required that the state in the network be
maintained in a 1:1 relationship with the VPN+ services. It will
usually be possible to aggregate a set or group of VPN+ services so
that they share the same VTN and the same set of network resources
(much in the same way that current VPNs are aggregated over transport
tunnels) so that collections of VPN+ services that require the same
behavior from the network in terms of resource reservation, latency
bounds, resiliency, etc. can be grouped together. This is an important
feature to assist with the scaling characteristics of VPN+
deployments. provides more
details of scalability considerations for the network resource
partitions used to instantiate VTNs, and includes a greater discussion of
scalability considerations.A VPN is a virtual network created by applying a demultiplexing
technique to the underlying network (the underlay) to distinguish the
traffic of one VPN from that of another. The connections of VPN are
supported by a set of underlay paths. A path that travels by other than
the shortest path through the underlay normally requires state to
specify that path. The state of the paths could be applied to the
underlay through the use of the RSVP-TE signaling protocol, or directly
through the use of an SDN controller. Based on Segment Routing, state
could be maintained at the ingress node of the path, and carried in the
data packet. Other techniques may emerge as this problem is studied.
This state gets harder to manage as the number of paths increases.
Furthermore, as we increase the coupling between the underlay and the
overlay to support the VPN+ service, this state is likely to increase
further. We cannot, for example, share the paths and network resource
between VPN+ services which require hard isolation.VTN can be used to provide a group of virtual underlay paths (VTP)
with a common set of network resources. Through the use of VTNs, a
subset of underlay network resource can be either dedicated for a
particular VPN+ service or shared among a group of VPN+ services. This
section describes the candidate technologies in different network planes
which can be used to build VTNs.Several candidate layer-2 packet- or frame-based forwarding plane
mechanisms which can provide the required resource isolation and
performance guarantees are described in the following sections.FlexE provides the ability to multiplex
channels over an Ethernet link to create point-to-point
fixed-bandwidth connections in a way that provides hard isolation.
FlexE also supports bonding links to create larger links out of
multiple low capacity links.However, FlexE is only a link level technology. When packets are
received by the downstream node, they need to be processed in a way
that preserves that isolation in the downstream node. This in turn
requires a queuing and forwarding implementation that preserves the
end-to-end isolation.If different FlexE channels are used for different services, then
no sharing is possible between the FlexE channels. This means that
it may be difficult to dynamically redistribute unused bandwidth to
lower priority services in another FlexE channel. If one FlexE
channel is used by one customer, the customer can use some methods
to manage the relative priority of their own traffic in the FlexE
channel.DiffServ based queuing systems are described in and . This approach is
not sufficient to provide isolation for VPN+ services because
DiffServ does not provide enough markers to differentiate between
traffic of a large number of VPN+ services. Nor does DiffServ offer
the range of service classes that each VPN+ service needs to provide
to its tenants. This problem is particularly acute with an MPLS
underlay, because MPLS only provides eight traffic classes.In addition, DiffServ, as currently implemented, mainly provides
per-hop priority-based scheduling, and it is difficult to use it to
achieve quantitative resource reservation for different VPN+
services.To address these problems and to reduce the potential
interference between VPN+ services, it would be necessary to steer
traffic to dedicated input and output queues per VPN+ service or per
group of VPN+ services: some routers have a large number of queues
and sophisticated queuing systems which could support this, while
some routers may struggle to provide the granularity and level of
isolation required by the applications of VPN+.Time Sensitive Networking (TSN) is an IEEE
project to provide a method of carrying time sensitive information
over Ethernet. It introduces the concept of packet scheduling where
a packet stream may be given a time slot guaranteeing that it
experiences no queuing delay or increase in latency beyond the very
small scheduling delay. The mechanisms defined in TSN can be used to
meet the requirements of time sensitive traffic flows of VPN+
service.Ethernet can be emulated over a layer-3 network using an IP or
MPLS pseudowire. However, a TSN Ethernet payload would be opaque to
the underlay and thus not treated specifically as time sensitive
data. The preferred method of carrying TSN over a layer-3 network is
through the use of deterministic networking as explained in .This section considers the problem of VPN+ service differentiation
and the representation of underlying network resources in the network
layer. More specifically, it describes the possible data plane
mechanisms to determine the network resources and the logical network
topology or paths associated with a VTN.Deterministic Networking (DetNet) is a
technique being developed in the IETF to enhance the ability of
layer-3 networks to deliver packets more reliably and with greater
control over the delay. The design cannot use re-transmission
techniques such as TCP since that can exceed the delay tolerated by
the applications. DetNet pre-emptively sends copies of the packet
over various paths to minimize the chance of all copies of a packet
being lost. It also seeks to set an upper bound on latency, but the
goal is not to minimize latency. Detnet can be realized over IP data
plane or MPLS data plane , and may be used to provide Virtual Transport
Paths (VTPs) for VPN+ services.MPLS-TE
introduces the concept of reserving end-to-end bandwidth for a
TE-LSP, which can be used to provide a point-to-point Virtual
Transport Path (VTP) across the underlay network to support VPN
services. VPN traffic can be carried over dedicated TE-LSPs to
provide reserved bandwidth for each specific connection in a VPN,
and VPNs with similar behavior requirements may be multiplexed onto
the same TE-LSPs. Some network operators have concerns about the
scalability and management overhead of MPLS-TE system, especially
with regard to those systems that use an active control plane, and
this has lead them to consider other solutions for traffic
engineering in their networks.Segment Routing (SR) is a method that
prepends instructions to packets at the head-end of a path. These
instructions are used to specify the nodes and links to be
traversed, and allow the packets to be routed on paths other than
the shortest path. By encoding the state in the packet, per-path
state is transitioned out of the network. SR can be instantiated
using MPLS data plane (SR-MPLS) or IPv6 data plane (SRv6).An SR traffic engineered path operates with a granularity of a
link. Hints about priority are provided using the Traffic Class (TC)
field in the packet header. However, to achieve the performance and
isolation characteristics that are sought by VPN+ customers, it will
be necessary to steer packets through specific virtual links and/or
queues on the same link and direct them to use specific resources.
With SR, it is possible to introduce such fine-grained packet
steering by specifying the queues and the associated resources
through an SR instruction list.Note that the concept of a queue is a useful abstraction for
different types of underlay mechanism that may be used to provide
enhanced isolation and performance support. How the queue satisfies
the requirement is implementation specific and is transparent to the
layer-3 data plane and control plane mechanisms used.With Segment Routing, the SR instruction list could be used to
build a P2P path, and a group of SR SIDs could also be used to
represent an MP2MP network. Thus, the SR based mechanism could be
used to provide both a Virtual Transport Path (VTP) and a Virtual
Transport Network (VTN) for VPN+ services.Non-packet underlay data plane technologies often have TE
properties and behaviors, and meet many of the key requirements in
particular for bandwidth guarantees, traffic isolation (with physical
isolation often being an integral part of the technology), highly
predictable latency and jitter characteristics, measurable loss
characteristics, and ease of identification of flows. The cost is that
the resources are allocated on a long-term and end-to-end basis. Such
an arrangement means that the full cost of the resources has to be
borne by the client that is allocated with the resources. When a VTN
built with this data plane can be used to support multiple VPN+
services, the cost could be distributed among such group of
services.The control plane of VPN+ would likely be based on a hybrid control
mechanism that takes advantage of a logically centralized controller
for on-demand provisioning and global optimization, whilst still
relying on a distributed control plane to provide scalability, high
reliability, fast reaction, automatic failure recovery, etc. Extension
to and optimization of the centralized and distributed control plane
is needed to support the enhanced properties of VPN+.As described in section 4, the VPN+ control plane needs to provide
the following functions: Collect information about the underlying network topology and
network resources, and exports this to network nodes and/or a
centralized controller as required.Create VTNs with the network resource and topology properties
needed by the VPN+ services.Distribute the attributes of VTNs to network nodes which
participate in the VTNs and/or the centralized controller.Map VPN+ services to an appropriate VTN.Compute and set up VTPs in each VTN to meet VPN+ service
requirements.The collection of underlying network topology and resource
information can be done using existing the IGP and BGP-LS based
mechanisms. The creation of VTN and the distribution of VTN attributes
may need further control protocol extensions. The computation of VTPs
based on the attributes and constraints of the VTN can be performed
either by the headend node of the path or a centralized Path
Computation Element (PCE).There are two candidate mechanisms for the setup of VTPs in the
VTN: RSVP-TE and Segment Routing (SR).RSVP-TE provides the signaling
mechanism for establishing a TE-LSP in an MPLS network with
end-to-end resource reservation. This can be seen as an approach
of providing a Virtual Transport Path (VTP) which could be used to
bind the VPN to specific network resources allocated within the
underlay, but there remain scalability concerns as mentioned in
.The SR control plane does not have the
capability of signaling resource reservations along the path. On
the other hand, the SR approach provides a potential way of
binding the underlay network resource and the VTNs without
requiring per-path state to be maintained in the network. A
centralized controller can perform resource planning and
reservation for VTNs, and it needs to instruct the network nodes
to ensure that resources are correctly allocated for the VTN. The
controller could provision the SR paths based on the mechanism in
to the headend nodes of the paths.According to the service requirements for connectivity, performance
and isolation, one VPN+ service may be mapped a dedicated VTN, or a
group of VPN+ services may be mapped to the same VTN. The mapping of
VPN+ services to VTN can be achieved using existing control mechanisms
with possible extensions, and it can be based on either the
characteristics of the data packet or the attributes of the VPN
service routes.The management plane provides the interface between the VPN+
service provider and the customers for life-cycle management of the
VPN+ service (i.e., creation, modification, assurance/monitoring, and
decommissioning). It relies on a set of service data models for the
description of the information and operations needed on the
interface.As an example, in the context of 5G end-to-end network slicing
, the management of the transport network
segment of the 5G end-to-end network slice can be realized with the
management plane of VPN+. The 3GPP management system may provide the
connectivity and performance related parameters as requirements to the
management plane of the transport network. It may also require the
transport network to expose the capabilities and status of the network
slice. Thus, an interface between the VPN+ management plane and the 5G
network slice management system, and relevant service data models are
needed for the coordination of 5G end-to-end network slice
management.The management plane interface and data models for VPN+ services
can be based on the service models described in .It is important that the management life-cycle supports in-place
modification of VPN+ services. That is, it should be possible to add
and remove end points, as well as to change the requested
characteristics of the service that is delivered. The management
system needs to be able to assess the revised VPN+ requests and
determine whether they can be provided by the existing VTNs or whether
changes must be made, and it will additionally need to determine
whether those changes to the VTN are possible. If not, then the
customer's modification request may be rejected.When the modification of a VPN+ service is possible, the management
system must make every effort to make the changes in a non-disruptive
way. That is, the modification of the VPN+ service or the underlying
VTN must not perturbate traffic on the VPN+ service in a way that
causes the service level to drop below the agreed levels. Furthermore,
in the spirit of isolation, changes to one VPN+ service should not
cause disruption to other VPN+ services.The network operator for the underlay network (i.e., the provider
of the VPN+ service) may delegate some operational aspects of the
overlay VPN and the underlying VTN to the customer. In this way, the
VPN+ is presented to the customer as a virtual network, and the
customer can choose how to use that network. Some mechanisms in the
operator's network is needed, so that a customer cannot exceed the
capabilities of the virtual links and nodes, but can decide how to
load traffic onto the network, for example, by assigning different
metrics to the virtual links so that the customer can control how
traffic is routed through the virtual network. This approach requires
a management system for the virtual network, but does not necessarily
require any coordination between the management systems of the virtual
network and the physical network, except that the virtual network
management system might notice when the VTN is close to capacity or
considerably under-used and automatically request changes in the
service provided by the underlay network.This section describes the applicability of the existing and
in-progress service data models to VPN+.
describes the scope and purpose of service models and shows where a
service model might fit into an SDN based network management
architecture. New service models may also be introduced for some of
the required management functions.Service data models are used to represent, monitor, and manage the
virtual networks and services enabled by VPN+. The VPN customer
service models (e.g., the Layer-3 VPN Service Model (L3SM) , the Layer-2 VPN Service Model (L2SM) ), or the ACTN Virtual Network (VN) model ) are service models which can
provide the customer's view of the VPN+ service. The Layer-3 VPN
Network Model (L3NM) , the Layer-2 VPN network
model (L2NM) provide the
operator's view of the managed infrastructure as a set of virtual
networks and the associated resources. The NRP model further provides the management of the
virtual underlay network topology and resources both in the controller
and in the network devices to instantiate the VTNs needed for the VPN+
services.One of the typical use cases of VPN+ is to deliver IETF Network Slice
Service. This section describes the applicability of VPN+ in network
slice realization.In order to provide IETF network slices to customers, a
technology-agnostic network slice service model is needed for the
customers to communicate the requirements of IETF network slices (end
points, connectivity, SLOs, and SLEs). These requirements may be
realized using technology specified in this document to instruct the
network to instantiate a VPN+ service to meet the requirements of the
IETF network slice customers.According to the network operators' network resource planning
policy, or based on the requirement of one or a group of customers or
services, a VTN may need to be created to support the requested VPN+
services. One of the basic requirements for a VTN is to provide a set
of dedicated network resources to avoid unexpected interference from
other services in the same network. Other possible requirements may
include the required topology and connectivity, bandwidth, latency,
reliability, etc.A centralized network controller can be responsible for calculating
a subset of the underlay network topology (which is called a logical
topology) to support the VTN requirement. And on the network nodes and
links within the logical topology, the set of network resources to be
allocated to the VTN can also be determined by the controller.
Normally such calculation needs to take the underlay network
connectivity information and the available network resource
information of the underlay network into consideration. The network
controller may also take the status of the existing VTNs into
consideration in the planning and calculation of a new VTN.According to the result of the VTN planning, the network nodes and
links involved in the logical topology of the VTN are instructed to
allocated the required set of network resources for the VTN. One or
multiple mechanisms as specified in section 5.1 can be used to
partition the forwarding plane network resources and allocate
different subsets of resources to different VTNs. In addition, the
data plane identifiers which are used to identify the set of network
resources allocated to the VTN are also provisioned on the network
nodes. Depends on the data plane technologies used, the set of network
resources of a VTN can be identified using either resource aware SR
segments as specified in , or a dedicated VTN
resource ID as specified in can be introduced. The
network nodes involved in a VTN may distribute the logical topology
information, the VTN specific network resource information and the VTN
resource identifiers using the control plane. Such information could
be used by the controller and the network nodes to compute the TE or
shortest paths within the VTN, and install the VTN specific forwarding
entries to network nodes.According to the connectivity requirements of an IETF network slice
service, an overlay VPN can be created using the existing or future
multi-tenancy overlay technologies as described in .Then according to the SLO and SLE requirements of the network
slice, the overlay VPN is mapped to an appropriate VTN as the virtual
underlay. The integration of the overlay VPN and the underlay VTN
together provide a VPN+ service which can meet the network slice
service requirements.At the edge of the operator's network, traffic of IETF network
slices can be classified based on the rules defined by the operator's
policy, so that the traffic is treated as a specific VPN+ service,
which is further mapped to a underlay VTN. Packets belonging to the
VPN+ service will be processed and forwarded by network nodes based
the TE or shortest path forwarding entries and the set of network
resources of the corresponding VTN.VPN+ provides performance guaranteed services in packet networks, but
with the potential cost of introducing additional state into the
network. There are at least three ways that this additional state might
be brought into the network:Introduce the complete state into the packet, as is done in SR.
This allows the controller to specify the detailed series of
forwarding and processing instructions for the packet as it transits
the network. The cost of this is an increase in the packet header
size. The cost is also that systems will have capabilities enabled
in case they are called upon by a service. This is a type of latent
state, and increases as the path and resources that need to be
exclusively available to a VPN are specified more precisely.Introduce the state to the network. This is normally done by
creating a path using signaling such as RSVP-TE. This could be
extended to include any element that needs to be specified along the
path, for example explicitly specifying queuing policy. It is also
possible to use other methods to introduce path state, such as via
an SDN controller, or possibly by modifying a routing protocol. With
this approach there is state per path: per-path characteristic that
needs to be maintained over the life of the path. This is more
network state than is needed using SR, but the packets are usually
shorter.Provide a hybrid approach. One example is based on using binding
SIDs to represent path fragments, and bind
them together with SR. Dynamic creation of a VPN service path using
SR requires less state maintenance in the network core at the
expense of larger packet headers. The packet size can be lower if a
form of loose source routing is used (using a few nodal SIDs), and
it will be lower if no specific functions or resources on the
routers are specified.Reducing the state in the network is important to VPN+, as it
requires the overlay to be more closely integrated with the underlay
than with traditional VPNs. This tighter coupling would normally mean
that more state needs to be created and maintained in the network, as
the state about fine granularity processing would need to be loaded and
maintained in the routers. Aggregation is a well-established approach to
reduce the amount of state and improve scaling, and VTN is considered as
the network construct to aggregate the states of VPN+ services. In
addition, an SR approach allows much of the state to be spread amongst
the network ingress nodes, and transiently carried in the packets as
SIDs.The following sections describe some of the scalability concerns that
need to be considered. Further discussion of the scalability
considerations of the underlaying network construct of VPN+ can be found
in .One of the challenges with SR is the stack depth that nodes are
able to impose on packets . This leads to a
difficult balance between adding state to the network and minimizing
stack depth, or minimizing state and increasing the stack depth.The established method of creating a resource allocated path
through an MPLS network is to use the RSVP-TE protocol. However, there
have been concerns that this requires significant continuous state
maintenance in the network. Work to improve the scalability of RSVP-TE
LSPs in the control plane can be found in .There is also concern at the scalability of the forwarder footprint
of RSVP-TE as the number of paths through a label switching router
(LSR) grows. addresses this by employing SR
within a tunnel established by RSVP-TE.The centralized approach of SDN requires state to be stored in the
network, but does not have the overhead of also requiring control
plane state to be maintained. Each individual network node may need to
maintain a communication channel with the SDN controller, but that
compares favorably with the need for a control plane to maintain
communication with all neighbors.However, SDN may transfer some of the scalability concerns from the
network to the centralized controller. In particular, there may be a
heavy processing burden at the controller, and a heavy load in the
network surrounding the controller. A centralized controller also
presents a single point of failure within the network.Operations, Administration, and Maintenance (OAM) are necessary
functionalities for network and service management. The OAM mechanisms
can be further classified as OAM and telemetry. This section describes
the considerations of both classes of OAM mechanisms.The design of OAM for VPN+ services needs to consider the following
requirements:Instrumentation of the underlay so that the network operator
can be sure that the resources committed to a customer are
operating correctly and delivering the required performance.Instrumentation of the overlay by the customer. This is likely
to be transparent to the network operator and to use existing
methods. Particular consideration needs to be given to the need to
verify the isolation and the various committed performance
characteristics.Instrumentation of the overlay by the network provider to
proactively demonstrate that the committed performance is being
delivered. This needs to be done in a non-intrusive manner,
particularly when the tenant is deploying a performance sensitive
application.Verification of the conformity of the path to the service
requirement. This may need to be done as part of a commissioning
test.A study of OAM in SR networks has been documented in .Network visibility is essential for network operation. Network
telemetry has been considered as an ideal means to gain sufficient
network visibility with better flexibility, scalability, accuracy,
coverage, and performance than conventional OAM technologies.As defined in , the objective of Network
Telemetry is to acquire network data remotely for network monitoring
and operation. It is a general term for a large set of network
visibility techniques and protocols. Network telemetry addresses the
current network operation issues and enables smooth evolution toward
intent-driven autonomous networks. Telemetry can be applied on the
forwarding plane, the control plane, and the management plane in a
network.How the telemetry mechanisms could be used or extended for the VPN+
service is out of the scope of this document.Each VPN+ service has a life cycle, and may need modification during
deployment as the needs of its tenant change. This is discussed in . Additionally, as the network evolves, there
may need to be garbage collection performed to consolidate resources
into usable quanta.Systems in which the path is imposed, such as SR or some form of
explicit routing, tend to do well in these applications, because it is
possible to perform an atomic transition from one path to another. That
is, a single action by the head-end that changes the path without the
need for coordinated action by the routers along the path. However,
implementations and the monitoring protocols need to make sure that the
new path is operational and meets the required SLA before traffic is
transitioned to it. It is possible for deadlocks to arise as a result of
the network becoming fragmented over time, such that it is impossible to
create a new path or to modify an existing path without impacting the
SLA of other paths. Resolution of this situation is as much a commercial
issue as it is a technical issue and is outside the scope of this
document.There are, however, two manifestations of the latency problem that
are for further study in any of these approaches:The problem of packets overtaking one another if a path latency
reduces during a transition.The problem of transient variation in latency in either direction
as a path migrates.There is also the matter of what happens during failure in the
underlay infrastructure. Fast reroute is one approach, but that still
produces a transient loss with a normal goal of rectifying this within
50ms . An alternative is some form of N+1
delivery such as has been used for many years to support protection from
service disruption. This may be taken to a different level using the
techniques of DetNet with multiple in-network replication and the
culling of later packets .In addition to the approach used to protect high priority packets,
consideration should be given to the impact of best effort traffic on
the high priority packets during a transition. Specifically, if a
conventional re-convergence process is used there will inevitably be
micro-loops and whilst some form of explicit routing will protect the
high priority traffic, lower priority traffic on best effort shortest
paths will micro-loop without the use of a loop prevention technology.
To provide the highest quality of service to high priority traffic,
either this traffic must be shielded from the micro-loops, or
micro-loops must be prevented completely.It is likely that VPN+ services will be introduced in networks which
already have VPN services deployed. Depending on service requirements,
the tenants or the operator may choose to use a VPN or a VPN+ to fulfill
a service requirement. The information and parameters to assist such a
decision needs to be reflected on the management interface between the
tenant and the operator.All types of virtual network require special consideration to be
given to the isolation of traffic belonging to different tenants. That
is, traffic belonging to one VPN must not be delivered to end points
outside that VPN. In this regard VPN+ neither introduce, nor experience
greater security risks than other VPNs.However, in a VPN+ service the additional service requirements need
to be considered. For example, if a service requires a specific upper
bound to latency then it can be damaged by simply delaying the packets
through the activities of another tenant, i.e., by introducing bursts of
traffic for other services. In some respects this makes the VPN+ more
susceptible to attacks since the SLA may be broken. But another view is
that the operator must, in any case, preform monitoring of the VPN+ to
ensure that the SLA is met, and this means that the operator may be more
likely to spot the early onset of a security attack and be able to take
pre-emptive protective action.The measures to address these dynamic security risks must be
specified as part to the specific solution are form part of the
isolation requirements of a service.While a VPN+ service may be sold as offering encryption and other
security features as part of the service, customers would be well
advised to take responsibility for their own security requirements
themselves possibly by encrypting traffic before handing it off to the
service provider.The privacy of VPN+ service customers must be preserved. It should
not be possible for one customer to discover the existence of another
customer, nor should the sites that are members of an VPN+ be externally
visible.A VPN+ service (even one with hard isolation requirements) does not
provide any additional guarantees of privacy for customer traffic
compared to regular VPNs: the traffic within the network may be
intercepted and errors may lead to mis-delivery. Users who wish to
ensure the privacy of their traffic must take their own precautions
including end-to-end encryption.There are no requested IANA actions.The authors would like to thank Charlie Perkins, James N Guichard,
John E Drake, Shunsuke Homma, and Luis M. Contreras for their review and
valuable comments.This work was supported in part by the European Commission funded
H2020-ICT-2016-2 METRO-HAUL project (G.A. 761727).3GPP TS23.5013GPP TS28.530NGMN NS Concepthao ,Flex Ethernet Implementation AgreementTime-Sensitive NetworkingDeterministic NetworkingService Function Chaining