Internet-Draft semantic-sdn-mom August 2022
Bellavista, et al. Expires 4 March 2023 [Page]
Workgroup:
Individual Submission
Internet-Draft:
draft-bellavista-semantic-sdn-mom-01
Published:
Intended Status:
Informational
Expires:
Authors:
P. Bellavista
University of Bologna
L. Foschini
University of Bologna
L. Patera
University of Bologna
M. Fogli
University of Ferrara
C. Giannelli
University of Ferrara
C. Stefanelli
University of Ferrara
D. Z. Lou
Huawei

A Framework for QoS-Enabled Semantic Routing in Industrial Networks

Abstract

Industrial networks pose unique challenges in realizing a communication substrate on the shop floor. Such challenges are due to strict Quality of Service (QoS) requirements, a wide range of protocols for data exchange, and highly heterogeneous network infrastructures. In this regard, this document proposes a framework for QoS-enabled semantic routing in industrial networks. Such a framework aims at providing loosely-coupled, asynchronous communications, fine-grained traffic management (delivery semantics and flow priorities), and in-network traffic optimization.

Discussion Venues

This note is to be removed before publishing as an RFC.

Source for this draft and an issue tracker can be found at https://github.com/fglmtt/draft-bellavista-semantic-sdn-mom.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 4 March 2023.

Table of Contents

1. Introduction

This Internet Draft defines a framework for Quality of Service (QoS)-enabled semantic routing in industrial networks. The term "semantic routing" refers to a form of routing based on additional semantics other than mere IP addresses [I-D.draft-farrel-irtf-introduction-to-semantic-routing-03]. Along with the semantics carried in packet headers, such routing may also depend on policy coded in, configured at, or signaled to network devices. A network device is an element that receives/transmits packets and performs network functions on them, such as forwarding, dropping, filtering, and packet header (or payload) manipulation, among others. Network devices may operate in, above, and below the network layer.

The framework described in this draft uses the overlay networking to provide a semantic routing substrate that operates both at the application and network level.

At the application level, the framework consists of Message-Oriented Middleware (MOM) and Application Gateways (AGWs). The MOM allows decoupling senders and receivers, sorts messages in topics of interest, and provides delivery semantics (e.g., at most once, at least once, and exactly once). The AGWs sit nearby industrial machines that are not natively compliant with the protocols the framework relies on. For example, some legacy industrial machines may not even support IP-based communications. It is worth mentioning that the typical lifetime of industrial equipment is 10 to 15 years (even longer sometimes), and in many cases, the software cannot be updated due to manufacturers' policy. Accordingly, AGWs translate the plethora of (proprietary) protocols that coexist on the shop floor towards the one(s) used by the framework.

At the network level, the framework combines two paradigms: Software-Defined Networking (SDN) [RFC7426] and In-Network Processing (INP) [ZILBERMAN2019], [PORTS2019]. Although the MOM enables critical features in message dispatching, it does not control how packets flow through network devices along routing paths. This is where SDN comes in. Specifically, the SDN controller computes optimal routes to meet the QoS requirements and configures network devices accordingly. The term INP refers to executing end-host programs within network devices. Such INP-enabled network devices operate at a line rate, processing packets as they traverse them without increasing the overall network load. Given that the SDN controller holds a network-wide view, it also knows which network devices support INP and which do not. The SDN controller may redirect flows towards target INP-enabled network devices based on the processing functions they provide.

The objectives that the framework targets are the following:

The remainder of this draft is structured as follows. First, Section 2 details the target scenario. Then, Section 3 provides the requirements of the target scenario. Next, Section 4 presents the principles and design guidelines of the framework. Lastly, Section 5 depicts the architecture of the framework and Section 6 proposes protocols to support it.

2. Target Scenario

Traditionally, a shop floor includes industrial machines, Programmable Logic Controllers (PLCs), and Human-Machine Interfaces (HMIs). Typically, industrial machines are equipped with sensors and actuators, PLCs control manufacturing processes, and human operators interact with and receive feedback from industrial machines through HMIs. In such legacy industrial networks, the message dispatching was primarily oriented to monitor operational- and safety-related machine parameters.

Nowadays, the shop floor has become more articulated due to the advent of the Industrial Internet of Things (IIoT). On the one hand, IIoT devices enable business-critical services (e.g., predictive maintenance) cost-effectively. On the other hand, they dramatically increase overall network traffic volume, infrastructure heterogeneity, and cyber security threats.

The heterogeneity is not only about the industrial equipment itself but also in how such equipment disseminates information. The plethora of (proprietary) protocols that machines use to exchange data makes machine-to-machine communications challenging.

Additionally, the shop floor may include dynamic industrial equipment (e.g., automated guided vehicles) that communicate on the move. Such dynamic equipment may abruptly migrate communications across different access points according to the physical location at a given time.

Therefore, modern industrial environments stress the network infrastructure more than traditional ones, where network traffic was fairly limited to mission-critical information generated by fixed network equipment.

In fulfilling current industrial guidelines for cyber security (e.g., IEC 62443 [IEC62443]), the industrial topology should consist of several shop floor subnets and a control room subnet. Figure 1 depicts an industrial topology compliant with such guidelines.

 Control Room Subnet
+---------------------------------------------------------------+
|                                                               |
|  +------------+ +------------+ +------------+ +------------+  |
|  |    SDN     | |    AGW     | |    MOM     | |    INP     |  |
|  | Controller | | Controller | | Controller | | Controller |  |
|  +-----+------+ +-----+------+ +------+-----+ +------+-----+  |
|        |              |               |              |        |
|        +--------------+-------+-------+--------------+        |
|                               |                               |
|                           +---+---+                           |
|                           |  SGW  |                           |
+---------------------------+---+---+---------------------------+
                                |
----------------+---------------+----------------+---------------
                |                                |
+-----------+---+---+----------+ +-----------+---+---+----------+
|           |  SGW  |          | |           |  SGW  |          |
|           +---+---+          | |           +---+---+          |
|               |              | |               |              |
|  +------------+-----------+  | |  +----------+ |              |
|  |           AGW          |  | |  |   AGW    +-+--+------+    |
|  +-+------+------+------+-+  | |  +-+------+-+    |      |    |
|    |      |      |      |    | |    |      |      |      |    |
|  +-+------+------+------+-+  | |  +-+------+-+  +-+------+-+  |
|  |        Machines        |  | |  | Machines |  | Machines |  |
|  +------------------------+  | |  +----------+  +----------+  |
|                              | |                              |
+------------------------------+ +------------------------------+
 Shop Floor Subnet 1              Shop Floor Subnet N
Figure 1: Target network topology

Note that:

The network devices interconnecting the subnets form the industrial network backbone. The outcome is a multihop multipath topology providing point-to-point connections with differentiated performance.

The framework described in this document targets the scenario depicted in Figure 1. The framework components (i.e., MOM, AGW, SDN, and INP controllers) run within the control room subnet. Note that also other services may run in the control room subnet along with them. Typical examples are the Manufacturing Execution System (MES) and the Enterprise Resource Planning (ERP).

3. Requirements

The transition from traditional to modern industrial environments raised critical communications challenges exposed in Section 2. In this regard, it is worth remarking that industrial machines typically have long lifetimes (decades), high costs (millions of USD), and restrictive manufacturers' policies in place (e.g., to prevent firmware updates). Accordingly, the communications substrate should face such challenges by fulfilling additional requirements.

First, non-mission-critical and mission-critical traffic should be distinguished. Typically, non-mission-critical flows (e.g., monitoring of vibrations) are more massive than mission-critical ones (e.g., alerting human operators about dangerous events), thus the former may easily take network resources at the expense of the latter. This requires per-flow traffic management, ranging from flow prioritization (mission-critical flows go first, then non-mission-critical ones) to data aggregation and filtering to reduce the traffic traversing the network. Since the industrial control typically runs cyclically in millisecond level, the control traffic, especially the mission-critical traffic, demands high QoS in terms of latency, jitter, and extremely low packet loss ratio.

Second, the industrial communication demands high reliability. The telecommunication equipment deployed in the Internet typically guarantee the reliability to 99.99%. However, the industrial systems need to be much more reliable, from 99.9999% to 99.99999%, in order to reduce the downtime of the production line. It requires the industrial network to equip extra measures to support it.

Third, machine-to-machine communications should be enabled straightforwardly, notwithstanding the plethora of (proprietary) dialects that coexist at the shop floor level, which enables the interoperability of different shop floor devices. This requires connectors to translate such dialects towards a common one and metadata to express the semantics. Intermediate nodes may use semantics to process packet payloads according to the information they carry. For example, an intermediate node may average a given number of consecutive temperature values (data aggregation) rather than drop values of little application interest (data filtering).

Lastly, machines should keep communicating on the move without affecting overall performance. For example, an automated guided vehicle may move from a shop floor subnet to another. By doing so, the vehicle changes the WiFi access point (i.e., SGW) used to access the network. As a result, the flows sent out by such a vehicle need to be rescheduled accordingly. This requires not only to reconfigure network devices dynamically, but also to do so in compliance with other flows already in place.

In this context, edge computing plays a crucial role in enabling the design and implementation of novel distributed control functions with parts that are hosted on the edge nodes located in the production plant premises and close to the controlled sensors/actuators, primarily to increase reliability and decrease latency. In the following, we discuss a framework for QoS-Enabled Semantic Routing in Industrial Networks capable of synchronizing several entities in a simplified manner via a unique logical configuration interface ("Northbound interface").

4. Principles and Design Guidelines

Future industrial networks will be characterized by an unprecedented degree of heterogeneity and complexity. Traditional solutions, mainly based on the direct interconnection of machines one to each other and machines towards the control room, cannot provide the required degree of flexibility. This leads to exploring novel solutions to manage the deluge of data generated by IIoT devices and provide QoS-driven network (re)configuration.

By considering the momentum of MOM as an enabler of the Industry 4.0 vision, we believe it will become a pillar of future industrial ecosystems. Although it enables critical features to facilitate message dispatching independent from actual machine location, it does not control how packets flow through middle network devices along the routing path. In fact, once a message is sent from a broker to a consumer (or vice versa, from a producer to a broker), the path the message traverses is beyond the MOM's control. However, the ability to dictate the behavior of middle network devices is essential to satisfy stringent QoS requirements. This is where the SDN paradigm comes in.

The SDN controller eases configuration and management of network devices, which act as the (distributed) communication substrate between the machines and the MOM. In addition, the SDN controller provides network-wide abstractions to define and enforce fine-grained network policies.

At the top level, the MOM identifies the destination nodes a message should be dispatched, along with the delivery semantics (e.g., at most once, at least once, or exactly once) to be applied. At the bottom level, AGWs deployed close to machines act as intermediaries between the machines (and the plethora of protocols they speak) and the MOM. In the middle level, the SDN controller exploits its network-wide view to (re)configure the network devices according to the QoS requirements.

Based on the MOM-SDN interplay, network devices can be properly configured:

For example, by considering two traffic flows between the MOM broker and a machine, proper routing table management allows to forward traffic flows tagged as "mission-critical" via a large-bandwidth low-latency path (if available). Besides, traffic flows tagged as "not-urgent" may be delayed, where the magnitude of the imposed delay may also depend on the current level of network saturation. Finally, an INP-enabled network device may exploit the semantics about the carried data to provide content-based message management. For instance, it is possible to forward packets only if they satisfy a given rule, e.g., if they carry temperature values greater than a given threshold, or to apply functions to send pre-processed values, e.g., sending only one packet with the average temperature resulting from a series of received temperature values. Note that content-based message management enables decisions on what is carried within packet payloads rather than only on packet headers (mere forwarding). However, since payload inspection and manipulation may introduce additional delays, content-based message management should be enforced as much as possible but without burdening mission-critical traffic flows.

From a functional point of view, the INP level sits atop the data forwarding level. As in the case of SDN deployment, we do not argue that all the network devices should be INP-enabled. Instead, we promote a pragmatic approach where legacy and novel solutions cooperate effectively. Since the SDN controller holds a network-wide view, it knows which network devices offer INP and which do not. Therefore, traffic can be optimally handled by maximizing INP (e.g., routing of packets carrying values that can be averaged towards network devices providing that aggregation function) while ensuring QoS requirements.

5. Architecture

The proposed architecture, mostly working at the application layer, adopts the typical SDN approach by identifying two main areas: Control Plane and Data Plane. In the Control Plane, the following components are deployed: the MOM controller, interacting with the MOM broker; the In-Network Processing (INP) controller, managing the INP units; the SDN controller, controlling network elements; and the Gateway controller, managing the many application gateways deployed in the environment. The Data Plane consists of the implementation of the MOM, the INP units, the SDN-enabled network elements, and the Gateway components.

                               PROTO E
      +-------------+-------------+---------------------+     NORTHBOUND
      |             |             |                     |          IFACE
      v             v             v                     v
+-----+-----+ +-----+-----+ +-----+-----+         +-----+-----+
|  GATEWAY  | |    SDN    | |    INP    |         |    MOM    |  CONTROL
|CONTROLLER | |CONTROLLER | |CONTROLLER |         |CONTROLLER |    PLANE
+-----+-----+ +-----+-----+ +-----+-----+         +-----+-----+
      ^             ^             ^                     ^
      |PROTO A      |PROTO B      |PROTO C       PROTO D|     SOUTHBOUND
      |             |             |                     |         IFACES
      v             |             v                     v
+-----+-----+       |       +-----+-----+         +-----+-----+
|  GATEWAY  |       |       | INP UNIT  |         |    MOM    |
+-----------+       v       +-----------+         +-----------+     DATA
+-------------------+-----------------------------------------+    PLANE
|                                SDN                          |
+-------------------------------------------------------------+
Figure 2: Functional/layered view of the SDN-MOM distributed architecture.

Each component has different duties and responsibilities:

Figure 2 depicts a schematic of the entire infrastructure. Dashed paths between controller entities in the control plane (Protocol E), and between control and data planes represent the management/configuration data exchanges that are logically separate from data flows (Protocols A, B, C, D). Data flows start from the Gateways (connected to the machinery via the machine-specific protocols) and are sent through the SDN Component, which traverses the entire platform.

The proposed platform can be seen as an integration of several software architectures in a unique system capable of interacting with them in a uniform and controlled way. In this draft, we omit our specific implementation of D and E protocols, and we ask the RFC community for possible implementations capable of satisfying each step's necessities and requirements. Although certain interfaces can be easily implemented using standard de facto protocols, for instance, Protocol B can be found in Open Networking Foundation, "OpenFlow Switch Specification", Version 1.5.1, October 2015, https://opennetworking.org/wp-content/uploads/2014/10/openflow-switch-v1.5.1.pdf, and Protocol C can be The P4 Language Consortium, "P416 Language Specification", Version 1.2.1, June 2020, https://p4.org/p4-spec/docs/P4-16-v1.2.1.html, the others interfaces remain open issues and must be implemented as ad-hoc solutions.

6. Protocol Proposal

Section 5 provided an overview of the architectural components and their links. This section proposes a custom message header (see Section 6.1) that gateways should attach to messages sent by machines that are not natively compliant with the protocols the framework relies on and describes the protocol between the Gateways and the Gateway Controller (see Section 6.2).

6.1. Data Header

 DataHeader
 +-----------------------+
 | flowId: int16         |
 | machineId: int16      |
 | machineSerial: string |
 +-----------------------+
  • flowId: A 16-bit user-defined integer that identifies the flow in a unique way in the system.
  • machineId: Machine 16-bit identifier. It can be logically set based on user necessities.
  • machineSerial: A string displaying the machine serial number. Can be dynamically read from machine registers or manually set on the gateways.

Each component that receives a new unforeseen DataHeader sends it to its controller and waits for the routing/processing/flow rules to be set.

6.2. Protocol A - From Gateway Controller to Gateways

 Protocol A - From Gateway Controller to Gateways
 +-----------------------------+
 | header: DataHeader          |
 +-----------------------------+
 | crud: 2bit                  |
 | ttl: uint32                 |
 | ipFrom: ipAddr              |
 | ipTo: ipAddr                |
 | destTopic: string           |
 | semanticDelivery: 3bit      |
 | machineProtocol: string     |
 | machineUrl: string          |
 | pollingInterval: int8       |
 | geoPosition: geoURI-RFC5870 |
 | applicationType: string     |
 +-----------------------------+
  • header: see Section 6.1.
  • crud: 2 bit flag for identifying Create, Read, Update and Delete of a new or existent configuration.
  • ttl: time to live [ms] of the configuration. After that time the gateway stops sending out new messages. If set to 0 the configuration is permanent.
  • ipFrom: IP address of the gateway interface on which send out messages through the platform.
  • ipTo: IP address of the destination Message Oriented Middleware.
  • destTopic: destination topic of the messages.
  • semanticDelivery: 3 bit identifying the semantic of the flow in the MOM. Typically can be at-most-once, at-least-once, and exactly-once, but others can be defined based on the specific implementation of the MOM.
  • machineProtocol: protocol for extracting data from the machine. Examples can be MODBUS, Profibus, EtherCAT, and OPC-UA.
  • machineUrl: Url address of the machine supervised by the gateway.
  • pollingInterval: interval in [ms] for polling data extraction.
  • geoPosition: position in space of the machine supervised by the gateway expressed in compliance to [RFC5870].
  • applicationType: application type header attached to the message body by the gateway. Examples can be 'application/json' or 'application/xml'.

7. Conventions and Definitions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

8. Security Considerations

While this Internet Draft is not primarily focused on addressing security issues, it is of paramount importance to provide some security considerations. In particular note that since the proposed solution should be adopted in industrial environments, possible security threats could cause not only issues related to the IT domain, such as service unavailability and data leak, but also to the OT domain, thus also including potential impact to the safety of human operators. To this purpose, we consider of paramount importance (and push for) the adoption of best practices in terms of security and safety of industrial environments and thus we advise the application of the IEC 62443 family standard as a prerequisite for the deployment of the proposed solution. In addition, by focusing on the proposed solution we recognize that while it is suitable to maximize the QoS of higher priority industrial applications, it should not be achieved to the total detriment of lower priority industrial applications, whose packets should be anyway delivered.

9. IANA Considerations

This document has no IANA actions.

10. References

10.1. Normative References

[I-D.draft-farrel-irtf-introduction-to-semantic-routing-03]
Farrel, A. and D. King, "An Introduction to Semantic Routing", Work in Progress, Internet-Draft, draft-farrel-irtf-introduction-to-semantic-routing-03, , <https://datatracker.ietf.org/doc/html/draft-farrel-irtf-introduction-to-semantic-routing-03>.
[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
[RFC7426]
Haleplidis, E., Ed., Pentikousis, K., Ed., Denazis, S., Hadi Salim, J., Meyer, D., and O. Koufopavlou, "Software-Defined Networking (SDN): Layers and Architecture Terminology", RFC 7426, DOI 10.17487/RFC7426, , <https://www.rfc-editor.org/rfc/rfc7426>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/rfc/rfc8174>.

10.2. Informative References

[IEC62443]
International Electrotechnical Commission, "IEC 62443: Industrial network and system security".
[PORTS2019]
Ports, D. and J. Nelson, "When Should The Network Be The Computer?", Proceedings of the Workshop on Hot Topics in Operating Systems, DOI 10.1145/3317550.3321439, , <https://doi.org/10.1145/3317550.3321439>.
[RFC5870]
Mayrhofer, A. and C. Spanring, "A Uniform Resource Identifier for Geographic Locations ('geo' URI)", RFC Editor report, DOI 10.17487/rfc5870, , <https://doi.org/10.17487/rfc5870>.
[ZILBERMAN2019]
Zilberman, N., "In-Network Computing", , <https://www.sigarch.org/in-network-computing-draft/>.

Acknowledgments

TODO acknowledge.

Authors' Addresses

Paolo Bellavista
University of Bologna
Luca Foschini
University of Bologna
Lorenzo Patera
University of Bologna
Mattia Fogli
University of Ferrara
Carlo Giannelli
University of Ferrara
Cesare Stefanelli
University of Ferrara
David Zhe Lou
Huawei