tram W. Zeng Internet-Draft Ant Group Intended status: Informational 9 May 2022 Expires: 10 November 2022 TURN Cluster: Scale out TURN cluster by routable transaction id draft-zeng-turn-cluster-03 Abstract The TURN protocol is designed to solve the connectivity problem of Peer-to-Peer Communication when NAT devices exist, by allowing each peer to establish a data channel on TURN servers. Since there are some specific requirements in the use of TURN, such as RTP/RTCP connection pairs must be sent to the same TURN server, it is not easy to scale a single TURN server into a TURN cluster. In addition, a TURN service cluster also needs to consider how to achieve good load balancing and how to protect internal information security. Based on these demands, this specification provides several standard means to implement a functional and secure TURN cluster, and this specification also provides an overview and rationale of the cluster architecture. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 10 November 2022. Copyright Notice Copyright (c) 2022 IETF Trust and the persons identified as the document authors. All rights reserved. Zeng Expires 10 November 2022 [Page 1] Internet-Draft TURN-Cluster May 2022 This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 1.2. Notation . . . . . . . . . . . . . . . . . . . . . . . . 5 2. Overview of an TURN ICE process . . . . . . . . . . . . . . . 5 3. Architectural and Interactive Process . . . . . . . . . . . . 9 3.1. Overview of the Architectural . . . . . . . . . . . . . . 9 3.2. Overview of interaction process . . . . . . . . . . . . . 11 3.2.1. ClientA Behavior . . . . . . . . . . . . . . . . . . 12 3.2.2. ClientB Behavior . . . . . . . . . . . . . . . . . . 13 3.2.3. TURN Cluster Behavior . . . . . . . . . . . . . . . . 13 4. Routing Mechanism . . . . . . . . . . . . . . . . . . . . . . 14 4.1. Server Generate ENCRYPTED-RELAYED-ADDRESS . . . . . . . . 15 4.1.1. Preparation Phase . . . . . . . . . . . . . . . . . . 15 4.1.2. Obfuscation Phase . . . . . . . . . . . . . . . . . . 16 4.1.3. Encryption Phase . . . . . . . . . . . . . . . . . . 16 4.2. Generation of Routable Transaction ID . . . . . . . . . . 17 4.2.1. Arbitrary Mode . . . . . . . . . . . . . . . . . . . 17 4.2.2. Specific Server Mode . . . . . . . . . . . . . . . . 18 4.2.3. Specific Address Mode . . . . . . . . . . . . . . . . 18 4.2.4. Uniqueness of Transaction ID . . . . . . . . . . . . 19 4.3. TURN LB Process Transaction ID . . . . . . . . . . . . . 19 4.4. ENCRYPTED-PEER-ADDRESS . . . . . . . . . . . . . . . . . 20 4.5. TLS Consideration . . . . . . . . . . . . . . . . . . . . 21 5. Security Consideration . . . . . . . . . . . . . . . . . . . 21 5.1. DoS Against TURN Cluster . . . . . . . . . . . . . . . . 22 5.2. DoS Against a Single TURN Server . . . . . . . . . . . . 22 6. IANA Consideration . . . . . . . . . . . . . . . . . . . . . 22 7. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 23 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 23 8.1. Normative References . . . . . . . . . . . . . . . . . . 23 8.2. Informative References . . . . . . . . . . . . . . . . . 24 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 24 Zeng Expires 10 November 2022 [Page 2] Internet-Draft TURN-Cluster May 2022 1. Introduction Interactive Connectivity Establishment(ICE)(described in [RFC8445] gives a standard way for peers exchanging information and establishing a data channel between each others, in the channel establishing progress, if a peer is located behind a NAT, then it's impossible for that peer to communicate directly with other peers, [RFC8656] proposal the TURN protocol to solve this problem by offering a standard way to establish relayed channel between peers. TURN and ICE are widely used and the most typical scenario is webrtc(described in [RFC7478]). Imagine a webrtc scenario with a large number of users, when most users need to use relay service, a single TURN server would become the bottleneck of the system. Setting a networking load-balancing equipment that forwards the requests to a member of the TURN servers group is the best and most efficient performance tuning approach, it allows near-linear performance improvement. However, TURN servers with a simple networking load-balancing equipment are not enough to build a fully functional cluster, since a TURN cluster still meet these requirements: * For RTP/RTCP connection pairs and TCP relayed, client requests with different source addresses must be forwarded to the same server, a TURN cluster SHOULD achieve this condition. * The recommended ICE candidate priority calculation formula is designed for all clients connected to the same TURN server. When clients are connected to different TURN servers in the cluster, there may be one more hop between TURN servers of the relayed channel, then the formula is unreliable. a TURN cluster SHOULD avoid this problem. * A TURN cluster SHOULD achieve good load balancing for all members of the cluster. [TURN-Load-balance] give some suggestions to solve these problem: (1) DNS based load balancing (2) Using ALTERNATE-SERVER(defined in Section 10 of [RFC8489]) to redirect requests to right server, while the DNS based load balancing is unreliable and the ALTERNATE-SERVER mechanism is inefficient. Moreover, these solutions are expensive and insecure, and are not suitable for large-scale deployment in Internet Data Center(IDC) environments, because they require that each TURN server in the cluster MUST have their own public network IP address and expose a considerable number of ports to the outside network. In general, a TURN cluster SHOULD meet the following requirements: Zeng Expires 10 November 2022 [Page 3] Internet-Draft TURN-Cluster May 2022 * Meet the basic requirements for the use of all TURN protocols, including the specific scenarios such as RTP/RTCP connection pairs. * Easy to scale in/out the size of the cluster. * The cluster SHOULD have a unified access portal, and the internal network information MUST be hidden. * Easy to set up network security policies to defend against potential attacks. This specification provides an architecture and corresponding interaction process for easily building a TURN cluster that meets all above requirements. Since TURN is always used in ICE, this specification introduces related processes based on ICE for better illustration. The remainder of this document is organized as follows: Section 2 briefly introduces how the relayed channel is established in the ICE process; Section 3 describes the overview of the architecture and the interaction process between client and TURN cluster; Section 4 introduce the generation and processing of routing message, including:(1)How does a TURN server transmit routing message in a secure manner; (2)How does a client generate routable transaction ID with the routing message; (3) How the TURN cluster handles the transaction ID and corresponding packet. 1.1. Terminology Although this document is not an IETF Standards Track publication it adopts the conventions for normative language to provide clarity of instructions to the implementer. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. The following terms are used in this document: concat(x0, ..., xN): Concatenation of byte strings. "concat(0x01, 0x0203, 0x040506) = 0x010203040506". Zeng Expires 10 November 2022 [Page 4] Internet-Draft TURN-Cluster May 2022 1.2. Notation All wire formats will be depicted using the notation defined in Section 1.3 of [RFC9000]. There is one addition: the function len() refers to the length of a field which can serve as a limit on a different field, so that the lengths of two fields can be concisely defined as limited to a sum, for example: x(A..B) y(C..B-len(x)) indicates that x can be of any length between A and B, and y can be of any length between C and B provided that (len(x) + len(y)) does not exceed B. The example below illustrates the basic framework: Example Structure { One-bit Field (1), 7-bit Field with Fixed Value (7) = 61, Field with Variable-Length Integer (i), Arbitrary-Length Field (..), Variable-Length Field (8..24), Field With Minimum Length (16..), Field With Maximum Length (..128), [Optional Field (64)], Repeated Field (8) ..., } Figure 1: Example Format 2. Overview of an TURN ICE process This section we would use an example to illustrate how clients set up relayed channel through ICE and TURN, in the example, clientA and clientB are all behind a symmetric NAT device, their network topology is shown in figure below: Zeng Expires 10 November 2022 [Page 5] Internet-Draft TURN-Cluster May 2022 +-------------+ | Turn Server | +-------------+ 10.11.252.43:3478 ^ ^ | | +----------------+ +-----------------+ | | 10.243.22.200:23768 10.243.21.133:12371 +---------------+ +---------------+ | Symmetric NAT | | Symmetric NAT | +---------------+ +---------------+ ^ ^ | | 192.168.1.0:6677 192.168.110.121:11202 +---------+ +---------+ | clientA | | clientB | +---------+ +---------+ Figure 2: Example network topology Although in this example, the P2P data channel built based on STUN protocol cannot be used because of the existence of symmetric NAT, this document does not omit the STUN process of ICE, so that readers can more clearly understand the whole ice process. A simplified TURN ICE relayed channel establishing processing is depicted in Figure 3. clientA TURN server clientB | | | |------STUN/TURN Req----->| | | | | |<-----STUN/TURN Resp-----| | | | | |--ClientA ICE Candidate Info---------------------->| | | | | |<-----STUN/TURN Req------| | | | | |------STUN/TURN Resp---->| | | | |<----------------------ClientB ICE Candidate Info--| | | | |<--Connectivity Checks-->|<--Connectivity Checks-->| | | | |<---------Data---------->|<--------Data----------->| | | | Figure 3: Example relayed channel establishment Zeng Expires 10 November 2022 [Page 6] Internet-Draft TURN-Cluster May 2022 The related behavior in the Figure 1 are explained as follows: STUN/TURN Req: The STUN requests send by clientA/clientB, which SHOULD be Allocate request(defined in Section 7 of [RFC8656]) or Bind request(defined in Section 2 of [RFC8489]) to TURN server. STUN/TURN Resp: The STUN responses return by TURN server, which SHOULD include these information: (1) XOR-RELAYED-ADDRESS(defined in Section 18.5 of [RFC8656]) (2) XOR-MAPPED-ADDRESS(defined in Section 14.2 of [RFC8489]) ClientA/ClientB ICE Candidate Info: The ICE Candidate Information(defined in Section 5.3 of [RFC8445]) gathered by client, and client synchronizes it to peer by signaling server(defined in [RFC8445]). Connectivity Checks: The connectivity check processing which is defined in Section 2 of [RFC8445]. Take clientA for example, clientA first attempts to connect directly to clientB through XOR-MAPPED- ADDRESS, because clientA and clientB are all behind a symmetric NAT device, this process would fail, then clientA would try relayed channel, if clientA and clientB can successfully bind to XOR-RELAYED- ADDRESS of peer, then there are 3 available channel: * srflxA2relayB: The channel of server-reflexive address of clientA to relayed address of clientB, shown below: XOR-RELAYED-ADDRESS +-------------+ allocated for clientB | Turn Server | 10.11.252.43:3478 10.11.252.43:55555 +-------------+ ^ ^ | | v v +---------------+ +---------------+ | Symmetric NAT | | Symmetric NAT | +---------------+ +---------------+ ^ ^ | | v v +---------+ +---------+ | clientA | | clientB | +---------+ +---------+ Figure 4: Established srflxA2relayB Data Channel * relayA2srflxB: The channel of relayed address of clientA to server-reflexive address of clientB, shown below: Zeng Expires 10 November 2022 [Page 7] Internet-Draft TURN-Cluster May 2022 +-------------+ XOR-RELAYED-ADDRESS 10.11.252.43:3478 | Turn Server | allocated for clientA +-------------+ 10.11.252.43:55666 ^ ^ | | v v +---------------+ +---------------+ | Symmetric NAT | | Symmetric NAT | +---------------+ +---------------+ ^ ^ | | v v +---------+ +---------+ | clientA | | clientB | +---------+ +---------+ Figure 5: Established relayA2srflxB Data Channel * relayA2relayB: The channel of relayed address of clientA to relayed address of clientB, shown below: XOR-RELAYED-ADDRESS XOR-RELAYED-ADDRESS allocated for clientA <--------> allocated for clientB 10.11.252.43:55555 10.11.252.43:55666 +-------------+ +-------------+ | Turn Server | | Turn Server | +-------------+ +-------------+ 10.11.252.43:3478 10.11.252.43:3478 ^ ^ | | v v +---------------+ +---------------+ | Symmetric NAT | | Symmetric NAT | +---------------+ +---------------+ ^ ^ | | v v +---------+ +---------+ | clientA | | clientB | +---------+ +---------+ Figure 6: Established relayA2relayB Data Channel ICE would have a priority calculation for the 3 channels, and which channel is finally selected depends on the calculation results. Zeng Expires 10 November 2022 [Page 8] Internet-Draft TURN-Cluster May 2022 For a client, the usage of a TURN cluster SHOULD be like a single TURN server, which means that the above 3 channels MUST still can be successfully established through TURN cluster, moreover, all requests from the peers of one P2P connection SHOULD be forward to the same TURN server in the cluster, or the calculation formula would be unavailable because of the potential one more hop between the TURN server. 3. Architectural and Interactive Process A single TURN server always serves on a default port(e.g., 3478 for UDP/TCP, 5349 for TLS), and allocates ports for client relay. In order to be compatible with the existing TURN implementation, a TURN server in cluster SHOULD also work in a similar way. In addition, the TURN server requires that all allocated ports can be accessed by the client directly. Since it is hard and insecure for a cluster to expose a large number of ports for each server in the cluster, the TURN cluster described in the document chooses to provide all services on the default port, and ensure the correct routing of packets through the routable transaction id(described in Section 4.2). This section will describe the architecture for the TURN cluster, and introduces the interaction process between client and cluster. 3.1. Overview of the Architectural The structure of the TURN cluster is not complicated, which just has a front-end load balancer "TURN LB" as the gateway to forward client requests to corresponding TURN server, and the TURN server is the equipment that really provides service. As described in Section 1 of [RFC8656], A client using TURN must have some way to communicate the relayed information to its peers, and to learn each peer's relay information, here we use "signaling server" described in [RFC8445] to represent this component, the network topology(including the internal architecture of TURN cluster) is depicted in figure below: Zeng Expires 10 November 2022 [Page 9] Internet-Draft TURN-Cluster May 2022 +------------------+ +------>| signaling server |<-------+ | +------------------+ | +----------+ +----------+ | client A | | client B | +----------+ +----------+ 10.243.22.200:23768 10.69.127.39:32102 | | +-------------+ +-------------+ | | +-----------------------|-------|-----------------------+ | TURN cluster | | | | v v | | 10.11.252.43:3478 | | +---------+ | | | TURN LB | | | +---------+ | | | | | | +--------------+ | | | | | | | v +----+ | | 192.168.1.2:3478 | | | +-------------+ v +-------------+ | | |TURN server 1| 192.168.1.2:61002 ...|TURN server n| | | +-------------+ +-------------+ | +-------------------------------------------------------+ Figure 7: Example Topology of Client and TURN Cluster The functions of each component are as follows: Client A/B: All peers of one P2P relay connection. Signaling server: A server for all clients to exchange TURN information with its peers, this specification does not involve its specific process and implementation, Implementers can refer to the "signaling server" defined in [RFC8445] for implementation TURN LB: A device that performs two functions:(1)Ensure the load balance of all servers in the cluster; (2)Ensure that data from all peers of a P2P connection can be routed to an appropriate TURN server. TURN server: The real TURN service provider. The core of the architecture design is: * Provide TURN services through a unified access portal. * Using TURN LB and mechanism described in Section 4 to ensure all packets can be routed to the Zeng Expires 10 November 2022 [Page 10] Internet-Draft TURN-Cluster May 2022 appropriate backend TURN server. * Each TURN server in the cluster just works like a single TURN server, the difference is that the TURN server MUST use ENCRYPTED-RELAYED-ADDRESS(defined in Section 4.1) to transmit allocation information instead of XOR-RELAYED-ADDRESS, in order to avoid the exposing of internal network information. In additional, since the address information is encrypted in ENCRYPTED- RELAYED-ADDRESS, and the client cannot extract it directly, client MUST use ENCRYPTED-PEER-ADDRESS(defined in Section 4.4) to specify the address information of the peer instead of XOR-PEER-ADDRESS. 3.2. Overview of interaction process Since the TURN server in the cluster MUST transmit allocation information through ENCRYPTED-RELAYED-ADDRESS to protecting cluster internal network information, client can not get the allocated address directly, and the establishing of srflxA2relayB and relayA2srflxB cannot be the same as usual. As depicted in Figure 7, all requests can only be sent to the unified access portal of cluster, in order to ensure the correct forwarding of requests, some routing message MUST be carried in a request, when TURN LB receive requests, it MUST extract and parse the routing message, and forward requests depend on it. The overall interactive processing is shown in the following figure, related address information comes from Figure 7 and ERA in the figure corresponds to ENCRYPTED-PEER- ADDRESS(defined in Section 4.1): Zeng Expires 10 November 2022 [Page 11] Internet-Draft TURN-Cluster May 2022 clientA TURN cluster clientB | | | |----------TURN Req---------->| | | (to 10.11.252.43:3478) | | | | | |<---------TURN Resp----------| | | (carry routing-info-A | | | in ERA) | | | | | |--ClientA ICE Candidate Info------------------------------>| | | | | | extract routing-info-A | | from clientA's ERA | | | | |<---------TURN Req-----------| | | (to 10.11.252.43:3478, | | | with routing-info-A) | | | | | |----------TURN Resp--------->| | | (carry routing-info-B | | | in ERA) | | | | |<-----------------------------clientB ICE Candidate Info---| | | | extract routing-info-B | | from clientB's ERA | | | | | |<----Connectivity Checks---->|<----Connectivity Checks---->| | (to 10.11.252.43:3478, | (to 10.11.252.43:3478, | | with routing-info-B) | with routing-info-A) | | | | |<------------Data----------->|<-----------Data------------>| | (from/to 10.11.252.43:3478) | (from/to 10.11.252.43:3478) | Figure 8: Interaction Process Between Client and TURN Cluster 3.2.1. ClientA Behavior When the clientA starts an ICE process, it first sends a STUN/TURN request as usual. Since currently clientA does not have any information about the server and clientB, clientA MUST use "Arbitrary-mode" defined in Section 4.2 to generate transaction ID for requests. After receiving the Allocate success response, clientA will extract ENCRYPTED-RELAYED-ADDRESS from the response and send it to clientB in Candidate Information. Zeng Expires 10 November 2022 [Page 12] Internet-Draft TURN-Cluster May 2022 Later clientA will receive Candidate Information from clientB, which include clientB's ENCRYPTED-RELAYED-ADDRESS, clientA MUST extract routing-info-B from it and start connectivity checks. For establishing "srflxA2relayB" data channel, the Bind request of clientA SHOULD be sent to the relayed address obtained by clientB from the server, then clientA MUST use "Specific-address-mode" to generate transaction ID for the Binding request. For establishing "relayA2srflxB" and "relayA2relayB" data channel, related requests SHOULD be sent to the TURN server that clientA had accessed before, then clientA MUST use "Specific-server-mode" to generate transaction ID for these requests. Above 3 relayed data channels have their own ways to transmit application data, for "srflxA2relayB", clientA can just send UDP datagram to the unified access portal of cluster, and the routing records left by the previous Binding request can ensure that they can be forwarded correctly. For "relayA2srflxB" and "relayA2relayB", there are 2 mechanism for clientA sending application data to clientB: (1)Send Indication(defined in Section 11 of [RFC8656]); (2)Bind a Channel and send ChannelData message(defined in Section 12 of [RFC8656]), where these two mechanism MUST use "Specific-server- mode" to generate transaction ID for indication(defined in Section 11 of [RFC8656]), meanwhile, client MUST use ENCRYPTED-PEER- ADDRESS(description in Section 4.4) to specify the address of peer instead of XOR-PEER-ADDRESS. For Channel mechanism, after success building a channel by Binding request, the later ChannelData message will be routed by the routing records left by the Binding request. 3.2.2. ClientB Behavior The behavior of ClientB is just similar to clientA, the difference is that when clientB sends STUN/TURN requests for the first time, it have already known which server it should access through the routing- info-A brought by clientA, so, clientB MUST use "Specific-server- mode" to generate transaction ID for these requests. 3.2.3. TURN Cluster Behavior A TURN Service cluster consists of 2 components, TURN LB and TURN server, the TURN LB is used to forward all packets to the right TURN server, and TURN server is the actual TURN service provider. 3.2.3.1. TURN LB Behavior TURN LB forwards packets through two elements: Zeng Expires 10 November 2022 [Page 13] Internet-Draft TURN-Cluster May 2022 * A self maintained routing-map, whose key is: concat(client source IP address, client source Port), and value is: concat(upstream TURN server IP address, upstream TURN server port). * Routing information in transaction ID. When a packet arrives, a TURN LB SHOULD resolve and process packet as below: * TURN LB first determines whether this packet is in STUN format, if so, TURN LB will extract the transaction ID from the packet, and process this packet through the way described in Section 4.3. * If this packet is not in STUN format, TURN LB will extract the source IP address and port of the packet to form the key, and try to get the upstream TURN server IP address and port through the key and routing-map, if successfully, TURN LB will forward the packet to the upstream TURN server directly, and refresh the expiration time of the corresponding routing record. If failed, drop this packet silently. Moreover, TURN LB SHOULD NOT modify the source IP address and port of the packet, for a TURN cluster MAY still provide STUN service. 3.2.3.2. TURN Server Behavior For most STUN/TURN messages, the TURN server processes them as defined in [RFC8656], while there are some special requirements for XOR-RELAYED-ADDRESS and XOR-PEER-ADDRESS. Instead of transmitting allocation information by XOR-RELAYED-ADDRESS, the TURN server MUST use ENCRYPTED-RELAYED-ADDRESS described in Section 4.1 to protect internal network information. And when the TURN server receives an ENCRYPTED-PEER-ADDRESS attribute, it MUST process it as described inSection 4.4. In addition, since a TURN server in the cluster MAY also provide STUN service, it SHOULD avoid carrying any attributes(e.g., RESPONSE-ORIGIN, RESPONSE-PORT defined in [RFC5780]) that expose internal network information in the stun response 4. Routing Mechanism This section defines the conventions for related components in Figure 7 securely generate and transmit routing information. It describes:(1) How does the TURN server generate ENCRYPTED-RELAYED- ADDRESS to securely carry routing information; (2) How does the client generate routable transaction ID with ENCRYPTED-RELAYED- ADDRESS and specify address of peer by ENCRYPTED-PEER-ADDRESS; (3) How does the TURN LB process routable transaction ID and forward packets. Zeng Expires 10 November 2022 [Page 14] Internet-Draft TURN-Cluster May 2022 4.1. Server Generate ENCRYPTED-RELAYED-ADDRESS ENCRYPTED-RELAYED-ADDRESS is a new STUN attribute defined in this specification, which attribute value is TBD1(IANA is requested to assign TBD1 a value in the range 0x000e-0x000f). The generation of ENCRYPTED-RELAYED-ADDRESS is divided into 3 phases:(1) Preparation phase; (2) Obfuscated phase; (3) Encryption phase. 4.1.1. Preparation Phase The preparation phase is triggered at the time of preparing for cluster establishment or updating the members of the cluster. In the preparation phase, the maintainer of the cluster will generate and synchronize configuration to TURN LB and each TURN server inside the cluster. The configuration consists of 4 parts: (1) A 2 bits Configuration-ID, which is used to uniquely identify the configuration when the cluster configuration rotates; (2) An arbitrary nonnegative integer "divisor", which is used to do obfuscated calculation, "divisor" MUST be larger than the numbers of TURN server; (3) A set of "modulus", which is used to uniquely identifies each server in the cluster; (4) A 16 byte "key", which is used in encryption phase. The maintainer of cluster MUST perform the following operations in the preparation phase: * Select a configuration ID for the configuration. The maintainer SHOULD ensure that there are no clients that are still using the configuration corresponding to the selected ID. * Generate "divisor", "modulus" set and "key" defined in the configuration as required. * If the cluster currently has a configuration in use, set its state to be "wait to be offline". * Synchronize new configuration ID, "divisor" and "key" to TURN LB and each TURN server, then assigned each TURN server its own "modulus", and synchronize the mapping between the "modulus" and TURN server IP address to TURN LB. * Set the state of the new configuration to be "active". Note there MUST be only one configuration at the "active" state. TURN server MUST NOT generate new ENCRYPTED-RELAYED-ADDRESS using an old configuration after receiving a new one. Zeng Expires 10 November 2022 [Page 15] Internet-Draft TURN-Cluster May 2022 4.1.2. Obfuscation Phase When a TURN server begins to generate ENCRYPTED-RELAYED-ADDRESS for Allocate success response, it starts the Obfuscation phase. In Obfuscation phase, TURN server use divisor and its modulus from the currently used configuration to generate Obfuscated-address, the struct of Obfuscated-address is depicted below: Obfuscated-address { Configuration-ID(2), Obfuscated-value(30) } Obfuscated-value is calculated by adding an arbitrary nonnegative integer multiple of the "divisor" to its "modulus", without exceeding the maximum integer value 2^30. 4.1.3. Encryption Phase After getting Obfuscated-address, the TURN server starts the Encryption phase, it first server left-padding the magic cookie with zeros to a 16Bytes string, and encrypt it with the "key" obtained in the preparation phase. Encryption in the algorithms below uses the AES-128-ECB cipher, and the encryption result is recorded as "mask". Then, TURN server begin to generate ENCRYPTED-RELAYED-ADDRESS with the "mask", the struct of ENCRYPTED-RELAYED-ADDRESS is shown below: ENCRYPTED-RELAYED-ADDRESS { Attribute-Type(8), Reserve-bit(2), Encoded-Check-bit(6), Encoded-Port(16), Encoded-Obfuscated-Address(32) } ENCRYPTED-RELAYED-ADDRESS has the following fields: Attribute-Type: IANA is requested to assign a value for it. Reserve-bit: A 2bits value reserved for two special purposes. The Encoded-Check-bit, Encoded-Obfuscated-Address,Encoded- configuration-ID and Encoded-Port are calculated by the function defined below: Encoded-Check-bit = mask[0:6] ^ plaintext-check-bit Encoded-Port = mask[6:22] ^ allocate-port Encoded-Obfuscated-Address = mask[22:54] ^ Obfuscated-Address Zeng Expires 10 November 2022 [Page 16] Internet-Draft TURN-Cluster May 2022 While plaintext-check-bit is a 6 bits value with all bits of '1', and allocate-port is the 16 bits port value allocated by the TURN server. 4.2. Generation of Routable Transaction ID As described in [RFC8489], The transaction ID is a 96-bit identifier generated by the client, to uniquely identify STUN transactions, it is always a uniformly and randomly chosen value. Actually, 96 bits is over abundant, we can further design the transaction ID, so that it can not only implement the uniqueness, but also securely carry some routing information and check information. The structure of a Routable Transaction ID is shown below: Routable Transaction ID { Mode-bit (2), Routing-info (6..54), Random-bit (40..88), } While the Mode-bit correspond to 3 route modes, and each mode has its corresponding routing information, 3 modes are depicted below: * Arbitrary mode: Corresponding request can be sent to the default port of any TURN server in the cluster. * Specific-server-mode: Corresponding request MUST be sent to the default port of the specific TURN server. * Specific-address-mode: Corresponding request MUST be sent to the specified port of the specific TURN server 4.2.1. Arbitrary Mode The typical scenario of "Arbitrary-mode" is that when a client send the first STUN/TURN request to the cluster at the beginning of ICE process, it does not have any information about TURN server, so client SHOULD set the Mode-bit to "00", and the routing information of transaction ID is just the 6bits check-bit with all bits of '1', as depicted below: Routing-info { Check-bit(6) } After that, the client will generate a 88bit random string as the Random-bit. Zeng Expires 10 November 2022 [Page 17] Internet-Draft TURN-Cluster May 2022 4.2.2. Specific Server Mode The scenarios suitable for mode B are: The client has received ENCRYPTED-XOR-RELAY-ADDRESS from TURN server or peer, and it expects to send a request to the TURN server corresponding to the ENCRYPTED- XOR-RELAY-ADDRESS. For example: (1) Client has established a RTP relay connection in a TURN server, and wants to establish a RTP/RTCP connection pair in the same TURN server; (2) Client has received ENCRYPTED-XOR-RELAY-ADDRESS from peer Candidate Information and expects to apply for the relay port in the same TURN server. At this mode, client MUST set Mode-bit to "01", and Routing-info struct is depicted below: Routing-info { Encoded-Check-bit(6), Encoded-Obfuscated-Address(32), } Encoded-Check-bit and Encoded-Address Here are obtained directly from ENCRYPTED-XOR-RELAY-ADDRESS. The rest 56bit of transaction ID MUST be a cryptographically random value. 4.2.3. Specific Address Mode At Specific Address Mode, client MUST have receive ENCRYPTED-XOR- RELAY-ADDRESS and expect to send a request to the specific port of the specific TURN server, a typical scenario is that: Client has received ENCRYPTED-XOR-RELAY-ADDRESS from peer Candidate Information, and expects to send a Bind request to the address of ENCRYPTED-XOR- RELAY-ADDRESS. At this mode, client SHOULD set Mode-bit to "10", and Routing-info struct is depicted below: Routing-info { Encoded-Check-bit(6), Encoded-Obfuscated-Address(32), Encoded-Port(16), } Client MUST set Mode-bit to '10', and extract Encoded-Check-bit, Encoded-Port and Encoded-Address from ENCRYPTED-XOR-RELAY-ADDRESS., and set it to transaction ID. then generate a 40bit random string to fill the rest of the transaction ID. Zeng Expires 10 November 2022 [Page 18] Internet-Draft TURN-Cluster May 2022 4.2.4. Uniqueness of Transaction ID This section will make a simple analysis of the uniqueness of the routable transaction ID, the routable transaction ID still depends on a large enough value range and random selection to ensure uniqueness. In fact, the routing part in transaction ID reduces the value range of transaction ID, in order to avoid the value range being too small, this specification suggest the obfuscated way to encode address, then the value range of transaction ID is determined by two factors: the length of random bit and the number of cluster machines N, and the value range of routable transaction ID under the three modes is shown in the table below: +-------------------------+------------------+ | mode | value range | +-------------------------+------------------+ | Arbitrary Mode | 0 - 2^88 | +-------------------------+------------------+ | Specific Server Mode | 0 - (2^88)/N | +-------------------------+------------------+ | Specific Address Mode | 0 - (2^72)/N | +-------------------------+------------------+ In production environment, the number of machines in a TURN cluster is not particularly large, so the value range of arbitrary mode and specific server mode is enough for most scenarios. As for specific address mode, only related peers will use this mode to access the same address, so it can work well without a particularly large value range. 4.3. TURN LB Process Transaction ID When a TURN LB receives a TURN packet, it first extracts the first 2 bits of transaction ID, if the first 2 bits are "11", the TURN LB will drop this packet silently. Later TURN LB will determine the mode of the client by the first 2 bits. For arbitrary mode requests, TURN LB will check whether the next 6 bits are all '1', if not, TURN LB SHOULD drop this packet silently. If yes, TURN LB will forward this packet to a backend TURN server default port depending on each server's load condition. For specific server Mode and specific address Mode requests, TURN LB would first generate "mask" just as defined in encryption phase of Section 4.1, and calculate plaintext-check-bit and Obfuscated-Address as below: plaintext-check-bit = mask[0:6] ^ Encoded-Check-bit Obfuscated-Address = mask[22:54] ^ Encoded-Obfuscated-Address Zeng Expires 10 November 2022 [Page 19] Internet-Draft TURN-Cluster May 2022 TURN LB then checks if all bits of plaintext-check-bit are all '1', if the check fails, TURN LB will drop this packet silently. If success, TURN LB SHOULD perform the following sequence of steps: 1. Extract configuration ID and Obfuscated-value from Obfuscated- Address, and get the configuration corresponding to the configuration ID. 2. Express Obfuscated-value as an unsigned integer, and divide the result by the "divisor" to get the modulus of the request. 3. Use modulus to get TURN server IP address from the TURN LB self maintain map. If "modulus" cannot be mapped to any TURN server, drop this packet silently. 4. If the TURN server selected in step3 is offline because of configuration rotation, TURN LB SHOULD send an error response to the client, with setting the ERR_CODE to be TBD3(IANA is request to assign a "4xx" err code for this value, to indicate request is failed because of the configuration problem). 5. If the TURN server selected in step3 works well, then it will forward the packet by the mode, for specific server Mode, TURN LB will forward the packet to the default port of the TURN server. For specific address Mode, TURN LB will forward the packet to the specific-port of the TURN server. The specific-port of step5 is calculated as below: allocate-port = mask[6:22] ^ Encoded-Port 4.4. ENCRYPTED-PEER-ADDRESS ENCRYPTED-PEER-ADDRESS is a new STUN attribute defined in this specification, which attribute value is TBD2(IANA is requested to assign TBD1 a value in the range 0x000e-0x000f). Similar to XOR- PEER-ADDRESS, the ENCRYPTED-PEER-ADDRESS is also used to indicate server the address and port of the peer, while the ENCRYPTED-PEER- ADDRESS is applicable to the scenario where the address and port of the peer is contained in ENCRYPTED-RELAYED-ADDRESS. ENCRYPTED-PEER- ADDRESS has the same struct as ENCRYPTED-RELAYED-ADDRESS, and IANA is requested to assign a type value for ENCRYPTED-PEER-ADDRESS. TURN server MUST perform the following steps to process ENCRYPTED- PEER-ADDRESS attribute. 1. Calculate plaintext-check-bit, allocate-port and Obfuscated- Address by the mask and formula defined in Section 4.1. Zeng Expires 10 November 2022 [Page 20] Internet-Draft TURN-Cluster May 2022 2. Check if all bits of plaintext-check-bit are all '1', if the check fails, the TURN server SHOULD drop this packet silently. 3. Extract configuration ID and Obfuscated-value from Obfuscated- Address, and get the "divisor" and "modulus" of the server by configuration id. 4. Express Obfuscated-value as an unsigned integer, and divide the result by the "divisor" to get the "modulus" of the request. Check if the "modulus" of the request is equal to the "modulus" of the server, if not equal, TURN server SHOULD send an error response to the client, with setting the ERR_CODE to be TBD4(IANA is request to assign a "4xx" err code for this value, to indicate request is failed due to access to an inappropriate server). If equal, the TURN server then sends the packet to the corresponding address. The check at step4 is based on this consideration: Since the cluster has provided the routing mechanisms, all peers of a relayed channel SHOULD be connected to the same server to avoid extra hops in the network. 4.5. TLS Consideration For most STUN/TURN requests, TURN LB forwards them based on transaction ID, if these messages are transmitted over DTLS-over-UDP or TLS-over-TCP, TURN LB cannot see the transaction ID directly. In these cases, TURN LB MUST also play a role of TLS offload device to obtain the plaintext transaction ID. 5. Security Consideration This document describes an architectural framework for building large-scale TURN clusters, since an attacker cannot obtain network information of a TURN server inside the cluster, attacks based on source address forgery(e.g., TURN loop attack) can be effectively prevented. While a TURN cluster still suffers most attacks against a single TURN server, This section will discuss possible attacks on a TURN cluster. For the attacks discussed in Section 21 of [RFC8656], if they are not mentioned in this section, it indicates that the relevant analysis of the attack is still valid for the TURN cluster. Zeng Expires 10 November 2022 [Page 21] Internet-Draft TURN-Cluster May 2022 5.1. DoS Against TURN Cluster An attacker might generate a large number of legitimate allocation requests and flood it, to exhaust the available ports of all TURN servers in the cluster. Since all requests are legitimate, the attack cannot be prevented directly. The maintainer of the TURN cluster can set up some custom address-based rules, which limit the number of allocation requests from the same source address to mitigate this attack. 5.2. DoS Against a Single TURN Server Since the routing message in the transaction ID is encrypted and will be checked, it is hard for an attacker to construct a large number legitimate TURN request to attack a single TURN server. However, ChannelData messages are routed by the address, an attacker might obtain a ChannelData and flood the corresponding channel with traffic. This attack is mitigated by the recommendation that the server limit the amount of bandwidth it will relay for a given username or just use (D)TLS to avoid forgery of legal ChannelData messages. 6. IANA Consideration IANA is requested to assign the type values for the attribute ENCRYPTED-RELAYED-ADDRESS(defined in Section 4.1) and ENCRYPTED-PEER- ADDRESS(defined in Section 4.4). +----------------+---------------------------+-----------------+ | attribute type | description | reference | +----------------+---------------------------+-----------------+ | | value for | | | TBD1 | ENCRYPTED-RELAYED-ADDRESS,| this RFC | | | used to carry relayed | | | | address safely | | +----------------+---------------------------+-----------------+ | | value for | | | TBD2 | ENCRYPTED-PEER-ADDRESS, | this RFC | | | used to carry peer address| | | | safely | | +----------------+--------------===----------+-----------------+ IANA is requested to assign the err code for the TBD3(defined in Section 4.3) and TBD4(defined in Section 4.4) depicted below: Zeng Expires 10 November 2022 [Page 22] Internet-Draft TURN-Cluster May 2022 +----------+------------------------+-----------------+ | err code | description | reference | +----------+------------------------+-----------------+ | | request failed due to | | | TBD3 | server configuration | this RFC | | | rotation | | +----------+------------------------+-----------------+ | | request failed because | | | TBD4 | the client accessed an | this RFC | | | inappropriate server | | +----------+------------------------+-----------------+ 7. Contributors The authors would like to thank HongQuan.Z(hongquan.zhq@antgroup.com), jim(jim.pj@alibaba-inc.com), Y.Chen(cy119846@antgroup.com), Han.X(han.xiao@antgroup.com), Bin.Y(yb261973@antgroup.com), and XiaoKang.Q(xiaokang.qxk@antgroup.com), LingTao.K(lingtao.klt@antgroup.com) for their contributions to the this document. 8. References 8.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . [RFC5780] MacDonald, D. and B. Lowekamp, "NAT Behavior Discovery Using Session Traversal Utilities for NAT (STUN)", RFC 5780, DOI 10.17487/RFC5780, May 2010, . [RFC7478] Holmberg, C., Hakansson, S., and G. Eriksson, "Web Real- Time Communication Use Cases and Requirements", RFC 7478, DOI 10.17487/RFC7478, March 2015, . [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, . Zeng Expires 10 November 2022 [Page 23] Internet-Draft TURN-Cluster May 2022 [RFC8445] Keranen, A., Holmberg, C., and J. Rosenberg, "Interactive Connectivity Establishment (ICE): A Protocol for Network Address Translator (NAT) Traversal", RFC 8445, DOI 10.17487/RFC8445, July 2018, . [RFC8489] Petit-Huguenin, M., Salgueiro, G., Rosenberg, J., Wing, D., Mahy, R., and P. Matthews, "Session Traversal Utilities for NAT (STUN)", RFC 8489, DOI 10.17487/RFC8489, February 2020, . [RFC8656] Reddy, T., Ed., Johnston, A., Ed., Matthews, P., and J. Rosenberg, "Traversal Using Relays around NAT (TURN): Relay Extensions to Session Traversal Utilities for NAT (STUN)", RFC 8656, DOI 10.17487/RFC8656, February 2020, . [RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based Multiplexed and Secure Transport", RFC 9000, DOI 10.17487/RFC9000, May 2021, . 8.2. Informative References [TURN-Load-balance] "TURN Performance and Load Balance", n.d., . Author's Address William Zeng Ant Group Email: william.zk@antfin.com Zeng Expires 10 November 2022 [Page 24]