```
```Denmark

```
```
janus@steinwurf.com

```
```

```
```
Code On Network Coding LLC
Cambridge
```
```USA

```
```
xshi@alum.mit.edu

```
```

```
```
Code On Network Coding LLC
Cambridge
```
```USA

```
```
fouli@codeontechnologies.com

```
```

```
```
Code On Network Coding LLC
Cambridge
```
```USA

```
```
muriel.medard@codeontechnologies.com

```
```

```
```
Inmarsat PLC
London
```
```United Kingdom

```
```
Vince.Chook@inmarsat.com

```
```

```
```
IRTF
NWCRG
This document describes a symbol representation for Random Linear
Network Coding (RLNC) schemes used for reliable data transfer.
Specifically, the following features are discussed and incorporated:
both block RLNC and a sliding window RLNC, varying data frame sizes, and
one or multiple symbols associated with a single symbol representation
header.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 .

```
```
Symbol representation specifies the format of the symbol-carrying
data unit that is to be used in network coding operations, including
header format and symbol concatenation. This document describes a symbol
representation format intended to be used for Network Coding in general,
and for Random Linear Network Coding (RLNC) in particular .
Owing to its dynamic structure, network coding has requirements that
are distinct from conventional point-to-point codes, leading to a highly
reconfigurable symbol set. Consequently, the design choices related to
symbol representation are particularly important in network coding as
they have a direct impact on the viability of network protocols,
topologies, and architecture . For
example, recoding requires the
coefficients to be accessible at the recoding nodes. Hence,
architectures and protocols requiring recoding must specify coefficient
location in their symbol representation.
In addition to providing background on RLNC, argues that careful design and specification
of a symbol representation is a requirement for any viable network
coding protocol, architecture, or topology.
This section provides a symbol representation design for implementing
RLNC-based erasure correction schemes. In the decribed symbol
representation design, multiple symbols are concatenated and associated
with a single symbol representation header.
The symbol representation design is provided for constructing a data
payload portion of a data packet for a protocol that utilizes a
generation-based or sliding-window RLNC, where recoding can be used at
intermediate nodes. A data packet data payload comprises one or more
symbol representations. Each symbol representation in turn comprises one
or more symbols that can be systematic, coded or recoded. The use of
this symbol representation design is not limited by transmission
schemes. It can be applied to unicast, multiple-unicast, multicast,
multi-path, and multi-source settings and the like.
Coding coefficient vectors must be implicitly or explicitly
transmitted from the sender to the receiver, generally along with the
coded data for successful decoding of the original data. One option is
to attach each coding coefficient vector to the corresponding coded
symbol as a header, thus also enabling recoding at intermediate nodes.
Another option is to attach the current state of a pseudo-random
generator for generating the coding coefficient vector, to reduce the
size of the header. Adding a header to each symbol may result in a high
overhead when the symbol size is small or when generation or sliding
window size is large. Adding a joint header to the beginning of each
generation may also cause synchronization to be re-initiated only at the
beginning of each generation instead of every symbol. In what follows, a
symbol representation is provided that allow for both of these options
such that both a general representation with coding coefficients and a
compact representation with a seed for generating the coding
coefficients can be used, in order to reduce the header overhead.
This section specifies a symbol representation that enables both a
general form with coding vectors attached, and a compact form where a
seed is attached instead for the first symbol in the symbol
representation, and where subsequent coding vectors are deduced from
the first one. Different maximum generation and window size are
supported for RLNC encoding, recoding, and decoding.
To encode over a set of data symbols, a coding vector as described
in Section 1.1 is first generated, comprising a number of finite field
elements as specified by a generation/window size variable. In the
case of systematic codes, systematic symbols correspond to unit coding
vectors.
Figure 1 illustrates the general symbol representation design. Four
header fields precede the symbol data: TYPE flag (T), SYMBOLS, ENCODER
RANK, and SEED or CODING COEFFICIENTS. The TYPE Flag (T) indicates if
the symbol is systematic, coded, or recoded. SYMBOLS indicates the
number of symbols in the “Symbol(s) Data” field. ENCODER
RANK represents the current rank of the encoder, which is the number
of symbols being linearly combined. SEED is used to generate the
coding coefficient vector(s) using a pseudo-random number generator,
for a compact form of the symbol representation. The CODING
COEFFICIENTS field is a list of SYMBOLS number of coding vectors used
to generate the ensuing SYMBOL(S) DATA.
The TYPE Flag (T) indicates if the symbol is systematic, coded, or
recoded, and has the following properties:
2 bits long.
If the TYPE flag is ‘1’, all symbols included in
this symbol representation are systematic or uncoded, with symbol
index starting from ENCODER RANK. This option allows for efficient
representation of systematic symbols.
If the TYPE is '2', all symbols included in this symbol
representation are coded, with coding vectors generated using the
included SEED and the ENCODER RANK. This option allows for compact
and efficient representation of coded symbols, which may also
subsequently be recoded.
If the TYPE is '3', all symbols included in this symbol
representation are either coded or recoded, with the coding
coefficients included constituting ENCODER RANK coefficients
each.

SYMBOLS indicates the number of symbols in the “Symbol(s)
Data” field, and has the following properties:
4 bits long. A maximum number of 15 symbols are concatenated
within each symbol representation.
The special case of SYMBOLS = 0 indicates that zero symbols are
included, and consequently the size of ‘Symbol(s)
Data’ is 0 bytes. This can, for example, be used to
implement a flush functionality or ensure that protocol operations
do not stop in certain case for purely event-driven protocols.

ENCODER RANK represents the current rank of the encoder, and has
the following properties:
MUST be no larger than generation/window size.
If TYPE flag is '1', ENCODER RANK is the symbol index of the
first data symbol in this symbol representation.
If TYPE flag is '2' or '3', ENCODER RANK is the number of data
symbols over which coding was performed for all coded symbols in
this symbol representation.
Coded symbols can be generated before a generation or window is
filled. ENCODER RANK describes the number of original symbols
included in the coded symbol(s).

SEED is used to generate the coding coefficient vector(s) using a
pseudo-random number generator, for a compact form of the symbol
representation, and has the following properties:
The SEED field is only present when TYPE flag is '2'. If TYPE
is '1' or '3', this field is absent.
The pseudo-random generator MUST be seeded with this value and
all coding coefficient vectors are produced by the same generator.
For example, if ENCODER RANK is 12, then coding vector for the
first symbol in this symbol representation is coefficients 0
through 11 generated by the pseudo-random generator seeded by
SEED, and coding vector for the second symbol in this symbol
representation is coefficients 12 through 23 generated by the
pseudo-random generator seeded by SEED. If generation/window size
is larger than ENCODER RANK, the remaining coefficients in the
coding vector are zero.
To ensure that SEED can be interpreted correctly at the
receiver, the same pseudo-random number generator MUST be used by
the sender and a recoding or receiving node. Otherwise, more than
one SEED field would need to be used.
8 bits long. Thus, 256 different seed values can be served. One
SEED is used per symbol representation, each of which can contain
up to 15 symbols, all derived using the same SEED. For distinct
ENCODER RANKs, different coding vectors would be generated from
the same SEED, since only an ENCODER RANK number of coefficients
from the random generator is grouped as a coding coefficient
vector, before progressing to the next coding vector for the next
symbol in the symbol representation. Consequently, the maximal
number of coded symbols that can be generated for a generation is
|SEED| * |SYMBOLS| * |ENCODER RANK| which in the best case is
(2^8)*(2^4-1)*(2^10) ~ 2^22, which for all practical
considerations can be considered as an infinite number of coded
symbols. If all coded symbols that can be represented using a SEED
is exhausted, symbols where the coding vectors is included can be
sent instead.
In the case where no random number generator is available, or
where its use is not desired, the coding coefficients can be
produced by other means, such as functions of the data, state of
the network, or the like, and transmitted explicitly by setting
the TYPE flag to ‘3’.

CODING COEFFICIENTS field is a list of SYMBOLS number of coding
vectors used to generate the ensuing SYMBOL(S) DATA, and has the
following properties:
The CODING COEFFICIENT field is only present when TYPE flag is
'3'. If TYPE is '1' or '2', this field is absent.
Each coding vector comprises ENCODER RANK number of coding
coefficients, each coding coefficient having a predetermined field
size.

In a first small encoding window symbol representation, ENCODER
RANK is 10 bits long, and the maximum generation/window size is
2^10.
Figures 2 to 4 below illustrate systematic, coded, and recoded
symbol representations within an encoding window of size 2^10.
Systematic symbols are uncoded. Coded symbols are compact in form and
comprise a seed for coding coefficient generation. Recoded symbols are
general in form and comprise the coding coefficient vectors
explicitly.
The following examples show different symbol representations for
an illustrative case where the symbol size is 2 bytes,
generation/window size is 8, and field size is 2^8.
Example 1: Three systematic symbols with ID 0, 1 and 2. As the
TYPE flag is '1' , SEED/CODING COEFFICIENTS is absent, and ENCODER
RANK is the symbol index of the first data symbol with ID 0 in this
compact symbol representation.
Example 2: Two coded symbols using a compact representation. In
this example, TYPE is '2', the SEED to the pseudo-random number
generator shared by the sender and receiver is 4. The coding vector
for Symbol A is coefficients 0 to 7 generated by the pseudo-random
number generator, the coding vector for symbol B is coefficients 8
to 15 generated by the pseudo-random number generator.
Example 3: Two recoded symbols. Coefficients A0 to A7 constitute
the coding vector for Symbol A, coefficients B0 to B7 constitute the
coding vector for symbol B. In practical implementations, symbol
sizes are much larger than 2, leading to amortization of the coding
coefficient overheads.
In a second large encoding window symbol representation, ENCODER
RANK is 18-bit long, and the maximum generation/window size is
2^18.
Figures 8 to 10 below illustrate systematic, coded, and recoded
symbol representations within an encoding window of size 2^18.
Systematic symbols are uncoded. Coded symbols are compact in form and
comprise a seed for coding coefficient generation. Recoded symbols are
general in form and comprise the coding coefficient vectors explicitly
(CODING COEFFICIENTS or CODING COEFFS).
This document does not present new security considerations.
This document has no actions for IANA.
Random Linear Network Coding (RLNC): Background and Practical
Considerations
```
```

```
```

```
```

```
```
```
```

```
```

```
```

```
```
```
```

```
```

```
```

```
```
```
```

```
```

```
```

```
```
```
```

```
```

```
```

```
```

```
```

```
```

```
```
The Benefits of Coding over Routing in a Randomized
Setting
```
```

```
```

```
```

```
```
```
```

```
```

```
```

```
```
```
```

```
```

```
```

```
```
```
```

```
```

```
```

```
```
```
```

```
```

```
```

```
```

```
```

```
```

```
```

```
```

```
```