rfc9804.original | rfc9804.txt | |||
---|---|---|---|---|
Network Working Group R. Rivest | Internet Engineering Task Force (IETF) R. Rivest | |||
Internet-Draft MIT CSAIL | Request for Comments: 9804 MIT CSAIL | |||
Intended status: Informational D. Eastlake | Category: Informational D. Eastlake | |||
Expires: 13 July 2025 Independent | ISSN: 2070-1721 Independent | |||
9 January 2025 | June 2025 | |||
Simple Public Key Infrastructure (SPKI) S-Expressions | Simple Public Key Infrastructure (SPKI) S-Expressions | |||
draft-rivest-sexp-13 | ||||
Abstract | Abstract | |||
This memo specifies the data structure representation that was | This memo specifies the data structure representation that was | |||
devised to support Simple Public Key Infrastructure (SPKI, RFC 2692) | devised to support Simple Public Key Infrastructure (SPKI) | |||
certificates and with the intent that it be more widely applicable. | certificates, as detailed in RFC 2692, with the intent that it be | |||
It has been and is being used elsewhere. There are multiple | more widely applicable. It has been and is being used elsewhere. | |||
implementations in a variety of programming languages. Uses of this | There are multiple implementations in a variety of programming | |||
representation are referred to in this document as "S-expressions". | languages. Uses of this representation are referred to in this | |||
This memo makes precise the encodings of these SPKI S-expressions: it | document as "S-expressions". This memo makes precise the encodings | |||
gives a "canonical form" for them, describes two "transport" | of these SPKI S-expressions: It gives a "canonical form" for them, | |||
representations, and also describes an "advanced" format for display | describes two "transport" representations, and also describes an | |||
to people. | "advanced" format for display to people. | |||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This document is not an Internet Standards Track specification; it is | |||
provisions of BCP 78 and BCP 79. | published for informational purposes. | |||
Internet-Drafts are working documents of the Internet Engineering | ||||
Task Force (IETF). Note that other groups may also distribute | ||||
working documents as Internet-Drafts. The list of current Internet- | ||||
Drafts is at https://datatracker.ietf.org/drafts/current/. | ||||
Internet-Drafts are draft documents valid for a maximum of six months | This document is a product of the Internet Engineering Task Force | |||
and may be updated, replaced, or obsoleted by other documents at any | (IETF). It represents the consensus of the IETF community. It has | |||
time. It is inappropriate to use Internet-Drafts as reference | received public review and has been approved for publication by the | |||
material or to cite them other than as "work in progress." | Internet Engineering Steering Group (IESG). Not all documents | |||
approved by the IESG are candidates for any level of Internet | ||||
Standard; see Section 2 of RFC 7841. | ||||
This Internet-Draft will expire on 13 July 2025. | Information about the current status of this document, any errata, | |||
and how to provide feedback on it may be obtained at | ||||
https://www.rfc-editor.org/info/rfc9804. | ||||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2025 IETF Trust and the persons identified as the | Copyright (c) 2025 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents | |||
license-info) in effect on the date of publication of this document. | (https://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | ||||
Please review these documents carefully, as they describe your rights | carefully, as they describe your rights and restrictions with respect | |||
and restrictions with respect to this document. Code Components | to this document. Code Components extracted from this document must | |||
extracted from this document must include Revised BSD License text as | include Revised BSD License text as described in Section 4.e of the | |||
described in Section 4.e of the Trust Legal Provisions and are | Trust Legal Provisions and are provided without warranty as described | |||
provided without warranty as described in the Revised BSD License. | in the Revised BSD License. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction | |||
1.1. Uses of S-Expressions . . . . . . . . . . . . . . . . . . 4 | 1.1. Uses of S-Expressions | |||
1.2. Formalization . . . . . . . . . . . . . . . . . . . . . . 5 | 1.2. Formalization | |||
1.3. Historical Note . . . . . . . . . . . . . . . . . . . . . 5 | 1.3. Historical Note | |||
1.4. Conventions Used in This Document . . . . . . . . . . . . 5 | 1.4. Conventions Used in This Document | |||
2. S-expressions -- informal introduction . . . . . . . . . . . 5 | 2. S-expressions -- Informal Introduction | |||
3. Character set . . . . . . . . . . . . . . . . . . . . . . . . 6 | 3. Character Set | |||
4. Octet-string representation types . . . . . . . . . . . . . . 7 | 4. Octet-String Representation Types | |||
4.1. Verbatim representation . . . . . . . . . . . . . . . . . 7 | 4.1. Verbatim Representation | |||
4.2. Quoted-string representation . . . . . . . . . . . . . . 8 | 4.2. Quoted-String Representation | |||
4.3. Token representation . . . . . . . . . . . . . . . . . . 9 | 4.3. Token Representation | |||
4.4. Hexadecimal representation . . . . . . . . . . . . . . . 10 | 4.4. Hexadecimal Representation | |||
4.5. Base-64 representation of octet-strings . . . . . . . . . 10 | 4.5. Base-64 Representation of Octet-Strings | |||
4.6. Display-Hints and Internationalization . . . . . . . . . 11 | 4.6. Display-Hints and Internationalization | |||
4.7. Comparison of octet-strings . . . . . . . . . . . . . . . 12 | 4.7. Comparison of Octet-Strings | |||
5. Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 | 5. Lists | |||
6. S-expression representation types . . . . . . . . . . . . . . 13 | 6. S-Expression Representation Types | |||
6.1. Base-64 representation of S-expressions . . . . . . . . . 14 | 6.1. Base-64 Representation of S-Expressions | |||
6.2. Canonical representation . . . . . . . . . . . . . . . . 14 | 6.2. Canonical Representation | |||
6.3. Basic transport representation . . . . . . . . . . . . . 15 | 6.3. Basic Transport Representation | |||
6.4. Advanced transport representation . . . . . . . . . . . . 15 | 6.4. Advanced Transport Representation | |||
7. ABNF of the syntax . . . . . . . . . . . . . . . . . . . . . 15 | 7. ABNF of the Syntax | |||
7.1. ABNF for advanced transport . . . . . . . . . . . . . . . 16 | 7.1. ABNF for Advanced Transport | |||
7.2. ABNF for canonical . . . . . . . . . . . . . . . . . . . 17 | 7.2. ABNF for Canonical | |||
7.3. ABNF for basic transport . . . . . . . . . . . . . . . . 17 | 7.3. ABNF for Basic Transport | |||
8. Restricted S-expressions . . . . . . . . . . . . . . . . . . 17 | 8. Restricted S-Expressions | |||
9. In-memory representations . . . . . . . . . . . . . . . . . . 18 | 9. In-Memory Representations | |||
9.1. List-structure memory representation . . . . . . . . . . 18 | 9.1. List-Structure Memory Representation | |||
9.2. Array-layout memory representation . . . . . . . . . . . 18 | 9.2. Array-Layout Memory Representation | |||
9.2.1. Octet-string . . . . . . . . . . . . . . . . . . . . 19 | 9.2.1. Octet-String | |||
9.2.2. Octet-string with display-hint . . . . . . . . . . . 19 | 9.2.2. Octet-String with Display-Hint | |||
9.2.3. List . . . . . . . . . . . . . . . . . . . . . . . . 19 | 9.2.3. List | |||
10. Security Considerations . . . . . . . . . . . . . . . . . . . 20 | 10. Security Considerations | |||
11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 | 11. IANA Considerations | |||
12. Normative References . . . . . . . . . . . . . . . . . . . . 20 | 12. References | |||
13. Informative References . . . . . . . . . . . . . . . . . . . 21 | 12.1. Normative References | |||
Appendix A. Implementations . . . . . . . . . . . . . . . . . . 23 | 12.2. Informative References | |||
Appendix B. Change History . . . . . . . . . . . . . . . . . . . 24 | Appendix A. Implementations | |||
B.1. -00 Changes . . . . . . . . . . . . . . . . . . . . . . . 24 | Acknowledgements | |||
B.2. Changes from -00 to -01 . . . . . . . . . . . . . . . . . 24 | Contributors | |||
B.3. Changes from -01 to -02 . . . . . . . . . . . . . . . . . 25 | Authors' Addresses | |||
B.4. Changes from -02 to -03 . . . . . . . . . . . . . . . . . 25 | ||||
B.5. Changes from -03 to -04 . . . . . . . . . . . . . . . . . 25 | ||||
B.6. Changes from -04 to -05 . . . . . . . . . . . . . . . . . 25 | ||||
B.7. Changes from -05 to -06 . . . . . . . . . . . . . . . . . 25 | ||||
B.8. Changes from -06 to -07 . . . . . . . . . . . . . . . . . 26 | ||||
B.9. Changes from -07 to -08 . . . . . . . . . . . . . . . . . 26 | ||||
B.10. Changes from -08 to -09 . . . . . . . . . . . . . . . . . 27 | ||||
B.11. Changes from -09 to -10 . . . . . . . . . . . . . . . . . 27 | ||||
B.12. Changes from -10 to -12 . . . . . . . . . . . . . . . . . 27 | ||||
B.13. Changes from -12 to -13 . . . . . . . . . . . . . . . . . 27 | ||||
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 27 | ||||
Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . 27 | ||||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 27 | ||||
1. Introduction | 1. Introduction | |||
This memo specifies the data structure representation that was | This memo specifies the data structure representation that was | |||
devised to support Simple Public Key Infrastructure (SPKI) [RFC2692] | devised to support Simple Public Key Infrastructure (SPKI) | |||
certificates and with the intent that it be more widely applicable | certificates [RFC2692], with the intent that it be more widely | |||
(see Section 1.3, History). It is suitable for representing | applicable (see Section 1.3, "Historical Note"). It is suitable for | |||
arbitrary, complex data structures and has been and is being used | representing arbitrary, complex data structures and has been and is | |||
elsewhere. Uses of this representation herein are referred to as | being used elsewhere. Uses of this representation herein are | |||
"S-expressions". | referred to as "S-expressions". | |||
This memo makes precise the encodings of these SPKI S-expressions: it | This memo makes precise the encodings of these SPKI S-expressions: It | |||
gives a "canonical form" for them, describes two "transport" | gives a "canonical form" for them, describes two "transport" | |||
representations, and also describe an "advanced" format for display | representations, and also describes an "advanced" format for display | |||
to people. There are multiple implementations of S-expressions in a | to people. There are multiple implementations of S-expressions in a | |||
variety of programming languages including Python, Ruby, and C (see | variety of programming languages including Python, Ruby, and C (see | |||
Appendix A). | Appendix A). | |||
These S-expressions are either byte-strings ("octet-strings") or | These S-expressions are either byte-strings ("octet-strings") or | |||
lists of simpler S-expressions. Here is a sample S-expression: | lists of simpler S-expressions. Here is a sample S-expression: | |||
(snicker "abc" (#03# |YWJj|)) | (snicker "abc" (#03# |YWJj|)) | |||
It is a list of length three containing the following: | It is a list of length three containing the following: | |||
* the octet-string "snicker" | * the octet-string "snicker" | |||
* the octet-string "abc" | * the octet-string "abc" | |||
* a sub-list containing two elements: the hexadecimal constant #03# | * a sub-list containing two elements: The hexadecimal constant #03# | |||
(which represents a one octet long octet-string with the value of | (which represents a one-octet-long octet-string with the value of | |||
that octet being 0x03) and the base-64 constant |YWJj| (which | that octet being 0x03) and the base-64 constant |YWJj| (which | |||
represents the same octet-string as "abc") | represents the same octet-string as "abc") | |||
This document specifies how to construct and use these S-expressions. | This document specifies how to construct and use these S-expressions. | |||
The design goals for S-expressions were as follows: | The design goals for S-expressions were as follows: | |||
generality: S-expressions should be good at representing arbitrary | * Generality: S-expressions should be good at representing arbitrary | |||
data. | data. | |||
readability: It should be easy for someone to examine and understand | * Readability: It should be easy for someone to examine and | |||
the structure of an S-expression. | understand the structure of an S-expression. | |||
economy: S-expressions should represent data compactly. | * Economy: S-expressions should represent data compactly. | |||
transportability: S-expressions should be easy to transport over | * Transportability: S-expressions should be easy to transport over | |||
communication media (such as email) that are known to be less than | communication media (such as email) that are known to be less than | |||
perfect. | perfect. | |||
flexibility: S-expressions should make it relatively simple to | * Flexibility: S-expressions should make it relatively simple to | |||
modify and extend data structures. | modify and extend data structures. | |||
canonicalization: It should be easy to produce a unique "canonical" | * Canonicalization: It should be easy to produce a unique | |||
form of an S-expression, for digital signature purposes. | "canonical" form of an S-expression, for digital signature | |||
purposes. | ||||
efficiency: S-expressions should admit in-memory representations | * Efficiency: S-expressions should admit in-memory representations | |||
that allow efficient processing. | that allow efficient processing. | |||
For implementors of new applications and protocols other technologies | For implementors of new applications and protocols other technologies | |||
also worthy of consideration include the following: [XML], CBOR | also worthy of consideration include the following: XML [XML], CBOR | |||
[RFC8949], and JSON [RFC7159]. | [RFC8949], and JSON [RFC8259]. | |||
1.1. Uses of S-Expressions | 1.1. Uses of S-Expressions | |||
The S-expressions specified herein are in active use today between | The S-expressions specified herein are in active use today between | |||
GnuPG [GnuPG] and Ribose's RNP [Ribose]. Ribose has implemented C++ | GnuPG [GnuPG] and Ribose's RNP [Ribose]. Ribose has implemented C++ | |||
software to compose and parse these S-expressions [RNPGP_SEXPP]. The | software to compose and parse these S-expressions [RNPGP_SEXPP]. The | |||
GNU software is here [Libgcrypt] and there are other implementations | GNU software is the Libgcrypt library [Libgcrypt], and there are | |||
(see Appendix A). | other implementations (see Appendix A). | |||
They are used or referenced in the following RFCs: | They are used or referenced in the following RFCs: | |||
* [RFC2693] for [SPKI] | * [RFC2693] for [SPKI] | |||
* [RFC3275] XML-Signature Syntax and Processing | * [RFC3275] XML-Signature Syntax and Processing | |||
In addition, S-Expressions are the inspiration for the encodings in | In addition, S-expressions are the inspiration for the encodings in | |||
other protocols. For example, [RFC3259] or Section 6 of | other protocols. For example, [RFC3259] or Section 6 of | |||
[CDDLfreezer]. | [CDDL-freezer]. | |||
1.2. Formalization | 1.2. Formalization | |||
An Internet Draft [formal] has been posted showing a formal model of | [Formal] is an Internet-Draft that shows a formal model of SPKI | |||
SPKI S-Expressions and which formally demonstrates that the examples | S-expressions and formally demonstrates that the examples and ABNF in | |||
and ABNF in this document are correct. | this document are correct. | |||
1.3. Historical Note | 1.3. Historical Note | |||
The S-expressions described here were originally developed for "SDSI" | The S-expressions described here were originally developed for "SDSI" | |||
(the Simple Distributed Security Infrastructure by Lampson and Rivest | (the Simple Distributed Security Infrastructure by Lampson and Rivest | |||
[SDSI]) in 1996, although their origins date back to McCarthy's | [SDSI]) in 1996, although their origins date back to McCarthy's | |||
[LISP] programming language. They were further refined and improved | [LISP] programming language. They were further refined and improved | |||
during the merger of SDSI and SPKI [SPKI] [RFC2692] [RFC2693] during | during the merger of SDSI and SPKI [SPKI] [RFC2692] [RFC2693] during | |||
the first half of 1997. S-expressions are more readable and flexible | the first half of 1997. S-expressions are more readable and flexible | |||
than, Bernstein's "net-strings" [BERN], which were developed | than Bernstein's "netstrings" [BERN], which were developed | |||
contemporaneously. | contemporaneously. | |||
| Although a specification was made publicly available as a file | | Although a specification was made publicly available as a file | |||
| named draft-rivest-sexp-00.txt on 4 May 1997, that file was | | named draft-rivest-sexp-00.txt on 4 May 1997, that file was | |||
| never actually submitted to the IETF. This document is a | | never actually submitted to the IETF. This document is a | |||
| clarified and modernized version of that document. | | clarified and modernized version of that document. | |||
1.4. Conventions Used in This Document | 1.4. Conventions Used in This Document | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
"OPTIONAL" in this document are to be interpreted as described in BCP | "OPTIONAL" in this document are to be interpreted as described in | |||
14 [RFC2119] [RFC8174] when, and only when, they appear in all | BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all | |||
capitals, as shown here. | capitals, as shown here. | |||
2. S-expressions -- informal introduction | 2. S-expressions -- Informal Introduction | |||
Informally, an S-expression is either: | Informally, an S-expression is either: | |||
* an octet-string, or | * an octet-string, or | |||
* a finite list of simpler S-expressions. | * a finite list of simpler S-expressions. | |||
An octet-string is a finite sequence of eight-bit octets. An octet- | An octet-string is a finite sequence of eight-bit octets. An octet- | |||
string may be zero length. There may be many different but | string may be zero length. There may be many different but | |||
equivalent ways of representing an octet-string | equivalent ways of representing an octet-string | |||
skipping to change at page 6, line 23 ¶ | skipping to change at line 245 ¶ | |||
(abc (de #6667#) "ghi jkl") | (abc (de #6667#) "ghi jkl") | |||
As can be seen, there is variability possible in the encoding of an | As can be seen, there is variability possible in the encoding of an | |||
S-expression. In some applications, it is desirable to standardize | S-expression. In some applications, it is desirable to standardize | |||
or restrict the encodings; in other cases, it is desirable to have no | or restrict the encodings; in other cases, it is desirable to have no | |||
restrictions. The following are the target cases these S-expressions | restrictions. The following are the target cases these S-expressions | |||
aim to handle: | aim to handle: | |||
* a "transport" or "basic" encoding for transporting the | * a "transport" or "basic" encoding for transporting the | |||
S-expression between computers. | S-expression between computers | |||
* a "canonical" encoding, used when signing the S-expression. | * a "canonical" encoding, used when signing the S-expression | |||
* an "advanced" encoding used for input/output to people. | * an "advanced" encoding used for input/output to people | |||
* an "in-memory" encoding used for processing the S-expression in | * an "in-memory" encoding used for processing the S-expression in | |||
the computer. | the computer | |||
In this document, related encoding techniques for each of these uses | In this document, related encoding techniques for each of these uses | |||
are provided. | are provided. | |||
3. Character set | 3. Character Set | |||
This document specifies encodings of S-expressions. Except when | This document specifies encodings of S-expressions. Except when | |||
giving "verbatim" encodings, the character set used is limited to the | giving "verbatim" encodings, the character set used is limited to the | |||
following characters in ASCII [RFC0020]: | following characters in ASCII [RFC0020]: | |||
Alphabetic: | Alphabetic: | |||
A B ... Z a b ... z | A B ... Z a b ... z | |||
Numeric: | Numeric: | |||
0 1 ... 9 | 0 1 ... 9 | |||
Whitespace: | Whitespace: | |||
skipping to change at page 7, line 38 ¶ | skipping to change at line 309 ¶ | |||
% percent | % percent | |||
^ circumflex | ^ circumflex | |||
~ tilde | ~ tilde | |||
; semicolon | ; semicolon | |||
' single-quote (apostrophe) | ' single-quote (apostrophe) | |||
, comma | , comma | |||
< less than | < less than | |||
> greater than | > greater than | |||
? question mark | ? question mark | |||
4. Octet-string representation types | 4. Octet-String Representation Types | |||
This section describes in detail the ways in which an octet-string | This section describes in detail the ways in which an octet-string | |||
may be represented. | may be represented. | |||
Recall that an octet-string is any finite sequence of octets, and | Recall that an octet-string is any finite sequence of octets and that | |||
that an octet-string may have length zero. | an octet-string may have length zero. | |||
4.1. Verbatim representation | 4.1. Verbatim Representation | |||
A verbatim encoding of an octet-string consists of three parts: | A verbatim encoding of an octet-string consists of three parts: | |||
* the length (number of octets) of the octet-string, given in | * the length (number of octets) of the octet-string, given in | |||
decimal, most significant digit first, with no leading zeros. | decimal, most significant digit first, with no leading zeros | |||
* a colon ":" | * a colon ":" | |||
* the octet-string itself, verbatim. | * the octet-string itself, verbatim | |||
There are no blanks or whitespace separating the parts. No "escape | There are no blanks or whitespace separating the parts. No "escape | |||
sequences" are interpreted in the octet-string. This encoding is | sequences" are interpreted in the octet-string. This encoding is | |||
also called a "binary" or "raw" encoding. | also called a "binary" or "raw" encoding. | |||
Here are some sample verbatim encodings: | Here are some sample verbatim encodings: | |||
3:abc | 3:abc | |||
7:subject | 7:subject | |||
4:::": | 4:::": | |||
12:hello world! | 12:hello world! | |||
10:abcdefghij | 10:abcdefghij | |||
0: | 0: | |||
4.2. Quoted-string representation | 4.2. Quoted-String Representation | |||
The quoted-string representation of an octet-string consists of: | The quoted-string representation of an octet-string consists of: | |||
* an optional decimal length field | * an optional decimal length field | |||
* an initial double-quote (") | * an initial double-quote (") | |||
* the octet-string with the "C" programming language [C] escape | * the octet-string with the C programming language [C] escape | |||
conventions (\n, etc.) | conventions (\n, etc.) | |||
* a final double-quote (") | * a final double-quote (") | |||
The specified length is the length of the resulting string after any | The specified length is the length of the resulting string after any | |||
backslash escape sequences have been converted to the octet value | backslash escape sequences have been converted to the octet value | |||
they denote. The string does not have any "terminating NULL" that | they denote. The string does not have any "terminating NULL" that | |||
[C] includes, and the length does not count such an octet. | [C] includes, and the length does not count such an octet. | |||
The length is optional. | The length is optional. | |||
The escape conventions within the quoted string are as follows (these | The escape conventions within the quoted string are as follows (these | |||
follow the "C" [C] programming language conventions, with an | follow the C programming language [C] conventions, with an extension | |||
extension for ignoring line terminators of just CR, LF, CRLF, or LFCR | for ignoring line terminators of just CR, LF, CRLF, or LFCR and more | |||
and more restrictive octal and hexadecimal value formats): | restrictive octal and hexadecimal value formats): | |||
\a -- audible alert (bell) | \a -- audible alert (bell) | |||
\b -- backspace | \b -- backspace | |||
\t -- horizontal tab | \t -- horizontal tab | |||
\v -- vertical tab | \v -- vertical tab | |||
\n -- new-line | \n -- new-line | |||
\f -- form-feed | \f -- form-feed | |||
\r -- carriage-return | \r -- carriage-return | |||
\" -- double-quote | \" -- double-quote | |||
\' -- single-quote | \' -- single-quote | |||
\? -- question mark | \? -- question mark | |||
\\ -- back-slash | \\ -- back-slash | |||
\ooo -- character with octal value ooo (all three | \ooo -- character with octal value ooo (all three | |||
digits MUST be present) | digits MUST be present) | |||
\xhh -- character with hexadecimal value hh (both | \xhh -- character with hexadecimal value hh (both | |||
digits MUST be present) | digits MUST be present) | |||
\<carriage-return> -- causes carriage-return | \<carriage-return> -- causes carriage-return to be ignored. | |||
to be ignored. | \<line-feed> -- causes line-feed to be ignored. | |||
\<line-feed> -- causes linefeed to be | ||||
ignored. | ||||
\<carriage-return><line-feed> -- causes | \<carriage-return><line-feed> -- causes | |||
CRLF to be ignored. | CRLF to be ignored. | |||
\<line-feed><carriage-return> -- causes | \<line-feed><carriage-return> -- causes | |||
LFCR to be ignored. | LFCR to be ignored. | |||
Here are some examples of quoted-string encodings: | Here are some examples of quoted-string encodings: | |||
"subject" | "subject" | |||
"hi there" | "hi there" | |||
7"subject" | 7"subject" | |||
"\xFE is the same octet as \376" | "\xFE is the same octet as \376" | |||
3"\n\n\n" | 3"\n\n\n" | |||
"This has\n two lines." | "This has\n two lines." | |||
"This has \ | "This has \ | |||
one line." | one line." | |||
"" | "" | |||
4.3. Token representation | 4.3. Token Representation | |||
An octet-string that meets the following conditions may be given | An octet-string that meets the following conditions may be given | |||
directly as a "token": | directly as a "token": | |||
* it does not begin with a digit; | * it does not begin with a digit; | |||
* it contains only characters that are: alphabetic (upper or lower | * it contains only characters that are: alphabetic (upper or lower | |||
case), numeric, or one of the following eight "pseudo-alphabetic" | case), numeric, or one of the following eight "pseudo-alphabetic" | |||
punctuation marks: - . / _ : * + = | punctuation marks: | |||
- . / _ : * + = | ||||
* it is length 1 or greater. | * it is length 1 or greater. | |||
Note: Upper and lower case are not equivalent. A token may begin | Note: Upper and lower case are not equivalent. A token may begin | |||
with punctuation, including ":". | with punctuation, including ":". | |||
Here are some examples of token representations: | Here are some examples of token representations: | |||
subject | subject | |||
not-before | not-before | |||
:=.. | :=.. | |||
class-of-1997 | class-of-1997 | |||
//example.net/names/smith | //example.net/names/smith | |||
* | * | |||
4.4. Hexadecimal representation | 4.4. Hexadecimal Representation | |||
An octet-string may be represented with a hexadecimal encoding | An octet-string may be represented with a hexadecimal encoding | |||
consisting of: | consisting of: | |||
* an (optional) decimal length of the octet-string | * an (optional) decimal length of the octet-string | |||
* a sharp-sign "#" | * a sharp-sign "#" | |||
* a hexadecimal encoding of the octet-string, with each octet | * a hexadecimal encoding of the octet-string, with each octet | |||
represented with two hexadecimal digits, most significant digit | represented with two hexadecimal digits, most significant digit | |||
skipping to change at page 10, line 44 ¶ | skipping to change at line 454 ¶ | |||
characters other than whitespace and hexadecimal digits. | characters other than whitespace and hexadecimal digits. | |||
Here are some examples of hexadecimal encodings: | Here are some examples of hexadecimal encodings: | |||
#616263# -- represents "abc" | #616263# -- represents "abc" | |||
3#616263# -- also represents "abc" | 3#616263# -- also represents "abc" | |||
# 616 | # 616 | |||
263 # -- also represents "abc" | 263 # -- also represents "abc" | |||
## -- represents the zero-length string | ## -- represents the zero-length string | |||
4.5. Base-64 representation of octet-strings | 4.5. Base-64 Representation of Octet-Strings | |||
An octet-string may be represented in a base-64 encoding [RFC4648] | An octet-string may be represented in a base-64 encoding [RFC4648] | |||
consisting of: | consisting of: | |||
* an (optional) decimal length of the octet-string | * an (optional) decimal length of the octet-string | |||
* a vertical bar "|" | * a vertical bar "|" | |||
* the base-64 [RFC4648] encoding of the octet-string. | * the base-64 [RFC4648] encoding of the octet-string. | |||
* a final vertical bar "|" | * a final vertical bar "|" | |||
Base-64 encoding produces four characters of output for each three | Base-64 encoding produces four characters of output for each three | |||
octets of input. If the length of the input divided by three leaves | octets of input. If the length of the input divided by three leaves | |||
a remainder of one or two, it produces an output block of length four | a remainder of one or two, it produces an output block of length four | |||
ending in two or one equals signs, respectively. These equals signs | ending in two or one equals signs, respectively. These equals signs | |||
MUST be included on output but input routines MAY accept inputs where | MUST be included on output, but input routines MAY accept inputs | |||
one or two equals signs are dropped. | where one or two equals signs are dropped. | |||
Whitespace inserted in the midst of the base-64 encoding is ignored. | Whitespace inserted in the midst of the base-64 encoding is ignored. | |||
It is an error to have characters other than whitespace and base-64 | It is an error to have characters other than whitespace and base-64 | |||
characters. | characters. | |||
Here are some examples of base-64 encodings: | Here are some examples of base-64 encodings: | |||
|YWJj| -- represents "abc" | |YWJj| -- represents "abc" | |||
| Y W | | Y W | |||
J j | -- also represents "abc" | J j | -- also represents "abc" | |||
skipping to change at page 11, line 36 ¶ | skipping to change at line 495 ¶ | |||
|YWJjZA| -- also represents "abcd" | |YWJjZA| -- also represents "abcd" | |||
|| -- represents the zero-length string | || -- represents the zero-length string | |||
Note the difference between this base-64 encoding of an octet-string | Note the difference between this base-64 encoding of an octet-string | |||
using vertical bars ("| |") and the base-64 encoding of an | using vertical bars ("| |") and the base-64 encoding of an | |||
S-expression using curly braces ("{ }") in Section 6.1. | S-expression using curly braces ("{ }") in Section 6.1. | |||
4.6. Display-Hints and Internationalization | 4.6. Display-Hints and Internationalization | |||
An octet-string can contain any type of data representable by a | An octet-string can contain any type of data representable by a | |||
finite octet-string, for example text, a fixed or variable length | finite octet-string, e.g., text, a fixed or variable-length integer, | |||
integer, or an image. Normally the application producing / consuming | or an image. Normally, the application producing and/or consuming | |||
S-expressions will understand their structure and the data type and | S-expressions will understand their structure, the data type, and the | |||
encoding of the octet-strings within the S-expressions used by that | encoding of the octet-strings within the S-expressions used by that | |||
application. If the octet-string consists of text, use of UTF-8 | application. If the octet-string consists of text, use of UTF-8 | |||
encoding is RECOMMENDED [RFC2130] [RFC3629]. | encoding is RECOMMENDED [RFC2130] [RFC3629]. | |||
The purposes of a display-hint is to provide information on how to | The purpose of a display-hint is to provide information on how to | |||
display an octet-string to a user. It has no other function. Many | display an octet-string to a user. It has no other function. Many | |||
of the MIME [RFC2046] types work here. | of the MIME [RFC2046] types work here. | |||
A display-hint is an octet-string representation surrounded by square | A display-hint is an octet-string representation surrounded by square | |||
brackets. There may be whitespace separating the display hint octet- | brackets. There may be whitespace separating the display hint octet- | |||
string from the surrounding brackets. Any of the legal octet-string | string from the surrounding brackets. Any of the legal octet-string | |||
representations may be used for the display-hint string but a | representations may be used for the display-hint string, but a | |||
display-hint may not be applied to a display-hint string, that is, | display-hint may not be applied to a display-hint string -- that is, | |||
display-hints may not be nested. | display-hints may not be nested. | |||
A display-hint that can be used for UTF-8 encoded text is shown in | A display-hint that can be used for UTF-8-encoded text is shown in | |||
the following example where the octet-string is text saying "bob", | the following example where the octet-string is "böb☺", i.e., "bob" | |||
with an umlaut over the central "o", followed by a smilie emoji. | with an umlaut over the "o", followed by WHITE SMILING FACE (U+263A). | |||
["text/plain; charset=utf-8"]"b\xC3\xB7b\xE2\x98\xBA" | ["text/plain; charset=utf-8"]"b\xC3\xB7b\xE2\x98\xBA" | |||
Every octet-string representation is either preceded by a single | Every octet-string representation is either preceded by a single | |||
display-hint or not so preceded. There may be whitespace between the | display-hint or not so preceded. There may be whitespace between the | |||
close square bracket and the octet-string to which the hint applies. | close square bracket and the octet-string to which the hint applies. | |||
Here are some other examples of display-hints: | Here are some other examples of display-hints: | |||
[image/gif] | [image/gif] | |||
skipping to change at page 12, line 38 ¶ | skipping to change at line 539 ¶ | |||
[application/postscript] | [application/postscript] | |||
[audio/basic] | [audio/basic] | |||
["http://example.com/display-types/funky.html"] | ["http://example.com/display-types/funky.html"] | |||
An octet-string that has no display-hint may be considered to have a | An octet-string that has no display-hint may be considered to have a | |||
MIME [RFC2046] type specified by the application or use. In the | MIME [RFC2046] type specified by the application or use. In the | |||
absence of such a specification, the default is as follows: | absence of such a specification, the default is as follows: | |||
[application/octet-stream] | [application/octet-stream] | |||
When an S-Expression is being encoded in one of the representations | When an S-expression is being encoded in one of the representations | |||
described in Section 6, any display-hint present is included. If a | described in Section 6, any display-hint present is included. If a | |||
display-hint is the default, it is not suppressed nor is the default | display-hint is the default, it is not suppressed nor is the default | |||
display-hint included in the representation for an octet-string | display-hint included in the representation for an octet-string | |||
without a display-hint. | without a display-hint. | |||
4.7. Comparison of octet-strings | 4.7. Comparison of Octet-Strings | |||
It is RECOMMENDED that two octet-strings be considered equivalent for | It is RECOMMENDED that two octet-strings be considered equivalent for | |||
most computational and algorithmic purposes if and only if they have | most computational and algorithmic purposes if and only if they have | |||
the same display-hint and the same data octet-strings. However, a | the same display-hint and the same data octet-strings. However, a | |||
particular application might need a different criterion. For | particular application might need a different criterion. For | |||
example, it might ignore the display hint on comparisons. | example, it might ignore the display hint on comparisons. | |||
Note that octet-strings are "case-sensitive"; the octet-string "abc" | Note that octet-strings are "case-sensitive"; the octet-string "abc" | |||
is not equal to the octet-string "ABC". | is not equal to the octet-string "ABC". | |||
skipping to change at page 13, line 24 ¶ | skipping to change at line 572 ¶ | |||
5. Lists | 5. Lists | |||
Just as with octet-strings, there are variations in representing a | Just as with octet-strings, there are variations in representing a | |||
list. Whitespace may be used to separate list elements, but they are | list. Whitespace may be used to separate list elements, but they are | |||
only required to separate two octet-strings when otherwise the two | only required to separate two octet-strings when otherwise the two | |||
octet-strings might be interpreted as one, as when one token follows | octet-strings might be interpreted as one, as when one token follows | |||
another. To be precise, an octet-string represented as a token | another. To be precise, an octet-string represented as a token | |||
(Section 4.3) MUST be separated by whitespace from a following token, | (Section 4.3) MUST be separated by whitespace from a following token, | |||
verbatim representation, or any of the following if they are prefixed | verbatim representation, or any of the following if they are prefixed | |||
with a length: quoted-string, hexadecimal, or base-64 representation. | with a length: quoted-string, hexadecimal, or base-64 representation. | |||
Also, whitespace may follow the initial left parenthesis, or precede | Also, whitespace may follow the initial left parenthesis or precede | |||
the final right parenthesis of a list. | the final right parenthesis of a list. | |||
Here are some examples of encodings of lists: | Here are some examples of encodings of lists: | |||
(a bob c) | (a bob c) | |||
( a ( bob c ) ( ( d e ) ( e f ) ) ) | ( a ( bob c ) ( ( d e ) ( e f ) ) ) | |||
(11:certificate(6:issuer3:bob)(7:subject5:alice)) | (11:certificate(6:issuer3:bob)(7:subject5:alice)) | |||
(|ODpFeGFtcGxlIQ==| "1997" murphy 3:XC+) | (|ODpFeGFtcGxlIQ==| "1997" murphy 3:XC+) | |||
() | () | |||
6. S-expression representation types | 6. S-Expression Representation Types | |||
There are three "types" of representation: | There are three "types" of representation: | |||
* canonical | * canonical | |||
* basic transport | * basic transport | |||
* advanced transport | * advanced transport | |||
The first two MUST be supported by any implementation; the last is | The first two MUST be supported by any implementation; the last is | |||
OPTIONAL. As part of basic representation, the base-64 [RFC4648] | OPTIONAL. As part of basic representation, the base-64 [RFC4648] | |||
representation of an S-expression may be used as described in | representation of an S-expression may be used as described in | |||
Section 6.1. | Section 6.1. | |||
6.1. Base-64 representation of S-expressions | 6.1. Base-64 Representation of S-Expressions | |||
An S-expression may be represented in a base-64 encoding [RFC4648] | An S-expression may be represented in a base-64 encoding [RFC4648] | |||
consisting of: | consisting of: | |||
* an opening curly brace "{" | * an opening curly brace "{" | |||
* the base-64 [RFC4648] encoding of the S-expression. | * the base-64 [RFC4648] encoding of the S-expression | |||
* a final closing curly brace "}" | * a final closing curly brace "}" | |||
Base-64 encoding produces four characters of output for each three | Base-64 encoding produces four characters of output for each three | |||
octets of input. If the length of the input divided by three leaves | octets of input. If the length of the input divided by three leaves | |||
a remainder of one or two, it produces an output block of length four | a remainder of one or two, it produces an output block of length four | |||
ending in two or one equals signs, respectively. These equals signs | ending in two or one equals signs, respectively. These equals signs | |||
MUST be included on output but input routines MAY accept inputs where | MUST be included on output, but input routines MAY accept inputs | |||
one or two equals signs are dropped. | where one or two equals signs are dropped. | |||
Whitespace inserted in the midst of the base-64 encoding, after the | Whitespace inserted in the midst of the base-64 encoding, after the | |||
opening curly brace, or before the closing curly brace is ignored. | opening curly brace, or before the closing curly brace is ignored. | |||
It is an error to have characters other than whitespace and base-64 | It is an error to have characters other than whitespace and base-64 | |||
characters. | characters. | |||
Note the difference between this base-64 encoding of an S-expression | Note the difference between this base-64 encoding of an S-expression | |||
using curly braces ("{ }") and the base-64 encoding of an octet- | using curly braces ("{ }") and the base-64 encoding of an octet- | |||
string using vertical bars ("| |") in Section 4.5. | string using vertical bars ("| |") in Section 4.5. | |||
6.2. Canonical representation | 6.2. Canonical Representation | |||
This canonical representation is used for digital signature purposes | This canonical representation is used for digital signature purposes | |||
and transport over channels not sensitive to specific octet values. | and transport over channels not sensitive to specific octet values. | |||
It is uniquely defined for each S-expression. It is not particularly | It is uniquely defined for each S-expression. It is not particularly | |||
readable, but that is not the point. It is intended to be very easy | readable, but that is not the point. It is intended to be very easy | |||
to parse, to be reasonably economical, and to be unique for any | to parse, reasonably economical, and unique for any S-expression. | |||
S-expression. (See [CANON].) | See [CANON1] and [CANON2]. | |||
The "canonical" form of an S-expression represents each octet-string | The "canonical" form of an S-expression represents each octet-string | |||
in verbatim mode, and represents each list with no blanks separating | in verbatim mode, and represents each list with no blanks separating | |||
elements from each other or from the surrounding parentheses (see | elements from each other or from the surrounding parentheses. See | |||
also Section 7.2). | also Section 7.2. | |||
Here are some examples of canonical representations of S-expressions: | Here are some examples of canonical representations of S-expressions: | |||
(6:issuer3:bob) | (6:issuer3:bob) | |||
(4:icon[12:image/bitmap]9:xxxxxxxxx) | (4:icon[12:image/bitmap]9:xxxxxxxxx) | |||
(7:subject(3:ref5:alice6:mother)) | (7:subject(3:ref5:alice6:mother)) | |||
10:foo)]}>bar | 10:foo)]}>bar | |||
0: | 0: | |||
6.3. Basic transport representation | 6.3. Basic Transport Representation | |||
There are two forms of the "basic transport" representation: | There are two forms of the "basic transport" representation: | |||
* the canonical representation | 1. The canonical representation | |||
* an [RFC4648] base-64 representation of the canonical | 2. A base-64 [RFC4648] representation of the canonical | |||
representation, surrounded by braces (see Section 6.1). | representation, surrounded by braces (see Section 6.1) | |||
The basic transport representations (see Section 7.3) are intended to | The basic transport representations (see Section 7.3) are intended to | |||
provide a universal means of representing S-expressions for transport | provide a universal means of representing S-expressions for transport | |||
from one machine to another. The base-64 encoding would be | from one machine to another. The base-64 encoding would be | |||
appropriate if the channel over which the S-expression is being sent | appropriate if the channel over which the S-expression is being sent | |||
might be sensitive to octets of some special values, such as an octet | might be sensitive to octets of some special values, such as an octet | |||
of all zero bits (NULL) or an octet of all one bits (DEL), or the | of all zero bits (NULL) or an octet of all one bits (DEL), or if the | |||
channel is sensitive to "line length" such that occasional line | channel is sensitive to "line length" such that occasional line | |||
terminating whitespace is needed. | terminating whitespace is needed. | |||
Here are two examples of an S-expression represented in basic | Here are two examples of an S-expression represented in basic | |||
transport mode: | transport mode: | |||
(1:a1:b1:c) | (1:a1:b1:c) | |||
{KDE6YTE6YjE | {KDE6YTE6YjE | |||
6Yyk= } | 6Yyk= } | |||
The second example above is the same S-expression as the first | The second example above is the same S-expression as the first | |||
encoded in base-64. | encoded in base-64. | |||
6.4. Advanced transport representation | 6.4. Advanced Transport Representation | |||
The "advanced transport" representation is intended to provide more | The "advanced transport" representation is intended to provide more | |||
flexible and readable notations for documentation, design, debugging, | flexible and readable notations for documentation, design, debugging, | |||
and (in some cases) user interface. | and (in some cases) user interface. | |||
The advanced transport representation allows all of the octet-string | The advanced transport representation allows all of the octet-string | |||
representation forms described above in Section 4: quoted strings, | representation forms described above in Section 4: quoted strings, | |||
base-64, hexadecimal, tokens, representations of strings with omitted | base-64, hexadecimal, tokens, representations of strings with omitted | |||
lengths, and so on. (See Section 7.1). | lengths, and so on. See Section 7.1. | |||
7. ABNF of the syntax | 7. ABNF of the Syntax | |||
ABNF is the Augmented Backus-Naur Form for syntax specifications as | ABNF is the Augmented Backus-Naur Form for syntax specifications as | |||
defined in [RFC5234]. The ABNF for advanced representation of | defined in [RFC5234]. The ABNF for advanced representation of | |||
S-expressions is given first and the basic and canonical forms | S-expressions is given first, and the basic and canonical forms are | |||
derived therefrom. The rule names below in all capital letters are | derived therefrom. The rule names below in all capital letters are | |||
defined in Appendix B.1 of [RFC5234]. | defined in Appendix B.1 of [RFC5234]. | |||
7.1. ABNF for advanced transport | 7.1. ABNF for Advanced Transport | |||
sexp = *whitespace value *whitespace | sexp = *whitespace value *whitespace | |||
whitespace = SP / HTAB / vtab / CR / LF / ff | whitespace = SP / HTAB / vtab / CR / LF / ff | |||
vtab = %x0B ; vertical tab | vtab = %x0B ; vertical tab | |||
ff = %x0C ; form feed | ff = %x0C ; form feed | |||
value = string / ("(" *(value / whitespace) ")") | value = string / ("(" *(value / whitespace) ")") | |||
skipping to change at page 17, line 4 ¶ | skipping to change at line 746 ¶ | |||
backslash = %x5C | backslash = %x5C | |||
quote = %x27 ; single quote | quote = %x27 ; single quote | |||
token = (ALPHA / simple-punc) *(ALPHA / DIGIT / | token = (ALPHA / simple-punc) *(ALPHA / DIGIT / | |||
simple-punc) | simple-punc) | |||
simple-punc = "-" / "." / "/" / "_" / ":" / "*" / "+" / "=" | simple-punc = "-" / "." / "/" / "_" / ":" / "*" / "+" / "=" | |||
hexadecimal = [decimal] "#" *whitespace *hexadecimals "#" | hexadecimal = [decimal] "#" *whitespace *hexadecimals "#" | |||
hexadecimals = 2(HEXDIG *whitespace) | hexadecimals = 2(HEXDIG *whitespace) | |||
base-64 = [decimal] "|" *whitespace *base-64-chars | base-64 = [decimal] "|" *whitespace *base-64-chars | |||
[base-64-end] "|" | [base-64-end] "|" | |||
base-64-chars = 4(base-64-char *whitespace) | base-64-chars = 4(base-64-char *whitespace) | |||
base-64-char = ALPHA / DIGIT / "+" / "/" | base-64-char = ALPHA / DIGIT / "+" / "/" | |||
base-64-end = base-64-chars / | base-64-end = base-64-chars / | |||
3(base-64-char *whitespace) ["=" *whitespace] / | 3(base-64-char *whitespace) ["=" *whitespace] / | |||
2(base-64-char *whitespace) *2("=" *whitespace) | 2(base-64-char *whitespace) *2("=" *whitespace) | |||
7.2. ABNF for canonical | 7.2. ABNF for Canonical | |||
c-sexp = c-string / ("(" *c-sexp ")") | c-sexp = c-string / ("(" *c-sexp ")") | |||
c-string = [ "[" verbatim "]" ] verbatim | c-string = [ "[" verbatim "]" ] verbatim | |||
7.3. ABNF for basic transport | 7.3. ABNF for Basic Transport | |||
b-sexp = c-sexp / b-base-64 | b-sexp = c-sexp / b-base-64 | |||
b-base-64 = "{" *whitespace *base-64-chars base-64-end "}" | b-base-64 = "{" *whitespace *base-64-chars base-64-end "}" | |||
; encodes a c-sexp, which has a minimum | ; encodes a c-sexp, which has a minimum | |||
; length of 2 | ; length of 2 | |||
8. Restricted S-expressions | 8. Restricted S-Expressions | |||
This document has described S-expressions in general form. | This document has described S-expressions in general form. | |||
Applications may wish to restrict their use of S-expressions in | Applications may wish to restrict their use of S-expressions in | |||
various ways as well as to specify a different default display-hint. | various ways as well as to specify a different default display-hint. | |||
Here are some possible restrictions that might be considered: | Here are some possible restrictions that might be considered: | |||
* no advanced representations (only canonical and basic) | * no advanced representations (only canonical and basic) | |||
* no display-hints | * no display-hints | |||
skipping to change at page 18, line 4 ¶ | skipping to change at line 794 ¶ | |||
* no lengths on hexadecimal, quoted-strings, or base-64 encodings | * no lengths on hexadecimal, quoted-strings, or base-64 encodings | |||
* no empty lists | * no empty lists | |||
* no empty octet-strings | * no empty octet-strings | |||
* no lists having another list as its first element | * no lists having another list as its first element | |||
* no base-64 or hexadecimal encodings | * no base-64 or hexadecimal encodings | |||
* fixed limits on the size of octet-strings | * fixed limits on the size of octet-strings | |||
As provided in Section 6, conformant implementations will support | As provided in Section 6, conformant implementations will support | |||
canonical and basic representation but support for advanced | canonical and basic representation, but support for advanced | |||
representation is not generally required. Thus advanced | representation is not generally required. Thus, advanced | |||
representation can only be used in applications which mandate its | representation can only be used in applications that mandate its | |||
support or where a capability discovery mechanism indicates support. | support or where a capability discovery mechanism indicates support. | |||
9. In-memory representations | 9. In-Memory Representations | |||
For processing, the S-expression would typically be parsed and | For processing, the S-expression would typically be parsed and | |||
represented in memory in a way that is more amenable to efficient | represented in memory in a way that is more amenable to efficient | |||
processing. This document suggests two alternatives: | processing. This document suggests two alternatives: | |||
* "list-structure" | * "list-structure" | |||
* "array-layout" | * "array-layout" | |||
These are only sketched here, as they are only suggestive. The | These are only sketched here, as they are only suggestive. The code | |||
[SexpCode] code illustrates these styles in more detail. | in [SexpCode] illustrates these styles in more detail. | |||
9.1. List-structure memory representation | 9.1. List-Structure Memory Representation | |||
Here there are separate records for simple-strings, strings, and | Here there are separate records for simple-strings, strings, and | |||
lists or list nodes. An S-expression of the form ("abc" "de") could | lists or list nodes. An S-expression of the form ("abc" "de") could | |||
be encoded as two records for the simple-strings, two for the | be encoded as two records for the simple-strings, two for the | |||
strings, and two for the list elements, where a record is a | strings, and two for the list elements where a record is a relatively | |||
relatively small block of memory and, except for simple-string, might | small block of memory and, except for simple-string, might have | |||
have pointers in it to other records. This is a fairly conventional | pointers in it to other records. This is a fairly conventional | |||
representation as discussed in Section 4 of [LISP2]. | representation as discussed in Section 4 of [LISP2]. | |||
9.2. Array-layout memory representation | 9.2. Array-Layout Memory Representation | |||
Here each S-expression is represented as a contiguous array of | Here each S-expression is represented as a contiguous array of | |||
octets. The first octet codes the "type" of the S-expression: | octets. The first octet codes the "type" of the S-expression: | |||
01 octet-string | 01 octet-string | |||
02 octet-string with display-hint | 02 octet-string with display-hint | |||
03 beginning of list (and 00 is used for "end of list") | 03 beginning of list (and 00 is used for "end of list") | |||
Each of the three types is immediately followed by a k-octet integer | Each of the three types is immediately followed by a k-octet integer | |||
indicating the size (in octets) of the following representation. | indicating the size (in octets) of the following representation. | |||
Here k is an integer that depends on the implementation, it might be | Here, k is an integer that depends on the implementation. It might | |||
anywhere from 2 to 8, but would be fixed for a given implementation; | be anywhere from 2 to 8, but it would be fixed for a given | |||
it determines the size of the objects that can be handled. The | implementation; it determines the size of the objects that can be | |||
transport and canonical representations are independent of the choice | handled. The transport and canonical representations are independent | |||
of k made by the implementation. | of the choice of k made by the implementation. | |||
Although the lengths of lists are not given in the usual S-expression | Although the lengths of lists are not given in the usual S-expression | |||
notations, it is easy to fill them in when parsing; when you reach a | notations, it is easy to fill them in when parsing; when you reach a | |||
right-parenthesis you know how long the list representation was, and | right parenthesis, you know how long the list representation was and | |||
where to go back to fill in the missing length. | where to go back to fill in the missing length. | |||
9.2.1. Octet-string | 9.2.1. Octet-String | |||
This is represented as follows: | This is represented as follows: | |||
01 <length> <octet-string> | 01 <length> <octet-string> | |||
For example (here k = 2) | For example (here, k = 2): | |||
01 0003 a b c | 01 0003 a b c | |||
9.2.2. Octet-string with display-hint | 9.2.2. Octet-String with Display-Hint | |||
This is represented as follows: | This is represented as follows: | |||
02 <length> | 02 <length> | |||
01 <length> <octet-string> /* for display-type */ | 01 <length> <octet-string> /* for display-type */ | |||
01 <length> <octet-string> /* for octet-string */ | 01 <length> <octet-string> /* for octet-string */ | |||
For example, the S-expression | For example, the S-expression: | |||
[gif] #61626364# | [gif] #61626364# | |||
would be represented as (with k = 2) | would be represented as (with k = 2): | |||
02 000d | 02 000d | |||
01 0003 g i f | 01 0003 g i f | |||
01 0004 61 62 63 64 | 01 0004 61 62 63 64 | |||
9.2.3. List | 9.2.3. List | |||
This is represented as | This is represented as: | |||
03 <length> <item1> <item2> <item3> ... <item> 00 | 03 <length> <item1> <item2> <item3> ... <item> 00 | |||
For example, the list (abc [d]ef (g)) is represented in memory as | For example, the list (abc [d]ef (g)) is represented in memory as | |||
(with k = 2) | (with k = 2): | |||
03 001b | 03 001b | |||
01 0003 a b c | 01 0003 a b c | |||
02 0009 | 02 0009 | |||
01 0001 d | 01 0001 d | |||
01 0002 e f | 01 0002 e f | |||
03 0005 | 03 0005 | |||
01 0001 g | 01 0001 g | |||
00 | 00 | |||
00 | 00 | |||
skipping to change at page 20, line 33 ¶ | skipping to change at line 912 ¶ | |||
is provided in Section 6.2. | is provided in Section 6.2. | |||
The default display-hint (see Section 4.6) can be specified for an | The default display-hint (see Section 4.6) can be specified for an | |||
application. Note that if S-expressions containing untyped octet- | application. Note that if S-expressions containing untyped octet- | |||
strings represented for that application are processed by a different | strings represented for that application are processed by a different | |||
application, those untyped octet-string may be treated as if they had | application, those untyped octet-string may be treated as if they had | |||
a different display-hint. | a different display-hint. | |||
11. IANA Considerations | 11. IANA Considerations | |||
This document requires no IANA actions. | This document has no IANA actions. | |||
12. Normative References | 12. References | |||
12.1. Normative References | ||||
[C] Kernighan, B. and D. Ritchie, "The C Programming | [C] Kernighan, B. and D. Ritchie, "The C Programming | |||
Language", ISBN 0-13-110370-9, 1988. | Language", ISBN 0-13-110370-9, 1988. | |||
[RFC0020] Cerf, V., "ASCII format for network interchange", STD 80, | [RFC0020] Cerf, V., "ASCII format for network interchange", STD 80, | |||
RFC 20, DOI 10.17487/RFC0020, October 1969, | RFC 20, DOI 10.17487/RFC0020, October 1969, | |||
<https://www.rfc-editor.org/info/rfc20>. | <https://www.rfc-editor.org/info/rfc20>. | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
skipping to change at page 21, line 18 ¶ | skipping to change at line 947 ¶ | |||
[RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax | [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax | |||
Specifications: ABNF", STD 68, RFC 5234, | Specifications: ABNF", STD 68, RFC 5234, | |||
DOI 10.17487/RFC5234, January 2008, | DOI 10.17487/RFC5234, January 2008, | |||
<https://www.rfc-editor.org/info/rfc5234>. | <https://www.rfc-editor.org/info/rfc5234>. | |||
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
13. Informative References | 12.2. Informative References | |||
[BERN] Bernstein, D., "Netstrings", Work in progress, 1 February | [BERN] Bernstein, D. J., "Netstrings", Work in Progress, | |||
1997, <https://www.ietf.org/archive/id/draft-bernstein- | Internet-Draft, draft-bernstein-netstrings-02, 1 January | |||
netstrings-02.txt>. | 1997, <https://datatracker.ietf.org/doc/html/draft- | |||
bernstein-netstrings-02>. | ||||
[CANON] Wikipedia, "Canonical S-expressions", | [CANON1] Wikipedia, "Canonical S-expressions", | |||
<https://en.wikipedia.org/wiki/Canonical_S-expressions>. | <https://en.wikipedia.org/wiki/Canonical_S-expressions>. | |||
Grinberg, R., "Csexp - Canonical S-expressions", 24 March | [CANON2] Grinberg, R., "Csexp - Canonical S-expressions", 24 March | |||
2023, <https://github.com/ocaml-dune/csexp>. | 2023, <https://github.com/ocaml-dune/csexp>. | |||
[CDDLfreezer] | [CDDL-freezer] | |||
Bormann, C., "A feature freezer for the Concise Data | Bormann, C., "A feature freezer for the Concise Data | |||
Definition Language (CDDL)", work in progress, 12 | Definition Language (CDDL)", Work in Progress, Internet- | |||
September 2023, <https://datatracker.ietf.org/doc/draft- | Draft, draft-bormann-cbor-cddl-freezer-15, 28 February | |||
bormann-cbor-cddl-freezer/>. | 2025, <https://datatracker.ietf.org/doc/html/draft- | |||
bormann-cbor-cddl-freezer-15>. | ||||
[formal] Petit-Huguenin, M., "A Formalization of Symbolic | [Formal] Petit-Huguenin, M., "A Formalization of Symbolic | |||
Expressions", work in progress, 24 May 2024, | Expressions", Work in Progress, Internet-Draft, draft- | |||
petithuguenin-ufmrg-formal-sexpr-06, 4 May 2025, | ||||
<https://datatracker.ietf.org/doc/html/draft- | <https://datatracker.ietf.org/doc/html/draft- | |||
petithuguenin-ufmrg-formal-sexpr-04>. | petithuguenin-ufmrg-formal-sexpr-06>. | |||
[GnuPG] Free Software Foundation, Inc., "The GNU Privacy Guard", | [GnuPG] GnuPG, "The GNU Privacy Guard", <https://www.gnupg.org/>. | |||
<https://www.gnupg.org/>. | ||||
[Inferno] Uriel, "Inferno S-expressions", | [Inferno] "Inferno S-expressions", Inferno Manual Page, | |||
<http://man.cat-v.org/inferno/6/sexprs>. | <https://man.cat-v.org/inferno/6/sexprs>. | |||
[Libgcrypt] | [Libgcrypt] | |||
GnuPG, "The Libgcrypt Library", Libgcrypt version 1.10.2, | GnuPG, "The Libgcrypt Library", Libgcrypt version 1.10.2, | |||
6 April 2023, | 6 April 2023, | |||
<https://www.gnupg.org/documentation/manuals/gcrypt/>. | <https://www.gnupg.org/documentation/manuals/gcrypt/>. | |||
[LISP] Levin, M. and J. McCarthy, "LISP 1.5 Programmer's Manual", | [LISP] McCarthy, J., Abrahams, P. W., Edwards, D. J., Hart, T. | |||
P., and M. Levin, "LISP 1.5 Programmer's Manual", | ||||
ISBN-13 978-0-262-12011-0, ISBN-10 0262130114, 15 August | ISBN-13 978-0-262-12011-0, ISBN-10 0262130114, 15 August | |||
1962. | 1962. | |||
[LISP2] McCarthy, J., "Recursive Functions of Symbolic Expressions | [LISP2] McCarthy, J., "Recursive Functions of Symbolic Expressions | |||
and Their Computation by Machine, Part I", April 1960, | and Their Computation by Machine, Part I", April 1960, | |||
<https://people.cs.umass.edu/~emery/classes/cmpsci691st/ | <https://people.cs.umass.edu/~emery/classes/cmpsci691st/ | |||
readings/PL/LISP.pdf>. | readings/PL/LISP.pdf>. | |||
[RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail | [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail | |||
Extensions (MIME) Part Two: Media Types", RFC 2046, | Extensions (MIME) Part Two: Media Types", RFC 2046, | |||
skipping to change at page 22, line 43 ¶ | skipping to change at line 1022 ¶ | |||
[RFC3259] Ott, J., Perkins, C., and D. Kutscher, "A Message Bus for | [RFC3259] Ott, J., Perkins, C., and D. Kutscher, "A Message Bus for | |||
Local Coordination", RFC 3259, DOI 10.17487/RFC3259, April | Local Coordination", RFC 3259, DOI 10.17487/RFC3259, April | |||
2002, <https://www.rfc-editor.org/info/rfc3259>. | 2002, <https://www.rfc-editor.org/info/rfc3259>. | |||
[RFC3275] Eastlake 3rd, D., Reagle, J., and D. Solo, "(Extensible | [RFC3275] Eastlake 3rd, D., Reagle, J., and D. Solo, "(Extensible | |||
Markup Language) XML-Signature Syntax and Processing", | Markup Language) XML-Signature Syntax and Processing", | |||
RFC 3275, DOI 10.17487/RFC3275, March 2002, | RFC 3275, DOI 10.17487/RFC3275, March 2002, | |||
<https://www.rfc-editor.org/info/rfc3275>. | <https://www.rfc-editor.org/info/rfc3275>. | |||
[RFC7159] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data | [RFC8259] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data | |||
Interchange Format", RFC 7159, DOI 10.17487/RFC7159, March | Interchange Format", STD 90, RFC 8259, | |||
2014, <https://www.rfc-editor.org/info/rfc7159>. | DOI 10.17487/RFC8259, December 2017, | |||
<https://www.rfc-editor.org/info/rfc8259>. | ||||
[RFC8949] Bormann, C. and P. Hoffman, "Concise Binary Object | [RFC8949] Bormann, C. and P. Hoffman, "Concise Binary Object | |||
Representation (CBOR)", STD 94, RFC 8949, | Representation (CBOR)", STD 94, RFC 8949, | |||
DOI 10.17487/RFC8949, December 2020, | DOI 10.17487/RFC8949, December 2020, | |||
<https://www.rfc-editor.org/info/rfc8949>. | <https://www.rfc-editor.org/info/rfc8949>. | |||
[Ribose] Ribose Group Inc., "Open-source projects for developers | [Ribose] Ribose Group Inc., "Open-source projects for developers | |||
and designers", 13 April 2023, <https://open.ribose.com/>. | and designers", <https://open.ribose.com/>. | |||
[RNPGP_SEXPP] | [RNPGP_SEXPP] | |||
RNP, R., "S-Expressions parser and generator library in | "S-Expressions parser and generator library in C++ (SEXP | |||
C++ (SEXP in C++)", version 0.8.7, 28 June 2023, | in C++)", Version 0.9.2, commit 249c6e3, 22 March 2025, | |||
<https://github.com/rnpgp/sexpp>. | <https://github.com/rnpgp/sexpp>. | |||
[SDSI] Rivest, R. and B. Lampson, "A Simple Distributed Security | [SDSI] Rivest, R. and B. Lampson, "A Simple Distributed Security | |||
Architecture", working document, SDSI version 1.1, 2 | Architecture", Working document for SDSI version 1.1, 2 | |||
October 1996, <https://people.csail.mit.edu/rivest/pubs/ | October 1996, <https://people.csail.mit.edu/rivest/pubs/ | |||
RL96.ver-1.1.html>. | RL96.ver-1.1.html>. | |||
[SexpCode] Malkiewicz, J., "SEXP---(S-expressions)", 10 June 2015, | [SexpCode] "SEXP---(S-expressions)", commit 4aa7c36, 10 June 2015, | |||
<https://github.com/jpmalkiewicz/rivest-sexp>. | <https://github.com/jpmalkiewicz/rivest-sexp>. | |||
[SEXPP] Davis, R., "SexpProcessor", 10 June 2015, | [SEXPP] "SexpProcessor", commit a90f90f, 11 April 2025, | |||
<https://github.com/seattlerb/sexp_processor>. | <https://github.com/seattlerb/sexp_processor>. | |||
[SFEXP] Sottile, M., "Small Fast X-Expression Library", 24 March | [SFEXP] "Small Fast X-Expression Library", commit b7d3bea, 24 | |||
2023, <https://github.com/mjsottile/sfsexp>. | March 2023, <https://github.com/mjsottile/sfsexp>. | |||
[SPKI] Rivest, R., "SPKI/SDSI 2.0 A Simple Distributed Security | [SPKI] Rivest, R., "SPKI/SDSI 2.0 A Simple Distributed Security | |||
Infrastructure", | Infrastructure", | |||
<https://people.csail.mit.edu/rivest/pubs/RL96.slides- | <https://people.csail.mit.edu/rivest/pubs/RL96.slides- | |||
maryland.pdf>. | maryland.pdf>. | |||
[XML] Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E., | [XML] Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E., | |||
and F. Yergeau, "Extensible Markup Language (XML) 1.0", 26 | and F. Yergeau, "Extensible Markup Language (XML) 1.0", | |||
November 2008, <https://www.w3.org/TR/REC-xml/>. | W3C Recommendation, 26 November 2008, | |||
<https://www.w3.org/TR/2008/REC-xml-20081126/>. Latest | ||||
version available at <https://www.w3.org/TR/REC-xml/>. | ||||
Appendix A. Implementations | Appendix A. Implementations | |||
At this time there are multiple implementations, many open source, | At this time there are multiple implementations, many open source, | |||
available that are intended to read and parse some or all of the | available that are intended to read and parse some or all of the | |||
various S-expression formats specified here. In particular, see the | various S-expression formats specified here. In particular, see the | |||
following likely incomplete list: | following -- likely incomplete -- list: | |||
* Project GNU's [Libgcrypt]. | ||||
* Ribose's RNP [RNPGP_SEXPP] in C++. | ||||
* Github project of J. P. Malkiewicz [SexpCode] in C. | ||||
* The Inferno implementation [Inferno]. | ||||
* Small Fast X-Expression Library [SFEXP]. | ||||
* S-expression Processor [SEXPP] in Ruby. | ||||
* Canonical S-expressions [CANON] (OCAML). | ||||
Appendix B. Change History | ||||
RFC Editor Note: Please delete this section before publication. | ||||
B.1. -00 Changes | ||||
This sub-section summarizes significant changes between the original | ||||
1997 -00 version of this document and the 2023 -00 version submitted | ||||
to the IETF. | ||||
1. Convert to XML v3. | ||||
2. Update Ron Rivest author information and, with his permission, | ||||
add Donald Eastlake as an author. | ||||
3. Add minimal "IANA Considerations" and "Security Considerations" | ||||
sections. | ||||
4. Since implementation requirements terminology is used, add the | ||||
usual paragraph about it as a sub-section of Section 1 and add | ||||
references to [RFC2119] and [RFC8174]. | ||||
5. Divide references into Normative and Informational and update | ||||
base-64 reference to be to [RFC4648]. | ||||
6. Add a couple of sentences to the "Historical note" section about | ||||
the history of -00 versions of the draft. | ||||
B.2. Changes from -00 to -01 | ||||
1. Fix glitches and errors in the BNF. | ||||
2. Add Acknowledgements section to list Marc Petit-Huguenin (who | ||||
provided BNF improvements) and John Klensin. | ||||
3. Update code references in Appendix A and add to Informative | ||||
References section. Note: The code in the Malkiewicz github | ||||
repository may be the code that was originally at | ||||
http://theory.lcs.mit.edu/~rivest/sexp.html | ||||
4. Add this Change History Appendix. | ||||
5. Move "Historical Notes" which were formerly a separate section at | ||||
the end of the document up to be a sub-section of Section 1. | ||||
6. Add references to [LISP], [RFC2692], and [RFC2693]. | ||||
7. Add simple security considerations. | ||||
8. Minor editorial fixes/improvements. | ||||
B.3. Changes from -01 to -02 | ||||
1. Change default MIME Type in Section 4.6 to have charset=utf-8 | ||||
[RFC4648]. | ||||
2. Change BNF to ABNF and add reference to [RFC5234]. | ||||
3. Move Marc Petit-Huguenin to a Contributors section for his work | ||||
on the ABNF. | ||||
B.4. Changes from -02 to -03 | ||||
1. Add current S-expression usage Section 1.2. | ||||
2. Add the white book [C] as a reference. | ||||
3. Add reference to the Ribose RNP code [RNPGP_SEXPP]. | ||||
4. Minor editorial improvements. | ||||
B.5. Changes from -03 to -04 | ||||
Trivial keep-alive update. | ||||
B.6. Changes from -04 to -05 | ||||
1. Add reference to [Inferno] implementation. | ||||
2. Eliminate remaining references to being a "proposal". | ||||
3. Emphasize that a particular application can specify a different | ||||
default display-hint. | ||||
4. Add reference to [RFC0020] for ASCII. | ||||
5. Minor editorial improvements. | ||||
B.7. Changes from -05 to -06 | ||||
1. Move implementations list to Appendix A. Add numerous | ||||
implementations. | ||||
2. Change default display-hint to "application/octet-stream". | ||||
3. Expand Abstract and include most of Abstract in the Introduction. | ||||
4. Use different tokens for the top-level rule in the three ABNF | ||||
encodings so that the rules would not collide if all were used. | ||||
Fix ABNF for "printable". | ||||
5. Add an illustration of list-structure memory representation. | ||||
6. Editorial improvements. | ||||
B.8. Changes from -06 to -07 | ||||
1. Re-order some top-level sections. | ||||
2. Replace "list-structure" memory figure with explanation and | ||||
[LISP2] reference. | ||||
3. Re-organize ABNF to give full ABNF for advanced transport first | ||||
and then mostly derive canonical and basic from advanced. | ||||
4. Correct reference to [RFC5234] to be to Appendix B.1, not | ||||
Appendix A. | ||||
5. Attempt to clarify the difference between canonicalization and | ||||
equality. | ||||
6. Add the explicit Section 6.1 on base-64 representation of | ||||
S-expressions. | ||||
7. Globally hyphenate "octet-string" and "display-hint", generally | ||||
replace "byte" with "octet". | ||||
8. Add some more examples here and there. | ||||
9. Fix typos. Other editorial improvements. | ||||
B.9. Changes from -07 to -08 | ||||
1. A variety of minor fixes and more precise wording. | ||||
2. Give exact circumstances under which a space is needed to | ||||
separate successive octet-string representations in a list. | ||||
3. Additional editorial improvements. | ||||
B.10. Changes from -08 to -09 | ||||
1. Add mention of and reference to [formal]. | ||||
2. Add mention in the text that whitespace can appear just after the | ||||
opening curly brace and before just before the closing curly | ||||
brace of base-64 encoding (the ABNF was correct). | ||||
3. Minor editorial improvements. | ||||
B.11. Changes from -09 to -10 | * Project GNU's [Libgcrypt] | |||
1. Revert default display hint to more closely follow the original | * Ribose's RNP [RNPGP_SEXPP] in C++ | |||
SPKI S-expressions. | ||||
2. Editorial improvements. | * Github project of J. P. Malkiewicz [SexpCode] in C | |||
B.12. Changes from -10 to -12 | * The Inferno implementation [Inferno] | |||
Minor ABNF fixes and editorial changes. | * Small Fast X-Expression Library [SFEXP] | |||
B.13. Changes from -12 to -13 | * S-expression Processor [SEXPP] in Ruby | |||
Added recommendation and references for using UTF-8 to support | * Canonical S-expressions [CANON2] (OCAML) | |||
Interntionalization for text octet-strings. Minor other updates | ||||
based on IESG reviews. | ||||
Acknowledgements | Acknowledgements | |||
Special thanks to Daniel K. Gillmore for his extensive comments. | Special thanks to Daniel K. Gillmore for his extensive comments. | |||
The comments and suggestions of the following are gratefully | The comments and suggestions of the following are gratefully | |||
acknowledged: John Klensin and Caleb Malchik. | acknowledged: John Klensin and Caleb Malchik. | |||
Contributors | Contributors | |||
Special thanks to Marc Petit-Huguenin, particularly for his extensive | Special thanks to Marc Petit-Huguenin, particularly for his extensive | |||
work and advice on the ABNF and on locating and fixing unclear parts | work and advice on the ABNF and on locating and fixing unclear parts | |||
of earlier versions of this document: | of earlier draft versions of this document: | |||
Marc Petit-Huguenin | Marc Petit-Huguenin | |||
Impedance Mismatch LLC | Impedance Mismatch LLC | |||
Email: marc@petit-huguenin.org | Email: marc@petit-huguenin.org | |||
Authors' Addresses | Authors' Addresses | |||
Ronald L. Rivest | Ronald L. Rivest | |||
MIT CSAIL | MIT CSAIL | |||
32 Vassar Street, Room 32-G692 | 32 Vassar Street, Room 32-G692 | |||
Cambridge, Massachusetts 02139 | Cambridge, Massachusetts 02139 | |||
United States of America | United States of America | |||
Email: rivest@mit.edu | Email: rivest@mit.edu | |||
URI: https://www.csail.mit.edu/person/ronald-l-rivest | URI: https://www.csail.mit.edu/person/ronald-l-rivest | |||
Donald E. Eastlake 3rd | Donald E. Eastlake 3rd | |||
Independent | Independent | |||
End of changes. 125 change blocks. | ||||
419 lines changed or deleted | 250 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. |