CURRENT_MEETING_REPORT_ Reported by Eric Miller/Online Computer Library Center and Ed Levinson/Accurate Information Systems Minutes of the MIME Content-Type for SGML Documents Working Group (MIMESGML) Minutes reported by Ed Levinson from notes compiled by Eric Miller. Ed Levinson chaired the meeting and after the introductions and administrivia, the progress against the milestones was reviewed. A draft was available for review and comment by the SGML Open Consortium Technical Committee. No formal report was received but informal conversations indicate that SGML Open favorably viewed the convergence of ideas being discussed on the MIMESGML mail list and the incorporation of James Clark's ideas. An Internet-Draft was not available, as scheduled, for review at this IETF meeting. A preliminary draft was distributed to the mail list. It was reported that James Clark had implemented many of the ideas in the draft document. Ed Levinson reported that the MIMESGML work represents three documents, ``Encapsulating SGML Documents Using the Multipart/Related Content Type'' which was the draft distributed, ``Multipart/Related Content Type'' which is an Internet-Draft (draft-levinson-multipart-related-00.txt), and ``Access-Type Content-ID'' which has not yet been distributed. The ``Encapsulating SGML...'' draft presents the whole picture and uses the Multipart/Related to wrap the document's SGML entities as a group of MIME body parts. The content-id access type permits SGML entities to reference another body part of the message, eliminating the need to include duplicate data. A subdoc will be its own Multipart/Related MIME entity; this is a recursive structure which provides each subdoc with its own namespace. Ed Levinson presented the Content-SGML-Entity (cse) header that the ``Encapsulating SGML...'' draft proposes. That header is intented to allow the unpacker to avoid parsing the SGML document. The information from the cse header can be supplied to the SGML entity manager. Notation declarations will also be described in Content-SGML-Entities. A discussion ensued on passing scripts as part of the notation information. There was concern about the security issues that would present. Ed Levinson recapped the recent relevant discussion on the mail list and several people suggested that scripting be differed and address in a follow-on document. The consensus at the meeting was to use a very simple mechanism for exposing the notation declaration at the MIME level and defer the more complicated mechanism. There was a lengthy discussion of character set related issues stemming from James Clark's message to the mail list. o Non-US ASCII character sets in the Content-SGML-Entity header. RFC 1522 provides a mechanism for encoding them, =?charset?X?encoded word?=, where charset is a MIME charset, X is Q for quoted printable encoding and B for base64, and encoded word is the non-us-ascii word as encoded. When presented to the unpacker, the word will be unencoded. o What is the character set of the SGML declaration? It was proposed to use the charset parameter with the default of us-ascii. o How to deal with non-8bit character encodings. An option was to specify an encoding scheme (charset). It was noted that an application/* is treated as an octet stream and the appropriate place is to use the character set in the SGML declaration. o How will Record Start and Record End (RS/RE) be indicated. This was rephrased as, Given an octet stream how do you know the record boundaries? Should an entity be shipped that uses proprietary linefeeds or do you change it to an SGML entity with defining the linefeed (RS/RE), like text/sgml. A number of RS/RE conventions were mentioned (all combinations of CR and/or LF). It was noted that the SGML declaration defines the RS/RE characters and that is what you should expect to see in the applications/sgml. Meeting consensus: use local end-of-line for text/sgml SGML entities, otherwise follow SGML specification. It was agreed that this SGML item appears to be broken and it was suggested that an Informational RFC be written to propose that it be resolved by the SGML standards group. David Duran volunteered to write a draft. An extensive and energetic discussion of character sets followed. A similar discussion is also taking place in the HTML Working Group. The chair undertook to create alignment between the MIMESGML and HTML Working Groups. The participants agreed to the following definitions: A character set is a mapping from characters to integers. A character encoding is an algorithm mapping a stream of octets to a stream of integers. The MIME charset is a character encoding. [Note: In a meeting the following day with MIME, MIMESGML, and HTML participants, it was agreed that the MIME charset maps octets to characters, not integers. Dan Connolly will write an RFC giving formal definitions to these and related terms; it will reference the corresponding ISO definitions. It was also agreed to pronounce charset with the ch as in charcol, to avoid confusion with character set. --Ed Levinson] The final two agenda items were dealt with quickly. For partial documents, the entities will be shipped as simple MIME body parts. An example of its usage may be in the request for a part of a document. SDIF will be included in the ``Encapsulating SGML...'' draft. It should only be a paragraph or two. A liaison statement from ISO/IEC JTC1/SC18/WG8 was read to the group. It offered to ballot the MIME encoding of SDIF for inclusion in the International Standard as a conforming encoding. It was agreed to send SC18/WG8 a copy of the Internet-Drafts and to solicit their comments.