Minutes of the Audio/Video Transport Working Group Reported by Colin Perkins. The audio/video transport working group held two full meetings in Minneapolis and, in addition, a sub-group met to discuss the transport of MPEG-4 in a telephone conference with the MPEG committee meeting in Korea. Since the last meeting the group has published RFC2508 (IP/UDP/RTP header compression). The group does not currently have any drafts awaiting IESG approval for publication although some are now ready for working group last call. The revised RTP specification (draft-ietf-avt-rtp-new-03.txt) was presented by Steve Casner. Recent changes include a clarification that the payload type may change during a session, a review of sections 6.2 and 6.3 which resulted in several minor corrections and a removal of the requirement that inactive participant state is retained for 30 minutes (obsoleted by the reconsideration rules), and several corrections to the code in appendix A.7. The specification is now believed to be complete: all members of the working group are encouraged to read, check and comment on this document. The SSRC sampling draft (draft-ietf-avt-rtpsample-02.txt) is now complete except that the wording of the IPR statement has to be updated to match the guidelines in RFC2026. Once this is done it is intended to hold last call on this document for experimental status. The new RTCP conformance testing draft (draft-ietf-avt-rtcptest-00.txt) was presented by Jonathan Rosenberg. This describes several tests which may be performed on an RTP implementation to determine if it correctly implements the RTCP send/receive rules. Reaction to this draft was favourable, and a number of additional tests were noted for possible inclusion in a future version: check that SSRC identifiers are randomly allocated and check response to an SSRC collision. It was also noted that the draft assumes the default RTCP sender/receiver bandwidth fractions, and should be made more general. This document was adopted as an AVT work item, for eventual progression as an informational RFC. This draft also specifies the behavior of a software "test instrument" to be used in performing the tests; it would be a tremendous service to the working group if one or more members decided to implement this test instrument. Recent changes to the RTP A/V profile (draft-ietf-avt-profile-new-05.txt) were presented by Steve Casner. The use of MUST, SHOULD, MAY, etc, is now complete throughout the document, the possible use of a non-default RTCP bandwidth fraction is noted, a "changes from RFC1890" section has been added and the GSM-HR, GSM-EFR, QCELP, BT656, H263-1998 and BMPEG codecs have been added. The use of non-default RTCP bandwidth fractions requires the definition of appropriate SDP bandwidth modifiers. For example: b=RS: RTCP sender bandwidth b=RR: RTCP receiver bandwidth however it is noted that SDP allows for bandwidth modifiers to be expressed in kb/s as an integer only. The SDP specification must be updated either to allow bandwidth modifiers to be specified as being in b/s or to accept fractional values. Mark Handley expressed a preference for the former choice and no counter-arguments were given. It was agreed that the specification of these modifiers could simply be added to the RTP A/V profile, although since they may be of use in other profiles it may be better to start a new draft to record additions and changes to SDP. The new GSM payload formats are defined by reference to an ETSI document only, yet it is unclear whether this is acceptable. Concern was raised that a definition by reference made it difficult to find the information, yet copying the relevant tables into the profile raises the potential for inconsistency. A compromise was suggested whereby a copy of the format is included in the profile as an appendix for convenience, but it is noted that the ETSI document is authoritative. A first draft document registering the RTP codec names in the MIME namespace has been produced by Philipp Hoschka (draft-hoschka-rtp-mime-00.txt). This draft defines procedures for registration of codecs in the MIME namespace with "encoding considerations" specifying how they are transported in RTP; it also registers the existing codecs in the A/V profile. At present it's a rough draft needing completion. This draft is separate from the profile so that it may define procedures for carrying the codec data via more traditional MIME transports in addition to RTP; for example, draft-alvestrand-audio-l16-01.txt on the L16 codec should be merged into this draft. This draft is referenced by the revised profile draft, but if advancement of the profile to Draft Standard status would be blocked by a reference to this separate draft at Proposed Standard status, we may consider merging the new draft into the profile. Open issue: what to do with vnd.wave and vnd.avi types defined in RFC 2361? The A/V profile is now believed to be complete, once again careful review by the working group is requested. A revision to the PureVoice (QCELP) payload format has been produced in response to last call comments (draft-mckay-qcelp-02.txt). A new working group last call is now in progress on this draft - comments are solicited. The guidelines for writers of RTP payload format specifications draft (draft-ietf-avt-rtp-format-guidelines-01.txt) is now complete. Working group last call for BCP will be issued shortly. The RTP MIB (draft-ietf-rtp-mib-04.txt) was presented by Bill Strahm. Changes include: clarification of the difference between monitor and host implementations, explicit allowance of non-consecutive indexes into the rtpSessionEntry table, use of 32 bit rather than 16 bit indexes into this table, the removal of rtpSessionIfAddr and type change of rtpSessionIfIndex into InterfaceIndex rather than InterfaceIndexOrZero. There are two known issues with the current specification: the references need updating to match the most recent SNMP documents, and compatibility with IPv6 needs to be checked. With these two exceptions, the document is believed to be ready for last call for proposed standard. Comments from the working group are solicited. The generic FEC draft (draft-ietf-fec-05.txt) was presented by Jonathan Rosenberg. This revision clarifies usage with the RFC2198 payload format and defines SDP attributes for FEC protected media. Since the FEC data is sent as a separate stream from the media, it is represented in SDP by an additional "m=" line, with "a=fmtp" lines linking it to the media stream via "a=tag" directives, for example: m=audio 49170 RTP/AVP 0 -+ c=IN IP4 224.2.17.12/127 | Stream protected by FEC a=tag:1 -+ m=video 50274 RTP/AVP 31 m=audio 47182 RTP/AVP 121 -+ c=IN IP4 224.2.17.13/127 | FEC stream a=rtpmap:121 parityfec/8000 | a=fmtp:121 1 -+ It was suggested that this usage is unclear, since the FEC is really a content transfer encoding, rather than a new media type; a better solution may be to specify multiple "c=" lines for the media stream. Furthermore, it is unclear how to register parity FEC as a MIME type, since it can apply to both audio and video. One possibility is to register it as both "audio/parityfec" and "video/parityfec", another may be to define a new top level MIME type and register, say, "encoding/parityfec". If the solution of using multiple "c=" directives in the session description is chosen, the problem may be avoided since the MIME type will be that of the media stream, and the parity FEC becomes a MIME content-transfer-encoding instead. Finally, it was noted that the parity FEC work may be subject to a patent owned by 3com. The meeting received an assurance that the 3com would "license the patent in accordance with [rfc]2026", and it is expected that a formal IPR statement will be forthcoming. The parity FEC draft will be modified to note these issues. A more loss-tolerant payload format for MPEG (1 or 2) layer III audio (draft-finlayson-rtp-mp3-00.txt) was presented by Ross Finlayson. The existing payload format, RFC2250, is fine for layer I or II audio but is not optimal for layer III (.mp3) since frames are not ADUs in MP3 and are not independently decodable, and hence such a stream is not very loss tolerant. This new payload format is a data-preserving rearrangement of the original stream, such that each packet contains complete ADUs, not codec frames. This makes the stream more error resilient, although the implementation needs more knowledge of MPEG audio to perform the encoding. It was decided to make this new payload format a work item of AVT. A new payload format for DV format video (draft-kobayashi-dv-video-00.txt) was presented by Akimichi Ogawa. This format is straight-forward, with multiple blocks of the codec output (DIF blocks) being packed into an RTP packet with no format specific header. Audio and video are typically bundled (for a data rate of around 30Mbps), but may be transmitted separately if desired. Again, this will become an AVT work item. Open issues with this draft include the handling for the 12 bit sampling option for DV audio (a new payload format for 12 bit audio may be defined and referenced from the DV draft) and the definition of SDP attributes to describe DV sessions. A proposal to include location information in RTCP has presented by Jon Crowcroft (draft-crowcroft-rtcp-latlong-00.txt). This suggests defining an RTCP APP packet to include the real (or virtual) position of an RTP session participant in a media independent manner. Reaction to this was favourable, and it was suggested that the DNS LOC field has a definition for the format of a location description which could be reused. Also, it was suggested that this could be a new SDES packet type, rather than an APP packet. There was also concern expressed that RTCP would not be sufficiently timely to convey location information for fast moving sources - a solution more along the lines of the MPEG BIFS concept may be more appropriate there. A revised draft with more details is expected in time for the Oslo meeting. The RTP payload format for MPEG-4 streams (draft-ietf-avt-rtp-mpeg4-01.txt) was presented by Reha Civanlar and is a result of collaboration between AVT and the MPEG committee. The payload format maps MPEG-4 SL packets onto RTP in an efficient manner: those bits of the SL header which have a direct analogue in the RTP header are mapped onto the RTP header, whilst a payload header carries the additional features of the SL header. Open issues are the mapping between RTP streams and MPEG elementary stream identifiers (ESs), which will probably require definition of additional SDP attributes, and the transport of the initial object descriptor (IOD). It was suggested that a possible alternative for conveying the mapping between ESs and RTP streams would be via an RTCP SDES item, since conveying it in the session description will only work if the selection of SSRCs can be controlled. It was also noted that some applications do not send RTCP reports, so an initial out-of-band mapping may be needed. It is possible that a combination of the two approaches may make most sense. Concerns were also expressed that the transport of the IOD as part of the initial session description may be inappropriate. The IOD may be large, in which case it is wasteful to include it in a SAP announcement, and a URL may be more appropriate; in a SIP invitation it may sensible to include the complete IOD as a MIME multipart response or in the SDP response; a reference to a BIFS stream may also be possible. It may be that we cannot provide a single solution for communicating the IOD, and that the draft should give a list of examples of how this can be done, rather than defining a single solution, since it is clearly scenario dependent. Finally, transport of MPEG-4 streams requires a multiplexing solution (for reasons outlined in the draft). It was noted that the GeRM proposal (draft-ietf-avt-germ-00.txt) provides a good fit to these needs. Following the discussion of MPEG-4 transport, the group revisited the need for an RTP multiplexing scheme. The questions asked to focus this discussion were: should AVT standardize a multiplexing scheme? If yes, more than one scheme? Which one(s)? In addition, the chairs made a strawman proposal to standardize one scheme (GeRM) and recommend the use of Tmux (RFC1692) for other situations, noting that applications for which neither of these are satisfactory may specify their own multiplexing schemes, but that these would not need to be standardized by AVT. There was considerable discussion on the merits of this proposal, initially much of it regarding Tmux. Many people were concerned that Tmux is a new IP protocol which is not well supported and cannot be (easily) implemented at the application level (these are similar arguments to those for why RTP was not given its own protocol number). It was also noted that multiplexing gateways are likely to be dedicated systems, so the issue of IP protocol numbers is less of an issue. Concern was also expressed that Tmux does not compress headers although it does save on IP headers and reduce the packet count, whereas the other multiplexing proposals also include some form of header compression. Overall, the feeling of the meeting was that Tmux is not generally applicable for RTP multiplexing, and should not be one of our recommended solutions. There was considerably less dissent regarding the idea of using GeRM as our multiplexing protocol. It was noted that GeRM does not include the UDP port numbers, and that we may need to extend it to include these since they are needed in some cases to distinguish multiple streams. It was noted that we may wish to allow reduced complexity decoders which can only decode a subset of the GeRM functionality (for example, which can multiplex but not compress). This may also require SDP attributes to signal which subset of GeRM was being used by the sender. This may result in an additional section being added to the GeRM specification which describes the range of solutions, references Tmux, includes the non-compressed GeRM and SDP extensions. A number of comments were made that GeRM is too complex. These may be offset if it becomes clear that "profiles" of GeRM whereby applications may implement only a subset are acceptable. It was noted that compressed RTP (RFC2508) may also be applicable: if all that is required is to save the header overhead, then this will achieve the desired effect without the problems caused by multiplexing. The consensus of the group was that we should work with the current GeRM proposal as our starting point. GeRM solves the transport problem, we would prefer one solution, and the holes in the other proposals were worse than in GeRM. Try implementing GeRM for the proposed scenarios and see whether there are any serious limitations that can't be fixed and would justify creating another solution. The next agenda item was the generic payload format. Michael Speer noted that there is work underway to revise this format as discussed in the Los Angeles meeting, and it is expected that this will be complete before the Oslo meeting. Finally, the group discussed scaling RTCP for large groups. Concern has been expressed about the amount of router state used by RTCP in very large groups and it may be time to consider methods, other than SSRC sampling, and write some additional profiles to specify these other modes of operation. For example, the group has previously discussed sampling which receivers respond to probes by sending RTCP packets (sliding key scheme, as proposed by Bolot/Turletti/Wakeman), summarisation and aggregation of RTCP reports, and unicast reporting to the source which can then forward as multicast. Proposals for such extensions were solicited from the group. Interested parties should write new RTP profile drafts which may reference the existing A/V profile and just specify the differences. Mark Handley noted that there is work in the area of reliable multicast congestion control which may be applicable to RTP. The group is urged to follow this work, and to consider adding congestion control to their implementations. Anders Klemets noted an additional problem with RTCP reception reports where multiple unicast clients are reporting to a server. It is desired to have a way by which the server can cause the clients to report less often than their minimum interval interval to avoid being overloaded by the multiple reports (from unicast clients which aren't aware of each other). It was noted that this can be done by using an RTCP bandwidth modifier in the session description, but this is done at startup only and cannot vary dynamically. It was briefly noted, as the meeting closed, that the security area advisory has noted that DES encryption is insufficient, yet the RTP specification recommends the use of DES as a default. The group should consider changing this recommendation. The revised RTP spec already does refer to use of IPSEC, but defines the DES-based scheme for backward compatibility with existing implementations based on RFC 1889.