Editor's Note: Minutes received 12/21/92 CURRENT_MEETING_REPORT_ Reported by Steve Casner/USC-ISI Minutes of Audio/Video Transport Working Group (AVT) The AVT Working Group met for three sessions. In the first two, we reviewed the draft specification for the Real-time Transport Protocol (RTP); the third session was an ``implementors agreement'' session focusing on software video encoding. Presentations of Draft RTP Specification We are indebted to Henning Schulzrinne for his efforts in writing a summary of the discussion from the Working Group meeting at the July IETF, and subsequently developing that into a concise RTP specification and a separate rationale document comparing the design tradeoffs considered by the Working Group. We began with a presentation by Henning on the draft protocol specification. In brief, RTP supports the following functions: o Transfer of media data. o Demultiplexing of multiple flows. o Content identification. o Synchronization and sequencing. o Options for simple control functions such as identification of participants. RTP consists primarily of protocol header for real-time data packets. In the typical case, the RTP header is just 8 octets long and composed of the following fields (this includes some changes since the meeting): o Protocol version (2 bits, value 1) o Flow identifier (6 bits) o Option present bit o Synchronization bit (marks end of synchronization unit) o Content type index (6 bits) o Packet sequence number (16 bits) o Timestamp, middle 32 bits of NTP-format timestamp The slides are not included here, but full details on the protocol are available in the Internet-Drafts just released (see section 4). Discussion of the Specification Seeking ``rough consensus''. There were many issues discussed, but no roadblocks were identified. Some items simply required additional explanation in the text. All items were resolved sufficiently for the editor to produce the next 1 draft. The following items are expanded in the text below: o Framing of data units o End-of-synchronization-unit flag o Conference announcement protocol as a separate document o Timestamp mechanisms o Encoding/flow descriptors o Backchannel information, including QoS measurement o Profiles and mapping to port numbers Framing is required when using RTP over a stream-oriented protocol layer, but we discussed here that it is also needed to allow multiple data units (e.g., from different media) in one packet. To allow alignment and to avoid length constraints, the frame length field was increased to 32 bits. There was no objection to the change of the header flag from start-to end-of-synchronization-unit. This gives a few advantages with only a slight addition in complexity. In the protocol draft sent out just before the meeting, a Conference Announcement Protocol (CAP) was added. CAP is intended as one near-term method of simple conference control until more sophisticated control protocols are developed. However, this protocol was deemed by some to be outside the scope of the Working Group, and in any case the Working Group agreed it should be specified separately from the RTP. It was agreed also that no specific references to audio or video encoding should be made in the RTP specification because it should be usable for other applications as well. Unlike the previous two meetings, there was relatively little discussion of timestamp formats. The Group has settled on a real-time timestamp, rather than a timestamp based on the media sample clock, to allow the timestamp to be independent of the content type and to aid inter-media synchronization. However, we need implementation experience to validate this choice. We discussed the need to clarify the wording in the specification to say that globally synchronized time is not required if it is not available (and inter-media synchronization is not required); also to specify that timestamps within a synchronization unit should be derived from media timing. The topic receiving the most discussion was the encoding/flow (EF) field and the EF description (EFDESC). The idea was that the value of the EF field would be used as an index into a table both the flow (or sequence state space) and the encoding (renamed content) which is opaque to the RTP layer. Since the meeting, this combined-function field has been found difficult to implement, and it has been separated into two fields by sacrificing the ``option length'' field and replacing it with just an ``option present'' bit. This requires parsing of all the options to determine where the data starts, but that may not be a disadvantage if all options must be processed before the data anyway. 2 Another topic receiving substantial discussion was the need to provide a backchannel from receivers to the sender. The draft contained a ``quality of service measurement'' option that could be multicast by receivers with or without their own data, but there may also be a need to unicast encoding control information back to the sender for error control or flow control. There is a need to identify to which flow from the sender the backchannel information pertains. A new idea was that RTP may be used in various ways for different applications, and that we need to define and indicate those modes of use. The term ``profile'' is taken for this purpose. A profile might indicate that one or more options are always present in a specified order, effectively increasing the fixed size of the header. The profile would also specify how content types are defined (statically in the profile, or dynamically through some higher-level control protocol). It is expected that use of RTP with a particular profile may be identified by a registered port number for IP multicast service. Since unicast service may require dynamically assigned port numbers, the profile will have to be identified (perhaps by the registered port number) in the control protocol that communicates the dynamic port numbers between the endpoints. More work is needed on this topic. It was suggested that we need a model for ``entity addressing'' covering both the multicast and unicast cases. This touches on the use of IP multicast addresses, port numbers, flow identifiers, and identification of multiple sources within one host. Should the model of a flow be unidirectional or bidirectional? These questions were not answered. In addition to the topics listed above, we discussed the need to address security measures (authentication, confidentiality, integrity) before this protocol draft can become an RFC. However, we did not define those measures yet. ``Implementors Agreement'' Session The real-time transport protocol should be independent of the media encoding algorithms and formats that belong to the next higher layer. However, several members of the Working Group are developing packages for software video compression, so we devoted the third Working Group session to an ``implementors agreement'' discussion to promote convergence and interoperation among these packages. We heard presentations by Thierry Turletti on the INRIA ``IVS'' system implementing software H.261 encoding; by Richard Cogger from Cornell on the CUSeeMe package for Macintosh; and a short description was given remotely by Ron Frederick at Xerox PARC on the ``nv'' package. Paul Milazzo, who has previously made a presentation on the BBN ``DVC'' system, and Bob Clements of BBN, also participated in the discussion remotely over the packet audio channel. Oliver Jones from PictureTel made a presentation on coding standards applicable to this effort. We found there was much in common among these systems, and several of the implementors agreed to work together toward convergence and 3 interoperation. A first step is for a description of each of these systems to be posted to the mailing list rem-conf@es.net (some information has already been posted). There was some discussion of defining an API for the software compression algorithms so they could be plugged into application frameworks on different platforms. However, Paul Milazzo pointed out that it may be necessary to interleave compression operations into the acquisition process to reduce processing time, so it may infeasible or at least premature to define an API between the two steps. We also determined there were no conflicts between the draft RTP protocol and the requirements of these packages. We will need to define an enumeration of these experimental encodings to allow systems to process multiple formats. Further Working Group Activities Subsequent to this meeting, an updated set of Internet-Drafts on RTP was issued on December 18th to incorporate the changes discussed at the meeting. These are: o draft-ietf-avt-rtp-00.txt o draft-ietf-avt-encoding-00.txt o draft-ietf-avt-profile-00.txt o draft-ietf-avt-issues-00.ps, .txt The first draft is the specification of the real-time transport protocol itself. The second and third drafts define a set of media encodings and a sample profile for use of those encodings to implement audio and video multiparticipant conferences with minimal control. The last draft is an updated discussion of the issues and decisions involved in the design of the protocol. Before these drafts are issued as RFCs, it is important that we obtain sufficient implementation and operational experience to validate or revise the protocol. Our goal should be to implement the protocol for both audio and video, experiment with it and have implementations ready for use to multicast the next IETF meeting in March. Assuming success in this process, the drafts should then be submitted to become RFCs after review at the March meeting. Attendees Vikas Aggarwal vikas@jvnc.net Brian Bataille bataillebc@afotec.af.mil Lou Berger lberger@bbn.com Dean Blackketter deanb@apple.com Rita Brennan brennan@apple.com Stephen Casner casner@isi.edu Kay Chang chang@chang.austin.ibm.com 4 Wo Chang wchang@nist.gov Richard Cogger rhx@cornell.cit.bitnet Robert Cole rgc@qsun.att.com Hans Eriksson hans@sics.se Jerry Friesen jafries@sandia.llnl.gov James Geddes wk05020@worldlink.com Robert Gilligan Bob.Gilligan@eng.sun.com Mark Green markg@apple.com Thomas Hacker hacker@citi.umich.edu Don Hoffman don.hoffman@eng.sun.com Christian Huitema christian.huitema@sophia.inria.fr Oliver Jones oj@pictel.com Phil Karn karn@qualcomm.com Charley Kline cvk@uiuc.edu Jim Knowles jknowles@binky.arc.nasa.gov Christopher Kolb kolb@psi.com Fong-Ching Liaw fong@eng.sun.com Louis Mamakos louie@ni.umd.edu Donald Merritt don@brl.mil Greg Minshall minshall@wc.novell.com Mitra mitra@pandora.sf.ca.us Kathleen Nichols nichols@apple.com Ari Ollikainen ari@es.net Michael Patton map@bbn.com Jim Perchik perchik@athena.mit.edu Mike Petry petry@ni.umd.edu Joe Ragland jrr@concert.net Bala Rajagopalan braja@qsun.att.com Allan Rubens acr@merit.edu Tom Sandoski tom@concert.net Eve Schooler schooler@isi.edu Dallas Scott scott@fluky.mitre.org Lansing Sloan ljsloan@llnl.gov Frank Solensky solensky@andr.ub.com Joo Young Song jysong@ring.kotel.co.kr Terrance Sullivan terrys@newbridge.com Tang Tang tt@virginia.edu Morton Taragin vsmorty@weizmann.weizmann.ac.il Sally Tarquinio sallyt@gateway.mitre.org Claudio Topolcic topolcic@cnri.reston.va.us Thierry Turletti turletti@sophia.inria.fr Zheng Wang z.wang@cs.ucl.ac.uk Von Welch vwelch@ncsa.uiuc.edu Peter Will will@isi.edu Kirk Williams kirk@sbctri.sbc.com Jeff Young young@alw.nih.gov Paul Zawada Zawada@ncsa.uiuc.edu 5