Diff: rfc8845.original.v2v3.xml

	rfc8845.original.v2v3.xml	rfc8845.form.xml
	<?xml version='1.0' encoding='utf-8'?>	<?xml version='1.0' encoding='utf-8'?>
	<!DOCTYPE rfc SYSTEM "rfc2629-xhtml.ent">	<!DOCTYPE rfc SYSTEM "rfc2629-xhtml.ent">

	<rfc xmlns:xi="http://www.w3.org/2001/XInclude" submissionType="IETF" ca	<rfc xmlns:xi="http://www.w3.org/2001/XInclude" submissionType="IETF"
	tegory="std" consensus="yes" number="XXXX" obsoletes="" updates="" xml:l	category="std" consensus="yes" number="0000" obsoletes="" updates="
	ang="en" sortRefs="true" symRefs="true" tocInclude="true" version="3">	"
		xml:lang="en" sortRefs="true" symRefs="true" tocInclude="true"
		version="3" ipr="trust200902" docName="draft-ietf-clue-framework-25
		">
	<!-- xml2rfc v2v3 conversion 2.45.2 -->	<!-- xml2rfc v2v3 conversion 2.45.2 -->
	<front>	<front>
	<title abbrev="CLUE Framework">Framework for Telepresence Multi-Stre ams</title>	<title abbrev="CLUE Framework">Framework for Telepresence Multi-Stre ams</title>

	<seriesInfo name="RFC" value="XXXX"/>	<seriesInfo name="RFC" value="0000"/>
	<!--[rfced] As the short title in the running header for this docume
	nt was "CLUE Telepresence Framework", might the document title be better
	as:


	Controlling Multiple Streams for Telepresence (CLUE) Framework	<!--[rfced] As the short title in the running header for this document w
		as
		"CLUE Telepresence Framework", might the document title be better as fol
		lows?

		Original:
		Framework for Telepresence Multi-Streams


		Perhaps:
		Controlling Multiple Streams for Telepresence (CLUE) Framework
	-->	-->
	<author fullname="Mark Duckworth" initials="M." role="editor" sur name="Duckworth">	<author fullname="Mark Duckworth" initials="M." role="editor" sur name="Duckworth">

	<organization>Polycom</organization>	<organization/>
	<address>	<address>
	<postal>	<postal>

	<street>Andover, MA 01810</street>	<city>Andover</city><region>MA</region><code>01810</code>
	<street>United States of America</street>	<country>United States of America</country>
	</postal>	</postal>

	<email>mark.duckworth@polycom.com</email>	<email>mrducky73@outlook.com</email>
	</address>	</address>
	</author>	</author>

	<!-- [rfced] We have received a bounce menssage for
	mark.duckworth@polycom.com. Please let us know how Mark's contact infor
	mation
	should be updated.

	Original:
	Mark Duckworth (editor)
	Polycom
	Andover, MA 01810
	USA

	Email: mark.duckworth@polycom.com


		<!-- [rfced] Mark, we have updated your email address as requested.
		Please let us know if there are any other updates to your contact
		information.
	-->	-->

	<author fullname="Andrew Pepperell" initials="A." surname="Pepper ell">	<author fullname="Andrew Pepperell" initials="A." surname="Pepper ell">
	<organization>Acano</organization>	<organization>Acano</organization>
	<address>	<address>
	<postal>	<postal>

	<street>Uxbridge, England</street>	<city>Uxbridge</city>
	<street>United Kingdom</street>	<country>United Kingdom</country>
	</postal>	</postal>
	<email>apeppere@gmail.com</email>	<email>apeppere@gmail.com</email>
	</address>	</address>
	</author>	</author>
	<author fullname="Stephan Wenger" initials="S." surname="Wenger">	<author fullname="Stephan Wenger" initials="S." surname="Wenger">
	<organization abbrev="Vidyo">Vidyo, Inc.</organization>	<organization abbrev="Vidyo">Vidyo, Inc.</organization>
	<address>	<address>
	<postal>	<postal>
	<street>433 Hackensack Ave.</street>	<street>433 Hackensack Ave.</street>

	<street>Hackensack, N.J. 07601</street>	<city>Hackensack</city><region>NJ</region><code>07601</code>
	<street>United States of America</street>	<country>United States of America</country>
	</postal>	</postal>
	<email>stewe@stewe.org</email>	<email>stewe@stewe.org</email>
	</address>	</address>
	</author>	</author>

	<date month="November" year="2017"/>	<date month="June" year="2020"/>
	<workgroup>CLUE WG</workgroup>	<area>ART</area>
		<workgroup>CLUE</workgroup>
	<!-- [rfced] Please insert any keywords (beyond those that appear in	<!-- [rfced] Please insert any keywords (beyond those that appear in
	the title) for use on https://www.rfc-editor.org/search.	the title) for use on https://www.rfc-editor.org/search.
	-->	-->

	<keyword>example</keyword>	<keyword>example</keyword>
	<abstract>	<abstract>
	<t>	<t>
	This document defines a framework for a protocol to enable devices	This document defines a framework for a protocol to enable devices
	in a telepresence conference to interoperate. The protocol enables	in a telepresence conference to interoperate. The protocol enables
	communication of information about multiple media streams so a	communication of information about multiple media streams so a
	sending system and receiving system can make reasonable decisions	sending system and receiving system can make reasonable decisions
	about transmitting, selecting, and rendering the media streams.	about transmitting, selecting, and rendering the media streams.
	This protocol is used in addition to SIP signaling and Session Descri ption Protocol (SDP)	This protocol is used in addition to SIP signaling and Session Descri ption Protocol (SDP)
	negotiation for setting up a telepresence session.</t>	negotiation for setting up a telepresence session.</t>
	</abstract>	</abstract>
	</front>	</front>
	<middle>	<middle>

	<section anchor="section-1" numbered="true" toc="default">	<section anchor="s-1" numbered="true" toc="default">
	<name>Introduction</name>	<name>Introduction</name>
	<t>	<t>
	Current telepresence systems, though based on open standards such	Current telepresence systems, though based on open standards such
	as RTP <xref target="RFC3550" format="default"/> and SIP <xref target ="RFC3261" format="default"/>, cannot easily interoperate with	as RTP <xref target="RFC3550" format="default"/> and SIP <xref target ="RFC3261" format="default"/>, cannot easily interoperate with
	each other. A major factor limiting the interoperability of	each other. A major factor limiting the interoperability of
	telepresence systems is the lack of a standardized way to describe	telepresence systems is the lack of a standardized way to describe
	and negotiate the use of multiple audio and video streams	and negotiate the use of multiple audio and video streams
	comprising the media flows. This document provides a framework for	comprising the media flows. This document provides a framework for
	protocols to enable interoperability by handling multiple streams	protocols to enable interoperability by handling multiple streams
	in a standardized way. The framework is intended to support the	in a standardized way. The framework is intended to support the

	skipping to change at line 103 ¶	skipping to change at line 104 ¶
	The basic session setup for the use cases is based on SIP <xref targe t="RFC3261" format="default"/>	The basic session setup for the use cases is based on SIP <xref targe t="RFC3261" format="default"/>
	and SDP offer/answer <xref target="RFC3264" format="default"/>. In a ddition to basic SIP & SDP	and SDP offer/answer <xref target="RFC3264" format="default"/>. In a ddition to basic SIP & SDP
	offer/answer, signaling that is ControLling mUltiple streams for	offer/answer, signaling that is ControLling mUltiple streams for
	tElepresence (CLUE) specific is required to exchange the	tElepresence (CLUE) specific is required to exchange the
	information describing the multiple media streams. The motivation	information describing the multiple media streams. The motivation
	for this framework, an overview of the signaling, and the information	for this framework, an overview of the signaling, and the information
	required to be exchanged are described in subsequent sections of	required to be exchanged are described in subsequent sections of
	this document. Companion documents describe the signaling details	this document. Companion documents describe the signaling details
	<xref target="RFCYYY3" format="default"/>, the data model <xref targe t="RFCYYY1" format="default"/>, and the protocol <xref target="RFCYYY2" format="default"/>.</t>	<xref target="RFCYYY3" format="default"/>, the data model <xref targe t="RFCYYY1" format="default"/>, and the protocol <xref target="RFCYYY2" format="default"/>.</t>
	</section>	</section>

	<section anchor="section-2" numbered="true" toc="default">	<section anchor="s-2" numbered="true" toc="default">
	<name>Terminology</name>	<name>Terminology</name>

	<section anchor="section-2.1" numbered="true" toc="default">	<section anchor="s-2.1" numbered="true" toc="default">
	<name>Requirements Language</name>	<name>Requirements Language</name>
	<t>	<t>
	The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL	The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
	NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",	NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
	"MAY", and "OPTIONAL" in this document are to be interpreted as	"MAY", and "OPTIONAL" in this document are to be interpreted as
	described in BCP 14 <xref target="RFC2119" format="default"/> < xref target="RFC8174" format="default"/>	described in BCP 14 <xref target="RFC2119" format="default"/> < xref target="RFC8174" format="default"/>
	when, and only when, they appear in all capitals, as shown here.	when, and only when, they appear in all capitals, as shown here.
	</t>	</t>
	</section>	</section>

	<section anchor="section-2.2" numbered="true" toc="default">	<section anchor="s-2.2" numbered="true" toc="default">
	<name>Definitions</name>	<name>Definitions</name>
	<t>	<t>
	This document occasionally refers to the term "CLUE". CLUE is an acro nym for "ControLling mUltiple streams for	This document occasionally refers to the term "CLUE". CLUE is an acro nym for "ControLling mUltiple streams for
	tElepresence", which is the name of the IETF working group in which	tElepresence", which is the name of the IETF working group in which
	this document and certain companion documents have been developed.	this document and certain companion documents have been developed.
	Often, CLUE-* refers to something that has been designed by	Often, CLUE-* refers to something that has been designed by
	the CLUE working group; for example, this document may be called	the CLUE working group; for example, this document may be called
	the CLUE-framework.	the CLUE-framework.
	</t>	</t>
	<t>	<t>
	The terms defined below are used throughout this document and	The terms defined below are used throughout this document and
	in companion documents. Capitalization is used in order to easily id entify a defined term.</t>	in companion documents. Capitalization is used in order to easily id entify a defined term.</t>

	<!--[rfced] We had the following questions about the "Definition
	s" section:	<!--[rfced] We had the following questions about the "Definitions"
		section:

	a) We see the following text:	a) We see the following text:

	"Capitalization is used in order to easily identify a defined term."	"Capitalization is used in order to easily identify a defined term."

	However, we see a number of cases throughout where a term from the	However, we see a number of cases throughout where a term from the
	"Definitions" section (e.g., media stream) is used in lowercase form.	"Definitions" section (e.g., media stream) is used in lowercase form.
	As this is true for nearly every term in the list, we ask that you	As this is true for nearly every term in the list, we ask that you
	please review the use of capitalization of each of these terms	please review the use of capitalization of each of these terms
	throughout the document and let us know if/how we may make them	throughout the document and let us know if/how we may make them

	skipping to change at line 306 ¶	skipping to change at line 308 ¶
	<dt>Video Capture (VC):</dt>	<dt>Video Capture (VC):</dt>
	<dd>Media Capture for video. Denoted as VCn in the	<dd>Media Capture for video. Denoted as VCn in the
	example cases in this document.</dd>	example cases in this document.</dd>
	<dt>Video Composite:</dt>	<dt>Video Composite:</dt>
	<dd>A single image that is formed, normally by an RTP	<dd>A single image that is formed, normally by an RTP
	mixer inside an MCU, by combining visual elements from separate	mixer inside an MCU, by combining visual elements from separate
	sources.</dd>	sources.</dd>
	</dl>	</dl>
	</section>	</section>
	</section>	</section>

	<section anchor="section-4" numbered="true" toc="default">	<section anchor="s-4" numbered="true" toc="default">
	<name>Overview and Motivation</name>	<name>Overview and Motivation</name>
	<t>	<t>
	This section provides an overview of the functional elements	This section provides an overview of the functional elements
	defined in this document to represent a telepresence or	defined in this document to represent a telepresence or
	multistream system. The motivations for the framework described	multistream system. The motivations for the framework described
	in this document are also provided.</t>	in this document are also provided.</t>
	<t>	<t>
	Two key concepts introduced in this document are the terms "Media Pro vider" and "Media Consumer". A Media Provider represents the	Two key concepts introduced in this document are the terms "Media Pro vider" and "Media Consumer". A Media Provider represents the
	entity that sends the media and a Media Consumer represents the	entity that sends the media and a Media Consumer represents the
	entity that receives the media. A Media Provider provides Media in	entity that receives the media. A Media Provider provides Media in

	skipping to change at line 337 ¶	skipping to change at line 339 ¶
	telepresence devices, such as Endpoints and MCUs, would perform as	telepresence devices, such as Endpoints and MCUs, would perform as
	both Media Providers and Media Consumers, the former being	both Media Providers and Media Consumers, the former being
	concerned with those devices' transmitted media and the latter	concerned with those devices' transmitted media and the latter
	with those devices' received media. In a few circumstances, a	with those devices' received media. In a few circumstances, a
	CLUE-capable device includes only Consumer or Provider	CLUE-capable device includes only Consumer or Provider
	functionality, such as recorder-type Consumers or webcam-type	functionality, such as recorder-type Consumers or webcam-type
	Providers.</t>	Providers.</t>
	<t>	<t>
	The motivations for the framework outlined in this document	The motivations for the framework outlined in this document
	include the following:</t>	include the following:</t>

	<ol spacing="normal" type="1">	<ol spacing="normal" type="(%d)">
	<li>Endpoints in telepresence systems typically have multiple Me dia	<li>Endpoints in telepresence systems typically have multiple Me dia
	Capture and Media Render devices, e.g., multiple cameras and	Capture and Media Render devices, e.g., multiple cameras and
	screens. While previous system designs were able to set up calls	screens. While previous system designs were able to set up calls
	that would capture media using all cameras and display media on al l	that would capture media using all cameras and display media on al l
	screens, for example, there was no mechanism that could associate	screens, for example, there was no mechanism that could associate
	these Media Captures with each other in space and time, in a cross -vendor interoperable way.</li>	these Media Captures with each other in space and time, in a cross -vendor interoperable way.</li>
	<!--[rfced] Please note the discrepancy in how devices are treat ed in	<!--[rfced] Please note the discrepancy in how devices are treat ed in
	these two snippets. Should these types of uses be made more	these two snippets. Should these types of uses be made more
	uniform? If so, please let us know how to update.	uniform? If so, please let us know how to update.


	skipping to change at line 371 ¶	skipping to change at line 373 ¶
	-->	-->
	<li>The mere fact that there are multiple capturing and rendering	<li>The mere fact that there are multiple capturing and rendering
	devices, each of which may be configurable in aspects such as zoom ,	devices, each of which may be configurable in aspects such as zoom ,
	leads to the difficulty that a variable number of such devices can	leads to the difficulty that a variable number of such devices can
	be used to capture different aspects of a region. The Capture	be used to capture different aspects of a region. The Capture
	Scene concept allows for the description of multiple setups for	Scene concept allows for the description of multiple setups for
	those multiple capture devices that could represent sensible	those multiple capture devices that could represent sensible
	operation points of the physical capture devices in a room, chosen	operation points of the physical capture devices in a room, chosen
	by the operator. A Consumer can pick and choose from those	by the operator. A Consumer can pick and choose from those
	configurations based on its rendering abilities and then inform th e	configurations based on its rendering abilities and then inform th e

	Provider about its choices. Details are provided in <xref target= "section-7" format="default"/>.</li>	Provider about its choices. Details are provided in <xref target= "s-7" format="default"/>.</li>
	<li>In some cases, physical limitations or other reasons disallo w	<li>In some cases, physical limitations or other reasons disallo w
	the concurrent use of a device in more than one setup. For	the concurrent use of a device in more than one setup. For
	example, the center camera in a typical three-camera conference	example, the center camera in a typical three-camera conference
	room can set its zoom objective to capture either the middle	room can set its zoom objective to capture either the middle
	few seats only or all seats of a room, but not both concurrently. The	few seats only or all seats of a room, but not both concurrently. The
	Simultaneous Transmission Set concept allows a Provider to signal	Simultaneous Transmission Set concept allows a Provider to signal
	such limitations. Simultaneous Transmission Sets are part of the	such limitations. Simultaneous Transmission Sets are part of the

	Capture Scene description and are discussed in <xref target="secti on-8" format="default"/>.</li>	Capture Scene description and are discussed in <xref target="s-8" format="default"/>.</li>
	<li>Often, the devices in a room do not have the computational	<li>Often, the devices in a room do not have the computational
	complexity or connectivity to deal with multiple encoding options	complexity or connectivity to deal with multiple encoding options
	simultaneously, even if each of these options is sensible in	simultaneously, even if each of these options is sensible in
	certain scenarios, and even if the simultaneous transmission is	certain scenarios, and even if the simultaneous transmission is
	also sensible (i.e., in case of multicast media distribution to	also sensible (i.e., in case of multicast media distribution to
	multiple endpoints). Such constraints can be expressed by the	multiple endpoints). Such constraints can be expressed by the

	Provider using the Encoding Group concept, which is described in < xref target="section-9" format="default"/>.</li>	Provider using the Encoding Group concept, which is described in < xref target="s-9" format="default"/>.</li>
	<li>Due to the potentially large number of RTP streams required for	<li>Due to the potentially large number of RTP streams required for
	a Multimedia Conference involving potentially many Endpoints, each	a Multimedia Conference involving potentially many Endpoints, each
	of which can have many Media Captures and media renderers, it has	of which can have many Media Captures and media renderers, it has
	become common to multiplex multiple RTP streams onto the same	become common to multiplex multiple RTP streams onto the same
	transport address, so as to avoid using the port number as a	transport address, so as to avoid using the port number as a
	multiplexing point and the associated shortcomings such as	multiplexing point and the associated shortcomings such as
	NAT/firewall traversal. The large number of possible permutations	NAT/firewall traversal. The large number of possible permutations
	of sensible options a Media Provider can make available to a Media	of sensible options a Media Provider can make available to a Media
	Consumer makes a mechanism desirable that allows it to narrow down	Consumer makes a mechanism desirable that allows it to narrow down
	the number of possible options that a SIP offer/answer exchange ha s	the number of possible options that a SIP offer/answer exchange ha s

	skipping to change at line 438 ¶	skipping to change at line 440 ¶
	-->	-->
	The	The
	Media Provider and Media Consumer may use information in CLUE	Media Provider and Media Consumer may use information in CLUE
	messages to reduce the complexity of SIP offer/answer messages.	messages to reduce the complexity of SIP offer/answer messages.
	Also, there are aspects of the control of both Endpoints and MCUs	Also, there are aspects of the control of both Endpoints and MCUs
	that dynamically change during the progress of a call, such as	that dynamically change during the progress of a call, such as
	audio-level-based screen switching, layout changes, and so on,	audio-level-based screen switching, layout changes, and so on,
	which need to be conveyed. Note that these control aspects are	which need to be conveyed. Note that these control aspects are
	complementary to those specified in traditional SIP-based	complementary to those specified in traditional SIP-based
	conference management, such as Binary Floor Control Protocol (BFCP ). An exemplary call flow can be	conference management, such as Binary Floor Control Protocol (BFCP ). An exemplary call flow can be

	found in <xref target="section-5" format="default"/>.</li>	found in <xref target="s-5" format="default"/>.</li>
	</ol>	</ol>
	<t>	<t>
	Finally, all this information needs to be conveyed, and the notion	Finally, all this information needs to be conveyed, and the notion
	of support for it needs to be established. This is done by the	of support for it needs to be established. This is done by the
	negotiation of a "CLUE channel", a data channel negotiated early	negotiation of a "CLUE channel", a data channel negotiated early
	during the initiation of a call. An Endpoint or MCU that rejects	during the initiation of a call. An Endpoint or MCU that rejects
	the establishment of this data channel, by definition, does not	the establishment of this data channel, by definition, does not
	support CLUE-based mechanisms, whereas an Endpoint or MCU that	support CLUE-based mechanisms, whereas an Endpoint or MCU that
	accepts it is indicating support for CLUE as specified in this	accepts it is indicating support for CLUE as specified in this
	document and its companion documents.</t>	document and its companion documents.</t>
	</section>	</section>

	<section anchor="section-5" numbered="true" toc="default">	<section anchor="s-5" numbered="true" toc="default">
	<name>Description of the Framework/Model</name>	<name>Description of the Framework/Model</name>
	<t>	<t>
	The CLUE framework specifies how multiple media streams are to be	The CLUE framework specifies how multiple media streams are to be
	handled in a telepresence conference.</t>	handled in a telepresence conference.</t>
	<t>	<t>
	A Media Provider (transmitting Endpoint or MCU) describes specific	A Media Provider (transmitting Endpoint or MCU) describes specific
	aspects of the content of the media and the media stream encodings	aspects of the content of the media and the media stream encodings
	it can send in an Advertisement; and the Media Consumer responds to	it can send in an Advertisement; and the Media Consumer responds to
	the Media Provider by specifying which content and media streams it	the Media Provider by specifying which content and media streams it
	wants to receive in a Configure message. The Provider then	wants to receive in a Configure message. The Provider then

	skipping to change at line 669 ¶	skipping to change at line 671 ¶
	not use CLUE, then the CLUE-capable device falls back to behavior	not use CLUE, then the CLUE-capable device falls back to behavior
	that does not require CLUE.</t>	that does not require CLUE.</t>
	<t>	<t>
	As for the media, Provider and Consumer have an end-to-end	As for the media, Provider and Consumer have an end-to-end
	communication relationship with respect to (RTP-transported) media;	communication relationship with respect to (RTP-transported) media;
	and the mechanisms described herein and in companion documents do	and the mechanisms described herein and in companion documents do
	not change the aspects of setting up those RTP flows and sessions.	not change the aspects of setting up those RTP flows and sessions.
	In other words, the RTP media sessions conform to the negotiated	In other words, the RTP media sessions conform to the negotiated
	SDP whether or not CLUE is used.</t>	SDP whether or not CLUE is used.</t>
	</section>	</section>

	<section anchor="section-6" numbered="true" toc="default">	<section anchor="s-6" numbered="true" toc="default">
	<name>Spatial Relationships</name>	<name>Spatial Relationships</name>
	<t>	<t>
	In order for a Consumer to perform a proper rendering, it is often	In order for a Consumer to perform a proper rendering, it is often
	necessary (or at least helpful) for the Consumer to have received	necessary (or at least helpful) for the Consumer to have received
	spatial information about the streams it is receiving. CLUE	spatial information about the streams it is receiving. CLUE
	defines a coordinate system that allows Media Providers to describe	defines a coordinate system that allows Media Providers to describe
	the spatial relationships of their Media Captures to enable proper	the spatial relationships of their Media Captures to enable proper
	scaling and spatially sensible rendering of their streams. The	scaling and spatially sensible rendering of their streams. The
	coordinate system is based on a few principles:</t>	coordinate system is based on a few principles:</t>
	<ul spacing="normal">	<ul spacing="normal">

	skipping to change at line 728 ¶	skipping to change at line 730 ¶
	Y increases from the front of the room to the back of the room;	Y increases from the front of the room to the back of the room;
	Z increases from low to high (i.e., floor to ceiling).</t>	Z increases from low to high (i.e., floor to ceiling).</t>
	<t>	<t>
	Cameras in a scene typically point in the direction of increasing	Cameras in a scene typically point in the direction of increasing
	Y, from front to back. But there could be multiple cameras	Y, from front to back. But there could be multiple cameras
	pointing in different directions. If the physical space does not	pointing in different directions. If the physical space does not
	have a well-defined front and back, the provider chooses any	have a well-defined front and back, the provider chooses any
	direction for X, Y, and Z consistent with right-handed	direction for X, Y, and Z consistent with right-handed
	coordinates.</t>	coordinates.</t>
	</section>	</section>

	<section anchor="section-7" numbered="true" toc="default">	<section anchor="s-7" numbered="true" toc="default">
	<name>Media Captures and Capture Scenes</name>	<name>Media Captures and Capture Scenes</name>
	<t>	<t>
	This section describes how Providers can describe the content of	This section describes how Providers can describe the content of
	media to Consumers.</t>	media to Consumers.</t>

	<section anchor="section-7.1" numbered="true" toc="default">	<section anchor="s-7.1" numbered="true" toc="default">
	<name>Media Captures</name>	<name>Media Captures</name>
	<t>	<t>
	Media Captures are the fundamental representations of streams that	Media Captures are the fundamental representations of streams that
	a device can transmit. What a Media Capture actually represents is	a device can transmit. What a Media Capture actually represents is
	flexible:</t>	flexible:</t>
	<ul spacing="normal">	<ul spacing="normal">
	<li>It can represent the immediate output of a physical source (e.g.,	<li>It can represent the immediate output of a physical source (e.g.,
	camera, microphone) or 'synthetic' source (e.g., laptop computer, DVD player).</li>	camera, microphone) or 'synthetic' source (e.g., laptop computer, DVD player).</li>
	<li>It can represent the output of an audio mixer or video com poser.</li>	<li>It can represent the output of an audio mixer or video com poser.</li>
	<li>It can represent a concept such as 'the loudest speaker'.< /li>	<li>It can represent a concept such as 'the loudest speaker'.< /li>

	skipping to change at line 782 ¶	skipping to change at line 784 ¶
	Advertisement unique identity. The identity may be referenced	Advertisement unique identity. The identity may be referenced
	outside the Capture Scene that defines it through an MCC.</li>	outside the Capture Scene that defines it through an MCC.</li>
	<li>A Media Capture may be associated with one or more CSVs.</ li>	<li>A Media Capture may be associated with one or more CSVs.</ li>
	<li>A Media Capture has exactly one set of spatial information .</li>	<li>A Media Capture has exactly one set of spatial information .</li>
	<li>A Media Capture can be the source of at most one Capture	<li>A Media Capture can be the source of at most one Capture
	Encoding.</li>	Encoding.</li>
	</ul>	</ul>
	<t>	<t>
	Each Media Capture can be associated with attributes to describe	Each Media Capture can be associated with attributes to describe
	what it represents.</t>	what it represents.</t>

	<section anchor="section-7.1.1" numbered="true" toc="default">	<section anchor="s-7.1.1" numbered="true" toc="default">
	<name>Media Capture Attributes</name>	<name>Media Capture Attributes</name>
	<t>	<t>
	Media Capture Attributes describe information about the Captures.	Media Capture Attributes describe information about the Captures.
	A Provider can use the Media Capture Attributes to describe the	A Provider can use the Media Capture Attributes to describe the
	Captures for the benefit of the Consumer of the Advertisement	Captures for the benefit of the Consumer of the Advertisement
	message. All these attributes are optional. Media Capture	message. All these attributes are optional. Media Capture
	Attributes include:	Attributes include:

	</t>	</t>
	<ul spacing="normal">	<ul spacing="normal">
	<li>Spatial information, such as point of capture, point on line	<li>Spatial information, such as point of capture, point on line
	of capture, and area of capture, (all of which, in combination,	of capture, and area of capture, (all of which, in combination,
	define the capture field of, for example, a camera).</li>	define the capture field of, for example, a camera).</li>
	<li>Other descriptive information to help the Consumer choos e	<li>Other descriptive information to help the Consumer choos e
	between captures (e.g., description, presentation, view,	between captures (e.g., description, presentation, view,
	priority, language, person information, and type).</li>	priority, language, person information, and type).</li>
	</ul>	</ul>
	<t>	<t>
	The subsections below define the Capture attributes.</t>	The subsections below define the Capture attributes.</t>

	<section anchor="section-7.1.1.1" numbered="true" toc="default ">	<section anchor="s-7.1.1.1" numbered="true" toc="default">
	<name>Point of Capture</name>	<name>Point of Capture</name>
	<t>	<t>
	The Point of Capture attribute is a field with a single Cartesian	The Point of Capture attribute is a field with a single Cartesian
	(X, Y, Z) point value that describes the spatial location of the	(X, Y, Z) point value that describes the spatial location of the
	capturing device (such as camera). For an Audio Capture with	capturing device (such as camera). For an Audio Capture with
	multiple microphones, the Point of Capture defines the nominal midpoi nt of the microphones.</t>	multiple microphones, the Point of Capture defines the nominal midpoi nt of the microphones.</t>
	</section>	</section>

	<section anchor="section-7.1.1.2" numbered="true" toc="default ">	<section anchor="s-7.1.1.2" numbered="true" toc="default">
	<name>Point on Line of Capture</name>	<name>Point on Line of Capture</name>
	<t>	<t>
	The Point on Line of Capture attribute is a field with a single	The Point on Line of Capture attribute is a field with a single
	Cartesian (X, Y, Z) point value that describes a position in space	Cartesian (X, Y, Z) point value that describes a position in space
	of a second point on the axis of the capturing device, toward the	of a second point on the axis of the capturing device, toward the
	direction it is pointing; the first point being the Point of	direction it is pointing; the first point being the Point of
	Capture (see above).</t>	Capture (see above).</t>
	<t>	<t>
	Together, the Point of Capture and Point on Line of Capture define	Together, the Point of Capture and Point on Line of Capture define
	the direction and axis of the capturing device, for example, the	the direction and axis of the capturing device, for example, the

	skipping to change at line 837 ¶	skipping to change at line 839 ¶
	picked up by the microphone providing this specific audio capture.	picked up by the microphone providing this specific audio capture.
	If the Consumer wants to associate an Audio Capture with a Video	If the Consumer wants to associate an Audio Capture with a Video
	Capture, it can compare this volume with the area of capture for	Capture, it can compare this volume with the area of capture for
	video media to provide a check on whether the audio capture is	video media to provide a check on whether the audio capture is
	indeed spatially associated with the video capture. For example, a	indeed spatially associated with the video capture. For example, a
	video area of capture that fails to intersect at all with the audio	video area of capture that fails to intersect at all with the audio
	volume of capture, or is at such a long radial distance from the	volume of capture, or is at such a long radial distance from the
	microphone point of capture that the audio level would be very low,	microphone point of capture that the audio level would be very low,
	would be inappropriate.</t>	would be inappropriate.</t>
	</section>	</section>

	<section anchor="section-7.1.1.3" numbered="true" toc="default ">	<section anchor="s-7.1.1.3" numbered="true" toc="default">
	<name>Area of Capture</name>	<name>Area of Capture</name>
	<t>	<t>
	The Area of Capture is a field with a set of four (X, Y, Z) points	The Area of Capture is a field with a set of four (X, Y, Z) points
	as a value that describes the spatial location of what is being	as a value that describes the spatial location of what is being
	"captured". This attribute applies only to video captures, not	"captured". This attribute applies only to video captures, not
	other types of media. By comparing the Area of Capture for	other types of media. By comparing the Area of Capture for
	different Video Captures within the same Capture Scene, a Consumer	different Video Captures within the same Capture Scene, a Consumer
	can determine the spatial relationships between them and render	can determine the spatial relationships between them and render
	them correctly.</t>	them correctly.</t>
	<t>	<t>
	The four points MUST be co-planar, forming a quadrilateral, which	The four points MUST be co-planar, forming a quadrilateral, which
	defines the Plane of Interest for the particular Media Capture.</t>	defines the Plane of Interest for the particular Media Capture.</t>
	<t>	<t>
	If the Area of Capture is not specified, it means the Video Capture	If the Area of Capture is not specified, it means the Video Capture
	might be spatially related to other Captures in the same Scene, but	might be spatially related to other Captures in the same Scene, but
	there is no detailed information on the relationship. For a switched	there is no detailed information on the relationship. For a switched
	Capture that switches between different sections within a larger	Capture that switches between different sections within a larger
	area, the area of capture MUST use coordinates for the larger	area, the area of capture MUST use coordinates for the larger
	potential area.</t>	potential area.</t>
	</section>	</section>

	<section anchor="section-7.1.1.4" numbered="true" toc="default ">	<section anchor="s-7.1.1.4" numbered="true" toc="default">
	<name>Mobility of Capture</name>	<name>Mobility of Capture</name>
	<t>	<t>
	The Mobility of Capture attribute indicates whether or not the	The Mobility of Capture attribute indicates whether or not the
	point of capture, line on point of capture, and area of capture	point of capture, line on point of capture, and area of capture
	values stay the same over time, or are expected to change	values stay the same over time, or are expected to change
	(potentially frequently). Possible values are static, dynamic, and	(potentially frequently). Possible values are static, dynamic, and
	highly dynamic.</t>	highly dynamic.</t>
	<t>	<t>
	An example for "dynamic" is a camera mounted on a stand that is	An example for "dynamic" is a camera mounted on a stand that is
	occasionally hand-carried and placed at different positions in	occasionally hand-carried and placed at different positions in

	skipping to change at line 884 ¶	skipping to change at line 886 ¶
	The capture point of a static Capture MUST NOT move for the life of	The capture point of a static Capture MUST NOT move for the life of
	the CLUE session. The capture point of dynamic Captures is	the CLUE session. The capture point of dynamic Captures is
	categorized by a change in position followed by a reasonable period	categorized by a change in position followed by a reasonable period
	of stability -- in the order of magnitude of minutes. Highly	of stability -- in the order of magnitude of minutes. Highly
	dynamic captures are categorized by a capture point that is	dynamic captures are categorized by a capture point that is
	constantly moving. If the "area of capture", "capture point" and	constantly moving. If the "area of capture", "capture point" and
	"line of capture" attributes are included with dynamic or highly	"line of capture" attributes are included with dynamic or highly
	dynamic Captures they indicate spatial information at the time of	dynamic Captures they indicate spatial information at the time of
	the Advertisement.</t>	the Advertisement.</t>
	</section>	</section>

	<section anchor="section-7.1.1.5" numbered="true" toc="default ">	<section anchor="s-7.1.1.5" numbered="true" toc="default">
	<name>Audio Capture Sensitivity Pattern</name>	<name>Audio Capture Sensitivity Pattern</name>
	<t>	<t>
	The Audio Capture Sensitivity Pattern attribute applies only to	The Audio Capture Sensitivity Pattern attribute applies only to
	audio captures. This attribute gives information about the nominal	audio captures. This attribute gives information about the nominal
	sensitivity pattern of the microphone that is the source of the	sensitivity pattern of the microphone that is the source of the
	Capture. Possible values include patterns such as omni, shotgun,	Capture. Possible values include patterns such as omni, shotgun,
	cardioid, and hyper-cardioid.</t>	cardioid, and hyper-cardioid.</t>
	</section>	</section>

	<section anchor="section-7.1.1.6" numbered="true" toc="default ">	<section anchor="s-7.1.1.6" numbered="true" toc="default">
	<name>Description</name>	<name>Description</name>
	<t>	<t>
	The Description attribute is a human-readable description (which	The Description attribute is a human-readable description (which
	could be in multiple languages) of the Capture.</t>	could be in multiple languages) of the Capture.</t>
	</section>	</section>

	<section anchor="section-7.1.1.7" numbered="true" toc="default ">	<section anchor="s-7.1.1.7" numbered="true" toc="default">
	<name>Presentation</name>	<name>Presentation</name>
	<t>	<t>
	The Presentation attribute indicates that the capture originates	The Presentation attribute indicates that the capture originates
	from a presentation device, that is, one that provides supplementary	from a presentation device, that is, one that provides supplementary
	information to a conference through slides, video, still images,	information to a conference through slides, video, still images,
	data, etc. Where more information is known about the capture, it MAY	data, etc. Where more information is known about the capture, it MAY
	be expanded hierarchically to indicate the different types of	be expanded hierarchically to indicate the different types of
	presentation media, e.g., presentation.slides, presentation.image,	presentation media, e.g., presentation.slides, presentation.image,
	etc.</t>	etc.</t>
	<t>	<t>
	Note: It is expected that a number of keywords will be defined that	Note: It is expected that a number of keywords will be defined that
	provide more detail on the type of presentation. Refer to <xref targe t="RFCYYY1" format="default"/> for how to extend the model.</t>	provide more detail on the type of presentation. Refer to <xref targe t="RFCYYY1" format="default"/> for how to extend the model.</t>
	</section>	</section>

	<section anchor="section-7.1.1.8" numbered="true" toc="default ">	<section anchor="s-7.1.1.8" numbered="true" toc="default">
	<name>View</name>	<name>View</name>
	<t>	<t>
	The View attribute is a field with enumerated values, indicating	The View attribute is a field with enumerated values, indicating
	what type of view the Capture relates to. The Consumer can use	what type of view the Capture relates to. The Consumer can use
	this information to help choose which Media Captures it wishes to	this information to help choose which Media Captures it wishes to
	receive. Possible values are as follows:</t>	receive. Possible values are as follows:</t>
	<dl newline="false" spacing="normal" indent="12">	<dl newline="false" spacing="normal" indent="12">
	<dt>Room:</dt>	<dt>Room:</dt>
	<dd>Captures the entire scene	<dd>Captures the entire scene
	</dd>	</dd>

	skipping to change at line 938 ¶	skipping to change at line 940 ¶
	<dd>Captures an individual person</dd>	<dd>Captures an individual person</dd>
	<dt>Lectern:</dt>	<dt>Lectern:</dt>
	<dd>Captures the region of the lectern including the	<dd>Captures the region of the lectern including the
	presenter, for example, in a classroom-style conference room	presenter, for example, in a classroom-style conference room
	</dd>	</dd>
	<dt>Audience:</dt>	<dt>Audience:</dt>
	<dd>Captures a region showing the audience in a classroom- style conference room	<dd>Captures a region showing the audience in a classroom- style conference room
	</dd>	</dd>
	</dl>	</dl>
	</section>	</section>

	<section anchor="section-7.1.1.9" numbered="true" toc="default ">	<section anchor="s-7.1.1.9" numbered="true" toc="default">
	<name>Language</name>	<name>Language</name>
	<t>	<t>
	The Language attribute indicates one or more languages used in the	The Language attribute indicates one or more languages used in the
	content of the Media Capture. Captures MAY be offered in different	content of the Media Capture. Captures MAY be offered in different
	languages in case of multilingual and/or accessible conferences. A	languages in case of multilingual and/or accessible conferences. A
	Consumer can use this attribute to differentiate between them and	Consumer can use this attribute to differentiate between them and
	pick the appropriate one.</t>	pick the appropriate one.</t>
	<t>	<t>
	Note that the Language attribute is defined and meaningful both for	Note that the Language attribute is defined and meaningful both for
	audio and video captures. In case of audio captures, the meaning	audio and video captures. In case of audio captures, the meaning
	is obvious. For a video capture, "Language" could, for example, be	is obvious. For a video capture, "Language" could, for example, be
	sign interpretation or text.</t>	sign interpretation or text.</t>
	<t>	<t>
	The Language attribute is coded per <xref target="RFC5646" format="de fault"/>.</t>	The Language attribute is coded per <xref target="RFC5646" format="de fault"/>.</t>
	</section>	</section>

	<section anchor="section-7.1.1.10" numbered="true" toc="defaul t">	<section anchor="s-7.1.1.10" numbered="true" toc="default">
	<name>Person Information</name>	<name>Person Information</name>
	<t>	<t>
	The Person Information attribute allows a Provider to provide	The Person Information attribute allows a Provider to provide
	specific information regarding the people in a Capture (regardless	specific information regarding the people in a Capture (regardless
	of whether or not the capture has a Presentation attribute). The	of whether or not the capture has a Presentation attribute). The
	Provider may gather the information automatically or manually from	Provider may gather the information automatically or manually from
	a variety of sources; however, the xCard <xref target="RFC6351" forma t="default"/> format is used to	a variety of sources; however, the xCard <xref target="RFC6351" forma t="default"/> format is used to
	convey the information. This allows various information, such as	convey the information. This allows various information, such as

	Identification information (Section 6.2 of <xref target="RFC6350" for	Identification information (<xref section="6.2" sectionFormat="of" ta
	mat="default"/>), Communication	rget="RFC6350" format="default"/>), Communication
	Information (Section 6.4 of <xref target="RFC6350" format="default"/>	Information (<xref section="6.4" sectionFormat="of" target="RFC6350"
	), and Organizational information	format="default"/>), and Organizational information
	(Section 6.6 of <xref target="RFC6350" format="default"/>), to be com	(<xref section="6.6" sectionFormat="of" target="RFC6350" format="defa
	municated. A Consumer may then	ult"/>), to be communicated. A Consumer may then
	automatically (i.e., via a policy) or manually select Captures	automatically (i.e., via a policy) or manually select Captures
	based on information about who is in a Capture. It also allows a	based on information about who is in a Capture. It also allows a
	Consumer to render information regarding the people participating	Consumer to render information regarding the people participating
	in the conference or to use it for further processing.</t>	in the conference or to use it for further processing.</t>
	<t>	<t>
	The Provider may supply a minimal set of information or a larger	The Provider may supply a minimal set of information or a larger
	set of information. However, it MUST be compliant to <xref target="RF C6350" format="default"/> and	set of information. However, it MUST be compliant to <xref target="RF C6350" format="default"/> and
	supply a "VERSION" and "FN" property. A Provider may supply	supply a "VERSION" and "FN" property. A Provider may supply

	multiple xCards per Capture of any KIND (Section 6.1.4 of <xref targe t="RFC6350" format="default"/>).</t>	multiple xCards per Capture of any KIND (<xref section="6.1.4" sectio nFormat="of" target="RFC6350" format="default"/>).</t>
	<t>	<t>
	In order to keep CLUE messages compact, the Provider SHOULD use a	In order to keep CLUE messages compact, the Provider SHOULD use a
	URI to point to any LOGO, PHOTO, or SOUND contained in the xCARD	URI to point to any LOGO, PHOTO, or SOUND contained in the xCARD
	rather than transmitting the LOGO, PHOTO, or SOUND data in a CLUE	rather than transmitting the LOGO, PHOTO, or SOUND data in a CLUE
	message.</t>	message.</t>
	</section>	</section>

	<section anchor="section-7.1.1.11" numbered="true" toc="defaul t">	<section anchor="s-7.1.1.11" numbered="true" toc="default">
	<name>Person Type</name>	<name>Person Type</name>
	<t>	<t>
	The Person Type attribute indicates the type of people contained in	The Person Type attribute indicates the type of people contained in
	the capture with respect to the meeting agenda (regardless of	the capture with respect to the meeting agenda (regardless of
	whether or not the capture has a Presentation attribute). As a	whether or not the capture has a Presentation attribute). As a
	capture may include multiple people the attribute may contain	capture may include multiple people the attribute may contain
	multiple values. However, values MUST NOT be repeated within the	multiple values. However, values MUST NOT be repeated within the
	attribute.</t>	attribute.</t>
	<t>	<t>
	An Advertiser associates the person type with an individual capture	An Advertiser associates the person type with an individual capture

	skipping to change at line 1033 ¶	skipping to change at line 1035 ¶
	or commentary in the meeting.</dd>	or commentary in the meeting.</dd>
	<dt>Timekeeper:</dt>	<dt>Timekeeper:</dt>
	<dd>the person responsible for maintaining the	<dd>the person responsible for maintaining the
	meeting schedule.</dd>	meeting schedule.</dd>
	</dl>	</dl>
	<t>	<t>
	Furthermore, the person type attribute may contain one or more	Furthermore, the person type attribute may contain one or more
	strings allowing the Provider to indicate custom meeting-specific	strings allowing the Provider to indicate custom meeting-specific
	types.</t>	types.</t>
	</section>	</section>

	<section anchor="section-7.1.1.12" numbered="true" toc="defaul t">	<section anchor="s-7.1.1.12" numbered="true" toc="default">
	<name>Priority</name>	<name>Priority</name>
	<t>	<t>
	The Priority attribute indicates a relative priority between	The Priority attribute indicates a relative priority between
	different Media Captures. The Provider sets this priority, and the	different Media Captures. The Provider sets this priority, and the
	Consumer MAY use the priority to help decide which Captures it	Consumer MAY use the priority to help decide which Captures it
	wishes to receive.</t>	wishes to receive.</t>
	<t>	<t>
	The "priority" attribute is an integer that indicates a relative	The "priority" attribute is an integer that indicates a relative
	priority between Captures. For example, it is possible to assign a	priority between Captures. For example, it is possible to assign a
	priority between two presentation Captures that would allow a	priority between two presentation Captures that would allow a
	remote Endpoint to determine which presentation is more important.	remote Endpoint to determine which presentation is more important.
	Priority is assigned at the individual Capture level. It represents	Priority is assigned at the individual Capture level. It represents
	the Provider's view of the relative priority between Captures with	the Provider's view of the relative priority between Captures with
	a priority. The same priority number MAY be used across multiple	a priority. The same priority number MAY be used across multiple
	Captures. It indicates that they are equally important. If no priorit y	Captures. It indicates that they are equally important. If no priorit y
	is assigned, no assumptions regarding relative importance of the	is assigned, no assumptions regarding relative importance of the
	Capture can be assumed.</t>	Capture can be assumed.</t>
	</section>	</section>

	<section anchor="section-7.1.1.13" numbered="true" toc="defaul t">	<section anchor="s-7.1.1.13" numbered="true" toc="default">
	<name>Embedded Text</name>	<name>Embedded Text</name>
	<t>	<t>
	The Embedded Text attribute indicates that a Capture provides	The Embedded Text attribute indicates that a Capture provides
	embedded textual information. For example, the video Capture may	embedded textual information. For example, the video Capture may
	contain speech-to-text information composed with the video image.</t>	contain speech-to-text information composed with the video image.</t>
	</section>	</section>

	<section anchor="section-7.1.1.14" numbered="true" toc="defaul t">	<section anchor="s-7.1.1.14" numbered="true" toc="default">
	<name>Related To</name>	<name>Related To</name>
	<t>	<t>
	The Related To attribute indicates the Capture contains additional	The Related To attribute indicates the Capture contains additional
	complementary information related to another Capture. The value	complementary information related to another Capture. The value
	indicates the identity of the other Capture to which this Capture	indicates the identity of the other Capture to which this Capture
	is providing additional information.</t>	is providing additional information.</t>
	<t>	<t>
	For example, a conference can utilize translators or facilitators	For example, a conference can utilize translators or facilitators
	that provide an additional audio stream (i.e., a translation or	that provide an additional audio stream (i.e., a translation or
	description or commentary of the conference). Where multiple	description or commentary of the conference). Where multiple
	captures are available, it may be advantageous for a Consumer to	captures are available, it may be advantageous for a Consumer to
	select a complementary Capture instead of or in addition to a	select a complementary Capture instead of or in addition to a
	Capture it relates to.</t>	Capture it relates to.</t>
	</section>	</section>
	</section>	</section>
	</section>	</section>

	<section anchor="section-7.2" numbered="true" toc="default">	<section anchor="s-7.2" numbered="true" toc="default">
	<name>Multiple Content Capture</name>	<name>Multiple Content Capture</name>
	<t>	<t>
	The MCC indicates that one or more Single Media Captures are	The MCC indicates that one or more Single Media Captures are
	multiplexed (temporally and/or spatially) or mixed in one Media	multiplexed (temporally and/or spatially) or mixed in one Media
	Capture. Only one Capture type (i.e., audio, video, etc.) is	Capture. Only one Capture type (i.e., audio, video, etc.) is
	allowed in each MCC instance. The MCC may contain a reference to	allowed in each MCC instance. The MCC may contain a reference to
	the Single Media Captures (which may have their own attributes) as	the Single Media Captures (which may have their own attributes) as
	well as attributes associated with the MCC itself. An MCC may also	well as attributes associated with the MCC itself. An MCC may also
	contain other MCCs. The MCC MAY reference Captures from within the	contain other MCCs. The MCC MAY reference Captures from within the
	Capture Scene that defines it or from other Capture Scenes. No	Capture Scene that defines it or from other Capture Scenes. No

	skipping to change at line 1099 ¶	skipping to change at line 1101 ¶
	the MCC contains content from multiple sources, but no information	the MCC contains content from multiple sources, but no information
	regarding those sources is given. MCCs either contain the	regarding those sources is given. MCCs either contain the
	referenced Captures and no others or have no referenced captures	referenced Captures and no others or have no referenced captures
	and, therefore, may contain any Capture.</t>	and, therefore, may contain any Capture.</t>
	<t>	<t>
	One or more MCCs may also be specified in a CSV. This allows an	One or more MCCs may also be specified in a CSV. This allows an
	Advertiser to indicate that several MCC captures are used to	Advertiser to indicate that several MCC captures are used to
	represent a capture scene. <xref target="ref-advertisement-sent-to-e ndpoint-f-two-encodings" format="default"/> provides an example of this	represent a capture scene. <xref target="ref-advertisement-sent-to-e ndpoint-f-two-encodings" format="default"/> provides an example of this
	case.</t>	case.</t>
	<t>	<t>

	As outlined in <xref target="section-7.1" format="default"/>, each in stance of the MCC has its own	As outlined in <xref target="s-7.1" format="default"/>, each instance of the MCC has its own
	Capture identity, i.e., MCC1. It allows all the individual captures	Capture identity, i.e., MCC1. It allows all the individual captures
	contained in the MCC to be referenced by a single MCC identity.</t>	contained in the MCC to be referenced by a single MCC identity.</t>
	<t>The example below shows the use of a Multiple Content Capture :</t>	<t>The example below shows the use of a Multiple Content Capture :</t>
	<table anchor="ref-multiple-content-capture-concept" align="cent er">	<table anchor="ref-multiple-content-capture-concept" align="cent er">
	<name>Multiple Content Capture Concept</name>	<name>Multiple Content Capture Concept</name>
	<thead>	<thead>
	<tr>	<tr>
	<th align="left"> Capture Scene #1</th>	<th align="left"> Capture Scene #1</th>
	<th align="left"> </th>	<th align="left"> </th>
	</tr>	</tr>

	skipping to change at line 1137 ¶	skipping to change at line 1139 ¶
	</tr>	</tr>
	<tr>	<tr>
	<td align="left">CSV(MCC1)</td>	<td align="left">CSV(MCC1)</td>
	<td align="left"/>	<td align="left"/>
	</tr>	</tr>
	</tbody>	</tbody>
	</table>	</table>
	<t>	<t>
	This indicates that MCC1 is a single capture that contains the	This indicates that MCC1 is a single capture that contains the
	Captures VC1, VC2, and VC3, according to any MCC1 attributes.</t>	Captures VC1, VC2, and VC3, according to any MCC1 attributes.</t>

	<section anchor="section-7.2.1" numbered="true" toc="default">	<section anchor="s-7.2.1" numbered="true" toc="default">
	<name>MCC Attributes</name>	<name>MCC Attributes</name>
	<t>	<t>
	Media Capture Attributes may be associated with the MCC instance	Media Capture Attributes may be associated with the MCC instance
	and the Single Media Captures that the MCC references. A Provider	and the Single Media Captures that the MCC references. A Provider
	should avoid providing conflicting attribute values between the MCC	should avoid providing conflicting attribute values between the MCC
	and Single Media Captures. Where there is conflict the attributes	and Single Media Captures. Where there is conflict the attributes
	of the MCC, a Provider should override any that may be present in the individual	of the MCC, a Provider should override any that may be present in the individual
	Captures.</t>	Captures.</t>
	<t>	<t>
	A Provider MAY include as much or as little of the original source	A Provider MAY include as much or as little of the original source
	Capture information as it requires.</t>	Capture information as it requires.</t>
	<t>	<t>
	There are MCC-specific attributes that MUST only be used with	There are MCC-specific attributes that MUST only be used with
	Multiple Content Captures. These are described in the sections	Multiple Content Captures. These are described in the sections

	below. The attributes described in <xref target="section-7.1.1" forma t="default"/> MAY also be used	below. The attributes described in <xref target="s-7.1.1" format="def ault"/> MAY also be used
	with MCCs.</t>	with MCCs.</t>
	<t>	<t>
	The spatial-related attributes of an MCC indicate its area of	The spatial-related attributes of an MCC indicate its area of
	capture and point of capture within the scene, just like any other	capture and point of capture within the scene, just like any other
	media capture. The spatial information does not imply anything	media capture. The spatial information does not imply anything
	about how other captures are composed within an MCC.</t>	about how other captures are composed within an MCC.</t>
	<t>For example: a virtual scene could be constructed for the MCC	<t>For example: a virtual scene could be constructed for the MCC
	capture with two Video Captures with a "MaxCaptures" attribute se t	capture with two Video Captures with a "MaxCaptures" attribute se t
	to 2 and an "Area of Capture" attribute provided with an overall	to 2 and an "Area of Capture" attribute provided with an overall
	area. Each of the individual Captures could then also include an	area. Each of the individual Captures could then also include an
	"Area of Capture" attribute with a subset of the overall area.	"Area of Capture" attribute with a subset of the overall area.
	The Consumer would then know how each capture is related to others	The Consumer would then know how each capture is related to others
	within the scene, but not the relative position of the individual	within the scene, but not the relative position of the individual
	captures within the composed capture.	captures within the composed capture.


	<!--[rfced] Please note that some "Tables" have been updated to
	"Figures" as they contain text or alignment that cannot yet be
	handled using a <texttable> in XML. Please let us know any
	objections.

	</t>	</t>

	<figure anchor="table_2">
	<name>Example of MCC and Single Media Capture Attributes</na	<table anchor="table_2">
	me>	<name>Example of MCC and Single Media Capture Attributes</name>
	<artwork name="" type="" align="left" alt=""><![CDATA[	<thead>
	+-----------------------+---------------------------------+	<tr><th align="left">Capture Scene #1</th><th/></tr>
	\| Capture Scene #1 \| \|	</thead>
	+-----------------------\|---------------------------------+	<tbody>
	\| VC1 \| AreaofCapture=(0,0,0)(9,0,0) \|	<tr>
	\| \| (0,0,9)(9,0,9) \|	<td>VC1</td>
	\| VC2 \| AreaofCapture=(10,0,0)(19,0,0) \|	<td align="left">
	\| \| (10,0,9)(19,0,9) \|	<artwork align="left">
	\| MCC1(VC1,VC2) \| MaxCaptures=2 \|	AreaofCapture=(0,0,0)(9,0,0)
	\| \| AreaofCapture=(0,0,0)(19,0,0) \|	(0,0,9)(9,0,9)
	\| \| (0,0,9)(19,0,9) \|	</artwork>
	\| CSV(MCC1) \| \|	</td>
	+---------------------------------------------------------+	</tr>
	]]></artwork>	<tr>
	</figure>	<td>VC2</td>
		<td align="left">
		<artwork align="left">
		AreaofCapture=(10,0,0)(19,0,0)
		(10,0,9)(19,0,9)
		</artwork>
		</td>
		</tr>
		<tr>
		<td>MCC1(VC1,VC2)</td>
		<td align="left"><artwork align="left">
		MaxCaptures=2
		AreaofCapture=(0,0,0)(19,0,0)
		(0,0,9)(19,0,9)
		</artwork>
		</td>
		</tr>
		<tr>
		<td>CSV(MCC1)</td>
		<td/>
		</tr>
		</tbody>
		</table>

	<t>	<t>
	The subsections below describe the MCC-only attributes.</t>	The subsections below describe the MCC-only attributes.</t>

	<section anchor="section-7.2.1.1" numbered="true" toc="default ">	<section anchor="s-7.2.1.1" numbered="true" toc="default">
	<name>Maximum Number of Captures within an MCC</name>	<name>Maximum Number of Captures within an MCC</name>
	<t>	<t>
	The Maximum Number of Captures MCC attribute indicates the maximum	The Maximum Number of Captures MCC attribute indicates the maximum
	number of individual Captures that may appear in a Capture Encoding	number of individual Captures that may appear in a Capture Encoding
	at a time. The actual number at any given time can be less than or	at a time. The actual number at any given time can be less than or
	equal to this maximum. It may be used to derive how the Single	equal to this maximum. It may be used to derive how the Single
	Media Captures within the MCC are composed/switched with regard	Media Captures within the MCC are composed/switched with regard
	to space and time.</t>	to space and time.</t>
	<!--[rfced] The relationship between "MaxCaptures" and "Maxi mum Number	<!--[rfced] The relationship between "MaxCaptures" and "Maxi mum Number
	of Captures MCC attribute" is not made clear. Will the reader	of Captures MCC attribute" is not made clear. Will the reader

	skipping to change at line 1272 ¶	skipping to change at line 1292 ¶
	<t>	<t>
	If this attribute is not set, then as a default, it is assumed that a ll	If this attribute is not set, then as a default, it is assumed that a ll
	source media capture content can appear concurrently in the Capture	source media capture content can appear concurrently in the Capture
	Encoding associated with the MCC.</t>	Encoding associated with the MCC.</t>
	<t>	<t>
	For example, the use of MaxCaptures equal to 1 on an MCC with three	For example, the use of MaxCaptures equal to 1 on an MCC with three
	Video Captures, VC1, VC2, and VC3, would indicate that the Advertiser	Video Captures, VC1, VC2, and VC3, would indicate that the Advertiser
	in the Capture Encoding would switch between VC1, VC2, and VC3 as	in the Capture Encoding would switch between VC1, VC2, and VC3 as
	there may be only a maximum of one Capture at a time.</t>	there may be only a maximum of one Capture at a time.</t>
	</section>	</section>

	<section anchor="section-7.2.1.2" numbered="true" toc="default ">	<section anchor="s-7.2.1.2" numbered="true" toc="default">
	<name>Policy</name>	<name>Policy</name>
	<t>	<t>
	The Policy MCC Attribute indicates the criteria that the Provider	The Policy MCC Attribute indicates the criteria that the Provider
	uses to determine when and/or where media content appears in the	uses to determine when and/or where media content appears in the
	Capture Encoding related to the MCC.</t>	Capture Encoding related to the MCC.</t>
	<t>	<t>
	The attribute is in the form of a token that indicates the policy	The attribute is in the form of a token that indicates the policy
	and an index representing an instance of the policy. The same	and an index representing an instance of the policy. The same
	index value can be used for multiple MCCs.</t>	index value can be used for multiple MCCs.</t>
	<t>	<t>

	skipping to change at line 1325 ¶	skipping to change at line 1345 ¶
	<tr>	<tr>
	<td align="left">VC1</td>	<td align="left">VC1</td>
	<td align="left"/>	<td align="left"/>
	</tr>	</tr>
	<tr>	<tr>
	<td align="left">VC2</td>	<td align="left">VC2</td>
	<td align="left"/>	<td align="left"/>
	</tr>	</tr>
	<tr>	<tr>
	<td align="left">MCC1(VC1,VC2)</td>	<td align="left">MCC1(VC1,VC2)</td>

	<td align="left">Policy=SoundLevel:0</td>	<td align="left">Policy=SoundLevel:0<br/>
	</tr>	MaxCaptures=1</td>
	<tr>
	<td align="left"/>
	<td align="left">MaxCaptures=1</td>
	</tr>	</tr>
	<tr>	<tr>
	<td align="left">MCC2(VC1,VC2)</td>	<td align="left">MCC2(VC1,VC2)</td>

	<td align="left">Policy=SoundLevel:1</td>	<td align="left">Policy=SoundLevel:1<br/>
	</tr>	MaxCaptures=1</td>
	<tr>
	<td align="left"/>
	<td align="left">MaxCaptures=1</td>
	</tr>	</tr>
	<tr>	<tr>
	<td align="left">CSV(MCC1,MCC2)</td>	<td align="left">CSV(MCC1,MCC2)</td>
	<td align="left"/>	<td align="left"/>
	</tr>	</tr>
	</tbody>	</tbody>
	</table>	</table>
	</section>	</section>

	<section anchor="section-7.2.1.3" numbered="true" toc="default
	">	<section anchor="s-7.2.1.3" numbered="true" toc="default">
	<name>Synchronization Identity</name>	<name>Synchronization Identity</name>
	<t>	<t>
	The Synchronization Identity MCC attribute indicates how the	The Synchronization Identity MCC attribute indicates how the
	individual Captures in multiple MCC Captures are synchronized. To	individual Captures in multiple MCC Captures are synchronized. To
	indicate that the Capture Encodings associated with MCCs contain	indicate that the Capture Encodings associated with MCCs contain
	Captures from the same source at the same time, a Provider should	Captures from the same source at the same time, a Provider should
	set the same Synchronization Identity on each of the concerned	set the same Synchronization Identity on each of the concerned
	MCCs. It is the Provider that determines what the source for the	MCCs. It is the Provider that determines what the source for the
	Captures is, so a Provider can choose how to group together Single	Captures is, so a Provider can choose how to group together Single
	Media Captures into a combined "source" for the purpose of	Media Captures into a combined "source" for the purpose of

	skipping to change at line 1368 ¶	skipping to change at line 1383 ¶
	SynchronizationID attribute. For example, when the Provider is in	SynchronizationID attribute. For example, when the Provider is in
	an MCU, it may determine that each separate CLUE Endpoint is a	an MCU, it may determine that each separate CLUE Endpoint is a
	remote source of media. The Synchronization Identity may be used	remote source of media. The Synchronization Identity may be used
	across media types, i.e., to synchronize audio- and video-related	across media types, i.e., to synchronize audio- and video-related
	MCCs.</t>	MCCs.</t>
	<t>	<t>
	Without this attribute it is assumed that multiple MCCs may provide	Without this attribute it is assumed that multiple MCCs may provide
	content from different sources at any particular point in time.</t>	content from different sources at any particular point in time.</t>
	<t>For example:	<t>For example:
	</t>	</t>

	<figure anchor="table_4">
		<table anchor="table_4">
	<name>Example Synchronization Identity MCC Attribute Usage </name>	<name>Example Synchronization Identity MCC Attribute Usage </name>

	<artwork name="" type="" align="left" alt=""><![CDATA[
	+=======================+=================================+	<tbody>
	\| Capture Scene #1 \| \|	<tr><th>Capture Scene #1</th> <th/></tr>
	+-----------------------\|---------------------------------+	<tr><td>VC1</td> <td>Description=Left</
	\| VC1 \| Description=Left \|	td></tr>
	\| VC2 \| Description=Center \|	<tr><td>VC2</td> <td>Description=Center
	\| VC3 \| Description=Right \|	</td></tr>
	\| AC1 \| Description=Room \|	<tr><td>VC3</td> <td>Description=Right<
	\| CSV(VC1,VC2,VC3) \| \|	/td></tr>
	\| CSV(AC1) \| \|	<tr><td>AC1</td> <td>Description=Room</
	+=======================+=================================+	td></tr>
	\| Capture Scene #2 \| \|	<tr><td>CSV(VC1,VC2,VC3)</td> <td/></tr>
	+-----------------------\|---------------------------------+	<tr><td>CSV(AC1)</td> <td/></tr>
	\| VC4 \| Description=Left \|	</tbody>
	\| VC5 \| Description=Center \|
	\| VC6 \| Description=Right \|	<tbody>
	\| AC2 \| Description=Room \|	<tr><th>Capture Scene #2</th> <th/></tr>
	\| CSV(VC4,VC5,VC6) \| \|
	\| CSV(AC2) \| \|	<tr><td>VC4</td> <td>Description=Left</
	+=======================+=================================+	td></tr>
	\| Capture Scene #3 \| \|	<tr><td>VC5</td> <td>Description=Center
	+-----------------------\|---------------------------------+	</td></tr>
	\| VC7 \| \|	<tr><td>VC6</td> <td>Description=Right<
	\| AC3 \| \|	/td></tr>
	+=======================+=================================+	<tr><td>AC2</td> <td>Description=Room</
	\| Capture Scene #4 \| \|	td></tr>
	+-----------------------\|---------------------------------+	<tr><td>CSV(VC4,VC5,VC6)</td> <td/></tr>
	\| VC8 \| \|	<tr><td>CSV(AC2)</td> <td/></tr>
	\| AC4 \| \|	</tbody>
	+=======================+=================================+
	\| Capture Scene #5 \| \|	<tbody>
	+-----------------------\|---------------------------------+	<tr><th>Capture Scene #3</th> <th/></tr>
	\| MCC1(VC1,VC4,VC7) \| SynchronizationID=1 \|
	\| \| MaxCaptures=1 \|	<tr><td>VC7</td> <td/></tr>
	\| MCC2(VC2,VC5,VC8) \| SynchronizationID=1 \|	<tr><td>AC3</td> <td/></tr>
	\| \| MaxCaptures=1 \|
	\| MCC3(VC3,VC6) \| MaxCaptures=1 \|	</tbody>
	\| MCC4(AC1,AC2,AC3,AC4) \| SynchronizationID=1 \|
	\| \| MaxCaptures=1 \|	<tbody>
	\| CSV(MCC1,MCC2,MCC3) \| \|	<tr><th>Capture Scene #4</th> <th/></tr>
	\| CSV(MCC4) \| \|
	+=======================+=================================+	<tr><td>VC8</td> <td/></tr>
	]]></artwork>	<tr><td>AC4</td> <td/></tr>
	</figure>	</tbody>

		<tbody>
		<tr><th>Capture Scene #5</th> <th/></tr>

		<tr><td>MCC1(VC1,VC4,VC7)</td> <td>SynchronizationID
		=1<br/>MaxCaptures=1</td></tr>
		<tr><td>MCC2(VC2,VC5,VC8)</td> <td>SynchronizationID
		=1<br/>MaxCaptures=1</td></tr>
		<tr><td>MCC3(VC3,VC6)</td> <td>MaxCaptures=1</td
		></tr>
		<tr><td>MCC4(AC1,AC2,AC3,AC4)</td> <td>SynchronizationID
		=1<br/>MaxCaptures=1</td></tr>
		<tr><td>CSV(MCC1,MCC2,MCC3)</td> <td/></tr>
		<tr><td>CSV(MCC4)</td> <td/></tr>
		</tbody>
		</table>

	<t>	<t>
	The above Advertisement would indicate that MCC1, MCC2, MCC3, and	The above Advertisement would indicate that MCC1, MCC2, MCC3, and
	MCC4 make up a Capture Scene. There would be four Capture	MCC4 make up a Capture Scene. There would be four Capture
	Encodings (one for each MCC). Because MCC1 and MCC2 have the same	Encodings (one for each MCC). Because MCC1 and MCC2 have the same
	SynchronizationID, each Encoding from MCC1 and MCC2, respectively,	SynchronizationID, each Encoding from MCC1 and MCC2, respectively,
	would together have content from only Capture Scene 1 or only	would together have content from only Capture Scene 1 or only
	Capture Scene 2 or the combination of VC7 and VC8 at a particular	Capture Scene 2 or the combination of VC7 and VC8 at a particular
	point in time. In this case, the Provider has decided the sources	point in time. In this case, the Provider has decided the sources
	to be synchronized are Scene #1, Scene #2, and Scene #3 and #4	to be synchronized are Scene #1, Scene #2, and Scene #3 and #4
	together. The Encoding from MCC3 would not be synchronized with	together. The Encoding from MCC3 would not be synchronized with
	MCC1 or MCC2. As MCC4 also has the same Synchronization Identity	MCC1 or MCC2. As MCC4 also has the same Synchronization Identity
	as MCC1 and MCC2, the content of the audio Encoding will be	as MCC1 and MCC2, the content of the audio Encoding will be
	synchronized with the video content.</t>	synchronized with the video content.</t>
	</section>	</section>

	<section anchor="section-7.2.1.4" numbered="true" toc="default ">	<section anchor="s-7.2.1.4" numbered="true" toc="default">
	<name>Allow Subset Choice</name>	<name>Allow Subset Choice</name>
	<t>	<t>
	The Allow Subset Choice MCC attribute is a boolean value,	The Allow Subset Choice MCC attribute is a boolean value,
	indicating whether or not the Provider allows the Consumer to	indicating whether or not the Provider allows the Consumer to
	choose a specific subset of the Captures referenced by the MCC.	choose a specific subset of the Captures referenced by the MCC.
	If this attribute is true, and the MCC references other Captures,	If this attribute is true, and the MCC references other Captures,
	then the Consumer MAY select (in a Configure message) a specific	then the Consumer MAY select (in a Configure message) a specific
	subset of those Captures to be included in the MCC, and the	subset of those Captures to be included in the MCC, and the
	Provider MUST then include only that subset. If this attribute is	Provider MUST then include only that subset. If this attribute is
	false, or the MCC does not reference other Captures, then the	false, or the MCC does not reference other Captures, then the
	Consumer MUST NOT select a subset.</t>	Consumer MUST NOT select a subset.</t>
	</section>	</section>
	</section>	</section>
	</section>	</section>

	<section anchor="section-7.3" numbered="true" toc="default">	<section anchor="s-7.3" numbered="true" toc="default">
	<name>Capture Scene</name>	<name>Capture Scene</name>
	<t>	<t>
	In order for a Provider's individual Captures to be used	In order for a Provider's individual Captures to be used
	effectively by a Consumer, the Provider organizes the Captures into	effectively by a Consumer, the Provider organizes the Captures into
	one or more Capture Scenes, with the structure and contents of	one or more Capture Scenes, with the structure and contents of
	these Capture Scenes being sent from the Provider to the Consumer	these Capture Scenes being sent from the Provider to the Consumer
	in the Advertisement.</t>	in the Advertisement.</t>
	<t>	<t>
	A Capture Scene is a structure representing a spatial region	A Capture Scene is a structure representing a spatial region
	containing one or more Capture Devices, each capturing media	containing one or more Capture Devices, each capturing media

	skipping to change at line 1484 ¶	skipping to change at line 1505 ¶
	A Capture Scene MAY (and typically will) include more than one type	A Capture Scene MAY (and typically will) include more than one type
	of media. For example, a Capture Scene can include several Capture	of media. For example, a Capture Scene can include several Capture
	Scene Views for Video Captures and several Capture Scene Views for	Scene Views for Video Captures and several Capture Scene Views for
	Audio Captures. A particular Capture MAY be included in more than	Audio Captures. A particular Capture MAY be included in more than
	one Capture Scene View.</t>	one Capture Scene View.</t>
	<t>	<t>
	A Provider MAY express spatial relationships between Captures that	A Provider MAY express spatial relationships between Captures that
	are included in the same Capture Scene. However, there is no	are included in the same Capture Scene. However, there is no
	spatial relationship between Media Captures from different Capture	spatial relationship between Media Captures from different Capture
	Scenes. In other words, Capture Scenes each use their own spatial	Scenes. In other words, Capture Scenes each use their own spatial

	measurement system as outlined in <xref target="section-6" format="de fault"/>.</t>	measurement system as outlined in <xref target="s-6" format="default" />.</t>
	<t>	<t>
	A Provider arranges Captures in a Capture Scene to help the	A Provider arranges Captures in a Capture Scene to help the
	Consumer choose which captures it wants to render. The Capture	Consumer choose which captures it wants to render. The Capture
	Scene Views in a Capture Scene are different alternatives the	Scene Views in a Capture Scene are different alternatives the
	Provider is suggesting for representing the Capture Scene. Each	Provider is suggesting for representing the Capture Scene. Each
	Capture Scene View is given an advertisement-unique identity. The	Capture Scene View is given an advertisement-unique identity. The
	order of Capture Scene Views within a Capture Scene has no	order of Capture Scene Views within a Capture Scene has no
	significance. The Media Consumer can choose to receive all Media	significance. The Media Consumer can choose to receive all Media
	Captures from one Capture Scene View for each media type (e.g.,	Captures from one Capture Scene View for each media type (e.g.,
	audio and video), or it can pick and choose Media Captures	audio and video), or it can pick and choose Media Captures
	regardless of how the Provider arranges them in Capture Scene	regardless of how the Provider arranges them in Capture Scene
	Views. Different Capture Scene Views of the same media type are	Views. Different Capture Scene Views of the same media type are
	not necessarily mutually exclusive alternatives. Also note that	not necessarily mutually exclusive alternatives. Also note that
	the presence of multiple Capture Scene Views (with potentially	the presence of multiple Capture Scene Views (with potentially
	multiple encoding options in each view) in a given Capture Scene	multiple encoding options in each view) in a given Capture Scene
	does not necessarily imply that a Provider is able to serve all the	does not necessarily imply that a Provider is able to serve all the
	associated media simultaneously (although the construction of such	associated media simultaneously (although the construction of such
	an over-rich Capture Scene is probably not sensible in many cases).	an over-rich Capture Scene is probably not sensible in many cases).
	What a Provider can send simultaneously is determined through the	What a Provider can send simultaneously is determined through the

	Simultaneous Transmission Set mechanism, described in <xref target="s ection-8" format="default"/>.</t>	Simultaneous Transmission Set mechanism, described in <xref target="s -8" format="default"/>.</t>
	<t>	<t>
	Captures within the same Capture Scene View MUST be of the same	Captures within the same Capture Scene View MUST be of the same
	media type -- it is not possible to mix audio and video captures in	media type -- it is not possible to mix audio and video captures in
	the same Capture Scene View, for instance. The Provider MUST be	the same Capture Scene View, for instance. The Provider MUST be
	capable of encoding and sending all Captures (that have an encoding	capable of encoding and sending all Captures (that have an encoding
	group) in a single Capture Scene View simultaneously. The order of	group) in a single Capture Scene View simultaneously. The order of
	Captures within a Capture Scene View has no significance. A	Captures within a Capture Scene View has no significance. A
	Consumer can decide to receive all the Captures in a single Capture	Consumer can decide to receive all the Captures in a single Capture
	Scene View, but a Consumer could also decide to receive just a	Scene View, but a Consumer could also decide to receive just a
	subset of those captures. A Consumer can also decide to receive	subset of those captures. A Consumer can also decide to receive
	Captures from different Capture Scene Views, all subject to the	Captures from different Capture Scene Views, all subject to the
	constraints set by Simultaneous Transmission Sets, as discussed in	constraints set by Simultaneous Transmission Sets, as discussed in

	<xref target="section-8" format="default"/>.</t>	<xref target="s-8" format="default"/>.</t>
	<t>	<t>
	When a Provider advertises a Capture Scene with multiple CSVs, it	When a Provider advertises a Capture Scene with multiple CSVs, it
	is essentially signaling that there are multiple representations of	is essentially signaling that there are multiple representations of
	the same Capture Scene available. In some cases, these multiple	the same Capture Scene available. In some cases, these multiple
	views would be used simultaneously (for instance, a "video view" and	views would be used simultaneously (for instance, a "video view" and
	an "audio view"). In some cases, the views would conceptually be	an "audio view"). In some cases, the views would conceptually be
	alternatives (for instance, a view consisting of three Video	alternatives (for instance, a view consisting of three Video
	Captures covering the whole room versus a view consisting of just a	Captures covering the whole room versus a view consisting of just a
	single Video Capture covering only the center of a room). In this	single Video Capture covering only the center of a room). In this
	latter example, one sensible choice for a Consumer would be to	latter example, one sensible choice for a Consumer would be to

	skipping to change at line 1555 ¶	skipping to change at line 1576 ¶
	rendering purposes is accomplished through use of their Area of	rendering purposes is accomplished through use of their Area of
	Capture attributes. The second view (MCC3) and the third view	Capture attributes. The second view (MCC3) and the third view
	(VC4) are alternative representations of the same room's video,	(VC4) are alternative representations of the same room's video,
	which might be better suited to some Consumers' rendering	which might be better suited to some Consumers' rendering
	capabilities. The inclusion of the Audio Capture in the same	capabilities. The inclusion of the Audio Capture in the same
	Capture Scene indicates that AC0 is associated with all of those	Capture Scene indicates that AC0 is associated with all of those
	Video Captures, meaning it comes from the same spatial region.	Video Captures, meaning it comes from the same spatial region.
	Therefore, if audio were to be rendered at all, this audio would be	Therefore, if audio were to be rendered at all, this audio would be
	the correct choice, irrespective of which Video Captures were	the correct choice, irrespective of which Video Captures were
	chosen.</t>	chosen.</t>

	<section anchor="section-7.3.1" numbered="true" toc="default">	<section anchor="s-7.3.1" numbered="true" toc="default">
	<name>Capture Scene Attributes</name>	<name>Capture Scene Attributes</name>
	<t>	<t>
	Capture Scene Attributes can be applied to Capture Scenes as well	Capture Scene Attributes can be applied to Capture Scenes as well
	as to individual media captures. Attributes specified at this	as to individual media captures. Attributes specified at this
	level apply to all constituent Captures. Capture Scene attributes	level apply to all constituent Captures. Capture Scene attributes
	include the following:</t>	include the following:</t>
	<ul spacing="normal">	<ul spacing="normal">
	<li>Human-readable description of the Capture Scene, which c ould	<li>Human-readable description of the Capture Scene, which c ould
	be in multiple languages;</li>	be in multiple languages;</li>
	<li>xCard scene information</li>	<li>xCard scene information</li>
	<li>Scale information ("Millimeters", "Unknown Scale", "No S cale"), as	<li>Scale information ("Millimeters", "Unknown Scale", "No S cale"), as

	described in <xref target="section-6" format="default"/>.</li>	described in <xref target="s-6" format="default"/>.</li>
	</ul>	</ul>

	<section anchor="section-7.3.1.1" numbered="true" toc="default ">	<section anchor="s-7.3.1.1" numbered="true" toc="default">
	<name>Scene Information</name>	<name>Scene Information</name>
	<t>	<t>
	The Scene information attribute provides information regarding the	The Scene information attribute provides information regarding the
	Capture Scene rather than individual participants. The Provider	Capture Scene rather than individual participants. The Provider
	may gather the information automatically or manually from a	may gather the information automatically or manually from a
	variety of sources. The scene information attribute allows a	variety of sources. The scene information attribute allows a
	Provider to indicate information such as organizational or	Provider to indicate information such as organizational or
	geographic information allowing a Consumer to determine which	geographic information allowing a Consumer to determine which
	Capture Scenes are of interest in order to then perform Capture	Capture Scenes are of interest in order to then perform Capture
	selection. It also allows a Consumer to render information	selection. It also allows a Consumer to render information
	regarding the Scene or to use it for further processing.</t>	regarding the Scene or to use it for further processing.</t>
	<t>	<t>

	As per <xref target="section-7.1.1.10" format="default"/>, the xCard format is used to convey this	As per <xref target="s-7.1.1.10" format="default"/>, the xCard format is used to convey this
	information and the Provider may supply a minimal set of	information and the Provider may supply a minimal set of
	information or a larger set of information.</t>	information or a larger set of information.</t>
	<t>	<t>
	In order to keep CLUE messages compact the Provider SHOULD use a	In order to keep CLUE messages compact the Provider SHOULD use a
	URI to point to any LOGO, PHOTO, or SOUND contained in the xCARD	URI to point to any LOGO, PHOTO, or SOUND contained in the xCARD
	rather than transmitting the LOGO, PHOTO, or SOUND data in a CLUE	rather than transmitting the LOGO, PHOTO, or SOUND data in a CLUE
	message.</t>	message.</t>
	</section>	</section>
	</section>	</section>

	<section anchor="section-7.3.2" numbered="true" toc="default">	<section anchor="s-7.3.2" numbered="true" toc="default">
	<name>Capture Scene View Attributes</name>	<name>Capture Scene View Attributes</name>
	<t>	<t>
	A Capture Scene can include one or more Capture Scene Views in	A Capture Scene can include one or more Capture Scene Views in
	addition to the Capture-Scene-wide attributes described above.	addition to the Capture-Scene-wide attributes described above.
	Capture Scene View attributes apply to the Capture Scene View as a	Capture Scene View attributes apply to the Capture Scene View as a
	whole, i.e., to all Captures that are part of the Capture Scene	whole, i.e., to all Captures that are part of the Capture Scene
	View.</t>	View.</t>
	<t>Capture Scene View attributes include the following:	<t>Capture Scene View attributes include the following:
	</t>	</t>
	<ul spacing="normal">	<ul spacing="normal">
	<li>A human-readable description (which could be in multiple	<li>A human-readable description (which could be in multiple
	languages) of the Capture Scene View.</li>	languages) of the Capture Scene View.</li>
	</ul>	</ul>
	</section>	</section>
	</section>	</section>

	<section anchor="section-7.4" numbered="true" toc="default">	<section anchor="s-7.4" numbered="true" toc="default">
	<name>Global View List</name>	<name>Global View List</name>
	<t>	<t>
	An Advertisement can include an optional Global View list. Each	An Advertisement can include an optional Global View list. Each
	item in this list is a Global View. The Provider can include	item in this list is a Global View. The Provider can include
	multiple Global Views, to allow a Consumer to choose sets of	multiple Global Views, to allow a Consumer to choose sets of
	captures appropriate to its capabilities or application. The	captures appropriate to its capabilities or application. The
	choice of how to make these suggestions in the Global View list	choice of how to make these suggestions in the Global View list
	for what represents all the scenes for which the Provider can send	for what represents all the scenes for which the Provider can send
	media is up to the Provider. This is very similar to how each CSV	media is up to the Provider. This is very similar to how each CSV
	represents a particular scene.</t>	represents a particular scene.</t>

	skipping to change at line 1681 ¶	skipping to change at line 1702 ¶
	. \| <---------' \| .	. \| <---------' \| .
	. \| \| \| (v) = video .	. \| \| \| (v) = video .
	. \| CSV6 (a)<-----------' (a) = audio .	. \| CSV6 (a)<-----------' (a) = audio .
	. \| \| .	. \| \| .
	. +--------------+ .	. +--------------+ .
	`......................................................'	`......................................................'
	]]></artwork>	]]></artwork>
	</figure>	</figure>
	</section>	</section>
	</section>	</section>

	<section anchor="section-8" numbered="true" toc="default">	<section anchor="s-8" numbered="true" toc="default">
	<name>Simultaneous Transmission Set Constraints</name>	<name>Simultaneous Transmission Set Constraints</name>
	<t>	<t>
	In many practical cases, a Provider has constraints or limitations	In many practical cases, a Provider has constraints or limitations
	on its ability to send Captures simultaneously. One type of	on its ability to send Captures simultaneously. One type of
	limitation is caused by the physical limitations of capture	limitation is caused by the physical limitations of capture
	mechanisms; these constraints are represented by a Simultaneous	mechanisms; these constraints are represented by a Simultaneous
	Transmission Set. The second type of limitation reflects the	Transmission Set. The second type of limitation reflects the
	encoding resources available, such as bandwidth or video encoding	encoding resources available, such as bandwidth or video encoding
	throughput (macroblocks/second). This type of constraint is	throughput (macroblocks/second). This type of constraint is
	captured by Individual Encodings and Encoding Groups, discussed	captured by Individual Encodings and Encoding Groups, discussed

	skipping to change at line 1734 ¶	skipping to change at line 1755 ¶
	<t>	<t>
	In this example, the two Simultaneous Transmission Sets are shown in	In this example, the two Simultaneous Transmission Sets are shown in
	<xref target="ref-two-simultaneous-transmission-sets" format="default "/>. If a Provider advertises one or more mutually exclusive	<xref target="ref-two-simultaneous-transmission-sets" format="default "/>. If a Provider advertises one or more mutually exclusive
	Simultaneous Transmission Sets, then, for each media type, the	Simultaneous Transmission Sets, then, for each media type, the
	Consumer MUST ensure that it chooses Media Captures that lie wholly	Consumer MUST ensure that it chooses Media Captures that lie wholly
	within one of those Simultaneous Transmission Sets.</t>	within one of those Simultaneous Transmission Sets.</t>
	<table anchor="ref-two-simultaneous-transmission-sets" align="cent er">	<table anchor="ref-two-simultaneous-transmission-sets" align="cent er">
	<name>Two Simultaneous Transmission Sets</name>	<name>Two Simultaneous Transmission Sets</name>
	<thead>	<thead>
	<tr>	<tr>

	<th align="left"> Simultaneous Sets</th>	<th align="left">Simultaneous Sets</th>
	</tr>	</tr>
	</thead>	</thead>
	<tbody>	<tbody>
	<tr>	<tr>
	<td align="left">{VC0, VC1, VC2}</td>	<td align="left">{VC0, VC1, VC2}</td>
	</tr>	</tr>
	<tr>	<tr>
	<td align="left">{VC0, VC3, VC2}</td>	<td align="left">{VC0, VC3, VC2}</td>
	</tr>	</tr>
	</tbody>	</tbody>

	skipping to change at line 1778 ¶	skipping to change at line 1799 ¶
	Scene. Likewise, if there are no Simultaneous Transmission Sets	Scene. Likewise, if there are no Simultaneous Transmission Sets
	and there is a Global View list, then the Provider MUST be able to	and there is a Global View list, then the Provider MUST be able to
	simultaneously provide all the Captures from any particular Global	simultaneously provide all the Captures from any particular Global
	View (of each media type) from the Global View list.</t>	View (of each media type) from the Global View list.</t>
	<t>	<t>
	If an Advertisement includes multiple Capture Scene Views in a	If an Advertisement includes multiple Capture Scene Views in a
	Capture Scene, then the Consumer MAY choose one Capture Scene View	Capture Scene, then the Consumer MAY choose one Capture Scene View
	for each media type, or it MAY choose individual Captures based on th e	for each media type, or it MAY choose individual Captures based on th e
	Simultaneous Transmission Sets.</t>	Simultaneous Transmission Sets.</t>
	</section>	</section>

	<section anchor="section-9" numbered="true" toc="default">	<section anchor="s-9" numbered="true" toc="default">
	<name>Encodings</name>	<name>Encodings</name>
	<t>	<t>
	Individual encodings and encoding groups are CLUE's mechanisms	Individual encodings and encoding groups are CLUE's mechanisms
	allowing a Provider to signal its limitations for sending Captures,	allowing a Provider to signal its limitations for sending Captures,
	or combinations of Captures, to a Consumer. Consumers can map the	or combinations of Captures, to a Consumer. Consumers can map the
	Captures they want to receive onto the Encodings, with the encoding	Captures they want to receive onto the Encodings, with the encoding
	parameters they want. As for the relationship between the CLUE-speci fied mechanisms based on Encodings and the SIP offer/answer	parameters they want. As for the relationship between the CLUE-speci fied mechanisms based on Encodings and the SIP offer/answer

	exchange, please refer to <xref target="section-5" format="default"/>	exchange, please refer to <xref target="s-5" format="default"/>.</t>
	.</t>	<section anchor="s-9.1" numbered="true" toc="default">
	<section anchor="section-9.1" numbered="true" toc="default">
	<name>Individual Encodings</name>	<name>Individual Encodings</name>
	<t>	<t>
	An Individual Encoding represents a way to encode a Media Capture	An Individual Encoding represents a way to encode a Media Capture
	as a Capture Encoding, to be sent as an encoded media stream from	as a Capture Encoding, to be sent as an encoded media stream from
	the Provider to the Consumer. An Individual Encoding has a set of	the Provider to the Consumer. An Individual Encoding has a set of
	parameters characterizing how the media is encoded.</t>	parameters characterizing how the media is encoded.</t>
	<t>	<t>
	Different media types have different parameters, and different	Different media types have different parameters, and different
	encoding algorithms may have different parameters. An Individual	encoding algorithms may have different parameters. An Individual
	Encoding can be assigned to at most one Capture Encoding at any	Encoding can be assigned to at most one Capture Encoding at any

	skipping to change at line 1816 ¶	skipping to change at line 1837 ¶
	<ul spacing="compact">	<ul spacing="compact">
	<li>Maximum bandwidth;</li>	<li>Maximum bandwidth;</li>
	<li>Maximum picture size in pixels;</li>	<li>Maximum picture size in pixels;</li>
	<li>Maximum number of pixels to be processed per second;</li>	<li>Maximum number of pixels to be processed per second;</li>
	</ul>	</ul>
	<t>	<t>
	The bandwidth parameter is the only one that specifically relates	The bandwidth parameter is the only one that specifically relates
	to a CLUE Advertisement, as it can be further constrained by the	to a CLUE Advertisement, as it can be further constrained by the
	maximum group bandwidth in an Encoding Group.</t>	maximum group bandwidth in an Encoding Group.</t>
	</section>	</section>

	<section anchor="section-9.2" numbered="true" toc="default">	<section anchor="s-9.2" numbered="true" toc="default">
	<name>Encoding Group</name>	<name>Encoding Group</name>
	<t>	<t>
	An Encoding Group includes a set of one or more Individual	An Encoding Group includes a set of one or more Individual
	Encodings, and parameters that apply to the group as a whole. By	Encodings, and parameters that apply to the group as a whole. By
	grouping multiple individual Encodings together, an Encoding Group	grouping multiple individual Encodings together, an Encoding Group
	describes additional constraints on bandwidth for the group. A	describes additional constraints on bandwidth for the group. A
	single Encoding Group MAY refer to Encodings for different media	single Encoding Group MAY refer to Encodings for different media
	types.</t>	types.</t>
	<t>The Encoding Group data structure contains:	<t>The Encoding Group data structure contains:


	skipping to change at line 1886 ¶	skipping to change at line 1907 ¶
	-->	-->
	<t>While a typical three-codec/display system might have one Encoding	<t>While a typical three-codec/display system might have one Encoding
	Group per "codec box" (physical codec, connected to one camera and	Group per "codec box" (physical codec, connected to one camera and
	one screen), there are many possibilities for the number of	one screen), there are many possibilities for the number of
	Encoding Groups a Provider may be able to offer and for the	Encoding Groups a Provider may be able to offer and for the
	encoding values in each Encoding Group.</t>	encoding values in each Encoding Group.</t>
	<t>	<t>
	There is no requirement for all Encodings within an Encoding Group	There is no requirement for all Encodings within an Encoding Group
	to be instantiated at the same time.</t>	to be instantiated at the same time.</t>
	</section>	</section>

	<section anchor="section-9.3" numbered="true" toc="default">	<section anchor="s-9.3" numbered="true" toc="default">
	<name>Associating Captures with Encoding Groups</name>	<name>Associating Captures with Encoding Groups</name>
	<t>	<t>
	Each Media Capture, including MCCs, MAY be associated with one	Each Media Capture, including MCCs, MAY be associated with one
	Encoding Group. To be eligible for configuration, a Media Capture	Encoding Group. To be eligible for configuration, a Media Capture
	MUST be associated with one Encoding Group, which is used to	MUST be associated with one Encoding Group, which is used to
	instantiate that Capture into a Capture Encoding. When an MCC is	instantiate that Capture into a Capture Encoding. When an MCC is
	configured, all the Media Captures referenced by the MCC will appear	configured, all the Media Captures referenced by the MCC will appear
	in the Capture Encoding according to the attributes of the chosen	in the Capture Encoding according to the attributes of the chosen
	encoding of the MCC. This allows an Advertiser to specify encoding	encoding of the MCC. This allows an Advertiser to specify encoding
	attributes associated with the Media Captures without the need to	attributes associated with the Media Captures without the need to

	skipping to change at line 1908 ¶	skipping to change at line 1929 ¶
	<t>	<t>
	If an Encoding Group is assigned to a Media Capture referenced by	If an Encoding Group is assigned to a Media Capture referenced by
	the MCC, it indicates that this Capture may also have an individual	the MCC, it indicates that this Capture may also have an individual
	Capture Encoding.</t>	Capture Encoding.</t>
	<t>For example:	<t>For example:
	</t>	</t>
	<table anchor="ref-example-usage-of-encoding-with-mcc-and-source -captures" align="center">	<table anchor="ref-example-usage-of-encoding-with-mcc-and-source -captures" align="center">
	<name>Example Usage of Encoding with MCC and Source Captures</ name>	<name>Example Usage of Encoding with MCC and Source Captures</ name>
	<thead>	<thead>
	<tr>	<tr>

	<th align="left"> Capture Scene #1</th>	<th align="left">Capture Scene #1</th>
	<th align="left"> </th>	<th align="left"> </th>
	</tr>	</tr>
	</thead>	</thead>
	<tbody>	<tbody>
	<tr>	<tr>
	<td align="left">VC1</td>	<td align="left">VC1</td>
	<td align="left">EncodeGroupID=1</td>	<td align="left">EncodeGroupID=1</td>
	</tr>	</tr>
	<tr>	<tr>
	<td align="left">VC2</td>	<td align="left">VC2</td>

	skipping to change at line 1954 ¶	skipping to change at line 1975 ¶
	individual Encodings in the group. The actual number of Capture	individual Encodings in the group. The actual number of Capture
	Encodings used at any time MAY be less than this maximum. Any of	Encodings used at any time MAY be less than this maximum. Any of
	the Captures that use a particular Encoding Group can be encoded	the Captures that use a particular Encoding Group can be encoded
	according to any of the Individual Encodings in the group.</t>	according to any of the Individual Encodings in the group.</t>
	<t>	<t>
	It is a protocol conformance requirement that the Encoding Groups	It is a protocol conformance requirement that the Encoding Groups
	MUST allow all the Captures in a particular Capture Scene View to	MUST allow all the Captures in a particular Capture Scene View to
	be used simultaneously.</t>	be used simultaneously.</t>
	</section>	</section>
	</section>	</section>

	<section anchor="section-10" numbered="true" toc="default">	<section anchor="s-10" numbered="true" toc="default">
	<name>Consumer's Choice of Streams to Receive from the Provider</n ame>	<name>Consumer's Choice of Streams to Receive from the Provider</n ame>
	<t>	<t>
	After receiving the Provider's Advertisement message (which includes	After receiving the Provider's Advertisement message (which includes
	media captures and associated constraints), the Consumer composes	media captures and associated constraints), the Consumer composes
	its reply to the Provider in the form of a Configure message. The	its reply to the Provider in the form of a Configure message. The
	Consumer is free to use the information in the Advertisement as it	Consumer is free to use the information in the Advertisement as it
	chooses, but there are a few obviously sensible design choices,	chooses, but there are a few obviously sensible design choices,
	which are outlined below.</t>	which are outlined below.</t>
	<t>	<t>
	If multiple Providers connect to the same Consumer (i.e., in an	If multiple Providers connect to the same Consumer (i.e., in an

	skipping to change at line 2054 ¶	skipping to change at line 2075 ¶
	recently received Advertisement. The Consumer can send a Configure	recently received Advertisement. The Consumer can send a Configure
	either in response to a new Advertisement from the Provider or on	either in response to a new Advertisement from the Provider or on
	its own, for example, because of a local change in conditions	its own, for example, because of a local change in conditions
	(people leaving the room, connectivity changes, multipoint related	(people leaving the room, connectivity changes, multipoint related
	considerations).</t>	considerations).</t>
	<t>	<t>
	When choosing which Media Streams to receive from the Provider, and	When choosing which Media Streams to receive from the Provider, and
	the encoding characteristics of those Media Streams, the Consumer	the encoding characteristics of those Media Streams, the Consumer
	advantageously takes several things into account: its local	advantageously takes several things into account: its local
	preference, simultaneity restrictions, and encoding limits.</t>	preference, simultaneity restrictions, and encoding limits.</t>

	<section anchor="section-10.1" numbered="true" toc="default">	<section anchor="s-10.1" numbered="true" toc="default">
	<name>Local Preference</name>	<name>Local Preference</name>
	<t>	<t>
	A variety of local factors influence the Consumer's choice of	A variety of local factors influence the Consumer's choice of
	Media Streams to be received from the Provider:</t>	Media Streams to be received from the Provider:</t>
	<ul spacing="normal">	<ul spacing="normal">
	<li>If the Consumer is an Endpoint, it is likely that it would	<li>If the Consumer is an Endpoint, it is likely that it would
	choose, where possible, to receive video and audio Captures that	choose, where possible, to receive video and audio Captures that
	match the number of display devices and audio system it has.</li>	match the number of display devices and audio system it has.</li>
	<li>If the Consumer is an MCU, it may choose to receive loudes t	<li>If the Consumer is an MCU, it may choose to receive loudes t
	speaker streams (in order to perform its own media composition)	speaker streams (in order to perform its own media composition)
	and avoid pre-composed video Captures.</li>	and avoid pre-composed video Captures.</li>
	<li>User choice (for instance, selection of a new layout) may result	<li>User choice (for instance, selection of a new layout) may result
	in a different set of Captures, or different encoding	in a different set of Captures, or different encoding
	characteristics, being required by the Consumer.</li>	characteristics, being required by the Consumer.</li>
	</ul>	</ul>
	</section>	</section>

	<section anchor="section-10.2" numbered="true" toc="default">	<section anchor="s-10.2" numbered="true" toc="default">
	<name>Physical Simultaneity Restrictions</name>	<name>Physical Simultaneity Restrictions</name>
	<t>	<t>
	Often there are physical simultaneity constraints of the Provider	Often there are physical simultaneity constraints of the Provider
	that affect the Provider's ability to simultaneously send all of	that affect the Provider's ability to simultaneously send all of
	the captures the Consumer would wish to receive. For instance, an	the captures the Consumer would wish to receive. For instance, an
	MCU, when connected to a multi-camera room system, might prefer to	MCU, when connected to a multi-camera room system, might prefer to
	receive both individual video streams of the people present in the	receive both individual video streams of the people present in the
	room and an overall view of the room from a single camera. Some	room and an overall view of the room from a single camera. Some
	Endpoint systems might be able to provide both of these sets of	Endpoint systems might be able to provide both of these sets of
	streams simultaneously, whereas others might not (if the overall	streams simultaneously, whereas others might not (if the overall
	room view were produced by changing the optical zoom level on the	room view were produced by changing the optical zoom level on the
	center camera, for instance).</t>	center camera, for instance).</t>
	</section>	</section>

	<section anchor="section-10.3" numbered="true" toc="default">	<section anchor="s-10.3" numbered="true" toc="default">
	<name>Encoding and Encoding Group Limits</name>	<name>Encoding and Encoding Group Limits</name>
	<t>	<t>
	Each of the Provider's encoding groups has limits on bandwidth,	Each of the Provider's encoding groups has limits on bandwidth,
	and the constituent potential encodings have limits on the	and the constituent potential encodings have limits on the
	bandwidth, computational complexity, video frame rate, and	bandwidth, computational complexity, video frame rate, and
	resolution that can be provided. When choosing the Captures to be	resolution that can be provided. When choosing the Captures to be
	received from a Provider, a Consumer device MUST ensure that the	received from a Provider, a Consumer device MUST ensure that the
	encoding characteristics requested for each individual Capture	encoding characteristics requested for each individual Capture
	fits within the capability of the encoding it is being configured	fits within the capability of the encoding it is being configured
	to use, as well as ensuring that the combined encoding	to use, as well as ensuring that the combined encoding
	characteristics for Captures fit within the capabilities of their	characteristics for Captures fit within the capabilities of their
	associated encoding groups. In some cases, this could cause an	associated encoding groups. In some cases, this could cause an
	otherwise "preferred" choice of capture encodings to be passed	otherwise "preferred" choice of capture encodings to be passed
	over in favor of different Capture Encodings -- for instance, if a	over in favor of different Capture Encodings -- for instance, if a
	set of three Captures could only be provided at a low resolution	set of three Captures could only be provided at a low resolution
	then a three screen device could switch to favoring a single,	then a three screen device could switch to favoring a single,
	higher quality, Capture Encoding.</t>	higher quality, Capture Encoding.</t>
	</section>	</section>
	</section>	</section>

	<section anchor="section-11" numbered="true" toc="default">	<section anchor="s-11" numbered="true" toc="default">
	<name>Extensibility</name>	<name>Extensibility</name>
	<t>	<t>
	One important characteristics of the Framework is its	One important characteristics of the Framework is its
	extensibility. The standard for interoperability and handling	extensibility. The standard for interoperability and handling
	multiple streams must be future-proof. The framework itself is	multiple streams must be future-proof. The framework itself is
	inherently extensible through expanding the data model types. For	inherently extensible through expanding the data model types. For
	example:</t>	example:</t>
	<ul spacing="normal">	<ul spacing="normal">
	<li>Adding more types of media, such as telemetry, can done by	<li>Adding more types of media, such as telemetry, can done by
	defining additional types of Captures in addition to audio and	defining additional types of Captures in addition to audio and
	video.</li>	video.</li>
	<li>Adding new functionalities, such as 3-D video Captures, may	<li>Adding new functionalities, such as 3-D video Captures, may
	require additional attributes describing the Captures.</li>	require additional attributes describing the Captures.</li>
	</ul>	</ul>
	<t>	<t>
	The infrastructure is designed to be extended rather than	The infrastructure is designed to be extended rather than
	requiring new infrastructure elements. Extension comes through	requiring new infrastructure elements. Extension comes through
	adding to defined types.</t>	adding to defined types.</t>
	</section>	</section>

	<section anchor="section-12" numbered="true" toc="default">	<section anchor="s-12" numbered="true" toc="default">
	<name>Examples - Using the Framework (Informative)</name>	<name>Examples - Using the Framework (Informative)</name>
	<t>	<t>
	This section gives some examples, first from the point of view of	This section gives some examples, first from the point of view of
	the Provider, then the Consumer, then some multipoint scenarios.</t>	the Provider, then the Consumer, then some multipoint scenarios.</t>

	<section anchor="section-12.1" numbered="true" toc="default">	<section anchor="s-12.1" numbered="true" toc="default">
	<name>Provider Behavior</name>	<name>Provider Behavior</name>
	<t>	<t>
	This section shows some examples in more detail of how a Provider	This section shows some examples in more detail of how a Provider
	can use the framework to represent a typical case for telepresence	can use the framework to represent a typical case for telepresence
	rooms. First, an endpoint is illustrated, then an MCU case is	rooms. First, an endpoint is illustrated, then an MCU case is
	shown.</t>	shown.</t>

	<section anchor="section-12.1.1" numbered="true" toc="default">	<section anchor="s-12.1.1" numbered="true" toc="default">
	<name>Three Screen Endpoint Provider</name>	<name>Three Screen Endpoint Provider</name>
	<t>	<t>
	Consider an Endpoint with the following description:</t>	Consider an Endpoint with the following description:</t>
	<t>	<t>
	Three cameras, three displays, and a six-person table</t>	Three cameras, three displays, and a six-person table</t>
	<ul spacing="normal">	<ul spacing="normal">
	<li>Each camera can provide one Capture for each 1/3-section of the	<li>Each camera can provide one Capture for each 1/3-section of the
	table.</li>	table.</li>
	<li>A single Capture representing the active speaker can be provided	<li>A single Capture representing the active speaker can be provided
	(voice-activity-based camera selection to a given encoder input	(voice-activity-based camera selection to a given encoder input
	port implemented locally in the Endpoint).</li>	port implemented locally in the Endpoint).</li>
	<li>A single Capture representing the active speaker with th e other	<li>A single Capture representing the active speaker with th e other
	two Captures shown picture in picture (PiP) within the stream can	two Captures shown picture in picture (PiP) within the stream can
	be provided (again, implemented inside the endpoint).</li>	be provided (again, implemented inside the endpoint).</li>
	<li>A Capture showing a zoomed out view of all six seats in the room	<li>A Capture showing a zoomed out view of all six seats in the room
	can be provided.</li>	can be provided.</li>
	</ul>	</ul>
	<t>	<t>
	The video and audio Captures for this Endpoint can be described as	The video and audio Captures for this Endpoint can be described as
	follows.</t>	follows.</t>

	<!--[rfced]

	<t>	<t>
	Video Captures:	Video Captures:
	</t>	</t>
	<dl newline="false" spacing="normal" indent="6">	<dl newline="false" spacing="normal" indent="6">
	<dt>VC0</dt>	<dt>VC0</dt>
	<dd>(the left camera stream), encoding group=EG0, view=table </dd>	<dd>(the left camera stream), encoding group=EG0, view=table </dd>
	<dt>VC1</dt>	<dt>VC1</dt>
	<dd>(the center camera stream), encoding group=EG1, view=tab le</dd>	<dd>(the center camera stream), encoding group=EG1, view=tab le</dd>
	<dt>VC2</dt>	<dt>VC2</dt>
	<dd>(the right camera stream), encoding group=EG2, view=tabl e</dd>	<dd>(the right camera stream), encoding group=EG2, view=tabl e</dd>

	skipping to change at line 2358 ¶	skipping to change at line 2377 ¶
	]]></artwork>	]]></artwork>
	</figure>	</figure>
	<t>	<t>
	Capture Scenes:</t>	Capture Scenes:</t>
	<t>	<t>
	The following table represents the Capture Scenes for this	The following table represents the Capture Scenes for this
	Provider. Recall that a Capture Scene is composed of alternative	Provider. Recall that a Capture Scene is composed of alternative
	Capture Scene Views covering the same spatial region. Capture	Capture Scene Views covering the same spatial region. Capture
	Scene #1 is for the main people captures, and Capture Scene #2 is	Scene #1 is for the main people captures, and Capture Scene #2 is
	for presentation.</t>	for presentation.</t>

	<t>Each row in the table is a separate Capture Scene View</t>	<t>Each row in the table is a separate Capture Scene View.</t>

	<table align="center">	<table align="center">

	<thead>	<name>Example Capture Scene Views</name>
		<tbody>
	<tr>	<tr>
	<th align="left"> Capture Scene #1</th>	<th align="left"> Capture Scene #1</th>
	</tr>	</tr>

	</thead>
	<tbody>
	<tr>	<tr>
	<td align="left">VC0, VC1, VC2</td>	<td align="left">VC0, VC1, VC2</td>
	</tr>	</tr>
	<tr>	<tr>
	<td align="left">MCC3</td>	<td align="left">MCC3</td>
	</tr>	</tr>
	<tr>	<tr>
	<td align="left">MCC4</td>	<td align="left">MCC4</td>
	</tr>	</tr>
	<tr>	<tr>
	<td align="left">VC5</td>	<td align="left">VC5</td>
	</tr>	</tr>
	<tr>	<tr>
	<td align="left">AC0, AC1, AC2</td>	<td align="left">AC0, AC1, AC2</td>
	</tr>	</tr>
	<tr>	<tr>
	<td align="left">AC3</td>	<td align="left">AC3</td>
	</tr>	</tr>
	</tbody>	</tbody>

	</table>
	<table anchor="Table7" align="center">	<tbody>
	<name>Example Capture Scene Views</name>
	<thead>
	<tr>	<tr>
	<th align="left"> Capture Scene #2</th>	<th align="left"> Capture Scene #2</th>
	</tr>	</tr>

	</thead>
	<tbody>
	<tr>	<tr>
	<td align="left">VC6</td>	<td align="left">VC6</td>
	</tr>	</tr>
	<tr>	<tr>
	<td align="left">AC4</td>	<td align="left">AC4</td>
	</tr>	</tr>
	</tbody>	</tbody>
	</table>	</table>
	<t>	<t>
	Different Capture Scenes are distinct from each other and do not	Different Capture Scenes are distinct from each other and do not

	skipping to change at line 2432 ¶	skipping to change at line 2447 ¶
	to receive, partially based on how many streams it can simultaneously	to receive, partially based on how many streams it can simultaneously
	receive. A consumer that can receive three video streams would	receive. A consumer that can receive three video streams would
	probably prefer to receive the first view of Capture Scene #1	probably prefer to receive the first view of Capture Scene #1
	(VC0, VC1, and VC2) and not receive the other views. A consumer that	(VC0, VC1, and VC2) and not receive the other views. A consumer that
	can receive only one video stream would probably choose one of the	can receive only one video stream would probably choose one of the
	other views.</t>	other views.</t>
	<t>	<t>
	If the consumer can receive a presentation stream too, it would	If the consumer can receive a presentation stream too, it would
	also choose to receive the only view from Capture Scene #2 (VC6).</t>	also choose to receive the only view from Capture Scene #2 (VC6).</t>
	</section>	</section>

	<section anchor="section-12.1.2" numbered="true" toc="default">	<section anchor="s-12.1.2" numbered="true" toc="default">
	<name>Encoding Group Example</name>	<name>Encoding Group Example</name>
	<t>	<t>
	This is an example of an Encoding Group to illustrate how it can	This is an example of an Encoding Group to illustrate how it can
	express dependencies between Encodings. The information below	express dependencies between Encodings. The information below
	about Encodings is a summary of what would be conveyed in SDP, not	about Encodings is a summary of what would be conveyed in SDP, not
	directly in the CLUE Advertisement.</t>	directly in the CLUE Advertisement.</t>
	<artwork name="" type="" align="left" alt=""><![CDATA[	<artwork name="" type="" align="left" alt=""><![CDATA[
	encodeGroupID=EG0 maxGroupBandwidth=6000000	encodeGroupID=EG0 maxGroupBandwidth=6000000
	encodeID=VIDENC0, maxWidth=1920, maxHeight=1088,	encodeID=VIDENC0, maxWidth=1920, maxHeight=1088,
	maxFrameRate=60, maxPps=62208000, maxBandwidth=4000000	maxFrameRate=60, maxPps=62208000, maxBandwidth=4000000

	skipping to change at line 2476 ¶	skipping to change at line 2491 ¶
	encodeID=VIDENC0, maxWidth=1920, maxHeight=1088,	encodeID=VIDENC0, maxWidth=1920, maxHeight=1088,
	maxFrameRate=60, maxPps=62208000, maxBandwidth=4000000	maxFrameRate=60, maxPps=62208000, maxBandwidth=4000000
	encodeID=VIDENC1, maxWidth=1920, maxHeight=1088,	encodeID=VIDENC1, maxWidth=1920, maxHeight=1088,
	maxFrameRate=60, maxPps=62208000, maxBandwidth=4000000	maxFrameRate=60, maxPps=62208000, maxBandwidth=4000000
	encodeGroupID=EG1 maxGroupBandwidth=500000	encodeGroupID=EG1 maxGroupBandwidth=500000
	encodeID=AUDENC0, maxBandwidth=96000	encodeID=AUDENC0, maxBandwidth=96000
	encodeID=AUDENC1, maxBandwidth=96000	encodeID=AUDENC1, maxBandwidth=96000
	encodeID=AUDENC2, maxBandwidth=96000	encodeID=AUDENC2, maxBandwidth=96000
	]]></artwork>	]]></artwork>
	</section>	</section>

	<section anchor="section-12.1.3" numbered="true" toc="default">	<section anchor="s-12.1.3" numbered="true" toc="default">
	<name>The MCU Case</name>	<name>The MCU Case</name>
	<t>	<t>
	This section shows how an MCU might express its Capture Scenes,	This section shows how an MCU might express its Capture Scenes,
	intending to offer different choices for consumers that can handle	intending to offer different choices for consumers that can handle
	different numbers of streams. Each MCC is for video. A single	different numbers of streams. Each MCC is for video. A single
	Audio Capture is provided for all single and multi-screen	Audio Capture is provided for all single and multi-screen
	configurations that can be associated (e.g., lip-synced) with any	configurations that can be associated (e.g., lip-synced) with any
	combination of Video Captures (the MCCs) at the consumer.</t>	combination of Video Captures (the MCCs) at the consumer.</t>
	<table anchor="ref-mcu-main-capture-scenes" align="center">	<table anchor="ref-mcu-main-capture-scenes" align="center">
	<name>MCU Main Capture Scenes</name>	<name>MCU Main Capture Scenes</name>
	<thead>	<thead>
	<tr>	<tr>

	<th align="left"> Capture Scene #1</th>	<th align="left">Capture Scene #1</th>
	<th align="left"/>	<th align="left"/>
	</tr>	</tr>
	</thead>	</thead>
	<tbody>	<tbody>
	<tr>	<tr>
	<td align="left">MCC</td>	<td align="left">MCC</td>
	<td align="left">for a single-screen consumer</td>	<td align="left">for a single-screen consumer</td>
	</tr>	</tr>
	<tr>	<tr>
	<td align="left">MCC1, MCC2</td>	<td align="left">MCC1, MCC2</td>

	skipping to change at line 2543 ¶	skipping to change at line 2558 ¶
	</tr>	</tr>
	</tbody>	</tbody>
	</table>	</table>
	<t>	<t>
	If/when a presentation stream becomes active within the conference,	If/when a presentation stream becomes active within the conference,
	the MCU might re-advertise the available media as:</t>	the MCU might re-advertise the available media as:</t>
	<table anchor="ref-mcu-presentation-capture-scene" align="cent er">	<table anchor="ref-mcu-presentation-capture-scene" align="cent er">
	<name>MCU Presentation Capture Scene</name>	<name>MCU Presentation Capture Scene</name>
	<thead>	<thead>
	<tr>	<tr>

	<th align="left"> Capture Scene #2</th>	<th align="left">Capture Scene #2</th>
	<th align="left"> Note</th>	<th align="left">Note</th>
	</tr>	</tr>
	</thead>	</thead>
	<tbody>	<tbody>
	<tr>	<tr>
	<td align="left">VC10</td>	<td align="left">VC10</td>
	<td align="left">Video capture for presentation</td>	<td align="left">Video capture for presentation</td>
	</tr>	</tr>
	<tr>	<tr>
	<td align="left">AC1</td>	<td align="left">AC1</td>
	<td align="left">Presentation audio to accompany VC10</t d>	<td align="left">Presentation audio to accompany VC10</t d>

	skipping to change at line 2568 ¶	skipping to change at line 2583 ¶
	<td align="left"/>	<td align="left"/>
	</tr>	</tr>
	<tr>	<tr>
	<td align="left">CSV(AC1)</td>	<td align="left">CSV(AC1)</td>
	<td align="left"/>	<td align="left"/>
	</tr>	</tr>
	</tbody>	</tbody>
	</table>	</table>
	</section>	</section>
	</section>	</section>

	<section anchor="section-12.2" numbered="true" toc="default">	<section anchor="s-12.2" numbered="true" toc="default">
	<name>Media Consumer Behavior</name>	<name>Media Consumer Behavior</name>
	<t>	<t>
	This section gives an example of how a Media Consumer might behave	This section gives an example of how a Media Consumer might behave
	when deciding how to request streams from the three-screen	when deciding how to request streams from the three-screen
	endpoint described in the previous section.</t>	endpoint described in the previous section.</t>
	<t>	<t>
	The receive side of a call needs to balance its requirements	The receive side of a call needs to balance its requirements
	(based on number of screens and speakers), its decoding capabilities,	(based on number of screens and speakers), its decoding capabilities,
	available bandwidth, and the provider's capabilities in order	available bandwidth, and the provider's capabilities in order
	to optimally configure the provider's streams. Typically, it would	to optimally configure the provider's streams. Typically, it would

	skipping to change at line 2597 ¶	skipping to change at line 2612 ¶
	alternative views in the video Capture Scenes based either on	alternative views in the video Capture Scenes based either on
	hard-coded preferences or on user choice. Once this choice has been	hard-coded preferences or on user choice. Once this choice has been
	made, the consumer would then decide how to configure the	made, the consumer would then decide how to configure the
	provider's encoding groups in order to make best use of the	provider's encoding groups in order to make best use of the
	available network bandwidth and its own decoding capabilities.</t>	available network bandwidth and its own decoding capabilities.</t>
	<!--[rfced] We note the use of both "single-screen" and	<!--[rfced] We note the use of both "single-screen" and
	"one-screen". May we update to use the latter consistently to	"one-screen". May we update to use the latter consistently to
	match "two-screen" and the like?	match "two-screen" and the like?

	-->	-->

	<section anchor="section-12.2.1" numbered="true" toc="default">	<section anchor="s-12.2.1" numbered="true" toc="default">
	<name>One-Screen Media Consumer</name>	<name>One-Screen Media Consumer</name>
	<t>	<t>
	MCC3, MCC4, and VC5 are all different views by themselves, not	MCC3, MCC4, and VC5 are all different views by themselves, not
	grouped together in a single view; so, the receiving device should	grouped together in a single view; so, the receiving device should
	choose between one of those. The choice would come down to	choose between one of those. The choice would come down to
	whether to see the greatest number of participants simultaneously	whether to see the greatest number of participants simultaneously
	at roughly equal precedence (VC5), a switched view of just the	at roughly equal precedence (VC5), a switched view of just the
	loudest region (MCC3), or a switched view with PiPs (MCC4). An	loudest region (MCC3), or a switched view with PiPs (MCC4). An
	endpoint device with a small amount of knowledge of these	endpoint device with a small amount of knowledge of these
	differences could offer a dynamic choice of these options, in-call, t o the user.</t>	differences could offer a dynamic choice of these options, in-call, t o the user.</t>
	</section>	</section>

	<section anchor="section-12.2.2" numbered="true" toc="default">	<section anchor="s-12.2.2" numbered="true" toc="default">
	<name>Two-Screen Media Consumer Configuring the Example</name>	<name>Two-Screen Media Consumer Configuring the Example</name>
	<t>	<t>
	Mixing systems with an even number of screens, "2n", and those	Mixing systems with an even number of screens, "2n", and those
	with "2n+1" cameras (and vice versa) is always likely to be the	with "2n+1" cameras (and vice versa) is always likely to be the
	problematic case. In this instance, the behavior is likely to be	problematic case. In this instance, the behavior is likely to be
	determined by whether a "two-screen" system is really a "two-decoder"	determined by whether a "two-screen" system is really a "two-decoder"
	system, i.e., whether only one received stream can be displayed	system, i.e., whether only one received stream can be displayed
	per screen or whether more than two streams can be received and	per screen or whether more than two streams can be received and
	spread across the available screen area. To enumerate three possible	spread across the available screen area. To enumerate three possible
	behaviors here for the two-screen system when it learns that the far	behaviors here for the two-screen system when it learns that the far

	skipping to change at line 2644 ¶	skipping to change at line 2659 ¶
	<li>Receive three streams, decode all three, and use control information	<li>Receive three streams, decode all three, and use control information
	indicating which was the most active to switch between showing	indicating which was the most active to switch between showing
	the left and center streams (one per screen) and the center and	the left and center streams (one per screen) and the center and
	right streams.</li>	right streams.</li>
	</ol>	</ol>
	<t>	<t>
	For an endpoint capable of all three methods of working described	For an endpoint capable of all three methods of working described
	above, again it might be appropriate to offer the user the choice	above, again it might be appropriate to offer the user the choice
	of display mode.</t>	of display mode.</t>
	</section>	</section>

	<section anchor="section-12.2.3" numbered="true" toc="default">	<section anchor="s-12.2.3" numbered="true" toc="default">
	<name>Three-Screen Media Consumer Configuring the Example</nam e>	<name>Three-Screen Media Consumer Configuring the Example</nam e>
	<t>	<t>
	This is the most straightforward case: the Media Consumer would	This is the most straightforward case: the Media Consumer would
	look to identify a set of streams to receive that best matched its	look to identify a set of streams to receive that best matched its
	available screens; so, the VC0 plus VC1 plus VC2 should match	available screens; so, the VC0 plus VC1 plus VC2 should match
	optimally. The spatial ordering would give sufficient information	optimally. The spatial ordering would give sufficient information
	for the correct Video Capture to be shown on the correct screen.	for the correct Video Capture to be shown on the correct screen.

	<!--[rfced] The use of "either" in this sentence seems odd. Also, the le ngth of this sentence makes it difficult to follow. If our suggested ed its do not convey your intended meaning, please let us know how we may r ephrase.	<!--[rfced] The use of "either" in this sentence seems odd. Also, the le ngth of this sentence makes it difficult to follow. If our suggested ed its do not convey your intended meaning, please let us know how we may r ephrase.


	skipping to change at line 2687 ¶	skipping to change at line 2702 ¶
	-->	-->
	The consumer would either need to divide a single encoding	The consumer would either need to divide a single encoding
	group's capability by 3 to determine what resolution and frame	group's capability by 3 to determine what resolution and frame
	rate to configure the provider with or to configure the individual	rate to configure the provider with or to configure the individual
	Video Captures' Encoding Groups with what makes most sense (taking	Video Captures' Encoding Groups with what makes most sense (taking
	into account the receive side decode capabilities, overall call	into account the receive side decode capabilities, overall call
	bandwidth, the resolution of the screens plus any user preferences	bandwidth, the resolution of the screens plus any user preferences
	such as motion vs. sharpness).</t>	such as motion vs. sharpness).</t>
	</section>	</section>
	</section>	</section>

	<section anchor="section-12.3" numbered="true" toc="default">	<section anchor="s-12.3" numbered="true" toc="default">
	<name>Multipoint Conference Utilizing Multiple Content Captures< /name>	<name>Multipoint Conference Utilizing Multiple Content Captures< /name>
	<t>	<t>
	The use of MCCs allows the MCU to construct outgoing Advertisements	The use of MCCs allows the MCU to construct outgoing Advertisements
	describing complex media switching and composition scenarios. The	describing complex media switching and composition scenarios. The
	following sections provide several examples.</t>	following sections provide several examples.</t>
	<t>	<t>
	Note: in the examples the identities of the CLUE elements (e.g.,	Note: in the examples the identities of the CLUE elements (e.g.,
	Captures, Capture Scene) in the incoming Advertisements overlap.	Captures, Capture Scene) in the incoming Advertisements overlap.
	This is because there is no coordination between the endpoints.	This is because there is no coordination between the endpoints.
	The MCU is responsible for making these unique in the outgoing	The MCU is responsible for making these unique in the outgoing
	advertisement.</t>	advertisement.</t>

	<section anchor="section-12.3.1" numbered="true" toc="default">	<section anchor="s-12.3.1" numbered="true" toc="default">
	<name>Single Media Captures and MCC in the Same Advertisement< /name>	<name>Single Media Captures and MCC in the Same Advertisement< /name>
	<t>	<t>
	Four endpoints are involved in a Conference where CLUE is used. An	Four endpoints are involved in a Conference where CLUE is used. An
	MCU acts as a middlebox between the endpoints with a CLUE channel	MCU acts as a middlebox between the endpoints with a CLUE channel
	between each endpoint and the MCU. The MCU receives the following	between each endpoint and the MCU. The MCU receives the following
	Advertisements.</t>	Advertisements.</t>
	<table anchor="ref-advertisement-received-from-endpoint-a" ali gn="center">	<table anchor="ref-advertisement-received-from-endpoint-a" ali gn="center">
	<name>Advertisement Received from Endpoint A</name>	<name>Advertisement Received from Endpoint A</name>
	<thead>	<thead>
	<tr>	<tr>
	<th align="left"> Capture Scene #1</th>	<th align="left"> Capture Scene #1</th>
	<th align="left"> Description=AustralianConfRoom</th>	<th align="left"> Description=AustralianConfRoom</th>
	</tr>	</tr>
	</thead>	</thead>
	<tbody>	<tbody>
	<tr>	<tr>
	<td align="left">VC1</td>	<td align="left">VC1</td>

	<td align="left">Description=Audience</td>	<td align="left">Description=Audience<br/>EncodeGroupID=
	</tr>	1</td>
	<tr>
	<td align="left"/>
	<td align="left">EncodeGroupID=1</td>
	</tr>	</tr>
	<tr>	<tr>
	<td align="left">CSV(VC1)</td>	<td align="left">CSV(VC1)</td>
	<td align="left"/>	<td align="left"/>
	</tr>	</tr>
	</tbody>	</tbody>
	</table>	</table>
	<table anchor="ref-advertisement-received-from-endpoint-b" ali gn="center">	<table anchor="ref-advertisement-received-from-endpoint-b" ali gn="center">
	<name>Advertisement Received from Endpoint B</name>	<name>Advertisement Received from Endpoint B</name>
	<thead>	<thead>
	<tr>	<tr>
	<th align="left"> Capture Scene #1</th>	<th align="left"> Capture Scene #1</th>
	<th align="left"> Description=ChinaConfRoom</th>	<th align="left"> Description=ChinaConfRoom</th>
	</tr>	</tr>
	</thead>	</thead>
	<tbody>	<tbody>
	<tr>	<tr>
	<td align="left">VC1</td>	<td align="left">VC1</td>

	<td align="left">Description=Speaker</td>	<td align="left">Description=Speaker<br/>EncodeGroupID=1
	</tr>	</td>
	<tr>
	<td align="left"/>
	<td align="left">EncodeGroupID=1</td>
	</tr>	</tr>
	<tr>	<tr>
	<td align="left">VC2</td>	<td align="left">VC2</td>

	<td align="left">Description=Audience</td>	<td align="left">Description=Audience<br/>EncodeGroupID=
	</tr>	1</td>
	<tr>
	<td align="left"/>
	<td align="left">EncodeGroupID=1</td>
	</tr>	</tr>
	<tr>	<tr>
	<td align="left">CSV(VC1, VC2)</td>	<td align="left">CSV(VC1, VC2)</td>
	<td align="left"/>	<td align="left"/>
	</tr>	</tr>
	</tbody>	</tbody>
	</table>	</table>
	<t keepWithPrevious="true">Note: Endpoint B indicates that it sends two streams.</t>	<t keepWithPrevious="true">Note: Endpoint B indicates that it sends two streams.</t>
	<table anchor="ref-advertisement-received-from-endpoint-c" ali gn="center">	<table anchor="ref-advertisement-received-from-endpoint-c" ali gn="center">
	<name>Advertisement Received from Endpoint C</name>	<name>Advertisement Received from Endpoint C</name>
	<thead>	<thead>
	<tr>	<tr>
	<th align="left"> Capture Scene #1</th>	<th align="left"> Capture Scene #1</th>
	<th align="left"> Description=USAConfRoom</th>	<th align="left"> Description=USAConfRoom</th>
	</tr>	</tr>
	</thead>	</thead>
	<tbody>	<tbody>
	<tr>	<tr>
	<td align="left">VC1</td>	<td align="left">VC1</td>

	<td align="left">Description=Audience</td>	<td align="left">Description=Audience<br/>EncodeGroupID=
	</tr>	1</td>
	<tr>
	<td align="left"/>
	<td align="left">EncodeGroupID=1</td>
	</tr>	</tr>
	<tr>	<tr>
	<td align="left">CSV(VC1)</td>	<td align="left">CSV(VC1)</td>
	<td align="left"/>	<td align="left"/>
	</tr>	</tr>
	</tbody>	</tbody>
	</table>	</table>
	<t>	<t>

	If the MCU wanted to provide a Multiple Content Captures containing	If the MCU wanted to provide a Multiple Content Captures cont
	a round-robin switched view of the audience from the three endpoints	aining
	and the speaker, it could construct the following advertisement:</t>	a round-robin switched view of the audience from the three en
	<figure anchor="ref-advertisement-sent-to-endpoint-f-one-encod	dpoints
	ing">	and the speaker, it could construct the following advertisement
		:</t>

		<table anchor="ref-advertisement-sent-to-endpoint-f-one-encodi
		ng">
	<name>Advertisement Sent to Endpoint F - One Encoding</name>	<name>Advertisement Sent to Endpoint F - One Encoding</name>

	<artwork name="" type="" align="left" alt=""><![CDATA[	<tbody>
	+=======================+=================================+	<tr>
	\| Capture Scene #1 \| Description=AustralianConfRoom \|	<th>Capture Scene #1</th> <th>Description=AustralianCo
	+-----------------------\|---------------------------------+	nfRoom</th>
	\| VC1 \| Description=Audience \|	</tr>
	\| CSV(VC1) \| \|	<tr>
	+=======================+=================================+	<td>VC1</td> <td>Description=Audience</
	\| Capture Scene #2 \| Description=ChinaConfRoom \|	td>
	+-----------------------\|---------------------------------+	</tr>
	\| VC2 \| Description=Speaker \|	<tr>
	\| VC3 \| Description=Audience \|	<td>CSV(VC1)</td> <td/>
	\| CSV(VC2, VC3) \| \|	</tr>
	+=======================+=================================+	</tbody>
	\| Capture Scene #3 \| Description=USAConfRoom \|
	+-----------------------\|---------------------------------+	<tbody>
	\| VC4 \| Description=Audience \|	<tr>
	\| CSV(VC4) \| \|	<th>Capture Scene #2</th> <th>Description=ChinaConfRoo
	+=======================+=================================+	m</th>
	\| Capture Scene #4 \| \|	</tr>
	+-----------------------\|---------------------------------+	<tr>
	\| MCC1(VC1,VC2,VC3,VC4) \| Policy=RoundRobin:1 \|	<td>VC2</td> <td>Description=Speaker</t
	\| \| MaxCaptures=1 \|	d>
	\| \| EncodingGroup=1 \|	</tr>
	\| CSV(MCC1) \| \|	<tr>
	+=======================+=================================+	<td>VC3</td> <td>Description=Audience</
	]]></artwork>	td>
	</figure>	</tr>
		<tr>
		<td>CSV(VC2, VC3)</td> <td/>
		</tr>
		</tbody>
		<tbody>
		<tr>
		<th>Capture Scene #3</th> <th>Description=USAConfRoo
		m</th>
		</tr>
		<tr>
		<td>VC4</td> <td>Description=Audience</t
		d>
		</tr>
		<tr>
		<td>CSV(VC4)</td> <td/>
		</tr>
		</tbody>
		<tbody>
		<tr><th>Capture Scene #4</th> <th/></tr>

		<tr>
		<td>MCC1(VC1,VC2,VC3,VC4)</td>

		<td>Policy=RoundRobin:1<br/>
		MaxCaptures=1<br/>
		EncodingGroup=1</td>
		</tr>
		<tr>
		<td>CSV(MCC1)</td> <td/>
		</tr>
		</tbody>
		</table>

	<t>	<t>

	Alternatively, if the MCU wanted to provide the speaker as one media	Alternatively, if the MCU wanted to provide the speaker as on e media
	stream and the audiences as another, it could assign an encoding	stream and the audiences as another, it could assign an encoding
	group to VC2 in Capture Scene 2 and provide a CSV in Capture Scene	group to VC2 in Capture Scene 2 and provide a CSV in Capture Scene
	#4 as per the example below.</t>	#4 as per the example below.</t>

	<figure anchor="ref-advertisement-sent-to-endpoint-f-two-encod ings">	<table anchor="ref-advertisement-sent-to-endpoint-f-two-encodi ngs">
	<name>Advertisement Sent to Endpoint F - Two Encodings</name >	<name>Advertisement Sent to Endpoint F - Two Encodings</name >

	<artwork name="" type="" align="left" alt=""><![CDATA[	<tbody>
	+=======================+=================================+	<tr>
	\| Capture Scene #1 \| Description=AustralianConfRoom \|	<th align="left"> Capture Scene #1</th>
	+-----------------------\|---------------------------------+	<th align="left"> Description=AustralianConfRoom</th>
	\| VC1 \| Description=Audience \|	</tr>
	\| CSV(VC1) \| \|
	+=======================+=================================+	<tr><td>VC1</td> <td>Description=Audience</td>
	\| Capture Scene #2 \| Description=ChinaConfRoom \|	</tr>
	+-----------------------\|---------------------------------+	<tr><td>CSV(VC1)</td> <td/>
	\| VC2 \| Description=Speaker \|	</tr>
	\| \| EncodingGroup=1 \|	</tbody>
	\| VC3 \| Description=Audience \|
	\| CSV(VC2, VC3) \| \|	<tbody>
	+=======================+=================================+	<tr><th>Capture Scene #2</th> <th>Description=ChinaConfRoom</t
	\| Capture Scene #3 \| Description=USAConfRoom \|	h>
	+-----------------------\|---------------------------------+	</tr>
	\| VC4 \| Description=Audience \|	<tr><td>VC2</td> <td>Description=Speaker
	\| CSV(VC4) \| \|	<br/>EncodingGroup=1</td>
	+=======================+=================================+	</tr>
	\| Capture Scene #4 \| \|	<tr><td>VC3</td> <td>Description=Audience</td>
	+-----------------------\|---------------------------------+	</tr>
	\| MCC1(VC1,VC3,VC4) \| Policy=RoundRobin:1 \|	<tr><td>CSV(VC2, VC3)</td> <td/>
	\| \| MaxCaptures=1 \|	</tr>
	\| \| EncodingGroup=1 \|	</tbody>
	\| \| AllowSubset=True \|
	\| MCC2(VC2) \| MaxCaptures=1 \|	<tbody>
	\| \| EncodingGroup=1 \|	<tr><th>Capture Scene #3</th> <th>Description=USAConfRoom</th>
	\| CSV2(MCC1,MCC2) \| \|	</tr>
	+=======================+=================================+	<tr><td>VC4</td> <td>Description=Audience</td>
	]]></artwork>	</tr>
	</figure>	<tr><td>CSV(VC4)</td> <td/>
		</tr>
		</tbody>

		<tbody>
		<tr><th>Capture Scene #4</th> <th/>
		</tr>
		<tr><td>MCC1(VC1,VC3,VC4)</td> <td>Policy=RoundRobin:1
		<br/>MaxCaptures=1
		<br/>EncodingGroup=1
		<br/>AllowSubset=True</td>
		</tr>
		<tr><td>MCC2(VC2)</td> <td>MaxCaptures=1
		<br/>EncodingGroup=1</td>
		</tr>
		<tr><td>CSV2(MCC1,MCC2)</td> <td/>
		</tr>
		</tbody>
		</table>
	<t>	<t>
	Therefore, a Consumer could choose whether or not to have a separate	Therefore, a Consumer could choose whether or not to have a separate
	speaker-related stream and could choose which endpoints to see. If	speaker-related stream and could choose which endpoints to see. If
	it wanted the second stream but not the Australian conference room,	it wanted the second stream but not the Australian conference room,
	it could indicate the following captures in the Configure message:</t >	it could indicate the following captures in the Configure message:</t >

	<figure anchor="table_15">	<table anchor="table_15">
	<name>MCU Case: Consumer Response</name>	<name>MCU Case: Consumer Response</name>

	<artwork name="" type="" align="left" alt=""><![CDATA[	<tbody>
	+-----------------------+---------------------------------+	<tr><td>MCC1(VC3,VC4)</td> <td>Encoding</td></tr>
	\| MCC1(VC3,VC4) \| Encoding \|	<tr><td>VC2</td> <td>Encoding</td></tr>
	\| VC2 \| Encoding \|	</tbody>
	+-----------------------\|---------------------------------+	</table>
	]]></artwork>
	</figure>
	</section>	</section>

	<section anchor="section-12.3.2" numbered="true" toc="default">	<section anchor="s-12.3.2" numbered="true" toc="default">
	<name>Several MCCs in the Same Advertisement</name>	<name>Several MCCs in the Same Advertisement</name>
	<t>	<t>

	<!--[rfced] Is the use of "Multiple MCCs" redundant (as MCC is	<!--[rfced] Is the use of "Multiple MCCs" redundant (as MCC is
	Multiple Content Capture)? Same with "MCC Capture"?	Multiple Content Capture)? Same with "MCC Capture"?

	Originals:	Originals:
	The same index value can be used for multiple MCCs.	The same index value can be used for multiple MCCs.
	...	...
	The Synchronisation Identity MCC attribute indicates how the	The Synchronisation Identity MCC attribute indicates how the

	skipping to change at line 2979 ¶	skipping to change at line 3020 ¶
	<td align="left">CSV(VC1,VC2,VC3)</td>	<td align="left">CSV(VC1,VC2,VC3)</td>
	<td align="left"/>	<td align="left"/>
	</tr>	</tr>
	</tbody>	</tbody>
	</table>	</table>
	<t>	<t>
	The MCU wants to offer Endpoint F three Capture Encodings. Each	The MCU wants to offer Endpoint F three Capture Encodings. Each
	Capture Encoding would contain all the Captures from either	Capture Encoding would contain all the Captures from either
	Endpoint D or Endpoint E, depending on the active speaker.	Endpoint D or Endpoint E, depending on the active speaker.
	The MCU sends the following Advertisement:</t>	The MCU sends the following Advertisement:</t>

	<figure anchor="ref-advertisement-sent-to-endpoint-f">	<table anchor="ref-advertisement-sent-to-endpoint-f">
	<name>Advertisement Sent to Endpoint F</name>	<name>Advertisement Sent to Endpoint F</name>

	<artwork name="" type="" align="left" alt=""><![CDATA[	<tbody>
	+=======================+=================================+	<tr>
	\| Capture Scene #1 \| Description=AustralianConfRoom \|	<th>Capture Scene #1</th><th>Description=AustralianConfR
	+-----------------------\|---------------------------------+	oom</th>
	\| VC1 \| \|	</tr>
	\| VC2 \| \|
	\| VC3 \| \|	<tr><td>VC1</td> <td/></tr>
	\| CSV(VC1,VC2,VC3) \| \|	<tr><td>VC2</td> <td/></tr>
	+=======================+=================================+	<tr><td>VC3</td> <td/></tr>
	\| Capture Scene #2 \| Description=ChinaConfRoom \|	<tr><td>CSV(VC1,VC2,VC3)</td> <td/></tr>
	+-----------------------\|---------------------------------+	</tbody>
	\| VC4 \| \|
	\| VC5 \| \|	<tbody>
	\| VC6 \| \|	<tr><th>Capture Scene #2</th> <th>Description=ChinaCon
	\| CSV(VC4,VC5,VC6) \| \|	fRoom</th></tr>
	+=======================+=================================+
	\| Capture Scene #3 \| \|	<tr><td>VC4</td> <td/></tr>
	+-----------------------\|---------------------------------+	<tr><td>VC5</td> <td/></tr>
	\| MCC1(VC1,VC4) \| CaptureArea=Left \|	<tr><td>VC6</td> <td/></tr>
	\| \| MaxCaptures=1 \|	<tr><td>CSV(VC4,VC5,VC6)</td> <td/></tr>
	\| \| SynchronizationID=1 \|	</tbody>
	\| \| EncodingGroup=1 \|	<tbody>
	\| MCC2(VC2,VC5) \| CaptureArea=Center \|
	\| \| MaxCaptures=1 \|	<tr><th>Capture Scene #3</th> <th/></tr>
	\| \| SynchronizationID=1 \|
	\| \| EncodingGroup=1 \|	<tr><td>MCC1(VC1,VC4)</td> <td>CaptureArea=Left
	\| MCC3(VC3,VC6) \| CaptureArea=Right \|	<br/>MaxCaptures=1
	\| \| MaxCaptures=1 \|	<br/>SynchronizationID=1
	\| \| SynchronizationID=1 \|	<br/>EncodingGroup=1
	\| \| EncodingGroup=1 \|	</td>
	\| CSV(MCC1,MCC2,MCC3) \| \|	</tr>
	+=======================+=================================+	<tr><td>MCC2(VC2,VC5)</td> <td>CaptureArea=Center
	]]></artwork>	<br/>MaxCaptures=1
	</figure>	<br/>SynchronizationID=1
		<br/>EncodingGroup=1
		</td>
		</tr>
		<tr><td>MCC3(VC3,VC6)</td> <td>CaptureArea=Right
		<br/>MaxCaptures=1
		<br/>SynchronizationID=1
		<br/>EncodingGroup=1
		</td>
		</tr>
		<tr><td>CSV(MCC1,MCC2,MCC3)</td> <td/></tr>
		</tbody>
		</table>

	</section>	</section>

	<section anchor="section-12.3.3" numbered="true" toc="default">	<section anchor="s-12.3.3" numbered="true" toc="default">
	<name>Heterogeneous Conference with Switching and Composition< /name>	<name>Heterogeneous Conference with Switching and Composition< /name>
	<t>	<t>
	Consider a conference between endpoints with the following	Consider a conference between endpoints with the following
	characteristics:</t>	characteristics:</t>

	<dl newline="false" spacing="normal" indent="3">	<dl newline="false" spacing="normal">
	<dt/>	<dt>Endpoint A -</dt>
	<dd>	<dd>4 screens, 3 cameras</dd>
	Endpoint A - 4 screens, 3 cameras</dd>
	</dl>	<dt>Endpoint B -</dt>
	<dl newline="false" spacing="normal" indent="3">	<dd>3 screens, 3 cameras</dd>
	<dt/>
	<dd>	<dt>Endpoint C -</dt>
	Endpoint B - 3 screens, 3 cameras</dd>	<dd>3 screens, 3 cameras</dd>
	</dl>
	<dl newline="false" spacing="normal" indent="3">	<dt>Endpoint D -</dt>
	<dt/>	<dd>3 screens, 3 cameras</dd>
	<dd>
	Endpoint C - 3 screens, 3 cameras</dd>	<dt>Endpoint E -</dt>
	</dl>	<dd>1 screen, 1 camera</dd>
	<dl newline="false" spacing="normal" indent="3">
	<dt/>	<dt>Endpoint F -</dt>
	<dd>	<dd>2 screens, 1 camera</dd>
	Endpoint D - 3 screens, 3 cameras</dd>
	</dl>	<dt>Endpoint G -</dt>
	<dl newline="false" spacing="normal" indent="3">	<dd>1 screen, 1 camera</dd>
	<dt/>
	<dd>
	Endpoint E - 1 screen, 1 camera</dd>
	</dl>
	<dl newline="false" spacing="normal" indent="3">
	<dt/>
	<dd>
	Endpoint F - 2 screens, 1 camera</dd>
	</dl>
	<dl newline="false" spacing="normal" indent="3">
	<dt/>
	<dd>
	Endpoint G - 1 screen, 1 camera</dd>
	</dl>	</dl>
	<t>	<t>
	This example focuses on what the user in one of the three-camera	This example focuses on what the user in one of the three-camera
	multi-screen endpoints sees. Call this person User A, at Endpoint	multi-screen endpoints sees. Call this person User A, at Endpoint
	A. There are four large display screens at Endpoint A. Whenever	A. There are four large display screens at Endpoint A. Whenever
	somebody at another site is speaking, all the video captures from	somebody at another site is speaking, all the video captures from
	that endpoint are shown on the large screens. If the talker is at	that endpoint are shown on the large screens. If the talker is at
	a three-camera site, then the video from those three cameras fills th ree of	a three-camera site, then the video from those three cameras fills th ree of
	the screens. If the person speaking is at a single-camera site, then video	the screens. If the person speaking is at a single-camera site, then video
	from that camera fills one of the screens, while the other screens	from that camera fills one of the screens, while the other screens

	skipping to change at line 3200 ¶	skipping to change at line 3239 ¶
	<t>	<t>
	As Endpoints A to D each advertise that three Captures make up a	As Endpoints A to D each advertise that three Captures make up a
	Capture Scene, the MCU offers these in a "site switching" mode.	Capture Scene, the MCU offers these in a "site switching" mode.
	That is, there are three Multiple Content Captures (and	That is, there are three Multiple Content Captures (and
	Capture Encodings) each switching between Endpoints. The MCU	Capture Encodings) each switching between Endpoints. The MCU
	switches in the applicable media into the stream based on voice	switches in the applicable media into the stream based on voice
	activity. Endpoint A will not see a capture from itself.</t>	activity. Endpoint A will not see a capture from itself.</t>
	<t>	<t>
	Using the MCC concept, the MCU would send the following	Using the MCC concept, the MCU would send the following
	Advertisement to Endpoint A:</t>	Advertisement to Endpoint A:</t>

	<figure anchor="ref-advertisement-sent-to-endpoint-a-source-pa
	rt">	<table anchor="ref-advertisement-sent-to-endpoint-a-source-par
		t">
	<name>Advertisement Sent to Endpoint A - Source Part</name>	<name>Advertisement Sent to Endpoint A - Source Part</name>

	<artwork name="" type="" align="left" alt=""><![CDATA[	<tbody>
	+=======================+=================================+	<tr>
	\| Capture Scene #1 \| Description=Endpoint B \|	<th>Capture Scene #1</th><th>Description=Endpoint B</th>
	+-----------------------\|---------------------------------+	</tr>
	\| VC4 \| CaptureArea=Left \|
	\| VC5 \| CaptureArea=Center \|	<tr><td>VC4</td> <td>CaptureArea=Left</td></tr>
	\| VC6 \| CaptureArea=Right \|	<tr><td>VC5</td> <td>CaptureArea=Center</td></tr>
	\| AC1 \| \|	<tr><td>VC6</td> <td>CaptureArea=Right</td></tr>
	\| CSV(VC4,VC5,VC6) \| \|	<tr><td>AC1</td> <td/></tr>
	\| CSV(AC1) \| \|	<tr><td>CSV(VC4,VC5,VC6)</td> <td/></tr>
	+=======================+=================================+	<tr><td>CSV(AC1)</td> <td/></tr>
	\| Capture Scene #2 \| Description=Endpoint C \|	</tbody>
	+-----------------------\|---------------------------------+	<tbody>
	\| VC7 \| CaptureArea=Left \|	<tr>
	\| VC8 \| CaptureArea=Center \|	<th>Capture Scene #2</th><th>Description=Endpoint C</th>
	\| VC9 \| CaptureArea=Right \|	</tr>
	\| AC2 \| \|	<tr><td>VC7</td> <td>CaptureArea=Left</td></tr>
	\| CSV(VC7,VC8,VC9) \| \|	<tr><td>VC8</td> <td>CaptureArea=Center</td></tr>
	\| CSV(AC2) \| \|	<tr><td>VC9</td> <td>CaptureArea=Right</td></tr>
	+=======================+=================================+	<tr><td>AC2</td> <td/></tr>
	\| Capture Scene #3 \| Description=Endpoint D \|	<tr><td>CSV(VC7,VC8,VC9)</td> <td/></tr>
	+-----------------------\|---------------------------------+	<tr><td>CSV(AC2)</td> <td/></tr>
	\| VC10 \| CaptureArea=Left \|	</tbody>
	\| VC11 \| CaptureArea=Center \|	<tbody>
	\| VC12 \| CaptureArea=Right \|	<tr>
	\| AC3 \| \|	<th>Capture Scene #3</th><th>Description=Endpoint D</th>
	\| CSV(VC10,VC11,VC12) \| \|	</tr>
	\| CSV(AC3) \| \|
	+=======================+=================================+	<tr><td>VC10</td> <td>CaptureArea=Left</td></tr>
	\| Capture Scene #4 \| Description=Endpoint E \|	<tr><td>VC11</td> <td>CaptureArea=Center</td></tr>
	+-----------------------\|---------------------------------+	<tr><td>VC12</td> <td>CaptureArea=Right</td></tr>
	\| VC13 \| \|	<tr><td>AC3</td> <td/></tr>
	\| AC4 \| \|	<tr><td>CSV(VC10,VC11,VC12)</td> <td/></tr>
	\| CSV(VC13) \| \|	<tr><td>CSV(AC3)</td> <td/></tr>
	\| CSV(AC4) \| \|	</tbody>
	+=======================+=================================+	<tbody>
	\| Capture Scene #5 \| Description=Endpoint F \|	<tr>
	+-----------------------\|---------------------------------+	<th>Capture Scene #4</th><th>Description=Endpoint E</th>
	\| VC14 \| \|	</tr>
	\| AC5 \| \|
	\| CSV(VC14) \| \|	<tr><td>VC13</td> <td/></tr>
	\| CSV(AC5) \| \|	<tr><td>AC4</td> <td/></tr>
	+=======================+=================================+	<tr><td>CSV(VC13)</td> <td/></tr>
	\| Capture Scene #6 \| Description=Endpoint G \|	<tr><td>CSV(AC4)</td> <td/></tr>
	+-----------------------\|---------------------------------+	</tbody>
	\| VC15 \| \|	<tbody>
	\| AC6 \| \|	<tr>
	\| CSV(VC15) \| \|	<th>Capture Scene #5</th><th>Description=Endpoint F</th>
	\| CSV(AC6) \| \|	</tr>
	+=======================+=================================+
	]]></artwork>	<tr><td>VC14</td> <td/></tr>
	</figure>	<tr><td>AC5</td> <td/></tr>
		<tr><td>CSV(VC14)</td> <td/></tr>
		<tr><td>CSV(AC5)</td> <td/></tr>
		</tbody>
		<tbody>
		<tr>
		<th>Capture Scene #6</th><th>Description=Endpoint G</th>
		</tr>

		<tr><td>VC15</td> <td/></tr>
		<tr><td>AC6</td> <td/></tr>
		<tr><td>CSV(VC15)</td> <td/></tr>
		<tr><td>CSV(AC6)</td> <td/></tr>

		</tbody>
		</table>
	<t>	<t>
	The above part of the Advertisement presents information about the	The above part of the Advertisement presents information about the
	sources to the MCC. The information is effectively the same as the	sources to the MCC. The information is effectively the same as the
	received Advertisements, except that there are no Capture Encodings	received Advertisements, except that there are no Capture Encodings
	associated with them and the identities have been renumbered.</t>	associated with them and the identities have been renumbered.</t>
	<t>	<t>
	In addition to the source Capture information, the MCU advertises	In addition to the source Capture information, the MCU advertises
	site switching of Endpoints B to G in three streams.</t>	site switching of Endpoints B to G in three streams.</t>

	<figure anchor="table_22">	<table anchor="table_22">
	<name>Advertisement Sent to Endpoint A - Switching Part</nam e>	<name>Advertisement Sent to Endpoint A - Switching Part</nam e>

	<artwork name="" type="" align="left" alt=""><![CDATA[	<thead>
	+=======================+=================================+	<tr>
	\| Capture Scene #7 \| Description=Output3streammix \|	<th>Capture Scene #7</th><th>Description=Output3streammix
	+-----------------------\|---------------------------------+	</th>
	\| MCC1(VC4,VC7,VC10, \| CaptureArea=Left \|	</tr>
	\| VC13) \| MaxCaptures=1 \|	</thead>
	\| \| SynchronizationID=1 \|	<tbody>
	\| \| Policy=SoundLevel:0 \|
	\| \| EncodingGroup=1 \|	<tr>
	\| \| \|	<td>MCC1(VC4,VC7,VC10,&zwsp;VC13)</td> <td>CaptureArea=Left
	\| MCC2(VC5,VC8,VC11, \| CaptureArea=Center \|	<br/>MaxCaptures=1
	\| VC14) \| MaxCaptures=1 \|	<br/>SynchronizationID=1
	\| \| SynchronizationID=1 \|	<br/>Policy=SoundLevel:0
	\| \| Policy=SoundLevel:0 \|	<br/>EncodingGroup=1</td>
	\| \| EncodingGroup=1 \|	</tr>
	\| \| \|
	\| MCC3(VC6,VC9,VC12, \| CaptureArea=Right \|	<tr>
	\| VC15) \| MaxCaptures=1 \|	<td>MCC2(VC5,VC8,VC11,&zwsp;VC14)</td> <td>CaptureArea=Center
	\| \| SynchronizationID=1 \|	<br/>MaxCaptures=1
	\| \| Policy=SoundLevel:0 \|	<br/>SynchronizationID=1
	\| \| EncodingGroup=1 \|	<br/>Policy=SoundLevel:0
	\| \| \|	<br/>EncodingGroup=1</td>
	\| MCC4() (for audio) \| CaptureArea=whole scene \|	</tr>
	\| \| MaxCaptures=1 \|
	\| \| Policy=SoundLevel:0 \|	<tr>
	\| \| EncodingGroup=2 \|	<td>MCC3(VC6,VC9,VC12,&zwsp;VC15)</td> <td>CaptureArea=Right
	\| \| \|	<br/>MaxCaptures=1
	\| MCC5() (for audio) \| CaptureArea=whole scene \|	<br/>SynchronizationID=1
	\| \| MaxCaptures=1 \|	<br/>Policy=SoundLevel:0
	\| \| Policy=SoundLevel:1 \|	<br/>EncodingGroup=1</td>
	\| \| EncodingGroup=2 \|	</tr>
	\| \| \|	<tr>
	\| MCC6() (for audio) \| CaptureArea=whole scene \|	<td>MCC4() (for audio)</td> <td>CaptureArea=whole scene
	\| \| MaxCaptures=1 \|	<br/>MaxCaptures=1
	\| \| Policy=SoundLevel:2 \|	<br/>Policy=SoundLevel:0
	\| \| EncodingGroup=2 \|	<br/>EncodingGroup=2</td>
	\| \| \|	</tr>
	\| MCC7() (for audio) \| CaptureArea=whole scene \|	<tr>
	\| \| MaxCaptures=1 \|	<td>MCC5() (for audio)</td> <td>CaptureArea=whole scene
	\| \| Policy=SoundLevel:3 \|	<br/>MaxCaptures=1
	\| \| EncodingGroup=2 \|	<br/>Policy=SoundLevel:1
	\| \| \|	<br/>EncodingGroup=2</td>
	\| CSV(MCC1,MCC2,MCC3) \| \|	</tr>
	\| CSV(MCC4,MCC5,MCC6, \| \|	<tr>
	\| MCC7) \| \|	<td>MCC6() (for audio)</td> <td>CaptureArea=whole scene
	+=======================+=================================+	<br/>MaxCaptures=1
	]]></artwork>	<br/>Policy=SoundLevel:2
	</figure>	<br/>EncodingGroup=2</td>
		</tr>

		<tr>
		<td>MCC7() (for audio)</td> <td>CaptureArea=whole scene
		<br/>MaxCaptures=1
		<br/>Policy=SoundLevel:3
		<br/>EncodingGroup=2</td>
		</tr>

		<tr>
		<td>CSV(MCC1,MCC2,MCC3)</td> <td/></tr>

		<tr>
		<td>CSV(MCC4,MCC5,MCC6,&zwsp;MCC7)</td> <td/></tr>

		</tbody></table>
	<t>	<t>
	The above part describes the three main switched streams that relate to	The above part describes the three main switched streams that relate to
	site switching. MaxCaptures=1 indicates that only one Capture from	site switching. MaxCaptures=1 indicates that only one Capture from
	the MCC is sent at a particular time. SynchronizationID=1 indicates	the MCC is sent at a particular time. SynchronizationID=1 indicates
	that the source sending is synchronized. The provider can choose to	that the source sending is synchronized. The provider can choose to
	group together VC13, VC14, and VC15 for the purpose of switching	group together VC13, VC14, and VC15 for the purpose of switching
	according to the SynchronizationID. Therefore, when the provider	according to the SynchronizationID. Therefore, when the provider
	switches one of them into an MCC, it can also switch the others	switches one of them into an MCC, it can also switch the others
	even though they are not part of the same Capture Scene.</t>	even though they are not part of the same Capture Scene.</t>
	<t>	<t>
	All the audio for the conference is included in Scene #7.	All the audio for the conference is included in Scene #7.
	There isn't necessarily a one-to-one relation between any audio	There isn't necessarily a one-to-one relation between any audio
	capture and video capture in this scene. Typically, a change in	capture and video capture in this scene. Typically, a change in
	the loudest talker will cause the MCU to switch the audio streams mor e	the loudest talker will cause the MCU to switch the audio streams mor e
	quickly than switching video streams.</t>	quickly than switching video streams.</t>
	<t>	<t>
	The MCU can also supply nine media streams showing the active and	The MCU can also supply nine media streams showing the active and
	previous eight speakers. It includes the following in the	previous eight speakers. It includes the following in the
	Advertisement:</t>	Advertisement:</t>

	<figure anchor="table_23">
		<table anchor="table_23">
	<name>Advertisement Sent to Endpoint A - 9 Switched Part</na me>	<name>Advertisement Sent to Endpoint A - 9 Switched Part</na me>

	<artwork name="" type="" align="left" alt=""><![CDATA[	<thead>
	+=======================+=================================+	<tr>
	\| Capture Scene #8 \| Description=Output9stream \|	<th>Capture Scene #8</th><th>Description=Output9stream</t
	+-----------------------\|---------------------------------+	h>
	\| MCC8(VC4,VC5,VC6,VC7, \| MaxCaptures=1 \|	</tr>
	\| VC8,VC9,VC10,VC11, \| Policy=SoundLevel:0 \|	</thead>
	\| VC12,VC13,VC14,VC15)\| EncodingGroup=1 \|	<tbody>
	\| \| \|	<tr>
	\| MCC9(VC4,VC5,VC6,VC7, \| MaxCaptures=1 \|	<td align="right">MCC8(VC4,VC5,VC6,VC7,
	\| VC8,VC9,VC10,VC11, \| Policy=SoundLevel:1 \|	<br/>VC8,VC9,VC10,VC11,
	\| VC12,VC13,VC14,VC15)\| EncodingGroup=1 \|	<br/>VC12,VC13,VC14,VC15)</td>
	\| \| \|
	to to \|	<td>MaxCaptures=1
	\| \| \|	<br/>Policy=SoundLevel:0
	\| MCC16(VC4,VC5,VC6,VC7,\| MaxCaptures=1 \|	<br/>EncodingGroup=1</td>
	\| VC8,VC9,VC10,VC11, \| Policy=SoundLevel:8 \|	</tr><tr>
	\| VC12,VC13,VC14,VC15)\| EncodingGroup=1 \|
	\| \| \|	<td align="right">MCC9(VC4,VC5,VC6,VC7,
	\| CSV(MCC8,MCC9,MCC10, \| \|	<br/>VC8,VC9,VC10,VC11,
	\| MCC11,MCC12,MCC13,\| \|	<br/>VC12,VC13,VC14,VC15)
	\| MCC14,MCC15,MCC16)\| \|	</td>
	+=======================+=================================+
	]]></artwork>	<td>MaxCaptures=1
	</figure>	<br/>Policy=SoundLevel:1
		<br/>EncodingGroup=1</td>
		</tr><tr>
		<th align="center">to</th><th align="center">to</th>
		</tr><tr>
		<td align="right">MCC16(VC4,VC5,VC6,VC7,
		<br/>VC8,VC9,VC10,VC11,
		<br/>VC12,VC13,VC14,VC15)</td>

		<td>MaxCaptures=1
		<br/>Policy=SoundLevel:8
		<br/>EncodingGroup=1</td>
		</tr><tr>

		<td align="right">CSV(MCC8,MCC9,MCC10,
		<br/>MCC11,MCC12,MCC13,
		<br/>MCC14,MCC15,MCC16)</td>
		<td/>
		</tr>
		</tbody>
		</table>
	<t>	<t>
	The above part indicates that there are nine capture encodings. Each	The above part indicates that there are nine capture encodings. Each
	of the Capture Encodings may contain any captures from any source	of the Capture Encodings may contain any captures from any source
	site with a maximum of one Capture at a time. Which Capture is	site with a maximum of one Capture at a time. Which Capture is
	present is determined by the policy. The MCCs in this scene do not	present is determined by the policy. The MCCs in this scene do not
	have any spatial attributes.</t>	have any spatial attributes.</t>
	<t>	<t>
	Note: The Provider alternatively could provide each of the MCCs	Note: The Provider alternatively could provide each of the MCCs
	above in its own Capture Scene.</t>	above in its own Capture Scene.</t>
	<t>	<t>
	If the MCU wanted to provide a composed Capture Encoding containing	If the MCU wanted to provide a composed Capture Encoding containing
	all of the nine captures, it could advertise in addition:</t>	all of the nine captures, it could advertise in addition:</t>

	<figure anchor="ref-advertisement-sent-to-endpoint-a-9-compose d-part">	<table anchor="ref-advertisement-sent-to-endpoint-a-9-composed -part">
	<name>Advertisement Sent to Endpoint A - 9 Composed Part</na me>	<name>Advertisement Sent to Endpoint A - 9 Composed Part</na me>

	<artwork name="" type="" align="left" alt=""><![CDATA[	<thead>
	+=======================+=================================+	<tr>
	\| Capture Scene #9 \| Description=NineTiles \|	<th>Capture Scene #9</th><th>Description=NineTiles</th>
	+-----------------------\|---------------------------------+	</tr>
	\| MCC13(MCC8,MCC9,MCC10,\| MaxCaptures=9 \|	</thead>
	\| MCC11,MCC12,MCC13,\| EncodingGroup=1 \|	<tbody>
	\| MCC14,MCC15,MCC16)\| \|	<tr>
	\| \| \|	<td align="right">MCC13(MCC8,MCC9,MCC10,<br/>
	\| CSV(MCC13) \| \|	MCC11,MCC12,MCC13,<br/>
	+=======================+=================================+	MCC14,MCC15,MCC16)</td>
	]]></artwork>
	</figure>	<td>MaxCaptures=9<br/>
		EncodingGroup=1</td>
		</tr>
		<tr>
		<td>CSV(MCC13)</td><td/>
		</tr>
		</tbody>
		</table>

	<t>	<t>
	As MaxCaptures is 9, it indicates that the capture encoding contains	As MaxCaptures is 9, it indicates that the capture encoding contains
	information from nine sources at a time.</t>	information from nine sources at a time.</t>
	<t>	<t>
	The Advertisement to Endpoint B is identical to the above, other	The Advertisement to Endpoint B is identical to the above, other
	than the fact that captures from Endpoint A would be added and the ca ptures	than the fact that captures from Endpoint A would be added and the ca ptures
	from Endpoint B would be removed. Whether the Captures are rendered	from Endpoint B would be removed. Whether the Captures are rendered
	on a four-screen display or a three-screen display is up to the	on a four-screen display or a three-screen display is up to the
	Consumer to determine. The Consumer wants to place video captures	Consumer to determine. The Consumer wants to place video captures
	from the same original source endpoint together, in the correct	from the same original source endpoint together, in the correct
	spatial order, but the MCCs do not have spatial attributes. So, the	spatial order, but the MCCs do not have spatial attributes. So, the
	Consumer needs to associate incoming media packets with the	Consumer needs to associate incoming media packets with the
	original individual captures in the advertisement (such as VC4,	original individual captures in the advertisement (such as VC4,
	VC5, and VC6) in order to know the spatial information it needs for	VC5, and VC6) in order to know the spatial information it needs for
	correct placement on the screens. The Provider can use the RTP Contr ol Protocol (RTCP)	correct placement on the screens. The Provider can use the RTP Contr ol Protocol (RTCP)
	CaptureId source description (SDES) item and associated RTP header ex tension, as	CaptureId source description (SDES) item and associated RTP header ex tension, as
	described in <xref target="RFCYYY4" format="default"/>, to convey thi s	described in <xref target="RFCYYY4" format="default"/>, to convey thi s
	information to the Consumer.</t>	information to the Consumer.</t>
	</section>	</section>

	<section anchor="section-12.3.4" numbered="true" toc="default">	<section anchor="s-12.3.4" numbered="true" toc="default">
	<name>Heterogeneous Conference with Voice-Activated Switching< /name>	<name>Heterogeneous Conference with Voice-Activated Switching< /name>
	<t>	<t>
	This example illustrates how multipoint "voice-activated switching"	This example illustrates how multipoint "voice-activated switching"
	behavior can be realized, with an endpoint making its own decision	behavior can be realized, with an endpoint making its own decision
	about which of its outgoing video streams is considered the "active t alker" from that endpoint. Then, an MCU can decide which is the	about which of its outgoing video streams is considered the "active t alker" from that endpoint. Then, an MCU can decide which is the
	active talker among the whole conference.</t>	active talker among the whole conference.</t>
	<t>	<t>
	Consider a conference between endpoints with the following	Consider a conference between endpoints with the following
	characteristics:</t>	characteristics:</t>

	<dl newline="false" spacing="normal" indent="3">	<dl newline="false" spacing="normal">
	<dt/>	<dt>Endpoint A -</dt>
	<dd>	<dd>3 screens, 3 cameras</dd>
	Endpoint A - 3 screens, 3 cameras</dd>
	</dl>	<dt>Endpoint B -</dt>
	<dl newline="false" spacing="normal" indent="3">	<dd>3 screens, 3 cameras</dd>
	<dt/>
	<dd>	<dt>Endpoint C -</dt>
	Endpoint B - 3 screens, 3 cameras</dd>	<dd>1 screen, 1 camera</dd>
	</dl>
	<dl newline="false" spacing="normal" indent="3">
	<dt/>
	<dd>
	Endpoint C - 1 screen, 1 camera</dd>
	</dl>	</dl>
	<t>	<t>
	This example focuses on what the user at Endpoint C sees. The	This example focuses on what the user at Endpoint C sees. The
	user would like to see the video capture of the current talker,	user would like to see the video capture of the current talker,
	without composing it with any other video capture. In this	without composing it with any other video capture. In this
	example, Endpoint C is capable of receiving only a single video	example, Endpoint C is capable of receiving only a single video
	stream. The following tables describe advertisements from Endpoints A and B	stream. The following tables describe advertisements from Endpoints A and B
	to the MCU, and from the MCU to Endpoint C, that can be used to accom plish	to the MCU, and from the MCU to Endpoint C, that can be used to accom plish
	this.</t>	this.</t>

	<figure anchor="ref-advertisement-received-at-the-mcu-from-end points-a-and-b">	<table anchor="ref-advertisement-received-at-the-mcu-from-endp oints-a-and-b">
	<name>Advertisement Received at the MCU from Endpoints A and B</name>	<name>Advertisement Received at the MCU from Endpoints A and B</name>

	<artwork name="" type="" align="left" alt=""><![CDATA[	<thead>
	+-----------------------+---------------------------------+	<tr>
	\| Capture Scene #1 \| Description=Endpoint x \|	<th>Capture Scene #1</th><th>Description=Endpoint x</th>
	+-----------------------\|---------------------------------+	</tr>
	\| VC1 \| CaptureArea=Left \|	</thead>
	\| \| EncodingGroup=1 \|	<tbody>
	\| VC2 \| CaptureArea=Center \|	<tr>
	\| \| EncodingGroup=1 \|	<td>VC1</td> <td>CaptureArea=Left
	\| VC3 \| CaptureArea=Right \|	<br/>EncodingGroup=1</td>
	\| \| EncodingGroup=1 \|	</tr>
	\| MCC1(VC1,VC2,VC3) \| MaxCaptures=1 \|	<tr>
	\| \| CaptureArea=whole scene \|	<td>VC2</td> <td>CaptureArea=Center
	\| \| Policy=SoundLevel:0 \|	<br/>EncodingGroup=1</td>
	\| \| EncodingGroup=1 \|	</tr>
	\| AC1 \| CaptureArea=whole scene \|	<tr>
	\| \| EncodingGroup=2 \|	<td>VC3</td> <td>CaptureArea=Right
	\| CSV1(VC1, VC2, VC3) \| \|	<br/>EncodingGroup=1</td>
	\| CSV2(MCC1) \| \|	</tr>
	\| CSV3(AC1) \| \|	<tr>
	+---------------------------------------------------------+	<td>MCC1(VC1,VC2,VC3)</td> <td>MaxCaptures=1
	]]></artwork>	<br/>CaptureArea=whole scene
	</figure>	<br/>Policy=SoundLevel:0
		<br/>EncodingGroup=1</td>
		</tr>
		<tr>
		<td>AC1</td> <td>CaptureArea=whole scene
		<br/>EncodingGroup=2</td>
		</tr>

		<tr>
		<td>CSV1(VC1, VC2, VC3)</td><td/>
		</tr>
		<tr>
		<td>CSV2(MCC1)</td><td/>
		</tr>
		<tr>
		<td>CSV3(AC1)</td><td/>
		</tr></tbody>
		</table>
	<t>	<t>
	Endpoints A and B are advertising each individual video capture,	Endpoints A and B are advertising each individual video capture,
	and also a switched capture MCC1 that switches between the other	and also a switched capture MCC1 that switches between the other
	three based on who is the active talker. These endpoints do not	three based on who is the active talker. These endpoints do not
	advertise distinct audio captures associated with each individual	advertise distinct audio captures associated with each individual
	video capture, so it would be impossible for the MCU (as a media	video capture, so it would be impossible for the MCU (as a media
	consumer) to make its own determination of which video capture is	consumer) to make its own determination of which video capture is
	the active talker based just on information in the audio streams.</t>	the active talker based just on information in the audio streams.</t>

	<figure anchor="ref-advertisement-sent-from-the-mcu-to-c">	<table anchor="ref-advertisement-sent-from-the-mcu-to-c">
	<name>Advertisement Sent from the MCU to Endpoint C</name>	<name>Advertisement Sent from the MCU to Endpoint C</na
	<artwork name="" type="" align="left" alt=""><![CDATA[	me>
	+-----------------------+---------------------------------+
	\| Capture Scene #1 \| Description=conference \|	<thead>
	+-----------------------\|---------------------------------+	<tr><th>Capture Scene #1</th><th>Description=conference</th
	\| MCC1() \| CaptureArea=Left \|	>
	\| \| MaxCaptures=1 \|	</tr>
	\| \| SynchronizationID=1 \|	</thead>
	\| \| Policy=SoundLevel:0 \|	<tbody>
	\| \| EncodingGroup=1 \|	<tr>
	\| \| \|	<td>MCC1()</td>
	\| MCC2() \| CaptureArea=Center \|	<td>CaptureArea=Left
	\| \| MaxCaptures=1 \|	<br/>MaxCaptures=1
	\| \| SynchronizationID=1 \|	<br/>SynchronizationID=1
	\| \| Policy=SoundLevel:0 \|	<br/>Policy=SoundLevel:0
	\| \| EncodingGroup=1 \|	<br/>EncodingGroup=1
	\| \| \|	</td>
	\| MCC3() \| CaptureArea=Right \|	</tr>
	\| \| MaxCaptures=1 \|	<tr>
	\| \| SynchronizationID=1 \|	<td>MCC2()</td><td>CaptureArea=Center
	\| \| Policy=SoundLevel:0 \|	<br/>MaxCaptures=1
	\| \| EncodingGroup=1 \|	<br/>SynchronizationID=1
	\| \| \|	<br/>Policy=SoundLevel:0
	\| MCC4() \| CaptureArea=whole scene \|	<br/>EncodingGroup=1
	\| \| MaxCaptures=1 \|	</td>
	\| \| Policy=SoundLevel:0 \|	</tr>
	\| \| EncodingGroup=1 \|	<tr>
	\| \| \|	<td>MCC3()</td><td>CaptureArea=Right
	\| MCC5() (for audio) \| CaptureArea=whole scene \|	<br/>MaxCaptures=1
	\| \| MaxCaptures=1 \|	<br/>SynchronizationID=1
	\| \| Policy=SoundLevel:0 \|	<br/>Policy=SoundLevel:0
	\| \| EncodingGroup=2 \|	<br/>EncodingGroup=1
	\| \| \|	</td>
	\| MCC6() (for audio) \| CaptureArea=whole scene \|	</tr>
	\| \| MaxCaptures=1 \|	<tr>
	\| \| Policy=SoundLevel:1 \|	<td>MCC4()</td><td>CaptureArea=whole scene
	\| \| EncodingGroup=2 \|	<br/>MaxCaptures=1
	\| CSV1(MCC1,MCC2,MCC3 \| \|	<br/>Policy=SoundLevel:0
	\| CSV2(MCC4) \| \|	<br/>EncodingGroup=1
	\| CSV3(MCC5,MCC6) \| \|	</td>
	+---------------------------------------------------------+	</tr>
	]]></artwork>	<tr>
	</figure>	<td>MCC5() (for audio)</td><td>CaptureArea=whole scene
		<br/>MaxCaptures=1
		<br/>Policy=SoundLevel:0
		<br/>EncodingGroup=2
		</td>
		</tr>
		<tr>
		<td>MCC6() (for audio)</td><td>CaptureArea=whole scene
		<br/>MaxCaptures=1
		<br/>Policy=SoundLevel:1
		<br/>EncodingGroup=2
		</td>
		</tr>
		<tr><td>CSV1(MCC1,MCC2,MCC3)</td><td/></tr>
		<tr><td>CSV2(MCC4)</td><td/></tr>
		<tr><td>CSV3(MCC5,MCC6)</td><td/></tr>
		</tbody>
		</table>

		<!-- [rfced] FYI, in Table 26, we add a closing parenthesis here.
		Please let us know if this is not correct.

		Original: CSV1(MCC1,MCC2,MCC3
		Current: CSV1(MCC1,MCC2,MCC3)
		-->
	<t>	<t>
	The MCU advertises one scene, with four video MCCs. Three of them	The MCU advertises one scene, with four video MCCs. Three of them
	in CSV1 give a left, center, and right view of the conference, with	in CSV1 give a left, center, and right view of the conference, with
	site switching. MCC4 provides a single video capture	site switching. MCC4 provides a single video capture
	representing a view of the whole conference. The MCU intends for	representing a view of the whole conference. The MCU intends for
	MCC4 to be switched between all the other original source	MCC4 to be switched between all the other original source
	captures. In this example, advertisement of the MCU is not giving al l	captures. In this example, advertisement of the MCU is not giving al l
	the information about all the other endpoints' scenes and which of	the information about all the other endpoints' scenes and which of
	those captures are included in the MCCs. The MCU could include all	those captures are included in the MCCs. The MCU could include all
	that if it wants to give the consumers more	that if it wants to give the consumers more

	skipping to change at line 3541 ¶	skipping to change at line 3673 ¶
	the MCU to get the information it needs to construct MCC4, it has	the MCU to get the information it needs to construct MCC4, it has
	to send configure messages to Endpoints A and B asking to receive MCC 1 from	to send configure messages to Endpoints A and B asking to receive MCC 1 from
	each of them, along with their AC1 audio. Now the MCU can use	each of them, along with their AC1 audio. Now the MCU can use
	audio energy information from the two incoming audio streams from	audio energy information from the two incoming audio streams from
	Endpoints A and B to determine which of those alternatives is the cur rent	Endpoints A and B to determine which of those alternatives is the cur rent
	talker. Based on that, the MCU uses either MCC1 from A or MCC1	talker. Based on that, the MCU uses either MCC1 from A or MCC1
	from B as the source of MCC4 to send to Endpoint C.</t>	from B as the source of MCC4 to send to Endpoint C.</t>
	</section>	</section>
	</section>	</section>
	</section>	</section>

	<section anchor="section-14" numbered="true" toc="default">	<section anchor="s-14" numbered="true" toc="default">
	<name>IANA Considerations</name>	<name>IANA Considerations</name>
	<t>	<t>
	This document does not require any IANA actions.	This document does not require any IANA actions.
	</t>	</t>
	</section>	</section>

	<section anchor="section-15" numbered="true" toc="default">	<section anchor="s-15" numbered="true" toc="default">
	<name>Security Considerations</name>	<name>Security Considerations</name>
	<t>	<t>
	There are several potential attacks related to telepresence,	There are several potential attacks related to telepresence,
	specifically the protocols used by CLUE. This is the case due to	specifically the protocols used by CLUE. This is the case due to
	conferencing sessions, the natural involvement of multiple	conferencing sessions, the natural involvement of multiple
	endpoints, and the many, often user-invoked, capabilities provided	endpoints, and the many, often user-invoked, capabilities provided
	by the systems.</t>	by the systems.</t>
	<t>	<t>
	An MCU involved in a CLUE session can experience many of the same	An MCU involved in a CLUE session can experience many of the same
	attacks as a conferencing system such as the one enabled by	attacks as a conferencing system such as the one enabled by

	skipping to change at line 3583 ¶	skipping to change at line 3715 ¶
	implementing the protocols necessary to support CLUE, follow the	implementing the protocols necessary to support CLUE, follow the
	security recommendations specified in the conference control	security recommendations specified in the conference control
	protocol documents.	protocol documents.

	-->	-->
	In the case of CLUE, SIP is the conferencing	In the case of CLUE, SIP is the conferencing
	protocol, thus the security considerations in <xref target="RFC4579" format="default"/> MUST be	protocol, thus the security considerations in <xref target="RFC4579" format="default"/> MUST be
	followed. Other security issues related to MCUs are discussed in	followed. Other security issues related to MCUs are discussed in
	the XCON framework <xref target="RFC5239" format="default"/>. The use of xCard with potentially	the XCON framework <xref target="RFC5239" format="default"/>. The use of xCard with potentially
	sensitive information provides another reason to implement	sensitive information provides another reason to implement

	recommendations of Section 11 in <xref target="RFC5239" format="defau lt"/>.</t>	recommendations in <xref section="11" sectionFormat="of" target="RFC5 239" format="default"/>.</t>
	<t>	<t>
	One primary security concern, surrounding the CLUE framework	One primary security concern, surrounding the CLUE framework
	introduced in this document, involves securing the actual	introduced in this document, involves securing the actual
	protocols and the associated authorization mechanisms. These	protocols and the associated authorization mechanisms. These
	concerns apply to endpoint-to-endpoint sessions as well as	concerns apply to endpoint-to-endpoint sessions as well as
	sessions involving multiple endpoints and MCUs. <xref target="ref-bas ic-information-flow" format="default"/> in	sessions involving multiple endpoints and MCUs. <xref target="ref-bas ic-information-flow" format="default"/> in

	<xref target="section-5" format="default"/> provides a basic flow of information exchange for CLUE	<xref target="s-5" format="default"/> provides a basic flow of inform ation exchange for CLUE
	and the protocols involved.</t>	and the protocols involved.</t>
	<t>	<t>

	As described in <xref target="section-5" format="default"/>, CLUE use s SIP/SDP to	As described in <xref target="s-5" format="default"/>, CLUE uses SIP/ SDP to
	establish the session prior to exchanging any CLUE-specific	establish the session prior to exchanging any CLUE-specific
	information. Thus, the security mechanisms recommended for SIP	information. Thus, the security mechanisms recommended for SIP
	<xref target="RFC3261" format="default"/>, including user authenticat ion and	<xref target="RFC3261" format="default"/>, including user authenticat ion and
	authorization, MUST be supported. In addition, the media MUST be	authorization, MUST be supported. In addition, the media MUST be
	secured. Datagram Transport Layer Security / Secure Real-time	secured. Datagram Transport Layer Security / Secure Real-time
	Transport Protocol MUST be supported and SHOULD be used unless the	Transport Protocol MUST be supported and SHOULD be used unless the
	media, which is based on RTP, is secured by other means (see <xref ta rget="RFC7201" format="default"/> <xref target="RFC7202" format="default "/>). Media security is	media, which is based on RTP, is secured by other means (see <xref ta rget="RFC7201" format="default"/> <xref target="RFC7202" format="default "/>). Media security is
	also discussed in <xref target="RFCYYY3" format="default"/> and <xref target="RFCYYY4" format="default"/>. Note that SIP call setup is done b efore any	also discussed in <xref target="RFCYYY3" format="default"/> and <xref target="RFCYYY4" format="default"/>. Note that SIP call setup is done b efore any
	CLUE-specific information is available, so the authentication and	CLUE-specific information is available, so the authentication and
	authorization are based on the SIP mechanisms. The entity that will	authorization are based on the SIP mechanisms. The entity that will
	be authenticated may use the Endpoint identity or the endpoint user	be authenticated may use the Endpoint identity or the endpoint user
	identity; this is an application issue and not a CLUE-specific	identity; this is an application issue and not a CLUE-specific
	issue.</t>	issue.</t>
	<t>	<t>
	A separate data channel is established to transport the CLUE	A separate data channel is established to transport the CLUE
	protocol messages. The contents of the CLUE protocol messages are	protocol messages. The contents of the CLUE protocol messages are
	based on information introduced in this document. The CLUE data	based on information introduced in this document. The CLUE data
	model <xref target="RFCYYY1" format="default"/> defines, through an X ML	model <xref target="RFCYYY1" format="default"/> defines, through an X ML
	schema, the syntax to be used. One type of information that could	schema, the syntax to be used. One type of information that could
	possibly introduce privacy concerns is the xCard information, as	possibly introduce privacy concerns is the xCard information, as

	described in <xref target="section-7.1.1.10" format="default"/>. The decision about which xCard	described in <xref target="s-7.1.1.10" format="default"/>. The decisi on about which xCard
	information to send in the CLUE channel is an application policy	information to send in the CLUE channel is an application policy
	for point-to-point and multipoint calls based on the authenticated	for point-to-point and multipoint calls based on the authenticated
	identity that can be the endpoint identity or the user of the	identity that can be the endpoint identity or the user of the
	endpoint. For example, the telepresence multipoint application can	endpoint. For example, the telepresence multipoint application can
	authenticate a user before starting a CLUE exchange with the	authenticate a user before starting a CLUE exchange with the
	telepresence system and have a policy per user.</t>	telepresence system and have a policy per user.</t>
	<t>	<t>
	In addition, the (text) description field in the Media Capture	In addition, the (text) description field in the Media Capture

	attribute (<xref target="section-7.1.1.6" format="default"/>) could p ossibly reveal sensitive	attribute (<xref target="s-7.1.1.6" format="default"/>) could possibl y reveal sensitive
	information or specific identities. The same would be true for the	information or specific identities. The same would be true for the

	descriptions in the Capture Scene (<xref target="section-7.3.1" forma	descriptions in the Capture Scene (<xref target="s-7.3.1" format="def
	t="default"/>) and Capture	ault"/>) and Capture
	Scene View (<xref target="section-7.3.2" format="default"/>) attribut	Scene View (<xref target="s-7.3.2" format="default"/>) attributes. An
	es. An implementation SHOULD give users	implementation SHOULD give users
	control over what sensitive information is sent in an	control over what sensitive information is sent in an
	Advertisement. One other important consideration for the	Advertisement. One other important consideration for the
	information in the xCard as well as the description field in the	information in the xCard as well as the description field in the
	Media Capture and Capture Scene View attributes is that while the	Media Capture and Capture Scene View attributes is that while the
	endpoints involved in the session have been authenticated, there	endpoints involved in the session have been authenticated, there
	are no assurance that the information in the xCard or description	are no assurance that the information in the xCard or description
	fields is authentic. Thus, this information MUST NOT be used to	fields is authentic. Thus, this information MUST NOT be used to
	make any authorization decisions.</t>	make any authorization decisions.</t>
	<t>	<t>
	While other information in the CLUE protocol messages does not	While other information in the CLUE protocol messages does not

	skipping to change at line 3661 ¶	skipping to change at line 3793 ¶
	However, the policies and security associated with these actions	However, the policies and security associated with these actions
	are outside the scope of this document and the overall CLUE	are outside the scope of this document and the overall CLUE
	solution.</t>	solution.</t>
	</section>	</section>
	</middle>	</middle>
	<back>	<back>
	<references>	<references>
	<name>References</name>	<name>References</name>
	<references>	<references>
	<name>Normative References</name>	<name>Normative References</name>

	<!--[rfced] PQ: Please review companion document references as m
	any of these were not in queue when this document was edited (i.e., chec
	k for title changes etc.).


	<!-- &I-D.ietf-clue-datachannel; Will be a companion doc - but as of 1 1/16/17 Waiting for AD Go-Ahead;-->	<!-- &I-D.ietf-clue-datachannel; Will be a companion doc - but as of 1 1/16/17 Waiting for AD Go-Ahead;-->

	<reference anchor="RFCYYYY" target="http://www.rfc-editor.org/info/rfcYY YY">	<reference anchor="RFCYYYY" target="http://www.rfc-editor.org/info/rfcYY YY">
	<front>	<front>
	<title>CLUE Protocol Data Channel</title>	<title>CLUE Protocol Data Channel</title>
	<seriesInfo name="DOI" value="10.17487/RFCYYYY"/>	<seriesInfo name="DOI" value="10.17487/RFCYYYY"/>
	<seriesInfo name="RFC" value="YYYY"/>	<seriesInfo name="RFC" value="YYYY"/>
	<author initials="C" surname="Holmberg" fullname="Christer H olmberg">	<author initials="C" surname="Holmberg" fullname="Christer H olmberg">
	<organization/>	<organization/>

	skipping to change at line 3744 ¶	skipping to change at line 3873 ¶
	</author>	</author>
	<author initials="R" surname="Hansen" fullname="Robert Hanse n">	<author initials="R" surname="Hansen" fullname="Robert Hanse n">
	<organization/>	<organization/>
	</author>	</author>
	<date month="August" day="20" year="2017"/>	<date month="August" day="20" year="2017"/>
	<abstract>	<abstract>
	<t>This document specifies how CLUE-specific signaling suc h as the CLUE protocol and the CLUE data channel are used in conjunction with each other and with existing signaling mechanisms such as SIP and SDP to produce a telepresence call.</t>	<t>This document specifies how CLUE-specific signaling suc h as the CLUE protocol and the CLUE data channel are used in conjunction with each other and with existing signaling mechanisms such as SIP and SDP to produce a telepresence call.</t>
	</abstract>	</abstract>
	</front>	</front>
	</reference>	</reference>

	<reference anchor="RFC2119" target="https://www.rfc-editor.org/i
	nfo/rfc2119" xml:base="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/	<xi:include
	reference.RFC.2119.xml">	href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/refere
	<front>	nce.RFC.2119.xml"/>
	<title>Key words for use in RFCs to Indicate Requirement Lev	<xi:include
	els</title>	href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/refere
	<seriesInfo name="DOI" value="10.17487/RFC2119"/>	nce.RFC.3261.xml"/>
	<seriesInfo name="RFC" value="2119"/>	<xi:include
	<seriesInfo name="BCP" value="14"/>	href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/refere
	<author initials="S." surname="Bradner" fullname="S. Bradner	nce.RFC.3264.xml"/>
	">	<xi:include
	<organization/>	href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/refere
	</author>	nce.RFC.3550.xml"/>
	<date year="1997" month="March"/>	<xi:include
	<abstract>	href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/refere
	<t>In many standards track documents several words are use	nce.RFC.4566.xml"/>
	d to signify the requirements in the specification. These words are oft	<xi:include
	en capitalized. This document defines these words as they should be inte	href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/refere
	rpreted in IETF documents. This document specifies an Internet Best Cur	nce.RFC.4579.xml"/>
	rent Practices for the Internet Community, and requests discussion and s	<xi:include
	uggestions for improvements.</t>	href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/refere
	</abstract>	nce.RFC.5239.xml"/>
	</front>	<xi:include
	</reference>	href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/refere
	<reference anchor="RFC3261" target="https://www.rfc-editor.org/i	nce.RFC.5646.xml"/>
	nfo/rfc3261" xml:base="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/	<xi:include
	reference.RFC.3261.xml">	href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/refere
	<front>	nce.RFC.6350.xml"/>
	<title>SIP: Session Initiation Protocol</title>	<xi:include
	<seriesInfo name="DOI" value="10.17487/RFC3261"/>	href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/refere
	<seriesInfo name="RFC" value="3261"/>	nce.RFC.6351.xml"/>
	<author initials="J." surname="Rosenberg" fullname="J. Rosen	<xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibx
	berg">	ml/reference.RFC.8174.xml"/>
	<organization/>
	</author>
	<author initials="H." surname="Schulzrinne" fullname="H. Sch
	ulzrinne">
	<organization/>
	</author>
	<author initials="G." surname="Camarillo" fullname="G. Camar
	illo">
	<organization/>
	</author>
	<author initials="A." surname="Johnston" fullname="A. Johnst
	on">
	<organization/>
	</author>
	<author initials="J." surname="Peterson" fullname="J. Peters
	on">
	<organization/>
	</author>
	<author initials="R." surname="Sparks" fullname="R. Sparks">
	<organization/>
	</author>
	<author initials="M." surname="Handley" fullname="M. Handley
	">
	<organization/>
	</author>
	<author initials="E." surname="Schooler" fullname="E. School
	er">
	<organization/>
	</author>
	<date year="2002" month="June"/>
	<abstract>
	<t>This document describes Session Initiation Protocol (SI
	P), an application-layer control (signaling) protocol for creating, modi
	fying, and terminating sessions with one or more participants. These se
	ssions include Internet telephone calls, multimedia distribution, and mu
	ltimedia conferences. [STANDARDS-TRACK]</t>
	</abstract>
	</front>
	</reference>
	<reference anchor="RFC3264" target="https://www.rfc-editor.org/i
	nfo/rfc3264" xml:base="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/
	reference.RFC.3264.xml">
	<front>
	<title>An Offer/Answer Model with Session Description Protoc
	ol (SDP)</title>
	<seriesInfo name="DOI" value="10.17487/RFC3264"/>
	<seriesInfo name="RFC" value="3264"/>
	<author initials="J." surname="Rosenberg" fullname="J. Rosen
	berg">
	<organization/>
	</author>
	<author initials="H." surname="Schulzrinne" fullname="H. Sch
	ulzrinne">
	<organization/>
	</author>
	<date year="2002" month="June"/>
	<abstract>
	<t>This document defines a mechanism by which two entities
	can make use of the Session Description Protocol (SDP) to arrive at a c
	ommon view of a multimedia session between them. In the model, one part
	icipant offers the other a description of the desired session from their
	perspective, and the other participant answers with the desired session
	from their perspective. This offer/answer model is most useful in unic
	ast sessions where information from both participants is needed for the
	complete view of the session. The offer/answer model is used by protoco
	ls like the Session Initiation Protocol (SIP). [STANDARDS-TRACK]</t>
	</abstract>
	</front>
	</reference>
	<reference anchor="RFC3550" target="https://www.rfc-editor.org/i
	nfo/rfc3550" xml:base="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/
	reference.RFC.3550.xml">
	<front>
	<title>RTP: A Transport Protocol for Real-Time Applications<
	/title>
	<seriesInfo name="DOI" value="10.17487/RFC3550"/>
	<seriesInfo name="RFC" value="3550"/>
	<seriesInfo name="STD" value="64"/>
	<author initials="H." surname="Schulzrinne" fullname="H. Sch
	ulzrinne">
	<organization/>
	</author>
	<author initials="S." surname="Casner" fullname="S. Casner">
	<organization/>
	</author>
	<author initials="R." surname="Frederick" fullname="R. Frede
	rick">
	<organization/>
	</author>
	<author initials="V." surname="Jacobson" fullname="V. Jacobs
	on">
	<organization/>
	</author>
	<date year="2003" month="July"/>
	<abstract>
	<t>This memorandum describes RTP, the real-time transport
	protocol. RTP provides end-to-end network transport functions suitable
	for applications transmitting real-time data, such as audio, video or si
	mulation data, over multicast or unicast network services. RTP does not
	address resource reservation and does not guarantee quality-of- service
	for real-time services. The data transport is augmented by a control p
	rotocol (RTCP) to allow monitoring of the data delivery in a manner scal
	able to large multicast networks, and to provide minimal control and ide
	ntification functionality. RTP and RTCP are designed to be independent
	of the underlying transport and network layers. The protocol supports t
	he use of RTP-level translators and mixers. Most of the text in this mem
	orandum is identical to RFC 1889 which it obsoletes. There are no chang
	es in the packet formats on the wire, only changes to the rules and algo
	rithms governing how the protocol is used. The biggest change is an enha
	ncement to the scalable timer algorithm for calculating when to send RTC
	P packets in order to minimize transmission in excess of the intended ra
	te when many participants join a session simultaneously. [STANDARDS-TRA
	CK]</t>
	</abstract>
	</front>
	</reference>
	<reference anchor="RFC4566" target="https://www.rfc-editor.org/i
	nfo/rfc4566" xml:base="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/
	reference.RFC.4566.xml">
	<front>
	<title>SDP: Session Description Protocol</title>
	<seriesInfo name="DOI" value="10.17487/RFC4566"/>
	<seriesInfo name="RFC" value="4566"/>
	<author initials="M." surname="Handley" fullname="M. Handley
	">
	<organization/>
	</author>
	<author initials="V." surname="Jacobson" fullname="V. Jacobs
	on">
	<organization/>
	</author>
	<author initials="C." surname="Perkins" fullname="C. Perkins
	">
	<organization/>
	</author>
	<date year="2006" month="July"/>
	<abstract>
	<t>This memo defines the Session Description Protocol (SDP
	). SDP is intended for describing multimedia sessions for the purposes
	of session announcement, session invitation, and other forms of multimed
	ia session initiation. [STANDARDS-TRACK]</t>
	</abstract>
	</front>
	</reference>
	<reference anchor="RFC4579" target="https://www.rfc-editor.org/i
	nfo/rfc4579" xml:base="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/
	reference.RFC.4579.xml">
	<front>
	<title>Session Initiation Protocol (SIP) Call Control - Conf
	erencing for User Agents</title>
	<seriesInfo name="DOI" value="10.17487/RFC4579"/>
	<seriesInfo name="RFC" value="4579"/>
	<seriesInfo name="BCP" value="119"/>
	<author initials="A." surname="Johnston" fullname="A. Johnst
	on">
	<organization/>
	</author>
	<author initials="O." surname="Levin" fullname="O. Levin">
	<organization/>
	</author>
	<date year="2006" month="August"/>
	<abstract>
	<t>This specification defines conferencing call control fe
	atures for the Session Initiation Protocol (SIP). This document builds
	on the Conferencing Requirements and Framework documents to define how a
	tightly coupled SIP conference works. The approach is explored from th
	e perspective of different user agent (UA) types: conference-unaware, co
	nference-aware, and focus UAs. The use of Uniform Resource Identifiers
	(URIs) in conferencing, OPTIONS for capabilities discovery, and call con
	trol using REFER are covered in detail with example call flow diagrams.
	The usage of the isfocus feature tag is defined. This document specifi
	es an Internet Best Current Practices for the Internet Community, and re
	quests discussion and suggestions for improvements.</t>
	</abstract>
	</front>
	</reference>
	<reference anchor="RFC5239" target="https://www.rfc-editor.org/i
	nfo/rfc5239" xml:base="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/
	reference.RFC.5239.xml">
	<front>
	<title>A Framework for Centralized Conferencing</title>
	<seriesInfo name="DOI" value="10.17487/RFC5239"/>
	<seriesInfo name="RFC" value="5239"/>
	<author initials="M." surname="Barnes" fullname="M. Barnes">
	<organization/>
	</author>
	<author initials="C." surname="Boulton" fullname="C. Boulton
	">
	<organization/>
	</author>
	<author initials="O." surname="Levin" fullname="O. Levin">
	<organization/>
	</author>
	<date year="2008" month="June"/>
	<abstract>
	<t>This document defines the framework for Centralized Con
	ferencing. The framework allows participants using various call signalin
	g protocols, such as SIP, H.323, Jabber, Q.931 or ISDN User Part (ISUP),
	to exchange media in a centralized unicast conference. The Centralized
	Conferencing Framework defines logical entities and naming conventions.
	The framework also outlines a set of conferencing protocols, which are
	complementary to the call signaling protocols, for building advanced co
	nferencing applications. The framework binds all the defined components
	together for the benefit of builders of conferencing systems. [STANDAR
	DS-TRACK]</t>
	</abstract>
	</front>
	</reference>
	<reference anchor="RFC5646" target="https://www.rfc-editor.org/i
	nfo/rfc5646" xml:base="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/
	reference.RFC.5646.xml">
	<front>
	<title>Tags for Identifying Languages</title>
	<seriesInfo name="DOI" value="10.17487/RFC5646"/>
	<seriesInfo name="RFC" value="5646"/>
	<seriesInfo name="BCP" value="47"/>
	<author initials="A." surname="Phillips" fullname="A. Philli
	ps" role="editor">
	<organization/>
	</author>
	<author initials="M." surname="Davis" fullname="M. Davis" ro
	le="editor">
	<organization/>
	</author>
	<date year="2009" month="September"/>
	<abstract>
	<t>This document describes the structure, content, constru
	ction, and semantics of language tags for use in cases where it is desir
	able to indicate the language used in an information object. It also de
	scribes how to register values for use in language tags and the creation
	of user-defined extensions for private interchange. This document spe
	cifies an Internet Best Current Practices for the Internet Community, an
	d requests discussion and suggestions for improvements.</t>
	</abstract>
	</front>
	</reference>
	<reference anchor="RFC6350" target="https://www.rfc-editor.org/i
	nfo/rfc6350" xml:base="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/
	reference.RFC.6350.xml">
	<front>
	<title>vCard Format Specification</title>
	<seriesInfo name="DOI" value="10.17487/RFC6350"/>
	<seriesInfo name="RFC" value="6350"/>
	<author initials="S." surname="Perreault" fullname="S. Perre
	ault">
	<organization/>
	</author>
	<date year="2011" month="August"/>
	<abstract>
	<t>This document defines the vCard data format for represe
	nting and exchanging a variety of information about individuals and othe
	r entities (e.g., formatted and structured name and delivery addresses,
	email address, multiple telephone numbers, photograph, logo, audio clips
	, etc.). This document obsoletes RFCs 2425, 2426, and 4770, and updates
	RFC 2739. [STANDARDS-TRACK]</t>
	</abstract>
	</front>
	</reference>
	<reference anchor="RFC6351" target="https://www.rfc-editor.org/i
	nfo/rfc6351" xml:base="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/
	reference.RFC.6351.xml">
	<front>
	<title>xCard: vCard XML Representation</title>
	<seriesInfo name="DOI" value="10.17487/RFC6351"/>
	<seriesInfo name="RFC" value="6351"/>
	<author initials="S." surname="Perreault" fullname="S. Perre
	ault">
	<organization/>
	</author>
	<date year="2011" month="August"/>
	<abstract>
	<t>This document defines the XML schema of the vCard data
	format. [STANDARDS-TRACK]</t>
	</abstract>
	</front>
	</reference>
	<reference anchor="RFC8174" target="https://www.rfc-editor.org/i
	nfo/rfc8174">
	<front>
	<title>Ambiguity of Uppercase vs Lowercase in RFC 2119 Key W
	ords</title>
	<seriesInfo name="DOI" value="10.17487/RFC8174"/>
	<seriesInfo name="RFC" value="8174"/>
	<seriesInfo name="BCP" value="14"/>
	<author initials="B." surname="Leiba" fullname="B. Leiba">
	<organization/>
	</author>
	<date year="2017" month="May"/>
	<abstract>
	<t>RFC 2119 specifies common key words that may be used in
	protocol specifications. This document aims to reduce the ambiguity b
	y clarifying that only UPPERCASE usage of the key words have the define
	d special meanings.</t>
	</abstract>
	</front>
	</reference>
	</references>	</references>
	<references>	<references>
	<name>Informative References</name>	<name>Informative References</name>
	<!-- &I-D.ietf-clue-rtp-mapping; MISSREF-->	<!-- &I-D.ietf-clue-rtp-mapping; MISSREF-->

	<reference anchor="RFCYYY4" target="http://www.rfc-editor.org/info/rfcYY Y4">	<reference anchor="RFCYYY4" target="http://www.rfc-editor.org/info/rfcYY Y4">
	<front>	<front>
	<title>Mapping RTP Streams to Controlling Multiple Streams f or Telepresence (CLUE) Media Captures</title>	<title>Mapping RTP Streams to Controlling Multiple Streams f or Telepresence (CLUE) Media Captures</title>
	<seriesInfo name="DOI" value="10.17487/RFCYYY4"/>	<seriesInfo name="DOI" value="10.17487/RFCYYY4"/>
	<seriesInfo name="RFC" value="YYY4"/>	<seriesInfo name="RFC" value="YYY4"/>

	skipping to change at line 3976 ¶	skipping to change at line 3918 ¶
	</author>	</author>
	<author initials="J" surname="Lennox" fullname="Jonathan Len nox">	<author initials="J" surname="Lennox" fullname="Jonathan Len nox">
	<organization/>	<organization/>
	</author>	</author>
	<date month="February" day="27" year="2017"/>	<date month="February" day="27" year="2017"/>
	<abstract>	<abstract>
	<t>This document describes how the Real Time transport Pro tocol (RTP) is used in the context of the CLUE protocol (ControLling mUl tiple streams for tElepresence). It also describes the mechanisms and r ecommended practice for mapping RTP media streams defined in Session Des cription Protocol (SDP) to CLUE Media Captures and defines a new RTP hea der extension (CaptureId).</t>	<t>This document describes how the Real Time transport Pro tocol (RTP) is used in the context of the CLUE protocol (ControLling mUl tiple streams for tElepresence). It also describes the mechanisms and r ecommended practice for mapping RTP media streams defined in Session Des cription Protocol (SDP) to CLUE Media Captures and defines a new RTP hea der extension (CaptureId).</t>
	</abstract>	</abstract>
	</front>	</front>
	</reference>	</reference>

	<reference anchor="RFC4353" target="https://www.rfc-editor.org/i
	nfo/rfc4353" xml:base="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/
	reference.RFC.4353.xml">
	<front>
	<title>A Framework for Conferencing with the Session Initiat
	ion Protocol (SIP)</title>
	<seriesInfo name="DOI" value="10.17487/RFC4353"/>
	<seriesInfo name="RFC" value="4353"/>
	<author initials="J." surname="Rosenberg" fullname="J. Rosen
	berg">
	<organization/>
	</author>
	<date year="2006" month="February"/>
	<abstract>
	<t>The Session Initiation Protocol (SIP) supports the init
	iation, modification, and termination of media sessions between user age
	nts. These sessions are managed by SIP dialogs, which represent a SIP re
	lationship between a pair of user agents. Because dialogs are between p
	airs of user agents, SIP's usage for two-party communications (such as a
	phone call), is obvious. Communications sessions with multiple partici
	pants, generally known as conferencing, are more complicated. This docu
	ment defines a framework for how such conferencing can occur. This fram
	ework describes the overall architecture, terminology, and protocol comp
	onents needed for multi-party conferencing. This memo provides informat
	ion for the Internet community.</t>
	</abstract>
	</front>
	</reference>
	<reference anchor="RFC7667" target="https://www.rfc-editor.org/i
	nfo/rfc7667" xml:base="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/
	reference.RFC.7667.xml">
	<front>
	<title>RTP Topologies</title>
	<seriesInfo name="DOI" value="10.17487/RFC7667"/>
	<seriesInfo name="RFC" value="7667"/>
	<author initials="M." surname="Westerlund" fullname="M. West
	erlund">
	<organization/>
	</author>
	<author initials="S." surname="Wenger" fullname="S. Wenger">
	<organization/>
	</author>
	<date year="2015" month="November"/>
	<abstract>
	<t>This document discusses point-to-point and multi-endpoi
	nt topologies used in environments based on the Real-time Transport Prot
	ocol (RTP). In particular, centralized topologies commonly employed in t
	he video conferencing industry are mapped to the RTP terminology.</t>
	</abstract>
	</front>
	</reference>
	<!-- [rfced] The following RFC has been obsoleted. We have upda
	ted this reference as follows. Please let us know any objections.


	RFC 5117 has become RFC 7667	<xi:include
		href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/refere
		nce.RFC.4353.xml"/>
		<xi:include
		href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/refere
		nce.RFC.7667.xml"/>


		<!-- [rfced] The following RFC has been obsoleted. We have updated
		this reference as follows. Please let us know any objections.

		RFC 5117 has been obsoleted by RFC 7667.
	-->	-->


	<reference anchor="RFC7201" target="https://www.rfc-editor.org/in	<xi:include
	fo/rfc7201" xml:base="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/r	href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/refere
	eference.RFC.7201.xml">	nce.RFC.7201.xml"/>
	<front>	<xi:include
	<title>Options for Securing RTP Sessions</title>	href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/refere
	<seriesInfo name="DOI" value="10.17487/RFC7201"/>	nce.RFC.7202.xml"/>
	<seriesInfo name="RFC" value="7201"/>	<xi:include
	<author initials="M." surname="Westerlund" fullname="M. West	href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/refere
	erlund">	nce.RFC.7205.xml"/>
	<organization/>	<xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibx
	</author>	ml/reference.RFC.7262.xml"/>
	<author initials="C." surname="Perkins" fullname="C. Perkins
	">
	<organization/>
	</author>
	<date year="2014" month="April"/>
	<abstract>
	<t>The Real-time Transport Protocol (RTP) is used in a lar
	ge number of different application domains and environments. This heter
	ogeneity implies that different security mechanisms are needed to provid
	e services such as confidentiality, integrity, and source authentication
	of RTP and RTP Control Protocol (RTCP) packets suitable for the various
	environments. The range of solutions makes it difficult for RTP-based
	application developers to pick the most suitable mechanism. This docume
	nt provides an overview of a number of security solutions for RTP and gi
	ves guidance for developers on how to choose the appropriate security me
	chanism.</t>
	</abstract>
	</front>
	</reference>
	<reference anchor="RFC7202" target="https://www.rfc-editor.org/i
	nfo/rfc7202" xml:base="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/
	reference.RFC.7202.xml">
	<front>
	<title>Securing the RTP Framework: Why RTP Does Not Mandate
	a Single Media Security Solution</title>
	<seriesInfo name="DOI" value="10.17487/RFC7202"/>
	<seriesInfo name="RFC" value="7202"/>
	<author initials="C." surname="Perkins" fullname="C. Perkins
	">
	<organization/>
	</author>
	<author initials="M." surname="Westerlund" fullname="M. West
	erlund">
	<organization/>
	</author>
	<date year="2014" month="April"/>
	<abstract>
	<t>This memo discusses the problem of securing real-time m
	ultimedia sessions. It also explains why the Real-time Transport Protoc
	ol (RTP) and the associated RTP Control Protocol (RTCP) do not mandate a
	single media security mechanism. This is relevant for designers and re
	viewers of future RTP extensions to ensure that appropriate security mec
	hanisms are mandated and that any such mechanisms are specified in a man
	ner that conforms with the RTP architecture.</t>
	</abstract>
	</front>
	</reference>
	<reference anchor="RFC7205" target="https://www.rfc-editor.org/i
	nfo/rfc7205" xml:base="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/
	reference.RFC.7205.xml">
	<front>
	<title>Use Cases for Telepresence Multistreams</title>
	<seriesInfo name="DOI" value="10.17487/RFC7205"/>
	<seriesInfo name="RFC" value="7205"/>
	<author initials="A." surname="Romanow" fullname="A. Romanow
	">
	<organization/>
	</author>
	<author initials="S." surname="Botzko" fullname="S. Botzko">
	<organization/>
	</author>
	<author initials="M." surname="Duckworth" fullname="M. Duckw
	orth">
	<organization/>
	</author>
	<author initials="R." surname="Even" fullname="R. Even" role
	="editor">
	<organization/>
	</author>
	<date year="2014" month="April"/>
	<abstract>
	<t>Telepresence conferencing systems seek to create an env
	ironment that gives users (or user groups) that are not co-located a fee
	ling of co-located presence through multimedia communication that includ
	es at least audio and video signals of high fidelity. A number of techn
	iques for handling audio and video streams are used to create this exper
	ience. When these techniques are not similar, interoperability between
	different systems is difficult at best, and often not possible. Conveyi
	ng information about the relationships between multiple streams of media
	would enable senders and receivers to make choices to allow telepresenc
	e systems to interwork. This memo describes the most typical and import
	ant use cases for sending multiple streams in a telepresence conference.
	</t>
	</abstract>
	</front>
	</reference>
	<reference anchor="RFC7262" target="https://www.rfc-editor.org/i
	nfo/rfc7262" xml:base="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/
	reference.RFC.7262.xml">
	<front>
	<title>Requirements for Telepresence Multistreams</title>
	<seriesInfo name="DOI" value="10.17487/RFC7262"/>
	<seriesInfo name="RFC" value="7262"/>
	<author initials="A." surname="Romanow" fullname="A. Romanow
	">
	<organization/>
	</author>
	<author initials="S." surname="Botzko" fullname="S. Botzko">
	<organization/>
	</author>
	<author initials="M." surname="Barnes" fullname="M. Barnes">
	<organization/>
	</author>
	<date year="2014" month="June"/>
	<abstract>
	<t>This memo discusses the requirements for specifications
	that enable telepresence interoperability by describing behaviors and p
	rotocols for Controlling Multiple Streams for Telepresence (CLUE). In a
	ddition, the problem statement and related definitions are also covered
	herein.</t>
	</abstract>
	</front>
	</reference>
	</references>	</references>
	</references>	</references>
	<section anchor="acks" numbered="false" toc="default">	<section anchor="acks" numbered="false" toc="default">
	<name>Acknowledgements</name>	<name>Acknowledgements</name>
	<t>	<t>
	Allyn Romanow and Brian Baldino were authors of early versions.	Allyn Romanow and Brian Baldino were authors of early versions.
	Mark Gorzynski also contributed much to the initial approach.	Mark Gorzynski also contributed much to the initial approach.
	Many others also contributed, including Christian Groves, Jonathan	Many others also contributed, including Christian Groves, Jonathan
	Lennox, Paul Kyzivat, Rob Hansen, Roni Even, Christer Holmberg,	Lennox, Paul Kyzivat, Rob Hansen, Roni Even, Christer Holmberg,
	Stephen Botzko, Mary Barnes, John Leslie, Paul Coverdale.</t>	Stephen Botzko, Mary Barnes, John Leslie, Paul Coverdale.</t>

End of changes. 142 change blocks.
	1113 lines changed or deleted	821 lines changed or added
This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/