4.4 Rendering Schemes, Operation Points and Media Profiles

26.1183GPPRelease 17TSVirtual Reality (VR) profiles for streaming applications

The present document provides several interoperability points that may be referred external specifications. These are:

Media profiles: providing DASH, file format and elementary stream constraints for a single media type.

– Operation Points: a collection of discrete combinations of different content formats including spatial and temporal resolutions, colour mapping, transfer functions, rendering metadata and the encoding format.

– Bitstream: A video bitstream that conforms to a video encoding format and certain Operation Point including VR Rendering Metadata.

Rendering Scheme: post-decoder processing of decoder output signals together with rendering metadata.

Note that this applies to both media types, audio and video. For audio, the 3GPP VR Rendering Scheme interoperability point serves as an informative output of the File Decoder. The 3GPP VR viewport interoperability point serves as the output of the entire file decoding process.

Both features provide clear requirements for interoperability for receiver. Figure 4.4-1 provides an overview on this.

Figure 4.4-1: Interoperability aspects for 3GPP VR Profiles

Media profile for timed media is defined as requirements and constraints for a set of one or more 3GPP VR tracks of a single media type. The conformance of a set of one or more 3GPP VR tracks to a media profile is specified as a combination of:

– Specification of which sample entry type(s) are allowed, and which constraints and extensions are required in addition to those imposed by the sample entry type(s).

– Constraints on the samples of the tracks, typically expressed as constraints on the elementary stream contained within the samples of the tracks.

The elementary stream constraints of a media profile may be indicated by a requirement to comply with a certain profile and level of the media coding specification, possibly including additional constraints and extensions, such as a requirement of the presence of certain information for rendering and presentation.

Each media profile specified in the present document includes a file decoding process such that all file decoders that conform to the video media profile will produce:

– For video: numerically identical cropped decoded pictures when invoking the file decoding process associated with that video media profile for a set of 3GPP VR tracks conforming to the video media profile. A bitstream that conforms to the elementary stream constraints specified for the video media profile is reconstructed as an intermediate product of the file decoding process. Output of the file decoding process consists of all of the following:

– a list of decoded pictures with associated presentation times;

– for projected omnidirectional video VR rendering metadata.

– for audio: a set of audio signals when invoking the file decoding process associated with that audio media profile for a VR Track conforming to the audio media profile. A bitstream that conforms to the elementary stream constraints specified for the audio media profile is reconstructed as an intermediate product of the file decoding process. Output of the file decoding process consists of all of the following:

– a sequence of audio samples with associated presentation times;

– audio VR rendering metadata.

A file decoder conforms to the file decoding process requirements of the present document when it complies with both of the following:

– for video:

– The file decoder includes a conforming decoder that produces numerically identical cropped decoded pictures to those produced by the file decoding process specified for the video media profile in clause 5 (with the correct output order or output timing, as specified in the video coding specification of the video media profile, respectively).

– The file decoder outputs rendering metadata that is equivalent to that produced by the file decoding process specified for the video media profile in clause 5 (with the correct association of the rendering metadata to particular cropped decoded pictures, as specified in the present document).

– for audio:

– The file decoder includes a conforming decoder that produces a sequence of audio samples with associated presentation times as defined in clause 6.

– The file decoder outputs Audio rendering metadata that is equivalent to that produced by the file decoding process specified for the audio media profile in clause 6 (with the correct association of the rendering metadata to particular audio samples).

A player claiming conformance to a media profile shall include a file decoder complying with the file decoding process of that video media profile as specified above. While the player operation, with the exception of the file decoding process, is not specified normatively in the present document, specifications of a media profile may include an informative clause on expectations of a player operation, for example including recommendations for rendering.

In addition to the interoperability on track level, also a DASH level interoperability for each media profile is defined. This interoperability includes the signalling and content generation, such that by dynamic switching based on network constraints or sensor input a conforming 3GPP VR Track for this media profile may be obtained.