10 Media codecs and formats

26.3463GPPMultimedia Broadcast/Multicast Service (MBMS)Protocols and codecsRelease 17TS

10.1 General

The set of media decoders that are supported by the MBMS Client to support a particular media type are defined below. Speech, Audio, Video, Timed Text and Scene description media decoders are relevant for both MBMS Download and Streaming delivery. Other media decoders are only relevant for MBMS Download delivery.

10.2 Speech

If speech is supported, the AMR decoder, as specified in 3GPP TS 26.071 [48], 3GPP TS 26.090 [49], 3GPP TS 26.073 [50] and 3GPP TS 26.107 [51], shall be supported for narrow-band speech. The AMR wideband speech decoder, 3GPP TS 26.171 [52], 3GPP TS 26.190 [53], 3GPP TS 26.173 [54] and 3GPP TS 26.204 [55], shall be supported when wideband speech working at 16 kHz sampling frequency is supported.

10.3 Audio

If audio is supported, then the following two audio decoders should be supported:

– Enhanced aacPlus, as specified in 3GPP TS 26.401 [28], 3GPP TS 26.410 [29] and 3GPP TS 26.411 [30].

– Extended AMR-WB, as specified in 3GPP TS 26.290 [24], 3GPP TS 26.304 [25] and 3GPP TS 26.273 [26].

Specifically, based on the audio codec selection test results, Extended AMR-WB is strong for the scenarios marked with blue, Enhanced aacPlus is strong for the scenarios marked with orange, and both are strong for the scenarios marked with green colour in table 1.

Table 1

Content type

Music

Speech over Music

Speech between Music

Speech

Bit rate

14 kbps mono

18 kbps stereo

24 kbps stereo

24 kbps mono

32 kbps stereo

48 kbps stereo

More recent information on the performance of the codecs based on more recent versions of the codecs can be found in TR 26.936 [86].

10.4 Synthetic audio

If synthetic audio is supported, the Scalable Polyphony MIDI (SP-MIDI) content format defined in Scalable Polyphony MIDI Specification [56] and the device requirements defined in Scalable Polyphony MIDI Device 5-to-24 Note Profile for 3GPP [57] should be supported.

SP-MIDI content is delivered in the structure specified in Standard MIDI Files 1.0 [58], either in format 0 or format 1.

In addition the Mobile DLS instrument format defined in [59] and the Mobile XMF content format defined in [60] should be supported.

An MBMS client supporting Mobile DLS shall meet the minimum device requirements defined in [59] in section 1.3 and the requirements for the common part of the synthesizer voice as defined in ISO/IEC 10646-1 [70] in section 1.2.1.2. If Mobile DLS is supported, wavetables encoded with the G.711 A-law codec (wFormatTag value 0x0006, as defined in [59]) shall also be supported. The optional group of processing blocks as defined in [59] may be supported. Mobile DLS resources are delivered either in the file format defined in ISO/IEC 10646-1 [70], or within Mobile XMF as defined in [60]. For Mobile DLS files delivered outside of Mobile XMF, the loading application should unload Mobile DLS instruments so that the sound bank required by the SP-MIDI profile [57] is not persistently altered by temporary loadings of Mobile DLS files.

Content that pairs Mobile DLS and SP-MIDI resources is delivered in the structure specified in Mobile XMF [60]. As defined in [60], a Mobile XMF file shall contain one SP-MIDI SMF file and no more than one Mobile DLS file. MBMS clients supporting Mobile XMF must not support any other resource types in the Mobile XMF file. Media handling behaviours for the SP-MIDI SMF and Mobile DLS resources contained within Mobile XMF are defined in [60].

10.5 Video

10.5.1 General video decoder requirements

If video is supported, the following applies:

– H.264 (AVC) Progressive High Profile Level 3.1 decoder [43] shall be supported wherein the maximum VCL Bit Rate is constrained to be 14Mbps with cpbBrVclFactor and cpbBrNalFactor being fixed to be 1000 and 1200, respectively.

– H.265 (HEVC) Main Profile, Main Tier, Level 3.1 decoder [112] should be supported.

When H.265 (HEVC) Main Profile decoder is supported, the client is only required to process H.265 (HEVC) Main Profile bitstreams that have general_progressive_source_flag equal to 1, general interlaced_source_flag equal to 0, general_non_packed_constraint_flag equal to 1, and general_frame_only_constraint_flag equal to 1.

NOTE: An H.264 (AVC) High Profile decoder is able to decode an H.264 (AVC) Main Profile stream that is progressively encoded.

10.5.2 Stereoscopic 3D Video

If a MBMS client supports stereoscopic 3D video, it should support frame-packed stereoscopic 3D video with the following characteristics:

– The bitstream conforms to H.264 (AVC) Constrained Baseline Profile Level 1.3 decoder [43], or conforms to H.264 (AVC) Progressive High Profile Level 3.1 decoder [43]. The Maximum VCL Bit Rate shall be constrained to be 14Mbps with cpbBrVclFactor and cpbBrNalFactor being fixed to be 1000 and 1200, respectively.

– Frame packing type is indicated by the frame packing arrangement SEI messages of H.264 (AVC) [43] as follows:

– The syntax element frame_packing_arrangement_type has one of the defined values: 3 for Side-by-Side, 4 for Top-and-Bottom.

– The syntax element quincunx_sampling_flag is equal to 0;

– The syntax element content_interpretation_type is equal to 1;

– The syntax elements spatial_flipping_flag is equal to 0;

– The syntax element field_views_flag is equal to 0;

– The syntax element current_frame_is_frame0_flag is equal to 0;

– When an access unit contains a frame packing arrangement SEI message A and the access unit is neither an IDR access unit nor an access unit containing a recovery point SEI message, the following two constraints apply:

– There shall be another access unit that precedes the access unit in both decoding order and output order and that contains a frame packing arrangement SEI message B.

– The two frame packing arrangement SEI messages A and B shall have the same value for the syntax element frame_packing_arrangement_type.

If a MBMS client supports frame-packed stereoscopic 3D video, it shall support parsing of frame packing arrangement SEI messages as specified in H.264 (AVC) [43].

10.5.3 Decoder parameter sets

Note that MBMS does not offer dynamic negotiation of media codecs.

When H.264 (AVC) is in use in the MBMS streaming delivery method, it is recommended to transmit H.264 (AVC) parameter sets within the SDP description of a stream (using sprop-parameter-sets MIME/SDP parameter [35]), and it is not recommended to transmit parameter sets within the RTP stream. Moreover, it is not recommended to reuse any parameter set identifier value that appeared previously in the SDP description or in the RTP stream. However, if a sequence parameter set is taken into use or updated within the RTP stream, it shall be contained at least in each IDR access unit and each access unit including a recovery point SEI message in which the sequence parameter set is used in the decoding process. If a picture parameter set is taken into use or updated within the RTP stream, it shall be contained at the latest in the first such access unit in each entry sequence that uses the picture parameter set in the decoding process, in which an entry sequence is defined as the access units between an IDR access unit or an access unit containing a recovery point SEI message, inclusive, and the next access unit, exclusive, in decoding order, which is either an IDR access unit or contains a recovery point SEI message.

When H.265 (HEVC) is in use in the MBMS streaming delivery method, it is recommended to transmit H.265 (HEVC) parameter sets within the SDP description of a stream (using the sprop-vps, sprop-sps, and sprop-pps MIME/SDP parameters [113]), and it is not recommended to transmit parameter sets within the RTP stream. Moreover, it is recommended not to reuse any parameter set identifier value that appeared previously in the SDP description or in the RTP stream, i.e., it is recommended that no_parameter_set_update_flag, if present, for each CVS in the stream is equal to 1. Also, it is required that self_contained_cvs_flag, if present, for each CVS in the stream is equal to 1, i.e., each parameter set that is (directly or indirectly) referenced by any VCL NAL unit of a CVS that is not a VCL NAL unit of a RASL picture is present within the CVS at a position that precedes, in decoding order, any NAL unit that (directly or indirectly) references the parameter set.

10.5.4 Decoder timing

There are no requirements on output timing conformance (annex C of ITU-T Recommendation H.264 [43] or for H.265 (HEVC) decoding (annex C of [112])) for MBMS clients.

The H.264 (AVC) decoder in an MBMS client shall start decoding immediately when it receives data (even if the stream does not start with an IDR access unit) or alternatively no later than it receives the next IDR access unit or the next recovery point SEI message, whichever is earlier in decoding order. Note that when the interleaved packetization mode of H.264 (AVC) is in use, de-interleaving is normally done before starting the decoding process. The decoding process for a stream not starting with an IDR access unit shall be the same as for a valid H.264 (AVC) bitstream. However, the client shall be aware that such a stream may contain references to pictures not available in the decoded picture buffer.

10.5.5 Television services

If the 3GPP MBMS client supports Television (TV) over 3GPP Services, it shall comply with

– the H.264/AVC 720p HD Operation Point Receiver requirements as specified in TS 26.116 [125], clause 4.4.2.6.

If the 3GPP MBMS client supports Television (TV) over 3GPP Services, it should comply with

– H.264/AVC Full HD Operation Point Receiver requirements as specified in TS 26.116 [125], clause 4.4.3.6,

– H.265/HEVC 720p HD Operation Point Receiver requirements as specified in TS 26.116 [125], clause 4.5.2.7,

– H.265/HEVC Full HD Operation Point Receiver requirements as specified in TS 26.116 [125], clause 4.5.3.7,

– H.265/HEVC UHD Operation Point Receiver requirements as specified in TS 26.116 [125], clause 4.5.4.7,

– H.265/HEVC Full HD HDR Operation Point Receiver requirements as specified in TS 26.116 [125], clause 4.5.5.8, and

– H.265/HEVC UHD HDR Operation Point Receiver requirements as specified in TS 26.116 [125], clause 4.5.6.8.

10.6 Still images

If still images are supported, ISO/IEC JPEG [61] together with JFIF [62] decoders shall be supported. The support for ISO/IEC JPEG only applies to the following two modes:

– baseline DCT, non-differential, Huffman coding, as defined in table B.1, symbol ‘SOF0’ in 3GPP TS 26.273 [26];

– progressive DCT, non-differential, Huffman coding, as defined in table B.1, symbol ‘SOF2’ 3GPP TS 26.273 [26].

10.7 Bitmap graphics

If bitmap graphics is supported, the following bitmap graphics decoders should be supported:

– GIF87a, [63];

– GIF89a, [64];

– PNG, [65].

10.8 Vector graphics

If vector graphics is supported, SVG Tiny 1.2 [66], [67] and ECMAScript [68] shall be supported.

NOTE 1: The compression format for SVG content is GZIP [42], in accordance with the SVG specification [66].

NOTE 2 Content creators of SVG Tiny 1.2 are strongly recommended to follow the content creation guidelines provided in annex L of 3GPP TS 26.234 [47].

10.9 Text

Text is supported as part of the HTML5 support [124].

The following character coding formats shall be supported:

– UTF-8, [71];

– UCS-2, [70].

10.10 Timed text

If timed text is supported, MBMS clients shall support 3GPP TS 26.245 [72]. Timed text may be transported over RTP or downloaded contained in 3GP files using Basic profile.

NOTE: When a MBMS client supports timed text it needs to be able to receive and parse 3GP files containing the text streams. This does not imply a requirement on MBMS clients to be able to render other continuous media types contained in 3GP files, e.g. AMR, if such media types are included in a presentation together with timed text. Audio and video are instead streamed to the client using RTP.

10.11 3GPP file format

An MBMS client shall support the Basic profile and the Extended presentation profile of the 3GPP file format 3GPP TS 26.244 [32].

For delivery of 3GP-DASH formatted segments over MBMS download (see clause 5.6), the MBMS client shall support the 3GPP file format and segments for Dynamic Adaptive Streaming over HTTP as specified in 3GPP TS 26.247 [98] and in 3GPP TS 26.244 [32].

10.12 Scene Description

If scene description is supported, MBMS clients shall support the 3GPP HTML5 profile [124] as the scene description format.

The HTML5 presentation shall be identified by the presence of an appService element in the USD with the attribute appServiceDescriptionURI set to point to the HTML5 document with the MIME type "text/html". The resources referenced from the HTML5 document should be delivered on the same FLUTE session as the HTML5 document itself. Alternatively, the HTML5 document and related static resources may be delivered as part of the USD as metadata fragments.

The HTML5 document may reference an MPD [98] or a progressive download file [98] as a source location for media playback using an HTML5 media element.

Scene updates are possible through the delivery of a scene update file that is referenced through a Javascript that periodically checks for these updates. The Javascript should detect new versions of the scene update file by checking the file version (provided e.g. as the Content-MD5 over FLUTE).

10.13 Timed graphics

If timed graphics is supported, MBMS clients shall support 3GPP TS 26.430[95].

10.14 360 video and 3D audio for VR (Virtual Reality)

10.14.1 Video

10.14.1.1 Operation Points

If the MBMS client supports 360 VR video, it shall include a receiver that complies with

– the H.264/AVC Basic Operation Point Receiver requirements as specified in TS 26.118 [121], clause 5.1.4.

If the MBMS client supports 360 VR video, it should include a receiver that complies with

– the Main H.265/HEVC Operation Point Receiver requirements as specified in TS 26.118 [121], clause 5.1.5.

10.14.1.2 DASH-over-MBMS

Only a single Adaptation Set for each media type should be offered in an MPD for DASH-over-MBMS.

If the MBMS client supports 360 VR video for DASH-over-MBMS services, it shall include a receiver that complies with

– the Basic Video Media Profile Receiver requirements for DASH as specified in TS 26.118 [121], clause 5.2.2.

If the MBMS client supports 360 VR video for DASH-over-MBMS services, it should include a receiver that complies with

– the Main Video Media Profile Receiver requirements for DASH as specified in TS 26.118 [121], clause 5.2.3.

NOTE: The receiver is not expected to handle Media Adaptation Set Ensembles for Viewport-Optimized offerings as defined in 5.2.3.3.4

10.14.2 Audio

10.14.2.1 Operation Points

If the MBMS client supports 3D/VR audio, it should include a receiver that complies with

– the 3GPP MPEG-H Audio Operation Point Receiver requirements as specified in TS 26.118 [121], clause 6.1.4.

10.14.2.2 DASH-over-MBMS

If the MBMS client supports 3D/VR audio for DASH-over-MBMS services, it should include a receiver that complies with

– the OMAF 3D Audio Baseline Media Profile Receiver requirements for DASH as specified in TS 26.118 [121], clause 6.2.2.3.