4.5 Audio Rendering

26.1183GPPRelease 17TSVirtual Reality (VR) profiles for streaming applications

4.5.1 Audio Renderer Definitions

4.5.1.1 Reference Renderer

a) The purpose of the Reference Renderer is to provide a documented audio rendering solution in 3GPP for its corresponding media profile. A Reference Renderer:

a) Is specified along with the media profile.

b) Supports binaural and loudspeaker based rendering.

c) Has a standardized implementable description documented either in 3GPP or in an external SDO.

d) Supports diegetic and non-diegetic content.

e) Has a Motion to Sound latency characterized according to the method defined in TS 26.260 [15].

f) Has a Loudness characterized according to the method defined in TS 26.260 [15].

g) Provides a suitable subjective quality level characterized by the Rendering Test (see clause 4.5.1.6).

h) Provides an interface to specify the set of HRTFs used for binaural rendering.

NOTE: The Reference Renderer could be an external renderer following the properties defined above.

4.5.1.2 Common Informative Binaural Renderer (CIBR)

The CIBR is a binaural renderer defined for the purposes of the Renderer Test in TS 26.259 [16]. The CIBR:

a) Supports binaural rendering.

b) Supports diegetic and non-diegetic content.

c) Has a Motion to Sound latency characterized according to the method defined in TS 26.260 [15].

d) Has a Loudness characterized according to the method defined in TS 26.260 [15].

e) Is intended to provide a quality comparison point for the Reference Renderer (see clause 4.5.1.1) and any External Renderer (see clause 4.5.1.3).

The CIBR consists of four components. The first three are currently available as VST audio plugins:

1) The "ESD to HOA" component, which receives a set of audio input signals in an Equivalent Spatial Domain (ESD) representation and converts them into a set of audio output signals in the HOA domain (ACN/SN3D format). The ESD representation corresponds to the immersive audio content rendered at a set of pre-determined virtual loudspeaker locations (Fliege Points) The ESD to HOA conversion is accomplished using the "AmbiX Decoder" plugin (https://github.com/kronihias/ambix) with the appropriate conversion matrices specified in TS 26.260 [15] clause 4.1 (Fliege Points).

2) The "Sound Field rotation" component, which performs rotation of the soundfield in the HOA domain. The sound field rotation is accomplished with the "AmbiX Soundfield Rotator" plugin (https://github.com/kronihias/ambix) using a connected headtracking device.

3) The "HOA to Binaural" component, which performs the binaural rendering of the HOA signals. The HOA to binaural conversion is accomplished with the "Google Resonance Monitoring" plugin (https://github.com/resonance-audio/resonance-audio-daw-tools), which supports up to 3rd order HOA.

4) A "Diegetic/Non-Diegetic content mixer". The non-diegetic signals are directly mixed at the headphone output.

Note that the CIBR may introduce spatial and or timbral quality changes to the rendered objects and channel based-audio signals (ESD loudspeaker inputs).

Figure 4.5-1: Block diagram of Common Informative Binaural Renderer

4.5.1.3 External Renderer

The primary purpose of the External Renderer is to enable alternatives to the Reference Renderer. There may be several External Renderers for a given media profile.

An External Renderer:

a) Supports binaural and/or loudspeaker based rendering.

b) Can be the Reference Renderer associated to the Audio Media Profile.

c) Does not require a standardized implementable description.

d) Exposes an External Renderer API (see clause 4.5.1.5) and/or the Common Renderer API (see clause 4.5.1.4) for connecting it to an audio decoder.

e) Supports diegetic and non-diegetic content.

f) Has a Motion to Sound latency documented according to the method defined in TS 26.260 [15].

g) Has a Loudness documented according to the method defined in TS 26.260 [15].

h) Provides a suitable subjective quality level characterized according to the Rendering Test (see clause 4.5.1.6) with additional comparison with the Reference Renderer.

i) Provides an interface to specify the set of HRTFs used for binaural rendering if applicable.

4.5.1.4 Common Renderer API

The purpose of the Common Renderer API is to enable the use of an External Renderer that can support all VRStream media profiles. The Common Renderer API:

a) Is normative.

b) Has a standardized implementable description in 3GPP technical specifications or by reference.

4.5.1.5 External Renderer API

The purpose of the External Renderer API is to enable the use of an External Renderer. The External Renderer API:

a) Has a standardized implementable description in 3GPP technical specifications or by reference.

b) Provides the necessary information to connect a VRStream media profile with an External Renderer.

4.5.1.6 Rendering Test

The purpose of the Rendering Test is to characterize the Quality of Experience (QoE) when using the Reference Renderer or External Renderer.

The Rendering Test:

a) Is defined in TS 26.259 [16] clause 6.

b) Characterizes media profile performance with Reference Renderer or External Renderer.

c) Assesses performance for multiple audio quality attributes and overall quality.