B.2 Interfaces

26.1183GPPRelease 17TSVirtual Reality (VR) profiles for streaming applications

B.2.1 Interface for Audio Data and Metadata

The example external binaural renderer has an interface for the input of un-rendered channels, objects, and HOA content and associated metadata. The syntax of this input interface follows the specification of the External Renderer Interface for MPEG-H 3D Audio to output un-rendered channels, objects, and HOA content and associated metadata according to clause 6.1.4.3.6.5.

The input PCM data of the channels and objects interfaces is provided through an input PCM buffer, which first contains signals carry the PCM data of the channel content. These are followed by signals carrying the PCM data of the un-rendered objects. Then additional signals carry the HOA data which number is indicated in the HOA metadata via the HOA order (e.g. 16 signals for HOA order 3). The HOA audio data in the HOA interface is provided in the ESD representation. The conversion from the HOA domain into the equivalent spatial domain representation and vice versa is described in ISO/IEC 23008-3 [19], Annex C.5.1.

The metadata for channels, objects, and HOA is received via the input interface once per frame and their syntax is specified in mpegh3da_getChannelMetadata(), mpegh3da_getObjectAudioAndMetadata(), and mpegh3da_getHoaMetadata() respectively, see ISO/IEC 23008-3, clause 17.10 [19]. The metadata and PCM data will be aligned to match each metadata element with the respective PCM frame.

B.2.2 Head Tracking Interface.

The external binaural renderer receives scene displacement values (yaw, pitch and roll) e.g. from an external head tracking device via the head tracking interface. The syntax is specified in mpegh3daSceneDisplacementData() as defined in ISO/IEC 23008-3 [19], clause 17.9.3.

B.2.3 Interface for Head-Related Impulse Responses

An interface is provided to specify the set of HRIRs used for the binaural rendering. These directional FIR filters will be input using the SOFA (Spatially Oriented Format for Acoustics) files format according to AES-69 [21]. The SimpleFreeFieldHRIR convention will be used, where binaural filters are indexed by polar coordinates (azimuth φ in radians, elevation ϕ in radians, and radius r in meters) relative to the listener.