6 Test Methodologies for Immersive Audio Systems of TS 26.118 (Renderer Comparison Test)

26.2593GPPRelease 17Subjective test methodologies for the evaluation of immersive audio systemsTS

Tools: ARFCN - Frequency Conversion for 5G NR/LTE/UMTS/GSM

6.1 Introduction

This clause specifies the Renderer Comparison Test for the audio profiles in TS 26.118. The Renderer Comparison Test is loosely inspired by the Comparison Category Rating test paradigm described in [5] Annex E.

6.2 Experimental Design

In the Renderer Comparison Test, the assessors compare a Test Condition against Anchor Conditions on four audio quality Attributes. The presentation of the Test and Anchor Conditions is binaural using head-tracking. For each trial, the Test Condition is compared to one of the Anchor Conditions as an A v. B comparison. To control for possible presentation order biases, the Test Conditions shall be presented to the assessors as sample A in exactly half of the trials. The test shall be conducted with 12 Test Materials and two Anchors for a total of 24 trials (comparisons).

The test shall be divided in two sessions. the first session compares the Test Condition against the first Anchor and the second session compares the Test Condition against the second Anchor.

6.3 Selection of Assessors

The selection of assessors shall follow the guidelines in [2] clause 4.1. Only experienced assessors shall participate in the experiment and the test administrator shall employ pre- screening according to [2] clause 4.1. The final test results shall include assessments from at least 12 experienced assessors that have passed pre-screening.

NOTE 1: Post-screening methods for this test are for further study. In the event post-screening is performed, the test report will describe the method adopted.

NOTE 2: Post-screening methods for this test are ffs.

6.4 Test Materials

The Rendering Comparison Test shall use critical audio materials representing typical virtual reality content, with a duration longer than 6 s and no longer than 12 s. The Rendering Comparison Test shall include 4 channel-based, 4 object-based and 4 scene-based Test Materials. In the event a test Material is a hybrid format, the primary category to which the test material belongs to (distributed among channels, objects and scene-based) shall be indicated in the test report.

6.5 Content Presentation

The Test Administration Platform shall employ a Graphical User Interface (GUI) to present the Test and Reference Conditions to the assessors as A/B samples within trials. The following are constraints on the GUI design:

1) The GUI shall have an "A" and "B" switch buttons which allow the assessor to seamlessly switch the audio presentation between the A and B samples for comparison.

2) The GUI shall have a "Play" button which enables Time-Synchronized Playback of the A and B samples. Within a trial, one of the samples is a bit-stream for the Test Condition and the other sample is one of the Anchor Conditions.

3) The GUI shall have a "Stop" button which enables stopping the Time-Synchronized Playback of the A and B samples.

4) The GUI shall present four Audio Quality Attributes for assessment: Timbre (TIM), Spatial (SPA), Artefacts (ART) and Basic Audio Quality (BAQ). In addition, the GUI shall present the possibility of comparing the Loudness (LOUD) of the A and B samples through an additional loudness scale.

5) The GUI shall have a "Loop" button which enables looping the Time-Synchronized playback of the A and B samples.

6) The GUI shall have a "Next" button which enables the assessor to proceed to the next trial in the experiment. For each trial, the GUI shall enable the "Next" button only after assessment of TIM, SPA and BAQ have been completed. Because all source Test Materials are normalized for Listening Level according to Clause 6.8 and the highest operating point.

In addition, the Test Administration Platform shall support a real-time implementation of the Audio Profile Renderer under test as well as a real-time implementation of the Anchor Conditions (see clause 6.9) with support for head-tracking.

Figure 1: Example of possible GUI for Rendering Comparison Test

6.6 Listening Environment

For each octave-band, the maximum sound pressure level of the listening environment shall not exceed the levels in Table 1 (corresponding to an NR20 noise rating curve):

Table 1: Maximum Sound Pressure Level for Listening Environment

Octave Band centre frequency	31.5 Hz	62.5 Hz	125 Hz	250 Hz	500 Hz	1 kHz	2 kHz	4 kHz	8 kHz
Maximum Sound Pressure Level (dBSPL)	69	51	39	31	24	20	17	14	13

6.7 Listening System

The listening system shall be headphone-based with head-tracking. Both the Test Conditions and Anchor/Reference Conditions shall be binauralized using a common HRTF set. The binauralization shall use either individualized HRTFs or HRTFs based on a head and torso simulator (HATS). The choice of HRTF set shall be indicated in the test report. The headphones shall be equalized. If individualized HRTFs are used, the headphones shall have individualized equalization. If HATS HRTFs are used, the headphones shall be equalized for the same make/model of HATS.

6.8 Listening Level

The listening level is according to [2] clause 8. The listening level is adjusted with channel-based content.

6.9 Anchor/Reference Conditions

All Renderer Comparison Tests shall include two Anchor/Reference Conditions. The two Anchors correspond to two configurations of a Common Informative Binaural Rendering (CIBR) scheme (1^st and 3^rd order). The CIBR:

1) Receives as an input a virtual loudspeaker representation, obtained using a Documented Loudspeaker Renderer, with speaker locations positioned according to an Equivalent Spatial Domain representation (ESD). The definition of Equivalent Spatial Domain can be found in TS 26.260 clause 4.1.1.

2) Converts the ESD representation to a 1^st order or 3^rd order B-format representation.

3) Performs rotation of the sound field, according to a motion sensor signal

4) Binauralizes the audio signal for presentation.

NOTE: The Documented Loudspeaker Renderer is Vector Based Amplitude Panning (VBAP) (Pulkki).

A block diagram of the rendering systems for Anchor Conditions is illustrated in Figure 2.

Figure 2: Block Diagram for Anchor Conditions

6.10 Test Conditions

The Rendering Comparison Test shall assess only one Test Condition per experiment. This Test Condition is such that the Audio Profile shall be configured for an Operating Point providing transparent quality for all Test Materials. In addition, the Audio Profile shall be configured to operate with its Reference Renderer. For all Test Materials, the Test Condition shall be assessed against the two Anchor Conditions.

6.11 Attributes

The Rendering Comparison Test shall assess the four Audio Quality Attributes: Timbre (TIM), Spatial (SPA), Artefacts (ART) and Basic Audio Quality (BAQ). In addition, the Rendering Comparison Test compares any residual Loudness (LOUD) difference between A and B samples through an additional loudness scale.

6.12 Test Report and Presentation of Results

The Test Report shall provide the Mean and 95 % Confidence Intervals (t-distribution) for the Test Condition against each of the Anchor Conditions. All results provided shall be post-screened results.