A.1 Decoder Test

26.4443GPPCodec for Enhanced Voice Services (EVS)Release 17Test sequencesTS

A.1.1 General Considerations

The reference PCM signals are taken from the decoded floating-point test sequences of this specification. The PCM signal under test are obtained by running the floating-point bit-stream included in this specification through the Decoder under Test (Figure A.1). The reference decoder is the floating-point code of TS 26.443 [8].

Figure A.1: Flow diagram for the decoder test using signal-based metrics

All metrics are calculated on the reference PCM signal and the PCM signal under test based on 20ms frames. The frames of the two signals will be time aligned, this means the delay compensation in EVS encoder and decoder remains ON (the default configuration). Furthermore, the frame processing is aligned with the encoded frame by adding the decoder delay. Table A.1 shows the delay values used for the different sampling frequencies.

Table A.1: Delay used for alignment of processing frames with encoded frames

Sampling frequency

8000 Hz

16000 Hz

32000 Hz

48000 Hz

Delay (samples)

10

37

74

111

The number of samples for a 20ms frame size is defined by , where represents the sampling rate.

The PCM signals and should be scaled between -1 and 1.

A.1.2 Metrics

A.1.2.1 RMS Error Threshold

The RMS method is derived from the decoder conformance used in ISO/IEC 14496-26 [10]. The RMS error is calculated for each 20ms frame and compared to a threshold according to:

The value chosen for the RMS error threshold is to assume change on the last bit of the audio signal:

with

A.1.2.2 Signal to Noise Ratio (SNR)

The segmental SNR method is derived from the decoder conformance used in ISO/IEC 14496-26 [10]. For each 20 ms segment, the following values need to be calculated:

Energy of reference signal:

Noise energy:

Signal to noise ratio with

As EVS is a switched codec containing a LPC based speech coder and a MDCT based transform coder, the SNR values vary significantly depending on the used coding mode. Therefore, a constant threshold for the SNR is not suitable but instead, a reference value per frame and test vector should be specified. The SNR should be compared against the thresholds by

where is a 20 ms frame index and is the test vector index

The set of SNR reference values is included in the zip file. This set was obtained using the reference implementations listed in clause A.4.

A.1.2.3 Spectral Distortion

The spectral distortion method can be conducted on a 20 ms frame base by the following steps:

Calculate the absolute FFT spectrum of and using a Hanning window

with

The 32768 is due to MATLAB scaling and to align to 16 bit PCM C-code. This scaling is dependent on the input value range.

For all spectral bins the distortion d is calculated according to the following pseudo code:

cnt=0
d=0
for k=1..N/2-1
    if (==0 && ==0)
        X_Y = 1;
        Y_X = 1;
    else
        if (==0)
            X_Y = 0;
            Y_X = 2;
        else if (==0)
            X_Y = 2;
            Y_X = 0;
        else
            X_Y = ( * ) / ( * );
            Y_X = ( * ) / ( * );
        end
  end
COSH = (X_Y + Y_X – 2)/2;   
    d = d + COSH;
    cnt = cnt+1;
end
d = d/cnt;

The distortion value is to be compared against a threshold . The frame will be considered as passed if

with

A.1.3 Analysis Flow and Reporting

The three metrics are computed in a specific order, as shown in Figure A.2. Once a frame passes a metric, the process is stopped and the next frame is analysed. The SNR metric is computed on the frames failing the RMS error criteria. Similarly, the Spectral Distortion metric is computed on the frames failing the SNR criteria.

Figure A.2: Flow chart for decoder tool

In a file one or two frames could slightly be above the threshold. To avoid relaxing the threshold, a constraint on the number of frames failing per file has been added as an additional criterion.

if number_of_frames_failing =< THRESH_GOOD_FRAMES_TO_PASS * number_of_frame_in file, the test signal will be considered equivalent to the reference signal.

All the test sequences need to pass for the implementation to be conformant.

In addition to the number of fail/pass test sequences, the statistics from the three methods should be displayed. Table A.2 shows an example of reporting.

Table A.2: Template for result presentation

RMS

WSNR

Spectral Distortion

Number of frames tested

Number of frames passing

Number of frames failing

Ratio of frames passing

Ratio of frames failing

As part of conformance criteria, thresholds are set for the ratio of frames passing with RMS and WNR tests (Ratio_RMSframespassing_and RatioWSNRframespassing respectively).

The list of the thresholds used in decoder test are summarized in table A.3.

Table A.3: List of thresholds

Thresholds

Description

value

SNRHEADROOM

Headroom compare to the Tsnr threshold

3 dB

CDSNRMAX

Limit of SNR for the spectral distortion test

0 dB

CDSNRHEADROOM

Headroom compare to Tsnr threshold for the spectral distortion test

10 dB

Tsd

Threshold for the spectral distance

6.6

THRESH_GOOD_FRAMES_TO_PASS

Factor for number of failing frame per file

0.005

Ratio_RMSframespassing

Minimal percentage for frames passing RMS error test

47%

RatioWSNRframespassing

Minimal percentage for frames passing WSNR test

95%