5.4.5 Guided concealment and recovery

26.4473GPPCodec for Enhanced Voice Services (EVS)Error concealment of lost packetsRelease 17TS

Tools: ARFCN - Frequency Conversion for 5G NR/LTE/UMTS/GSM

5.4.5.1 Transmission of the synthesis class

Instead of performing the signal classification in the decoder (see clause 5.1.2 Signal classification), the synthesis class is transmitted in the bitstream for the rates 48, 96 and 128 kbps.

5.4.5.2 Transmission of the LTP pitch lag

Despite of the fact that no LTP post processing is performed for the rates 96 and 128 kbps, the LTP pitch lag is transmitted in the bitstream to allow reliable functioning of the decoder concealment modules which depend on this LTP lag.

5.4.5.3 Transmission of a voicing indicator

A flag is used in the decoder for the concealment method described in clause 5.4.3.6, to adapt several parameters (pitch search range, sinusoid selection, noise level to be re-injected).

At the encoder, the flag is set to 1 when the current frame is classified as GENERIC or VOICED, otherwise the flag is set to 0 (for all other signal classes: UNVOICED, TRANSIENT, INACTIVE, AUDIO).

5.4.5.3a Transmission of a tonality flag

A flag indicating the frame as tonal type (1) or non-tonal type (0) is transmitted in the bitsteam for the rates 48, 96 and 128 kbps. It is used in the decision criterion to select the concealment method of non-tonal concealment with waveform adjustment.

5.4.5.4 ACELP to MDCT mode recovery

For the ACELP concealment at 9.6, 16.4 and 24.4 kbps, as well as for the TCX time domain concealment, an additional segment (half frame) of signal is generated by predictive decoding from the previous frame and stored in a temporary buffer. This additional segment is used to recover from the loss of a transition frame between ACELP to MDCT (HQ MDCT or non-transition TCX 20) and it is generated in advance without prior knowledge that the transition frame will be lost, before receiving the next frame. The extra complexity associated with generating this additional segment has been found to be outside the critical path of complexity of the EVS decoder, therefore this extra processing does not impact worst-case decoding complexity.

To create this additional segment, the parameters used to generate the half frame of signal are predicted based on the parameters in the previous frame. The bit indicating the sampling frequency of the ACELP core is implicitly repeated; the excitation decoding done in the previous ACELP frame is extended in the additional segment.

When the current frame is MDCT, the additional segment (half frame) is then overlap and added to the MDCT frame decoded in the current frame, being HQ MDCT or non-transition TCX 20, using a symmetric sine window of length 8.75 milliseconds.

For the recovery in the TCX20 transition frame (TCX20 after lost ACELP frame):

– If the ACELP PLC was used, the same ACELP to TCX transition as if the previous ACELP frame would have been received is used, but with the samples of the past frame replaced with the concealed frame.

– If the TD TCX PLC was used, the additional half frame constructed in TD TCX PLC is overlap and added to the transition TCX frame, using HALF_OVERLAP (the symmetric sine window of length 3.75 milliseconds).

5.4.5.5 Recovery after TCX MDCT concealment

During recovery after TCX MDCT concealment fading the background level as described in 5.4.6.1.3.1, the overlap-add buffer is rescaled by multiplying each element of it with the latest target background noise level (see equation 109).