5 Functions on the transmit (TX) side

26.0923GPPAdaptive Multi-Rate (AMR) speech codecComfort noise aspectsMandatory speech CODEC speech processing functionsRelease 17TS

The comfort noise evaluation algorithm uses the following parameters of the AMR speech encoder, defined in [2]:

– the unquantized Linear Prediction (LP) parameters, using the Line Spectral Pair (LSP) representation, where the unquantized Line Spectral Frequency (LSF) vector is given by ;

– the unquantized LSF vector for the 12.2 kbit/s mode is given by the second set of LSF parameters in the frame.

The algorithm computes the following parameters to assist in comfort noise generation:

– the averaged LSF parameter vector (average of the LSF parameters of the eight most recent frames);

– the averaged logarithmic frame energy (average of the logarithmic energy of the eight most recent frames).

These parameters give information on the level () and the spectrum () of the background noise.

The evaluated comfort noise parameters ( and) are encoded into a special frame, called a Silence Descriptor (SID) frame for transmission to the RX side.

A hangover logic is used to enhance the quality of the silence descriptor frames. A hangover of seven frames is added to the VAD flag so that the coder waits with the switch from active to inactive mode for a period of seven frames, during that time the decoder can compute a silence descriptor frame from the quantized LSFs and the logarithmic frame energy of the decoded speech signal. Therefore, no comfort noise description is transmitted in the first SID frame after active speech. If the background noise contains transients which will cause the coder to switch to active mode and then back to inactive mode in a very short time period, no hangover is used. Instead the previously used comfort noise frames are used for comfort noise generation.

The first SID frame also serves to initiate the comfort noise generation on the receive side, as a first SID frame is always sent at the end of a speech burst, i.e., before the transmission is terminated.

The scheduling of SID or speech frames on the network path is described in [4].