6.1.3 Post-processing of Mono Low-Band signal
26.2903GPPAudio codec processing functionsExtended Adaptive Multi-Rate - Wideband (AMR-WB+) codecRelease 17Transcoding functionsTS
In the low-frequency pitch enhancement, two-band decomposition is used and adaptive filtering is applied only to the lower band. This results in a total post-processing that is mostly targeted at frequencies near the first harmonics of the synthesized speech signal.
Figure 14: Block diagram of the low frequency pitch enhancer
Figure 14 shows the block diagram of the two-band pitch enhancer. In the higher branch the decoded signal is filtered by a high-pass filter to produce the higher band signal (sH). In the lower branch, the decoded signal is first processed through an adaptive pitch enhancer, and then filtered through a low-pass filter to obtain the lower band post-processed signal (sLEF). The post-processed decoded signal is obtained by adding the lower band post-processed signal and the higher band signal. The object of the pitch enhancer is to reduce the inter-harmonic noise in the decoded signal, which is achieved here by a time-varying linear filter with a transfer function
and described by the following equation:
(1)
where is a coefficient that controls the inter-harmonic attenuation, T is the pitch period of the input signal , and
is the output signal of the pitch enhancer. Parameters T and vary with time and are given by the pitch tracking module. With a value of = 1, the gain of the filter described by Equation (1) is exactly 0 at frequencies 1/(2T),3/(2T), 5/(2T), etc.; i.e. at the mid-point between the harmonic frequencies 1/T, 3/T, 5/T, etc. When approaches 0, the attenuation between the harmonics produced by the filter of Equation (1) decreases.
To confine the post-processing to the low frequency region, the enhanced signal sLE is low pass filtered to produce the signal sLEF which is added to the high-pass filtered signal sH to obtain the post-processed synthesis signal sE.
Another configuration equivalent to the one in Figure 14 is used here which eliminates the need to high-pass filtering. This is explained as follows.
Let hLP(n) be the impulse response of the low-pass filter and hHP(n) is the impulse response of the complementary high-pass filter. The post-processed signal sE(n) is given by
Thus, the post-processing is equivalent to subtracting the scaled low-pass filtered long-term error signal from the synthesis signal . The transfer function of the long-term prediction filter is given by
The alternative post-processing configuration is depicted in Figure 15.
Figure 15: Implemented post-processing configuration
The value T is given by the received closed-loop pitch lag in each subframe (the fractional pitch lag rounded to the nearest integer). A simple tracking for checking pitch doubling is performed. If the normalized pitch correlation at delay T/2 is larger than 0.95 then the value T/2 is used as the new pitch lag for post-processing.
The factor is by
constrained to
where is the decoded pitch gain. Note that in TCX mode the value of
is set to zero.
A linear phase FIR low-pass filter with 25 coefficients is used, with a cut-off frequency at 5Fs/256 kHz (the filter delay is 12 samples).