6 Example ECU/BFH Solution
26.1913GPPAdaptive Multi-Rate - Wideband (AMR-WB) speech codecError concealment of erroneous or lost framesRelease 17Speech codec speech processing functionsTS
6.1 State Machine
This example solution for substitution and muting is based on a state machine with seven states (Figure 1).
The system starts in state 0. Each time a bad frame is detected, the state counter is incremented by one and is saturated when it reaches 6. Each time a good speech frame is detected, the state counter is right-shifted by one. The state indicates the quality of the channel: the larger the value of the state counter, the worse the channel quality is. The control flow of the state machine can be described by the following C code (BFI = bad frame indicator, State = state variable):
if(BFI != 0 )
State = State + 1;
if(State > 6)
State = 6;
else
State = State >> 1;
In addition to this state machine, the Bad Frame Flag from the previous frame is checked (prevBFI). The processing depends on the value of the State-variable. In states 0 and 6, the processing depends on the BFI flag.
The procedure can be described as follows:
Figure 1: State machine for controlling the bad frame substitution
6.2 Substitution and muting of erroneous/lost speech frames
6.2.1 BFI = 0, prevBFI = 0, State = 0 or 1
No error is detected in the received or in the previous received speech frame. The received speech parameters are used normally in the speech synthesis. The current frame of speech parameters is saved.
6.2.2 BFI = 0, prevBFI = 1, State = 0 to3
No error is detected in the received speech frame but the previous received speech frame was bad. The LTP gain is used normally in the speech synthesis and fixed codebook gain are limited below the values used for the last received good subframe:
(1)
where
= current decoded fixed codebook-gain
= fixed codebook gain used for the last good subframe (BFI = 0)
= fixed codebook gain to be used for the current frame.
The rest of the received speech parameters are used normally in the speech synthesis. The current frame of speech parameters is saved.
6.2.3 BFI = 1, prevBFI = 0 or 1, State = 1…6
An error is detected in the received speech frame and the substitution and muting procedure is started.
6.2.3.1 LTP gain & fixed codebook gain concealment
when RX_FRAMETYPE = SPEECH_BAD
The LTP gain and fixed codebook gain are replaced by attenuated values from the previous subframes:
(2)
(3)
where:
= current decoded LTP gain,
= current decoded fixed codebook gain,
= LTP gains used for the last 5 subframes,
= fixed codebook gains used for the last 5 subframes,
median5() = 5-point median operation,
= attenuation factor (Pp(1) = 0.98, Pp (2) = 0.96, Pp (3) = 0.75, Pp (4) = 0.23, Pp (5) = 0.05,
Pp(6) = 0.01),
= attenuation factor (Pc (1) = 0.98, Pc (2) = 0.98, Pc (3) = 0.98, Pc (4) = 0.98, Pc (5) = 0.98,
Pc (6) = 0.70),
state = state number {0..6},
VAD_HIST is number of consecutive VAD=0 decisions.
The higher the state value is, the more the gains are attenuated. Also the memory of the predictive fixed codebook gain is updated by using the average value of the past four values in the memory:
(4)
6.2.3.2 LTP gain & fixed codebook gain concealment
when RX_FRAMETYPE = SPEECH_LOST
The LTP gain and fixed codebook gain are replaced by attenuated values from the previous subframes:
(5)
(6)
where:
= current decoded LTP gain,
= current decoded fixed codebook gain,
= LTP gains used for the last 5 subframes,
= fixed codebook gains used for the last 5 subframes,
median5() = 5-point median operation,
= attenuation factor (Pp(1) = 0.95, Pp (2) = 0.90, Pp (3 ) = 0.75, Pp (4) = 0.23, Pp (5) = 0.05,
Pp (6) = 0.01),
= attenuation factor (Pc (1) = 0.50, Pc (2) = 0.25, Pc (3) = 0.25, Pc (4) = 0.25, Pc (5) = 0.15,
Pc (6) = 0.01),
state = state number {0..6},
VAD_HIST is number of consecutive VAD=0 decisions.
The higher the state value is, the more the gains are attenuated. Also the memory of the predictive fixed codebook gain is updated by using the average value of the past four values in the memory:
(7)
6.2.3.3 ISF concealment
The past ISFs are shifted towards their partly adaptive mean:
i = 0..16 (8)
where
= 0.9,
is ISF-vector for a current frame,
is ISF-vector from the previous frame,
vector is combination of adaptive mean and constant mean ISF-vectors in the following manner:
, i = 0..16 (9)
where
= 0.75,
and is updated whenever BFI =0.
is a vector containing long time average of ISF-vectors.
6.2.3.4 LTP-lag concealment
The histories of five last good LTP-lags and LTP-gains are used for finding the best method to update.
6.2.3.4.1 LTP-lag concealment when RX_FRAMETYPE = SPEECH_BAD
The usability of the received LTP lag () is defined as follows: (Predicts if the received lag is most probably very close to one that was sent and therefore its usage should not introduce any bad artifacts)
(10)
where:
is LTP lag from the previous good frame,
,
,
,
is received lag,
,
is LTP gain of the current frame,
(-1) is LTP gain of the previous good frame,
(-2) is LTP gain of the frame before previous good frame,
LPT lag value for the current frame is defined as follows:
(11)
where:
,
is second largest value in ,
is second largest value in ,
is random value generated to range
6.2.3.4.2 LTP-lag concealment when RX_FRAMETYPE = SPEECH_LOST
The usability of the LTP lag from last good frame () is defined as follows: (Predicts if the received lag is most probably very close to one that was sent and therefore its usage should not introduce any bad artifacts)
(12)
where:
,
(n-1) is LTP gain of the previous good frame,
(n-2) is LTP gain of the frame before previous good frame
LPT lag value for the current frame is defined as follows:
(13)
where:
is LTP lag from the previous good frame,
,
is second largest value in ,
is second largest value in ,
is random value generated to range
6.2.4 Innovation sequence
When RX_FRAMETYPE = SPEECH_BAD, the received fixed codebook innovation pulses from the erroneous frame are used as they are received.
When RX_FRAMETYPE = SPEECH_LOST, the received fixed codebook innovation pulses from the erroneous frame are not used and the fixed codebook innovation vector is filled with random signal (values limited to
range [-1, +1]).
6.2.5 High-band gain (for 23.85 kbit/s mode)
When RX_FRAMETYPE = SPEECH_BAD or RX_FRAMETYPE = SPEECH_LOST the received high-band energy parameter of the frame is not used and the estimation for the high-band gain is used instead. This means that in case of bad/lost speech frames, the high-band reconstruction operates in the same way for all the modes.
6.3 Substitution and muting of lost SID frames
In the speech decoder a single frame classified as SID_BAD shall be substituted by the last valid SID frame information and the procedure for valid SID frames be applied. If the time between SID information updates (updates are specified by SID_UPDATE arrivals and occasionally by SID_FIRST arrivals) is greater than one second this shall lead to attenuation.
Annex A (informative):
Change history
Change history |
|||||||
Date |
TSG SA# |
TSG Doc. |
CR |
Rev |
Subject/Comment |
Old |
New |
03-2001 |
11 |
SP-010086 |
Version 2.0.0 produced for approval |
5.0.0 |
|||
03-2002 |
15 |
SP-020083 |
001 |
Error concealment of high band gain in 23.85 kbit/s mode |
5.0.0 |
5.1.0 |
|
12-2004 |
26 |
Version for Release 6 |
5.1.0 |
6.0.0 |
|||
06-2007 |
36 |
Version for Release 7 |
6.0.0 |
7.0.0 |
|||
12-2008 |
42 |
Version for Release 8 |
7.0.0 |
8.0.0 |
|||
12-2009 |
46 |
Version for Release 9 |
8.0.0 |
9.0.0 |
|||
03-2011 |
51 |
Version for Release 10 |
9.0.0 |
10.0.0 |
|||
09-2012 |
57 |
Version for Release 11 |
10.0.0 |
11.0.0 |
|||
09-2014 |
65 |
Version for Release 12 |
11.0.0 |
12.0.0 |
|||
12-2015 |
70 |
Version for Release 13 |
12.0.0 |
13.0.0 |
Change history |
|||||||
Date |
Meeting |
TDoc |
CR |
Rev |
Cat |
Subject/Comment |
New version |
2017-03 |
75 |
Version for Release 14 |
14.0.0 |
||||
2018-06 |
80 |
Version for Release 15 |
15.0.0 |
||||
2020-07 |
– |
– |
– |
– |
– |
Update to Rel-16 version (MCC) |
16.0.0 |
2022-04 |
– |
– |
– |
– |
– |
Update to Rel-17 version (MCC) |
17.0.0 |