5.4 Switching of Coding Modes

26.4453GPPCodec for Enhanced Voice Services (EVS)Detailed algorithmic descriptionRelease 15TS

5.4.1 General description

As described in subclause 5.2 and 5.3, the EVS codec supports a CELP coding mode as well as a MDCT coding mode. The transitions between both within the same bit rate and audio bandwidth are described in 5.4.2 and 5.4.3.

The CELP or LP-based coding mode can operate on different sample rates depending on the frame configuration. The procedure how to handle sample rate changes during the encoding process is described in 5.4.4.

The switching between primary and AMR-WB IO modes is described in 5.4.5.

The handling of transitions in the context bit rate switches is described in 5.4.6.

5.4.2 MDCT coding mode to CELP coding mode

When a CELP encoded frame is preceded by a MDCT based encoded frame, the memories of the CELP encoded frame have to be updated before starting the encoding of the CELP frame. These memories include:

  • Adaptive codebook memory
  • LPC synthesis filter memory
  • Weighting filter denominator memory (used to compute the target signal)
  • The factor of the tilt part of the innovative codebook pre-filter
  • De-emphasis filter memory
  • MA/AR prediction memories used in end-frame LSF quantization
  • Previous quantized end-frame LSP (for quantized LPC interpolation)
  • Previous quantized end-frame LSF (for mid-frame LSF quantization)

The CELP memories update is performed depending on the bitrate and the previous encoding mode (either MDCT based TCX or HQ MDCT). In general, three different MDCT to CELP (MC1-3) transition methods are supported. The following table lists the different cases depending on MDCT mode and bit rate.

Table 149: MDCT to CELP transition modes

Switching from

Switching to

Bitrate (kbps)

Transition mode

HQ MDCT

CELP

7.2

MC1

HQ MDCT

CELP

8

MC1

TCX

CELP

9.6

MC2

TCX or HQ MDCT

CELP

13.2

MC1

TCX

CELP

16.4

MC2

HQ MDCT

CELP

16.4

MC3

TCX

CELP

24.4

MC2

HQ MDCT

CELP

24.4

MC3

TCX or HQ MDCT

CELP

32

MC1

HQ MDCT

CELP

64

MC1

In following subclauses, the MDCT to CELP transitions are described in detail. Note that this description only considers switching cases within the same bit rate.

5.4.2.1 MDCT to CELP transition 1 (MC1)

MC1 is used when the previous frame was coded with HQ MDCT and the current frame is coded with CELP. In this case, the CELP state variables are reset in the current frame to predetermined (fixed) values. In particular the following memories are reset to 0 in the CELP encoder:

  • Resampling memories of the CELP synthesis signal
  • Pre-emphasis and de-emphasis memories
  • LPC synthesis memories
  • Past excitation (adaptive codebook memory)

The old LPC coefficients and associated representations (LSP, LSF) and CELP gain quantization memories are reset to predetermined (fixed) values. Since the past excitation is not available, the CELP coder in the current frame is forced to operate in Transition coding (TC) , i.e. without any adaptive codebook. The LPC coefficients from the previous frame are not available, therefore only one set of LPC coefficients corresponding to the end of frame are coded and used for all subframes of the current frame.

5.4.2.2 MDCT to CELP transition 2 (MC2)

MC2 is designed for CELP transitions coming from the MDCT based TCX mode. The TCX shares the same LPC analysis and quantization as in CELP, as described in subclause 5.3.3.2.1. The MA/AR/LSP/LSF memories are consequently updated during MDCT based TCX encoding, as it is done in CELP encoding. The only exception is at 9.6kbps, where the weighted LPC are quantized (instead of the unweighted LPC as in CELP) as described in subclause 5.3.3.2.1.1.1. In that case, the MA/AR/LSP/LSF memories are re-computed in the unweighted domain as described in subclause 5.3.3.2.1.1.2.

Moreover, MDCT based TCX includes an internal decoder which generates a decoded time-domain signal at the CELP sampling rate as described in subclause 5.3.3.2.12.1. This signal and the quantized LPC are then used to update CELP memories as described in 5.3.3.2.12.2, including the adaptive codebook memory, the LPC synthesis filter memory, the weighting filter denominator memory and the de-emphasis filter memory.

5.4.2.3 MDCT to CELP transition 3 (MC3)

Similarly to MC1, the CELP state variables are reset to predetermined values (in general 0), with the following exceptions::

  • The factor is set to 0.3.
  • The LSF quantization is run in safety-net mode, such that no prediction is used and the MA/AR memories are not used.
  • The previous quantized end-frame LSP/LSF and the quantized mid-frame LSP/LSF are set to the current quantized end-frame LSP/LSF.

5.4.3 CELP coding mode to MDCT coding mode

When a MDCT encoded frame is preceded by a CELP encoded frame, a beginning portion of the MDCT encoded frame cannot be reconstructed properly due to the aliasing introduced by the missing previous MDCT encoded frame. To solve this problem, two approaches are used depending on the MDCT based coding mode (either MDCT based TCX or HQ MDCT). These approaches are described in detail in the following subclauses.

5.4.3.1 CELP coding mode to MDCT based TCX coding mode

When the previous frame is CELP and the current frame is MDCT based TCX, the MDCT length is increased, the left folding point is moved towards the past and the left overlap length is reduced such that the current MDCT based TCX can reconstruct the whole 20ms frame, without the need for the previous (and missing in this case) MDCT based frame. This is illustrated in the figure below.

Figure 85: CELP to MDCT based TCX transition window (right part is here ALDO)

The right part of the transition window is not changed, such that it can be used in the next MDCT based frame as if it was a normal (non-transition) MDCT based frame. Similarly to the non-transition case, the right part of the transition window can have different shapes like ALDO, HALF or MINIMAL as described in subclause 5.3.2.3.

The left folding point is moved towards the past at 0.625ms before the transition border. This is equivalent to increasing the MDCT length from 20ms to 25ms. The corresponding numbers of MDCT bins are given in subclause 5.3.3.1.1.

The left part of the MDCT window is modified such that window segment with weight 1 covers the whole 20ms frame until the transition border, and the window segment before the transition border is a sine window with length 1.25ms

(1284)

with is the window length in samples and is the sampling rate of the time-domain signal.

After windowing and MDCT, the MDCT-based TCX frame is encoded as described in subclause 5.3.3.

5.4.3.2 CELP coding mode to HQ MDCT coding mode

When the previous frame is CELP and the current frame is to be coded by HQ MDCT, the current frame is a transition frame in which two types of coding are used:

  • Constrained CELP coding and (when required) simplified time-domain BWE coding
  • HQ MDCT coding with a modified window

Constrained CELP coding means here that CELP is restricted to cover only the first subframe of the current frame, to code only a subset of CELP parameters, and to reuse parameters (LPC coefficients) from the previous CELP frame. These constraints are set to minimize the bit budget taken by continuing CELP coding in the current transition frame, this bit budget being taken out of HQ MDCT coding.

5.4.3.2.1 Constrained CELP coding and simplified BWE coding

The bit budget for CELP and BWE in the current (transition) frame is determined depending on the CELP coder used in the previous frame (12.8 kHz or 16 kHz) and coded audio bandwidth in the current frame. The following pseudo-code describes how this bit budget is subtracted from the total bit budget for the current frame (total_budget):

num_bits = total_budget

if CELP 12.8kHz was used in the previous frame

cbrate = min(core_bitrate, ACELP_24k40)

if cbrate ACELP_11k60, num_bits = num_bits – 1, end

num_bits = num_bits – table_ACB_bits[cbrate,GENERIC]

num_bits = num_bits – table_gain_bits [cbrate,TRANSITION]

num_bits = num_bits – table_FCB_bits [cbrate,GENERIC]

else (CELP 16 kHz was used in the previous frame)

if core_bitrate ACELP_8k00, cbrate = ACELP_8k00

else if core_bitrate ACELP_14k80, cbrate = ACELP_14k80

else cbrate = min(core_bitrate, ACELP_22k60)

end

if cbrate ACELP_11k60, num_bits = num_bits – 1, end

num_bits = num_bits – table_ACB_bits_16kHz[cbrate,GENERIC]

num_bits = num_bits – table_gain_bits_16kHz [cbrate,GENERIC]

num_bits = num_bits – table_FCB_bits_16kHz [cbrate,GENERIC]

end

if bandwidth is not NB and (bandwidth is not WB and CELP 16 kHz was not used in the previous frame)

num_bits = num_bits –(6+6)

end

The bit rate for CELP coding is any case saturated by the minimum of HQ MDCT coding bit-rate and a predetermined bit-rate value (24 kbit/s or 22.6 kbit/s depending on whether the CELP core is at 12.8 or 16 kHz), then the numbers of bits allocated to CELP coding is subtracted and the remaining bit budget (denoted ‘num_bits’) is reserved for HQ MDCT coding normally operating at a bit rate ‘core_bitrate’ in the current frame. CELP coding in the extra subframe is configured to operate as if the current frame was CELP at a bit-rate denoted ‘cbrate’; this CELP bit-rate depends on the CELP coder used in the previous frame (12.8 kHz or 16 kHz).

The coded CELP parameters in this extra subframe are: pitch filter flag (1 bit) if the CELP bit-rate is 11.6 kbit/s, pitch index for the adaptive codebook (ACB), codebook gains, fixed codebook (FCB) index. The bit allocation tables from CELP coding in Generic or Transition coding at 12.8 and 16 kHz are reused.

Besides, 12 bits are used for BWE in the extra subframe to code one gain (6 bits) and one pitch index (6 bits) for the high band above CELP synthesis.

Note that the current frame being a transition frame, one bit is used to indicate the type of CELP coding (12.8 kHz or 16 kHz) used in the extra CELP subframe; this bit is necessary to be able to decode correctly the transition frame in case of frame erasures.

LPC coefficients from the end of the previous frame are reused to code the extra subframe; constrained CELP coding reuses the subframe excitation coding with the same CELP core coder (12.8 kHz or 16 kHz) as in the previous frame, and this subframe coding is adapted from the procedure described in clause 5.2.3.1.

When the coded audio bandwidth is higher than the bandwidth of the core CELP coder, simplified BWE coding is applied. The previous and current input frames are high-pass FIR filtered to obtain the high-band; the cutoff frequency (6.4 or 8 kHz) depends on the core CELP coder. Then, pitch search based on correlation in the high band provides an estimated pitch lag and gain which are coded with 6 bits each.

5.4.3.2.2 HQ MDCT coding with a modified analysis window

HQ MDCT coding in the transition frame is identical to clause 5.3.4, except the MDCT analysis window is modified and the bit budget for HQ MDCT coding in the current frame is decreased as described in clause 5.4.3.2.1.

Figure a: Modified MDCT window in transition frame (CELP to MDCT transition)

The modified MDCT window is designed to avoid aliasing in the first part of the frame as shown in Figure 85a. Its shape also allows cross-fading between the synthesis from constrained CELP and simplified BWE and the synthesis from HQ MDCT as described in clause 6.3.3.2.3. Note that the frames labeled CELP and MDCT in Figure 85a represent the new frames of signal (20 ms) entering in the encoder; the actual coded frame is delayed by the encoder lookahead.

5.4.4 Internal sampling rate switching

The LP-based coding within EVS operates at two internal sampling rates, 12.8 kHz and 16 kHz. In active frames the 12.8 kHz internal sampling is employed at lower bit-rates (≤ 13.2 kbps) while the 16 kHz internal sampling is employed at higher bit-rates (≥ 16.4 kbps). Further in LP-based CNG, the 12.8 kHz internal sampling is employed at bit-rates ≤ 8.0 kbps while 16 kHz internal sampling is employed at bit-rates ≥ 9.6 kbps. Consequently a CELP internal sampling rate switching can happen either 1) in case of bit-rate switching or 2) in case of switching between active segments and LP-based CNG segments at 9.6 kbps and 13.2 kbps.

The MDCT-based TCX operates at 4 different internal sampling rates, 12.8, 16, 25.6 and 32 kHz. MDCT-based TCX internal sampling rate corresponds to the rate used for computing and transmitting its LP filter, filter employed for shaping the quantization noise in frequency domain. The same internal sampling rate is used for generating the low rate decoded signal computed at both encoder and decoder sides for updating memories of an eventual next CELP frame. The sampling-rate switching in MDCT-Based TCX can only happen either 1) in case of bit-rate switching or 2) in case of switching from CELP at 13.2kbps or switching from LP-based CNG segments at 9.6 kbps.

When changing the internal sampling rate, a number of memory and buffer updates needs to be done. These are described in subsequent subclauses.

5.4.4.1 Reset of LPC memory

In case of internal sampling rate switching, the LSF quantization is run in safety-net mode, such that no prediction is used and the MA/AR memories are not used.

5.4.4.2 Conversion of LP filter between 12.8 and 16 kHz internal sampling rates

When switching between internal sampling rates of 12.8 kHz and 16 kHz, the previous LP filter needs to be converted both at the encoder and the decoder between these two sampling rates in order to determine the interpolated LP parameters of the current frame. For this purpose, the LP filter of the previous frame could be recomputed at the current sampling rate based on the past synthesis signal that is already available. However, this would require complete LP analysis and resampling the past synthesis signal both at the encoder and the decoder. A less complex method is used here based on re-estimating the LP filter from its power spectrum modified corresponding to the current sampling frequency. The autocorrelation is computed from this modified power spectrum for solving the parameters of the converted LP filter with the Levinson Durbin algorithm. The converted LP filter is finally transformed to its line spectrum frequency representation for interpolation with the corresponding parameters of the current frame.

The computation and modification of the power spectrum as well as the computation of the autocorrelation are described in the following subclauses. The Levinson-Durbin algorithm is described in subclause 5.1.9.4 and the determination of the line spectrum frequencies in subclause 5.1.9.5.

Note that only the quantized LP filter is converted. Although the perceptual weighting filter uses the unquantized LP filter at the encoder, it is sufficient to use the converted quantized LP filter for interpolation when switching between internal sampling rates. This approximation avoids an additional conversion procedure for the unquantized LP filter.

5.5.4.1.1 Modification of the Power Spectrum

When switching the internal sampling rate down to 12.8 kHz from 16 kHz, the converted LP filter models the power spectrum of the LP filter originally estimated at a sampling rate of 16 kHz up to the new cut-off frequency. This is accomplished by computing the power spectrum of the LP filter at frequency points equispaced in , corresponding to the Nyquist frequency at 12.8 kHz sampling rate. This frequency range of the power spectrum is then mapped onto when computing the autocorrelation for solving the parameters of the converted LP filter.

Conversely when switching the internal sampling rate up to 16 kHz from 12.8 kHz, the power spectrum of the LP filter originally estimated at a sampling rate of 12.8 kHz is computed at frequencies equispaced in . This frequency range of the power spectrum is mapped onto for autocorrelation computation. The power spectrum unknown at frequencies is approximated by extending the power spectrum value at over this range by values for autocorrelation computation. This procedure thus re-estimates the LP filter at sampling frequency 16 kHz with an approximated, extended upper band.

By choosing , all power spectrum and autocorrelation computations can be accomplished on two equispaced frequency grids, one including the frequency points and one the points . Converting down to 12.8 kHz is hence equivalent to dropping 10 last values away from the power spectrum of the original LP filter. Similarly, converting up to 16 kHz translates to adding 10 approximated values to the power spectrum of the original LP filter.

The same procedure is applied identically both at the encoder and the decoder.

5.5.4.1.2 Computation of the Power Spectrum

The LP filter is converted to another sampling rate by truncating or extending its power spectrum

, , ()

to the frequency range that corresponds to the new sampling rate and re-estimating linear prediction coefficients from the autocorrelation computed from this modified spectrum. For reduced complexity, the power spectrum is expressed on the real axis by utilizing the line spectrum frequency decomposition

, ()

where the polynomials and are defined as in subclause 5.1.9.5. The zeros of these two polynomials give the line spectrum frequencies of .

Because of the symmetry properties of the polynomials and , they can be expressed as the polynomials and of . It can be shown that for an LP filter of an even order,

, . ()

The use of this expression is motivated by the observation that the polynomials and can be evaluated efficiently for a given with Horner’s method [34]. The value of these polynomials is needed on frequency grids described in subclause 5.5.4.1.1. The expression (1287) obtains a particularly simple form at that can be utilized in computation.

The polynomials and can be derived by substituting the explicit forms of the Chebyshev polynomials

,     ()

to the Chebyshev series representation of and [34]. This same representation is employed in subclause 5.1.9.5 for the determination of the line spectrum frequencies. The explicit forms of can readily be written by using the recursion

()

By definition the zeros of and are respectively the cosines of the even and odd line spectrum frequencies. The coefficients of these polynomials are thus readily obtained when the line spectrum frequencies of are known. Given that the order of is 16, the polynomial is of order 8 and can be expressed as

()

The leading coefficient is constant. This relation yields a simple recursion for solving the coefficients of from the even line spectrum pairs for . The coefficients of are obtained correspondingly from the odd line spectrum pairs.

If no line spectrum frequency representation is available for, one can alternatively first compute the coefficients of the polynomials and from those of and then solve the coefficients of and from a set of equations that relate the two representations. These relating equations can be derived by substituting the explicit forms of the Chebyshev polynomials to the Chebyshev series representation of and . This approach is employed when switching from the AMR-WB IO mode, which uses the immittance spectrum frequency representation instead of line spectrum frequencies.

5.5.4.1.3 Computation of the Autocorrelation

The autocorrelation of the LP filter is obtained by the inverse Fourier Transform of the modified power spectrum. Since the power spectrum is real and symmetric, the relation between the autocorrelation and the power spectrum can be expressed through the integral

,           k = 0, 1, … ()

Because the power spectrum is real and symmetric, , it suffices to evaluate the integral over only the upper half of the unit circle. Due to this symmetry, the rectangle rule for approximating the integral can be expressed as

,       k = 0, 1, … ()

where  is the set of frequencies equispaced in [0, ], but excluding 0 and to avoid double counting. The number of these frequencies is hereafter assumed odd.

Note that in sequel the factor is omitted for simplicity from the approximation of the autocorrelation. Namely, the autocorrelation can be scaled for convenience, because the resulting linear prediction coefficients are invariant to this scaling.

The expression of autocorrelation can be rewritten equivalently for more efficient computation by utilizing the symmetries of the cosine term relative to through the following trigonometric identities:

()

where k  0, 1, … and . The operator rounds to the nearest integers towards minus infinity. The term simply generates the sequence 2, 0, 2, 0, 2, 0, 2, 0, 2, … and is hence readily implementable.

By employing the two trigonometric identities given above, the autocorrelation can be rewritten as

()

where  is the set of frequencies equispaced in but excluding 0 and . The cosine term of the autocorrelation is evaluated using the recursion

()

starting from and . The value of is stored in a table that holds all the entries needed for .

When switching the internal sampling rate from 16 kHz down to 12.8 kHz, a grid of equispaced frequency points is used. Switching from 12.8 kHz up to 16 kHz uses a grid of points, see subclause 5.5.4.1.1.

5.4.4.3 Extrapolation of LP filter

In case of sampling rate switching involving at least a sampling rate being neither 12.8 nor 16 kHz, the previous LP filter is not converted. Instead, the previous quantized end-frame LSP/LSF and the quantized mid-frame LSP/LSF are set to the current quantized end-frame LSP/LSF.

The LP filter is also extrapolated when the conversion of LP filter between 12.8 and 18 kHz, described in suclause 5.4.4.2, does not produce a stable filter. The stability of the filter is detected during the Levinson Durbin algorithm described in subclause 5.1.9.4. A filter is detected as unstable when at least one of the reflection coefficients has an absolute value greater than 0.99945.

5.4.4.4 Buffer resampling with linear interpolation

Switching the internal sampling rate requires several memories to be resampled. . In order to reduce the complexity of the resampling processing, a simple linear interpolation is used most of the time instead of a conventional low-pass filtering method.

The basic operation for interpolating a point is done as follows:

()

where is the new resampled buffer of size and is the old buffer of size . The index is initialized to 0 while index is equal to the integer part of the position , . The position is initialized to:

()

where is the increment of the position for each unit increment of the index n.

5.4.4.5 Update of CELP input signal memories

The pre-emphasized input signal defined in subclause 5.1.4 is updated at both CELP internal sampling rate 12.8 and 16 kHz at any bit-rates. No specific processing is then needed in case of sampling rate switching.

The weighed synthesis filterstate is updated in three different ways in case of sampling rate switching:

  • It is set to zero if a sampling rate different from 12.8 and 16 kHz is involved.
  • At bit-rates <= 8, 13.2, 32 and 64 kbps, the memory states is updated in previous frame as usual and the difference in sampling rate between the previous and the current frame is simply ignored.
  • Otherwise, the state is recomputed by filtering the resampled LPC synthesis filter state obtained in subclause 5.4.4.7 through the filter filter and by taking computing the error between the obtained signal and the input weighted signal. The memory state is used for computed the target signal of the next frame as defined in subclause 5.2.3.1.2.

5.4.4.6 Update of MDCT-based TCX input signal memories

The past of the input signal and the past of weighted signal are needed for MDCT-based TCX and both past signal are resampled as in sub-clause 5.4.4.4.

5.4.4.7 Update of CELP synthesis memories

For being able to make a seamless transition from CELP to CELP or from MDCT-based TCX to CELP, the following memory states have to be maintained:

  • The adaptive codebook state
  • The LPC synthesis filter state
  • The de-emphasis state

The three memory states are maintained at both encoder and decoder side in CELP mode and in MDCT-based TCX coding mode. The following specific processing is performed in case of internal sampling rate switching.

The adaptive codebook state covers at the encoder side a frame, i.e. 20 ms. In case of internal sampling rate switching between 12.8 and 16 kHz, the adaptive codebook is resampled with the method described in subclause 5.4.4.4. If it involves at least a sampling rate different from 12.8 and 16 kHz, the adaptive codebook is reset with zeros.

The LPC synthesis filter state doesn’t cover a fixed time duration but a fixed number samples equal to the order of the LPC. This order is always 16. For being able to resample this state at any of the sampling rate between 12.8 and 48 kHz, the memory of the LPC synthesis filter state is extended from 16 to 60 samples, which represents 1.25ms at 48 kHz. The memory resampling from sampling rate Hz to sampling rate Hz can summarized as:

()

where is the function resampling the input buffer x from to samples as described in subclause 5.4.4.4. L_SYN_MEM is the largest size in samples that the memory can cover and is equal to 60 samples. At any sampling rate and at any time, mem_syn_r is updated with the last L_SYN_MEM output samples and is then eventually resampled in case of internal sampling rate switching at the beginning of the next frame.

The de-emphasis has a fixed order of 1, which represents also a different time duration at different sampling rates. However the resampling stage is not performed and the memory update in done as usual even in case of intern sampling rate switching.

5.4.5 EVS primary and AMR-WB IO

The codec support a seamless switching between primary and AMR-WB IO modes. While most of memories and past buffers are shared between the two modes, there are some particularities that need to be properly handled. The following scenarios can happen:

5.4.5.1 Switching from primary modes to AMR-WB IO

When a CELP based AMR-WB IO encoded frame is preceded by a primary mode encoded frame, the memories of the CELP AMR-WB IO encoded frame have to be updated or converted before starting the encoding. These include:

  • Convert previous quantized end-frame LSFs to ISFs
  • Convert previous quantized end-frame LSPs to ISPs
  • Set previous un-quantized end-frame ISPs to converted quantized end-frame ISPs
  • Convert previous CNG quantized end-frame LSPs to ISPs
  • Limit index of last encoded CNG energy to 63
  • Reset the gain quantization memory to -14.0.
  • In case the switching happens in the SNG segment, force SID frame
  • Reset AR model LP quantizer memory

In case the AMR-WB IO frame is preceded by CELP primary frame at 16 kHz internal sampling rate, the processing described in subclause 5.4.4 is performed.

Finally in case the AMR-WB IO frame is preceded by MDCT primary frame, the processing described in subclause 5.4.2 is performed.

5.4.5.2 Switching from AMR-WB IO mode to primary modes

When a primary mode encoded frame is preceded by a CELP based AMR-WB IO encoded frame, the memories of the primary mode encoded frame have to be updated or converted before starting the encoding. These include:

  • First three ACELP frames are processed using safety-net LP quantizer
  • Convert previous quantized end-frame ISFs to LSFs
  • Convert previous quantized end-frame ISPs to LSPs
  • Convert previous CNG quantized end-frame LSPs to ISPs
  • Reset BWE past buffers
  • reset the unvoiced/audio signal improvement memories

In case of CELP at 16 kHz internal sampling rate primary mode frame is preceded by the AMR-WB IO frame, the processing described in subclause 5.4.4 is performed.

Finally in case the MDCT primary frame is preceded by AMR-WB IO frame, the processing described in subclause 5.4.3 is performed.

5.4.6 Rate switching

A seamless switching between all EVS primary rates is supported in the codec. Since most of the states and memories are shared and maintained at any bit-rates, a complete re-initialization is not needed. The coding tools are able to be reconfigured at the beginning of any frame. The different bit-rate dependent setups of each tool are described in each corresponding subclause. Rate switching doesn’t require any specific handling, except in the following scenarios.

5.4.6.1 Rate switching along with internal sampling rate switching

In case the internal sampling rate changes when switching the bit-rate, the processing described in subclause 5.4.4 is performed at first.

5.4.6.2 Rate switching along with coding mode switching

In case the internal sampling rate changes when switching the bit-rate, the processing described in subclause 5.4.3 is performed. New possible transitions from MDCT to CELP mode are possible during rate switching compared to table 149. table 150 lists all different cases achievable during rate switching depending on MDCT mode and the new CELP bit rate.

Table 150: MDCT to CELP transition modes in case of rate switching

Switching from

Switching to

CELP Bitrate (kbps)

Transition mode

HQ MDCT or TCX

CELP

7.2

MC1

HQ MDCT or TCX

CELP

8

MC1

TCX

CELP

9.6

MC2

HQ MDCT

CELP

9.6

MC3

HQ MDCT or TCX

CELP

13.2

MC1

TCX

CELP

16.4

MC2

HQ MDCT

CELP

16.4

MC3

TCX

CELP

24.4

MC2

HQ MDCT

CELP

24.4

MC3

HQ MDCT or TCX

CELP

32

MC1

HQ MDCT or TCX

CELP

64

MC1

If the internal sampling rate is also changing, the processing of subclause 5.4.4 is performed beforehand.