5.1 SBR tools overview

26.4043GPPEnhanced aacPlus encoder Spectral Band Replication (SBR) partEnhanced aacPlus general audio codecGeneral audio codec audio processing functionsRelease 17TS

The encoder part of the SBR tool estimates several parameters used by the high frequency reconstruction method on the decoder side. In order to synchronise the SBR bitstream data with the AAC codec, the two different modes of operation have to be considered; normal aacPlus operation and aacPlus parametric stereo operation. In the normal case, the AAC encoder is responsible for downsampling of the input PCM signal, while the SBR encoder works in parallel on twice the sampling frequency compared to the downsampled signal. When using parametric stereo aacPlus, the SBR tool is also responsible for downsampling of the AAC coder signal. The two modes are outlined in the following sections and illustrated in Figure 1 and Figure 2.

Figure 1 aacPlus block diagram

Figure 2 Parametric stereo aacPlus block diagram

5.1.1 Enhanced aacPlus sdynchronization without parametric stereo

The time domain input PCM signal is assumed to be stored in a buffer x, where 2048 new samples are added to the end of the buffer every frame. Before adding new samples, all samples in the buffer must be left-shifted 2048 samples. The buffersize amounts to 576 + 2048 + tinputDelay samples, where tinputDelay equals the total AAC delay, i.e. the delay for the entire encoder – decoder chain, plus the SBR decoder buffer delay minus the SBR encoder buffer delay. All delays are expressed in the SBR input sampling rate:

The PCM buffer x is fed to the analysis QMF bank, where subband filtering is performed. The window stride of the QMF bank is illustrated in Figure 3a, which shows that the first window is applied from sample 0 to sample 639 on the PCM buffer. The output from the analysis QMF bank: 32 subbands having 64 frequency channels each, is stored in the matrix X (Figure 3b) as

A delay of qmfWriteOffset subband samples is hence introduced, making

The algorithmic buffer delay in the decoder is 6 subband samples, giving

The total AAC delay is the delay introduced by the 1024 point MDCT transform, the window switching look-ahead and the delay introduced by the downsampling filter. If other delays are introduced these of course have to be accounted for.

Figure 3 aacPlus encoder buffers and synchronisation

5.1.2 Enhanced aacPlus synchronisation with parametric stereo

The time domain input PCM signal is assumed to be stored in a buffer x, where 2048 new samples are added to the end of the buffer every frame. Before adding new samples, all samples in the buffer must be left-shifted 2048 samples. The buffersize amounts to 576 + 2048. Note that two buffers are needed for stereo signals.

The PCM buffer is fed to the analysis QMF bank, where subband filtering is performed. The window stride of the QMF bank is illustrated in Figure 4a, which shows that the first window is applied from sample 0 to sample 639 on the PCM buffer. The output from the analysis QMF bank: 32 subbands having 64 frequency channels each, is stored in the matrix H (Figure 4b) as

Two buffers are needed for stereo operation. The subband samples in the matrix H are fed to the hybrid filter bank (See [5]) which introduces a delay of 6 subband samples. Parametric stereo parameters are extracted from the output of the hybrid filterbank and downmixing of the stereo signal is performed. Subsequently, hybrid synthesis filtering is applied to the modified hybrid subband samples.

The downmixed subband samples are fed to the subband matrix X (Figure 4c) as

whereafter "normal" SBR operation is undertaken. The subband samples are in parallell fed to the 32 channel synthesis filter bank. The stride for the synthesis windowing is illustrated in Figure 4d. The output from the filterbank, having a sampling frequency half of the SBR sampling frequency is forwarded to the AAC encoder.

After SBR processing of the current frame, an additional delay of one frame has to be introduced by delaying the SBR frame data (Figure 4e).

To achieve synchronisation, the total AAC codec delay is bound to be 3200 samples, expressed in the SBR input sampling frequency.

Figure 4 Enhanced aacPlus stereo synchronisation

5.1.3 SBR encoder modules overview

The modules of the encoder part of the SBR tool are illustrated in the block diagram of Figure 5. The SBR tool operates on discrete mono signals in general, but some of the modules in Figure 5 need simultaneous access to both the left and right signal in case of stereo signals.

– As outlined in 5.1.1 and 5.1.2, the time domain signal is first filtered by the 64 channel complex QMF bank (section 5.2). The output from the analysis QMF bank: 32 subbands having 64 frequency channels each, is stored in the matrix X as

Several modules use the output from the QMF bank;

– The transient detector operates on the matrix X starting at subband sample 0.

– The frame splitter operates on the matrix X starting at subband sample 0.

– The output from the transient detector and frame splitter is fed to the frame generator, where the time and frequency resolutions for the current frame are determined.

– The Tonality detector operates on the matrix X starting at subband sample qmfWriteOffset.

– The control data from the Tonality detector and also the current time and frequency grid is forwarded to the unit for Additional control parameters. In this module, the levels of the adaptive noise, inverse filtering and additional sines are determined.

– The Envelope energy formatter operates on the matrix X starting from subband sample 0. The unit needs the time frequency grid and the additional control data as inputs.

– The formatted envelope data is subsequently quantised and Huffman coded, before being fed to the Bitstream multiplexer, where all SBR data is formatted and packed into a SBR frame. The SBR frame is transmitted as a fill-element in the bitstream multiplex together with the AAC channel element for the current frame. In case of a Parametric stereo SBR element, the current SBR frame is delayed one frame before entering the bitstream multiplexer (Section 5.1.2 ).

Figure 5 Sbr Encoder overview