5.6.5 Additional sines estimation
26.4043GPPEnhanced aacPlus encoder Spectral Band Replication (SBR) partEnhanced aacPlus general audio codecGeneral audio codec audio processing functionsRelease 17TS
The additional sines estimation module, estimates for which frequency bands a strong sinusoidal component will be missing after high frequency reconstruction in the decoder. The result of the detection may not include a detection of a new siusoidal component unless the frame contains a transient, as defined by the transient detector, or unless the previous frame contained a transient positioned less than nine QMF slots from the trailing border of the previous frame. Such a detection will be removed.
The detection algorithm firstly calculates the input data upon which detection is done, based on the T and Tsbr values.
The detection system is based on using guide-vectors holding information on previous detections. There are two different guide-vectors:
– guidevectorDiff (has the frequency resolution of the scalefactorbands)
– guidevectorOrig (has the frequency resolution of the QMF)
For every frame two tonality estimates in time are available, and hence two estimates in time for the diff, sfm, sfmsbr parameters are available as well. For every estimate a detection is done using the guide-vectors from the previous detection. The results from the separate detections are finally merged into one decision reflecting the current frame
The detection algorithm is applied for every estimate, using guide-vectors from the previous detection and producing a detection vector and new guide-vectors. The algorithm is outlined below for tonality estimate l0.
Firstly, for every scalefactor band the difference signal is compared to a threshold thresTemp. The threshold is calculated based on the guide-vectors and a decay-factor according to:
thresTemp = guideVectorDiff[i][l0] ?
max(decayGuideDiff*guideVectorDiff[i][l0],thresHoldDiffGuide):
thresHoldDiff;
thresTemp = min(thresTemp, thresHoldDiff);
If the difference diff for a scalefactor band is higher than the threshold, the detection vector is set to one for this scalefactor band, and the new guide vector is given the current difference value for the present scalefactor band. If the difference in tonality is lower than the threshold, but the guide vector indicated that present scale factor band had a detected missing sine in for the previous tonality estimate, the guide vector "guideVectorOrig", is assigned the thresHoldToneGuide value, in order to track the decay of the original tone instead of the difference signal. This is outlined for scalefactor band i, in the following pseudo-code:
if(diff[i][l0] > thresTemp){
detVec[i][l0] = 1;
guideVectorDiff[i][l0+1] = diff[i][l0];
}
else{
if(guideVectorDiff[i]){
guideVectorOrig[i][l0] = thresHoldToneGuide;
}
}
A second detection is done for all scalefactor bands where guideVectorOrig is not zero. The threshold used is calculated according to:
thresOrig = max(guideVectorOrig[i][l0]*decayGuideOrig,thresHoldToneGuide);
thresOrig = min(thresOrig,thresHoldTone);
If the tonality value in T for any QMF subband within the a scalefactor band is above the threshold the detection vector element for this subband is set to one, as well as the new guide vector. The following pseudo-code outlines the second round of detection, for scalefactor band i, where ll and lu are the lower and upper QMF subband borders for the present scalefactor band:
if(guideVectorOrig[i][l0]){
for(j= ll;j<lu;j++){
if(T[j][l0] > thresOrig){
detVec[i][l0] = 1;
guideVectorOrig[i][l0+1] = T[j][l0];
}
}
}
Finally, for every scalefactor band, a detection is done in order to make sure that one single strong sinusoidal in the original signal is not replaced (by patching) by several strong sinusoids in the SBR signal. For all scalefactor bands larger than one QMF subband, the values of sfm and sfmSbr is compared. This is done according to:
for(j= ll;j<lu;j++){
if(T[j][l0] > thresOrig &&
(sfmSbr[i][l0] > sfmThresSbr && sfm[i][l0]<sfmThresOrig)){
detVec[i][l0] = 1;
guideVectorOrig[i][l0+1] = T[j][l0];
}
}
However, for the scalefactor bands only containing one QMF subband the above matrices are defined according to:
if(T[ll][l0] > thresHoldTone &&
(diff[+1][l0] < 1/thresHoldTone ||
diff[i-1][l0] < 1/thresHoldTone)){
detVec[i][l0] = 1;
guideVectorOrig[i][l0+1] = T[ll][l0];
}
The above is applied for every estimate, i.e. twice per frame. If a new detection is allowed, e.g. there is a transient present in the frame, the following additional algorithmic step is performed:
– Identify adjacent scalefactor bands where detection of a missing sine is done in both bands
– Find the QMF subband within each scalefactor band that has the highest tonality
– If the QMF subband with the highest tonality value are adjacent, remove the detection for the scalefactor band with the lowest tonality.
Finally the detection decisions from the different detections are merged together, according to:
for(i = 0; i< nSfb; i++){
for(est = start; est < totNoEst; est++){
bs_add_harmonic[i] = bs_add_harmonic[i] || detVec[i][est];
}
}
Here start equals two if the newDetectionAllowed flag is set, otherwise it is set to zero.
If the newDetectionAllowed flag is not set, detections that were not present before are removed, according to:
if(!newDetectionAllowed){
for(i=0;i<nSfb;i++){
if(bs_add_harmonic[i] – prev_bs_add_harmonic[i] > 0)
bs_add_harmonic[i] = 0;
}
}
Apart from detection in which scalefactor band a sinusoidal should be added the module also calculates an energy compensation vector. This is used in the envelope estimation module.
For every scalefactor band where a missing sine has been detected the maximum tonality value in the T matrix is found, indicated by maxPosF (indicating the subband) and maxPosT (indicating the QMF slot). If maxPosF coincides with a scalefactor band border and a detection was not done for the adjacent scalefactor band, a compensation value is calculated according to (here outlined for the case where the maxPosF value coincides for the lower scalefactorband border):
compValue = (int) (fabs(ILOG2*log(diff[i – 1][maxPosT] +EPS)) + 0.5f);
if (compValue > maxComp)
compValue = maxComp;
if(!pAddHarmonicsScaleFactorBands[i-1]) {
if(tonality[maxPosF -1][maxPosT] > tonalityQuota*tonality[maxPosF][maxPosT]){
compVec[i-1] = -1*compValue;
}
}
Finally the detection algorithm compensates for the case where a strong sinusoidal is present in the patched SBR signal where there were no strong sinusoidal in the original, and at the same time there is a sinusoidal missing in the adjacent scalefactor band. This is done for all scalefactor bands where a sine is missing (except for the first and the last scalefactor band), according to the following:
compValue = (int) (fabs(ILOG2*log(diff[i – 1][maxPosT]+EPS)) + 0.5f);
if (compValue > maxComp)
compValue = maxComp;
if(1/diff[i-1][maxPosT] > diffQuota*diff[i][maxPosT]){
compVec[i-1] = -1*compValue;
}
compValue = (int) (fabs(ILOG2*log(diff[i + 1][maxPosT]+EPS)) + 0.5f);
if (compValue > maxComp)
compValue = maxComp;
if(1/diff[i+1][maxPosT] > diffQuota*diff[i][maxPosT]){
compVec[i+1] = compValue;
}
The bitstream element bs_add_harmonic_flag is set to one if any element of the bs_add_harmonic is not zero, otherwise it is set to zero.