Introduction
3GPP46.008Half rate speechPerformance Characterization of the GSM Half Rate speech codecRelease 17TS
During five years of activity, the Traffic CHannel Half rate Speech (TCH-HS) Experts Group has produced a number of test plans and experiments to assess the performance of the candidate algorithms submitted for the GSM half rate standardization. An aid in this task was a large knowledge base made available from previous CCITT (now ITU-T) and ETSI activities on codec assessment (see annex A references 1) 2) 3) 4) 5)), plus the use of recommendations in the field (see annex A references 6) 7) 8)).
Here are reported 3 different phases of the standardization of the GSM half rate codec: Characterization Phase 1, Characterization Phase 2 and Verification phase. The selection of the codec candidate for the GSM half rate traffic channel was based on the results of the characterization phase 1. Test results reported hereafter are based on version 3.3 of the GSM half rate codec.
Characterization Phase 1 (Experiments 1 to 5): For characterization Phase 1, C-simulations of the candidate codecs were used as hardware implementations were not available at that time. The simulations were produced by MOTOROLA (USA) and Ericsson (Sweden) with support by MATRA (France). The following experiments were carried out:
– Experiment 1: Quality under error conditions (A-law, IRS);
– Experiment 2: Quality under error conditions (UPCM, No IRS);
– Experiment 3: Quality under tandeming conditions;
– Experiment 4: Quality under background noise conditions (ACR);
– Experiment 5: Quality under background noise conditions (DCR).
Characterization Phase 2 (Experiments 6 to 9): During Characterization Phase 2, a hardware implementation of the candidate algorithm was employed, provided by ANT (Germany). The following experiments were carried out:
– Experiment 6: Assessment of equivalent qdu;
– Experiment 7: Effect of tandeming with other standards;
– Experiment 8: Talker Dependency;
– Experiment 9: Assessment of DTX algorithm.
Verification phase: Further tests accompanied characterization Phase 1 and 2 to obtain a better knowledge of the characteristics of the GSM half rate codec and its performance under different operational conditions:
– Special background noise;
– Channel activity in DTX mode;
– Performance with DTMF tones;
– Performance with signalling tones;
– Delay;
– Frequency response;
– Complexity.
For the characterization tests, a practical "indirect" method of performance comparison between different codecs was adopted, that utilizes the Modulated Noise Reference Unit (MNRU) (see annex A reference 7)) as a reference degradation in a subjective experiment including the codecs under test.
NOTE: The MNRU is a device designed for producing speech correlated noise that sounds subjectively like the quantizing noise produced by log-companded PCM codecs. The device is subjectively calibrated for Mean Opinion Scores (MOS) against Q dB (where Q is the ratio of the speech to speech-correlated noise power). The "Equivalent Q" of the codecs under test can then be found from the corresponding MOS on the calibration curve of the MNRU.
It is well known that this procedure works as long as the reference degradation sounds similar to the degradation under test.
The MNRU provides the additional function of normalization across laboratories carrying out the same experiment, i.e. all MOS are converted to Equivalent Q (dB) and the results can be analysed statistically for differences between laboratories. An appropriate analysis of variance (ANOVA) was identified to evaluate the statistical significance of the experimental factors.
The aim was to show that the subjective performance of the GSM half rate algorithm is at least as good as that of the full rate codec over a selected set of conditions. To allow for experimental error, the half rate candidate had to perform better than 1 dB below the performance of the full rate (for the overall figure of merit) and better than 3 dB below the performance of the full rate for individual test conditions.
To model its use in a network, the half rate candidate codec had to be placed between either a ITU-T Recommendation G.711 [1] PCM coder and decoder, or a Uniform PCM, which provided the necessary A/D and D/A conversions. Source files of speech, produced either by using an "average" telephone set (called IRS – Intermediate Reference System) or a microphone showing a "flat" sending frequency characteristic (No IRS or "flat"), could then be processed through the different experimental conditions, for presentation to subjects in listening experiments. Among the different experimental conditions were error conditions at different input levels under both IRS A-Law PCM and No-IRS Linear PCM audio parts, tandeming conditions for different error patterns and background noise conditions. During all phases of testing, the host laboratory functions for the processing were provided by Aachen University of Technology (RWTH at Aachen, Germany).
The whole set of "individual" and "global" data, collected in Experiment 1 to Experiment 9 were extensively analysed and discussed within TCH-HS expert group; for each condition, the MOS (or DMOS for Experiment 5) were computed, separately for male and female speech, as well as averaged together, and the effects of different factors and their interactions were subject to analysis of variance (ANOVA). Within characterization Phase 1, conversion to Q values and weighted averages were calculated for the whole set of results, in order to assess that the global figure of merit of the GSM half rate algorithm meets the quality requirement.