7.12 Send speech quality and noise intrusiveness in the presence of ambient noise
26.1323GPPRelease 18Speech and video telephony terminal acoustic test specificationTS
7.12.1 Handset UE
The speech quality in sending for narrowband systems is tested based on ETSI TS 103 106 [34]. This test method leads to three MOS-LQOn quality numbers:
N-MOS-LQOn: Transmission quality of the background noise
S-MOS-LQOn: Transmission quality of the speech
G-MOS-LQOn: Overall transmission quality
The test arrangement is given in clause 5.1.5. The measurement is conducted for 8 noise conditions as described in Table 2d. The measurements should be made in the same unique and dedicated call. The noise types shall be presented according to the order specified in Table 2d.
Table 2d: Noise conditions used for ambient noise simulation in handset mode as specified in ES 202 396-1 [35]
Description |
File name |
Duration |
Level |
Type |
Recording in pub |
Pub_Noise_binaural_V2 |
30 s |
L: 75,0 dB(A) R: 73,0 dB(A) |
Binaural |
Recording at pavement |
Outside_Traffic_Road_binaural |
30 s |
L: 74,9 dB(A) R: 73,9 dB(A) |
Binaural |
Recording at pavement |
Outside_Traffic_Crossroads_binaural |
20 s |
L: 69,1 dB(A) R: 69,6 dB(A) |
Binaural |
Recording at departure platform |
Train_Station_binaural |
30 s |
L: 68,2 dB(A) R: 69,8 dB(A) |
Binaural |
Recording at the drivers position |
Fullsize_Car1_130Kmh_binaural |
30 s |
L: 69,1 dB(A) R: 68,1 dB(A) |
Binaural |
Recording at sales counter |
Cafeteria_Noise_binaural |
30 s |
L: 68,4 dB(A) R: 67,3 dB(A) |
Binaural |
Recording in a cafeteria |
Mensa_binaural |
22 s |
L: 63,4 dB(A) R: 61,9 dB(A) |
Binaural |
Recording in business office |
Work_Noise_Office_Callcenter_binaural |
30 s |
L: 56,6 dB(A) R: 57,8 dB(A) |
Binaural |
1) Before starting the measurements a proper conditioning sequence shall be used. The conditioning sequence shall be comprised of the four additional sentences 1- 4 described in ETSI TS 103 106 [34], applied to the beginning of the 16-sentence test sequence.
NOTE: The sequence of speech samples concatenated for the test signal, consisting of alternating talkers in the sending direction, reduces the overall test time but may represent an unrealistic behaviour for certain voice enhancement technologies. Alternative concatenations are for further study.
2) The send speech signal consists of the 16 sentences of speech as described in ETSI TS 103 106 [34]. The test signal level is -1,7 dBPa at the MRP, measured as the active speech level according to ITU-T P.56 [37]. Three signals are required for the tests:
– The clean speech signal is used as the undisturbed reference (see ETSI TS 103 106 [34], ETSI EG 202 396‑3 [36]).
– The speech plus undisturbed background noise signal is recorded at the terminal’s microphone position using an omnidirectional measurement microphone with a linear frequency response between 50 Hz and 12 kHz.
– The send signal is recorded at the POI.
3) N-MOS-LQOn, S-MOS-LQOn and G-MOS-LQOn are calculated as described in ETSI TS 103 106 [34] on a per sentence basis and averaged over all 16 sentences. The results shall be reported as average and standard deviation.
4) The measurement is repeated for each ambient noise condition described in Table 2d.
5) The average of the results derived from all ambient noise types is calculated.
7.12.2 Hand-held hands-free UE
The speech quality in sending for narrowband systems is tested based on ETSI TS 103 106 [34]. This test method leads to three MOS-LQOn quality numbers:
N-MOS-LQOn: Transmission quality of the background noise
S-MOS-LQOn: Transmission quality of the speech
G-MOS-LQOn: Overall transmission quality
The test arrangement is given in clause 5.1.5.
When using the simulation method described in TS 103 224 [43], the measurement is conducted for 5 noise conditions as described in Table 2d2. When using the ES 202 396-1 method, the equivalent binaurally recorded noises described in Table 2d2, and available in the source file directory of TS 103 224 [43], are used.
Table 2d2: Noise conditions used for ambient noise simulation in hand-held hands-free mode as specified in TS 103 224 [43], A-weighted
Name |
Description |
Length |
Hands-free Levels |
Binaural L |
Binaural R |
Full-size car 130 km/h (FullSizeCar_130) |
HATS and microphone array at co-drivers position |
30 s |
1: 69,5 dB 2: 68,6 dB 3: 68,6 dB 4: 68,7 dB 5: 68,8 dB 6: 68,8 dB 7: 69,2 dB 8: 69,7 dB |
68.7 dB |
70.7 dB |
Crossroadnoise (Crossroadnoise) |
HATS and microphone array standing outside near a crossroad |
30 s |
1: 69,9 dB 2: 69,6 dB 3: 69,6 dB 4: 69,9 dB 5: 69,6 dB 6: 69,5 dB 7: 69,6 dB 8: 69,7 dB |
70.8 dB |
71.6 dB |
Cafeteria (Cafeteria) |
HATS and microphone array inside a cafeteria |
30 s |
1: 69,0 dB 2: 69,7 dB 3: 69,6 dB 4: 69,8 dB 5: 69,5 dB 6: 69,5 dB 7: 69,7 dB 8: 70,0 dB |
69.8 dB |
70.3 dB |
Sales Counter (SalesCounter) |
HATS and microphone array in a supermarket |
30 s |
1: 65,5 dB 2: 65,3 dB 3: 65,2 dB 4: 65,5 dB 5: 65,6 dB 6: 65,3 dB 7: 65,2 dB 8: 65,3 dB |
66.7 dB |
66.6 dB |
Callcenter 2 (Callcenter) |
HATS and microphone array in business office |
30 s |
1: 59,3 dB 2: 59,3 dB 3: 59,5 dB 4: 59,6 dB 5: 59,4 dB 6: 59,3 dB 7: 59,3 dB 8: 59,5 dB |
60,2 dB |
60,0 dB |
1) Before starting the measurements a proper conditioning sequence shall be used. The conditioning sequence shall be comprised of the four additional sentences 1- 4 described in ETSI TS 103 106 [34], applied to the beginning of the 16-sentence test sequence. The conditioning signal level is +1.3 dBPa at the MRP, measured as active speech level according to ITU-T P.56 [37].
NOTE: The sequence of speech samples concatenated for the test signal, consisting of alternating talkers in the sending direction, reduces the overall test time but may represent an unrealistic behaviour for certain voice enhancement technologies. Alternative concatenations are for further study.
2) The send speech signal consists of the 16 sentences of speech as described in ETSI TS 103 106 [34]. The test signal level is +1.3dBPa at the MRP, measured as active speech level according to ITU-T P.56 [37]. Three signals are required for the tests:
– The clean speech signal is used as the undisturbed reference (see ETSI TS 103 106 [34], ETSI EG 202 396‑3 [36]).
– The speech plus undisturbed background noise signal is recorded at the terminal’s microphone position using an omnidirectional measurement microphone with a linear frequency response between 50 Hz and 12 kHz.
– The send signal is recorded at the POI.
3) N-MOS-LQOn, S-MOS-LQOn and G-MOS-LQOn are calculated as described in ETSI TS 103 106 [34] on a per sentence basis and averaged over all 16 sentences. The results shall be reported as average and standard deviation.
4) The measurement is repeated for each ambient noise condition described in Table 2d2.
5) The average of the results derived from all ambient noise types is calculated.
7.12.3 Electrical interface UE
The speech quality in sending for narrowband systems is tested based on ETSI TS 103 106 [34]. This test method leads to three MOS-LQOn quality numbers:
N-MOS-LQOn: Transmission quality of the background noise
S-MOS-LQOn: Transmission quality of the speech
G-MOS-LQOn: Overall transmission quality
For the measurement of electrial interface UE, pre-recorded noisy speech signals according to Annex B of Recommendation ITU‑T P.381 [53] shall be used. These noisy test sequences are available for the eight noise types described in Table 2d and were captured at the electrical output of a representative analogue headset. The corresponding speech level at MRP was calibrated to -1.7 dBPa, as described in clause 7.12.1. All test signals also include the proper conditioning sequence described in ETSI TS 103 106 [34], which is applied to the beginning of the 16-sentence test sequence.
Annex B of Recommendation ITU‑T P.381 [53] also provides the corresponding unprocessed reference speech signals, which are necessary for the calculation of S-MOS, N-MOS and G-MOS according to [b-ETSI TS 103 106]. These signals were recorded with a omnidirectional measurement microphone close to the input microphone of the representative headset.
1) The test arrangement is given in clause 5.1.6. For analogue interfaces, the noisy test sequences according to Annex B of Recommendation ITU‑T P.381 [53] shall be calibrated in a way that -26 dBov correspond to ‑60 dBV. For digital interfaces, -26 dBov shall correspond to -16 dBm0.
2) The noisy test sequence is inserted into electrical interface UE and then recorded at the POI.
3) N-MOS-LQOn, S-MOS-LQOn and G-MOS-LQOn are calculated as described in ETSI TS 103 106 [34] (narrowband mode) on a per sentence basis and averaged over all 16 sentences. The results shall be reported as average and standard deviation. Three signals are required for the tests:
– The clean speech signal is used as the undisturbed reference (see ETSI TS 103 106 [34], ETSI EG 202 396‑3 [36]).
– The speech plus undisturbed background noise signal. For each noisy test signal, a corresponding signal is available in Annex B of Recommendation ITU‑T P.381 [53] as well.
– The send signal is recorded at the POI.
4) The measurement is repeated for each ambient noise condition described in Table 2d. For each of these noise types, a corresponding test signal is available in Annex B of Recommendation ITU‑T P.381 [53].
5) The average of the results derived from all ambient noise types is calculated.