8 Sequences for finding the 20 ms framing of the GSM enhanced full rate speech encoder

3GPP46.054Release 17Test sequences for the GSM Enhanced Full Rate (EFR) speech codecTS

When testing the decoder, alignment of the test sequences used to the decoder framing is achieved by the air interface (testing of MS) or can be reached easily on the Abis-interface (testing on network side).

When testing the encoder, usually there is no information available about where the encoder starts its 20 ms segments of speech input to the encoder.

In the following, a procedure is described to find the 20 ms framing of the encoder using special synchronisation sequences. This procedure can be used for MS as well as for network side.

Synchronisation can be achieved in two steps. First, bit synchronisation has to be found. In a second step, frame synchronisation can be determined. This procedure takes advantage of the codec homing feature of the enhanced full rate codec, which puts the codec in a defined home state after the reception of the first homing frame. On the reception of further homing frames, the output of the codec is predefined and can be triggered to.

8.1 Bit synchronisation

The input to the speech encoder is a series of 13 bit long words (104 kbits/s, 13 bit linear PCM). When starting to test the speech encoder, no knowledge is available on bit synchronisation, i.e., where the encoder expects its least significant bits, and where it expects the most significant bits.

The encoder homing frame consists of 160 samples, all set to zero with the exception of the least significant bit, which is set to one (0 0000 0000 0001 binary, or 0x0008 hex if written into 16 bit words left justified). If two such encoder homing frames are input to the encoder consecutively, the decoder homing frame is expected at the output as a reaction of the second encoder homing frame.

Since there are only 13 possibilities for bit synchronisation, after a maximum of 13 trials bit synchronisation can be reached. In each trial three consecutive encoder homing frames are input to the encoder. If the decoder homing frame is not detected at the output, the relative bit position of the three input frames is shifted by one and another trial is performed. As soon as the decoder homing frame is detected at the output, bit synchronisation is found, and the first step can be terminated.

The reason why three consecutive encoder homing frames are needed is that frame synchronisation is not known at this stage. To be sure that the encoder reads two complete homing frames, three frames have to be input. Wherever the encoder has its 20 ms segmentation, it will always read at least two complete encoder homing frames.

An example of the 13 different frame triplets is given in sequence BITSYNC.INP (see table 7).

8.2 Frame synchronisation

Once bit synchronisation is found, frame synchronisation can be found by inputting one special frame that delivers 160 different output frames, depending on the 160 different positions that this frame can possibly have with respect to the encoder framing.

This special synchronisation frame was found by taking one input frame and shifting it through the positions 0 to 159. The corresponding 160 encoded speech frames were calculated and it was verified that all 160 output frames were different. When shifting the input synchronisation frame, the samples at the beginning were set to 0x0008 hex, which corresponds to the samples of the encoder homing frame.

Before inputting this special synchronisation frame to the encoder, again the encoder has to be reset by one encoder homing frame. A second encoder homing frame is needed to provoke a decoder homing frame at the output that can be triggered to. And since the framing of the encoder is not known at that stage, three encoder homing frames have to precede the special synchronisation frame to ensure that the encoder reads at least two homing frames, and at least one decoder homing frame is produced at the output, serving as a trigger for recording.

The special synchronisation frame preceded by the three encoder homing frames are given in SEQSYNC.INP. The corresponding 160 different output frames are given in SYNC000.COD through SYNC159.COD. The three digit number in the filename indicates the number of samples by which the input was retarded with respect to the encoder framing. By a corresponding shift in the opposite direction, alignment with the encoder framing can be reached.

8.3 Formats and sizes of the synchronisation sequences

BIT SYNC.INP:

This sequence consists of 13 frame triplets. It has the format of the speech encoder input test sequences (13 bit left justified with the three least significant bits set to zero).

The size of it is therefore:

SIZE (BITSYNC.INP) = 13 * 3 * 160 * 2 bytes = 12480 bytes

SEQSYNC.INP:

This sequence consist of 3 encoder reset frames and the special synchronisation frame. It has the format of the speech encoder input test sequences (13 bit left justified with the three least significant bits set to zero).

The size of it is therefore:

SIZE (SEQSYNC.INP) = 4 * 160 * 2 bytes = 1280 bytes

SYNCXXX.COD:

These sequences consists of 1 encoder output frame each. They have the format of the speech encoder output test sequences (16 bit words right justified). The values of the VAD and SP flags are set to one in these files.

The size of them is therefore:

SIZE (SYNCXXX.COD) = (244 + 2) * 2 bytes = 492 bytes

Table 7 summarises this information.

Table 7: Location, size and justification of synchronisation sequences

Disk No.

Purpose of Sequence

Name of Sequence

No. of Frames

Size in Bytes

Justification    

3/8

Bit Synchronisation

BITSYNC.INP

39

1 2480

Left

3/8

Frame Synchronisation (input)

SEQSYNC.INP

4

1 280

Left

3/8

3/8

3/8

"
"
"
3/8

Frame Synchronisation (output)

SYNC000.COD

SYNC001.COD

SYNC002.COD

"

"

"

SYNC159.COD

1

1

1

"

"

"

1

492

492

492

"

"

"

492

Right

Right

Right

"

"

"

Right