12 Inter-working

26.1143GPPIP Multimedia Subsystem (IMS)Media handling and interactionMultimedia telephonyRelease 18TS

Tools: ARFCN - Frequency Conversion for 5G NR/LTE/UMTS/GSM

12.1 General

In order to support inter-working between different networks it is good if common codecs for the connection can be found. Requirements for different networks are described in this clause. In some cases functionality is also needed in the network to make the inter-working possible (e.g. MGCF and MGW).

NOTE: The term MTSI MGW (or MTSI Media gateway) is used in a broad sense, as it is outside the scope of the current specification to make the distinction whether certain functionality should be implemented in the MGW or in the MGCF.

12.2 3G-324M

12.2.1 General

Inter-working functions are required between IMS and CS. There are separate functions, in e.g. a MGCF, for control-plane inter-working (see TS 29.163 [65]) and, in e.g. a IM-MGW, for user-plane inter-working. Control-plane inter-working includes for instance SIP ⬄ BICC and SIP ⬄ H.245 protocol translations, whereas user-plane inter-working requires transport protocol translations and possibly transcoding.

12.2.2 Codec usage

12.2.2.1 General

An interoperable set of speech, video and real-time text codecs is specified for 3G-324M and MTSI. Both video codec level and maximum bitrate can be specified as part of the call setup negotiation (see clause 12.2.5). Thus, it may be possible that the MTSI client in terminal and a CS UE agree on a common codec end-to-end without the need for MGW transcoding.

If a common codec is not found and the MTSI MGW does not support transcoding between any of the supported codecs, then the controlling MGCF may drop the unsupported media component. If the speech part cannot be supported, then the connection should not be set up.

12.2.2.2 Text

A channel for real-time text is specified in ITU-T H.324. Presentation and coding is specified according to ITU-T Recommendation T.140, which is also used for MTSI clients (see clause 7.4.4). Inter-working is a matter of establishing the text transport channels and moving the text contents between the two transport levels.

12.2.3 Payload format

See clause 7.4 of the present document.

12.2.4 MTSI media gateway trans-packetization

12.2.4.1 General

The MTSI MGW shall offer conversion between H.223 as used in 3G-324M on the CS side and RTP as used in IMS. This clause contains a list inter-working functionalities that should be included.

12.2.4.2 Speech de-jitter buffer

The MTSI MGW should use a speech de-jitter buffer in the direction IMS to CS with sufficient performance to meet the 10 milliseconds maximum jitter requirement in clause 6.7.2 of ITU-T Recommendation H.324. H.324 specifies that transmission of each speech AL-SDU at the H.223 multiplex shall commence no later than 10 milliseconds after a whole multiple of the speech frame interval, measured from transmission of the first speech frame.

12.2.4.3 Video bitrate equalization

Temporary video rate variations can occur on the IMS side for example due to congestion. The video rate on the CS side, in contrast, is under full control of the CS side UE and the MGCF.

During session setup, the MGCF shall negotiate a video bitrate on the IMS side that allows all video bits to be conveyed to/from the CS link.

A buffer shall be maintained at the IM-MGW in the direction from the IMS to the CS side. The size of the buffer should be kept small enough to allow for a low end-to-end delay, yet large enough to conceal most network jitter on the IMS side. Temporary uneven traffic on the IMS side, beyond the handling capability of the buffer, should be handled as follows: if the buffer overflows, RTP packets should be dropped and the resulting loss and observed jitter should be reported by the means of an RTCP RR at the earliest possible sending time. The drop strategy may preferably be implemented media aware (i.e. favouring dropping predicted information over non-predicted information and similar techniques), or may be drop-head. If the buffer runs empty, the CS side should insert appropriate flag stuffing.

A buffer shall be maintained in the direction from the CS to the IMS side. The size of the buffer should be kept small enough to allow for a low end-to-end delay, but large enough to conceal most network jitter on the CS side. If the buffer overflows, then video bits must be dropped, preferably in a media-aware fashion, i.e. at GOB/slice/picture boundaries. IM-MGWs may also take into account the type of media data, i.e. coded with or without prediction. When the buffer runs empty, no activity is required on the IMS side.

If the CS video call is changed to a speech-only call [46], the video component on the IMS side shall be dropped.

12.2.4.4 Data loss detection

If RTP packet loss is detected on input to the MTSI MGW at the IMS side, including losses caused by buffer-full condition as described above, corresponding H.223 AL-SDU sequence number increments should be made on the CS side to enable loss detection and proper concealment in the receiving CS UE.

If packet loss is detected on the CS side, e.g. through H.223 AL-SDU sequence numbers, those losses should be indicated towards the IMS side through corresponding RTP packet sequence number increments. The deliberate increments made for this reason will be visible in the RTCP RR from the MTSI client and the MTSI MGW should take that into account when acting on RTCP RR from the MTSI client, as the CS side losses are not related to the IMS network conditions.

12.2.4.5 Data integrity indication

This is mainly relevant in the direction from CS to IMS. The H.223 AL-SDUs include a CRC that forms an unreliable indication of data corruption. On the IMS side, no generic protocol mechanisms are available to convey this CRC and/or the result of a CRC check. The MTSI MGW shall discard any AL-SDUs which fail a CRC check and are not of a payload type that supports the indication of possible bit errors in the RTP payload header or data. If such payload type is in use, the MTSI MGW may forward corrupted packets, but in this case shall indicate the possible corruption by the means available in the payload header or data. One example is setting the Q bit of RFC 3267 [28] to 0 for AMR speech data that was carried in an H.223 AL-SDU with CRC indicating errors. Another example is setting the F bit of RFC 6184 [25] for H.264 (AVC) NAL units or the F bit of [120] for H.265 (HEVC) NAL units that may contain bit errors.

The H.223 AL-SDU CRC is not fully fail-safe and it is therefore recommended that a MTSI client is designed to be robust and make concealment of corrupt media data, similar to the CS UE.

12.2.4.6 Packet size considerations

12.2.4.6.0 General

The same packet size and alignment requirements and considerations as defined in clause 7.5.2 of the present document and in TS 26.111 [45] apply to the MTSI MGW and controlling MGCF, as it in that sense acts both as a MTSI client towards the IMS and as a CS UE towards the CS side. Maximum available buffer size for packetization of media data may differ between IMS and CS UE. To avoid non-favourable segmentation of data (especially video) by the MTSI MGW, the controlling MGCF should indicate the SDP ‘a’ attribute "3gpp_MaxRecvSDUSize" to the MTSI client in terminal. This attribute indicates the maximum SDU size of the application data (excluding RTP/UDP/IP headers) that can be transmitted to the receiver without segmentation. The specific maximum SDU size limit is determined by the MGCF from the H.245 bearer capability exchange between the CS UE and the MGCF. For example, the MTSI MGW determines this through the maximumAl2SDUSize and maximumAl3SDUSize fields of the H223Capability member in H.245 TerminalCapabilitySet message.

12.2.4.6.1 The Maximum Receive SDU Size attribute "3gpp_MaxRecvSDUSize"

The ABNF for the maximum receive SDU size attribute is described as follows:

Max-receive-SDU-size-def = "a" "=" "3gpp_MaxRecvSDUSize" ":" size-value CRLF

size-value = 1*5DIGIT; 0 to 65535 in octets

The value "size-value" indicates the maximum SDU size of application data, excluding RTP/UDP/IP headers, that can be transmitted to the other end point without segmentation.

The parameter "3gpp_MaxRecvSDUSize" should be included in the SDP at the session level and/or at the media level. Its usage is governed by the following rules:

1. At the session level, the "3gpp_MaxRecvSDUSize" attribute shall apply to the combination of the data from all the media streams in the session.

2. At the media level, the "3gpp_MaxRecvSDUSize" attribute indicates to the MTSI client in terminal that this particular media stream in the session has a specific maximum SDU size limit beyond which received SDUs will be segmented before delivery to the CS UE.

3. If the "3gpp_MaxRecvSDUSize" attribute is included at the session and media levels, then the particular media streams have specific maximum SDU size limits for their own data while the session has an overall maximum SDU size limit for all the media data in the session.

The MGCF includes the "3gpp_MaxRecvSDUSize" attribute in the SDP offer or answer sent to the MTSI client in terminal after the MGCF determines the bearer capability of the CS UE (see Annex E of [65]). Upon reception of the SDP offer or answer that includes the "3gpp_MaxRecvSDUSize" attribute, the MTSI client in terminal need not include this attribute in its subsequent exchange of messages with the MTSI MGW.

There are no offer/answer implications on the "3gpp_MaxRecvSDUSize" attribute. The "3gpp_MaxRecvSDUSize" attribute in the SDP from the MTSI MGW is only an indication to the MTSI client in terminal of the maximum SDU size that avoids segmentation for the specified media streams and/or session.

NOTE: Default operation in the absence of the "3gpp_MaxRecvSDUSize" attribute in SDP is to not have any SDU size limits for any of the media streams or session.

12.2.4.7 Setting RTP timestamps

In general, no explicit timestamps exist at the CS side. Even without transcoding functionality, the MTSI MGW may have to inspect and be able to interpret media data to set correct RTP timestamps.

12.2.4.8 Protocol termination

The MTSI MGW shall terminate the H.223 protocol at the CS side. Similarly, the MTSI MGW shall terminate RTP and RTCP at the IMS side.

12.2.4.9 Media synchronization

The IM-MGW and controlling MGCF should forward and translate the timing information between the IMS side (RTP timestamps, RTCP sender reports) and the CS side (H.245 message H223SkewIndication) to allow for media synchronization in the MTSI client in terminal and the CS UE. The MTSI MGW shall account for its own contribution to the skew in both directions. Note that transmission timing of H223SkewIndication and RTCP SR must be decoupled. H223SkewIndication has no timing restrictions, but is typically sent only once in the beginning of the session. RTCP SR timing is strictly regulated in RFC 3550 [9], RFC 4585[40], and clause 7.3. To decouple send timings, the time shift information conveyed in H223SkewIndication and RTCP SR must be kept as part of the MTSI MGW/MGCF session state. H223SkewIndication should be sent at least once, and may be sent again when RTCP SR indicates a synchronization change. A synchronization change of less than 50 ms (value to be confirmed) should be considered insignificant and need not be signalled.

NOTE: This procedure is not supported in the present Release in a decomposed MGCF and IM-MGW, as H.245 is treated on the MGCF and RTCP is sent at the IM-MGW, and no means are defined to forward information from the H223SkewIndication over the Mn interface.

12.2.5 Session control

The MGCF shall offer translation between H.245 and SIP/SDP signalling according to TS 29.163 [65] to allow for end-to-end capability negotiation.

12.3 GERAN/UTRAN CS inter-working

This clause defines requirements only for the PS side of the MGW, i.e. for the PS session in-between the MTSI client in a terminal and the MGW. The CS side of the MGW, i.e. in-between the MGW and the CS terminal, is out of scope of this clause.

This clause applies for MTSI MGWs supporting inter-working between a CS terminal using CS GERAN/UTRAN access or an MTSI client in terminal performing SRVCC to CS and:

– an MTSI client in terminal using 3GPP access; or:

– an MTSI client in terminal using fixed access; or:

– a non-MTSI client.

The requirements and recommendations for these three cases are harmonized to enable using the same procedures regardless of the type of PS client and what access it uses, as long as it uses IP based access.

The target for this clause is to enable tandem-free operation when the same codec (AMR or AMR-WB) is used by both end-points.

An MTSI MGW may also support the other codecs listed in clause 18.2.2 for inter-working between an MTSI client in terminal using fixed access and a CS terminal using GERAN or UTRAN access. This means that tandem coding will be used and then the PS side and the CS side operate independently of each other. This further means that the requirements and recommendations for the PS side of the MGW are the same as for an MTSI client in terminal using fixed access, as described in clause 18, unless it is explicitly defined below.

12.3.0 3G-324M

If 3G-324M is supported in the GERAN/UTRAN CS, then the inter-working can be made as specified in clause 12.2.

12.3.1 Codecs for MTSI media gateways

12.3.1.1 Speech interworking between 3GPP PS access and CS GERAN/UTRAN

This clause applies to MTSI MGWs used for interworking between an MTSI client in terminal using 3GPP access and a CS GERAN/UTRAN UE.

MTSI media gateways supporting speech communication between an MTSI client in terminal using 3GPP access and terminals operating in the CS domain in GERAN and UTRAN should support Tandem-Free Operation (TFO) for AMR or AMR-WB according to TS 28.062 [37], and Transcoder-Free Operation (TrFO), see TS 23.153 [38].

MTSI media gateways supporting speech communication and supporting TFO and/or TrFO shall support:

– AMR speech codec modes 12.2, 7.4, 5.9 and 4.75 [11], [12], [13], [14] and source-controlled rate operation [15].

MTSI media gateways should also support the other AMR codec types and configurations as defined in Clause 5.4 in [16].

In the receiving direction, from the MTSI client in the terminal, the MTSI media gateway shall be capable of restricting codec mode changes to be aligned to every other frame border and shall be capable of restricting codec mode changes to neighbouring codec modes within the negotiated codec mode set.

NOTE 1: This means that the MTSI client in a terminal will apply and accept mode changes according to UMTS AMR2 [16]. An example of an SDP offer for how the MTSI MGW can restrict AMR mode changes in the MTSI client in a terminal is shown in Table A.2.1. An example of an SDP answer from the MTSI MGW for restricting the mode changes in the MTSI client in a terminal is shown in Table A.3.4a.

MTSI media gateways supporting wideband speech communication at 16 kHz sampling frequency and supporting TFO and/or TrFO for wideband speech shall support:

– AMR wideband codec 12.65, 8.85 and 6.60 [17], ‎[18], ‎[19], [20] and source controlled rate operation ‎[21].

MTSI media gateways supporting wideband speech communication at 16 kHz sampling frequency should also support the other AMR-WB codec types and configurations as defined in [16].

NOTE 2: This means that the MTSI client in a terminal will apply and accept mode changes according to UMTS AMR-WB [16]. An example of an SDP offer for how the MTSI MGW can restrict AMR and AMR-WB mode changes in the MTSI client in a terminal is shown in Table A.2.4. An example of an SDP answer from the MTSI MGW for restricting the mode changes in the MTSI client in a terminal is shown in Table A.3.4.

MTSI MGWs supporting wideband speech communication shall also support narrowband speech communications. When offering both wideband speech and narrowband speech communication, wideband shall be listed as the first payload type in the m line of the SDP offer (RFC 4566 [8]).

Requirements applicable to MTSI media gateways for DTMF events are described in Annex G.

12.3.1.1a Speech inter-working between fixed access and CS GERAN/UTRAN

This clause applies to MTSI MGWs used for interworking between an MTSI client in terminal using fixed access and a CS GERAN/UTRAN UE.

Media codecs for MTSI MGWs for speech inter-working between fixed access and CS GERAN/UTRAN are specified in TS 181 005 [98] in clause 6.2 for narrow-band codecs and in clause 6.3 for wide-band codecs.

MTSI MGWs for speech inter-working between fixed access and CS GERAN/UTRAN supporting AMR and AMR-WB shall follow clause 12.3.1.1 for the AMR and AMR-WB codecs. Tandem-free inter-working should be used whenever possible.

For the other codecs, the MTSI MGW shall follow the recommendations and requirements defined in clause 18 for the respective codec. For these codecs, tandem-free inter-working is not possible when interworking with CS GERAN/UTRAN.

Requirements applicable to MTSI media gateways for DTMF events are described in Annex G.

12.3.1.2 Text

The CTM coding format defined in TS 26.226 [52] is used for real time text in CS calls. In order to arrange inter‑working, a transcoding function between CTM and RFC 4103 is required in the MTSI media gateway. A buffer shall be used for rate adaptation between receiving text from a real-time text transmitter according to the present document and transmitting to a CTM receiver. A gateway buffer of 2K characters is considered sufficient according to clause 13.2.4 in EG 202 320 [51].

12.3.2 RTP payload formats for MTSI media gateways

12.3.2.1 Speech

For RTP payload formats, see clause 18.4.3.

MTSI media gateways supporting AMR or AMR-WB shall support the bandwidth-efficient payload format and should support the octet-aligned payload format. When offering both payload formats, the bandwidth-efficient payload format shall be listed before the octet-aligned payload format in the preference order defined in the SDP.

The MTSI media gateway should use the SDP parameters defined in table 12.1 for the session.

For all access technologies and for normal operating conditions, the MTSI media gateway should encapsulate the number of non-redundant speech frames in the RTP packets that corresponds to the ptime value received in SDP from the other MTSI client, or if no ptime value has been received then according to "Recommended encapsulation" defined in table 12.1. The MTSI media gateway may encapsulate more non-redundant speech frames in the RTP packet but shall not encapsulate more than 4 non-redundant speech frames in the RTP packets. The MTSI media gateway may encapsulate any number of redundant speech frames in an RTP packet but the length of an RTP packet, measured in ms, shall never exceed the maxptime value.

Table 12.1: Recommended encapsulation parameters

Access technology	Recommended encapsulation (if no ptime and no RTCP_APP_REQ_AGG has been received)	ptime	maxptime when redundancy is not supported	maxptime when redundancy is supported
Default	1 non-redundant speech frame per RTP packet Max 4 or 12 speech frames in total depending on whether redundancy is supported but not more than a received maxptime value requires	20	80	240
HSPA E-UTRAN NR	1 non-redundant speech frame per RTP packet Max 4 or 12 speech frames in total depending on whether redundancy is supported but not more than a received maxptime value requires	20	80	240
EGPRS	2 non-redundant speech frames per RTP packet but not more than a received maxptime value requires Max 4 or 12 speech frames in total depending on whether redundancy is supported but not more than a received maxptime value requires	40	80	240
GIP	1 to 4 non-redundant speech frames per RTP packet but not more than a received maxptime value requires Max 12 speech frames in total but not more than a received maxptime value requires	20, 40, 60 or 80	N/A	240

When the access technology is not known to the MTSI media gateway, the default encapsulation parameters defined in Table 12.1 shall be used.

The SDP offer shall include an RTP payload type where octet-align=0 is defined or where the octet-align parameter is not specified and should include another RTP payload type with octet-align=1. MTSI media gateways offering wide-band speech shall offer these parameters and parameter settings also for the RTP payload types used for wide-band speech.

MTSI media gateways should support the RTCP-APP signalling defined in clause 10.2.1. The Codec Mode Request (RTCP_APP_CMR) is only relevant when AMR or AMR-WB is used but the Redundancy Request and the Frame Aggregation Request can be used for all codecs. When RTCP-APP is not supported or cannot be used in the session then adaptation can also be based on RTCP Receiver Reports/Sender Reports.

MTSI media gateways should support redundancy according to clause 9.

NOTE: Support of transmitting redundancy may be especially useful in the case an MTSI media gateway is aware of the used access technology and knows that the Generic Access technology is used.

12.3.2.2 Text

Both CTM according to TS 26.226 [52] and RFC 4103 make use of ITU-T Recommendation T.140 presentation and character coding. Therefore inter-working is a matter of payload packetization and CTM modulation/demodulation.

12.3.3 Explicit Congestion Notification

An MTSI MGW can be used to enable ECN between the MTSI client in terminal and the MTSI MGW when inter-working with CS GERAN/UTRAN.

If ECN is supported in the MTSI MGW, then the MTSI MGW shall also:

– support ECN as described in this specification for the MTSI client in terminal, except that the MTSI MGW does not determine whether ECN can be used based on the Radio Access Technology that is used towards the MTSI client in terminal;

– support RTP/AVPF and SDPCapNeg if the MTSI MGW supports RTCP AVPF ECN feedback messages;

– be capable of enabling end-to-end rate adaptation between the MTSI client in terminal and the CS terminal by performing the following:

– negotiate the use of ECN with the MTSI client in terminal, if it can be confirmed that the network used towards the MTSI client in terminal properly handles ECN-marked packets;

– inter-work adaptation requests between the MTSI client in terminal and the CS GERAN/UTRAN;

12.3.4 Codec switching procedures with SRVCC

An MTSI client in terminal (hereinafter "local client") using 3GPP PS access may be handed over to CS access. By that SRVCC procedure, the end-point of the IP connection moves from the local client to a CS MGW in the CS network, as described in TS 23.216 (SRVCC) [133].

In order to achieve this handover, the MSC server, controlling the CS MGW, sends a SIP INVITE message:

– either to the remote client (in case of SRVCC handover without SRVCC enhancement);

– or to the ATCF (in case of SRVCC handover with ATCF enhancement),

to change the communication end from the MTSI client in terminal to the CS MGW as described in TS 23.237 [134].

If EVS is used between local and remote client before SRVCC and if AMR-WB is used after SRVCC by the local CS UE, an MTSI MGW (e.g. MSC/CS-MGW or ATCF/ATGW) can send the RTCP_APP_EP2I request message, (see clause 10.1.2.10), or a CMR in the RTP payload requesting an EVS AMR-WB IO mode, to the remote client to request that it switches from the EVS Primary mode to the EVS AMR-WB IO mode. The mode-set used in CS shall be included in the RTCP_APP_EP2I request message. Furthermore, the RTCP_APP_EP2I request message also supports signalling to restrict the timing and destination of codec mode changes. An SDP offer/answer negotiation between the MTSI MGW and the remote client can also be performed to align the mode-sets and to optimize the resource usage and also to request switching to the EVS AMR-WB IO mode.

Correspondingly, the RTCP_APP_EI2P request message can be used to switch from the EVS AMR-WB IO mode to the EVS Primary mode, e.g. in case an SRVCC handover to a CS access and a switch to the EVS AMR-WB IO mode is followed by a reverse SRVCC to perform handover back to the PS access. An SDP offer/answer negotiation can also be performed to restore the session, e.g. bitrates, bandwidths and other configuration parameters, to what was used before SRVCC.

NOTE: The DTX operation of EVS Primary and AMR-WB IO may be configured in sending direction with either a fixed SID update interval (from 3 to 100 frames) or an adaptive SID update interval – more details can be found in clauses 4.4.3 and 5.6.1.1 of TS 26.445 [125]. The DTX operation of AMR-WB is defined with a fixed interval of 8 frames for SID updates. Implementers of MTSI MGWs are advised to take into account the SID flexibility of EVS (with respect to AMR-WB) for the interworking between AMR-WB and EVS AMR-WB IO.

12.4 PSTN

12.4.1 3G-324M

If 3G-324M is supported in the PSTN, then the inter-working can be made as specified in clause 12.2.

12.4.2 Text

PSTN text telephony inter-working with PS environments is described in ITU-T Recommendation H.248.2 [50]and further elaborated in EG 202 320 [51].

Text telephony modem tones are sensitive to packet loss, jitter and echo canceller behaviour. Therefore, conversion of modem based transmission of real-time text is best done at the border of the PSTN. If PSTN text telephone tones need to be carried audio coded in a PS network, considerations must be taken to carry them reliably as for example specified in ITU-T Recommendations V.151 [54] and V.152 [55].

When inter-working with PSTN text telephones, it must be considered that in PSTN most text telephone communication methods do not allow simultaneous speech and text transmission. An MTSI client in terminal indicating text capability shall not automatically initiate text connection efforts on the PSTN circuit. Instead, either a requirement for text support should be required from the MTSI client in terminal, active transmission of text from the MTSI client in terminal, or active transmission of text telephone tones from the PSTN terminal. See clause 13 of EG 202 320 [51].

Note that the primary goal of real-time text support in MTSI is not to offer a replica of PSTN text telephony functionality. On the contrary, real-time text in MTSI is aiming at being a generally useful mainstream feature, complementing the general usability of the Multimedia Telephony Service for IMS.

12.5 GIP inter-working

12.5.1 Text

RFC 4103 [31] and T.140 are specified as default real-time text codec in SIP telephony devices in RFC 4504 [53]. When GIP implements this codec, the media stream contents are identical for the two environments. Packetization will also in many cases be equal, while consideration must be taken to cope with different levels of redundancy and possible use of different media security and integrity measures.

12.5.2 Speech

See Clause 12.7.

12.6 Void

12.6.1 Void

12.6.2 Void

12.7 Inter-working with other IMS and non-IMS IP networks

12.7.1 General

IMS and MTSI services are required to support inter-working with similar services operating on other IP networks, both IMS based and non-IMS based, [2]. It is an operator option to provide transcoding when the end-to-end codec negotiation fails to agree on a codec to be used for the session. The requirements herein apply to MTSI MGWs when such transcoding is provided.

These requirements were designed for sessions carried with IP end-to-end, possibly inter-connected through one or more other IP networks.

A main objective is to harmonize the requirements for this inter-working case with the requirements for GERAN/UTRAN CS inter-working defined in Clause 12.3. There is however one major difference as the MGW requirements in Clause 12.3 apply only to the PS side of the MTSI MGW, i.e. between the MTSI MGW and the MTSI client in the terminal, while here there are requirements for the MTSI MGW both towards the MTSI client in the terminal and towards the remote network.

Most requirements included here apply only to the PS access towards the remote network but there are also requirements that target both the local MTSI client in terminal and the remote network or even only the local MTSI client.

12.7.2 Speech

12.7.2.1 General

This clause defines how speech media should be handled in MTSI MGWs in inter-working scenarios between an MTSI client in terminal using 3GPP access and a non-3GPP IP network and between an MTSI client in terminal using fixed access and a non-3GPP IP network. This clause therefore defines requirements for what the MTSI MGW needs to support and how it should behave during session setup and session modification. A few SDP examples are included in Annex A.10.

12.7.2.2 Speech codecs and formats

12.7.2.2.1 MTSI MGW for interworking between MTSI client in terminal using 3GPP access and other IMS or non-IMS IP networks

This clause applies to MTSI MGWs used for interworking between an MTSI client in terminal using 3GPP access and a client using another IMS or non-IMS IP network.

MTSI MGWs offering speech communication between an MTSI client in a terminal and a client in another IP network through a Network-to-Network Interface (NNI) using AMR shall support:

– AMR speech codec modes 12.2, 7.4, 5.9 and 4.75 [11], [12], [13], [14] and source-controlled rate operation [15], both towards the local MTSI client in terminal and towards the remote network;

– G.711, both A-law and -law PCM, [77], towards the remote network.

and should support:

– linear 16 bit PCM (L16) at 8 kHz sampling frequency, towards the remote network.

When such MTSI MGWs also offer wideband speech communication using AMR-WB they shall support:

– AMR wideband codec 12.65, 8.85 and 6.60 [17], ‎[18], ‎[19], [20] and source controlled rate operation ‎[21] , both towards the local MTSI client in terminal and towards the remote network;

and should support:

– G.722 (SB-ADPCM) at 64 kbps, [78], towards the remote network; and:

– linear 16 bit PCM (L16) at 16 kHz sampling frequency, towards the remote network.

NOTE: A TrGW decomposed from an IBCF can also be media-unaware and forward any media transparentely without changing the encoding. Transcoding support is optional at the Ix interface.

12.7.2.2.2 MTSI MGW for interworking between MTSI client in terminal using fixed access and other IMS or non-IMS IP networks

This clause applies to MTSI MGWs used for interworking between an MTSI client in terminal using fixed access and a client using another IMS or non-IMS IP network.

Media codecs for MTSI MGWs for speech inter-working between fixed access and IP clients in other IMS or non-IMS IP networks are specified in TS 181 005 [98] in clause 6.2 for narrow-band codecs and in clause 6.3 for wide-band codecs. In addition, the MTSI MGW should support linear 16 bit PCM (L16) at 8 kHz sampling frequency for narrow-band speech. An MTSI MGW supporting wideband speech should also support linear 16 bit PCM (L16) at 16 kHz sampling frequency.

MTSI MGWs for speech inter-working between access and CS GERAN/UTRAN supporting AMR and AMR-WB shall follow clause 12.7.2.2.2 for the AMR and AMR-WB codecs. Tandem-free inter-working should be used whenever possible.

For the other codecs, the MTSI MGW shall follow the recommendations and requirements defined in clause 18 for the respective codec.

12.7.2.2.3 Common procedures

If the remote network supports AMR for narrowband speech and/or AMR-WB for wideband speech, then transcoding shall be avoided whenever possible. In this case, the MTSI MGW should not be included in the RTP path unless it is required for non transcoding related purposes. If the MTSI MGW is included in the RTP path then it shall support forwarding the RTP payload regardless of codec mode and packetization.

NOTE: An example of where transcoding may be required when AMR and/or AMR-WB are supported by the remote network is when the remote terminal is limited to modes that are not supported by the local MTSI client in terminal due to operator configuration.

If the MTSI MGW is performing transcoding of AMR or AMR-WB then it shall be capable of restricting mode changes, both mode change period and mode changes to neighboring mode, if this is required by the remote network.

Requirements applicable to MTSI MGW for DTMF events are described in Annex G.

12.7.2.3 Codec preference order for session negotiation

It is important to optimize the quality-bandwidth compromise, even though the NNI uses a fixed IP network. For this reason, the following preference order should be used by MTSI MGWs unless another preference order is defined in bilateral agreements between the operators or configured otherwise by the operator:

– The best option is if a codec can be used end-to-end. For example, using AMR or AMR-WB end-to-end is preferable over transcoding through G.711 or G.722 respectively.

– The second best solution is to use G.711 or G.722 as inter-connection codecs, for narrow-band and wide-band speech respectively, since these codecs offer a good quality while keeping a reasonable bit rate.

– The linear 16 bit PCM format should only be used as the last resort, when none of the above solutions are possible.

If a wide-band speech session is possible, then fall-back to narrow-band speech should be avoided whenever possible, unless another preference order is indicated in the SDP.

NOTE: There may be circumstances, for example bit rate constraints, when a fall-back to narrow-band speech is acceptable since the alternative would be a session setup failure.

12.7.2.4 RTP profiles

MTSI MGWs offering speech communication over the NNI shall support the RTP/AVP profile and should support the RTP/AVPF profile, [40]. If the RTP/AVPF profile is supported then the SDP Capability Negotiation (SDPCapNeg) framework shall also be supported, [69].

An MTSI MGW supporting EVS should support the RTCP-APP signalling for speech adaptation defined in clause 10.2.1.

12.7.2.5 RTP payload formats

The payload format to be used for AMR and AMR-WB encoded media is defined in Clause 12.3.2.1. The payload format to be used for EVS encoded media is defined in [125]. The MTSI MGW shall support the following payload SDP parameters for AMR and AMR-WB: octet-align, mode-set, mode-change-period, mode-change-capability, mode-change-neighbor, maxptime, ptime, channels and max-red.

The payload format to be used for G.711 encoded media is defined in RFC 3551, [10], for both -law (PCMU) and -law (PCMA).

The payload format to be used for G.722 encoded media is defined in RFC 3551, [10].

NOTE: The sampling frequency for G.722 is 16 kHz but is set to 8000 Hz in SDP since it was (erroneously) defined this way in the original version of the RTP A/V profile, see [10].

The payload format to be used for linear 16 bit PCM is the L16 format defined in RFC 3551, [10]. When this format is used for narrow-band speech then the rate (sampling frequency) indicated on the a=rtpmap line shall be 8000. When this format is used for wide-band speech then the rate (sampling frequency) indicated on the a=rtpmap line shall be 16000.

The payload formats to be used for the other codecs are listed in Clause 18.4.3.

12.7.2.6 Packetization

For the G.711, G.722 and linear 16 bit PCM formats, the frame length shall be 20 ms, i.e. 160 and 320 speech samples in each frame for narrow-band and wide-band speech respectively.

MTSI MGWs offering speech communication over the NNI shall support encapsulating up to 4 non-redundant speech frames into the RTP packets.

MTSI MGWs may support application layer redundancy. If redundancy is supported then the MTSI MGW should support encapsulating up to 8 redundant speech frames in the RTP packets. Thereby, an RTP packet may contain up to 12 frames, up to 4 non-redundant and up to 8 redundant frames.

An MTSI MGW setting up a speech session should align the ptime and maxptime between the networks so that the same packetization can be used end-to-end, even when transcoding is used.

The MGW should use the packetization schemes indicated by the ptime value in the SDP offer and answer. If no ptime value is present in the SDP then the MGW should encapsulate 1 frame per packet or the packetization used by the end-point clients.

The MTSI MGW should preserve the packetization used by the end-point clients to minimize the buffering times otherwise caused by jitter. For example, if one end-point adapts the packetization to use 2 frames per packet then the MTSI MGW should adapt the packetization to the other end-point to also use 2 frames per packet. This applies also when the MTSI MGW performs transcoding. The packet size can become quite large for some combinations of formats and packetization. If the packet size exceeds the Maximum Transfer Unit (MTU) of the network then the MTSI MGW should encapsulate fewer frames per packet.

NOTE: It is an implementation consideration to determine the MTU of the network. RFC 4821 [79] describes one method that can be used to discover the path MTU.

When the MTSI MGW does not perform any transcoding then it shall be transparent to the packetization schemes used by the end-point clients.

12.7.2.7 RTCP usage and adaptation

The RTP implementation shall include an RTCP implementation.

MTSI MGWs offering speech should support AVPF (RFC 4585 [40]) configured to operate in early mode. When allocating RTCP bandwidth, it is recommended to allocate RTCP bandwidth and set the values for the "b=RR:" and the "b=RS:" parameters such that a good compromise between the RTCP reporting needs for the application and bandwidth utilization is achieved, see also SDP examples in Annex A.10. When an MTSI MGW uses tandem-free inter-working between two PS networks then it should align the RTCP bandwidths such that RTCP packets can be sent with the same frequency in both networks. This is to allow for sending adaptation requests end-to-end without being forced to buffer the requests in the MTSI MGW. The value of "trr-int" should be set to zero or not transmitted at all (in which case the default "trr‑int" value of zero will be assumed) when Reduced-Size RTCP (see clause 7.3.6) is not used.

For speech sessions, between the MTSI client in terminal and the MTSI MGW, it is beneficial to keep the size of RTCP packets as small as possible in order to reduce the potential disruption of RTCP onto the RTP stream in bandwidth-limited channels. RTCP packet sizes can be minimized by using Reduced-Size RTCP packets or using the parts of RTCP compound packets (according to RFC 3550 [9]) which are required by the application.

The MTSI MGW shall be capable of adapting the session to handle possible congestion. For AMR and AMR-WB encoded media, the MTSI MGW shall support the adaptation signalling method using RTCP APP packets as defined in clause 10.2, both towards the MTSI client in terminal and towards the remote network. As the IP client in the remote network may or may not support the RTCP APP signalling method, the MTSI MGW shall also be capable of using the inband CMR in the AMR payload. When receiving inband CMR in the payload from the remote network, the MTSI MGW does not need to move the adaptation signalling to RTCP APP packets before sending it to the MTSI client in terminal.

For PCM, G.722 and linear 16 bit PCM encoded media, the MTSI MGW shall support RFC 3550 for signalling the experienced quality using RTCP Sender Reports and Receiver Reports.

For a given RTP based media stream to/from the MTSI client in terminal, the MTSI MGW shall transmit RTCP packets from and receive RTCP packets to the same port number.

For a given RTP based media stream to/from the remote network, the MTSI MGW shall transmit RTCP packets from and receive RTCP packets on the same port number, not necessarily the same port number as used to/from the MTSI client in terminal.

This facilitates inter-working with fixed/broadband access. However, the MTSI MGW may, based on configuration or local policy, accept RTCP packets that are not received from the same remote port where RTCP packets are sent by either the MTSI client in terminal or the remote network.

12.7.2.8 RTP usage

For AMR and AMR-WB encoded media, the MTSI MGW shall follow the same requirements when inter-working with other IP network as when inter-working with GERAN/UTRAN CS, see clause 12.3.2.1.

For a given RTP based media stream to/from the MTSI client in terminal, the MTSI MGW shall transmit RTP packets from and receive RTP packets to the same port number.

For a given RTP based media stream to/from the remote network, the MTSI MGW shall transmit RTP packets from and receive RTP packets on the same port number, not necessarily the same port number as used to/from the MTSI client in terminal.

This facilitates inter-working with fixed/broadband access. However, the MTSI MGW may, based on configuration or local policy, accept RTP packets that are not received from the same remote port where RTP packets are sent by either the MTSI client in terminal or the remote network.

12.7.2.9 Session setup and session modification

The MTSI MGW shall be capable of dynamically adding and dropping speech media during the session.

The MTSI MGW may use the original SDP offer received from the MTSI client in terminal when creating an SDP offer that is to be sent outbound to the remote network.

If the MTSI MGW adds codecs to the SDP offer then it shall follow the recommendations of Clause 12.7.2.3 when creating the outbound SDP offer and when selecting which codec to include in the outbound SDP answer.

If the MTSI MGW generates an SDP offer based on the offer received from the MTSI client in terminal, it should maintain the ptime and maxptime values as indicated by the MTSI client in terminal. If the MTSI MGW generates an SDP offer without using the SDP offer from the MTSI client in terminal then it should define the ptime and maxptime values in accordance in Clause 12.7.2.6, i.e. the preferred values for ptime and maxptime are 20 and 80 respectively.

If the MTSI MGW does not support AVPF (nor SDPCapNeg) then it shall not include the corresponding lines in the SDP offer that is sent to the remote network.

12.7.2.10 Audio level alignment

In case of interworking, the audio levels should be aligned to ensure suitable audio levels to the end users. This is especially important when codecs with different overload points are used on each side of the MTSI MGW as this can result in an asymmetrical loudness between the end points.

NOTE 1: The overload point of a given codec refers to the adjustment factor between the digital levels in input/output of this codec and the resulting acoustic levels. In practice the overload point value corresponds to the analog Root Mean Square (RMS) level of a full-scale sinusoidal signal.

For MTSI client in terminal using fixed access, clause 18.8 applies to ensure proper audio alignment.

For communications requiring interworking with other IMS or non-IMS IP networks, terminals connected to these networks may use different codecs, which have different overload points. In this case, it is recommended that the MTSI MGW doing transcoding ensure proper audio level alignment. This alignment shall be performed such that the nominal level is preserved (0 dBm0 shall be maintained to 0 dBm0). As an example, a fixed CAT-IQ DECT terminal implementing G.722 with a 9 dBm0 overload point as recommended in ITU-T Recommendation G.722 [78] might need some audio level alignment in case of wideband voice interworking with a 3GPP terminal using AMR-WB with a 3.14 dBm0 overload point. The audio level alignment may use dynamic range control to prevent saturation or clipping.

NOTE 2: The definition of the dBm0 unit can be found in ITU-T P.10 [108].

12.7.3 Explicit Congestion Notification

An MTSI MGW can be used to enable ECN within the local network when the local ECN-capable MTSI client in terminal is in a network that properly handles ECN-marked packets, and either the remote network cannot be confirmed to properly handle ECN-marked packets or the remote terminal does not support or use ECN.

If ECN is supported in the MTSI MGW, then the MTSI MGW shall also:

– support RTP/AVPF and SDPCapNeg if the MTSI MGW supports RTCP AVPF ECN feedback messages;

– be capable of enabling end-to-end rate adaptation between the local MTSI client in terminal and the remote client by performing the following towards the local MTSI client in terminal:

– negotiate the use of ECN;

– support ECN as described in this specification for the MTSI client in terminal, except that the MTSI MGW does not determine whether ECN can be used based on the Radio Access Technology.

NOTE: The adaptation requests are transmitted between the local and the remote client without modification by the MTSI MGW.

An MTSI MGW can also be used to enable ECN end-to-end if the remote client uses ECN in a different way than what is described in this specification for the MTSI client in terminal, e.g. if the remote client only supports probing for the ECN initiation phase or it needs the RTCP AVPF ECN feedback messages.

12.7.4 Text

The codec and other considerations for real-time text described in the present document for MTSI clients in terminal using 3GPP access apply also to MTSI clients in terminal using fixed access. There are thus no inter-working considerations on the media level between these types of end-points.

12.7.5 Inter-working IPv4 and IPv6 networks

If different IP versions are used by the offerer and the answerer, information in the SDP offer or answer related to IP version and QoS negotiation should be modified appropriately by the MTSI MGW so that the offerer and the answerer agree with an identical or similar source bit-rates.

For video, b=AS in IPv6 should be assumed to be a product of b=AS in IPv4 and 1.04, rounded down to a nearest integer, when other information that can be used to re-compute b=AS in IPv6 from b=AS in IPv4 is not present. Likewise, b=AS in IPv4 should be assumed to be a product of b=AS in IPv6 and 0.96, rounded up to a nearest integer. These formulas meet the relationship of b=AS values for 176×144 and 320×240 in Table N.x. Depending on service policy or codec configuration, other formulas can be used.

An MTSI MGW for interworking between IPv4 and IPv6 networks supporting the ‘a=bw-info’ attribute (see clause 19) shall re-compute the bandwidth properties signalled with this attribute if only bandwidths for either IPv4 or IPv6 are present. If bandwidth properties are provided with values for both IPv4 and IPv6 then the MTSI MGW should not re-compute the bandwidths.

12.8 MGW handling for NO_REQ interworking

The meaning of "none" and "NO_REQ" for EVS (as specified in TS 26.445 [125] is not equivalent to code-point "CMR=15" for AMR and AMR-WB (as specified according to TS 26.114 and RFC 4867 with its errata):

– For AMR-WB, CMR=15 overrides the previously received CMR value (corresponding to a speech mode or CMR=15). In other words, when an MTSI client receives CMR15 it is no longer restricted for its outbound packets by the previously received CMR, however it still complies with the negotiated mode-set.

– For EVS, the ‘NO_REQ’ and ‘none’ CMR code points mean that there is no request and this CMR value shall be ignored. In other words, when an MTSI client receives NO_REQ or ‘none’ for EVS it is still restricted for its outbound packets by the previously received CMR (if any) and in addition it still complies within the negotiated codec operation modes.

MGWs in the path, repacking between the RTP format according to RFC 4867 [28] and the EVS RTP format in TS 26.445 [125] shall translate between these code-points (in transcoder-free operation):

– When translating a single frame per packet from AMR-WB to EVS (AMR-WB IO): CMR=15 shall be replaced by the highest possible of EVS AMR-WB IO allowed in the session.

– When translating a single frame per packet from EVS (AMR-WB IO) to AMR-WB: NO_REQ and none shall be replaced by the previously sent CMR (or the highest possible of AMR-WB allowed in the session if no request has been sent since the beginning of the session).

– When translating more than one frame per packet (e.g. from 1 frame per packet to 2 frames per packets or vice versa), the MGW may have to "combine" or "repeat" CMRs following same translation as for the single frame per packet when applicable.

The above translation rules apply except when MGW wants to change the CMR. An example is when a MGW detects problems at an early stage in uplink which may require the MGW to send a CMR to limit bitrate at a lower value than the incoming CMR from the remote media receiver.

NOTE: When EVS AMR-WB IO is not used (transcoder-free operation is not possible), the speech path is split into two links (AMR-WB and EVS) and the adaptation on these two links are independent from each other. CMR translation between AMR-WB and EVS at the MGW is therefore not required.