17 Management of Media Adaptation
26.1143GPPIP Multimedia Subsystem (IMS)Media handling and interactionMultimedia telephonyRelease 18TS
17.1 General
For the purpose of quality control or network management, it can be necessary to adjust the speech and video adaptation of the MTSI client in terminal. To effectively manage, i.e., initialize and update, the media adaptation of a large number of terminals, which can be implemented in different fashions, the 3GPP MTSIMA (MTSI Media Adaptation) MO defined in this clause may be used.
The MO, which exploits the information estimated or received from various entities such as ongoing multimedia packet stream, the far-end MTSI client in terminal, IMS, and network node such as eNodeB, provides two sets of parameters that can be used in the design of adaptation state machines for speech and video respectively. The parameters are contrived such that dependence on media codec or radio access bearer technology is avoided as much as possible, not to constrain the evolution of these elements. In addition, vendor specific parameters taking advantage of the implementation can be placed under Ext nodes.
By altering the parameters of the MO via OMA-DM protocol, media adaptation behavior of the MTSI client in terminal can be modified up to extent allowed by the implementation. Note that due to the underlying uncertainties and complexities, one should expect only to shape the expected bit rate trajectory of multimedia stream over time-varying transmission conditions, rather than to control the media flow in a timely and stringent manner. Detailed descriptions of the speech and video adaptation parameters can be found in table 17.1 and 17.2.
The Management Object Identifier shall be: urn:oma:mo:ext-3gpp-mtsima:1.0.
Protocol compatibility: The MO is compatible with OMA Device Management protocol specifications, version 1.2 and upwards, and is defined using the OMA DM Device Description Framework as described in the Enabler Release Definition OMA-ERELD_DM-V1_2 [67].
17.2 Media adaptation management object
The following nodes and leaf objects in figure 17.1 shall be contained under the 3GPP_MTSIMA node if the MTSI client in terminal supports the feature described in this clause. Information of DDF for this MO is given in Annex J.
Figure 17.1: MTSI media adaptation management object tree
Node: /<X>
This interior node specifies the unique object id of a MTSI media adaptation management object. The purpose of this interior node is to group together the parameters of a single object.
– Occurrence: ZeroOrOne
– Format: node
– Minimum Access Types: Get
The following interior nodes shall be contained if the MTSI client in terminal supports the "MTSI media adaptation management object".
/<X>/Speech
The Speech node is the starting point of parameters related to speech adaptation if any speech codec are available.
– Occurrence: ZeroOrOne
– Format: node
– Minimum Access Types: Get
/<X>/Speech/<X>
This interior node is used to allow a reference to a list of speech adaptation parameters.
– Occurrence: OneOrMore
– Format: node
– Minimum Access Types: Get
/<X>/Speech/<X>/ID
This leaf node represents the identification number of a set of parameters related to speech adaptation.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Speech/<X>/TAG
This leaf node represents the identification tag of a set of parameters for speech adaptation. It is recommended to have at least a node, for example, ID, TAG, or implementation-specific ones, for the identification purpose such that each set of parameters can be distinguished and accessed.
– Occurrence: ZeroOrOne
– Format: chr
– Minimum Access Types: Get
/<X>/Speech/<X>/PLR
This interior node is used to allow a reference to a list of parameters related to packet loss rate (PLR).
– Occurrence: ZeroOrOne
– Format: node
– Minimum Access Types: Get
/<X>/Speech/<X>/PLR/MAX
This leaf node represents the maximum PLR tolerated when redundancy is not used, before the receiver signals the sender to attempt adaptation that reduces PLR or operate at modes more robust to packet loss.
– Occurrence: ZeroOrOne
– Format: float
– Minimum Access Types: Get
– Values: 0 ~ 100 %
/<X>/Speech/<X>/PLR/LOW
This leaf node represents the minimum PLR tolerated, before the receiver signals the sender to probe for higher bit rate, increase the packet rate, reduce redundancy, or perform other procedures that could improve speech quality under such favorable conditions.
– Occurrence: ZeroOrOne
– Format: float
– Minimum Access Types: Get
– Values: 0 ~ 100 %
/<X>/Speech/<X>/PLR/STATE_REVERSION
This leaf node represents the maximum PLR tolerated after adaptation state machine has taken actions, based on the measured PLR lower than LOW. Once PLR exceeds this threshold, the receiver decides that the actions taken to improve speech quality were not successful.
– Occurrence: ZeroOrOne
– Format: float
– Minimum Access Types: Get
– Values: 0 ~ 100 %
/<X>/Speech/<X>/PLR/RED_INEFFECTIVE
This leaf node represents the maximum PLR tolerated, after adaptation state machine has taken actions to increase redundancy. Once PLR exceeds this threshold, the receiver decides that the situation was not improved but degraded.
– Occurrence: ZeroOrOne
– Format: float
– Minimum Access Types: Get
– Values: 0 ~ 100 %
/<X>/Speech/<X>/PLR/DURATION_MAX
This leaf node represents the duration (ms) of sliding window over which PLR is observed and computed. The computed value is compared with the MAX threshold.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Speech/<X>/PLR/DURATION_LOW
This leaf node represents the duration (ms) of sliding window over which PLR is observed and computed. The computed value is compared with the LOW threshold.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Speech/<X>/PLR/DURATION_STATE_REVERSION
This leaf node represents the duration (ms) of sliding window over which PLR is observed and computed. The computed value is compared with the STATE_REVERSION threshold.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Speech/<X>/PLR/DURATION_RED_INEFFECTIVE
This leaf node represents the duration (ms) of sliding window over which PLR is observed and computed. The computed value is compared with the RED_INEFFECTIVE threshold.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Speech/<X>/PLR/DURATION
This leaf node represents the duration (ms) of sliding window over which PLR is observed and computed. The computed value is compared with the PLR thresholds.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Speech/<X>/PLB
This interior node is used to allow a reference to a list of parameters related to an event, packet loss burst (PLB), in which a large number of packets are lost during a limited period.
– Occurrence: ZeroOrOne
– Format: node
– Minimum Access Types: Get
/<X>/Speech/<X>/PLB/LOST_PACKET
This leaf node represents the number of packets lost during a period of PLB/DURATION.
– Occurrence: One
– Format: int
– Minimum Access Types: Get
/<X>/Speech/<X>/PLB/DURATION
This leaf node represents the period (ms) for which LOST_PACKET is counted.
– Occurrence: One
– Format: int
– Minimum Access Types: Get
/<X>/Speech/<X>/ECN
This interior node is used to allow a reference to a list of parameters related to Explicit Congestion Notification (ECN) to IP.
– Occurrence: ZeroOrOne
– Format: node
– Minimum Access Types: Get
/<X>/Speech/<X>/ECN/USAGE
This leaf node represents a Boolean parameter that enables or disables ECN-based adaptation.
– Occurrence: ZeroOrOne
– Format: bool
– Minimum Access Types: Get
/<X>/Speech/<X>/ECN/MIN_RATE
This leaf node represents the minimum bit rate (bps, excluding IP, UDP, RTP and payload overhead) that speech encoder should use during ECN-based adaptation.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Speech/<X>/ECN/STEPWISE_DOWNSWITCH
This leaf node represents a Boolean parameter that selects which down-switch method to use, i.e., direct or step-wise, for ECN-triggered adaptation.
– Occurrence: ZeroOrOne
– Format: bool
– Minimum Access Types: Get
/<X>/Speech/<X>/ECN/RATE_LIST
This leaf node represents the list of bit rates to use during stepwise down-switch. This parameter is only applicable when stepwise down-switch is used.
– Occurrence: ZeroOrOne
– Format: chr
– Minimum Access Types: Get
/<X>/Speech/<X>/ECN/INIT_WAIT
This leaf node represents the time (ms) that the sender should wait before an up-switch is attempted in the beginning of the session if no rate control information or reception quality feedback information is received.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Speech/<X>/ECN/INIT_UPSWITCH_WAIT
This leaf node represents the time (ms) that the sender should wait at each step during up-switch in the beginning of the session.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Speech/<X>/ECN/CONGESTION_WAIT
This leaf node represents the minimum interval (ms) between detection of ECN-CE and up-switch from the reduced rate.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Speech/<X>/ECN/CONGESTION_UPSWITCH_WAIT
This leaf node represents the waiting time (ms) at each step during up-switch after a congestion event, except for the initial up-switch which uses the ECN/CONGESTION_WAIT time.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Speech/<X>/ICM
This interior node is used to allow a reference to a list of parameters related to Initial Codec Mode (ICM).
– Occurrence: ZeroOrOne
– Format: node
– Minimum Access Types: Get
/<X>/Speech/<X>/ICM/INITIAL_CODEC_RATE
This leaf node represents the bit rate (bps, excluding IP, UDP, RTP and payload overhead) that the speech encoder should use when starting the encoding in the beginning of the session.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Speech/<X>/ICM/INITIAL_CODEC_BANDWIDTH
This leaf node represents the audio bandwidth that the EVS speech encoder should use when starting the encoding in the beginning of the session, unless specified by bw, bw-send, or bw-recv parameter.
– Occurrence: ZeroOrOne
– Format: chr
– Minimum Access Types: Get
– Values: nb, wb, swb, fb
/<X>/Speech/<X>/ICM/INIT_WAIT
This leaf node represents the time (ms) that the sender should wait before an up-switch is attempted in the beginning of the session if no rate control information or reception quality feedback information is received.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Speech/<X>/ICM/INIT_UPSWITCH_WAIT
This leaf node represents the time (ms) that the sender should wait at each step during up-switch in the beginning of the session.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Speech/<X>/ ICM/INIT_PARTIAL_REDUNDANCY_OFFSET_SEND
This leaf node represents the initial partial redundancy offset (-1, 0, 2, 3, 5, or 7) that the EVS speech encoder should use when starting the encoding in the beginning of the session that uses channel aware mode, unless asked otherwise by the far-end MTSI client in terminal with the ch-aw-recv parameter .
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Speech/<X>/ ICM/INIT_PARTIAL_REDUNDANCY_OFFSET_RECV
This leaf node represents the initial partial redundancy offset (-1, 0, 2, 3, 5, or 7) that the MTSI client in terminal should ask the far-end MTSI client in terminal with the ch-aw-recv parameter to use when starting the encoding in the beginning of the session that uses channel aware mode.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Speech/<X>/MEDIA_ROBUSTNESS
This interior node is used to allow a reference to a list of parameters related to Media Robustness Adaptation that can be used for the CHEM feature. Each unique codec type is identified by the CODEC_ID under a corresponding instance of the MEDIA_ROBUSTNESS node which groups the parameters associated with the codec type/CODEC_ID.
– Occurrence: ZeroOrMore
– Format: node
– Minimum Access Types: Get
/<X>/Speech/<X>/MEDIA_ROBUSTNESS/CODEC_ID
This leaf node represents the codec MIME type.
– Occurrence: One
– Format: chr
– Minimum Access Types: Get
/<X>/Speech/<X>/MEDIA_ROBUSTNESS/TAG
This leaf node represents the identification tag of a set of parameters for speech robustness adaptation of a codec type identified by the CODEC_ID. It is recommended to have at least a node, for example, TAG, or implementation-specific ones, for the identification purpose such that each set of parameters can be distinguished and accessed.
– Occurrence: ZeroOrOne
– Format: chr
– Minimum Access Types: Get
/<X>/Speech/<X>/MEDIA_ROBUSTNESS/CFG_BIT_RATE_LIST
This interior node is used to provide a list of the bit rates of the configurations of the codec type (CODEC_ID) listed from the bit rate of the least robust configuration first to the bit rate of the most robust listed last.
– Occurrence: One
– Format: chr
– Minimum Access Types: Get
/<X>/Speech/<X>/MEDIA_ROBUSTNESS/CFG_RED_LIST
This interior node is used to provide a list of the redundancy levels of the configurations of the codec type (CODEC_ID) listed from the redundancy level of the least robust configuration first to the redundancy level of the most robust listed last.
– Occurrence: One
– Format: chr
– Minimum Access Types: Get
/<X>/Speech/<X>/MEDIA_ROBUSTNESS/HIGH_PLR_THRESH_LIST
This interior node is used to provide a list of the high PLR thresholds for each codec configuration except for the most robust configuration. A high PLR threshold for a given codec configuration is the highest tolerable PLR at that codec configuration before the MTSI client requests a more robust codec configuration that will yield lower PLR.
– Occurrence: One
– Format: chr
– Minimum Access Types: Get
/<X>/Speech/<X>/MEDIA_ROBUSTNESS/LOW_PLR_THRESH_LIST
This interior node is used to provide a list of the low PLR thresholds for each codec configuration except for the least robust configuration. A low PLR threshold for a given codec configuration is the lowest tolerable PLR at that codec configuration before the MTSI client requests a less robust codec configuration that will yield better quality.
– Occurrence: One
– Format: chr
– Minimum Access Types: Get
/<X>/Speech/<X>/MEDIA_ROBUSTNESS/DJB_PLR
This interior node indicates whether the estimated PLR is measured before or after de-jitter buffering.
– Occurrence: One
– Format: boolean
– Minimum Access Types: Get
/<X>/Speech/<X>/MEDIA_ROBUSTNESS/PLR_AVG_WINDOW
This interior node indicates the duration of the sliding window used by the media receiver to estimate the received PLR.
– Occurrence: One
– Format: int
– Minimum Access Types: Get
/<X>/Speech/<X>/N_INHIBIT
This leaf node represents the period (number of speech frames) for which adaptation is disabled to avoid the ping-pong effects, when adaptation state machine transitions from one state to another then back to the original state.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Speech/<X>/N_HOLD
This leaf node represents the period (proportion of PLR/DURATION) that can substitute other periods such as DURATION_LOW or DURATION_RED_INEFFECTIVE, when they are not available.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Speech/<X>/T_RESPONSE
This leaf node represents the expected response time (ms) for a request to be fulfilled. If a request transmitted to the far-end is not granted within a period of T_RESPONSE, the request can be considered lost during transmission or the far-end MTSI client in terminal might have decided not to grant it.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Speech/<X>/Ext
The Ext is an interior node where the vendor specific information can be placed (vendor meaning application vendor, device vendor etc.). Usually the vendor extension is identified by vendor specific name under the ext node. The tree structure under the vendor identified is not defined and can therefore include one or more un-standardized sub-trees.
– Occurrence: ZeroOrOne
– Format: node
– Minimum Access Types: Get
/<X>/Video
The Video node is the starting point of parameters related to video adaptation if any video codec are available.
– Occurrence: ZeroOrOne
– Format: node
– Minimum Access Types: Get
/<X>/Video/<X>
This interior node is used to allow a reference to a list of video adaptation parameters.
– Occurrence: OneOrMore
– Format: node
– Minimum Access Types: Get
/<X>/Video/<X>/ID
This leaf node represents the identification number of a set of parameters related to video adaptation.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Video/<X>/TAG
This leaf node represents the identification tag of a set of parameters for video adaptation. It is recommended to have at least a node, for example, ID, TAG, or implementation-specific ones, for the identification purpose such that each set of parameters can be distinguished and accessed.
– Occurrence: ZeroOrOne
– Format: chr
– Minimum Access Types: Get
/<X>/Video/<X>/PLR
This interior node is used to allow a reference to a list of parameters related to PLR.
– Occurrence: ZeroOrOne
– Format: node
– Minimum Access Types: Get
/<X>/Video/<X>/PLR/MAX
This leaf node represents the maximum PLR tolerated, before the receiver signals the sender to reduce the bit rate such that PLR is reduced.
– Occurrence: ZeroOrOne
– Format: float
– Minimum Access Types: Get
– Values: 0 ~ 100 %
/<X>/Video/<X>/PLR/LOW
This leaf node represents the minimum PLR tolerated, before the receiver signals the sender to increase the bit rate.
– Occurrence: ZeroOrOne
– Format: float
– Minimum Access Types: Get
– Values: 0 ~ 100 %
/<X>/Video/<X>/PLR/DURATION_MAX
This leaf node represents the duration (ms) of sliding window over which PLR is observed and computed. The computed value is compared with the MAX threshold.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Video/<X>/PLR/DURATION_LOW
This leaf node represents the duration (ms) of sliding window over which PLR is observed and computed. The computed value is compared with the LOW threshold.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Video/<X>/PLB
This interior node is used to allow a reference to a list of parameters related to PLB.
– Occurrence: ZeroOrOne
– Format: node
– Minimum Access Types: Get
/<X>/Video/<X>/PLB/LOST_PACKET
This leaf node represents the number of packets lost during a period of PLB/DURATION.
– Occurrence: One
– Format: int
– Minimum Access Types: Get
/<X>/Video/<X>/PLB/DURATION
This leaf node represents the period (ms) for which LOST_PACKET is counted.
– Occurrence: One
– Format: int
– Minimum Access Types: Get
/<X>/Video/<X>/MIN_QUALITY
This interior node is used to allow a reference to a list of parameters related to the minimum video quality.
– Occurrence: ZeroOrOne
– Format: node
– Minimum Access Types: Get
/<X>/Video/<X>/MIN_QUALITY/BIT_RATE
This interior node is used to allow a reference to a list of parameters related to the minimum bit rate.
– Occurrence: ZeroOrOne
– Format: node
– Minimum Access Types: Get
/<X>/Video/<X>/MIN_QUALITY/BIT_RATE/ABSOLUTE
This leaf node represents the minimum bit rate (kbps) that video encoder should use.
– Occurrence: ZeroOrOne
– Format: float
– Minimum Access Types: Get
/<X>/Video/<X>/MIN_QUALITY/BIT_RATE/RELATIVE
This leaf node represents the minimum bit rate (proportion of the bit rate negotiated for the video session) that video encoder should use.
– Occurrence: ZeroOrOne
– Format: float
– Minimum Access Types: Get
– Values: 0 ~ 100 %
/<X>/Video/<X>/MIN_QUALITY/FRAME_RATE
This interior node is used to allow a reference to a list of parameters related to the minimum frame rate.
– Occurrence: ZeroOrOne
– Format: node
– Minimum Access Types: Get
/<X>/Video/<X>/MIN_QUALITY/FRAME_RATE/ABSOLUTE
This leaf node represents the minimum frame rate (fps, frames per second) that video encoder should use.
– Occurrence: ZeroOrOne
– Format: float
– Minimum Access Types: Get
/<X>/Video/<X>/MIN_QUALITY/FRAME_RATE/RELATIVE
This leaf node represents the minimum frame rate (proportion of the maximum frame rate limited by the codec profile/level negotiated for the video session) that video encoder should use.
– Occurrence: ZeroOrOne
– Format: float
– Minimum Access Types: Get
– Values: 0 ~ 100 %
/<X>/Video/<X>/MIN_QUALITY/QP
This interior node is used to allow a reference to a list of parameters related to video quantisation.
– Occurrence: ZeroOrOne
– Format: node
– Minimum Access Types: Get
/<X>/Video/<X>/MIN_QUALITY/QP/H264
This leaf node represents the maximum value of luminance quantization parameter QPY that video encoder should use if H.264 is negotiated for the video session.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
– Values: 0 ~ 51
/<X>/Video/<X>/ECN
This interior node is used to allow a reference to a list of parameters related to Explicit Congestion Notification (ECN) to IP.
– Occurrence: ZeroOrOne
– Format: node
– Minimum Access Types: Get
/<X>/Video/<X>/ECN/STEP_UP
This leaf node represents the proportion of current encoding rate estimated by video receiver, which is used to ask video sender to increase the rate by this value.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Video/<X>/ECN/STEP_DOWN
This leaf node represents the decrease in the requested maximum encoding rate over current rate, when a down-switch is requested by the receiver.
– Occurrence: ZeroOrOne
– Format: chr
– Minimum Access Types: Get
/<X>/Video/<X>/ECN/INIT_WAIT
This leaf node represents the minimum waiting time (ms) before up-switch is attempted in the initial phase of the session.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Video/<X>/ECN/INIT_UPSWITCH_WAIT
This leaf node represents the waiting time (ms) at each step during up-switch in the beginning of the session.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Video/<X>/ECN/CONGESTION_WAIT
This leaf node represents the minimum interval (ms) between detection of ECN-CE and up-switch from the reduced rate.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Video/<X>/ECN/CONGESTION_UPSWITCH_WAIT
This leaf node represents the waiting time (ms) at each step during up-switch after a congestion event, except for the initial up-switch which uses the ECN/CONGESTION_WAIT time.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Video/<X>/ECN/MIN_RATE
This interior node is used to allow a reference to a list of parameters related to the minimum bit rate during ECN-based adaptation.
– Occurrence: ZeroOrOne
– Format: node
– Minimum Access Types: Get
/<X>/Video/<X>/ECN/MIN_RATE/ABSOLUTE
This leaf node represents the minimum bit rate (kbps, excluding IP, UDP, RTP and payload overhead) that video encoder should use during ECN-based adaptation.
– Occurrence: ZeroOrOne
– Format: float
– Minimum Access Types: Get
/<X>/Video/<X>/ECN/MIN_RATE/RELATIVE
This leaf node represents the minimum bit rate (proportion of the bit rate negotiated for the video session) that video encoder should use during ECN-based adaptation.
– Occurrence: ZeroOrOne
– Format: float
– Minimum Access Types: Get
/<X>/Video/<X>/RTP_GAP
This leaf node represents the maximum interval between packets (proportion of the estimated frame period) tolerated, before the receiver declares bursty packet loss or severe congestion condition.
– Occurrence: ZeroOrOne
– Format: float
– Minimum Access Types: Get
/<X>/Video/INC_FBACK_MIN_INTERVAL
This leaf node represents the minimum interval (ms) at which rate adaptation feedback such as TMMBR should be sent from the receiver to the sender, when the bit rate is being increased.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Video/<X>/DEC_FBACK_MIN_INTERVAL
This leaf node represents the minimum interval (ms) at which rate adaptation feedback such as TMMBR should be sent from the receiver to the sender, when the bit rate is being decreased.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Video/<X>/TP_DURATION_HI
This leaf node represents the duration (ms) of sliding window over which the interval between packet arrival and playout is observed. The computed value is compared with TARGET_PLAYOUT_MARGIN_HI.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Video/<X>/TP_DURATION_MIN
This leaf node represents the duration (ms) of sliding window over which the interval between packet arrival and playout is observed. The computed value is compared with TARGET_PLAYOUT_MARGIN_MIN.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Video/<X>/TARGET_PLAYOUT_MARGIN_HI
This leaf node represents the upper threshold of the interval (ms) between packet arrival and its properly scheduled playout.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Video/<X>/TARGET_PLAYOUT_MARGIN_MIN
This leaf node represents the lower threshold of the interval (ms) between packet arrival and its properly scheduled playout.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Video/<X>/RAMP_UP_RATE
This leaf node represents the rate (kbps/s) at which video encoder should increase its maximum bit rate from current value to the value indicated in the most recently received TMMBR message.
– Occurrence: ZeroOrOne
– Format: float
– Minimum Access Types: Get
/<X>/Video/<X>/RAMP_DOWN_RATE
This leaf node represents the rate (kbps/s) at which video encoder should decrease its maximum bit rate from current value to the value indicated in the most recently received TMMBR message.
– Occurrence: ZeroOrOne
– Format: float
– Minimum Access Types: Get
/<X>/Video/<X>/DECONGEST_TIME
This leaf node represents the time (ms) the receiver should command the sender to spend in decongesting the transmission path, before attempting to transmit at the sustainable rate of the path.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
/<X>/Video/<X>/HOLD_DROP_END
This leaf node represents a tri-valued parameter that controls how the sender should behave in case video quality cannot meet the requirements set in BIT_RATE, FRAME_RATE, or QP.
– Occurrence: ZeroOrOne
– Format: int
– Minimum Access Types: Get
– Values: 0, 1, 2
/<X>/Video/<X>/INITIAL_CODEC_RATE
This leaf node represents the initial bit rate (proportion of the bit rate negotiated for the video session) that the sender should begin encoding video at.
– Occurrence: ZeroOrOne
– Format: float
– Minimum Access Types: Get
– Values: 0 ~ 100 %
/<X>/Video/<X>/X_PERCENTILE
This leaf node represents the percentile point of packet arrival distribution used with the TARGET_PLAYOUT_MARGIN parameters.
– Occurrence: ZeroOrOne
– Format: float
– Minimum Access Types: Get
– Values: 0 ~ 100 %
/<X>/Video/<X>/Ext
The Ext is an interior node where the vendor specific information can be placed (vendor meaning application vendor, device vendor etc.). Usually the vendor extension is identified by vendor specific name under the ext node. The tree structure under the vendor identified is not defined and can therefore include one or more un-standardized sub-trees.
– Occurrence: ZeroOrOne
– Format: node
– Minimum Access Types: Get
/<X>/Ext
The Ext is an interior node where the vendor specific information can be placed (vendor meaning application vendor, device vendor etc.). Usually the vendor extension is identified by vendor specific name under the ext node. The tree structure under the vendor identified is not defined and can therefore include one or more un-standardized sub-trees.
– Occurrence: ZeroOrOne
– Format: node
– Minimum Access Types: Get
Table 17.1: Speech adaptation parameters of 3GPP MTSIMA MO
Parameter (Unit) |
Usage |
|||
PLR/MAX (%) |
Packet loss rate (PLR) above this threshold, when redundancy is not used, indicates that performance is not satisfactory. Adaptation state machine at the receiver should signal the sender to attempt adaptation that reduces PLR or operate at modes more robust to packet loss. When using the example adaptation state machines of Annex C, this parameter corresponds to PLR_1. |
|||
PLR/LOW (%) |
PLR below this threshold indicates that conditions are favorable and better quality can be supported. Adaptation state machine at the receiver should signal the sender to probe for higher bit rate, increase the packet rate, reduce redundancy, or perform other procedures that could improve speech quality under such favorable conditions. When in the probing state, if PLR falls below this threshold, then the sender should adapt to a higher bit rate. When using the example adaptation state machines of Annex C, this parameter corresponds to PLR_2. |
|||
PLR/STATE_REVERSION (%) |
PLR above this threshold, after adaptation state machine has taken actions based on PLR lower than LOW, indicates that the actions taken to improve speech quality were not successful. Adaptation state machine at the receiver should signal the sender to return to the previous state where it stayed before attempting to improve speech quality. When using the example adaptation state machines of Annex C, this parameter corresponds to PLR_3. |
|||
PLR/RED_INEFFECTIVE (%) |
PLR above this threshold, after adaptation state machine has taken actions to increase redundancy, indicates that situation was not improved but degraded. Adaptation state machine at the receiver should signal the sender to use a lower bit rate and no redundancy. When using the example adaptation state machines of Annex C, this parameter corresponds to PLR_4. |
|||
PLR/DURATION_MAX (ms) |
Duration of sliding window over which PLR is observed and computed. The computed value is compared with the MAX threshold. |
|||
PLR/DURATION_LOW (ms) |
Duration of sliding window over which PLR is observed and computed. The computed value is compared with the LOW threshold. |
|||
PLR/DURATION_STATE_REVERSION (ms) |
Duration of sliding window over which PLR is observed and computed. The computed value is compared with the STATE_REVERSION threshold. |
|||
PLR/DURATION_RED_INEFFECTIVE (ms) |
Duration of sliding window over which PLR is observed and computed. The computed value is compared with the RED_INEFFECTIVE threshold. |
|||
PLR/DURATION (ms) |
Duration of sliding window over which PLR is observed and computed. The computed value is compared with the PLR thresholds. This applies as the default duration in case no specific DURATION is specified. |
|||
PLB/LOST_PACKET (integer) |
When loss of LOST_PACKET or more packets is detected in the latest period of PLB/DURATION, this event is categorized as a packet loss burst (PLB) and adaptation state machine should take appropriate actions to reduce the impact on speech quality. |
|||
PLB/DURATION (ms) |
Duration of sliding window over which lost packets are counted. |
|||
ECN/USAGE (Boolean) |
Switch to enable or disable ECN-based adaptation. This parameter should be translated as follows: "0" = OFF, "1" = ON. |
|||
ECN/MIN_RATE (bps) |
Lower boundary for the media bit-rate adaptation in response to ECN-CE marking. The media bit-rate shall not be reduced below this value as a reaction to the received ECN-CE. The value of this parameter is assigned to the ECN_min_rate parameter defined in Clause 10.2.0. The ECN_min_rate should be selected to maintain an acceptable service quality while reducing the resource utilization. Default value: Same as ICM/INITIAL_CODEC_RATE if defined, otherwise same as Initial Codec Mode (ICM), see Clause 7.5.2.1.6. |
|||
ECN/STEPWISE_DOWNSWITCH (Boolean) |
Switch to select down-switch method. This parameter should be translated as follows: "0" = direct down-switch to ECN/MIN_RATE; "1" = stepwise down-switch according to ECN/RATE_LIST (one step per congestion event). |
|||
ECN/RATE_LIST (character set) |
List of bit rates (e.g. codec modes) to use during stepwise down-switch. This parameter is only applicable when stepwise down-switch is used. If the codec does not support exactly the rate which is indicated then the highest rate supported by the codec below the indicated value should be used. Depending on the codec, the values can be understood as either the highest rate or the average rate. The entries in the list may either be generic, i.e. usable for any codec, but can also be codec-specific. The default usage is the generic list where the bit rates [in bps] are included, e.g. (5000, 6000, 7500, 12500). A codec-specific list may indicate desired modes, e.g. for AMR the list could be (0,2,4,7). The use of certain rates in this list may be prevented by the results of session negotiation involving SDP attributes such as the "mode-set" parameter. The SDP parameter "mode-change-neighbor" may lead to using intermediate modes when transitioning between rates in this list. If this parameter is not defined or contains bit rates not negotiated in the session, then the mode-set included in SDP is used. If no mode-set is defined in SDP, then "4750, 5900, 7400, 12200" is used for AMR, which corresponds to the "0, 2, 4, 7" modes. |
|||
ECN/INIT_WAIT (ms) |
The waiting time before the first up-switch is attempted in the beginning of the session, to avoid premature up-switch. This parameter shall be used instead of the ICM/INIT_WAIT parameter if ECN is used in the session. Default value is defined in Clause 7.5.2.1.6. |
|||
ECN/INIT_UPSWITCH_WAIT (ms) |
This parameter is used in up-switches in the beginning of the session. Note that the first up-switch in the beginning of the session uses the ECN/INIT_WAIT time. Only the subsequent up-switches use the ECN/INIT_UPSWITCH_WAIT time. This parameter shall be used instead of the ICM/INIT_UPSWITCH_WAIT parameter if ECN is used in the session. Default value: is defined in Clause 7.5.2.1.6. |
|||
ECN/CONGESTION_WAIT (ms) |
The waiting time after an ECN-CE marking for which an up-switch shall not be attempted. The value of this parameter is assigned to the ECN_congestion_wait parameter defined in Clause 10.2.0. A negative value indicates an infinite waiting time, i.e. to prevent up-switch for the whole remaining session. Default value: Same as the ECN_congestion_wait parameter defined in Clause 10.2.0. |
|||
ECN/CONGESTION_UPSWITCH_ WAIT (ms) |
This parameter is used in up-switches after a congestion event. Note that the first up-switch after a congestion event uses the ECN/CONGESTION_WAIT time. Only the subsequent up-switches use the ECN/CONGESTION_UPSWITCH_WAIT time. Default value is 5000 ms. |
|||
ICM/INITIAL_CODEC_RATE (bps) |
The bit rate that the speech encoder should use for the encoding of the speech at the start of the RTP stream. |
|||
ICM/INITIAL_CODEC_BANDWIDTH (character set) |
The audio bandwidth that the EVS speech encoder in EVS Primary mode should use for the encoding of the speech at the start of the RTP stream. |
|||
ICM/INIT_WAIT (ms) |
To avoid premature up-switch when ECN is not used in the session, this parameter defines the waiting time before the first up-switch is attempted in the beginning of the session. Default value: Same as Initial Waiting Time as defined in Clause 7.5.2.1.6. |
|||
ICM/INIT_UPSWITCH_WAIT (ms) |
When ECN is not used in the session, this parameter is used in up-switches in the beginning of the session until the first down-switch occurs. Note that the first up-switch in the beginning uses the INIT_WAIT time. Only the subsequent up-switches use the INIT_UPSWITCH_WAIT time. Default value: Same as Initial Upswitch Waiting Time as defined in Clause 7.5.2.1.6. |
|||
ICM/INIT_PARTIAL_REDUNDANCY_OFFSET_SEND (integer) |
The initial partial redundancy offset (-1, 0, 2, 3, 5, or 7) that the EVS speech encoder should use when starting the encoding in the beginning of the session that uses channel aware mode, unless asked otherwise by the far-end MTSI client in terminal with the ch-aw-recv parameter. |
|||
ICM/INIT_PARTIAL_REDUNDANCY_OFFSET_RECV (integer) |
The initial partial redundancy offset (-1, 0, 2, 3, 5, or 7) that the MTSI client in terminal should ask the far-end MTSI client in terminal to use with the ch-aw-recv parameter when starting the encoding in the beginning of the session that uses channel aware mode. |
|||
N_INHIBIT (integer) |
If adaptation state machine transitions from one state to another then back to the original state, adaptation state machine should not return to the other state in less than N_INHIBIT speech frames, to avoid the ping-pong effects. |
|||
N_HOLD (integer) |
N_HOLD x PLR/DURATION can be used as the period for which PLR is observed and computed. For example, the computed value can be compared with the LOW threshold when DURATION_LOW is not defined. |
|||
T_RESPONSE (ms) |
If the receiver does not detect expected responses from the sender within a period of T_RESPONSE after having sent a request, the receiver should consider this request as not fulfilled and take appropriate actions. |
|||
CODEC_ID (character set) |
MIME Type of the codec for which the media robustness adaptation PLR thresholds are configured. |
|||
CFG_BIT_RATE_LIST (character set) |
List of bit rates (or codec modes) describing the codec configurations to use during media robustness adaptation. The entries are listed in order of the bit rate of the least robust configuration first to the bit rate of the most robust configuration listed last. If there are multiple codec configurations with the same bit rate but different loss robustness (e.g., EVS 13.2 channel aware and non-channel aware modes, or a codec mode with different levels of application layer redundancy), the same bit rate is listed multiple times in the list. If the codec does not support exactly the rate which is indicated, then the highest rate supported by the codec below the indicated value should be used. Depending on the codec, the values can be understood as either the highest rate or the average rate. The entries in the list may either be generic, i.e. usable for any codec, but can also be codec-specific. The default usage is the generic list where the bit rates [in bps] are included, e.g. (5000, 6000, 7500, 12500). A codec-specific list may indicate desired modes, e.g. for AMR the list could be (0,2,4,7). The use of certain rates in this list may be prevented by the results of session negotiation involving SDP attributes such as the "mode-set" parameter or “b=AS” attribute. The SDP parameter "mode-change-neighbor" may lead to using intermediate modes when transitioning between rates in this list. If this parameter is not defined or contains bit rates not negotiated in the session, then the rates or mode-set included in SDP is used. |
|||
CFG_RED_LIST (character set) |
List of redundancy levels describing the codec configurations to use during media robustness adaptation. The redundancy levels are listed in order to correspond to respective CFG_BIT_RATE_LIST entries and describe the redundancy levels from the least robust configuration first to the most robust last. The redundancy level is described using one of the values below: |
|||
Value |
Description |
|||
0 |
No redundancy |
|||
P |
Partial Redundancy |
|||
1 |
100% repetition application layer redundancy |
|||
2 |
200% repetition application layer redundancy |
|||
3 |
300% repetition application layer redundancy |
|||
Commas are used to separate redundancy levels of each codec configuration in the list. If the codec configuration does not support the redundancy level, then the codec configuration shall not be requested by the media receiver for media robustness adaptation. |
||||
HIGH_PLR_THRESH_LIST (character set) |
List of high PLR thresholds for each codec configuration in the order described for the CFG_BIT_RATE_LIST with the exception of not having a high PLR threshold for the most robust codec configuration, i.e., the last entry in the HIGH_PLR_THRESH_LIST corresponds to the threshold for requesting the most robust configuration when using the second most, or a less, robust configuration. When the estimated PLR exceeds a PLR threshold in this list corresponding to a given codec configuration, the media receiver shall request the next more robust codec configuration. E.g., if the first high PLR threshold in the list, which corresponds to the least robust codec configuration is exceeded, then the media receiver requests the second least robust codec configuration. The PLR values are represented as a percent (e.g., 2.5 is 2.5% PLR) and separated by commas for each codec configuration. |
|||
LOW_PLR_THRESH_LIST (character set) |
List of low PLR thresholds for each codec configuration in the order described for the CFG_BIT_RATE_LIST with the exception of not having a low PLR threshold for the least robust codec configuration, i.e., the first entry in the LOW_PLR_THRESH_LIST corresponds to the threshold for requesting the least robust configuration when using the second least, or a more, robust configuration. When the estimated PLR drops below a PLR threshold in this list corresponding to a given codec configuration, the media receiver shall request the next less robust codec configuration. E.g., if the estimated PLR drops below the last low PLR threshold in the list, which corresponding to the most robust codec configuration, then the media receiver requests the second most robust codec configuration. The PLR values are represented as a percent (e.g., 2.5 is 2.5% PLR) and separated by commas for each codec configuration. |
|||
DJB_PLR (Boolean) |
||||
Value |
Description |
|||
0 |
Measure PLR pre-DJB |
|||
1 |
Measure PLR post-DJB |
|||
PLR_AVG_WINDOW (ms) |
Indicates the duration of the sliding window (in ms) over which the PLR is observed and computed. |
Table 17.2: Video adaptation parameters of 3GPP MTSIMA MO
Parameter (Unit) |
Usage |
PLR/MAX (%) |
Upper threshold of PLR above which adaptation state machine at the receiver should signal the sender to reduce the bit rate. PLR is measured per RTP packet and in addition to packets that do not arrive at the receiver ever, packets that arrive but do not make it in time for their properly scheduled playout are considered as lost. |
PLR/LOW (%) |
Lower threshold of PLR below which adaptation state machine at the receiver may signal the sender to increase the bit rate. |
PLR/DURATION_MAX (ms) |
Duration of sliding window over which PLR is observed and computed. The computed value is compared with the MAX threshold. |
PLR/DURATION_LOW (ms) |
Duration of sliding window over which PLR is observed and computed. The computed value is compared with the LOW threshold. |
PLB/LOST_PACKET (integer) |
When loss of LOST_PACKET or more packets is detected in the last period of PLB/DURATION, this event is categorized as a packet loss burst (PLB) and adaptation state machine should take appropriate actions to reduce the impact on video quality. |
PLB/DURATION (ms) |
Duration of sliding window over which lost packets are counted. |
MIN_QUALITY/BIT_RATE /ABSOLUTE (kbps) |
Minimum bit rate that video encoder should use. If the MTSI client in terminal is unable to maintain this minimum bit rate, it should drop the video stream component or put it on hold. If both MIN_QUALITY/BIT_RATE/ABSOLUTE and MIN_QUALITY/BIT_RATE/RELATIVE are set, the larger of these two shall be used as the minimum bit rate. |
MIN_QUALITY/BIT_RATE /RELATIVE (%) |
Minimum bit rate (as a proportion of the bit rate negotiated for the video session) that the video encoder should use. If the MTSI client in terminal is unable to maintain this minimum bit rate, it should drop the video stream component or put it on hold. If both MIN_QUALITY/BIT_RATE/ABSOLUTE and MIN_QUALITY/BIT_RATE/RELATIVE are set, the larger of these two shall be used as the minimum bit rate. |
MIN_QUALITY/FRAME_RATE /ABSOLUTE (fps) |
Minimum frame rate that video encoder should use. If the MTSI client in terminal is unable to maintain this minimum frame rate, it should drop the video stream component or put it on hold. The minimum frame rate is considered unmet if the interval between encoding times of video frames is larger than the reciprocal of the minimum frame rate. If both MIN_QUALITY/FRAME_RATE/ABSOLUTE and MIN_QUALITY/FRAME_RATE/RELATIVE are set, the larger of these two shall be used as the minimum frame rate. |
MIN_QUALITY/FRAME_RATE /RELATIVE (%) |
Minimum frame rate (as a proportion of the maximum frame rate supported as specified by the video codec profile/level negotiated for the session) that video encoder should use. If the MTSI client in terminal is unable to maintain this minimum frame rate, it should drop the video stream component or put it on hold. The minimum frame rate is considered unmet if the interval between encoding times of video frames is larger than the reciprocal of the minimum frame rate. If both MIN_QUALITY/FRAME_RATE/ABSOLUTE and MIN_QUALITY/FRAME_RATE/RELATIVE are set, the larger of these two shall be used as the minimum frame rate. |
MIN_QUALITY/QP/H264 (integer) |
Maximum value of QPY that video encoder should use if H.264 is negotiated for the video session. The encoder should generate video stream such that QPY does not exceed H264. If the MTSI client in terminal is unable to maintain this maximum QPY value, it should drop the video stream component or put it on hold. |
ECN/STEP_UP (%) |
When an up-switch is requested by the receiver, this parameter defines the proportion of the session media bandwidth (b=AS) that is used to increment the requested maximum encoding rate over the currently used rate. The receiver estimates the currently used rate over an implementation dependent time period. Default value: 10. |
ECN/STEP_DOWN (character set) |
List of proportions (%) by which video receiver requests that the encoder rate be reduced relative to the currently used rate in response to each congestion event. The receiver estimates the currently used rate over an implementation dependent time period. The receiver uses the first value in the list for the first congestion event, the second value for the second congestion event etc. The list may consist of only one value. If there are more congestion events than there are values in the list, then the last value is used for each additional congestion event. The receiver resets to use the first value in the list after an up-switch has started i.e. after the CONGESTION_WAIT time. Default Value: "30, 20, 10". |
ECN/INIT_WAIT (ms) |
The waiting time before the first up-switch is attempted in the initial phase of the session, to avoid premature up-switch. Default value is 500 ms. The initial phase starts at the beginning of the session and ends when the first congestion event is detected. |
ECN/INIT_UPSWITCH_WAIT (ms) |
This parameter is the waiting time used before attempting up-switches in the initial phase of the session. Note that the first up-switch in the initial phase uses the INIT_WAIT time. Only the subsequent up-switches use the INIT_UPSWITCH_WAIT time. Default value: 500 ms. |
ECN/CONGESTION_WAIT (ms) |
The waiting time after an ECN-CE marking for which an up-switch shall not be attempted. A negative value indicates an infinite waiting time, i.e. to prevent up-switch for the whole remaining session. Default value: 5000 ms. |
ECN/CONGESTION_UPSWITCH_ WAIT (ms) |
This parameter is the waiting time used before attempting up-switches after a congestion event. Note that the first up-switch after a congestion event uses the CONGESTION_WAIT time. Only the subsequent up-switches use the CONGESTION_UPSWITCH_WAIT time. Default value is 5000 ms. |
ECN/MIN_RATE/ABSOLUTE (kbps) |
Lower boundary for the media bit-rate adaptation in response to ECN-CE marking. The media bit-rate shall not be reduced below this value as a reaction to the received ECN-CE. The ECN/MIN_RATE/ABSOLUTE should be selected to maintain an acceptable service quality while reducing the resource utilization. If the GBR is known to the client to be lower than the ECN/MIN_RATE then the GBR value shall be used instead of the ECN/MIN_RATE value. Default value: 48 kbps. If both ECN/MIN_RATE/ABSOLUTE and ECN/MIN_RATE/RELATIVE are set, the larger of these two shall be used as the lower boundary for the media bit-rate adaptation in response to ECN-CE marking. |
ECN/MIN_RATE/RELATIVE (%) |
Lower boundary (as a proportion of the bit rate negotiated for the video session) for the media bit-rate adaptation in response to ECN-CE marking. The media bit-rate shall not be reduced below this value as a reaction to the received ECN-CE. The ECN/MIN_RATE/RELATIVE should be selected to maintain an acceptable service quality while reducing the resource utilization. If the GBR is known to the client to be lower than the ECN/MIN_RATE then the GBR value shall be used instead of the ECN/MIN_RATE value. Default value: Same as INITIAL_CODEC_RATE for video. If both ECN/MIN_RATE/ABSOLUTE and ECN/MIN_RATE/RELATIVE are set, the larger of these two shall be used as the lower boundary for the media bit-rate adaptation in response to ECN-CE marking. |
RTP_GAP (float) |
If no RTP packets are received for longer than this period (proportion of the estimated frame period), the receiver should declare bursty packet loss or severe congestion condition. Packet loss gap can be detected as follows: based on the reception history of video packets and their time-stamps, the receiver keeps a running estimate of the frame period, T_FRAME_EST. If the receiver does not receive any RTP packets for a duration of RTP_GAP x T_FRAME_EST, then it should react accordingly. Typical RTP_GAP values can range from 0.5 to 5.0. |
INC_FBACK_MIN_INTERVAL (ms) |
Minimum interval between transmitting TMMBR messages that increase the maximum rate limit. |
DEC_FBACK_MIN_INTERVAL (ms) |
Minimum interval between transmitting TMMBR messages that decrease the maximum rate limit. |
TP_DURATION_HI (ms) |
Duration of sliding window over which the interval between packet arrival and playout is observed and computed. The computed value is compared with the TARGET_PLAYOUT_MARGIN_HI threshold. |
TP_DURATION_MIN (ms) |
Duration of sliding window over which the interval between packet arrival and playout is observed and computed. The computed value is compared with the TARGET_PLAYOUT_MARGIN_MIN threshold. |
TARGET_PLAYOUT_MARGIN_HI (ms) |
Upper threshold of the interval between packet arrival and its properly scheduled playout. The interval is measured from playout time to the X percentile point (X_PERCENTILE) of the packet arrival distribution. When this upper threshold is exceeded, the receiver may signal the sender to increase the bit rate. |
TARGET_PLAYOUT_MARGIN_MIN (ms) |
Lower threshold of the interval between packet arrival and its properly scheduled playout. The interval is measured from playout time to the X percentile point (X_PERCENTILE) of the packet arrival distribution. When this lower threshold is exceeded, the receiver should signal the sender to decrease the bit rate. |
RAMP_UP_RATE (kbps/s) |
Rate at which video encoder should increase its target bit rate to a higher max rate limit. |
RAMP_DOWN_RATE (kbps/s) |
Rate at which video encoder should decrease its target bit rate to a lower max rate limit. |
DECONGEST_TIME (ms) |
Minimum time the receiver should command the sender to spend in decongesting the transmission path, before attempting to transmit at the sustainable rate of the path. The receiver can achieve decongestion by first sending a TMMBR message with a value below the sustainable rate of the path. Once the receiver concludes that congestion has been cleared, it can send a TMMBR message with a value closer to the sustainable rate of the path. If the receiver concludes that congestion has not been cleared yet, it may attempt to clear the remaining congestion for another period of DECONGEST_TIME. A short DECONGEST_TIME results in a quick and aggressive decongestion by reducing the bit rate radically while a long DECONGEST_TIME results in a long and conservative decongestion. A value of 0 indicates that the receiver should not attempt to perform any decongestion at all. |
HOLD_DROP_END (integer) |
Tri-valued parameter that controls how the sender should behave in case video quality cannot meet the requirements set in BIT_RATE, FRAME_RATE, or QP. This parameter indicates whether the sender should put the video stream on hold while maintaining QoS reservations, drop the video stream and release QoS reservations, or end the session. Allowed values of this parameter are defined as follows: "0" = HOLD, "1" = DROP, "2" = END. |
INITIAL_CODEC_RATE (%) |
Initial bit rate (proportion of the bit rate negotiated for the video session) that the sender should begin encoding video at. |
X_PERCENTILE (%) |
X percentile point of the packet arrival distribution used with TARGET_PLAYOUT_MARGIN parameters. |
17.3 Management procedures
This clause explains how speech and video adaptation of the MTSI client in terminal can be managed using 3GPP MTSIMA MO and OMA-DM protocol. First, it is necessary to describe the expected behavior of media adaptation, i.e., reaction of the MTSI client in terminal to the received RTCP-APP and TMMBR messages, information on the transmission results such as RTCP RR and SR, signalled changes in transport characteristics such as ECN Congestion Experienced (ECN-CE) marking in IP packet headers, and analysis of packet reception status. Such descriptions, which include many parameters of different nature, can be made in the form of adaptation state machines or state transition tables, as in Annex C, based on the criteria for service quality or the policy for network management.
Some parameters in the descriptions can be determined in session setup or measured during session, and therefore do not require to be managed from outside. For example, the maximum or minimum bit rate of speech and video codecs, and round-trip time (RTT) belong to this class of parameters. It is also possible that other parameters are implementation-specific, or related to detailed features of media codec or underlying radio access bearer technology. These classes of parameters are not provided by 3GPP MTSIMA MO but still can be included under Ext nodes as vendor extensions.
The next step will be to select the parameters to be included in 3GPP MTSIMA MO. It might not be practical or necessary to update all parameters in the descriptions and selecting a subset of key parameters might simplify the management. The set of parameters selected should enable the behavior of media adaptation to be controlled up to the necessary extent.
The results of session setup may influence the selection of media adaptation methods to apply. For example, the negotiated media codec and the bandwidth, or whether to use ECN or not may determine the necessary adaptation procedures. Selection of session parameters from 3GPP MTSINP MO falls outside the scope of the present document. Information available to the MTSI client in terminal that may assist such decisions includes, but may not be limited to, the radio access bearer technology, information on service provider broadcast by (e)NodeB, date and time, and service policy.
17.3.1 Management of speech adaptation
3GPP MTSIMA MO contains a set of parameters which can be used in the construction of adaptation state machines. If available, information on the expected behavior of the network, such as the scheduling strategy applied to eNodeB, can assist the design and calibration process. Basically the receiver estimates the encoding and payload packetization status of the sender, and transmits appropriate RTCP-APP messages when the state of adaptation state machine needs to be switched.
Each PLR in table 17.1 is used to specify the conditions, usually as a threshold, to enter or exit a state. MAX, LOW, STATE_REVERSION, and RED_INEFFECTIVE correspond to PLR_1, PLR_2, PLR_3, and PLR_4 in Annex C respectively. Once the measured PLR exceeds or falls below the thresholds, while meeting certain conditions, adaptation state machine triggers the programmed transitions. A subset of PLRs can be used to construct adaptation state machines with fewer states. For example, the two-state adaptation state machine in Annex C can be built with MAX and LOW. DURATION_MAX, DURATION_LOW, DURATION_STATE_REVERSION, and DURATION_RED_INEFFECTIVE can be used to specify the duration of sliding window over which MAX, LOW, STATE_REVERSION, and RED_INEFFECTIVE PLR are observed and computed. DURATION is reserved for the case when it is not necessary to separately specify the durations. N_HOLD allows setting of the duration as an integer multiple of DURATION.
With each pair of a PLR and a DURATION, the observation period of each PLR can be controlled and the sensitivity of each transition path can be tailored to meet the requirements. For example, larger DURATION values are likely to smooth out the impact of bursty loss of packets and reduce the likelihood of frequent transitions between states, i.e., the ping-pong effects, but can delay the reaction to events that require immediate repairing actions. In general, transitions to states designed for better transmission conditions need to be taken more conservately than transitions to states for worse transmission conditions. Other requirements can be combined with PLR to refine the conditions for transitions.
Packet loss burst (PLB) refers to a davastating event in which a large number of packets are lost during a limited period. Immediate measures, such as changing the bit rate or payload packetization are required to reduce the impact on the perceived speech quality. As PLR and PLR/DURATION enable detailed specification of PLR, PLB can be described efficiently with PLB/LOST_PACKET and PLB/DURATION.
The parameters ICM/INITIAL_CODEC_RATE, ICM/INIT_WAIT and ICM/INIT_UPSWITCH_WAIT can be used to control the rate adaptation during the beginning of the session. ICM/INITIAL_CODEC_RATE is used to define what codec mode should be used when starting the encoding for the RTP stream. In EVS Primary mode, ICM/INITIAL_CODEC_BANDWIDTH is used to define which audio bandwidth should be used when starting the encoding for the RTP stream. ICM/INIT_WAIT defines the period over which the sending MTSI client in terminal should use the Initial Codec Mode when ECN is not used. If no codec mode request or other feedback information is received within this period then the sender is allowed to adapt to a higher rate. Since it is unknown in the beginning of the RTP stream whether the transmission path can support higher rates, the adaptation to higher bit rates needs to be conservative. It is therefore recommended that when adapting to a higher rate the sender increases the rate only to the next higher rate in the list of codec modes allowed in the session. It is also recommended that the sender waits for a while in-between consecutive up-switches, to give the receiver a chance to evaluate whether the new rate can be sustained. This waiting period in-between consecutive up-switches can be controlled with the ICM/INIT_UPSWITCH_WAIT parameter when ECN is not used. For the channel aware mode of EVS Primary, ICM/INIT_PARTIAL_REDUNDANCY_OFFSET_SEND and INIT_PARTIAL_REDUNDANCY_OFFSET_RECV can be used to configure the initial redundancy offset for the send and the receive directions respectively.
When ECN is used in the session, the ECN/INIT_WAIT and ECN/INIT_UPSWITCH_WAIT parameters are used instead of the ICM/INIT_WAIT and ICM/INIT_UPSWITCH_WAIT parameters, respectively.
N_INHIBIT can be used to limit the earliest time for the next transition, after transition is temporarily disabled due to frequent transitions among a limited number of states. Use of N_INHIBIT is suggested as a measure to avoid unnecessary transtions during rapid fluctuations of transmission conditions. It is left as the discretion of the implementation to handle RTCP-APP messages received before the sender is allowed to transition again.
T_RESPONSE refers to the maximum period the receiver can tolerate, before declaring that either the transmitted RTCP-APP message was lost or its execution was denied by the sender. After the timer expires, the receiver may retransmit the request or transmit a new request, or choose to be satisfied with current status.
Adaptation state machines using above parameters collect the information on transmission path by analysing the packet reception process. Another, more direct source of information can be provided by network nodes, such as eNodeB, in the form of Explicit Congestion Notification (ECN) to IP. A key benefit of ECN is more refined initiation of adaptation in which the receiver can be aware of incoming deterioration of transmission conditions even before any packets are dropped by network node, i.e., as an early-warning scheme for congestion.
STEPWISE_DOWNSWITCH can be used to control the path of bit-rate reduction, i.e., whether to directly down-switch to ECN/MIN_RATE or to gradually down-switch via several intermediate bit-rates specified in ECN/RATE_LIST. The former path may be preferred when rapid reduction of the bit-rate is required while the latter path may be employed for more graceful degradation of speech quality.
To avoid premature up-switch before the congestion has been cleared, waiting periods during which the sender is not allowed to increase the bit-rate can be defined with ECN/CONGESTION_WAIT parameter. The ECN/CONGESTION_UPSWITCH_WAIT parameter is used to prevent congestion from re-occurring during the upswitch after the ECN/CONGESTION_WAIT period.
To align speech adaptation of the MTSI client in terminal with the purpose of quality control or network management, not only the terminals, which might be managed by different service providers, but also the behaviour, such as scheduling strategy or ECN-marking policy, of network nodes should be considered in the construction of adaptation state machines. It is also possible to program the terminals to adapt differently, as a means of differentiating the quality of service.
With 3GPP MTSIMA MO, it is possible to shape a rough trajectory of the bit rate over time-varying transmission conditions but the maximum and minimum bit rates of speech codec are determined during session setup with mode-set, which can be managed with RateSet leaf of 3GPP MTSINP MO (see clause 15).
Adaptation state machines designed to recover the once reduced bit or packet rate at an earliest opportunity might be considered as an adaptation policy oriented to service quality. However, such an aggressive up-switch before the transmission conditions fully recover takes the risk of degrading the quality or even backward transitions, i.e., the ping-pong effects. Such an optimistic adaptation strategy might not necessarily result in higher quality but can influence the service quality of other terminals sharing the same link. On the other hand, adaptation state machines that increase the once reduced bit or packet rate more conservatively are likely to avoid such situations but might be late in the recovery of speech quality after the transmission conditions are restored.
Even at similar total bit rates, bit stream consisting of a smaller number of larger packets can be at a disadvantage during transmission over packet networks or shared links, when the link quality deteriorates or the link becomes congested, than bit stream consisting of a larger number of smaller packets, since many types of schedulers installed in the network nodes base their decisions on the size of packets such that lower priorities are assigned to larger packets. RTCP_APP_REQ_RED, RTCP_APP_REQ_AGG, and RTCP_APP_CMR specify detailed request for the bit rate and packetization. Bit-fields of RTCP_APP_REQ_RED and RTCP_APP_REQ_AGG are restricted by parameters, such as max-red and maxptime, which are negotiated during session setup.
17.3.2 Management of video adaptation
Compared with speech adaptation where the number of allowed bit rates from speech encoder is limited and each encoded speech frame covers the same short period, e.g., 20 ms, or contains the same number of bits when voice activity is present, video adaptation should tolerate a higher level of uncertainty in the control of the bit rate. Moreover, due to the structural dependence between encoded video frames, from motion estimation and compensation, packetization is not likely to be used as an opportunity for adaptation. This dependence necessitates not only controling the bit rate but also putting an end to error propagation with AVPF NACK or PLI.
Output bit rate from video encoder depends also on the scene being encoded and even if maintaining a constant bit rate is intended, actual output bit rate is likely to fluctuate around a target value. In the design of adaptation state machines for video, this uncertainty needs to be compensated for, for example, with additional implementation margin.
Encoded speech frames have a clear boundary in the bit stream and multiple speech frames can be transported over an RTP packet. In contrast, an encoded video frame, whose size depends on the bit rate, frame rate, and image size, is typically far larger than an encoded speech frame. Multiple packets are usually necessary to transport even a predicted frame, which is usually smaller than an intra frame.
As in speech adaptation, basic information on transmission path can be obtained from analyzing received packet stream. However, perceived video quality can be more sensitive to PLR since the compression ratio of video is typically higher than that of speech and even a minor level of packet loss can initiate error propagation to the following predicted frames, rendering them unrecognizable. For example, at comparable PLR values, speech quality can be acceptable but video quality can be significantly damaged such that dropping the media might be considered. Two parameters for PLR, MAX and LOW, and two additional parameters for their durations, DURATION_MAX and DURATION_LOW, are available for video adaptation.
PLB/LOST_PACKET and PLB/DURATION are also available for video but the fundamental differences in the frame structure need to be taken into account when the event of packet loss burst is defined for video.
INC_FBACK_MIN_INTERVAL and DEC_FBACK_MIN_INTERVAL can be used to control the rate of adaptation and also the amount of signalling overhead. These two minimum intervals are provided separately since the minimum interval between the feedback messages to decrease the bit rate typically needs to be shorter than the one to increase the bit rate. The urgency of rate-decreasing conditions generally requires shorter minimum feedback intervals.
Target bit rate for video is determined during session setup and can be considered as the maximum bit rate to be used during session, which can be configured with the Bandwidth leaf of 3GPP MTSINP MO. On the other hand, BIT_RATE can be used to set a lower threshold for the bit rate. Whether MIN_QUALITY/BIT_RATE/ABSOLUTE or MIN_QUALITY/BIT_RATE/RELATIVE is to be used is left as the discretion of the implementation or service provider. For example, capability of setting a fixed minimum bit rate can be necessary when the lowest quality of MTSI is required to be comparable to the quality of 3G-324M, whose bit rate for video is in general set to 47 ~ 49 kbps. If both MIN_QUALITY/BIT_RATE/ABSOLUTE and MIN_QUALITY/BIT_RATE/RELATIVE are set, the larger of these two shall be used as the minimum bit rate.
In the case of speech adaptation, the MTSI client in terminal limits the initial codec mode (ICM) to a lower mode than the maximum mode negotiated, until at least one frame block or an RTCP message is received with rate control information (see clause 7.5). This policy is recommended to avoid congestion during initial phase of session when the information on transmission path is known to neither the sender nor the receiver. INITIAL_CODEC_RATE can be used for video with similar objectives as that of ICM, i.e., a warming-up process in the beginning of session. Once the session starts and few packets are lost during delivery, the receiver will attempt to increase the bit rate by transmitting TMMBR messages requesting higher bit rates until the negotiated value is reached. However, low INITIAL_CODEC_RATE can reduce the video quality at session setup when the transmission path is free of congestion.
The maximum bit rate allowed for video communication in a session depends on the outcome of the SDP offer-answer negotiation. For inter-working with 3G-324M it is likely that the bit rate is limited to 47 ~ 49 kbps while for high-quality video communication it is foreseen that bit rates in the order of several hundred kbps might be used. This can be challenging when setting the ECN/MIN_RATE parameter since the configuration of the MTSI client in terminal parameters occurs rarely while the maximum allowed bit rate used for video may vary from session to session.
Two parameters, ECN/MIN_RATE/ABSOLUTE and ECN/MIN_RATE/RELATIVE, are therefore provided to enable better control of the video rate adaptation algorithm. The ECN/MIN_RATE/RELATIVE parameter is provided to limit the bit range variations during a session to avoid large quality variations. The ECN/MIN_RATE/ABSOLUTE parameter is provided to avoid reducing the bit rate to an unacceptably low quality level.
FRAME_RATE can be used to set a lower threshold for the frame rate. As the bit rate is controlled during adaptation between two limits, the frame rate also needs to be controlled between the limits while maintaining a balance between spatial quality and temporal resolution (see clause 10.3). As the increase in codec profile/level can result in an abrupt increase of the maximum image size, e.g., from QCIF to CIF, so can quadruple the maximum frame rate, with a fixed image size. With "imageattr" attribute, it is possible that image sizes whose maximum frame rates are unspecified by codec profile/level, such as 272×224, can be negotiated (see clause A.4). In this case, the maximum frame rate is determined as the maximum value at the maximum image size supported by the profile/level negotiated. Whether MIN_QUALITY/FRAME_RATE/ABSOLUTE or MIN_QUALITY/FRAME_RATE/RELATIVE is used to specify the lower threshold of the frame rate is left as the discretion of the implementation or service provider. If both are set, the larger of these two shall be used as the minimum frame rate.
RTP_GAP can be used to set the maximum interval between received packets before the receiver considers repairing actions. During periods of severe congestion or packet loss, the receiver may not receive packets for an unexpectedly long period. Observing such gaps in the reception of packets can be used by the receiver to request the sender to decrease the bit or packet rate. In the case of severe packet loss, this gap can be detected before any other observations are made and thus allows for faster reaction, while detection of packet loss requires reception of at least one packet after the loss.
However, estimating such gaps in the arrival of packets can be challenging because video encoder may not always output packets at regular intervals and typical scheduling strategy of network node, especially in the downlink, can cause jitter in the delivery of video packets. Therefore, it is recommended that RTP_GAP is set conservatively and the measured gap is based on a moving average estimate of the frame period observed by the receiver. The timestamps of the received packets allow the receiver to estimate the frame period based on the past a few received video frames. Since typical video encoders are not likely to abruptly change their encoding frame rates, this estimate can serve as a fairly reliable basis for detecting the gaps in the transport of video packets.
Leaf nodes for luminance quantization parameter, H263, MPEG4, and H264, can be used to set a lower threshold for the image clarity to be maintained. Target range of the quantization parameters depends on the video codec negotiated.
If the MTSI client in terminal cannot maintain the bit rate or the frame rate higher than the lower thresholds, or cannot maintain the quantization parameter lower than the higher threshold, the video stream might be put on hold, dropped, or the session might be ended, depending on the criteria for service quality or policy for network management, with HOLD_DROP_END.
RAMP_UP_RATE and RAMP_DOWN_RATE can be used to control how fast the sender changes its target bit rate from its current target value to the value indicated in the most recently received TMMBR message, when the bit rate is being increased and decreased respectively. As with INC_FBACK_MIN_INTERVAL and DEC_FBACK_MIN_INTERVAL, rates for ramping up and down need to be different, as rapid ramping down is usually necessary whereas rapid ramping up is undesirable as it can cause sudden congestion in the transmission path.
DECONGEST_TIME can be used to control the time spent in resolving the congestion of transmission path. Smaller values of this parameter can result in faster reduction of the bit rate while larger values can be used for slower decongestion. If the situation at the receiver does not improve at the end of initial decongesting, another round of decongestion can be attempted, or the video stream can be dropped or put on hold.
From received packets, video frames are typically reconstructed to YUV format, converted to formats such as RGB, and stored in the frame buffer, before being fed to the display for visual presentation. TARGET_PLAYOUT_MARGIN_HI and TARGET_PLAYOUT_MARGIN_MIN can be used to maintain appropriate playout margin, defined as the interval between packet arrival and its properly scheduled playout. Duration of sliding window over which the interval is observed and computed can be controlled with TP_DURATION_HI and TP_DURATION_MIN.
In general, video should be encoded, packetized, transmitted, de-packetized, decoded, and, played out within a total delay target. In addition, processing of video should be appropriately synchronized to that of speech. If the estimated playout margin exceeds TARGET_PLAYOUT_MARGIN_HI, it is considered that video packets are arriving too early and there remains room for higher bit rate in the transmission path. Therefore the receiver may signal the sender to increase the bit rate with TMMBR messages. If the estimated playout margin falls below TARGET_PLAYOUT_MARGIN_MIN, it is considered that video packets are arriving too late and current transmission path cannot sustain the bit rate. Therefore the receiver should signal the sender to reduce the bit rate to enable earlier arrival of video packets.
X_PERCENTILE can be used to control the target playout margin but the packet arrival distribution is left to the discretion of the implementation, which might be implemented as statistical models or empirical data.
17.4 Management of media robustness adaptation
17.4.1 General
The MEDIA_ROBUSTNESS node defined under the 3GPP MTSIMA MO may be used to manage media robustness adaptation across different vendor terminals in a network. For each codec type, the MO node provides a list of codec configurations arranged from least robust to most robust. For each of these configurations, with the exception the first and last, the MO node also provides two sets of PLR threshold levels (see Figure 17.2):
– A set of high PLR thresholds that trigger the media receiver to request a more robust configuration from the media sender when the PLR is increasing in order to reduce the effects of the higher PLR on QoE, and
– A set of low PLR thresholds that trigger the media receiver to request a less robust configuration from the media sender when the PLR is decreasing in order to take advantage of the improved media quality supported at the lower PLR.
The high PLR and low PLR thresholds between two codec configurations can be set independently to avoid ping-pong switching by introducing hysteresis. The least robust configuration does not have a low PLR threshold and the most robust configuration does not have a high PLR threshold. The MO node does not describe the type of request message that shall be used for adapting the codec configuration as this is determined by the codec configuration being requested, i.e., in-band RTP CMR if requesting a speech codec mode change, RTCP-APP if it requesting speech application layer redundancy change, TMMBR for video.
Figure 17.2: High and Low PLR thresholds for media robustness adaptation
The MO node also provides a flag to indicate whether PLR measurements are to be made before or after the de-jitter buffer in the receiver. Measuring before provides an estimate of the PLR over the transport link to the media receiver while measuring after provides a PLR estimate that is closer to the QoE after the media decoder. The MO node also specifies the sliding averaging window over which PLR estimates are to be calculated.
The parameters are specified independent of the media codec or radio access bearer technology. In addition, vendor specific parameters of the implementation can be placed under Ext nodes. Detailed descriptions of the speech robustness adaptation parameters can be found in table 17.1.