6 Data transport

26.2343GPPProtocols and codecsRelease 17Transparent end-to-end Packet-switched Streaming Service (PSS)TS

6.1 Packet based network interface

PSS clients and servers shall support an IP-based network interface for the transport of session control and media data. Control and media data are sent using TCP/IP [8] and UDP/IP [7]. An overview of the protocol stack can be found in figure 2 of the present document.

6.2 RTP over UDP/IP

6.2.1 General

The IETF RTP [9] provides means for sending real-time or streaming data over UDP (see [7]). The encoded media is encapsulated in the RTP packets with media specific RTP payload formats. RTP payload formats are defined by IETF. RTP also provides a protocol called RTCP (see clause 6 in [9]) for feedback about the transmission quality.

RTP/UDP/IP transport of speech, audio and video shall be supported. RTP/UDP/IP transport of timed text should be supported. Sending of RTCP shall be performed according to the used RTP profile, indicated RTCP bandwidth, and other RTCP related parameters. The transmission times of RTCP shall be controlled by algorithms performing as the ones specified in the RTP specification [9], and if AVPF is used according to [57]. For information on how the RTCP transmission interval depends on different values of the RTCP parameters, see Annex A.3.2.3.

6.2.2 RTP profiles

For RTP/UDP/IP transport of continuous media the following RTP profile shall be supported:

– RTP Profile for Audio and Video Conferences with Minimal Control [10], also called RTP/AVP;

For RTP/UDP/IP transport of continuous media the following RTP profile should be supported:

– Extended RTP Profile for RTCP-based Feedback (RTP/AVPF) [57], also called RTP/AVPF. A PSS client or server shall support the generic NACK message specified in section 6.2.1 of [57] if RTP retransmission is supported. A PSS client or server is not required to support the other feedback formats specified in section 6 of [57].

Clause A.3.2.3 in Annex A of the present document provides more information about the minimum RTCP transmission interval.

6.2.3 RTP and RTCP extensions

6.2.3.1 RTCP extended reports

A PSS client should implement the framework and SDP signalling of the RTP Control Protocol Extended Reports [58]. A PSS client should further implement the following report formats:

– Loss RLE Report Block defined in section 4.1of [58].

A PSS client should send the report block(s) indicated by SDP signalling from the PSS server. A PSS server may limit the report blocks size using SDP signalling. For best utility the client should report in every packet and provide redundancy by reporting also on past RTCP intervals. In cases where size restrictions prevent the client from both reporting on all the RTP packets and providing redundancy, the client shall stop the redundant reporting to address this restriction. If this action is still not enough to reduce the reports to satisfactory sizes, the client may then choose not to send the report in every packet.

6.2.3.2 RTCP App packet for client buffer feedback (NADU APP packet)

A PSS client supporting Signalling for Client Buffer Feedback (see clause 10.2.3) shall report the next application data unit to be decoded for buffer status reporting and rate adaptation by sending the RTCP APP packet. A NADU APP packet shall be sent only after the client has received at least one RTP packet on the media stream and shall be accompanied by a complementary RR packet. The RR and NADU packets shall contain information that represents a single simultaneous ‘snapshot’ of the media stream. The format of a generic RTCP APP packet is shown in Figure 3 below:

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P| subtype | PT=APP=204 | length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC/CSRC |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| name (ASCII) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| application-dependent data …
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Figure 3: Generic Format of an RTCP APP packet.

For rate adaptation the name and subtype fields must be set to the following values:

name: The NADU APP data format is detected through the name "PSS0", i.e. 0x50535330 and the subtype.

subtype: This field shall be set to 0 for the NADU format.

length: The number of 32 bit words –1, as defined in RFC 3550 [9]. This means that the field will be 2+3*N, where N is the number of sources reported on. The length field will typically be 5, i.e. 24 bytes packets.

application-dependent data: One or more of the following data format blocks (as described in Figure 4) can be included in the application-dependent data location of the APP packet. The APP packets length field is used to detect how many blocks of data are present. The block shall be sent for the SSRCs for which there are a report block as part of either a Receiver Report or a Sender Report, included in the RTCP compound packet. A NADU APP packet shall not contain any other data format than the one described in figure 4 below.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Playout Delay | NSN |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved | NUN | Free Buffer Space (FBS) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Figure 4: Data format block for NADU reporting

SSRC: The SSRC of the media stream the buffered packets belong to.

Playout delay (16 bits): The difference in milliseconds between the scheduled playout time of the next ADU to be decoded, (whose sequence number is indicated in the NSN field) and the current time when generating the RTCP packet that contains the NADU APP block, both measured on the media playout clock. The client shall always indicate this value, unless it is not well defined, when it may use the reserved value (0xFFFF). When the buffer is empty (the client has not yet received the packet with sequence number NSN), the playout delay is not well defined and the client should use the reserved value 0xFFFF for this field. When the media clock is not advancing (e.g. while paused or re-buffering), the playout delay corresponds to the difference between the playout time of the next ADU and the media time at which playout will resume.

The point at which the media playout clock is measured should be chosen such that, if the only packet in the buffer is that with sequence number NSN, the playout delay indicates the time remaining until the media playout will ‘starve’ and this stream might need re-buffering. In the calculations of playout delay above, this point is used to determine the playout point of a media packet even though actual playout may occur later in the decoding chain. The target buffer time (see clause 5.3.2.2) must be measured from the same point.

The playout delay allows the server to have a more precise value of the amount of time before the client will underflow. The playout delay shall be computed until the actual media playout (i.e., audio playback or video display).

NSN (16 bits): The RTP sequence number of the next ADU to be decoded for the SSRC reported on. In the case where the buffer does not contain any packets for this SSRC, the next not yet received sequence number shall be reported, i.e. an NSN value that is one larger than the least significant 16 bits of the RTCP SR or RR report block’s "extended highest sequence number received".

NUN (5 bits): The unit number (within the RTP packet) of the next ADU to be decoded. The first unit in a packet has a unit number equal to zero. The unit number is incremented by one for each ADU in an RTP packet. In the case of an audio codec, an ADU is defined as an audio frame. In the case of H.264 (AVC) or H.265 (HEVC), an ADU is defined as a NAL unit. FBS (16 bit): The amount of free buffer space available in the client at the time of reporting. The reported free buffer space shall be less than or equal to the buffer space that has been reported as available for adaptation by the 3GPP-Adaptation RTSP header, see clause 5.3.2.2. The amount of free buffer space are reported in number of complete 64 byte blocks, thus allowing for up to 4194304 bytes to be reported as free. If more is available, it shall be reported as the maximal amount available, i.e. 4194304 with a field value 0xffff.

Reserved (11 bits): These bits are not used and shall be set to 0 and shall be ignored by the receiver.

6.2.3.3 RTP retransmission

6.2.3.3.1 General

A PSS client should implement RTP retransmission. A PSS server should implement RTP retransmission. A PSS client or server implementing RTP retransmission shall implement the payload format, SDP signalling and mechanisms of the RTP retransmission payload format [81]. In addition to the specifications and recommendations in [81], a PSS client and server supporting RTP retransmission shall follow the definitions in the following clauses.

6.2.3.3.2 Multiplexing scheme

The RTP retransmission payload format [81] provides two different schemes for multiplexing the original and the retransmission stream, i.e. session-multiplexing and SSRC-multiplexing. PSS servers shall use SSRC-multiplexing and shall not use session-multiplexing.

6.2.3.3.3 RTCP retransmission request

PSS clients shall use the NACK feedback message format defined in the "Extended RTP Profile for RTCP-based Feedback (RTP/AVPF)" [57] for requesting the retransmission of RTP packets.

Before requesting the retransmission of RTP packets the client should assess whether a requested packet can be decoded in time by checking the latest receiver buffer status. If the client sends RTCP APP packets for client buffer feedback, as defined in section 6.2.3.2, the same assessment should be performed by the server, according to the latest RTCP APP packet it has received.

6.2.3.3.4 Congestion control and usage with rate adaptation

To avoid network congestion due to the additional bandwidth required for the retransmission of lost packets, a PSS server or client implementing RTP retransmission shall estimate the available link rate and adapt the total transmission rate of the RTP session, including retransmissions, to the available link rate. The actual algorithms providing link-rate estimation and transmission-rate adaptation are implementation specific. Rules and information sources for the estimation of the available link rate are described in clause 10.2.1 of the present document. To adapt the total transmission rate including retransmissions, a PSS server can e.g. skip retransmissions, use the transmission rate adaptation described in clause 10.2.2 of the present document or use any other suitable method.

If the server uses multiple streams for rate adaptation, the server may receive retransmission requests for a stream that is different from the one it is currently using. The server should thus not flush its retransmission buffer after switching streams.

6.2.4 RTP payload formats

For RTP/UDP/IP transport of continuous media the following RTP payload formats shall be used:

– AMR narrow-band speech codec (see clause 7.2) RTP payload format according to [11]. A PSS client is not required to support multi-channel sessions;

– AMR wideband speech codec (see clause 7.2) RTP payload format according to [11]. A PSS client is not required to support multi-channel sessions;

– Extended AMR-WB codec (see clause 7.3) RTP payload format according to [85];

– Enhanced aacPlus and MPEG-4 AAC codec (see clause 7.3) RTP payload format according to [13]; the size of audioMuxElements shall be limited to the maximum size of one audio frame, which is 6144 bits per AAC channel; moreover multiplexing of multiple audio frames into one audioMuxElement should be avoided if this would lead to fragmentation across RTP packets;

– – H.264 (AVC) video codec (see clause 7.4) RTP payload format according to [92]. A PSS client is required to support all three packetization modes: single NAL unit mode, non-interleaved mode and interleaved mode. For the interleaved packetization mode, a PSS client shall support streams for which the value of the "sprop-deint-buf-req" MIME parameter is less than or equal to MaxCPB * 1000 / 8, inclusive, in which "MaxCPB" is the value for VCL parameters of the H.264 (AVC) profile and level in use, as specified in [90]. Parameter sets shall not be transmitted within the RTP payload, i.e., all parameter sets required for a session must be provided in the SDP;

– H.265 (HEVC) video codec (see clause 7.4) RTP payload format according to [118];

– 3GPP timed text format (see clause 7.9) RTP payload format according to [80];

– encrypted "enc-isoff-generic" (see Annex R) RTP payload format according [102];

– RTP retransmission payload format according to [81];

– RTP Header Extension to signal CVO information as specified in clause 6.2.5.

6.2.5 Coordination of Video Orientation

In the PSS RTP-streaming context, Coordination of Video Orientation consists in signalling of the current orientation of the image to a CVO-capable PSS client for appropriate rendering and displaying. A CVO-capable PSS server performs signalling of the CVO by indicating this in the SDP as specified in clause 5.3.3.2 and using RTP Header Extensions with a byte formatted as specified in clause 7.4.5 of TS 26.114 [115] for CVO (corresponding to urn:3gpp:video-orientation) and Higher Granularity CVO (corresponding to urn:3gpp:video-orientation:6).

A CVO-capable PSS client should rotate the video to compensate the rotation as indicated in Tables 7.2 (for CVO) and 7.3 (for Higher Granularity CVO) of clause 7.4.5 in [115]. When compensating for both rotation and flip, the operations shall be performed in the LSB to MSB order i.e. rotation compensation first and then flip.

A CVO-capable PSS server will typically add the payload bytes as defined in this clause to the last RTP packet in each group of packets which make up a key frame (I-frame or IDR picture in H.264). The PSS server may also add the payload bytes onto the last RTP packet in each group of packets which make up another type of frame (e.g. a P-Frame) only if the current value is different from the previous value sent.

If this is the only header extension present, a total of 8 bytes are appended to the RTP header, and the last packet in the sequence of RTP packets is marked with both the marker bit and the Extension bit, as defined in RFC3550 [9].

6.3 HTTP over TCP/IP

The IETF TCP provides reliable transport of data over IP networks, but with no delay guarantees. It is the preferred way for sending the scene description, text, bitmap graphics and still images. There is also need for an application protocol to control the transfer. The IETF HTTP [17] provides this functionality.

HTTP/TCP/IP transport shall be supported for:

– still images (see clause 7.5);

– bitmap graphics (see clause 7.6);

– synthetic audio (see clause 7.3A);

– vector graphics (see clause 7.7);

– text (see clause 7.8);

– timed text (see clause 7.9);

– HTML5 and related resources;

– presentation description (see clause 5.3.3).

HTTP/TCP/IP transport should be supported for:

– 3GP files for download, progressive download and Dynamic Adaptive Streaming over HTTP (see 3GPP TS 26.247 [112]).

Transport security in HTTP over TCP/IP is achieved using the HTTPS (Hypertext Transfer Protocol Secure) as specified in RFC2818 [105] and TLS as specified in TLS profile of Annex E in TS 33.310 [111]. In case secure delivery is desired, HTTPS and TLS should be used to authenticate the server and to ensure secure transport of the content from server to client.

6.4 Transport of RTSP

Transport of RTSP shall be supported according to RFC 2326 [5].