C.2 FTP

33.1083G Security3GPPHandover interface for Lawful Interception (LI)Release 17TS

Tools: ARFCN - Frequency Conversion for 5G NR/LTE/UMTS/GSM

C.2.1 Introduction

At HI3 interface FTP is used over the internet protocol stack for the delivery of the result of interception. FTP is defined in IETF STD 9 [13]. The IP is defined in IETF STD0005 [15]. The TCP is defined in IETF STD0007 [16].

FTP supports reliable delivery of data. The data may be temporarily buffered in the sending node (MF) in case of link failure. FTP is independent of the payload data it carries.

C.2.2 Usage of the FTP

In the packet data LI the MF acts as the FTP client and the receiving node (LEMF) acts as the FTP server . The client pushes the data to the server.

The receiving node LEMF stores the received data as files. The sending entity (MF) may buffer files.

Several smaller intercepted data units may be gathered to bigger packages prior to sending, to increase bandwidth efficiency.

The following configurable intercept data collection (= transfer package closing / file change) threshold parameters should be supported:

– frequency of transfer, based on send timeout, e.g. X ms.

– frequency of transfer, based on volume trigger, e.g. X octets.

There are two possible ways how the interception data may be sent from the MF to the LEMF. One way is to produce files that contain interception data only for one observed target (see: "File naming method A)"). The other way is to multiplex all the intercepted data that MF receives to the same sequence of general purpose interception files sent by the MF (see: "File naming method B)").

The HI2 and HI3 are logically different interfaces, even though in some installations the HI2 and HI3 packet streams might also be delivered via a common transmission path from a MF to a LEMF. It is possible to correlate HI2 and HI3 packet streams by having common (referencing) data fields embedded in the IRI and the CC packet streams.

File naming:

The names for the files transferred to a LEA are formed according to one of the 2 available formats, depending on the delivery file strategy chosen (e.g. due to national convention or operator preference).

Either each file contains data of only one observed target (as in method A) or several targets’ data is put to files common to all observed target traffic through a particular MF node (as in method B).

The maximum set of allowed characters in interception file names are "a"…"z", "A"…"Z", "-", "_", ".", and decimals "0"…"9".

File naming method A):

LIID = See clause 7.1.

seq = integer ranging between [0..2^64-1], in ASCII form (not exceeding 20 ASCII digits), identifying the sequence number for file transfer from this node per a specific target.

ext = ASCII integer ranging between ["1".."8"] (in hex: 31H…38H), identifying the file type. The possible file type codings for intercepted data are shown in table C.1. The types "2", "4", and "6" are reserved for the HI3 interface and type "8" is reserved for data files according to a national requirement by using the same file naming concept.

Table C.1: Possible file types

File types that the LEA may get	Intercepted data types
"1" (in binary: 0011 0001)	IRI / as option HI1 notifications (see annex A.2.2)
"2" (in binary: 0011 0010)	CC(MO)
"4" (in binary: 0011 0100)	CC(MT)
"6" (in binary: 0011 0110)	CC(MO&MT)
"7" (in binary 0011 0111)	IRI + CC(MO&MT)
"8" (in binary: 0011 1000)	for national use

The least significant bit that is ‘1’ in file type 1, is reserved for indicating IRI data and may be used for indicating that the HI2 and HI3 packet streams are delivered via a common transmission path from a MF to a LEMF.

The bit 2 of the ext tells whether the CC(MO) is included in the intercepted data.

The bit 3 of the ext tells whether the CC(MT) is included in the intercepted data.

The bit 4 of the ext tells whether the intercepted data is according to a national requirement.

Thus, for CC(MO) data, the file type is "2", for CC(MT) data "4", for CC(MO&MT) data "6" and for "national use" data the file type is "8".

When HI2 and HI3 packet streams are delivered via a common transmission path from a MF to a LEMF, then the file type is "7", that indicates the presence of both the IRI and the CC(MO&MT) data.

This alternative A is used when each target’s intercepted data is gathered per observed target to dedicated delivery files. This method provides the result of interception in a very refined form to the LEAs, but requires somewhat more resources in the sending node than alternative B. With this method, the data sorting and interpretation tasks of the LEMF are considerably easier to facilitate in near real time than in alternative B.

File naming method B):

The other choice is to use monolithic fixed format file names (with no trailing file type part in the file name):

<filenamestring> (e.g. ABXY00041014084400006)

where:

ABXY = Source node identifier part, used for all files by the mobile network operator "AB" from this MF node named "XY".

00 = year 2000

04= month April

10= day 10

14 = hour

08 = minutes

44= seconds

0000 = extension

ext = file type. Coding: "2" = CC(MO), "4" = CC(MT), "6" = CC(MO&MT), "8" = national use. The type "1" is reserved for IRI data files and may be used for indicating that the HI2 and HI3 packet streams are delivered via a common transmission path from a MF to a LEMF. In such a case, the file type is "7", that indicates the presence of both the IRI and the CC(MO&MT) data.

This alternative B is used when several targets’ intercepted data is gathered to common delivery files. This method does not provide the result of interception in as refined form to the LEAs as the alternative A, but it is faster in performance for the MF point of view. With this method, the MF does not need to keep many files open like in alternative A.

C.2.3 Exceptional procedures

Overflow at the receiving end (LEMF) is avoided due to the nature of the protocol.

In case the transit network or receiving end system (LEMF) is down for a reasonably short time period, the local buffering at the MF will be sufficient as a delivery reliability backup procedure.

In case the transit network or receiving end system (LEMF) is down for a very long period, the local buffering at the MF may have to be terminated. Then the following intercepted data coming from the intercepting nodes towards the MF would be discarded, until the transit network or LEMF is up and running again.

C.2.4 CC contents for FTP

C.2.4.1 Fields

The logical contents of the CC-header is described here.

CC-header = (Version, HeaderLength, PayloadLength, PayloadType, PayloadTimeStamp, PayloadDirection, CCSeqNumber, CorrelationNumber, LIID, PrivateExtension).

The Information Element CorrelationNumber forms the means to correlate the IRI and CC of the communication session intercepted.

The first column indicates whether the Information Element referred is Mandatory, Conditional or Optional.

The second column is the Type in decimal.

The third column is the length of the Value in octets.

(Notation used in table C.2: M = Mandatory, O = Optional, C= Conditional).

Table C.2: Information elements in the first version of the CC header

Mode	Type	Length	Value
M	130	2	Version = the version number of the format version to be used. This field has a decimal value, this enables version changes to the format version. The values are allocated according to national conventions.
O	131	2	HeaderLength = Length of the CC-header up to the start of the payload in octets. (This field is optional since it is useful only in such cases that these information elements would be transferred without a dynamic length encapsulation that contains all the length information anyway. This field could be needed in case of e.g. adapting to a local encapsulation convention.)
O	132	2	PayloadLength = Length of the payload following the CC-header in octets. (This field is optional since it is useful only in such cases that these information elements would be transferred without a dynamic length encapsulation that contains all the length information anyway. This field could be needed in case of e.g. adapting to a local encapsulation convention.)
M	133	1	PayloadType = Type of the payload, indicating the type of the CC. Type of the payload. This field has a decimal value. The possible PDP Type values can be found in the standards (e.g. TS 29.060 [17]). The value 255 is reserved for future PDP Types and means: "Other". The PDP Type values defined in TS 29.060 [17] are used for the GTPv2 and for the PMIP protocols as well. The PDN Type (GTPv2) or the IPv6 Home network prefix option/IPv4 home address option (PMIP) are mapped to the PDP Type values based on the IP version information.
O	134	4	PayloadTimeStamp = Payload timestamp according to intercepting node. (Precision: 1 second, timezone: UTC). Format: Seconds since 1970-01-01 as in e.g. Unix (length: 4 octets).
C	137	1	PayloadDirection = Direction of the payload data. This field has a decimal value 0 if the payload data is going towards the target (ie. downstream), or 1 if the payload data is being sent from the target (ie. upstream). If this information is transferred otherwise, e.g. in the protocol header, this field is not required as mandatory. If the direction information is not available otherwise, it is mandatory to include it here in the CC header.
O	141	4	CCSeqNumber = Identifies the sequence number of each CC packet during interception of the target. This field has a 32-bit value.
M	144	8 or 20	CorrelationNumber = Identifies an intercepted session of the observed target. This can be implemented by using e.g. the Charging Id (4 octets, see TS 32.215 [14]) with the (4-octet/16-octet) Ipv4/Ipv6 address of the PDP context maintaining GGSN node attached after the first 4 octets.
			<Possible future parameters are to be allocated between 145 and 250.>
O	254	1-25	LIID = Field indicating the LIID as defined in this document. This field has a character string value, e.g. "ABCD123456".
O	255	1-N	PrivateExtension = An optional field. The optional Private Extension contains vendor or LEA or operator specific information. It is described in the document TS 29.060 [17].

Table C.3: Information elements in the second version of the CC header

Mode	Type	Length	Value
M	130	2	Version = the version number of the format version to be used. This field has a decimal value, this enables version changes to the format version. The values are allocated according to national conventions.
O	131	2	HeaderLength = Length of the CC-header up to the start of the payload in octets. (This field is optional since it is useful only in such cases that these information elements would be transferred without a dynamic length encapsulation that contains all the length information anyway. This field could be needed in case of e.g. adapting to a local encapsulation convention).
O	132	2	PayloadLength = Length of the payload following the CC-header in octets. (This field is optional since it is useful only in such cases that these information elements would be transferred without a dynamic length encapsulation that contains all the length information anyway. This field could be needed in case of e.g. adapting to a local encapsulation convention.)
M	133	1	PayloadType = Type of the payload, indicating the type of the CC. Type of the payload. This field has a decimal value. The possible PDP Type values can be found in the standards (e.g. TS 29.060 [17]). The value 255 is reserved for future PDP Types and means: "Other". The PDP Type values defined in TS 29.060 [17] are used for the GTPv2 and for the PMIP protocols as well. The PDN Type (GTPv2) or the IPv6 Home network prefix option/IPv4 home address option (PMIP) are mapped to the PDP Type values based on the IP version information.
O	134	4	PayloadTimeStamp = Payload timestamp according to intercepting node. (Precision: 1 second, timezone: UTC). Format: Seconds since 1970-01-01 as in e.g. Unix (length: 4 octets).
C	137	1	PayloadDirection = Direction of the payload data. This field has a decimal value 0 if the payload data is going towards the target (ie. downstream), or 1 if the payload data is being sent from the target (ie. upstream). If this information is transferred otherwise, e.g. in the protocol header, this field is not required as mandatory. If the direction information is not available otherwise, it is mandatory to include it here in the CC header.
O	141	4	CCSeqNumber = Identifies the sequence number of each CC packet during interception of the target. This field has a 32-bit value.
M	144	8 or 20	CorrelationNumber = Identifies an intercepted session of the observed target. This can be implemented by using e.g. the Charging Id (4 octets, see TS 32.215 [14]) with the (4-octet/16-octet) Ipv4/Ipv6 address of the PDP context maintaining GGSN node attached after the first 4 octets.
			<Possible future parameters are to be allocated between 145 and 250.>
M	251	2	MainElementID = Identifier for the TLV element that encompasses one or more HeaderElement-PayloadElement pairs for intercepted packets.
M	252	2	HeaderElementID = Identifier for the TLV element that encompasses the CC-header of a PayloadElement.
M	253	2	PayloadElementID = Identifier for the TLV element that encompasses one intercepted Payload packet.
O	254	1-25	LIID = Field indicating the LIID as defined in this document. This field has a character string value, e.g. "ABCD123456".
O	255	1-N	PrivateExtension = An optional field. The optional Private Extension contains vendor or LEA or operator specific information. It is described in the document TS 29.060 [17].

C.2.4.2 Information element syntax

The dynamic TypeLengthValue (TLV) format is used for its ease of implementation and good encoding and decoding performance. Subfield sizes: Type = 2 octets, Length = 2 octets and Value = 0…N octets. From Length the T and L subfields are excluded. The Type is different for every different field standardized.

The octets in the Type and Length subfields are ordered in the little-endian order, (i.e. least significant octet first). Any multioctet Value subfield is also to be interpreted as being little-endian ordered (word/double word/long word) when it has a (hexadecimal 2/4/8-octet) numeric value, instead of being specified to have an ASCII character string value. This means that the least significant octet/word/double word is then sent before the more significant octet/word/double word.

TLV encoding:

Type (2 octets)

Length (2 octets)

Value (0-N octets)

Figure C.4: Information elements in the CC header

TLV encoding can always be applied in a nested fashion for structured values.


T	L	V	T	L	^V TLV TLV TLV TLV

(The small "v" refers to the start of a Value field that has inside it a nested structure).

Figure C.5: Information elements in the CC header

In figure C.6, the TLV structure for UMTS HI3 transfer is presented for the case that there is just one intercepted packet inside the CC message. (There can be more CC Header IEs and CC Payload IEs in the CC, if there are more intercepted packets in the same CC message).

Figure C.6: IE structure of a CC message that contains one intercepted packet

The first octet of the first TLV element will start right after the last octet of the header of the protocol that is being used to carry the CC information.

The first TLV element (i.e. the main TLV IE) comprises the whole dynamic length CC information, i.e. the dynamic length CC header and the dynamic length CC payload.

Inside the main TLV IE there are at least 2 TLV elements: the Header of the payload and the Payload itself. The Header contains all the ancillary IEs related to the intercepted CC packet. The Payload contains the actual intercepted packet.

There may be more than one intercepted packet in one UMTS HI3 delivery protocol message. If the Value of the main TLV IE is longer than the 2 (first) TLV Information Elements inside it, then it is an indication that there are more than one intercepted packets inside the main TLV IE (i.e. 4 or more TLV IEs in total). The number of TLV IEs in the main TLV IE is always even, since for every intercepted packet there is one TLV IE for header and one TLV IE for payload.

C.2.5 Other considerations

The FTP protocol mode parameters used:

Transmission Mode: stream

Format: non-print

Structure: file-structure

Type: binary

The FTP service command to define the file system function at the server side: STORE mode for data transmission.

The FTP client- (=user -FTP process at the MF) uses e.g. the default standard FTP ports 20 (for data connection) and 21 (for control connection), ‘passive’ mode is supported. The data transfer process listens to the data port for a connection from a server-FTP process.

For the file transfer from the MF to the LEMF(s) e.g. the following data transfer parameters are provided for the FTP client (at the MF):

– transfer destination (IP) address, e.g. "194.89.205.4";

– transfer destination username, e.g. "LEA1";

– transfer destination directory path, e.g. "/usr/local/LEA1/1234-8291";

– transfer destination password;

– interception file type, e.g. "2" (this is needed only if the file naming method A is used).

LEMF may use various kind directory structures for the reception of interception files. It is strongly recommended that at the LEMF machine the structure and access and modification rights of the storage directories are adjusted to prevent unwanted directory operations by a FTP client.

The use of IPSec services for this interface is recommended.

Timing considerations for the FTP transmission

The MF and LEMF sides control the timers to ensure reliable, near-real time data transfer. The transmission related timers are defined within the lower layers of the used protocol and are out of scope of this document.

The following timers may be used within the LI application:

Table C.4: Timing considerations

Name	Controlled by	Units	Description
T1 inactivity timer	LEMF	Seconds	Triggered by no activity within the FTP session (no new files). The FTP session is torn down when the T1 expires. To send another file the new connection will be established. The timer avoids the FTP session overflow at the LEMF side.
T2 send file trigger	MF	Milliseconds	Forces the file to be transmitted to the LEMF (even if the size limit has not been reached yet in case of volume trigger active). If the timer is set to 0 the only trigger to send the file is the file size parameter (see C.2.2).

C.2.6 Profiles (informative)

As there are several ways (usage profiles) how data transfer can be arranged by using the FTP, this clause contains practical considerations how the communications can be set up. Guidance is given for client‑server arrangements, session establishments, time outs, the handling of the files (in RAM or disk). Example batch file is described for the case that the sending FTP client uses files. If instead (logical) files are sent directly from the client’s RAM memory, then the procedure can be in principle similar though no script file would then be needed.

At the LEMF side, FTP server process is run, and at MF, FTP client. No FTP server (which could be accessed from outside the operator network) shall run in the MF. The FTP client can be implemented in many ways, and here the FTP usage is presented with an example only. The FTP client can be implemented by a batch file or a file sender program that uses FTP via an API. The login needs to occur only once per e.g. <destaddr> and <leauser> ‑ pair. Once the login is done, the files can then be transferred just by repeating "mput" command and checking the transfer status (e.g. from the API routine return value). To prevent inactivity timer triggering, a dummy command (e.g. "pwd") can be sent every
T seconds (T should be less than L, the actual idle time limit). If the number of FTP connections is wanted to be as minimized as possible, the FTP file transfer method "B" is to be preferred to the method A (though the method A helps more the LEMF by pre‑sorting the data sent).

Simple example of a batch file extract:

FTP commands usage scenario for transferring a list of files:

To prevent FTP cmd line buffer overflow the best way is to use wildcarded file names, and let the FTP implementation do the file name expansion (instead of shell). The number of files for one mput is not limited this way:

ftp <flags> <destaddr>

user <leauser> <leapasswd>

cd <destpath>

lcd <srcpath>

bin

mput <files>

nlist <lastfile> <checkfile>

EOF

This set of commands opens an FTP connection to a LEA site, logs in with a given account (auto‑login is disabled), transfers a list of files in binary mode, and checks the transfer status in a simplified way.

Brief descriptions for the FTP commands used in the example:

user <user‑name> <password> Identify the client to the remote FTP server.

cd <remote‑directory> Change the working directory on the remote machine to remote‑directory.

lcd <directory> Change the working directory on the local machine.

bin Set the file transfer type to support binary image transfer

mput <local‑files> Expand wild cards in the list of local files given as arguments and do a put for each file in the resulting list. Store each local file on the remote machine.

nlist <remote‑directory> <local‑file> Print a list of the files in a directory on the remote machine. Send the output to local‑file.

close Terminate the FTP session with the remote server, and return to the command interpreter. Any defined macros are erased.

The parameters are as follows:

<flags> contains the FTP command options, e.g. "‑i ‑n ‑V ‑p" which equals to "interactive prompting off", "auto‑login disabled", "verbose mode disabled", and "passive mode enabled". (These are dependent on the used ftp‑version.)

<destaddr> contains the IP address or DNS address of the destination (LEA).

<leauser> contains the receiving (LEA) username.

<leapasswd> contains the receiving (LEA) user’s password.

<destpath> contains the destination path.

<srcpath> contains the source path.

<files> wild carded file specification (matching the files to be transferred).

<lastfile> the name of the last file to be transferred.

<checkfile> is a (local) file to be checked upon transfer completion; if it exists then the transfer is considered successful.

The FTP application should to do the following things if the check file is not found:

‑ keep the failed files;

‑ raise "file transfer failure" error condition (i.e. send alarm to the corresponding LEA);

‑ the data can be buffered for a time that the buffer size allows. If that would finally be exhausted, DF would start dropping the corresponding target’s data until the transfer failure is fixed;

‑ the transmission of the failed files is retried until the transfer eventually succeeds. Then the DF would again start collecting the data;

‑ upon successful file transfer the sent files are deleted from the DF.

The FTP server at LEMF shall not allow anonymous login of an FTP client.

It is required that FTP implementation guarantees that LEMF will start processing data only after data transfer is complete.

The following implementation example addresses a particular issue of FTP implementation. It is important however to highlight that there are multiple ways of addressing the problem in question, and therefore the given example does not in any way suggest being the default one.

MF sends data with a filename, which indicates that the file is temporary. Once data transfer is complete, MF renames temporary file into ordinary one (as defined in F.3.2.2).

The procedure for renaming filename should be as follow:

1) open FTP channel (if not already open) from MF to LEMF;

2) sends data to LEMF using command "put" with temporary filename;

3) after MF finished to send the file, renaming it as ordinary one with command "ren".

Brief descriptions for the FTP commands used in the example:

ren <from-name> <to-name> renaming filename from-name to to-name.

If the ftp-client want to send file to LEMF using the command "mput" (e.g. MF stored many IRI files and want to send all together with one command), every filename transferred successfully has to be renamed each after command "mput" ended.