8 Asset information
26.2443GPP3GPP file format (3GP)Release 17Transparent end-to-end Packet-switched Streaming Service (PSS)TS
8.1 General
Asset information in a 3GP file describes the contained media. Clause 8.2 defines 3GPP asset meta data that is backward compatible with Release 6. However, in order to provide more enriched information for audio, it is also possible to include ID3 version 2 (ID3v2) tags as described in clause 8.3.
8.2 3GPP asset meta data
A user-data box (‘udta’), as defined in [7] may be present in conforming files. It should reside within the Movie box, but may reside within the Track box, following the hierarchy of boxes described in Clause 6.2.
Within the user-data box, there may reside sub-boxes that contain asset meta-data, taken from the list of boxes in tables 8.1 through 8.10 below (zero or more sub-boxes of each kind, zero or one for each language or role of location information). Each of the sub-boxes conforms to the definition of a "full box" as specified in [7] (hence the ‘Version’ and ‘Flags’ fields).
The following sub-boxes are in use for the following purposes:
– titl – title for the media (see table 8.1)
– dscp – caption or description for the media (see table 8.2)
– cprt – notice about organisation holding copyright for the media file (see table 8.3)
– perf – performer or artist (see table 8.4)
– auth – author of the media (see table 8.5)
– gnre – genre (category and style) of the media (see table 8.6)
– rtng – media rating (see table 8.7)
– clsf – classification of the media (see table 8.8)
– kywd – media keywords (see table 8.9)
– loci – location information (see table 8.10)
– albm – album title and track number for the media (see table 8.11)
– yrrc – recording year for the media (see table 8.12)
– coll – name of the collection from which the media comes (see table 8.12a)
– urat – user ‘star’ rating of the media (see table 8.12b)
– thmb — thumbnail image of the media (see table 8.12c)
– orie – orientation information (see table 8.12d)
Table 8.1: The Title box
|
Field |
Type |
Details |
Value |
|
BoxHeader.Size |
Unsigned int(32) |
||
|
BoxHeader.Type |
Unsigned int(32) |
‘titl’ |
|
|
BoxHeader.Version |
Unsigned int(8) |
0 |
|
|
BoxHeader.Flags |
Bit(24) |
0 |
|
|
Pad |
Bit(1) |
0 |
|
|
Language |
Unsigned int(5)[3] |
Packed ISO-639-2/T language code |
|
|
Title |
String |
Text of title |
Language: declares the language code for the following text. See ISO 639-2/T for the set of three character codes. Each character is packed as the difference between its ASCII value and 0x60. The code is confined to being three lower-case letters, so these values are strictly positive.
Title: null-terminated string in either UTF-8 or UTF-16 characters, giving a title information. If UTF-16 is used, the string shall start with the BYTE ORDER MARK (0xFEFF).
Table 8.2: The Description box
|
Field |
Type |
Details |
Value |
|
BoxHeader.Size |
Unsigned int(32) |
||
|
BoxHeader.Type |
Unsigned int(32) |
‘dscp’ |
|
|
BoxHeader.Version |
Unsigned int(8) |
0 |
|
|
BoxHeader.Flags |
Bit(24) |
0 |
|
|
Pad |
Bit(1) |
0 |
|
|
Language |
Unsigned int(5)[3] |
Packed ISO-639-2/T language code |
|
|
Description |
String |
Text of description |
Language: declares the language code for the following text. See ISO 639-2/T for the set of three character codes. Each character is packed as the difference between its ASCII value and 0x60. The code is confined to being three lower-case letters, so these values are strictly positive.
Description: null-terminated string in either UTF-8 or UTF-16 characters, giving a description information. If UTF-16 is used, the string shall start with the BYTE ORDER MARK (0xFEFF).
Table 8.3: The Copyright box
|
Field |
Type |
Details |
Value |
|
BoxHeader.Size |
Unsigned int(32) |
||
|
BoxHeader.Type |
Unsigned int(32) |
‘cprt’ |
|
|
BoxHeader.Version |
Unsigned int(8) |
0 |
|
|
BoxHeader.Flags |
Bit(24) |
0 |
|
|
Pad |
Bit(1) |
0 |
|
|
Language |
Unsigned int(5)[3] |
Packed ISO-639-2/T language code |
|
|
Copyright |
String |
Text of copyright notice |
Language: declares the language code for the following text. See ISO 639-2/T for the set of three character codes. Each character is packed as the difference between its ASCII value and 0x60. The code is confined to being three lower-case letters, so these values are strictly positive.
Copyright: null-terminated string in either UTF-8 or UTF-16 characters, giving a copyright information. If UTF-16 is used, the string shall start with the BYTE ORDER MARK (0xFEFF).
Table 8.4: The Performer box
|
Field |
Type |
Details |
Value |
|
BoxHeader.Size |
Unsigned int(32) |
||
|
BoxHeader.Type |
Unsigned int(32) |
‘perf’ |
|
|
BoxHeader.Version |
Unsigned int(8) |
0 |
|
|
BoxHeader.Flags |
Bit(24) |
0 |
|
|
Pad |
Bit(1) |
0 |
|
|
Language |
Unsigned int(5)[3] |
Packed ISO-639-2/T language code |
|
|
Performer |
String |
Text of performer |
Language: declares the language code for the following text. See ISO 639-2/T for the set of three character codes. Each character is packed as the difference between its ASCII value and 0x60. The code is confined to being three lower-case letters, so these values are strictly positive.
Performer: null-terminated string in either UTF-8 or UTF-16 characters, giving a performer information. If UTF-16 is used, the string shall start with the BYTE ORDER MARK (0xFEFF).
Table 8.5: The Author box
|
Field |
Type |
Details |
Value |
|
BoxHeader.Size |
Unsigned int(32) |
||
|
BoxHeader.Type |
Unsigned int(32) |
‘auth’ |
|
|
BoxHeader.Version |
Unsigned int(8) |
0 |
|
|
BoxHeader.Flags |
Bit(24) |
0 |
|
|
Pad |
Bit(1) |
0 |
|
|
Language |
Unsigned int(5)[3] |
Packed ISO-639-2/T language code |
|
|
Author |
String |
Text of author |
Language: declares the language code for the following text. See ISO 639-2/T for the set of three character codes. Each character is packed as the difference between its ASCII value and 0x60. The code is confined to being three lower-case letters, so these values are strictly positive.
Author: null-terminated string in either UTF-8 or UTF-16 characters, giving an author information. If UTF-16 is used, the string shall start with the BYTE ORDER MARK (0xFEFF).
Table 8.6: The Genre box
|
Field |
Type |
Details |
Value |
|
BoxHeader.Size |
Unsigned int(32) |
||
|
BoxHeader.Type |
Unsigned int(32) |
‘gnre’ |
|
|
BoxHeader.Version |
Unsigned int(8) |
0 |
|
|
BoxHeader.Flags |
Bit(24) |
0 |
|
|
Pad |
Bit(1) |
0 |
|
|
Language |
Unsigned int(5)[3] |
Packed ISO-639-2/T language code |
|
|
Genre |
String |
Text of genre |
Language: declares the language code for the following text. See ISO 639-2/T for the set of three character codes. Each character is packed as the difference between its ASCII value and 0x60. The code is confined to being three lower-case letters, so these values are strictly positive.
Genre: null-terminated string in either UTF-8 or UTF-16 characters, giving a genre information. If UTF-16 is used, the string shall start with the BYTE ORDER MARK (0xFEFF).
Table 8.7: The Rating box
|
Field |
Type |
Details |
Value |
|
BoxHeader.Size |
Unsigned int(32) |
||
|
BoxHeader.Type |
Unsigned int(32) |
‘rtng’ |
|
|
BoxHeader.Version |
Unsigned int(8) |
0 |
|
|
BoxHeader.Flags |
Bit(24) |
0 |
|
|
RatingEntity |
Unsigned int(32) |
Four-character code rating entity |
|
|
RatingCriteria |
Unsigned int(32) |
Four-character code rating criteria |
|
|
Pad |
Bit(1) |
0 |
|
|
Language |
Unsigned int(5)[3] |
Packed ISO-639-2/T language code |
|
|
RatingInfo |
String |
Text of media-rating information |
RatingEntity: four-character code that indicates the rating entity grading the asset, e.g., ‘BBFC’. The values of this field should follow common names of worldwide movie rating systems, such as those mentioned in [http://www.movie-ratings.net/, October 2002].
RatingCriteria: four-character code that indicates which rating criteria are being used for the corresponding rating entity, e.g., ‘PG13’.
Language: declares the language code for the following text. See ISO 639-2/T for the set of three character codes. Each character is packed as the difference between its ASCII value and 0x60. The code is confined to being three lower-case letters, so these values are strictly positive.
RatingInfo: null-terminated string in either UTF-8 or UTF-16 characters, giving a rating information. If UTF-16 is used, the string shall start with the BYTE ORDER MARK (0xFEFF).
Table 8.8: The Classification box
|
Field |
Type |
Details |
Value |
|
BoxHeader.Size |
Unsigned int(32) |
||
|
BoxHeader.Type |
Unsigned int(32) |
‘clsf’ |
|
|
BoxHeader.Version |
Unsigned int(8) |
0 |
|
|
BoxHeader.Flags |
Bit(24) |
0 |
|
|
ClassificationEntity |
Unsigned int(32) |
Four-character code classification entity |
|
|
ClassificationTable |
Unsigned int(16) |
Index to classification table |
|
|
Pad |
Bit(1) |
0 |
|
|
Language |
Unsigned int(5)[3] |
Packed ISO-639-2/T language code |
|
|
ClassificationInfo |
String |
Text of media-classification information |
ClassificationEntity: four-character code that indicates the classification entity classifying the asset. The values of this field should follow names of worldwide classification systems to be identified, but may be assigned blanks to indicate no specific classification entity.
ClassificationTable: binary code that indicates which classification table is being used for the corresponding classification entity. 0x00 is reserved to indicate no specific classification table.
Language: declares the language code for the following text. See ISO 639-2/T for the set of three character codes. Each character is packed as the difference between its ASCII value and 0x60. The code is confined to being three lower-case letters, so these values are strictly positive.
ClassificationInfo: null-terminated string in either UTF-8 or UTF-16 characters, giving a classification information, taken from the corresponding classification table, if specified. If UTF-16 is used, the string shall start with the BYTE ORDER MARK (0xFEFF).
Table 8.9: The Keywords box
|
Field |
Type |
Details |
Value |
|
BoxHeader.Size |
Unsigned int(32) |
||
|
BoxHeader.Type |
Unsigned int(32) |
‘kywd’ |
|
|
BoxHeader.Version |
Unsigned int(8) |
0 |
|
|
BoxHeader.Flags |
Bit(24) |
0 |
|
|
Pad |
Bit(1) |
0 |
|
|
Language |
Unsigned int(5)[3] |
Packed ISO-639-2/T language code |
|
|
KeywordCnt |
Unsigned int(8) |
Binary number of keywords |
|
|
Keywords |
KeywordStruct[KeywordCnt] |
Array of structures that hold the actual keywords (see Table 8.9.1) |
Language: declares the language code for the following text. See ISO 639-2/T for the set of three character codes. Each character is packed as the difference between its ASCII value and 0x60. The code is confined to being three lower-case letters, so these values are strictly positive.
KeywordCnt: binary code that indicates the number of keywords provided. This number shall be greater than 0.
Keywords: Array of structures that hold the actual keywords, according to table 8.9.1.
Table 8.9.1: The Keyword Struct
|
Field |
Type |
Details |
Value |
|
KeywordSize |
Unsigned int(8) |
Binary size of keyword |
|
|
KeywordInfo |
String |
Text of keyword |
KeywordSize: binary code that indicates the total size (in bytes) of the keyword information field.
KeywordInfo: null-terminated string in either UTF-8 or UTF-16 characters, giving a keyword information. If UTF-16 is used, the string shall start with the BYTE ORDER MARK (0xFEFF).
Table 8.10: The Location Information box
|
Field |
Type |
Details |
Value |
|
BoxHeader.Size |
Unsigned int(32) |
||
|
BoxHeader.Type |
Unsigned int(32) |
‘loci’ |
|
|
BoxHeader.Version |
Unsigned int(8) |
0 |
|
|
BoxHeader.Flags |
Bit(24) |
0 |
|
|
Pad |
Bit(1) |
0 |
|
|
Language |
Unsigned int(5)[3] |
Packed ISO-639-2/T language code |
|
|
Name |
String |
Text of place name |
|
|
Role |
Unsigned int(8) |
Non-negative value indicating role of location |
|
|
Longitude |
Unsigned int(32) |
Fixed-point value of the longitude |
|
|
Latitude |
Unsigned int(32) |
Fixed-point value of the latitude |
|
|
Altitude |
Unsigned int(32) |
Fixed-point value of the Altitude |
|
|
Astronomical_body |
String |
Text of astronomical body |
|
|
Additional_notes |
String |
Text of additional location-related information |
Language: declares the language code for the following text. See ISO 639-2/T for the set of three character codes. Each character is packed as the difference between its ASCII value and 0x60. The code is confined to being three lower-case letters, so these values are strictly positive.
Name: null-terminated string in either UTF-8 or UTF-16 characters, indicating the name of the place. If UTF-16 is used, the string shall start with the BYTE ORDER MARK (0xFEFF).
Role: indicates the role of the place. Value 0 indicates "shooting location", 1 indicates "real location", and 2 indicates "fictional location". Other values are reserved.
Longitude: fixed-point 16.16 number indicating the longitude in degrees. Negative values represent western longitude.
Latitude: fixed-point 16.16 number indicating the latitude in degrees. Negative values represent southern latitude.
Altitude: fixed-point 16.16 number indicating the altitude in meters. The reference altitude, indicated by zero, is set to the sea level.
Astronomical_body: null-terminated string in either UTF-8 or UTF-16 characters, indicating the astronomical body on which the location exists, e.g. "earth". If UTF-16 is used, the string shall start with the BYTE ORDER MARK (0xFEFF).
Additional_notes: null-terminated string in either UTF-8 or UTF-16 characters, containing any additional location-related information. If UTF-16 is used, the string shall start with the BYTE ORDER MARK (0xFEFF).
NOTE 1: If the location information refers to a time-variant location, ‘Name’ should express a high-level location, such as "Finland" for several places in Finland or "Finland-Sweden" for several places in Finland and Sweden. Further details on time-variant locations can be provided as ‘Additional notes’.
NOTE 2: The values of longitude, latitude and altitude provide cursory Global Positioning System (GPS) information of the media content.
NOTE 3: A value of longitude (latitude) that is less than –180 (-90) or greater than 180 (90) indicates that the GPS coordinates (longitude, latitude, altitude) are unspecified, i.e. none of the given values for longitude, latitude or altitude are valid.
Table 8.11: The Album box
|
Field |
Type |
Details |
Value |
|
BoxHeader.Size |
Unsigned int(32) |
||
|
BoxHeader.Type |
Unsigned int(32) |
‘albm’ |
|
|
BoxHeader.Version |
Unsigned int(8) |
0 |
|
|
BoxHeader.Flags |
Bit(24) |
0 |
|
|
Pad |
Bit(1) |
0 |
|
|
Language |
Unsigned int(5)[3] |
Packed ISO-639-2/T language code |
|
|
AlbumTitle |
String |
Text of album title |
|
|
TrackNumber |
Unsigned int(8) |
Optional integer with track number |
Language: declares the language code for the following text. See ISO 639-2/T for the set of three character codes. Each character is packed as the difference between its ASCII value and 0x60. The code is confined to being three lower-case letters, so these values are strictly positive.
AlbumTitle: null-terminated string in either UTF-8 or UTF-16 characters, giving an album information. If UTF-16 is used, the string shall start with the BYTE ORDER MARK (0xFEFF).
TrackNumber: the track number (order number) of the media on this album. This is an optional field.
Table 8.12: The Recording Year box
|
Field |
Type |
Details |
Value |
|
BoxHeader.Size |
Unsigned int(32) |
||
|
BoxHeader.Type |
Unsigned int(32) |
‘yrrc’ |
|
|
BoxHeader.Version |
Unsigned int(8) |
0 |
|
|
BoxHeader.Flags |
Bit(24) |
0 |
|
|
RecordingYear |
Unsigned int(16) |
Integer value of recording year |
RecordingYear: the year when the media was recorded.
Table 8.12a: The Collection name box
|
Field |
Type |
Details |
Value |
|
BoxHeader.Size |
Unsigned int(32) |
||
|
BoxHeader.Type |
Unsigned int(32) |
‘coll’ |
|
|
BoxHeader.Version |
Unsigned int(8) |
0 |
|
|
BoxHeader.Flags |
Bit(24) |
0 |
|
|
Pad |
Bit(1) |
0 |
|
|
Language |
Unsigned int(5)[3] |
Packed ISO-639-2/T language code |
|
|
Name |
String |
Text of collection name |
Language: declares the language code for the following text. See ISO 639-2/T for the set of three character codes. Each character is packed as the difference between its ASCII value and 0x60. The code is confined to being three lower-case letters, so these values are strictly positive.
Name: null-terminated string in either UTF-8 or UTF-16 characters, giving collection name information. A collection contains works that may be conceptually independent, usually with some aspect in common, and may be user-defined. If UTF-16 is used, the string shall start with the BYTE ORDER MARK (0xFEFF).
Table 8.12b: The User-rating box
|
Field |
Type |
Details |
Value |
|
BoxHeader.Size |
Unsigned int(32) |
||
|
BoxHeader.Type |
Unsigned int(32) |
‘urat’ |
|
|
BoxHeader.Version |
Unsigned int(8) |
0 |
|
|
BoxHeader.Flags |
Bit(24) |
0 |
|
|
Pad |
Unsigned int(24) |
0 |
|
|
StarRating |
Unsigned int(8) |
User’s ‘star’ rating |
StarRating: either the value 0 (indicating no rating assigned) or a value in the range 10 through 50 inclusive, indicating a rating between 1 star (1.0, lowest rated by the user) and 5 stars (5.0, highest rated by the user) inclusive.
Table 8.12c: The Thumbnail box
|
Field |
Type |
Details |
Value |
|
BoxHeader.Size |
Unsigned int(32) |
||
|
BoxHeader.Type |
Unsigned int(32) |
‘thmb’ |
|
|
BoxHeader.Version |
Unsigned int(8) |
0 |
|
|
BoxHeader.Flags |
Bit(24) |
0 |
|
|
Format |
Unsigned int(32) |
Four-character code of the coding format |
|
|
Data |
bytes to end of box |
Image data |
Format: four-character code that indicates the encoding system for the thumbnail or thumbnail reference. That shall be ‘jpeg’.
Data: the image data, as indicated in the Format field. The Data is the image or reference in the indicated format. The Format ‘jpeg’ indicates an image in the JPEG format, that shall conform to the requirements of section 7.5 respectively of [3] (i.e. 3GPP TS 26.234).
Table 8.12d: The Orientation Information box
|
Field |
Type |
Details |
Value |
|
BoxHeader.Size |
Unsigned int(32) |
||
|
BoxHeader.Type |
Unsigned int(32) |
‘orie’ |
|
|
BoxHeader.Version |
Unsigned int(8) |
0 |
|
|
BoxHeader.Flags |
Bit(24) |
0 |
|
|
Digital_zoom |
Unsigned int(16) |
Fixed-point value of the enlargement scale factor |
|
|
Optical_zoom |
Unsigned int(16) |
Fixed-point value of the optical magnification scale factor |
|
|
Pan_indication |
Bit(1) |
true or magnetic |
|
|
Pan |
Unsigned int(32) |
Fixed-point value of the compass direction in a plane parallel to the earth surface |
|
|
Rotation |
Unsigned int(32) |
Fixed-point value of the rotation position relative to the direction pointed at by the camera |
|
|
Tilt |
Unsigned int(32) |
Fixed-point value of the rotation position relative to the plane of constant altitude. |
Digital zoom: fixed-point 8.8 number indicating the enlargement scale factor of the image due to cropping and interpolating the pixel dimensions back to the original size.
Optical zoom: fixed-point 8.8 number indicating the optical magnification scale factor.
Pan_indication: When the value is 1 indicates whether the direction is "true"; when the value is 0 indicates whether the direction is "magnetic".
Pan: fixed-point 16.15 number indicating the compass direction of the component in the plane parallel to the earth’s surface of any vector which points in the same direction that the camera is facing. North corresponds to 0 degrees, East corresponds to 90 degrees; West correspond to -90 degrees, South corresponds to 180 degrees.If the camera is pointing in a direction perpendicular to the earth’s surface (either straight up at the sky or straight down at the ground), then the value of Pan is undefined.
Rotation: fixed-point 16.16 number in degrees indicating the rotational position around the axis corresponding to the direction that the camera is facing. Since Tilt and Rotation are independent parameters, Rotation is defined for a Tilt value of 0, i.e. the camera is first tilted to be pointing parallel to the earth’s surface in the direction that would correspond to Pan. Rotation is then the amount of counter-clockwise rotation about the axis that the camera is facing needed to bring a vector initially pointing straight up towards the sky into alignment with the camera "up" direction. In the event that Pan is undefined as the camera is either pointing straight up or straight down, Rotation can be defined as the amount of rotation needed to bring a vector initially pointing North into alignment with the camera "up" direction.
NOTE: The Rotation and CVO convey the same information.
Tilt: fixed-point 16.16 number in degrees indicating the rotational position around the axis in the plane of constant altitude through the camera centre that is perpendicular to the Pan direction.When the camera is pointing parallel to the earth’s surface, Tilt is 0. When the camera is pointing straight up towards the sky, the Tilt is 90 degrees and if the camera is pointing straight down towards the earth Tilt is -90 degrees.
8.3 ID3 version 2 meta data
ID3 version 2 meta-data can be stored in 3GP files by using the Meta box defined by the ISO base media file format [7]. The procedure is specified by MP4REG, the MP4 Registration Authority [32] and is provided here for information.
The ID3v2 meta data is stored in the Meta box (‘meta’), which shall contain a Handler box with handler ‘ID32’. The actual meta data is either stored in one or more ID3v2 box(es) inside the meta-data box, or this entire set of box(es) is referenced as the primary item, and stored elsewhere. The ID3v2 box is defined in Table 8.13.
Table 8.13: ID3v2 box
|
Field |
Type |
Details |
Value |
|
BoxHeader.Size |
Unsigned int(32) |
||
|
BoxHeader.Type |
Unsigned int(32) |
‘ID32’ |
|
|
BoxHeader.Version |
Unsigned int(8) |
0 |
|
|
BoxHeader.Flags |
Bit(24) |
0 |
|
|
Pad |
Bit(1) |
0 |
|
|
Language |
Unsigned int(5)[3] |
Packed ISO-639-2/T language code |
|
|
ID3v2data |
Unsigned int(8)[] |
Complete ID3 version 2.x.x data |
Language: declares the language code for the following text. See ISO 639-2/T for the set of three character codes. Each character is packed as the difference between its ASCII value and 0x60. The code is confined to being three lower-case letters, so these values are strictly positive. If there are some language fields inside ID3 tag, language must not conflict with them. Instead codes ‘mul’ (multiple languages) and ‘und’ (undetermined language) should be used in such cases.
ID3v2data: binary data that corresponds to ID3v2 tag format (e.g. for v.2.4.0: http://www.id3.org/id3v2.4.0-structure.txt) and its native frames (e.g. for v.2.4.0: http://www.id3.org/id3v2.4.0-frames.txt). ID3 tag must not contain any footer information, because it is never needed. Both ID3v2 tag format and its native frames must use the same version of the specification. Size of this field can be derived from the box size. The version of the ID3 data may be found by inspecting it
The ID3v2 box contains a complete ID3 version 2.x.x data. It should be parsed according to ID3v2 [33] specifications for v.2.x.x tags. There may be multiple ID3v2 boxes using different language codes.