协议描述如下:
8.3.2 Track Header Box
8.3.2.1 Definition
Box Type: ‘tkhd’
Container: Track Box (‘trak’)
Mandatory: Yes
Quantity: Exactly one
This box specifies the characteristics of a single track. Exactly one Track Header Box is contained in a track.
In the absence of an edit list, the presentation of a track starts at the beginning of the overall presentation. An empty edit is used to offset the start time of a track. The default value of the track header flags for media tracks is 7 (track_enabled, track_in_movie, track_in_preview). If in a presentation all tracks have neither track_in_movie nor track_in_preview set, then all tracks shall be treated as if both flags were set on all tracks. Server hint tracks should have the track_in_movie and track_in_preview set to 0, so that they are ignored for local playback and preview.
Under the ‘iso3’ brand or brands that share its requirements, the width and height in the track header are measured on a notional ‘square’ (uniform) grid. Track video data is normalized to these dimensions (logically) before any transformation or placement caused by a layup or composition system. Track (and movie) matrices, if used, also operate in this uniformly‐scaled space.
The duration field here does not include the duration of following movie fragments, if any, but only of the media in the enclosing Movie Box. The Movie Extends Header box may be used to document the duration including movie fragments, when desired and possible.
version is an integer that specifies the version of this box (0 or 1 in this specification)
flags is a 24‐bit integer with flags; the following values are defined:
Track_enabled: Indicates that the track is enabled. Flag value is 0x000001. A disabled track (the low bit is zero) is treated as if it were not present.
Track_in_movie: Indicates that the track is used in the presentation. Flag value is 0x000002.
Track_in_preview: Indicates that the track is used when previewing the presentation. Flag value is 0x000004.
Track_size_is_aspect_ratio: Indicates that the width and height fields are not expressed in pixel units. The values have the same units but these units are not specified. The values are only an indication of the desired aspect ratio. If the aspect ratios of this track and other related tracks are not identical, then the respective positioning of the tracks is undefined, possibly defined by external contexts. Flag value is 0x000008.
creation_time is an integer that declares the creation time of this track (in seconds since midnight, Jan. 1, 1904, in UTC time).
modification_time is an integer that declares the most recent time the track was modified (in seconds since midnight, Jan. 1, 1904, in UTC time).
track_ID is an integer that uniquely identifies this track over the entire life‐time of this presentation. Track IDs are never re‐used and cannot be zero.
duration is an integer that indicates the duration of this track (in the timescale indicated in the Movie Header Box). The value of this field is equal to the sum of the durations of all of the track’s edits. If there is no edit list, then the duration is the sum of the sample durations, converted into the timescale in the Movie Header Box. If the duration of this track cannot be determined then duration is set to all 1s.
layer specifies the front‐to‐back ordering of video tracks; tracks with lower numbers are closer to the viewer. 0 is the normal value, and ‐1 would be in front of track 0, and so on.
alternate_group is an integer that specifies a group or collection of tracks. If this field is 0 there is no information on possible relations to other tracks. If this field is not 0, it should be the same for tracks that contain alternate data for one another and different for tracks belonging to different such groups. Only one track within an alternate group should be played or streamed at any one time, and must be distinguishable from other tracks in the group via attributes such as bitrate, codec, language, packet size etc. A group may have only one member.
volume is a fixed 8.8 value specifying the track’s relative audio volume. Full volume is 1.0 (0x0100) and is the normal value. Its value is irrelevant for a purely visual track. Tracks may be composed by combining them according to their volume, and then using the overall Movie Header Box volume setting; or more complex audio composition (e.g. MPEG‐4 BIFS) may be used.
matrix provides a transformation matrix for the video; (u,v,w) are restricted here to (0,0,1), hex (0,0,0×40000000).
width and height fixed‐point 16.16 values are track‐dependent as follows:
For text and subtitle tracks, they may, depending on the coding format, describe the suggested size of the rendering area. For such tracks, the value 0x0 may also be used to indicate that the data may be rendered at any size, that no preferred size has been indicated and that the actual size may be determined by the external context or by reusing the width and height of another track. For those tracks, the flag track_size_is_aspect_ratio may also be used.
For non‐visual tracks (e.g. audio), they should be set to zero.
For all other tracks, they specify the track’s visual presentation size. These need not be the same as the pixel dimensions of the images, which is documented in the sample description(s); all images in the sequence are scaled to this size, before any overall transformation of the track represented by the matrix. The pixel dimensions of the images are the default values.
aligned(8) class TrackHeaderBox
extends FullBox(‘tkhd’, version, flags){
if (version==1) {
unsigned int(64) creation_time;
unsigned int(64) modification_time;
unsigned int(32) track_ID;
const unsigned int(32) reserved = 0;
unsigned int(64) duration;
} else { // version==0
unsigned int(32) creation_time;
unsigned int(32) modification_time;
unsigned int(32) track_ID;
const unsigned int(32) reserved = 0;
unsigned int(32) duration;
}
const unsigned int(32)[2] reserved = 0;
template int(16) layer = 0;
template int(16) alternate_group = 0;
template int(16) volume = {if track_is_audio 0x0100 else 0};
const unsigned int(16) reserved = 0;
template int(32)[9] matrix=
{ 0x00010000,0,0,0,0x00010000,0,0,0,0x40000000 };
// unity matrix
unsigned int(32) width;
unsigned int(32) height;
}
000005c0: 0000 5c74 6b68 6400 0000 0300 0000 0000 ..\tkhd………
000005d0: 0000 0000 0000 0100 0000 0000 0000 2800 …………..(.
000005e0: 0000 0000 0000 0000 0000 0000 0000 0000 …………….
000005f0: 0100 0000 0000 0000 0000 0000 0000 0000 …………….
00000600: 0100 0000 0000 0000 0000 0000 0000 0040 ……………@
00000610: 0000 0000 0c00 0000 0c00 0000 0000 2465 …………..$e
000007e0: 0001 bc74 7261 6b00 0000 5c74 6b68 6400 …trak…\tkhd.
000007f0: 0000 0300 0000 0000 0000 0000 0000 0200 …………….
00000800: 0000 0000 0000 2000 0000 0000 0000 0000 …… ………
00000810: 0000 0101 0000 0000 0100 0000 0000 0000 …………….
00000820: 0000 0000 0000 0000 0100 0000 0000 0000 …………….
00000830: 0000 0000 0000 0040 0000 0000 0000 0000 …….@……..
00000840: 0000 0000 0000 2465 6474 7300 0000 1c65 ……$edts….e
size == 5c = 92
74 6b 68 64 tkhd
version 00
flags 00 00 03
creation_time 32 00 00 00 00
modification_time 00 00 00 00
track id = 00 00 00 01 == 1
reserved 32 00 00 00 00
duration 32 00 00 00 28 == 40ms
reserved 32 * 2 00 00 00 00 00 00 00 00
layer 16 00 00
alternate_group 16 00 00
volume 16 00 00 or
reverved 16 00
matrix 32*9
width 00 0c 00 00>> 16
height 00 0c 00 00 >> 16