Sample Description Box
Box Types: ‘stsd’
Container: Sample Table Box (‘stbl’) Mandatory: Yes
Quantity: Exactly one
The sample description table gives detailed information about the coding type used, and any initialization information needed for that coding.
The information stored in the sample description box after the entry‐count is both track‐type specific as documented here, and can also have variants within a track type (e.g. different codings may use different specific information after some common fields, even within a video track).
Which type of sample entry form is used is determined by the media handler, using a suitable form, such as one defined in clause 12, or defined in a derived specification, or registration.
Multiple descriptions may be used within a track.
Note Though the count is 32 bits, the number of items is usually much fewer, and is restricted by the fact that the reference index in the sample table is only 16 bits
If the ‘format’ field of a SampleEntry is unrecognized, neither the sample description itself, nor the associated media samples, shall be decoded.
Note The definition of sample entries specifies boxes in a particular order, and this is usually also followed in derived specifications. For maximum compatibility, writers should construct files respecting the order both within specifications and as implied by the inheritance, whereas readers should be prepared to accept any box order.
All string fields shall be null‐terminated, even if unused. “Optional” means there is at least one null byte.
Entries that identify the format by MIME type, such as a TextSubtitleSampleEntry, TextMetaDataSampleEntry, or SimpleTextSampleEntry, all of which contain a MIME type, may be used to identify the format of streams for which a MIME type applies. A MIME type applies if the contents of the string in the optional configuration box (without its null termination), followed by the contents of a set of samples, starting with a sync sample and ending at the sample immediately preceding a sync sample, are concatenated in their entirety, and the result meets the decoding requirements for documents of that MIME type. Non‐sync samples should be used only if that format specifies the behaviour of ‘progressive decoding’, and then the sample times indicate when the results of such progressive decoding should be presented (according to the media type).
Note The samples in a track that is all sync samples are therefore each a valid document for that MIME type.
In some classes derived from SampleEntry, namespace and schema_location are used both to identify the XML document content and to declare “brand” or profile compatibility. Multiple namespace identifiers indicate that the track conforms to the specification represented by each of the identifiers, some of which may identify supersets of the features present. A decoder should be able to decode all the namespaces in order to be able to decode and present correctly the media associated with this sample entry.
Note Additionally, namespace identifiers may represent performance constraints, such as limits on document size, font size, drawing rate, etc., as well as syntax constraints such as features that are not permitted or ignored.
aligned(8) class Box (unsigned int(32) boxtype,
optional unsigned int(8)[16] extended_type) {
unsigned int(32) size;
unsigned int(32) type = boxtype;
if (size==1) {
unsigned int(64) largesize;
} else if (size==0) {
// box extends to end of file
}
if (boxtype==‘uuid’) {
unsigned int(8)[16] usertype = extended_type;
}
}
aligned(8) class FullBox(unsigned int(32) boxtype, unsigned int(8) v, bit(24) f)
extends Box(boxtype) {
unsigned int(8)
version = v;
bit(24)
flags = f;
}
aligned(8) abstract class SampleEntry (unsigned int(32) format) extends Box(format)
{
const unsigned int(8)[6] reserved = 0;
unsigned int(16) data_reference_index;
}
class BitRateBox extends Box('btrt'){
unsigned int(32) bufferSizeDB;
unsigned int(32) maxBitrate;
unsigned int(32) avgBitrate;
}
aligned(8) class SampleDescriptionBox (unsigned int(32) handler_type) extends FullBox('stsd', version, 0){
int i ; // i 为临时变量,不占用空间
unsigned int(32) entry_count;
for (i = 1 ; i <= entry_count ; i++){
SampleEntry(); // an instance of a class derived from SampleEntry
}
}
参考协议 ISO/IEC 14496-12:2015 page – 170
12.1.3 Sample entry
12.1.3.1
Definition
Video tracks use VisualSampleEntry. // 当handler_type 是视频时,使用VisualSampleEntry
class VisualSampleEntry(codingname) extends SampleEntry (codingname){
unsigned int(16) pre_defined = 0;
const unsigned int(16) reserved = 0;
unsigned int(32)[3] pre_defined = 0;
unsigned int(16) width;
unsigned int(16) height;
template unsigned int(32) horizresolution = 0x00480000; // 72 dpi
template unsigned int(32) vertresolution = 0x00480000; // 72 dpi
const unsigned int(32) reserved = 0;
template unsigned int(16) frame_count = 1;
string[32] compressorname;
template unsigned int(16) depth = 0x0018;
int(16) pre_defined = -1;
// other boxes from derived specifications
CleanApertureBox
clap;
// optional
PixelAspectRatioBox pasp;
// optional
}
000006d0: 00 00 00 01 00 00 01 0b 73 74 62 6c 00 00 00 a7 ……..stbl….
000006e0: 73 74 73 64 00 00 00 00 00 00 00 01 00 00 00 97 stsd…………
000006f0: 61 76 63 31 00 00 00 00 00 00 00 01 00 00 00 00 avc1…………
00000700: 00 00 00 00 00 00 00 00 00 00 00 00 00 0c 00 0c …………….
00000710: 00 48 00 00 00 48 00 00 00 00 00 00 00 01 00 00 .H…H……….
00000720: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 …………….
00000730: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 18 …………….
00000740: ff ff 00 00 00 31 61 76 63 43 01 f4 00 0a ff e1 …..1avcC……
00000750: 00 19 67 f4 00 0a 91 9b 2b f2 cb 80 b6 40 00 00 ..g…..+….@..
00000760: 03 00 40 00 00 0c 83 c4 89 65 80 01 00 05 68 eb ..@……e….h.
00000770: e3 c4 48 00 00 00 10 70 61 73 70 00 00 00 01 00 ..H….pasp…..
00000780: 00 00 01 00 00 00 18 73 74 74 73 00 00 00 00 00 …….stts…..
00000790: 00 00 01 00 00 00 01 00 00 0e 10 00 00 00 1c 73 ……………s
000007a0: 74 73 63 00 00 00 00 00 00 00 01 00 00 00 01 00 tsc………….
000007b0: 00 00 01 00 00 00 01 00 00 00 14 73 74 73 7a 00 ………..stsz.
000007c0: 00 00 00 00 00 02 f0 00 00 00 01 00 00 00 14 73 ……………s
000007d0: 74 63 6f 00 00 00 00 00 00 00 01 00 00 00 30 00 tco………..0.
000007e0: 00 01 bc 74 72 61 6b 00 00 00 5c 74 6b 68 64 00 …trak…\tkhd.
000007f0: 00 00 03 00 00 00 00 00 00 00 00 00 00 00 02 00 …………….
size 00 00 00 a7
type 73 74 73 64
version 8 00
flags 24 00 00 00
entry_count 00 00 00 01 == 1
00 00 00 97 // VisualSampleEntry size 0x97
61 76 63 31 //VisualSampleEntry type avc1
aligned(8) abstract class SampleEntry (unsigned int(32) format) extends Box(format)
{
const unsigned int(8)[6] reserved = 0;
unsigned int(16) data_reference_index;
}
00 00 00 00 00 00 // const unsigned int(8)[6] reserved = 0;
00 01 // unsigned int(16) data_reference_index;
参考协议 ISO/IEC 14496-12:2015 page – 170
class VisualSampleEntry(codingname) extends SampleEntry (codingname){
unsigned int(16) pre_defined = 0;
const unsigned int(16) reserved = 0;
unsigned int(32)[3] pre_defined = 0;
unsigned int(16) width;
unsigned int(16) height;
template unsigned int(32) horizresolution = 0x00480000; // 72 dpi
template unsigned int(32) vertresolution = 0x00480000; // 72 dpi
const unsigned int(32) reserved = 0;
template unsigned int(16) frame_count = 1;
string[32] compressorname;
template unsigned int(16) depth = 0x0018;
int(16) pre_defined = -1;
// other boxes from derived specifications
CleanApertureBox clap; // optional
PixelAspectRatioBox pasp; // optional
}
000006d0: 00 00 00 01 00 00 01 0b 73 74 62 6c 00 00 00 a7 ……..stbl….
000006e0: 73 74 73 64 00 00 00 00 00 00 00 01 00 00 00 97 stsd…………
000006f0: 61 76 63 31 00 00 00 00 00 00 00 01 00 00 00 00 avc1…………
00000700: 00 00 00 00 00 00 00 00 00 00 00 00 00 0c 00 0c …………….
00000710: 00 48 00 00 00 48 00 00 00 00 00 00 00 01 00 00 .H…H……….
00000720: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 …………….
00000730: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 18 …………….
00000740: ff ff 00 00 00 31 61 76 63 43 01 f4 00 0a ff e1 …..1avcC……
00000750: 00 19 67 f4 00 0a 91 9b 2b f2 cb 80 b6 40 00 00 ..g…..+….@..
00000760: 03 00 40 00 00 0c 83 c4 89 65 80 01 00 05 68 eb ..@……e….h.
00000770: e3 c4 48 00 00 00 10 70 61 73 70 00 00 00 01 00 ..H….pasp…..
00000780: 00 00 01 00 00 00 18 73 74 74 73 00 00 00 00 00 …….stts…..
00000790: 00 00 01 00 00 00 01 00 00 0e 10 00 00 00 1c 73 ……………s
000007a0: 74 73 63 00 00 00 00 00 00 00 01 00 00 00 01 00 tsc………….
000007b0: 00 00 01 00 00 00 01 00 00 00 14 73 74 73 7a 00 ………..stsz.
000007c0: 00 00 00 00 00 02 f0 00 00 00 01 00 00 00 14 73 ……………s
000007d0: 74 63 6f 00 00 00 00 00 00 00 01 00 00 00 30 00 tco………..0.
000007e0: 00 01 bc 74 72 61 6b 00 00 00 5c 74 6b 68 64 00 …trak…\tkhd.
000007f0: 00 00 03 00 00 00 00 00 00 00 00 00 00 00 02 00 …………….
00 00 // VisualSampleEntry unsigned int(16) pre_defined = 0;
00 00 // VisualSampleEntry const unsigned int(16) reserved = 0;
00 00 00 00 00 00 00 00 00 00 00 00 // VisualSampleEntry unsigned int(32)[3] pre_defined = 0;
00 0c // VisualSampleEntry unsigned int(16) width;
00 0c // VisualSampleEntry unsigned int(16) height;
00 48 00 00// VisualSampleEntry template unsigned int(32) horizresolution = 0x00480000; // 72 dpi template unsigned int(32)
00 48 00 00 // VisualSampleEntry pi template unsigned int(32) vertresolution = 0x00480000; // 72 dpi const
00 00 00 00 //VisualSampleEntry const unsigned int(32) reserved = 0;
00 01 //VisualSampleEntry template unsigned int(16) frame_count = 1;
00*32 //VisualSampleEntry string[32] compressorname;
00 18 //VisualSampleEntry template unsigned int(16) depth = 0x0018;
ff ff //VisualSampleEntry int(16) pre_defined = -1;
参考协议 ISO/IEC 14496-15 5.3.3.1 定义如下:
aligned(8) class AVCDecoderConfigurationRecord {
unsigned int(8) configurationVersion = 1;
unsigned int(8) AVCProfileIndication;
unsigned int(8) profile_compatibility;
unsigned int(8) AVCLevelIndication;
bit(6) reserved = '111111'b;
unsigned int(2) lengthSizeMinusOne;
bit(3) reserved = '111'b;
unsigned int(5) numOfSequenceParameterSets;
for (i=0; i< numOfSequenceParameterSets; i++) {
unsigned int(16) sequenceParameterSetLength ;
bit(8*sequenceParameterSetLength) sequenceParameterSetNALUnit;
}
unsigned int(8) numOfPictureParameterSets;
for (i=0; i< numOfPictureParameterSets; i++) {
unsigned int(16) pictureParameterSetLength;
bit(8*pictureParameterSetLength) pictureParameterSetNALUnit;
}
if( AVCProfileIndication != 66 && AVCProfileIndication != 77 &&
AVCProfileIndication != 88 )
{
bit(6) reserved = '111111'b;
unsigned int(2) chroma_format;
bit(5) reserved = '11111'b;
unsigned int(3) bit_depth_luma_minus8;
bit(5) reserved = '11111'b;
unsigned int(3) bit_depth_chroma_minus8;
unsigned int(8) numOfSequenceParameterSetExt;
for (i=0; i< numOfSequenceParameterSetExt; i++) {
unsigned int(16) sequenceParameterSetExtLength;
bit(8*sequenceParameterSetExtLength) sequenceParameterSetExtNALUnit;
}
}
}
参考协议 ISO/IEC 14496-15 5.4.2 定义如下:
5.4.2 AVC video stream definition
1.1.1.1 Sample entry name and format
1.1.1.1.1 Definition
Sample Entry and Box Types: ‘avc1’, ‘avc2’, ‘avc3’, ‘avc4’, ‘avcC’, ‘m4ds’,’btrt’
Container: Sample Description Box (‘stsd’)
Mandatory: An ‘avc1’, ‘avc2’, ‘avc3’ or ‘avc4’ sample entry is mandatory
Quantity: One or more sample entries may be present
An AVC visual sample entry shall contain an AVC Configuration Box, as defined below. This includes an AVCDecoderConfigurationRecord, as defined in 5.3.3.1.
An optional BitRateBox may be present in the AVC visual sample entry to signal the bit rate information of the AVC video stream. Extension descriptors that should be inserted into the Elementary Stream Descriptor, when used in MPEG-4, may also be present.
Multiple sample entries may be used, as permitted by the ISO Base Media File Format specification, to indicate sections of video that use different configurations or parameter sets.
The sample entry name ‘avc1’ or ‘avc3’ may only be used when the stream to which this sample entry applies is a compliant and AVC stream as viewed by an AVC decoder operating under the configuration (including profile and level) given in the AVCConfigurationBox. The file format specific structures that resemble NAL units (see Annex A) may be present but shall not be used to access the AVC base data; that is, the AVC data shall not be contained in Aggregators (though they may be included within the bytes referenced by the additional_bytes field) nor referenced by Extractors.
The sample entry name ‘avc2’ or ‘avc4’ may only be used when Extractors or Aggregators (Annex A) are required to be supported, and an appropriate Toolset is required (for example, as indicated by the file-type brands). This sample entry type indicates that, in order to form the intended AVC stream, Extractors shall be replaced with the data they are referencing, and Aggregators shall be examined for contained NAL Units. Tier grouping may be present.
1.1.1.1.2 Syntax
// Visual Sequences
class AVCConfigurationBox extends Box(‘avcC’) {
AVCDecoderConfigurationRecord() AVCConfig;
}
class MPEG4ExtensionDescriptorsBox extends Box(‘m4ds’) {
Descriptor Descr[0 .. 255];
}
class AVCSampleEntry() extends VisualSampleEntry (type) {
// type is ‘avc1’ or ‘avc3’
AVCConfigurationBox config;
MPEG4ExtensionDescriptorsBox (); // optional
}
class AVC2SampleEntry() extends VisualSampleEntry (type) {
// type is ‘avc2’ or ‘avc4’
AVCConfigurationBox avcconfig;
MPEG4ExtensionDescriptorsBox descr; // optional
}
1.1.1.1.3 Semantics
Compressorname in the base class VisualSampleEntry indicates the name of the compressor used with the value “\012AVC Coding” being recommended; the first byte is a count of the remaining bytes, here represented by \012, which (being octal 12) is 10 (decimal), the number of bytes in the rest of the string.
config is defined in 5.3.3. If a separate parameter set stream is used, numOfSequenceParameterSets and numOfPictureParameterSets shall both be zero.
Descr is a descriptor that should be placed in the ElementaryStreamDescriptor when this stream is used in an MPEG-4 systems context. This does not include SLConfigDescriptor or DecoderConfigDescriptor, but includes the other descriptors in order to be placed after the SLConfigDescriptor.
参考协议 ISO/IEC 14496-15 5.3.3.1 定义如下:
aligned(8) class AVCDecoderConfigurationRecord {
unsigned int(8) configurationVersion = 1;
unsigned int(8) AVCProfileIndication;
unsigned int(8) profile_compatibility;
unsigned int(8) AVCLevelIndication;
bit(6) reserved = '111111'b;
unsigned int(2) lengthSizeMinusOne;
bit(3) reserved = '111'b;
unsigned int(5) numOfSequenceParameterSets;
for (i=0; i< numOfSequenceParameterSets; i++) {
unsigned int(16) sequenceParameterSetLength ;
bit(8*sequenceParameterSetLength) sequenceParameterSetNALUnit;
}
unsigned int(8) numOfPictureParameterSets;
for (i=0; i< numOfPictureParameterSets; i++) {
unsigned int(16) pictureParameterSetLength;
bit(8*pictureParameterSetLength) pictureParameterSetNALUnit;
}
if( AVCProfileIndication != 66 && AVCProfileIndication != 77 &&
AVCProfileIndication != 88 )
{
bit(6) reserved = '111111'b;
unsigned int(2) chroma_format;
bit(5) reserved = '11111'b;
unsigned int(3) bit_depth_luma_minus8;
bit(5) reserved = '11111'b;
unsigned int(3) bit_depth_chroma_minus8;
unsigned int(8) numOfSequenceParameterSetExt;
for (i=0; i< numOfSequenceParameterSetExt; i++) {
unsigned int(16) sequenceParameterSetExtLength;
bit(8*sequenceParameterSetExtLength) sequenceParameterSetExtNALUnit;
}
}
}
00000730: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 18 …………….
00000740: ff ff 00 00 00 31 61 76 63 43 01 f4 00 0a ff e1 …..1avcC……
00000750: 00 19 67 f4 00 0a 91 9b 2b f2 cb 80 b6 40 00 00 ..g…..+….@..
00000760: 03 00 40 00 00 0c 83 c4 89 65 80 01 00 05 68 eb ..@……e….h.
00000770: e3 c4 48 00 00 00 10 70 61 73 70 00 00 00 01 00 ..H….pasp…..
00000780: 00 00 01 00 00 00 18 73 74 74 73 00 00 00 00 00 …….stts…..
00000790: 00 00 01 00 00 00 01 00 00 0e 10 00 00 00 1c 73 ……………s
000007a0: 74 73 63 00 00 00 00 00 00 00 01 00 00 00 01 00 tsc………….
000007b0: 00 00 01 00 00 00 01 00 00 00 14 73 74 73 7a 00 ………..stsz.
000007c0: 00 00 00 00 00 02 f0 00 00 00 01 00 00 00 14 73 ……………s
000007d0: 74 63 6f 00 00 00 00 00 00 00 01 00 00 00 30 00 tco………..0.
000007e0: 00 01 bc 74 72 61 6b 00 00 00 5c 74 6b 68 64 00 …trak…\tkhd.
000007f0: 00 00 03 00 00 00 00 00 00 00 00 00 00 00 02 00 …………….
00 00 00 31 // AVCDecoderConfigurationRecord box size
61 76 63 43 // AVCDecoderConfigurationRecord type
01 //AVCDecoderConfigurationRecord unsigned int(8) configurationVersion = 1;
f4 //AVCDecoderConfigurationRecord unsigned int(8) AVCProfileIndication;
00 // unsigned int(8) profile_compatibility;
0a //unsigned int(8) AVCLevelIndication;
ff // bit(6) reserved = ‘111111’b; unsigned int(2) lengthSizeMinusOne;
e1 = 1110 0001 // bit(3) reserved = ‘111’b; unsigned int(5) numOfSequenceParameterSets; 即 sps 数量为1
00 19 //unsigned int(16) sequenceParameterSetLength ;
00 19 67 f4 00 0a 91 9b 2b f2 cb 80 b6 40 00 00 03 00 40 00 00 0c 83 c4 89 65 80 //bit(8*sequenceParameterSetLength) sequenceParameterSetNALUnit; sps data
01 //unsigned int(8) numOfPictureParameterSets;
00 05 // unsigned int(16) sequenceParameterSetLength ;
参考协议 ISO/IEC 14496-12 12.1.4 Pixel Aspect Ratio and Clean Aperture page – 170
00000770: e3 c4 48 00 00 00 10 70 61 73 70 00 00 00 01 00 ..H….pasp…..
00000780: 00 00 01 00 00 00 18 73 74 74 73 00 00 00 00 00 …….stts…..
class PixelAspectRatioBox extends Box(‘pasp’){
unsigned int(32) hSpacing;
unsigned int(32) vSpacing;
}
00 00 00 01 // unsigned int(32) hSpacing;
00 00 00 01 //unsigned int(32) vSpacing;
avc1解析需要协议14496-15
可参考 :
https://blog.csdn.net/badousuan/article/details/79519862