本文参考的博文整理:https://blog.csdn.net/leixiaohua1020。
https://blog.csdn.net/u011913612/article/details/53642355
本文简单分析FFmpeg中一个常用的函数:avformat_find_stream_info()。该函数可以读取一部分视音频数据并且获得一些相关的信息。avformat_find_stream_info()的声明位于libavformat\avformat.h,如下所示。
/**
* Read packets of a media file to get stream information. This
* is useful for file formats with no headers such as MPEG. This
* function also computes the real framerate in case of MPEG-2 repeat
* frame mode.
* The logical file position is not changed by this function;
* examined packets may be buffered for later processing.
*
* @param ic media file handle
* @param options If non-NULL, an ic.nb_streams long array of pointers to
* dictionaries, where i-th member contains options for
* codec corresponding to i-th stream.
* On return each dictionary will be filled with options that were not found.
* @return >=0 if OK, AVERROR_xxx on error
*
* @note this function isn't guaranteed to open all the codecs, so
* options being non-empty at return is a perfectly normal behavior.
*
* @todo Let the user decide somehow what information is needed so that
* we do not waste time getting stuff the user does not need.
*/
int avformat_find_stream_info(AVFormatContext *ic, AVDictionary **options);
从注释来看,avformat_find_stream_info主要是读一些包(packets ),然后从中提取初流的信息。有一些文件格式没有头,比如说MPEG格式的,这个时候,这个函数就很有用,因为它可以从读取到的包中获得到流的信息。在MPEG-2重复帧模式的情况下,该函数还计算真实的帧率。
逻辑文件位置不被此函数更改; 读出来的包会被缓存起来供以后处理。
注释很好的解释了这个函数的功能。我们可以想象一下,既然这个函数的功能是更新流的信息,那么可以猜测它的作用就是更新AVStream这个结构体中的字段了。
简单解释一下这个函数参数的含义:
ic:输入的AVFormatContext。
options:额外的选项,目前没有深入研究过。
AVStream这个结构体中的一些重要字段:
int index:该流的标示。
AVCodecContext *codec:该流的编解码器上下文。
AVRational time_base:时基。通过该值可以把PTS,DTS转化为真正的时间。FFMPEG其他结构体中也有这个字段,但是根据我的经验,只有AVStream中的######time_base是可用的。PTS*time_base=真正的时间
start_time:流中第一个pts的时间。
nb_frames:这个流中的帧的数目。
AVRational avg_frame_rate;平均帧率。
int64_t duration:该流的时间长度
AVDictionary *metadata:元数据信息
AVPacket attached_pic:附带的图片。比如说一些MP3,AAC音频文件附带的专辑封面。
这个函数的代码,源码在libavformat/utils.c文件中。
这里简单记录一下它的要点。该函数主要用于给每个媒体流(音频/视频)的AVStream结构体赋值。我们大致浏览一下这个函数的代码,会发现它其实已经实现了解码器的查找,解码器的打开,视音频帧的读取,视音频帧的解码等工作。换句话说,该函数实际上已经“走通”的解码的整个流程。
函数源代码分析:
part1:参数定义
int avformat_find_stream_info(AVFormatContext *ic, AVDictionary **options)
{
int i, count = 0, ret = 0, j;
int64_t read_size;
AVStream *st;
AVCodecContext *avctx;
AVPacket pkt1, *pkt;
int64_t old_offset = avio_tell(ic->pb);
// new streams might appear, no options for those
int orig_nb_streams = ic->nb_streams;
int flush_codecs;
int64_t max_analyze_duration = ic->max_analyze_duration;
int64_t max_stream_analyze_duration;
int64_t max_subtitle_analyze_duration;
int64_t probesize = ic->probesize;
int eof_reached = 0;
int *missing_streams = av_opt_ptr(ic->iformat->priv_class, ic->priv_data, "missing_streams");
flush_codecs = probesize > 0;
...
这里定义了一些参数:
probesize就是探测的大小,为了获得流的信息,这个函数会尝试的读一些包出来,然后分析这些数据,probesize限制 了最大允许读出的数据的大小。
orig_nb_streams是这个文件中的流的数量。普通的电影会包含三个流:音频流,视频流和字幕流。
part 2 设置max_stream_analyze_duration 等
max_stream_analyze_duration = max_analyze_duration;
max_subtitle_analyze_duration = max_analyze_duration;
if (!max_analyze_duration) {
max_stream_analyze_duration =
max_analyze_duration = 5*AV_TIME_BASE;
max_subtitle_analyze_duration = 30*AV_TIME_BASE;
if (!strcmp(ic->iformat->name, "flv"))
max_stream_analyze_duration = 90*AV_TIME_BASE;
if (!strcmp(ic->iformat->name, "mpeg") || !strcmp(ic->iformat->name, "mpegts"))
max_stream_analyze_duration = 7*AV_TIME_BASE;
}
这里定义了最大分析时长的限制。max_stream_analyze_duration 等于max_analyze_duration 等于30倍的timebase。此外,如果文件格式为flv或者Mpeg,那么他们会有各自的最大分析时长的限制。
part 3 第一次遍历流
for (i = 0; i < ic->nb_streams; i++) {
const AVCodec *codec;
AVDictionary *thread_opt = NULL;
st = ic->streams[i];
avctx = st->internal->avctx;
if (st->codecpar->codec_type == AVMEDIA_TYPE_VIDEO ||
st->codecpar->codec_type == AVMEDIA_TYPE_SUBTITLE) {
/* if (!st->time_base.num)
st->time_base = */
if (!avctx->time_base.num)
avctx->time_base = st->time_base;
}
/* check if the caller has overridden the codec id */
#if FF_API_LAVF_AVCTX
FF_DISABLE_DEPRECATION_WARNINGS
if (st->codec->codec_id != st->internal->orig_codec_id) {
st->codecpar->codec_id = st->codec->codec_id;
st->codecpar->codec_type = st->codec->codec_type;
st->internal->orig_codec_id = st->codec->codec_id;
}
FF_ENABLE_DEPRECATION_WARNINGS
#endif
// only for the split stuff
if (!st->parser && !(ic->flags & AVFMT_FLAG_NOPARSE) && st->request_probe <= 0) {
st->parser = av_parser_init(st->codecpar->codec_id);
if (st->parser) {
if (st->need_parsing == AVSTREAM_PARSE_HEADERS) {
st->parser->flags |= PARSER_FLAG_COMPLETE_FRAMES;
} else if (st->need_parsing == AVSTREAM_PARSE_FULL_RAW) {
st->parser->flags |= PARSER_FLAG_USE_CODEC_TS;
}
} else if (st->need_parsing) {
av_log(ic, AV_LOG_VERBOSE, "parser not found for codec "
"%s, packets or times may be invalid.\n",
avcodec_get_name(st->codecpar->codec_id));
}
}
if (st->codecpar->codec_id != st->internal->orig_codec_id)
st->internal->orig_codec_id = st->codecpar->codec_id;
ret = avcodec_parameters_to_context(avctx, st->codecpar);
if (ret < 0)
goto find_stream_info_err;
if (st->request_probe <= 0)
st->internal->avctx_inited = 1;
codec = find_probe_decoder(ic, st, st->codecpar->codec_id);
/* Force thread count to 1 since the H.264 decoder will not extract
* SPS and PPS to extradata during multi-threaded decoding. */
av_dict_set(options ? &options[i] : &thread_opt, "threads", "1", 0);
if (ic->codec_whitelist)
av_dict_set(options ? &options[i] : &thread_opt, "codec_whitelist", ic->codec_whitelist, 0);
/* Ensure that subtitle_header is properly set. */
if (st->codecpar->codec_type == AVMEDIA_TYPE_SUBTITLE
&& codec && !avctx->codec) {
if (avcodec_open2(avctx, codec, options ? &options[i] : &thread_opt) < 0)
av_log(ic, AV_LOG_WARNING,
"Failed to open codec in %s\n",__FUNCTION__);
}
// Try to just open decoders, in case this is enough to get parameters.
if (!has_codec_parameters(st, NULL) && st->request_probe <= 0) {
if (codec && !avctx->codec)
if (avcodec_open2(avctx, codec, options ? &options[i] : &thread_opt) < 0)
av_log(ic, AV_LOG_WARNING,
"Failed to open codec in %s\n",__FUNCTION__);
}
if (!options)
av_dict_free(&thread_opt);
}
这个函数有很多次遍历流的循环,这里是第一次。第一次循环做了如下事情:
1.获得编码器上下文环境avctx,设置avctx的time_base,code_id,codec_type,orig_codec_id。
2.如果解析器paser为空,那么会初始化解析器。
3.把解析器中的参数对应的拷贝到编解码器上下文环境中。调用的是avcodec_parameters_to_context函数,这个函数如下:
int avcodec_parameters_to_context(AVCodecContext *codec,
const AVCodecParameters *par)
{
codec->codec_type = par->codec_type;
codec->codec_id = par->codec_id;
codec->codec_tag = par->codec_tag;
codec->bit_rate = par->bit_rate;
codec->bits_per_coded_sample = par->bits_per_coded_sample;
codec->bits_per_raw_sample = par->bits_per_raw_sample;
codec->profile = par->profile;
codec->level = par->level;
switch (par->codec_type) {
case AVMEDIA_TYPE_VIDEO:
codec->pix_fmt = par->format;
codec->width = par->width;
codec->height = par->height;
codec->field_order = par->field_order;
codec->color_range = par->color_range;
codec->color_primaries = par->color_primaries;
codec->color_trc = par->color_trc;
codec->colorspace = par->color_space;
codec->chroma_sample_location = par->chroma_location;
codec->sample_aspect_ratio = par->sample_aspect_ratio;
codec->has_b_frames = par->video_delay;
break;
case AVMEDIA_TYPE_AUDIO:
codec->sample_fmt = par->format;
codec->channel_layout = par->channel_layout;
codec->channels = par->channels;
codec->sample_rate = par->sample_rate;
codec->block_align = par->block_align;
codec->frame_size = par->frame_size;
codec->delay =
codec->initial_padding = par->initial_padding;
codec->trailing_padding = par->trailing_padding;
codec->seek_preroll = par->seek_preroll;
break;
case AVMEDIA_TYPE_SUBTITLE:
codec->width = par->width;
codec->height = par->height;
break;
}
if (par->extradata) {
av_freep(&codec->extradata);
codec->extradata = av_mallocz(par->extradata_size + AV_INPUT_BUFFER_PADDING_SIZE);
if (!codec->extradata)
return AVERROR(ENOMEM);
memcpy(codec->extradata, par->extradata, par->extradata_size);
codec->extradata_size = par->extradata_size;
}
return 0;
}
可见就是根据编码器类型做对应的参数的拷贝.
根据编码器ID查找编码器.
这里调用的是find_probe_decoder函数,该函数也定义在libavformat/utils.c中。可以看一下查找编解码器的过程:
static const AVCodec *find_probe_decoder(AVFormatContext *s, const AVStream *st, enum AVCodecID codec_id)
{
const AVCodec *codec;
#if CONFIG_H264_DECODER
/* Other parts of the code assume this decoder to be used for h264,
* so force it if possible. */
if (codec_id == AV_CODEC_ID_H264)
return avcodec_find_decoder_by_name("h264");
#endif
codec = find_decoder(s, st, codec_id);
if (!codec)
return NULL;
if (codec->capabilities & AV_CODEC_CAP_AVOID_PROBING) {
const AVCodec *probe_codec = NULL;
while (probe_codec = av_codec_next(probe_codec)) {
if (probe_codec->id == codec_id &&
av_codec_is_decoder(probe_codec) &&
!(probe_codec->capabilities & (AV_CODEC_CAP_AVOID_PROBING | AV_CODEC_CAP_EXPERIMENTAL))) {
return probe_codec;
}
}
}
return codec;
}
调用find_decoder函数进一步查找:
static const AVCodec *find_decoder(AVFormatContext *s, const AVStream *st, enum AVCodecID codec_id)
{
#if FF_API_LAVF_AVCTX
FF_DISABLE_DEPRECATION_WARNINGS
if (st->codec->codec)
return st->codec->codec;
FF_ENABLE_DEPRECATION_WARNINGS
#endif
switch (st->codecpar->codec_type) {
case AVMEDIA_TYPE_VIDEO:
if (s->video_codec) return s->video_codec;
break;
case AVMEDIA_TYPE_AUDIO:
if (s->audio_codec) return s->audio_codec;
break;
case AVMEDIA_TYPE_SUBTITLE:
if (s->subtitle_codec) return s->subtitle_codec;
break;
}
return avcodec_find_decoder(codec_id);
}
如果编码器已经存在就根据编码器类型返回对应的编解码器,否则就根据id进行查找。其实就是遍历一个编解码器的链表,其首个结构是first_avcodec。编解码器链表是我们在av_register_all()函数中注册的。这个逐个匹配每一个编解码器的id,找到后返回对应的编解码器。
然后打开编解码器使用的是avcodec_open2函数。
此处后续再分析。。。。
所以我们可以总结下第一次遍历所有的流所做的事情:初始化time_base,codec_id等参数,初始化解析器paser,然后对于每一个流,根据codec_id找对对应的编解码器,然后打开编解码器。也就是说,第一次遍历流,使得我们对于每一个流,都有一个编解码器可用。
part 4 第二次遍历流
for (i = 0; i < ic->nb_streams; i++) {
#if FF_API_R_FRAME_RATE
ic->streams[i]->info->last_dts = AV_NOPTS_VALUE;
#endif
ic->streams[i]->info->fps_first_dts = AV_NOPTS_VALUE;
ic->streams[i]->info->fps_last_dts = AV_NOPTS_VALUE;
}
第二次遍历流是在编解码器已经打开之后,对于每一个流,设置了一些参数。last_dts 是最有的解码时间。fps_first_dts和fps_last_dts 用于帧率的计算。
part 5 死循环
接下来的这个死循环代码很长。我们回顾一下之前的工作,我们遍历了两次流,设置好了每个流需要的编解码器和计算帧率的参数,初始化了每个流的解析器paser。在介绍这个函数的功能的时候,我们说这个函数会尝试的读一些数据进来,并解析这些数据,从这些数据中进一步获得详细的流的信息。到目前为止我们并没有读任何数据进来,那么接下来,想必就是读数据进来并解码分析数据流了吧。
read_size = 0;
for (;;) {
int analyzed_all_streams;
if (ff_check_interrupt(&ic->interrupt_callback)) {
ret = AVERROR_EXIT;
av_log(ic, AV_LOG_DEBUG, "interrupted\n");
break;
}
/* check if one codec still needs to be handled */
for (i = 0; i < ic->nb_streams; i++) {
int fps_analyze_framecount = 20;
int count;
st = ic->streams[i];
if (!has_codec_parameters(st, NULL))
break;
/* If the timebase is coarse (like the usual millisecond precision
* of mkv), we need to analyze more frames to reliably arrive at
* the correct fps. */
if (av_q2d(st->time_base) > 0.0005)
fps_analyze_framecount *= 2;
if (!tb_unreliable(st->internal->avctx))
fps_analyze_framecount = 0;
if (ic->fps_probe_size >= 0)
fps_analyze_framecount = ic->fps_probe_size;
if (st->disposition & AV_DISPOSITION_ATTACHED_PIC)
fps_analyze_framecount = 0;
/* variable fps and no guess at the real fps */
count = (ic->iformat->flags & AVFMT_NOTIMESTAMPS) ?
st->info->codec_info_duration_fields/2 :
st->info->duration_count;
if (!(st->r_frame_rate.num && st->avg_frame_rate.num) &&
st->codecpar->codec_type == AVMEDIA_TYPE_VIDEO) {
if (count < fps_analyze_framecount)
break;
}
// Look at the first 3 frames if there is evidence of frame delay
// but the decoder delay is not set.
if (st->info->frame_delay_evidence && count < 2 && st->internal->avctx->has_b_frames == 0)
break;
if (!st->internal->avctx->extradata &&
(!st->internal->extract_extradata.inited ||
st->internal->extract_extradata.bsf) &&
extract_extradata_check(st))
break;
if (st->first_dts == AV_NOPTS_VALUE &&
!(ic->iformat->flags & AVFMT_NOTIMESTAMPS) &&
st->codec_info_nb_frames < ((st->disposition & AV_DISPOSITION_ATTACHED_PIC) ? 1 : ic->max_ts_probe) &&
(st->codecpar->codec_type == AVMEDIA_TYPE_VIDEO ||
st->codecpar->codec_type == AVMEDIA_TYPE_AUDIO))
break;
}
analyzed_all_streams = 0;
if (!missing_streams || !*missing_streams)
if (i == ic->nb_streams) {
analyzed_all_streams = 1;
/* NOTE: If the format has no header, then we need to read some
* packets to get most of the streams, so we cannot stop here. */
if (!(ic->ctx_flags & AVFMTCTX_NOHEADER)) {
/* If we found the info for all the codecs, we can stop. */
ret = count;
av_log(ic, AV_LOG_DEBUG, "All info found\n");
flush_codecs = 0;
break;
}
}
/* We did not get all the codec info, but we read too much data. */
if (read_size >= probesize) {
ret = count;
av_log(ic, AV_LOG_DEBUG,
"Probe buffer size limit of %"PRId64" bytes reached\n", probesize);
for (i = 0; i < ic->nb_streams; i++)
if (!ic->streams[i]->r_frame_rate.num &&
ic->streams[i]->info->duration_count <= 1 &&
ic->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_VIDEO &&
strcmp(ic->iformat->name, "image2"))
av_log(ic, AV_LOG_WARNING,
"Stream #%d: not enough frames to estimate rate; "
"consider increasing probesize\n", i);
break;
}
/* NOTE: A new stream can be added there if no header in file
* (AVFMTCTX_NOHEADER). */
ret = read_frame_internal(ic, &pkt1);
if (ret == AVERROR(EAGAIN))
continue;
if (ret < 0) {
/* EOF or error*/
eof_reached = 1;
break;
}
pkt = &pkt1;
if (!(ic->flags & AVFMT_FLAG_NOBUFFER)) {
ret = ff_packet_list_put(&ic->internal->packet_buffer,
&ic->internal->packet_buffer_end,
pkt, 0);
if (ret < 0)
goto find_stream_info_err;
}
st = ic->streams[pkt->stream_index];
if (!(st->disposition & AV_DISPOSITION_ATTACHED_PIC))
read_size += pkt->size;
avctx = st->internal->avctx;
if (!st->internal->avctx_inited) {
ret = avcodec_parameters_to_context(avctx, st->codecpar);
if (ret < 0)
goto find_stream_info_err;
st->internal->avctx_inited = 1;
}
if (pkt->dts != AV_NOPTS_VALUE && st->codec_info_nb_frames > 1) {
/* check for non-increasing dts */
if (st->info->fps_last_dts != AV_NOPTS_VALUE &&
st->info->fps_last_dts >= pkt->dts) {
av_log(ic, AV_LOG_DEBUG,
"Non-increasing DTS in stream %d: packet %d with DTS "
"%"PRId64", packet %d with DTS %"PRId64"\n",
st->index, st->info->fps_last_dts_idx,
st->info->fps_last_dts, st->codec_info_nb_frames,
pkt->dts);
st->info->fps_first_dts =
st->info->fps_last_dts = AV_NOPTS_VALUE;
}
/* Check for a discontinuity in dts. If the difference in dts
* is more than 1000 times the average packet duration in the
* sequence, we treat it as a discontinuity. */
if (st->info->fps_last_dts != AV_NOPTS_VALUE &&
st->info->fps_last_dts_idx > st->info->fps_first_dts_idx &&
(pkt->dts - st->info->fps_last_dts) / 1000 >
(st->info->fps_last_dts - (uint64_t)st->info->fps_first_dts) /
(st->info->fps_last_dts_idx - st->info->fps_first_dts_idx)) {
av_log(ic, AV_LOG_WARNING,
"DTS discontinuity in stream %d: packet %d with DTS "
"%"PRId64", packet %d with DTS %"PRId64"\n",
st->index, st->info->fps_last_dts_idx,
st->info->fps_last_dts, st->codec_info_nb_frames,
pkt->dts);
st->info->fps_first_dts =
st->info->fps_last_dts = AV_NOPTS_VALUE;
}
/* update stored dts values */
if (st->info->fps_first_dts == AV_NOPTS_VALUE) {
st->info->fps_first_dts = pkt->dts;
st->info->fps_first_dts_idx = st->codec_info_nb_frames;
}
st->info->fps_last_dts = pkt->dts;
st->info->fps_last_dts_idx = st->codec_info_nb_frames;
}
if (st->codec_info_nb_frames>1) {
int64_t t = 0;
int64_t limit;
if (st->time_base.den > 0)
t = av_rescale_q(st->info->codec_info_duration, st->time_base, AV_TIME_BASE_Q);
if (st->avg_frame_rate.num > 0)
t = FFMAX(t, av_rescale_q(st->codec_info_nb_frames, av_inv_q(st->avg_frame_rate), AV_TIME_BASE_Q));
if ( t == 0
&& st->codec_info_nb_frames>30
&& st->info->fps_first_dts != AV_NOPTS_VALUE
&& st->info->fps_last_dts != AV_NOPTS_VALUE)
t = FFMAX(t, av_rescale_q(st->info->fps_last_dts - st->info->fps_first_dts, st->time_base, AV_TIME_BASE_Q));
if (analyzed_all_streams) limit = max_analyze_duration;
else if (avctx->codec_type == AVMEDIA_TYPE_SUBTITLE) limit = max_subtitle_analyze_duration;
else limit = max_stream_analyze_duration;
if (t >= limit) {
av_log(ic, AV_LOG_VERBOSE, "max_analyze_duration %"PRId64" reached at %"PRId64" microseconds st:%d\n",
limit,
t, pkt->stream_index);
if (ic->flags & AVFMT_FLAG_NOBUFFER)
av_packet_unref(pkt);
break;
}
if (pkt->duration) {
if (avctx->codec_type == AVMEDIA_TYPE_SUBTITLE && pkt->pts != AV_NOPTS_VALUE && pkt->pts >= st->start_time) {
st->info->codec_info_duration = FFMIN(pkt->pts - st->start_time, st->info->codec_info_duration + pkt->duration);
} else
st->info->codec_info_duration += pkt->duration;
st->info->codec_info_duration_fields += st->parser && st->need_parsing && avctx->ticks_per_frame ==2 ? st->parser->repeat_pict + 1 : 2;
}
}
if (st->codecpar->codec_type == AVMEDIA_TYPE_VIDEO) {
#if FF_API_R_FRAME_RATE
ff_rfps_add_frame(ic, st, pkt->dts);
#endif
if (pkt->dts != pkt->pts && pkt->dts != AV_NOPTS_VALUE && pkt->pts != AV_NOPTS_VALUE)
st->info->frame_delay_evidence = 1;
}
if (!st->internal->avctx->extradata) {
ret = extract_extradata(st, pkt);
if (ret < 0)
goto find_stream_info_err;
}
/* If still no information, we try to open the codec and to
* decompress the frame. We try to avoid that in most cases as
* it takes longer and uses more memory. For MPEG-4, we need to
* decompress for QuickTime.
*
* If AV_CODEC_CAP_CHANNEL_CONF is set this will force decoding of at
* least one frame of codec data, this makes sure the codec initializes
* the channel configuration and does not only trust the values from
* the container. */
try_decode_frame(ic, st, pkt,
(options && i < orig_nb_streams) ? &options[i] : NULL);
if (ic->flags & AVFMT_FLAG_NOBUFFER)
av_packet_unref(pkt);
st->codec_info_nb_frames++;
count++;
}
这个死循环做了如下事情:
1.检查用户有没有请求中断。如果有中断请求,就调用中断回调方法,并且函数返回。
2.再一次遍历流,检查是不是还有编解码器需要进一步处理。这里会检查编解码器的各个参数,如果这些参数都就绪,那么就会返回了,也就没有必要再做进一步分析了。如果还有些编解码器的信息不全,那么这里会继续向下执行。
3.读一帧数据进来。使用read_frame_internal函数。这个函数很复杂,以后在分析。
4.把读出来的数据添加到缓冲区。使用的是ff_packet_list_put函数。然后更新读到的总的数据的大小:read_size += pkt->size;
5.再次更新编解码器上下文环境的参数。调用的是avcodec_parameters_to_context函数,这个函数我们已经分析过了。
6.检查dts的连续性。 如果dts中的差异大于序列中平均分组持续时间的1000倍,我们将其视为不连续。这里不是很明白意思,后续再分析
if (st->info->fps_last_dts != AV_NOPTS_VALUE &&
st->info->fps_last_dts_idx > st->info->fps_first_dts_idx &&
(pkt->dts - st->info->fps_last_dts) / 1000 >
(st->info->fps_last_dts - (uint64_t)st->info->fps_first_dts) /
(st->info->fps_last_dts_idx - st->info->fps_first_dts_idx)) {
av_log(ic, AV_LOG_WARNING,
"DTS discontinuity in stream %d: packet %d with DTS "
"%"PRId64", packet %d with DTS %"PRId64"\n",
st->index, st->info->fps_last_dts_idx,
st->info->fps_last_dts, st->codec_info_nb_frames,
pkt->dts);
st->info->fps_first_dts =
st->info->fps_last_dts = AV_NOPTS_VALUE;
}
更新存储的dts的值。代码为:
/* update stored dts values */
if (st->info->fps_first_dts == AV_NOPTS_VALUE) {
st->info->fps_first_dts = pkt->dts;
st->info->fps_first_dts_idx = st->codec_info_nb_frames;
}
st->info->fps_last_dts = pkt->dts;
st->info->fps_last_dts_idx = st->codec_info_nb_frames;
如果这个时候,还是没有找到足够的信息,那么就会尝试解压一些数据出来并做分析。
首先看注释:
如果仍然没有信息,我们尝试打开编解码器并且解压缩帧。 我们需要在大多数情况下尽量避免做这样的事情,因为很需要很多的时间,并使用更多的内存。 对于MPEG-4,我们需要解压QuickTime。
如果设置AV_CODEC_CAP_CHANNEL_CONF,这将强制解码至少一帧编解码器数据,这确保编解码器初始化通道配置,并且不仅信任来自容器的值
尝试解压一帧的数据使用的是try_decode_frame函数。有些参数,如vidoe的pix_fmt是需要调用h264_decode_frame才可以获取其pix_fmt的。
总结下这个死循环做的工作:其中会检测是不是所有流的信息都已经完备,如果完备就返回,如果不完备,就尝试的解压一帧的数据。并从中获取pix_fmt等必须通过解压一帧的数据才能获得的参数。
part 6 检查是否到达文件尾部
if (eof_reached) {
int stream_index;
for (stream_index = 0; stream_index < ic->nb_streams; stream_index++) {
st = ic->streams[stream_index];
avctx = st->internal->avctx;
if (!has_codec_parameters(st, NULL)) {
const AVCodec *codec = find_probe_decoder(ic, st, st->codecpar->codec_id);
if (codec && !avctx->codec) {
AVDictionary *opts = NULL;
if (ic->codec_whitelist)
av_dict_set(&opts, "codec_whitelist", ic->codec_whitelist, 0);
if (avcodec_open2(avctx, codec, (options && stream_index < orig_nb_streams) ? &options[stream_index] : &opts) < 0)
av_log(ic, AV_LOG_WARNING,
"Failed to open codec in %s\n",__FUNCTION__);
av_dict_free(&opts);
}
}
// EOF already reached while reading the stream above.
// So continue with reoordering DTS with whatever delay we have.
if (ic->internal->packet_buffer && !has_decode_delay_been_guessed(st)) {
update_dts_from_pts(ic, stream_index, ic->internal->packet_buffer);
}
}
}
如果到了文件尾部,又会遍历一次流,检测其编解码器参数,如果其参数不完整,就会再次调用avcodec_open2来初始化编解码器的各个参数。
part 7 刷新解码器
if (flush_codecs) {
AVPacket empty_pkt = { 0 };
int err = 0;
av_init_packet(&empty_pkt);
for (i = 0; i < ic->nb_streams; i++) {
st = ic->streams[i];
/* flush the decoders */
if (st->info->found_decoder == 1) {
do {
err = try_decode_frame(ic, st, &empty_pkt,
(options && i < orig_nb_streams)
? &options[i] : NULL);
} while (err > 0 && !has_codec_parameters(st, NULL));
if (err < 0) {
av_log(ic, AV_LOG_INFO,
"decoding for stream %d failed\n", st->index);
}
}
}
}
有一些帧可能在缓存区中,需要把它们flush掉。
有一个has_codec_parameters()函数,用于判断AVStream中的成员变量是否都已经设置完毕。该函数在avformat_find_stream_info()中的多个地方被使用过。下面简单看一下该函数。
has_codec_parameters()用于检查AVStream中的成员变量是否都已经设置完毕。函数的定义如下。
has_codec_parameters()
static int has_codec_parameters(AVStream *st, const char **errmsg_ptr)
{
AVCodecContext *avctx = st->internal->avctx;
#define FAIL(errmsg) do { \
if (errmsg_ptr) \
*errmsg_ptr = errmsg; \
return 0; \
} while (0)
if ( avctx->codec_id == AV_CODEC_ID_NONE
&& avctx->codec_type != AVMEDIA_TYPE_DATA)
FAIL("unknown codec");
switch (avctx->codec_type) {
case AVMEDIA_TYPE_AUDIO:
if (!avctx->frame_size && determinable_frame_size(avctx))
FAIL("unspecified frame size");
if (st->info->found_decoder >= 0 &&
avctx->sample_fmt == AV_SAMPLE_FMT_NONE)
FAIL("unspecified sample format");
if (!avctx->sample_rate)
FAIL("unspecified sample rate");
if (!avctx->channels)
FAIL("unspecified number of channels");
if (st->info->found_decoder >= 0 && !st->nb_decoded_frames && avctx->codec_id == AV_CODEC_ID_DTS)
FAIL("no decodable DTS frames");
break;
case AVMEDIA_TYPE_VIDEO:
if (!avctx->width)
FAIL("unspecified size");
if (st->info->found_decoder >= 0 && avctx->pix_fmt == AV_PIX_FMT_NONE)
FAIL("unspecified pixel format");
if (st->codecpar->codec_id == AV_CODEC_ID_RV30 || st->codecpar->codec_id == AV_CODEC_ID_RV40)
if (!st->sample_aspect_ratio.num && !st->codecpar->sample_aspect_ratio.num && !st->codec_info_nb_frames)
FAIL("no frame in rv30/40 and no sar");
break;
case AVMEDIA_TYPE_SUBTITLE:
if (avctx->codec_id == AV_CODEC_ID_HDMV_PGS_SUBTITLE && !avctx->width)
FAIL("unspecified size");
break;
case AVMEDIA_TYPE_DATA:
if (avctx->codec_id == AV_CODEC_ID_NONE) return 1;
}
return 1;
}
part 8 计算rfps
ff_rfps_calculate(ic);
part9 再次遍历流
for (i = 0; i < ic->nb_streams; i++) {
st = ic->streams[i];
avctx = st->internal->avctx;
if (avctx->codec_type == AVMEDIA_TYPE_VIDEO) {
if (avctx->codec_id == AV_CODEC_ID_RAWVIDEO && !avctx->codec_tag && !avctx->bits_per_coded_sample) {
uint32_t tag= avcodec_pix_fmt_to_codec_tag(avctx->pix_fmt);
if (avpriv_find_pix_fmt(avpriv_get_raw_pix_fmt_tags(), tag) == avctx->pix_fmt)
avctx->codec_tag= tag;
}
/* estimate average framerate if not set by demuxer */
if (st->info->codec_info_duration_fields &&
!st->avg_frame_rate.num &&
st->info->codec_info_duration) {
int best_fps = 0;
double best_error = 0.01;
AVRational codec_frame_rate = avctx->framerate;
if (st->info->codec_info_duration >= INT64_MAX / st->time_base.num / 2||
st->info->codec_info_duration_fields >= INT64_MAX / st->time_base.den ||
st->info->codec_info_duration < 0)
continue;
av_reduce(&st->avg_frame_rate.num, &st->avg_frame_rate.den,
st->info->codec_info_duration_fields * (int64_t) st->time_base.den,
st->info->codec_info_duration * 2 * (int64_t) st->time_base.num, 60000);
/* Round guessed framerate to a "standard" framerate if it's
* within 1% of the original estimate. */
for (j = 0; j < MAX_STD_TIMEBASES; j++) {
AVRational std_fps = { get_std_framerate(j), 12 * 1001 };
double error = fabs(av_q2d(st->avg_frame_rate) /
av_q2d(std_fps) - 1);
if (error < best_error) {
best_error = error;
best_fps = std_fps.num;
}
if (ic->internal->prefer_codec_framerate && codec_frame_rate.num > 0 && codec_frame_rate.den > 0) {
error = fabs(av_q2d(codec_frame_rate) /
av_q2d(std_fps) - 1);
if (error < best_error) {
best_error = error;
best_fps = std_fps.num;
}
}
}
if (best_fps)
av_reduce(&st->avg_frame_rate.num, &st->avg_frame_rate.den,
best_fps, 12 * 1001, INT_MAX);
}
if (!st->r_frame_rate.num) {
if ( avctx->time_base.den * (int64_t) st->time_base.num
<= avctx->time_base.num * avctx->ticks_per_frame * (int64_t) st->time_base.den) {
av_reduce(&st->r_frame_rate.num, &st->r_frame_rate.den,
avctx->time_base.den, (int64_t)avctx->time_base.num * avctx->ticks_per_frame, INT_MAX);
} else {
st->r_frame_rate.num = st->time_base.den;
st->r_frame_rate.den = st->time_base.num;
}
}
if (st->display_aspect_ratio.num && st->display_aspect_ratio.den) {
AVRational hw_ratio = { avctx->height, avctx->width };
st->sample_aspect_ratio = av_mul_q(st->display_aspect_ratio,
hw_ratio);
}
} else if (avctx->codec_type == AVMEDIA_TYPE_AUDIO) {
if (!avctx->bits_per_coded_sample)
avctx->bits_per_coded_sample =
av_get_bits_per_sample(avctx->codec_id);
// set stream disposition based on audio service type
switch (avctx->audio_service_type) {
case AV_AUDIO_SERVICE_TYPE_EFFECTS:
st->disposition = AV_DISPOSITION_CLEAN_EFFECTS;
break;
case AV_AUDIO_SERVICE_TYPE_VISUALLY_IMPAIRED:
st->disposition = AV_DISPOSITION_VISUAL_IMPAIRED;
break;
case AV_AUDIO_SERVICE_TYPE_HEARING_IMPAIRED:
st->disposition = AV_DISPOSITION_HEARING_IMPAIRED;
break;
case AV_AUDIO_SERVICE_TYPE_COMMENTARY:
st->disposition = AV_DISPOSITION_COMMENT;
break;
case AV_AUDIO_SERVICE_TYPE_KARAOKE:
st->disposition = AV_DISPOSITION_KARAOKE;
break;
}
}
}
这个循环用来处理音频流和视频流。
对视频流而言,首先会获得原始数据的图像格式,然后据此找到对应的编解码器的tag。之后会计算平均帧率。
对音频流而言,基于音频流的服务类型初始化disposition。disposition的作用暂时不知。
part 10 计算时间相关的参数
if (probesize)
estimate_timings(ic, old_offset);
static void estimate_timings(AVFormatContext *ic, int64_t old_offset)
{
int64_t file_size;
/* get the file size, if possible */
if (ic->iformat->flags & AVFMT_NOFILE) {
file_size = 0;
} else {
file_size = avio_size(ic->pb);
file_size = FFMAX(0, file_size);
}
if ((!strcmp(ic->iformat->name, "mpeg") ||
!strcmp(ic->iformat->name, "mpegts")) &&
file_size && (ic->pb->seekable & AVIO_SEEKABLE_NORMAL)) {
/* get accurate estimate from the PTSes */
estimate_timings_from_pts(ic, old_offset);
ic->duration_estimation_method = AVFMT_DURATION_FROM_PTS;
} else if (has_duration(ic)) {
/* at least one component has timings - we use them for all
* the components */
fill_all_stream_timings(ic);
ic->duration_estimation_method = AVFMT_DURATION_FROM_STREAM;
} else {
/* less precise: use bitrate info */
estimate_timings_from_bit_rate(ic);
ic->duration_estimation_method = AVFMT_DURATION_FROM_BITRATE;
}
update_stream_timings(ic);
{
int i;
AVStream av_unused *st;
for (i = 0; i < ic->nb_streams; i++) {
st = ic->streams[i];
av_log(ic, AV_LOG_TRACE, "stream %d: start_time: %0.3f duration: %0.3f\n", i,
(double) st->start_time * av_q2d(st->time_base),
(double) st->duration * av_q2d(st->time_base));
}
av_log(ic, AV_LOG_TRACE,
"format: start_time: %0.3f duration: %0.3f bitrate=%"PRId64" kb/s\n",
(double) ic->start_time / AV_TIME_BASE,
(double) ic->duration / AV_TIME_BASE,
(int64_t)ic->bit_rate / 1000);
}
}
从estimate_timings()的代码中可以看出,有3种估算方法:(1)通过pts(显示时间戳)。该方法调用estimate_timings_from_pts()。它的基本思想就是读取视音频流中的结束位置AVPacket的PTS和起始位置AVPacket的PTS,两者相减得到时长信息。(2)通过已知流的时长。该方法调用fill_all_stream_timings()。它的代码没有细看,但从函数的注释的意思来说,应该是当有些视音频流有时长信息的时候,直接赋值给其他视音频流。
(3)通过bitrate(码率)。该方法调用estimate_timings_from_bit_rate()。它的基本思想就是获得整个文件大小,以及整个文件的bitrate,两者相除之后得到时长信息。
在这里附上上述几种方法中最简单的函数estimate_timings_from_bit_rate()的代码。
static void estimate_timings_from_bit_rate(AVFormatContext *ic)
{
int64_t filesize, duration;
int i, show_warning = 0;
AVStream *st;
/* if bit_rate is already set, we believe it */
if (ic->bit_rate <= 0) {
int64_t bit_rate = 0;
for (i = 0; i < ic->nb_streams; i++) {
st = ic->streams[i];
if (st->codecpar->bit_rate <= 0 && st->internal->avctx->bit_rate > 0)
st->codecpar->bit_rate = st->internal->avctx->bit_rate;
if (st->codecpar->bit_rate > 0) {
if (INT64_MAX - st->codecpar->bit_rate < bit_rate) {
bit_rate = 0;
break;
}
bit_rate += st->codecpar->bit_rate;
} else if (st->codecpar->codec_type == AVMEDIA_TYPE_VIDEO && st->codec_info_nb_frames > 1) {
// If we have a videostream with packets but without a bitrate
// then consider the sum not known
bit_rate = 0;
break;
}
}
ic->bit_rate = bit_rate;
}
/* if duration is already set, we believe it */
if (ic->duration == AV_NOPTS_VALUE &&
ic->bit_rate != 0) {
filesize = ic->pb ? avio_size(ic->pb) : 0;
if (filesize > ic->internal->data_offset) {
filesize -= ic->internal->data_offset;
for (i = 0; i < ic->nb_streams; i++) {
st = ic->streams[i];
if ( st->time_base.num <= INT64_MAX / ic->bit_rate
&& st->duration == AV_NOPTS_VALUE) {
duration = av_rescale(8 * filesize, st->time_base.den,
ic->bit_rate *
(int64_t) st->time_base.num);
st->duration = duration;
show_warning = 1;
}
}
}
}
if (show_warning)
av_log(ic, AV_LOG_WARNING,
"Estimating duration from bitrate, this may be inaccurate\n");
}
从代码中可以看出,该函数做了两步工作:(1)如果AVFormatContext中没有bit_rate信息,就把所有AVStream的bit_rate加起来作为AVFormatContext的bit_rate信息。
(2)使用文件大小filesize除以bitrate得到时长信息。具体的方法是:
AVStream->duration=(filesize*8/bit_rate)/time_basePS:
1)filesize乘以8是因为需要把Byte转换为Bit
2)具体的实现函数是那个av_rescale()函数。x=av_rescale(a,b,c)的含义是x=a*b/c。
3)之所以要除以time_base,是因为AVStream中的duration的单位是time_base,注意这和AVFormatContext中的duration的单位(单位是AV_TIME_BASE,固定取值为1000000)是不一样的。
part 11 更新数据
if (ret >= 0 && ic->nb_streams)
/* We could not have all the codec parameters before EOF. */
ret = -1;
for (i = 0; i < ic->nb_streams; i++) {
const char *errmsg;
st = ic->streams[i];
/* if no packet was ever seen, update context now for has_codec_parameters */
if (!st->internal->avctx_inited) {
if (st->codecpar->codec_type == AVMEDIA_TYPE_AUDIO &&
st->codecpar->format == AV_SAMPLE_FMT_NONE)
st->codecpar->format = st->internal->avctx->sample_fmt;
ret = avcodec_parameters_to_context(st->internal->avctx, st->codecpar);
if (ret < 0)
goto find_stream_info_err;
}
if (!has_codec_parameters(st, &errmsg)) {
char buf[256];
avcodec_string(buf, sizeof(buf), st->internal->avctx, 0);
av_log(ic, AV_LOG_WARNING,
"Could not find codec parameters for stream %d (%s): %s\n"
"Consider increasing the value for the 'analyzeduration' and 'probesize' options\n",
i, buf, errmsg);
} else {
ret = 0;
}
}
compute_chapters_end(ic);
/* update the stream parameters from the internal codec contexts */
for (i = 0; i < ic->nb_streams; i++) {
st = ic->streams[i];
if (st->internal->avctx_inited) {
int orig_w = st->codecpar->width;
int orig_h = st->codecpar->height;
ret = avcodec_parameters_from_context(st->codecpar, st->internal->avctx);
if (ret < 0)
goto find_stream_info_err;
#if FF_API_LOWRES
// The decoder might reduce the video size by the lowres factor.
if (st->internal->avctx->lowres && orig_w) {
st->codecpar->width = orig_w;
st->codecpar->height = orig_h;
}
#endif
}
#if FF_API_LAVF_AVCTX
FF_DISABLE_DEPRECATION_WARNINGS
ret = avcodec_parameters_to_context(st->codec, st->codecpar);
if (ret < 0)
goto find_stream_info_err;
#if FF_API_LOWRES
// The old API (AVStream.codec) "requires" the resolution to be adjusted
// by the lowres factor.
if (st->internal->avctx->lowres && st->internal->avctx->width) {
st->codec->lowres = st->internal->avctx->lowres;
st->codec->width = st->internal->avctx->width;
st->codec->height = st->internal->avctx->height;
}
#endif
if (st->codec->codec_tag != MKTAG('t','m','c','d')) {
st->codec->time_base = st->internal->avctx->time_base;
st->codec->ticks_per_frame = st->internal->avctx->ticks_per_frame;
}
st->codec->framerate = st->avg_frame_rate;
if (st->internal->avctx->subtitle_header) {
st->codec->subtitle_header = av_malloc(st->internal->avctx->subtitle_header_size);
if (!st->codec->subtitle_header)
goto find_stream_info_err;
st->codec->subtitle_header_size = st->internal->avctx->subtitle_header_size;
memcpy(st->codec->subtitle_header, st->internal->avctx->subtitle_header,
st->codec->subtitle_header_size);
}
// Fields unavailable in AVCodecParameters
st->codec->coded_width = st->internal->avctx->coded_width;
st->codec->coded_height = st->internal->avctx->coded_height;
st->codec->properties = st->internal->avctx->properties;
FF_ENABLE_DEPRECATION_WARNINGS
#endif
st->internal->avctx_inited = 0;
}
最后就是更新各个结构的数据了。AVStream中的AVCodecContext结构体,以及AVStream中AVStreamInternal中的AVCodecContext结构体的数据要统一起来。AVStream中的AVCodecParameters和AVStream中AVStreamInternal中的AVCodecContext结构体的数据也要统一起来。以上主要就是跟新这几个结构体中的数据。
总结
avformat_open_input函数会读文件头,对mp4文件而言,它会解析所有的box。但它只是把读到的结果保存在对应的数据结构下。这个时候,AVStream中的很多字段都是空白的。
avformat_find_stream_info则检测这些重要字段,如果是空白的,就设法填充它们。因为我们解析文件头的时候,已经掌握了大量的信息,avformat_find_stream_info就是通过这些信息来填充自己的成员,当重要的成员都填充完毕后,该函数就返回了。这中情况下,该函数效率很高。
但对于某些文件,单纯的从文件头中获取信息是不够的,比如vidoe的pix_fmt是需要调用h264_decode_frame才可以获取其pix_fmt的。因此,这个时候这个函数就会读一些数据进来,然后分析这些数据,并尝试解码这些数据,最终从中提取到所需要的信息。在所有的信息都已经获取到以后,avformat_find_stream_info函数会计算start_time,波特率等信息,其中时间相关的计算很复杂,没能看懂,以后再研究吧。计算结束后会更新相关结构体的数据成员。