本篇只关注编码相关的内容,主要分三块来阐述:
初始化流程、编码流程、编码控制
如上图所示,step1~step31是初始化流程,主要是创建相关的对象,step32~step49是编码流程。
初始化流程
VideoStreamEncoder实现了VideoSinkInterface和EncodedImageCallback接口,
只看 Java 代码我们会发现这个类的方法基本都是 package private 的,而且并没有被其他类调用过,对 MediaCodecVideoEncoder
接口的调用都发生在 native 层,在 webrtc/sdk/android/src/jni/androidmediaencoder_jni.cc
这个文件中。当做为一个VideoSinkInterface对象时,可以接收WebRTC Native 源码1:相机采集实现分析分发的图像。当做为一个EncodedImageCallback对象时,可以接收MediaCodecVideoEncoder编码后的码流数据。MediaCodecVideoEncoder的callback_是一个VCMEncodedFrameCallback对象,VCMEncodedFrameCallback的post_encode_callback_是一个VideoStreamEncoder对象,因此编码后的数据就可以通过callback_成员回调到VideoStreamEncoder。
创建视频发送Stream时会调用WebRtcVideoSendStream的RecreateWebRtcStream,在这个过程会完成视频编码模块初始化工作,RecreateWebRtcStream主要代码如下:
void WebRtcVideoChannel::WebRtcVideoSendStream::RecreateWebRtcStream() {
stream_ = call_->CreateVideoSendStream(std::move(config),
parameters_.encoder_config.Copy());
if (source_) {
stream_->SetSource(this, GetDegradationPreference());
}
}
主要包括如下两个流程:
- 调用Call对象的CreateVideoSendStream创建VideoSendStream对象。
- 调用SetSource函数设置图像数据源,最后会调用VideoTrack的AddOrUpdateSink函数注册sink对象,该sink对象是一个VideoStreamEncoder对象。
Call的CreateVideoSendStream主要代码如下:
webrtc::VideoSendStream* Call::CreateVideoSendStream(
webrtc::VideoSendStream::Config config,
VideoEncoderConfig encoder_config) {
VideoSendStream* send_stream = new VideoSendStream(
num_cpu_cores_, module_process_thread_.get(), &worker_queue_,
call_stats_.get(), transport_send_.get(), bitrate_allocator_.get(),
video_send_delay_stats_.get(), event_log_, std::move(config),
std::move(encoder_config), suspended_video_send_ssrcs_,
suspended_video_payload_states_);
return send_stream;
}
VideoSendStream的构造函数如下所示:
VideoSendStream::VideoSendStream(
int num_cpu_cores,
ProcessThread* module_process_thread,
rtc::TaskQueue* worker_queue,
CallStats* call_stats,
RtpTransportControllerSendInterface* transport,
BitrateAllocator* bitrate_allocator,
SendDelayStats* send_delay_stats,
RtcEventLog* event_log,
VideoSendStream::Config config,
VideoEncoderConfig encoder_config,
const std::map<uint32_t, RtpState>& suspended_ssrcs,
const std::map<uint32_t, RtpPayloadState>& suspended_payload_states)
: worker_queue_(worker_queue),
thread_sync_event_(false /* manual_reset */, false),
stats_proxy_(Clock::GetRealTimeClock(),
config,
encoder_config.content_type),
config_(std::move(config)),
content_type_(encoder_config.content_type) {
video_stream_encoder_.reset(
new VideoStreamEncoder(num_cpu_cores, &stats_proxy_,
config_.encoder_settings,
config_.pre_encode_callback,
std::unique_ptr<OveruseFrameDetector>()));
worker_queue_->PostTask(std::unique_ptr<rtc::QueuedTask>(new ConstructionTask(
&send_stream_, &thread_sync_event_, &stats_proxy_,
video_stream_encoder_.get(), module_process_thread, call_stats, transport,
bitrate_allocator, send_delay_stats, event_log, &config_,
encoder_config.max_bitrate_bps, suspended_ssrcs, suspended_payload_states,
encoder_config.content_type)));
// Wait for ConstructionTask to complete so that |send_stream_| can be used.
// |module_process_thread| must be registered and deregistered on the thread
// it was created on.
thread_sync_event_.Wait(rtc::Event::kForever);
send_stream_->RegisterProcessThread(module_process_thread);
// TODO(sprang): Enable this also for regular video calls if it works well.
if (encoder_config.content_type == VideoEncoderConfig::ContentType::kScreen) {
// Only signal target bitrate for screenshare streams, for now.
video_stream_encoder_->SetBitrateObserver(send_stream_.get());
}
ReconfigureVideoEncoder(std::move(encoder_config));
}
主要是创建了VideoStreamEncoder对象和VideoSendStreamImpl对象,然后调用ReconfigureVideoEncoder初始化编码器,VideoSendStreamImpl对象的创建是在ConstructionTask的Run函数中完成的,如下所示:
bool Run() override {
send_stream_->reset(new VideoSendStreamImpl(
stats_proxy_, rtc::TaskQueue::Current(), call_stats_, transport_,
bitrate_allocator_, send_delay_stats_, video_stream_encoder_,
event_log_, config_, initial_encoder_max_bitrate_,
std::move(suspended_ssrcs_), std::move(suspended_payload_states_),
content_type_));
return true;
}
VideoStreamEncoder的构造函数如下所示:
VideoStreamEncoder::VideoStreamEncoder(
uint32_t number_of_cores,
SendStatisticsProxy* stats_proxy,
const VideoSendStream::Config::EncoderSettings& settings,
rtc::VideoSinkInterface<VideoFrame>* pre_encode_callback,
std::unique_ptr<OveruseFrameDetector> overuse_detector)
: shutdown_event_(true /* manual_reset */, false),
number_of_cores_(number_of_cores),
initial_rampup_(0),
source_proxy_(new VideoSourceProxy(this)),
sink_(nullptr),
settings_(settings),
codec_type_(PayloadStringToCodecType(settings.payload_name)),
video_sender_(Clock::GetRealTimeClock(), this),
overuse_detector_(
overuse_detector.get()
? overuse_detector.release()
: new OveruseFrameDetector(
GetCpuOveruseOptions(settings.full_overuse_time),
this,
stats_proxy)),
stats_proxy_(stats_proxy),
pre_encode_callback_(pre_encode_callback),
max_framerate_(-1),
pending_encoder_reconfiguration_(false),
encoder_start_bitrate_bps_(0),
max_data_payload_length_(0),
nack_enabled_(false),
last_observed_bitrate_bps_(0),
encoder_paused_and_dropped_frame_(false),
clock_(Clock::GetRealTimeClock()),
degradation_preference_(
VideoSendStream::DegradationPreference::kDegradationDisabled),
posted_frames_waiting_for_encode_(0),
last_captured_timestamp_(0),
delta_ntp_internal_ms_(clock_->CurrentNtpInMilliseconds() -
clock_->TimeInMilliseconds()),
last_frame_log_ms_(clock_->TimeInMilliseconds()),
captured_frame_count_(0),
dropped_frame_count_(0),
bitrate_observer_(nullptr),
encoder_queue_("EncoderQueue") {
RTC_DCHECK(stats_proxy);
encoder_queue_.PostTask([this] {
RTC_DCHECK_RUN_ON(&encoder_queue_);
overuse_detector_->StartCheckForOveruse();
video_sender_.RegisterExternalEncoder(
settings_.encoder, settings_.payload_type, settings_.internal_source);
});
}
主要是初始化source_proxy_对象和video_sender_对象,VideoSender构造函数如下所示:
VideoSender::VideoSender(Clock* clock,
EncodedImageCallback* post_encode_callback)
: _encoder(nullptr),
_mediaOpt(clock),
_encodedFrameCallback(post_encode_callback, &_mediaOpt),
post_encode_callback_(post_encode_callback),
_codecDataBase(&_encodedFrameCallback),
frame_dropper_enabled_(true),
current_codec_(),
encoder_params_({BitrateAllocation(), 0, 0, 0}),
encoder_has_internal_source_(false),
next_frame_types_(1, kVideoFrameDelta) {
_mediaOpt.Reset();
// Allow VideoSender to be created on one thread but used on another, post
// construction. This is currently how this class is being used by at least
// one external project (diffractor).
sequenced_checker_.Detach();
}
主要是初始化_encodedFrameCallback和codecDataBase对象,VCMEncodedFrameCallback构造函数如下所示,
post_encode_callback是一个VideoStreamEncoder对象:
VCMEncodedFrameCallback::VCMEncodedFrameCallback(
EncodedImageCallback* post_encode_callback,
media_optimization::MediaOptimization* media_opt)
: internal_source_(false),
post_encode_callback_(post_encode_callback),
media_opt_(media_opt),
framerate_(1),
last_timing_frame_time_ms_(-1),
timing_frames_thresholds_({-1, 0}),
incorrect_capture_time_logged_messages_(0),
reordered_frames_logged_messages_(0),
stalled_encoder_logged_messages_(0) {
}
VCMCodecDataBase构造函数如下所示,encoded_frame_callback_是一个VCMEncodedFrameCallback对象:
VCMCodecDataBase::VCMCodecDataBase(
VCMEncodedFrameCallback* encoded_frame_callback)
: number_of_cores_(0),
max_payload_size_(kDefaultPayloadSize),
periodic_key_frames_(false),
pending_encoder_reset_(true),
send_codec_(),
receive_codec_(),
encoder_payload_type_(0),
external_encoder_(nullptr),
internal_source_(false),
encoded_frame_callback_(encoded_frame_callback),
dec_map_(),
dec_external_map_() {}
在VideoStreamEncoder的构造函数中,调用VideoSender的RegisterExternalEncoder函数最终将编码器对象保存在VCMCodecDataBase的external_encoder_成员,如下所示:
void VCMCodecDataBase::RegisterExternalEncoder(VideoEncoder* external_encoder,
uint8_t payload_type,
bool internal_source) {
// Since only one encoder can be used at a given time, only one external
// encoder can be registered/used.
external_encoder_ = external_encoder;
encoder_payload_type_ = payload_type;
internal_source_ = internal_source;
pending_encoder_reset_ = true;
}
编码器对象是通过WebRtcVideoEncoderFactory创建的,如下所示:
void WebRtcVideoChannel::WebRtcVideoSendStream::SetCodec(
const VideoCodecSettings& codec_settings,
bool force_encoder_allocation) {
std::unique_ptr<webrtc::VideoEncoder> new_encoder;
if (force_encoder_allocation || !allocated_encoder_ ||
allocated_codec_ != codec_settings.codec) {
const webrtc::SdpVideoFormat format(codec_settings.codec.name,
codec_settings.codec.params);
new_encoder = encoder_factory_->CreateVideoEncoder(format);
parameters_.config.encoder_settings.encoder = new_encoder.get();
const webrtc::VideoEncoderFactory::CodecInfo info =
encoder_factory_->QueryVideoEncoder(format);
parameters_.config.encoder_settings.full_overuse_time =
info.is_hardware_accelerated;
parameters_.config.encoder_settings.internal_source =
info.has_internal_source;
} else {
new_encoder = std::move(allocated_encoder_);
}
parameters_.config.encoder_settings.payload_name = codec_settings.codec.name;
parameters_.config.encoder_settings.payload_type = codec_settings.codec.id;
}
以Android MediaCodec编码器为例,MediaCodecVideoEncoderFactory的CreateVideoEncoder定义如下:
VideoEncoder* MediaCodecVideoEncoderFactory::CreateVideoEncoder(
const cricket::VideoCodec& codec) {
if (supported_codecs().empty()) {
ALOGW << "No HW video encoder for codec " << codec.name;
return nullptr;
}
if (FindMatchingCodec(supported_codecs(), codec)) {
ALOGD << "Create HW video encoder for " << codec.name;
JNIEnv* jni = AttachCurrentThreadIfNeeded();
ScopedLocalRefFrame local_ref_frame(jni);
return new MediaCodecVideoEncoder(jni, codec, egl_context_);
}
ALOGW << "Can not find HW video encoder for type " << codec.name;
return nullptr;
}
可见VCMCodecDataBase的external_encoder_是一个MediaCodecVideoEncoder对象。
回到VideoSendStream的构造函数中,调用的ReconfigureVideoEncoder最后会调用VCMCodecDataBase的SetSendCodec函数,如下所示,主要是创建并初始化VCMGenericEncoder对象,其中external_encoder_是一个MediaCodecVideoEncoder对象,encoded_frame_callback_是一个VCMEncodedFrameCallback对象。
bool VCMCodecDataBase::SetSendCodec(const VideoCodec* send_codec,
int number_of_cores,
size_t max_payload_size) {
ptr_encoder_.reset(new VCMGenericEncoder(
external_encoder_, encoded_frame_callback_, internal_source_));
encoded_frame_callback_->SetInternalSource(internal_source_);
if (ptr_encoder_->InitEncode(&send_codec_, number_of_cores_,
max_payload_size_) < 0) {
RTC_LOG(LS_ERROR) << "Failed to initialize video encoder.";
DeleteEncoder();
return false;
}
}
VCMGenericEncoder的构造函数如下所示,encoder_和vcm_encoded_frame_callback_保存的分别是MediaCodecVideoEncoder对象和VCMEncodedFrameCallback对象。
VCMGenericEncoder::VCMGenericEncoder(
VideoEncoder* encoder,
VCMEncodedFrameCallback* encoded_frame_callback,
bool internal_source)
: encoder_(encoder),
vcm_encoded_frame_callback_(encoded_frame_callback),
internal_source_(internal_source),
encoder_params_({BitrateAllocation(), 0, 0, 0}),
streams_or_svc_num_(0) {}
VCMGenericEncoder的InitEncode函数定义如下所示:
int32_t VCMGenericEncoder::InitEncode(const VideoCodec* settings,
int32_t number_of_cores,
size_t max_payload_size) {
RTC_DCHECK_RUNS_SERIALIZED(&race_checker_);
TRACE_EVENT0("webrtc", "VCMGenericEncoder::InitEncode");
streams_or_svc_num_ = settings->numberOfSimulcastStreams;
codec_type_ = settings->codecType;
if (settings->codecType == kVideoCodecVP9) {
streams_or_svc_num_ = settings->VP9().numberOfSpatialLayers;
}
if (streams_or_svc_num_ == 0)
streams_or_svc_num_ = 1;
vcm_encoded_frame_callback_->SetTimingFramesThresholds(
settings->timing_frame_thresholds);
vcm_encoded_frame_callback_->OnFrameRateChanged(settings->maxFramerate);
if (encoder_->InitEncode(settings, number_of_cores, max_payload_size) != 0) {
RTC_LOG(LS_ERROR) << "Failed to initialize the encoder associated with "
"payload name: "
<< settings->plName;
return -1;
}
vcm_encoded_frame_callback_->Reset();
encoder_->RegisterEncodeCompleteCallback(vcm_encoded_frame_callback_);
return 0;
}
主要是调用MediaCodecVideoEncoder的InitEncode函数创建并初始化编码器,并将VCMEncodedFrameCallback注册到MediaCodecVideoEncoder中用来接收编码后的数据。
InitEncode函数最后会调用java层MediaCodecVideoEncoder的initEncode函数,如下所示,在这里完成了创建并初始化Android MediaCodec编码器的工作。
@CalledByNativeUnchecked
boolean initEncode(VideoCodecType type, int profile, int width, int height, int kbps, int fps,
EglBase14.Context sharedContext) {
final boolean useSurface = sharedContext != null;
Logging.d(TAG,
"Java initEncode: " + type + ". Profile: " + profile + " : " + width + " x " + height
+ ". @ " + kbps + " kbps. Fps: " + fps + ". Encode from texture : " + useSurface);
this.profile = profile;
this.width = width;
this.height = height;
if (mediaCodecThread != null) {
throw new RuntimeException("Forgot to release()?");
}
EncoderProperties properties = null;
String mime = null;
int keyFrameIntervalSec = 0;
boolean configureH264HighProfile = false;
if (type == VideoCodecType.VIDEO_CODEC_VP8) {
mime = VP8_MIME_TYPE;
properties = findHwEncoder(
VP8_MIME_TYPE, vp8HwList(), useSurface ? supportedSurfaceColorList : supportedColorList);
keyFrameIntervalSec = 100;
} else if (type == VideoCodecType.VIDEO_CODEC_VP9) {
mime = VP9_MIME_TYPE;
properties = findHwEncoder(
VP9_MIME_TYPE, vp9HwList, useSurface ? supportedSurfaceColorList : supportedColorList);
keyFrameIntervalSec = 100;
} else if (type == VideoCodecType.VIDEO_CODEC_H264) {
mime = H264_MIME_TYPE;
properties = findHwEncoder(
H264_MIME_TYPE, h264HwList, useSurface ? supportedSurfaceColorList : supportedColorList);
if (profile == H264Profile.CONSTRAINED_HIGH.getValue()) {
EncoderProperties h264HighProfileProperties = findHwEncoder(H264_MIME_TYPE,
h264HighProfileHwList, useSurface ? supportedSurfaceColorList : supportedColorList);
if (h264HighProfileProperties != null) {
Logging.d(TAG, "High profile H.264 encoder supported.");
configureH264HighProfile = true;
} else {
Logging.d(TAG, "High profile H.264 encoder requested, but not supported. Use baseline.");
}
}
keyFrameIntervalSec = 20;
}
if (properties == null) {
throw new RuntimeException("Can not find HW encoder for " + type);
}
runningInstance = this; // Encoder is now running and can be queried for stack traces.
colorFormat = properties.colorFormat;
bitrateAdjustmentType = properties.bitrateAdjustmentType;
if (bitrateAdjustmentType == BitrateAdjustmentType.FRAMERATE_ADJUSTMENT) {
fps = BITRATE_ADJUSTMENT_FPS;
} else {
fps = Math.min(fps, MAXIMUM_INITIAL_FPS);
}
forcedKeyFrameMs = 0;
lastKeyFrameMs = -1;
if (type == VideoCodecType.VIDEO_CODEC_VP8
&& properties.codecName.startsWith(qcomVp8HwProperties.codecPrefix)) {
if (Build.VERSION.SDK_INT == Build.VERSION_CODES.LOLLIPOP
|| Build.VERSION.SDK_INT == Build.VERSION_CODES.LOLLIPOP_MR1) {
forcedKeyFrameMs = QCOM_VP8_KEY_FRAME_INTERVAL_ANDROID_L_MS;
} else if (Build.VERSION.SDK_INT == Build.VERSION_CODES.M) {
forcedKeyFrameMs = QCOM_VP8_KEY_FRAME_INTERVAL_ANDROID_M_MS;
} else if (Build.VERSION.SDK_INT > Build.VERSION_CODES.M) {
forcedKeyFrameMs = QCOM_VP8_KEY_FRAME_INTERVAL_ANDROID_N_MS;
}
}
Logging.d(TAG, "Color format: " + colorFormat + ". Bitrate adjustment: " + bitrateAdjustmentType
+ ". Key frame interval: " + forcedKeyFrameMs + " . Initial fps: " + fps);
targetBitrateBps = 1000 * kbps;
targetFps = fps;
bitrateAccumulatorMax = targetBitrateBps / 8.0;
bitrateAccumulator = 0;
bitrateObservationTimeMs = 0;
bitrateAdjustmentScaleExp = 0;
mediaCodecThread = Thread.currentThread();
try {
MediaFormat format = MediaFormat.createVideoFormat(mime, width, height);
format.setInteger(MediaFormat.KEY_BIT_RATE, targetBitrateBps);
format.setInteger("bitrate-mode", VIDEO_ControlRateConstant);
format.setInteger(MediaFormat.KEY_COLOR_FORMAT, properties.colorFormat);
format.setInteger(MediaFormat.KEY_FRAME_RATE, targetFps);
format.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, keyFrameIntervalSec);
if (configureH264HighProfile) {
format.setInteger("profile", VIDEO_AVCProfileHigh);
format.setInteger("level", VIDEO_AVCLevel3);
}
Logging.d(TAG, " Format: " + format);
mediaCodec = createByCodecName(properties.codecName);
this.type = type;
if (mediaCodec == null) {
Logging.e(TAG, "Can not create media encoder");
release();
return false;
}
mediaCodec.configure(format, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
if (useSurface) {
eglBase = new EglBase14(sharedContext, EglBase.CONFIG_RECORDABLE);
// Create an input surface and keep a reference since we must release the surface when done.
inputSurface = mediaCodec.createInputSurface();
eglBase.createSurface(inputSurface);
drawer = new GlRectDrawer();
}
mediaCodec.start();
outputBuffers = mediaCodec.getOutputBuffers();
Logging.d(TAG, "Output buffers: " + outputBuffers.length);
} catch (IllegalStateException e) {
Logging.e(TAG, "initEncode failed", e);
release();
return false;
}
return true;
}
MediaCodecVideoEncoder的RegisterEncodeCompleteCallback定义如下所示,可见callback_成员是一个VCMEncodedFrameCallback对象。
int32_t MediaCodecVideoEncoder::RegisterEncodeCompleteCallback(
EncodedImageCallback* callback) {
RTC_DCHECK_CALLED_SEQUENTIALLY(&encoder_queue_checker_);
JNIEnv* jni = AttachCurrentThreadIfNeeded();
ScopedLocalRefFrame local_ref_frame(jni);
callback_ = callback;
return WEBRTC_VIDEO_CODEC_OK;
}
回到WebRtcVideoSendStream的RecreateWebRtcStream函数,会调用VideoSendStream的SetSource设置Source,最后会在VideoSourceProxy的SetSource函数中调用WebRtcVideoSendStream的AddOrUpdateSink函数将VideoStreamEncoder这个sink对象注册到Source上,这样Source的图像数据就可以分发到VideoStreamEncoder对象进行编码了。
视频编码流程
视频编码流程是从VideoBroadcaster回调VideoStreamEncoder的OnFrame开始的。
VideoStreamEncoder的OnFrame定义如下:
void VideoStreamEncoder::OnFrame(const VideoFrame& video_frame) {
RTC_DCHECK_RUNS_SERIALIZED(&incoming_frame_race_checker_);
VideoFrame incoming_frame = video_frame;
// Local time in webrtc time base.
int64_t current_time_us = clock_->TimeInMicroseconds();
int64_t current_time_ms = current_time_us / rtc::kNumMicrosecsPerMillisec;
// In some cases, e.g., when the frame from decoder is fed to encoder,
// the timestamp may be set to the future. As the encoding pipeline assumes
// capture time to be less than present time, we should reset the capture
// timestamps here. Otherwise there may be issues with RTP send stream.
if (incoming_frame.timestamp_us() > current_time_us)
incoming_frame.set_timestamp_us(current_time_us);
// Capture time may come from clock with an offset and drift from clock_.
int64_t capture_ntp_time_ms;
if (video_frame.ntp_time_ms() > 0) {
capture_ntp_time_ms = video_frame.ntp_time_ms();
} else if (video_frame.render_time_ms() != 0) {
capture_ntp_time_ms = video_frame.render_time_ms() + delta_ntp_internal_ms_;
} else {
capture_ntp_time_ms = current_time_ms + delta_ntp_internal_ms_;
}
incoming_frame.set_ntp_time_ms(capture_ntp_time_ms);
// Convert NTP time, in ms, to RTP timestamp.
const int kMsToRtpTimestamp = 90;
incoming_frame.set_timestamp(
kMsToRtpTimestamp * static_cast<uint32_t>(incoming_frame.ntp_time_ms()));
if (incoming_frame.ntp_time_ms() <= last_captured_timestamp_) {
// We don't allow the same capture time for two frames, drop this one.
RTC_LOG(LS_WARNING) << "Same/old NTP timestamp ("
<< incoming_frame.ntp_time_ms()
<< " <= " << last_captured_timestamp_
<< ") for incoming frame. Dropping.";
return;
}
bool log_stats = false;
if (current_time_ms - last_frame_log_ms_ > kFrameLogIntervalMs) {
last_frame_log_ms_ = current_time_ms;
log_stats = true;
}
last_captured_timestamp_ = incoming_frame.ntp_time_ms();
encoder_queue_.PostTask(std::unique_ptr<rtc::QueuedTask>(new EncodeTask(
incoming_frame, this, rtc::TimeMicros(), log_stats)));
}
EncodeTask的Run定义如下:
bool Run() override {
RTC_DCHECK_RUN_ON(&video_stream_encoder_->encoder_queue_);
video_stream_encoder_->stats_proxy_->OnIncomingFrame(frame_.width(),
frame_.height());
++video_stream_encoder_->captured_frame_count_;
const int posted_frames_waiting_for_encode =
video_stream_encoder_->posted_frames_waiting_for_encode_.fetch_sub(1);
RTC_DCHECK_GT(posted_frames_waiting_for_encode, 0);
if (posted_frames_waiting_for_encode == 1) {
video_stream_encoder_->EncodeVideoFrame(frame_, time_when_posted_us_);
} else {
// There is a newer frame in flight. Do not encode this frame.
RTC_LOG(LS_VERBOSE)
<< "Incoming frame dropped due to that the encoder is blocked.";
++video_stream_encoder_->dropped_frame_count_;
video_stream_encoder_->stats_proxy_->OnFrameDroppedInEncoderQueue();
}
if (log_stats_) {
RTC_LOG(LS_INFO) << "Number of frames: captured "
<< video_stream_encoder_->captured_frame_count_
<< ", dropped (due to encoder blocked) "
<< video_stream_encoder_->dropped_frame_count_
<< ", interval_ms " << kFrameLogIntervalMs;
video_stream_encoder_->captured_frame_count_ = 0;
video_stream_encoder_->dropped_frame_count_ = 0;
}
return true;
}
成员video_stream_encoder_是一个VideoStreamEncoder对象,EncodeVideoFrame函数定义如下:
void VideoStreamEncoder::EncodeVideoFrame(const VideoFrame& video_frame,
int64_t time_when_posted_us) {
RTC_DCHECK_RUN_ON(&encoder_queue_);
if (pre_encode_callback_)
pre_encode_callback_->OnFrame(video_frame);
if (!last_frame_info_ || video_frame.width() != last_frame_info_->width ||
video_frame.height() != last_frame_info_->height ||
video_frame.is_texture() != last_frame_info_->is_texture) {
pending_encoder_reconfiguration_ = true;
last_frame_info_ = rtc::Optional<VideoFrameInfo>(VideoFrameInfo(
video_frame.width(), video_frame.height(), video_frame.is_texture()));
RTC_LOG(LS_INFO) << "Video frame parameters changed: dimensions="
<< last_frame_info_->width << "x"
<< last_frame_info_->height
<< ", texture=" << last_frame_info_->is_texture << ".";
}
if (initial_rampup_ < kMaxInitialFramedrop &&
video_frame.size() >
MaximumFrameSizeForBitrate(encoder_start_bitrate_bps_ / 1000)) {
RTC_LOG(LS_INFO) << "Dropping frame. Too large for target bitrate.";
AdaptDown(kQuality);
++initial_rampup_;
return;
}
initial_rampup_ = kMaxInitialFramedrop;
int64_t now_ms = clock_->TimeInMilliseconds();
if (pending_encoder_reconfiguration_) {
ReconfigureEncoder();
last_parameters_update_ms_.emplace(now_ms);
} else if (!last_parameters_update_ms_ ||
now_ms - *last_parameters_update_ms_ >=
vcm::VCMProcessTimer::kDefaultProcessIntervalMs) {
video_sender_.UpdateChannelParemeters(rate_allocator_.get(),
bitrate_observer_);
last_parameters_update_ms_.emplace(now_ms);
}
if (EncoderPaused()) {
TraceFrameDropStart();
return;
}
TraceFrameDropEnd();
VideoFrame out_frame(video_frame);
// Crop frame if needed.
if (crop_width_ > 0 || crop_height_ > 0) {
int cropped_width = video_frame.width() - crop_width_;
int cropped_height = video_frame.height() - crop_height_;
rtc::scoped_refptr<I420Buffer> cropped_buffer =
I420Buffer::Create(cropped_width, cropped_height);
// TODO(ilnik): Remove scaling if cropping is too big, as it should never
// happen after SinkWants signaled correctly from ReconfigureEncoder.
if (crop_width_ < 4 && crop_height_ < 4) {
cropped_buffer->CropAndScaleFrom(
*video_frame.video_frame_buffer()->ToI420(), crop_width_ / 2,
crop_height_ / 2, cropped_width, cropped_height);
} else {
cropped_buffer->ScaleFrom(
*video_frame.video_frame_buffer()->ToI420().get());
}
out_frame =
VideoFrame(cropped_buffer, video_frame.timestamp(),
video_frame.render_time_ms(), video_frame.rotation());
out_frame.set_ntp_time_ms(video_frame.ntp_time_ms());
}
TRACE_EVENT_ASYNC_STEP0("webrtc", "Video", video_frame.render_time_ms(),
"Encode");
overuse_detector_->FrameCaptured(out_frame, time_when_posted_us);
video_sender_.AddVideoFrame(out_frame, nullptr);
}
有必要的话先裁剪缩放,然后调用VideoSender的AddVideoFrame函数,定义如下:
// Add one raw video frame to the encoder, blocking.
int32_t VideoSender::AddVideoFrame(const VideoFrame& videoFrame,
const CodecSpecificInfo* codecSpecificInfo) {
EncoderParameters encoder_params;
std::vector<FrameType> next_frame_types;
bool encoder_has_internal_source = false;
{
rtc::CritScope lock(¶ms_crit_);
encoder_params = encoder_params_;
next_frame_types = next_frame_types_;
encoder_has_internal_source = encoder_has_internal_source_;
}
rtc::CritScope lock(&encoder_crit_);
if (_encoder == nullptr)
return VCM_UNINITIALIZED;
SetEncoderParameters(encoder_params, encoder_has_internal_source);
if (_mediaOpt.DropFrame()) {
RTC_LOG(LS_VERBOSE) << "Drop Frame "
<< "target bitrate "
<< encoder_params.target_bitrate.get_sum_bps()
<< " loss rate " << encoder_params.loss_rate << " rtt "
<< encoder_params.rtt << " input frame rate "
<< encoder_params.input_frame_rate;
post_encode_callback_->OnDroppedFrame(
EncodedImageCallback::DropReason::kDroppedByMediaOptimizations);
return VCM_OK;
}
// TODO(pbos): Make sure setting send codec is synchronized with video
// processing so frame size always matches.
if (!_codecDataBase.MatchesCurrentResolution(videoFrame.width(),
videoFrame.height())) {
RTC_LOG(LS_ERROR)
<< "Incoming frame doesn't match set resolution. Dropping.";
return VCM_PARAMETER_ERROR;
}
VideoFrame converted_frame = videoFrame;
const VideoFrameBuffer::Type buffer_type =
converted_frame.video_frame_buffer()->type();
const bool is_buffer_type_supported =
buffer_type == VideoFrameBuffer::Type::kI420 ||
(buffer_type == VideoFrameBuffer::Type::kNative &&
_encoder->SupportsNativeHandle());
if (!is_buffer_type_supported) {
// This module only supports software encoding.
// TODO(pbos): Offload conversion from the encoder thread.
rtc::scoped_refptr<I420BufferInterface> converted_buffer(
converted_frame.video_frame_buffer()->ToI420());
if (!converted_buffer) {
RTC_LOG(LS_ERROR) << "Frame conversion failed, dropping frame.";
return VCM_PARAMETER_ERROR;
}
converted_frame = VideoFrame(converted_buffer,
converted_frame.timestamp(),
converted_frame.render_time_ms(),
converted_frame.rotation());
}
int32_t ret =
_encoder->Encode(converted_frame, codecSpecificInfo, next_frame_types);
if (ret < 0) {
RTC_LOG(LS_ERROR) << "Failed to encode frame. Error code: " << ret;
return ret;
}
{
rtc::CritScope lock(¶ms_crit_);
// Change all keyframe requests to encode delta frames the next time.
for (size_t i = 0; i < next_frame_types_.size(); ++i) {
// Check for equality (same requested as before encoding) to not
// accidentally drop a keyframe request while encoding.
if (next_frame_types[i] == next_frame_types_[i])
next_frame_types_[i] = kVideoFrameDelta;
}
}
return VCM_OK;
}
成员_encoder是一个VCMGenericEncoder对象,Encode函数定义如下:
int32_t VCMGenericEncoder::Encode(const VideoFrame& frame,
const CodecSpecificInfo* codec_specific,
const std::vector<FrameType>& frame_types) {
RTC_DCHECK_RUNS_SERIALIZED(&race_checker_);
TRACE_EVENT1("webrtc", "VCMGenericEncoder::Encode", "timestamp",
frame.timestamp());
for (FrameType frame_type : frame_types)
RTC_DCHECK(frame_type == kVideoFrameKey || frame_type == kVideoFrameDelta);
for (size_t i = 0; i < streams_or_svc_num_; ++i)
vcm_encoded_frame_callback_->OnEncodeStarted(frame.timestamp(),
frame.render_time_ms(), i);
return encoder_->Encode(frame, codec_specific, &frame_types);
}
考虑encoder_为MediaCodecVideoEncoder的情况,Encode函数定义如下:
int32_t MediaCodecVideoEncoder::Encode(
const VideoFrame& frame,
const CodecSpecificInfo* /* codec_specific_info */,
const std::vector<FrameType>* frame_types) {
RTC_DCHECK_CALLED_SEQUENTIALLY(&encoder_queue_checker_);
if (sw_fallback_required_)
return WEBRTC_VIDEO_CODEC_FALLBACK_SOFTWARE;
JNIEnv* jni = AttachCurrentThreadIfNeeded();
ScopedLocalRefFrame local_ref_frame(jni);
const int64_t frame_input_time_ms = rtc::TimeMillis();
if (!inited_) {
return WEBRTC_VIDEO_CODEC_UNINITIALIZED;
}
bool send_key_frame = false;
if (codec_mode_ == kRealtimeVideo) {
++frames_received_since_last_key_;
int64_t now_ms = rtc::TimeMillis();
if (last_frame_received_ms_ != -1 &&
(now_ms - last_frame_received_ms_) > kFrameDiffThresholdMs) {
// Add limit to prevent triggering a key for every frame for very low
// framerates (e.g. if frame diff > kFrameDiffThresholdMs).
if (frames_received_since_last_key_ > kMinKeyFrameInterval) {
ALOGD << "Send key, frame diff: " << (now_ms - last_frame_received_ms_);
send_key_frame = true;
}
frames_received_since_last_key_ = 0;
}
last_frame_received_ms_ = now_ms;
}
frames_received_++;
if (!DeliverPendingOutputs(jni)) {
if (!ProcessHWError(true /* reset_if_fallback_unavailable */)) {
return sw_fallback_required_ ? WEBRTC_VIDEO_CODEC_FALLBACK_SOFTWARE
: WEBRTC_VIDEO_CODEC_ERROR;
}
}
if (frames_encoded_ < kMaxEncodedLogFrames) {
ALOGD << "Encoder frame in # " << (frames_received_ - 1)
<< ". TS: " << static_cast<int>(current_timestamp_us_ / 1000)
<< ". Q: " << input_frame_infos_.size() << ". Fps: " << last_set_fps_
<< ". Kbps: " << last_set_bitrate_kbps_;
}
if (drop_next_input_frame_) {
ALOGW << "Encoder drop frame - failed callback.";
drop_next_input_frame_ = false;
current_timestamp_us_ += rtc::kNumMicrosecsPerSec / last_set_fps_;
frames_dropped_media_encoder_++;
return WEBRTC_VIDEO_CODEC_OK;
}
RTC_CHECK(frame_types->size() == 1) << "Unexpected stream count";
// Check if we accumulated too many frames in encoder input buffers and drop
// frame if so.
if (input_frame_infos_.size() > MAX_ENCODER_Q_SIZE) {
ALOGD << "Already " << input_frame_infos_.size()
<< " frames in the queue, dropping"
<< ". TS: " << static_cast<int>(current_timestamp_us_ / 1000)
<< ". Fps: " << last_set_fps_
<< ". Consecutive drops: " << consecutive_full_queue_frame_drops_;
current_timestamp_us_ += rtc::kNumMicrosecsPerSec / last_set_fps_;
consecutive_full_queue_frame_drops_++;
if (consecutive_full_queue_frame_drops_ >=
ENCODER_STALL_FRAMEDROP_THRESHOLD) {
ALOGE << "Encoder got stuck.";
return ProcessHWErrorOnEncode();
}
frames_dropped_media_encoder_++;
return WEBRTC_VIDEO_CODEC_OK;
}
consecutive_full_queue_frame_drops_ = 0;
rtc::scoped_refptr<VideoFrameBuffer> input_buffer(frame.video_frame_buffer());
VideoFrame input_frame(input_buffer, frame.timestamp(),
frame.render_time_ms(), frame.rotation());
if (!MaybeReconfigureEncoder(jni, input_frame)) {
ALOGE << "Failed to reconfigure encoder.";
return WEBRTC_VIDEO_CODEC_ERROR;
}
const bool key_frame =
frame_types->front() != kVideoFrameDelta || send_key_frame;
bool encode_status = true;
int j_input_buffer_index = -1;
if (!use_surface_) {
j_input_buffer_index = Java_MediaCodecVideoEncoder_dequeueInputBuffer(
jni, j_media_codec_video_encoder_);
if (CheckException(jni)) {
ALOGE << "Exception in dequeu input buffer.";
return ProcessHWErrorOnEncode();
}
if (j_input_buffer_index == -1) {
// Video codec falls behind - no input buffer available.
ALOGW << "Encoder drop frame - no input buffers available";
if (frames_received_ > 1) {
current_timestamp_us_ += rtc::kNumMicrosecsPerSec / last_set_fps_;
frames_dropped_media_encoder_++;
} else {
// Input buffers are not ready after codec initialization, HW is still
// allocating thme - this is expected and should not result in drop
// frame report.
frames_received_ = 0;
}
return WEBRTC_VIDEO_CODEC_OK; // TODO(fischman): see webrtc bug 2887.
} else if (j_input_buffer_index == -2) {
return ProcessHWErrorOnEncode();
}
}
if (input_frame.video_frame_buffer()->type() !=
VideoFrameBuffer::Type::kNative) {
encode_status =
EncodeByteBuffer(jni, key_frame, input_frame, j_input_buffer_index);
} else {
AndroidVideoFrameBuffer* android_buffer =
static_cast<AndroidVideoFrameBuffer*>(
input_frame.video_frame_buffer().get());
switch (android_buffer->android_type()) {
case AndroidVideoFrameBuffer::AndroidType::kTextureBuffer:
encode_status = EncodeTexture(jni, key_frame, input_frame);
break;
case AndroidVideoFrameBuffer::AndroidType::kJavaBuffer:
encode_status =
EncodeJavaFrame(jni, key_frame, NativeToJavaFrame(jni, input_frame),
j_input_buffer_index);
break;
default:
RTC_NOTREACHED();
return WEBRTC_VIDEO_CODEC_ERROR;
}
}
if (!encode_status) {
ALOGE << "Failed encode frame with timestamp: " << input_frame.timestamp();
return ProcessHWErrorOnEncode();
}
// Save input image timestamps for later output.
input_frame_infos_.emplace_back(frame_input_time_ms, input_frame.timestamp(),
input_frame.render_time_ms(),
input_frame.rotation());
last_input_timestamp_ms_ =
current_timestamp_us_ / rtc::kNumMicrosecsPerMillisec;
current_timestamp_us_ += rtc::kNumMicrosecsPerSec / last_set_fps_;
// Start the polling loop if it is not started.
if (encode_task_) {
rtc::TaskQueue::Current()->PostDelayedTask(std::move(encode_task_),
kMediaCodecPollMs);
}
if (!DeliverPendingOutputs(jni)) {
return ProcessHWErrorOnEncode();
}
return WEBRTC_VIDEO_CODEC_OK;
}
考虑use_surface_为true的情况,调用java层MediaCodecVideoEncoder的encodeTexture函数,定义如下:
@CalledByNativeUnchecked
boolean encodeTexture(boolean isKeyframe, int oesTextureId, float[] transformationMatrix,
long presentationTimestampUs) {
checkOnMediaCodecThread();
try {
checkKeyFrameRequired(isKeyframe, presentationTimestampUs);
eglBase.makeCurrent();
// TODO(perkj): glClear() shouldn't be necessary since every pixel is covered anyway,
// but it's a workaround for bug webrtc:5147.
GLES20.glClear(GLES20.GL_COLOR_BUFFER_BIT);
drawer.drawOes(oesTextureId, transformationMatrix, width, height, 0, 0, width, height);
eglBase.swapBuffers(TimeUnit.MICROSECONDS.toNanos(presentationTimestampUs));
return true;
} catch (RuntimeException e) {
Logging.e(TAG, "encodeTexture failed", e);
return false;
}
}
通过opengl方式往MediaCodec的输入Surface绘制,将图像数据送到OMX进行编码。
然后调用DeliverPendingOutputs函数,定义如下:
bool MediaCodecVideoEncoder::DeliverPendingOutputs(JNIEnv* jni) {
RTC_DCHECK_CALLED_SEQUENTIALLY(&encoder_queue_checker_);
while (true) {
ScopedJavaLocalRef<jobject> j_output_buffer_info =
Java_MediaCodecVideoEncoder_dequeueOutputBuffer(
jni, j_media_codec_video_encoder_);
if (CheckException(jni)) {
ALOGE << "Exception in set dequeue output buffer.";
ProcessHWError(true /* reset_if_fallback_unavailable */);
return WEBRTC_VIDEO_CODEC_ERROR;
}
if (IsNull(jni, j_output_buffer_info)) {
break;
}
int output_buffer_index =
Java_OutputBufferInfo_getIndex(jni, j_output_buffer_info);
if (output_buffer_index == -1) {
ProcessHWError(true /* reset_if_fallback_unavailable */);
return false;
}
// Get key and config frame flags.
ScopedJavaLocalRef<jobject> j_output_buffer =
Java_OutputBufferInfo_getBuffer(jni, j_output_buffer_info);
bool key_frame =
Java_OutputBufferInfo_isKeyFrame(jni, j_output_buffer_info);
// Get frame timestamps from a queue - for non config frames only.
int64_t encoding_start_time_ms = 0;
int64_t frame_encoding_time_ms = 0;
last_output_timestamp_ms_ =
Java_OutputBufferInfo_getPresentationTimestampUs(jni,
j_output_buffer_info) /
rtc::kNumMicrosecsPerMillisec;
if (!input_frame_infos_.empty()) {
const InputFrameInfo& frame_info = input_frame_infos_.front();
output_timestamp_ = frame_info.frame_timestamp;
output_render_time_ms_ = frame_info.frame_render_time_ms;
output_rotation_ = frame_info.rotation;
encoding_start_time_ms = frame_info.encode_start_time;
input_frame_infos_.pop_front();
}
// Extract payload.
size_t payload_size = jni->GetDirectBufferCapacity(j_output_buffer.obj());
uint8_t* payload = reinterpret_cast<uint8_t*>(
jni->GetDirectBufferAddress(j_output_buffer.obj()));
if (CheckException(jni)) {
ALOGE << "Exception in get direct buffer address.";
ProcessHWError(true /* reset_if_fallback_unavailable */);
return WEBRTC_VIDEO_CODEC_ERROR;
}
// Callback - return encoded frame.
const VideoCodecType codec_type = GetCodecType();
EncodedImageCallback::Result callback_result(
EncodedImageCallback::Result::OK);
if (callback_) {
std::unique_ptr<EncodedImage> image(
new EncodedImage(payload, payload_size, payload_size));
image->_encodedWidth = width_;
image->_encodedHeight = height_;
image->_timeStamp = output_timestamp_;
image->capture_time_ms_ = output_render_time_ms_;
image->rotation_ = output_rotation_;
image->content_type_ = (codec_mode_ == VideoCodecMode::kScreensharing)
? VideoContentType::SCREENSHARE
: VideoContentType::UNSPECIFIED;
image->timing_.flags = TimingFrameFlags::kInvalid;
image->_frameType = (key_frame ? kVideoFrameKey : kVideoFrameDelta);
image->_completeFrame = true;
CodecSpecificInfo info;
memset(&info, 0, sizeof(info));
info.codecType = codec_type;
if (codec_type == kVideoCodecVP8) {
info.codecSpecific.VP8.pictureId = picture_id_;
info.codecSpecific.VP8.nonReference = false;
info.codecSpecific.VP8.simulcastIdx = 0;
info.codecSpecific.VP8.temporalIdx = kNoTemporalIdx;
info.codecSpecific.VP8.layerSync = false;
info.codecSpecific.VP8.tl0PicIdx = kNoTl0PicIdx;
info.codecSpecific.VP8.keyIdx = kNoKeyIdx;
} else if (codec_type == kVideoCodecVP9) {
if (key_frame) {
gof_idx_ = 0;
}
info.codecSpecific.VP9.picture_id = picture_id_;
info.codecSpecific.VP9.inter_pic_predicted = key_frame ? false : true;
info.codecSpecific.VP9.flexible_mode = false;
info.codecSpecific.VP9.ss_data_available = key_frame ? true : false;
info.codecSpecific.VP9.tl0_pic_idx = tl0_pic_idx_++;
info.codecSpecific.VP9.temporal_idx = kNoTemporalIdx;
info.codecSpecific.VP9.spatial_idx = kNoSpatialIdx;
info.codecSpecific.VP9.temporal_up_switch = true;
info.codecSpecific.VP9.inter_layer_predicted = false;
info.codecSpecific.VP9.gof_idx =
static_cast<uint8_t>(gof_idx_++ % gof_.num_frames_in_gof);
info.codecSpecific.VP9.num_spatial_layers = 1;
info.codecSpecific.VP9.spatial_layer_resolution_present = false;
if (info.codecSpecific.VP9.ss_data_available) {
info.codecSpecific.VP9.spatial_layer_resolution_present = true;
info.codecSpecific.VP9.width[0] = width_;
info.codecSpecific.VP9.height[0] = height_;
info.codecSpecific.VP9.gof.CopyGofInfoVP9(gof_);
}
}
picture_id_ = (picture_id_ + 1) & 0x7FFF;
// Generate a header describing a single fragment.
RTPFragmentationHeader header;
memset(&header, 0, sizeof(header));
if (codec_type == kVideoCodecVP8 || codec_type == kVideoCodecVP9) {
header.VerifyAndAllocateFragmentationHeader(1);
header.fragmentationOffset[0] = 0;
header.fragmentationLength[0] = image->_length;
header.fragmentationPlType[0] = 0;
header.fragmentationTimeDiff[0] = 0;
if (codec_type == kVideoCodecVP8) {
int qp;
if (vp8::GetQp(payload, payload_size, &qp)) {
current_acc_qp_ += qp;
image->qp_ = qp;
}
} else if (codec_type == kVideoCodecVP9) {
int qp;
if (vp9::GetQp(payload, payload_size, &qp)) {
current_acc_qp_ += qp;
image->qp_ = qp;
}
}
} else if (codec_type == kVideoCodecH264) {
h264_bitstream_parser_.ParseBitstream(payload, payload_size);
int qp;
if (h264_bitstream_parser_.GetLastSliceQp(&qp)) {
current_acc_qp_ += qp;
image->qp_ = qp;
}
// For H.264 search for start codes.
const std::vector<H264::NaluIndex> nalu_idxs =
H264::FindNaluIndices(payload, payload_size);
if (nalu_idxs.empty()) {
ALOGE << "Start code is not found!";
ALOGE << "Data:" << image->_buffer[0] << " " << image->_buffer[1]
<< " " << image->_buffer[2] << " " << image->_buffer[3]
<< " " << image->_buffer[4] << " " << image->_buffer[5];
ProcessHWError(true /* reset_if_fallback_unavailable */);
return false;
}
header.VerifyAndAllocateFragmentationHeader(nalu_idxs.size());
for (size_t i = 0; i < nalu_idxs.size(); i++) {
header.fragmentationOffset[i] = nalu_idxs[i].payload_start_offset;
header.fragmentationLength[i] = nalu_idxs[i].payload_size;
header.fragmentationPlType[i] = 0;
header.fragmentationTimeDiff[i] = 0;
}
}
callback_result = callback_->OnEncodedImage(*image, &info, &header);
}
// Return output buffer back to the encoder.
bool success = Java_MediaCodecVideoEncoder_releaseOutputBuffer(
jni, j_media_codec_video_encoder_, output_buffer_index);
if (CheckException(jni) || !success) {
ProcessHWError(true /* reset_if_fallback_unavailable */);
return false;
}
// Print per frame statistics.
if (encoding_start_time_ms > 0) {
frame_encoding_time_ms = rtc::TimeMillis() - encoding_start_time_ms;
}
if (frames_encoded_ < kMaxEncodedLogFrames) {
int current_latency = static_cast<int>(last_input_timestamp_ms_ -
last_output_timestamp_ms_);
ALOGD << "Encoder frame out # " << frames_encoded_
<< ". Key: " << key_frame << ". Size: " << payload_size
<< ". TS: " << static_cast<int>(last_output_timestamp_ms_)
<< ". Latency: " << current_latency
<< ". EncTime: " << frame_encoding_time_ms;
}
// Calculate and print encoding statistics - every 3 seconds.
frames_encoded_++;
current_frames_++;
current_bytes_ += payload_size;
current_encoding_time_ms_ += frame_encoding_time_ms;
LogStatistics(false);
// Errors in callback_result are currently ignored.
if (callback_result.drop_next_frame)
drop_next_input_frame_ = true;
}
return true;
}
DeliverPendingOutputs主要流程如下:
调用java层MediaCodecVideoEncoder的dequeueOutputBuffer函数从编码器取出数据,封装成OutputBufferInfo。
转换OutputBufferInfo为EncodedImage。
回调callback_的OnEncodedImage来分发EncodedImage,callback_成员是一个VCMEncodedFrameCallback对象,通过其OnEncodedImage最终将EncodedImage传给VideoSendStreamImpl,VideoSendStreamImpl的OnEncodedImage函数定义如下:
EncodedImageCallback::Result VideoSendStreamImpl::OnEncodedImage(
const EncodedImage& encoded_image,
const CodecSpecificInfo* codec_specific_info,
const RTPFragmentationHeader* fragmentation) {
// Encoded is called on whatever thread the real encoder implementation run
// on. In the case of hardware encoders, there might be several encoders
// running in parallel on different threads.
size_t simulcast_idx = 0;
if (codec_specific_info->codecType == kVideoCodecVP8) {
simulcast_idx = codec_specific_info->codecSpecific.VP8.simulcastIdx;
}
if (config_->post_encode_callback) {
config_->post_encode_callback->EncodedFrameCallback(EncodedFrame(
encoded_image._buffer, encoded_image._length, encoded_image._frameType,
simulcast_idx, encoded_image._timeStamp));
}
{
rtc::CritScope lock(&encoder_activity_crit_sect_);
if (check_encoder_activity_task_)
check_encoder_activity_task_->UpdateEncoderActivity();
}
protection_bitrate_calculator_.UpdateWithEncodedData(encoded_image);
EncodedImageCallback::Result result = payload_router_.OnEncodedImage(
encoded_image, codec_specific_info, fragmentation);
RTC_DCHECK(codec_specific_info);
int layer = codec_specific_info->codecType == kVideoCodecVP8
? codec_specific_info->codecSpecific.VP8.simulcastIdx
: 0;
{
rtc::CritScope lock(&ivf_writers_crit_);
if (file_writers_[layer].get()) {
bool ok = file_writers_[layer]->WriteFrame(
encoded_image, codec_specific_info->codecType);
RTC_DCHECK(ok);
}
}
return result;
}
payload_router_是一个PayloadRouter对象,在这里完成后续的RTP打包和传输的工作。
编码控制
编码流控
MediaCodec 流控相关的接口并不多,一是配置时设置目标码率和码率控制模式,二是动态调整目标码率(API 19+)。
配置时指定目标码率和码率控制模式:
mediaFormat.setInteger(MediaFormat.KEY_BIT_RATE, bitRate);
mediaFormat.setInteger(MediaFormat.KEY_BITRATE_MODE,
MediaCodecInfo.EncoderCapabilities.BITRATE_MODE_VBR);
// 其他配置
mVideoCodec.configure(mediaFormat, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
码率控制模式在 MediaCodecInfo.EncoderCapabilities
类中定义了三种,在 framework 层有另一套名字和它们的值一一对应:
- CQ 对应于
OMX_Video_ControlRateDisable
,它表示完全不控制码率,尽最大可能保证图像质量; - CBR 对应于
OMX_Video_ControlRateConstant
,它表示编码器会尽量把输出码率控制为设定值,即我们前面提到的“不为所动”; - VBR 对应于
OMX_Video_ControlRateVariable
,它表示编码器会根据图像内容的复杂度(实际上是帧间变化量的大小)来动态调整输出码率,图像复杂则码率高,图像简单则码率低;
动态调整目标码率:
Bundle param = new Bundle();
param.putInt(MediaCodec.PARAMETER_KEY_VIDEO_BITRATE, bitrate);
mediaCodec.setParameters(param);
API 是很简单,但我们究竟该用哪种模式?
1.对于质量要求高、不在乎带宽(例如本地存文件)、解码器支持码率剧烈波动的情况,显然 CQ 是不二之选;
2.VBR 输出码率会在一定范围内波动,对于小幅晃动,方块效应会有所改善,但对剧烈晃动仍无能为力,而连续调低码率则会导致码率急剧下降,如果无法接受这个问题,那 VBR 就不是好的选择;
3.WebRTC 使用的是 CBR,稳定可控是 CBR 的优点,一旦稳定可控,那我们就可以自己实现比较可靠的控制了;
4.方块效应优化 VBR 的码率存在一个波动范围,因此使用 VBR 可以在一定程度上优化方块效应,但对于视频内容的剧烈变化,VBR 就只能望洋兴叹了。
WebRTC 的做法是,获取每个输出帧的 QP 值,如果 QP 值过大,就说明图像复杂度太高,如果 QP 值持续超过上界,那就重启编码器,用更低的输出分辨率来编码;如果 QP 值过低,则说明图像复杂度太低,如果 QP 值持续低于下界,也会重启编码器,用更高的输出分辨率来编码。关于 QP 值的获取,可以查看 WebRTC 相关代码:master/webrtc/common_video/h264/h264_bitstream_parser.cc。
硬编码与软编码
视频编码有硬编码和软编码,在android平台上,MediaCodec封装了硬编码和软编码,对于软编码,也可以直接使用其他主流开源编码库,比如H264编码标准有libx264、libopenh264,vp8/vp9编码标准有libvpx。
webrtc同时支持硬编码和软编码,在android平台上,硬编码使用的是MediaCodec,是在java层封装调用的,软编码H264使用的是libopenh264,vp8/vp9使用的是libvpx,都是在native层封装调用的。webrtc定义了如下编码相关的主要类:
java层主要类如下所示:
native层主要类如下所示:
在java层和native层都定义了VideoEncoderFactory接口和VideoEncoder和接口,VideoEncoderFactory创建VideoEncoder,VideoEncoder实现视频编码功能。从实现的角度来说,可以分为以下几类:
- 基于android系统提供的MediaCodec实现的HardwareVideoEncoder和MediaCodecVideoEncoder,支持H264、VP8、VP9编码。
- 基于libopenh264实现的H264Encoder,支持H264编码。
- 基于libvpx实现的VP8Encoder和VP9Encoder,分别支持VP8、VP9编码。
其他类只是包装类,比如native层VideoEncoderWrapper可以用于包装java层HardwareVideoEncoder,因为编码操作都是从native层调用的,借助VideoEncoderWrapper就可以在native层统一接口了。同理native层的VideoEncoderFactoryWrapper可以用于包装java层VideoEncoderFactory对象。
以java层为例,VideoEncoderFactory定义如下:
/** Factory for creating VideoEncoders. */
public interface VideoEncoderFactory {
/** Creates an encoder for the given video codec. */
@CalledByNative VideoEncoder createEncoder(VideoCodecInfo info);
/**
* Enumerates the list of supported video codecs. This method will only be called once and the
* result will be cached.
*/
@CalledByNative VideoCodecInfo[] getSupportedCodecs();
}
createEncoder根据VideoCodecInfo创建对应类型的编码器,getSupportedCodecs获取支持的编码器类型,这里的类型指的是编码标准,比如H264、VP8、VP9等。获取的信息用于生成sdp信息用于协商会话使用的编码器类型。实际定义了HardwareVideoEncoderFactory和SoftwareVideoEncoderFactory两种,DefaultVideoEncoderFactory只是对它们的封装。特定的VideoEncoderFactory创建特定的VideoEncoder,比如HardwareVideoEncoderFactory创建的是HardwareVideoEncoder,SoftwareVideoEncoderFactory创建的是VP8Encoder或者VP9Encoder,由info参数决定。
VideoEncoder主要定义如下:
/**
* Initializes the encoding process. Call before any calls to encode.
*/
@CalledByNative VideoCodecStatus initEncode(Settings settings, Callback encodeCallback);
/**
* Releases the encoder. No more calls to encode will be made after this call.
*/
@CalledByNative VideoCodecStatus release();
/**
* Requests the encoder to encode a frame.
*/
@CalledByNative VideoCodecStatus encode(VideoFrame frame, EncodeInfo info);
/**
* Informs the encoder of the packet loss and the round-trip time of the network.
*
* @param packetLoss How many packets are lost on average per 255 packets.
* @param roundTripTimeMs Round-trip time of the network in milliseconds.
*/
@CalledByNative VideoCodecStatus setChannelParameters(short packetLoss, long roundTripTimeMs);
/** Sets the bitrate allocation and the target framerate for the encoder. */
@CalledByNative VideoCodecStatus setRateAllocation(BitrateAllocation allocation, int framerate);
/** Any encoder that wants to use WebRTC provided quality scaler must implement this method. */
@CalledByNative ScalingSettings getScalingSettings();
/**
* Should return a descriptive name for the implementation. Gets called once and cached. May be
* called from arbitrary thread.
*/
@CalledByNative String getImplementationName();
native层的EncoderAdapter的internal_encoder_factory_成员是一个InternalEncoderFactory对象,external_encoder_factory_成员是一个CricketToWebRtcEncoderFactory对象,CricketToWebRtcEncoderFactory的external_encoder_factory_成员是一个MediaCodecVideoEncoderFactory对象,VideoEncoderFactoryWrapper的encoder_factory_成员是一个java层的VideoEncoderFactory对象,VideoEncoderWrapper的encoder_成员是一个java层的VideoEncoder对象。
上面定义了这么多种VideoEncoderFactory和VideoEncoder,实际使用的是哪一种呢?实际使用哪一种跟调用webrtc api传递的参数、硬件平台以及android系统版本相关。
参数主要是PeerConnectionFactory相关的,比如:
public static void initializeFieldTrials(String fieldTrialsInitString) {
nativeInitializeFieldTrials(fieldTrialsInitString);
}
fieldTrialsInitString的值会影响VideoEncoderSoftwareFallbackWrapper的行为。
还有就是给PeerConnectionFactory构造函数传的encoderFactory的值。
相关的代码如下所示:
public PeerConnectionFactory(
Options options, VideoEncoderFactory encoderFactory, VideoDecoderFactory decoderFactory) {
checkInitializeHasBeenCalled();
nativeFactory = nativeCreatePeerConnectionFactory(options, encoderFactory, decoderFactory);
if (nativeFactory == 0) {
throw new RuntimeException("Failed to initialize PeerConnectionFactory!");
}
}
jlong CreatePeerConnectionFactoryForJava(
JNIEnv* jni,
const JavaParamRef<jobject>& joptions,
const JavaParamRef<jobject>& jencoder_factory,
const JavaParamRef<jobject>& jdecoder_factory,
rtc::scoped_refptr<AudioProcessing> audio_processor) {
cricket::WebRtcVideoEncoderFactory* legacy_video_encoder_factory = nullptr;
cricket::WebRtcVideoDecoderFactory* legacy_video_decoder_factory = nullptr;
std::unique_ptr<cricket::MediaEngineInterface> media_engine;
if (jencoder_factory.is_null() && jdecoder_factory.is_null()) {
// This uses the legacy API, which automatically uses the internal SW
// codecs in WebRTC.
if (video_hw_acceleration_enabled) {
legacy_video_encoder_factory = CreateLegacyVideoEncoderFactory();
legacy_video_decoder_factory = CreateLegacyVideoDecoderFactory();
}
media_engine.reset(CreateMediaEngine(
adm, audio_encoder_factory, audio_decoder_factory,
legacy_video_encoder_factory, legacy_video_decoder_factory, audio_mixer,
audio_processor));
} else {
// This uses the new API, does not automatically include software codecs.
std::unique_ptr<VideoEncoderFactory> video_encoder_factory = nullptr;
if (jencoder_factory.is_null()) {
legacy_video_encoder_factory = CreateLegacyVideoEncoderFactory();
video_encoder_factory = std::unique_ptr<VideoEncoderFactory>(
WrapLegacyVideoEncoderFactory(legacy_video_encoder_factory));
} else {
video_encoder_factory = std::unique_ptr<VideoEncoderFactory>(
CreateVideoEncoderFactory(jni, jencoder_factory));
}
std::unique_ptr<VideoDecoderFactory> video_decoder_factory = nullptr;
if (jdecoder_factory.is_null()) {
legacy_video_decoder_factory = CreateLegacyVideoDecoderFactory();
video_decoder_factory = std::unique_ptr<VideoDecoderFactory>(
WrapLegacyVideoDecoderFactory(legacy_video_decoder_factory));
} else {
video_decoder_factory = std::unique_ptr<VideoDecoderFactory>(
CreateVideoDecoderFactory(jni, jdecoder_factory));
}
rtc::scoped_refptr<AudioDeviceModule> adm_scoped = nullptr;
media_engine.reset(CreateMediaEngine(
adm_scoped, audio_encoder_factory, audio_decoder_factory,
std::move(video_encoder_factory), std::move(video_decoder_factory),
audio_mixer, audio_processor));
}
rtc::scoped_refptr<PeerConnectionFactoryInterface> factory(
CreateModularPeerConnectionFactory(
network_thread.get(), worker_thread.get(), signaling_thread.get(),
std::move(media_engine), std::move(call_factory),
std::move(rtc_event_log_factory)));
RTC_CHECK(factory) << "Failed to create the peer connection factory; "
<< "WebRTC/libjingle init likely failed on this device";
// TODO(honghaiz): Maybe put the options as the argument of
// CreatePeerConnectionFactory.
if (has_options) {
factory->SetOptions(options);
}
OwnedFactoryAndThreads* owned_factory = new OwnedFactoryAndThreads(
std::move(network_thread), std::move(worker_thread),
std::move(signaling_thread), legacy_video_encoder_factory,
legacy_video_decoder_factory, network_monitor_factory, factory.release());
owned_factory->InvokeJavaCallbacksOnFactoryThreads();
return jlongFromPointer(owned_factory);
}
可见传给PeerConnectionFactory构造函数encoderFactory参数的值直接影响使用哪一个VideoEncoderFactory,也就直接影响使用哪一个VideoEncoder。demo中调用如下:
if (peerConnectionParameters.videoCodecHwAcceleration) {
encoderFactory = new DefaultVideoEncoderFactory(
rootEglBase.getEglBaseContext(), true /* enableIntelVp8Encoder */, enableH264HighProfile);
decoderFactory = new DefaultVideoDecoderFactory(rootEglBase.getEglBaseContext());
} else {
encoderFactory = new SoftwareVideoEncoderFactory();
decoderFactory = new SoftwareVideoDecoderFactory();
}
factory = new PeerConnectionFactory(options, encoderFactory, decoderFactory);
可见使能hardware acceleration时,使用的是DefaultVideoEncoderFactory,而DefaultVideoEncoderFactory createEncoder时会先后尝试HardwareVideoEncoderFactory和SoftwareVideoEncoderFactory两种,如下所示:
public VideoEncoder createEncoder(VideoCodecInfo info) {
final VideoEncoder videoEncoder = hardwareVideoEncoderFactory.createEncoder(info);
if (videoEncoder != null) {
return videoEncoder;
}
return softwareVideoEncoderFactory.createEncoder(info);
}
也就是无法使用硬编码的情况下使用软编码,而是否能使用硬编码取决于硬件平台以及android系统版本是否支持对应的编码标准,如下所示:
public VideoEncoder createEncoder(VideoCodecInfo input) {
VideoCodecType type = VideoCodecType.valueOf(input.name);
MediaCodecInfo info = findCodecForType(type);
if (info == null) {
// No hardware support for this type.
// TODO(andersc): This is for backwards compatibility. Remove when clients have migrated to
// new DefaultVideoEncoderFactory.
if (fallbackToSoftware) {
SoftwareVideoEncoderFactory softwareVideoEncoderFactory = new SoftwareVideoEncoderFactory();
return softwareVideoEncoderFactory.createEncoder(input);
} else {
return null;
}
}
String codecName = info.getName();
String mime = type.mimeType();
Integer surfaceColorFormat = MediaCodecUtils.selectColorFormat(
MediaCodecUtils.TEXTURE_COLOR_FORMATS, info.getCapabilitiesForType(mime));
Integer yuvColorFormat = MediaCodecUtils.selectColorFormat(
MediaCodecUtils.ENCODER_COLOR_FORMATS, info.getCapabilitiesForType(mime));
if (type == VideoCodecType.H264) {
boolean isHighProfile = nativeIsSameH264Profile(input.params, getCodecProperties(type, true))
&& isH264HighProfileSupported(info);
boolean isBaselineProfile =
nativeIsSameH264Profile(input.params, getCodecProperties(type, false));
if (!isHighProfile && !isBaselineProfile) {
return null;
}
}
return new HardwareVideoEncoder(codecName, type, surfaceColorFormat, yuvColorFormat,
input.params, getKeyFrameIntervalSec(type), getForcedKeyFrameIntervalMs(type, codecName),
createBitrateAdjuster(type, codecName), sharedContext);
}
VideoCodecInfo的name包含要使用的编码类型信息,比如H264、VP8、VP9,可以对应一个VideoCodecType,如下所示:
enum VideoCodecType {
VP8("video/x-vnd.on2.vp8"),
VP9("video/x-vnd.on2.vp9"),
H264("video/avc");
private final String mimeType;
private VideoCodecType(String mimeType) {
this.mimeType = mimeType;
}
String mimeType() {
return mimeType;
}
}
调用findCodecForType判断使用的硬件平台以及android系统版本是否支持该编码类型,支持的话就使用HardwareVideoEncoder,否则就会调用SoftwareVideoEncoderFactory创建对应的软解码器,findCodecForType定义如下所示:
private MediaCodecInfo findCodecForType(VideoCodecType type) {
for (int i = 0; i < MediaCodecList.getCodecCount(); ++i) {
MediaCodecInfo info = null;
try {
info = MediaCodecList.getCodecInfoAt(i);
} catch (IllegalArgumentException e) {
Logging.e(TAG, "Cannot retrieve encoder codec info", e);
}
if (info == null || !info.isEncoder()) {
continue;
}
if (isSupportedCodec(info, type)) {
return info;
}
}
return null; // No support for this type.
}
isSupportedCodec定义如下所示:
private boolean isSupportedCodec(MediaCodecInfo info, VideoCodecType type) {
if (!MediaCodecUtils.codecSupportsType(info, type)) {
return false;
}
// Check for a supported color format.
if (MediaCodecUtils.selectColorFormat(
MediaCodecUtils.ENCODER_COLOR_FORMATS, info.getCapabilitiesForType(type.mimeType()))
== null) {
return false;
}
return isHardwareSupportedInCurrentSdk(info, type);
}
先调用codecSupportsType比对mimeType,如下所示:
static boolean codecSupportsType(MediaCodecInfo info, VideoCodecType type) {
for (String mimeType : info.getSupportedTypes()) {
if (type.mimeType().equals(mimeType)) {
return true;
}
}
return false;
}
接着调用isHardwareSupportedInCurrentSdk比对硬件平台以及android系统版本,如下所示:
private boolean isHardwareSupportedInCurrentSdk(MediaCodecInfo info, VideoCodecType type) {
switch (type) {
case VP8:
return isHardwareSupportedInCurrentSdkVp8(info);
case VP9:
return isHardwareSupportedInCurrentSdkVp9(info);
case H264:
return isHardwareSupportedInCurrentSdkH264(info);
}
return false;
}
如果是H264,调用的是isHardwareSupportedInCurrentSdkH264,如下所示:
private boolean isHardwareSupportedInCurrentSdkH264(MediaCodecInfo info) {
// First, H264 hardware might perform poorly on this model.
if (H264_HW_EXCEPTION_MODELS.contains(Build.MODEL)) {
return false;
}
String name = info.getName();
// QCOM H264 encoder is supported in KITKAT or later.
return (name.startsWith(QCOM_PREFIX) && Build.VERSION.SDK_INT >= Build.VERSION_CODES.KITKAT)
// Exynos H264 encoder is supported in LOLLIPOP or later.
|| (name.startsWith(EXYNOS_PREFIX)
&& Build.VERSION.SDK_INT >= Build.VERSION_CODES.LOLLIPOP);
}
可见,在这里完成了硬件平台以及android系统版本的判断。
总的来说,实际使用哪一种编码器跟调用webrtc api传递的参数、硬件平台以及android系统版本相关。
总结
本篇文章主要分析了webrtc视频编码模块流程,这个流程就是创建一系列相关的对象,然后编码器设置好输入输出,VideoStreamEncoder对象负责输入输出的衔接,编码器的输入是通过将VideoStreamEncoder注册到VideoTrack来完成图像数据的接收,此时VideoStreamEncoder是做为一个VideoSinkInterface对象,编码器的输出是通过将VCMEncodedFrameCallback注册到MediaCodecVideoEncoder,再经过VideoStreamEncoder来完成码流数据的打包和传输,这工作是交给VideoSendStreamImpl来完成的,此时VideoStreamEncoder是做为一个EncodedImageCallback对象。