live555中视频和音频是分别进行编码的,如何实现两者的同步呢?
如果可以做到让视频和音频的时间戳,都与NTP时间保持同步,就可达到音视频同步的目的。
Network Time Protocol (NTP) is a networking protocol for clock synchronization between computer systems overpacket-switched, variable-latency data networks.
在live555中是如何实现这种机制的呢?
总体思路是:
- RTSP服务端利用RTCP协议中的Sender Report将NTP Timestamp发送到RTSP客户端。
-
RTSP客户端(数据的接收方)把A/V的RTP时间戳同步到RTCP的绝对时间(NTP Timestamp),实现A/V同步。
这个绝对时间就是当前时间距离Jan 1 1900 00:00:00
的差值。
首先看一下未加入同步机制的时间戳代码:
void RTPReceptionStats::noteIncomingPacket(u_int16_t seqNum,
u_int32_t rtpTimestamp,
unsigned timestampFrequency,
Boolean useForJitterCalculation,
struct timeval& resultPresentationTime,
Boolean& resultHasBeenSyncedUsingRTCP,
unsigned packetSize)
{
...
// Record the inter-packet delay
struct timeval timeNow;
gettimeofday(&timeNow, NULL);
...
// Return the 'presentation time' that corresponds to "rtpTimestamp":
if (fSyncTime.tv_sec == 0 && fSyncTime.tv_usec == 0)
{
// This is the first timestamp that we've seen, so use the current
// 'wall clock' time as the synchronization time. (This will be
// corrected later when we receive RTCP SRs.)
fSyncTimestamp = rtpTimestamp; // 首个RTP Timestamp
fSyncTime = timeNow; // 使用当前系统时间作为初始参考时间戳
}
int timestampDiff = rtpTimestamp - fSyncTimestamp;
// Note: This works even if the timestamp wraps around
// (as long as "int" is 32 bits)
// Divide this by the timestamp frequency to get real time:
double timeDiff = timestampDiff/(double)timestampFrequency;
// Add this to the 'sync time' to get our result:
unsigned const million = 1000000;
unsigned seconds, uSeconds;
if (timeDiff >= 0.0)
{
// 计算时间戳
seconds = fSyncTime.tv_sec + (unsigned)(timeDiff);
uSeconds = fSyncTime.tv_usec + (unsigned)((timeDiff - (unsigned)timeDiff)*million);
if (uSeconds >= million)
{
uSeconds -= million;
++seconds;
}
}
else
{
timeDiff = -timeDiff;
seconds = fSyncTime.tv_sec - (unsigned)(timeDiff);
uSeconds = fSyncTime.tv_usec - (unsigned)((timeDiff - (unsigned)timeDiff)*million);
if ((int)uSeconds < 0)
{
uSeconds += million;
--seconds;
}
}
resultPresentationTime.tv_sec = seconds;
resultPresentationTime.tv_usec = uSeconds;
resultHasBeenSyncedUsingRTCP = fHasBeenSynchronized;
// Save these as the new synchronization timestamp & time:
fSyncTimestamp = rtpTimestamp;
fSyncTime = resultPresentationTime;
fPreviousPacketRTPTimestamp = rtpTimestamp;
}
其中有两个重要的参数: fSyncTimestamp和fSyncTime;
class RTPReceptionStats {
...
private:
// Used to convert from RTP timestamp to 'wall clock' time:
Boolean fHasBeenSynchronized;
u_int32_t fSyncTimestamp;
struct timeval fSyncTime;
};
-
fSyncTimestamp
RTP Timestamp
, 默认第N
帧的rtpTimestamp
为第N+1
帧的fSyncTimestamp
。 -
fSyncTime
'wall clock' time
, 默认第N
帧的'wall clock' time
为第N+1
帧的fSyncTime
。
RTPReceptionStats::noteIncomingPacket
的实质是:
将 RTP timestamp 转换为 'wall clock' time。
获取首个RTP时,将系统时间作为首个'wall clock' time
。
后续,当RTP timestamp
发生变化时,要将变化的部分转换为real time:
int timestampDiff = rtpTimestamp - fSyncTimestamp;
// Divide this by the timestamp frequency to get real time:
double timeDiff = timestampDiff/(double)timestampFrequency;
然后将该部分改变反映到'wall clock' time
上, 如:
seconds = fSyncTime.tv_sec + (unsigned)(timeDiff);
uSeconds = fSyncTime.tv_usec + (unsigned)((timeDiff - (unsigned)timeDiff)*million);
可以看出以上的逻辑中,完全取决于系统时间的精确度,没有任何校正机制。
live555是在哪里实现时间校正的呢?
答案是利用RTSP客户端(数据的接收者)利用RTCP
返回的Sender Report
, 然后利用其中的NTP Timestamp
和RTP timestamp
, 对fSyncTimestamp和fSyncTime进行校正。
校正程序如下:
void RTPReceptionStats::noteIncomingSR(u_int32_t ntpTimestampMSW,
u_int32_t ntpTimestampLSW,
u_int32_t rtpTimestamp)
{
fLastReceivedSR_NTPmsw = ntpTimestampMSW;
fLastReceivedSR_NTPlsw = ntpTimestampLSW;
gettimeofday(&fLastReceivedSR_time, NULL);
// Use this SR to update time synchronization information:
// ntpTimestampMSW : NTP timestamp, most significant word (64位NTP时间戳的高32位)
fSyncTimestamp = rtpTimestamp;
fSyncTime.tv_sec = ntpTimestampMSW - 0x83AA7E80; // 1/1/1900 -> 1/1/1970
// ntpTimestampLSW : NTP timestamp, least significant word (64位NTP时间戳的低32位)
double microseconds = (ntpTimestampLSW * 15625.0) / 0x04000000; // 10^6/2^32
fSyncTime.tv_usec = (unsigned)(microseconds + 0.5);
}
通过Sender Report
,分别对视频和音频的时间及时进行校正,即可保证视音频同步。
References:
https://en.wikipedia.org/wiki/Network_Time_Protocol
RTP: A Transport Protocol for Real-Time Applications