本源码分析基于Android8.0
源码目录
Java层
framework/base/core/java/andorid/os/MessageQueue.java
framework/base/core/java/andorid/os/Looper.java
Native层
system/core/libutils/include/utils/RefBase.h
system/core/libutils/RefBase.cpp
framework/base/core/jni/android_os_MessageQueue.h
framework/base/core/jni/android_os_MessageQueue.cpp
system/core/libutils/include/utils/Looper.h
system/core/libutils/Looper.cpp
framework/native/include/android/looper.h
framework/base/native/android/looper.cpp
回顾
在上一篇文章中,我们讲解了Handler,Looper,MessageQueue的关系,其中在MessageQueue的next方法中有这样一段代码
Message next() {
....
for (;;) {
if (nextPollTimeoutMillis != 0) {
Binder.flushPendingCommands();
}
nativePollOnce(ptr, nextPollTimeoutMillis);
}
...
}
而添加消息入队的时候,有这样一段代码
boolean enqueueMessage(Message msg, long when) {
...
synchronized (this) {
...
if (p == null || when == 0 || when < p.when) {
// New head, wake up the event queue if blocked.
msg.next = p;
mMessages = msg;
needWake = mBlocked;
}
...
// We can assume mPtr != 0 because mQuitting is false.
if (needWake) {
nativeWake(mPtr);
}
}
}
同样Looper里面
public static void loop() {
...
for (;;) {
Message msg = queue.next(); // might block
}
...
}
通过以上三段代码和注释可以看出,添加消息的时候有可能在阻塞状态:即之前消息队列为空,取消息的时候也可能在阻塞状态,为什么会这样呢,阻塞不会导致ANR吗?其实关键就在于两个native方法身上nativePollOnce和nativeWake****
** 它的本质就是Linux的管道。管道,其本质是也是文件,但又和普通的文件会有所不同:管道缓冲区大小一般为1页,即4K字节。管道分为读端和写端,读端负责从管道拿数据,当数据为空时则阻塞;写端向管道写数据,当管道缓存区满时则阻塞。**
接下来我们进入Native层
UML图
首先查看MessageQueue.java里面的native方法
MessageQueue.java
private native static long nativeInit();
private native static void nativeDestroy(long ptr);
private native void nativePollOnce(long ptr, int timeoutMillis); /*non-static for callbacks*/
private native static void nativeWake(long ptr);
private native static boolean nativeIsPolling(long ptr);
private native static void nativeSetFileDescriptorEvents(long ptr, int fd, int events);
//构造函数
MessageQueue(boolean quitAllowed) {
mQuitAllowed = quitAllowed;
mPtr = nativeInit();
}
我们发现调用了nativeInit()方法,我们进入native层看它做了什么
android_os_MessageQueue.cpp
static jlong android_os_MessageQueue_nativeInit(JNIEnv* env, jclass clazz) {
NativeMessageQueue* nativeMessageQueue = new NativeMessageQueue();
if (!nativeMessageQueue) {
jniThrowRuntimeException(env, "Unable to allocate native queue");
return 0;
}
nativeMessageQueue->incStrong(env);
return reinterpret_cast<jlong>(nativeMessageQueue);
}
它就是生成一个NativeMessageQueue()对象,那我们去看构造函数做了什么
NativeMessageQueue::NativeMessageQueue() :
mPollEnv(NULL), mPollObj(NULL), mExceptionObj(NULL) {
mLooper = Looper::getForThread();
if (mLooper == NULL) {
mLooper = new Looper(false);
Looper::setForThread(mLooper);
}
}
发现它生成了Looper对象,这个是native层的,跟java层的Looper不一样,它几乎重写了java层的Looper逻辑。
Looper::Looper(bool allowNonCallbacks) :
mAllowNonCallbacks(allowNonCallbacks), mSendingMessage(false),
mPolling(false), mEpollFd(-1), mEpollRebuildRequired(false),
mNextRequestSeq(0), mResponseIndex(0), mNextMessageUptime(LLONG_MAX) {
mWakeEventFd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);//1
LOG_ALWAYS_FATAL_IF(mWakeEventFd < 0, "Could not make wake event fd: %s",
strerror(errno));
AutoMutex _l(mLock);//2
rebuildEpollLocked();//3
}
1,eventfd(),使用这个函数来创建一个事件对象,该函数返回一个文件描述符来代表这个事件对象,之后我们就用这个来调用对象;
2,AutoMutex _l(),给mLock对象加锁;执行完后自动释放锁,它的原理是利用了c++的构造和析构函数完成自动加锁和放锁。
3,rebuildEpollLocked(),重建epoll事件。
接下来看rebuildEpollLocked
void Looper::rebuildEpollLocked() {
// Close old epoll instance if we have one.
if (mEpollFd >= 0) {
#if DEBUG_CALLBACKS
ALOGD("%p ~ rebuildEpollLocked - rebuilding epoll set", this);
#endif
//关闭旧的epoll
close(mEpollFd);
}
// Allocate the new epoll instance and register the wake pipe.
//创建新的epoll并注册管道,参数表示监听的文件描述符数目,它向内核申请了一段内存空间
mEpollFd = epoll_create(EPOLL_SIZE_HINT);
LOG_ALWAYS_FATAL_IF(mEpollFd < 0, "Could not create epoll instance: %s", strerror(errno));
struct epoll_event eventItem;
memset(& eventItem, 0, sizeof(epoll_event)); // zero out unused members of data field union
eventItem.events = EPOLLIN;
eventItem.data.fd = mWakeEventFd;//把之前创建的mWakeEventFd赋给item
//把之前生成的mWakeEventFd加入到 epoll,eventItem也加入epoll,这样就能控制我们的mWakeEventFd所表示的对象了
int result = epoll_ctl(mEpollFd, EPOLL_CTL_ADD, mWakeEventFd, & eventItem);
LOG_ALWAYS_FATAL_IF(result != 0, "Could not add wake event fd to epoll instance: %s",
strerror(errno));
for (size_t i = 0; i < mRequests.size(); i++) {
const Request& request = mRequests.valueAt(i);
struct epoll_event eventItem;
request.initEventItem(&eventItem);
int epollResult = epoll_ctl(mEpollFd, EPOLL_CTL_ADD, request.fd, & eventItem);
if (epollResult < 0) {
ALOGE("Error adding epoll events for fd %d while rebuilding epoll set: %s",
request.fd, strerror(errno));
}
}
}
注意int epoll_ctl(int epfd, intop, int fd, struct epoll_event* event);
他是epoll的事件注册函数:
第一个参数是epoll_create()的返回值,
第二个参数表示动作,用三个宏来表示:
EPOLL_CTL_ADD: 注册新的fd到epfd中;
EPOLL_CTL_MOD: 修改已经注册的fd的监听事件;
EPOLL_CTL_DEL: 从epfd中删除一个fd;
第三个参数是需要监听的fd,
第四个参数是告诉内核需要监听什么事件
我们回到NativeMessageQueue::NativeMessageQueue()
它不是每次都生成新的Looper,而是保存到TSL中
void Looper::setForThread(const sp<Looper>& looper) {
sp<Looper> old = getForThread(); // also has side-effect of initializing TLS
if (looper != NULL) {
looper->incStrong((void*)threadDestructor);
}
pthread_setspecific(gTLSKey, looper.get());
if (old != NULL) {
old->decStrong((void*)threadDestructor);
}
}
sp就类似于java的强引用,native层还有一个wp类似于java的弱引用,因为Android封装了c++的对象回收机制,具体的可阅读深入理解Android卷I相关源码。
TLS,即线程本地存储(Thread Local Storage),可以对比理解为Java层的ThreadLocal,在单线程模式下,所有整个程序生命周期的变量都是只有一份,那是因为只是一个执行单元;而在多线程模式下,有些变量需要支持每个线程独享一份的功能。这种每个线程独享的变量放到每个线程专有的存储区域,所以称为线程本地存储(Thread Local Storage)或者线程私有数据(Thread Specific Data)。
那么到这里初始化就完成了,即创建NativeMessageQueue,创建Looper并保存到TLS中,Looper里面创建了epoll,注册了事件,之后我们就能收到回调,这里可以对比理解为setOnclickListener。最后返回生成的NativeMessageQueue指针(jlong类型)给Java层,注意reinterpret_cast是c++的强转,通常将一个类型指针转换为另一个类型指针 。
nativePollOnce()
在Looper的loop()死循环里面,会调用MessageQueue的next(),next()会调用nativePollOnce(),进入native层:
static void android_os_MessageQueue_nativePollOnce(JNIEnv* env, jobject obj,
jlong ptr, jint timeoutMillis) {
NativeMessageQueue* nativeMessageQueue = reinterpret_cast<NativeMessageQueue*>(ptr);
nativeMessageQueue->pollOnce(env, obj, timeoutMillis);
}
这里传入了一个参数,就是刚刚调用nativeInit()得到的NativemessageQueue的jlong指针,再强转回来,然后调用pollOnce方法
MessageQueue.cpp
void NativeMessageQueue::pollOnce(JNIEnv* env, jobject pollObj, int timeoutMillis) {
mPollEnv = env;
mPollObj = pollObj;
mLooper->pollOnce(timeoutMillis);
mPollObj = NULL;
mPollEnv = NULL;
if (mExceptionObj) {
env->Throw(mExceptionObj);
env->DeleteLocalRef(mExceptionObj);
mExceptionObj = NULL;
}
}
Looper.h
inline int pollOnce(int timeoutMillis) {
return pollOnce(timeoutMillis, NULL, NULL, NULL);
}
Looper.cpp
int Looper::pollOnce(int timeoutMillis, int* outFd, int* outEvents, void** outData) {
int result = 0;
for (;;) {
while (mResponseIndex < mResponses.size()) {
const Response& response = mResponses.itemAt(mResponseIndex++);
int ident = response.request.ident;
if (ident >= 0) {
int fd = response.request.fd;
int events = response.events;
void* data = response.request.data;
#if DEBUG_POLL_AND_WAKE
ALOGD("%p ~ pollOnce - returning signalled identifier %d: "
"fd=%d, events=0x%x, data=%p",
this, ident, fd, events, data);
#endif
if (outFd != NULL) *outFd = fd;
if (outEvents != NULL) *outEvents = events;
if (outData != NULL) *outData = data;
return ident;
}
}
if (result != 0) {
#if DEBUG_POLL_AND_WAKE
ALOGD("%p ~ pollOnce - returning result %d", this, result);
#endif
if (outFd != NULL) *outFd = 0;
if (outEvents != NULL) *outEvents = 0;
if (outData != NULL) *outData = NULL;
return result;
}
result = pollInner(timeoutMillis);
}
}
pollOnce的timeoutMillis就是我们java层设置的超时参数,接下来调用pollInner
int Looper::pollInner(int timeoutMillis) {
...
// Poll.
int result = POLL_WAKE;
mResponses.clear();
mResponseIndex = 0;
// We are about to idle.
mPolling = true;
struct epoll_event eventItems[EPOLL_MAX_EVENTS];
//关键方法
int eventCount = epoll_wait(mEpollFd, eventItems, EPOLL_MAX_EVENTS, timeoutMillis);
// No longer idling.
mPolling = false;
// Acquire lock.
mLock.lock();
// Rebuild epoll set if needed.
if (mEpollRebuildRequired) {
mEpollRebuildRequired = false;
rebuildEpollLocked();
goto Done;
}
// Check for poll error.
if (eventCount < 0) {
if (errno == EINTR) {
goto Done;
}
ALOGW("Poll failed with an unexpected error: %s", strerror(errno));
result = POLL_ERROR;
goto Done;
}
// Check for poll timeout.
if (eventCount == 0) {
#if DEBUG_POLL_AND_WAKE
ALOGD("%p ~ pollOnce - timeout", this);
#endif
result = POLL_TIMEOUT;
goto Done;
}
// Handle all events.
#if DEBUG_POLL_AND_WAKE
ALOGD("%p ~ pollOnce - handling events from %d fds", this, eventCount);
#endif
for (int i = 0; i < eventCount; i++) {
int fd = eventItems[i].data.fd;
uint32_t epollEvents = eventItems[i].events;
//找到我们注册事件的文件描述符
if (fd == mWakeEventFd) {
if (epollEvents & EPOLLIN) {
//从epoll_wait()里唤醒了,读取管道内容
awoken();
} else {
ALOGW("Ignoring unexpected epoll events 0x%x on wake event fd.", epollEvents);
}
} else {
ssize_t requestIndex = mRequests.indexOfKey(fd);
if (requestIndex >= 0) {
int events = 0;
if (epollEvents & EPOLLIN) events |= EVENT_INPUT;
if (epollEvents & EPOLLOUT) events |= EVENT_OUTPUT;
if (epollEvents & EPOLLERR) events |= EVENT_ERROR;
if (epollEvents & EPOLLHUP) events |= EVENT_HANGUP;
pushResponse(events, mRequests.valueAt(requestIndex));
} else {
ALOGW("Ignoring unexpected epoll events 0x%x on fd %d that is "
"no longer registered.", epollEvents, fd);
}
}
}
Done: ;
// Invoke pending message callbacks.
mNextMessageUptime = LLONG_MAX;
while (mMessageEnvelopes.size() != 0) {
nsecs_t now = systemTime(SYSTEM_TIME_MONOTONIC);
const MessageEnvelope& messageEnvelope = mMessageEnvelopes.itemAt(0);
if (messageEnvelope.uptime <= now) {
// Remove the envelope from the list.
// We keep a strong reference to the handler until the call to handleMessage
// finishes. Then we drop it so that the handler can be deleted *before*
// we reacquire our lock.
{ // obtain handler
sp<MessageHandler> handler = messageEnvelope.handler;
Message message = messageEnvelope.message;
mMessageEnvelopes.removeAt(0);
mSendingMessage = true;
mLock.unlock();
#if DEBUG_POLL_AND_WAKE || DEBUG_CALLBACKS
ALOGD("%p ~ pollOnce - sending message: handler=%p, what=%d",
this, handler.get(), message.what);
#endif
handler->handleMessage(message);
} // release handler
mLock.lock();
mSendingMessage = false;
result = POLL_CALLBACK;
} else {
// The last message left at the head of the queue determines the next wakeup time.
mNextMessageUptime = messageEnvelope.uptime;
break;
}
}
// Release lock.
mLock.unlock();
// Invoke all response callbacks.
for (size_t i = 0; i < mResponses.size(); i++) {
Response& response = mResponses.editItemAt(i);
if (response.request.ident == POLL_CALLBACK) {
int fd = response.request.fd;
int events = response.events;
void* data = response.request.data;
#if DEBUG_POLL_AND_WAKE || DEBUG_CALLBACKS
ALOGD("%p ~ pollOnce - invoking fd event callback %p: fd=%d, events=0x%x, data=%p",
this, response.request.callback.get(), fd, events, data);
#endif
// Invoke the callback. Note that the file descriptor may be closed by
// the callback (and potentially even reused) before the function returns so
// we need to be a little careful when removing the file descriptor afterwards.
int callbackResult = response.request.callback->handleEvent(fd, events, data);
if (callbackResult == 0) {
removeFd(fd, response.request.seq);
}
// Clear the callback reference in the response structure promptly because we
// will not clear the response vector itself until the next poll.
response.request.callback.clear();
result = POLL_CALLBACK;
}
}
return result;
}
最关键的方法就在epoll_wait()身上,这个方法会等待事件发生或者超时,在nativeWake()方法,向管道写端写入字符时,则该方法会返回,否则一直阻塞;注意result返回值有以下几种类型:
POLL_WAKE,初始化状态,它表示由管道写入端触发,pipe write;
POLL_ERROR阻塞等待期间发生错误,发生错误goto到Done处;
POLL_TIMEOUT 发生超时;
POLL_CALLBACK: 表示某个被监听的文件描述符被触发,比如我们nativeInit创建的mWakeEventFd;
当唤醒后,要不断的去从管道中读取数据,这时调用了awoken()方法
void Looper::awoken() {
#if DEBUG_POLL_AND_WAKE
ALOGD("%p ~ awoken", this);
#endif
uint64_t counter;
TEMP_FAILURE_RETRY(read(mWakeEventFd, &counter, sizeof(uint64_t)));
}
很简单,就是从管道里读取内容,这是我们已经拿到Native层的Message了,在Done里面,我们会处理Message,并回调handler->handleMessage()方法,注意此handler非java层的handler,它是一个MessageHandler,Message等类在Looper.h中。
nativeWake()
接下来我们看是怎么唤醒的
android_os_MessageQueue.cpp
static void android_os_MessageQueue_nativeWake(JNIEnv* env, jclass clazz, jlong ptr) {
NativeMessageQueue* nativeMessageQueue = reinterpret_cast<NativeMessageQueue*>(ptr);
nativeMessageQueue->wake();
}
void NativeMessageQueue::wake() {
mLooper->wake();
}
Looper.cpp
void Looper::wake() {
#if DEBUG_POLL_AND_WAKE
ALOGD("%p ~ wake", this);
#endif
uint64_t inc = 1;
ssize_t nWrite = TEMP_FAILURE_RETRY(write(mWakeEventFd, &inc, sizeof(uint64_t)));
if (nWrite != sizeof(uint64_t)) {
if (errno != EAGAIN) {
LOG_ALWAYS_FATAL("Could not write wake signal to fd %d: %s",
mWakeEventFd, strerror(errno));
}
}
}
就是调用write()向管道写入一个整数1,TEMP_FAILURE_RETRY就是失败不断的重试,知道成功唤醒为止,成功写入后,管道的另一端就会接收到,并从阻塞状态结束,即从epoll_wait()返回,执行它后面的代码。