简述
本文主要通过mono和il2cpp的源码来研究Unity中使用的C#虚拟机对于.Net多线程接口的实现原理。
多线程接口
首先我们来看.NET接口里面对于多线程的一些接口定义。
在.NET里面的线程池都是由 ThreadPool 来提供API的,该线程池可用于执行任务、发送工作项、处理异步 I/O、代表其他线程等待以及处理计时器。
一般来说在使用异步IO接口(例如HttpWebRequest的BeginRead,FileStream的BeginWrite等),或者使用委托来做异步回调时,都会由全局的线程池来提供多线程的支持,来执行这些异步任务和回调方法。
线程池范围:
- 全局
- 线程局部
线程池类型:
- Worker线程池,专门用于处理与内部工作相关的线程池,例如委托的异步回调或用户提交的异步任务等。
- IO线程池,专门用于处理与外部IO任务的线程池,一般处于wait状态。(在mono实现里面只有一个线程)
线程池策略:
- 最小线程池:默认和最小值都为CPU个数
- 最大线程数:默认为CPU个数 * 100
- 队列:无界队列
- 空闲时间:5s - 60s
- 调度算法:.NET ThreadPool 'Hill Climbing' Algorithm
相关API:
public static bool SetMaxThreads (int workerThreads, int completionPortThreads);
public static bool SetMinThreads (int workerThreads, int completionPortThreads);
public static void GetAvailableThreads (out int workerThreads, out int completionPortThreads);
public static bool UnsafeQueueUserWorkItem (System.Threading.IThreadPoolWorkItem callBack, bool preferLocal);
ThreadPool.GetAvailableThreads
ThreadPool.UnsafeQueueUserWorkItem
异步IO
下面从一个异步IO的方法调用为例,我们来观察一下该异步任务是如何被多线程异步执行的。
例如这里使用HttpWebRequest的BeginGetResponse()方法为具体的例子,可以观察到该异步任务会涉及到三种线程:
- main-thread, 主线程,异步方法的调用者所在的线程。
- io-thread, IO线程,全局的IO线程池里的线程,用于异步IO的阻塞等待。
- worker-thread, 工作线程,全局的worker线程池中的线程,用于异步任务的执行和委托异步回调。
1次异步IO任务的执行过程,会由4个主要的过程组成,该任务将在4个线程中切换执行:
- [主线程] 在主线程发起BeginGetResponse这个异步IO请求,该任务会将其IO句柄交给IO线程池进行等待执行(第1次线程切换:任务从主线程交到IO线程)
- [主线程 -> IO线程] 由IO线程做阻塞式的等待,等到IO操作可执行时,将该任务入队到工作线程池等待异步任务执行(IO写操作)。
- [IO线程 -> worker线程] 在工作线程,socket进行write操作,然后再将异步回调任务再次入队到工作线程池等待委托的异步回调任务执行。
- [worker线程 -> worker线程] 异步IO任务的结果提交给工作线程池后等待异步回调,工作线程池定时的从工作队列中取出任务,将任务分发给某个工作线程,在工作线程中通过调用BeginGetResponse的callback来回调该异步IO任务的执行结果。
目前测试出在il2cpp iOS上,HttpWebRequest多线程下载会出现网络波动比较明显,下载速度慢的现象,都和这个设计有一定关系。
步骤1:主线程,main-thread,发起异步IO请求
HttpWebRequest.BeginGetResponse(AsyncCallback) // 开始异步获取Response
-> SimpleAsyncResult.RunWithLock()
-> HttpWebRequest.CheckIfForceWrite()
-> WebConnectionStream.WriteRequestAsync(SimpleAsyncResult) // HTTP输出流开始写入请求头部
-> WebConnectionStream.SetHeaderAsync()
-> WebConnection.BeginWrite()
-> NetworkStream.BeginWrite()
-> Socket.BeginSend() // Socket开始准备发送数据
-> Socket.QueueIOSelectorJob(new IOSelectorJob(IOOperation.Write, BeginSendCallback))
-> IOSelector.Add(handle, job) // 加锁将写任务入队到IO线程池队列等待执行
-> selector_thread_wakeup() -> [io-thread] // 唤醒IO线程池
步骤2:IO线程,io-thread,阻塞等待IO
selector_thread() // IO线程loop
{
for update in udpates: // 遍历全局的IO任务队列
case UPDATE_ADD: // 增加socket fd类型的job
poll_register_fd(fd) // threadpool-ms-io-poll.cpp 注册fd,放入到poll_fds, 用于pool监听
case UPDATE_REMOVE_SOCKET: // 移除socket fd类型的job
poll_remove_fd(fd) // threadpool-ms-io-poll.cpp 移除fd
poll_event_wait(wait_callback)
{
poll(poll_fds) // 调用poll等待fd列表
for poll_fds: // 遍历fd列表
wait_callback(fd)
{
managedList = threadPoolStateHash->find(fd) // 全局的哈希表 <fd, jobList>
job = get_job_for_event(managedList) // 找到job
if (job != null) {
threadpool_ms_enqueue_work_item(job) // 将任务入队工作线程池,等待回调
{
System.Threading.ThreadPool.UnsafeQueueCustomWorkItem(job, false) // 将委托的异步结果回调任务入队到工作线程池,等待回调
}
}
poll_register_fd(fd) // 重新注册fd到poll_fds, 等待下次进行poll操作
}
}
}
步骤3: 工作线程,worker-thread,执行IO操作
worker_thread() // 工作线程loop
{
while (true) {
System.Bool System.Threading._ThreadPoolWaitCallback::PerformWaitCallback()
{
ThreadPoolWorkQueue::Dispatch() // 分发任务
{
while {time < tpQuantum} // 分片执行,每次dispatch只能运行 ThreadPoolGlobals.tpQuantum 的时间,然后等待下一次dispatch
{
System.Threading.IThreadPoolWorkItem::ExecuteWorkItem()
{
System.IOSelectorJob::System.Threading.IThreadPoolWorkItem.ExecuteWorkItem() // 执行异步IO任务
{
System.IOAsyncCallback::Invoke(System.IOAsyncResult)
{
System.Net.Sockets.Socket_<>c::<BeginSend>b__241_0(System.IOAsyncResult)
{
System.Net.Sockets.Socket::BeginSendCallback(System.Net.Sockets.SocketAsyncResult,System.Int32)
{
System.Net.Sockets.Socket::Send_internal(System.Net.Sockets.SafeSocketHandle,System.Byte*,System.Int32,System.Net.Sockets.SocketFlags,System.Int32U26,System.Boolean)
{
SocketImpl::Send() // socket发送数据
}
socketAsyncResult.Complete() // 将结果异步回调给调用者
{
ThreadPool.UnsafeQueueUserWorkItem(asyncResult) -> [worker-thread] // 将委托的异步结果回调任务入队到工作线程池,等待回调
}
IOSelector.Add(new IOSelectorJob(IOOperation.Write, BeginSendCallback) // 重新注册写任务到IO线程池,等待poll监听
}
}
}
}
}
}
}
}
}
}
Socket.BeginSendCallback() at mcs\class\System\System.Net.Sockets\Socket.cs
步骤4:工作线程:worker-thread,异步回调结果
worker_thread() // 工作线程loop
{
while (true) {
System.Bool System.Threading._ThreadPoolWaitCallback::PerformWaitCallback()
{
ThreadPoolWorkQueue::Dispatch() // 分发任务
{
while {time < tpQuantum} // 分片执行,每次dispatch只能运行 ThreadPoolGlobals.tpQuantum 的时间,然后等待下一次dispatch
{
System.Threading.QueueUserWorkItemCallback::System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()
{
System.Threading.WaitCallback::Invoke(System.Object)
{
System.AsyncCallback::Invoke(System.IAsyncResult)
{
System.Net.SimpleAsyncResult::SetCompleted()
{
System.Net.SimpleAsyncResult::DoCallback_private()
{
System.AsyncCallback::Invoke(System.IAsyncResult)
{
::GetResponseCallBack(System.IAsyncResult) // 异步回调结果
}
}
}
}
}
}
}
}
}
}
}
IO线程池
在mono/il2cpp里面,对于.NET的IO线程池的实现,在Android和iOS平台上都是单线程,其中通过使用poll实现。
任务入队
ves_icall_System_IOSelector_Add // IO异步任务入队IO线程池
void ves_icall_System_IOSelector_Add (intptr_t handle, Il2CppIOSelectorJob *job)
{
ThreadPoolIOUpdate *update;
update = update_get_new (); // 在全局的updates链表中增加一条update元素
il2cpp::os::SocketHandleWrapper socketHandle(il2cpp::os::PointerToSocketHandle(reinterpret_cast<void*>(handle)));
update->type = UPDATE_ADD; // 类型
update->data.add.fd = (int)socketHandle.GetSocket()->GetDescriptor(); // socket句柄
update->data.add.job = job; // 异步IO任务
selector_thread_wakeup (); // 通过写操作唤醒IO线程池
}
任务分派
在IO线程中,在loop中定期的轮训全局的IO任务队列。
selector_thread() at External\il2cpp\il2cpp\libil2cpp\mono\ThreadPool\threadpool-ms-io.cpp
static void selector_thread(void* data)
{
for (;;) {
int i, j;
int res;
threadpool_io->updates_lock.Lock();
for (i = 0; i < threadpool_io->updates_size; ++i) {
ThreadPoolIOUpdate *update = &threadpool_io->updates [i];
switch (update->type) {
case UPDATE_EMPTY:
break;
case UPDATE_ADD: {
// ...
poll_register_fd(fd, operations, !exists);
break;
case UPDATE_REMOVE_SOCKET:
poll_remove_fd(fd)
break;
// ...
}
poll_event_wait(wait_callback, state); // poll监听fd列表,在fd状态变化时在wait_callback中回调。
}
static void wait_callback (int fd, int events, void* user_data)
{
ThreadPoolStateHash::iterator iter = states->find(fd);
list = iter->second;
if (list && (events & EVENT_IN) != 0) {
Il2CppIOSelectorJob *job = get_job_for_event (list, EVENT_IN);
if (job) {
threadpool_ms_enqueue_work_item (il2cpp::vm::Domain::GetCurrent(), (Il2CppObject*) job); // 将读任务入队到工作线程池中
}
}
if (list && (events & EVENT_OUT) != 0) {
Il2CppIOSelectorJob *job = get_job_for_event (list, EVENT_OUT);
if (job) {
threadpool_ms_enqueue_work_item (il2cpp::vm::Domain::GetCurrent(), (Il2CppObject*) job); // 将写任务入队到工作线程池中
}
}
remove_fd = (events & EVENT_ERR) == EVENT_ERR;
if (!remove_fd) {
//mono_g_hash_table_replace (states, int_TO_POINTER (fd), list);
states->insert(ThreadPoolStateHash::value_type(fd, list));
operations = get_operations_for_jobs (list);
/*mono_trace (G_LOG_LEVEL_DEBUG, MONO_TRACE_IO_THREADPOOL, "io threadpool: res fd %3d, events = %2s | %2s | %3s",
fd, (operations & EVENT_IN) ? "RD" : "..", (operations & EVENT_OUT) ? "WR" : "..", (operations & EVENT_ERR) ? "ERR" : "...");*/
threadpool_io->backend.register_fd (fd, operations, false); // 重新监听该fd
} else {
//mono_trace (G_LOG_LEVEL_DEBUG, MONO_TRACE_IO_THREADPOOL, "io threadpool: err fd %d", fd);
states->erase(ThreadPoolStateHash::key_type(fd));
//mono_g_hash_table_remove (states, int_TO_POINTER (fd));
threadpool_io->backend.remove_fd (fd); // 出现异常,移除该fd
}
}
threadpool_ms_enqueue_work_item 异步任务入队工作线程池
at il2cpp\libil2cpp\mono\ThreadPool\threadpool-ms.cpp
bool threadpool_ms_enqueue_work_item (Il2CppDomain *domain, Il2CppObject *work_item)
{
static Il2CppClass *threadpool_class = NULL;
static MethodInfo *unsafe_queue_custom_work_item_method = NULL;
//Il2CppDomain *current_domain;
bool f;
void* args [2];
IL2CPP_ASSERT(work_item);
if (!threadpool_class)
threadpool_class = il2cpp::vm::Class::FromName(il2cpp_defaults.corlib, "System.Threading", "ThreadPool");
if (!unsafe_queue_custom_work_item_method)
unsafe_queue_custom_work_item_method = (MethodInfo*)il2cpp::vm::Class::GetMethodFromName(threadpool_class, "UnsafeQueueCustomWorkItem", 2);
IL2CPP_ASSERT(unsafe_queue_custom_work_item_method);
f = false;
args [0] = (void*) work_item;
args [1] = (void*) &f;
// 调用C#线程池函数:System.Threading.ThreadPool.UnsafeQueueCustomWorkItem()
Il2CppObject *result = il2cpp::vm::Runtime::InvokeWithThrow(unsafe_queue_custom_work_item_method, NULL, args);
return true;
}
Worker线程池
任务入队
ThreadPool.UnsafeQueueCustomWorkItem()
public static bool UnsafeQueueUserWorkItem (System.Threading.WaitCallback callBack, object state)
{
//...
ThreadPoolGlobals.workQueue.Enqueue(workItem, forceGlobal);
}
代码在 mcs\class\referencesource\mscorlib\system\threading\threadpool.cs 注意在 mcs\class\referencesource 目录下的不是mono或unity写的代码,而是引用的微软的.net core的源码。
ThreadPoolGlobals.workQueue 类型是ThreadPoolWorkQueue, 它是一个全局的工作任务队列,数据结构是分段(每一段256个任务)的链表,实现的是无锁的无界队列。
ThreadPoolWorkQueue.Enqueue任务入队.
public void Enqueue(IThreadPoolWorkItem callback, bool forceGlobal)
{
ThreadPoolWorkQueueThreadLocals tl = null;
if (!forceGlobal)
tl = ThreadPoolWorkQueueThreadLocals.threadLocals;
#if !MONO
if (loggingEnabled)
System.Diagnostics.Tracing.FrameworkEventSource.Log.ThreadPoolEnqueueWorkObject(callback);
#endif
if (null != tl)
{
tl.workStealingQueue.LocalPush(callback);
}
else
{
QueueSegment head = queueHead; // 队列的数据结构是一个单链表,[QueueSegment, QueueSegment, QueueSegment], 每一个分段的队列QueueSegment有256个任务列表的空位。
while (!head.TryEnqueue(callback)) // 遍历链表,尝试在在分段队列中找到位置
{
Interlocked.CompareExchange(ref head.Next, new QueueSegment(), null); // 没有找到位置则在head前面新建一个分段队列QueueSegment,将任务放进去,并将其设置为新的HEAD
// if (head.Next == null) { head.Next = new QueueSegment(); }
// [old-head] -> next -> [new-head]
while (head.Next != null)
{
Interlocked.CompareExchange(ref queueHead, head.Next, head);
// if (queueHead != head) { queueHead = head.Next; }
head = queueHead;
}
}
}
EnsureThreadRequested();
}
任务出队
public void Dequeue(ThreadPoolWorkQueueThreadLocals tl, out IThreadPoolWorkItem callback, out bool missedSteal)
{
callback = null;
missedSteal = false;
WorkStealingQueue wsq = tl.workStealingQueue;
if (wsq.LocalPop(out callback))
Contract.Assert(null != callback);
if (null == callback)
{
QueueSegment tail = queueTail;
while (true)
{
if (tail.TryDequeue(out callback))
{
Contract.Assert(null != callback);
break;
}
if (null == tail.Next || !tail.IsUsedUp())
{
break;
}
else
{
Interlocked.CompareExchange(ref queueTail, tail.Next, tail);
tail = queueTail;
}
}
}
if (null == callback)
{
WorkStealingQueue[] otherQueues = allThreadQueues.Current;
int i = tl.random.Next(otherQueues.Length);
int c = otherQueues.Length;
while (c > 0)
{
WorkStealingQueue otherQueue = Volatile.Read(ref otherQueues[i % otherQueues.Length]);
if (otherQueue != null &&
otherQueue != wsq &&
otherQueue.TrySteal(out callback, ref missedSteal))
{
Contract.Assert(null != callback);
break;
}
i++;
c--;
}
}
}
任务分发
ThreadPoolWorkQueue.Dispatch 任务分发
在worker_thread里面有一个loop,定时的调用ThreadPoolWorkQueue工作队列的Dispatch分发函数,将队列中的任务拿出来一个个执行。这个调派过程是分时的,每一个Dispath都需要在ThreadPoolGlobals.tpQuantum(30毫秒)内进行完成这次的任务分发,否则就需要等待下一次进行分发。
static internal bool Dispatch()
{
var workQueue = ThreadPoolGlobals.workQueue; // 全局工作线程的任务队列
//
// The clock is ticking! We have ThreadPoolGlobals.tpQuantum milliseconds to get some work done, and then
// we need to return to the VM.
//
int quantumStartTime = Environment.TickCount;
//
// Update our records to indicate that an outstanding request for a thread has now been fulfilled.
// From this point on, we are responsible for requesting another thread if we stop working for any
// reason, and we believe there might still be work in the queue.
//
// Note that if this thread is aborted before we get a chance to request another one, the VM will
// record a thread request on our behalf. So we don't need to worry about getting aborted right here.
//
workQueue.MarkThreadRequestSatisfied();
#if !MONO
// Has the desire for logging changed since the last time we entered?
workQueue.loggingEnabled = FrameworkEventSource.Log.IsEnabled(EventLevel.Verbose, FrameworkEventSource.Keywords.ThreadPool|FrameworkEventSource.Keywords.ThreadTransfer);
#endif
//
// Assume that we're going to need another thread if this one returns to the VM. We'll set this to
// false later, but only if we're absolutely certain that the queue is empty.
//
bool needAnotherThread = true;
IThreadPoolWorkItem workItem = null;
try
{
//
// Set up our thread-local data
//
ThreadPoolWorkQueueThreadLocals tl = workQueue.EnsureCurrentThreadHasQueue();
//
// Loop until our quantum expires.
//
while ((Environment.TickCount - quantumStartTime) < ThreadPoolGlobals.tpQuantum) // 最长运行ThreadPoolGlobals.tpQuantum时间
{
//
// Dequeue and EnsureThreadRequested must be protected from ThreadAbortException.
// These are fast, so this will not delay aborts/AD-unloads for very long.
//
try { }
finally
{
bool missedSteal = false;
workQueue.Dequeue(tl, out workItem, out missedSteal); // 从队列中取出任务
if (workItem == null)
{
//
// No work. We're going to return to the VM once we leave this protected region.
// If we missed a steal, though, there may be more work in the queue.
// Instead of looping around and trying again, we'll just request another thread. This way
// we won't starve other AppDomains while we spin trying to get locks, and hopefully the thread
// that owns the contended work-stealing queue will pick up its own workitems in the meantime,
// which will be more efficient than this thread doing it anyway.
//
needAnotherThread = missedSteal;
}
else
{
//
// If we found work, there may be more work. Ask for another thread so that the other work can be processed
// in parallel. Note that this will only ask for a max of #procs threads, so it's safe to call it for every dequeue.
//
workQueue.EnsureThreadRequested();
}
}
if (workItem == null)
{
// Tell the VM we're returning normally, not because Hill Climbing asked us to return.
return true;
}
else
{
#if !MONO
if (workQueue.loggingEnabled)
System.Diagnostics.Tracing.FrameworkEventSource.Log.ThreadPoolDequeueWorkObject(workItem);
#endif
//
// Execute the workitem outside of any finally blocks, so that it can be aborted if needed.
//
if (ThreadPoolGlobals.enableWorkerTracking)
{
bool reportedStatus = false;
try
{
try { }
finally
{
ThreadPool.ReportThreadStatus(true);
reportedStatus = true;
}
workItem.ExecuteWorkItem();
workItem = null;
}
finally
{
if (reportedStatus)
ThreadPool.ReportThreadStatus(false);
}
}
else
{
workItem.ExecuteWorkItem();
workItem = null;
}
//
// Notify the VM that we executed this workitem. This is also our opportunity to ask whether Hill Climbing wants
// us to return the thread to the pool or not.
//
if (!ThreadPool.NotifyWorkItemComplete())
return false;
}
}
// If we get here, it's because our quantum expired. Tell the VM we're returning normally.
return true;
}
catch (ThreadAbortException tae)
{
//
// This is here to catch the case where this thread is aborted between the time we exit the finally block in the dispatch
// loop, and the time we execute the work item. QueueUserWorkItemCallback uses this to update its accounting of whether
// it was executed or not (in debug builds only). Task uses this to communicate the ThreadAbortException to anyone
// who waits for the task to complete.
//
if (workItem != null)
workItem.MarkAborted(tae);
//
// In this case, the VM is going to request another thread on our behalf. No need to do it twice.
//
needAnotherThread = false;
// throw; //no need to explicitly rethrow a ThreadAbortException, and doing so causes allocations on amd64.
}
finally
{
//
// If we are exiting for any reason other than that the queue is definitely empty, ask for another
// thread to pick up where we left off.
//
if (needAnotherThread)
workQueue.EnsureThreadRequested();
}
// we can never reach this point, but the C# compiler doesn't know that, because it doesn't know the ThreadAbortException will be reraised above.
Contract.Assert(false);
return true;
}
}