本文使用的 runtime 版本为 objc4-706。
__weak
修饰的指针最重要的特性是其指向的对象销毁后,会自动置为 nil
,这个特性的实现完全是依靠运行时的。实现思路是非常简单的,对于下面的语句来说:
id __weak weakObj = strongObj;
便是用 strongObj
当作 key,weakObj
当作 value 存入一个表里。当 strongObj
销毁时,从表里找到所有的 __weak
引用,将其置为 nil
。
当然,实际的实现肯定是要比这要充斥着更多的细节。
变量的创建和销毁
还是上面那个例子,实际上编译器会进行一些变动:
{
id __weak weakObj = strongObj;
}
// 会变成
{
id __weak weakObj;
objc_initWeak(&weakObj, strongObj);
// 离开变量的范围,进行销毁
objc_destroyWeak(&weakObj);
}
objc_initWeak
和 objc_destroyWeak
都可以在 NSObject.mm
文件中找到:
id
objc_initWeak(id *location, id newObj)
{
if (!newObj) {
*location = nil;
return nil;
}
return storeWeak<false/*old*/, true/*new*/, true/*crash*/>
(location, (objc_object*)newObj);
}
void
objc_destroyWeak(id *location)
{
(void)storeWeak<true/*old*/, false/*new*/, false/*crash*/>
(location, nil);
}
可以看到都是对 storeWeak
函数模板的调用(为什么要使用模板呢?会更快吗?C++ 小白内心的问题…… )。
赋值
当已有的 __weak
变量被重新赋值时会怎么样呢?
weakObj = anotherStrongObj;
// 会变成下面这样
objc_storeWeak(&weakObj, anotherStrongObj);
它的实现如下:
id
objc_storeWeak(id *location, id newObj)
{
return storeWeak<true/*old*/, true/*new*/, true/*crash*/>
(location, (objc_object *)newObj);
}
但实际上也还是对 storeWeak
函数模板的封装。
storeWeak
storeWeak
的实现还是有点长的,一点一点来分析:
// Update a weak variable.
// If HaveOld is true, the variable has an existing value
// that needs to be cleaned up. This value might be nil.
// If HaveNew is true, there is a new value that needs to be
// assigned into the variable. This value might be nil.
// If CrashIfDeallocating is true, the process is halted if newObj is
// deallocating or newObj's class does not support weak references.
// If CrashIfDeallocating is false, nil is stored instead.
template <bool HaveOld, bool HaveNew, bool CrashIfDeallocating>
static id
storeWeak(id *location, objc_object *newObj)
{
assert(HaveOld || HaveNew);
if (!HaveNew) assert(newObj == nil);
Class previouslyInitializedClass = nil;
id oldObj;
SideTable *oldTable;
SideTable *newTable;
函数前的注释表明了三个模板参数的作用,当然在后面的代码里也能直观的看到。函数一开始进行了变量的声明,可以注意到 SideTable
这个类型,SideTable
是现在的运行时中用来存放引用计数和弱引用的结构体,它的结构是这样的(省略了结构体函数):
struct SideTable {
spinlock_t slock;
RefcountMap refcnts;
weak_table_t weak_table;
}
其中 slock
是一个自旋锁,用来对 SideTable
实例进行操作时的加锁。refcnts
则是存放引用计数的地方。weak_table
则是存放弱引用的地方(后面将详细分析 weak_table_t
)。
回到 storeWeak
函数:
// Acquire locks for old and new values.
// Order by lock address to prevent lock ordering problems.
// Retry if the old value changes underneath us.
retry:
if (HaveOld) {
oldObj = *location;
oldTable = &SideTables()[oldObj];
} else {
oldTable = nil;
}
if (HaveNew) {
newTable = &SideTables()[newObj];
} else {
newTable = nil;
}
SideTable::lockTwo<HaveOld, HaveNew>(oldTable, newTable);
if (HaveOld && *location != oldObj) {
SideTable::unlockTwo<HaveOld, HaveNew>(oldTable, newTable);
goto retry;
}
这一段即获取 oldObj
、oldTable
和 newTable
,并将获取的两个表上锁。注意到获取 oldTable
和 newTable
时,其实是用对象的地址当作 key 从 SideTables
获取的,SideTables
返回的就是一个哈希表,存储着若干个 SideTable
,一般是 64 个。
// Prevent a deadlock between the weak reference machinery
// and the +initialize machinery by ensuring that no
// weakly-referenced object has an un-+initialized isa.
if (HaveNew && newObj) {
Class cls = newObj->getIsa();
if (cls != previouslyInitializedClass &&
!((objc_class *)cls)->isInitialized())
{
SideTable::unlockTwo<HaveOld, HaveNew>(oldTable, newTable);
_class_initialize(_class_getNonMetaClass(cls, (id)newObj));
// If this class is finished with +initialize then we're good.
// If this class is still running +initialize on this thread
// (i.e. +initialize called storeWeak on an instance of itself)
// then we may proceed but it will appear initializing and
// not yet initialized to the check above.
// Instead set previouslyInitializedClass to recognize it on retry.
previouslyInitializedClass = cls;
goto retry;
}
}
上面这一段代码也有着很好的注释,就是要确保对象的类已经走过 +initialize
流程了。
// Clean up old value, if any.
if (HaveOld) {
weak_unregister_no_lock(&oldTable->weak_table, oldObj, location);
}
// Assign new value, if any.
if (HaveNew) {
newObj = (objc_object *)weak_register_no_lock(&newTable->weak_table,
(id)newObj, location,
CrashIfDeallocating);
// weak_register_no_lock returns nil if weak store should be rejected
// Set is-weakly-referenced bit in refcount table.
if (newObj && !newObj->isTaggedPointer()) {
newObj->setWeaklyReferenced_nolock();
}
// Do not set *location anywhere else. That would introduce a race.
*location = (id)newObj;
}
else {
// No new value. The storage is not changed.
}
SideTable::unlockTwo<HaveOld, HaveNew>(oldTable, newTable);
return (id)newObj;
}
最后一段的逻辑也是很清晰的。首先,如果有旧的值(HaveOld
),则使用 weak_unregister_no_lock
函数将其从 oldTable
的 weak_table
中移除。其次,如果有新的值(HaveNew
),则使用 weak_register_no_lock
函数将其注册到 newTable
的 weak_table
中,并使用 setWeaklyReferenced_nolock
函数将对象标记为被弱引用过。
storeWeak
的实现就告一段落了,其重点就在 weak_register_no_lock
和 weak_unregister_no_lock
函数上。
weak_table_t
在分析这两个函数之前,先看看 weak_table_t
是一个怎么样的结构:
/**
* The global weak references table. Stores object ids as keys,
* and weak_entry_t structs as their values.
*/
struct weak_table_t {
weak_entry_t *weak_entries;
size_t num_entries;
uintptr_t mask;
uintptr_t max_hash_displacement;
};
-
weak_entries
便是存放弱引用的数组; -
num_entries
是存放的weak_entry_t
条目的数量; -
mask
则是动态申请的弱引用数组weak_entries
长度减 1 的值,用来对哈希后的值取余和记录数组大小; -
max_hash_displacement
则是哈希碰撞后最大的位移值。
其实 weak_table_t
就是一个动态增长的哈希表。
继续看看其相关的操作,首先是对整个表的扩大:
#define TABLE_SIZE(entry) (entry->mask ? entry->mask + 1 : 0)
// Grow the given zone's table of weak references if it is full.
static void weak_grow_maybe(weak_table_t *weak_table)
{
size_t old_size = TABLE_SIZE(weak_table);
// Grow if at least 3/4 full.
if (weak_table->num_entries >= old_size * 3 / 4) {
weak_resize(weak_table, old_size ? old_size*2 : 64);
}
}
可以看到,当 weak_table
里的弱引用条目达到它容量的四分之三时,便会将容量拓展为两倍。值得注意的是第一次拓展也就是是 mask
为 0 的情况,初始值是 64。实际对弱引用表大小的操作则交给了 weak_resize
函数。
除了扩大,当然也还有缩小:
// Shrink the table if it is mostly empty.
static void weak_compact_maybe(weak_table_t *weak_table)
{
size_t old_size = TABLE_SIZE(weak_table);
// Shrink if larger than 1024 buckets and at most 1/16 full.
if (old_size >= 1024 && old_size / 16 >= weak_table->num_entries) {
weak_resize(weak_table, old_size / 8);
// leaves new table no more than 1/2 full
}
}
缩小的话则是需要表本身大于等于 1024 并且存放了不足十六分之一的条目时,直接缩小 8 倍。实际工作也是交给了 weak_resize
函数:
static void weak_resize(weak_table_t *weak_table, size_t new_size)
{
size_t old_size = TABLE_SIZE(weak_table);
weak_entry_t *old_entries = weak_table->weak_entries;
weak_entry_t *new_entries = (weak_entry_t *)
calloc(new_size, sizeof(weak_entry_t));
weak_table->mask = new_size - 1;
weak_table->weak_entries = new_entries;
weak_table->max_hash_displacement = 0;
weak_table->num_entries = 0; // restored by weak_entry_insert below
if (old_entries) {
weak_entry_t *entry;
weak_entry_t *end = old_entries + old_size;
for (entry = old_entries; entry < end; entry++) {
if (entry->referent) {
weak_entry_insert(weak_table, entry);
}
}
free(old_entries);
}
}
weak_resize
函数的过程就是新建一个数组,将老数组里的值使用 weak_entry_insert
函数添加进去,注意到代码中间 mask
在这里被赋值为新数组的大小减去 1,max_hash_displacement
和 num_entries
也都清零了,因为 weak_entry_insert
函数会对这两个值进行操作。接着对 weak_entry_insert
函数进行分析:
/**
* Add new_entry to the object's table of weak references.
* Does not check whether the referent is already in the table.
*/
static void weak_entry_insert(weak_table_t *weak_table, weak_entry_t *new_entry)
{
weak_entry_t *weak_entries = weak_table->weak_entries;
assert(weak_entries != nil);
size_t begin = hash_pointer(new_entry->referent) & (weak_table->mask);
size_t index = begin;
size_t hash_displacement = 0;
while (weak_entries[index].referent != nil) {
index = (index+1) & weak_table->mask;
if (index == begin) bad_weak_table(weak_entries);
hash_displacement++;
}
weak_entries[index] = *new_entry;
weak_table->num_entries++;
if (hash_displacement > weak_table->max_hash_displacement) {
weak_table->max_hash_displacement = hash_displacement;
}
}
这个函数就是个很正常的哈希表插入的过程,hash_pointer
函数是对指针地址进行哈希,哈希后的值之所以要和 mask
进行 &
操作,是因为弱引用表的大小永远是 2 的幂(一开始是 64,之后不断乘以 2),mask
则是大小减去 1 即为一个 0b111...11
这么一个数,和它进行 &
运算相当于取余。hash_displacement
则是记录了哈希相撞后偏移的大小。
既然有插入,也就有删除:
/**
* Remove entry from the zone's table of weak references.
*/
static void weak_entry_remove(weak_table_t *weak_table, weak_entry_t *entry)
{
// remove entry
if (entry->out_of_line()) free(entry->referrers);
bzero(entry, sizeof(*entry));
weak_table->num_entries--;
weak_compact_maybe(weak_table);
}
很直接的清零 entry
,并给 weak_table
的 num_entries
减 1,最后检查看是否需要缩小。
最后还有一个根据指定对象查找存在条目的函数:
/**
* Return the weak reference table entry for the given referent.
* If there is no entry for referent, return NULL.
* Performs a lookup.
*
* @param weak_table
* @param referent The object. Must not be nil.
*
* @return The table of weak referrers to this object.
*/
static weak_entry_t *
weak_entry_for_referent(weak_table_t *weak_table, objc_object *referent)
{
assert(referent);
weak_entry_t *weak_entries = weak_table->weak_entries;
if (!weak_entries) return nil;
size_t begin = hash_pointer(referent) & weak_table->mask;
size_t index = begin;
size_t hash_displacement = 0;
while (weak_table->weak_entries[index].referent != referent) {
index = (index+1) & weak_table->mask;
if (index == begin) bad_weak_table(weak_table->weak_entries);
hash_displacement++;
if (hash_displacement > weak_table->max_hash_displacement) {
return nil;
}
}
return &weak_table->weak_entries[index];
}
也是很正常的哈希表套路。
weak_entry_t
那弱引用是怎么存储的呢,继续分析 weak_entry_t
:
#define WEAK_INLINE_COUNT 4
#define REFERRERS_OUT_OF_LINE 2
struct weak_entry_t {
DisguisedPtr<objc_object> referent;
union {
struct {
weak_referrer_t *referrers;
uintptr_t out_of_line_ness : 2;
uintptr_t num_refs : PTR_MINUS_2;
uintptr_t mask;
uintptr_t max_hash_displacement;
};
struct {
// out_of_line_ness field is low bits of inline_referrers[1]
weak_referrer_t inline_referrers[WEAK_INLINE_COUNT];
};
};
bool out_of_line() {
return (out_of_line_ness == REFERRERS_OUT_OF_LINE);
}
weak_entry_t& operator=(const weak_entry_t& other) {
memcpy(this, &other, sizeof(other));
return *this;
}
weak_entry_t(objc_object *newReferent, objc_object **newReferrer)
: referent(newReferent)
{
inline_referrers[0] = newReferrer;
for (int i = 1; i < WEAK_INLINE_COUNT; i++) {
inline_referrers[i] = nil;
}
}
};
首先 DisguisedPtr<T>
类型和 T*
的行为是一模一样的,这个类型存在的目的是为了躲过内存泄漏工具的检查(注释原文:「DisguisedPtr<T>
acts like pointer type T*
, except the stored value is disguised to hide it from tools like leaks
.」)。所以 DisguisedPtr<objc_object> referent
可以看作是 objc_object *referent
。
referent
这个指针记录的便是被弱引用的对象。接下来的联合里有两种结构体,先分析第一种:
-
referrers
:referrers
是一个weak_referrer_t
类型的数组,用来存放弱引用变量的地址,weak_referrer_t
的定义是这样的:typedef DisguisedPtr<objc_object *> weak_referrer_t;
; -
out_of_line_ness
:2 bit 标记位,用来确定联合里的内存是第一个结构体还是第二个结构体; -
num_refs
:PTR_MINUS_2
便是字长减去 2 位,和out_of_line_ness
一起组成一个字长,用来存储referrers
的大小; -
mask
和max_hash_displacement
:和前面分析的一样,做哈希表用到的东西。
可以发现第一种结构体也是一个哈希表,第二种结构体则是一个和第一种结构体一样大的数组,所谓的 inline 存储。存放思路则是首先 inline 存储,当超过 WEAK_INLINE_COUNT
也就是 4 时,再变成第一种的动态哈希表存储。代码下方的构造函数便体现了这个思路。
可以注意到 weak_entry_t
重载了赋值操作符,将赋值变成了一个拷贝内存的操作。
相关操作也是和上面 weak_table_t
的类似,只不过加上了 inline 存储情况的变化,就不详细分析了。
weak_register_no_lock
开始分析 weak_register_no_lock
函数:
/**
* Registers a new (object, weak pointer) pair. Creates a new weak
* object entry if it does not exist.
*
* @param weak_table The global weak table.
* @param referent The object pointed to by the weak reference.
* @param referrer The weak pointer address.
*/
id
weak_register_no_lock(weak_table_t *weak_table, id referent_id,
id *referrer_id, bool crashIfDeallocating)
{
objc_object *referent = (objc_object *)referent_id;
objc_object **referrer = (objc_object **)referrer_id;
if (!referent || referent->isTaggedPointer()) return referent_id;
第一段,约等于什么都没干。referent
是被弱引用的对象,referrer
则是弱引用变量的地址。
// ensure that the referenced object is viable
bool deallocating;
if (!referent->ISA()->hasCustomRR()) {
deallocating = referent->rootIsDeallocating();
}
else {
BOOL (*allowsWeakReference)(objc_object *, SEL) =
(BOOL(*)(objc_object *, SEL))
object_getMethodImplementation((id)referent,
SEL_allowsWeakReference);
if ((IMP)allowsWeakReference == _objc_msgForward) {
return nil;
}
deallocating =
! (*allowsWeakReference)(referent, SEL_allowsWeakReference);
}
这一段很有意思,如果对象没有自定义的内存管理方法(hasCustomRR
),则将 deallocating
变量赋值为 rootIsDeallocating
也就是是否正在销毁。但是如果有自定义的内存管理方法的话,发送的是
allowsWeakReference
这个消息,即是否允许弱引用。不管怎么样,我们得到了一个 deallocating
变量。
if (deallocating) {
if (crashIfDeallocating) {
_objc_fatal("Cannot form weak reference to instance (%p) of "
"class %s. It is possible that this object was "
"over-released, or is in the process of deallocation.",
(void*)referent, object_getClassName((id)referent));
} else {
return nil;
}
}
从上面一段可以知道,deallocating
为 true
的话肯定是有问题的,所以这一段处理一下。
// now remember it and where it is being stored
weak_entry_t *entry;
if ((entry = weak_entry_for_referent(weak_table, referent))) {
append_referrer(entry, referrer);
}
else {
weak_entry_t new_entry(referent, referrer);
weak_grow_maybe(weak_table);
weak_entry_insert(weak_table, &new_entry);
}
// Do not set *referrer. objc_storeWeak() requires that the
// value not change.
return referent_id;
}
最后一段终于做了正事了!首先先用 weak_entry_for_referent
函数搜索对象是否已经有了 weak_entry_t
类型的条目,有的话则使用 append_referrer
添加一个变量位置进去,没有的话则新建一个 weak_entry_t
条目,使用 weak_grow_maybe
函数扩大(如果需要的话)弱引用表的大小,并使用 weak_entry_insert
将弱引用插入表中。
weak_unregister_no_lock
接下来是 weak_unregister_no_lock
函数:
void
weak_unregister_no_lock(weak_table_t *weak_table, id referent_id,
id *referrer_id)
{
objc_object *referent = (objc_object *)referent_id;
objc_object **referrer = (objc_object **)referrer_id;
weak_entry_t *entry;
if (!referent) return;
if ((entry = weak_entry_for_referent(weak_table, referent))) {
remove_referrer(entry, referrer);
bool empty = true;
if (entry->out_of_line() && entry->num_refs != 0) {
empty = false;
}
else {
for (size_t i = 0; i < WEAK_INLINE_COUNT; i++) {
if (entry->inline_referrers[i]) {
empty = false;
break;
}
}
}
if (empty) {
weak_entry_remove(weak_table, entry);
}
}
// Do not set *referrer = nil. objc_storeWeak() requires that the
// value not change.
}
主要功能实现思路很简单,使用 weak_entry_for_referent
函数找到对应的弱引用条目,并用 remove_referrer
将对应的弱引用变量位置从中移除。最后判断条目是否为空,为空则使用 weak_entry_remove
将其从弱引用表中移除。
自动置为 nil
对象销毁后,弱引用变量被置为 nil
是因为在对象 dealloc
的过程中调用了 weak_clear_no_lock
函数:
/**
* Called by dealloc; nils out all weak pointers that point to the
* provided object so that they can no longer be used.
*
* @param weak_table
* @param referent The object being deallocated.
*/
void
weak_clear_no_lock(weak_table_t *weak_table, id referent_id)
{
objc_object *referent = (objc_object *)referent_id;
weak_entry_t *entry = weak_entry_for_referent(weak_table, referent);
if (entry == nil) {
/// XXX shouldn't happen, but does with mismatched CF/objc
//printf("XXX no entry for clear deallocating %p\n", referent);
return;
}
首先初始化一下,获取到弱引用条目,顺便处理没有弱引用的情况。
// zero out references
weak_referrer_t *referrers;
size_t count;
if (entry->out_of_line()) {
referrers = entry->referrers;
count = TABLE_SIZE(entry);
}
else {
referrers = entry->inline_referrers;
count = WEAK_INLINE_COUNT;
}
获取弱引用变量位置数组和个数。
for (size_t i = 0; i < count; ++i) {
objc_object **referrer = referrers[i];
if (referrer) {
if (*referrer == referent) {
*referrer = nil;
}
else if (*referrer) {
_objc_inform("__weak variable at %p holds %p instead of %p. "
"This is probably incorrect use of "
"objc_storeWeak() and objc_loadWeak(). "
"Break on objc_weak_error to debug.\n",
referrer, (void*)*referrer, (void*)referent);
objc_weak_error();
}
}
}
weak_entry_remove(weak_table, entry);
}
循环将它们置为 nil
,最后移除整个弱引用条目。
访问弱引用
在访问一个弱引用时,ARC 会对其进行一些操作:
obj = weakObj;
// 会变成
objc_loadWeakRetained(&weakObj);
obj = weakObj;
objc_release(weakObj);
objc_loadWeakRetained
函数的主要作用就是调用了 rootTryRetain
函数:
ALWAYS_INLINE bool
objc_object::rootTryRetain()
{
return rootRetain(true, false) ? true : false;
}
实际上就是尝试对引用计数加 1,让弱引用对象在使用时不会被释放掉。
有关
rootRetain
的实现:《Objective-C 小记(7)retain & release》
总结
存放一个弱引用还真是哈希了很多次:
-
SideTable
哈希一次,这里分开来应该是为了性能原因; -
weak_table_t
哈希一次; -
weak_entry_t
哈希一次。
对于开销,直观感受上也并没有什么很大开销,想用就用呗……