我们都知道dyld在runtime初始化的时候注册了三个回调其中有一个load_images,他就是用来做load收集以及调用的
当我们的程序启动时加载的load函数过多时,就会对app的启动有一定的影响;一个load耗时是几毫秒级别的,但是当load函数多了这个优化就显得很有必要了
本文demo代码
可以运行demo之后在此入口进行测试查看结果
1. 如何收集项目中所有的load方法
你可以直接正则搜索,能看到项目中源文件中存在的load方法,下面介绍一种使用lldb的方式
在Xcode的lldb执行一下指令就可列出来
# 使用正则的方式给+load打断点
(lldb) br s -r "\+\[.+ load\]$"
Breakpoint 28: 42 locations.
# 查看断点的列表:比如我这里是28号断点
(lldb) br list 28
28: regex = '\+\[.+ load\]$', locations = 42, resolved = 42, hit count = 0
28.1: where = RuntimeLearning`+[SubTestUnsafeSwizzle load] + 8 at TestUnsafeSwizzle.m:23, address = 0x0000000107a146a8, resolved, hit count = 0
28.2: where = RuntimeLearning`+[TestCategorySwizzle(EventTrack) load] + 8 at TestCategorySwizzle.m:51, address = 0x0000000107a149b8, resolved, hit count = 0
... 此处省略很多
28.12: where = Foundation`+[NSObject(NSObject) load], address = 0x00007fff2574d4d2, resolved, hit count = 0
28.13: where = CoreFoundation`+[NSObject(NSObject) load], address = 0x00007fff23c91d70, resolved, hit count = 0
...
这里就截取了部分输出的断点列表,可以看到还有一些是系统的也列出来了,我们也可以通过指定库的名称来只对我们的项目代码中的load断点
比如我的项目是RuntimeLearning
我就指定是他,这个时候断点就是该项目中的load方法了
(lldb) br s -s RuntimeLearning -r "\+\[.+ load\]$"
Breakpoint 29: 10 locations.
(lldb) br list 29
29: regex = '\+\[.+ load\]$', module = RuntimeLearning, locations = 10, resolved = 10, hit count = 0
29.1: where = RuntimeLearning`+[SubTestUnsafeSwizzle load] + 8 at TestUnsafeSwizzle.m:23, address = 0x0000000107a146a8, resolved, hit count = 0
29.2: where = RuntimeLearning`+[TestCategorySwizzle(EventTrack) load] + 8 at TestCategorySwizzle.m:51, address = 0x0000000107a149b8, resolved, hit count = 0
...此处省略一些断点信息
29.10: where = RuntimeLearning`+[UIFont(Test) load] + 8 at UIFont+Test.m:18, address = 0x0000000107a14f58, resolved, hit count = 0
(lldb)
我们有各种手段能收集到项目中的load方法,那么我们怎么去优化掉这些load方法了。
2. 我们一般在load中做了些什么
方法的存在是有原因的,我们在load方法中做了一些事,也是利用了load函数的加载调用机制
2.1 load中做方法交换hook
我们hook函数的时候想要保证,函数在被调用之前就能被hook住,同时在一个类的多个分类中去做hook也都能生效;那么load函数就再合适不过了;
- load的加载时机是在runtime初始化,在main函数之前
- 一个类的多个分类的load函数也都会被执行,执行顺序是按照编译顺序;分类的load在原类之后执行
- load的执行是按照继承链从上往下执行的
- load方法只有实现了才会执行,没有实现不会去调用继承链上游的,也不会调用分类中的
那么我们有没有替代方案了?
__attribute__ constructor
也是在main之前在objc_init的static_init()阶段执行的,我们排除掉
initialize
函数是在消息首次接受消息的时候触发,但是他有2个特点是我们有时候不想要的;
- 子类没实现initialize方法会调用父类的
- 多个分类中实现会从行为上覆盖掉原类及其他分类的initialize(最后编译的那个分类中的会被调用)
这些行为导致我们如果在类的多个分类中去hook那么只会有一个生效;那么有没有办法解决了,稍后来一起探讨技术上可行的方案
2.2 load中做路由配置的注册
这里主要就是我们有时候在做路由跳转的时候,需要将一些标识信息注册到路由表中,往往我们就是写在load中,在程序启动这个路由表就注册好了
这个的解决方式就比较多
- 配置文件的方式,可以是个json,plist
- 管理类维护注册关系的方式
- 协议的方式,协议定义好需要的信息;容器中实现该协议,在用到的时候使用runtime去收集所有实现该协议的类的信息
-
__attribute__((section("name")))
将数据写入到可执行文件中,在使用的时候读取出来
3 如何用initialize替代load做方法hook
我们先了解一些load是怎么加载的,前面也说过dyld注册了回调,其中一个load_images回调就是来调用load方法
3.1 dyld在objc_init中注册回调load_images
void _objc_init(void)
{
static bool initialized = false;
if (initialized) return;
initialized = true;
// fixme defer initialization until an objc-using image is found?
environ_init();
tls_init();
static_init();
runtime_init();
exception_init();
cache_init();
_imp_implementationWithBlock_init();
_dyld_objc_notify_register(&map_images, load_images, unmap_image);
}
3.2 load_images内部流程
调用
prepare_load_methods
收集该镜像image的所有load方法,然后调用call_load_methods
触发load方法
void
load_images(const char *path __unused, const struct mach_header *mh)
{
// Return without taking locks if there are no +load methods here.
if (!hasLoadMethods((const headerType *)mh)) return;
recursive_mutex_locker_t lock(loadMethodLock);
// Discover load methods
{
mutex_locker_t lock2(runtimeLock);
prepare_load_methods((const headerType *)mh);
}
// Call +load methods (without runtimeLock - re-entrant)
call_load_methods();
}
3.3 收集load方法的imp
- 从MachO的
__objc_nlclslist
段获取class列表,收集class的load方法;schedule_class_load
内部会沿着继承链向上递归调用,依次将类的load方法通过add_class_to_loadable_list
加到一个结构体数组中- 从MachO的
__objc_nlcatlist
段获取category列表,然后按照MachO中的顺序依次将load方法通过add_category_to_loadable_list
加到一个结构体数组中
void prepare_load_methods(const headerType *mhdr)
{
size_t count, i;
runtimeLock.assertLocked();
// 获取class列表,收集class的load方法
classref_t const *classlist =
_getObjc2NonlazyClassList(mhdr, &count);
for (i = 0; i < count; i++) {
schedule_class_load(remapClass(classlist[i]));
}
category_t * const *categorylist = _getObjc2NonlazyCategoryList(mhdr, &count);
for (i = 0; i < count; i++) {
category_t *cat = categorylist[i];
Class cls = remapClass(cat->cls);
if (!cls) continue; // category for ignored weak-linked class
if (cls->isSwiftStable()) {
_objc_fatal("Swift class extensions and categories on Swift "
"classes are not allowed to have +load methods");
}
realizeClassWithoutSwift(cls, nil);
ASSERT(cls->ISA()->isRealized());
add_category_to_loadable_list(cat);
}
}
// Recursively schedule +load for cls and any un-+load-ed superclasses.
// cls must already be connected.
static void schedule_class_load(Class cls)
{
if (!cls) return;
ASSERT(cls->isRealized()); // _read_images should realize
if (cls->data()->flags & RW_LOADED) return;
// Ensure superclass-first ordering
schedule_class_load(cls->superclass);
add_class_to_loadable_list(cls);
cls->setInfo(RW_LOADED);
}
3.3.1 add_category_to_loadable_list内部实现
loadable_category
构体包含Category信息,以及load的imp;然后用一个结构体数组loadable_categories
去存储loadable_category
结构体对象方法的
add_class_to_loadable_list
实现跟类别的类似,只是存储的结构体的差异
struct loadable_class {
Class cls; // may be nil
IMP method;
};
struct loadable_category {
Category cat; // may be nil
IMP method;
};
// List of categories that need +load called (pending parent class +load)
static struct loadable_category *loadable_categories = nil;
static int loadable_categories_used = 0;
static int loadable_categories_allocated = 0;
void add_category_to_loadable_list(Category cat)
{
IMP method;
loadMethodLock.assertLocked();
// 直接拿到load方法的imp
method = _category_getLoadMethod(cat);
// Don't bother if cat has no +load method
if (!method) return;
if (PrintLoading) {
_objc_inform("LOAD: category '%s(%s)' scheduled for +load",
_category_getClassName(cat), _category_getName(cat));
}
// 如果需要扩容,就扩容
if (loadable_categories_used == loadable_categories_allocated) {
loadable_categories_allocated = loadable_categories_allocated*2 + 16;
loadable_categories = (struct loadable_category *)
realloc(loadable_categories,
loadable_categories_allocated *
sizeof(struct loadable_category));
}
// 将category的类以及imp存储在数组中
loadable_categories[loadable_categories_used].cat = cat;
loadable_categories[loadable_categories_used].method = method;
loadable_categories_used++;
}
3.4 call_load_methods调用load方法
先调用class的load
call_class_loads
,再调用类别的loadcall_category_loads
void call_load_methods(void)
{
static bool loading = NO;
bool more_categories;
loadMethodLock.assertLocked();
// Re-entrant calls do nothing; the outermost call will finish the job.
if (loading) return;
loading = YES;
void *pool = objc_autoreleasePoolPush();
do {
// 1. Repeatedly call class +loads until there aren't any more
while (loadable_classes_used > 0) {
call_class_loads();
}
// 2. Call category +loads ONCE
more_categories = call_category_loads();
// 3. Run more +loads if there are classes OR more untried categories
} while (loadable_classes_used > 0 || more_categories);
objc_autoreleasePoolPop(pool);
loading = NO;
}
至此我们也大概梳理了runtime是如何加载load方法的
- 收集load方法的信息,先类再分类,先父类再子类,分别存储到2个结构体数组中
- 调用load方法,先调用类的,再调用分类的
- 由于系统是收集了所有的load的imp,然后去执行,所以就保证了load方法都执行了,并且是按照先类再分类,类是按照继承链从父到子类的顺序执行的
3.5 initialize是如何调用的
在initialize方法下个断点,然后在Xcode的
Debug
--Debug Workflow
--Always Show Disassembly
打开汇编调试,可以看到调用了如下的函数CALLING_SOME_+initialize_METHOD
libobjc.A.dylib`CALLING_SOME_+initialize_METHOD:
0x7fff513fc0f2 <+0>: pushq %rbp
0x7fff513fc0f3 <+1>: movq %rsp, %rbp
0x7fff513fc0f6 <+4>: movq 0x38a0a3cb(%rip), %rsi ; "initialize"
0x7fff513fc0fd <+11>: callq *0x3663a18d(%rip) ; (void *)0x00007fff513f7780: objc_msgSend
-> 0x7fff513fc103 <+17>: popq %rbp
0x7fff513fc104 <+18>: retq
CALLING_SOME_+initialize_METHOD
的实现;
可以看到是调用的callInitialize
内部是走的objc_msgSend消息机制;这也能解释为啥分类中的initialize方法会覆盖类中的initialize方法了
void callInitialize(Class cls)
asm("_CALLING_SOME_+initialize_METHOD");
// 内部是走的objc_msgSend消息机制
void callInitialize(Class cls)
{
((void(*)(Class, SEL))objc_msgSend)(cls, @selector(initialize));
asm("");
}
callInitialize
的上层调用initializeNonMetaClass
,方法实现太长就不贴代码了,内部就是递归调用父类的,简化版如下
void initializeNonMetaClass(Class cls)
{
ASSERT(!cls->isMetaClass());
Class supercls;
bool reallyInitialize = NO;
// Make sure super is done initializing BEFORE beginning to initialize cls.
// See note about deadlock above.
supercls = cls->superclass;
if (supercls && !supercls->isInitialized()) {
initializeNonMetaClass(supercls);
}
// 此处省略很多代码
callInitialize(cls);
}
3.6 对initialize做点什么
我们了解了load
以及initialize
的调用机制,现在我们又想将方法替换的代码移到initialize
方法中,那么我们可以仿照load
的加载流程,再调用initialize
的时候将类和分类中的initialize
方法也收集起来,然后手动去调用一下,是不是就可以了
流程如下:
- 按照规则收集
initialize
方法的imp保存到一个结构体数组中- 规则是先类再分类;类需要按照继承链去获取
- 在调用
initialize
的时候,手动调用收集到的方法imp列表
3.6.1 收集initialize的imp列表
我们也定义一个结构体去存储Class以及imp
struct initailize_class {
Class cls;
IMP method;
};
编写收集imp的方法,入参传入Class、方法子、以及count;count是用来统计结构体数组的个数
struct initailize_class *gatherClassMethodImps(Class gatherCls, SEL gatherSel, unsigned int *count) {
if (gatherCls == Nil || gatherSel == nil) {
return nil;
}
struct initailize_class *initailize_classes = nil;
int used = 0, allocated = 0;
Class cls = gatherCls;
// 沿着继承链去从类的ias即metaClass的方法列表中获取initialize方法并存储起来
while (cls != NSObject.class) {
unsigned int count = 0;
Class gatherClsIsa = object_getClass(cls);
Method *methodList = class_copyMethodList(gatherClsIsa, &count);
for (unsigned int i = 0; i < count; i++) {
Method method = methodList[i];
SEL sel = method_getName(method);
if (sel == gatherSel) {
IMP imp = method_getImplementation(method);
if (used == allocated) {
allocated = allocated * 2 + 16;
initailize_classes = (struct initailize_class *)realloc(initailize_classes, sizeof(struct initailize_class) * allocated);
}
initailize_classes[used].cls = cls;
initailize_classes[used].method = imp;
used++;
}
}
free(methodList);
cls = [cls superclass]; // class_getSuperclass(cls);
}
// 倒序一下,得到的是按照继承链从上到下的顺序
struct initailize_class *reverse_initailize_classes = calloc(used, sizeof(struct initailize_class));
for (NSInteger i = used - 1; i >= 0; i--) {
reverse_initailize_classes[used - i - 1].cls = initailize_classes[i].cls;
reverse_initailize_classes[used - i - 1].method = initailize_classes[i].method;
}
*count = used;
free(initailize_classes);
initailize_classes = nil;
return reverse_initailize_classes;
}
3.6.2 遍历调用initialize方法
这里传入originalImp是由于我是在某个类的initialize
方法中触发的,所以我将消息接收者的initialize
的调用过滤掉
void callClassMethods(Class cls, SEL callSel, IMP originalImp) {
unsigned int count = 0;
struct initailize_class *initailize_classes = gatherClassMethodImps(cls, @selector(initialize), &count);
for (unsigned int i = 0; i < count; i ++) {
struct initailize_class item = initailize_classes[i];
if (item.method != originalImp) { // 过滤调调用者的方法
initalize_imp imp = (initalize_imp)item.method;
imp(item.cls, @selector(initialize));
}
}
if (initailize_classes) {
free(initailize_classes);
initailize_classes = nil;
}
}
3.6.3 编写测试代码
- 定义一个方法(被hook的方法),分别在本类,分类,子类,父类中对其进行hook
- 我们在子类中进行initialize方法的收集和调用
- 我们预期的结果最终hook的结果:类中是保持着类的继承链的,类和类的分类是先类再分类
/// 父类
@interface SuperHookMethodInInitialize : NSObject
// 需要被hook的方法
- (void)methodToBeHooked;
@end
/// 本类
@interface HookMethodInInitialize : SuperHookMethodInInitialize
@end
/// 子类
@interface SubHookMethodInInitialize : HookMethodInInitialize
@end
/// 分类A
@interface HookMethodInInitialize(CategoryA)
@end
/// 分类B
@interface HookMethodInInitialize(CategoryB)
@end
/// 分类C
@interface HookMethodInInitialize(CategoryC)
@end
@implementation SuperHookMethodInInitialize
+ (void)initialize {
if (self == [SuperHookMethodInInitialize self]) {
static dispatch_once_t onceToken;
dispatch_once(&onceToken, ^{
[MethodSwizzleUtil swizzleInstanceMethodWithClass:self originalSel:@selector(methodToBeHooked) replacementSel:@selector(hookedMethodInSuper)];
});
}
}
- (void)methodToBeHooked; {
NSLog(@"%s", __FUNCTION__);
}
- (void)hookedMethodInSuper {
[self hookedMethodInSuper];
NSLog(@"%s", __FUNCTION__);
}
@end
@implementation HookMethodInInitialize
+ (void)initialize {
if (self == [HookMethodInInitialize self]) {
static dispatch_once_t onceToken;
dispatch_once(&onceToken, ^{
[MethodSwizzleUtil swizzleInstanceMethodWithClass:self originalSel:@selector(methodToBeHooked) replacementSel:@selector(hookedMethodInMine)];
});
}
}
- (void)hookedMethodInMine {
[self hookedMethodInMine];
NSLog(@"%s", __FUNCTION__);
}
@end
@implementation SubHookMethodInInitialize
+ (void)initialize {
if (self == [SubHookMethodInInitialize self]) {
Method currentMethod = class_getClassMethod(self, _cmd);
IMP currentMethodImp = method_getImplementation(currentMethod);
callClassMethods(self, @selector(initialize), currentMethodImp);
static dispatch_once_t onceToken;
dispatch_once(&onceToken, ^{
[MethodSwizzleUtil swizzleInstanceMethodWithClass:self originalSel:@selector(methodToBeHooked) replacementSel:@selector(hookedMethodInSubclass)];
});
}
}
- (void)hookedMethodInSubclass {
[self hookedMethodInSubclass];
NSLog(@"%s", __FUNCTION__);
}
@end
@implementation HookMethodInInitialize(CategoryA)
+ (void)initialize {
if (self == [HookMethodInInitialize self]) {
static dispatch_once_t onceToken;
dispatch_once(&onceToken, ^{
[MethodSwizzleUtil swizzleInstanceMethodWithClass:self originalSel:@selector(methodToBeHooked) replacementSel:@selector(hookedMethodInCategoryA)];
});
}
}
- (void)hookedMethodInCategoryA {
[self hookedMethodInCategoryA];
NSLog(@"%s", __FUNCTION__);
}
@end
@implementation HookMethodInInitialize(CategoryB)
+ (void)initialize {
if (self == [HookMethodInInitialize self]) {
static dispatch_once_t onceToken;
dispatch_once(&onceToken, ^{
[MethodSwizzleUtil swizzleInstanceMethodWithClass:self originalSel:@selector(methodToBeHooked) replacementSel:@selector(hookedMethodInCategoryB)];
});
}
}
- (void)hookedMethodInCategoryB {
[self hookedMethodInCategoryB];
NSLog(@"%s", __FUNCTION__);
}
@end
@implementation HookMethodInInitialize(CategoryC)
+ (void)initialize {
if (self == [HookMethodInInitialize self]) {
static dispatch_once_t onceToken;
dispatch_once(&onceToken, ^{
[MethodSwizzleUtil swizzleInstanceMethodWithClass:self originalSel:@selector(methodToBeHooked) replacementSel:@selector(hookedMethodInCategoryC)];
});
}
}
- (void)hookedMethodInCategoryC {
[self hookedMethodInCategoryC];
NSLog(@"%s", __FUNCTION__);
}
@end
在子类中调用方法,查看函数被hook之后的调用情况
[[SubHookMethodInInitialize new] methodToBeHooked];
运行查看日志:
2020-06-18 20:34:25.528307+0800 RuntimeLearning[7568:93450] -[SuperHookMethodInInitialize methodToBeHooked]
2020-06-18 20:34:25.528860+0800 RuntimeLearning[7568:93450] -[SuperHookMethodInInitialize hookedMethodInSuper]
2020-06-18 20:34:25.529013+0800 RuntimeLearning[7568:93450] -[HookMethodInInitialize(CategoryC) hookedMethodInCategoryC]
2020-06-18 20:34:25.529163+0800 RuntimeLearning[7568:93450] -[HookMethodInInitialize hookedMethodInMine]
2020-06-18 20:34:25.529273+0800 RuntimeLearning[7568:93450] -[HookMethodInInitialize(CategoryA) hookedMethodInCategoryA]
2020-06-18 20:34:25.551023+0800 RuntimeLearning[7568:93450] -[HookMethodInInitialize(CategoryB) hookedMethodInCategoryB]
2020-06-18 20:34:25.551186+0800 RuntimeLearning[7568:93450] -[SubHookMethodInInitialize hookedMethodInSubclass]
分析日志看到,hook的继承链是符合预期,按照类的继承链的顺序;然而类有分类的时候,却不是按照先类再分类的顺序
这是由于在子类触发
initialize
的时候会沿着继承链触发上层类的initialize
的调用,而当类有分类的时候则调用分类的函数,所以出现了如上结果先调用了CategoryC的再调用了类的
优化代码
我们在分类中也收集一下initialize
方法imp列表再去触发一下,就能保证该类是先类再分类的顺序了
@implementation HookMethodInInitialize(CategoryC)
+ (void)initialize {
if (self == [HookMethodInInitialize self]) {
Method currentMethod = class_getClassMethod(self, _cmd);
IMP currentMethodImp = method_getImplementation(currentMethod);
callClassMethods(self, @selector(initialize), currentMethodImp);
static dispatch_once_t onceToken;
dispatch_once(&onceToken, ^{
[MethodSwizzleUtil swizzleInstanceMethodWithClass:self originalSel:@selector(methodToBeHooked) replacementSel:@selector(hookedMethodInCategoryC)];
});
}
}
@end
运行查看日志:
2020-06-18 20:49:55.871230+0800 RuntimeLearning[7693:104497] -[SuperHookMethodInInitialize methodToBeHooked]
2020-06-18 20:49:55.871384+0800 RuntimeLearning[7693:104497] -[SuperHookMethodInInitialize hookedMethodInSuper]
2020-06-18 20:49:55.871497+0800 RuntimeLearning[7693:104497] -[HookMethodInInitialize hookedMethodInMine]
2020-06-18 20:49:55.871594+0800 RuntimeLearning[7693:104497] -[HookMethodInInitialize(CategoryA) hookedMethodInCategoryA]
2020-06-18 20:49:55.871682+0800 RuntimeLearning[7693:104497] -[HookMethodInInitialize(CategoryB) hookedMethodInCategoryB]
2020-06-18 20:49:55.871960+0800 RuntimeLearning[7693:104497] -[HookMethodInInitialize(CategoryC) hookedMethodInCategoryC]
2020-06-18 20:49:55.872050+0800 RuntimeLearning[7693:104497] -[SubHookMethodInInitialize hookedMethodInSubclass]
至此已经符合我们预期的结果了,跟load中的行为一致
4. 总结
- 通过阅读源码了解了load的加载机制以及initialize的调用机制;参照系统加载load的思路,通过收集
initialize
的imp列表,在合适的时机去调用imp列表,从而达到类似load的行为。initialize
和load
的本质区别就是一个走的是消息机制,一个是直接通过imp函数指针调用;所以才导致load和initialize的差异性