Simple implementation of static interception of iOS object method calls
Recently there have been several articles on optimization of binary rearrangement startup. In all schemes, all function call situations need to be counted in advance, and the code rearrangement is performed according to the frequency of function calls.
Of these function calls, the OC object has the most method calls. The method call of the statistics OC object can be implemented at runtime by using third-party libraries such as fishhook to Hook all objc_msgSend calls, or Hook interception can be implemented by static instrumentation before linking after compilation.
The implementation of static instrumentation generally has the following two schemes:
Implement code instrumentation with the help of LLVM syntax tree analysis.
Compile the source code into a static library and implement code instrumentation by modifying the code section of the .o object file in the static library.
The above two methods are more complicated to implement. Either you need to understand LLVM, or you must be familiar with the underlying bytecode of the object file and the underlying knowledge related to the symbol table.
This article introduces the third kind of static Hook scheme, which also relies on the premise of static libraries to implement the hook of the objc_msgSend function, so as to achieve OC object method call instrumentation after linking before compilation.
The principle of this solution is very simple. Because the static library is actually just an intermediate product of the compilation phase, all referenced external symbols in the static library object file are saved to a string table, and all function calls only record the function name's index position in the string table. At link time, the actual function call instruction will be replaced according to the symbol name. Therefore, we can uniformly replace objc_msgSend in all static library string tables with another string of the same length: hook_msgSend (the name is arbitrary as long as the length is consistent and unique). Then implement a function named hook_msgSend in the main project source code. This function must be consistent with the function signature of objc_msgSend, so that all objc_msgSend calls in static libraries will be converted into hook_msgSend calls at link time.
The following are the specific implementation steps:
1. Write the implementation of hook_msgSend in the main project.
The function signature of hook_msgSend must be consistent with objc_msgSend, and it must be implemented in the main project code, and it must be implemented in assembly code. The specific implementation logic is consistent with the Hook implementation of the objc_msgSend function introduced in many articles.
Many Hook implementations of objc_msgSend are actually incomplete, so if you want to fully grasp the ABI rules of function calls, please refer to: "Deep Function Calls in the Underlying iOS System"
2. Compile all other code uniformly into one or more static libraries.
The source code is functionally compiled into one or more static libraries, and the main project links these static libraries. The organization of this program code is very mature. The most common method is to use code cocoapods, which is an integration tool, and we will not repeat them here.
3. Add Run Script to Build Phases of the main project.
We need to ensure that this script must be executed before linking all static libraries. So it can be placed under Compile Sources.
4. Run Script to implement static library symbol substitution.
This is the most critical step. We can implement a symbol replacement program, and then execute this symbol replacement program in a Run Script script. The input parameters of the symbol replacement program are the paths of all static libraries linked in the main project. There is no limit to how this symbol replacement program can be written. You can write it in ruby, python, or C. No matter which method you use to implement it, you need to first understand the file structure of the static library .a. You can grasp the structure of a static library file from the article "Dive into the Static Library of the iOS System". After understanding the structure of the static library file, what your symbol replacement program needs to do can be implemented as follows:
A) Open the static library .a file.
B) Find the part of the string table defined in the .a file. The description of the string table is as follows:
int size; // The size of the string table
char strings ; // The contents of the string table, each string is separated by \ 0.
The contents of the strings in the string table are strings separated by \ 0, and the contents of these strings are actually all the external and internal symbol names referenced by this object file.
C) Replace the objc_msgSend string in the string table with the hook_msgSend string.
4.) Save and close the static library .a file.
5. Compile, link and run your main project program.
The advantage of using the static Hook method introduced in this article is that we do not have to Hook all OC method calls, but can selectively perform method call interception for specific objects and classes. Therefore, this technology can be applied not only to the code reordering statistics, but also to other monitoring and statistical applications. Because this mechanism can avoid the function call storm problem caused by objc_msgSend replacement when the program is running. Another point is that this method is not limited to Hooking objc_msgSend, but can also Hook processing any other function. So this technique can also be applied in other area