Detailed explanation of Android Native memory leak detection solution

Detailed explanation of Android Native memory leak detection solution

Author | yeconglu

A complete Android Native memory leak detection tool mainly consists of three parts: proxy implementation, stack backtrace and cache management. Proxy implementation is the key part to solve the access problem on the Android platform, and stack backtrace is the core element of performance and stability.

This article will introduce how to implement Native memory leak monitoring from three aspects:

  • This chapter introduces the implementation methods, advantages and disadvantages of three proxy implementation schemes: Inline Hook, PLT/GOT Hook, and LD_PRELOAD.
  • This article introduces the basic ideas for detecting Android Native memory leaks and sample code including caching logic.
  • This section describes how to obtain the Android Native stack, which is used to record the call stack when allocating memory.

1. Implementation of proxy memory management function

First, let's introduce three solutions for implementing proxy memory management functions:

  • Inline Hook
  • PLT/GOT Hook
  • LD_PRELOAD

1. Native Hook

(1) Solution comparison: Inline Hook and PLT/GOT Hook

There are currently two main Native Hook solutions: Inline Hook and PLT/GOT Hook.

Instruction relocation refers to the process of adjusting the relative addresses in a program to point to the correct memory location during the linking and loading process of a computer program. This is because when a program is compiled, it is impossible to predict where in the memory it will be loaded at runtime, so relative addresses are often used to represent memory locations in the compiled program. However, during actual runtime, the program may be loaded to any location in the memory, so during the loading process, all relative addresses in the program need to be adjusted according to the memory address where the program is actually loaded. This process is called relocation.

When performing Inline Hook, if the machine code of the target function is modified directly, the relative address of the original jump instruction may be changed, causing the program to jump to the wrong location. Therefore, instruction relocation is required to ensure that the modified instruction can correctly jump to the expected location.

(2) Example: Hooking the malloc function in an Android application

To better understand the application scenarios of Native Hook, let's look at a real case: Hook malloc function in Android application to monitor the opening operation of the file.

① Inline Hook Implementation

 #include <stdio.h> #include <dlfcn.h> #include <unistd.h> #include <string.h> #include <sys/mman.h> #include <android/log.h> #define TAG "NativeHook" #define LOGD(...) __android_log_print(ANDROID_LOG_DEBUG, TAG, __VA_ARGS__) typedef void* (*orig_malloc_func_type)(size_t size); orig_malloc_func_type orig_malloc; unsigned char backup[8]; // 用于保存原来的机器码void* my_malloc(size_t size) { LOGD("内存分配: %zu 字节", size); // 创建一个新的函数指针orig_malloc_with_backup,指向一个新的内存区域void *orig_malloc_with_backup = mmap(NULL, sizeof(backup) + 8, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); // 将备份的指令A和B复制到新的内存区域memcpy(orig_malloc_with_backup, backup, sizeof(backup)); // 在新的内存区域的末尾添加一个跳转指令,使得执行流跳转回原始malloc函数的剩余部分unsigned char *jump = (unsigned char *)orig_malloc_with_backup + sizeof(backup); jump[0] = 0x01; // 跳转指令的机器码*(void **)(jump + 1) = (unsigned char *)orig_malloc + sizeof(backup); // 跳转目标的地址// 调用orig_malloc_with_backup函数指针orig_malloc_func_type orig_malloc_with_backup_func_ptr = (orig_malloc_func_type)orig_malloc_with_backup; void *result = orig_malloc_with_backup_func_ptr(size); // 释放分配的内存区域munmap(orig_malloc_with_backup, sizeof(backup) + 8); return result; } void *get_function_address(const char *func_name) { void *handle = dlopen("libc.so", RTLD_NOW); if (!handle) { LOGD("错误: %s", dlerror()); return NULL; } void *func_addr = dlsym(handle, func_name); dlclose(handle); return func_addr; } void inline_hook() { void *orig_func_addr = get_function_address("malloc"); if (orig_func_addr == NULL) { LOGD("错误: 无法找到'malloc' 函数的地址"); return; } // 备份原始函数orig_malloc = (orig_malloc_func_type)orig_func_addr; // 备份原始机器码memcpy(backup, orig_func_addr, sizeof(backup)); // 更改页面保护size_t page_size = sysconf(_SC_PAGESIZE); uintptr_t page_start = (uintptr_t)orig_func_addr & (~(page_size - 1)); mprotect((void *)page_start, page_size, PROT_READ | PROT_WRITE | PROT_EXEC); // 构造跳转指令unsigned char jump[8] = {0}; jump[0] = 0x01; // 跳转指令的机器码*(void **)(jump + 1) = my_malloc; // 我们的钩子函数的地址// 将跳转指令写入目标函数的入口点memcpy(orig_func_addr, jump, sizeof(jump)); } void unhook() { void *orig_func_addr = get_function_address("malloc"); if (orig_func_addr == NULL) { LOGD("错误: 无法找到'malloc' 函数的地址"); return; } // 更改页面保护size_t page_size = sysconf(_SC_PAGESIZE); uintptr_t page_start = (uintptr_t)orig_func_addr & (~(page_size - 1)); mprotect((void *)page_start, page_size, PROT_READ | PROT_WRITE | PROT_EXEC); // 将备份的机器码写入目标函数的入口点memcpy(orig_func_addr, backup, sizeof(backup)); }

In my_malloc, we need to execute the backed-up instructions first, and then jump execution flow back to the remainder of the original malloc function:

  • In the my_malloc function, a new function pointer orig_malloc_with_backup is created, which points to a new memory area that contains the backed-up instructions and a jump instruction.
  • Copies the backed-up instructions to a new memory area.
  • Add a jump instruction at the end of the new memory area so that the execution flow jumps back to the remainder of the original malloc function.
  • In my_malloc, call the orig_malloc_with_backup function pointer.

There are three difficulties here, which are explained in detail below:

● How to modify the protection attributes of memory pages

orig_func_addr & (~(page_size - 1)) This code is used to get the starting address of the memory page containing the orig_func_addr address. Here is a trick: page_size is always a power of 2, so the binary representation of page_size - 1 is that the low bits are all 1 and the high bits are all 0. After inversion, the low bits are all 0 and the high bits are all 1. By performing an AND operation on orig_func_addr and ~(page_size - 1), the low bits of orig_func_addr can be cleared to zero, thereby obtaining the starting address of the memory page.

mprotect((void *)page_start, page_size, PROT_READ | PROT_WRITE | PROT_EXEC); This line of code is used to modify the protection attributes of a memory page. The mprotect function can set the protection attributes of a memory area. It accepts three parameters: the starting address of the memory area to be modified, the size of the memory area, and the new protection attributes. Here, we set the protection attributes of the memory page containing the orig_func_addr address to readable, writable, and executable (PROT_READ | PROT_WRITE | PROT_EXEC) so that we can modify the code in this memory page.

● How to restore the original function

To restore the original function, we need to save the original machine code before Hook, and then write the saved machine code back to the entry point of the function when we need to restore it.

The backup array in the code is used to save the original machine code. In the inline_hook function, we copy the original machine code to the backup array before modifying the machine code. Then, we provide an unhook function to restore the original machine code. When you need to restore the malloc function, you can call the unhook function.

It should be noted that this example assumes that the machine code length of the function entry point is 8 bytes. In actual use, you need to determine the length of the machine code according to the actual situation and adjust the size of the backup array and the parameters of the memcpy function accordingly.

●How to implement instruction relocation

We take a simple ARM64 assembly code as an example to demonstrate how to relocate instructions. Assume that we have the following target function:

 TargetFunction: mov x29, sp sub sp, sp, #0x10 ; ... 其他指令... bl SomeFunction ; ... 其他指令... b TargetFunctionEnd

We need to insert a jump instruction at the beginning of TargetFunction to jump the execution flow to our HookFunction. To achieve this goal, we need to do the following:

  • Back up the overwritten instructions: We need to back up the instructions at the beginning of TargetFunction because they will be overwritten by our jump instructions. In this example, we need to back up the mov x29, sp and sub sp, sp, #0x10 instructions.
  • Insert jump instruction: Insert a jump instruction to HookFunction at the beginning of TargetFunction. In ARM64 assembly, we can use the b instruction to achieve this goal:
 b HookFunction
  • Handle the overwritten instructions: In HookFunction, we need to execute the overwritten instructions. In this example, we need to execute two instructions in HookFunction: mov x29, sp and sub sp, sp, #0x10.
  • Relocate jumps and data references: In HookFunction, we need to handle jumps and data references in the target function. In this example, we need to relocate two jump instructions, bl SomeFunction and b TargetFunctionEnd. Based on the new address of the target function in memory, we need to calculate the new jump address and modify the operands of these two instructions.
  • Return to the target function: After executing the overwritten instructions and other custom operations in HookFunction, we need to return to the unmodified part of the target function. In this example, we need to add a jump instruction at the end of HookFunction to jump the execution flow back to the sub sp, sp, #0x10 instruction of TargetFunction.

After the above steps, we have successfully inserted a jump instruction to HookFunction in TargetFunction and relocated the jump and data references in the target function. In this way, when executing to TargetFunction, the program will jump to HookFunction for execution, and after executing the overwritten instructions and other custom operations, it will return to the unmodified part of the target function.

(2) PLT/GOT Hook Implementation

PLT (Procedure Linkage Table) and GOT (Global Offset Table) are two important tables used to resolve dynamic symbols in shared libraries under Linux.

PLT (Procedure Linkage Table): Procedure Linkage Table, used to store the entry address of the function in the dynamic link library. When the program calls a function in a dynamic link library, it will first jump to the corresponding entry in the PLT, and then find the actual function address through the GOT and execute it.

GOT (Global Offset Table): Global offset table, used to store the actual addresses of functions and variables in dynamic link libraries. When the program is running, the dynamic linker will fill the actual addresses of functions and variables into the GOT as needed. The entries in the PLT will find the actual addresses of functions and variables through the GOT.

In PLT/GOT Hook, we can modify the function address in GOT so that when the program calls a certain function, it actually calls our custom function. In this way, we can add additional logic (such as detecting memory leaks) in the custom function and then call the original function. This method can achieve non-intrusive modification of the program without recompiling the program.

 #include <stdio.h> #include <dlfcn.h> #include <unistd.h> #include <android/log.h> #define TAG "NativeHook" #define LOGD(...) __android_log_print(ANDROID_LOG_DEBUG, TAG, __VA_ARGS__) typedef void* (*orig_malloc_func_type)(size_t size); orig_malloc_func_type orig_malloc; void* my_malloc(size_t size) { LOGD("Memory allocated: %zu bytes", size); return orig_malloc(size); } void plt_got_hook() { void **got_func_addr = (void **)dlsym(RTLD_DEFAULT, "malloc"); if (got_func_addr == NULL) { LOGD("Error: Cannot find the GOT entry of 'malloc' function"); return; } // Backup the original function orig_malloc = (orig_malloc_func_type)*got_func_addr; // Replace the GOT entry with the address of our hook function *got_func_addr = my_malloc; }

RTLD_DEFAULT in the above code is a special handle value, which means to search for symbols in all dynamic link libraries loaded by the current process. When RTLD_DEFAULT is used as the handle parameter of dlsym(), dlsym() will search for the specified symbol in all dynamic link libraries loaded by the current process, not just a specific dynamic link library.

(3) Let’s look at the difference between Inline Hook and Got Hook

The key point is that the address returned by dlsym has different meanings in the two Native Hook implementations:

① Inline Hook

 void *get_function_address(const char *func_name) { void *handle = dlopen("libc.so", RTLD_NOW); ... void *func_addr = dlsym(handle, func_name); dlclose(handle); return func_addr; } void *orig_func_addr = get_function_address("malloc"); memcpy(orig_func_addr, jump, sizeof(jump));

The address returned by dlsym is the actual address of the function in memory, which usually points to the entry point of the function (that is, the first instruction of the function).

②Got Hook

 void **got_func_addr = (void **)dlsym(RTLD_DEFAULT, "malloc"); *got_func_addr = my_malloc;

dlsym returns the address of the malloc function in the GOT. Note that void **got_func_addr is a double pointer.

2. Using LD_PRELOAD

Using LD_PRELOAD, you can overload memory management functions without modifying the source code. Although this method has many limitations on the Android platform, we can also understand the relevant principles.

LD_PRELOAD is an environment variable used to preload dynamic link libraries when a program is running. By setting LD_PRELOAD, we can force the loading of a specified library when the program is running, thereby changing the behavior of the program without modifying the source code. This method is often used in scenarios such as debugging, performance analysis, and memory leak detection.

The principle and method of using LD_PRELOAD to detect memory leaks are as follows:

(1) Principle

When the LD_PRELOAD environment variable is set, the program will load the specified library before loading other libraries. This allows us to overload some functions in the original library (such as glibc) in the custom library. In the scenario of memory leak detection, we can overload memory allocation and release functions (such as malloc, calloc, realloc and free) to record relevant information when allocating and releasing memory.

(2) Methods:

  • Create a custom library: First, we need to create a custom memory leak detection library and overload the memory allocation and release functions in it. In these overloaded functions, we can call the original memory management function and add the memory block and its related information (such as allocation size, call stack, etc.) to the global memory allocation table when allocating memory, and delete the corresponding memory block from the global memory allocation table when releasing memory.
  • Set the LD_PRELOAD environment variable: Before running the program, we need to set the LD_PRELOAD environment variable to point to the path of the custom library. In this way, the program will load the custom library first when running, so as to use the overloaded memory management function.
  • Run the program: When you run the program, it will use the overloaded memory management functions to record information about memory allocation and deallocation. We can detect memory leaks by checking the memory blocks that still exist in the global memory allocation table during or after the program is run.

By using LD_PRELOAD to detect memory leaks, we can dynamically change the behavior of the program without modifying the program source code, record the information of memory allocation and release, and thus detect memory leaks and find out the source of memory leaks.

3. Summary

Finally, we summarize the advantages and disadvantages of the three proxy implementation methods in this section in a table:

2. Detecting Natie memory leaks

In this section, we will introduce the overall idea of ​​detecting Native layer memory leaks based on the proxy implementation of PLT/GOT Hook.

1. Principle Introduction

In Android, to detect memory leaks in the Native layer, you can rewrite memory allocation and release functions such as malloc, calloc, realloc, and free to record relevant information each time memory is allocated and released. For example, we can create a global memory allocation table to store all allocated memory blocks and their metadata (such as allocation size, allocation location, etc.). Then, when the memory is released, delete the corresponding entry from the memory allocation table. Check the memory allocation table regularly to find out the memory that has not been released.

2. Code Examples

The main technical principle of the following code is to rewrite the memory management function and use weak symbols to reference the original memory management function so that relevant information can be recorded each time memory is allocated and released, and these functions can be dynamically found and called when the program is running.

Here is a code example:

 #include <cstdlib> #include <cstdio> #include <map> #include <mutex> #include <dlfcn.h> #include <execinfo.h> #include <vector> #include <android/log.h> #define TAG "CheckMemoryLeaks" #define LOGD(...) __android_log_print(ANDROID_LOG_DEBUG, TAG, __VA_ARGS__) // 全局内存分配表,存储分配的内存块及其元数据(如分配大小、调用栈等) std::map<void*, std::pair<size_t, std::vector<void*>>> g_memoryAllocations; std::mutex g_memoryAllocationsMutex; // 定义弱符号引用原始的内存管理函数extern "C" void* __libc_malloc(size_t size) __attribute__((weak)); extern "C" void __libc_free(void* ptr) __attribute__((weak)); extern "C" void* __libc_realloc(void *ptr, size_t size) __attribute__((weak)); extern "C" void* __libc_calloc(size_t nmemb, size_t size) __attribute__((weak)); void* (*lt_malloc)(size_t size); void (*lt_free)(void* ptr); void* (*lt_realloc)(void *ptr, size_t size); void* (*lt_calloc)(size_t nmemb, size_t size); #define LT_MALLOC (*lt_malloc) #define LT_FREE (*lt_free) #define LT_REALLOC (*lt_realloc) #define LT_CALLOC (*lt_calloc) // 在分配内存时记录调用栈std::vector<void*> record_call_stack() { // ... } // 初始化原始内存管理函数,如果弱符号未定义,则使用dlsym 获取函数地址void init_original_functions() { if (!lt_malloc) { if (__libc_malloc) { lt_malloc = __libc_malloc; } else { lt_malloc = (void*(*)(size_t))dlsym(RTLD_NEXT, "malloc"); } } //calloc realloc free 的实现也类似... } // 重写malloc 函数extern "C" void* malloc(size_t size) { // 初始化原始内存管理函数init_original_functions(); // 调用原始的malloc 函数void* ptr = LT_MALLOC(size); // 记录调用栈std::vector<void*> call_stack = record_call_stack(); // 在全局内存分配表中添加新分配的内存块及其元数据std::unique_lock<std::mutex> lock(g_memoryAllocationsMutex); g_memoryAllocations[ptr] = std::make_pair(size, call_stack); return ptr; } // 重写calloc 函数extern "C" void* calloc(size_t nmemb, size_t size) { // 跟malloc 实现类似// ... } // 重写realloc 函数extern "C" void* realloc(void* ptr, size_t size) { // 初始化原始内存管理函数init_original_functions(); // 调用原始的realloc 函数void* newPtr = LT_REALLOC(ptr, size); // 记录调用栈std::vector<void*> call_stack = record_call_stack(); // 更新全局内存分配表中的内存块及其元数据std::unique_lock<std::mutex> lock(g_memoryAllocationsMutex); g_memoryAllocations.erase(ptr); g_memoryAllocations[newPtr] = std::make_pair(size, call_stack); return newPtr; } // 重写free 函数extern "C" void free(void* ptr) { // 初始化原始内存管理函数init_original_functions(); // 从全局内存分配表中删除释放的内存块std::unique_lock<std::mutex> lock(g_memoryAllocationsMutex); g_memoryAllocations.erase(ptr); // 调用原始的free 函数LT_FREE(ptr); } // 定义一个函数用于检查内存泄漏void check_memory_leaks() { // 使用互斥锁保护对全局内存分配表的访问,防止在多线程环境下发生数据竞争std::unique_lock<std::mutex> lock(g_memoryAllocationsMutex); // 如果全局内存分配表为空,说明没有检测到内存泄漏if (g_memoryAllocations.empty()) { LOGD("No memory leaks detected."); } else { // 如果全局内存分配表不为空,说明检测到了内存泄漏LOGD("Memory leaks detected:"); // 遍历全局内存分配表,打印出所有未被释放的内存块的地址和大小for (const auto& entry : g_memoryAllocations) { LOGD(" Address: %p, Size: %zu bytes\n", entry.first, entry.second.first); LOGD(" Call stack:"); for (void* frame : entry.second.second) { LOGD(" %p\n", frame); } } } } int main() { // 初始化原始内存管理函数init_original_functions(); // 示例代码void* ptr1 = malloc(10); void* ptr2 = calloc(10, sizeof(int)); void* ptr3 = malloc(20); ptr3 = realloc(ptr3, 30); free(ptr1); free(ptr2); free(ptr3); // 检查内存泄漏check_memory_leaks(); return 0; }

The core logic of the above code includes:

  • Rewrite memory management functions: rewrite malloc, calloc, realloc and free, add memory blocks and their information to the global memory allocation table when allocating memory, and delete the corresponding memory blocks from the table when releasing memory.
  • Weak symbols reference original memory management functions: Use __attribute__((weak)) to define four weak symbols to reference memory management functions in glibc/eglibc. Check the weak symbol definition in the init_original_functions function. If it is not defined, use the dlsym function to find the original memory management function.
  • Global memory allocation table: Define the global memory allocation table to store all allocated memory blocks and their information. The table is a map, the key is the memory block address, and the value is a pair, including the memory block size and call stack.
  • Call stack recording: Recording the current call stack when allocating memory helps to find the source of the leak when detecting memory leaks.
  • Memory leak detection: Define the check_memory_leaks function to check for memory blocks that still exist in the global memory allocation table, indicating a memory leak.

(1) Use weak symbols: Prevent calls to the dlsym function from causing infinite recursion

The dlsym function is used to find symbols in a dynamic link library. However, in glibc and eglibc, the dlsym function may call the calloc function internally. If we are redefining the calloc function and calling the dlsym function in the calloc function to get the original calloc function, infinite recursion will occur.

Functions such as __libc_calloc are declared as weak symbols to avoid conflicts with the strong symbol definitions of these functions in glibc or eglibc. Then in the init_original_functions function, we check whether functions such as __libc_calloc are nullptr. If so, it means that glibc or eglibc has not defined these functions, so use the dlsym function to get the addresses of these functions. If not, it means that glibc or eglibc has already defined these functions, so use those definitions directly.

(2) Explanation of RTLD_NEXT

RTLD_NEXT is a special "pseudo handle" used to find the next symbol in a dynamic link library function. It is often used with the dlsym function to find and call the original (overwritten or intercepted) function.

In Linux, if a program links multiple dynamic link libraries, and there are multiple functions with the same name defined in these libraries, then by default, the program will use the first function found. But sometimes, we may need to overwrite a function in another library in one library, and at the same time need to call the original function. At this time, you can use RTLD_NEXT.

dlsym(RTLD_NEXT, "malloc") will find the next symbol named "malloc", which is the original malloc function. Then we can call the original malloc function in the custom malloc function.

(3) Notes

Detecting memory leaks may increase the runtime overhead of the program and may cause some problems related to thread safety. When using this method, we need to ensure that the code is thread-safe and perform memory leak detection without affecting program performance. At the same time, manual detection of memory leaks may not find all memory leaks, so it is recommended that you use other tools (such as AddressSanitizer, LeakSanitizer or Valgrind) to assist in detecting memory leaks.

3. Get the Android Native stack

You may have noticed that in the implementation of Native memory leak detection in the second part, the implementation of record_call_stack was omitted. So we are left with a question: how to record the call stack when allocating memory? In the last section, we will explain how to obtain the Android Native stack.

1. Use the unwind function

(3) Tools and methods

For Android system, we cannot use the backtrace_symbols function directly because it is not implemented in Android Bionic libc. However, we can use the dladdr function instead of backtrace_symbols to get symbol information.

Android NDK provides the unwind.h header file, which defines the unwind function, which can be used to obtain the stack information of any thread.

(2) Get the stack information of the current thread

If we need to obtain the stack information of the current thread, we can use the unwind function in Android NDK. The following is a sample code for using the unwind function to obtain the stack information:

 #include <unwind.h> #include <dlfcn.h> #include <stdio.h> // 定义一个结构体,用于存储回溯状态struct BacktraceState { void** current; void** end; }; // 回溯回调函数,用于处理每一帧的信息_Unwind_Reason_Code unwind_callback(struct _Unwind_Context* context, void* arg) { BacktraceState* state = static_cast<BacktraceState*>(arg); uintptr_t pc = _Unwind_GetIP(context); if (pc) { if (state->current == state->end) { return _URC_END_OF_STACK; } else { *state->current++ = reinterpret_cast<void*>(pc); } } return _URC_NO_REASON; } // 捕获回溯信息,将其存储到buffer中void capture_backtrace(void** buffer, int max) { BacktraceState state = {buffer, buffer + max}; _Unwind_Backtrace(unwind_callback, &state); } // 打印回溯信息void print_backtrace(void** buffer, int count) { for (int idx = 0; idx < count; ++idx) { const void* addr = buffer[idx]; const char* symbol = ""; Dl_info info; if (dladdr(addr, &info) && info.dli_sname) { symbol = info.dli_sname; } // 计算相对地址void* relative_addr = reinterpret_cast<void*>(reinterpret_cast<uintptr_t>(addr) - reinterpret_cast<uintptr_t>(info.dli_fbase)); printf("%-3d %p %s (relative addr: %p)\n", idx, addr, symbol, relative_addr); } } // 主函数int main() { const int max_frames = 128; void* buffer[max_frames]; // 捕获回溯信息capture_backtrace(buffer, max_frames); // 打印回溯信息print_backtrace(buffer, max_frames); return 0; }

In the above code, the capture_backtrace function uses the _Unwind_Backtrace function to obtain the stack information, and then we use the dladdr function to obtain the base address of the SO library where the function is located (info.dli_fbase), and then calculate the relative address of the function (relative_addr). Then when printing the stack information, the relative address of the function is also printed.

(3) libunwind related interfaces

① _Unwind_Backtrace

_Unwind_Backtrace is a function of the libunwind library that is used to obtain the current thread call stack. It traverses the stack frames and calls the user-defined callback function on each stack frame to obtain the stack frame information (such as function address, parameters, etc.). The function prototype is as follows:

 _Unwind_Reason_Code _Unwind_Backtrace(_Unwind_Trace_Fn trace, void *trace_argument);

parameter:

  • trace: callback function, which will be called on each stack frame. The callback function needs to return a value of type _Unwind_Reason_Code, indicating the execution result.
  • trace_argument: User-defined parameter passed to the callback function. Usually used to store stack information or other user data.

② _Unwind_GetIP

_Unwind_GetIP is a function of the libunwind library, which is used to obtain the instruction pointer of the current stack frame (that is, the return address of the current function). It depends on the underlying hardware architecture (such as ARM, x86, etc.) and the operating system implementation. The function prototype is as follows:

 uintptr_t _Unwind_GetIP(struct _Unwind_Context *context);

parameter:

  • context: context information of the current stack frame. Created in the _Unwind_Backtrace function and passed to the callback function on each stack frame.
  • _Unwind_GetIP returns an unsigned integer, which represents the return address of the current function. This address can be used to obtain the function's symbolic information, such as the function name, source file name, and line number.

③ Availability in different Android versions

The _Unwind_Backtrace and _Unwind_GetIP functions are defined in the libunwind library, which is part of the GNU C Library (glibc). However, the Android system uses a lightweight C library, Bionic libc, instead of glibc. Therefore, the availability of these two functions in the Android system depends on the Bionic libc and Android system versions.

In earlier Android versions (such as Android 4.x), Bionic libc does not fully implement the libunwind library functions, which may cause the _Unwind_Backtrace and _Unwind_GetIP functions to not work properly. In this case, other methods are required to obtain stack information, such as manually traversing the stack frame or using a third-party library.

Starting from Android 5.0 (Lollipop), Bionic libc provides more complete libunwind library support, including _Unwind_Backtrace and _Unwind_GetIP functions. Therefore, in Android 5.0 and higher, you can directly use these two functions to obtain stack information.

Although these two functions are available in newer versions of Android, their behavior may be affected by compiler optimization, debugging information, etc. In actual use, we need to choose the most appropriate method according to the specific situation.

2. Manually traverse the stack frame to obtain stack information

In the Android system, the specific implementation of _Unwind_Backtrace depends on the underlying hardware architecture (such as ARM, x86, etc.) and the operating system. It uses architecture-specific registers and data structures to traverse the stack frame. For example, on the ARM64 architecture, _Unwind_Backtrace uses the Frame Pointer (FP) register and the Link Register (LR) register to traverse the stack frame.

If _Unwind_Backtrace is not used, we can manually traverse the stack frame to obtain the stack information.

(1) Sample code for ARM64 architecture

The following is a sample code based on the ARM64 architecture, showing how to manually traverse the stack frame using the Frame Pointer (FP) register:

 #include <stdio.h> #include <dlfcn.h> void print_backtrace_manual() { uintptr_t fp = 0; uintptr_t lr = 0; // 获取当前的FP和LR寄存器值asm("mov %0, x29" : "=r"(fp)); asm("mov %0, x30" : "=r"(lr)); while (fp) { // 计算上一个栈帧的FP和LR寄存器值uintptr_t prev_fp = *(uintptr_t*)(fp); uintptr_t prev_lr = *(uintptr_t*)(fp + 8); // 获取函数地址对应的符号信息Dl_info info; if (dladdr(reinterpret_cast<void*>(lr), &info) && info.dli_sname) { printf("%p %s\n", reinterpret_cast<void*>(lr), info.dli_sname); } else { printf("%p\n", reinterpret_cast<void*>(lr)); } // 更新FP和LR寄存器值fp = prev_fp; lr = prev_lr; } }

In the above code, we first get the current FP (x29) and LR (x30) register values. Then, by traversing the FP chain, we get the return address of each stack frame (stored in the LR register). Finally, we use the dladdr function to get the symbol information corresponding to the function address and print the stack information.

In this code, *(uintptr_t*)(fp) means to get the value at the memory address pointed to by fp. fp is an unsigned integer, which represents a memory address. (uintptr_t*)(fp) converts fp into a pointer, and then the * operator gets the value pointed to by the pointer.

In the ARM64 architecture, a new stack frame is created when a function is called. Each stack frame contains the function's local variables, parameters, return address, and other information related to the function call. Among them, the Frame Pointer (FP) register (x29) saves the FP register value of the previous stack frame, and the Link Register (LR) register (x30) saves the return address of the function.

In this code, the fp variable stores the FP register value of the current stack frame, which is the frame base address of the previous stack frame. Therefore, *(uintptr_t*)(fp) takes the FP register value of the previous stack frame, which is the frame base address of the previous stack frame. This value is used to update the fp variable when traversing the stack frame so that the previous stack frame can be processed in the next loop.

(2) Sample code for ARM architecture

In the ARM architecture, we can use the Frame Pointer (FP) register (R11) and the Link Register (LR) register (R14) to manually traverse the stack frame. The following is an example code based on the ARM architecture, showing how to manually traverse the stack frame to obtain stack information:

 #include <stdio.h> #include <dlfcn.h> void print_backtrace_manual_arm() { uintptr_t fp = 0; uintptr_t lr = 0; // 获取当前的FP和LR寄存器值asm("mov %0, r11" : "=r"(fp)); asm("mov %0, r14" : "=r"(lr)); while (fp) { // 计算上一个栈帧的FP和LR寄存器值uintptr_t prev_fp = *(uintptr_t*)(fp); uintptr_t prev_lr = *(uintptr_t*)(fp + 4); // 获取函数地址对应的符号信息Dl_info info; if (dladdr(reinterpret_cast<void*>(lr), &info) && info.dli_sname) { printf("%p %s\n", reinterpret_cast<void*>(lr), info.dli_sname); } else { printf("%p\n", reinterpret_cast<void*>(lr)); } // 更新FP和LR寄存器值fp = prev_fp; lr = prev_lr; } }

In this sample code, we first get the current FP (R11) and LR (R14) register values. Then, by traversing the FP chain, we get the return address of each stack frame (stored in the LR register). Finally, we use the dladdr function to get the symbol information corresponding to the function address and print the stack information.

From the above sample code, we can see that the method of manually traversing the stack frame to obtain stack information on different architectures is roughly the same, except that the registers and data structures are different. This method provides a way to obtain stack information without using _Unwind_Backtrace, which helps us better understand and debug the program.

(3) Register

In the function call process, fp (Frame Pointer), lr (Link Register) and sp (Stack Pointer) are three key registers. The relationship between them is as follows:

  • fp (Frame Pointer): The frame pointer register is used to point to the frame base address of the current stack frame. During the function call process, each function has a stack frame, which is used to store the local variables, parameters, return addresses and other information of the function. The fp register helps locate and access this information. In different architectures, the fp register may have different names. For example, in the ARM64 architecture, the fp register corresponds to X29; in the ARM architecture, the fp register corresponds to R11; in the x86_64 architecture, the fp register corresponds to RBP.
  • lr (Link Register): The link register is used to save the return address of a function. When a function is called, the program needs to know where to return to continue execution after the function is executed. This return address is saved in the lr register. In different architectures, the lr register may have different names. For example, in the ARM64 architecture, the lr register corresponds to X30; in the ARM architecture, the lr register corresponds to R14; in the x86_64 architecture, the return address is usually saved on the stack, not in a dedicated register.
  • sp(Stack Pointer): The stack pointer register is used to point to the top of the stack of the current stack frame. During the function call process, the stack pointer will allocate or release the stack space as needed. In different architectures, the sp register may have different names. For example, in the ARM64 architecture, the sp register corresponds to XSP; in the ARM architecture, the sp register corresponds to R13; in the x86_64 architecture, the sp register corresponds to RSP.

Fp, lr and sp work together during the function call process to achieve correct function calls and return. Fp is used to locate data in the stack frame, lr saves the return address of the function, and sp is responsible for managing the stack space. When traversing the stack frame to obtain stack information, we need to use the relationship between these three registers to locate the position and content of each stack frame.

(4) Stack frame

Stack Frame is an important concept in function calling. Each time a function is called, a new stack frame is created on the stack. The stack frame contains local variables, parameters, return addresses, and some other information related to function calls. The following figure is a standard function calling process:

  • EBP: Base pointer register, pointing to the bottom of the stack frame. Register is R11 under ARM. Register is X29 in ARM64.
  • ESP: Stack pointer register, pointing to the top of the stack frame, register is R13 under ARM.
  • EIP: Instruction register, which stores the address of the instruction to be executed next time by the CPU. Under ARM, it is PC, and the register is R15.

Each function call saves EBP and EIP to restore function stack frames when returned. All saved EBPs here are like a linked list pointer, constantly pointing to the EBP that calls the function.

In Android system, the basic principle of stack frame is the same as that of other operating systems. Through the stack frame defined by SP and FP, the SP and FP of the parent function can be obtained, thereby obtaining the stack frame of the parent function (PC, LR, SP, FP will press the stack at the first time of the function call). In this way, you can obtain the call order of all functions.

In ARM64 and ARM architectures, we can use FP chains (frame pointer chains) to traverse the stack frames. The specific method is: start from the current FP register and traverse upward along the FP chain until we encounter a null pointer (NULL) or an invalid address. During the traversal process, we can extract the return address (stored in the LR register) and other related information from each stack frame.

(5) Name Mangling

The symbol information of the Native stack may be somewhat different from the function names defined in the code, because the symbol table generated by GCC has some modification rules.

C++ supports function overloading, that is, the same function name can have different parameter types and numbers. In order to distinguish these functions during compilation, GCC will modify the function name to generate a unique symbol name. The modified name contains information such as function name, parameter type, etc. For example, for the following C++ functions:

 namespace test { int foo(int a, double b); }

After GCC modification, the generated symbol may be similar to: _ZN4test3fooEid, where:

  • _ZN and E are modified prefixes and suffixes to identify this as a C++ symbol.
  • 4test means the namespace name is test, and 4 means the length of the namespace name.
  • 3foo means the function name is foo, and 3 means the length of the function name.
  • id represents the parameter type of the function, i represents int, and d represents double.

IV. Practical suggestions

Through the detailed introduction in the previous article, we have learned about how to implement three aspects of Android Native memory leak monitoring: including proxy implementation, detection of Native memory leaks and obtaining Android Native stack. Finally, let’s take a look at the comparison of some existing memory leak detection tools and give some practical suggestions.

1. Comparison of Native memory leak detection tools

In practical applications, we need to choose the most suitable solution based on the specific scenario. The first three tools in the table below are ready-made, but have certain limitations, especially not suitable for online use.

2. Practical advice

In actual projects, we can combine multiple memory leak detection solutions to improve the detection effect. Here are some suggestions:

  • Coding specifications: When writing code, following certain coding specifications and best practices, such as using smart pointers, avoiding circular references, etc., can effectively reduce the risk of memory leaks.
  • Code review: During the development process, regular code reviews are conducted to check whether there is potential memory leak risk in the code. Code review can help us discover and fix problems in a timely manner and improve code quality.
  • Automated testing: Introduce automated testing in the project to detect memory leaks on key functions. Tools such as ASan and LSan can be used in a continuous integration environment to detect memory leaks to ensure that newly submitted code does not introduce new memory leak problems.
  • Performance monitoring: In an online environment, regularly monitor the memory usage of the application. If memory usage abnormalities are found, you can use manual detection methods or feedback the problem to the development environment and use other tools for further analysis and processing.
  • Problem positioning: When a memory leak problem is found, quickly locate the location where the problem occurs based on the error information provided by the tool. Combining the stack information, relative address, etc. can help us better understand the cause of the problem and thus fix the problem.

V. Conclusion

During the development and testing phase, we can use tools such as ASan, LSan and Valgrind to detect memory leaks. In online environments, these tools are not suitable for direct use because of their performance overhead. In this case, we can use manual detection methods, combined with code review and good programming habits to minimize the occurrence of memory leaks.

However, these tools do not guarantee that all memory leaks will be detected. The discovery and repair of memory leaks requires a deep understanding of the code and good programming habits. Only in this way can we effectively prevent and resolve memory leaks, thereby improving the stability and performance of our applications.

<<:  How to use scroll offset of ScrollView in SwiftUI

>>:  iOS 18 has been updated again, bringing many new features!

Recommend

To promote Internet finance products, how to plan and develop H5 mini-games?

If a product is created to solve a certain pain p...

How to use video marketing to let customers fully understand your products?

A marketing activity that uses video as the main ...

Of the three great Android kingpins, only one is left

As early as in native Android 9.0, the traditiona...

What are the benefits of promoting WeChat mini programs?

1. Mini Programs combined with official accounts ...

Should you upgrade your iPhone to iOS 10?

[51CTO.com Quick Translation] iOS 10 looks good, ...