From C pseudocode to assembly, hands-on implementation of objc_msgSend

From C pseudocode to assembly, hands-on implementation of objc_msgSend

The objc_msgSend function underpins everything we do with Objective-C. Gwynne Raskind, a reader of Friday Q&A, suggested that I talk about the internals of objc_msgSend . What better way to understand something than to implement it yourself? Let's implement an objc_msgSend ourselves.

Tramapoline! Trampopoline! (Trampoline)

When you write a method that sends an Objective-C message:

  1. [obj message]

The compiler generates a call to objc_msgSend:

  1. objc_msgSend(obj, @selector (message));

Then objc_msgSend will be responsible for forwarding this message.

What does it do? It looks for the appropriate function pointer or IMP, calls it, and finally jumps. Any parameters passed to objc_msgSend will eventually become parameters of the IMP. The return value of the IMP becomes the return value of the method that was originally called.

Because objcmsgSend is only responsible for receiving the parameters, finding the appropriate function pointer, and then jumping, this is sometimes called a trampoline. More generally, any piece of code that is responsible for forwarding one piece of code to another can be called a trampoline.

This forwarding behavior is what makes objc_msgSend special. Because it simply looks up the appropriate code and jumps directly to it, it's pretty general. It can take any combination of arguments, because it just leaves them for the IMP to read. The return value is a little tricky, but ultimately it can all be seen as a different variant of objc_msgSend.

Unfortunately, none of this forwarding behavior can be implemented in pure C. There is no way to pass generic parameters from one C function to another. You can use variadic parameters, but variadic parameters are passed differently and slower than normal parameters, so this is not suitable for normal C parameters.

If you want to implement objc_msgSend in C, it should basically look like this:

  1. id objc_msgSend(id self, SEL _cmd, ...)
  2. {
  3. Class c = object_getClass(self);
  4. IMP imp = class_getMethodImplementation(c, _cmd);
  5. return imp(self, _cmd, ...);
  6. }

This is a bit oversimplified. In fact there is a method cache to speed up the lookup, like this:

  1. id objc_msgSend(id self, SEL _cmd, ...)
  2. {
  3. Class c = object_getClass (self);
  4. IMP imp = cache_lookup (c, _cmd);
  5. if(!imp)
  6. imp = class_getMethodImplementation (c, _cmd);
  7. return imp(self, _cmd, ...);
  8. }

Usually cache_lookup is implemented using an inline function for speed.

compilation

In the Apple version of the runtime, the entire function is implemented in assembly to maximize speed. In Objective-C, every time a message is sent, objc_msgSend is called. The simplest action in an application can have thousands or millions of messages.

To make things simpler, I will use as little assembly as possible in my own implementation, using independent C functions to abstract complexity. The assembly code will implement the following functions:

  1. id objc_msgSend(id self, SEL _cmd, ...)
  2. {
  3. IMP imp = GetImplementation(self, _cmd);
  4. imp(self, _cmd, ...);
  5. }
  6.  
  7. GetImplementation can work in a more readable way.

The assembly code requires:

1. Store all potential parameters in a safe place to ensure that GetImplementation does not overwrite them.

2. Call GetImplementation.

3. Save the return value somewhere.

4. Restore all parameter values.

5. Jump to the IMP returned by GetImplementation.

Let’s get started!

I'll try to use x86-64 assembly here, which is convenient for working on a Mac, but the concepts can also be applied to i386 or ARM.

This function will be saved in a separate file called msgsend-asm.s. This file can be passed to the compiler like a source file and it will be compiled and linked into the program.

The first thing to do is declare global symbols. For some boring historical reason, global symbols for C functions have an underscore in front of their names:

  1. .globl _objc_msgSend
  2. _objc_msgSend:

The compiler will happily link against the nearest available objc_msgSend. Simply linking this into a test app already makes [obj message] expressions use our own code instead of Apple's runtime, which makes it very easy to test our code to make sure it works.

Integer and pointer arguments are passed into registers %rsi, %rdi, %rdx, %rcx, %r8, and %r9. All other types of arguments are passed onto the stack. The first thing this function does is save the values ​​of these six registers onto the stack so they can be restored later:

  1. pushq %rsi
  2. pushq %rdi
  3. pushq %rdx
  4. pushq %rcx
  5. pushq %r8
  6. pushq %r9

In addition to these registers, the register %rax acts as a hidden parameter. It is used for variable parameter calls and saves the number of vector registers passed in so that the called function can correctly prepare the variable parameter list. In case the target function is a variable parameter method, I also save the value in this register:

  1. pushq %rax

For completeness, the %xmm registers used to pass floating point parameters should also be preserved. However, if I can ensure that GetImplementation doesn't pass any floating point numbers, I can ignore them and make the code more concise.

Next, align the stack. Mac OS X requires that a function call stack be aligned on a 16-byte boundary. The code above already has the stack aligned, but it's still necessary to do it explicitly manually so that everything is aligned and there are no crashes when dynamically calling functions. To align the stack, after saving the original value of %r12 to the stack, I save the current stack pointer to %r12. %r12 is arbitrary, any caller-saved register will do. The important thing is that these values ​​still exist after the call to GetImplementation. I then bitwise AND the stack pointer with -0x10, which clears the four bits at the bottom of the stack:

  1. pushq %r12
  2. mov %rsp, %r12
  3. andq $- 0x10 , %rsp

Now the stack pointer is aligned. This makes it safe to avoid the registers saved above, and since the stack grows downwards, this alignment will move it further down.

It's time to call GetImplementation. It takes two parameters, self and _cmd. The calling convention is to save these two parameters to %rsi and %rdi respectively. However, they are already like that when they are passed to objc_msgSend, they have not been moved, so there is no need to change them. All you need to do is actually call GetImplementation, and the method name should also be preceded by an underscore:

  1. callq _GetImplementation

Integer and pointer return values ​​are stored in %rax, which is where the returned IMP is found. Since %rax needs to be restored to its original state, the returned IMP needs to be moved somewhere else. I randomly picked %r11.

  1. mov %rax, %r11

Now it's time to restore the state. First, restore the stack pointer previously saved in %r12, and then restore the old value of %r12:

  1. mov %r12, %rsp
  2. popq %r12

Then restore the register values ​​in the reverse order they were pushed onto the stack:

  1. popq %rax
  2. popq %r9
  3. popq %r8
  4. popq %rcx
  5. popq %rdx
  6. popq %rdi
  7. popq %rsi

Now everything is ready. The argument registers are restored to their previous state. The target function's required parameters are in place. IMP is in register %r11, and now all we need to do is jump there:

  1. jmp *%r11

That's it! No more assembly code is needed. The jump passes control to the method implementation. From the code's perspective, it's as if the sender called the method directly. All the roundabout ways of calling the method before disappear. When the method returns, it goes right back to the call to objc_msgSend, no further work is needed. The return value of the method can be found in the appropriate place.

There are some details to be aware of with unconventional return values. For example, large structures (return values ​​that cannot be stored in a register size). On x86-64, large structures are returned using a hidden first parameter. When you call like this:

  1. NSRect r = SomeFunc(a, b, c);

This call will be translated into this:

  1. NSRect r;
  2. SomeFunc(&r, a, b, c);

The memory address for the return value is passed into %rdi. Since objc_msgSend expects %rdi and %rsi to contain self and _cmd, this doesn't work when a message returns a large structure. The same problem exists on multiple platforms. The runtime provides objc_msgSend_stret for returning structures, which works similarly to objc_msgSend, except that it knows to look for self in %rsi and _cmd in %rdx.

A similar problem occurs when sending messages that return floating-point values ​​on some platforms. On these platforms, the runtime provides objc_msgSend_fpret (on x86-64, objc_msgSend_fpret2 for particularly extreme cases).

Method Lookup

Let's move on to implementing GetImplementation. The assembly trampoline above means that this code can be implemented in C. Remember, in the real runtime, this code is written directly in assembly to ensure the fastest possible speed. This not only allows better control of the code, but also avoids repeating the code for saving and restoring registers as above.

GetImplementation can simply call class_getMethodImplementation implementation, mixing in the Objective-C runtime implementation. This is a bit boring. The real objc_msgSend first searches the class's method cache to maximize speed. Since GetImplementation wants to mimic objc_msgSend, it does the same. If the cache does not contain the given selector entry, it falls back to querying the runtime.

What we need now are some structure definitions. The method cache is a private structure in the class structure, and in order to get it we need to define our own version. Although private, the definitions of these structures are available through Apple's open source implementation of the Objective-C runtime.

First you need to define a cache entry:

  1. typedef struct {
  2. SEL name;
  3. void *unused;
  4. IMP imp;
  5. } cache_entry;

Pretty simple. Don't ask me what the unused field is for, I have no idea why it's there. Here's the full definition of cache:

  1. struct objc_cache {
  2. uintptr_t mask;
  3. uintptr_t occupied;
  4. cache_entry *buckets[ 1 ];
  5. };

The cache is implemented using a hash table. This table is implemented for speed, and everything else is simplified, so it is a bit different. The size of the table is always a power of 2. The table is indexed by the selector, and the bucket is indexed directly by the selector value, possibly shifting to remove irrelevant low bits and performing a logical and with a mask. Here are some macros for calculating the bucket index given a selector and mask:

  1. #ifndef __LP64__
  2. # define CACHE_HASH(sel, mask) (((uintptr_t)(sel)>> 2 ) & (mask))
  3. # else  
  4. # define CACHE_HASH(sel, mask) (((unsigned int )((uintptr_t)(sel)>> 0 )) & (mask))
  5. #endif

*** is the structure of the class. This is the type that Class points to:

  1. struct class_t {
  2. struct class_t *isa;
  3. struct class_t *superclass;
  4. struct objc_cache *cache;
  5. IMP *vtable;
  6. };

Now that we have all the necessary structures, let's start implementing GetImplementation:

  1. IMP GetImplementation(id self, SEL _cmd)
  2. {

The first thing to do is to get the class of the object. The real objc_msgSend is obtained in a similar way to self->isa, but it will use the official API implementation:

  1. Class c = object_getClass(self);

Since I want to access the most primitive form, I will perform a type conversion for the pointer to the class_t structure:

  1. struct class_t *classInternals = (struct class_t *)c;

Now it's time to look up IMP. First we initialize it to NULL. If we find it in the cache, we assign it a value. If it's still NULL after looking up the cache, we fall back to the slower method:

  1. IMP imp = NULL;

Next, get a pointer to the cache:

  1. struct objc_cache * cache = classInternals - > cache;

Calculate the index of the bucket and get a pointer to the buckets array:

  1. uintptr_t index = CACHE_HASH(_cmd, cache->mask);
  2. cache_entry **buckets = cache->buckets;

Then, we search the cache using the selector we are looking for. The runtime uses linear chaining and then just iterates through a subset of buckets until we find the entry we need or a NULL entry:

  1. for (; buckets[index] != NULL; index = (index + 1 ) & cache->mask)
  2. {
  3. if (buckets[index]->name == _cmd)
  4. {
  5. imp = buckets[index]->imp;
  6. break ;
  7. }
  8. }

If the entry is not found, we call the runtime to use a slower method. In the real objc_msgSend, all the above code is implemented using assembly, and it is time to leave the assembly code and call the runtime's own method. Once the required entry is not found after searching the cache, the hope of sending the message quickly will be dashed. At this time, getting faster speed is not so important, because it is destined to be slow, and to a certain extent, it is rarely necessary to make such a call. Because of this, it is acceptable to abandon the assembly code and use more maintainable C:

  1. if (imp == NULL)
  2. imp = class_getMethodImplementation(c, _cmd);

Regardless, the IMP is now retrieved. If it's in the cache, it will be found there, otherwise it will be looked up by the runtime. The class_getMethodImplementation call will also use the cache, so the next call will be faster. All that's left is to return the IMP:

  1. return imp;
  2. }

test

To make sure it works, I wrote a quick test program:

  1. @interface Test : NSObject
  2. - ( void )none;
  3. - ( void )param: ( int )x;
  4. - ( void )params: ( int )a : ( int )b : ( int )c : ( int )d : ( int )e : ( int )f : ( int )g;
  5. - ( int )retval;
  6. @end  
  7. @implementation Test
  8. - (id)init
  9. {
  10. fprintf(stderr, "in init method, self is %p\n" , self);
  11. return self;
  12. }
  13. - ( void )none
  14. {
  15. fprintf(stderr, "in none method\n" );
  16. }
  17. - ( void )param: ( int )x
  18. {
  19. fprintf(stderr, "got parameter %d\n" , x);
  20. }
  21. - ( void )params: ( int )a : ( int )b : ( int )c : ( int )d : ( int )e : ( int )f : ( int )g
  22. {
  23. fprintf(stderr, "got params %d %d %d %d %d %d %d\n" , a, b, c, d, e, f, g);
  24. }
  25. - ( int )retval
  26. {
  27. fprintf(stderr, "in retval method\n" );
  28. return   42 ;
  29. }
  30. @end  
  31. int main( int argc, char **argv)
  32. {
  33. for ( int i = 0 ; i < 20 ; i++)
  34. {
  35. Test *t = [[Test alloc] init];
  36. [t none];
  37. [t param: 9999 ];
  38. [t params: 1 : 2 : 3 : 4 : 5 : 6 : 7 ];
  39. fprintf(stderr, "retval gave us %d\n" , [t retval]);
  40. NSMutableArray *a = [[NSMutableArray alloc] init];
  41. [a addObject: @1 ];
  42. [a addObject: @{ @ "foo" : @ "bar" }];
  43. [a addObject: @( "blah" )];
  44. a[ 0 ] = @2 ;
  45. NSLog(@ "%@" , a);
  46. }
  47. }

Just in case the runtime implementation is being called by accident, I added some debug logging to GetImplementation to make sure it is being called. Everything works fine, even literals and subscripting are calling the alternate implementation.

in conclusion

The core of objc_msgSend is pretty simple. But its implementation requires some assembly code, which makes it harder to understand than it should be. But it still has to use some assembly code for performance optimization. But by building a simple assembly trampoline and then implementing its logic in C, we can see how it works, and it really is nothing rocket science.

Obviously, you should not use the replacement objc_msgSend implementation in your own app. You will regret it. Do this only for learning purposes.

<<:  WatchKit, HealthKit, ApplePay, HomeKit, App Store Review Guide

>>:  JD.com: Technical exchanges spark innovation

Recommend

How to efficiently guide products to achieve self-propagation growth model

As a member of the growth department, when observ...

Event Promotion: How to create a hit event?

This chapter will take the recent mini program fi...

For promotion on Xiaohongshu, just read this article.

This article will try to answer your questions: 1...

Network | Using 5G mobile phones will cause cancer? Don't be silly

Whether wireless signals and mobile phone radiati...

China's mobile internet monthly active users decline for the first time

On July 23, QuestMobile, a research organization ...

Brand Proposal: How to Write a Proposal Case Study

This is a proposal I wrote for a small appliance ...

Android 14 Beta 4 update released, Pixel phone/Fold/Tablet can be downloaded

On July 12, Google pushed the Android 14 Beta 4 v...

Community monetization: the unbearable pain of community operators

1. Why do we need to talk about monetization? Wha...

Personal understanding of the stack in function calls

This is my first blog. Due to the needs of the co...

Kuaishou brand account operation plan!

2021 is the first year of brand self-broadcasting...

Summary of information flow advertising optimization techniques!

1: Bid If the account is old and new products are...

Ji Zhongzhan: 13 Quick Winning Lessons in the Workplace Your Boss Won't Tell You

Course Catalog: Lesson 1-Is it better to work in ...