In-depth understanding of OC/C++ closures

In-depth understanding of OC/C++ closures

Author: Cui Xiaobing

background

Apple's Objective-C compiler allows users to freely mix C++ and Objective-C in the same source file. The mixed language is called Objective-C++. Compared with the file isolation and bridge communication between other languages ​​(such as Swift, Kotlin, Dart, etc.) and C++ (for example, Kotlin uses JNI, Dart uses FFI), the same-file mixed mode of Objective-C and C++ is undoubtedly comfortable. Although OC/C++ mixed can be written in one file, there are some precautions to be aware of: Objective-C++ does not add C++ functions to OC classes, nor does it add OC functions to C++. For example, you cannot call C++ objects with OC syntax, you cannot add constructors and destructors to OC objects, and you cannot use this and self interchangeably. The class architecture is independent. C++ classes cannot inherit OC classes, and OC classes cannot inherit C++ classes.

This article mainly explores the previously confusing mixing of OC's Block and C++'s lambda.

  • Experimental environment: C++ version is C++14, OC is limited to ARC.

Basic understanding

Before we go into more detail, let’s first compare the two:

grammar

 ^ ( int x , NSString * y ){} // ObjC, take int and NSString*
[]( int x , std :: string y ){} // C++, take int and std::string

^ { return 42 ; } // ObjC, returns int
[]{ return 42 ; } // C++, returns int

^ int { if ( something ) return 42 ; else return 43 ; }
[]() - > int { if ( something ) return 42 ; else return 43 ; }

principle

For the underlying structure of OC's Block, please refer to "In-depth Study of Block Capture of External Variables and __block Implementation Principles" (https://halfrost.com/ios_block/). We will not delve into it here, but just expand the code to achieve a comparison effect.

 - ( void ) viewDidLoad {
[ super viewDidLoad ];

int x = 3 ;
void ( ^ block ) ( int ) = ^ ( int a ) {
NSLog ( @ "%d" , x );
};
block ( 5 );
}

Rewriting with clang -rewrite-objc gives the following results:

 struct __ViewController__viewDidLoad_block_impl_0 {
struct __block_impl impl ;
struct __ViewController__viewDidLoad_block_desc_0 * Desc ;
int x ;
__ViewController__viewDidLoad_block_impl_0 ( void * fp , struct __ViewController__viewDidLoad_block_desc_0 * desc , int _x , int flags = 0 ) : x ( _x ) {
impl . isa = & _NSConcreteStackBlock ;
impl . Flags = flags ;
impl . FuncPtr = fp ;
Desc = desc ;
}
};
static void __ViewController__viewDidLoad_block_func_0 ( struct __ViewController__viewDidLoad_block_impl_0 * __cself , int a ) {
int x = __cself - > x ; // bound by copy
NSLog (( NSString * ) & __NSConstantStringImpl__var_folders_st_jhg68rvj7sj064ft0rznckfh0000gn_T_ViewController_d02516_mii_0 , x );
}

static struct __ViewController__viewDidLoad_block_desc_0 {
size_t reserved ;
size_t Block_size ;
} __ViewController__viewDidLoad_block_desc_0_DATA = { 0 , sizeof ( struct __ViewController__viewDidLoad_block_impl_0 )};

static void _I_ViewController_viewDidLoad ( ViewController * self , SEL _cmd ) {
(( void ( * )( __rw_objc_super * , SEL ))( void * ) objc_msgSendSuper )(( __rw_objc_super ){( id ) self , ( id ) class_getSuperclass ( objc_getClass ( "ViewController" ))}, sel_registerName ( "viewDidLoad" ));
int x = 3 ;
void ( * block )( int ) = (( void ( * )( int )) & __ViewController__viewDidLoad_block_impl_0 (( void * ) __ViewController__viewDidLoad_block_func_0 , & __ViewController__viewDidLoad_block_desc_0_DATA , x ));
(( void ( * )( __block_impl * , int ))(( __block_impl * ) block ) - > FuncPtr )(( __block_impl * ) block , 5 );
}

C++ lambda uses a completely different implementation mechanism, which converts the lambda expression into an anonymous C++ class. Here we use cppinsights to look at the implementation of C++ lambda.

 #include < cstdio >

struct A {
int x ;
int y ;
};

int main ()
{
A a = { 1 , 2 };
int m = 3 ;
auto add = [ & a , m ]( int n ) - > int {
return m + n + a . x + a . y ;
};
m = 30 ;
add ( 20 );
}
 #include < cstdio >

struct A
{
int x ;
int y ;
};

int main ()
{
A a = { 1 , 2 };
int m = 3 ;

class __lambda_12_15
{
public :
inline int operator ()( int n ) const
{
return (( m + n ) + a . x ) + a . y ;
}

private :
A & a ;
int m ;

public :
__lambda_12_15 ( A & _a , int & _m )
: a { _a }
, m { _m }
{}
};

__lambda_12_15 add = __lambda_12_15 { a , m };
m = 30 ;
add . operator ()( 20 );
return 0 ;
}

As you can see, the lambda expression add is converted to the class __lambda_12_15, and the operator() is overloaded. The call to add is also converted to a call to add.operator().

Capturing variables

OC Block can only capture variables through normal methods and __block methods:

 int x = 42 ;
void ( ^ block )( void ) = ^ { printf ( "%d\n" , x ); };
block (); // prints 42
 __block int x = 42 ;
void ( ^ block )( void ) = ^ { x = 43 ; };
block (); // x is now 43

C++ lambda brings more flexibility and can capture variables in the following ways:

 [] Capture nothing
[ & ] Capture any referenced variable by reference
[ = ] Capture any referenced variable by making a copy
[ = , & foo ] Capture any referenced variable by making a copy , but capture variable foo by reference
[ bar ] Capture bar by making a copy ; don 't copy anything else
[ this ] Capture the this pointer of the enclosing class
 int x = 42 ;
int y = 99 ;
int z = 1001 ;
auto lambda = [ = , & z ] {
// can't modify x or y here, but we can read them
z ++ ;
printf ( "%d, %d, %d\n" , x , y , z );
};
lambda (); // prints 42, 99, 1002
// z is now 1002

Memory Management

OC Block and C++ lambda both originated from stack objects, but their subsequent development is completely different. OC Block is essentially an OC object. They are stored by reference and never by value. In order to extend the life cycle, OC Block must be copied to the heap. OC Block follows OC's reference counting rules, and copy and release must be balanced (the same applies to Block_copy and Block_release). The first copy will move the Block from the stack to the heap, and the second copy will increase its reference count. When the reference count is 0, the Block will be destroyed and the object it captures will be released.

C++ lambda is stored by value, not by reference. All captured variables are stored in anonymous class objects as member variables of the anonymous class object. When the lambda expression is copied, these variables are also copied, and only the appropriate constructors and destructors need to be triggered. There is an extremely important point here: capturing variables by reference. These variables are stored as references in anonymous objects, and they do not receive any special treatment. This means that after the life cycle of these variables ends, lambda may still access these variables, resulting in undefined behavior or crashes, for example:

 - ( void ) viewDidLoad {
[ super viewDidLoad ];

int x = 3 ;
lambda = [ & x ]() - > void {
NSLog ( @ "x = %d" , x );
};
}

- ( void ) touchesBegan :( NSSet < UITouch * > * ) touches withEvent : ( UIEvent * ) event {
lambda ();
}

// From the output, we can see that x is a random value
2022-02-12 23 :15 : 01.375925 + 0800 BlockTest [ 63517 : 1006998 ] x = 32767

Relatively speaking, the storage pointed to by this is on the heap, and its life cycle is guaranteed to a certain extent. However, even so, the life cycle safety cannot be absolutely guaranteed. In some cases, it is necessary to use smart pointers to extend the life cycle.

 auto strongThis = shared_from_this ();
doSomethingAsynchronously ([ strongThis , this ]() {
someMember_ = 42 ;
});

Closure mixed capture problem

The contents discussed above are all independent of each other. OC's Block does not involve C++ objects, and C++'s lambda does not involve OC objects. This is probably what we want to see most, but in the process of mixing, we will find that this is just wishful thinking. The two often extend their magic wands to each other's fields, which will cause some more puzzling problems.

C++ lambda captures OC objects

Can C++ lambda capture Objective-C variables? If so, will there be a circular reference problem? If there is a circular reference problem, how to deal with it?

Value capture OC object

As shown in the code, there is a C++ field cppObj in the OCClass class. In the initialization method of OCClass, cppObj is initialized and its field callback is assigned a value. It can be seen that self is captured in lambda, which can be considered value capture according to the previous rules.

 class CppClass {
public :
CppClass () {
}

~ CppClass () {
}
public :
std :: function < void () > callback ;
};
 @implementation OCClass {
std :: shared_ptr <CppClass> cppObj ;
}

- ( void ) dealloc {
NSLog ( @ "%s" , __FUNCTION__ );
}

- ( instancetype ) init {
if ( self = [ super init ]) {
cppObj = std :: make_shared < CppClass > ();
cppObj - > callback = [ self ]() - > void {
[ self executeTask ];
};
}
return self ;
}

- ( void ) executeTask {
NSLog ( @ "execute task" );
}
 OCClass * ocObj = [[ OCClass alloc ] init ];

Unfortunately, this capture method will cause a circular reference: the OCClass object ocObj holds cppObj, and cppObj holds ocObj through callback.

Looking at the corresponding assembly code, we can find that when capturing, ARC semantics is triggered and self is automatically retained.

These lines of assembly code increase the reference count of self.

 0x10cab31ea < + 170 > : movq - 0x8 ( % rbp ), % rdi
0x10cab31ee < + 174 > : movq 0x5e7b ( % rip ), % rax ; ( void * ) 0x00007fff2018fa80 : objc_retain
0x10cab31f5 < + 181 > : callq * % rax

Finally, let’s look at the parameters of the anonymous class. We can see that self is of type OCClass *, which is a pointer type.

Then we can simply think of the capture pseudo code as follows, and the retain behavior will occur under ARC semantics:

 __strong __typeof ( self ) capture_self = self ;

// Expand
__strong OCClass * capture_self = self ;

To solve the problem of circular references, you can use __weak.

 cppObj = std :: make_shared < CppClass > ();
__weak __typeof ( self ) wself = self ;
cppObj - > callback = [ wself ]() - > void {
[ wself executeTask ];
};

Looking at the assembly code again, we find that the previous objc_retain logic has disappeared and is replaced by objc_copyWeak.

Reference capture OC object

So is it possible to capture self by reference?

cppObj = std::make_shared();cppObj->callback = [&self]() -> void { [self executeTask];};

You can see that there is no objc_retain logic in the assembly code either.

Finally, let’s look at the parameters of the anonymous class. We can see that self is of type OCClass *&, which is a pointer reference type.

It can be seen that reference capture does not retain self. You can simply think of the capture pseudo code as follows, and no retain behavior will occur under ARC semantics.

 __unsafe_unretained __typeof ( self ) & capture_self = self ;

// Expand
__unsafe_unretained OCClass * & capture_self = self ;

When is the captured OC object released?

Take this code snippet as an example:

 auto cppObj = std :: make_shared < CppClass > ();
OCClass2 * oc2 = [[ OCClass2 alloc ] init ];
cppObj - > callback = [ oc2 ]() - > void {
[ oc2 class ];
};

As you can see, std::function is destructed in the destructor of CppClass, and std::function

The captured OC variable oc2 is released.

in conclusion

The essence of C++ lambda is to create an anonymous structure type to store captured variables. ARC will ensure that the C++ structure type containing OC object fields follows ARC semantics:

  • The constructor of the C++ structure will initialize the OC object fields to nil;
  • When the OC object field is assigned a value, the previous value is released and the new value is retained (if it is a block, a copy is made);
  • When the destructor of the C++ structure is called, the OC object fields will be released.

C++ lambda captures OC objects by value or reference.

  1. Capturing OC objects by reference is equivalent to using __unsafe_unretained, which has life cycle issues and is inherently dangerous and not recommended;
  2. The value capture method is equivalent to using __strong, which may cause circular references. __weak can be used when necessary.

How does OC's Block capture C++ objects?

Let's look at how OC's Block captures C++ objects.

The HMRequestMonitor in the code is a C++ structure, in which the WaitForDone and SignalDone methods are mainly used to achieve synchronization.

 struct HMRequestMonitor {
public :
bool WaitForDone () { return is_done_ . get (); }
void SignalDone ( bool success ) { done_with_success_ . set_value ( success ); }
ResponseStruct & GetResponse () { return response_ ; }
private :
... ..
};

The upload method uses the HMRequestMonitor object to synchronously wait for the result of the network request (the code has been adjusted for typesetting).

 hermas :: ResponseStruct HMUploader :: upload (
const char * url ,
const char * request_data ,
int64_t len ​​,
const char * header_content_type ,
const char * header_content_encoding ) {
HMRequestModel * model = [[ HMRequestModel alloc ] init ];
... ...

auto monitor = std :: make_shared < hermas :: HMRequestMonitor > ();
std :: weak_ptr < hermas :: HMRequestMonitor > weakMonitor ( monitor );
DataResponseBlock block = ^ ( NSError * error , id data , NSURLResponse * response ) {
weakMonitor . lock () - > SignalDone ( true );
};
[ m_session_manager requestWithModel : model callBackWithResponse : block ];
monitor- > WaitForDone ( ) ;
return monitor - > GetResponse ();
}

std::weak_ptr is used directly here.

Not using __block

The following conclusions can be drawn from the experiment:

1. The C++ object will be captured by the OC block and passed by value. Through the breakpoint, it can be found that the copy constructor of std::weak_ptr is called.

 template < class _Tp >
inline
weak_ptr < _Tp > :: weak_ptr ( weak_ptr const & __r ) _NOEXCEPT
: __ptr_ ( __r . __ptr_ ),
__cntrl_ ( __r . __cntrl_ )
{
if ( __cntrl_ )
__cntrl _- > __add_weak ();
}

2. The weak reference count of monitor changes as follows:

  1. When initializing monitor, weak_count = 1;
  2. When initializing weakMonitor, weak_count = 2, increase by 1;
  3. After OC Block captures, weak_count = 4, which increases by 2. By observing the assembly code, there are 2 places:
  • When first captured, weakMinotor is copied in line 142 of the assembly code;
  • When the block is copied from the stack to the heap, the weakMinotor is copied again in assembly line 144;

It should be noted here that: C++'s weak_count is rather strange, its value = number of weak references + 1, the reason for this design is quite complicated, for details, please refer to: https://stackoverflow.com/questions/5671241/how-does-weak-ptr-work

If std::weak_ptr is not used here, but std::shared_ptr is captured directly, its strong reference count is 3 after being captured, and the logic is the same as the above std::weak_ptr. (In essence, std::shared_ptr and std::weak_ptr are both C++ classes)

 std :: shared_ptr < hermas :: HMRequestMonitor > monitor = std :: make_shared < hermas :: HMRequestMonitor > ();
DataResponseBlock block = ^ ( NSError * _Nonnull error , id _Nonnull data , NSURLResponse * _Nonnull response ) {
monitor - > SignalDone ( true );
};
 ( lldb ) po monitor
std :: __ 1:: shared_ptr < hermas :: HMRequestMonitor > :: element_type @ 0x00006000010dda58 strong = 3 weak = 1

Using __block

So is it possible to use __block to modify captured C++ variables? Through experiments, we found that it is feasible.

The following conclusions can be drawn:

  1. OC's Block can capture C++ objects by passing references;
  2. The weak reference count of monitor is as follows:
  • When initializing monitor, weak_count = 1;
  • When initializing weakMonitor, weak_count = 2, increase by 1;
  • After OC Block captures, weak_count = 2, mainly because the move constructor is triggered, which is just a transfer of ownership and does not change the reference count;

Questions about __block

Students who know C++ may wonder, since the move constructor is triggered here, only the ownership is transferred, which means that the monitor is passed in as an rvalue and has become nullptr and is destroyed, then why can the monitor in the example still be accessed? Let's verify it:

1. When the following code is executed for the first time

You will find that the address of the monitor variable is:

 ( lldb ) po & monitor
0x0000700001d959e8

2. When the block assignment is executed, the move constructor of std::shared_ptr is called:

  • The address of this in the move constructor is 0x0000600003b0c830;
  • The address of __r is also 0x0000700001d959e8, which is consistent with the address of monitor.

3. When the block is executed, print the address of monitor again, and you will find that the address of monitor has changed and is consistent with this in the second step, which means that monitor has become this in the second step.

 ( lldb ) po & monitor
0x0000600003b0c830

During the whole process, the address of monitor changes, and they are two different std::shared_ptr objects. Therefore, monitor can still be accessed.

When are captured C++ objects released?

Similarly, when the OC Block is released, the captured C++ object will be released.

Capturing shared_from_this

This in C++ is a pointer, which is essentially an integer. There is no essential difference between capturing this in an OC Block and capturing an integer, so we will not discuss it in detail here. Here we will focus on the shared_from_this class in C++, which is the smart pointer version of this.

If a C++ class wants to access shared_from_this, it must inherit from the enable_shared_from_this class and pass its own class name as a template parameter.

 class CppClass : public std :: enable_shared_from_this < CppClass > {
public :
CppClass (){}
~ CppClass () {}

void attachOCBlock ();
public :
OCClass2 * ocObj2 ;
void dosomething () {}
};

void CppClass :: attachOCBlock () {
ocObj2 = [[ OCClass2 alloc ] init ];
auto shared_this = shared_from_this ();
ocObj2.ocBlock = ^ {
shared_this - > dosomething ();
};
}
 @interface OCClass2 : NSObject
@property void (^ ocBlock ) ();
@ end
 auto cppObj = std :: make_shared < CppClass > ();
cppObj- > attachOCBlock ( ) ;

According to the previous conclusion, in the CppClass member function attachOCBlock, ocBlock directly capturing shared_from_this will also cause a circular reference, which can also be solved by using std::weak_ptr.

 void CppClass :: attachOCBlock () {
ocObj2 = [[ OCClass2 alloc ] init ];
std :: weak_ptr < CppClass > weak_this = shared_from_this ();
ocObj2.ocBlock = ^ {
weak_this . lock () - > dosomething ();
};
}

in conclusion

OC's Block can capture C++ objects.

  • If you capture a C++ object on the stack in the normal way, the copy constructor will be called;
  • If the __block method is used to capture a C++ object on the stack, the move constructor will be called, and the C++ object modified by __block will be redirected when it is captured.

Summarize

This article begins with a simple comparison of OC's Block and C++'s Lambda from four dimensions: syntax, principle, variable capture, and memory management. It then spends a lot of space to focus on the mixed capture of closures in OC/C++. The reason for this is that I don't want to "guess" and "trial and error" in a muddled way. Only by deeply understanding the mechanism behind it can we write better OC/C++ mixed code. At the same time, I also hope to bring some help to readers who have the same confusion. However, for the entire OC/C++ mixed field, this is just the tip of the iceberg. There are still many difficult problems, and I look forward to more exploration in the future.

Reference Documentation

  • https://isocpp.org/wiki/faq/objective-c
  • ​​http://www.philjordan.eu/article/mixing-objective-c-c++-and-objective-c++​​
  • ​​https://releases.llvm.org/12.0.0/tools/clang/docs/AutomaticReferenceCounting.html​​
  • ​​https://releases.llvm.org/12.0.0/tools/clang/docs/BlockLanguageSpec.html#c-extensions​​
  • ​​https://mikeash.com/pyblog/friday-qa-2011-06-03-objective-c-blocks-vs-c0x-lambdas-fight.html​​

<<:  Android phone manufacturers collectively ignore this. Does iOS’s “battery health” have no meaning?

>>:  Exploration on Tik Tok Android package size optimization: extreme simplification of resource binary format

Recommend

WeChat mini program dividends, can mini programs adopt a dividend model?

Q: Can the mini program adopt a dividend model? A...

Brief Analysis of Application Debugging Principles

1. Bugs and Debugging Speaking of "Debug&quo...

Are these “mysterious circles” in the desert really the work of aliens?

In the desert grasslands of Namibia in southern A...

Why can’t the mountains on Earth exceed 10,000 meters?

The highest mountain on earth is the main peak of...

It’s 2022, do you want to see something tiger-like?

The Year of the Tiger is here, Fat Tiger wishes y...

How to use coupon activities to increase user conversion rate?

Whether offline or online, where there are transa...

2019 Information Flow Advertising Click-Through Rate Data Insight Report!

As digital marketing enters the stock market, ref...

McLaren CEO: We have negotiated with Apple but no clear decision on acquisition

Recently, according to AutoBlog, as early as Sept...

Can you really avoid nutritional deficiencies by taking supplements alone?

The pace of life of modern people is indeed getti...

New consumer brand marketing strategy layout!

The emergence of the new consumption wave has led...