How to solve the problem of inexplicable APP login on iOS 15

How to solve the problem of inexplicable APP login on iOS 15

[[439486]]

After iOS 15 was publicly launched, we started receiving reports from users that they were being repeatedly logged out to the login page for unknown reasons when opening our app (Cookpad). Surprisingly enough, this was not a problem we discovered while testing the iOS 15 beta.

If you’re here looking for the fix, then scroll down directly to the conclusion, but if you want to learn more about how we debugged this particular issue, then let’s get started.

Reproduce the feedback problem

The specific information in the user reports is limited, and the only thing we know is that starting from iOS 15, users will find themselves logged out after opening the app.

We have no video or specific steps to reproduce the issue, so I tried to launch the app in various ways hoping to see it for myself. I tried reinstalling the app, I tried launching with and without a network connection, I tried force quitting, and after 30 minutes of trying, I gave up and started responding to users saying I had no idea what the specific issue was.

It wasn’t until I unlocked my phone again and started Cookpad without doing anything else that I discovered that the app had exited directly to the login screen, just as our users had reported!

After that, I couldn't replicate the problem exactly, but it seemed to be related to using the phone again after not using it for a while.

Narrow down the problem

I was concerned that reinstalling the app from Xcode might prevent the issue from reproducing, so before doing that it was time to look at the code and try to narrow down the issue. Based on our implementation, I came up with three potential causes.

1. The data in UserDefaults is cleared.

2. An unexpected API call returns HTTP 401 and triggers a logout.

3. Keychain threw an error.

I was able to rule out the first two potential causes thanks to some subtle behavior I observed after reproducing the issue myself.

  • The login screen did not ask me to select a region - this indicates that there is no problem with the data in UserDefaults, because our "Show region selection" preference is still in effect.
  • The main UI didn't show up, not even briefly - this suggests that no network requests were attempted, so it's probably too early to conclude that the API is the cause of the problem.

That leaves us with Keychain, which leads me to my next question. What changed and why is it so hard to reproduce?

What changed and why is it so hard to reproduce?

I took a cursory look at the release notes and did a quick Google search and I couldn't find anything, so I had to keep digging to get a better understanding of the issue.

Access to Keychain data is provided through the Security[1] framework, which is notoriously tricky. While there are a number of third-party libraries that wrap this framework to make things easier, we maintain our own simple wrapper based on some Apple sample code.

Looking at this code, we call the SecItemCopyMatching[2] method to load our access token, which returns the data as well as an OSStatus code describing the result. Unfortunately, while our wrapper throws unsuccessful results along with the status code for debugging purposes, we throw away this information in the next layer and simply treat the error as nil.

We're on a weekly release schedule, thanks to a lot of automation. At this point, our next release deadline was the next day. Because we didn't fully understand how widespread the issue was, and we weren't sure if we'd be able to release a fix before code freeze, I took the opportunity to address the lack of observability by adding some additional non-fatal logging using Crashlytics.

While we were unable to change the behavior of loading sessions, we were able to start logging errors and better document the current behavior we implemented.

This result gives us some good observation points that we can then watch over the next few weeks.

The number of affected users slowly decreased over the 10.58.0 and 10.59.0 releases due to a mitigation that was introduced while we worked to identify the root cause, which was fixed in 10.60.0.

At this point, I was able to capture the exact error code returned. The culprit was errSecInteractionNotAllowed[3]:

Interaction with the Security Server is not allowed.

This error tells us that we are trying to read data from the Keychain at a point in time when the data is not available. This usually happens when you try to read data that has been stored and has its accessibility set to kSecAttrAccessibleWhenUnlocked[4] while the device is still locked.

Now this makes perfect sense, but the only problem is that in Cookpad we only read from the Keychain when the app is launched, and my assumption is that the user must have clicked on the app icon to launch the app, so the device should always be unlocked at this point, right?

So, what exactly has changed? Even though I'm able to reproduce the issue, I'm 100% sure my phone is unlocked when I click on the app icon, so I don't understand why I'm getting this Keychain error.

Determined to find the cause, I replaced our application's implementation with a debugging tool that would try and log Keychain reads at different points in its lifecycle.

In a scenario where I can reproduce the problem, I observed the following results:

  • main.swift — Failure (errSecInteractionNotAllowed)
  • AppDelegate.init() — Failed (errSecInteractionNotAllowed)
  • AppDelegate.applicationProtectedDataDidBecomeAvailable(_:) — Success
  • AppDelegate.application(_:didFinishLaunchingWithOptions:) — Success
  • ViewController.viewDidAppear(_:) — Success

So this (half) explains it. In order to avoid holding some implicitly unwrapped optional properties on our AppDelegate, we were doing some setup in the init() method, part of which involved reading an access token from the Keychain. This was why the read would fail, and ultimately why some users would find themselves logged out.

I learned an important lesson here that I shouldn't assume protected data is available when the AppDelegate is initialized, but to be honest I'm still upset because I don't understand why it isn't available. After all, we haven't changed this part of the code in years and it has been working fine in iOS 12, 13, and 14, so what's the reason?

Finding the root cause

My debugging interface was useful, but it was missing some important information that would help answer all of my questions: time.

I know that "protected data" isn't available until AppDelegate.application(_:didFinishLaunchingWithOptions:), but it still doesn't make sense because to reproduce the issue I'm doing the following:

1. Launch the app 2. Use briefly 3. Force quit the app 4. Lock my device and leave it for about 30 minutes 5. Unlock the device 6. Launch the app again

Whenever I launch the app again in step 6, I'm 100% sure the device is unlocked, so I firmly believe I should be able to read data from the Keychain in AppDelegate.init().

It wasn't until I looked at the timing of all of these steps that things started to make a little sense.

Look closely at the timestamps again:

  • main.swift — 11:38:47
  • AppDelegate.init() — 11:38:47
  • AppDelegate.application(_:didFinishLaunchingWithOptions:) — 12:03:04
  • ViewController.viewDidAppear(_:) — 12:03:04

The app itself launched 25 minutes before I actually unlocked my phone and tapped the app icon!

Now, I actually never thought there was such a huge delay, it was actually @_saagarjha who suggested me to check the timestamp, after which he pointed me to this tweet.

Twitter: Home of Apple's developer documentation

Twitter translation: Interesting iOS 15 optimization. Duet now attempts to preemptively "warm up" third-party apps by running them through dyld and pre-main static initializers a few minutes before you tap an app icon. The app is then suspended and subsequent "launches" seem to be faster.

Now it all makes sense. We didn't test it initially because we most likely didn't give the iOS 15 beta enough time to "learn" our usage habits, so the issue only reproduced in real-world scenarios where the device thought I was about to launch an app soon. I still don't know how this prediction is formed, but I'm just going to chalk it up to "Siri intelligence" and leave it at that.

in conclusion

Starting in iOS 15, the system may decide to "warm up" your app before the user actually tries to open it, which may increase the probability that protected data can be accessed at a time when you think it should not be available.

Protect yourself by waiting for the application(_:didFinishLaunchingWithOptions:) delegate callback and, if possible, watching for UIApplication.isProtectedDataAvailable (or the corresponding delegate callback/notification) and handle accordingly.

We still found very few non-fatal issues where isProtectedDataAvailable was reported as false in application(_:didFinishLaunchingWithOptions:) , and beyond the fact that we could defer to reading the access token from the keychain, this would be a massive undertaking and it’s not worth investigating further right now.

This is a pretty hard bug to debug, and the change in behavior seems to be completely undocumented, which really doesn't help me. If you are also stuck with this issue, please consider forking FB9780579 [5].

I learned a lot from this, and I hope you will too!

Update: Since posting this article, a number of people have actually pointed me to Apple’s relatively well-documented preheating behavior [6]. However, others have also told me that they are still observing behavior that differs from the documented behavior in certain scenarios, so proceed with caution.

References

[1] Security:

https://developer.apple.com/documentation/security

[2]SecItemCopyMatching: https:

//developer.apple.com/documentation/security/1398306-secitemcopymatching?language=objc

[3]errSecInteractionNotAllowed:

https://developer.apple.com/documentation/security/errsecinteractionnotallowed?changes=_3

[4]kSecAttrAccessibleWhenUnlocked:

https://developer.apple.com/documentation/security/ksecattraccessiblewhenunlocked

[5]FB9780579: https://openradar.appspot.com/FB9780579

[6] Apple has relatively complete documentation on preheating behavior: https://developer.apple.com/documentation/uikit/app_and_environment/responding_to_the_launch_of_your_app/about_the_app_launch_sequence#3894431

<<:  The stability of the iOS system is getting worse and worse, and it has affected the core functions

>>:  The United States boycotts Huawei, but the iPhone is selling well in the country. Why can't we reject Apple?

Recommend

Cocos Developer Salon——Cocos Studio V2.1 Open Plan

In the era of mobile Internet, the mobile game in...

How should marketing campaigns be designed?

The purpose of marketing activities is mainly to ...

WeChat announces eight hardware industry solutions for the first time

On August 25, at the final of the WeChat Hardware...

Establishing four strategic systems for user operations

What is the user operation system like? I believe...

Can I get a refund if I regret buying gifts on Douyin? Any live streaming tips?

The number of people playing Douyin is increasing...

Operational case analysis: How to formulate an operational strategy?

In the first few years of operations , everyone w...

Farewell! Lu Yuanjiu, winner of the "July 1st Medal", passed away

On June 6, Lu Yuanjiu, winner of the "July 1...

Beware! You may have been "exposed" online!

Have you ever received a call like this? The othe...

What nutrients does umbilical cord blood contain?

When it comes to the source of nutrition for babi...