iOS advanced page performance optimization

iOS advanced page performance optimization

Preface

In the field of software development, we often hear the phrase "premature optimization is the root of all evil". Don't optimize too early or over-optimize. I think it is necessary to pay attention to the performance impact during the coding process, but everything has a limit, and the development progress cannot be delayed for performance. When time is urgent, we often use the "quick and dirty" solution to quickly produce results, and then iterate and optimize later, which is the so-called agile development. The corresponding is the waterfall development process in traditional software development.

Causes of lag

In the iOS system, the process of displaying image content on the screen requires the joint participation of the CPU and GPU. The CPU is responsible for calculating the display content, such as view creation, layout calculation, image decoding, text drawing, etc. The CPU will then submit the calculated content to the GPU, which will transform, synthesize, and render it. The GPU will then submit the rendering results to the frame buffer and wait for the next VSync signal to arrive before displaying them on the screen. Due to the vertical synchronization mechanism, if the CPU or GPU does not complete content submission within a VSync time, that frame will be discarded and will wait for the next opportunity to display it, and the display will retain the previous content unchanged. This is the reason for the interface freeze.

Therefore, we need to balance the CPU and GPU loads to avoid overloading one side. To do this, we first need to understand what the CPU and GPU are responsible for.

The above figure shows the location of each module in the iOS system. Now let's take a closer look at the operations corresponding to the CPU and GPU.

CPU consuming tasks

Layout calculation

Layout calculation is the most common place in iOS where CPU resources are consumed. If the view hierarchy is complex, calculating the layout information of all layers will consume some time. Therefore, we should try to calculate the layout information in advance, and then adjust the corresponding properties at the right time. We should also avoid unnecessary updates and only update when the layout actually changes.

Object creation

The object creation process is accompanied by memory allocation, property setting, and even file reading operations, which consumes CPU resources. Try to use lightweight objects instead of heavy objects to optimize performance. For example, CALayer is much lighter than UIView. If the view element does not need to respond to touch events, it is more appropriate to use CALayer.

Creating view objects through Storyboard also involves file deserialization operations, and its resource consumption is much greater than creating objects directly through code. In performance-sensitive interfaces, Storyboard is not a good technical choice.

For list-type pages, you can also refer to the reuse mechanism of UITableView. Each time you want to initialize a View object, you first retrieve it from the cache pool according to the identifier. If you can retrieve it, reuse the View object. If you can't retrieve it, then actually perform the initialization process. When sliding the screen, the View object that slides out of the screen will be put into the cache pool according to the identifier, and the View that enters the visible range of the screen will decide whether to actually initialize it according to the previous rules.

Autolayout

Autolayout is a new layout technology introduced by Apple after iOS6. In most cases, this technology can greatly improve the development speed, especially when dealing with multiple languages. For example, in Arabic, the layout is from right to left, and you can set leading and trailing through Autolayout.

However, Autolayout often causes serious performance issues for complex views. For performance-sensitive pages, it is recommended to use manual layout and control the refresh frequency to ensure that the layout is re-laid out only when it is really necessary.

Text calculation

If an interface contains a large amount of text (such as Weibo, WeChat Moments, etc.), the calculation of the width and height of the text will take up a large part of the resources and is unavoidable.

A common scenario is that in UITableView, the heightForRowAtIndexPath method is frequently called. Even if it is not a time-consuming calculation, it will cause performance loss after being called too many times. The optimization here is to avoid recalculating the text row height every time. After obtaining the Model data, you can calculate the layout information based on the text content, and then save this layout information as an attribute in the corresponding Model. In this way, you can directly use the attributes in the Model in the callback of UITableView, reducing the text calculation.

Text Rendering

All text content controls that can be seen on the screen, including UIWebView, are formatted and drawn as Bitmaps through CoreText at the bottom layer. Common text controls (UILabel, UITextView, etc.) are formatted and drawn in the main thread. When displaying a large amount of text, the CPU pressure will be very high.

To optimize the performance of this part, we need to abandon the use of the upper-level controls provided by the system and directly use CoreText for typesetting control.

Wherever possible, try to avoid making changes to the frame of a view that contains text, because it will cause the text to be redrawn. For example, if you need to display a static block of text in the corner of a layer that frequently changes size, put the text in a sublayer instead.

The above paragraph is quoted from iOS Core Animation: Advanced Techniques. Translated, it means that the view containing text will trigger the re-rendering of the text when the layout is changed. For static text, we should minimize the layout modification of the view in which it is located.

Drawing of images

Image drawing usually refers to the process of drawing an image to a canvas using methods starting with CG, and then creating and displaying an image from the canvas. The previous module diagram introduced that CoreGraphic works on the CPU, so calling methods starting with CG consumes CPU resources. We can put the drawing process into a background thread, and then set the result to the contents of the layer in the main thread. The code is as follows:

  1. - (void)display {
  2. dispatch_async(backgroundQueue, ^{
  3. CGContextRef ctx = CGBitmapContextCreate(...);
  4. // draw in context...
  5. CGImageRef img = CGBitmapContextCreateImage(ctx);
  6. CFRelease(ctx);
  7. dispatch_async(mainQueue, ^{
  8. layer.contents = img;
  9. });
  10. });
  11. }

Decoding of images

Once an image file has been loaded, it must then be decompressed. This decompression can be a computationally complex task and take considerable time. The decompressed image will also use substantially more memory than the original.

After the image is loaded, it needs to be decoded. Image decoding is a complex and time-consuming process, and requires more memory resources than the original image.

In order to save memory, the iOS system delays the decoding process. The decoding process is performed only after the image is set to the contents property of the layer or the image property of the UIImageView. However, these two operations are performed on the main thread, which still brings performance problems.

If you want to decode in advance, you can use ImageIO or draw the image into CGContext in advance. For this part of the practice, please refer to iOS Core Animation: Advanced Techniques

Here is one more point. Commonly used UIImage loading methods include imageNamed and imageWithContentsOfFile. After imageNamed loads the image, it will be decoded immediately, and the system will cache the decoded image, but this cache strategy is not public, and we cannot know when the image will be released. Therefore, in some performance-sensitive pages, we can also use static variables to hold the image loaded by imageNamed to prevent it from being released, thereby improving performance by exchanging space for time.

GPU consuming tasks

Compared to the CPU, the GPU can do relatively simple things: receive the submitted texture (Texture) and vertex description (triangle), apply transformation (transform), blend and render, and then output to the screen. Broadly speaking, most CALayer properties are drawn using the GPU.

The following operations will reduce the performance of GPU drawing:

Lots of geometry

All Bitmaps, including images, texts, and rasterized content, must eventually be submitted from memory to video memory and bound as GPU Textures. Both the submission to video memory and the GPU adjustment and rendering of Textures consume a lot of GPU resources. When a large number of images are displayed in a short period of time (for example, when a TableView has a lot of images and slides quickly), the CPU occupancy rate is very low, the GPU occupancy rate is very high, and the interface will still drop frames. The only way to avoid this situation is to minimize the display of a large number of images in a short period of time and combine multiple images into one for display as much as possible.

In addition, when the image is too large and exceeds the maximum texture size of the GPU, the image needs to be pre-processed by the CPU first, which will cause additional resource consumption for both the CPU and GPU.

Mixing of views

When multiple views (or CALayers) overlap and display, the GPU will first blend them together. If the view structure is too complex, the blending process will also consume a lot of GPU resources. In order to reduce the GPU consumption of this situation, the application should minimize the number and layers of views, and reduce unnecessary transparent views.

Off-Screen Rendering

Off-screen rendering means that the layer is rendered in a buffer opened outside the current screen buffer before being displayed.

Off-screen rendering requires multiple context switches: first, from the current screen (On-Screen) to the off-screen (Off-Screen); after the off-screen rendering is completed, the rendering result of the off-screen buffer is displayed on the screen, and the context switch needs to be switched from the off-screen to the current screen. Switching the context is a high-cost operation.

The reasons for offscreen rendering are:

  • Shadow (UIView.layer.shadowOffset/shadowRadius/…)
  • Rounded corners (when UIView.layer.cornerRadius and UIView.layer.maskToBounds are used together)
  • Layer Mask
  • Enable rasterization (shouldRasterize = true)

When using shadows, setting shadowPath at the same time can avoid off-screen rendering and greatly improve performance. There will be a Demo to demonstrate this later. Off-screen rendering triggered by rounded corners can be avoided by using CoreGraphics to process the image into rounded corners.

CALayer has a shouldRasterize property. Setting this property to true enables rasterization. Enabling rasterization will draw the layer to an off-screen image, which will then be cached and drawn to the contents and sublayers of the actual layer. This is more efficient than redrawing all frames of all transactions for layers with many sublayers or complex effects. However, rasterizing the original image takes time and consumes additional memory.

Rasterization will also bring some performance loss. Whether to enable it depends on the actual usage scenario. It is not recommended to use it when the layer content changes frequently. It is best to use Instruments to compare the FPS before and after enabling it to see if it has achieved the optimization effect.

Notice:

When shouldRasterize = true, remember to set rasterizationScale at the same time

Instruments Usage

Instruments is a set of tools. Here we only demonstrate the use of Core Animation. You will see the following options in the lower right corner of the Core Animation options:

Color Blended Layers

This option highlights the mixed areas on the screen from green to red based on the degree of rendering. The redder the color, the worse the performance, which will have a greater impact on indicators such as frame rate. Red is usually caused by the superposition of multiple semi-transparent layers.

Color Hits Green and Misses Red

When UIView.layer.shouldRasterize = YES, time-consuming image drawing will be cached and presented as a simple flat image. At this time, if other blocks of the page (such as UITableViewCell reuse) use the cache to directly hit, it will be displayed in green. Otherwise, if it does not hit, it will be displayed in red. The more red, the worse the performance. Because the process of rasterizing and generating cache is expensive, if the cache can be hit a lot and used effectively, the overall cost will be reduced. Otherwise, it means that new caches must be generated frequently, which will make performance problems worse.

Color Copied Images

For images with color formats that are not supported by the GPU, they can only be processed by the CPU, and such images are marked as blue. The more blue, the worse the performance.

Color Immediately

Normally Core Animation Instruments updates layer debug colors 10 times per millisecond. For some effects, this is obviously too slow. This option can be used to set it to update every frame (it may affect rendering performance and cause inaccurate frame rate measurements, so don't set it all the time).

Color Misaligned Images

This option checks if the image is scaled and if the pixels are aligned. Scaled images are marked in yellow, and pixel misalignment is marked in purple. The more yellow and purple, the worse the performance.

Color Offscreen-Rendered Yellow

This option will make layers that are rendered off-screen appear yellow. The more yellow, the worse the performance. These layers that appear yellow are likely to need to be optimized using shadowPath or shouldRasterize.

Color OpenGL Fast Path Blue

This option will make any layer drawn directly with OpenGL appear blue. The more blue, the better the performance. If you only use UIKit or Core Animation APIs, it will have no effect.

Flash Updated Regions

This option will display the redrawn content in yellow. The more yellow that shouldn't appear, the worse the performance. Usually we want only the updated part to be marked yellow.

Demo

Among the above options, Color Blended Layers, Offscreen-Rendered Yellow and Color Hits Green and Misses Red are commonly used to test performance. Below I will focus on the detection of off-screen rendering and rasterization. I wrote a simple Demo to set the shadow effect. The code is as follows:

  1. view .layer.shadowOffset = CGSizeMake(1, 1);
  2. view .layer.shadowOpacity = 1.0;
  3. view .layer.shadowRadius = 2.0;
  4. view .layer.shadowColor = [UIColor blackColor].CGColor;
  5. // view .layer.shadowPath = CGPathCreateWithRect(CGRectMake(0, 0, 50, 50), NULL );

When shadowPath is not set, the FPS detected by Instruments is basically below 20 (iPhone 6 device). After setting shadowPath, it is basically maintained at around 55, and the performance improvement is very obvious.

Let's take a look at the rasterization detection. The code is as follows:

  1. view .layer.shouldRasterize = YES;
  2. view .layer.rasterizationScale = [UIScreen mainScreen].scale;

After checking the Color Hits Green and Misses Red option, the following is displayed:

We can see that the cache is effective when the image is stationary, but it basically does not work when the image is sliding quickly. Therefore, whether to enable rasterization depends on the specific scenario and the performance before and after it is enabled can be determined by using Instruments.

Summarize

This article mainly summarizes some theoretical knowledge of performance tuning, and then introduces the usage of some performance detection indicators of Core Animation in Instruments. The most important thing for performance optimization is to use tools to detect rather than guess. First check whether there are problems such as off-screen rendering, and then use Time Profiler to analyze time-consuming function calls. After modification, use tools to analyze whether there is any improvement. Execute step by step, carefully and carefully.

I suggest you analyze your own application to deepen your impression. Enjoy~

<<:  iOS componentization solution (Part 2)

>>:  Efficient debugging of iOS

Recommend

New ways to play in the private domain of Douyin enterprise accounts

In 2019, the concept of private domain was propos...

One picture reveals the secret code of new media copywriting

I don’t know when it started, but the mobile phon...

My Primary Growth Map

The Internet has developed to this stage, and the...

What is your chance of encountering Aurora? This index is the key

Since 2019, the sun has entered its 25th activity...

Eagle AE system advanced tutorial 2021

Course Catalog ├──AE system advanced tutorial mat...