The Evolution of veImageX: iOS High-Performance Image Loading SDK

1. SDK Introduction

Images are a common element in business application scenarios. veImageX (ImageX for short) provides a flexible and efficient one-stop image processing solution for the business, including server SDK, upload SDK and client image loading SDK.

1.1 The industry's mainstream open source image loading SDK

Before introducing veImageX image loading SDK, let's take a look at the mainstream image loading SDKs currently available in the industry. veImageX image loading SDK is developed using Objective-C language. The mainstream open source image loading SDKs implemented using Objective-C language in the industry include YYWebImage, SDWebImage, etc.

YYWebImage: An asynchronous image loading framework (a component of YYKit). It was created as an improved replacement for SDWebImage, PINRemoteImage, and FLAnimatedImage. It uses YYCache to support memory and disk caching, and YYImage to support WebP/APNG/GIF image decoding, but unfortunately this excellent framework has stopped updating around 2017;
SDWebImage: A widely used image processing framework that can asynchronously load network images and supports features such as local image caching. It is also an excellent image loading framework.

1.2 Advantages of veImaegX SDK

veImageX image loading SDK also draws on the strengths of various companies and develops a set of image loading SDKs based on the properties of some actual online applications of some businesses. Compared with these open source image loading SDKs, it has the following main features:

Adopting layered and modular architecture design, select corresponding functional modules according to business needs to minimize the package size;
Supports high-compression image formats such as WebP, AVIF, and HEIF. In particular, with the support of the self-developed high-performance HEIF software decoding library, it can efficiently decode the HEIF format and get rid of the limitations of the native iOS system version of HEIF;
Support cloud encryption and client decryption to ensure the privacy and security of pictures;
The SDK's network library supports HTTPDNS, which can effectively prevent content hijacking and domain name hijacking, effectively reduce the image decoding failure rate, and improve the client image loading experience;
It supports the collection and reporting of various image-related data. Together with the real-time dashboard data viewing of the veImageX console, it can provide comprehensive support from data discovery, data analysis, data monitoring, data diagnosis, data tracking, etc. for business operations and product experience improvement.

2. SDK Architecture

As time goes by, SDK has more and more functions, and various businesses have begun to diversify their choices of SDK functions. In particular, as the size of App packages needs to be reduced, SDK also needs to reduce the size of the package. In the face of the above problems, SDK has higher and higher requirements for modularization/plug-in capabilities of functions, and the SDK architecture has evolved into the following figure.

SDK is mainly divided into three layers

The interface layer is also the top layer. This layer provides various interfaces for image loading and processing. The interface design is consistent with the mainstream open source image loading SDK. Adapters are provided at this layer, providing an adaptation layer for open source image loading SDKs (such as YYWebImage, SDWebImage, etc.), which facilitates quick business start-up and seamless switching;
The management layer, as the middle layer, is responsible for the interaction management of various modules, including cloud control configuration management and authorization management;
Module layer: This layer includes various modules of the image loading process: download module, cache module, decoding module, log reporting module, etc. The business can selectively rely on the various functions of these modules according to its own needs to achieve the principle of minimizing dependence.

3. How does UIImageView render a network image through SDK?

The mainstream application scenario of images in business is to load network images. Taking the iOS native system control UIImageView as an example, the complete process of loading a network image through the SDK is as follows:

Initiate image request -> query memory cache -> query disk cache -> join download queue -> start downloading -> obtain undecoded image data from server -> decode the undecoded image data to get a renderable image -> cache the decoded image and undecoded image data into memory and disk respectively -> UIImageView renders the decoded image. At this point, a network image is successfully loaded and displayed to the user.

4. SDK module introduction

After understanding the complete loading process of network images in the mainstream scenarios of the SDK, the following introduces the five major modules of the SDK loading process: downloading, caching, decoding, log reporting, and image post-processing.

4.1 Download module

The main task of the download module is to download network images from the server to the client through the network library. This process is very important for image loading. The success or failure of the download directly determines whether the image can be displayed correctly, and the performance of the network library also determines the speed of image download, which is ultimately reflected in the user's experience. Therefore, in addition to supporting the network library implementation of Apple's native system, the download task in the download module also supports Byte's powerful self-developed network library TTNetwork implementation. This library not only makes some network-related optimizations, such as HTTPDNS, HTTP2+HTTPS connection reuse optimization, link selection, dynamic strategies, etc., but also supports the latest network protocol QUIC, and provides more fine-grained network monitoring, providing efficient support for SDK image downloads. The SDK supports native network libraries and self-developed network libraries by default. If the business has its own network library, it can also be integrated in the form of plug-ins.

In business, multiple images are usually downloaded concurrently. In the feed stream scenario, if the user slides the images back and forth, multiple requests will be made for the same image. If multiple requests for the same image are used to download the image repeatedly, this will obviously waste user traffic and increase bandwidth costs. The SDK will manage these concurrent download tasks and mark the same image requests to avoid this problem. The management and scheduling of download tasks are implemented through the native NSOperation and NSOperationQueue of the iOS system. At the same time, an Identifier will be generated based on the request parameters to uniquely identify a download task and managed by the download manager, so as to avoid downloading the same image multiple times in the same time period.

4.2 Cache Module

The cache module is composed of a secondary cache structure consisting of memory and disk. When a picture is downloaded to the client, it will be cached in the memory and disk cache. If the picture is requested again during the App life cycle, it can be found from the memory cache. If the picture is requested again after the App is cold started, it can be found from the disk cache. This can not only speed up the loading of pictures and improve the user experience, but also reduce user traffic and save bandwidth costs. Adding an expiration time limit to the cache can solve the timeliness problem of pictures.

In terms of memory cache, in addition to supporting iOS native NSCache, it also supports Strong-Weak weak reference cache. When the cached object is not held by anyone, it will be released in time to reduce memory usage. It also supports LRU cache. When receiving a notification of insufficient memory, it will actively release memory to relieve memory pressure while ensuring thread safety. In terms of disk cache, in addition to supporting the most basic iOS system file management NSFileManager, it also supports LRU cache and ensures thread safety.

Overall, if only one fixed cache algorithm is used in the App, the same cache algorithm cannot meet all scenarios due to the different usage scenarios of images, and the cache hit rate will be low. In addition to the cache algorithms supported by the SDK by default, since both memory and disk caches are defined by the protocol, businesses can also customize caches according to their needs and use different cache algorithms in different scenarios, which can greatly improve the cache hit rate. In some business-specific scenarios, the SDK's cache hit rate can reach about 80%. As the cache hit rate increases, the bandwidth cost savings will also increase.

4.3 Decoding Module

After the pictures are downloaded to the client, they are all undecoded data. If you want to display the pictures correctly to the user, you must decode them. Picture decoding supports decoding through the iOS native system decoding framework ImageIO, that is, the formats that Apple can natively support can also be supported by the SDK. In addition, for picture formats that are not natively supported, such as WebP, AVIF, VVIC (byte-developed picture format based on the BVC algorithm), the SDK can also decode pictures of these formats through the support of self-developed decoders or open source decoders. When there are new formats of pictures to support, you only need to implement the corresponding format of dynamic and static picture protocols to integrate them into the SDK in the form of plug-ins to achieve the purpose of supporting new formats of pictures.

4.3.1 SDK Features: HEIF support for all iOS systems

The application of high-compression format images such as HEIF has been relatively mature within ByteDance. In terms of bandwidth savings, compared with WebP, it can save another 30% of bandwidth costs at the same quality, saving the company a lot of bandwidth costs. In terms of loading optimization, HEIF supports progressive loading, which can load HEIF thumbnails first, and then load HEIF original images, so that you can have a good image loading experience even in scenarios with poor network quality. With the support of the company's self-developed high-performance HEIF software decoding library, the SDK has freed the decoding support of HEIF format images from the limitations of the iOS system. It is no longer limited to iOS 11 and above to use HEIF still images, and iOS 13 and above to use HEIF animated images. HEIF static and animated images can also be supported on lower versions of iOS, which greatly improves the application scope of HEIF and reaps a lot of bandwidth cost savings.

4.4 Image Post-Processing Module

After the image is loaded, the business can also perform various real-time conversions on the image again as needed, such as adding rounded corners, super-resolution, etc. These are all completed through image post-processing. The following is an introduction to a special capability of the SDK: super-resolution.

4.4.1 SDK Features: Super Resolution

Super-resolution refers to restoring a high-resolution image from a given low-resolution image based on machine learning/deep learning methods. With the help of image post-processing, real-time image super-resolution can be achieved on mobile devices.

Generally, it can be used in two scenarios. One is to improve the user experience. When the original image has low resolution and clarity, it can be super-resolution to improve the clarity, so as to improve the user viewing experience. The other is to downgrade and super-resolution. When the user requests a high-resolution image, the resolution of the image can be reduced during the transmission process, and then super-resolution can be performed on the client to improve it to the original requested resolution, so as to save bandwidth costs.

4.5 Log reporting module

The SDK includes three major log modules: image performance log, user perception log, and large image monitoring log, providing comprehensive data support for business operations and product experience improvement. With the console of the volcano engine veImageX, you can view various visualized large-scale data in real time and monitor various indicators of the image in an all-round way.

Among them, the image performance log includes data such as image URL, download time, decoding time, error code, image source, etc., which is used to monitor various performance indicators of the image; the user perception log includes data such as image URL, ImageView Size, ImageView display time, etc., which is used to monitor various user experience indicators; the large image monitoring log includes data such as large image URL, memory usage, image file size, image resolution size, etc., which can comprehensively monitor abnormal large image situations.

5. Evolution: Performance Optimization

The SDK is committed to providing the ultimate user experience for image loading. To this end, the SDK has made many related performance optimizations. The following mainly introduces how the SDK improves the image loading experience, reduces memory usage, and optimizes animated image playback.

5.1 Improve image loading experience

The speed of image loading directly affects the user experience. Efficient image loading is an indispensable capability of the SDK.

Progressive Loading

When loading large static images, or loading multi-frame animated images, or in weak network scenarios, you can enable progressive loading of images to improve the image loading experience.

The SDK supports traditional PNG and JPEG progressive loading of static images, as well as HEIF progressive loading of static images. First, the HEIF thumbnail is loaded, and then the HEIF original image is loaded. The SDK also supports progressive loading of animated images, which can be played while being downloaded. Under normal network conditions, the loading speed of the first frame can be increased. Under weak network conditions, a buffering mechanism similar to video playback can also improve the playback experience of animated images.

Force Decode

In terms of image decoding, the SDK supports Force Decode, which can transfer the Bitmap Buffer to the rendering process in advance, reducing the time spent on copying during future rendering. If the original decoded Bitmap Buffer is not directly supported by the iOS hardware screen, it will be converted in advance to avoid the conversion overhead in the main thread during rendering and increase the loading frame rate of the image.

5.2 Elegant Memory Control

Usually, there are many scenarios where images are loaded in an App. When a large number of images are loaded, the memory occupied by the images may be very large. If the memory usage is too high, it will cause OOM problems, and the user experience is the same as Crash, that is, the application suddenly crashes.

The SDK provides the following solutions to reduce image memory usage:

Release memory cache

When the system memory is tight and a low memory notification is received, the cache module will release the memory cache in a timely manner. It also provides an interface for the business to actively release the memory cache at an appropriate time.

Global image downsampling

The size of the image in memory can be simply estimated using the following formula:

 memoryCost（单位：字节）= imageWidth（单位：像素）* imageHeight（单位：像素）* 4

As can be seen from the formula, if you want to reduce memory, you need to find a way to reduce the width and height of the image as much as possible without affecting the function and experience. Therefore, when it is not clear whether the size of the downloaded image will be much larger than the size of the ImageView to be displayed, you can use the global image downsampling function. Global image downsampling is divided into downsampling based on size limit and downsampling based on memory size limit.

Downsampling with size constraints:

If the current image is larger than the downsampled image, scale the original image to the same size as the downsampled image.

Downsampling within memory size limit:

If the memory usage of the current image exceeds the memory limit, then scale the original image in proportion to its length and width to just below the memory limit.

Disable image rendering

Before each rendering operation, the metadata of the current image will be called back to the business, such as the image's length and width, the number of frames of the animated image, and the estimated memory consumption. The business can use this information to prohibit the rendering of oversized images that do not meet expectations.

Big picture monitoring

In actual business scenarios, the resolution and frame rate of the image to be displayed are unknown. In some extreme cases (real online cases), a certain animated image has a resolution of 1080p and hundreds of frames. It is generated by user screen recording and is a very large animated image. After decoding, it occupies more than 1 GB of memory, and it will directly OOM on some low-end machines. For this type of OOM situation, it is difficult to troubleshoot according to conventional methods. So how to effectively monitor this kind of online large image that does not meet expectations? The SDK defines a large image through three dimensions: image display size, memory size after image decoding, and image file size. When an image triggers the threshold limit of any of these three dimensions, it will be recorded in the large image monitoring log, and these data will be reported later. The business can see the detailed data under the large image monitoring indicator through the veImageX console. When the value of the memory usage size is found to be abnormally large, the corresponding image URL can be found in time, and then combined with the actual business scenario, this large image that does not meet expectations can be offline in time to reduce the online OOM rate.

5.3 Optimization of animated image playback

Animated images are also a common application scenario in business. If they can be optimized well, it can also improve the user experience. When an animated image is played, each frame of the image will be continuously decoded, which will consume a lot of CPU resources. The SDK will calculate the current available memory and the memory required to render all frames of the animated image. If the current available memory meets the memory required to render all frames of the animated image, the SDK will cache all frames of the animated image to save CPU resources. If the current available memory does not meet the memory required to render all frames of the animated image, the SDK will discard the previous frame after each frame of the image is played, that is, it will repeatedly render the next frame of the image, saving memory by consuming CPU resources, and achieving a balance between CPU consumption and memory saving.

6. Final Thoughts

Although there are already many mature image loading SDKs in the industry, it is also important to have an SDK that fits the company's own business development. As an indispensable part of the overall end-to-end veImageX product, the image loading SDK was born in this context. In addition to some performance optimizations, in terms of cost savings, the application of the HEIF format has saved the company a lot of bandwidth costs, and the benefits are very considerable. It is also continuing to try new image formats with higher compression rates, such as VVIC. In the application of cutting-edge capabilities, with the continuous iteration and optimization of the image super-resolution algorithm, I believe that it will also bring good experience improvements and cost savings in the future.

<<: Web3.0--Introduction to Decentralized Identity DID

>>: Apple iOS 16 hidden system exposed: detect user country/region more accurately and restrict specific functions