iQIYI’s overseas app network optimization practices

iQIYI’s overseas app network optimization practices

​When it comes to overseas markets, especially when targeting global users, the importance of the network is self-evident. Imagine a mobile app that takes 10 seconds to open the App homepage, which may cause half of the users to leave. iQIYI launched an international version for global users. Facing the complex overseas network environment, we have carried out a series of targeted optimization practices and achieved good results. Here we summarize and share some of our practices and optimization ideas, hoping to be helpful to you.

Several core optimization practices: do not request the network if possible, request a link target of 0-RTT, and the smaller the requested content, the better.

01 Research the local network situation

Increase the sampling of request links in the initial version of the App. If the number of samples is sufficient, you can clearly see what the market environment is like for your promotion. The sample data allows us to clearly discover the network problems in various countries and regions. Before large-scale promotion and investment, it is very important to do a good job of the basic work of the App.

Figure: Network latency from overseas users to overseas data centers (this is monitoring node data, user-side latency is higher)

Figure: Mobile network conditions in major overseas countries and regions

During the research phase, we found the following problems to be quite obvious, which actually affected our operations and App experience.

1. Serious operator hijacking, DNS hijacking, HTTP hijacking. 2. Complex mobile network. Southeast Asia's network infrastructure needs to be improved. 3. Low-end Android phones account for a certain proportion, and the number level affects decision-making. 4. High latency from international network users to servers

In the initial stage, the core of technical work is to solve the above problems and build a good infrastructure for subsequent operations. Because most of the business interfaces are in HTTP format, we began to make targeted improvements around HTTPS.

Figure: Analysis of an HTTPS request stage

An HTTPS request will have 5 RTTs in the first request

1RTT (DNS) + 1RTT (TCP handshake) + 2RTT (TLS1.2) + 1RTT (HTTP connection)

If we take the 50ms delay from end to service as an example, the delay of an HTTPS interface = 350ms = 50*5+ 100ms (server side). If the target is a non-domestic user, it takes 1.1s to open the homepage, which is obviously a bit long.

The following is the main text of the technical improvement: Figure: Summarize the key points of technical optimization

02 Improvement and optimization plan

01 Improvement and optimization of basic links

1. DNS optimization and adjustment

DNS resolution was changed to HTTPDNS. After the DNS improvement was launched, the efficiency of initial connection requests was observed to increase by 17%.

Solve the problem of domain name hijacking (data sent back from Southeast Asia shows that there are many hijackings)

Solve the problem of LocalDNS non-nearby allocation

Analysis warm-up can be done in conjunction with the business.

2. Optimization and adjustment of the transport layer

The MTU Problem

  • Different MTU values ​​on the client and server will result in excessive packet loss. Some AWS scenarios use jumbo frames by default: the MTU is 9001, but the receiving end uses 1500 by default, which will cause some packet loss.
  • If you use multiple cloud service providers and VPN networking, the data encapsulated in the IP tunnel is limited to 1500, which will cause packet loss and packet retransmission problems.
  • The worst case: some networks block the ICMP protocol, resulting in the inability to automatically negotiate MTU.

TCP-level optimization

TCP congestion control optimization

CongWin is the number of bytes sent continuously without receiving confirmation from the receiving end; CongWin is dynamically adjusted, depending on the product of bandwidth and delay, such as a 100MB bandwidth and a 100ms delay environment. Delay-bandwidth product = 100Mbps*100ms = (100/8)*(100/1000) = 1.25MB. In theory, the CongWin window can be maximized to 1.25MB. CentOS defaults to CongWin = 20*MSS, which is around 29KB, far from the upper limit of 1.26MB. If the default value is raised, TCP will start faster.

TCP Fast Open (TFO)​

Under TCP keepalive, there will still be cases where the link is disconnected and reestablished. TFO is an optimization for this situation. Figure: Principle mechanism of TFO

In our observation, when the TFO mechanism is enabled, the RTT of overseas services is usually over 100ms, and the HTTP request efficiency is improved by about 12%.

02 Improvement and optimization of the application layer

1. HTTP Optimization

HTTP1.1 has a keep-alive function that reuses TCP connections and reduces the consumption of new connections. It is more suitable for browser services, but for mobile terminals, most requests are still new connections. The serial mechanism of HTTP1.1 has the problem of header blocking.

2. SSL layer optimization

Try to upgrade to TLS1.3, use the Pre-shared Key mechanism, and enable ssl_early_data to further optimize "0-RTT". If you cannot upgrade the TLS version, optimize the key algorithm to ECDHE, which has a fast operation speed and reduces the handshake message round trip from 2-RTT to 1-RTT, which can achieve similar results to TLS1.3. Figure: Differences between TLS versions

After TLS1.3 is optimized, the time it takes to complete an HTTP request is reduced from 4 RTTs to 3 RTTs.

3. Upgrade HTTP2.0

Several important improvements: frame transmission, multiplexing, and header compression.

Multiplexing

In HTTP/2, there are two very important concepts: frame and stream. Frame represents the smallest unit of data. Each frame identifies which stream it belongs to. A stream is a data stream composed of multiple frames. Multiplexing means that multiple streams can exist in a TCP connection. These improvements can avoid HTTP head-of-line blocking and improve transmission performance.

Header Compression

If developers do not pay attention to the control of header content, it will cause the header content to get out of control, and the client will easily store a very large cookie graph: HTTP2's frame transmission mechanism

4. Dynamic acceleration of edge nodes is a very effective method

​As close to users as possible, use edge nodes to optimize routing and links to improve the efficiency of dynamic services. Compared with the direct connection mode, after using dynamic acceleration, the interface delay efficiency of P90 is improved by 60%.

Figure: iQIYI’s overseas dynamic acceleration effect has improved

5. Enable the safety net mechanism

For failed requests, enable the fallback protocol QUIC or kcp. The client failure rate is about 3%. For these requests, we use the UDP protocol as a fallback attempt. In our observation, the success rate increased by 45%.

03 Optimization of transmission content

1. Apply Brotli

Because the dictionary is preset, at the same level of compression rate, the compression ratio is improved by at least 17% compared to gzip, and the average Content-Size of the interface is reduced from 30KB to 18KB.

2. The interface is changed from JSON to Google Protobuf​

The important reason for using Protobuf is that the parsing efficiency is at least four or five times higher than that of JSON, which is more obvious when the node depth and data volume are large. But pay attention to the varint compression inside Protobuf, which only performs variable-length compression on numbers less than 128. The actual effect is not great. If the data volume in the production environment is large, external compression such as gzip is indispensable.

3. Image format upgraded to WebP​

While applying WebP, the quality of the poster image is reduced. In practice, the poster quality is set to 85%, which is difficult to distinguish with the naked eye. Compared with JPEG or PNG of the same quality, the volume can be reduced by up to 45%. The application effect is obvious. The loading speed of the homepage image of the App is improved with the naked eye.

04 Optimization and improvement at the business level

1. Reduce unnecessary requests:

​Some common content, such as navigation and channels, are usually updated actively by operators. As shown in the figure below, add an interface for startup phase request, put the timestamp of content update in it, and if it is different from the timestamp of local cache, then asynchronous request is made to update.

2. Differentiate user networks and adapt to different strategies. ​

For videos, the default playback bitrate for non-WiFi is 360P. For posters, the backend interface provides two quality URLs: high quality for WiFi and low quality for 4G.

3. More business optimization

​Increase request retries, adjust HTTP timeouts, request caches, etc. These can be adjusted according to business needs.

03 Ending

After a series of detail optimizations, the user experience of iQIYI overseas version continues to improve. A series of key indicators such as user interface delay, client failure rate, and video playback success rate have been greatly improved. This also helped iQIYI to rise to the top 1 in the application market in many Southeast Asian countries. In addition, App optimization, Server delay optimization, and product experience improvement, this series can only complement each other to maximize the user experience. ​

<<:  iOS 16 adds Find My, Health, and Clock to the list of deletable apps

>>:  Using HOOK to achieve response speed test in seconds

Recommend

How to make your app stand out in the app store

According to Apple’s third-quarter financial repo...

Product Analysis Report丨How does WeChat Reading retain users?

I have always believed that reading is a solitary...

How can Mini Programs effectively engage in fission marketing?

In this issue, I will share with you the efficien...

Analysis mechanism principle in tween animation source code

[[438831]] Preface After the tween animation move...

The four major operating logics of Weibo Super Topic!

There was a "one hundred million" event...

Analysis of Wugumofang's private domain operations

With the development of the Internet, private dom...

Coupon rapid traffic diversion and monetization strategy

“Some leeks get used to it after a while.” Some t...

How to monetize private domain traffic?

As early as April 2019, at the main venue of the ...

Writing copy is not to attract attention, but to occupy your time!

The purpose of writing copy is to gain as much us...

One article solves the marketing problems of startups!

This article takes how start-ups should properly ...

Brand marketing: brand positioning skills!

In the recently popular TV series "Don't...