Towards advanced level: Network basics that excellent Android programmers must know

1. Introduction

Network communication has always been an important module in Android projects. Many excellent network frameworks have appeared in Android open source projects, from some tool classes that simply encapsulate HttpClient and HttpUrlConnection in the beginning, to the more complete and rich Volley open sourced by Google, to the more popular Okhttp and Retrofit today.

In order to understand the similarities and differences between them (or to be more specific, to have a deeper understanding of network communication technology in Android development), you must have a good understanding of the basic network knowledge and the basic principles of the Android network framework, so that you can find the best network communication technology practice suitable for your APP at the critical moment.

It turns out that relevant knowledge is often encountered in daily Android development and source code reading. Mastering these basic network knowledge is also one of the basic technical qualities that Android programmers must have in their journey to advanced levels.

In view of this, this article will mainly introduce some basics of computer networks, as well as some uses, problems encountered and solutions in Android development.

This article is mainly divided into the following parts:

1) Computer network architecture;
2)Http related;
3) TCP related;
4)Socket.

2. About the Author

Shu Dafei: Android development engineer at Ctrip.com.

Note: When this article was included, the content was revised in more detail to make it easier to understand.

3. Computer network architecture

Computer network architecture, that is, the layered structure of computer network systems that we often see, is still necessary to clarify this to avoid confusion between HTTP and TCP, two protocols that are not at the same layer. According to different reference models, there are several different versions of the layered structure, such as the OSI model and the TCP/IP model.

The following is an example of a 5-layer structure that is commonly seen:

As shown in the figure above, the five-layer architecture can ultimately achieve end-to-end data transmission and communication. What are they responsible for, and how do they ultimately achieve end-to-end communication?

Application layer: such as the http protocol, it actually defines how to package and parse data. If the application layer is the http protocol, the data will be packaged according to the protocol, such as packaging according to the request line, request header, and request body. After the data is packaged, it will be passed to the transport layer.
Transport layer: The transport layer has two protocols, TCP and UDP, which correspond to reliable transport and unreliable transport respectively. For example, TCP needs to provide reliable transmission, so it needs to solve the problem of how to establish a connection, how to ensure reliable transmission without data loss, and how to adjust flow control and congestion control. Regarding this layer, we usually deal with Socket, which is a set of encapsulated programming call interfaces. Through it, we can operate TCP and UDP to establish connections. When we usually use Socket to establish a connection, we usually have to specify the port number, so this layer specifies the corresponding port number to send data to.
Network layer: This layer includes the IP protocol and some routing protocols, so this layer specifies the IP address to which the data is transmitted. In the middle, some optimal routes, routing algorithms, etc. are involved.
Data link layer: The one that impressed me most was the ARP protocol, which is responsible for resolving the IP address into a MAC address, i.e., the hardware address, so that the corresponding unique machine can be found.
Physical layer: This layer is the lowest level and provides binary stream transmission services, which means that data transmission actually begins through the transmission medium (wired or wireless).

Therefore, through the respective functions of the above five layers, the physical transmission medium - MAC address - IP address - port number - data is obtained and the data is parsed according to the application layer protocol to finally realize network communication and data transmission.

The following will focus on things related to HTTP and TCP.

4. HTTP related

This section mainly talks about some basic knowledge about Http, as well as some practical applications in Android and the problems encountered and their solutions.

4.1 Correctly understand HTTP’s “connectionless” and “stateless”

Http is connectionless and stateless.

Connectionless does not mean that there is no need for connection. The Http protocol is only an application layer protocol, and ultimately it still depends on the services provided by the transport layer, such as the TCP protocol, to connect.

Connectionless means that HTTP stipulates that each connection only processes one request, and the connection is disconnected after a request is completed. This is mainly to relieve the pressure on the server and reduce the connection's occupation of server resources. My understanding is that establishing a connection is actually a matter of the transport layer. For HTTP facing the application layer, it is connectionless because the upper layer has no perception of the lower layer.

Stateless means that each request is independent and has no memory of previous request transactions. Therefore, things like cookies appear to save some states.

4.2 Request Message and Response Message

Here we will briefly talk about the basic knowledge of the format of HTTP request messages and response messages.

Request message:

Response message:

Regarding Get and Post, we all know the differences between Get and Post as follows:

Get will concatenate the request parameters after the URL and finally display them in the address bar, while Post will put the request parameter data into the request body and will not display them in the address bar.
The length limit of the passed parameter.

question:

Regarding point 1), it is indeed inappropriate to expose private data in the address bar in the browser, but what if it is in App development, where there is no concept of address bar? Will this still become a constraint on choosing post or get?
Regarding point 2), the length limit should be a browser limitation and has nothing to do with get itself. If it is in App development, can this point be ignored?

4.3 HTTP Cache Mechanism

The reason why I want to introduce the following Http caching mechanism is that Okhttp uses the Http caching mechanism for network request caching, instead of the client writing a set of caching strategies by itself like Volley and other frameworks.

Http cache is mainly controlled by two fields in the header: Cache-control and ETag, which will be introduced below.

1) Cache-control mainly includes several fields:

private: only the client can cache;
public: Both the client and the proxy server can cache;
max-age: cache expiration time;
no-cache: A comparison cache is needed to verify cache data;
no-store: All memory will not be cached.

In fact, a cache policy is set here, which is sent from the server to the client through the header for the first time. You can see:

max-age: the time when the cache expires. If the cache is requested again later and the time has not exceeded, the cache can be used directly.
no-cache: indicates that a comparison cache is needed to verify cache data. If this field is turned on, even if the max-age cache is not invalid, a request is still required to confirm whether the resource has been updated and whether the data needs to be requested again. As for how to do the comparison cache, the role of Etag is described below. If the server confirms that the resource has not been updated, it returns 304 and takes the local cache. If there is an update, it returns the latest resource;
no-store: If this field is turned on, no caching will be performed and no cache will be retrieved.

2) ETag: It is used for comparison caching. Etag is an identification code of the server resource.

When the client sends the first request, the server will send the identification code Etag of the currently requested resource. The next time the client requests, the client will include this identification code Etag through the If-None-Match in the header. The server will compare the Etag sent by the client with the latest resource Etag. If they are the same, it means that the resource has not been updated and returns 304.

3) Summary:

The Http cache mechanism is implemented through the cooperation of Cache-control and Etag.

4.4 HTTP Cookies

As mentioned above, the HTTP protocol is stateless, and cookies are used to cache some states locally. A cookie generally contains several attributes such as domin (domain), path, and Expires (expiration time). The server can write the state into the client's cookie through set-cookies in the response header. The next time the client initiates a request, the cookie can be brought along.

Problems encountered and solutions in Android development:

Speaking of Cookies, if you usually only do App development, you don’t encounter them often, but if it involves WebView requirements, you may encounter them.

Let me tell you about a heart-wrenching experience I encountered in a project regarding WebView Cookies: The requirement is that the H5 page loaded in the WebView needs to be logged in, so we need to manually write the ticket into the WebView Cookie after logging in to the native page, and then the H5 page loaded in the WebView can be verified by the server with the ticket in the Cookie.

But I encountered a problem: When debugging WebView through Chrome inspect, the manually written cookies were indeed written in, but when the request was initiated, the cookies were not brought, resulting in the failure of request verification. After investigation, it was found that the default attribute of WebView was closed, which can be turned on by setting the following code:

 CookieManager cookieManager = CookieManager.getInstance();  
 if(Build.VERSION.SDK_INT >= Build.VERSION_CODES.LOLLIPOP) {  
 cookieManager.setAcceptThirdPartyCookies(mWebView, true );  
 } else {  
 cookieManager.setAcceptCookie( true );  
 }

4.5 Https

We all know that Https ensures the security of our data transmission, Https=Http+Ssl. The main principle to ensure security is the use of asymmetric encryption algorithms. The reason why the commonly used symmetric encryption algorithms are not safe is that both parties use the same key for encryption and decryption. As long as either party leaks the key, others can use the key to decrypt the data.

The core essence of asymmetric encryption algorithms that can achieve secure transmission is that information encrypted by a public key can only be decrypted with a private key, and information encrypted by a private key can only be decrypted by a public key.

1) Briefly describe why asymmetric encryption algorithms are secure:

The server applies for a certificate issued by a CA and obtains the public and private keys of the certificate. The private key is known only to the server, while the public key can be shared with others, such as passing the public key to the client. The client then encrypts the data it transmits using the public key sent by the server, and the server can decrypt the data using the private key. Since the data encrypted by the public key on the client can only be decrypted by the private key, and only the server has the private key, data transmission is secure.

The above is just a brief description of how asymmetric encryption algorithms ensure data security. In fact, the working process of HTTPS is much more complicated than this (due to space limitations, I will not go into details here, there are many related articles on the Internet):

One is that the client also needs to verify the legitimacy and validity of the CA certificate sent by the server, because there is a risk that the CA certificate may be swapped during the transmission process. This involves the question of how the client can verify the legitimacy of the server certificate to ensure the legitimacy of the identities of both parties in the communication;

Another problem is that although asymmetric algorithms ensure data security, their efficiency is relatively poor compared to symmetric algorithms. How to optimize them to ensure data security and improve efficiency?

2) How does the client verify the legitimacy of the certificate?

First, the CA certificate generally includes the following:

The issuing authority and version of the certificate;
The user of the certificate;
The certificate's public key;
Certificate validity period;
The digital signature hash value of the certificate and the signature hash algorithm (the digital signature hash value is encrypted with the private key of the certificate);
etc.

The client verifies the legitimacy of the certificate sent by the server by first using the obtained public key to decrypt the digital signature Hash value 1 in the certificate (because it is encrypted using the private key), and then using the signature Hash algorithm in the certificate to generate a Hash value 2. If the two values are equal, it means that the certificate is legitimate and the server can be trusted.

3) Problems encountered and solutions in Android development:

By the way, during project development, I used Android WebView to load a web page on the company's test server. The certificate expired, causing the web page to fail to load and a white screen to appear.

The solution is to temporarily ignore the SSL error in the test environment, so that the web page can be loaded. Of course, don't do this in production. One is that there will be security issues, and the other is that Google Play will not pass the review.

The best way is to override onReceivedSslError() of WebViewClient:

 @Override  
 publicvoidonReceivedSslError(WebView view , SslErrorHandler handler, SslError error) {  
 if(ContextHolder.sDebug) {  
 handler.proceed();  
 return ;  
 }  
 super.onReceivedSslError( view , handler, error);  
 }

4.6 HTTP 2.0

Okhttp supports configuration using the Http 2.0 protocol. Http2.0 has a huge improvement over Http1.x, mainly in the following points.

1) Binary format: http1.x is a text protocol, while http2.0 is a binary protocol with frames as the basic unit. In addition to data, a frame also contains the frame's identifier: Stream Identifier, which identifies which request the frame belongs to, making network transmission very flexible;

2) Multiplexing: This is a big improvement. The original http1.x one connection per request situation had great limitations and also caused many problems, such as the cost and efficiency of establishing multiple connections.

In order to solve the efficiency problem, http1.x may initiate as many concurrent requests as possible to load resources. However, the browser has a limit on concurrent requests under the same domain name, and the optimization method is generally to put the requested resources under different domain names to overcome this limitation.

The multiplexing supported by http2.0 can solve this problem very well. Multiple requests share one TCP connection, and multiple requests can be concurrently connected on this TCP connection. This not only solves the consumption problem of establishing multiple TCP connections, but also solves the efficiency problem.

So what is the principle that supports multiple requests to be sent concurrently on a TCP connection? The basic principle is the binary framing mentioned above. Because each frame has an identity, different frames of multiple requests can be sent out concurrently and out of order. The server will organize each frame into the corresponding request based on its identity.

3) Header compression: This is mainly done by compressing the header to reduce the size of the request, reduce traffic consumption, and improve efficiency. This is because there was a problem before that each request had to carry a header, and the data in the header was usually unchanged.

4) Support server push.

5. TCP related

TCP is connection-oriented and provides reliable data transmission. At this layer, we usually operate TCP through Socket API to establish connections, etc.

5.1 Three-way handshake to establish a connection

First time: Send SNY=1 to indicate that this handshake is a request to establish a connection, and then seq generates a client random number X
The second time: Send SNY=1, ACK=1 to indicate that it is a reply to the request to establish a connection, and then ack=client's seq+1 (so that the client can confirm that it is the server it wanted to connect to before after receiving it), and then the server also generates a random number seq=Y representing itself and sends it to the client.
The third time: ACK=1. seq=client random number+1, ack=server random number+1 (so the server knows it is the same client)

Why is a three-way handshake required to establish a connection?

First of all, it is very clear that the two-way handshake is the most basic. In the first handshake, the C end sends a connection request message to the S end. After the S end receives it, the S end knows that it can connect successfully with the C end, but the C end does not know whether the S end has received this message at this time, so the S end must respond after receiving the message. Only after the C end receives the S end’s reply can it be sure that it can connect with the S end. This is the second handshake.

The C end can only start sending data after it is sure that it can connect to the S end. So two handshakes are definitely the most basic.

So why is a third handshake needed? Suppose there is no third handshake, but we think the connection is established after two handshakes, then what will happen?

The third handshake is to prevent the invalid connection request segment from being suddenly transmitted to the server, thus causing errors.

The specific situation is:

The first network connection request sent by the C end was delayed in the network node for some reason, resulting in a delay. It did not reach the S end until a certain point in time when the connection was released. This was a message that had long been invalid, but at this time the S end still believed that this was the first handshake of the C end's request to establish a connection, so the S end responded to the C end for the second handshake.

If there are only two handshakes, then the connection is established here, but the C end does not have any data to send at this time, and the S end will wait foolishly, causing a great waste of resources. Therefore, a third handshake is needed, and only if the C end responds again, this situation can be avoided.

5.2 Wave four times to disconnect

After the analysis of the connection diagram above, this diagram should not be difficult to understand.

There is a main question here: why is there one more wave than when establishing a connection?

It can be seen that the ACK (reply to the client) and FIN (termination) messages of the server are not sent at the same time, but ACK first and then FIN. This is also easy to understand. When the client requests to disconnect, the server may still have unsent data, so ACK first, then wait until the data is sent before FIN. This becomes a four-way handshake.

The above describes the process of TCP establishing and disconnecting connections. The most important feature of TCP is to provide reliable transmission. So how does it ensure that data transmission is reliable? This is the sliding window protocol to be discussed below.

5.3 Sliding Window Protocol

The sliding window protocol is the basis for ensuring reliable transmission of TCP, because the sending window will move backwards to continue sending other frames only after receiving the confirmation frame.

Here is an example: If the sending window is 3 frames

Initially, the sending window is in the first 3 frames [1,2,3]. The first 3 frames can be sent, but the ones after that cannot be sent temporarily. For example, after frame [1] is sent, a confirmation message is received from the receiver. Then the sending window can be moved back 1 frame to [2,3,4]. Similarly, only frames in the sending window can be sent, and so on.

After receiving the frame, the receive window puts it into the corresponding position and then moves the receive window. The interface window also has a size like the send window. For example, if the receive window is 5 frames, the frames that fall outside the receive window will be discarded.

Different settings of the send window and receive window size lead to different protocols:

Stop-wait protocol: Each frame must wait for the confirmation message before sending the next frame. Disadvantage: poor efficiency.

Back-off N frame protocol: adopts the cumulative confirmation method. After the receiver correctly receives N frames, it sends a cumulative confirmation message to the sending window to confirm that N frames have been correctly received. If the sender does not receive the confirmation message within the specified time, it is considered to be timed out or data lost, and all frames after the confirmation frame will be resent. Disadvantages: The PDU after the error sequence number has been sent, but it still needs to be resent, which is wasteful.

Selective retransmission protocol: If an error occurs, only the required PDU involved in the error is retransmitted, which improves transmission efficiency and reduces unnecessary retransmissions.

There is one last problem left here: Since there is a mismatch between the sending efficiency and the receiving efficiency between the sending window and the receiving window, congestion will occur. To solve this problem, TCP has a set of flow control and congestion control mechanisms.

5.4 Flow Control and Congestion Control

1) Flow control:

Flow control is the control of the flow on a communication path. The sender dynamically adjusts the sending rate by obtaining feedback from the receiver to achieve the effect of controlling the flow. Its purpose is to ensure that the sender's sending speed does not exceed the receiver's receiving speed.

2) Congestion Control:

Congestion control is the control of the traffic of the entire communication subnet and is a global control.

① Slow start + congestion avoidance

Let’s take a look at a classic picture first:

At the beginning, slow start is used, that is, the congestion window is set to 1. Then the congestion window exponentially grows to the slow start threshold (ssthresh=16), and then it switches to congestion avoidance, that is, additive growth. When it grows to a certain extent, it causes network congestion. At this time, the congestion window will be reduced to 1 again, that is, slow start again, and the new slow start threshold will be adjusted to 12, and so on.

② Fast retransmission + fast recovery

Fast retransmission: The retransmission mechanism we mentioned above will not start retransmission until the timeout period has expired and the receiver has not yet received a reply. The design idea of fast retransmission is that if the sender receives three repeated ACKs from the receiver, it can be determined that a segment is lost. At this time, the lost segment can be retransmitted immediately without waiting for the set timeout period to expire, which improves the efficiency of retransmission.

Fast recovery: The above congestion control will reduce the congestion window to 1 when the network is congested and restart the slow start. The problem with this is that the network cannot quickly recover to normal. Fast recovery is to optimize this problem. When fast recovery is used, when congestion occurs, the congestion window will only be reduced to the new slow start gate threshold (i.e. 12), instead of being reduced to 1, and then directly start to enter congestion to avoid additive growth, as shown in the following figure:

Fast retransmission and fast recovery are further improvements to congestion control.

6. About Socket

Socket is a set of APIs that operate TCP/UDP. APIs like HttpURLConnection and Okhttp, which involve sending low-level network requests, also use Socket to send network requests. Volley and Retrofit are higher-level encapsulations, and finally rely on HttpURLConnection or Okhttp to establish the final connection and send the request.

You should know how to use Socket simply. Create a Socket on each end, the server end is called ServerSocket, and then establish a connection.

7. Conclusion

Of course, the above content is just the computer network basics that I know and think are very important. There is still a lot of network basic knowledge that needs to be deeply understood and explored. I have written a lot, which is a summary of my own network basics. There may be omissions. I just want to throw out some ideas. I hope you can give me some advice.

<<: Time dilutes everything Google stops supporting Microsoft Windows Phone

>>: For the security of your Apple device, please remember to enable the "two-factor authentication" function

A fire broke out in a primary school in Fengtai District, Beijing, due to a fire in the canteen oil pan! How to eliminate kitchen safety hazards →

Blog

WeChat has launched a feature that does not shut down your phone when you owe money on time, which means you can still use your phone if you owe 100 yuan

[[432032]] I believe everyone has encountered the...

Successfully cut ties! Huawei sells off its "Wenjie" trademark at an ultra-low price of RMB 2.5 billion, and its partners are finally relieved

Whose car is it? Seres? Huawei or AITO? Huawei ha...

Towards advanced level: Network basics that excellent Android programmers must know

A fire broke out in a primary school in Fengtai District, Beijing, due to a fire in the canteen oil pan! How to eliminate kitchen safety hazards →

Star Rating

It was invented by accident only a hundred years ago.

"Sacred Salt" and "Salt God": The Mysterious Salt Worship

Take a sneak peek at the majestic figure of “Kuafu” and look forward to the feat of “chasing the sun” in the future!

World Pain Relief Day | Technology helps you avoid being a "ninja" in the face of pain

From zero to the third largest e-commerce company, how did Pinduoduo increase its user base?

A "minor cold" causes indicators to soar, 10,000 times higher than normal values. Be alert to the onset of myocarditis!

I never expected that aspirin and heroin came from the same person.

Electric Technology Auto News: Have you ever seen a Ford with 12 exhaust pipes? Tell us about your favorite car

Recommend

Guangdiantong account establishment, targeted classification and path landing page production

How serious are the consequences if spies steal hybrid rice seeds in my country?

15 major brands play with private domain traffic, 5 strategies to increase growth!

WeChat has launched a feature that does not shut down your phone when you owe money on time, which means you can still use your phone if you owe 100 yuan

Why do I fall asleep easily when I'm in a car? How did I get "hypnotized"?

How did Chen Danian create an application as large as WeChat by working only 6 hours a day?

Dong Mingzhu: If we hadn't offended those people, Gree wouldn't be where it is today

Successfully cut ties! Huawei sells off its "Wenjie" trademark at an ultra-low price of RMB 2.5 billion, and its partners are finally relieved

Is there a black hole at the center of the Milky Way? The photo was just revealed!

A successful H5 should hit the user's key points and achieve the operation purpose

China Association of Automobile Manufacturers: Brief analysis of automobile industry production and sales in July 2023

Don’t know how to write a Mother’s Day copy? Share 9 classic cases of brands leveraging momentum!

RAPOO D5 "three-in-one" experience

What is SEO keyword domination? How effective is keyword dominance?

Nowadays, people eat sea cucumbers when they have a disagreement...