Do you know? How to learn TCP protocol

Do you know? How to learn TCP protocol

TCP is currently the de facto foundation of the Internet. Many higher-level application protocols such as HTTP and FTP are based on TCP.

Learning the TCP protocol can be said to be extremely boring, especially for students, who have no idea where it is used and what its importance is. In fact, based on the current network development and distributed development, TCP is simply the foundation of the foundation. Many network problems, configurations, intrusions, defenses, and even architectures involve the specific application and mechanism of TCP.

The following is my summary of the TCP learning process:

1. Understand the importance and necessity of learning the TCP protocol, and understand why the TCP protocol was developed

2. Learn the three-way handshake and four-way handshake of the TCP protocol, and focus on understanding why there are three-way handshakes and four-way handshakes, and how the state changes during the whole process. (Classic state diagram and handshake and handshake diagram)

a. Why do we need three handshakes? Not one, two or four. Let's discuss what will happen if there is only one handshake. A sends a connection request to B. Assuming B does not receive it, B actually has no idea that A has sent a request, and A also has no idea whether B has received it or not. Therefore, a handshake is unreliable. What if there are two handshakes? A sends a connection request to B. B receives A's request and sends a reply to A. Assuming that A receives B's reply at this time, A knows that B is ready, but B has no idea whether A is ready. It is possible that A did not receive B's request, or it is possible that A received it, but B has no idea about it. Therefore, it is only a one-way connection. What if there are four handshakes? In fact, the second handshake lets A know that B is ready, and the third handshake lets B know that A is also ready. The fourth handshake is completely redundant and will waste network resources.

b. Why do we need to wave four times? Not three? In fact, the two connections can be viewed separately. Use two waves to disconnect one side, and another two waves to disconnect the other side, and finally complete the entire connection closure. The reason for this design is that it is possible that the data on one side has not been transmitted and the connection has not been closed. Because TCP is designed as a full-duplex protocol, data can be sent in one direction on either side.

1. Handshake and waving process

2. TCP state transition diagram

3. Learn how the TCP protocol is designed to maintain reliability.

The main purpose is to be used as a reference for architecture and design in other communication situations

1) Packet response sequence number and packet reassembly.

Problems faced: During network transmission, problems such as data corruption, packet loss, duplication, and fragmentation confusion may occur.

Essentially, to ensure the reliability of transmission, the content of the transmission needs to be verified.

a. For network data damage (such as cosmic rays affecting a rocket causing a bit in the data to change from 0 to 1), the strategy is to discard and resend the data to ensure that no fatal errors occur. TCP has a separate checksum in its own protocol for this kind of verification. The verification algorithm essentially maps the entire block of data to a 16-bit check bit through a certain function (such as using the sum of the characters for verification).

b. If data is transmitted correctly but there are problems such as fragment out of order, duplication, or packet loss, the strategy adopted is not to discard but to reassemble the packets.

Consider two situations: the first is that a packet is missing, resulting in a missing 1000-byte segment in the middle of the entire data. In this case, how can we notify the other party which segment of data is missing? The other is that due to network or retransmission mechanism reasons, a packet is received multiple times. In this case, how can we exclude the redundant packets and only retain the existing data?

TCP took this into full consideration when it was designed. SYN and ACK are used to ensure this process. SYN sends the byte order, and ACK responds to the received byte order plus 1. In this way, both the sender and the receiver can accurately maintain a list of sent and received bytes. In this way, they can know which bytes the other party needs or which bytes they have received.

2) Retransmission mechanism

a. Timeout retransmission

In order to ensure that the data is received, the timeout must be properly handled. If no response is received during the timeout, the best way is to resend it.

First, the data is copied to the sending buffer. A timer is started when each packet is sent. If a response is received from the other party before the timer times out, the transmission is successful and the buffer is cleared. Otherwise, the data packet is retransmitted until the maximum number of times is reached.

TCP calculates the round trip time and deviation every time it sends a packet. This record can be used to roughly determine the network conditions of both parties and thus determine the timeout period. Usually, the timeout period is long at the beginning (such as 6s), and then it may be reduced to a shorter time such as 0.5s.

b. High-speed retransmission

TCP has designed a more clever way to resend packets than resending packets after a timeout. This is called fast resend. That is, the target host always confirms the missing packet that is first in the queue. When the sender finds that it has received three consecutive identical ACKs, it indicates that the packet has been lost and needs to be resent quickly. This can avoid having to wait until a timeout before resending.

3) Flow control (sliding window)

a. Sliding Window Protocol

The sliding window is essentially to synchronize the rates of the sender and receiver during the communication process. The sending window of the sender and the receiving window of the receiver are used to ensure the reliability of the transmission and coordinate the speed of the transmission.

For the sender, the entire window is divided into the following four sections: one is the one that has been sent and received a confirmation reply; the second is the one that has been sent but has not received a reply; the third is the one that has not been sent but is about to be sent (the receiver has space, but the sender has not sent it yet); the fourth is the one that has not been sent, but the receiver has no space

Similarly, for the receiver, the entire window is divided into three sections: one is the received and ACKed messages; the second is the received messages; and the third is the messages that are not ready to be received.

The so-called sliding is to move the window from the position of the last continuous ACK received to the position of the next continuous ACK received. Note the word "continuous". If it is not continuous, it cannot be considered as having been received.

b. Contraction and expansion of sliding window

The most powerful thing about the sliding window is that it dynamically adjusts the window size of the sender and receiver to synchronize the communication between the sender and receiver rather than just managing the sending and receiving bytes.

In the TCP protocol, 16 bytes are used to store the window size, which is used by the receiving host to inform the sending host of the data size it can accept, and the sender will send data that does not exceed this limit based on the window data. The receiving host can increase or decrease this value according to its processing capacity, and the sending host only needs to keep pace with it.

When it shrinks to the minimum (i.e. 0 window), according to the agreement, the sender can no longer send data to the receiver. Wouldn't that mean a dead end and everyone will be disconnected from each other? In fact, the sender will retry after a period of time. If it still doesn't work, it will retry after a longer period of time until the maximum number of retries is reached.

4) Congestion Control

The reason for congestion control is that TCP itself has various checksum detection methods to ensure that both parties can communicate with each other and synchronize their situations. However, it still ignores a key factor - network conditions. The network is a channel. If the channel is too congested, the sending should be appropriately reduced, and if the channel is relatively loose, the sending can be appropriately increased.

TCP congestion control includes: slow start, congestion avoidance, congestion occurrence, and fast recovery.

Slow start sets the congestion window size to 1 data segment each time, and then increases the congestion window by 1 each time a confirmation response is received.

Since this can easily lead to exponential growth, the concept of a slow start threshold is introduced. That is, when TCP communication starts, the network throughput will rise sharply. After reaching a threshold, it will begin to decline and then rise slowly.

4. Various abnormal attack situations in TCP

  • SYN attack. It is mainly an attack during the connection establishment. The attacker initiates a SYN request, and the attacked party responds with an ACK after receiving the request. At this time, the attacker should respond to this ACK to make the attacked party enter the established state. However, the attacker does not respond at this time, which increases the survival time of the record in the unconnected queue maintained by the attacked party due to timeout retries. A large number of such attacks in a short period of time can cause the attacked party's unconnected queue to continue to grow, slowing down the system response, causing network congestion and even system crashes.
  • RST attack. RST reset is mainly used to clear the buffer of a connection that either party considers abnormal, and send the RST flag to the other party to force the connection to close. RST attack is mainly used to break an existing connection. For example, when A and B are connected, C appears and pretends to be A and sends a request with RST bit to B. Then B will clear all "memories" with A. The next time A comes, B will not recognize A. Of course, if C pretends to be A and sends a SYN request to B, B will actively initiate RST reset. This type of attack is mainly used to paralyze important connections, so as to take advantage of the opportunity.

<<:  How to build your first machine learning model on your iPhone

>>:  This is the real value of the mini program, but it will never replace the App

Recommend

The future of Android - PC-based

In August this year, Xiaomi released MIUI 7 with ...

The secrets of Douyin’s tens of millions of DOU+ placements

When operating a Tik Tok account, what you look f...

What are the differences between 400 telephones and ordinary telephones?

Before the emergence of 400 telephone numbers , t...

Will the next iPhone have an OLED screen? This Japanese company is the key

Next year, the iPhone will be 10 years old, and A...

Jack Ma: "I had a sleepless night in Seattle last night"

[[150517]] On September 24, on the morning of Sep...

Google launches a set of iOS 16 lock screen widgets for iPhone

Google announced today that it will bring some po...

Too difficult to copy? Why don’t new Android flagships have 3D Touch?

As early as last fall, there were multiple reports...

AARRR model case: How to use data to optimize channel delivery?

With the disappearance of the new population divi...

You are just one cup of Helicobacter pylori away from the Nobel Prize

In October 2005, Australian doctors Barry Mashall...