A Brief Analysis of High-Performance IO Model

Server-side programming often requires the construction of high-performance IO models. There are four common IO models:

(1) Blocking IO: the traditional IO model.

(2) Synchronous non-blocking IO: By default, all created sockets are blocking. Non-blocking IO requires the socket to be set to NONBLOCK. Note that the NIO mentioned here is not the Java NIO (New IO) library.

(3) IO Multiplexing: This is the classic Reactor design pattern, sometimes also called asynchronous blocking IO. Selector in Java and epoll in Linux are both such models.

(4) Asynchronous IO: This is the classic Proactor design pattern, also known as asynchronous non-blocking IO.

The concepts of synchronization and asynchrony describe the interaction between user threads and the kernel: synchronization means that after the user thread initiates an IO request, it needs to wait or poll the kernel IO operation to complete before continuing to execute; while asynchrony means that the user thread continues to execute after initiating an IO request, and when the kernel IO operation is completed, it will notify the user thread or call the callback function registered by the user thread.

The concepts of blocking and non-blocking describe the way user threads call kernel IO operations: blocking means that the IO operation needs to be completely completed before returning to the user space; non-blocking means that a status value is returned to the user immediately after the IO operation is called, without waiting until the IO operation is completely completed.

In addition, Richard Stevens mentioned the signal-driven IO model in Volume 1 of Unix Network Programming. Since this model is not commonly used, this article does not cover it. Next, we analyze the implementation principles of four common IO models in detail. For the convenience of description, we use the IO read operation as an example.

1. Synchronous blocking IO

The synchronous blocking IO model is the simplest IO model, in which the user thread is blocked when the kernel performs IO operations.

As shown in Figure 1, the user thread initiates an IO read operation through the system call read, which is transferred from the user space to the kernel space. The kernel waits until the data packet arrives, and then copies the received data to the user space to complete the read operation.

The pseudo code description of the user thread using the synchronous blocking IO model is:

{

read(socket, buffer) ;

process(buffer) ;

}

That is, the user needs to wait for the read function to read the data in the socket into the buffer before continuing to process the received data. During the entire IO request process, the user thread is blocked, which means that the user cannot do anything when initiating an IO request, and the CPU resource utilization is insufficient.

2. Synchronous non-blocking IO

Synchronous non-blocking IO is based on synchronous blocking IO, and the socket is set to NONBLOCK. In this way, the user thread can return immediately after initiating an IO request.

As shown in Figure 2, since the socket is non-blocking, the user thread returns immediately after initiating an IO request. However, no data is read, and the user thread needs to continuously initiate IO requests until the data arrives, and then actually read the data and continue execution.

The pseudo code description of the user thread using the synchronous non-blocking IO model is:

{

while( read(socket, buffer) != SUCCESS)

;

process(buffer) ;

}

That is, the user needs to continuously call read to try to read the data in the socket until the reading is successful, and then continue to process the received data. During the entire IO request process, although the user thread can return immediately after each IO request, in order to wait for the data, it still needs to continuously poll and repeat the request, consuming a lot of CPU resources. Generally, this model is rarely used directly, but the non-blocking IO feature is used in other IO models.

3. IO multiplexing

The IO multiplexing model is based on the multiplexing function select provided by the kernel. Using the select function can avoid the problem of polling and waiting in the synchronous non-blocking IO model.

As shown in Figure 3, the user first adds the socket that needs to perform IO operations to select, and then blocks and waits for the select system call to return. When data arrives, the socket is activated and the select function returns. The user thread formally initiates a read request, reads the data and continues execution.

From the process point of view, there is not much difference between using the select function to make IO requests and the synchronous blocking model. There are even additional operations such as adding monitoring sockets and calling the select function, which is less efficient. However, the biggest advantage of using select is that users can process IO requests of multiple sockets at the same time in one thread. Users can register multiple sockets and then continuously call select to read the activated sockets to achieve the purpose of processing multiple IO requests at the same time in the same thread. In the synchronous blocking model, this goal must be achieved through multi-threading.

The pseudo code description of the user thread using the select function is:

{

select(socket);

while(1) {

sockets = select();

for(socket in sockets) {

if (can_read(socket)) {

read(socket, buffer);

process(buffer) ;

}

Before the while loop, the socket is added to the select monitoring, and then select is called continuously within the while loop to obtain the activated socket. Once the socket is readable, the read function is called to read the data in the socket.

However, the advantages of using the select function are not limited to this. Although the above method allows multiple IO requests to be processed in a single thread, the process of each IO request is still blocked (blocked on the select function), and the average time is even longer than the synchronous blocking IO model. If the user thread only registers the socket or IO request it is interested in, and then does its own thing and waits until the data arrives before processing, the CPU utilization can be improved.

The IO multiplexing model uses the Reactor design pattern to implement this mechanism.

As shown in Figure 4, the EventHandler abstract class represents the IO event handler, which has the IO file handle Handle (obtained through get_handle) and the operation handle_event on the Handle (read/write, etc.). Subclasses inherited from EventHandler can customize the behavior of the event handler. The Reactor class is used to manage EventHandler (register, delete, etc.), and use handle_events to implement the event loop, continuously calling the multiplexing function select of the synchronous event multiplexing device (usually the kernel). As long as a file handle is activated (readable/writable, etc.), select returns (blocked), and handle_events will call the handle_event of the event handler associated with the file handle to perform related operations.

As shown in Figure 5, through the Reactor method, the work of polling the IO operation status of the user thread can be uniformly handed over to the handle_events event loop for processing. After the user thread registers the event handler, it can continue to perform other work (asynchronously), while the Reactor thread is responsible for calling the kernel's select function to check the socket status. When a socket is activated, the corresponding user thread is notified (or the callback function of the user thread is executed) to execute handle_event to read and process the data. Since the select function is blocking, the multiplexed IO model is also called the asynchronous blocking IO model. Note that the blocking here refers to the thread being blocked when the select function is executed, not the socket. Generally, when using the IO multiplexing model, the socket is set to NONBLOCK, but this will not have any effect, because when the user initiates an IO request, the data has arrived and the user thread will definitely not be blocked.

The pseudo code description of the user thread using the IO multiplexing model is:

void UserEventHandler::handle_event() {

if (can_read(socket)) {

read(socket, buffer);

process(buffer);

}

{

Reactor.register(new UserEventHandler(socket));

}

The user needs to rewrite the handle_event function of EventHandler to read and process data, and the user thread only needs to register its own EventHandler to Reactor. The pseudo code of the handle_events event loop in Reactor is as follows.

Reactor::handle_events() {

while(1) {

sockets = select();

for(socket in sockets) {

get_event_handler(socket).handle_event();

}

The event loop continuously calls select to obtain the activated socket, and then executes the handle_event function according to the EventHandler corresponding to the socket.

IO multiplexing is the most commonly used IO model, but its asynchronous degree is not "thorough" enough because it uses the select system call that blocks the thread. Therefore, IO multiplexing can only be called asynchronous blocking IO, not real asynchronous IO.

4. Asynchronous IO

"True" asynchronous IO requires stronger support from the operating system. In the IO multiplexing model, the event loop notifies the user thread of the status event of the file handle, and the user thread reads and processes the data on its own. In the asynchronous IO model, when the user thread receives the notification, the data has already been read by the kernel and placed in the buffer specified by the user thread. After the IO is completed, the kernel notifies the user thread to use it directly.

The asynchronous IO model uses the Proactor design pattern to implement this mechanism.

As shown in Figure 6, the Proactor mode and the Reactor mode are similar in structure, but they are quite different in the way they are used by the user (Client). In the Reactor mode, the user thread registers the event listener of interest to the Reactor object, and then calls the event processing function when the event is triggered. In the Proactor mode, the user thread registers the AsynchronousOperation (read/write, etc.), Proactor, and the CompletionHandler when the operation is completed to the AsynchronousOperationProcessor. The AsynchronousOperationProcessor uses the Facade mode to provide a set of asynchronous operation APIs (read/write, etc.) for users to use. When the user thread calls the asynchronous API, it continues to execute its own task. The AsynchronousOperationProcessor will start an independent kernel thread to perform asynchronous operations, realizing true asynchrony. When the asynchronous IO operation is completed, the AsynchronousOperationProcessor takes out the Proactor and CompletionHandler registered by the user thread and the AsynchronousOperation, and then forwards the CompletionHandler and the result data of the IO operation to the Proactor. The Proactor is responsible for calling back the event completion processing function handle_event of each asynchronous operation. Although each asynchronous operation in the Proactor pattern can be bound to a Proactor object, Proactor is generally implemented as a Singleton pattern in the operating system to facilitate centralized distribution of operation completion events.

As shown in Figure 7, in the asynchronous IO model, the user thread directly uses the asynchronous IO API provided by the kernel to initiate a read request, and returns immediately after initiation to continue executing the user thread code. However, at this time, the user thread has registered the called AsynchronousOperation and CompletionHandler with the kernel, and then the operating system starts an independent kernel thread to handle the IO operation. When the data of the read request arrives, the kernel is responsible for reading the data in the socket and writing it to the user-specified buffer. ***The kernel distributes the read data and the CompletionHandler registered by the user thread to the internal Proactor, and the Proactor notifies the user thread of the IO completion information (generally by calling the completion event processing function registered by the user thread) to complete the asynchronous IO.

The pseudo code description of the user thread using the asynchronous IO model is:

void UserCompletionHandler::handle_event(buffer) {

process(buffer);

}

{

aio_read(socket, new UserCompletionHandler);

}

The user needs to rewrite the handle_event function of CompletionHandler to process the data. The parameter buffer represents the data that Proactor has prepared. The user thread directly calls the asynchronous IO API provided by the kernel and registers the rewritten CompletionHandler.

Compared with the IO multiplexing model, asynchronous IO is not very commonly used. Many high-performance concurrent service programs use the IO multiplexing model + multi-threaded task processing architecture to basically meet the needs. Moreover, the current operating system does not support asynchronous IO very well. Most of them use the IO multiplexing model to simulate asynchronous IO (when an IO event is triggered, the user thread is not notified directly, but the data is placed in the user-specified buffer after reading and writing). Java7 has supported asynchronous IO since then, and interested readers can try it.

This article briefly describes the structure and principles of four common high-performance IO models from three levels: basic concepts, workflows, and code examples, and clarifies the easily confused concepts of synchronization, asynchrony, blocking, and non-blocking. By understanding the high-performance IO model, you can choose an IO model that is more in line with actual business characteristics when developing server-side programs to improve service quality. I hope this article will be helpful to you.

The copyright of this article is shared by the author and Blog Park, author: Florian.

<<: Long Text Decryption Convolutional Neural Network Architecture

>>: LSTM, GRU and Neural Turing Machine: Detailed explanation of the most popular recurrent neural network in deep learning

Razer launches the world's first triple-screen laptop with a total resolution of 11520×2160

Blog

Second-class e-commerce | Toutiao & clothing category delivery plan + case sharing

Blog

Autonomous, precise, and efficient! How to thread a needle through thousands of miles of space rendezvous and docking technology

Blog

Not only the sales of purifiers are growing, but also the haze diamond rings indicate the business opportunities of Industry 4.0

Blog

Safari and Firefox have both abandoned Flash, leaving Chrome with you

Blog

What are the benefits and functions of dog meat? How to make dog meat delicious?

Blog

"Crayfish feel pain, steaming them alive is too cruel" In the future, when making spicy crayfish, you can only electrocute them one by one

Blog

The latest generation of Android system is here! Detailed analysis of the new features of Android 12

5G mobile phone chip market: many players are competing for the market, making plans, grabbing positions, and accelerating their pace

The 5G world is surging in 2019. The communicatio...

Analysis mechanism principle in tween animation source code

[[438831]] Preface After the tween animation move...

Nuclear power units that can withstand magnitude 9 earthquakes and aircraft collisions are made in China! | Great Power Technology

China's "Sky Eye", Shenzhou spacecr...

A Brief Analysis of High-Performance IO Model

Razer launches the world's first triple-screen laptop with a total resolution of 11520×2160

Second-class e-commerce | Toutiao & clothing category delivery plan + case sharing

Autonomous, precise, and efficient! How to thread a needle through thousands of miles of space rendezvous and docking technology

Not only the sales of purifiers are growing, but also the haze diamond rings indicate the business opportunities of Industry 4.0

Safari and Firefox have both abandoned Flash, leaving Chrome with you

What are the benefits and functions of dog meat? How to make dog meat delicious?

"Crayfish feel pain, steaming them alive is too cruel" In the future, when making spicy crayfish, you can only electrocute them one by one

The latest generation of Android system is here! Detailed analysis of the new features of Android 12

How does an e-commerce app build a “grass-planting” community from 0-1?

Is it expensive to develop a Wuzhong catering mini program? Wuzhong catering mini program development costs and process

Recommend

Liang Xiang: How to learn SEO knowledge from scratch? What are the precautions?

50 Tips for Marketing and Promotion at Station B

Operation and promotion: I spent more than 100,000 yuan, but why is there not even 1 conversion?

5G mobile phone chip market: many players are competing for the market, making plans, grabbing positions, and accelerating their pace

Analysis mechanism principle in tween animation source code

Nuclear power units that can withstand magnitude 9 earthquakes and aircraft collisions are made in China! | Great Power Technology

Future automotive industry value chain: after 2025

A complete guide to brand growth on TikTok in 2022

Analysis of brand private domain cases!

Is your nose also "lazy"? You have 2 nostrils and one for breathing!

The public version of ARM architecture is the king, and independent CPUs are useless?

Attracting 42 million fans in 6 years, "MINISO"'s private domain skills!

How to design a good online h5 game?

WeChat system for community operation: education industry

Zhihu traffic generation methods and techniques!