1. BackgroundThe transmission protocol is the connection channel between the "terminal" and the "computing power". Its stability and transmission efficiency determine the user experience of the terminal computing power application. It is the core key and important guarantee for the product to provide users with "one-point access, ready-to-use" computing power services. The mainstream cloud terminal transmission protocols in the industry are mainly concentrated in VMWare's PCoIP protocol, Citrix's ICA protocol, Microsoft's RDP protocol and RedHat's SPICE protocol. Among them, the SPICE protocol is the only completely open source protocol. Most cloud computer manufacturers will refer to the architecture of the SPICE protocol to develop transmission protocols when developing products. In order to achieve independent control of key cloud computer technologies, the computing service product group of the SaaS product department, based on the SPICE protocol, draws on cutting-edge WebRTC, QUIC and other protocols, and explores the productization ideas of cloud terminal transmission protocols from six aspects: in-band protocol transformation, codec optimization, WAN transmission, virtual display solutions, USB redirection, and SDK optimization . 2. Detailed explanation of the SPICE protocol principle2.1 Overall ArchitectureThe SPICE protocol can be divided into four components from a structural perspective:
The relationship between the various parts is shown in the figure: Figure 2.1 Relationship between SPICE protocol related components The overall architecture of the SPICE protocol is shown in the figure below. The client runs on the user terminal device and provides a desktop environment for the user. The SPICE server is integrated with the KVM virtual machine in the form of a dynamic link library and communicates with the client through the SPICE protocol. Figure 2.2 SPICE protocol architecture As can be seen from the architecture diagram, the client and server communicate through channels, and each channel type corresponds to a specific data type. Each channel uses a dedicated TCP socket, which can be secure (using SSL) or unsecured. The details of channels will be explained in subsequent chapters. 2.2 Detailed explanation of the server-side principleThe SPICE server is implemented through libspice, which is a pluggable library for virtual device interface (VDI). On the one hand, the server uses the SPICE protocol to communicate with remote clients; on the other hand, the SPICE protocol interacts with VDI host applications (such as QEMU). The server architecture diagram is shown in Figure 2.3. The server communicates with the client through various channels, including the main channel, input channel, playback channel, recording channel, display channel, and cursor channel. Each channel transmits different types of data, such as the main channel transmits some simple control instructions, the input channel transmits messages such as mouse and keyboard, and the display channel transmits desktop display related data, etc., and each channel uses a dedicated TCP socket. Figure 2.3 Server architecture diagram The communication between the server and the virtual machine requires the use of QEMU to virtualize the QXL interface and I/O interface. The I/O interface includes the proxy interface, input interface, and audio interface (playback interface and recording interface), which respectively complete the establishment of the main channel, input channel, playback channel, and recording interface. The QXL interface completes the encapsulation call of the graphics device interface through the QXL driver in the virtual machine, which facilitates the operation of the display channel and cursor channel. Unlike the I/O interface, there can be multiple QXL interfaces to support functions such as multi-screen display. The SPICE graphics system structure diagram is shown in Figure 2.4. Red Sever initiates a scheduling request for each QXL interface, and Red Dispatcher creates a Red Worker through the channel socket. Red Dispatcher is responsible for QXL scheduling. When the program is initialized, image compression is changed, video stream changes, and mouse mode is set, a new Red Worker will be created for specific processing. Red Worker is responsible for processing specific QXL commands and needs to manage the display channel and cursor channel, including the dimensions of the communication pipeline, image compression, video stream creation and encoding, cache control, etc. Figure 2.4 Graphics system architecture diagram To sum up, the three core components of the SPICE server are: (1) Red Server (reds.c) Red Server is mainly used to listen to client connection requests, accept connections and communicate with clients. It is mainly responsible for:
Management channel (register, unregister, stop) Channels that notify Clients of activities so that they can create them Management of main channels and input channels Socket operation and connection management Handling SSL and ticketing
(2) Red Worker (red_worker.c) The SPICE server creates a worker thread (Red Worker) for each QXL interface. The Red Worker is mainly responsible for:
(3) Red Dispatcher(red_dispatcher.c) Red Dispatcher is mainly used for QXL scheduling and is responsible for:
2.3 Detailed explanation of the guest side principleIn the SPICE protocol, the guest is the virtual machine used by the user. Currently, the guest supports virtualization of different operating systems such as Windows and Linux. The guest is composed of virtualized hardware, and the content is remotely transmitted to the client through SPICE for user operation. The user experience of the guest has a bottleneck. For this reason, the SPICE protocol designs two modules, Vdagent and qxl, in the guest to enhance the user experience. Figure 2.5 guest framework diagram (1) Vdagent Vdagent is an application running in the guest, communicating with the server through the specific spicemvc device created by qemu. Vdagent is both a command executor for the client and server, and an event listener for the guest. In terms of basic user use, functions such as keyboard and mouse synchronization, audio and video display, etc. can all be implemented through the virtual hardware interface provided by qemu. However, since the client runs in the client's physical machine in the form of an application, the guest and the user's physical machine interact indirectly, so Vdagent implements the following four functions to enhance the user experience.
Based on qemu, spcie protocol provides a server-side mouse control mode, which controls the relative displacement of qemu virtual mouse by receiving the mouse displacement vector of the client. Vdagent implements a client-side mouse mode, which abandons the virtual mouse interface provided by qemu and moves the guest mouse in Vdagent through the relative position information of the client-side mouse. In comparison, Vdagent's Client-side mouse mode can adapt to more complex network transmission environments, and can ensure the overall control effect even in network conditions with severe packet loss, reducing the occurrence of mouse drag and other issues.
From the user's point of view, the client side of the protocol is a window program running on the user's physical machine, and the window size should be switchable. If there is no resolution adaptation function, the resolution of the client and the guest will not match after the window size is switched, resulting in the desktop screen being enlarged, reduced, or even incompletely displayed or with black edges. Therefore, vdagent avoids the resolution matching problem by calling QXL to adjust the guest resolution to adapt to the client window size.
If users use their local physical machine and the guest for work and study at the same time, then text sharing between the two machines is very necessary. Based on this, both the client and Vdagent have implemented the clipboard monitoring and writing functions. When the local clipboard of one party is updated, the other party will receive the clipboard data and update its corresponding clipboard, thereby realizing the sharing of clipboard data.
Similar to sharing the clipboard, file transfer between two machines is as important as text sharing. The user can directly select the file on the physical machine and drag it to the Client window to start the file transfer, and Vdagent will place the received file in the specified path on the Guest. (2) QXL QXL in a broad sense refers to the display module, while QXL in a narrow sense is divided into two parts, one is the QXL graphics card device created by qemu, and the other is the QXL driver corresponding to the QXL graphics card in the guest. The screen on the guest side must be drawn by the graphics card, and the QXL graphics card is only virtualized by qemu through the CPU, so the performance is poor. In the absence of the QXL driver, the screen refresh is obviously disconnected. To this end, the guest has developed QXL drivers with the same functions for different systems. On the one hand, it enhances the image processing effect of the QXL graphics card, and on the other hand, it provides an interface for calling the QXL graphics card to adjust the resolution for vdagent to implement the resolution adaptation function. 2.4 Detailed explanation of the client principleThe main function of the SPICE client is to parse and render the data sent by the remote end, and provide the ability to remotely access the virtual machine desktop. Both the SPICE Server and Client adopt modular ideas and multi-channel solutions to achieve remote desktop transmission. Its advantage is that it reduces the coupling of the code and facilitates the support of new functions through horizontal expansion. Each channel of SPICE provides a specific function. The client infrastructure is shown in Figure 2.6. In order to have a pure cross-platform structure, SPICE defines a set of common interfaces (Platform classes) and keeps its platform-specific implementations in parallel directories. This set of interfaces defines many low-level services, such as timers and cursor operations. Application is the main class, which contains and controls the Client , monitors and screens. The main functions are to parse command line parameters, run the main message loop, handle events (connection, disconnection, errors, etc.), redirect mouse events to input handlers, switch full screen mode, etc. (1) Channels Clients and servers communicate through channels. Each channel type is used for a specific type of data. Each channel uses a dedicated TCP socket, which can be secure (using SSL) or unsecured. On the client side, each channel has a thread, and different QoS can be provided for each channel by distinguishing thread priorities. RedClient - main channel class, which is able to control other instantiated channels (create channels using factory pattern, connect, disconnect, etc.). Figure 2.6 Client infrastructure diagram - The ancestors of all channels are: RedPeer - A socket wrapper class for secure and insecure communication, providing basic functions such as connect, disconnect, close, send, receive and socket exchange for migration. It defines common message classes: InMessages, CompoundInMessage and OutMessage. All messages include type, size and data. RedChannelBase - Inherits from RedPeer, provides basic functions for establishing channel connections with Servers , and supports channel function exchange with servers. RedChannel - Inherits from RedChannelBase. This class is the parent of all instantiated channels. Handles sending outgoing messages and dispatching incoming messages. The RedChannel thread runs an event loop with various event sources (e.g., send and abort triggers). The channel socket is added as an event source to trigger the sending and receiving of SPICE messages. - Available channels are: Main - implemented by RedClient DisplayChannel - handles graphics commands, images, and video streams InputsChannel - keyboard and mouse input CursorChannel - Pointer device position, visibility, and cursor shape PlaybackChannel - Audio received from the server and played by the Client RecordChannel - Audio captured at the Client (2) Screens and Windows ScreenLayer - A screen layer is attached to a specific screen and provides operations (set, clear, update, invalidate, etc.) on a rectangular area. RedScreen - Uses screen layers (e.g., display, cursor) to implement screen logic and control windows to display their contents. RedDrawable - Platform-specific basic pixmap implementation. Supports basic rendering operations (e.g., copy, blend, combine). RedWindow_p - Platform-specific window data and methods. RedWindow - Inherits from RedDrawable and RedWindow_p. Implements basic window states and cross-platform related functions (e.g., show, hide, move, minimize, set title, set cursor, etc.). 3. SPICE protocol product transformation planThe biggest advantage of the SPICE protocol is that it is completely open source, which facilitates its functional expansion and secondary development to adapt to specific scenarios. However, its shortcomings in productization are also obvious. There are still many problems that need to be solved in order to achieve productization of the SPICE protocol. For example, there are prominent problems such as insufficient video processing capabilities, excessive network transmission bandwidth usage, easy freezing of the screen, and poor user experience. This article proposes some ideas for productization transformation based on user experience in general scenarios. 3.1 In-band protocol transformationThe SPICE protocol is a typical out-of-band protocol. An out-of-band protocol refers to the interaction between the client and the server through the QEMU layer. This feature makes the SPICE protocol heavily dependent on QEMU, making it inflexible. In order to better connect with the underlying virtualization layer of the mobile cloud and reduce the dependence on QEMU, the SPICE protocol needs to be transformed into an in-band protocol. The biggest difference between the in-band protocol and the out-of-band protocol is the location of the SPICE server. The architecture diagram of the out-of-band protocol is shown in Figure 3.1. The SPICE server is located in the QEMU layer and interacts with the QXL driver and VDI agent of the virtual machine through the QXL device and VDI port respectively. The client needs to connect to the SPICE server through the host machine's IP address plus the port number. Different virtual machines require different ports for connection. Figure 3.1 Out-of-band protocol structure diagram The architecture diagram of the in-band protocol is shown in Figure 3.2. The SPICE server is located in the virtual machine and is a SPICE process in the Guest, responsible for data transmission. In addition to the QXL driver and VDI agent, the Guest needs to implement screen capture and video encoding functions. The video encoding function of the out-of-band protocol is implemented on the server side. The in-band protocol needs to use tools such as DXGI to implement screen capture and transmission, and implement video encoding on the Guest side. The transmission part between the server and the client uses WebRTC transmission technology to achieve more efficient and stable data communication. Figure 3.2 In-band protocol architecture At present, there are two feasible solutions for the in-band transformation of the SPICE protocol, as shown in Figure 3.3. The first solution is to migrate the initialization and calling code of the SPICE server and QEMU related to SPICE to the SPICE process, making SPICE an executable service. The service is responsible for completing the data interaction at the bottom of the system and connecting with the client through multiple channels; the second solution is to use the SPICE stream proxy component. The stream proxy is responsible for completing functions such as screen capture and encoding, and connects with the client by migrating to libspice in the virtual machine. The purpose of the two solutions is the same, both of which are to port the SPICE server to the virtual machine, but the implementation methods are different. The difficulty lies in the fact that the SPICE server, as a dynamic library of QEMU, only contains the interface called by QEMU. The initialization part and the calling part need to be ported out of QEMU and transformed into a service compatible with different operating systems. In summary, the second solution uses the SPICE stream proxy component to implement functions such as screen capture and encoding, which is more feasible and easier to implement. The second solution is more suitable as an in-band transformation solution. Figure 3.3 In-band transformation solution 3.2 H.26x Codec Integration (1) H.264 codec integration H.264 is a new generation of digital video compression format proposed by the International Organization for Standardization (ISO) and the International Telecommunication Union (ITU) after MPEG4. In SPICE, H.264 is mainly implemented through x264enc in Gstreamer. Currently, Gstreamer is mainly used to process audio or video. x264enc is a plug-in for Gstreamer, which is a specific solution for implementing H.264 encoding based on the x264 library. The specific integration process is as follows: First, install the x264 encoding library on the physical machine of spice-server, and then install Gstreamer. In Gstreamer, x265enc is in the gst-plugins-bad plug-in package. After correctly installing the plug-in package, install spice-server to enable H264 encoding on the server side. Similarly, after correctly installing Gstreamer on the client side, H264 decoding can be performed. The core of Gstreamer is the pipeline, which inputs the data stream into the pipeline and outputs the data stream after being processed by the pipeline. In the H.264 encoding on the server side, the pipeline uses x264enc to encode the data stream, and then outputs the H.264 data stream and sends it to the client. After the client decodes it, the graphical interface is displayed on the client. In SPICE, the specific description of the H.264 encoding pipeline in Gstreamer is as follows: appsrc is-live=true format=time do-timestamp=true name=src ! videoconvert ! x264enc name=encoder byte-stream=true qp-min=15 qp-max=35 tune=4 sliced-threads=true speed-preset=ultrafast intra-refresh=true ! appsink name=sink The H.264 encoding pipeline is mainly composed of four elements, namely appsrc, videoconvert, x264enc and appsink. Among them, appsrc mainly obtains the data of the application and inserts it into the Gstreamer pipeline. The main function of videoconvert is to convert multiple video formats. Specifically, it converts the data stream format obtained by appsrc into the video format that can be received by the next element, such as x264enc, in SPICE. Appsink is a sink application plug-in that enables the application to obtain the data stream in the pipeline. The above three elements are commonly used elements of Gstreamer. In H.264 encoding, the core is the x264enc element. x264enc mainly encodes the original video into the H.264 encoding format. In the property control of x264enc, qp stands for the quantizer parameter. The qp-related parameters are related to the bit rate control. When the qp-related parameter settings are enabled, x264enc will use the QP mode. The qp value reflects the difference between the quality of the encoded image and the quality of the original video stream. When the quantization parameter qp=0, the encoder will produce lossless output. When the quantization parameter qp=51 (the maximum quantizer that x264enc can set), the image quality will be minimized, but the bit rate will be improved. Figure 3.4 Comparison of different qp (left qp = 51, right qp = 0) At present, the server can complete H.264 encoding, and the traffic bandwidth has been significantly reduced. The decoding of H.264 is mainly completed by the client, using the H.264 decoder based on Gstreamer. The specific implementation of the client is mainly implemented by the uridecodebin plug-in, which can automatically select the appropriate audio and video decoder according to the given uri, thereby shielding the encapsulation type and decoder type of different media. In the client, firstly, the data stream is obtained from the application by using appsrc. Decodebin will automatically detect the format of the input data stream and construct the corresponding Gstreamer element in the background for decoding. Decodebin inserts the typefind element to determine the media type of the stream. In the H.264 decoding process, the data format of the video stream is h-x264. The h264parse element is used to split and output the H.264 frame data, and avdec_h264 is used for decoding. After decoding, videoconvert is used to convert the video stream into a format that can be received by appsink, and then transmitted to the application for rendering and imaging. Figure 3.5 H.264 decoder pipeline (2) H.265 codec integration H.265 is a new video coding standard developed by ITU-T VCEG after H.264. On the server side, H.265 integration is also based on Gstreamer, using the x265enc encoder. The final implementation pipeline is described as follows: appsrc is-live=true format=time do-timestamp=true name=src! videoconvert! video/x-raw,format=\(string\)I420! x265enc name=encoder tune=4 speed-preset=ultrafast! video/x-h265, stream-format=byte-stream, alignment=au, profile=\(string\)main! appsink name=sink Like H.264, the raw data stream of H.265 pipeline input is also obtained from the application, so appsrc is still used to obtain data, and video is converted into a data format that x265enc can handle through videoconvert. Unlike H.264, the output data format of videoconvert is explicitly given here, which is I420 format. The reason is that the x265enc version used is better adapted to the I420 format. Compared with x264enc, x265enc has fewer parameters. In SPICE, only tune and speed-preset=ultrafast are currently set to control the bit rate and image quality. It should be pointed out that during the development process, H.265 is less mature than H.264, and the theoretically feasible solutions of H.265 are not fully supported. Therefore, it is necessary to explicitly indicate the various formats of the data. If the data stream does not specify the video format, it may cause data stream errors and cannot be encoded correctly. The decoding work of H.265 is similar to that of H.264. It only requires replacing h264parse and avdec_h264 with h265parse and avdec_h265. This work can be completed by decodebin. 3.3 WAN environment transmission optimization(1) Limitations of SPICE in WAN The three modules of SPICE are isolated from each other. To ensure the communication between modules, SPICE implements the communication between the server and the guest based on the qemu virtual hardware. The client and the server are usually installed on different physical machines, so the TCP transmission protocol is used to establish the data channel between the two. In a LAN environment, the TCP protocol can still meet the needs of audio and video transmission. However, in a more complex WAN environment, the TCP protocol's large data header and blocking, packet loss and retransmission mechanisms will increase data latency and additional bandwidth consumption in long-link networks. SPICE is a real-time desktop transmission protocol. Remote data includes not only real-time audio and video data, but also other real-time command data, which has high requirements for network transmission latency and throughput. In order to achieve the needs of SPICE transplantation to the WAN, it is necessary to optimize the transmission. (2) Transmission optimization based on QUIC protocol QUIC is a UDP-based application layer reliable transmission protocol developed by Google in response to TCP. Compared with the reliability achieved by TCP at the transport layer, the reliability achieved by QUIC at the application layer is more adaptable to complex networks. Therefore, in wide area networks with relatively complex network conditions, QUIC can make full use of the characteristics of the underlying UDP to reduce transmission delays and increase transmission rates while ensuring data transmission reliability. Figure 3.6 TCP and QUIC transmission architecture As shown in the figure, both the client and the server belong to the application layer. On the left is SPICE's current network transmission solution, which transmits and receives data through the underlying TCP protocol based on websocket. On the right is the modified QUIC transmission solution, which needs to be performed on both the client and the server. Since the function of QUIC is consistent with TCP, after switching TCP to QUIC, some mechanisms need to be added to the overall architecture to realize the function of websocket. (3) Transmission optimization based on the webrtc framework WebRTC is a real-time communication audio and video transmission technology released by Google. It provides functions including audio and video acquisition, encoding and decoding, network transmission, and display. Compared with QUIC, WebRTC directly targets transmission frameworks such as WebSocket to achieve transmission connections between applications. Standard WebRTC uses the UDP protocol as the underlying transmission protocol to achieve point-to-point connections between applications and optimizes the transmission mechanism. Compared with the TCP-based WebSocket framework, the UDP-based WebRTC has stronger stability and lower latency. Figure 3.7 Transmission architecture of websocket and webrtc In comparison, the optimized architecture based on webrtc is similar to the original websocket architecture. On the Client and Server sides, both need to convert the original websocket framework into webrtc. 3.4 Virtual Display SolutionIn the display technology solution of the cloud terminal transmission protocol, there are usually three implementation methods: Virtual graphics card, GPU virtualization (vGPU), graphics card pass-through. Taking into account product costs and physical resource consumption, the current mainstream cloud desktop (computer) products for users with ordinary office needs all use virtual graphics card technology solutions. (1) Working principle of virtual graphics card In the virtual machine management software (vmware, hyper-v, qemu-kvm) of the transmission protocol server (Server), one or more virtual graphics card devices (QXL, Cirrur, SVGA, etc.) are usually built-in and provided to the virtual machine. The Windows operating system in the virtual machine will write all desktop display data to the virtual graphics card. After the virtual graphics card receives all the desktop images of the virtual machine, it shares the image data with the virtual machine management software on the host machine through some means (such as shared memory). The management software restores this part of the data in a window, so that the desktop image of the virtual machine can be seen on the host machine; at the same time, the image data shared by the virtual graphics card to the host machine is intercepted (the management software has a built-in API to intercept the image data of each virtual machine), and then compressed by the image compression algorithm (usually using H.264), and sent to the client through the network transmission protocol (such as SPICE, VNC). The specific workflow can be referred to in the figure below: Figure 3.8 Working principle of virtual graphics card (2) Windows Display Driver Model (WDDM) When it comes to graphics cards (whether physical or virtual), the graphics card driver is naturally involved. The display/graphics driver model of the Windows operating system has been the Windows Display Driver Model (WDDM) since Windows 7. The complete WDDM display driver model consists of two parts: the kernel-mode display miniport driver (Display Miniport Driver) and the user-mode display driver (Display Driver). Its architecture is as follows: Figure 3.9 WDDM architecture diagram Three types of graphics card drivers were introduced in WDDM 1.2:
The full-featured version supports 2D and 3D hardware acceleration and has complete rendering, display and video functions.
As the name suggests, this type of driver only has the most basic display functions and does not support calculation (rendering).
This type of driver only supports rendering function, not display function. In cloud desktop applications, the DOD driver is usually used inside the virtual machine to drive the virtual graphics card (QXL) for display, while the complex and time-consuming rendering work is handed over to the CPU on the host side. (3) Indirect Display Driver (IDD) After Win10 version 1607, WDDM version 2.1 provides an indirect display driver (IDD, Indirect Display Driver) model to implement the virtual display function.
This driver simulates a "virtual display" device on a Windows computer (virtual machine), and uses software to connect the virtual display to the output port of the (virtual/physical) graphics card (simulating HDMI/VGA). For the virtual machine operating system, the virtual display is equivalent to a "real physical display". The device can be seen on the system display settings control panel, and can be copied and expanded like a physical display. Objectively, the virtual display "does not exist", so the screen on the display cannot be seen. However, we can intercept the display data output by the virtual graphics card to the virtual display, draw it in a window, or send it to the required client through a transmission protocol, so that the virtual machine can realize dual-screen display in the client window;
The IDD driver is developed based on IddCx (Indirect Display Driver Class eXtension). It is a "pure" user mode driver that does not contain any kernel mode components and can use any DirectX API to process desktop image data. At the same time, IDD runs in session 0 and no components are running in the user session, which helps improve the overall system reliability. The driver framework is as follows: Figure 3.10 IDD architecture diagram (4) Optimization of virtual display solutions To improve the display performance of virtual machines based on the SPICE transmission protocol and support functions such as multi-screen display, window adaptation, and resolution adjustment, it is necessary to develop corresponding display drivers. For details, please refer to the following solutions:
Figure 3.11 DoD+IDD display solution This solution uses the underlying virtualization software to load a virtual graphics card and uses DoD to drive the display, and then expands the screen through an IDD virtual display to achieve dual screens;
Figure 3.12 DoD+DoD display solution This solution actually adds two virtual graphics cards to the virtual machine and uses DoD drivers. Each virtual graphics card corresponds to a virtual display device in the host machine, thereby achieving dual screens. 3.5 USB redirection policy(1) Problems with USB redirection in the SPICE protocol The use of USB peripherals is a key factor affecting the user experience in cloud terminal products. At present, the SPICE protocol still has the following problems in USB redirection: ① Because the USB driver loading mode will frequently install and uninstall drivers, resulting in The transmission process is inefficient and unstable. First, the installation and uninstallation of the driver takes a long time. On some machines with old configurations, it may even take several minutes to install the driver, resulting in a long wait from the beginning of the device mapping to the actual use. Secondly, the installation and uninstallation of the driver will cause repeated refreshes of the device, which may easily affect other devices that have been successfully mapped, causing other devices to work abnormally. Finally, frequent installation and uninstallation of drivers can easily cause confusion in the system's device library. Without redirection, the system cannot load the correct device driver, causing the device to be unavailable. ② Other USB devices such as camera redirection need further functional development and testing. The transmission of large amounts of data and scenes with a large amount of real-time output transmission such as USB cameras are relatively inefficient and will definitely take up a lot of bandwidth. These are also factors that need to be considered. (2) Renovation plan The entire redirection process involves the SPICE client, SPICE server and Guest. The detailed processing flow is shown in Figure 3.12. When USB device mapping is required, the control and read/write requests of the USB device are issued through the Guest, and the commands are transmitted to the SPICE server through the VDI interface provided by the virtualization software QEMU. After receiving the message, the SPICE server reads the data based on the USB redirection protocol defined by libusbredir and processes it, and then sends the data to the client through the USBRedir channel defined by the SPICE protocol. The client parses the received data through the USB universal driver and then operates the physical USB device. After the physical USB device responds, it returns the data through the original path. When USB device mapping is not needed, SPICE will uninstall the corresponding universal device driver. The operating system will match the device with the driver through the USB device attributes and then eject the device, and then allow the device to be re-identified. It will query its own driver library and find the appropriate driver in the driver library. Since the system has a backup mechanism, even if the original USB device has a manufacturer-defined functional device driver, it can still find the device driver after being overwritten and installed by the universal device driver. After SPICE uninstalls the corresponding universal device driver, it can still reload the device driver correctly when the USB device is re-identified. Figure 3.13 USB device redirection architecture diagram The USB device redirection in the entire SPICE cloud terminal transmission protocol project uses USBRedir technology. Based on this technology, the transformation plan for the problems of USB redirection in the SPICE protocol is as follows:
Driver replacement technology is used to replace the current driver installation method to load the driver. When installing the driver, the driver library of the operating system is updated to load the device driver, while the driver replacement technology modifies the USB device attributes to achieve the purpose of matching the universal driver. The operating system matches the driver through the USB device attributes, pre-installs the universal device driver on a set of customized device attributes, and modifies the USB attribute information to the previously predefined device attributes during mapping. In this way, when the operating system matches the driver according to the device attributes, it will load the universal device driver. When redirection is not performed, no device information is modified and the real information of the device is directly reported. In this way, the operating system will load the corresponding functional driver of the device. Finally, by modifying the device attributes to achieve driver replacement, frequent installation and uninstallation of drivers can be avoided.
The USB data compression method can be used to improve the effect of USB redirection, reduce bandwidth, and improve data transmission efficiency. In the process of USB redirection, scenarios with a large amount of data transmission are usually file copying and USB camera transmission. In the application scenario of file copying, since the file content cannot be modified, lossless compression is required. In application scenarios such as USB camera redirection, the amount of data collected by USB transmission is usually large, and data compression processing is generally not performed. In addition, such data can allow for information loss. Therefore, it can be compressed by mature video codec compression algorithms such as MJPEG, H.264, etc. and then sent to the server. The server decodes and restores the data. 3.6 SDK cropping optimization(1) API Refactoring SPICE uses GTK+ to implement the client code. GTK+ uses GObject (C language) to simulate object-oriented programming at the bottom layer. The code implementation is relatively cumbersome. As the amount of code increases, the code maintenance is also relatively difficult. Most importantly, if developers are not familiar with the GTK+ operating mechanism, it is almost difficult to use this set of API interfaces. Therefore, the new version of the server code no longer uses GObject to simulate object-oriented programming, but uses C++ to develop code. For this reason, we also need to optimize the client framework, select the C++-based cross-platform framework Qt to reconstruct the code, redesign the API, and reduce the difficulty for users to call the API. (2) Smartcard cutting Smart Cards - Commonly known as plastic cards with microchips embedded in them (usually the size of a credit card). Generally, chips have CPU, RAM, and I/O, and require specific card readers to interact, but one card reader cannot be used to compatible with all types of smart cards. In addition, although smart cards may be safer for many applications, they are still susceptible to certain types of attacks. For example, attacks that can recover information from chips through smart card technology. Differential power analysis can be used to infer on-chip private keys used by public key algorithms (such as RSA). Furthermore, mobile cloud computers are mainly aimed at individual users, and smart card usage scenarios for private cloud deployments are almost difficult to use, so this feature can be removed. (3) Coroutine cutting spice-gtk uses native coroutines to realize I/O reading and writing. Although native coroutines can reduce the overhead of creation and avoid meaningless scheduling compared to threads, they are not very suitable for mobile platforms. Coroutines are suitable for high concurrency throughput scenarios, sacrificing thread fairness. If there is a long-term computing task (such as image decoding), it will affect the response delay of IO tasks, that is, it will affect other synchronous data processing (such as audio playback). Moreover, the single-threaded coroutine solution cannot fundamentally avoid blocking, such as file operations and memory page shortages. Therefore, for cloud desktop terminals, which do not need to deal with high concurrency scenarios, multi-threading can better take into account the fairness of each channel's data processing, so that audio playback and peripheral redirection will not be affected when displaying complex animation web pages. 4. Summary and evolution directionSince the emergence of the SPICE protocol is to solve the problem of remote desktop, it has obvious shortcomings in the current mobile Internet era and even the future full-true Internet era. To adapt to the rapidly developing Internet, the cloud terminal transmission protocol must continue to evolve towards different technical routes. 1️⃣Adapter container technology Container technology represented by Docker is developing rapidly. Unlike virtual machines that consume too much resources, Docker is widely used in cloud applications such as cloud games and cloud mobile phones with the advantages of lightweight and low resource consumption. The cloud terminal transmission protocol should be adapted to container technology to support lightweight cloud application scenarios. 2️⃣Ultra -high definition video support Ultra-high definition is a new round of major technological changes in the display industry after digitalization and high definition. With the continuous improvement of user display screen resolution, the support for ultra-high definition video by cloud terminal transmission protocol has been put on the agenda. 3️⃣Panorama video support The metaverse has become the hottest topic at the moment. Virtual reality (VR) is the most promising entrance to the metaverse and is called the "next generation universal computing platform". It uses computers to create realistic three-dimensional virtual scenes, where users can interact with the virtual environment, thereby gaining a strong sense of immersion. If the cloud terminal transmission protocol wants to have a place in the metaverse, it must evolve towards wearable terminals. In summary, the current cloud terminal transmission protocol is like the ancient ape man standing at the entrance to the next generation of Internet revolution. He is still in a very primitive state. If you want to support the future universe, there is still a long way to go. 👇References [1] Gstreamer Plugins Documentation [EB/OL] .https://gstreamer.freedesktop.org/documentation/plugins_doc.html?gi-language=c,2022. [2] Spice for Newbie [EB/OL] https://www.spice-space.org/spice-for-newbies.html. [3] Lan Y , Hao X . Research on technology of desktop virtualization based on SPICE protocol and its improvement solutions[M]. Springer-Verlag New York, Inc. 2014. [4] Wang Zhe. Research and optimization of SPICE protocol in cloud platform virtual desktop architecture [D]. University of Electronic Science and Technology of China, 2020. [5] Zhang Dingdan. Design and implementation of USB device redirection system based on desktop cloud [D]. University of Electronic Science and Technology of China, 2020. [6] WDDM Architecture-Windows Driver [EB/OL]. https://learn.microsoft.com/en-us/windows-hardware/drivers/display/windows-vista-and-later-display-driver-model-architecture. [7] Peng Xiaoping, Zhang Xuejian, Huang Bo. Research on virtualization technology based on KVM[J]. China New Communications, 2017(20):77-80. DOI:10.3969/j.issn.1673-4866.2017.20.066. [8] Qiao Yong. Video transmission analysis and improvement of SPICE protocol [D]. Shandong: Shandong University, 2013. DOI:10.7666/d.Y2433562. [9] Thompson S . The cold hard truth about TCP/IP performance over the WAN.(Storage Networking)(Wide Area Network )(Transmission Control Protocol/Internet Protocol )[J]. Computer Technology Review, 2004, XXIV(8):p.24. [10] Langley A , Iyengar J , Bailey J , et al. The QUIC Transport Protocol: Design and Internet-Scale Deployment[C]// the Conference of the ACM Special Interest Group. ACM, 2017. |
<<: Apple introduces peer group benchmarking tool in App Analytics: compare apps with similar apps
>>: An iOS super app with billions of users, with 10 years of code changes, have you noticed?
Course Catalog 1. What is the difference between ...
It’s time to share with you our nearly 10 years o...
Operation and promotion require the most user res...
The AARRR funnel model is a customer life cycle m...
Do you think this article is about "Fat Hous...
The emergence of DTC as a new business form has b...
The following is the latest traffic rankings of 5...
Recently, Chengdu Qianfeng Electronics Co., Ltd. ...
1. Activity Background 1. Market conditions Pay a...
In the past two days, China Unicom and China Mobil...
Now when you open the app store, whether it is fr...
When farming on the Internet, traffic is like the...
On August 29, 2019, Pinduoduo launched the "...
At around 9:00 a.m. on July 12, the Suzaku-2 Yao-...
Do you often see this kind of situation in the co...