A brief talk about GPU web-based GPU

Part 01

WebGPU R&D Background

In the early days, when using GPU modules to develop Web applications, developers mostly used the WebGL API released in 2011 for graphics drawing. This API is based on OpenGL ES and was the only choice for underlying GPU graphics drawing on the Web for a period of time. The addition of programmable GPU language gave it a certain advantage over Canvas2D in terms of performance in certain drawing tasks. This API can only be used after obtaining the WebGL context through the canvas element. Its state machine-style API calls designed with internal global states as the center have been criticized by developers. Developers must carefully construct the API call sequence (procedural calls) and manage the opening and recovery of states to ensure correct drawing results. At the same time, this has led to performance overhead to a certain extent.

With the development of science and technology, GPU is no longer exclusive to graphics rendering applications. It shines in different fields such as metaverse, machine learning, big data, neural network, etc. With the increasing demand for computing power, the role of GPU is becoming more and more important. At the same time, a new generation of graphics API (Vulkan, Metal, DirectX12) has appeared on the desktop. They adopt object-oriented design solutions to provide developers with more low-level interface access, more GPU usage rights, flexible API calling methods and general parallel computing capabilities, allowing developers to maximize the performance of GPU.

The Web side also needs these capabilities. Based on the design concept of modern graphics API, WebGPU came into being. It is not an upgrade of WebGL. WebGPU has its own unique abstract design and does not directly encapsulate a specific graphics API. The following is a schematic diagram of the WebGPU architecture.

Part 02

Important concepts in WebGPU

2.1 Adapters and Devices

When you start to understand the relevant specifications of WebGPU, the first thing you come into contact with is the concept of adapter and device. The following figure shows the abstract architecture from physical device (GPU) to logical device.

Adapter, i.e. GPUAdapter. One physical GPU device corresponds to one GPUAdapter. A computer may have multiple GPU devices (integrated graphics and discrete graphics). The adapter acts as a translator to link WebGPU with the native graphics API. The corresponding GPUAdapter can be obtained in the following ways.

The device here, GPUDevice, is a concept of logical device and does not correspond to a real GPU. GPU is a shared resource. The browser can run multiple Web applications. Each Web application can use the GPU independently. An agent-like role is needed to help multiple independent Web applications use GPU-related functions. This is the role of WebGPU devices. GPUDevice objects are important objects for subsequent use of related APIs. In a sense, it is very similar to the concept of WebGL context, but it is not strongly related to canvas. Get GPUDevice in the following way.

2.2 Shaders

A shader is a program that runs on the GPU. Modern GPU rendering is implemented through a pipeline (programmable logic pipeline), and the shader code is executed at a certain stage (programmable part) of the pipeline execution. If you have learned about WebGL, you may know about vertex shaders and fragment shaders. The application organizes data resources and passes them to the shader in the form of variables (unifrom/attribute). The shader runs and passes the execution results to the next stage for processing.

Shaders are important tools for developers to control GPUs. Complex calculations, scene effects, image processing, etc. can all be handled by shader programs. WebGPU not only contains vertex shaders and fragment shaders, but also has the ability to perform general parallel calculations, namely compute shaders. It is carried by the WebGPU computing pipeline (the concept of pipeline is introduced below) and has more powerful computing capabilities than WebGL. WebGL uses GLSL language (the language used by OpenGL) to implement shader code, while WebGPU has a redesigned shader language WGSL. The following is an example of creating a shader code and the corresponding module (GPUShaderModule).

2.3 Resources (buffers, textures, samplers)

In the above shader example, some variables are defined, such as unfiorms, uTexture, uSampler, aPosition, aUv, etc. The values of these variable parameters correspond to the data resources of the external application. These data will be stored in the video memory and will eventually be passed to the shader program to run to get the corresponding results. Data resources can be roughly divided into four categories: vertex attribute data, shader variable (uniform buffer) data, texture data (texture), and sampler (sampler).

Vertex attribute data mainly stores vertex position coordinates, normal vectors, texture coordinates (for sampling textures), etc., which are necessary for basic drawing. Shader variable data is the general data required for shader program operation, such as affine transformation matrix, scene lighting parameters, material parameters, etc. Texture data is more used to store image resources and is often used to achieve mapping effects when drawing. Sampler is a special resource that specifies the required texture encoding and filtering methods, such as texture enlargement and reduction, anisotropic filtering, minmap generation, etc. For vertex attribute data and shader variable data, they are mainly mapped to GPUBuffer, namely vertex buffer object (VBO) and uniform buffer object (UBO), texture data corresponds to GPUTexture, and sampler is GPUSampler object. These three types of resources are created by GPUDevice. The following are examples of creating each type of resource.

The creation of GPUBuffer uses the buffer mapping mechanism. When a certain video memory is mapped, the CPU can access it. In the above example, when creating GPUBuffer, mappedAtCreation is set to true to start the mapping mechanism, and the mapping is ended after the data is set.

2.4 Binding Group

In the above example, GPUBuffer objects for storing vertex attributes, GPUBuffer objects for storing uniform variables, GPUTexture objects for storing image resources, and sampler objects are created respectively. For the GPUBuffer objects of vertex attributes, how they are passed to the GPU will be explained in the subsequent pipeline and command encoding modules. For the three resources mentioned later (shader variables, textures, and samplers), they need to be submitted to the GPU in an effective way. For this purpose, WebGPU proposes the concept of binding groups, namely GPUBindGroup, which is a data container used to group some data resources and pass them to the shader program, which can efficiently organize and allocate data. The grouped data organization form can reduce the number of CPU and GPU communications, thereby improving performance. At the same time, it is also convenient for shaders with different behaviors to share the same grouped resources and realize resource reuse. The following figure shows the different data organization and transmission forms of WebGL and WebGPU.

As can be seen from the figure above, the WebGL API design is implemented around the internal global state setting. The resources are bound to the binding points one by one through the API functions, which essentially changes the internal global state. WebGPU puts the resource data into a data container and sends it to the GPU through command submission (introduction to encoders and queues). Creating a GPUBindGroup requires a corresponding descriptor, whose structure is as follows.

The binding group has a corresponding layout (GPUBindGroupLayout). The layout describes to the shader program the type (type) of a resource, the group (group) it belongs to, the corresponding binding point (binding), and the shader program (visibility) used for a specific stage. If you look closely at the examples given in the shader section above, you will find declarations such as @group(0) @binding(0), which means that the resource is bound to binding point 0 of group 0. The binding layout needs to be filled in according to the settings in the shader program. The GPUBindGroupEntry object indicates a binding bit, and the resource data created by WebGPU will be attached to this binding bit (specified in the resource field). The following is a simple example of GPUBindGroup creation. We package the previously created GPUBuffer object, sampler, and texture objects into a binding group object.

2.5 Pipeline

After completing the creation of the shader module and the preparation of data resources, an important task is to build the pipeline. When most developers start learning graphics rendering, they first come into contact with the concept of rendering pipeline, which is an important mechanism for modern image rendering. However, this important concept is not reflected in the design of the WebGL API. The fragmented API organization makes it difficult for beginners to link each step with the GPU pipeline. WebGL requires developers to organize the execution process of the application by themselves, so you will see API designs such as gl.bindVertexArray, gl.bindBuffer, gl.bindTexture, and gl.useProgram. Different resources or states are bound according to different needs, so as to achieve the drawing of different objects or effects. The pipelines in WebGPU are divided into rendering pipelines and computing pipelines.

As the name implies, the rendering pipeline (GPURenderPipeline) is a pipeline used for drawing. Through the function of this pipeline, a 2D image will eventually be generated. The image can be displayed on the screen or rendered to the frame buffer. Creating a GPURenderPipeline requires a corresponding descriptor, and its structure is as follows.

The GPUVertexState and GPUFragmentState fields represent the vertex shader and fragment shader programmable stages respectively. GPUPrimitiveState is used to specify the primitive assembly form, which primitive type is used for drawing during rasterization. GPUDepthStencilState is used to describe the depth stencil test information. GPUMultisampleState specifies multi-sampling, which is used to handle the aliasing effect. The following is an example of creating a GPURenderPipeline.

The above example shows that the two shader modules generated previously are configured in the rendering pipeline, and also describes the layout of the vertex attributes (mentioned in the resource section) in the shader. In the vertex shader, there are two definitions, @location(0) aPosition and @location(1) aUv, which represent the position attribute and uv coordinate attribute of the incoming vertex respectively. Location(0) and location(1) correspond to the shaderLocation in the pipeline configuration.

WebGL is just a graphics API in most cases, and it is rarely used for other things, such as calculations. The emergence of the Compute Pipeline gives WebGPU "computing power". It is not part of the traditional rendering pipeline and is used for GPU parallel computing. The final result is stored in a buffer, which can store any type of data. The compute pipeline has only one compute stage. Creating a GPUComputePipeline requires a corresponding descriptor, and its structure is as follows.

GPUProgrammableStage indicates that this is a programmable stage, similar to GPUVertexState and GPUFragmentState. The processing of each vertex requires a call to the vertex shader, the fragment shader performs the processing of each pixel, and the compute shader is called according to the work items defined by the developer, and each work item corresponds to a thread. The collection of work items is divided into work groups, which are a group of threads (thread blocks) that can share memory, communicate with each other, and coordinate operations. In WebGPU, the work group is simulated as a three-dimensional grid, as shown in the figure below.

Each minimum cube (black edge) can be regarded as a work item, and multiple work items are grouped into a work group (red dashed edge). In the compute shader code, you can see declarations such as @workgroup_size(x, y, z), which tells the GPU how big the work group of this compute shader is. The setting of the work group size (workdgroup_size) depends on the work item coordinate semantics in most cases. The following figure shows a simple GPUComputePipeline creation example.

This is a simple example of image grayscale histogram statistics. Through GPU parallel architecture processing, we can ignore the traversal statistics of image pixels and greatly speed up the calculation.

2.6 Command Encoding and Queuing

The above work can be regarded as the preparation stage, which mainly involves data preparation and pipeline construction. When performing the final drawing or calculation, it needs to be implemented in the form of commands and queues. The command encoder (GPUCommandEncoder) has two main common functions: creating a pass encoder and copying buffer resources (GPUBuffer/GPUTexture). GPUCommandEncoder is created by the device object shape, as follows:

WebGPU channels are divided into render pass and compute pass, corresponding to the rendering pipeline and the computing pipeline. The two types of channel objects are created and started through the corresponding methods (beginRenderPass/beginComputePass) on the GPUCommandEncoder object in combination with their own descriptors, and finally the channel encoder object GPURenderPassEncoder/GPUComputePassEncoder is obtained. This type of encoder is an abstract concept in the design of WebGPU API and a substitute for the global state setting of WebGL. Through the encoder object, you can set the required pipeline, binding group, vertex attribute buffer and call the draw/dispatch function for drawing or calculation. The following is an example of the use of the encoder object.

After calling the finish function, the GPUCommandEncoder object will get a command buffer object (GPUCommandBuffer), which is used to store GPU commands. The submission of these commands is implemented through the command queue (GPUQueue), as follows:

Part 03

Conclusion

As a brand-new API, WebGPU has injected new vitality into Web application development. It has achieved progress from graphics rendering to general parallel computing, making GPU an important role in Web applications and the key to building high-performance applications in the future.

<<: Apple disables ChatGPT to prevent confidentiality leaks! A large-scale version of Siri will be upgraded and launched soon

>>: Troubleshooting and solutions for wild pointer issues in Dewu H5 container

Are TV games revisiting classics game remakes or just rehashing old stuff?

Blog

Why Google Cardboard is the future of virtual reality

Blog

The best pickup truck under 200,000 yuan, no doubt about it, the world's first super extended-range pickup truck Changan Hunter is officially launched

Blog

Do soy products increase uric acid? Can soda water lower uric acid? What you know may not be the truth!

Recommend

Stock fund financial management course teaching stock market index fixed investment entry skills value investment novice tutorial complete set

From entry level to mastery, it is guaranteed to ...

Heading into the future! China's first suspended aerial rail train, Guanggu SkyRail, starts trial ride

The train is suspended in mid-air and shuttles fr...

Six warnings including high temperature, heavy rain and severe convection are issued at the same time. Pay attention to these places! Please take precautions

Weekend rest and travel Most afraid of changeable...

A review of the auto industry executives who have fallen in the past three years: state-owned enterprises are the hardest hit, and there are also many new forces

Recently, Chen Hao, the former deputy general man...

A brief talk about GPU web-based GPU

Part 01

WebGPU R&D Background

Part 02

Important concepts in WebGPU

2.1 Adapters and Devices

2.2 Shaders

2.3 Resources (buffers, textures, samplers)

2.4 Binding Group

2.5 Pipeline

2.6 Command Encoding and Queuing

Part 03

Conclusion

Are TV games revisiting classics game remakes or just rehashing old stuff?

Why Google Cardboard is the future of virtual reality

The best pickup truck under 200,000 yuan, no doubt about it, the world's first super extended-range pickup truck Changan Hunter is officially launched

Do soy products increase uric acid? Can soda water lower uric acid? What you know may not be the truth!

Why was the brightest "star" in the night sky regarded as a "disaster star" by ancient people?

Humans can see these "little guys" only because of this group of obsessive compulsive disorder

Match point is approaching, LeEco ecosystem lands in the US, Samsung is the first to panic

Urgent reminder! There is a risk of electric shock when opening a delivery locker during a thunderstorm!

Jiayuguan WeChat ordering software mini program, how much does the WeChat mini program for ordering food cost?

Google is also testing the waters, but Nexus Player is not the savior of Android game consoles

Recommend

Stock fund financial management course teaching stock market index fixed investment entry skills value investment novice tutorial complete set

Heading into the future! China's first suspended aerial rail train, Guanggu SkyRail, starts trial ride

Six warnings including high temperature, heavy rain and severe convection are issued at the same time. Pay attention to these places! Please take precautions

How to mine long-tail keywords for a website? The secret to doubling your website traffic!

Xiaohongshu brand content keyword analysis

A detailed analysis of Perfect Diary’s social media sales pitch

10 new customer acquisition methods you must know in the early stages of product promotion!

Running thousands of miles! How can this locust fly so fast?

4 Effective Strategies to Increase App User Engagement

How to do Douyin SEO? How does Tik Tok attract traffic?

Why are rats the “beneficiaries” of floods?

The latest trends in augmented reality and its impact on the entertainment industry

Xiaodai - 8 core lectures on quick account creation, teach you how to make your own account explode in 8 days

Market analysis of mobile advertising during the National Day holiday

A review of the auto industry executives who have fallen in the past three years: state-owned enterprises are the hardest hit, and there are also many new forces