The temperature of the newly purchased public version GPU without water cooling soared from room temperature to 85 degrees when running at full load. In addition, model training is not a matter of minutes and it is very likely to run at high temperature for a long time. It is really a pity to let such an expensive GPU keep heating up! First, I was inspired by a friend's article on Zhihu: Assembling a deep learning platform from scratch (GPU cooling). Specific address: http://t.cn/RK9wyBK This article is about modifying the GPU fan speed by modifying nvidia-settings in the Ubuntu X server environment. Because the default nvidia-settings setting is that even if the GPU temperature reaches 85 degrees during calculation, the fan speed will not exceed 70%, which will not be able to dissipate heat for the GPU well, so the GPU fan speed needs to be manually modified. Note: The following settings are for GPU settings on Linux systems. Windows users please search for related articles. 1. If you have a display (X server)You can follow the article "Assembling a Deep Learning Platform from Scratch" mentioned above. Here are the key steps: 1. Modify the /etc/X11/xorg.cong file
2. Add Option "Coolbits" "4" to Section "Device"
3. Restart your computer: sudo reboot 4. Input:
Here GPUTargetFanSpeed=100 is the fan speed, 100 means the fan runs at 100% speed, and it can also be changed to other speeds. Note that in the new NVIDIA driver, GPUCurrentFanSpeed has been changed to GPUTargetFanSpeed. In addition, GPUFanControlState=1 means that users can manually adjust the GPU fan speed. Thanks to the original Zhihu author: Zhang San 2. If you don’t have a monitorGenerally, after building a deep learning environment on Ubuntu, many people are used to disabling the X desktop service of Ubuntu, and then connecting to the GPU machine through ssh from another Windows computer. At this time, the X server has been disabled, and the command line mode is automatically started when the computer is turned on. The first method above is not applicable to this situation. The reason is that nvidia-settings can only run in the X desktop environment. If you want to force this setting, you will get an error: Therefore, under normal circumstances, it is not possible to change the fan speed by modifying this setting. But is there any other way to modify it? Yes! You need to trick the system into thinking you have a monitor, which is often called headless mode. The main solution is to refer to the article in the link (fan speed without X: powermizer drops card to p8): http://t.cn/RK9ASS5 This article provides a script for modifying the fan speed. Running the script under Ubuntu can adjust the fan speed in real time to cool the GPU. Here are the detailed steps: 1. Clone this github repository to the local directory /opt: https://github.com/boris-dimitrov/set_gpu_fans_public
This repository includes several files as shown above. The main one is the cool_gpu file. After we clone the folder, we can run cool_gpu to adjust the fan speed. 2. Change the folder name to set-gpu-fans. Due to the author's negligence, this folder is named "set-gpu-fans" in the cool_gpu code, but the folder name cloned by git is "set_gpu_fans_public".
3. Create a symbolic link to let the system know where this code is:
4. Locate the set-gpu-fans folder and enter the following command:
This command runs the cool_gpu cooling code. After starting, you will see these real-time changing prompts: Before starting the calculation test, let's take a look at the current GPU temperature: Here we use 2 cards for computing test, we can see that the Perf (performance) of 2 cards has been adjusted to "P2" (other cards are still P8), the temperature of 2 cards is 35 degrees, and the speed of three fans is 55%. "P2" refers to the power state of NVIDIA graphics cards, from P0 to P12, the highest performance state is P0, the running calculation is P2, and the highest power consumption (highest performance) is P12. Start model training, and we can see that the program is constantly adjusting the temperature automatically: After running the training model for a period of time, the final temperature status is as follows: The fans were all adjusted to 80% speed, and the temperature was stabilized at 65 degrees! Compared with the data at the beginning of the article, the graphics card temperature dropped from 84 degrees to 65 degrees, a full 20 degrees drop! 3. One thing to noteBefore the second part of the article above came out, there was another article circulating on the Internet, which can be said to be the most original version. The code in the second part of the above is improved based on the original version of the article. The link address is here (Set fan speed without an X server): http://t.cn/RK9yQmf However, there is a serious problem with the original code in this article: although the fan speed can be forced to change, the GPU will be downgraded and the power state will be forced to drop to P8, resulting in a serious drop in computing performance! Maybe the article was published a long time ago and is not suitable for the latest graphics cards and drivers. Therefore, there is an improved version in the second part above. So please do not use the original version of the code, otherwise the GPU performance will be limited. Reposted from Leifeng.com. This article is written by Hu Zhihao and originally published on the author’s personal blog. |
<<: Tech Neo June Issue: Enterprise-Level Operations and Maintenance
>>: Useful information sharing: Let you learn JS closures in minutes
In the field of performance advertising, I have c...
Nobel Prize dinner cancelled According to the CCT...
How much does it cost to produce the Zhuhai ticke...
This course can help you solve the problems of ins...
If a product is produced but cannot fully reach u...
Recently, I was surfing the Internet and found a ...
It is no exaggeration to say that "like"...
What is the user perspective? This is actually a ...
Q: How to give a mini program a “good name”? A: N...
The 618 promotion has developed into a major node...
In the eyes of many marketers , this is an era of...
"Wolf Warrior 2" grossed over 1 billion...
Source code introduction: Test account: 123456888...
Baidu/360/Sogou/UC bidding promotions all start w...
Want to know the secret to gaining 100,000 users ...