An unexpected result of technological evolution: How did games and cryptocurrencies become the “computing power base” of AI?

An unexpected result of technological evolution: How did games and cryptocurrencies become the “computing power base” of AI?

In the past spring, we witnessed the largest technology carnival since the new century. It would be too conservative to describe the development of artificial intelligence (AI) in the past few months as "mushrooms after rain". "Big bang" may be a more appropriate description - even Dr. Lu Qi, the former president of Baidu, an industry leader and recognized as the "most motivated", said that he "can't keep up (with papers and codes) because there are too many."

Looking back to November 30, 2022, the door to a new era suddenly opened. OpenAI released ChatGPT, and people were surprised to find that AI had reproduced the glory of AlphaGo - and this time it was more comprehensive. Generative artificial intelligence represented by GPT-3 seems to have comprehensive language processing capabilities, while MidJourney and Stable Diffusion have made painting no longer a craft unique to humans. In the following months, large language models (LLMs) became a well-known keyword, and Internet giants such as Microsoft, Google, and Facebook (Meta) were back in the spotlight.

Domestic enterprises have also made efforts. Baidu's "Wenxin Yiyan", SenseTime's "RiRiXin", Alibaba's "Tongyi", Tencent's "Hunyuan", and Huawei's "Pangu" have all made their debuts. By May, various enterprises and teaching and research institutions and colleges had released more than 30 large models, with a great ambition to "build the IT foundation of the new era", which can be called "industrial revolution every day and Renaissance every night".

Copyright image, no permission to reprint

Of course, the future of AI is not without concerns. In an article published in early March 2023, Bloomberg News said that 10% to 15% of Google's total annual electricity consumption is consumed by AI projects. This is roughly equivalent to the living electricity consumption of 500,000 people in Atlanta for a whole year. According to the International Data Corporation (IDC), AI's energy consumption currently accounts for about 3% of global energy consumption; two years later, in 2025, this number will soar to 15%, accompanied by a huge impact on the environment.

In this sense, energy is the first foundation of AI. Perhaps AI will hit the energy wall before it benefits all mankind.

01

How are these energies consumed by AI?

But why does AI consume so much power? This involves its other foundation: computing power . AI is a computing-intensive technology, especially in applications like ChatGPT. It requires a lot of computing power, and naturally a lot of energy.

The recent AI wave is driven by deep learning technology, which builds multi-layered artificial neural networks (i.e., deep neural networks), in which each neuron has its own adjustable parameters. Large language models often mean billions, tens of billions, or even more parameters, which is a guarantee for obtaining good results; on this basis, a huge data set is also needed to teach the model how to respond correctly. Supporting both is powerful computing power.

**Computing power, data, and algorithms are the three essential elements of AI, and none of them can be missing. **When ChatGPT was first released, it was based on the GPT-3 model. This model contains 175 billion parameters and uses 45T of data for training. The computing power required for training once is about 3640 PF-days - that is, if a computing device that can perform 100 trillion operations per second is used, it will take 3640 days to complete a training session.

Copyright image, no permission to reprint

And that’s just training. Putting AI models into real-world situations to answer questions or take actions — known as “inference” — is more energy-intensive than training. Chip giant Nvidia estimates that 80% to 90% of the cost of a model like GPT-3 is spent on inference rather than training.

There are three main reasons why AI training and reasoning require so much computing power: the expansion of data sets, the growth of parameters, and the law of diminishing returns of models. Generally speaking, the more data there is, the more the model learns, which is similar to human learning; however, unlike human learning, when learning is iterated multiple times on larger data sets, the energy consumed will also increase rapidly.

As model parameters increase, the number of connections between artificial neurons increases exponentially, and the amount of computation and energy required soars. In a previous test case, the number of model parameters increased by 4 times, while the energy consumption increased by 18,000 times.

What’s worse is that the bigger the model, the better it is. It also has a cost-effectiveness problem. In 2019, researchers at the Allen Institute for Artificial Intelligence (AI2) published a paper demonstrating the diminishing marginal returns of large models: the ResNeXt model released in 2017 required 35% more computing power than its 2015 original, but its accuracy was only improved by 0.5%.

However, before finding the optimal balance, people still have to work hard to increase computing power. An article published by OpenAI said that from 2012 to now, the amount of computing used for artificial intelligence has increased by 300,000 times, which means that the amount of computing used for AI doubles about every 100 days.

This is probably the new Moore's Law in the AI ​​era.

02

Computing power: Moore's Law in the AI ​​era

In 1965, Gordon Moore, co-founder of Intel, proposed an empirical rule that the number of transistors that can be accommodated on an integrated circuit will double every two years. This means that every 20 years, the number of transistors on an integrated circuit of the same size will increase by 1,000 times; every 40 years, it will increase by 1 million times.

The information age we live in today is built on Moore's Law, which has been an important driving force for the development of computer technology.

In a sense, the driving force brought by Moore's definition is only an "external factor". The development of computer technology also requires a little "internal factor" - it comes from human nature: play.

The desire for "games" and "ownership" has been engraved in our genes, even before the birth of the "human" species. Not long after the computer was invented, games became its important use. As early as 1952, American computer scientist Arthur Samuel wrote the first checkers program on an IBM computer. Later, he also coined the term "machine learning". Today, this term and "artificial intelligence" often appear together. In 1966, in order to continue playing the "Star Trek" game he developed, American computer scientist and Turing Award winner Ken Thompson simply wrote an operating system and designed a programming language. That operating system was later Unix. Today, Linux and macOS operating systems on computers, Android and iOS operating systems on mobile phones can be regarded as its close relatives. And that programming language is the famous C language.

Copyright image, no permission to reprint

In 1982, IBM launched the personal computer (PC). The emergence of PC games was a natural progression. Faster hardware leads to more powerful software, and more powerful software forces hardware upgrades, and the two are intertwined like vines. In 1992, the popular 3D game "Wolfenstein 3D" was born. In 3D games, the difficulty of rendering pictures is not great, but the requirement for computing speed is very high. In such games, the environment and characters are built with many polygons. Their shape and position depend on the 3D coordinates of the vertices. The graphics card needs to perform matrix multiplication and division operations on many vertices to determine how these models should be accurately presented on the flat screen; then, it needs to perform calculations on each pixel to determine the color of each pixel. These calculations need to be very fast because 3D games often change the scene.

Fortunately, these calculations are not difficult and are mostly independent of each other. Therefore, graphics cards dedicated to display should be good at completing these parallel calculations and transferring data quickly. Such requirements have led the graphics processing unit (GPU), the core of computer graphics cards, to take a different path from computer CPUs. GPUs can be optimized for image processing.

After entering the new century, the signs of Moore's Law failing became increasingly obvious. Processing technology gradually approached the physical limit, transistors became smaller and smaller, and it became more and more difficult to manufacture and integrate. Heat dissipation and power supply also became more and more problematic. As a result, multi-core gradually became the mainstream solution; both CPUs and GPUs were rushing towards multi-core.

Then, Bitcoin came along.

Cryptocurrencies such as Bitcoin are calculated, a process called "mining". Mining requires a lot of parallel computing power, which is performed millions of times per second. In the days when cryptocurrency prices are rising, "mining" has become a lucrative business activity. In order to pursue more wealth, enthusiastic "miners" even buy graphics cards until they are out of stock - and such demand further stimulates the demand for breakthroughs in computing power.

When chip manufacturers first developed GPUs, how could they have imagined that many years later, these "gaming equipment" would be used for "mining"?

03

Technology own arrangement

There are so many unexpected things, not just this one.

In 2010, the US Air Force bought about 2,000 PlayStation 3 game consoles produced by Sony. Is this to allow pilots to train by playing games, or is it that officers simply want to play games?

Neither.

After some operation by physicist Guarav Khanna, these game consoles were connected together to become a supercomputer specially designed for processing high-resolution satellite images. Its floating-point computing performance is at least 30 times stronger than the most powerful graphics card on the market at that time. Even now, more than 10 years later, the most powerful consumer-grade graphics card can only barely reach 1/5 of it.

This is obviously something that Sony and gamers did not expect. However, it is not difficult to understand. Game consoles are optimized for games. The chip used by PlayStation 3 has independent CPU and GPU working together, can use 8 cores to complete dual tasks, and can also share information between all cores.

Today, AI also needs these capabilities. Today, the main technology of AI is deep learning, and the basic idea of ​​deep learning is "connectionism": although a single neuron in a neural network does not have intelligence, when a large number of neurons are connected together, intelligence often "emerges". The key is to have a large number of neurons and a large neural network scale - one of the keys to improving model capabilities is the change in network scale.

**Obviously, the larger the network, the higher the demand for computing power. **Today's large neural networks are often computed using GPUs. This is because the algorithms used by neural networks often involve a large number of parameters that are updated at each training iteration. The more content to update, the higher the requirement for memory bandwidth, and one of the advantages of GPUs is memory bandwidth. Moreover, the training algorithms for neural networks are often relatively independent and simple at the neuron level, so the parallel computing power of GPUs can also be used to accelerate processing.

Copyright image, no permission to reprint

This is certainly not what graphics cards were designed for. But by accident, graphics cards became the infrastructure of the AI ​​era. It was games and cryptocurrencies that, to a certain extent, helped lay the foundation for the computing power of later AI. In a sense, this was technology's own arrangement.

04

Technology always surprises

Today, AI has begun to drive social and industrial change. Without graphics cards, perhaps we would not see AI enter our lives so quickly. Graphics cards are derived from people's enthusiasm and innovative spirit, especially the pursuit of games and cryptocurrencies. This is probably a somewhat unexpected start.

Matt Ridley, a famous science writer, said in his masterpiece "Bottom-Up" that **technological innovation, like biological evolution, has no specific direction. Only after a period of survival of the fittest, the most suitable technology will grow and develop. **Once a certain technology becomes mainstream, it will continue to improve itself. Technology is like a unique organism with its own development direction. As technology advances, those popular technologies will continue to accumulate and the development speed will become faster and faster.

Kevin Kelly also has similar views. In his book What Technology Wants, he said that the development of technology is not linear, but full of twists and turns and repetitions; the evolution of technology is often complex and uncertain, and future development is often beyond people's expectations.

Therefore, there may be unexpected solutions to the energy consumption problem of AI. People have begun to try to make AI less power-consuming, such as reducing precision, model compression, model pruning and other technologies; they are also actively exploring the application of renewable energy technologies to provide more environmentally friendly energy. This is certainly a good start.

Let AI explore this question, maybe there will be a surprising answer!

Author: Mammoth Harbin University of Science and Technology

Audit|Yu Yang, Head of Tencent Security Xuanwu Lab

The cover image and the images in this article are from the copyright library

Reproduction of image content is not authorized

<<:  The stronger the immune system, the better? Not necessarily! Be careful of this "immortal cancer"!

>>:  "How Much Do You Know About Food Nutrition" Series | Are You Sub-Healthy? Experts Teach You How to Eat

Recommend

How does the gaming industry advertise on Weibo?

How to correctly direct traffic during the promot...

Tencent, Alibaba, and LeTV all use these four tricks to build cars

On July 6, 2016, the Roewe RX5, a joint effort of...

iPhone 6s internal chip revealed: there is still a 16GB version

Previously, the foreign website 9to5Mac got the i...

Android SDK Development - Release and Use Pitfalls

Preface During the Android development process, s...

How to use the website imitation tool? How to use the website emulation tool?

For website optimizers, the era of just optimizin...

How much does it cost to make a lighting applet in Dafeng?

What is the price of producing Dafeng Lighting Mi...

The oldest European spruce in the world is called... Husky

There are several Linnaeus Trails in Uppsala, Swe...