SEMI Vision: Cloud giants' GPU capital expenditure is expected to exceed US$320 billion in 2025. TSMC and others are accelerating production expansion to cope with the supply-demand gap

SEMI Vision: Cloud giants' GPU capital expenditure is expected to exceed US$320 billion in 2025. TSMC and others are accelerating production expansion to cope with the supply-demand gap

Recently, DeepSeek, a domestic AI large model startup, has been all over the news. In just a few months, DeepSeek launched two open source large language models, DeepSeek-V3 and DeepSeek-R1, which are not only comparable to the world's top large models such as Meta's Llama 3.1, OpenAI's GPT-4, Anthropic's Claude Sonnet 3.5, etc. in multiple key performance indicators.

What is most shocking is that the training cost of DeepSeek is much lower than these traditional models, and the GPU chip used is not a top-level configuration, but it has delivered an amazing performance.

Economist Ed Yardeni pointed out in a report: "DeepSeek spent only $5.6 million in two months to develop its DeepSeek-v3 model." In contrast, Anthropic CEO Dario Amodei mentioned last year that it costs $100 million to $1 billion to build a model. And because these models are open source, they have great advantages in cost and pricing.

However, it cannot be ignored that although DeepSeek's cost is much lower than that of traditional large companies, its breakthrough is still inseparable from the strong support of GPU, a key hardware. As AI competition becomes increasingly fierce, especially the continuous expansion of the training and reasoning market, computing power will still be the key to victory, and the role of GPU cannot be ignored.

US tech giants remain unaffected and continue to scramble for GPUs

Neither DeepSeek nor concerns about the AI ​​bubble have slowed down corporate investment enthusiasm. Cloud vendors will still have large capital expenditures in 2025. In order to win the battle of AI, technology companies have been building data centers, grabbing GPU cards, and grabbing electricity in recent years. A new round of fierce competition is about to begin:

On January 21, OpenAI announced a new project: the Stargate Project, which plans to invest $500 billion over the next four years to build a new artificial intelligence infrastructure for OpenAI in the United States, with the first $100 billion invested this year.

Amazon plans to invest $100 billion in infrastructure this year, up from $77 billion in 2024 and more than double the $48 billion the year before. The vast majority of the money will go to data centers and servers for Amazon Web Services.

Microsoft announced in early January 2025 that it plans to invest $80 billion in fiscal year 2025 (ending June of this year) to build data centers capable of handling artificial intelligence workloads;

Google plans to invest $75 billion by 2025 (up 42% from $53 billion last year), and Google CEO Sundar Pichai called the opportunity in AI "unprecedented, which is why we are investing more to seize it."

Meta will invest $60 billion to $65 billion in AI-related capital expenditures this year. Meta CEO Mark Zuckerberg said: "I still think that investing heavily in capital expenditures and infrastructure will be a strategic advantage in the long run. We may find other situations at some point, but I think it's too early to draw conclusions. For now, I bet that the ability to build this infrastructure will be a major advantage."

Even Oracle, which has been very cautious in the field of AI in the past few years, has increased its capital expenditure in 2025. Oracle will double its capital expenditure in 2025 from 2024 to about $13.6 billion. In fiscal 2021, Oracle's capital expenditure was only about $2 billion. Oracle Chief Financial Officer Safra Katz said at the fiscal 2025 earnings conference that AI demand has driven Oracle's cloud infrastructure revenue to grow by 52%, and cloud computing revenue is expected to reach $25 billion this fiscal year.

Among the seven major technology companies, Oracle is the only large technology company in the S&P 500 index whose market value has not exceeded $1 trillion. Unlike Amazon and Microsoft, Oracle mainly leases a large number of data centers rather than purchasing them. However, analysts say that Oracle's unique data center strategy enables it to compete effectively with well-funded competitors because it allows a larger part of its capital expenditure to be invested in purchasing GPUs rather than large-scale expenditures like Microsoft. As part of Stargate, Oracle also plays an important role this time, building and operating the computing system together with Nvidia and OpenAI.

It is estimated that the total capital expenditure of Microsoft, Amazon, Google and Meta will reach $246 billion in 2024, up from $151 billion in 2023. And the expenditure in 2025 may exceed $320 billion. Among the high capital expenditures of technology giants, the purchase of GPUs will account for a large part, and clusters of 100,000 cards are gradually becoming the standard for AI computing.

Nvidia is undoubtedly the biggest beneficiary of this wave: Meta, one of its three major customers, is accelerating the construction of a data center of over 2GW and plans to deploy more than 1.3 million GPUs by the end of 2025; Oracle is building a Zettascale-level cloud infrastructure super cluster that supports up to 131,072 Blackwell GPUs and is expected to be launched in the first half of 2025; and Microsoft became the world's largest GPU buyer last year. According to Omdia analysis, Microsoft purchased up to 485,000 Hopper chips in 2024, twice as many as other manufacturers. This year's $80 billion budget is also expected to be invested heavily in GPU purchases.

In 2025, Nvidia's Blackwell GPU will undoubtedly become the most watched chip in the market. Although the series faces some technical challenges, Nvidia still plans to launch it ahead of schedule. According to Business Insider, Nvidia is actively pushing SK Hynix to prepare its next-generation memory as early as possible to accelerate the mass production of Blackwell GPU.

AMD is also actively accelerating the introduction of GPUs into manufacturers. The development of the popular DeepSeek-V3 model used AMD Instinct GPU and ROCM software. The MI300 series GPU has become AMD's fastest-growing product ever, and Microsoft has purchased the Instinct MI300X. According to Nextplatform, AMD's GPU chip sales to data centers are expected to exceed US$5 billion in 2024, almost 10 times that of 2023. Previously, AMD CEO Lisa Su revealed that more than 100 companies and AI customers have actively deployed MI300X.

In order to quickly capture the AI ​​market, AMD plans to launch the next-generation GPU chip MI350 series ahead of schedule. MI350 will start sending samples to major customers this quarter and accelerate production and shipment to the middle of the year.

Under this trend, the GPU market may once again face a situation of supply exceeding demand, and only a few leading manufacturers will be able to obtain priority supply.

Can expanding production solve the GPU shortage problem?

The biggest problem with GPU shortages lies in production capacity. CowoS packaging and HBM storage, as the two mainstays of GPUs, also restrict GPU production capacity. According to DIGITIMES Research, driven by the strong demand for cloud AI accelerators, global demand for CoWoS and similar packaging capacity may increase by 113% in 2025. To this end, industry giants such as TSMC and SK Hynix, the two key manufacturers in the GPU industry chain, have stepped up their efforts to expand production in an attempt to solve the problem of insufficient supply.

Due to increasing geopolitical and economic uncertainties, TSMC's advanced packaging roadmap has undergone multiple adjustments in 2024. According to TSMC's latest plan, the monthly production capacity of CoWoS is expected to be 35,000 pieces in 2024, which will double to 75,000 pieces by 2025, and is expected to further increase to 135,000 pieces in 2026.

TSMC CoWos capacity forecast (Source: SEMI Vision)

TSMC will expand its CoWoS packaging in 2024, which is 2 times more than in 2023, but it is still in short supply. According to SEMI Vision data, TSMC's advanced packaging expansion projects in Zhunan, Chiayi, Taichung and Tainan are being fully promoted. Among them, the Zhunan Advanced Packaging AP6B Plant obtained a use permit on December 3, and the Chiayi Plant started construction in May this year. The Taichung AP5B Plant is expected to be put into production in the first half of next year, and the Tainan AP8 Plant (internal code name AP8) of Innolux Tainan Plant is planned to be put into production on a small scale at the end of 2025. According to the latest news from the Economic Daily on January 20, TSMC will invest another NT$200 billion to build two new CoWoS packaging plants in the third phase of Nanke. If the news is true, plus the Jiake plant currently under construction, TSMC will have a total of eight CoWoS plants in the short term, including two in Jiake Phase I, two in Innolux Plant IV, two in Nanke Phase III, and two in Jiake Phase II. The hot demand for CoWoS packaging is evident.

TSMC advanced packaging factory (Source: SEMI Vision)

On the other hand, SK Hynix is ​​also very busy. As a major supplier of HBM, SK Hynix's HBM production capacity for 2025 has been sold out, so it continues to expand HBM production capacity, aiming to achieve HBM production capacity of 140,000 wafers per month by next year; at the same time, it is accelerating the pace of product iteration and is working closely with TSMC on 16-layer HBM4, which is expected to start mass production and shipment in the second half of 2026.

SK Hynix's revenue for the whole year of 2024 hit a record high, exceeding the record of more than 21 trillion won set in 2022, and its operating profit also exceeded the record during the semiconductor super boom in 2018. Samsung, on the other hand, has handed over the HBM3E market to SK Hynix. The current hope is HBM4. Samsung intends to use new hybrid bonding technology to achieve 16-layer HBM4. But judging from the current progress, it does not seem to be so smooth. However, Samsung is also expanding its HBM production capacity and plans to increase its HBM production capacity to 140,000 to 150,000 wafers per month by the end of this year, and further increase it to 170,000 to 200,000 wafers per month by the end of next year.

Micron's HBM products are progressing well, and its CEO expects HBM revenue to be in the hundreds of millions of dollars in fiscal 2024 and to reach billions of dollars in fiscal 2025. Micron's goal is to capture 20% of the HBM market share next year. In Taiwan, Micron is significantly expanding HBM production capacity, including the A3 plant in Taichung and the 11th plant in Taoyuan. In addition, Micron has opened a new office in Taiwan and acquired AUO's Taichung plant to convert it into a DRAM production base. The company plans to triple its current monthly output of 20,000 wafers to 60,000 by the end of next year.

In the short term, although the expansion plan is being promoted, the contradiction between supply and demand will continue to exist for a long time. In particular, with the further development of AI and data center applications, the demand for GPUs will continue to rise, and it may be difficult to quickly fill the market gap by expanding existing production capacity alone.

GPUs are not enough, so ASICs are used?

The GPU chip shortage has been going on for more than a year or two, and cloud vendors have been constrained by Nvidia for many years. As a result, cloud giants have all invested in ASIC chip development. Google's TPU (tensor processing unit) has become an industry benchmark; Amazon AWS has also launched two self-developed chips, Trainium and Inferentia; Microsoft followed closely and developed the Maia and Cobalt series; Meta seized the market with MTIA chips; and OpenAI is reportedly working with Broadcom to develop its own ASIC chips to support its large-scale AI model training needs.

Apparently, these ASICs have already become popular. Apple released the first preview of iPhone AI in July 2024, and its AI model was trained using Google's TPU (tensor processing unit). At Amazon's AWS Reinvent conference, it announced that it would use Amazon's own self-developed AI chips for model training. In addition, Apple is evaluating Amazon's latest Trainium2 chip. All signs indicate that ASICs can already fill some of the gaps in GPUs.

ASIC must have its unique value. Nvidia's B200 and other GPU chips mainly improve performance by expanding the area, so they will become larger and larger. ASIC chips that focus on improving computing power have abandoned some of the general functions of GPUs and become a choice to improve performance and reduce power consumption. In other words, ASIC has the advantages of high performance, low power consumption, low cost, high confidentiality and security, and reduced circuit board size.

Even NVIDIA, the "King of GPUs", has not ignored the potential of ASIC. According to reports, NVIDIA has begun planning its ASIC product line and recruited thousands of talents in the fields of chip design, software development, and AI research and development in Taiwan. Huang Renxun bluntly stated that NVIDIA can further expand its customer base through this. Perhaps CSP customers will become NVIDIA's competitors, but at the same time all CSPs will still be NVIDIA's customers, and cloud customers will be inseparable from NVIDIA.

The era of ASIC popularity also brought Broadcom and Marvell to the fore. Broadcom has already reached a trillion-dollar market value, and Marvell's market value has also exceeded the 100-billion-dollar mark. The rapid rise of these companies in the ASIC field is not only due to their own technical research and development capabilities, but also reflects the huge market demand for customized computing solutions.

With the increasing complexity of AI models and the popularization of large-scale applications, the scale of the ASIC market has shown explosive growth. According to Morgan Stanley, the AI ​​ASIC market size will grow from US$12 billion in 2024 to US$30 billion in 2027, with an average annual compound growth rate of 34%. Broadcom is more optimistic and predicts that the market demand for ASICs in 2027 will be between US$60 billion and US$90 billion, while Marvell expects the data center ASIC market size to climb to US$42.9 billion in 2028.

Although the GPU shortage problem is difficult to be completely solved in the short term, the rise of ASIC chips has undoubtedly provided a feasible path to fill this gap. Especially in the fields of cloud computing and AI, ASICs provide more efficient and lower power consumption solutions in specific applications through customized design, and gradually become a powerful supplement to GPUs.

As more and more cloud service providers invest in ASIC research and development, the future computing ecosystem may become more diversified. GPU and ASIC complement each other and jointly promote the advancement of the entire AI and cloud computing industry.

Final Thoughts

The emergence of DeepSeek has led many people to believe that it is detrimental to computing power and NVIDIA. However, an analysis by "Information Equality" states that the computing power requirements for cutting-edge exploration (such as the development of new models) and latecomer catch-up (i.e., improvements based on existing results) are different. The article quoted the viewpoint in "The Bitter Lesson", emphasizing that computing power is the fundamental driving force of AI research. Historical experience tells us that breakthroughs in AI often come from the expansion of computing power, rather than simple algorithm innovation. With the continuous improvement of computing power, the capabilities of AI will make a qualitative leap.

Therefore, the important position of GPU in the future AI battlefield remains unbreakable. The future of the AI ​​industry is still a competition of computing power. Whether it is an emerging company like DeepSeek or traditional giants such as OpenAI, Google, and Meta, GPU will still be the cornerstone supporting their technological innovation and product breakthroughs. In 2025, the battle for GPU will still be exciting.

Observations from the semiconductor industry

<<:  New system vulnerability makes iOS 10 easier to break into, Apple says it has started fixing it

>>:  In the post-3.5mm interface era, the Vermillion Red version of Sony MDR-EX750BT calls you to recharge your faith

Recommend

Comet Zijinshan-Atlas has a reverse tail? What's going on?

Image caption: Comet Purple Mountain-Atlas in the...

How to leverage Moments ads to attract new users?

This article takes a decoration company as an exa...

Do you know where the osmanthus tree on the moon came from?

I remember when I was a child, on the night of th...

2020, a survival guide for marketing departments

Today’s content is about the iteration of the mar...

5 things you need to think about before building an Internet platform!

People who have been in the Internet circle for s...

A nanny-level guide for beginners to create Tik Tok videos (Part 1)

Today, I will introduce to you some tips on how t...

Mobile and PC management: a difficult but unstoppable path to merger

[[124757]] Obviously, we only jump into a trend w...