Written after DeepSeek became popular: AI is developing so fast, will it develop faster and faster in the future?

Written after DeepSeek became popular: AI is developing so fast, will it develop faster and faster in the future?

Produced by: Science Popularization China

Author: Wang Chen (PhD Candidate at Institute of Computing Technology, Chinese Academy of Sciences)

Producer: China Science Expo

Editor's note: To showcase the latest trends in intelligent technology, the China Science Popularization Frontier Technology Project has launched a series of articles on "Artificial Intelligence" to provide a glimpse into the latest progress in artificial intelligence and respond to various concerns and curiosities. Let us explore together and welcome the intelligent era.

Recently, DeepSeek, as a "new top player" in the AI ​​industry, has sparked heated discussions on social media with its powerful functions. Some people say it is a productivity tool of the future, some are curious about what changes it can bring to life, and some are worried that it will take away their jobs...

In order to let everyone know more about this highly anticipated smart assistant, we invited Wang Chen, a doctoral student at the Institute of Computing Technology of the Chinese Academy of Sciences, to answer questions about DeepSeek's core principles, usage tips, and future trends with 10 questions. Whether you are an AI novice or a technology expert, this article can answer your questions! Let's see if this "smart assistant" can become a true partner in our lives!

DeepSeek has attracted global attention during the Spring Festival. Now many platforms have stated that they have connected to the DeepSeek big model. What exactly is it?

DeepSeek is an artificial intelligence startup based in Hangzhou. It was founded by Liang Wenfeng, co-founder of Huanfang Quantitative in July 2023 and focuses on the research and development of large language models.

Before the Spring Festival, DeepSeek released two open source large language models with the same name: DeepSeek-V3 (December 26, 2024) and DeepSeek-R1 (January 20, 2025). Their performance is comparable to other large language models such as OpenAI's closed-source models GPT-4o and o1, and the cost is significantly lower than other models.

The DeepSeek-V3 model is designed to provide cost-effective services, can quickly respond to user needs, and meet the needs of daily tasks such as natural language processing, question-answering translation, and content generation. The DeepSeek-R1 model focuses on complex reasoning tasks, especially in the fields of mathematical problems, code generation, and logical reasoning, but the response time is also relatively long.

Why has DeepSeek attracted so much attention?

After DeepSeek-V3 and DeepSeek-R1 were released before the Spring Festival, they quickly attracted widespread attention around the world with their performance comparable to the top large models led by OpenAI and their low training and inference costs. DeepSeek's high cost-performance ratio challenges the monopoly of large models in the United States. Its launch enables more companies and users to experience the most advanced AI results at a lower price.

DeepSeek has open-sourced its technical details and model weights, allowing more people to use its results for innovation and research and development. At the same time, DeepSeek has also opened its online services for free, attracting a large number of user experiences and forming an unprecedented boom. Seven days after the official release of DeepSeek-R1, DeepSeek surpassed ChatGPT to top the AppStore free app download rankings. The success of DeepSeek marks a major progress in China's AI field and enhances China's position in the global AI technology competition. At present, many companies and universities have begun to independently deploy the DeepSeek model, further proving its potential for wide application.

Why can it achieve such powerful capabilities with such low cost and limited algorithms? Is it only powerful in Chinese expression, or is it good in all aspects?

DeepSeek is able to achieve powerful capabilities at a low training cost, mainly due to DeepSeek's long-term continuous innovation in model architecture and algorithm levels.

Specifically, DeepSeek effectively reduces the cost of inference by using technologies such as Mixed Expert Architecture (MoE) and Multi-Head Latent Attention (MLA). At the same time, with the help of data distillation, distributed training optimization, and fine-tuning at the hardware level, DeepSeek further improves resource utilization, thereby reducing training costs. The integration of multiple innovative optimization technologies enables DeepSeek to provide powerful performance while having only low training and inference costs.

DeepSeek has outstanding performance in understanding and applying Chinese. It can not only understand classical Chinese and create poetry, but also accurately grasp the popular online terms. In contrast, although ChatGPT's Chinese grammar is fluent, it seems stiff. However, DeepSeek's powerful capabilities are not limited to Chinese expression. In the official multiple standard evaluations, DeepSeek has reached the top level in English, encyclopedia knowledge, long text, code, mathematical ability and other fields.

DeepSeek's performance in different fields

(Image source: Reference 2)

In the field of AI, does using Chinese mean higher efficiency?

In the field of AI, greater “efficiency” often means faster processing, higher accuracy of understanding, or better quality of generated content.

First of all, there are many differences in structure between Chinese and English. Chinese is a ideographic language, where one character can express many meanings, while English is an alphabetic language, where each word consists of multiple letters. Compared with English, Chinese is more concise, efficient, and has a higher information density. When expressing the same meaning, Chinese can often convey the content more concisely. Therefore, in the field of AI, using Chinese can improve expression efficiency and thus reduce costs.

But at the same time, the diverse semantics and complex grammatical structure of Chinese also pose challenges to AI's understanding ability. For example, in Chinese, "花" can mean a plant or an expense, which may make it more difficult for AI to understand the context. Although English also has problems with synonyms and polysemous words, its structure may be clearer. Therefore, when processing Chinese, AI needs more contextual information to accurately understand the meaning.

In addition, the amount of data and the design and optimization of the model should also be taken into consideration. If the AI ​​model uses a large amount of Chinese data during training, it may perform better when processing Chinese tasks. Conversely, if the data mainly comes from English or other languages, the AI ​​may be more efficient when processing these languages. Some models may be designed specifically for a certain language, in which case they will naturally be more efficient in that language.

There is no consensus on whether Chinese has significant advantages in the field of AI. In the future, how to tap the potential advantages of Chinese may become an important research direction.

Why can DeepSeek show its “deep thinking process” when answering user questions?

DeepSeek-R1 can show its deep thinking process when answering user questions because it uses Chain of Thought (CoT) technology. Chain of Thought technology imitates the way humans think. It requires the model to break down complex tasks into simple steps and then solve them step by step, thereby enhancing the model's ability in complex reasoning tasks.

OpenAI's O series models also use the thought chain technology, but OpenAI did not disclose the original thought chain of the model to users, but only provided a summary of the thought chain. As an open source model, DeepSeek-R1 completely opens the thought chain, and users can clearly see the entire reasoning process of the model when solving problems.

What are the characteristics of ChatGPT and DeepSeek respectively? Do they represent two future development directions of AI big models, or will they develop in a fusion manner?

ChatGPT is based on OpenAI's GPT series model, uses a large amount of multilingual data for training, supports multilingual and multimodal, and can provide services across languages ​​and domains. As a closed source model, it is provided by OpenAI to users as an online service.

DeepSeek is optimized for the Chinese language and has lower training and inference costs. DeepSeek is an open source model that users can deploy and customize as needed. Currently, their technical architectures and market positioning are different, but with the development of technology, there may be more trends of mutual learning and integration in the future. For example, DeepSeek may learn from ChatGPT's multimodal capabilities, and ChatGPT may also optimize its localized services to meet the challenges of competitors such as DeepSeek.

DeepSeek released an open source model. After opening it to the public, how should it maintain its leading position?

Liang Wenfeng, founder of DeepSeek, said that the current generative artificial intelligence is not the end point, and the future goal is to move towards the realization of general artificial intelligence. At a time when AI technology is developing rapidly, no one has a technological advantage over their opponents, and even closed source cannot prevent being overtaken by others. In order to meet this challenge, they hope to precipitate value in the growth of the team and stay ahead through continuous innovation. The decision to open source is based on this consideration. Open source can break the technology monopoly, lower the technical threshold, and stimulate broader technical cooperation and innovation. Open source can attract more developers to participate in contributions and build an open and diversified technology development environment. DeepSeek hopes to promote the long-term development of technology in this way, maintain its leading position, and become a leader in AI technology.

When you open the usage page, there are options for "Deep Thinking (R1)" and "Online Search". What is the difference between the two in terms of usage? How can we better use this large inference model?

After turning on the Deep Thinking (R1) option, the background will switch to the DeepSeek-R1 model, which focuses on scenarios that require complex reasoning, such as math or programming problems. It can show detailed thinking processes, provide reasoning steps and final results.

The online search option allows the model to obtain real-time Internet search results. It is suitable for time-sensitive questions that require the latest information. The model can provide real-time updated answers based on the search results.

When using the Deep Thinking (R1) function, users do not need to guide the model to think during questioning. They only need to clearly express their needs and avoid ambiguous expressions so that the model can better understand and provide accurate answers . In Deep Thinking mode, in addition to the model's final answer, users can also pay attention to the model's thinking process, so as to better grasp the detailed methods of solving problems.

Which fields of work are likely to be most impacted by DeepSeek, or even replaced?

Large language models such as DeepSeek may have an impact on industries that rely on information retrieval, data analysis, high repetitiveness and clear goals. For example, fields such as content creation, data processing, translation proofreading, manual customer service, human resources management and financial auditing may be replaced by automated AI technology. AI can efficiently complete tasks required by users, thereby reducing dependence on manual labor.

However, for some jobs that require creativity, emotional intelligence, and interpersonal communication, human involvement is still indispensable. As AI technology develops rapidly, people need to continuously improve these abilities that are difficult to be easily replaced by AI. These abilities can help individuals maintain their competitiveness in the workplace and ensure that in the future work environment, people and AI can achieve better collaboration and complementarity, and jointly promote social progress.

Why is AI developing so fast? Will it continue to develop faster?

AI has developed rapidly in the past few years, driven by several factors.

First, the significant improvement in computing power, especially the development of hardware technologies such as GPU, has enabled AI models to process larger amounts of data and train more complex models, thereby improving overall performance.

Secondly, the rapid development of Internet technology has provided a rich database for AI training. At the same time, breakthroughs in algorithm architecture in the field of deep learning have also enabled AI capabilities to continue to increase. In recent years, technology companies and investors have seen the potential of AI and have provided strong support in terms of funding and technology. These factors have jointly promoted the leapfrog development of AI technology.

Although many experts believe that AI will continue to develop rapidly in the future, there is still uncertainty as to whether it can maintain the current speed. Optimists believe that the progress of AI will show an exponential explosion trend. As AI intelligence grows, the speed of AI iteration will become faster and faster, and eventually it will completely surpass humans. However, computing power and data may become bottlenecks restricting the development of AI technology. The training of large models requires more and more computing power, and the development of computing power is currently not enough to fully meet the needs of AI training. At the same time, the existing data of humans may be exhausted in the next few years.

How AI technology can break through the bottleneck of computing power and data and continue to develop rapidly in the future still requires the joint efforts of researchers around the world. In addition, the ethical, legal and social issues that AI may cause have gradually aroused people's concerns. Some scientists have called for a pause in the development of more powerful AI systems until people can ensure their safety and controllability.

References:

1.https://en.wikipedia.org/wiki/DeepSeek

2.https://api-docs.deepseek.com/zh-cn/news/news1226

3.https://api-docs.deepseek.com/zh-cn/news/news250120

4. Liu, A., Feng, B., Xue, B., Wang, B., Wu, B., Lu, C., ... & Piao, Y. (2024). Deepseek-v3 technical report. arXiv preprint arXiv:2412.19437.

5. Guo, D., Yang, D., Zhang, H., Song, J., Zhang, R., Xu, R., ... & He, Y. (2025). Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning. arXiv preprint arXiv:2501.12948.

<<:  You may not think that these symptoms are related to lack of sleep →

>>:  Where should you “cover” in spring? Remember these three points

Recommend

How much does it cost to join a big turntable mini program in Tongcheng?

For entrepreneurs, although mini program developm...

Annual salary of 250,000 yuan plan Li Chen AE basic + advanced course

Course Catalog: Part I: Chapter 01 AE Interface a...

Chubby Bird: I'm not fat, I just have big breasts

My favorite animal literature writer, Vi Bianchi,...

Fun Box Game Set-Top Box First Test

At present, homogeneity is the biggest stumbling b...

How to effectively promote and attract new users on YouTube?

YouTube is the world's largest video website ...

High-level growth: building and applying 3 growth models

Building a growth model is in the lower half of f...

How to formulate a product strategy to attract new customers?

As the Internet enters the second half, the price...

Summary of high-quality marketing channels for APP promotion!

In recent years, the frequency of use of mobile a...

In the Internet age, let users help you build and promote your brand

People of different eras created big brands that ...

Three steps to do a good job in content operation of driving test products

1. Understand the target audience of the content ...

2019 Advertising Industry Mid-Year Observation

Advertising that is not aimed at growth is a hool...

What happened to those who loved drinking sparkling water?

In recent years, sparkling water has become incre...