Welcome to Science Popularization China’s special winter vacation column “ High-tech Classes for Children ”! As one of the most cutting-edge technologies today, artificial intelligence is changing our lives at an astonishing speed. From smart voice assistants to driverless cars, from AI painting to machine learning, it has opened up a future full of infinite possibilities for us. This column will use videos and text to explain to children the principles, applications, and profound impact of artificial intelligence on society in an easy-to-understand way. Come and start this AI journey with us! At the end of 2022, the term "ChatGPT" quietly entered the public eye. If you haven’t heard of it, or just think of it as a chatbot, you’re underestimating it. Many industries, such as news, law, education, customer service consulting, etc., have applied ChatGPT in production and services. So, what exactly is ChatGPT? In this issue, we will learn about ChatGPT and the technology behind it. Let’s start with its name. Chat, literally translated as "chatting", can be said to be a chat application like WeChat or QQ. However, the person on the other end is not your friend, but an AI. G, P, T is the more important part, which is the abbreviation of "Generative Pre-trained Transformer". Generative means "generative", which means it can generate text in response based on the information it receives. Pre-trained means “pre-trained”, which means that Chat GPT has undergone a lot of text training before talking to you. Transformer is a deep learning model. It can be said that transformer is the core of the entire GPT. To understand "ChatGPT", we have to start with how AI learns to speak. When humans speak, they pick out words from the "dictionary" in their minds and form sentences. If we simply let AI randomly pick words from the dictionary, the sentences formed will most likely be incoherent and meaningless. In order to make computers speak human language, people introduced the Markov model. Simply put, the Markov model can establish a connection between a word and the previous words. For example, according to the corpus, the probability that the next word after "soda" is "biscuit" or "soda" is much higher than words like "table" or "carrot". If we continue to add the word "eat" before "soda", the probability of filling in "biscuit" is higher than "soda". The sentences generated in this way are closer to human language than randomly generated sentences. Copyright images in the gallery. Reprinting and using them may lead to copyright disputes. Based on this thinking, a model called recurrent neural network was born in the 1970s and 1980s. Recurrent neural network, or RNN for short, can well consider the order of words and the influence of previous words on subsequent words. But RNN also has some limitations, such as the "vanishing gradient" effect. As the length of the sentence increases, it forgets what was said before. Therefore, people optimized the RNN model and developed the long short-term memory model, abbreviated as LSTM, to solve the problem of "forgetfulness". But this is not enough. The RNN-based model has two problems: one is that the learning speed is too slow, and the other is that the understanding of word meaning is not good enough. To this end, a new neural network architecture transformer emerged. The transformer-based model has a very fast learning rate and can learn a large amount of text data in a short time. Currently, the GPT model that communicates with people has been trained on at least 45TB of text data. And the transformer introduces a technology called "self attention". This allows it to assist in understanding the meaning of words based on other words in the article and better understand what we say. Of course, GPT is still being optimized. For example, GPT-4.0 has stronger logical reasoning capabilities and can even understand the content of pictures. Its prospects are immeasurable. In fact, language models like GPT, which have extremely complex parameters and require a large amount of text training, are called large language models. In addition to GPT, Alibaba's PLUG, Huawei's Pangu-α, Baidu's ERNIE 3.0, etc., are all large language models. With the help of these large language models, our work and lifestyle may undergo tremendous changes. Are you ready? Planning and production This article is a work of the Science Popularization China-Creation Cultivation Program Produced by: Science Popularization Department of China Association for Science and Technology Producer|China Science and Technology Press Co., Ltd., Beijing Zhongke Xinghe Culture Media Co., Ltd. Author: Beijing Yunyuji Culture Communication Co., Ltd. Reviewer: Qin Zengchang, Associate Professor, School of Automation Science and Electrical Engineering, Beihang University Planning丨Fu Sijia Editor: Fu Sijia Proofread by Xu Lailinlin |
Respecting teachers is a traditional virtue of th...
Once spread Wang Chuan, who was put on the shelf ...
"Hotspot Review | Major Technology Events of...
[[411358]] I believe that friends who have checke...
Among the BAT Big Three, Tencent has been critici...
1.341 billion yuan is the amount of money that th...
Many people have the habit of listening to music ...
The electric vehicle market is developing rapidly...
Overview When developing for Android, you will de...
Introduction: It is said that to see whether a pe...
If Samsung had told everyone in advance that the ...
Most marketers who are eager for a young user bas...
Beijing Meteorological Observatory issued a yello...
Regarding event operations , this article summari...
For user operations , it is necessary to analyze ...