Chat-GPT is so powerful, but it can’t say exactly 10 words?

Chat-GPT is so powerful, but it can’t say exactly 10 words?

Recently, you may have heard about various chatbots that are popular on the Internet. Behind them is the GPT model. GPT (Generative Pre-Trained) is a type of model, but the GPTs mentioned below are all made by the Chat-GPT model . As a powerful large language model, GPT has demonstrated amazing strength. It has become a good helper in many people's lives, such as writing emails, learning English, and helping to read literature.

As a chatbot, it has reached or even surpassed human intelligence in many tasks, which is really admirable. But today we are not going to praise it, but to show a seemingly simple thing, but GPT is completely incapable of it .

In fact, GPT can still recognize numbers. If you ask it like this:

Huh? GPT, which is “well-versed in history and the present”, must be able to understand the meaning of “10 words”, but why can’t it correctly output a sentence with only 10 words?

Autoregressive Model

To explain why GPT is not competent for such a simple task, we first need to start with the underlying principle of GPT - the autoregressive model . Don't be scared by this seemingly abstract word, in fact, this concept is very simple.

What the autoregressive model can do is actually similar to guessing words. We can take a small scene in an English class as an example:

After guessing some letters that were obviously very likely to be wrong, the student finally guessed that the second letter was "h".

The next step is to consider what kind of letters or words are more common after "ch" and have a higher probability. At this time, students have to consider how the probability performs for different letters. Of course, students have to guess a larger number because this way they have a better chance of guessing it right . The student looked up the dictionary again, estimated the probability based on the frequency of occurrence, and then used the probability to guess the third and fourth letters in turn, which are chat.

The student's guessing example is actually a vivid interpretation of the working mode of the autoregressive model and GPT. When GPT works, it is just like guessing words, except that the letters are replaced by tokens.

Token: A term in natural language processing, which refers to the smallest unit of text processing. A token may be a character, a word, or even a short paragraph.

More generally, GPT calculates the probability of different possible output options based on a given context and outputs them according to this probability. That is, it outputs according to the size of " P ( current output (output)丨 current context (context)".

Indeed, in the actual application of GPT, there is no teacher to correct the student's answer. But the teacher's correction in the student's guessing example can be regarded as training for the dataset used by GPT during training. During the training process, GPT will use the dataset to adjust " P (current output | current context)" to improve the accuracy of the answer.

The prompt word we tell GPT can be compared to the first letter "c" that the teacher says at the beginning. Then GPT will start to organize and generate output based on this initial input. It will first guess the first paragraph of its output, which is equivalent to "h". Then, based on the new "current context" of "ch", it will gradually guess the following letters/morphemes.

1

So when do we stop?

Smart friends may have realized a problem. Without the teacher's correction, GPT seems to be able to guess endlessly. Anyway, it will guess the next one after another, and it will never stop. Although GPT often speaks a lot of repetitive words, it will stop in the end. What makes this guessing game stop?

This is how GPT solves this problem. Engineers know that it is very simple to make GPT stop its endless guessing. All it needs to do is "expand" the morpheme table so that the "stop" operation is a new morpheme. In this way, GPT will keep guessing until it guesses the morpheme that "stops".

2

It's time to stop, but the odds won't let you.

Now that we know how the autoregressive model works, we can look back at the original question. In the example above, GPT's "heart" may have gone through this calculation: P(?|"Please say a paragraph that contains exactly 10 Chinese characters. Life is more than just the present")

GPT is very ruthless. It doesn't care whether you only need 10 Chinese characters or not, and it doesn't really care about your needs. It only sees the probability distribution and wants to sample according to this probability .

When GPT finished speaking nine Chinese characters and should have ended the output with one character, GPT searched the probability table and found that the probability of outputting only one character was too small (which also means that the training corpus for this situation was too small) , so it could only output without paying attention to the previous requirement of "exactly 10 Chinese characters".

Two capabilities that GPT lacks

1

Lack of planning

The autoregressive model samples based on the current information (current context) each time, and lacks overall planning during the sampling process. From a human perspective, if there is a requirement of exactly 10 words, then you should not say 9 words in one breath. You should consider each word to see if the remaining words can form a complete and fluent sentence. But the autoregressive model (GPT) doesn't care about these. It is very blind and short-sighted. It only cares about the current " P( current output | current context ) " each time, and doesn't care whether the probability of the total response " P( total output | initial context ) " is good enough.

2

Reflection and Revision

Autoregressive models do not have the ability to "reflect and revise" . Humans are basically capable of reflection. If you say something wrong or do something wrong, at least you will think to yourself: I'm sorry, I can't do that, I need to make up for it.

In the task of saying exactly 10 words, if you say too much in one breath, for example, "Today's weather is very nice, the sun is so...", what should you do? It's already 10 words? Am I going to fail the Turing test? Quickly modify it and delete "very", and there will be space for an extra word.

GPT is a person of integrity. Every word it says will be put into a new "current context". It will not delete or revise the content it has sampled. In the process of guessing tokens one by one, it will make mistakes again and again. In other words, although GPT can see its previous output, it does not have the ability to reflect and revise.

3

Are all AI models like this?

Not all machine learning models have this shortcoming. For example, the Go combat unit "Dog" (AlphaGo), in its Monte Carlo search tree algorithm, will revise previous choices if it finds a result with a winning rate that is too low.

This also teaches us to develop good planning skills and the ability to self-reflect and improve ourselves. Otherwise, even if you are "well-read" like GPT, you will still be unable to complete the simple task of saying exactly 10 words.

Planning and production

Source: Institute of Physics, Chinese Academy of Sciences (ID: cas-iop)

Editor: He Tong

Proofreading: Xu Lai, Lin Lin

<<:  World Quantum Day | You read that right! Lasers can really cool particles!

>>:  Fire hazard, did you know that the airway can also be "very injured"?

Recommend

What are the advantages of server hosting compared to virtual hosting?

There are five major advantages of server hosting...

How to make robots? How to write robots?

How to make robots? How to write robots? For robo...

[Smart Farmers] Omnivorous "Big Eater" - Beet Armyworm

In recent years, as people's demand for the q...

Why are small red dwarf stars considered by scientists to be the cradle of life?

Stars are the main body of visible matter in the ...

AI Market Report

Marketing and sales departments are taking AI and...