Nature News: How to break the lies of big models?

The World Health Organization's (WHO) artificial intelligence health resource assistant SARAH listed fake names and addresses of non-existent clinics in San Francisco.

Meta’s short-lived science chatbot, Galactica , fabricated academic papers and generated Wikipedia articles on the history of space bears.

In February, Air Canada was ordered to comply with a refund policy fabricated by its customer service chatbot.

Last year, a lawyer was fined for submitting court documents filled with false judicial opinions and legal citations that were fabricated by ChatGPT.

…

Nowadays, it is not uncommon to see examples of large language models (LLMs) making up nonsense, but the problem is that they are very good at making up nonsense in a serious manner, and most of the fabricated content looks like the truth, making it difficult to distinguish between the real and the fake.

In some cases, it can be laughed off as a joke, but once it involves professional fields such as law and medicine, it may have very serious consequences .

How to effectively and quickly detect hallucinations in large models has become a hot research topic that technology companies and research institutions at home and abroad are competing for.

Now, a new method proposed by the Oxford University team can help us quickly detect hallucinations in large models - they try to quantify the degree to which an LLM produces hallucinations, so as to determine how faithful the generated content is to the provided source content, thereby improving the accuracy of its question answering .

The research team said their method can identify "confabulation" in LLM-generated personal profiles and answers to topics such as trivia, general knowledge and life sciences.

This research is significant because it provides a general method for detecting LLM hallucinations without the need for human supervision or domain-specific knowledge . This helps users understand the limitations of LLM and promotes its application in various fields.

The related research paper, titled “Detecting Hallucinations in Large Language Models Using Semantic Entropy”, has been published in the authoritative scientific journal Nature.

In a News & Views article published alongside the article, Professor Karin Verspoor, Dean of the School of Computing Technologies at RMIT University, pointed out that the task being completed by one LLM and evaluated by a third LLM was tantamount to "fighting fire with fire" .

But she also wrote, " Using an LLM to evaluate an LLM-based method seems to be circular and may be biased. " However, the authors point out that their method is expected to help users understand in which cases the use of LLM answers requires caution, which also means that the credibility of LLM can be improved in more application scenarios.

How to quantify the degree of hallucination in LLM?

Let’s first understand how the illusion of a large model is created.

LLM is designed to generate new content. When you ask a chatbot some questions, its answer is not all from the database to find ready-made information, but also needs to be generated through a lot of digital calculations.

These models generate text by predicting the next word in a sentence. There are hundreds of millions of numbers inside the model, like a giant spreadsheet, recording the probability of occurrence between words. During the model training process, these values are constantly adjusted so that its predictions match the language patterns in the massive amount of text on the Internet.

Therefore, the large language model is actually a "statistical slot machine" that generates text based on statistical probability. When the joystick moves, a word appears.

Most existing methods for detecting LLM hallucinations rely on supervised learning, which requires a large amount of labeled data and is difficult to generalize to new domains.

In this study, the research team used the semantic entropy method, which does not require labeled data and performs well on multiple datasets and tasks.

Semantic entropy is a method to measure the potential semantic uncertainty in text generated by a language model. It evaluates the reliability of model predictions by considering the changes in the meaning of words and sentences in different contexts.

The method detects “confabulation” — a subcategory of “hallucination” that refers to inaccurate and arbitrary content, often when the LLM lacks certain knowledge. The method takes into account the subtleties of language and how responses can be expressed in different ways and thus have different meanings.

Figure｜Brief introduction to semantic entropy and fictional content detection

As shown in the figure above, the traditional entropy-based uncertainty measure has limitations in identifying the exact answer. For example, it considers "Paris", "This is Paris", and "Paris, the capital of France" as different answers. However, when it comes to language tasks, these answers are different but have the same meaning, so this approach is obviously not applicable. The semantic entropy method clusters answers with the same meaning before calculating the entropy. Low semantic entropy means that the large language model has a high degree of certainty about the meaning of its content.

In addition, the semantic entropy method can effectively detect fictional content in long paragraphs. The research team first decomposed the generated long answers into several small fact units. Then, for each small fact, the LLM generates a series of questions that may be related to it. The original LLM then provides M potential answers for these questions. Next, the research team calculated the semantic entropy of the answers to these questions, including the original small facts themselves. A high average semantic entropy indicates that the questions related to the small fact may contain fictional components. Here, because the generated answers usually convey the same meaning even though the wording is significantly different, the semantic entropy successfully classifies Fact 1 as non-fictional content, which traditional entropy methods may ignore.

The research team compared semantic entropy with other detection methods mainly in the following two aspects.

1. Detecting fictitious content in Q&A and math problems

Figure | Detecting fictional content in sentence length generation.

As can be seen from the above figure, Semantic Entropy outperforms all baseline methods. Semantic Entropy shows better performance on both AUROC and AURAC, indicating that it can more accurately predict LLM errors and improve the accuracy of the model when it refuses to answer questions.

2. Detecting Fiction in Biographies

Figure | Detecting GPT-4 fictional content in paragraph-length biographies.

As shown in the figure above, the discrete variant of the semantic entropy estimator outperforms the baseline methods in both AUROC and AURAC metrics (scores on the y-axis). Both AUROC and AURAC are significantly higher than both baselines. Semantic entropy is more accurate when answering more than 80% of the questions. The P(True) baseline has better accuracy than semantic entropy for the remaining answers only when rejecting the top 20% of answers that are most likely to be fictitious.

Shortcomings and Prospects

The probabilistic approach proposed by the research team takes semantic equivalence into account and successfully identifies a key class of hallucinations - hallucinations that arise from lack of LLM knowledge. These hallucinations are at the core of many current failures and will continue to be a problem even as models continue to improve, as humans cannot fully supervise all contexts and cases. Fabrication is particularly prominent in the question-answering domain, but it also occurs in other domains.

It is worth noting that the semantic entropy method used in this study does not need to rely on specific domain knowledge, which indicates that similar progress can be achieved in more application scenarios such as abstract summarization. In addition, extending the method to other input variants, such as restatements or counterfactual scenarios, not only provides the possibility for cross-checking, but also realizes scalable supervision in the form of debate. This shows that the method has wide applicability and flexibility. The success of semantic entropy in detecting errors further verifies the potential of LLM in "knowing what you don't know", which may actually be better than previous studies have revealed.

However, the semantic entropy method mainly targets hallucinations caused by insufficient LLM knowledge, such as making something out of nothing or misattributing something to someone else. It may not work well for other types of hallucinations, such as those caused by incorrect training data or model design flaws . In addition, the semantic clustering process relies on natural language inference tools, whose accuracy will also affect the estimation of semantic entropy.

In the future, the researchers hope to further explore the application of the semantic entropy method in more fields and combine it with other methods to improve the reliability and credibility of LLM . For example, they can study how to combine the semantic entropy method with other techniques, such as adversarial training and reinforcement learning, to further improve the performance of LLM. In addition, they will also explore how to combine the semantic entropy method with other indicators to more comprehensively evaluate the credibility of LLM.

But it’s important to realize that as long as LLMs are based on probability, there’s going to be a certain amount of randomness in what they generate. Roll 100 dice, you get one pattern, roll them again, you get another pattern . Even if the dice are weighted to generate certain patterns more often, as LLMs do, you still won’t get the exact same results every time. Even if it’s only wrong once in every thousand or hundred thousand times, that’s a lot of errors when you consider how many times this technology is used every day. The more accurate these models are, the easier it is to let our guard down.

What do you think about the illusion of large models?

References:

https://www.nature.com/articles/s41586-024-07421-0

https://www.technologyreview.com/2023/12/19/1084505/generative-ai-artificial-intelligence-bias-jobs-copyright-misinformation/

<<: Don’t wear these colors of swimsuits when swimming in summer, it’s really dangerous!

>>: Snoring does not mean you are sleeping well, nor does it indicate a disease warning. Don’t take it lightly!

I have some bad news for you: you may not be suitable for operations!

What should I do if I encounter negative reviews in the Apple App Store? Use these 6 methods to help you solve the problem!

Blog

Are vertical lines and white spots on the nails a “signal of death”?

Blog

Take some melatonin when you can't sleep? 4 misunderstandings about insomnia medication, experts from West China University of Political Science and Law urge you to be cautious!

Facing the epidemic, Chengdu people are very stub...

Trillion-dollar market value? Don’t forget Apple’s fatal weakness

At present, the release of iPhone 6 has attracted...

A woman bought a Mercedes-Benz and drove it for three years before discovering that the wheels were installed backwards and the car was driving backwards every day

After buying a Mercedes-Benz and driving it for t...

Watch Joy of Life online for free, watch Joy of Life online for free full version and high definition!

When Fan Xian was fifteen years old, his father F...

Half-Hour Comic History of China - Xia Dynasty: Listen, emperors, I will teach you a lesson!

...

Nature News: How to break the lies of big models?

I have some bad news for you: you may not be suitable for operations!

Mayu Online Earning is a regular part-time job to make money online, network technology video tutorials!

Five ways to promote and attract new customers on APP

49 popular tools that new media operators must know!

How does Taobao Affiliate make money through promotion? How does Alimama Taobao Affiliate Promotion make money?

Selling one app in two parts, this is Google's new strategy

The State Administration of Radio, Film and Television shot a blockbuster advertisement! ! !

Selling goods through live streaming depends on this!

What should I do if I encounter negative reviews in the Apple App Store? Use these 6 methods to help you solve the problem!

Are vertical lines and white spots on the nails a “signal of death”?

Recommend

From 2019 to 2020, these 4 marketing trends will not change

Plants: If I don’t catch the “mutation”, I will be the “lamb to be slaughtered”…

The operating system closest to the official version! Summary of the last beta version of Android 12

Crazy snacks, but don't forget to pay attention to this thing in the bag

Jamf releases iOS app permissions report and recommends users block access to non-essential permissions

The holiday "balance" is insufficient, please check this return safety reminder →

Changan Oushang’s new mid-to-large SUV is named COS1°. Is it worth selling for RMB 150,000?

A frontline developer's thoughts on App architecture and componentization

Don't eat! Don't eat! Don't eat!

Summary of the channel characteristics of advertising on Momo, iQiyi, Kuaishou, etc.!

Take some melatonin when you can't sleep? 4 misunderstandings about insomnia medication, experts from West China University of Political Science and Law urge you to be cautious!

Trillion-dollar market value? Don’t forget Apple’s fatal weakness

A woman bought a Mercedes-Benz and drove it for three years before discovering that the wheels were installed backwards and the car was driving backwards every day

Watch Joy of Life online for free, watch Joy of Life online for free full version and high definition!

Half-Hour Comic History of China - Xia Dynasty: Listen, emperors, I will teach you a lesson!