Nature News: AI beats human mathematicians to solve classic mathematical problems for the first time

Artificial intelligence (AI) big models beat human mathematicians.

In a paper published today in Nature, a research team at Google DeepMind introduced a method for searching for new solutions in mathematics and computer science, FunSearch , which works by pairing pre-trained large language models (LLMs) with automatic "evaluators" to prevent hallucinations and wrong ideas. By iterating back and forth between these two components, the initial solution evolves into new knowledge.

The study is the first to leverage LLMs for challenging open problems in science or mathematics. FunSearch found new solutions to the capped set problem, a long-standing open problem in mathematics. And, to demonstrate the practical usefulness of FunSearch, the researchers used it to discover more efficient algorithms for solving the "bin packing" problem, which has ubiquitous applications, such as improving the efficiency of data centers.

Scientific progress has always relied on the ability to share new understanding. What makes FunSearch a particularly powerful scientific tool is that the program it outputs reveals how its solution was built, not just what the solution is. The authors of the paper said, "Hopefully this will inspire further insights from scientists using FunSearch, driving a virtuous cycle of improvement and discovery."

“The solutions generated by FunSearch are much richer in concept than just lists of numbers,” said Jordan Ellenberg, co-author and professor of mathematics at the University of Wisconsin-Madison. “When I studied them, I learned something.”

Finding the maximum upper bound set, solving the "bin packing" problem

FunSearch uses an evolutionary approach powered by LLMs to promote and develop the highest scoring ideas. These ideas are expressed as computer programs so that they can be run and evaluated automatically.

First, the user writes a description of the problem in the form of code, which includes the process of evaluating programs and the seed program used to initialize the program pool.

FunSearch is an iterative process. In each iteration, the system selects some programs from the current pool of programs and feeds them back to the LLMs. Subsequently, the LLMs build on this creatively and generate new programs that are automatically evaluated. The best programs are added back to the existing pool of programs, creating a self-improvement cycle.

FunSearch uses Google's PaLM 2, but it is compatible with other code-trained LLMs.

Figure | FunSearch process

The research focused on the upper bound set problem , an open challenge that has puzzled mathematicians in multiple research fields for decades, and which the renowned mathematician Terence Tao once described as his favorite open problem.

The problem involves finding the largest set of points (called an upper bound set) in a high-dimensional grid in which no three points lie on a line. The problem is important because it serves as a model for other problems in extremal combinatorics, which studies how large or small sets of numbers, graphs, or other objects can be. Brute-force computational approaches to the problem don't work; the number of possibilities that need to be considered quickly becomes greater than the number of atoms in the universe.

Figure | Interactive chart showing the evolution from a seed program (top) to a new high-scoring function (bottom). Each circle is a program and its size is proportional to the score assigned to it.

However, FunSearch has found the largest set of upper bounds ever discovered in some settings in a procedural form, the largest increase in the size of upper bounds in the past 20 years , and it has also outperformed state-of-the-art computational solvers.

In addition, the researchers explored the flexibility of FunSearch by applying it to real-world challenges in computer science. The "bin packing" problem looks at how to pack items of different sizes into the least number of bins, which is at the heart of many real-world problems.

It’s just the beginning

Discovering new mathematical knowledge and algorithms in different fields is a notoriously difficult task that is largely beyond the capabilities of state-of-the-art AI systems. To solve such challenging problems using FunSearch, the research introduces several key components .

It’s worth noting that FunSearch isn’t a black box that simply generates solutions to problems. Instead, it generates programs that describe how those solutions were arrived at, and this method of showing work is how scientists typically operate .

FunSearch tends to find solutions represented by highly compact programs, solutions with low Kolmogorov complexity . Short programs can describe very large objects, allowing FunSearch to scale to large problems such as finding a needle in a haystack. In addition, this feature of FunSearch also makes its program output easier for researchers to understand.

More importantly, this interpretability of FunSearch programs can provide researchers with actionable insights. For example, when using FunSearch, some of its high-scoring outputs have interesting symmetries in their code.

Figure | Inspecting the code generated by FunSearch yields further actionable insights (left); the original “acceptable” set built using the (shorter) program on the left (right).

The results on the upper bound set problem show that the FunSearch technique can go beyond established results on difficult combinatorial problems, where it is difficult to build intuition. The researchers expect this approach to be instrumental in new discoveries on similar theoretical problems in combinatorics and to open up new possibilities in areas such as communication theory.

Alternatively, hard combinatorial problems such as online bin packing can be solved using other AI methods, such as neural networks and reinforcement learning. FunSearch’s approach has also proven to be effective, but it can also require a lot of resources to deploy. On the other hand, the code output by the method can be easily checked and deployed, which means that its solution has the potential to be embedded in a variety of real-world industrial systems to bring rapid benefits.

FunSearch demonstrates that if one can guard against the illusions of LLMs, the power of these models can be harnessed not only to generate new mathematical discoveries, but also to reveal potentially effective solutions to important real-world problems.

The research team anticipates that it will become common practice to use LLMs-driven approaches to generate efficient and customized algorithms for many problems in science and industry, both long-standing and new.

In fact, this is just the beginning. The researchers said: "We will also work to expand its capabilities to address a variety of society's pressing scientific and engineering challenges."

Reference Links:

https://deepmind.google/discover/blog/funsearch-making-new-discoveries-in-mathematical-sciences-using-large-language-models/

https://www.nature.com/articles/s41586-023-06924-6

<<: Having trouble breathing and dry cough, but can’t find the cause? Learn more about this disease!

>>: Can myopia be cured? Not everyone can recover. The key is whether you are in this situation...

Daily Fresh Product Analysis

Blog

Tik Tok short video promotion plan and channels!

Blog

The second extravehicular mission of the Shenzhou 13 astronauts was successfully completed! What are the black technologies behind it?

Blog

In 2015, Internet Marketing Will Never Be Moral

Recommend

Beijing will carry out large-scale air disinfection? Don't be ridiculous, let's take a look at the "real disinfection" strategy

Recently, rumors about large-scale air disinfecti...

China Association of Automobile Manufacturers: The top ten companies (groups) in automobile sales from January to August 2022 sold a total of 14.539 million vehicles

According to statistics and analysis by the China...

Douyin Must-Hit Academy·Food Addiction-Group Buying Expert Practical Course, teach you how to become a group buying store explorer with zero basic knowledge

Douyin must-be-hot academy·Fan Shangyou-Group buy...

I clearly memorized it according to ABCD, so why aren't the letters on the keyboard arranged in order?

This article was reviewed by Dr. Tao Ning, Associ...

Breaking the 20-year Western technology blockade! He made it easier for Chinese people to make phone calls

Today, communications and the Internet have becom...

There are more reasons to drink cold drinks and eat ice cream! It really doesn't hurt your stomach, but these two factors may be the reason for your diarrhea

Sunny day in August It finally looks like summer ...

12 viewpoints on knowledge payment + fan fission operation!

In the context of knowledge payment , more and mo...

Meizu App Store promotion account opening qualification requirements!

What qualifications are required to open an accou...

Nature News: AI beats human mathematicians to solve classic mathematical problems for the first time

Daily Fresh Product Analysis

Tik Tok short video promotion plan and channels!

The second extravehicular mission of the Shenzhou 13 astronauts was successfully completed! What are the black technologies behind it?

In 2015, Internet Marketing Will Never Be Moral

DeepBlue Auto was invited to attend the International Hydrogen Fuel Cell Industry High-end Summit

A forest elf, a trio of high-pitched songs, what kind of mysterious bird is this?

Girl, do you play Momo?

I have been living with chronic diseases for most of my life

Why are your conversions always so low?

The price was cut in half in just one month. Has the Bitcoin bubble really burst?

Recommend

Beijing will carry out large-scale air disinfection? Don't be ridiculous, let's take a look at the "real disinfection" strategy

Tips and strategies for placing Weibo Fans Advertising!

What are invalid clicks in 360 search advertising promotions? How does the system determine invalid clicks?

The latest version of Android supports FIDO2 standard for password-free login to apps or websites

Can "staying in bed" for 5 minutes protect blood vessels? There's a new reason not to get up in winter...

"Pink Killer" wanted poster, AI's ability to read breast X-rays is comparable to that of doctors

4 steps, 24 methods to write copy that will make people want to order!

How does Xiaohongshu operate a UGC content sharing community?

China Association of Automobile Manufacturers: The top ten companies (groups) in automobile sales from January to August 2022 sold a total of 14.539 million vehicles

Douyin Must-Hit Academy·Food Addiction-Group Buying Expert Practical Course, teach you how to become a group buying store explorer with zero basic knowledge

I clearly memorized it according to ABCD, so why aren't the letters on the keyboard arranged in order?

Breaking the 20-year Western technology blockade! He made it easier for Chinese people to make phone calls

There are more reasons to drink cold drinks and eat ice cream! It really doesn't hurt your stomach, but these two factors may be the reason for your diarrhea

12 viewpoints on knowledge payment + fan fission operation!

Meizu App Store promotion account opening qualification requirements!