How to solve the multi-body problem in deep learning

The "many-body problem" (also called the N-body problem) is a seemingly simple problem that is actually extremely difficult to solve in modern mathematics. A many-body problem refers to multiple interacting entities. In physics, there is no closed form or analytical solution to any three-body problem (see: https://en.wikipedia.org/wiki/Three-body_problem). Simple problems like this reflect the limitations of our analytical tools. This does not mean that it is unsolvable, it just means that we have to resort to approximations and numerical techniques to calculate it. The three-body problem between the sun, the moon and the earth can be analyzed with sufficiently accurate numerical calculations to help astronauts land on the moon.

In the field of deep learning, we also have an emerging N-body problem. Many more advanced systems are now dealing with the problem of multi-agent systems. Each agent may have goals (i.e., objective functions) that cooperate or compete with the global goal. In multi-agent deep learning systems, and even in modular deep learning systems, researchers need to design scalable methods for cooperation.

Johannes Kepler University, DeepMind, OpenAI, and Facebook have recently published papers exploring various aspects of this issue.

A team at Johannes Kepler University, including Sepp Hochreiter (the inventor of LSTM), has proposed using simulated Coulomb forces (i.e., electromagnetic forces whose magnitude is proportional to the square of the inverse distance) as an alternative objective function for training generative adversarial networks (GANs).

Finding the equilibrium state between two adversarial networks is a hot research topic. Solving the two-body problem in deep learning is quite difficult. The study found that using this approach prevents the undesirable situation of "mode collapse". Moreover, the setting ensures convergence to a perfect solution and only one local minimum which happens to be global. The Wasserstein objective function (aka Earth Mover Distance) may be a better solution, which was extremely popular a few months ago. The team has named their creation "Coulomb GAN".

Microsoft Maluuba published a paper introducing an AI system that plays Pac-Man at a level that is better than humans. The Pac-Man game that the researchers challenged is similar to the original version of this type of game, where the characters collect balls and fruits while avoiding monsters. The title of the paper is "Hybrid Reward Architecture for Reinforcement Learning". This paper introduces an implementation of reinforcement learning (RL) that is different from the typical reinforcement structure (i.e. HRA):

What is surprising about this paper is the number of objective functions used. The paper describes using 1800 value functions as part of its solution, that is, an agent was used for each ball, each fruit, and each monster. Microsoft's research shows that using thousands of micro-agents to break the problem into sub-problems and actually solve it works! The coupling between agents is clearly implicit in this model.

DeepMind solves the problem of multi-agent programs with shared memory. In the paper "Distral: Robust Multitask Reinforcement Learning", the researchers solve a common problem through an agent coordination method inspired by "mind fusion". To do this, the researchers adopted an approach of encapsulating each agent. However, they allow some information to pass through the encapsulation boundary of the agent, hoping that the narrow channel is more scalable and robust.

We propose a new approach to joint multi-task training, which we call distral (Extract and Transfer Learning). Instead of sharing parameters between different networks, we propose to share an "extracted" policy that captures common behaviors across tasks. Each network is trained to solve its own task while being constrained to approximate the shared policy, which becomes the center of all task policies through extractive training.

The results lead to faster and more stable learning, thus validating the narrow channel approach. The open problem in these multi-agent (N-body problems) is the nature of this coupling. The DeepMind paper shows the effectiveness of lower coupling relative to the native tightly coupled approach (i.e. weight sharing).

OpenAI recently published an interesting paper on multi-agent systems where they train models to match other agents in their system. The paper is titled "Learning with Opponent-Learning Awareness". The paper shows that the emergence of "tit-for-tat" strategies comes from giving multi-agent systems social awareness capabilities. Although the results have resilience issues, it is indeed a very fascinating approach because it addresses a key dimension of artificial intelligence (see: multi-dimensional intelligence).

In summary, many leading deep learning research groups are actively exploring modular deep learning. These groups are exploring multi-agent systems composed of different objective functions, all working together to solve a single global objective function. There are still many problems to be solved, but it is clear that this approach is indeed very promising for progress. Last year, I found changes in game theory to be the most instructive for future progress. This year, we will see more attempts to explore loosely coupled multi-agent systems.

<<: HTML5 gesture detection principle and implementation

>>: Modify the default font globally, which can also be done through reflection

WeChat rewards everyone with a top-of-the-line iPhone. Zhang Xiaolong: The year-end bonus is so high that it might scare everyone.

Blog

How much does it cost to be an agent of a pet mini program in Huludao?

How to solve the multi-body problem in deep learning

WeChat rewards everyone with a top-of-the-line iPhone. Zhang Xiaolong: The year-end bonus is so high that it might scare everyone.

How much does it cost to be an agent of a pet mini program in Huludao?

How to apply for a WeChat Mini Program account? How does a company register and develop a mini program?

How should colleges and universities carry out promotion and operation?

District 9 watch online free version, District 9 full movie!

Can consumption be doubled by optimizing information flow using these 3 copywriting tricks?

How to operate mini program data?

Review: How to effectively build a user incentive system!

Ten Lectures on Event Planning and Promotion

The newly added keywords in the bidding have passed the review, but the quality scores are all 1 point. Why?

Recommend

Android immersive status bar implementation

The monthly download volume of Bullet SMS is less than 6,000. Will Alipay be its life-saving straw?

How to review an event you have organized? From which aspects?

Xiao Meng's "Goods Sharing Professional Practice Class F" video sales from zero foundation to mastery

Private domain traffic marketing promotion guide!

How much does it cost to use China Mobile's 10,000M bandwidth exclusively?

What should you pay attention to before viral marketing goes viral?

Android source code: Custom date and time control (SelectTime)

Apple iOS 9 may integrate smart home applications

How should the boss spend 10 million? Mainstream promotion channel combination strategy

5 Taobao store operation routines to double your sales!

The trend of aunt red is all over the screen, but you are following this trend in the wrong way!

Uncovering the mystery of CDN optimization—Tech Neo’s 15th Technical Salon

Liu Tong talks about survival in the workplace: 20 heartfelt answers to key questions

New media software, what software do you need to operate short videos?