Citation: When it comes to emotional robots, science fiction writers have created many touching images. The one closest to us is MOSS in the movie “The Wandering Earth”, who said “It is indeed a luxury to let humans remain rational forever”, but he has been faithfully accompanying Liu Peiqiang until the end. However, back to reality, improving the emotional intelligence of robots is still a problem and difficulty that scientists need to solve. Among the major scientific issues, engineering and technical problems, and industrial technology issues for 2024 released by the China Association for Science and Technology, the top ten cutting-edge scientific issues include the "Research on digital humans and robots with both emotion and intelligence" proposed by the Chinese Society of Image and Graphics. So, what level of "emotional intelligence" has the digital human and robot developed to now? In the eyes of scientists, what does the ideal digital human and robot look like? We invited Gao Yue, one of the people who raised this question, member of the Standing Committee of the Emotional Computing and Understanding Professional Committee of the Chinese Society of Image and Graphics, and associate professor at Tsinghua University, to have a chat. The following content is organized according to Gao Yue’s narration: text: The issue of " Research on digital humans and robots with both emotion and intelligence " is the fruit of the common thinking of the Affective Computing Committee of our Society of Image and Graphics, and is also a key scientific issue explored by many of our colleagues over the years. Nowadays, artificial intelligence is developing very rapidly. With the emergence of technologies like AlphaGo, very important breakthroughs have been made in many scenarios. In the past two years, technologies such as embodied intelligence have also been developing rapidly, which has greatly promoted decision-making and intelligent analysis and processing. It should be pointed out that in our daily lives, we have a lot of emotional communication with people, objects, and events in the outside world. How to make these technologies and devices intelligent while also taking into account emotional communication is actually very important and is also a question we have been thinking about. The robot "No.5" in the movie "Thunderbolt 5" Image source: Screenshot of the movie "Thunderbolt 5" Ideally, what do digital humans and robots with both emotional intelligence and intelligence look like? Digital humans are mainly simulated in virtual space , with more room for design, providing some scenes that are difficult to see or richer in our daily lives; while robots are more in the real space , can be seen and touched, and can interact with us behaviorally and even physically, which will present more challenges. From an application perspective, digital humans and robots are constantly expanding into the scenarios we expect. For example, we now see a lot of digital people on the Internet platform, reporting news or telling interesting stories. In the past two years, these digital people may open their mouths to speak or make other movements, but they are very mechanical, and you may recognize that they are fake at a glance. Now, many digital people have a very good degree of simulation, and they will be accompanied by many emotional expressions or movements. Xinhua News Agency previously launched a 3D version of AI synthetic anchor. Image source: Xinhua News Agency At the same time, there are already many automated robots in our daily lives, such as robots in factories and robots that make coffee, but the interaction between them and us may now be just a relatively mechanical interaction. You send a command, it gives you feedback, and executes it for you, such as making you a cup of coffee or making an auto part. However, we humans still need a lot of emotional interaction with the outside world and hope to blend in with the surrounding environment . For example, you can play with the cats and dogs at home, and they can accompany you. Many friends who have watched the cartoon "Doraemon" may want to have a robot cat at home, not only because it can take out of its pocket to give you what you want, but also like a friend, comforting you when you are unhappy and accompanying you when you are happy - this is a very typical example of " robots integrating into life ", and it integrates very naturally. Image source: Screenshot from Doraemon anime Emotional interaction is very important to us humans. In our daily lives, we all hope to have a partner with whom we can communicate emotionally. Whether it is a robot or a digital human, we all hope that it is not just a tool to complete tasks, but can better integrate into our lives. For example, the intelligent companion robot that everyone has paid much attention to in the past two years must not be just a cold machine. If your coffee machine suddenly comes to you and asks you if you want coffee, it may be a weird scene; but if it can ask for your opinion, understand your thoughts, and even understand your current state from your movements and living habits - this is a direction that we can work on in the future. Copyright images in the gallery. Reprinting and using them may lead to copyright disputes. How to score the emotional intelligence of digital humans or robots? How to score the "emotional intelligence" of robots or digital humans is also an issue that the academic community is very concerned about, because it is more difficult than evaluating intelligence. If we want to quantitatively evaluate the level of intelligence, we can now design different evaluation indicators for different tasks. For example, autonomous driving now has different levels from L0 to L5. But it is still difficult to quantify from the perspective of emotion or mood. For example, for the reaction to something, whether happy or unhappy, we can of course make a scale with a happiness level from 1 to 10, but this is actually difficult to define. We all hope that digital humans and robots can communicate seamlessly with humans. Previously, the typical "Turing test" was used to evaluate whether a machine can be distinguished between humans and machines. Evaluation from the emotional dimension is similar. How to evaluate the strength of emotional communication ability and the effectiveness of emotional motivation? We can now evaluate in many specific tasks, such as judging emotions through facial expressions, but a more general and complete analysis model still needs further exploration and some standardized evaluation to form a standard to evaluate "what kind of emotional state they have reached". It will definitely be necessary in the future. What is so difficult about being “ emotionally and intellectually balanced ” ? As mentioned earlier, digital humans now look very real. This is actually due to considerations from the dimensions of computer graphics and virtual reality on how to make their movements more continuous and the simulation scenes more realistic. This is looking at the issue from the perspective of appearance. But when we care about "emotion and intelligence", in addition to the authenticity of its appearance, we should pay more attention to its emotional expression and accurate judgment of emotions when responding to external feedback. Copyright images in the gallery. Reprinting and using them may lead to copyright disputes. To enable digital humans and robots to do this better, we need to understand the generation mechanism of human emotions and some representations from a more basic perspective. In other words, in addition to letting digital humans and robots learn to analyze and judge human emotions based on external signals, we also need to understand human emotional states from the human brain itself, such as what changes and impacts will occur in different scenarios. For example, some patients with specific diseases, such as children with depression or autism, will also have some differences in their emotional states from ordinary people - then, it is very important to better recognize these differences. For example, we can build a robot to help us build cars or pour coffee, but it may also help us identify some disease precursors at the same time, or provide some diagnosis and treatment services or emotional care when serving people with autism or cognitive impairment. We hope that robots and digital humans can continue to get closer to humans in terms of both emotion and intelligence. Of course, we still have a long way to go in exploring our own emotional cognitive abilities. I think, on the one hand, we need to improve the level of brain cognition research. The brain is so complex that we are far from fully understanding its operating mechanism, and even our own cognition, the causes of emotions and the analysis of emotions. There is still a long way to go; on the other hand, new technologies are also needed to further enhance the capabilities of these digital people and robots. These two lines may be moving forward, how to make them come together and where the final road will lead to also requires a long time of exploration. If we have a lot of data to do sentiment calculation and emotion judgment, use this data to train some models, and then use it to judge what the normal emotions of people are in this situation, I think this belongs to the primary stage. However, in the future, when it comes to more general use, we will also need to consider the specific emotional differences among different groups of people, and even the differences in different regions or specific environments. These personalized and diverse characteristics make emotional computing and emotional judgment very difficult. Copyright images in the gallery. Reprinting and using them may lead to copyright disputes. From this perspective, it is indeed difficult to make digital humans and robots interact with people emotionally. This is because everyone is an independent individual, and personalization itself is difficult. For example, when you listen to a joke in a talk show, the feedback given by each audience member is definitely different, but it is difficult to mechanically repeat a joke 100 times to each person to collect feedback and conduct research. In this case, it is very important to give real-time feedback based on external reactions and adjust the pace of doing this - we humans may be better at this, but if we want robots to give more appropriate personalized feedback, there are still many technical difficulties to overcome. For example, when we look at things in the outside world, we receive a lot of visual data, auditory data, and other sensory data. These reactions to the outside world cannot be simply divided into a few specific situations according to one, two, three, or four. Their internal connections are more complicated. It's like tossing a coin, either heads or tails, but now you will find that it can also exist in a swinging state in the middle, which is even more difficult to judge. Since many sensory information data may not be complete, it is difficult for us to put a person into a device full of cameras and various sensors for observation, which will cause great discomfort to the subjects and is difficult to achieve. It is even more difficult to better understand a person's current state and give feedback based on limited local information for intelligent model training. In addition, when you embed these models or methods into digital humans or robots, there are still many problems to be solved in order to achieve interaction with you. For example, computing efficiency. If we say something to it, it will take 10 seconds for it to give a response, which feels like it is stuck and the experience will be very bad. How to make it give feedback in real time requires solving both computing efficiency and hardware problems, including chips, which all need to be improved. How to solve it? Behind the "digital humans and robots with both emotional intelligence and cognition" are actually very complex issues, and there are many technical difficulties that need to be gradually solved. This involves several aspects of the problem. On the one hand, more abundant data is needed. Many artificial intelligence methods require data support. In recent years, more and more researchers have focused on emotional computing, and more data can be obtained to support model training. In addition, with the advancement of hardware technology, computing power and sensor technology have also increased rapidly, making it easier to collect data. For example, it was very difficult to collect EEG data in the past, but now EEG collection equipment is relatively easier to use. Research in medicine, psychology, and brain science related to emotions has also developed rapidly in recent years. Many research works are discussed and explored with doctors in related fields. A lot of follow-up data can help us better model and understand people's emotional states, including the differences between internal and external representations. These can also help us understand what factors are related to people's emotions, and the relationship between external representations such as people's speech and behavior and changes in their own brains, which in turn can help us judge people's emotions. The integration of these interdisciplinary subjects also provides an important breakthrough path for emotional cognition and understanding. Copyright images in the gallery. Reprinting and using them may lead to copyright disputes. In addition, the advancement of hardware technology is also very important. Digital humans may not need a hardware form and can interact with you through headphones, screens and other devices. But robots need to solve the problem of carriers. In some scenarios, robots can lift cars, but in other scenarios, you also want them to shake hands with you gently. In many practical activities, we need them to do what we want them to do accurately and in real time, which definitely requires the involvement of other fields such as machinery, materials, sensors, etc. Planning and production Interviewed expert: Gao Yue, Associate Professor of Tsinghua University, Standing Committee Member of the Emotional Computing and Understanding Professional Committee of the Chinese Society of Image and Graphics Compiled by Yang Yang Planning丨Sun Jingya Chinese Society of Image and Graphics Dingzong Editor: Yang Yang Proofread by Xu Lailinlin |
Xiami Music failed to survive the spring of onlin...
The latest data released by the U.S. Centers for ...
[[170825]] The so-called AR (augmented reality) i...
Many webmasters will implement robots protocol fo...
In the past 20 years, cable networks have focused...
As the old saying goes, "30% of the test and...
The preliminary preparations for community operat...
In the author's opinion, product operations i...
This article is the first part of a series on And...
Gold in our galaxy may be linked to the explosion...
Recently, the "China E-Commerce Complaint an...
On the evening of September 21, WeChat announced ...
[Introduction] I have read too many articles rece...
In recent years, news about "cardiac arrest&...
The Spring Festival, as the most important tradit...