Can AI "copy" human smiles in advance? Is it possible for AI to integrate into the human social world?

Can AI "copy" human smiles in advance? Is it possible for AI to integrate into the human social world?

The emergence of large language models (LLMs) such as ChatGPT has enabled robots to have human-like language expression capabilities. However, when robots talk to humans, their facial expressions still appear unnatural and even full of fear .

This will undoubtedly hinder people's willingness to communicate with machines, making communication between the two very difficult.

Therefore, in the future era of human-machine coexistence, it is crucial to design a robot that can not only make a variety of facial expressions, but also know when to use these expressions.

Now, a research team from Columbia University and its collaborators have taken an important step forward by creating a robot called Emo that can predict human facial expressions and execute them synchronously . It can even predict an upcoming smile about 840 milliseconds (about 0.9 seconds) before a human smiles.

It is reported that it can make eye contact with people and use two artificial intelligence (AI) models to predict and "copy" people's smiles before they smile. The research team said this is a major step forward in robots accurately predicting human facial expressions, improving interactions, and building trust between humans and robots .

The related research paper, titled "Human-robot facial coexpression," was published today in the scientific journal Science Robotics. Yuhang Hu, a Ph.D. student in the Department of Mechanical Engineering at Columbia University, is the first author and co-corresponding author of the paper, and his advisor, Professor Hod Lipson of Columbia University, is the co-corresponding author of the paper.

Image: Yuhang Hu and Emo face to face. (Source: Creative Machines Lab)

In a FOCUS article published in Science Robotics at the same time, Rachael Jack, Professor of Computational Social Cognition at the University of Glasgow, commented:

"Human social interactions are inherently multimodal, involving a complex combination of visual and auditory signals, and while the study by Hu and colleagues focused on a single modality - facial expressions - their work makes a significant contribution to the development of more complex social synchronization skills across multimodal signals ."

In her view, although this is a complex interdisciplinary task, "it is possible to truly integrate social robots into the human social world."

Emo smiled, but it was more than just a smile

If you walked up to a robot with a human head and it smiled at you, what would you do? You would probably smile back, perhaps thinking that the two of you were genuinely communicating.

But how does the robot know how to do this? Or better yet, how does it know how to make you smile back?

To do this, Yuhang Hu and his colleagues needed to solve two major challenges : one was how to mechanically design an expressive robot face, which involves complex hardware and execution mechanisms; the other was knowing which expressions to generate so that they looked natural, timely, and real.

According to the paper, Emo is equipped with 26 actuators, a soft silicone skin on its head, and a magnetic connection system for easy customization and quick maintenance. To achieve more realistic interactions, the research team integrated high-resolution cameras in the pupils of each of Emo's eyes, enabling eye contact, which is essential for non-verbal communication.

Figure|Robot face platform

In addition, they developed two AI models : one predicts human facial expressions by analyzing subtle changes in the target's face, and the other generates motor commands using corresponding facial expressions. The first model was trained by watching online videos, while the second model was trained by letting the robot watch its own expressions on live camera feeds. They demonstrated the effectiveness of both models by conducting quantitative evaluations with other baselines.

Figure|Model architecture. Inverse model (A) and prediction model (B)

To train Emo to make facial expressions, the research team placed Emo in front of a camera and asked it to make random movements . After a few hours, Emo learned the relationship between facial expressions and motor commands - just like humans practice facial expressions by looking in the mirror. They call this "self-modeling" - similar to how humans imagine themselves making specific expressions.

The team then played videos of people’s facial expressions and had Emo observe them frame by frame. After a few hours of training, Emo was able to predict people’s facial expressions by observing tiny changes in their faces.

In Yuhang Hu's view, accurately predicting human facial expressions is an important breakthrough in human-computer interaction technology. "When robots interact with people in real time, it not only improves the quality of interaction, but also helps build trust between people and robots. In the future, when interacting with robots, robots will observe and interpret your facial expressions like real people ."

It is worth mentioning that the potential impact of this research may have extended beyond robotics to fields such as neuroscience and experimental psychology .

For example, a robotic system that can anticipate and synchronize facial expressions could serve as a tool for studying the mirror neuron system. By interacting with participants while measuring brain activity, researchers can gain insight into the neural correlates of social interaction and communication.

In the field of psychology, robots with the ability to predict and synchronize facial expressions can be used as educational tools to help people with autism develop better social communication skills. Studies have shown that robots can effectively engage children with autism spectrum disorder (ASD) and promote their social interactions.

Shortcomings and Prospects

Although Emo can already predict human facial expressions and respond quickly and synchronously, it is far from being able to fully capture human facial communication, and it may even seem disgusting when imitated by an adult-like robot.

However, the research team believes that just as infants learn to imitate their parents before they can make independent facial expressions , robots must first learn to predict and imitate human expressions before they can mature into more spontaneous and self-driven expressive communication.

In future work, they hope to expand Emo's range of expressions and hope to train Emo to make expressions based on what humans say . They are working to integrate language communication into Emo and access large language models like ChatGPT.

However, they also said that the facial expressions that robots imitate must be chosen with care . For example, certain facial gestures, such as smiling, nodding and maintaining eye contact, are often responded to naturally and are viewed positively in human communication. In contrast, expressions such as pouting or frowning should be imitated with caution, as these expressions have the potential to be misinterpreted as sarcasm or convey unintended emotions.

In addition, how human users perceive these expressions is the ultimate measure of success. An important step in the future is to verify the emotional effects of these expressions when humans and robots interact in various situations in the real world to determine their psychological validity.

In addition, the study also has certain limitations , one of which is that "the model's predictions and facial expression imitations may lack cultural sensitivity."

It is well known that different cultures may have different norms and meanings for certain facial expressions. For example, while a smile is often considered a sign of happiness or friendliness in many cultures, it can also be a sign of embarrassment or uncertainty. Similarly, direct eye contact may be seen as a sign of confidence and honesty in some cultures, but may be seen as rude or confrontational in others.

Future work could explore incorporating cultural context into the model. One possible approach is to include datasets from different cultural backgrounds and incorporate an understanding of cultural norms into the algorithm.

Figure | Yuhang Hu working in Hod Lipson's laboratory. (Source: John Abbott/Columbia Engineering School)

Finally, one topic that cannot be avoided is that as robots become more and more capable of behaving like humans, research teams must consider the ethical issues associated with this technology . Preventing possible abuse of technology (such as deception or manipulation) requires a strong ethical framework and management.

Nevertheless, this research is very exciting. As the research team said:

“We are approaching a future where robots can seamlessly integrate into our daily lives, providing companionship, assistance, and even empathy. Imagine a world where interacting with a robot is as natural and comfortable as talking to a friend .”

Reference Links:

https://www.science.org/doi/10.1126/scirobotics.adi4724

https://www.science.org/doi/10.1126/scirobotics.ado5755

https://www.engineering.columbia.edu/news/robot-can-you-say-cheese

https://www.creativemachineslab.com/

<<:  Webb telescope's more precise observations make scientists more confused about the expansion of the universe

>>:  This food is not only a must-have for cooking, it can also treat rare diseases!

Recommend

What exactly is Apple's Swift language?

The Swift language released by Apple has become a ...

Heavy snowfall! Temperature drop! Avoid these roads during peak return hours →

Today (February 4) It is the seventh day of the S...

Apple's "calling in the rain" ad sparks controversy, but is still a good one

Apple 12’s waterproof advertisement has become po...

8 big data analysis models, essential for operations!

You probably know there are eight models for data...

SwipeMenuListView list sliding function

introduce SwipeMenuListView implements a list sli...

How to promote products overseas on TikTok?

In the past few years, the advertising market see...

1688 Alibaba TrustPass Operation. From Beginner to Master

Learn to master: Understand the positioning of 16...

In 2023, these rockets will be on a mission, come and take a look!

On January 9, the Long March 7A rocket Sent on th...