Why are natural language interaction tools that are more human-like the more likely they are to disappoint people?

With Siri as a precedent, anthropomorphism has become a necessary capability for natural language interaction tools. Whether it is an AI voice assistant serving individual users, or smart customer service provided by enterprises, or even various home appliances with voice functions, they all need to create IP and personas, almost like becoming spirits.

Most of the time, we think that the anthropomorphism of natural language interaction tools can reduce the user's "uncanny valley effect" and make users prefer to communicate with them. However, the results of the latest research show that this may not be the case.

The Thousand Routines of Becoming Human

First, let’s take a look at the “thousands of routines” that are personified in natural language interaction tools.

The first step is to give yourself a harmless name.

We often say that if you pick up a small animal and give it a name, it will most likely become your pet. The same is true for AI. When a natural language interaction tool has a name, it is basically destined to go further and further on the road to becoming a spirit. The names of natural language interaction tools are usually "small", which makes them seem weak and harmless, and regardless of gender, politically correct.

The second step is to use speech generation technology to imitate human tone.

After having a name, you certainly can't use cold electronic voices anymore. Even the previous speech generation technology of real-person recording + rule matching is a bit rigid. At this time, neural network speech generation represented by Google WaveNet appeared. By capturing multiple features of real people's speaking methods, taking into account semantics, part of speech, grammar, context and other parameters, it finally generates a real-person speaking tone with pauses and thinking, just like Google Assistant.

The third step is to make the conversation more humane.

In the process of natural language interaction, speech generation needs to be based on text content. In addition to satisfying the anthropomorphism of the "speaking tone", the "speaking content" must also be more humane. At this time, the maturity of technologies such as semantic understanding, multi-round dialogue, and natural language generation becomes very important. For example, the full-duplex natural language interaction applied by Microsoft on Microsoft Xiaoice can achieve "listening and thinking" and "rhythm control"-understanding the user's intentions through the entire dialogue process, reducing the user's waiting time, and being able to actively trigger new topics to break the silence, and adjust the content and timing of the answer by itself. Such dialogue content is "displayed" through speech generation technology, which can be confused with the real thing, making people think that they are really talking to humans.

The last step is to put on "human skin".

In addition to technology, some peripheral modes should be used to make natural language interaction tools more humanized. For example, design a cute cartoon image for them, add a few instructions to let them learn some cute and coquettish verbal expressions, and add some details to the interactive interface so that people don't realize that they are talking to machines.

With these steps, you can basically create a natural language interaction tool that "takes human form".

The more human, the cuter?

Managing expectations with natural language interaction tools

But one question we have never thought about is, in actual application, are natural language interaction tools really better the more humanized they are? Recently, the Media Effects Research Laboratory at Pennsylvania State University conducted such an experiment.

The researchers told the volunteers that they would be purchasing a digital camera on an e-commerce platform and would need to talk to online customer service for consultation. Behind these customer service systems are intelligent natural language interaction systems, but the researchers differentiated them in terms of humanization and responsiveness. Different groups of volunteers were exposed to different online customer service systems. Some directly told the other party that they were machine customer service during the conversation, some only displayed the content of the dialog box, and some "disguised" themselves as humans through real-life avatars and names.

At the same time, these intelligent customer service agents with different levels of anthropomorphism have different levels of response. Some can answer user questions quickly and accurately, while others cannot understand human language and evade the question.

When the subjects were surveyed about their satisfaction after the interaction, the results were surprising.

In general logic, we would think that the higher the responsiveness of intelligent customer service during interaction, the higher people's satisfaction will naturally be. But the actual situation is that at the same level of responsiveness, the user's satisfaction is related to the degree of humanity of the intelligent customer service. For example, for the same interactive content, the experimenters who clearly know that the other party is a machine customer service will give an 80-point satisfaction rating, while those machine customer service disguised as humans can only get a 60-point satisfaction rating. The reason is that when machine customer service shows higher human characteristics, users' expectations of them will also increase, hoping that they can help them solve problems like humans. If they don't get the answers they want, their disappointment will be magnified.

In fact, we have the same feeling when we use natural language interaction ourselves. When voice assistants, intelligent customer service and other products cannot solve the problem and have to force themselves to be cute and tell jokes, our irritability index tends to rise sharply.

Ultimately, whether natural language interaction is humane or not is a question of "user expectation management". Sometimes over-raising user expectations can backfire.

It is easy to be a person, but difficult to be a tool

But an important trend we can see at present is that the development of the humanity and instrumentality of natural language interaction is uneven.

From the perspective of the difficulty of technological development, making natural language interaction tools closer to humans is much easier than making them more effective.

Whether it is Google's WaveNet or Microsoft's full-duplex natural language interaction, they are enough to make the pronunciation pattern, conversation rhythm and other details of natural language interaction as close to humans as possible. In the future, combined with the capabilities of computer vision and even robot manufacturing technology, we can create a conversationalist that is no different from humans.

In fact, today we can see "AI speakers" that are visually humanized, such as AI anchors or Sophia launched by Harmony.

However, the ability of these natural language interactions to solve problems has not improved. Specifically, there is still a certain gap in the understanding of human corpora, especially the relatively unpopular corpora of minority languages, the elderly, children, etc.; the cognition of vocabulary in different fields is not comprehensive enough. Many times when it comes to some vertical industries, AI often falls into knowledge blind spots.

In this way, helping the "instrumentality" of natural language interaction catch up with "humanity" may become an industry trend for a long time in the future. For example, building knowledge graphs for various industry segments, accumulating vocabulary libraries, or collecting corpora of different dialects and languages from different groups of people for AI training.

As technology continues to catch up, it is inevitable that people's expectations for natural language interaction tools will continue to increase. In order to avoid the "shortboard effect", we should perhaps devote more energy to pursuing things other than "human nature".

<<: Three veterans in charge of iPhone design resigned, causing a major personnel shakeup in Apple's industrial design team

>>: Google launches an app that helps hearing-impaired people communicate freely

Sharing tips on selling products through Douyin: The most comprehensive guide to selling products through short videos!

Daokun Express Course teaches you how to filter and analyze data in multiple dimensions to optimize your store and increase traffic

Daokun Express Course teaches you how to filter a...

Boboyang 《5 Golden Keys to Time Management》

Boboyang's "5 Golden Keys to Time Manage...

Douyu's high valuation was questioned, and it once faced the problem of a broken capital chain

In March, a report titled "2017 China Unicor...

Is the "artificial sun" in "The Three-Body Problem" likely to be realized? Research results of Shenzhen young scientist Yuan Ding's team were published in the journal "Nature Astronomy"

For 5,000 years, Chinese civilization has been ex...

Why are natural language interaction tools that are more human-like the more likely they are to disappoint people?

Sharing tips on selling products through Douyin: The most comprehensive guide to selling products through short videos!

GAC Trumpchi, with three 8s in its hands, shows off its hand against Mercedes-Benz

How to improve the conversion rate of the lead generation training camp?

Today is Sunday, but I have to go to work! Why does taking a day off make people so exhausted?

Google VR: Anything above 9.9 yuan is just a tax

9 to 5? The daily life of a CTO is not what you think

CHJ Automotive launches the "Ideal Smart Manufacturing" brand; extended-range electric vehicles solve range anxiety

Xiaomi’s cost-effectiveness crisis has finally emerged. What is the road to future redemption?

Even if they are not related, people who look alike share a lot of the same DNA?

Scientists, we want to tell you: 50th anniversary exhibition of China's overcoming the difficulties of hybrid rice

Recommend

China Brain Health Day | Eat for a Smart Brain - How to Protect Your Brain Through Diet

BYD has become a "traditional car company". Behind its leading sales, intelligent driving is still an unspeakable pain

Four soft methods of app promotion that you may not know!

Tips for developing a big Tik Tok account that attracts a lot of fans!

Daokun Express Course teaches you how to filter and analyze data in multiple dimensions to optimize your store and increase traffic

Boboyang 《5 Golden Keys to Time Management》

Douyu's high valuation was questioned, and it once faced the problem of a broken capital chain

Is the "artificial sun" in "The Three-Body Problem" likely to be realized? Research results of Shenzhen young scientist Yuan Ding's team were published in the journal "Nature Astronomy"

Snatching food from the tiger's mouth: the story of WeChat phone book and operators

Is Tik Tok restricting influencers from promoting products?

Why do some people still believe in the "doomsday" rumors that are disproven every year?

How to do a good job of user recall? Share 4 tips!

How to create high-conversion information flow video content!

Popular Science Illustrations | What special “skills” does the “Red National Treasure” have?

If the notch full screen is destined to disappear, will Apple still insist on Face ID?