"I'm going to open the sunroof and listen to Jay Chou's old songs on the way to Quyuan Fenghe." If you say this to a person, he will easily understand your three intentions: one, go to Quyuan Fenghe; two, open the skylight; three, listen to Jay Chou's old songs. But if we replace people with machines, such as cars, will the cars be able to understand and give corresponding operational feedback? As we all know, voice is naturally one of the most suitable ways of in-car interaction because of its convenient and safe operation. It has almost become the standard of in-vehicle solutions in the industry, although there are large differences in the voice solutions made by various companies. For example, the semantic understanding multi-tasking mentioned at the beginning is still a relatively new technology application in the industry. Few companies have been able to implement it. Most manufacturers focus on improving the accuracy of voice recognition and natural language understanding. Chen Hualiang, head of AliOS data intelligence, revealed that they are currently upgrading the technology of voice, focusing on improving the experience of scene-based intelligent semantic understanding (SSLU: Scene-based Spoken Language Understanding), which is an intelligent upgrade of language understanding based on natural language understanding, which includes the improvement of multi-domain task processing capabilities. Common dialogue systems are generally composed of several modules: automatic speech recognition (ASR), natural language understanding (NLU), dialogue management (DM), natural language generation (NLG) and text to speech (TTS). It is reported that AliOS has now implemented innovative self-play dialogue training data generation and crowdsourcing solutions, combining a comprehensive understanding of people, cars, and scenarios, migrating linguistic, semantic prior knowledge, and knowledge graph knowledge into the dialogue system, training end-to-end deep learning dialogue system models, improving scenario coverage and dialogue fluency, and enabling the system to better understand voice commands based on scenarios. Taking the command mentioned at the beginning as an example, AliOS will first accurately recognize each word of the sentence "I want to open the sunroof and listen to Jay Chou's old songs on the way to Quyuan Fenghe", and then combine the user's current usage scenario to understand the meaning of the sentence and call related services to perform complex operations such as navigating to Quyuan Fenghe, opening the sunroof, and playing Jay Chou's old songs. Chen Hualiang said: "Spoken language is usually vague and incomplete in meaning. It is not enough to achieve understanding of spoken expression by relying solely on massive corpus data. We believe that only with more information such as people, cars, and scenes can we achieve scene-based intelligent natural language understanding capabilities and provide users with a better voice experience." He introduced that at present AliOS has focused on optimizing and upgrading voice technology in several high-frequency in-vehicle application scenarios such as navigation, music, audiobooks, and radio, to achieve multi-condition search, navigation multi-tasking, changing preferences during navigation, multi-slot query, etc. To give a few vivid examples, for example, "How far is it from here to Zhongshan Park?", AliOS can accurately understand it as asking the distance from the current location to Zhongshan Park; "Delete the previous waypoints", AliOS can accurately delete the last waypoints; "Play some songs that suit the occasion for me", AliOS can play appropriate songs based on the current weather and time information. In addition, AliOS has now achieved multimodal fusion of voice, vision, gestures and other interactive methods from the bottom of the system, striving to provide users with an immersive experience. It will be widely used in various scenarios such as in-car music, news broadcasts, audiobooks, in-car navigation, etc. As a winner of Toutiao's Qingyun Plan and Baijiahao's Bai+ Plan, the 2019 Baidu Digital Author of the Year, the Baijiahao's Most Popular Author in the Technology Field, the 2019 Sogou Technology and Culture Author, and the 2021 Baijiahao Quarterly Influential Creator, he has won many awards, including the 2013 Sohu Best Industry Media Person, the 2015 China New Media Entrepreneurship Competition Beijing Third Place, the 2015 Guangmang Experience Award, the 2015 China New Media Entrepreneurship Competition Finals Third Place, and the 2018 Baidu Dynamic Annual Powerful Celebrity. |
<<: Tesla successfully acquired land for industrial use in Lingang, Shanghai for RMB 973 million
There is nothing better than having a hot pot in ...
The Internet economy is developing rapidly. The s...
Judging from the current reports from major media...
Master the skills of Reiki magnetic field 1PDF Sk...
Community is a good product, but it is also a com...
"Are you messing with me again? I remember t...
How to make money using Zhihu? This is a topic th...
Sanya nail art applet development price 1. Displa...
From 2G to 3G, from 3G to 4G, and now 5G has arri...
One in ten people on Earth see everyday phenomena...
Before going out Checking the weather forecast is...
When it comes to copywriting , the first thing pe...
Today we are going to talk about how to implement...
During shopping carnivals such as Double Eleven a...
The customization of core keywords for hospital w...