"I'm going to open the sunroof and listen to Jay Chou's old songs on the way to Quyuan Fenghe." If you say this to a person, he will easily understand your three intentions: one, go to Quyuan Fenghe; two, open the skylight; three, listen to Jay Chou's old songs. But if we replace people with machines, such as cars, will the cars be able to understand and give corresponding operational feedback? As we all know, voice is naturally one of the most suitable ways of in-car interaction because of its convenient and safe operation. It has almost become the standard of in-vehicle solutions in the industry, although there are large differences in the voice solutions made by various companies. For example, the semantic understanding multi-tasking mentioned at the beginning is still a relatively new technology application in the industry. Few companies have been able to implement it. Most manufacturers focus on improving the accuracy of voice recognition and natural language understanding. Chen Hualiang, head of AliOS data intelligence, revealed that they are currently upgrading the technology of voice, focusing on improving the experience of scene-based intelligent semantic understanding (SSLU: Scene-based Spoken Language Understanding), which is an intelligent upgrade of language understanding based on natural language understanding, which includes the improvement of multi-domain task processing capabilities. Common dialogue systems are generally composed of several modules: automatic speech recognition (ASR), natural language understanding (NLU), dialogue management (DM), natural language generation (NLG) and text to speech (TTS). It is reported that AliOS has now implemented innovative self-play dialogue training data generation and crowdsourcing solutions, combining a comprehensive understanding of people, cars, and scenarios, migrating linguistic, semantic prior knowledge, and knowledge graph knowledge into the dialogue system, training end-to-end deep learning dialogue system models, improving scenario coverage and dialogue fluency, and enabling the system to better understand voice commands based on scenarios. Taking the command mentioned at the beginning as an example, AliOS will first accurately recognize each word of the sentence "I want to open the sunroof and listen to Jay Chou's old songs on the way to Quyuan Fenghe", and then combine the user's current usage scenario to understand the meaning of the sentence and call related services to perform complex operations such as navigating to Quyuan Fenghe, opening the sunroof, and playing Jay Chou's old songs. Chen Hualiang said: "Spoken language is usually vague and incomplete in meaning. It is not enough to achieve understanding of spoken expression by relying solely on massive corpus data. We believe that only with more information such as people, cars, and scenes can we achieve scene-based intelligent natural language understanding capabilities and provide users with a better voice experience." He introduced that at present AliOS has focused on optimizing and upgrading voice technology in several high-frequency in-vehicle application scenarios such as navigation, music, audiobooks, and radio, to achieve multi-condition search, navigation multi-tasking, changing preferences during navigation, multi-slot query, etc. To give a few vivid examples, for example, "How far is it from here to Zhongshan Park?", AliOS can accurately understand it as asking the distance from the current location to Zhongshan Park; "Delete the previous waypoints", AliOS can accurately delete the last waypoints; "Play some songs that suit the occasion for me", AliOS can play appropriate songs based on the current weather and time information. In addition, AliOS has now achieved multimodal fusion of voice, vision, gestures and other interactive methods from the bottom of the system, striving to provide users with an immersive experience. It will be widely used in various scenarios such as in-car music, news broadcasts, audiobooks, in-car navigation, etc. As a winner of Toutiao's Qingyun Plan and Baijiahao's Bai+ Plan, the 2019 Baidu Digital Author of the Year, the Baijiahao's Most Popular Author in the Technology Field, the 2019 Sogou Technology and Culture Author, and the 2021 Baijiahao Quarterly Influential Creator, he has won many awards, including the 2013 Sohu Best Industry Media Person, the 2015 China New Media Entrepreneurship Competition Beijing Third Place, the 2015 Guangmang Experience Award, the 2015 China New Media Entrepreneurship Competition Finals Third Place, and the 2018 Baidu Dynamic Annual Powerful Celebrity. |
<<: Tesla successfully acquired land for industrial use in Lingang, Shanghai for RMB 973 million
Really big news! Today, according to two authorit...
Contributed by "Panda Medical Cartoons"...
When shopping in the supermarket, you’ll notice t...
Using independent websites for keyword ranking (S...
The Shenzhou spacecraft is the only manned spacec...
...
The holiday is coming to an end Good weather &quo...
Regarding the issue of growth, there is a vicious...
As a developer, the last thing you want to see is...
At 10 am on August 20, the domestic 3A game "...
Zebrafish have become a new favorite in life scie...
China's new energy vehicle market is in an up...
Although they are both "domestic pet stars&q...
WHO declares monkeypox outbreak a "public he...
Follow "Body Code Decoding Bureau" (pub...