Recently, Musk showed off his strength at Tesla's AI Day, with the appearance of humanoid robots, supercomputer Dojo, and the demonstration of the working principle of the pure vision route, all of which revealed his confidence in technology. While many manufacturers have chosen the lidar solution as the visual perception route, Tesla still insists on the pure vision route and has raised the flag higher and gone deeper. We know that the underlying principle of autonomous driving is the combination of three steps: perception, decision-making, and execution. The perception layer uses visual sensors to obtain information about surrounding road conditions, processes data through the vehicle body and the cloud, and obtains execution commands, enabling the car to achieve autonomous driving capabilities. Among the three basic steps, perception is the first step and plays a prerequisite role in subsequent decision-making and execution. At the perception level, there are currently two technical routes in the market: visual perception and lidar perception. The LiDAR faction believes that the visual perception accuracy of cameras is not enough, and if autonomous driving is to develop to L3 level or above, LiDAR should be used. The visual perception faction believes that cameras perceive rich environmental information data, can classify objects and then easily label them, and most importantly, have low cost, which LiDAR cannot do. Whether from the perspective of technology or cost, the core difference between the two solutions lies in whether the assistance of LiDAR is needed to achieve high-level autonomous driving. The two camps are arguing endlessly about which is better. So, which of the two technical routes will have the last laugh? LiDAR vs. Visual Perception Performance Comparison LiDAR sensing technology is dominated by LiDAR, with millimeter-wave radar, ultrasonic sensors and cameras as auxiliary. The working principle of LiDAR sensing environment is to emit a laser beam through LiDAR, measure the time difference and phase difference between the emission and recovery process of the laser, determine the relative distance between the vehicle and the object, and realize real-time environmental perception and obstacle avoidance functions. LiDAR has a long detection distance and high accuracy, strong anti-interference ability, can actively detect the surrounding environment of multiple objects, obtain the surrounding environment point cloud to build a 3D environment model. Even if the light is poor at night, it will not affect the detection effect. Although LiDAR is not afraid of dark light, it is sensitive to weather. Rain, snow, dust, fog and other weather affect the recognition effect of LiDAR. LiDAR fusion high-precision map solution can effectively make up for the defects of high environmental dependence and high computing power demand of visual solution. Its performance advantage makes most car manufacturers list LiDAR as an indispensable perception device for L3 and above autonomous driving. Visual perception is a camera-based solution, and the cost of cameras is much lower than that of lidar. The price of cameras is around tens of dollars, while lidar is several hundred dollars, several times more. In addition, camera technology is gradually maturing, and high-resolution and high-frame-rate imaging technology makes the perceived environmental information richer, but the camera's perception is limited in dark environments, and its accuracy and safety are reduced. For example, Tesla's most criticized ghost brake failure occurs in some tunnels and bridges where the shadows appear. Due to the structure of the camera, the algorithm treats the sudden shadows as obstacles, causing the vehicle to suddenly slow down automatically, posing a safety hazard. When comparing hardware performance in the visual solution, the camera function is instantly reduced to rubbish. Thanks to the addition of software algorithms, the visual solution can rely on powerful algorithms to ensure the normal function of image processing and decision execution. Compared with LiDAR, visual perception has obvious weaknesses: cameras rely on light conditions, have low perception accuracy, are highly dependent on and require algorithms and computing power, and have high barriers to data acquisition and algorithm iteration. LiDAR is clearly superior in terms of performance. Tesla has spent a huge amount of money on computing power and algorithms, and has invested a lot. But it has always insisted on the visual perception route. What are the considerations? Tesla focuses on pure visual route logic In Musk's view, "pure visual perception is the path to real-world AI," and this is also his approach to solving problems. The underlying idea of the first principle is to return to the most basic conditions of things, break them down into various elements for structural analysis, and thus find the optimal path to achieve the goal. When driving a vehicle, we collect road condition information through our eyes and process it with our brains. So autonomous driving should also be able to drive safely through visual perception and algorithm processing. What Tesla wants to do is to imitate the human visual ability to obtain information to achieve autonomous driving. Since the perception method of visual cameras is less accurate, Tesla relies on its unique data advantages and ability to build computing power and algorithms to smooth out this defect. In terms of data, while other self-driving car manufacturers are still collecting data during the road testing phase, Tesla has accumulated a massive amount of real road data thanks to the millions of cars with cameras sold worldwide. The data used for deep learning model training has long established barriers for Tesla's algorithms, and other manufacturers cannot replicate the speed of accumulation of these data samples and the efficiency of the algorithms, and can only stare blankly and anxiously. In terms of computing power, Tesla's newly established supercomputer Dojo has powerful computing power. This supercomputer is set up for Tesla's autonomous driving system to concentrate on training the entire autonomous driving system including Autopilot. On the technical level of cameras, Tesla has also made technological innovations, using "pseudo lidar" technology instead to perform depth estimation on the pixels in the camera. The point cloud function similar to lidar generally forms 3D target detection, which improves the accuracy of depth estimation. The gap between lidar and cameras has begun to narrow. People rely on vision when driving. Our neural networks can process signals such as distance and speed in visual information, and Tesla's neural networks seem to be able to do the same gradually. Tesla's visual perception route is gradually narrowing the gap with the lidar solution, but the price paid behind it makes it impossible for latecomers to follow and copy, which also builds a strong barrier for Tesla. The pure visual solution is supported by massive sample data training and learning and advanced image processing algorithm computing power, which is destined to be a difficult route chosen by a few climbers. Tesla's chief AI scientist Karpathy said at this year's CVPR 2021 autonomous driving seminar that pure vision-based autonomous driving solutions are more difficult to implement technically because they require neural networks to work very well based only on video input. But the advantage is that "once you really get it working, it's a universal vision system that can be deployed anywhere on Earth." In the future, visual perception systems will not only be deployed on cars, but also on any other products that require visual system functions, such as robots, drones, AR/VR, etc., becoming a universal capability, which is also Tesla's future consideration and ambition. Although Tesla's vision of the future is beautiful, in reality, there is still a gap between the current visual perception solution and the LiDAR solution. We still see in the news that Tesla cars have safety accidents due to problems with recognition and perception. At present, the LiDAR faction is still smiling and walking ahead in terms of safety. Will LiDAR have the last laugh? Which of the two schools will have the last laugh is also a consideration of which is faster in terms of mass production or the iteration of visual route technology. We can find from the data that there are more and more newly registered radar companies. Data shows that there are currently 14,000 radar-related companies in my country, with 2,640 newly registered companies in 2020, a year-on-year increase of 29.3%. The low-cost lidar products released by listed companies such as Hesai Technology and giants such as Huawei are ready for mass production. The growth trend on the supply side is driven by the huge demand on the demand side. Most companies engaged in L3 and L4 autonomous driving, including start-ups and large companies, have adopted lidar, and most of them purchase lidar instead of developing it themselves. The LiDAR solution is acceptable to the market due to the safety advantages brought by the high-precision performance of the hardware, even though the cost is temporarily high. Most players accept the LiDAR solution, which leads to a large demand side, and its production capacity is also expanding. Large-scale mass production is on the way, and future costs will be further reduced due to the scale advantage, establishing a virtuous circle. After more than 10 years of development, LiDAR has been proven to be an essential sensor for achieving high-level autonomous driving. Tesla is also in a hurry to develop it, and is also showing off its strength to recruit people. Previously, the news that Tesla signed a contract with LiDAR technology company Luminar to use LiDAR for testing and development has aroused everyone's speculation. Although Tesla later clarified that it would stick to the pure vision route, its intentions in using LiDAR are hard to predict. The pure vision route, the camera is cheap but the safety is worrying, linked to the algorithm and computing power, Tesla relies on its own massive data and supercomputers, this advantage cannot be imitated by anyone. This means that the pure vision route will either be far ahead or evenly matched, but other companies in the market cannot follow the visual perception route no matter what the result is. Without two brushes to participate, it will be a heartbeat. In the long run, the two visual perception routes will still be controversial due to cost and safety. At present, the speed of laser radar scale development is still unclear compared with the speed of Tesla's pure visual technology development. It is hard to say whether laser radar will have the last laugh. But at present, compared with the unknown development of visual perception technology, the laser radar solution is already on the road to large-scale mass production. Its bright development trend gives this faction the confidence to welcome the future with a smile. |
>>: Technology Morning News | The fastest asteroid in the solar system has appeared;
Although "everyone is a self-media " an...
Source code introduction: It can realize the card...
[Abstract] Download the complete set of videos on...
Although most people will think of Xiaomi or Redm...
In recent days, strong winds have occurred in man...
Meizu launched the MX 4 at 1799 yuan and the MX 4...
1. Prophecy Dadong: Xiaobai, do you remember the ...
As one of the mainstream new media platforms, Zhi...
The intelligent manufacturing demonstration produ...
Produced by: Science Popularization China Author:...
Last month, Stability AI released its third-gener...
There was little suspense about the Google Pixel ...
First, let’s talk about Max pooling in detail. Ma...
Account optimization refers to the process of con...
Third-party libraries are something that programm...