Understanding the “Identity” of AI Core in One Article

Understanding the “Identity” of AI Core in One Article

[[251095]]

From the beginning of 2018 to the end of the year, artificial intelligence has been the keyword in the mobile phone industry.

Mobile phone manufacturers are busy instilling in users the idea that AI will make your phone smarter, and have launched a series of applications such as smart voice assistants, facial unlocking, and smart photo classification. Facts have proved that industry trends always lead to changes in the upstream of the industry chain, and mobile phone chips at the top of the industry chain are no exception.

Apple A12 and Kirin 980 both claim to be equipped with NPU units to enhance the AI ​​processing capabilities of mobile phones. Qualcomm deliberately added the "artificial intelligence" label to the promotion of Snapdragon 845. Samsung's newly launched Exynos 9820 has become the first Exynos series chip with an integrated NPU unit. MediaTek has emphasized the concept of "AI-specific core" in Helio P70...

If you are not an expert in the IC field, you will be confused after seeing a series of concepts such as NPU and AI dedicated core. What exactly is AI dedicated core and what role does it play? This is exactly the original intention of this article.

What exactly is an AI chip?

Before answering this question, let's first understand two concepts: what are CPU and GPU?

Simply put, the CPU is the "brain" of the phone and the "commander-in-chief" of the normal operation of the phone. The GPU is translated into a graphics processor, and its main job is indeed image processing.

Let's talk about the division of labor between the CPU and GPU. The CPU follows the von Neumann architecture, the core of which is "store program, execute sequentially", just like a housekeeper who does things in a methodical manner, doing everything step by step. If you ask the CPU to plant a tree, it will have to dig a hole, water it, plant the tree, and cover it with soil step by step.

If the GPU is asked to plant a tree, it will call on A, B, C, etc. to complete the task together, dividing the work of digging holes, watering, planting trees, and covering the soil into different subtasks. This is because the GPU performs parallel computing, that is, breaking down a problem into several parts, each of which is completed by an independent computing unit. It just so happens that every pixel in image processing needs to be calculated, which coincides with the working principle of the GPU.

As a Zhihu expert put it: CPU is like an old professor, who can calculate everything from integrals to differentials, but some jobs require calculating a large number of additions, subtractions, multiplications and divisions within 100. The best way is of course not to let the old professor calculate them one by one, but to hire dozens of elementary school students to distribute the tasks. This is the division of labor between CPU and GPU. CPU is responsible for large-scale computing, and GPU is born for image processing, from computers to smartphones.

However, when the demand for artificial intelligence emerged, problems arose in the division of labor between the CPU and GPU. The deep learning of artificial intelligence terminals is different from traditional computing. The background summarizes the rules from a large amount of training data in advance to obtain parameters that can be used for judgment by the artificial intelligence terminal. For example, if the training sample is facial image data, the function implemented on the terminal is facial recognition.

The CPU often needs hundreds or even thousands of instructions to complete the processing of a neuron, and cannot support large-scale parallel computing, while the GPU on the phone needs to handle the image processing needs of various applications. Forcibly using the CPU and GPU for artificial intelligence tasks generally results in low efficiency and severe heat generation.

This requires Qualcomm and MediaTek to come up with solutions. Unfortunately, the solutions proposed by the major mobile chip manufacturers are somewhat different.

Qualcomm's current commercial flagship processor is the Snapdragon 845, which is equipped with the Adreno 630 GPU. Compared with the previous generation Snapdragon 835, the AI ​​processing capability has increased by 3 times, and it supports neural network systems on multiple platforms. Perhaps out of confidence in the GPU performance, or perhaps because it is not aware of the coming of AI demand, Qualcomm does not have an independent AI computing unit, and still relies on CPU, GPU, DSP, etc. to handle AI needs part-time.

MediaTek has always been an underestimated player. The solution it provides is somewhat similar to Google's TPU. It uses ASIC (Application Specific Integrated Circuit) to create an AI core that specifically handles artificial intelligence needs, which has become a small piece of IP integrated into Helio P60, Helio P70 and other chips. The advantages of the AI ​​core are fast running speed and low power consumption. It can work in coordination with the CPU and GPU. The CPU is responsible for large-scale computing, the GPU is responsible for image processing, and the AI ​​core is responsible for deep learning related scenarios.

The NPU mentioned at the beginning of the article is translated into Chinese as a neural network processor, which is the solution provided by Apple A12, Kirin 980 and Exynos 9820. In fact, it is also a kind of AI core. In layman's terms, it is an artificial intelligence accelerator, because GPU is based on block data processing, but AI applications on mobile phones need real-time processing. Artificial intelligence accelerators just solve this pain point and take over the work related to deep learning, thereby alleviating the pressure on CPU and GPU.

It can be seen that the NPU units and AI cores of Apple A12, Kirin 980, and Exynos 9820 have similar principles, which separate the computing workload of the CPU and GPU, and offload AI-related tasks such as facial recognition and voice recognition to ASIC for processing. AI cores have long become an industry trend.

However, the concept of "NPU" has not yet been fully unified. Some players still mobilize resource support by integrating multiple DSP cores. Cambrian's IP has some problems in processing Mobilenetv1/v2. Therefore, it is highlighted that MediaTek needs to take bigger steps in this regard.

Is AI a breakthrough or a fantasy?

Using a "dedicated core" to handle AI scenarios is not without flaws, such as single function, long development time, increased chip cost, and occupying mobile phone space. This is probably why Qualcomm did not choose this solution.

However, to judge whether AI dedicated cores are advanced leaps or useless fantasies, we only need to compare a few actual usage scenarios.

Taking AI face recognition, which is the most widely used technology, as an example, this is a process of "scanning and detection" and "result judgment". It requires the positioning of facial features, facial attribute recognition, and facial feature extraction during the scanning process, and then makes comparisons based on facial features, face recognition, and liveness verification. Face recognition is not a purely algorithmic process, but also involves multiple computing units such as CPU, GPU, VPU, and DLA.

Some media have conducted a comparative test on face recognition of smartphones equipped with MediaTek Helio P60, Qualcomm Snapdragon 845 and Snapdragon 710. The former is equipped with an AI core, while the latter two use software-optimized solutions. The final face recognition speeds are 316.5ms, 687.5ms and 950ms respectively. Both are mid-range processors, but the face recognition speed of MediaTek Helio P60 is faster than that of Snapdragon 710, and even saves nearly half the time of Snapdragon 845, which shows the advantages of AI core.

Why is there such a huge gap? The face recognition process requires the camera to first identify the face, regardless of whether it is dim or facing away, and then accurately determine the facial features, such as the size of the eyes and the length of the face, and compare them with known samples to determine who the person is. The whole process requires extremely high computing power support. The Helio P60 with a dedicated AI core is naturally more efficient than chips with part-time CPU and GPU processing, even the flagship Snapdragon 845.

After tasting the sweetness of AI dedicated core, MediaTek continued to upgrade the AI ​​dedicated core in Helio P70. The AI ​​processing capability has been improved by 30% compared with the previous generation, supporting more complex AI applications, such as human posture recognition, AI video encoding, real-time photo beautification, scene detection, AR functions, etc.

For example, when a beauty blogger is live streaming, one APU (MediaTek's name for AI core) of Helio P70 can perform face detection and real-time beauty, while another APU is simultaneously doing HDR processing and background blur. If it is a Snapdragon 845 solution, a single DSP needs to complete processes such as face detection, image segmentation, background blur, HDR processing, and multi-frame synthesis, resulting in a speed difference.

For example, in terms of photography, a high dynamic range HDR image requires three 12-bit RAW photos to be synthesized, and then the ISP is used to output the best optimized photo. The time from taking a photo to outputting the photo is extremely short, which requires a lot of computing power and often causes a delay of 2-3 seconds. However, the dual-core APU of Helio P70 can accelerate in parallel with dual threads, and can complete photo optimization in less than 1 second, which is more efficient than the processing of a single DSP.

Not only MediaTek, but Huawei also showed the advantages of dual-core NPU in AI at the launch of Kirin 980, which is mainly reflected in image and video processing. For example, in object recognition, it has gone from recognizing contours to recognizing details; in real-time object segmentation, it has gone from slightly rough scene segmentation to fine segmentation. At the same time, Kirin 980 also allows real-time "tracking" of multiple objects, with image recognition reaching 4,500 images per minute, and also supports "changing backgrounds" in videos.

In addition, another major advantage of AI cores is probably battery life. At least Apple, Huawei, and MediaTek are eager to prove it, and they focus on two dimensions:

On the one hand, the value of AI cores lies in the coordinated division of labor with CPU and GPU. Too many tasks stacked on CPU and GPU will only waste power and increase temperature. For example, although the performance of Snapdragon 845 is very strong, it will still have a slight fever when taking AI photos. Products equipped with AI cores such as Helio P70 do not have this problem.

On the other hand, with the cooperation of AI cores, user behavior can be learned, and then the user's usage scenario can be predicted, and then reasonable performance allocation can be made. For example, when you are playing games, the CPU can be used for efficient calculations, while when you are reading e-books, performance can be avoided.

Final Thoughts

Considering our real life, the need for image processing in the past two years was limited to beauty. Now short videos and live broadcasts have shown higher demands on the AI ​​performance of mobile phones. MediaTek's AI core was born for this purpose.

What can be concluded is that MediaTek, Huawei, and others have undoubtedly made the right bet on the future direction of mobile chips by improving the AI ​​capabilities of chips through AI dedicated cores or similar concepts. Perhaps in two or three years, AI dedicated cores will be an indispensable component of mobile phone chips. We also look forward to these chip giants continuing to compete, innovate, and make breakthroughs in AI dedicated cores.

<<:  Refuse to steal your iPhone passwords! 6 practical tips to create a more secure iPhone password library

>>:  Alipay Android package size is extremely compressed

Recommend

Long March 6A: my country's first solid-liquid carrier rocket

Friends who are familiar with foreign launch vehi...

How much does it cost to be an agent for the Jiayuguan Transport Mini Program?

How much does it cost to be an agent for a transp...

The 2023 Turing Award is out! Why is the "randomness" of computers so important?

Last night, the Association for Computing Machine...

The Simplest Rule for Super Users: How to Achieve User Growth?

2018 is the second half of the Internet, and it i...

Case analysis | "Yuanfudao" growth system!

As the leader in English enlightenment and K12, Y...

Sudden weight gain? It could be due to lack of sleep

Why do I gain weight even though I don’t feel lik...

Wake up! “Humans evolved from monkeys” is wrong!

On April 19, 1882, Charles Darwin, a British natu...

Can't eat anything if you have diabetes? Beware of malnutrition!

Image from: freepik.com According to data from th...

The basics of through train promotion, a must-read for newbies!

Today I’m going to share with you some basic thro...

Cable TV was surpassed by broadband and then by Internet TV

Cable TV was once the largest form of home entert...

What kind of operations does the Internet industry need in the next ten years?

As I mentioned at the beginning of this series, I...

New brands use content IP as endorsement

Brands can find celebrities to endorse their prod...