Looking for the next unicorn: in-depth analysis of three models of AI entrepreneurship

This AI wave looks dazzling, but in fact there are only four possible landing directions: first, the breakthrough in voice and semantics makes voice interaction finally possible; second, the breakthrough in computer vision leads to AR, a display method that mixes real space and virtual space; third, the breakthrough in computer vision allows automation to be upgraded, and highly automated products such as self-driving cars and robots appear; fourth, machine learning provides a new way of processing data. For the basic entrepreneurial model of the last method, please refer to the previous article: AI Entrepreneurship Inspirations from Two Things Just Done by Google DeepMind. The first three entrepreneurial models can be divided into: product system, from soft to hard, and from hard to soft. This article explores the advantages and disadvantages of these three models.

Product Series

The AI startups we are familiar with are basically product-oriented, such as Mobvoi and Rokid Robotics in China, and Jibo, Savioke, Knightscope, Meta, etc. in foreign countries. The fundamental feature of this type of startup is to try to use technological breakthroughs in AI to create new products with novel experiences (with voice interaction or AR, etc.) and run a beautiful sales curve.

So what is a beautiful sales curve? It will probably look like this:

A beautiful sales curve means that once the market for a new product is launched, the sales curve will double (possibly more than 2 times) in the next three years, rather than a slow increase of 10% per year. In turn, this means that the decisive period for these startups is actually only three years. No matter how much you have done before, once you miss these three years, all your hard work may become worthless.

How long will it take for the market to really start from the early stage? No one knows. The trend can be judged by logical deduction, but the specific start time is actually a blind guess. It may be 1 year, 2 years, or even 5 years or 10 years.

In this way, there are only two key points that all product-based AI entrepreneurs need to grasp:

1. Make adequate preparations during a warm-up period whose length cannot be clearly predicted, including products, sales channels, production and manufacturing capabilities, etc.

2. Once the market starts, it will produce the sales curve shown above.

If the first point is not done well, it will be Luo Yonghao and his Smartisan Technology. If the second point is not done well, it will be like some large companies, such as Motorola, which have everything but cannot get anything done. If both points are done well, it will definitely be a new unicorn company.

It must be emphasized here that when achieving the above two goals, what is really important is the product experience rather than the advancement of technology. In other words, this model requires people like Steve Jobs who can use technology well rather than people like Sergey Brin who can create technology. Even if all the technology is from others, as long as it does not affect the user experience, it is actually not very relevant. However, in fact, since this generation of AI entrepreneurs are very smart and ambitious, they do not want to become pure assembly companies, so these startups usually try to get stuck on certain key points in technology in the early stages, such as ASR. The most extreme one among them is Mobvoi. According to various information, Mobvoi has built its own ASR, NLU and even search.

This makes product-based entrepreneurship very much like climbing the north slope of Mount Everest. It is bound to be a big undertaking, but success is very difficult. The high risk comes from the following two aspects:

1. The length of the warm-up period is highly unpredictable. Putting aside AI, the most successful company in China that has adopted this model so far is DJI. DJI's sales began to take off around 2013 (there is no official data yet, only Dronelife's guess).

So when was DJI founded? In 2006, which means DJI waited for almost seven years for its products to take off.

2. Costs increase significantly due to the desire to control key technical points. As mentioned earlier, the new batch of entrepreneurs usually do not want to become a simple assembly and sales company. Therefore, they will grasp several key technical points in the early stage. This is beneficial to the development of the company after the product is launched. Without the control over these points, even if it succeeds, it may become a certain type of PC and mobile phone company today. This may not be wrong. After all, Amazon also acquired three companies to build its own technical system in order to make Echo, but it will undoubtedly significantly increase costs and risks for startups, and will cause the company to be in a state of no income but high expenses for a long time.

From hard to soft

Whether from hard to soft or from soft to hard, it means that the company itself does not make products, but provides services to companies that make products. From hard to soft means that the company believes that the company's advantages must start from the front end (such as microphone array) and extend to the back end (cloud). From soft to hard means that the company believes that the cloud is the core of intelligence and the front end has a lower priority. Of course, we can say that it is best to be strong at both ends, but startups are usually limited by resources and the background of the founders and can only emphasize one part first. These two modes can be respectively divided into voice interaction and computer vision, but because it is difficult to explain them together, the following will take the direction of voice interaction as an example to first explain these two modes, and finally analyze these two modes in general.

There are relatively few domestic voice interaction artificial intelligence startups that have taken the path from hard to soft. The most typical one should be SoundAI Technology (this company is one of my portfolio companies, so I am relatively familiar with it). SoundAI Technology starts from the most basic acoustic array, first doing noise suppression, reverberation elimination, echo cancellation, etc., and then considers ASR, etc., which is the opposite of the path taken by Unisound and others.

The advantage of this model is that it can stand at the forefront of the industry chain, is easy to implement, and is the only way for data. Data itself is the core driving force of ASR and even NLU in the future, so it has more potential.

The disadvantage is that in the short term, hardware is needed to acquire customers, and the production of hardware needs to be organized, which requires a lot of start-up capital.

It can be said that the success of product-based startups actually has two external requirements: one is that the trend must come, and the other is that the product must stand the test of the market. Companies that go from hard to soft also have two external requirements: one is that the trend must come, and the other is that the technology must be strong and have price advantages. The customers faced by 2B companies are usually very rational, and many fancy marketing is not very effective.

From soft to hard

A typical startup that has gone from soft to hard in voice interaction is Yunzhisheng. This model choice and company positioning can even be seen from the name.

The advantage of moving from software to hardware is that it is easier to cover existing mature computing platforms. For example, if all apps need their own Siri, then companies like Unisound only need to build technical barriers and wait for Ctrip and Toutiao to come to them. The main challenge in this direction is to compete directly with large companies (such as Baidu and iFlytek). This article mainly focuses on the implementation of new hardware products, so we will not expand on this point.

The downside is that it is difficult to implement on new hardware products (Echo, cars, robots, AR, etc.), because in order to implement on new hardware products, the array layer must be added in the middle, otherwise the effect will be very poor. Once it cannot be implemented, its technical advantages will be easily broken. The accuracy of speech recognition is data-driven in nature, but obviously if you don’t make hardware such as arrays, you can’t implement it. If you can’t implement it, there will be no data, and it will be difficult to obtain positive feedback on data, technology, accuracy, and scenarios, and it will not be easy to solve the problem of speech recognition accuracy in real environments. Yunzhisheng and others obviously recognize this, so they are also actively expanding in this direction. At this time, they usually encounter general problems of software companies crossing over to make hardware, such as the inability to bargain with the supply chain, which will cause the cost of the same product to be several tens of percent higher.

The difference in routes originates from the understanding of computing architecture

The above example uses a speech semantics company as an example, but in fact it also applies to computer vision, but the details are different. For example, Movidus' chip may complete image recognition processing on the terminal, instead of having to process it by itself like a microphone array and then transmit the results to the cloud. The common thing behind this model selection is the understanding and assumption of computing architecture.

So far, there are three kinds of such assumptions and cognitions:

One is to ensure the experience (speed, etc.) that the client always plays an important role, and the cloud is used to assist the client in completing the calculation. All the hardware products we use: mobile phones, Pads, etc. are basically in this mode.

One is that most computing should take place in the cloud. Google's ChromeBook is this model, and the terminals in banks used to be this model.

One is the emerging sensor + Fog computing + cloud architecture. This can be seen as an extension of the first architecture. For example, if all the devices in a smart home are directly connected to the cloud, the computing cost is too high. It is better to have a central hub at home to process what can be processed first (for example, turn on the air conditioner when it is cold, and close the windows when it rains, and not transmit it to the cloud). If it is really not possible, then connect to the cloud.

The first two architectures often cause PK in reality, and lead to serious consequences. Here are two examples:

A PK took place on PCs. At that time, the Network Computer that Oracle and others tried to make essentially meant to transfer all kinds of calculations to the back end, turning the front end into an input and output device. This attempt obviously failed miserably, but what is interesting is that more than 20 years later, when the PC category was mature enough, Chromebooks that still followed this route saw some hope of success.

One happened to Native APP and HTML5. At that time, Facebook really wanted to promote HTML5. Zuckerberg wanted to use Web App to break the monopoly of iOS and Anroid, but in fact, Facebook almost died because of this choice, because this choice almost made it miss the mobile Internet. The subsequent large-scale acquisitions of Instagram and WhatsApp were probably related to this wrong choice.

My personal basic understanding of this is: when a new category of hardware product first appears, first of all, the end must be powerful enough to provide the ultimate experience. With the gradual development of applications, bandwidth, etc., the computing power on the end may be transferred to the cloud because there will be cost advantages, but this requires a long process. It took more than 20 years for the PC to see this possibility.

If this is right, it means that for new hardware products, the first thing to be established will be a model from hard to soft, rather than from soft to hard. AR, autonomous driving, etc., like the voice interaction mentioned above, must first solve the problems on the terminal, so that the product can respond quickly and accurately in real time to ensure user experience, and then talk about other things. You can also consider this issue from another angle. For new products, it is more likely that the iPhone will come first and then the Android phone, rather than the other way around, because the iPhone has a stronger impact on users and is easier to establish a new category, but the iPhone cannot be made with HTML.

But the model from hard to soft does put forward more complex requirements for founders. For example, Chen Xiaoliang of Sound Intelligence Technology is first an acoustic expert and also a speech recognition expert, so he chose this route of starting from the front end and combining it with the back end. The CTO of Unisound is a computer scientist who is more proficient in algorithms and deep learning, so he is naturally inclined to use data, neural network algorithms and greatly increased computing power (from cloud computing to HPC) to solve problems. Switching to the path of focusing on the end requires overcoming both thinking and technical barriers, which may not be easy.

summary

The following two things are highly certain :

1.The wave of AI is coming.

2. There will definitely be new hardware products.

Therefore, new unicorns will definitely emerge from this product line.

If you believe that new hardware products must be supported by powerful terminals to ensure a good experience, and that the basic order of appearance of new categories is that iPhone comes first and then Android, then you will probably agree that unicorns will first appear in the model from hardware to software. (I have talked about these views with many people, but it is a coincidence that I only recently met a beautiful investment woman who has the same understanding as me. You won’t say that I plagiarized her views...)

It must be emphasized that this article mainly discusses the possible states under new hardware products. Data analysis is not covered by this article (data analysis is pure cloud computing), and products or services mainly for existing platforms (mobile phones, Pads, etc.) are also not covered by this article.

As a winner of Toutiao's Qingyun Plan and Baijiahao's Bai+ Plan, the 2019 Baidu Digital Author of the Year, the Baijiahao's Most Popular Author in the Technology Field, the 2019 Sogou Technology and Culture Author, and the 2021 Baijiahao Quarterly Influential Creator, he has won many awards, including the 2013 Sohu Best Industry Media Person, the 2015 China New Media Entrepreneurship Competition Beijing Third Place, the 2015 Guangmang Experience Award, the 2015 China New Media Entrepreneurship Competition Finals Third Place, and the 2018 Baidu Dynamic Annual Powerful Celebrity.

<<: "Stabilizing the overall situation" Infiniti continues to accelerate the localization process

>>: Michigan government uses VW compensation to promote ZEV development

How much does Baidu paid ranking cost per month?

China Association of Automobile Manufacturers: The top ten companies (groups) in automobile sales from January to August 2022 sold a total of 14.539 million vehicles

According to statistics and analysis by the China...

Looking for the next unicorn: in-depth analysis of three models of AI entrepreneurship

How much does Baidu paid ranking cost per month?

If you pull out one white hair, three will grow back? Should you pull out your hair? Finally figured it out!

This small hole on the ear is not a symbol of "wealth"! It may be a dangerous hole...

Should I choose wire braces or invisible braces? Here is a complete guide to orthodontics!

Operators shut down 2G/3G: Who will pay for everyone to replace their phones?

APP promotion: Serious user loss? You stepped on these pitfalls!

He chases his dream of deep blue sea with the wind blowing on the sea

Why did Baidu and Alibaba register their companies in the Cayman Islands?

50 hours and 600,000 data points reveal for the first time the driving force behind Wang Ju's sudden popularity

User Growth Tips

Recommend

iOS 15 reveals new features, combined with iPhone 13, netizens say it's too powerful

You know nothing about anything, and you still dare to promote it?

The top ten evolutionary "weirdos" in nature, each one more outrageous than the other (Part 1)

What are the three common misunderstandings about product operation growth?

Are nitrogen and carbon related in the forest? Uncovering the love-hate relationship between nitrogen and carbon

What should I do if my memory continues to decline?

A very cool strategy for the top fund players, a 21-day practical training camp course

China Association of Automobile Manufacturers: The top ten companies (groups) in automobile sales from January to August 2022 sold a total of 14.539 million vehicles

How to cook dog meat, how to make crispy dog meat in Yulin, how to make dog meat hotpot

Lao Luo's two failures and Wang Ziru's three stupid moves

How did NetEase's "Terminator 2" become popular in the App Store within a week?

Unexpectedly, "burping and farting" can also affect carbon emissions

Why are giant salamanders becoming extinct in the wild?

The “last mile” of integrated marketing

The future of Zen architecture is uncertain