In the technology industry, there has never been a "century-old store" that has stood firm. For example, in the Internet era, no one would have thought that Nokia, which was invincible in the mobile phone market, would eventually decline and exit the market. With the advent of the AI era, in order to avoid being eliminated by the times, technology giants have begun to innovate internally, hoping to break through their own development bottlenecks. In order to achieve success in the field of AI, data is an indispensable basic element. There is a saying that whoever owns the data owns the future. Is that true? Google has released a study on artificial intelligence that reminds us of important business dynamics in the current AI boom. Consumers and the economy are increasingly dependent on the tech company ecosystem, which has long been seen as innovative and non-monopolistic through internal disruption, allowing small companies to disrupt large companies in the process. However, when tech competition depends on machine learning systems driven by large amounts of data, surpassing the tech giants may become more difficult than ever. On Monday, Google released a new paper preprint describing an expensive collaboration with Carnegie Mellon University (CMU). Their image recognition experiment took two months to integrate 50 powerful graphics processors and used an unprecedented set of 300 million labeled images (a lot of work in image recognition uses a standard set of just 1 million images). The project was designed to test whether it could get more accurate image recognition by giving it more data, rather than tweaking existing algorithms. The answer is yes. After the Google and CMU researchers trained standard image processing systems with their new dataset, they found that it produced new state-of-the-art results on several standard tests for how software can interpret images, such as detecting objects in photos. There was a clear correlation between the amount of data they extracted and the accuracy demonstrated by the image recognition algorithms. The finding helps clarify a question that has been circulating in the field of artificial intelligence research: whether you can "squeeze" more data out of existing algorithms by giving them more. The experiment shows that having massive data is largely equivalent to having greater strength, which also means that technology giants with massive data such as Google, Facebook or Microsoft may reap greater benefits than before. However, Google's huge 300 million image dataset does not produce huge benefits - the object detection score only increased by 3 percentage points from 1 million images to 300 million images. But the authors of the paper believe that they can adjust the software to make it more suitable for super large datasets, thereby expanding this advantage. Even if it turns out not to be the case, in the technology industry, a small advantage is also important. For example, every point increase in the accuracy of self-driving car vision will be crucial, and for a product that may have billions of dollars in revenue, efficiency will also be greatly improved. For AI-centric companies, collecting data has become a defensive strategy. Google, Microsoft, and others have open-sourced much of their software — and even hardware designs — but rarely the data that makes those tools work. According to Leifeng.com, when announcing the open source TensorFlow AI engine, Google said that the real value of AI is not in software or algorithms, but in the data needed to make it more intelligent. Google may give up other things, but it will definitely keep the data, at least for now. Still, tech giants do make some data public. Last year, Google released a massive dataset of more than 7 million YouTube videos, and Salesforce opened up Wikipedia to help algorithms analyze language. But Luke de Oliveira, a partner at AI development lab Manifold and a visiting researcher at Lawrence Berkeley National Lab, said such openness often doesn’t yield much value to potential competitors. “These datasets are never critical to a product’s ability to maintain its market position,” he said. With the rise of cloud computing, companies like Amazon and Microsoft have access to the powerful processing power of the Internet. But the richest data is still in the hands of giants such as Google and Facebook, with nearly billions of people using their services, including rich communication information from text to pictures, videos to voice. They are all working hard to build powerful AI software, but their real competitive advantage lies in having a large amount of high-quality data, which they can use to teach software to think like people. Google and CMU researchers said that with the valuable data they have processed, they hope to use their latest research to help create more "Google-scale" open image datasets. "We sincerely hope that people in the visual field will not underestimate the value of data, and we can build larger datasets through collective efforts." They wrote. Abhinav Gupta, who led the research, believes that one of their options is to work with the Common Visual Data Foundation, a non-profit organization launched by Facebook and Microsoft that has already opened up image datasets. At the same time, data-poor companies will have to get creative if they want to survive in a sea of data-rich giants looking to make their algorithms smarter. DataRobot CEO Jeremy Achin speculates that this model could gain traction as machine learning becomes more important in many companies and industries, such as insurance, where data collected by small companies allows them to compete with large companies in risk prediction. This advance could upend the data economics of AI, as machine learning becomes free from data scarcity. Last year, Uber acquired an AI company for this purpose, but now it may also try to avoid the data advantage of AI practitioners. Rachel Thomas, co-founder of Fast.ai, a company dedicated to making machine learning accessible, believes that startups are often able to apply machine learning to areas (such as agriculture) that are beyond the reach of Internet giants. "I'm not sure that these big companies have a huge advantage everywhere, but there are a lot of specific areas where no one is collecting any data right now," she said. Even the giants in the field of artificial intelligence have their blind spots. From: Leifeng.com compiled from Wierd |
<<: VR glasses startups are booming, and these manufacturers are really good at it
>>: Realizing 2030: A new era of human-machine collaboration
Not long ago, a 21-year-old game power leveler na...
1. Introduction Functional Responsive Programming...
With the advent of 5G in 2019, short videos, as t...
Using ObjectAnimator ObjectAnimator is a powerful...
Audit expert: Zhan Mingjin PhD, Chinese Academy o...
In the world of brand marketing, new "market...
Lao Huang Speaks English (Full Course) Resource I...
First, let’s briefly talk about the two ways of B...
In recent years, the short video track has been q...
As the aerospace industry develops rapidly, satel...
Before we start today's topic, let's look...
Dahe.com (Reporter Shen Hua) At present, the COVI...
There is no fancy opening remarks, I will simply ...
What is a car clock? The engine bell appeared in ...
Defined from an economic perspective, growth refe...