In March 2023, AI technology represented by GPT-4 set off a wave of AI fever. Half a year later, there have been many new developments in the field of AI, some of which may completely change the lives of "working people". Next, let’s take stock of the things in the field of AI that are most worthy of your attention in the past six months. GPT's biggest rival Gemini appears In May 2023, at the Google I/O Developer Conference, Google CEO Pi Chai revealed that Google's Deep Mind is training the Gemini model. Gemini is a large language model specifically designed to compete with GPT-4. According to the analysis of semiconductor research company SemiAnalysis, Gemini's computing power will be five times that of GPT-4. In addition, compared with GPT-4, Genimi can better support multimodal input, which means that in addition to text information, Gemini can also process images and voice information , which makes Gemini more convenient to use than the current GPT-4. And we may soon be able to witness Gemini's performance. According to a report by overseas technology media "The Information" on September 14, Gemini has opened up the use and testing rights to some companies. It may not be long before Gemini is deployed in Google's product matrix and begins to serve the public. Image source: Internet Open AI trains a more versatile “GPT-5” Of course, facing an opponent with multimodal capabilities like Gemini, OpenAI will not sit idly by. In fact, as early as this March, at the GPT-4 launch conference, GPT-4 demonstrated its multimodal processing capabilities. At the launch conference, the presenter hand-painted a sketch of a web page, took a photo and sent it to GPT-4 to tell it to make a web page according to this layout. GPT-4 immediately wrote the web page code. However, in actual applications after the press conference, users did not seem to experience the multimodal processing capabilities of ChatGPT. In order to meet the challenge of Google's Gemini, OpenAI combined ChatGPT with the new image generation model DALL·E-3 to make GPT more "versatile". On September 25, after the GPT-4 version was updated, it can also process voice and picture information. For example, the following is the content generated by DALL·E-3 and ChatGPT. GPT can not only draw the corresponding picture according to the text, but also interpret the information on the picture and make some modifications to the picture according to the conversation. DALLE3 Images created from text GPT explains why the hedgehog in the picture is so good Image generated by DALL·E 3 according to the requirement "Show that the little hedgehog is enthusiastic" In addition to combining DALL·E-3 and ChatGPT, OpenAI has also begun deploying "GPT-5". In the new version of GPT released on September 25, test functions for voice communication and image recognition have been introduced. In fact, in March this year (shortly after the emergence of GPT-4), there was a wave of calls on the Internet to suspend research on GPT-5 due to concerns about information security and privacy. As the CEO of OpenAI, Sam Altman also promised not to train the GPT-5 model in the short term. According to The Information, six months after the emergence of GPT-4, OpenAI has begun developing a new model codenamed "Gobi", which has multimodal capabilities from the beginning of its design. Some media even claim that it may be the future GPT-5. Microsoft releases "Worker Benefit Package" : Microsoft Copilot On September 21, Microsoft released the Microsoft Copilot family bucket. If you are unfamiliar with Microsoft Copilot, you can understand it this way: our commonly used software such as Word, Excel, PPT, and the browser that comes with Windows will all be supported by GPT-4. Take Word, which we use most often, for example. When writing a document, you can directly tell Word a topic and let it automatically generate a document related to this topic. At the same time, it also has a picture matching function. You don’t even need to spend time looking for pictures on the Internet. You can just let it generate pictures based on the text. As for Excel, in the new Microsoft Copilot family, you no longer need to remember various formulas, nor do you need to program in Excel. You just need to tell Excel your purpose, and it will automatically complete the work of writing formulas, writing codes, and analyzing data for you. You just need to wait and see the results. In addition, when browsing the web, you don’t even need to read the content of the web page in detail. The browser can directly help you summarize the important information of the current page, helping you save a lot of time. This may be the most "friendly" AI gift package for workers in the past six months. AI helps humans understand smells Among human vision, hearing, and smell, smell may be much more complicated than we think. For vision, the colors we see can be associated with the wavelength of light, and for hearing, sound is related to the vibration frequency of objects. As for smell, humans not only have hundreds of olfactory receptors, but the common smells in life are often formed by many types of odor molecules, so it is difficult to establish a simple and perfect mathematical model. A paper published in Science in August this year pointed out that scientists used the AI technology of "graph neural network" to identify the relationship between smell and ingredients and generate a smell map . Based on this map, we can use known compounds to configure the desired smell. More importantly, the odor map drawn by AI covers 500,000 potential odors, which means that with the help of AI, we may be able to smell flavors that we have never imagined before. This may greatly change the status quo of the food and spice industry and make our lives more "flavorful". The process of AI identifying odors. Image source: References Autonomous driving tells you how it drives On September 14, the autonomous driving company Wayve released the open-loop autonomous driving commentator LINGO-1. We can simply think of it as an autonomous driving commentator . Why does autonomous driving need an interpreter? This is actually a very interesting and important study. Imagine that when you are driving, every choice and action you make must have a basis. For example, you think the car in front is too slow, and the lane next to you is empty and safe, so you decide to drive to the lane next to you to overtake, or you slow down at the intersection because there are many people here and you need to carefully observe the surrounding environment. LINGO-1 can also explain the behavior of the self-driving car in natural language that people are familiar with, and can respond to specific human questions. For example, when the self-driving car stops at an intersection, you can ask it, "What are you observing now?"; if there is someone riding a bicycle nearby, you can also ask the self-driving car, "How do you judge that you are keeping a safe distance from this cyclist?" Currently, the accuracy of LINGO-1's answers is only about 60%, but LINGO-1's capabilities are constantly improving, and this kind of research is of great significance. It can improve the explainability of artificial intelligence. In the past, many decisions made by self-driving cars were a black box to humans. We didn't know why the car changed lanes, nor why it chose not to overtake when it could have. After understanding how AI makes decisions, engineers can better design self-driving algorithms and continuously improve the safety of self-driving cars. At the same time, it can also increase ordinary users' understanding and trust in self-driving, so that the process of artificial intelligence making decisions is no longer a black box. AI surpasses humans in multiple competitions On August 30, an article published in Nature showed that AI has surpassed humans in first-person drone racing. First-person drone racing is different from ordinary remote-controlled aircraft. The pilot needs to observe the environment from the perspective of the high-speed drone and operate it. For AI, AI needs to quickly analyze the information sent back from the video sensor and make decisions to optimize the flight route. According to the article in Nature, being able to beat the human champion in this competition is a "milestone in the field of mobile robots and machine intelligence." This achievement may be of great value to future self-driving cars and unmanned aerial vehicles. In addition to the drone field, AI also has amazing performance in the field of verification codes . Verification codes are very common in life. When you want to enter your account to log in to a web page, you will often see various verification codes, including identifying strange letters and numbers, dragging a puzzle-like slider, or clicking a picture of a "head up". The purpose of these verification codes is to prevent robots from maliciously logging in and registering. But an article in July 2023 showed that AI may be better at dealing with CAPTCHAs than humans. The experiment invited more than 1,000 human testers to use CAPTCHAs on 120 mainstream websites. The results show that the accuracy of humans in solving CAPTCHAs is between 50% and 80%. In contrast, the accuracy of AI in solving CAPTCHAs is between 85% and 100%, with most cases being above 96%. In addition to being more accurate than humans, AI also solves CAPTCHAs 0.5 seconds faster than humans. This means that as AI technology develops, the verification codes that humans use to defend against AI may become less and less effective, which is a very serious challenge for network and information security. Scientists need to start designing new verification codes that can defend against AI but will not stump humans. Human brain simulation project "failed" Although artificial intelligence technology has made great progress in recent years, humans have suffered a "Waterloo" when it comes to using AI to simulate the human brain. In 2013, European scientists launched a 10-year "Human Brain Project". Scientists plan to use cutting-edge computer technology to simulate the tens of billions of neurons and their synapses in the human brain over a 10-year period, thereby restoring the operation of the human brain and uncovering the mysteries of the human brain. October 1st of this year is the "cash-in" date for this project, but scientists have invested a total of 1.3 billion euros (about 10 billion yuan), but are still far from the goal of "simulating the human brain". Scientists have greatly underestimated the complexity of the human brain. It can be said that this is a failure in human attempts to build "artificial intelligence". Although humans have not successfully built an "artificial intelligence" in the Human Brain Project, such exploration is very meaningful. In this project, scientists have indeed gained a deeper understanding of the human brain. For example, scientists have drawn a more detailed map of the human brain and discovered several previously unknown brain areas in the prefrontal cortex. In addition, scientists have established the relationship between gene expression and brain structure, which allows them to study some brain structure-related diseases (such as depression) at the genetic level. Some digital brain models have also been clinically applied in the field of Parkinson's disease and other diseases. Image source: unsplash.com Photographer: Xu Haiwei In addition to Europe's Human Brain Project, China, the United States, Japan, South Korea, Australia and other countries also have their own Human Brain Projects. Only when we have a deeper understanding of the structure of the human brain and the origin of intelligence can we better develop "artificial intelligence" technology. The field of artificial intelligence is developing extremely rapidly. In just half a year, GPT-4 has encountered a strong opponent, AI has helped humans build odor maps and more detailed brain maps, and self-driving cars also have "explainers". It is this rapid technological development that allows us to enjoy a safer and more convenient life. Their development is a science fiction blockbuster that is being played out every day in our lives. References [1] https://www.semianalysis.com/p/google-gemini-eats-the-world-gemini [2]https://www.theinformation.com/articles/google-nears-release-of-gemini-ai-to-rival-openai [3] https://openai.com/dall-e-3 [4]https://blogs.microsoft.com/zh/blog/2023/09/21/announcing-microsoft-copilot-your-everyday-ai-companion/ [5] https://sitn.hms.harvard.edu/flash/2023/this-ai-smells-better-than-you/ [6] https://wayve.ai/thinking/lingo-natural-language-autonomous-driving/ [7] https://www.nature.com/articles/d41586-023-02600-x#ref-CR3 [8] https://arxiv.org/pdf/2307.12108.pdf [9] https://qz.com/ai-bots-recaptcha-turing-test-websites-authenticity-1850734350 [10] https://www.nature.com/articles/s41586-023-06419-4 Planning and production Author: Science Scraps Popular Science Team Audit丨Yu Yang, Head of Tencent Security Xuanwu Lab Planning丨Cui Yinghao Editor: Lin Lin |
<<: The vibrant green is actually the color of death?
In 2019, the total retail sales of social e-comme...
Produced by: Science Popularization China Produce...
From the era without Internet to the era of click...
In the past two days, the handwriting of Wang Any...
Tik Tok - the second wave of short video populari...
Many residents have reported that their tap water...
96 days after injecting US$409 million into Coolp...
In April 2005, a small pond in Hamburg, Germany w...
On February 20, 2020, Google released the first d...
In recent years, people's pursuit of high-qua...
The author of this article once created a video o...
Last week a number of publications broke the stor...
[[141377]] The ASP.NET versions released with Vis...
Google recently released an online course on Andr...
Now everyone has begun to pay attention to SEO op...