Welcome to Science Popularization China’s special winter vacation column “ High-tech Classes for Children ”! As one of the most cutting-edge technologies today, artificial intelligence is changing our lives at an astonishing speed. From smart voice assistants to driverless cars, from AI painting to machine learning, it has opened up a future full of infinite possibilities for us. This column will use videos and text to explain to children the principles, applications, and profound impact of artificial intelligence on society in an easy-to-understand way. Come and start this AI journey with us! This is a little girl named Susan from England, and this is her father, Adam. There is something in common between the father and daughter. Look carefully at their photos, can you find this commonality? AI generated images Well, the answer is revealed. The common point is that these two photos are generated by AI. There is no Susan and Adam father and daughter. Their identities are made up. You may be a little surprised to see such photos. After all, in the past, no matter how realistic the characters in games and animated films were, you could still tell at a glance that they were AI-synthesized portraits. But now the people in these two pictures look almost like real people. In addition to generating human photos, AI can also draw photos of various styles according to our needs. In today's episode, let's talk about how AI draws such pictures? Generative Adversarial Networks Behind AI-generated images, there is a very important technology - GAN. GAN is the acronym for Generative Adversarial Networks, which means Generative Adversarial Network. It was proposed by Ian Goodfellow and his colleagues in 2014. GAN sounds very high-sounding, but its principle is actually very easy to understand. Suppose we want to build a GAN to draw pictures of human faces. In this network, there are two important members, the generator and the discriminator. The task of the generator is to generate portrait images. These generated images will be mixed with photos of real people for the discriminator to judge. The discriminator needs to identify which ones are generated by the generator and which ones are real human photos. If the photo generated by the generator fools the discriminator, then the generator gets a score, otherwise the discriminator gets a score. It can be imagined that at the beginning, the photos generated by the generator are actually very simple, and you can see it at a glance when they are placed in real photos. But with thousands of trainings, the images generated by the generator will become closer and closer to real human photos. In this process, the discriminator also needs to improve its identification ability in order to score. In order to deceive the increasingly clever discriminator, the generator must continue to improve its capabilities. In this way, after tens of millions of times of training, AI can draw extremely realistic portraits. Copyright images in the gallery. Reprinting and using them may lead to copyright disputes. Of course, in addition to drawing portraits, people can also let GAN draw images in different styles. For example, if GAN is used to generate paintings in the style of Picasso, its discriminator no longer needs to judge whether the picture looks like a real person, but needs to judge which pictures are authentic works of Picasso and which are AI-generated works. With such training, pictures of different styles can be drawn. This is what the Style GAN model does. In addition to GAN, there is another image generation technology - Stable Diffusion. The recently popular MidJourney was generated by this model. Simply put, Stable Diffusion can gradually remove noise from a bunch of disordered noisy images and finally generate the expected image. Of course, the image generation software we are familiar with today also has a very important function, which is to generate images based on the content described in natural language. This process is not easy, but fortunately there are two technologies that make it possible. The first is image recognition technology. In the past few decades, whether it is autonomous driving or searching for objects in pictures, they all rely on AI to recognize the content on the image. In this process, humans have annotated the content on a large number of pictures and used them to train AI so that AI can recognize a variety of things. Another important technology is natural language recognition. Over the past few decades, people have been trying to make AI understand what we write and what we are saying. This allows AI to better understand the meaning of the text we give it. When you say "there is an owl in the tree", the computer can know that you are talking about a bird, not that there is a "cat" and a "head" on the tree, plus an "eagle". As image recognition and natural language processing technologies have become increasingly mature, a technology called cross-modal retrieval has emerged. Modality refers to the form in which data exists, such as text, image, video, etc. Cross-modal retrieval can associate data of different modalities, such as associating the word "cup" in a text with the image of a cup in a picture. With the help of cross-modal retrieval technology, AI can convert the text information we input into image information. Today, AI image generation technology based on GAN and stable diffusion has been widely used. In addition to image generation, it has extremely wide applications in generating music, videos, and text. Even since the end of 2022, many companies have announced that they will use AI painters to replace human painters. And on various social media, we may also see AI-generated pictures and videos. Of course, some people have expressed concerns about AI-generated pictures and videos. After all, the photos and videos they generate are so realistic that some people with bad intentions may use these photos to commit fraud or spread rumors. Many AI companies have also taken this into consideration and have begun to impose some restrictions on the AI services they provide. Many countries have also begun to consider improving the laws and regulations on AI-generated content. I believe that with the improvement of rules and further development of technology, the benefits brought to us by AI technology will far outweigh its disadvantages, and AI will eventually serve mankind better. Planning and production This article is a work of the Science Popularization China-Creation Cultivation Program Produced by: Science Popularization Department of China Association for Science and Technology Producer|China Science and Technology Press Co., Ltd., Beijing Zhongke Xinghe Culture Media Co., Ltd. Author: Beijing Yunyuji Culture Communication Co., Ltd. Reviewer: Qin Zengchang, Associate Professor, School of Automation Science and Electrical Engineering, Beihang University Planning丨Fu Sijia Editor: Fu Sijia Proofread by Xu Lailinlin |
Is it worth doing mini program? What is the marke...
[[130868]] The Economist recently wrote that the ...
According to the latest data from iResearch MUT, ...
serial number Chinese content 1.1 When developing...
Can machines think? Can artificial intelligence b...
The biggest news in the mobile phone industry rec...
Zuckerberg and his Facebook don't want to rem...
For a successful application, developing the APP ...
1. Introduction to brand cooperation Different fr...
Key insights from Silicon Valley Bank’s healthcar...
It is not easy to stand out among the 15 million ...
I have said before that we should consider the lo...
According to foreign media reports, Lamborghini...
Yesterday, Li Ronghao was on the hot search again...
Many students and civil servants place Wenchang P...