In the summer of 2013, a nondescript post appeared on Google’s open source blog titled “Learning the Meaning Behind Words.” "Computers are not very good at understanding human language right now, and while we are still some way from that goal, we are making significant progress using the latest machine learning and natural language processing techniques," the post said. Google took a massive amount of human language data from print media and the internet, thousands of times larger than the largest previous dataset, fed it into a biologically inspired “neural network,” and had the system look for correlations and connections between words. Using what's called "unsupervised learning," the system began to discover patterns. It noticed, for example, that the word "Beijing" was related to "China" in the same way that "Moscow" was related to "Russia," regardless of what the words meant. Could the computer be said to have "understood"? That's a question only philosophers can answer, but it's clear that the system has captured some essence of what it's "reading." Google named the system "word2vec"—for Words to Vectors of Numbers—and made it open source. To mathematicians, vectors have all kinds of wonderful properties. You can treat them like simple numbers, adding, subtracting, and multiplying them. In this way, the researchers soon discovered something startling and unexpected. They called it "linguistic regularities in continuous-space word representations," and explaining it is not as hard as it sounds. Word2vec turns words into vectors so that you can do mathematical operations on them. For example, if you enter China + rivers, you get the Yangtze River. If you enter Paris-France + Italy, you get Rome. Enter king-man + woman and you get queen. The results were astonishing. The word2vec system began to be used in Google’s machine translation and search engines, and the industry also adopted it in other fields, such as recruiting, and it became an essential tool for a new generation of data-driven linguists in science and engineering. Two years passed and no one realized there was a problem. Machine learning mainly includes three areas: unsupervised learning , in which the machine is directly given a bunch of data, like the word2vec system, with the goal of understanding the data and finding patterns, regularities, and useful ways to refine, represent, or visualize the data; supervised learning , in which the system is given a bunch of classified or labeled examples to learn, such as whether a parolee will reoffend, and then uses the learned model to make predictions about new examples that have never been seen or for which the basic facts are not yet clear; reinforcement learning , in which the system is placed in an environment with rewards and punishments, like a rowing track where energy and danger coexist, with the goal of finding the best way to minimize punishment and maximize reward. There is a growing awareness that the world is increasingly dependent in various ways on mathematical and computational models from the field of machine learning. These simple and complex models -- some are just spreadsheets, others can be called AI -- are gradually replacing human judgment and more traditional explicitly programmed programs. This is happening not just in the tech and business world, but also in areas with ethical and moral implications. The justice system is increasingly using “risk assessment” software to determine bail and parole. Vehicles on the road are increasingly driving themselves. Our loan applications, resumes, and medical exams are increasingly being evaluated by humans. As we enter the 21st century, more and more of us are working to make the world — figuratively and literally — drive itself. In recent years, two different groups have sounded the alarm. The first group is concerned about the ethical risks of the current technology . If facial recognition systems are particularly inaccurate for a certain ethnic group or gender, or if someone is sentenced to no bail based on an unaudited statistical model that no one in the courtroom - including the judge, lawyers and defendants - understands, there will be a problem. Such problems cannot be solved within traditional academic disciplines, but only through dialogue between computer scientists, sociologists, lawyers, policy experts and ethicists. The dialogue has begun. Another group is concerned about the dangers of the future . As our systems become more flexible and make decisions in real time, both in the virtual and real worlds, we face such dangers. Without a doubt, the past 10 years have witnessed the most exciting, but also the most sudden and worrying progress in the history of AI and machine learning. At the same time, an invisible taboo has gradually been broken, and AI researchers are no longer shy about discussing safety issues. In fact, in the past five years, this concern has gone from fringe to mainstream in this field. While there is debate over whether immediate or long-term problems should be prioritized, the two groups are aligned on the big picture. As machine learning systems become more common and powerful, we will find ourselves more and more often in the situation of the sorcerer and the apprentice: we summon a force, give it a set of instructions, expect it to be autonomous but completely obedient, and then once we realize that the instructions are inaccurate or incomplete, we scramble to stop them, lest we use our wisdom to summon something terrible. How to prevent such catastrophic divergences—how to ensure that these models capture our norms and values, understand our meaning or intent, and, most importantly, behave in the way we want—has become one of the most central and pressing problems in computer science. This problem is called the alignment problem. As the research frontier moves ever closer to developing so-called “general” intelligence and real-world machine learning systems increasingly intrude upon the ethics of personal and public life, there has been a sudden and energized response to this warning. A diverse group is reaching across traditional disciplinary boundaries. Nonprofits, think tanks, and research institutes are jumping in. A growing number of industry and academic leaders are speaking out, and increasing research funding in response. The first generation of graduate students specializing in machine learning ethics and safety has enrolled. The first responders to the alignment problem have arrived on the scene. Machine learning is ostensibly a technical problem, but increasingly it involves human problems. Human, social, and public problems are becoming technical. Technical problems are becoming human, social, and public. It turns out that our successes and failures in getting these systems to “behave the way we want them to behave” provide a real and revealing mirror for us to look at ourselves. For better or worse, the story of humanity over the next century is likely to be one of building and launching all sorts of intelligent systems. Like the sorcerer’s apprentice, we will find ourselves just one of many autonomous agents in a world filled with brooms. How should we teach them? What should we teach them? |
<<: Yes! Animals also dream and have consciousness
Bidding advertising products: product strategies ...
My name is Ryan Cooke and I work on the Core Expe...
Qihoo 360 invested 400 million US dollars to form...
In an ordinary home in Minnesota, lawyer Daniel S...
September 1 my country's Shenzhou 14 astronau...
...
The truth about promotions is: when users are anx...
The fairies are very open about their love for be...
Driven by the "Internet +" policy, the ...
What is the price to join the Luzhou Learning Min...
With the arrival of midsummer, crayfish has becom...
Selling for only 79 yuan, the impact of the Xiaomi...
APP exchange of traffic is a more commonly used m...
Hot News TOP NEWS Raincoats of missing persons we...
Geese and ducks are almost one of the most famili...