Late-night blockbuster! Google releases the most powerful AI model Gemini, which "beats" GPT-4 in 30 benchmark tests

Late-night blockbuster! Google releases the most powerful AI model Gemini, which "beats" GPT-4 in 30 benchmark tests

After much anticipation, Google's most anticipated large model, Gemini, is finally here.

Google CEO Sundar Pichai and Google DeepMind CEO Demis Hassabis described it as "a huge leap forward for AI models" and said it "will eventually impact almost all of Google's products." Sundar Pichai said in a statement, "These are our first models into the Gemini era and the first realization of our vision when we founded Google DeepMind earlier this year. This new era of models represents one of the biggest scientific and engineering efforts we've ever made as a company." It is reported that this time Google released three models : Gemini Nano, Gemini Pro and Gemini Ultra . Among them,

Gemini Nano is a lighter version that runs natively offline on Android devices, such as the Pixel 8 Pro;

Gemini Pro is a more powerful version that will soon power a host of Google AI services and will be plugged into Bard starting today;

Gemini Ultra is a more powerful version, the most powerful large model Google has created so far, designed primarily for data centers and enterprise applications and scheduled to be launched next year.

In terms of performance, Gemini is ahead of GPT-4 in 30 out of 32 benchmarks , including a wide range of overall tests such as the multi-task language understanding benchmark, as well as tests of its ability to generate Python code.

Figure | Gemini outperforms the state-of-the-art on a range of benchmarks including text and encoding.

Figure | Gemini outperforms the state-of-the-art across a range of multi-modal benchmarks.

In addition, Gemini Ultra scored 90.0%, making it the first model to surpass human experts in MMLU (Massive Multi-Task Language Understanding), which combines 57 subjects including mathematics, physics, history, law, medicine, and ethics to test world knowledge and problem-solving skills.

Gemini’s clearest advantage in these benchmarks comes from its ability to understand and interact with both video and audio. This is largely by design: multimodality was part of the Gemini project from the beginning. Google didn’t train separate models for images and speech, as OpenAI did when it created DALL-E and Whisper; instead, it built a “multisensory” model from the beginning. Demis Hassabis says Google has always been interested in very general systems, and is particularly interested in how to mix all of these modes—collecting as much data as possible from any number of inputs and senses, and then giving equally diverse responses.

Right now, Gemini's most basic mode is text input and text output, but more powerful models like Gemini Ultra can process images, video, and audio. Demis Hassabis said Gemini will also have features like motion and touch - more like robot-type features, and will gain more senses over time, become more sentient, and become more accurate and grounded in the process , "these models will better understand the world around them." Of course, the Gemini model will still produce hallucinations. Benchmarks aren't everything, though. The true test of Gemini's capabilities will ultimately come from everyday users who want to use it to brainstorm, find information, write code, and more. Google seems to see coding in particular as Gemini's killer app, using a new code generation system called AlphaCode 2, and saying it outperforms 85% of coding competition entrants, 50% higher than the original AlphaCode. Equally important to Google, though, is that Gemini is clearly a more efficient model. It was trained on Google's own Tensor Processing Units, running faster and cheaper than Google's previous models like PaLM. Along with the launch of the new model, Google also launched a new version of the TPU system - TPU v5p, a computing system designed specifically for data centers for training and running large-scale models.

It is worth noting that Gemini is currently only available in English, and other language versions will be launched in the future. But Sundar Pichai said that the model will eventually be integrated into Google's search engine, advertising products, Chrome browser, etc.

So, the era of artificial intelligence brought by ChatGPT has lasted for a year. Does the release of Gemini by Google mean that Google has caught up? Or, can Google now stand at the top of the artificial intelligence industry again?

Attached: Statement from Sundar Pichai, CEO of Google and Alphabet:

Every technological revolution is an important opportunity for scientific discovery, human progress, and improved lives. I firmly believe that the artificial intelligence (AI) transformation we are experiencing will be the most profound change our generation has experienced, far more impactful than the previous mobile Internet or network revolutions. AI will not only create opportunities for people around the world, from the everyday to the extraordinary, but will also drive a new wave of knowledge, learning, creativity, and productivity on a scale we have never seen before.

That’s what excites me: making AI useful to everyone in the world.

We’re nearly eight years into our journey as an AI-first company. And the pace of progress has not slowed down, it’s accelerated: Today, millions of people are using generative AI in our products to do things that weren’t possible just last year, like answering more complex questions and using new tools to collaborate and innovate. At the same time, developers around the world are using our models and infrastructure to build new generative AI applications, and startups and enterprises of all sizes are using our AI tools to grow.

It’s an incredible dynamic, but we’ve only begun to explore the possibilities.

We are doing this in a bold and responsible way. This means pursuing ambitious goals in our research, developing technologies that can bring great benefits to people and society, while also building in safeguards and working with governments and experts to address the risks that emerge as AI capabilities grow. We continue to invest in the best tools, foundational models, and infrastructure, and we use our AI principles as a guide to continually improve our products and services.

Today, we are taking the next step in our journey with the launch of Gemini, our most advanced and general model yet, which performs extremely well on multiple leading benchmarks. Our first release, Gemini 1.0, is optimized for different scales, including Ultra, Pro, and Nano. These are our first models entering the Gemini era, and the first realization of the vision we set when we founded Google DeepMind earlier this year. Models in this new era represent one of the largest scientific and engineering efforts we have undertaken as a company. I am incredibly excited about the upcoming developments and the opportunities that Gemini will bring to people around the world.

–Sundar

Reference Links:

https://blog.google/technology/ai/google-gemini-ai/#capabilitieshttps://storage.googleapis.com/deepmind-media/gemini/gemini_1_report.pdf

<<:  It's snowing outside, but your body is so lively? | Science Museum

>>:  Drinking alcohol and getting red on the face! Is the "Asian flush" an evolutionary "defect"?

Recommend

How can App developers in difficult situations save themselves?

[[155642]] “Because the industry is so saturated,...

Barnard 68, mistakenly thought to be a black hole, is actually a mass...

Author: Huang Xianghong Duan Yuechu There are man...

I work in operations at Alibaba 3 - 818 classic misunderstandings in operations

Thanks Mantou for the invitation. I always find i...

The price of HTC's uncompromising approach

HTC took the high-end route, first sold abroad an...

Reminiscing about Windows XP

Today, the 13-year-old Microsoft Windows XP system...

Surprise! Windows 10 loves to spread your WiFi password

Windows 10's Wi-FiSense feature has a securit...

User growth fission guide!

Fission is a standard feature for user growth and...

The unity of aging and evolution: Why do lifespans vary so much across species?

We can understand aging from an evolutionary pers...