MMLM之Gemini：《Introducing Gemini: our largest and most capable AI model》的翻译与解读

本文主要是介绍MMLM之Gemini：《Introducing Gemini: our largest and most capable AI model》的翻译与解读，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

MMLM之Gemini：《Introducing Gemini: our largest and most capable AI model》的翻译与解读

导读：2023年12月6日，Google重磅发布大规模多模态模型Gemini，表示了Google语言模型发展到了一个新阶段，其多模态和通用能力明显优于目前大部分主流大模型。这是Google目前最大、最强大的人工智能模型。Gemini从底层构建为多模式，可以概括和无缝地理解、操作和组合不同类型的信息，包括文本、图像、音频、视频和代码。这意味着它具有复杂的多模态推理和高级编码能力。通过可以驱动Google产品，提供更先进的客户服务互动，用于内容创作和营销活动，并在自然语言、代码生成、竞赛编程等任务上表现优秀。

背景：随着AI技术的不断进步，语言模型也在不断发展，但现有模型在多模态处理能力和一致性暴露了不足。

解决痛点：Gemini面向未来AI助手应有的知识和能力，即多模态、通用、可靠等能力。

解决方案:

>> Gemini采用从一开始就注重多模态的训练方式，可以自然地理解和推理各种输入。

>> Gemini在多种语言、图像、知识测评benchmark上均超过目前SOTA，表明其强大的多模态能力。

>> Gemini在自然语言、代码生成、竞赛编程等任务上也表现出色。

>> Gemini的三个版本针对不同场景进行优化，可以在服务器、设备上高效运行。

>> Gemini系列开发注重责任和安全，采取多重机制提升模型安全性。

>> Gemini将被应用在谷歌多个产品中，同时也将通过API对开发者开放。

总之，Gemini极大提升了谷歌模型在多模态能力、通用性和运行效率上的水平，解决了传统模型在这方面的不足，有望助推AI助手的发展。

《Introducing Gemini: our largest and most capable AI model》的翻译与解读

Note from Sundar

Introducing Gemini介绍Gemini

State-of-the-art performance最先进的性能

See more details in our Gemini technical report.在我们的Gemini技术报告中看到更多细节。

在包括文本和编码在内的一系列基准测试中都超越了最先进的性能Gemini surpasses state-of-the-art performance on a range of benchmarks including text and coding.Gemini

在一系列多模式基准上超越了最先进的性能Gemini surpasses state-of-the-art performance on a range of multimodal benchmarks.

Next-generation capabilities新一代能力

Learn more about Gemini’s capabilities and see how it works.了解有关Gemini能力的更多信息，并了解其工作原理。

Sophisticated reasoning复杂的推理

Gemini解锁新的科学见解

Understanding text, images, audio and more理解文本，图像，音频和更多

Gemini explains reasoning in math and physics，Gemini在数学和物理的推理中表现优异。

Advanced coding先进的编码

Gemini excels at coding and competitive programming，Gemini擅长编码和竞争性编程

See more details in our AlphaCode 2 technical report.详见我们的AlphaCode 2技术报告。

Scalable and efficient可扩展且高效

More reliable, scalable and efficient更可靠，可扩展和高效

A row of Cloud TPU v5p AI accelerator supercomputers in a Google data center.谷歌数据中心的一排Cloud TPU v5p AI加速器超级计算机

Responsibility and safety责任与安全

Built with responsibility and safety at the core以责任和安全为核心构建

Availability可用性

Making Gemini available to the world让Gemini向世界开放

Gemini Pro in Google products，Gemini Pro在谷歌产品中

在线体验Gemini

Building with Gemini使用Gemini构建

Gemini Ultra coming soon，Gemini Ultra即将推出

The Gemini era: enabling a future of innovation，Gemini时代：开启创新的未来

《Introducing Gemini: our largest and most capable AI model》的翻译与解读

地址

地址：Introducing Gemini: Google’s most capable AI model yet

时间

2023年12月6日

作者

Sundar Pichai

CEO of Google and Alphabet

Demis Hassabis

CEO and Co-Founder, Google DeepMind

Note from Sundar

A note from Google and Alphabet CEO Sundar Pichai:

Every technology shift is an opportunity to advance scientific discovery, accelerate human progress, and improve lives. I believe the transition we are seeing right now with AI will be the most profound in our lifetimes, far bigger than the shift to mobile or to the web before it. AI has the potential to create opportunities — from the everyday to the extraordinary — for people everywhere. It will bring new waves of innovation and economic progress and drive knowledge, learning, creativity and productivity on a scale we haven’t seen before.

That’s what excites me: the chance to make AI helpful for everyone, everywhere in the world.

谷歌和Alphabet首席执行官Sundar Pichai的一则声明:

每一次技术变革都是推动科学发现、加速人类进步和改善生活的机会。我相信我们现在看到的人工智能的转变将是我们一生中最深刻的，远远超过之前向移动或网络的转变。人工智能有潜力为世界各地的人们创造机会——从日常生活到非凡的生活。它将带来新的创新浪潮和经济进步，并以前所未有的规模推动知识、学习、创造力和生产力。

让我兴奋的是：有机会使人工智能对全球所有人都有帮助。

Nearly eight years into our journey as an AI-first company, the pace of progress is only accelerating: Millions of people are now using generative AI across our products to do things they couldn’t even a year ago, from finding answers to more complex questions to using new tools to collaborate and create. At the same time, developers are using our models and infrastructure to build new generative AI applications, and startups and enterprises around the world are growing with our AI tools.

This is incredible momentum, and yet, we’re only beginning to scratch the surface of what’s possible.

We’re approaching this work boldly and responsibly. That means being ambitious in our research and pursuing the capabilities that will bring enormous benefits to people and society, while building in safeguards and working collaboratively with governments and experts to address risks as AI becomes more capable. And we continue to invest in the very best tools, foundation models and infrastructure and bring them to our products and to others, guided by our AI Principles.

Now, we’re taking the next step on our journey with Gemini, our most capable and general model yet, with state-of-the-art performance across many leading benchmarks. Our first version, Gemini 1.0, is optimized for different sizes: Ultra, Pro and Nano. These are the first models of the Gemini era and the first realization of the vision we had when we formed Google DeepMind earlier this year. This new era of models represents one of the biggest science and engineering efforts we’ve undertaken as a company. I’m genuinely excited for what’s ahead, and for the opportunities Gemini will unlock for people everywhere.

– Sundar

作为一家以人工智能为先的公司，我们已经进行了近八年的探索，进展的速度只是在加快：数百万人现在正在使用我们产品中的生成式人工智能，做一些他们一年前甚至无法做到的事情，从解答更复杂的问题到使用新工具进行协作和创造。同时，开发人员正在利用我们的模型和基础设施构建新的生成式人工智能应用程序，全球范围内的初创公司和企业正在借助我们的人工智能工具实现增长。

这是不可思议的动力，然而，我们只是刚刚开始触及可能性的表面。

我们正在大胆而负责地开展这项工作。这意味着在研究中抱有雄心，并追求那些将为人们和社会带来巨大利益的能力，同时建立防护措施，并与政府和专家合作，以应对随着人工智能变得更加强大而出现的风险。我们继续投资于最优秀的工具、基础模型和基础设施，并将它们引入我们的产品和其他产品，遵循我们的人工智能原则。

现在，我们正在Gemini的旅程中迈出下一步，这是我们迄今为止最强大且最通用的模型，在许多领先的基准测试中具有最先进的性能。我们的第一个版本Gemini 1.0针对不同的尺寸进行了优化：Ultra、Pro和Nano。这些是Gemini时代的第一批模型，也是我们今年早些时候成立Google DeepMind时的第一个愿景的首次实现。这一新时代的模型代表了公司迄今为止进行的最大的科学和工程努力之一。我为即将发生的事情感到真正兴奋，也为Gemini将为全球人民开启的机会感到兴奋。

Introducing Gemini介绍Gemini

By Demis Hassabis, CEO and Co-Founder of Google DeepMind, on behalf of the Gemini team

AI has been the focus of my life's work, as for many of my research colleagues. Ever since programming AI for computer games as a teenager, and throughout my years as a neuroscience researcher trying to understand the workings of the brain, I’ve always believed that if we could build smarter machines, we could harness them to benefit humanity in incredible ways.

This promise of a world responsibly empowered by AI continues to drive our work at Google DeepMind. For a long time, we’ve wanted to build a new generation of AI models, inspired by the way people understand and interact with the world. AI that feels less like a smart piece of software and more like something useful and intuitive — an expert helper or assistant.

Today, we’re a step closer to this vision as we introduce Gemini, the most capable and general model we’ve ever built.

由Google DeepMind首席执行官兼联合创始人Demis Hassabis代表Gemini团队发表

人工智能一直是我毕生工作的焦点，也是我的许多研究同仁的焦点。自从十几岁时为电脑游戏编写人工智能程序以来，一直到我作为神经科学研究者试图理解大脑工作的这些年，我一直相信，如果我们能构建更智能的机器，我们就能利用它们以令人难以置信的方式造福人类。

在Google DeepMind，我们继续致力于这一由人工智能负责任地赋予世界权力的承诺。很长一段时间以来，我们一直想要构建一代新的人工智能模型，灵感来自人们理解和与世界互动的方式。这种人工智能感觉不像是一款聪明的软件，更像是一种有用而直观的东西 —— 一种专业的助手或专家。

Gemini is the result of large-scale collaborative efforts by teams across Google, including our colleagues at Google Research. It was built from the ground up to be multimodal, which means it can generalize and seamlessly understand, operate across and combine different types of information including text, code, audio, image and video.

今天，我们向这一愿景又迈进了一步，我们推出了Gemini，这是我们有史以来打造的最强大、最通用的模型。

Gemini是谷歌各个团队大规模合作的结果，包括我们在谷歌研究部门的同事。它从头开始构建，以多模态为特点，这意味着它可以泛化并无缝地理解、操作和组合不同类型的信息，包括文本、代码、音频、图像和视频。

Introducing Gemini: our largest and most capable AI model

Gemini is also our most flexible model yet — able to efficiently run on everything from data centers to mobile devices. Its state-of-the-art capabilities will significantly enhance the way developers and enterprise customers build and scale with AI.

We’ve optimized Gemini 1.0, our first version, for three different sizes:

>> Gemini Ultra — our largest and most capable model for highly complex tasks.

>> Gemini Pro — our best model for scaling across a wide range of tasks.

>> Gemini Nano — our most efficient model for on-device tasks.

Gemini:我们最大、最强大的人工智能模型

Gemini也是我们迄今为止最灵活的模型，能够在从数据中心到移动设备的所有设备上高效运行。其最先进的功能将显著增强开发人员和企业客户使用人工智能构建和扩展的方式。

我们已经优化了Gemini 1.0，我们的第一个版本，有三种不同的尺寸：

>>GeminiUltra -用于高度复杂任务的最大最强大的模型。

>> Gemini Pro -在各种任务上扩展的最佳模型。

>> Gemini Nano -在设备上任务中最有效的模型。

State-of-the-art performance最先进的性能

We've been rigorously testing our Gemini models and evaluating their performance on a wide variety of tasks. From natural image, audio and video understanding to mathematical reasoning, Gemini Ultra’s performance exceeds current state-of-the-art results on 30 of the 32 widely-used academic benchmarks used in large language model (LLM) research and development.

With a score of 90.0%, Gemini Ultra is the first model to outperform human experts on MMLU (massive multitask language understanding), which uses a combination of 57 subjects such as math, physics, history, law, medicine and ethics for testing both world knowledge and problem-solving abilities.

Our new benchmark approach to MMLU enables Gemini to use its reasoning capabilities to think more carefully before answering difficult questions, leading to significant improvements over just using its first impression.

我们已经对Gemini模型进行了严格的测试，并在各种任务上评估了它们的性能。从自然图像、音频和视频理解到数学推理，Gemini Ultra的性能在32个广泛使用的大语言模型（LLM）研究和开发中使用的学术基准中有30个超越了当前最先进的结果。

在MMLU（大规模多任务语言理解）中，Gemini Ultra以90.0%的得分首次超过人类专家，该任务使用57个主题（如数学、物理学、历史、法律、医学和伦理学）结合测试世界知识和解决问题的能力。

我们对MMLU的新基准方法使Gemini能够利用其推理能力在回答困难问题之前更加谨慎思考，从而比仅使用第一印象有显着改善。

Gemini Ultra also achieves a state-of-the-art score of 59.4% on the new MMMU benchmark, which consists of multimodal tasks spanning different domains requiring deliberate reasoning.

With the image benchmarks we tested, Gemini Ultra outperformed previous state-of-the-art models, without assistance from object character recognition (OCR) systems that extract text from images for further processing. These benchmarks highlight Gemini’s native multimodality and indicate early signs of Gemini's more complex reasoning abilities.

Gemini Ultra在新的MMM（多模态多任务）基准测试中也取得了59.4%的最先进得分，该基准测试包括涉及不同领域的多模态任务，需要深思熟虑的推理。

在我们测试的图像基准测试中，Gemini Ultra在没有目标字符识别（OCR）系统的辅助下，超越了以前最先进的模型。这些基准测试突显了Gemini的本机多模态性，并表明Gemini具有更复杂推理能力的早期迹象。

See more details in our Gemini technical report.在我们的Gemini技术报告中看到更多细节。

在包括文本和编码在内的一系列基准测试中都超越了最先进的性能Gemini surpasses state-of-the-art performance on a range of benchmarks including text and coding.Gemini

在一系列多模式基准上超越了最先进的性能Gemini surpasses state-of-the-art performance on a range of multimodal benchmarks.

Next-generation capabilities新一代能力

Until now, the standard approach to creating multimodal models involved training separate components for different modalities and then stitching them together to roughly mimic some of this functionality. These models can sometimes be good at performing certain tasks, like describing images, but struggle with more conceptual and complex reasoning.

We designed Gemini to be natively multimodal, pre-trained from the start on different modalities. Then we fine-tuned it with additional multimodal data to further refine its effectiveness. This helps Gemini seamlessly understand and reason about all kinds of inputs from the ground up, far better than existing multimodal models — and its capabilities are state of the art in nearly every domain.

到目前为止，创建多模态模型的标准方法包括为不同的模态训练单独的组件，然后将它们拼接在一起，粗略地模仿一些功能。这些模型有时可以很好地执行某些任务，比如描述图像，但在更概念性和复杂的推理方面会遇到困难。

我们设计Gemini是天生的多模态，从一开始就在不同的模态上进行了预训练。然后我们用额外的多模态数据对其进行微调，以进一步改进其有效性。这有助于Gemini从一开始就无缝地理解和推理各种输入，比现有的多模态模型要好得多，而且它的能力几乎在每个领域都是最先进的。

Learn more about Gemini’s capabilities and see how it works.了解有关Gemini能力的更多信息，并了解其工作原理。

Sophisticated reasoning复杂的推理

Gemini 1.0’s sophisticated multimodal reasoning capabilities can help make sense of complex written and visual information. This makes it uniquely skilled at uncovering knowledge that can be difficult to discern amid vast amounts of data.

Its remarkable ability to extract insights from hundreds of thousands of documents through reading, filtering and understanding information will help deliver new breakthroughs at digital speeds in many fields from science to finance.

Gemini 1.0复杂的多模态推理能力有助于理解复杂的书面和视觉信息。这使得它在发现在大量数据中难以辨别的知识方面具有独特的技能。

它通过阅读、过滤和理解信息，从数十万份文件中提取见解的非凡能力，将有助于在从科学到金融的许多领域以数字速度实现新的突破。

Gemini解锁新的科学见解

Understanding text, images, audio and more理解文本，图像，音频和更多

Gemini 1.0 was trained to recognize and understand text, images, audio and more at the same time, so it better understands nuanced information and can answer questions relating to complicated topics. This makes it especially good at explaining reasoning in complex subjects like math and physics.

Gemini1.0经过训练，可以同时识别和理解文本、图像、音频等，因此它能更好地理解细微的信息，并能回答与复杂话题有关的问题。这使得它特别擅长解释数学和物理等复杂学科的推理。

Gemini explains reasoning in math and physics，Gemini在数学和物理的推理中表现优异。

Advanced coding先进的编码

Our first version of Gemini can understand, explain and generate high-quality code in the world’s most popular programming languages, like Python, Java, C++, and Go. Its ability to work across languages and reason about complex information makes it one of the leading foundation models for coding in the world.

Gemini Ultra excels in several coding benchmarks, including HumanEval, an important industry-standard for evaluating performance on coding tasks, and Natural2Code, our internal held-out dataset, which uses author-generated sources instead of web-based information.

Gemini can also be used as the engine for more advanced coding systems. Two years ago we presented AlphaCode, the first AI code generation system to reach a competitive level of performance in programming competitions.

Using a specialized version of Gemini, we created a more advanced code generation system, AlphaCode 2, which excels at solving competitive programming problems that go beyond coding to involve complex math and theoretical computer science.

我们的第一个版本Gemini可以理解、解释和生成世界上最流行的编程语言的高质量代码，如Python、Java、c++和Go。它具有跨语言工作和对复杂信息进行推理的能力，使其成为世界上领先的编码基础模型之一。

Gemini Ultra在几个编码基准测试中表现出色，包括HumanEval(一个重要的行业标准，用于评估编码任务的性能)和Natural2Code(我们的内部保留数据集)，它使用作者生成的来源而不是基于web的信息。

Gemini也可以用作更先进的编码系统的引擎。两年前，我们推出了AlphaCode，这是第一个在编程比赛中达到竞技水平的人工智能代码生成系统。

使用专门的Gemini版本，我们创建了一个更高级的代码生成系统，AlphaCode 2，在解决涉及复杂数学和理论计算机科学的竞争性编程问题方面表现出色。

When evaluated on the same platform as the original AlphaCode, AlphaCode 2 shows massive improvements, solving nearly twice as many problems, and we estimate that it performs better than 85% of competition participants — up from nearly 50% for AlphaCode. When programmers collaborate with AlphaCode 2 by defining certain properties for the code samples to follow, it performs even better.

We’re excited for programmers to increasingly use highly capable AI models as collaborative tools that can help them reason about the problems, propose code designs and assist with implementation — so they can release apps and design better services, faster.

当在与原始AlphaCode相同的平台上进行评估时，AlphaCode 2显示出巨大的改进，解决了几乎两倍的问题，我们估计它的表现优于85%的比赛参与者——较AlphaCode的近50%有所提高。当程序员通过为代码示例定义某些属性与AlphaCode 2协作时，它的性能会更好。

我们很高兴程序员越来越多地使用高性能的人工智能模型作为协作工具，帮助他们推理问题、提出代码设计并协助实现——这样他们就可以更快地发布应用程序和设计更好的服务。

Gemini excels at coding and competitive programming，Gemini擅长编码和竞争性编程

See more details in our AlphaCode 2 technical report.详见我们的AlphaCode 2技术报告。

Scalable and efficient可扩展且高效

More reliable, scalable and efficient更可靠，可扩展和高效

We trained Gemini 1.0 at scale on our AI-optimized infrastructure using Google’s in-house designed Tensor Processing Units (TPUs) v4 and v5e. And we designed it to be our most reliable and scalable model to train, and our most efficient to serve.

On TPUs, Gemini runs significantly faster than earlier, smaller and less-capable models. These custom-designed AI accelerators have been at the heart of Google's AI-powered products that serve billions of users like Search, YouTube, Gmail, Google Maps, Google Play and Android. They’ve also enabled companies around the world to train large-scale AI models cost-efficiently.

Today, we’re announcing the most powerful, efficient and scalable TPU system to date, Cloud TPU v5p, designed for training cutting-edge AI models. This next generation TPU will accelerate Gemini’s development and help developers and enterprise customers train large-scale generative AI models faster, allowing new products and capabilities to reach customers sooner.

我们使用谷歌自家设计的Tensor Processing Units（TPUs）v4和v5e在我们的AI优化基础设施上大规模训练Gemini 1.0。我们把它设计成最可靠、最可扩展的培训模式，也是最有效的服务模式。

在TPUs上，Gemini的运行速度明显快于早期、较小和功能较差的机型。这些定制设计的人工智能加速器一直是谷歌人工智能产品的核心，这些服务为数十亿用户提供搜索、YouTube、Gmail、Google Maps、Google Play和Android等服务。它们还使世界各地的公司能够以经济高效的方式训练大规模的AI模型。

今天，我们宣布了迄今为止最强大，最高效和可扩展的TPU系统，Cloud TPU v5p，专为训练尖端的人工智能模型而设计。这款下一代TPU将加速Gemini的开发，并帮助开发人员和企业客户更快地训练大规模生成式人工智能模型，从而使新产品和功能更快地到达客户手中。

A row of Cloud TPU v5p AI accelerator supercomputers in a Google data center.谷歌数据中心的一排Cloud TPU v5p AI加速器超级计算机

Responsibility and safety责任与安全

Built with responsibility and safety at the core以责任和安全为核心构建

At Google, we’re committed to advancing bold and responsible AI in everything we do. Building upon Google’s AI Principles and the robust safety policies across our products, we’re adding new protections to account for Gemini’s multimodal capabilities. At each stage of development, we’re considering potential risks and working to test and mitigate them.

Gemini has the most comprehensive safety evaluations of any Google AI model to date, including for bias and toxicity. We’ve conducted novel research into potential risk areas like cyber-offense, persuasion and autonomy, and have applied Google Research’s best-in-class adversarial testing techniques to help identify critical safety issues in advance of Gemini’s deployment.

To identify blindspots in our internal evaluation approach, we’re working with a diverse group of external experts and partners to stress-test our models across a range of issues.

To diagnose content safety issues during Gemini’s training phases and ensure its output follows our policies, we’re using benchmarks such as Real Toxicity Prompts, a set of 100,000 prompts with varying degrees of toxicity pulled from the web, developed by experts at the Allen Institute for AI. Further details on this work are coming soon.

在谷歌，我们致力于在我们所做的一切中推进大胆而负责任的人工智能。在谷歌的AI原则和我们产品各个领域的健全安全政策的基础上，我们为Gemini的多模态能力增加了新的保护措施。在开发的每个阶段，我们都考虑了潜在的风险，并努力测试和缓解这些风险。

Gemini拥有迄今为止谷歌所有人工智能模型中最全面的安全评估，包括偏见和毒性。我们进行了关于潜在风险领域的新颖研究，如网络攻击、说服和自治，并应用了谷歌研究最佳的对抗测试技术，以帮助在Gemini部署之前预先识别关键的安全问题。

为了在内部评估方法中识别盲点，我们与外部的多样化的专家团队和合作伙伴合作，以在一系列问题上对我们的模型进行压力测试。

在Gemini的训练阶段诊断内容安全问题，并确保其输出符合我们的政策，我们使用了真实毒性提示(Real toxic Prompts)等基准测试，这是一组从网络中提取的具有不同程度毒性的10万个提示，由艾伦人工智能研究所的专家开发。有关此工作的进一步细节即将发布。

To limit harm, we built dedicated safety classifiers to identify, label and sort out content involving violence or negative stereotypes, for example. Combined with robust filters, this layered approach is designed to make Gemini safer and more inclusive for everyone. Additionally, we’re continuing to address known challenges for models such as factuality, grounding, attribution and corroboration.

Responsibility and safety will always be central to the development and deployment of our models. This is a long-term commitment that requires building collaboratively, so we’re partnering with the industry and broader ecosystem on defining best practices and setting safety and security benchmarks through organizations like MLCommons, the Frontier Model Forum and its AI Safety Fund, and our Secure AI Framework (SAIF), which was designed to help mitigate security risks specific to AI systems across the public and private sectors. We’ll continue partnering with researchers, governments and civil society groups around the world as we develop Gemini.

为了减少伤害，我们构建了专用的安全分类器，用于识别、标记和分类涉及暴力或负面刻板印象的内容。结合强大的过滤器，这种分层方法旨在使Gemini更安全、更包容。此外，我们还在继续解决模型的已知挑战，如事实性、基础、归因和协同。

责任和安全将始终是我们模型开发和部署的核心。这是一项长期的承诺，需要协作建设，因此我们正在与行业和更广泛的生态系统合作，共同制定最佳实践，并通过MLCommons、Frontier Model Forum及其AI安全基金以及我们的安全AI框架（SAIF）等组织设定安全和安全标准，该框架旨在帮助缓解公共和私营部门中特定于AI系统的安全风险。在我们开发Gemini的过程中，我们将继续与世界各地的研究人员、政府和公民社会团体合作。

Availability可用性

Making Gemini available to the world让Gemini向世界开放

Gemini 1.0 is now rolling out across a range of products and platforms:

Gemini 1.0现在正在逐步在一系列产品和平台上推出:

Gemini Pro in Google products，Gemini Pro在谷歌产品中

We’re bringing Gemini to billions of people through Google products.

Starting today, Bard will use a fine-tuned version of Gemini Pro for more advanced reasoning, planning, understanding and more. This is the biggest upgrade to Bard since it launched. It will be available in English in more than 170 countries and territories, and we plan to expand to different modalities and support new languages and locations in the near future.

We’re also bringing Gemini to Pixel. Pixel 8 Pro is the first smartphone engineered to run Gemini Nano, which is powering new features like Summarize in the Recorder app and rolling out in Smart Reply in Gboard, starting with WhatsApp — with more messaging apps coming next year.

In the coming months, Gemini will be available in more of our products and services like Search, Ads, Chrome and Duet AI.

We’re already starting to experiment with Gemini in Search, where it's making our Search Generative Experience (SGE) faster for users, with a 40% reduction in latency in English in the U.S., alongside improvements in quality.

Gemini专业在谷歌产品

我们通过谷歌产品将Gemini带给了数十亿人。

从今天开始，Bard将使用Gemini Pro的微调版本进行更高级的推理、规划、理解等操作。这是Bard自推出以来的最大升级。它将在超过170个国家和地区提供英文版本，并计划在不久的将来扩展到不同的模态，并支持新的语言和地区。

我们还将Gemini引入Pixel。Pixel 8 Pro是首款运行Gemini Nano的智能手机，它支持一些新功能，比如在Recorder应用程序中进行总结，并在Gboard中推出智能回复功能，从WhatsApp开始，明年还会推出更多的即时通讯应用程序。

在未来几个月内，Gemini将在我们的更多产品和服务中推出，如Search、Ads、Chrome和Duet AI。

我们已经开始在Search中尝试Gemini，它使我们的搜索生成体验（SGE）对用户更加快速，在美国英语中的延迟减少了40%，同时提高了质量。

在线体验Gemini

产品测试地址：https://bard.google.com/

Building with Gemini使用Gemini构建

Starting on December 13, developers and enterprise customers can access Gemini Pro via the Gemini API in Google AI Studio or Google Cloud Vertex AI.

Google AI Studio is a free, web-based developer tool to prototype and launch apps quickly with an API key. When it's time for a fully-managed AI platform, Vertex AI allows customization of Gemini with full data control and benefits from additional Google Cloud features for enterprise security, safety, privacy and data governance and compliance.

Android developers will also be able to build with Gemini Nano, our most efficient model for on-device tasks, via AICore, a new system capability available in Android 14, starting on Pixel 8 Pro devices. Sign up for an early preview of AICore.

从12月13日开始，开发者和企业客户可以通过Google AI Studio或Google Cloud Vertex AI中的Gemini API访问Gemini Pro。

Google AI Studio是一款免费的基于web的开发者工具，可以通过API密钥快速创建和发布应用。当一个完全托管的人工智能平台到来时，Vertex AI允许Gemini的定制化，具有完全的数据控制，并受益于额外的谷歌云功能，包括企业安全、隐私、数据治理和合规性。

Android开发者还可以通过AICore (Android 14中的一项新系统功能，从Pixel 8 Pro设备开始)，使用Gemini Nano(我们最高效的设备上任务模型)进行构建。注册获得AICore的早期预览版。

Gemini Ultra coming soon，Gemini Ultra即将推出

For Gemini Ultra, we’re currently completing extensive trust and safety checks, including red-teaming by trusted external parties, and further refining the model using fine-tuning and reinforcement learning from human feedback (RLHF) before making it broadly available.

As part of this process, we’ll make Gemini Ultra available to select customers, developers, partners and safety and responsibility experts for early experimentation and feedback before rolling it out to developers and enterprise customers early next year.

Early next year, we’ll also launch Bard Advanced, a new, cutting-edge AI experience that gives you access to our best models and capabilities, starting with Gemini Ultra.

对于Gemini Ultra，我们目前正在进行广泛的信任和安全性检查，包括由可信赖的外部团体进行的红队测试，并在广泛推出之前使用来自人类反馈的微调和强化学习（RLHF）进一步完善模型。

作为这一过程的一部分，我们将向选定的客户、开发人员、合作伙伴以及安全和责任专家提供Gemini Ultra，以便在明年年初向开发人员和企业客户推出之前进行早期实验和反馈。

明年年初，我们还将推出Bard Advanced，这是一种全新的尖端人工智能体验，从Gemini Ultra开始，您可以使用我们最好的模型和功能。

The Gemini era: enabling a future of innovation，Gemini时代：开启创新的未来

This is a significant milestone in the development of AI, and the start of a new era for us at Google as we continue to rapidly innovate and responsibly advance the capabilities of our models.

We’ve made great progress on Gemini so far and we’re working hard to further extend its capabilities for future versions, including advances in planning and memory, and increasing the context window for processing even more information to give better responses.

We’re excited by the amazing possibilities of a world responsibly empowered by AI — a future of innovation that will enhance creativity, extend knowledge, advance science and transform the way billions of people live and work around the world.

这是人工智能发展的一个重要里程碑，也是我们谷歌一个新时代的开始，因为我们将继续快速创新，负责任地提高我们模型的能力。

到目前为止，我们在Gemini上取得了很大的进展，并且我们正在努力进一步扩展其能力，包括在规划和记忆方面的进步，以及增加上下文窗口以处理更多信息，以提供更好的响应。

我们对由人工智能负责任赋能的美好可能性感到兴奋——这是一个通过创新来增强创造力、扩展知识、推动科学并改变全球数十亿人生活和工作方式的未来。

这篇关于MMLM之Gemini：《Introducing Gemini: our largest and most capable AI model》的翻译与解读的文章就介绍到这儿，希望我们推荐的文章对编程师们有所帮助！

MMLM之Gemini：《Introducing Gemini: our largest and most capable AI model》的翻译与解读

《Introducing Gemini: our largest and most capable AI model》的翻译与解读

Note from Sundar

Introducing Gemini介绍Gemini

State-of-the-art performance最先进的性能

See more details in our Gemini technical report.在我们的Gemini技术报告中看到更多细节。

在包括文本和编码在内的一系列基准测试中都超越了最先进的性能Gemini surpasses state-of-the-art performance on a range of benchmarks including text and coding.Gemini

在一系列多模式基准上超越了最先进的性能Gemini surpasses state-of-the-art performance on a range of multimodal benchmarks.