Gemma：基於Gemini研究和技術的開放模型

摘要

本研究介紹了 Gemma，這是一系列輕量、最先進的開放模型，建立在用於創建 Gemini 模型的研究和技術基礎之上。Gemma 模型在語言理解、推理和安全性等學術基準上展現出優異的性能。我們釋出了兩種模型大小（20億和70億參數），並提供預訓練和微調後的檢查點。Gemma 在18個基於文本的任務中，有11個超越了相同大小的開放模型，我們對模型的安全性和責任方面進行了全面評估，並詳細描述了模型開發過程。我們認為負責任地釋出大型語言模型對於提升前沿模型的安全性以及推動下一波語言模型創新至關重要。

English

This work introduces Gemma, a family of lightweight, state-of-the art open models built from the research and technology used to create Gemini models. Gemma models demonstrate strong performance across academic benchmarks for language understanding, reasoning, and safety. We release two sizes of models (2 billion and 7 billion parameters), and provide both pretrained and fine-tuned checkpoints. Gemma outperforms similarly sized open models on 11 out of 18 text-based tasks, and we present comprehensive evaluations of safety and responsibility aspects of the models, alongside a detailed description of model development. We believe the responsible release of LLMs is critical for improving the safety of frontier models, and for enabling the next wave of LLM innovations.

Gemma：基於Gemini研究和技術的開放模型

Gemma: Open Models Based on Gemini Research and Technology

摘要

Support