OLMo：加速语言模型科学

摘要

语言模型（LMs）已经在自然语言处理研究和商业产品中变得无处不在。随着它们在商业上的重要性不断增长，最强大的模型已经变得封闭起来，只能通过专有接口访问，其训练数据、架构和开发的重要细节也未公开。鉴于这些细节对于科学研究这些模型的重要性，包括它们的偏见和潜在风险，我们认为研究社区能够访问功能强大、真正开放的LMs至关重要。为此，本技术报告详细介绍了OLMo的首次发布，这是一种最先进的、真正开放的语言模型，以及用于构建和研究语言建模科学的框架。与大多数先前仅发布模型权重和推理代码的努力不同，我们发布了OLMo和整个框架，包括训练数据、训练和评估代码。我们希望这一发布能赋予并加强开放研究社区，并激发新一波创新。

English

Language models (LMs) have become ubiquitous in both NLP research and in commercial product offerings. As their commercial importance has surged, the most powerful models have become closed off, gated behind proprietary interfaces, with important details of their training data, architectures, and development undisclosed. Given the importance of these details in scientifically studying these models, including their biases and potential risks, we believe it is essential for the research community to have access to powerful, truly open LMs. To this end, this technical report details the first release of OLMo, a state-of-the-art, truly Open Language Model and its framework to build and study the science of language modeling. Unlike most prior efforts that have only released model weights and inference code, we release OLMo and the whole framework, including training data and training and evaluation code. We hope this release will empower and strengthen the open research community and inspire a new wave of innovation.

OLMo：加速语言模型科学

OLMo: Accelerating the Science of Language Models

摘要

Support