OLMo:加速語言模型科學
OLMo: Accelerating the Science of Language Models
February 1, 2024
作者: Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Valentina Pyatkin, Abhilasha Ravichander, Dustin Schwenk, Saurabh Shah, Will Smith, Emma Strubell, Nishant Subramani, Mitchell Wortsman, Pradeep Dasigi, Nathan Lambert, Kyle Richardson, Luke Zettlemoyer, Jesse Dodge, Kyle Lo, Luca Soldaini, Noah A. Smith, Hannaneh Hajishirzi
cs.AI
摘要
語言模型(LMs)已經在自然語言處理研究和商業產品中變得無所不在。隨著其商業重要性的上升,最強大的模型已經變得封閉,只能透過專有接口訪問,其訓練數據、架構和開發的重要細節未公開。考慮到這些細節在科學研究這些模型時的重要性,包括它們的偏見和潛在風險,我們認為研究社區能夠訪問功能強大、真正開放的LMs至關重要。為此,本技術報告詳細介紹了OLMo的首次發布,這是一個最先進的、真正開放的語言模型及其構建和研究語言建模科學的框架。與大多數先前僅發布模型權重和推理代碼的努力不同,我們發布了OLMo和整個框架,包括訓練數據以及訓練和評估代碼。我們希望這一發布能賦予並加強開放研究社區,並激發新一波創新。
English
Language models (LMs) have become ubiquitous in both NLP research and in
commercial product offerings. As their commercial importance has surged, the
most powerful models have become closed off, gated behind proprietary
interfaces, with important details of their training data, architectures, and
development undisclosed. Given the importance of these details in
scientifically studying these models, including their biases and potential
risks, we believe it is essential for the research community to have access to
powerful, truly open LMs. To this end, this technical report details the first
release of OLMo, a state-of-the-art, truly Open Language Model and its
framework to build and study the science of language modeling. Unlike most
prior efforts that have only released model weights and inference code, we
release OLMo and the whole framework, including training data and training and
evaluation code. We hope this release will empower and strengthen the open
research community and inspire a new wave of innovation.