Nemotron-4 15B 技術報告

摘要

我們介紹了 Nemotron-4 15B，一個擁有 150 億參數的大型多語言語言模型，訓練過程中使用了 8000 億個文本標記。Nemotron-4 15B 在英語、多語言和編碼任務中表現出色：在 7 個下游評估領域中，它在其中 4 個領域中表現優於所有現有的同等大小的開放模型，並在其餘領域中達到與領先的開放模型相競爭的表現。具體來說，Nemotron-4 15B 展現出所有同等大小模型中最佳的多語言能力，甚至優於四倍以上的模型以及專門為多語言任務而設計的模型。

English

We introduce Nemotron-4 15B, a 15-billion-parameter large multilingual language model trained on 8 trillion text tokens. Nemotron-4 15B demonstrates strong performance when assessed on English, multilingual, and coding tasks: it outperforms all existing similarly-sized open models on 4 out of 7 downstream evaluation areas and achieves competitive performance to the leading open models in the remaining ones. Specifically, Nemotron-4 15B exhibits the best multilingual capabilities of all similarly-sized models, even outperforming models over four times larger and those explicitly specialized for multilingual tasks.

Nemotron-4 15B 技術報告

Nemotron-4 15B Technical Report

摘要

Support