Jais和Jais-chat:以阿拉伯為中心的基礎和指令調整的開放式生成大型語言模型
Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models
August 30, 2023
作者: Neha Sengupta, Sunil Kumar Sahu, Bokang Jia, Satheesh Katipomu, Haonan Li, Fajri Koto, Osama Mohammed Afzal, Samta Kamboj, Onkar Pandit, Rahul Pal, Lalit Pradhan, Zain Muhammad Mujahid, Massa Baali, Alham Fikri Aji, Zhengzhong Liu, Andy Hock, Andrew Feldman, Jonathan Lee, Andrew Jackson, Preslav Nakov, Timothy Baldwin, Eric Xing
cs.AI
摘要
我們介紹了 Jais 和 Jais-chat,這是最新的以阿拉伯語為中心的基礎和指導調校的開放生成式大型語言模型(LLMs)。這些模型基於 GPT-3 的僅解碼器架構,並在混合的阿拉伯語和英語文本(包括各種編程語言的源代碼)上進行了預訓練。憑藉 130 億個參數,它們在阿拉伯語方面展示了比現有的任何開放式阿拉伯語和多語種模型更好的知識和推理能力,這是基於廣泛的評估。此外,儘管在英語數據上的訓練量遠遠少於英語為中心的相似大小的開放模型,但這些模型在英語方面也具有競爭力。我們提供了有關模型的訓練、調校、安全對齊和評估的詳細描述。我們釋出了兩個開放版本的模型 —— 基礎的 Jais 模型和一個經過指導調校的 Jais-chat 變體 —— 旨在促進對阿拉伯語 LLMs 的研究。可在 https://huggingface.co/inception-mbzuai/jais-13b-chat 下載。
English
We introduce Jais and Jais-chat, new state-of-the-art Arabic-centric
foundation and instruction-tuned open generative large language models (LLMs).
The models are based on the GPT-3 decoder-only architecture and are pretrained
on a mixture of Arabic and English texts, including source code in various
programming languages. With 13 billion parameters, they demonstrate better
knowledge and reasoning capabilities in Arabic than any existing open Arabic
and multilingual models by a sizable margin, based on extensive evaluation.
Moreover, the models are competitive in English compared to English-centric
open models of similar size, despite being trained on much less English data.
We provide a detailed description of the training, the tuning, the safety
alignment, and the evaluation of the models. We release two open versions of
the model -- the foundation Jais model, and an instruction-tuned Jais-chat
variant -- with the aim of promoting research on Arabic LLMs. Available at
https://huggingface.co/inception-mbzuai/jais-13b-chat