ChatPaper.aiChatPaper

Jais和Jais-chat:基于阿拉伯语的基础和指令调整的开放式生成大型语言模型

Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models

August 30, 2023
作者: Neha Sengupta, Sunil Kumar Sahu, Bokang Jia, Satheesh Katipomu, Haonan Li, Fajri Koto, Osama Mohammed Afzal, Samta Kamboj, Onkar Pandit, Rahul Pal, Lalit Pradhan, Zain Muhammad Mujahid, Massa Baali, Alham Fikri Aji, Zhengzhong Liu, Andy Hock, Andrew Feldman, Jonathan Lee, Andrew Jackson, Preslav Nakov, Timothy Baldwin, Eric Xing
cs.AI

摘要

我们介绍了 Jais 和 Jais-chat,这是新一代的以阿拉伯语为中心的基础和指导调整的开放生成式大型语言模型(LLMs)。这些模型基于 GPT-3 的仅解码器架构,并在混合阿拉伯语和英语文本(包括各种编程语言的源代码)上进行了预训练。拥有 130亿参数,它们在阿拉伯语方面展示出比任何现有的开放阿拉伯语和多语言模型更好的知识和推理能力,根据广泛的评估结果,优势明显。此外,尽管在较少的英语数据上训练,但与类似规模的英语为中心的开放模型相比,这些模型在英语方面也具有竞争力。我们提供了关于模型训练、调整、安全对齐和评估的详细描述。我们发布了两个模型的开放版本 -- 基础的 Jais 模型和一个指导调整的 Jais-chat 变体 -- 旨在促进对阿拉伯语LLMs的研究。可在 https://huggingface.co/inception-mbzuai/jais-13b-chat 获取。
English
We introduce Jais and Jais-chat, new state-of-the-art Arabic-centric foundation and instruction-tuned open generative large language models (LLMs). The models are based on the GPT-3 decoder-only architecture and are pretrained on a mixture of Arabic and English texts, including source code in various programming languages. With 13 billion parameters, they demonstrate better knowledge and reasoning capabilities in Arabic than any existing open Arabic and multilingual models by a sizable margin, based on extensive evaluation. Moreover, the models are competitive in English compared to English-centric open models of similar size, despite being trained on much less English data. We provide a detailed description of the training, the tuning, the safety alignment, and the evaluation of the models. We release two open versions of the model -- the foundation Jais model, and an instruction-tuned Jais-chat variant -- with the aim of promoting research on Arabic LLMs. Available at https://huggingface.co/inception-mbzuai/jais-13b-chat
PDF286December 15, 2024