NeMo-Aligner:面向高效模型对齐的可扩展工具包
NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment
May 2, 2024
作者: Gerald Shen, Zhilin Wang, Olivier Delalleau, Jiaqi Zeng, Yi Dong, Daniel Egert, Shengyang Sun, Jimmy Zhang, Sahil Jain, Ali Taghibakhshi, Markel Sanz Ausin, Ashwath Aithal, Oleksii Kuchaiev
cs.AI
摘要
让大型语言模型(LLMs)与人类价值观及偏好对齐,是确保其既实用又安全的关键。然而,为参数规模常达数百亿甚至数千亿的最大最强LLMs构建高效对齐工具颇具挑战。我们开发了NeMo-Aligner——一个能够高效扩展至数百块GPU进行训练的模型对齐工具包。该工具包集成了经过深度优化且具备高度可扩展性的主流模型对齐范式实现,包括:基于人类反馈的强化学习(RLHF)、直接偏好优化(DPO)、SteerLM以及自我博弈微调(SPIN)。此外,我们的工具包支持在参数高效微调(PEFT)配置下运行大多数对齐技术。NeMo-Aligner采用可扩展架构设计,能轻松适配其他对齐方法。本项目基于Apache 2.0协议开源,诚邀社区参与贡献:https://github.com/NVIDIA/NeMo-Aligner
English
Aligning Large Language Models (LLMs) with human values and preferences is
essential for making them helpful and safe. However, building efficient tools
to perform alignment can be challenging, especially for the largest and most
competent LLMs which often contain tens or hundreds of billions of parameters.
We create NeMo-Aligner, a toolkit for model alignment that can efficiently
scale to using hundreds of GPUs for training. NeMo-Aligner comes with highly
optimized and scalable implementations for major paradigms of model alignment
such as: Reinforcement Learning from Human Feedback (RLHF), Direct Preference
Optimization (DPO), SteerLM, and Self-Play Fine-Tuning (SPIN). Additionally,
our toolkit supports running most of the alignment techniques in a Parameter
Efficient Fine-Tuning (PEFT) setting. NeMo-Aligner is designed for
extensibility, allowing support for other alignment techniques with minimal
effort. It is open-sourced with Apache 2.0 License and we invite community
contributions at https://github.com/NVIDIA/NeMo-Aligner