ChatPaper.aiChatPaper

PERL: 从人类反馈中实现参数高效强化学习

PERL: Parameter Efficient Reinforcement Learning from Human Feedback

March 15, 2024
作者: Hakim Sidahmed, Samrat Phatale, Alex Hutcheson, Zhuonan Lin, Zhang Chen, Zac Yu, Jarvis Jin, Roman Komarytsia, Christiane Ahlheim, Yonghao Zhu, Simral Chaudhary, Bowen Li, Saravanan Ganesh, Bill Byrne, Jessica Hoffmann, Hassan Mansoor, Wei Li, Abhinav Rastogi, Lucas Dixon
cs.AI

摘要

从人类反馈中进行强化学习(RLHF)已被证明是一种有效的方法,可以使预训练的大型语言模型(LLMs)与人类偏好保持一致。但使用RLHF训练模型在计算上是昂贵且整个过程复杂。在这项工作中,我们研究了在使用Hu等人(2021年)引入的参数高效方法低秩适应(LoRA)训练基础模型的RLHF。我们研究了“参数高效强化学习”(PERL)的设置,其中我们使用LoRA进行奖励模型训练和强化学习。我们将PERL与传统的微调(全调整)在包括奖励建模和强化学习的7个基准测试中的各种配置进行比较,其中包括2个新数据集。我们发现,PERL的性能与传统的RLHF设置相当,同时训练速度更快,内存占用更少。这使得RLHF能够保持高性能,同时减少了限制其作为大型语言模型对齐技术的采用的计算负担。我们还发布了两个新的好评/差评偏好数据集:“Taskmaster Coffee”和“Taskmaster Ticketing”,以促进围绕RLHF的研究。
English
Reinforcement Learning from Human Feedback (RLHF) has proven to be a strong method to align Pretrained Large Language Models (LLMs) with human preferences. But training models with RLHF is computationally expensive, and an overall complex process. In this work, we study RLHF where the underlying models are trained using the parameter efficient method of Low-Rank Adaptation (LoRA) introduced by Hu et al. [2021]. We investigate the setup of "Parameter Efficient Reinforcement Learning" (PERL), in which we perform reward model training and reinforcement learning using LoRA. We compare PERL to conventional fine-tuning (full-tuning) across various configurations for 7 benchmarks, including 2 novel datasets, of reward modeling and reinforcement learning. We find that PERL performs on par with the conventional RLHF setting, while training faster, and with less memory. This enables the high performance of RLHF, while reducing the computational burden that limits its adoption as an alignment technique for Large Language Models. We also release 2 novel thumbs up/down preference datasets: "Taskmaster Coffee", and "Taskmaster Ticketing" to promote research around RLHF.

Summary

AI-Generated Summary

PDF604December 15, 2024