ChatPaper.aiChatPaper

大规模语言模型中偏好建模的深度贝叶斯主动学习

Deep Bayesian Active Learning for Preference Modeling in Large Language Models

June 14, 2024
作者: Luckeciano C. Melo, Panagiotis Tigas, Alessandro Abate, Yarin Gal
cs.AI

摘要

利用人类偏好来引导大型语言模型(LLMs)的行为在近年来取得了显著成功。然而,数据选择和标记仍然是这些系统的瓶颈,特别是在大规模情况下。因此,选择获取人类反馈的最具信息量的点可能会大大降低偏好标记的成本,并推动LLMs的进一步发展。贝叶斯主动学习提供了一个原则性框架来解决这一挑战,并在不同环境中展现了显著的成功。然而,先前尝试将其用于偏好建模的努力并未达到预期效果。在这项工作中,我们发现天真的认知不确定性估计会导致获取冗余样本。我们通过提出贝叶斯主动学习者用于偏好建模(BAL-PM)来解决这个问题,这是一种新颖的随机获取策略,不仅针对偏好模型中的高认知不确定性点,而且还试图最大化所获取提示在LLM使用的特征空间中的熵。值得注意的是,我们的实验表明BAL-PM在两个流行的人类偏好数据集中需要的偏好标记数量减少了33%至68%,超过了先前的随机贝叶斯获取策略。
English
Leveraging human preferences for steering the behavior of Large Language Models (LLMs) has demonstrated notable success in recent years. Nonetheless, data selection and labeling are still a bottleneck for these systems, particularly at large scale. Hence, selecting the most informative points for acquiring human feedback may considerably reduce the cost of preference labeling and unleash the further development of LLMs. Bayesian Active Learning provides a principled framework for addressing this challenge and has demonstrated remarkable success in diverse settings. However, previous attempts to employ it for Preference Modeling did not meet such expectations. In this work, we identify that naive epistemic uncertainty estimation leads to the acquisition of redundant samples. We address this by proposing the Bayesian Active Learner for Preference Modeling (BAL-PM), a novel stochastic acquisition policy that not only targets points of high epistemic uncertainty according to the preference model but also seeks to maximize the entropy of the acquired prompt distribution in the feature space spanned by the employed LLM. Notably, our experiments demonstrate that BAL-PM requires 33% to 68% fewer preference labels in two popular human preference datasets and exceeds previous stochastic Bayesian acquisition policies.

Summary

AI-Generated Summary

PDF21December 6, 2024