ChatPaper.aiChatPaper

利用多重奖励提炼定制自我合理化者

Tailoring Self-Rationalizers with Multi-Reward Distillation

November 6, 2023
作者: Sahana Ramnath, Brihi Joshi, Skyler Hallinan, Ximing Lu, Liunian Harold Li, Aaron Chan, Jack Hessel, Yejin Choi, Xiang Ren
cs.AI

摘要

大型语言模型(LMs)能够生成自由文本的理由,以帮助问题回答。然而,先前的研究表明:1)有用的自我理性化仅在显著规模(例如,175B参数的GPT-3)下出现;2)主要关注下游性能,忽略了理由本身的语义,例如,它们是否忠实、真实且对人类有帮助?在这项工作中,我们使小规模LMs(约为GPT-3的200倍小)能够生成理由,不仅提高下游任务性能,而且在自动和人工评估中评估更加合理、一致和多样化。我们的方法,MaRio(Multi-rewArd RatIOnalization),是一种多奖励条件化的自我理性化算法,优化多个不同的属性,如合理性、多样性和一致性。在五个困难的问题回答数据集StrategyQA、QuaRel、OpenBookQA、NumerSense和QASC上的结果表明,MaRio不仅提高了任务准确性,还改善了小型LMs在上述轴上的自我理性化质量,优于监督微调(SFT)基线。广泛的人类评估证实,MaRio的理由比SFT的理由更受欢迎,以及在合理性和一致性方面的定性改进。
English
Large language models (LMs) are capable of generating free-text rationales to aid question answering. However, prior work 1) suggests that useful self-rationalization is emergent only at significant scales (e.g., 175B parameter GPT-3); and 2) focuses largely on downstream performance, ignoring the semantics of the rationales themselves, e.g., are they faithful, true, and helpful for humans? In this work, we enable small-scale LMs (approx. 200x smaller than GPT-3) to generate rationales that not only improve downstream task performance, but are also more plausible, consistent, and diverse, assessed both by automatic and human evaluation. Our method, MaRio (Multi-rewArd RatIOnalization), is a multi-reward conditioned self-rationalization algorithm that optimizes multiple distinct properties like plausibility, diversity and consistency. Results on five difficult question-answering datasets StrategyQA, QuaRel, OpenBookQA, NumerSense and QASC show that not only does MaRio improve task accuracy, but it also improves the self-rationalization quality of small LMs across the aforementioned axes better than a supervised fine-tuning (SFT) baseline. Extensive human evaluations confirm that MaRio rationales are preferred vs. SFT rationales, as well as qualitative improvements in plausibility and consistency.
PDF71December 15, 2024