ChatPaper.aiChatPaper

人物:一個可重複使用的多元對齊測試平臺

PERSONA: A Reproducible Testbed for Pluralistic Alignment

July 24, 2024
作者: Louis Castricato, Nathan Lile, Rafael Rafailov, Jan-Philipp Fränken, Chelsea Finn
cs.AI

摘要

語言模型(LMs)的快速發展需要與多元使用者價值觀進行堅固的對齊。然而,目前的偏好優化方法常常無法捕捉使用者意見的多樣性,反而強化多數觀點並邊緣化少數派觀點。我們引入了PERSONA,一個可重現的測試平台,旨在評估和改善LMs的多元對齊。我們從美國人口普查數據中程序生成多樣的使用者檔案,產生了1,586個具有不同人口統計和特殊屬性的合成人物。然後,我們生成了一個大規模的評估數據集,包含3,868個提示和317,200個從我們的合成人物獲得的反饋對。利用這個數據集,我們系統地評估LM在扮演多樣使用者方面的能力,通過人類評審的驗證,並建立了一個用於多元對齊方法的基準,即PERSONA Bench,以及一個用於創建新的和未來基準的廣泛數據集。完整的數據集和基準可在以下網址找到:https://www.synthlabs.ai/research/persona。
English
The rapid advancement of language models (LMs) necessitates robust alignment with diverse user values. However, current preference optimization approaches often fail to capture the plurality of user opinions, instead reinforcing majority viewpoints and marginalizing minority perspectives. We introduce PERSONA, a reproducible test bed designed to evaluate and improve pluralistic alignment of LMs. We procedurally generate diverse user profiles from US census data, resulting in 1,586 synthetic personas with varied demographic and idiosyncratic attributes. We then generate a large-scale evaluation dataset containing 3,868 prompts and 317,200 feedback pairs obtained from our synthetic personas. Leveraging this dataset, we systematically evaluate LM capabilities in role-playing diverse users, verified through human judges, and the establishment of both a benchmark, PERSONA Bench, for pluralistic alignment approaches as well as an extensive dataset to create new and future benchmarks. The full dataset and benchmarks are available here: https://www.synthlabs.ai/research/persona.

Summary

AI-Generated Summary

PDF202November 28, 2024