PrefPalette：基於潛在屬性的個人化偏好建模

摘要

個性化AI系統不僅需要理解用戶的偏好，還需深入探究這些偏好背後的原因——然而，當前的偏好模型通常將人類判斷視為黑箱。我們引入了PrefPalette，這是一個將偏好分解為屬性維度，並以人類可理解的方式針對不同社群價值觀定制其偏好預測的框架。PrefPalette通過兩種方式實現了認知科學中的多屬性決策原則：(1) 一個可擴展的反事實屬性合成步驟，涉及生成合成訓練數據以隔離單個屬性效應（例如，正式性、幽默感、文化價值觀），以及(2) 基於注意力的偏好建模，學習不同社群如何動態權衡這些屬性。此方法超越了聚合偏好建模，捕捉到驅動人類判斷的多樣化評估框架。在對來自在線平台Reddit的45個社群進行評估時，PrefPalette的平均預測準確率比GPT-4o高出46.6%。除了預測性能的提升，PrefPalette還揭示了直觀的、社群特定的特徵：學術社群重視冗長性和啟發性，衝突導向的社群看重諷刺和直接性，而支持型社群則強調同理心。通過建模人類判斷的屬性中介結構，PrefPalette不僅提供了更優的偏好建模，還帶來了透明、可解釋的洞察，為構建更值得信賴、價值觀感知的個性化應用邁出了第一步。

English

Personalizing AI systems requires understanding not just what users prefer, but the reasons that underlie those preferences - yet current preference models typically treat human judgment as a black box. We introduce PrefPalette, a framework that decomposes preferences into attribute dimensions and tailors its preference prediction to distinct social community values in a human-interpretable manner. PrefPalette operationalizes a cognitive science principle known as multi-attribute decision making in two ways: (1) a scalable counterfactual attribute synthesis step that involves generating synthetic training data to isolate for individual attribute effects (e.g., formality, humor, cultural values), and (2) attention-based preference modeling that learns how different social communities dynamically weight these attributes. This approach moves beyond aggregate preference modeling to capture the diverse evaluation frameworks that drive human judgment. When evaluated on 45 social communities from the online platform Reddit, PrefPalette outperforms GPT-4o by 46.6% in average prediction accuracy. Beyond raw predictive improvements, PrefPalette also shed light on intuitive, community-specific profiles: scholarly communities prioritize verbosity and stimulation, conflict-oriented communities value sarcasm and directness, and support-based communities emphasize empathy. By modeling the attribute-mediated structure of human judgment, PrefPalette delivers both superior preference modeling and transparent, interpretable insights, and serves as a first step toward more trustworthy, value-aware personalized applications.

PrefPalette：基於潛在屬性的個人化偏好建模

PrefPalette: Personalized Preference Modeling with Latent Attributes

摘要

Support