当个性化产生误导:理解并缓解个性化大语言模型中的幻觉现象
When Personalization Misleads: Understanding and Mitigating Hallucinations in Personalized LLMs
January 16, 2026
作者: Zhongxiang Sun, Yi Zhan, Chenglei Shen, Weijie Yu, Xiao Zhang, Ming He, Jun Xu
cs.AI
摘要
个性化大语言模型通过适应用户个体特征来提升用户满意度,但个性化过程可能无意间扭曲事实推理。我们发现,当个性化大语言模型处理事实性查询时,会出现模型生成答案与用户历史偏好而非客观事实保持一致的现象,这种由个性化引发的幻觉会降低事实可靠性,并可能传播错误信念,其根源在于个性化表征与事实表征之间的纠缠效应。为解决该问题,我们提出事实保持个性化导向(FPPS),一种轻量级的推理时干预方法,能在保持个性化行为的同时减轻个性化导致的事实扭曲。我们还推出了PFQABench——首个专门用于评估个性化场景下事实性与个性化问答联合表现的基准测试。在多类大语言模型架构和个性化方法上的实验表明,FPPS在保持个性化性能的同时显著提升了事实准确性。
English
Personalized large language models (LLMs) adapt model behavior to individual users to enhance user satisfaction, yet personalization can inadvertently distort factual reasoning. We show that when personalized LLMs face factual queries, there exists a phenomenon where the model generates answers aligned with a user's prior history rather than the objective truth, resulting in personalization-induced hallucinations that degrade factual reliability and may propagate incorrect beliefs, due to representational entanglement between personalization and factual representations. To address this issue, we propose Factuality-Preserving Personalized Steering (FPPS), a lightweight inference-time approach that mitigates personalization-induced factual distortions while preserving personalized behavior. We further introduce PFQABench, the first benchmark designed to jointly evaluate factual and personalized question answering under personalization. Experiments across multiple LLM backbones and personalization methods show that FPPS substantially improves factual accuracy while maintaining personalized performance.