探究與人類認知一致的大型語言模型不確定性
Investigating Human-Aligned Large Language Model Uncertainty
March 16, 2025
作者: Kyle Moore, Jesse Roberts, Daryl Watson, Pamela Wisniewski
cs.AI
摘要
近期研究致力於量化大型語言模型的不確定性,以促進模型控制並調節用戶信任。先前的工作主要關注那些具有理論基礎或反映模型平均顯著行為的不確定性度量。在本研究中,我們探討了多種不確定性度量,旨在識別與人類群體層面不確定性相關的度量。我們發現,貝葉斯度量及熵度量的一種變體——top-k熵,隨著模型規模的變化,其表現與人類行為趨於一致。我們還觀察到,某些強度量隨著模型規模的增大,其與人類相似性有所下降,但通過多元線性回歸分析,我們發現結合多種不確定性度量能夠提供與人類對齊相當的效果,同時減少對模型規模的依賴。
English
Recent work has sought to quantify large language model uncertainty to
facilitate model control and modulate user trust. Previous works focus on
measures of uncertainty that are theoretically grounded or reflect the average
overt behavior of the model. In this work, we investigate a variety of
uncertainty measures, in order to identify measures that correlate with human
group-level uncertainty. We find that Bayesian measures and a variation on
entropy measures, top-k entropy, tend to agree with human behavior as a
function of model size. We find that some strong measures decrease in
human-similarity with model size, but, by multiple linear regression, we find
that combining multiple uncertainty measures provide comparable human-alignment
with reduced size-dependency.Summary
AI-Generated Summary