估计生成式人工智能的幻觉率
Estimating the Hallucination Rate of Generative AI
June 11, 2024
作者: Andrew Jesson, Nicolas Beltran-Velez, Quentin Chu, Sweta Karlekar, Jannik Kossen, Yarin Gal, John P. Cunningham, David Blei
cs.AI
摘要
本文讨论了使用生成式人工智能估计上下文学习(ICL)中的幻觉率。在ICL中,通过给定数据集,条件生成模型(CGM)被要求基于该数据集进行预测。ICL的贝叶斯解释假设CGM正在计算关于潜在参数和数据的未知贝叶斯模型的后验预测分布。从这个角度来看,我们将幻觉定义为在真实潜在参数下概率较低的生成预测。我们开发了一种新方法,该方法接受一个ICL问题,即一个CGM、一个数据集和一个预测问题,并估计CGM生成幻觉的概率。我们的方法只需要从模型生成查询和响应,并评估其响应的对数概率。我们通过使用大型语言模型在合成回归和自然语言ICL任务上对我们的方法进行了实证评估。
English
This work is about estimating the hallucination rate for in-context learning
(ICL) with Generative AI. In ICL, a conditional generative model (CGM) is
prompted with a dataset and asked to make a prediction based on that dataset.
The Bayesian interpretation of ICL assumes that the CGM is calculating a
posterior predictive distribution over an unknown Bayesian model of a latent
parameter and data. With this perspective, we define a hallucination
as a generated prediction that has low-probability under the true latent
parameter. We develop a new method that takes an ICL problem -- that is, a CGM,
a dataset, and a prediction question -- and estimates the probability that a
CGM will generate a hallucination. Our method only requires generating queries
and responses from the model and evaluating its response log probability. We
empirically evaluate our method on synthetic regression and natural language
ICL tasks using large language models.