探索語言模型在龐大知識庫上的知識缺陷

摘要

大型语言模型（LLMs）具备令人印象深刻的语言能力，但往往无法忠实保留事实知识，导致产生幻觉和不可靠的输出。通过全面评估与大规模知识库的对比来理解LLMs的知识缺陷，在计算上是不可行的，尤其是对于闭源权重模型。我们提出了随机误差上升（SEA），一个在严格查询预算下可扩展且高效的框架，用于发现闭源权重LLMs中的知识缺陷（错误）。SEA并非简单地探测所有知识候选，而是将错误发现构建为一个随机优化过程：它通过利用与先前观察到的失败的语义相似性，迭代检索新的高错误候选。为了进一步提升搜索效率和覆盖率，SEA采用了跨文档和段落层次的分级检索，并构建了一个关系有向无环图来模拟错误传播并识别系统性故障模式。实证表明，SEA发现的错误数量是自动能力发现的40.7倍，比AutoBencher多26.7%，同时将每个错误的成本分别降低了599倍和9倍。人工评估确认了生成问题的高质量，而消融和收敛分析验证了SEA中每个组件的贡献。对发现错误的进一步分析揭示了跨LLM家族的关联故障模式和反复出现的缺陷，强调了未来LLM开发中需要更好的数据覆盖和针对性微调。

English

Large language models (LLMs) possess impressive linguistic capabilities but often fail to faithfully retain factual knowledge, leading to hallucinations and unreliable outputs. Understanding LLMs' knowledge deficiencies by exhaustively evaluating against full-scale knowledge bases is computationally prohibitive, especially for closed-weight models. We propose stochastic error ascent (SEA), a scalable and efficient framework for discovering knowledge deficiencies (errors) in closed-weight LLMs under a strict query budget. Rather than naively probing all knowledge candidates, SEA formulates error discovery as a stochastic optimization process: it iteratively retrieves new high-error candidates by leveraging the semantic similarity to previously observed failures. To further enhance search efficiency and coverage, SEA employs hierarchical retrieval across document and paragraph levels, and constructs a relation directed acyclic graph to model error propagation and identify systematic failure modes. Empirically, SEA uncovers 40.7x more knowledge errors than Automated Capability Discovery and 26.7% more than AutoBencher, while reducing the cost-per-error by 599x and 9x, respectively. Human evaluation confirms the high quality of generated questions, while ablation and convergence analyses validate the contribution of each component in SEA. Further analysis on the discovered errors reveals correlated failure patterns across LLM families and recurring deficits, highlighting the need for better data coverage and targeted fine-tuning in future LLM development.

探索語言模型在龐大知識庫上的知識缺陷

Discovering Knowledge Deficiencies of Language Models on Massive Knowledge Base

摘要

Support