대규모 지식 기반에서 언어 모델의 지식 결핍 탐색

초록

대규모 언어 모델(LLMs)은 인상적인 언어 능력을 갖추고 있지만, 종종 사실적 지식을 충실히 유지하지 못해 환각(hallucination)과 신뢰할 수 없는 출력을 초래합니다. 완전한 규모의 지식 베이스에 대해 철저히 평가하여 LLMs의 지식 결핍을 이해하는 것은 계산적으로 부담스럽습니다, 특히 가중치가 닫힌(closed-weight) 모델의 경우 더욱 그렇습니다. 우리는 엄격한 쿼리 예산 하에서 가중치가 닫힌 LLMs의 지식 결핍(오류)을 발견하기 위한 확장 가능하고 효율적인 프레임워크인 확률적 오류 상승(Stochastic Error Ascent, SEA)을 제안합니다. SEA는 모든 지식 후보를 단순히 탐색하는 대신, 오류 발견을 확률적 최적화 과정으로 공식화합니다: 이전에 관찰된 실패와의 의미적 유사성을 활용하여 새로운 고오류 후보를 반복적으로 검색합니다. 검색 효율성과 범위를 더욱 향상시키기 위해, SEA는 문서 및 단락 수준에서 계층적 검색을 사용하고, 오류 전파를 모델링하고 체계적인 실패 모드를 식별하기 위해 관계 방향성 비순환 그래프(relation directed acyclic graph)를 구성합니다. 실험적으로, SEA는 Automated Capability Discovery보다 40.7배 더 많은 지식 오류를 발견하고, AutoBencher보다 26.7% 더 많은 오류를 발견하면서 오류당 비용을 각각 599배와 9배 줄였습니다. 인간 평가는 생성된 질문의 높은 품질을 확인했으며, 제거 및 수렴 분석은 SEA의 각 구성 요소의 기여를 검증했습니다. 발견된 오류에 대한 추가 분석은 LLM 패밀리 간의 상관된 실패 패턴과 반복적인 결핍을 드러내며, 향후 LLM 개발에서 더 나은 데이터 커버리지와 목적에 맞는 미세 조정의 필요성을 강조합니다.

English

Large language models (LLMs) possess impressive linguistic capabilities but often fail to faithfully retain factual knowledge, leading to hallucinations and unreliable outputs. Understanding LLMs' knowledge deficiencies by exhaustively evaluating against full-scale knowledge bases is computationally prohibitive, especially for closed-weight models. We propose stochastic error ascent (SEA), a scalable and efficient framework for discovering knowledge deficiencies (errors) in closed-weight LLMs under a strict query budget. Rather than naively probing all knowledge candidates, SEA formulates error discovery as a stochastic optimization process: it iteratively retrieves new high-error candidates by leveraging the semantic similarity to previously observed failures. To further enhance search efficiency and coverage, SEA employs hierarchical retrieval across document and paragraph levels, and constructs a relation directed acyclic graph to model error propagation and identify systematic failure modes. Empirically, SEA uncovers 40.7x more knowledge errors than Automated Capability Discovery and 26.7% more than AutoBencher, while reducing the cost-per-error by 599x and 9x, respectively. Human evaluation confirms the high quality of generated questions, while ablation and convergence analyses validate the contribution of each component in SEA. Further analysis on the discovered errors reveals correlated failure patterns across LLM families and recurring deficits, highlighting the need for better data coverage and targeted fine-tuning in future LLM development.

대규모 지식 기반에서 언어 모델의 지식 결핍 탐색

Discovering Knowledge Deficiencies of Language Models on Massive Knowledge Base

초록

Support