내부 표현의 렌즈를 통해 다양한 언어에 걸친 LLM의 지식 경계 인식 분석

초록

LLM의 지식 경계를 이해하는 것은 환각(hallucination)을 방지하기 위해 중요하지만, LLM의 지식 경계에 대한 연구는 주로 영어에 초점이 맞춰져 왔습니다. 본 연구에서는 다양한 언어로 알려진 질문과 알려지지 않은 질문을 처리할 때 LLM의 내부 표현을 탐구함으로써, LLM이 어떻게 다국어 간 지식 경계를 인식하는지 분석하는 첫 번째 연구를 제시합니다. 우리의 실험 연구는 세 가지 주요 발견을 보여줍니다: 1) LLM의 지식 경계 인식은 다양한 언어에서 중간에서 중상위 층에 인코딩됩니다. 2) 지식 경계 인식의 언어적 차이는 선형 구조를 따르며, 이는 훈련 없이도 언어 간 지식 경계 인식 능력을 효과적으로 전이할 수 있는 방법을 제안하는 동기가 되었습니다. 이를 통해 저자원 언어에서의 환각 위험을 줄이는 데 도움이 됩니다. 3) 이중 언어 질문 쌍 번역에 대한 미세 조정은 언어 간 지식 경계 인식을 더욱 향상시킵니다. 다국어 지식 경계 분석을 위한 표준 테스트베드가 부재한 상황에서, 우리는 세 가지 대표적인 유형의 지식 경계 데이터로 구성된 다국어 평가 도구를 구축했습니다. 우리의 코드와 데이터셋은 https://github.com/DAMO-NLP-SG/LLM-Multilingual-Knowledge-Boundaries에서 공개적으로 이용 가능합니다.

English

While understanding the knowledge boundaries of LLMs is crucial to prevent hallucination, research on knowledge boundaries of LLMs has predominantly focused on English. In this work, we present the first study to analyze how LLMs recognize knowledge boundaries across different languages by probing their internal representations when processing known and unknown questions in multiple languages. Our empirical studies reveal three key findings: 1) LLMs' perceptions of knowledge boundaries are encoded in the middle to middle-upper layers across different languages. 2) Language differences in knowledge boundary perception follow a linear structure, which motivates our proposal of a training-free alignment method that effectively transfers knowledge boundary perception ability across languages, thereby helping reduce hallucination risk in low-resource languages; 3) Fine-tuning on bilingual question pair translation further enhances LLMs' recognition of knowledge boundaries across languages. Given the absence of standard testbeds for cross-lingual knowledge boundary analysis, we construct a multilingual evaluation suite comprising three representative types of knowledge boundary data. Our code and datasets are publicly available at https://github.com/DAMO-NLP-SG/LLM-Multilingual-Knowledge-Boundaries.

내부 표현의 렌즈를 통해 다양한 언어에 걸친 LLM의 지식 경계 인식 분석

Analyzing LLMs' Knowledge Boundary Cognition Across Languages Through the Lens of Internal Representations

초록

Support