대형 언어 모델에서 중국 지식 교정의 벤치마킹

초록

대형 언어 모델(LLMs)은 현저한 생성 능력을 보이지만, 환각 형태의 결함이 없는 것은 아니다. 특히 특정 언어와 분야에 LLMs를 적용할 때 이 문제가 더욱 두드러진다. 예를 들어, 중국 고전 시, 속담 또는 관용구를 처리할 때 LLMs가 특정 지식 부족으로 허황된 정보를 생성할 수 있다. 이에 본 논문은 지식 편집을 통해 LLMs의 중국 지식을 교정하기 위한 벤치마크를 제시한다. 구체적으로, 우리는 중국어 데이터셋인 CKnowEdit을 소개한다. 이를 위해 고전 텍스트, 관용구, 그리고 Baidu Tieba Ruozhiba의 콘텐츠 등 다양한 출처로부터 일곱 가지 유형의 지식을 수집하여 중국어의 고유한 다중성, 대립, 그리고 논리 구조를 고려한다. 이 데이터셋을 분석함으로써, 우리는 현재 LLMs가 중국어를 마스터하는 데 직면한 어려움을 발견한다. 더 나아가, 본 데이터셋에서 최첨단 지식 편집 기술을 평가함으로써 중국 지식 교정 분야에서 큰 발전 가능성을 드러낸다. 코드와 데이터셋은 https://github.com/zjunlp/EasyEdit에서 이용할 수 있다.

English

While Large Language Models (LLMs) exhibit remarkable generative capabilities, they are not without flaws, particularly in the form of hallucinations. This issue is even more pronounced when LLMs are applied to specific languages and domains. For example, LLMs may generate nonsense information when handling Chinese ancient poetry, proverbs, or idioms, owing to the lack of specific knowledge. To this end, this paper introduces a benchmark for rectifying Chinese knowledge in LLMs via knowledge editing. Specifically, we introduce a new Chinese dataset, CKnowEdit, by collecting seven type of knowledge from various sources, including classical texts, idioms, and content from Baidu Tieba Ruozhiba, thereby accounting for the unique polyphony, antithesis, and logical constructs inherent in the Chinese language. Through the analysis of this dataset, we uncover the challenges faced by current LLMs in mastering Chinese. Furthermore, our evaluation of state-of-the-art knowledge editing techniques on this dataset unveil the substantial scope for advancement in the rectification of Chinese knowledge. Code and dataset are available at https://github.com/zjunlp/EasyEdit.

대형 언어 모델에서 중국 지식 교정의 벤치마킹

Benchmarking Chinese Knowledge Rectification in Large Language Models

초록

Support