CaKE: 회로 인식 편집을 통한 일반화 가능한 지식 학습기

초록

지식 편집(Knowledge Editing, KE)은 대규모 언어 모델(Large Language Models, LLMs) 내의 오래되거나 잘못된 정보를 수정할 수 있게 해줍니다. 기존의 KE 방법들은 고립된 사실들을 업데이트할 수 있지만, 수정된 지식에 의존하는 다중 홉 추론 작업(multi-hop reasoning tasks)으로 이러한 업데이트를 일반화하는 데 어려움을 겪습니다. 추론 회로(reasoning circuits) — LLMs가 지식 기반 추론을 위해 사용하는 신경 경로 — 를 분석한 결과, MEMIT 및 WISE와 같은 현재의 계층-지역적 KE 접근법은 단일 또는 소수의 모델 계층만을 편집하기 때문에 이러한 추론 경로에 업데이트된 정보를 효과적으로 통합하는 데 어려움을 겪는 것으로 관찰되었습니다. 이러한 한계를 해결하기 위해, 우리는 CaKE(Circuit-aware Knowledge Editing)라는 새로운 방법을 제안합니다. CaKE는 회로 기반 분석을 통해 전략적으로 선별된 데이터를 활용하여 모델이 수정된 지식을 활용하도록 강제하고, 새로 통합된 지식에 적합한 추론 회로를 개발하도록 자극합니다. 실험 결과, CaKE는 관련된 추론 작업에서 업데이트된 지식을 더 정확하고 일관되게 사용할 수 있게 하여, MQuAKE 데이터셋에서 기존 KE 방법들에 비해 다중 홉 추론 정확도가 평균 20% 향상되었음을 보여줍니다. 우리는 코드와 데이터를 https://github.com/zjunlp/CaKE에서 공개합니다.

English

Knowledge Editing (KE) enables the modification of outdated or incorrect information in large language models (LLMs). While existing KE methods can update isolated facts, they struggle to generalize these updates to multi-hop reasoning tasks that depend on the modified knowledge. Through an analysis of reasoning circuits -- the neural pathways LLMs use for knowledge-based inference, we observe that current layer-localized KE approaches, such as MEMIT and WISE, which edit only single or a few model layers, struggle to effectively incorporate updated information into these reasoning pathways. To address this limitation, we propose CaKE (Circuit-aware Knowledge Editing), a novel method that enables more effective integration of updated knowledge in LLMs. CaKE leverages strategically curated data, guided by our circuits-based analysis, that enforces the model to utilize the modified knowledge, stimulating the model to develop appropriate reasoning circuits for newly integrated knowledge. Experimental results show that CaKE enables more accurate and consistent use of updated knowledge across related reasoning tasks, leading to an average of 20% improvement in multi-hop reasoning accuracy on MQuAKE dataset compared to existing KE methods. We release the code and data in https://github.com/zjunlp/CaKE.

CaKE: 회로 인식 편집을 통한 일반화 가능한 지식 학습기

CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners

초록

Support