중간에서 길을 잃다: 언어 모델이 긴 문맥을 어떻게 사용하는가

초록

최근 언어 모델들은 긴 문맥을 입력으로 받아들일 수 있는 능력을 갖추고 있지만, 이러한 언어 모델들이 긴 문맥을 얼마나 잘 활용하는지에 대해서는 상대적으로 알려진 바가 적습니다. 우리는 입력 문맥 내에서 관련 정보를 식별해야 하는 두 가지 작업, 즉 다중 문서 질의응답과 키-값 검색에 대한 언어 모델의 성능을 분석했습니다. 분석 결과, 관련 정보가 입력 문맥의 시작이나 끝에 위치할 때 성능이 가장 높았고, 긴 문맥의 중간에 관련 정보가 위치할 경우 성능이 현저히 저하되는 것을 확인했습니다. 또한, 명시적으로 긴 문맥을 처리하도록 설계된 모델들조차도 입력 문맥이 길어질수록 성능이 크게 감소하는 것으로 나타났습니다. 우리의 분석은 언어 모델이 입력 문맥을 어떻게 활용하는지에 대한 이해를 높이고, 향후 긴 문맥 모델을 위한 새로운 평가 프로토콜을 제시합니다.

English

While recent language models have the ability to take long contexts as input, relatively little is known about how well the language models use longer context. We analyze language model performance on two tasks that require identifying relevant information within their input contexts: multi-document question answering and key-value retrieval. We find that performance is often highest when relevant information occurs at the beginning or end of the input context, and significantly degrades when models must access relevant information in the middle of long contexts. Furthermore, performance substantially decreases as the input context grows longer, even for explicitly long-context models. Our analysis provides a better understanding of how language models use their input context and provides new evaluation protocols for future long-context models.

중간에서 길을 잃다: 언어 모델이 긴 문맥을 어떻게 사용하는가

Lost in the Middle: How Language Models Use Long Contexts

초록

Support