ReALM: 참조 해결을 언어 모델링으로 접근하기

초록

참조 해결(reference resolution)은 다양한 종류의 맥락을 이해하고 성공적으로 처리하기 위해 필수적인 중요한 문제입니다. 이러한 맥락에는 이전 대화뿐만 아니라 사용자의 화면에 있는 개체나 백그라운드에서 실행 중인 개체와 같은 비대화적(non-conversational) 개체와 관련된 맥락도 포함됩니다. 대형 언어 모델(LLM)이 다양한 작업에서 매우 강력한 성능을 보여주고 있지만, 특히 비대화적 개체에 대한 참조 해결에서의 활용은 아직 충분히 이루어지지 않고 있습니다. 본 논문은 참조 해결이 전통적으로 텍스트만으로 축소하기 어려운 화면상의 개체와 같은 형태를 포함함에도 불구하고, 이를 언어 모델링 문제로 변환함으로써 다양한 유형의 참조를 해결하는 매우 효과적인 시스템을 LLM을 통해 구축할 수 있음을 보여줍니다. 우리는 기존의 유사한 기능을 가진 시스템에 비해 다양한 유형의 참조에서 큰 개선을 보여주었으며, 가장 작은 모델도 화면상 참조에 대해 5% 이상의 절대적 성능 향상을 달성했습니다. 또한 GPT-3.5 및 GPT-4와의 벤치마크에서, 가장 작은 모델은 GPT-4와 비슷한 성능을 보였고, 더 큰 모델들은 GPT-4를 크게 능가하는 성과를 거두었습니다.

English

Reference resolution is an important problem, one that is essential to understand and successfully handle context of different kinds. This context includes both previous turns and context that pertains to non-conversational entities, such as entities on the user's screen or those running in the background. While LLMs have been shown to be extremely powerful for a variety of tasks, their use in reference resolution, particularly for non-conversational entities, remains underutilized. This paper demonstrates how LLMs can be used to create an extremely effective system to resolve references of various types, by showing how reference resolution can be converted into a language modeling problem, despite involving forms of entities like those on screen that are not traditionally conducive to being reduced to a text-only modality. We demonstrate large improvements over an existing system with similar functionality across different types of references, with our smallest model obtaining absolute gains of over 5% for on-screen references. We also benchmark against GPT-3.5 and GPT-4, with our smallest model achieving performance comparable to that of GPT-4, and our larger models substantially outperforming it.

ReALM: 참조 해결을 언어 모델링으로 접근하기

ReALM: Reference Resolution As Language Modeling

초록

Support