주의력 오버플로우: 장문맥 상황에서의 언어 모델 입력 흐림 현상 누락 항목 추천

초록

대형 언어 모델(LLM)은 프롬프트에 나열된 항목에서 누락된 요소를 제안할 수 있으며, 이는 목록 완성 또는 사용자 기록 기반 추천에 활용될 수 있습니다. 그러나 너무 많은 항목이 제시되면 성능이 저하되며, 입력 목록에 이미 포함된 항목을 다시 제안하기 시작합니다. 이 현상은 2024년 중반 기준 플래그십 LLM의 경우 약 100개 항목에서 발생합니다. 우리는 이 현상을 합성 문제(예: 섞인 정수 범위에서 누락된 숫자 찾기)와 현실적인 영화 추천 시나리오에서 평가합니다. 이를 '주의력 오버플로(attention overflow)'라고 부르는데, 반복을 방지하려면 모든 항목에 동시에 주의를 기울여야 하기 때문입니다. 반복 루프를 통해 이 문제를 완화할 수 있지만, 그 비용은 반복률에 따라 증가하며, 이는 언어 모델이 긴 입력에서 새로움을 도출하는 능력에 영향을 미칩니다.

English

Large language models (LLMs) can suggest missing elements from items listed in a prompt, which can be used for list completion or recommendations based on users' history. However, their performance degrades when presented with too many items, as they start to suggest items already included in the input list. This occurs at around 100 items for mid-2024 flagship LLMs. We evaluate this phenomenon on both synthetic problems (e.g., finding missing numbers in a given range of shuffled integers) and realistic movie recommendation scenarios. We refer to this issue as attention overflow, as preventing repetition requires attending to all items simultaneously. Although iterative loops can mitigate this problem, their costs increase with the repetition rate, affecting the language models' ability to derive novelty from lengthy inputs.

주의력 오버플로우: 장문맥 상황에서의 언어 모델 입력 흐림 현상 누락 항목 추천

Attention Overflow: Language Model Input Blur during Long-Context Missing Items Recommendation

초록

Support