ハイパーマルチステップ：困難な長文脈タスクの真実

要旨

長い文脈言語モデル（LCLM）は、広範な文脈ウィンドウを特徴とし、ますます人気を集めています。一方で、多くの長い文脈のベンチマークは、最も高度なLCLMでも完了に苦労する困難なタスクを提示しています。しかし、さまざまな困難な長い文脈タスクの根源はほとんど研究されていませんでした。このギャップを埋めるために、私たちは実験を行い、その難しさが主に2つの基本的な問題から生じることを示しました。「複数の一致検索」という、複数のアイテムを同時に検索する必要がある問題と、「論理ベースの検索」という、検索基準内で論理的判断が必要な問題です。これら2つの問題は、一見簡単なように見えますが、実際にはLCLMの能力を超えており、解決には多数のステップが必要なハイパーマルチステップ（多数のステップを要する）であることが証明されています。この発見は、なぜLLMがより高度な長い文脈タスクに苦戦するのかを説明し、それらのための解決策を再考する際により正確な視点を提供する可能性があります。

English

Long-context language models (LCLM), characterized by their extensive context window, is becoming increasingly popular. Meanwhile, many long-context benchmarks present challenging tasks that even the most advanced LCLMs struggle to complete. However, the underlying sources of various challenging long-context tasks have seldom been studied. To bridge this gap, we conduct experiments to indicate their difficulty stems primarily from two basic issues: "multi-matching retrieval," which requires the simultaneous retrieval of multiple items, and "logic-based retrieval," which necessitates logical judgment within retrieval criteria. These two problems, while seemingly straightforward, actually exceed the capabilities of LCLMs because they are proven to be hyper-multi-step (demanding numerous steps to solve) in nature. This finding could explain why LLMs struggle with more advanced long-context tasks, providing a more accurate perspective for rethinking solutions for them.

ハイパーマルチステップ：困難な長文脈タスクの真実

Hyper-multi-step: The Truth Behind Difficult Long-context Tasks

要旨

Support