超多步骤:困难长文本任务背后的真相
Hyper-multi-step: The Truth Behind Difficult Long-context Tasks
October 6, 2024
作者: Yijiong Yu
cs.AI
摘要
长上下文语言模型(LCLM)以其广泛的上下文窗口而闻名,正变得日益流行。与此同时,许多长上下文基准提出了具有挑战性的任务,即使是最先进的LCLM也难以完成。然而,各种具有挑战性的长上下文任务的根源却鲜为人知。为了填补这一空白,我们进行实验,表明这些困难主要源于两个基本问题:“多匹配检索”,需要同时检索多个项目,以及“基于逻辑的检索”,需要在检索标准内进行逻辑判断。这两个问题,虽然看似简单,实际上超出了LCLM的能力,因为它们被证明具有超级多步骤(需要大量步骤来解决)的性质。这一发现可以解释为什么LLM在更高级的长上下文任务中遇到困难,为重新思考解决方案提供了更准确的视角。
English
Long-context language models (LCLM), characterized by their extensive context
window, is becoming increasingly popular. Meanwhile, many long-context
benchmarks present challenging tasks that even the most advanced LCLMs struggle
to complete. However, the underlying sources of various challenging
long-context tasks have seldom been studied. To bridge this gap, we conduct
experiments to indicate their difficulty stems primarily from two basic issues:
"multi-matching retrieval," which requires the simultaneous retrieval of
multiple items, and "logic-based retrieval," which necessitates logical
judgment within retrieval criteria. These two problems, while seemingly
straightforward, actually exceed the capabilities of LCLMs because they are
proven to be hyper-multi-step (demanding numerous steps to solve) in nature.
This finding could explain why LLMs struggle with more advanced long-context
tasks, providing a more accurate perspective for rethinking solutions for them.Summary
AI-Generated Summary