LLMtimesMapReduce:使用大型语言模型简化长序列处理
LLMtimesMapReduce: Simplified Long-Sequence Processing using Large Language Models
October 12, 2024
作者: Zihan Zhou, Chong Li, Xinyi Chen, Shuo Wang, Yu Chao, Zhili Li, Haoyu Wang, Rongqiao An, Qi Shi, Zhixing Tan, Xu Han, Xiaodong Shi, Zhiyuan Liu, Maosong Sun
cs.AI
摘要
扩大大型语言模型(LLMs)的上下文窗口已成为一个关键的研究领域,特别是对涉及极长文本的应用而言。在这项工作中,我们提出了一个新颖的无需训练的框架,用于处理长文本,利用分而治之的策略实现全面的文档理解。所提出的LLMtimesMapReduce框架将整个文档分成几个块供LLMs阅读,然后聚合中间答案以生成最终输出。分而治之长文本处理框架的主要挑战在于在分割文档时存在丢失关键的长距离信息的风险,这可能导致模型基于分段文本生成不完整或不正确的答案。中断的长距离信息可分为两类:块间依赖和块间冲突。我们设计了一个结构化信息协议来更好地处理块间依赖,并设计了一个上下文置信度校准机制来解决块间冲突。实验结果表明,LLMtimesMapReduce能够胜过代表性的开源和商业长上下文LLMs,并适用于多种不同模型。
English
Enlarging the context window of large language models (LLMs) has become a
crucial research area, particularly for applications involving extremely long
texts. In this work, we propose a novel training-free framework for processing
long texts, utilizing a divide-and-conquer strategy to achieve comprehensive
document understanding. The proposed LLMtimesMapReduce framework splits the
entire document into several chunks for LLMs to read and then aggregates the
intermediate answers to produce the final output. The main challenge for
divide-and-conquer long text processing frameworks lies in the risk of losing
essential long-range information when splitting the document, which can lead
the model to produce incomplete or incorrect answers based on the segmented
texts. Disrupted long-range information can be classified into two categories:
inter-chunk dependency and inter-chunk conflict. We design a structured
information protocol to better cope with inter-chunk dependency and an
in-context confidence calibration mechanism to resolve inter-chunk conflicts.
Experimental results demonstrate that LLMtimesMapReduce can outperform
representative open-source and commercial long-context LLMs, and is applicable
to several different models.Summary
AI-Generated Summary