LongAgent: 다중 에이전트 협업을 통해 언어 모델의 컨텍스트를 128k로 확장

초록

대형 언어 모델(LLM)은 언어 이해와 복잡한 추론 작업 수행에서 인상적인 성능을 보여왔다. 그러나 긴 문맥 윈도우를 가진 LLM은 훈련 비용이 비싸고 추론 지연 시간이 길다는 점으로 악명이 높다. GPT-4와 Claude2와 같은 가장 진보된 모델들조차 100k 토큰 이상의 입력을 처리할 때 실수를 저지르는 경우가 많으며, 이는 '중간에서 길을 잃음(lost in the middle)' 현상으로도 알려져 있다. 본 논문에서는 다중 에이전트 협업을 기반으로 한 LongAgent 방법을 제안한다. 이 방법은 LLM(예: LLaMA)을 128K 문맥으로 확장하며, GPT-4와 비교하여 장문 처리에서 잠재적인 우수성을 보인다. LongAgent에서는 리더가 사용자의 의도를 이해하고 팀원들에게 문서에서 정보를 수집하도록 지시하는 역할을 맡는다. 팀원들의 환각(hallucination)으로 인해 리더가 수십에서 수백 명의 팀원들의 응답으로부터 정확한 정보를 얻는 것은 쉬운 일이 아니다. 이를 해결하기 위해, 우리는 정보 공유를 통해 환각으로 인한 응답 충돌을 해결하는 팀원 간 통신 메커니즘을 개발했다. 실험 결과는 LongAgent가 장문 처리에 있어 유망한 대안을 제공함을 보여준다. LLaMA-7B로 구현된 에이전트 팀은 128k 길이의 텍스트 검색, 다중 홉 질문 응답 등의 작업에서 GPT-4와 비교하여 상당한 개선을 달성했다.

English

Large language models (LLMs) have demonstrated impressive performance in understanding language and executing complex reasoning tasks. However, LLMs with long context windows have been notorious for their expensive training costs and high inference latency. Even the most advanced models such as GPT-4 and Claude2 often make mistakes when processing inputs of over 100k tokens, a phenomenon also known as lost in the middle. In this paper, we propose LongAgent, a method based on multi-agent collaboration, which scales LLMs (e.g., LLaMA) to a context of 128K and demonstrates potential superiority in long-text processing compared to GPT-4. In LongAgent, a leader is responsible for understanding user intent and directing team members to acquire information from documents. Due to members' hallucinations, it is non-trivial for a leader to obtain accurate information from the responses of dozens to hundreds of members. To address this, we develop an inter-member communication mechanism to resolve response conflicts caused by hallucinations through information sharing. Our experimental results indicate that LongAgent offers a promising alternative for long-text processing. The agent team instantiated with LLaMA-7B achieves significant improvements in tasks such as 128k-long text retrieval, multi-hop question answering, compared to GPT-4.

LongAgent: 다중 에이전트 협업을 통해 언어 모델의 컨텍스트를 128k로 확장

LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration

초록

Summary

Support

Support