대규모 언어 모델의 컨텍스트 윈도우 확장: 시맨틱 압축을 통한 접근

초록

Transformer 기반 대규모 언어 모델(LLMs)은 일반적으로 유창하고 관련성 높은 응답 생성을 위해 입력 텍스트의 길이에 제한을 둡니다. 이러한 제약은 긴 텍스트를 다루는 시나리오에서의 적용 가능성을 제한합니다. 본 연구에서는 상당한 계산 비용 증가나 미세 조정 없이도 6-8배 더 긴 텍스트로 일반화할 수 있는 새로운 의미론적 압축 방법을 제안합니다. 제안된 프레임워크는 정보 이론의 소스 코딩에서 영감을 얻었으며, 사전 훈련된 모델을 사용하여 긴 입력의 의미론적 중복성을 줄인 후 이를 LLMs에 전달하여 다운스트림 작업을 수행합니다. 실험 결과는 제안된 방법이 질문 응답, 요약, 소수 샷 학습, 정보 검색 등 다양한 작업에서 LLMs의 컨텍스트 윈도우를 효과적으로 확장함을 보여줍니다. 또한, 제안된 의미론적 압축 방법은 텍스트 생성에서 일관된 유창성을 유지하면서 관련 계산 오버헤드를 줄이는 것으로 나타났습니다.

English

Transformer-based Large Language Models (LLMs) often impose limitations on the length of the text input to ensure the generation of fluent and relevant responses. This constraint restricts their applicability in scenarios involving long texts. We propose a novel semantic compression method that enables generalization to texts that are 6-8 times longer, without incurring significant computational costs or requiring fine-tuning. Our proposed framework draws inspiration from source coding in information theory and employs a pre-trained model to reduce the semantic redundancy of long inputs before passing them to the LLMs for downstream tasks. Experimental results demonstrate that our method effectively extends the context window of LLMs across a range of tasks including question answering, summarization, few-shot learning, and information retrieval. Furthermore, the proposed semantic compression method exhibits consistent fluency in text generation while reducing the associated computational overhead.

대규모 언어 모델의 컨텍스트 윈도우 확장: 시맨틱 압축을 통한 접근

Extending Context Window of Large Language Models via Semantic Compression

초록

Support