無限背景下的人類式情節記憶LLMs
Human-like Episodic Memory for Infinite Context LLMs
July 12, 2024
作者: Zafeirios Fountas, Martin A Benfeghoul, Adnan Oomerjee, Fenia Christopoulou, Gerasimos Lampouras, Haitham Bou-Ammar, Jun Wang
cs.AI
摘要
大型語言模型(LLMs)展現出卓越的能力,但仍然在處理廣泛語境方面遇到困難,限制了它們在長序列中保持一致性和準確性的能力。相較之下,人類大腦擅長組織和檢索跨越廣泛時間尺度的情節性經驗,涵蓋一生。在這項工作中,我們介紹了EM-LLM,一種將人類情節記憶和事件認知的關鍵方面整合到LLMs中的新方法,使它們能夠有效處理幾乎無限的語境長度,同時保持計算效率。EM-LLM通過結合貝葉斯驚奇和圖論邊界細化的方式,以在線方式將token序列組織成一致的情節事件。在需要時,這些事件通過兩階段記憶過程檢索,結合基於相似性和時間上連續的檢索,實現對相關信息的高效且類似人類的訪問。對LongBench數據集的實驗表明,EM-LLM表現優異,整體相對改進率達到4.3%,在各種任務中優於最先進的InfLLM模型,包括在PassageRetrieval任務上達到33%的改進。此外,我們的分析顯示EM-LLM的事件分割與人類感知事件之間存在著強烈的相關性,暗示了這種人工系統與其生物對應物之間的橋樑。這項工作不僅推動了LLM在處理廣泛語境方面的能力,還為探索人類記憶機制提供了一個計算框架,為AI和認知科學的跨學科研究開辟了新途徑。
English
Large language models (LLMs) have shown remarkable capabilities, but still
struggle with processing extensive contexts, limiting their ability to maintain
coherence and accuracy over long sequences. In contrast, the human brain excels
at organising and retrieving episodic experiences across vast temporal scales,
spanning a lifetime. In this work, we introduce EM-LLM, a novel approach that
integrates key aspects of human episodic memory and event cognition into LLMs,
enabling them to effectively handle practically infinite context lengths while
maintaining computational efficiency. EM-LLM organises sequences of tokens into
coherent episodic events using a combination of Bayesian surprise and
graph-theoretic boundary refinement in an on-line fashion. When needed, these
events are retrieved through a two-stage memory process, combining
similarity-based and temporally contiguous retrieval for efficient and
human-like access to relevant information. Experiments on the LongBench dataset
demonstrate EM-LLM's superior performance, outperforming the state-of-the-art
InfLLM model with an overall relative improvement of 4.3% across various tasks,
including a 33% improvement on the PassageRetrieval task. Furthermore, our
analysis reveals strong correlations between EM-LLM's event segmentation and
human-perceived events, suggesting a bridge between this artificial system and
its biological counterpart. This work not only advances LLM capabilities in
processing extended contexts but also provides a computational framework for
exploring human memory mechanisms, opening new avenues for interdisciplinary
research in AI and cognitive science.Summary
AI-Generated Summary