拉里玛:具有情节记忆控制的大型语言模型
Larimar: Large Language Models with Episodic Memory Control
March 18, 2024
作者: Payel Das, Subhajit Chaudhury, Elliot Nelson, Igor Melnyk, Sarath Swaminathan, Sihui Dai, Aurélie Lozano, Georgios Kollias, Vijil Chenthamarakshan, Jiří, Navrátil, Soham Dan, Pin-Yu Chen
cs.AI
摘要
在当今,高效准确地更新存储在大型语言模型(LLMs)中的知识是最紧迫的研究挑战之一。本文提出了Larimar - 一种新颖的、受大脑启发的架构,用于增强LLMs的分布式情节记忆。Larimar的记忆允许对知识进行动态、一次性更新,无需进行计算昂贵的重新训练或微调。在多个事实编辑基准测试上的实验结果表明,Larimar在挑战性的顺序编辑设置中达到了与大多数竞争基线相当的准确性,但在速度上也表现出色 - 根据基础LLM的不同,速度提高了4-10倍,同时由于所提出的架构简单、与LLM无关,因此具有灵活性和通用性。我们进一步提供了用于选择性事实遗忘和输入上下文长度泛化的机制,并展示了它们的有效性。
English
Efficient and accurate updating of knowledge stored in Large Language Models
(LLMs) is one of the most pressing research challenges today. This paper
presents Larimar - a novel, brain-inspired architecture for enhancing LLMs with
a distributed episodic memory. Larimar's memory allows for dynamic, one-shot
updates of knowledge without the need for computationally expensive re-training
or fine-tuning. Experimental results on multiple fact editing benchmarks
demonstrate that Larimar attains accuracy comparable to most competitive
baselines, even in the challenging sequential editing setup, but also excels in
speed - yielding speed-ups of 4-10x depending on the base LLM - as well as
flexibility due to the proposed architecture being simple, LLM-agnostic, and
hence general. We further provide mechanisms for selective fact forgetting and
input context length generalization with Larimar and show their effectiveness.Summary
AI-Generated Summary