TransformerFAM:反馈注意力即工作记忆
TransformerFAM: Feedback attention is working memory
April 14, 2024
作者: Dongseong Hwang, Weiran Wang, Zhuoyuan Huo, Khe Chai Sim, Pedro Moreno Mengibar
cs.AI
摘要
尽管Transformer已经彻底改变了深度学习,但其二次注意力复杂度限制了其处理无限长输入的能力。我们提出了反馈注意力记忆(Feedback Attention Memory,FAM),这是一种新颖的Transformer架构,利用反馈循环使网络能够关注自己的潜在表示。这种设计促进了Transformer内部工作记忆的出现,使其能够处理无限长的序列。TransformerFAM不需要额外的权重,可以与预训练模型无缝集成。我们的实验表明,TransformerFAM显著改善了Transformer在长上下文任务中的性能,无论是在不同模型大小(1B、8B和24B)上。这些结果展示了赋予大型语言模型(LLMs)处理无限长度序列的潜力。
English
While Transformers have revolutionized deep learning, their quadratic
attention complexity hinders their ability to process infinitely long inputs.
We propose Feedback Attention Memory (FAM), a novel Transformer architecture
that leverages a feedback loop to enable the network to attend to its own
latent representations. This design fosters the emergence of working memory
within the Transformer, allowing it to process indefinitely long sequences.
TransformerFAM requires no additional weights, enabling seamless integration
with pre-trained models. Our experiments show that TransformerFAM significantly
improves Transformer performance on long-context tasks across various model
sizes (1B, 8B, and 24B). These results showcase the potential to empower Large
Language Models (LLMs) to process sequences of unlimited length.Summary
AI-Generated Summary