ChatPaper.aiChatPaper

循环GPT:交互式生成(任意长度的)长文本

RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text

May 22, 2023
作者: Wangchunshu Zhou, Yuchen Eleanor Jiang, Peng Cui, Tiannan Wang, Zhenxin Xiao, Yifan Hou, Ryan Cotterell, Mrinmaya Sachan
cs.AI

摘要

Transformer 模型的固定大小上下文使得 GPT 模型无法生成任意长的文本。在本文中,我们介绍了 RecurrentGPT,这是一个基于语言的模拟器,模拟了 RNN 中的循环机制。RecurrentGPT 基于大型语言模型(LLM),如 ChatGPT,并使用自然语言来模拟 LSTM 中的长短期记忆机制。在每个时间步,RecurrentGPT 生成一个文本段落,并分别更新存储在硬盘上的基于语言的长短期记忆和提示信息。这种循环机制使得 RecurrentGPT 能够生成任意长度的文本而不会遗忘。由于人类用户可以轻松观察和编辑自然语言记忆,RecurrentGPT 是可解释的,并且能够实现交互式生成长文本。RecurrentGPT 是迈向超越本地编辑建议的下一代计算机辅助写作系统的初始步骤。除了生成 AI 生成内容(AIGC),我们还展示了使用 RecurrentGPT 作为与消费者直接交互的交互式虚构的可能性。我们将这种生成模型的用法称为“AI 作为内容”(AIAC),我们认为这是传统 AIGC 的下一个形式。我们进一步展示了使用 RecurrentGPT 创作个性化交互式虚构的可能性,这种虚构直接与读者互动,而不是与作者互动。广义上讲,RecurrentGPT 展示了从认知科学和深度学习中流行的模型设计中借鉴思想来提示 LLM 的实用性。我们的代码可在 https://github.com/aiwaves-cn/RecurrentGPT 获取,并且在线演示可在 https://www.aiwaves.org/recurrentgpt 查看。
English
The fixed-size context of Transformer makes GPT models incapable of generating arbitrarily long text. In this paper, we introduce RecurrentGPT, a language-based simulacrum of the recurrence mechanism in RNNs. RecurrentGPT is built upon a large language model (LLM) such as ChatGPT and uses natural language to simulate the Long Short-Term Memory mechanism in an LSTM. At each timestep, RecurrentGPT generates a paragraph of text and updates its language-based long-short term memory stored on the hard drive and the prompt, respectively. This recurrence mechanism enables RecurrentGPT to generate texts of arbitrary length without forgetting. Since human users can easily observe and edit the natural language memories, RecurrentGPT is interpretable and enables interactive generation of long text. RecurrentGPT is an initial step towards next-generation computer-assisted writing systems beyond local editing suggestions. In addition to producing AI-generated content (AIGC), we also demonstrate the possibility of using RecurrentGPT as an interactive fiction that directly interacts with consumers. We call this usage of generative models by ``AI As Contents'' (AIAC), which we believe is the next form of conventional AIGC. We further demonstrate the possibility of using RecurrentGPT to create personalized interactive fiction that directly interacts with readers instead of interacting with writers. More broadly, RecurrentGPT demonstrates the utility of borrowing ideas from popular model designs in cognitive science and deep learning for prompting LLMs. Our code is available at https://github.com/aiwaves-cn/RecurrentGPT and an online demo is available at https://www.aiwaves.org/recurrentgpt.
PDF22December 15, 2024