ChatPaper.aiChatPaper

循環GPT:互動生成(任意長度)的文本

RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text

May 22, 2023
作者: Wangchunshu Zhou, Yuchen Eleanor Jiang, Peng Cui, Tiannan Wang, Zhenxin Xiao, Yifan Hou, Ryan Cotterell, Mrinmaya Sachan
cs.AI

摘要

Transformer 模型的固定大小上下文使得 GPT 模型無法生成任意長度的文本。本文介紹了 RecurrentGPT,這是一種基於語言的模擬 RNN 中循環機制的方法。RecurrentGPT 基於大型語言模型(LLM),如 ChatGPT,並使用自然語言來模擬 LSTM 中的長短期記憶機制。在每個時間步,RecurrentGPT 生成一段文本並更新其存儲在硬盤和提示中的基於語言的長短期記憶。這種循環機制使得 RecurrentGPT 能夠生成任意長度的文本而不會遺忘。由於人類用戶可以輕鬆觀察和編輯自然語言記憶,RecurrentGPT 具有可解釋性,並且能夠實現互動生成長文本。RecurrentGPT 是邁向超越本地編輯建議的下一代計算機輔助寫作系統的初始步驟。除了生成 AI 生成內容(AIGC),我們還展示了使用 RecurrentGPT 作為與消費者直接互動的互動式小說的可能性。我們將這種生成模型的用法稱為“AI 作為內容”(AIAC),我們認為這是傳統 AIGC 的下一形式。我們進一步展示了使用 RecurrentGPT 創建個性化互動式小說的可能性,這些小說直接與讀者互動而不是與作者互動。總的來說,RecurrentGPT 展示了從認知科學和深度學習中流行的模型設計中借用想法以提示 LLM 的效用。我們的代碼可在 https://github.com/aiwaves-cn/RecurrentGPT 找到,並且在 https://www.aiwaves.org/recurrentgpt 上提供了在線演示。
English
The fixed-size context of Transformer makes GPT models incapable of generating arbitrarily long text. In this paper, we introduce RecurrentGPT, a language-based simulacrum of the recurrence mechanism in RNNs. RecurrentGPT is built upon a large language model (LLM) such as ChatGPT and uses natural language to simulate the Long Short-Term Memory mechanism in an LSTM. At each timestep, RecurrentGPT generates a paragraph of text and updates its language-based long-short term memory stored on the hard drive and the prompt, respectively. This recurrence mechanism enables RecurrentGPT to generate texts of arbitrary length without forgetting. Since human users can easily observe and edit the natural language memories, RecurrentGPT is interpretable and enables interactive generation of long text. RecurrentGPT is an initial step towards next-generation computer-assisted writing systems beyond local editing suggestions. In addition to producing AI-generated content (AIGC), we also demonstrate the possibility of using RecurrentGPT as an interactive fiction that directly interacts with consumers. We call this usage of generative models by ``AI As Contents'' (AIAC), which we believe is the next form of conventional AIGC. We further demonstrate the possibility of using RecurrentGPT to create personalized interactive fiction that directly interacts with readers instead of interacting with writers. More broadly, RecurrentGPT demonstrates the utility of borrowing ideas from popular model designs in cognitive science and deep learning for prompting LLMs. Our code is available at https://github.com/aiwaves-cn/RecurrentGPT and an online demo is available at https://www.aiwaves.org/recurrentgpt.
PDF22December 15, 2024