多流大型语言模型：利用并行思维流、输入流与输出流解锁语言模型

摘要

语言模型能力的持续提升使其作为自主智能体驱动核心的应用日益广泛，例如在编程或计算机使用场景中。然而，自ChatGPT等早期指令微调模型以来，这些系统的核心架构并未发生显著改变。即便是先进的AI智能体依然遵循消息交换范式——与用户、系统、自身（即思维链）以及工具之间通过单一计算流依次交换消息。这种基于聊天模型的单流瓶颈引发诸多限制：智能体无法在读取信息时生成输出，也无法在生成输出时响应新信息；同样，它无法在思考的同时执行操作，也无法在读取或处理信息时进行思考。本研究提出，通过将指令微调从顺序消息格式转向多路并行计算流格式（将每个角色拆分至独立数据流），可解除上述瓶颈。语言模型每次前向传播将同时读取多个输入流，并在多个输出流中生成标记，所有流均与早期时间步存在因果关系。我们论证这种数据驱动变革可解决上述可用性限制，通过并行化提升模型效率，通过更清晰的职责分离增强模型安全性，并进一步改善模型的可监控性。

English

The continued improvements in language model capability have unlocked their widespread use as drivers of autonomous agents, for example in coding or computer use applications. However, the core of these systems has not changed much since early instruction-tuned models like ChatGPT. Even advanced AI agents function on message exchange formats, successively exchanging messages with users, systems, with itself (i.e. chain-of-thought) and tools in a single stream of computation. This bottleneck to a single stream in chat models leads to a number of limitations: the agent cannot act (generate output) while reading, and in reverse, cannot react to new information while writing. Similarly, the agent cannot act while thinking and cannot think while reading or acting on information. In this work, we show that models can be unblocked by switching from instruction-tuning for sequential message formats to instruction-tuning for multiple, parallel streams of computation, splitting each role into a separate stream. Every forward pass of the language model then simultaneously reads from multiple input streams and generates tokens in multiple output streams, all of which causally depend on earlier timesteps. We argue that this data-driven change remedies a number of usability limitations as outlined above, improves model efficiency through parallelization, improves model security through better separation of concerns and can further improve model monitorability.

多流大型语言模型：利用并行思维流、输入流与输出流解锁语言模型

Multi-Stream LLMs: Unblocking Language Models with Parallel Streams of Thoughts, Inputs and Outputs

摘要

Support