自然语言为基础的心智社会中的心灵风暴

摘要

Minsky的“心智社会”和Schmidhuber的“学会思考”都启发了大型多模态神经网络（NNs）的多样化社会，通过在“思维风暴”中相互面试来解决问题。最近基于NN的心智社会的实现包括大型语言模型（LLMs）和其他基于NN的专家通过自然语言界面进行通信。通过这种方式，它们克服了单个LLMs的局限，改善了多模态零样本推理。在这些基于自然语言的心智社会（NLSOMs）中，新的代理人 - 所有人都通过相同的通用符号语言进行通信 - 可以轻松以模块化方式添加。为了展示NLSOMs的力量，我们组装并实验了几个（最多有129名成员），利用其中的思维风暴来解决一些实际的AI任务：视觉问题回答、图像字幕、文本到图像合成、3D生成、自我中心检索、具身AI和一般基于语言的任务解决。我们将此视为迈向拥有数十亿代理人的更大NLSOMs的起点 - 其中一些可能是人类。随着异质思维伟大社会的出现，许多新的研究问题突然变得至关重要，关乎人工智能的未来。NLSOM的社会结构应该是什么样的？拥有君主制而不是民主制的（不）优势会是什么？如何利用NN经济原则来最大化强化学习NLSOM的总奖励？在这项工作中，我们识别、讨论并试图回答其中一些问题。

English

Both Minsky's "society of mind" and Schmidhuber's "learning to think" inspire diverse societies of large multimodal neural networks (NNs) that solve problems by interviewing each other in a "mindstorm." Recent implementations of NN-based societies of minds consist of large language models (LLMs) and other NN-based experts communicating through a natural language interface. In doing so, they overcome the limitations of single LLMs, improving multimodal zero-shot reasoning. In these natural language-based societies of mind (NLSOMs), new agents -- all communicating through the same universal symbolic language -- are easily added in a modular fashion. To demonstrate the power of NLSOMs, we assemble and experiment with several of them (having up to 129 members), leveraging mindstorms in them to solve some practical AI tasks: visual question answering, image captioning, text-to-image synthesis, 3D generation, egocentric retrieval, embodied AI, and general language-based task solving. We view this as a starting point towards much larger NLSOMs with billions of agents-some of which may be humans. And with this emergence of great societies of heterogeneous minds, many new research questions have suddenly become paramount to the future of artificial intelligence. What should be the social structure of an NLSOM? What would be the (dis)advantages of having a monarchical rather than a democratic structure? How can principles of NN economies be used to maximize the total reward of a reinforcement learning NLSOM? In this work, we identify, discuss, and try to answer some of these questions.

自然语言为基础的心智社会中的心灵风暴

Mindstorms in Natural Language-Based Societies of Mind

摘要

Support