自然語言為基礎的心智社會中的心智風暴

摘要

Minsky的“心靈社會”和Schmidhuber的“學會思考”啟發了由大型多模態神經網絡（NNs）組成的多樣化心靈社會，透過在“心靈風暴”中互相訪問來解決問題。最近基於NN的心靈社會的實現包括大型語言模型（LLMs）和其他基於NN的專家通過自然語言界面進行通信。通過這樣做，它們克服了單個LLMs的限制，改進了多模態零樣本推理。在這些基於自然語言的心靈社會（NLSOMs）中，新的代理人 - 所有通過相同的通用符號語言進行通信 - 可以輕鬆以模塊化方式添加。為了展示NLSOMs的威力，我們組裝並實驗了幾個（最多有129名成員），利用其中的心靈風暴來解決一些實際的AI任務：視覺問答、圖像標題、文本到圖像合成、3D生成、自我中心檢索、具身AI和一般基於語言的任務解決。我們將這視為朝著擁有數十億代理人的更大NLSOMs的起點 - 其中一些可能是人類。隨著這些異質心靈的偉大社會的出現，許多新的研究問題突然變得至關重要，對於人工智能的未來至關重要。NLSOM的社會結構應該是什麼？擁有君主制而不是民主制的（不）優勢是什麼？如何利用NN經濟原則來最大化強化學習NLSOM的總獎勵？在這項工作中，我們確定、討論並試圖回答其中一些問題。

English

Both Minsky's "society of mind" and Schmidhuber's "learning to think" inspire diverse societies of large multimodal neural networks (NNs) that solve problems by interviewing each other in a "mindstorm." Recent implementations of NN-based societies of minds consist of large language models (LLMs) and other NN-based experts communicating through a natural language interface. In doing so, they overcome the limitations of single LLMs, improving multimodal zero-shot reasoning. In these natural language-based societies of mind (NLSOMs), new agents -- all communicating through the same universal symbolic language -- are easily added in a modular fashion. To demonstrate the power of NLSOMs, we assemble and experiment with several of them (having up to 129 members), leveraging mindstorms in them to solve some practical AI tasks: visual question answering, image captioning, text-to-image synthesis, 3D generation, egocentric retrieval, embodied AI, and general language-based task solving. We view this as a starting point towards much larger NLSOMs with billions of agents-some of which may be humans. And with this emergence of great societies of heterogeneous minds, many new research questions have suddenly become paramount to the future of artificial intelligence. What should be the social structure of an NLSOM? What would be the (dis)advantages of having a monarchical rather than a democratic structure? How can principles of NN economies be used to maximize the total reward of a reinforcement learning NLSOM? In this work, we identify, discuss, and try to answer some of these questions.

自然語言為基礎的心智社會中的心智風暴

Mindstorms in Natural Language-Based Societies of Mind

摘要

Support