心靈代理人:新興遊戲互動
MindAgent: Emergent Gaming Interaction
September 18, 2023
作者: Ran Gong, Qiuyuan Huang, Xiaojian Ma, Hoi Vo, Zane Durante, Yusuke Noda, Zilong Zheng, Song-Chun Zhu, Demetri Terzopoulos, Li Fei-Fei, Jianfeng Gao
cs.AI
摘要
大型語言模型(LLMs)具有在多智能體系統中執行複雜排程的能力,可以協調這些智能體完成需要廣泛協作的複雜任務。然而,儘管引入了許多遊戲框架,社群對於構建包含LLM和人類-NPC協作的通用多智能體協作基礎設施仍然缺乏充分的基準。在這項工作中,我們提出了一個新型基礎設施 - MindAgent - 來評估遊戲互動的規劃和協調新能力。特別是,我們的基礎設施利用現有的遊戲框架,i)需要多智能體系統協調者的理解,ii)通過未微調的適當指示與人類玩家協作,iii)在少量提示和反饋上建立上下文學習。此外,我們介紹了一個新的遊戲場景CUISINEWORLD,以及相關的基準,用於評估多智能體協作效率並監督多個智能體同時玩遊戲。我們使用新的自動度量CoS進行全面評估,計算協作效率。最後,我們的基礎設施可以部署到現實世界的遊戲場景中,以CUISINEWORLD的定制VR版本為例,並適應現有更廣泛的Minecraft遊戲領域。我們希望我們對LLMs和通用排程和協調的新基礎設施的研究成果能夠揭示這些技能如何可以通過從大型語言語料庫中學習來獲得。
English
Large Language Models (LLMs) have the capacity of performing complex
scheduling in a multi-agent system and can coordinate these agents into
completing sophisticated tasks that require extensive collaboration. However,
despite the introduction of numerous gaming frameworks, the community has
insufficient benchmarks towards building general multi-agents collaboration
infrastructure that encompass both LLM and human-NPCs collaborations. In this
work, we propose a novel infrastructure - MindAgent - to evaluate planning and
coordination emergent capabilities for gaming interaction. In particular, our
infrastructure leverages existing gaming framework, to i) require understanding
of the coordinator for a multi-agent system, ii) collaborate with human players
via un-finetuned proper instructions, and iii) establish an in-context learning
on few-shot prompt with feedback. Furthermore, we introduce CUISINEWORLD, a new
gaming scenario and related benchmark that dispatch a multi-agent collaboration
efficiency and supervise multiple agents playing the game simultaneously. We
conduct comprehensive evaluations with new auto-metric CoS for calculating the
collaboration efficiency. Finally, our infrastructure can be deployed into
real-world gaming scenarios in a customized VR version of CUISINEWORLD and
adapted in existing broader Minecraft gaming domain. We hope our findings on
LLMs and the new infrastructure for general-purpose scheduling and coordination
can help shed light on how such skills can be obtained by learning from large
language corpora.