CrowdMoGen:零樣本文本驅動集體運動生成
CrowdMoGen: Zero-Shot Text-Driven Collective Motion Generation
July 8, 2024
作者: Xinying Guo, Mingyuan Zhang, Haozhe Xie, Chenyang Gu, Ziwei Liu
cs.AI
摘要
眾人動態生成在娛樂行業(如動畫和遊戲)以及戰略領域(如城市模擬和規劃)中至關重要。這項新任務需要精細地整合控制和生成,以實際合成在特定空間和語義約束下的人群動態,其挑戰尚未完全探索。一方面,現有的人類動態生成模型通常專注於個人行為,忽略了集體行為的複雜性。另一方面,最近的多人動態生成方法嚴重依賴於預定義的情境,並且僅限於固定的少量人際互動,因此限制了它們的實用性。為了克服這些挑戰,我們引入了CrowdMoGen,一個零樣本文本驅動框架,利用大型語言模型(LLM)的力量將集體智慧整合到動態生成框架中作為指導,從而實現人群動態的通用規劃和生成,而無需配對訓練數據。我們的框架包括兩個關鍵組件:1)Crowd Scene Planner,根據特定場景背景或引入的擾動學習協調動作和動態,以及2)Collective Motion Generator,根據整體計劃高效合成所需的集體動作。廣泛的定量和定性實驗驗證了我們框架的有效性,該框架不僅通過提供可擴展和通用的解決方案填補了人群動態生成任務的關鍵空白,而且實現了高水準的真實感和靈活性。
English
Crowd Motion Generation is essential in entertainment industries such as
animation and games as well as in strategic fields like urban simulation and
planning. This new task requires an intricate integration of control and
generation to realistically synthesize crowd dynamics under specific spatial
and semantic constraints, whose challenges are yet to be fully explored. On the
one hand, existing human motion generation models typically focus on individual
behaviors, neglecting the complexities of collective behaviors. On the other
hand, recent methods for multi-person motion generation depend heavily on
pre-defined scenarios and are limited to a fixed, small number of inter-person
interactions, thus hampering their practicality. To overcome these challenges,
we introduce CrowdMoGen, a zero-shot text-driven framework that harnesses the
power of Large Language Model (LLM) to incorporate the collective intelligence
into the motion generation framework as guidance, thereby enabling
generalizable planning and generation of crowd motions without paired training
data. Our framework consists of two key components: 1) Crowd Scene Planner that
learns to coordinate motions and dynamics according to specific scene contexts
or introduced perturbations, and 2) Collective Motion Generator that
efficiently synthesizes the required collective motions based on the holistic
plans. Extensive quantitative and qualitative experiments have validated the
effectiveness of our framework, which not only fills a critical gap by
providing scalable and generalizable solutions for Crowd Motion Generation task
but also achieves high levels of realism and flexibility.Summary
AI-Generated Summary