3D-GPT：使用大型語言模型進行程序化3D建模

摘要

為了追求高效的自動化內容創作，程序生成成為一種具有潛力的方法，利用可修改參數和基於規則的系統。然而，這可能是一項具有挑戰性的工作，因為其複雜的性質需要對規則、算法和參數有深入的理解。為了減輕工作量，我們引入了3D-GPT，這是一個利用大型語言模型（LLMs）進行指導驅動的3D建模的框架。3D-GPT將LLMs定位為熟練的問題解決者，將程序化3D建模任務分解為可訪問的部分，並為每個任務指定適當的代理。3D-GPT集成了三個核心代理：任務調度代理、概念化代理和建模代理。它們共同實現兩個目標。首先，它增強了簡潔的初始場景描述，將其進化為詳細形式，同時根據後續指令動態地調整文本。其次，它集成了程序生成，從豐富文本中提取參數值，以便輕鬆地與3D軟件進行資產創建的接口。我們的實證研究證實了3D-GPT不僅解釋並執行指令，提供可靠的結果，而且與人類設計師有效地合作。此外，它與Blender無縫集成，開啟了擴展操作可能性。我們的工作突顯了LLMs在3D建模中的潛力，為未來在場景生成和動畫方面的進展提供了基本框架。

English

In the pursuit of efficient automated content creation, procedural generation, leveraging modifiable parameters and rule-based systems, emerges as a promising approach. Nonetheless, it could be a demanding endeavor, given its intricate nature necessitating a deep understanding of rules, algorithms, and parameters. To reduce workload, we introduce 3D-GPT, a framework utilizing large language models~(LLMs) for instruction-driven 3D modeling. 3D-GPT positions LLMs as proficient problem solvers, dissecting the procedural 3D modeling tasks into accessible segments and appointing the apt agent for each task. 3D-GPT integrates three core agents: the task dispatch agent, the conceptualization agent, and the modeling agent. They collaboratively achieve two objectives. First, it enhances concise initial scene descriptions, evolving them into detailed forms while dynamically adapting the text based on subsequent instructions. Second, it integrates procedural generation, extracting parameter values from enriched text to effortlessly interface with 3D software for asset creation. Our empirical investigations confirm that 3D-GPT not only interprets and executes instructions, delivering reliable results but also collaborates effectively with human designers. Furthermore, it seamlessly integrates with Blender, unlocking expanded manipulation possibilities. Our work highlights the potential of LLMs in 3D modeling, offering a basic framework for future advancements in scene generation and animation.

3D-GPT：使用大型語言模型進行程序化3D建模

3D-GPT: Procedural 3D Modeling with Large Language Models

摘要

Support