Progressive3D: 복잡한 의미론적 프롬프트를 활용한 텍스트-3D 콘텐츠 생성을 위한 점진적 지역 편집

초록

최근 텍스트-3D 생성 방법은 이미지 확산 모델과 최적화 전략의 발전 덕분에 인상적인 3D 콘텐츠 생성 능력을 달성하고 있다. 그러나 현재의 방법들은 복잡한 의미론적 프롬프트, 즉 여러 상호작용하는 객체와 다양한 속성이 결합된 프롬프트에 대해 정확한 3D 콘텐츠를 생성하는 데 어려움을 겪고 있다. 본 연구에서는 복잡한 프롬프트에 대해 정밀한 3D 콘텐츠를 생성하기 위해 전체 생성 과정을 일련의 지역적 점진적 편집 단계로 분해하는 Progressive3D라는 일반적인 프레임워크를 제안한다. 또한 각 편집 단계에서 사용자 정의 영역 프롬프트에 의해 결정된 영역에서만 콘텐츠 변경이 발생하도록 제한한다. 더 나아가, 프롬프트 간의 의미론적 차이에 더 집중하도록 최적화 과정을 촉진하기 위해 중첩된 의미론적 요소 억제 기법을 제안한다. 광범위한 실험을 통해 제안된 Progressive3D 프레임워크가 복잡한 의미론을 가진 프롬프트에 대해 정밀한 3D 콘텐츠를 생성하며, 다양한 3D 표현 방식에 기반한 다양한 텍스트-3D 방법에 일반적으로 적용 가능함을 입증한다.

English

Recent text-to-3D generation methods achieve impressive 3D content creation capacity thanks to the advances in image diffusion models and optimizing strategies. However, current methods struggle to generate correct 3D content for a complex prompt in semantics, i.e., a prompt describing multiple interacted objects binding with different attributes. In this work, we propose a general framework named Progressive3D, which decomposes the entire generation into a series of locally progressive editing steps to create precise 3D content for complex prompts, and we constrain the content change to only occur in regions determined by user-defined region prompts in each editing step. Furthermore, we propose an overlapped semantic component suppression technique to encourage the optimization process to focus more on the semantic differences between prompts. Extensive experiments demonstrate that the proposed Progressive3D framework generates precise 3D content for prompts with complex semantics and is general for various text-to-3D methods driven by different 3D representations.

Progressive3D: 복잡한 의미론적 프롬프트를 활용한 텍스트-3D 콘텐츠 생성을 위한 점진적 지역 편집

Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts

초록

Support