CharacterShot: 제어 가능하고 일관된 4D 캐릭터 애니메이션

초록

본 논문에서는 단일 참조 캐릭터 이미지와 2D 포즈 시퀀스로부터 동적인 3D 캐릭터(즉, 4D 캐릭터 애니메이션)를 생성할 수 있는 제어 가능하고 일관된 4D 캐릭터 애니메이션 프레임워크인 CharacterShot을 제안합니다. 먼저, 최신 DiT 기반 이미지-투-비디오 모델을 기반으로 강력한 2D 캐릭터 애니메이션 모델을 사전 학습하여, 어떠한 2D 포즈 시퀀스도 제어 신호로 사용할 수 있도록 합니다. 그런 다음, 듀얼-어텐션 모듈과 카메라 사전 정보를 도입하여 애니메이션 모델을 2D에서 3D로 확장함으로써 시공간적 및 공간-뷰 일관성을 갖는 멀티뷰 비디오를 생성합니다. 마지막으로, 이러한 멀티뷰 비디오에 대해 새로운 이웃-제약 4D 가우시안 스플래팅 최적화를 적용하여 연속적이고 안정적인 4D 캐릭터 표현을 얻습니다. 또한, 캐릭터 중심 성능을 향상시키기 위해 다양한 외관과 동작을 가진 13,115개의 고유 캐릭터를 다중 시점에서 렌더링한 대규모 데이터셋 Character4D를 구축했습니다. 새롭게 구축한 벤치마크인 CharacterBench에서의 광범위한 실험을 통해 우리의 접근 방식이 현재 최첨단 방법들을 능가함을 입증합니다. 코드, 모델, 데이터셋은 https://github.com/Jeoyal/CharacterShot에서 공개될 예정입니다.

English

In this paper, we propose CharacterShot, a controllable and consistent 4D character animation framework that enables any individual designer to create dynamic 3D characters (i.e., 4D character animation) from a single reference character image and a 2D pose sequence. We begin by pretraining a powerful 2D character animation model based on a cutting-edge DiT-based image-to-video model, which allows for any 2D pose sequnce as controllable signal. We then lift the animation model from 2D to 3D through introducing dual-attention module together with camera prior to generate multi-view videos with spatial-temporal and spatial-view consistency. Finally, we employ a novel neighbor-constrained 4D gaussian splatting optimization on these multi-view videos, resulting in continuous and stable 4D character representations. Moreover, to improve character-centric performance, we construct a large-scale dataset Character4D, containing 13,115 unique characters with diverse appearances and motions, rendered from multiple viewpoints. Extensive experiments on our newly constructed benchmark, CharacterBench, demonstrate that our approach outperforms current state-of-the-art methods. Code, models, and datasets will be publicly available at https://github.com/Jeoyal/CharacterShot.

CharacterShot: 제어 가능하고 일관된 4D 캐릭터 애니메이션

CharacterShot: Controllable and Consistent 4D Character Animation

초록

Support