텍스트툰: 싱글 비디오에서 실시간 텍스트 투니파이 헤드 아바타

초록

TextToon은 운전 가능한 툰 스타일 아바타를 생성하는 방법을 제안합니다. 짧은 단안 동영상 시퀀스와 아바타 스타일에 대한 서면 지침이 주어지면, 우리의 모델은 임의의 신원을 가진 다른 동영상에 실시간으로 운전 가능한 고품질의 툰 스타일 아바타를 생성할 수 있습니다. 기존 관련 작업은 기하학을 복구하기 위해 텍스처 임베딩을 통해 다중 뷰 모델링에 크게 의존하며, 정적 방식으로 제시되어 제어 제한을 초래합니다. 다중 뷰 동영상 입력은 또한 이러한 모델을 실제 응용 프로그램에 배치하는 것을 어렵게 만듭니다. 이러한 문제를 해결하기 위해 우리는 실제적이고 스타일화된 얼굴 표현을 학습하기 위해 조건부 임베딩 Tri-plane을 채택합니다. 또한, 고품질 이미지를 얻기 위해 적응형 픽셀 이동 신경망을 도입하고 패치 인식 대조 학습을 활용하여 3D 가우시안 스플래팅의 스타일화 기능을 확장합니다. 소비자 응용 프로그램으로의 우리의 작업을 이끌기 위해 GPU 기계에서 48 FPS, 모바일 기계에서 15-18 FPS에서 작동할 수 있는 실시간 시스템을 개발했습니다. 광범위한 실험은 우리의 방법이 품질과 실시간 애니메이션 측면에서 기존 방법보다 우수한 텍스트 아바타를 생성하는 데 효과적임을 입증합니다. 자세한 내용은 저희 프로젝트 페이지를 참조하십시오: https://songluchuan.github.io/TextToon/.

English

We propose TextToon, a method to generate a drivable toonified avatar. Given a short monocular video sequence and a written instruction about the avatar style, our model can generate a high-fidelity toonified avatar that can be driven in real-time by another video with arbitrary identities. Existing related works heavily rely on multi-view modeling to recover geometry via texture embeddings, presented in a static manner, leading to control limitations. The multi-view video input also makes it difficult to deploy these models in real-world applications. To address these issues, we adopt a conditional embedding Tri-plane to learn realistic and stylized facial representations in a Gaussian deformation field. Additionally, we expand the stylization capabilities of 3D Gaussian Splatting by introducing an adaptive pixel-translation neural network and leveraging patch-aware contrastive learning to achieve high-quality images. To push our work into consumer applications, we develop a real-time system that can operate at 48 FPS on a GPU machine and 15-18 FPS on a mobile machine. Extensive experiments demonstrate the efficacy of our approach in generating textual avatars over existing methods in terms of quality and real-time animation. Please refer to our project page for more details: https://songluchuan.github.io/TextToon/.