Control4D: 2D 확산 기반 편집기를 통해 4D GAN을 학습하여 동적 초상화 편집하기

초록

최근 몇 년간 텍스트 지시를 이용한 이미지 편집 기술은 상당한 성과를 거두었습니다. 그러나 이러한 편집기를 동적 장면 편집에 적용할 경우, 2D 편집기의 프레임 단위 특성으로 인해 새로운 스타일의 장면이 시간적 일관성을 유지하지 못하는 경향이 있습니다. 이 문제를 해결하기 위해, 우리는 고화질 및 시간적 일관성을 갖춘 4D 초상화 편집을 위한 새로운 접근 방식인 Control4D를 제안합니다. Control4D는 효율적인 4D 표현과 2D 디퓨전 기반 편집기를 기반으로 구축되었습니다. 우리의 방법은 편집기로부터 직접적인 감독을 사용하는 대신, 이를 통해 4D GAN을 학습하고 일관되지 않은 감독 신호를 피합니다. 구체적으로, 우리는 편집된 이미지를 기반으로 생성 분포를 학습하기 위해 판별기를 사용하고, 그런 다음 판별 신호를 통해 생성기를 업데이트합니다. 더 안정적인 학습을 위해, 편집된 이미지에서 다중 수준의 정보를 추출하여 생성기의 학습을 촉진합니다. 실험 결과는 Control4D가 기존 접근 방식을 능가하며 더욱 사실적이고 일관된 4D 편집 성능을 달성함을 보여줍니다. 우리 프로젝트 웹사이트 링크는 https://control4darxiv.github.io 입니다.

English

Recent years have witnessed considerable achievements in editing images with text instructions. When applying these editors to dynamic scene editing, the new-style scene tends to be temporally inconsistent due to the frame-by-frame nature of these 2D editors. To tackle this issue, we propose Control4D, a novel approach for high-fidelity and temporally consistent 4D portrait editing. Control4D is built upon an efficient 4D representation with a 2D diffusion-based editor. Instead of using direct supervisions from the editor, our method learns a 4D GAN from it and avoids the inconsistent supervision signals. Specifically, we employ a discriminator to learn the generation distribution based on the edited images and then update the generator with the discrimination signals. For more stable training, multi-level information is extracted from the edited images and used to facilitate the learning of the generator. Experimental results show that Control4D surpasses previous approaches and achieves more photo-realistic and consistent 4D editing performances. The link to our project website is https://control4darxiv.github.io.

Control4D: 2D 확산 기반 편집기를 통해 4D GAN을 학습하여 동적 초상화 편집하기

Control4D: Dynamic Portrait Editing by Learning 4D GAN from 2D Diffusion-based Editor

초록

Support