Control4D: 2D拡散ベースエディタから4D GANを学習する動的ポートレート編集

要旨

近年、テキスト指示による画像編集において大きな進展が見られています。しかし、これらの編集ツールを動的なシーン編集に適用する場合、2Dエディタのフレームごとの性質により、新しいスタイルのシーンは時間的な一貫性を欠く傾向があります。この問題を解決するため、我々は高忠実度かつ時間的に一貫した4Dポートレート編集を実現する新しいアプローチであるControl4Dを提案します。Control4Dは、効率的な4D表現と2D拡散ベースのエディタを基盤としています。エディタからの直接的な教師信号を使用する代わりに、我々の手法はそこから4D GANを学習し、一貫性のない教師信号を回避します。具体的には、編集された画像に基づいて生成分布を学習するディスクリミネータを採用し、その識別信号を用いてジェネレータを更新します。より安定した学習のため、編集された画像からマルチレベル情報を抽出し、ジェネレータの学習を促進します。実験結果は、Control4Dが従来のアプローチを凌駕し、よりフォトリアルで一貫性のある4D編集性能を達成することを示しています。プロジェクトウェブサイトへのリンクはhttps://control4darxiv.github.ioです。

English

Recent years have witnessed considerable achievements in editing images with text instructions. When applying these editors to dynamic scene editing, the new-style scene tends to be temporally inconsistent due to the frame-by-frame nature of these 2D editors. To tackle this issue, we propose Control4D, a novel approach for high-fidelity and temporally consistent 4D portrait editing. Control4D is built upon an efficient 4D representation with a 2D diffusion-based editor. Instead of using direct supervisions from the editor, our method learns a 4D GAN from it and avoids the inconsistent supervision signals. Specifically, we employ a discriminator to learn the generation distribution based on the edited images and then update the generator with the discrimination signals. For more stable training, multi-level information is extracted from the edited images and used to facilitate the learning of the generator. Experimental results show that Control4D surpasses previous approaches and achieves more photo-realistic and consistent 4D editing performances. The link to our project website is https://control4darxiv.github.io.

Control4D: 2D拡散ベースエディタから4D GANを学習する動的ポートレート編集

Control4D: Dynamic Portrait Editing by Learning 4D GAN from 2D Diffusion-based Editor

要旨

Support