Cora: 少数ステップの拡散を用いた対応関係を考慮した画像編集

要旨

画像編集は、コンピュータグラフィックス、ビジョン、VFXにおける重要なタスクであり、最近の拡散ベースの手法は高速かつ高品質な結果を達成しています。しかし、非剛体変形、オブジェクトの変更、コンテンツ生成など、大幅な構造的変更を必要とする編集は依然として困難です。既存の少ステップ編集アプローチでは、無関係なテクスチャが生じたり、ソース画像の重要な属性（例：ポーズ）を保持するのに苦労したりする問題があります。本論文では、これらの制限を解決する新しい編集フレームワーク「Cora」を紹介します。Coraは、対応関係を考慮したノイズ補正と補間されたアテンションマップを導入することで、ソース画像とターゲット画像の間のテクスチャと構造をセマンティックな対応関係を通じて整合させ、必要に応じて新しいコンテンツを生成しつつ、正確なテクスチャ転移を可能にします。Coraは、コンテンツ生成と保存のバランスを制御する機能を提供します。広範な実験により、Coraがポーズ変更、オブジェクト追加、テクスチャ調整など多様な編集において、構造、テクスチャ、アイデンティティを維持する点で定量的・定性的に優れていることが示されています。ユーザースタディでは、Coraが代替手法を上回る優れた結果を提供することが確認されました。

English

Image editing is an important task in computer graphics, vision, and VFX, with recent diffusion-based methods achieving fast and high-quality results. However, edits requiring significant structural changes, such as non-rigid deformations, object modifications, or content generation, remain challenging. Existing few step editing approaches produce artifacts such as irrelevant texture or struggle to preserve key attributes of the source image (e.g., pose). We introduce Cora, a novel editing framework that addresses these limitations by introducing correspondence-aware noise correction and interpolated attention maps. Our method aligns textures and structures between the source and target images through semantic correspondence, enabling accurate texture transfer while generating new content when necessary. Cora offers control over the balance between content generation and preservation. Extensive experiments demonstrate that, quantitatively and qualitatively, Cora excels in maintaining structure, textures, and identity across diverse edits, including pose changes, object addition, and texture refinements. User studies confirm that Cora delivers superior results, outperforming alternatives.

Cora: 少数ステップの拡散を用いた対応関係を考慮した画像編集

Cora: Correspondence-aware image editing using few step diffusion

要旨

Support