エディコ：自然環境における一貫性のある画像編集

要旨

実証されたニーズとして、実世界の画像にわたる一貫した編集は、物体のポーズや照明条件、撮影環境など様々な制御不能な要因から生じる技術的課題となっています。Edichoは、拡散モデルに基づく学習不要のソリューションを提供し、明示的な画像対応関係を編集の指針とする基本原理を特徴としています。具体的には、事前に推定された対応関係を考慮したアテンション操作モジュールと、注意深く改良された分類器不要ガイダンス（CFG）のノイズ除去戦略が主要な構成要素です。この推論時アルゴリズムはプラグアンドプレイ性を有し、ControlNetやBrushNetなど、ほとんどの拡散ベースの編集手法と互換性があります。大規模な実験結果は、多様な設定下での画像間一貫性編集におけるEdichoの有効性を実証しています。今後の研究促進のため、コードを公開予定です。

English

As a verified need, consistent editing across in-the-wild images remains a technical challenge arising from various unmanageable factors, like object poses, lighting conditions, and photography environments. Edicho steps in with a training-free solution based on diffusion models, featuring a fundamental design principle of using explicit image correspondence to direct editing. Specifically, the key components include an attention manipulation module and a carefully refined classifier-free guidance (CFG) denoising strategy, both of which take into account the pre-estimated correspondence. Such an inference-time algorithm enjoys a plug-and-play nature and is compatible to most diffusion-based editing methods, such as ControlNet and BrushNet. Extensive results demonstrate the efficacy of Edicho in consistent cross-image editing under diverse settings. We will release the code to facilitate future studies.

エディコ：自然環境における一貫性のある画像編集

Edicho: Consistent Image Editing in the Wild

要旨

Support