FlexEdit: 柔軟かつ制御可能な拡散モデルに基づくオブジェクト中心の画像編集

要旨

本研究は、物体中心の編集問題における従来手法の限界、例えば形状の不一致による非現実的な結果や物体の置換・挿入における制御の不足といった課題に対処します。この目的のために、FlexEditという柔軟で制御可能な物体編集フレームワークを提案します。FlexEditでは、各ノイズ除去ステップにおいてFlexEditブロックを使用して潜在変数を反復的に調整します。最初に、テスト時に指定された物体制約に合わせて潜在変数を最適化します。次に、ノイズ除去中に自動的に抽出される適応マスクを活用して背景を保護しつつ、新しいコンテンツをターゲット画像にシームレスに融合させます。FlexEditの汎用性を様々な物体編集タスクで実証し、実写画像と合成画像の両方からなる評価テストスイートを構築しました。さらに、物体中心の編集に特化した新しい評価指標を設計しました。異なる編集シナリオにおける広範な実験を行い、最近のテキストガイド型画像編集手法と比較して本フレームワークの優位性を示しました。プロジェクトページはhttps://flex-edit.github.io/で公開されています。

English

Our work addresses limitations seen in previous approaches for object-centric editing problems, such as unrealistic results due to shape discrepancies and limited control in object replacement or insertion. To this end, we introduce FlexEdit, a flexible and controllable editing framework for objects where we iteratively adjust latents at each denoising step using our FlexEdit block. Initially, we optimize latents at test time to align with specified object constraints. Then, our framework employs an adaptive mask, automatically extracted during denoising, to protect the background while seamlessly blending new content into the target image. We demonstrate the versatility of FlexEdit in various object editing tasks and curate an evaluation test suite with samples from both real and synthetic images, along with novel evaluation metrics designed for object-centric editing. We conduct extensive experiments on different editing scenarios, demonstrating the superiority of our editing framework over recent advanced text-guided image editing methods. Our project page is published at https://flex-edit.github.io/.

FlexEdit: 柔軟かつ制御可能な拡散モデルに基づくオブジェクト中心の画像編集

FlexEdit: Flexible and Controllable Diffusion-based Object-centric Image Editing

要旨

Support