SwiftEdit：ワンステップ拡散を介した高速なテキストガイド画像編集

要旨

テキストによる画像編集の最近の進歩により、ユーザーは単純なテキスト入力を通じて画像編集を行うことができるようになりました。これは、多段階の拡散ベースのテキストから画像へのモデルの事前知識を活用しています。しかし、これらの方法は、実世界やデバイス上のアプリケーションで必要とされる高速な処理速度には対応できず、多段階の反転やサンプリングプロセスが原因です。この課題に対処するために、我々はSwiftEditを導入しました。これは、瞬時のテキストによる画像編集（0.23秒で完了）を実現するシンプルで非常に効率的な編集ツールです。SwiftEditの進歩の要点は、1ステップ反転フレームワークと、提案された注意の再スケーリングメカニズムを使用したマスクによる編集技術にあります。これにより、局所的な画像編集が可能となります。SwiftEditの有効性と効率性を示すために、幅広い実験が提供されています。特に、SwiftEditは、従来の多段階手法よりも極めて高速な瞬時のテキストによる画像編集を可能にしました（少なくとも50倍高速）、編集結果において競争力のあるパフォーマンスを維持しています。プロジェクトページはこちら：https://swift-edit.github.io/

English

Recent advances in text-guided image editing enable users to perform image edits through simple text inputs, leveraging the extensive priors of multi-step diffusion-based text-to-image models. However, these methods often fall short of the speed demands required for real-world and on-device applications due to the costly multi-step inversion and sampling process involved. In response to this, we introduce SwiftEdit, a simple yet highly efficient editing tool that achieve instant text-guided image editing (in 0.23s). The advancement of SwiftEdit lies in its two novel contributions: a one-step inversion framework that enables one-step image reconstruction via inversion and a mask-guided editing technique with our proposed attention rescaling mechanism to perform localized image editing. Extensive experiments are provided to demonstrate the effectiveness and efficiency of SwiftEdit. In particular, SwiftEdit enables instant text-guided image editing, which is extremely faster than previous multi-step methods (at least 50 times faster) while maintain a competitive performance in editing results. Our project page is at: https://swift-edit.github.io/

SwiftEdit：ワンステップ拡散を介した高速なテキストガイド画像編集

SwiftEdit: Lightning Fast Text-Guided Image Editing via One-Step Diffusion

要旨

Support