コンセプトスライダー：拡散モデルにおける精密制御のためのLoRAアダプター

要旨

我々は、拡散モデルによる画像生成において属性を精密に制御可能な解釈可能なコンセプトスライダーを作成する手法を提案する。本手法は、あるコンセプトに対応する低ランクのパラメータ方向を特定しつつ、他の属性への干渉を最小化する。スライダーは少数のプロンプトまたはサンプル画像を用いて作成されるため、テキストまたは視覚的コンセプトのいずれに対してもスライダー方向を生成可能である。コンセプトスライダーはプラグアンドプレイ方式であり、効率的に組み合わせたり連続的に調整したりできるため、画像生成を精密に制御できる。従来の編集技術と比較した定量的実験において、我々のスライダーはより強力なターゲット編集を実現し、干渉が少ないことを示す。天候、年齢、スタイル、表情などのスライダーとその組み合わせを実演する。また、テキストによる記述が困難な視覚的コンセプトを直感的に編集するため、StyleGANの潜在変数を転送する方法を示す。さらに、本手法がStable Diffusion XLにおける物体の変形修復や歪んだ手の修正といった継続的な品質問題の解決に役立つことを確認した。コード、データ、および学習済みスライダーはhttps://sliders.baulab.info/で公開している。

English

We present a method to create interpretable concept sliders that enable precise control over attributes in image generations from diffusion models. Our approach identifies a low-rank parameter direction corresponding to one concept while minimizing interference with other attributes. A slider is created using a small set of prompts or sample images; thus slider directions can be created for either textual or visual concepts. Concept Sliders are plug-and-play: they can be composed efficiently and continuously modulated, enabling precise control over image generation. In quantitative experiments comparing to previous editing techniques, our sliders exhibit stronger targeted edits with lower interference. We showcase sliders for weather, age, styles, and expressions, as well as slider compositions. We show how sliders can transfer latents from StyleGAN for intuitive editing of visual concepts for which textual description is difficult. We also find that our method can help address persistent quality issues in Stable Diffusion XL including repair of object deformations and fixing distorted hands. Our code, data, and trained sliders are available at https://sliders.baulab.info/

コンセプトスライダー：拡散モデルにおける精密制御のためのLoRAアダプター

Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models

要旨

Support