连续上下文:基于指令的图像编辑中的连续强度控制
Kontinuous Kontext: Continuous Strength Control for Instruction-based Image Editing
October 9, 2025
作者: Rishubh Parihar, Or Patashnik, Daniil Ostashev, R. Venkatesh Babu, Daniel Cohen-Or, Kuan-Chieh Wang
cs.AI
摘要
基于指令的图像编辑提供了一种强大且直观的方式,通过自然语言来操控图像。然而,仅依赖文本指令限制了编辑程度的精细控制。我们引入了Kontinuous Kontext,这是一个指令驱动的编辑模型,它提供了对编辑强度的新维度控制,使用户能够以平滑连续的方式从无变化逐步调整至完全实现的结果。Kontinuous Kontext扩展了一个先进的图像编辑模型,使其能够接受一个额外的输入——一个标量编辑强度,该强度随后与编辑指令配对,从而实现对编辑程度的显式控制。为了注入这一标量信息,我们训练了一个轻量级的投影网络,将输入标量和编辑指令映射到模型调制空间中的系数。为了训练我们的模型,我们利用现有的生成模型合成了一组多样化的图像-编辑-指令-强度四元组数据集,随后通过过滤阶段确保质量和一致性。Kontinuous Kontext为指令驱动的编辑提供了一种统一的方法,实现了从细微到强烈的编辑强度精细控制,涵盖风格化、属性、材质、背景和形状变化等多种操作,而无需进行特定属性的训练。
English
Instruction-based image editing offers a powerful and intuitive way to
manipulate images through natural language. Yet, relying solely on text
instructions limits fine-grained control over the extent of edits. We introduce
Kontinuous Kontext, an instruction-driven editing model that provides a new
dimension of control over edit strength, enabling users to adjust edits
gradually from no change to a fully realized result in a smooth and continuous
manner. Kontinuous Kontext extends a state-of-the-art image editing model to
accept an additional input, a scalar edit strength which is then paired with
the edit instruction, enabling explicit control over the extent of the edit. To
inject this scalar information, we train a lightweight projector network that
maps the input scalar and the edit instruction to coefficients in the model's
modulation space. For training our model, we synthesize a diverse dataset of
image-edit-instruction-strength quadruplets using existing generative models,
followed by a filtering stage to ensure quality and consistency. Kontinuous
Kontext provides a unified approach for fine-grained control over edit strength
for instruction driven editing from subtle to strong across diverse operations
such as stylization, attribute, material, background, and shape changes,
without requiring attribute-specific training.