ChatPaper.aiChatPaper

FireRed-Image-Edit-1.0 技术报告

FireRed-Image-Edit-1.0 Techinical Report

February 12, 2026
作者: Super Intelligence Team, Changhao Qiao, Chao Hui, Chen Li, Cunzheng Wang, Dejia Song, Jiale Zhang, Jing Li, Qiang Xiang, Runqi Wang, Shuang Sun, Wei Zhu, Xu Tang, Yao Hu, Yibo Chen, Yuhao Huang, Yuxuan Duan, Zhiyi Chen, Ziyuan Guo
cs.AI

摘要

我们推出FireRed-Image-Edit——一款基于指令的扩散Transformer图像编辑模型,通过数据构建、训练方法和评估设计的系统化优化实现了顶尖性能。我们构建了包含16亿样本的训练语料库,涵盖来自多元渠道的9亿文生图对和7亿图像编辑对。经过严格的数据清洗、分层处理、自动标注及两阶段筛选后,我们保留了超过1亿个生成与编辑任务均衡的高质量样本,确保强语义覆盖和指令对齐。我们的多阶段训练流程通过预训练、监督微调和强化学习逐步构建编辑能力。为提升数据效率,我们引入多条件感知分桶采样器实现可变分辨率批处理,以及采用动态提示重索引的随机指令对齐技术。为稳定优化并增强可控性,我们提出DPO的非对称梯度优化、针对文本编辑的布局感知OCR奖励机制DiffusionNFT,以及用于身份保持的可微分一致性损失。我们还建立了REDEdit-Bench综合评测基准,涵盖15个编辑类别(包括新引入的美颜优化和低层级增强任务)。在REDEdit-Bench及公开基准(ImgEdit和GEdit)上的大量实验表明,本模型在开源与商业系统中均展现出竞争优势。我们将公开代码、模型及评测套件以支持后续研究。
English
We present FireRed-Image-Edit, a diffusion transformer for instruction-based image editing that achieves state-of-the-art performance through systematic optimization of data curation, training methodology, and evaluation design. We construct a 1.6B-sample training corpus, comprising 900M text-to-image and 700M image editing pairs from diverse sources. After rigorous cleaning, stratification, auto-labeling, and two-stage filtering, we retain over 100M high-quality samples balanced between generation and editing, ensuring strong semantic coverage and instruction alignment. Our multi-stage training pipeline progressively builds editing capability via pre-training, supervised fine-tuning, and reinforcement learning. To improve data efficiency, we introduce a Multi-Condition Aware Bucket Sampler for variable-resolution batching and Stochastic Instruction Alignment with dynamic prompt re-indexing. To stabilize optimization and enhance controllability, we propose Asymmetric Gradient Optimization for DPO, DiffusionNFT with layout-aware OCR rewards for text editing, and a differentiable Consistency Loss for identity preservation. We further establish REDEdit-Bench, a comprehensive benchmark spanning 15 editing categories, including newly introduced beautification and low-level enhancement tasks. Extensive experiments on REDEdit-Bench and public benchmarks (ImgEdit and GEdit) demonstrate competitive or superior performance against both open-source and proprietary systems. We release code, models, and the benchmark suite to support future research.
PDF31February 18, 2026