ChatPaper.aiChatPaper

MUSAR:基於注意力路由機制探索從單一主體數據集實現多主體定制

MUSAR: Exploring Multi-Subject Customization from Single-Subject Dataset via Attention Routing

May 5, 2025
作者: Zinan Guo, Pengze Zhang, Yanze Wu, Chong Mou, Songtao Zhao, Qian He
cs.AI

摘要

当前的多主体定制方法面临两大关键挑战:一是获取多样化的多主体训练数据难度大,二是不同主体间的属性存在纠缠。为填补这些空白,我们提出了MUSAR——一个简单却高效的框架,仅需单主体训练数据即可实现鲁棒的多主体定制。首先,为突破数据限制,我们引入了去偏双联学习法。该方法通过单主体图像构建双联训练对,以促进多主体学习,同时通过静态注意力路由和双分支LoRA主动纠正双联构建引入的分布偏差。其次,为消除跨主体纠缠,我们引入了动态注意力路由机制,该机制自适应地在生成图像与条件主体间建立双射映射。这一设计不仅实现了多主体表征的解耦,还保持了随着参考主体增加而可扩展的泛化性能。综合实验表明,尽管仅需单主体数据集,我们的MUSAR在图像质量、主体一致性和交互自然度上均优于现有方法——即便是那些基于多主体数据集训练的方法。
English
Current multi-subject customization approaches encounter two critical challenges: the difficulty in acquiring diverse multi-subject training data, and attribute entanglement across different subjects. To bridge these gaps, we propose MUSAR - a simple yet effective framework to achieve robust multi-subject customization while requiring only single-subject training data. Firstly, to break the data limitation, we introduce debiased diptych learning. It constructs diptych training pairs from single-subject images to facilitate multi-subject learning, while actively correcting the distribution bias introduced by diptych construction via static attention routing and dual-branch LoRA. Secondly, to eliminate cross-subject entanglement, we introduce dynamic attention routing mechanism, which adaptively establishes bijective mappings between generated images and conditional subjects. This design not only achieves decoupling of multi-subject representations but also maintains scalable generalization performance with increasing reference subjects. Comprehensive experiments demonstrate that our MUSAR outperforms existing methods - even those trained on multi-subject dataset - in image quality, subject consistency, and interaction naturalness, despite requiring only single-subject dataset.

Summary

AI-Generated Summary

PDF31May 6, 2025