SPF-Portrait: Naar Pure Portretcustomisatie met Semantische Vervuiling-Vrije Fijnafstemming

Samenvatting

Het finetunen van een vooraf getrainde Text-to-Image (T2I) model op een aangepaste portretdataset is de gangbare methode voor tekstgestuurde aanpassing van portretattributen. Door Semantische Verontreiniging tijdens het finetunen, hebben bestaande methoden moeite om het oorspronkelijke gedrag van het model te behouden en incrementeel leren te bereiken terwijl doelattributen worden aangepast. Om dit probleem aan te pakken, stellen we SPF-Portrait voor, een baanbrekende methode om puur aangepaste semantiek te begrijpen terwijl semantische verontreiniging wordt geëlimineerd in tekstgestuurde portretcustomisatie. In onze SPF-Portrait introduceren we een dual-path pipeline die het oorspronkelijke model als referentie gebruikt voor het conventionele finetuningpad. Door contrastief leren zorgen we voor aanpassing aan doelattributen en richten we opzettelijk andere niet-gerelateerde attributen af op het oorspronkelijke portret. We introduceren een nieuwe Semantisch-Bewuste Fijne Controlekaart, die de precieze responsgebieden van de doelsemantiek weergeeft, om het afstemmingsproces tussen de contrastieve paden ruimtelijk te begeleiden. Dit afstemmingsproces behoudt niet alleen effectief de prestaties van het oorspronkelijke model, maar voorkomt ook overmatige afstemming. Bovendien stellen we een nieuw responsversterkingsmechanisme voor om de prestaties van doelattributen te versterken, terwijl de inherente representatie-discrepantie in directe cross-modale supervisie wordt gemitigeerd. Uitgebreide experimenten tonen aan dat SPF-Portrait state-of-the-art prestaties bereikt. Projectwebpagina: https://spf-portrait.github.io/SPF-Portrait/

English

Fine-tuning a pre-trained Text-to-Image (T2I) model on a tailored portrait dataset is the mainstream method for text-driven customization of portrait attributes. Due to Semantic Pollution during fine-tuning, existing methods struggle to maintain the original model's behavior and achieve incremental learning while customizing target attributes. To address this issue, we propose SPF-Portrait, a pioneering work to purely understand customized semantics while eliminating semantic pollution in text-driven portrait customization. In our SPF-Portrait, we propose a dual-path pipeline that introduces the original model as a reference for the conventional fine-tuning path. Through contrastive learning, we ensure adaptation to target attributes and purposefully align other unrelated attributes with the original portrait. We introduce a novel Semantic-Aware Fine Control Map, which represents the precise response regions of the target semantics, to spatially guide the alignment process between the contrastive paths. This alignment process not only effectively preserves the performance of the original model but also avoids over-alignment. Furthermore, we propose a novel response enhancement mechanism to reinforce the performance of target attributes, while mitigating representation discrepancy inherent in direct cross-modal supervision. Extensive experiments demonstrate that SPF-Portrait achieves state-of-the-art performance. Project webpage: https://spf-portrait.github.io/SPF-Portrait/

SPF-Portrait: Naar Pure Portretcustomisatie met Semantische Vervuiling-Vrije Fijnafstemming

SPF-Portrait: Towards Pure Portrait Customization with Semantic Pollution-Free Fine-tuning

Samenvatting

Support