바로잡기: 원치 않는 개념을 피하기 위한 자동 조정 노이즈 제거 궤적

초록

텍스트-이미지 모델의 윤리적 배포를 보장하기 위해서는 유해하거나 부적절한 콘텐츠의 생성을 방지할 수 있는 효과적인 기술이 필요합니다. 개념 삭제 방법이 유망한 해결책을 제공하지만, 기존의 미세 조정 기반 접근법은 상당한 한계를 가지고 있습니다. 앵커 없는 방법은 샘플링 궤적을 방해하여 시각적 결함을 유발할 위험이 있고, 앵커 기반 방법은 휴리스틱적으로 선택된 앵커 개념에 의존합니다. 이러한 단점을 극복하기 위해, 우리는 원치 않는 개념을 피하기 위해 자동으로 디노이징 궤적을 안내하는 ANT(Automatically guides deNoising Trajectories)라는 미세 조정 프레임워크를 소개합니다. ANT는 중간부터 후반 디노이징 단계에서 분류자 없는 가이던스의 조건 방향을 반전시키면 초기 단계의 구조적 무결성을 희생하지 않고도 정밀한 콘텐츠 수정이 가능하다는 핵심 통찰에 기반을 두고 있습니다. 이는 휴리스틱적인 앵커 개념 선택에 의존하지 않고도 자연스러운 이미지 매니폴드로 샘플을 이끄는 초기 단계 점수 함수 필드의 무결성을 보존하는 궤적 인식 목적 함수를 고안하게 했습니다. 단일 개념 삭제를 위해, 우리는 원치 않는 개념에 가장 크게 기여하는 중요한 매개변수를 정확히 식별할 수 있는 증강 강화 가중치 중요도 맵을 제안하여 더 철저하고 효율적인 삭제를 가능하게 합니다. 다중 개념 삭제를 위해, 우리의 목적 함수는 성능을 크게 향상시키는 다용도의 플러그 앤 플레이 솔루션을 제공합니다. 광범위한 실험을 통해 ANT가 단일 및 다중 개념 삭제 모두에서 최첨단 결과를 달성하며, 생성 품질을 저하시키지 않고도 고품질의 안전한 출력을 제공함을 입증했습니다. 코드는 https://github.com/lileyang1210/ANT에서 확인할 수 있습니다.

English

Ensuring the ethical deployment of text-to-image models requires effective techniques to prevent the generation of harmful or inappropriate content. While concept erasure methods offer a promising solution, existing finetuning-based approaches suffer from notable limitations. Anchor-free methods risk disrupting sampling trajectories, leading to visual artifacts, while anchor-based methods rely on the heuristic selection of anchor concepts. To overcome these shortcomings, we introduce a finetuning framework, dubbed ANT, which Automatically guides deNoising Trajectories to avoid unwanted concepts. ANT is built on a key insight: reversing the condition direction of classifier-free guidance during mid-to-late denoising stages enables precise content modification without sacrificing early-stage structural integrity. This inspires a trajectory-aware objective that preserves the integrity of the early-stage score function field, which steers samples toward the natural image manifold, without relying on heuristic anchor concept selection. For single-concept erasure, we propose an augmentation-enhanced weight saliency map to precisely identify the critical parameters that most significantly contribute to the unwanted concept, enabling more thorough and efficient erasure. For multi-concept erasure, our objective function offers a versatile plug-and-play solution that significantly boosts performance. Extensive experiments demonstrate that ANT achieves state-of-the-art results in both single and multi-concept erasure, delivering high-quality, safe outputs without compromising the generative fidelity. Code is available at https://github.com/lileyang1210/ANT

바로잡기: 원치 않는 개념을 피하기 위한 자동 조정 노이즈 제거 궤적

Set You Straight: Auto-Steering Denoising Trajectories to Sidestep Unwanted Concepts

초록

Support