Eenmalig distilleren, levenslang aanpassen: Onderzoek naar datasetdistillatie voor continue testtijdadaptatie

Samenvatting

Continue Testtijdadaptatie (CTTA) heeft als doel de modelprestaties te handhaven onder evoluerende doeldomeinen door online aan te passen zonder gelabelde data. In de praktijk kan de brondataset echter vaak niet worden bewaard vanwege privacy- of licentiebeperkingen, en zuiver bronvrije CTTA-methoden worden doorgaans instabiel bij langdurige distributieverschuivingen, waarbij ze lijden onder cumulatieve zelfleringsfouten en catastrofaal vergeten. We introduceren DO-ALL (Distilleer eenmalig, pas levenslang aan), een plug-and-play-raamwerk dat broninformatie hergebruikt in een compacte en privacybewuste vorm via Datasetdestillatie (DD). Voorafgaand aan de implementatie voert DO-ALL DD uit om een kleine set synthetische gedistilleerde ankers te produceren die de brondistributie samenvatten. Tijdens de adaptatie wordt elk doelmonster gekoppeld aan het meest semantisch overeenkomende anker, dat een stabiele referentie biedt voor diverse CTTA via bronherhaling, representatie-uitlijning en manifold-gladmakende regularisatie. DO-ALL kan naadloos worden geïntegreerd in bestaande CTTA-algoritmen, waarbij het de langetermijnrobuustheid consistent verbetert op CIFAR100-C, ImageNet-C en de CCC-benchmark. Dit toont de potentie aan van het inzetten van DD om stabiele en continue adaptatie mogelijk te maken zonder het bewaren van ruwe brondata. De code is beschikbaar op https://github.com/blue-531/DOALL.

English

Continual Test-Time Adaptation (CTTA) aims to maintain model performance under evolving target domains by adapting online without labeled data. However, practical deployments often cannot retain the source dataset due to privacy or licensing constraints, and purely source-free CTTA methods tend to become unstable under long-term distribution shift, suffering from compounding self-training errors and catastrophic forgetting. We introduce DO-ALL (Distill Once, Adapt Life-Long), a plug-and-play framework that revisits source information in a compact and privacy-conscious form via Dataset Distillation (DD). Before deployment, DO-ALL performs DD to produce a small set of synthetic distilled anchors that summarize the source distribution. During adaptation, each target sample is matched with its most semantically aligned anchor, which provides a stable reference for various CTTA via source replay, representation alignment, and manifold-smoothing regularization. DO-ALL can be seamlessly integrated into existing CTTA algorithms, consistently improving long-term robustness across CIFAR100-C, ImageNet-C, and the CCC benchmark. This demonstrates the potential of leveraging DD to enable stable and continuous adaptation without retaining raw source data. The code is available at https://github.com/blue-531/DOALL.