PromptStyler：基於提示驅動的風格生成，用於無源領域泛化

摘要

在一個共同的視覺-語言空間中，一個文本特徵（例如，來自“一張狗的照片”）可以有效地代表其相關的圖像特徵（例如，來自狗的照片）。受此啟發，我們提出了PromptStyler，通過合成各種風格來模擬聯合空間中的各種分布變化，而無需使用任何圖像來應對無源領域泛化。我們的方法通過可學習的風格詞向量為偽詞S*生成各種風格特徵（來自“一個S*風格的a”）。為了確保學習到的風格不會扭曲內容信息，我們強制風格-內容特徵（來自“一個S*風格的a [類別]”）位於它們對應的內容特徵（來自“[類別]”）附近在共同的視覺-語言空間中。在學習風格詞向量後，我們使用合成的風格-內容特徵來訓練一個線性分類器。PromptStyler在PACS、VLCS、OfficeHome和DomainNet上實現了最先進的效果，儘管不需要任何圖像，並且僅需約30分鐘的時間在單個GPU上進行訓練。

English

In a joint vision-language space, a text feature (e.g., from "a photo of a dog") could effectively represent its relevant image features (e.g., from dog photos). Inspired by this, we propose PromptStyler which simulates various distribution shifts in the joint space by synthesizing diverse styles via prompts without using any images to deal with source-free domain generalization. Our method learns to generate a variety of style features (from "a S* style of a") via learnable style word vectors for pseudo-words S*. To ensure that learned styles do not distort content information, we force style-content features (from "a S* style of a [class]") to be located nearby their corresponding content features (from "[class]") in the joint vision-language space. After learning style word vectors, we train a linear classifier using synthesized style-content features. PromptStyler achieves the state of the art on PACS, VLCS, OfficeHome and DomainNet, although it does not require any images and takes just ~30 minutes for training using a single GPU.

PromptStyler：基於提示驅動的風格生成，用於無源領域泛化

PromptStyler: Prompt-driven Style Generation for Source-free Domain Generalization

摘要

Support