ChatPaper.aiChatPaper

GeneCIS:一個用於一般條件圖像相似性的基準。

GeneCIS: A Benchmark for General Conditional Image Similarity

June 13, 2023
作者: Sagar Vaze, Nicolas Carion, Ishan Misra
cs.AI

摘要

我們認為「相似性」有許多概念,並且模型應該能夠動態地適應這些概念,就像人類一樣。這與大多數表示學習方法(監督式或自監督式)形成對比,這些方法學習一個固定的嵌入函數,因此隱含地假設了單一的相似性概念。例如,在ImageNet上訓練的模型偏向於物件類別,而使用者可能希望模型專注於顏色、紋理或場景中的特定元素。在本文中,我們提出了GeneCIS('genesis')基準,該基準評估模型適應各種相似性條件的能力。擴展先前的工作,我們的基準僅設計用於零樣本評估,因此考慮了一個開放的相似性條件集。我們發現,來自強大的CLIP模型的基準在GeneCIS上遇到困難,而在基準上的表現與ImageNet的準確性之間只有微弱的相關性,這表明簡單地擴展現有方法並不是有效的。我們進一步提出了一個簡單、可擴展的解決方案,基於從現有的圖像說明數據集中自動挖掘信息。我們發現我們的方法在GeneCIS上比基準提供了顯著的提升,並進一步改善了相關圖像檢索基準的零樣本表現。事實上,儘管是零樣本評估,我們的模型在MIT-States上超越了最先進的監督式模型。項目頁面位於https://sgvaze.github.io/genecis/。
English
We argue that there are many notions of 'similarity' and that models, like humans, should be able to adapt to these dynamically. This contrasts with most representation learning methods, supervised or self-supervised, which learn a fixed embedding function and hence implicitly assume a single notion of similarity. For instance, models trained on ImageNet are biased towards object categories, while a user might prefer the model to focus on colors, textures or specific elements in the scene. In this paper, we propose the GeneCIS ('genesis') benchmark, which measures models' ability to adapt to a range of similarity conditions. Extending prior work, our benchmark is designed for zero-shot evaluation only, and hence considers an open-set of similarity conditions. We find that baselines from powerful CLIP models struggle on GeneCIS and that performance on the benchmark is only weakly correlated with ImageNet accuracy, suggesting that simply scaling existing methods is not fruitful. We further propose a simple, scalable solution based on automatically mining information from existing image-caption datasets. We find our method offers a substantial boost over the baselines on GeneCIS, and further improves zero-shot performance on related image retrieval benchmarks. In fact, though evaluated zero-shot, our model surpasses state-of-the-art supervised models on MIT-States. Project page at https://sgvaze.github.io/genecis/.
PDF40December 15, 2024