InfoNCEはガウス分布を誘導する

要旨

対照学習は、現代的な表現学習の基盤となり、タスク特化型モデルと汎用（基盤）モデルの両方において、大規模なラベルなしデータを用いた学習を可能にしている。対照学習における典型的な損失関数はInfoNCEとその派生形である。本研究では、InfoNCE目的関数が対照学習から生じる表現にガウス構造を誘導することを示す。この結果を二つの相補的な領域で確立する。まず、特定の整列性と集中性の仮定の下で、高次元表現の射影が漸近的に多変量ガウス分布に近づくことを示す。次に、より緩い仮定の下で、特徴ノルムの低さと特徴エントロピーの高さを促進する漸近的に消失する正則化項を追加すると、同様の漸近的結果が得られることを示す。我々の分析は、合成データセットとCIFAR-10データセットを用い、複数のエンコーダ構造とサイズにわたって一貫したガウス振る舞いを実証する実験によって支持される。この視点は、対照的表現で一般的に観察されるガウス性に対する原理的な説明を提供する。結果として得られるガウスモデルは、学習された表現の原理的分析的処理を可能にし、対照学習における幅広い応用を支えることが期待される。

English

Contrastive learning has become a cornerstone of modern representation learning, allowing training with massive unlabeled data for both task-specific and general (foundation) models. A prototypical loss in contrastive training is InfoNCE and its variants. In this work, we show that the InfoNCE objective induces Gaussian structure in representations that emerge from contrastive training. We establish this result in two complementary regimes. First, we show that under certain alignment and concentration assumptions, projections of the high-dimensional representation asymptotically approach a multivariate Gaussian distribution. Next, under less strict assumptions, we show that adding a small asymptotically vanishing regularization term that promotes low feature norm and high feature entropy leads to similar asymptotic results. We support our analysis with experiments on synthetic and CIFAR-10 datasets across multiple encoder architectures and sizes, demonstrating consistent Gaussian behavior. This perspective provides a principled explanation for commonly observed Gaussianity in contrastive representations. The resulting Gaussian model enables principled analytical treatment of learned representations and is expected to support a wide range of applications in contrastive learning.

InfoNCEはガウス分布を誘導する

InfoNCE Induces Gaussian Distribution

要旨

Support