Rectified Flows の漏洩箇所：補間経路に沿ったメンバーシップ信号の特性評価

要旨

生成モデルが学習データから何を保持しているかを理解することは依然として困難であり、著作権やプライバシーに影響を及ぼす。逐語的な再現を超えて、モデルは学習データのより微妙な痕跡を符号化することがあり、それは出力に現れることはないが、依然として悪用可能である。本研究では、実運用されている生成システムで使用が増加しているRectified Flowについて、このような領域を調査する。Rectified Flowの学習を定義する補間パスX_λ = (1-λ)X_0 + λX_1を解析し、学習データとテストデータの再構成の間にλに対してベル型曲線を描くギャップが存在することを示す。このギャップは学習中に蓄積される一方で、検証指標は安定したままである。この信号は最大値を持ち、その位置をガウス仮定の下で閉形式で導出する。これらの予測を音声と画像の両方で検証し、ベル型構造が普遍的である一方、ピークの予測は仮定が満たされる場合に成立することを示す。概念実証として、このλに依存した構造を利用してメンバーシップ推論攻撃を行い、学習セットのメンバーと非メンバーを識別する。

English

Understanding what generative models retain from training data remains challenging, with implications for copyright and privacy. Beyond verbatim reproduction, models can encode subtler traces of their training data that never surface in their outputs yet remain exploitable. We study this regime for Rectified Flows, which are increasingly used in deployed generative systems. We analyse the interpolation path X_λ= (1-λ)X_0 + λX_1 that defines the Rectified Flow training. We show that a gap exists between the reconstruction of train and test data that follows a bell-shaped curve over λ, wich accumulates during training, while the validation metrics remain stable. The signal has a maximum whose location we derive in closed form under Gaussian assumptions. We validate these predictions on both audio and images and show that the bell-shaped structure is universal, while the peak prediction holds when our assumptions are satisfied. As a proof of concept, we exploit this specific λ-resolved structure to perform a Membership Inference Attack, distinguishing members of the training set from non-members.