ChatPaper.aiChatPaper

元流匹配:Wasserstein流形上的向量场集成

Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold

August 26, 2024
作者: Lazar Atanackovic, Xi Zhang, Brandon Amos, Mathieu Blanchette, Leo J. Lee, Yoshua Bengio, Alexander Tong, Kirill Neklyudov
cs.AI

摘要

众多生物与物理过程可被建模为随时间连续演化的相互作用实体系统,例如通信细胞或物理粒子的动态变化。学习此类系统的动态规律对于预测新样本和未知环境中群体随时间演化的趋势至关重要。基于流的模型能够在群体层面学习这些动态——它们模拟的是整个样本分布的演化过程。然而,当前基于流的模型仅适用于单一初始群体和一组描述不同动态的预设条件。我们认为,自然科学中的多重过程必须被表示为Wasserstein概率密度流形上的向量场。也就是说,由于样本间的相互作用,群体在任何时刻的变化都取决于群体自身的状态。这一点在个性化医疗中尤为关键,因为疾病发展及其对应治疗反应取决于每位患者特有的细胞微环境。我们提出元流匹配(MFM),通过将初始群体的流模型进行摊销计算,实现在Wasserstein流形上沿这些向量场积分的一种实用方法。具体而言,我们使用图神经网络(GNN)对样本群体进行嵌入表示,并利用这些嵌入向量来训练流匹配模型。这使得MFM能够泛化到不同初始分布,与现有方法相比具有显著优势。我们通过大规模多患者单细胞药物筛选数据集证明,MFM能有效提升个体治疗反应的预测准确性。
English
Numerous biological and physical processes can be modeled as systems of interacting entities evolving continuously over time, e.g. the dynamics of communicating cells or physical particles. Learning the dynamics of such systems is essential for predicting the temporal evolution of populations across novel samples and unseen environments. Flow-based models allow for learning these dynamics at the population level - they model the evolution of the entire distribution of samples. However, current flow-based models are limited to a single initial population and a set of predefined conditions which describe different dynamics. We argue that multiple processes in natural sciences have to be represented as vector fields on the Wasserstein manifold of probability densities. That is, the change of the population at any moment in time depends on the population itself due to the interactions between samples. In particular, this is crucial for personalized medicine where the development of diseases and their respective treatment response depends on the microenvironment of cells specific to each patient. We propose Meta Flow Matching (MFM), a practical approach to integrating along these vector fields on the Wasserstein manifold by amortizing the flow model over the initial populations. Namely, we embed the population of samples using a Graph Neural Network (GNN) and use these embeddings to train a Flow Matching model. This gives MFM the ability to generalize over the initial distributions unlike previously proposed methods. We demonstrate the ability of MFM to improve prediction of individual treatment responses on a large scale multi-patient single-cell drug screen dataset.
PDF82November 14, 2024