ChatPaper.aiChatPaper

CauScale:大规模神经因果发现

CauScale: Neural Causal Discovery at Scale

February 9, 2026
作者: Bo Peng, Sirui Chen, Jiaguo Tian, Yu Qiao, Chaochao Lu
cs.AI

摘要

因果发现对于推动科学AI和数据分析等数据驱动领域的发展至关重要,但现有方法在处理大规模图结构时面临显著的时间与空间效率瓶颈。为解决这一挑战,我们提出CauScale——一种专为高效因果发现设计的神经架构,可将推理规模扩展至包含1000个节点的图结构。该架构通过降维单元压缩数据嵌入提升时间效率,并采用绑定注意力权重避免维护轴向特定注意力图以优化空间效率。为保持高精度因果发现能力,CauScale采用双流设计:数据流从高维观测值中提取关系证据,图流则整合统计图先验并保留关键结构信号。在训练阶段,CauScale成功扩展至500节点图规模(此前研究因空间限制无法实现),在涵盖不同图规模和因果机制的测试数据中,分别实现分布内数据99.6%的mAP和分布外数据84.4%的mAP,同时推理速度较现有方法提升4至13000倍。项目页面详见https://github.com/OpenCausaLab/CauScale。
English
Causal discovery is essential for advancing data-driven fields such as scientific AI and data analysis, yet existing approaches face significant time- and space-efficiency bottlenecks when scaling to large graphs. To address this challenge, we present CauScale, a neural architecture designed for efficient causal discovery that scales inference to graphs with up to 1000 nodes. CauScale improves time efficiency via a reduction unit that compresses data embeddings and improves space efficiency by adopting tied attention weights to avoid maintaining axis-specific attention maps. To keep high causal discovery accuracy, CauScale adopts a two-stream design: a data stream extracts relational evidence from high-dimensional observations, while a graph stream integrates statistical graph priors and preserves key structural signals. CauScale successfully scales to 500-node graphs during training, where prior work fails due to space limitations. Across testing data with varying graph scales and causal mechanisms, CauScale achieves 99.6% mAP on in-distribution data and 84.4% on out-of-distribution data, while delivering 4-13,000 times inference speedups over prior methods. Our project page is at https://github.com/OpenCausaLab/CauScale.
PDF02February 11, 2026