热带注意力机制:面向组合算法的神经算法推理
Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms
May 22, 2025
作者: Baran Hashemi, Kurt Pasque, Chris Teska, Ruriko Yoshida
cs.AI
摘要
针对组合优化问题的动态规划(DP)算法,在其递归算法中运用了最大化、最小化及经典加法运算。相应的价值函数对应于最大加半环中的凸多面体。然而,现有的神经算法推理模型依赖于经过softmax归一化的点积注意力机制,其中平滑的指数权重模糊了这些锐利的多面体结构,并在面对分布外(OOD)场景时表现崩溃。我们引入了热带注意力,这是一种新颖的注意力函数,它原生地运行于热带几何的最大加半环中。我们证明了热带注意力能够近似模拟DP型组合算法的热带电路。随后,我们提出,在算法推理任务中,采用热带变换器能够提升长度泛化和价值泛化方面的OOD实证性能,超越softmax基线,同时在对抗攻击下保持稳定。我们还提出了对抗攻击泛化作为神经算法推理基准测试的第三个维度。我们的研究结果表明,热带注意力恢复了softmax所缺失的锐利、尺度不变的推理能力。
English
Dynamic programming (DP) algorithms for combinatorial optimization problems
work with taking maximization, minimization, and classical addition in their
recursion algorithms. The associated value functions correspond to convex
polyhedra in the max plus semiring. Existing Neural Algorithmic Reasoning
models, however, rely on softmax-normalized dot-product attention where the
smooth exponential weighting blurs these sharp polyhedral structures and
collapses when evaluated on out-of-distribution (OOD) settings. We introduce
Tropical attention, a novel attention function that operates natively in the
max-plus semiring of tropical geometry. We prove that Tropical attention can
approximate tropical circuits of DP-type combinatorial algorithms. We then
propose that using Tropical transformers enhances empirical OOD performance in
both length generalization and value generalization, on algorithmic reasoning
tasks, surpassing softmax baselines while remaining stable under adversarial
attacks. We also present adversarial-attack generalization as a third axis for
Neural Algorithmic Reasoning benchmarking. Our results demonstrate that
Tropical attention restores the sharp, scale-invariant reasoning absent from
softmax.Summary
AI-Generated Summary