ChatPaper.aiChatPaper

熱帶注意力機制:組合算法中的神經算法推理

Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms

May 22, 2025
作者: Baran Hashemi, Kurt Pasque, Chris Teska, Ruriko Yoshida
cs.AI

摘要

動態規劃(DP)算法在處理組合優化問題時,其遞歸算法涉及最大化、最小化及經典加法運算。相關的價值函數對應於最大加半環中的凸多面體。然而,現有的神經算法推理模型依賴於經過softmax歸一化的點積注意力機制,其中平滑的指數加權模糊了這些尖銳的多面體結構,並在分佈外(OOD)設置下評估時崩潰。我們引入了熱帶注意力,這是一種新穎的注意力函數,它原生於熱帶幾何的最大加半環中。我們證明,熱帶注意力能夠近似DP型組合算法的熱帶電路。隨後,我們提出使用熱帶變換器在算法推理任務中,無論是長度泛化還是價值泛化,都能提升實證OOD性能,超越softmax基線,同時在對抗攻擊下保持穩定。我們還將對抗攻擊泛化作為神經算法推理基準測試的第三個維度。我們的結果表明,熱帶注意力恢復了softmax所缺失的尖銳、尺度不變的推理能力。
English
Dynamic programming (DP) algorithms for combinatorial optimization problems work with taking maximization, minimization, and classical addition in their recursion algorithms. The associated value functions correspond to convex polyhedra in the max plus semiring. Existing Neural Algorithmic Reasoning models, however, rely on softmax-normalized dot-product attention where the smooth exponential weighting blurs these sharp polyhedral structures and collapses when evaluated on out-of-distribution (OOD) settings. We introduce Tropical attention, a novel attention function that operates natively in the max-plus semiring of tropical geometry. We prove that Tropical attention can approximate tropical circuits of DP-type combinatorial algorithms. We then propose that using Tropical transformers enhances empirical OOD performance in both length generalization and value generalization, on algorithmic reasoning tasks, surpassing softmax baselines while remaining stable under adversarial attacks. We also present adversarial-attack generalization as a third axis for Neural Algorithmic Reasoning benchmarking. Our results demonstrate that Tropical attention restores the sharp, scale-invariant reasoning absent from softmax.

Summary

AI-Generated Summary

PDF11May 28, 2025