神经预测校正器:基于强化学习的同伦问题求解
Neural Predictor-Corrector: Solving Homotopy Problems with Reinforcement Learning
February 3, 2026
作者: Jiayao Mai, Bangyan Liao, Zhenjun Zhao, Yingping Zeng, Haoang Li, Javier Civera, Tailin Wu, Yi Zhou, Peidong Liu
cs.AI
摘要
同伦范式作为解决复杂问题的通用原理,广泛存在于鲁棒优化、全局优化、多项式求根和采样等多个领域。针对这些问题的实际求解器通常采用预测-校正(PC)结构,但依赖人工设计的步长与迭代终止启发式规则,这些规则往往非最优且局限于特定任务。为此,我们构建了统一框架将这些问题整合,从而设计出通用神经求解器。基于此统一视角,我们提出神经预测-校正器(NPC),用自动学习的策略替代人工启发式规则。NPC将策略选择建模为序列决策问题,并利用强化学习自动发现高效策略。为进一步增强泛化能力,我们引入分摊训练机制,能够针对一类问题实现一次性离线训练,并在新实例上实现高效在线推理。在四个典型同伦问题上的实验表明,本方法能有效泛化至未见实例,在效率上持续超越经典及专用基线方法,同时展现出跨任务的卓越稳定性,彰显了将同伦方法统一至神经框架的重要价值。
English
The Homotopy paradigm, a general principle for solving challenging problems, appears across diverse domains such as robust optimization, global optimization, polynomial root-finding, and sampling. Practical solvers for these problems typically follow a predictor-corrector (PC) structure, but rely on hand-crafted heuristics for step sizes and iteration termination, which are often suboptimal and task-specific. To address this, we unify these problems under a single framework, which enables the design of a general neural solver. Building on this unified view, we propose Neural Predictor-Corrector (NPC), which replaces hand-crafted heuristics with automatically learned policies. NPC formulates policy selection as a sequential decision-making problem and leverages reinforcement learning to automatically discover efficient strategies. To further enhance generalization, we introduce an amortized training mechanism, enabling one-time offline training for a class of problems and efficient online inference on new instances. Experiments on four representative homotopy problems demonstrate that our method generalizes effectively to unseen instances. It consistently outperforms classical and specialized baselines in efficiency while demonstrating superior stability across tasks, highlighting the value of unifying homotopy methods into a single neural framework.