图增强深度强化学习在多目标不相关并行机调度中的应用
Graph-Enhanced Deep Reinforcement Learning for Multi-Objective Unrelated Parallel Machine Scheduling
February 8, 2026
作者: Bulent Soykan, Sean Mondesire, Ghaith Rabadi, Grace Bochenek
cs.AI
摘要
针对具有释放时间、设置时间和资格约束的无关联并行机调度问题,本文提出一种基于近端策略优化算法和图神经网络的多目标深度强化学习框架。该问题在最小化总加权延迟与总设置时间方面存在显著的多目标优化挑战。通过图神经网络精准刻画作业、机器及设置状态的复杂关系,近端策略优化智能体能够学习直接调度策略。在多目标奖励函数的引导下,智能体可同步优化两个目标函数。基准测试表明,该PPO-GNN智能体在标准分派规则和元启发式算法基础上实现显著提升,获得了更优的多目标平衡效果,为复杂制造调度提供了强健且可扩展的解决方案。
English
The Unrelated Parallel Machine Scheduling Problem (UPMSP) with release dates, setups, and eligibility constraints presents a significant multi-objective challenge. Traditional methods struggle to balance minimizing Total Weighted Tardiness (TWT) and Total Setup Time (TST). This paper proposes a Deep Reinforcement Learning framework using Proximal Policy Optimization (PPO) and a Graph Neural Network (GNN). The GNN effectively represents the complex state of jobs, machines, and setups, allowing the PPO agent to learn a direct scheduling policy. Guided by a multi-objective reward function, the agent simultaneously minimizes TWT and TST. Experimental results on benchmark instances demonstrate that our PPO-GNN agent significantly outperforms a standard dispatching rule and a metaheuristic, achieving a superior trade-off between both objectives. This provides a robust and scalable solution for complex manufacturing scheduling.