布爾可滿足性問題的模仿學習求解法
Boolean Satisfiability via Imitation Learning
September 29, 2025
作者: Zewei Zhang, Huan Liu, Yuanhao Yu, Jun Chen, Xiangyu Xu
cs.AI
摘要
我們提出ImitSAT,一種基於模仿學習的衝突驅動子句學習(CDCL)求解器的分支策略,專為布爾可滿足性問題(SAT)設計。與以往預測實例級信號以間接改進CDCL分支的方法不同,或依賴於強化學習及不足的CDCL信息來增強分支,ImitSAT從專家KeyTrace中學習,該KeyTrace將完整運行壓縮為一系列存活的決策序列。在同一實例上重放KeyTrace幾乎無衝突,提供了密集的決策級監督,並直接減少了傳播——這是牆鐘時間的主要貢獻者。這種前綴條件下的監督使ImitSAT能夠無需探索即可重現高質量分支,從而實現更快的收斂、穩定的訓練以及與CDCL的無縫集成。大量實驗表明,ImitSAT減少了傳播次數和運行時間,超越了最先進的學習方法。我們已在https://github.com/zewei-Zhang/ImitSAT上公開了源代碼及訓練模型。
English
We propose ImitSAT, a branching policy for conflict-driven clause learning
(CDCL) solvers based on imitation learning for the Boolean satisfiability
problem (SAT). Unlike previous methods that predict instance-level signals to
improve CDCL branching indirectly, or rely on reinforcement learning and
insufficient CDCL information to enhance branching, ImitSAT learns from expert
KeyTrace that collapses a full run into the sequence of surviving decisions.
Replaying a KeyTrace on the same instance is nearly conflict-free, providing
dense decision-level supervision and directly reducing propagations -- the
dominant contributor to wall-clock time. This prefix-conditioned supervision
enables ImitSAT to reproduce high-quality branches without exploration,
yielding faster convergence, stable training, and seamless integration into
CDCL. Extensive experiments demonstrate that ImitSAT reduces propagation counts
and runtime, outperforming state-of-the-art learned approaches. We released the
source code and trained model at https://github.com/zewei-Zhang/ImitSAT