當人工智能穿越戰爭迷霧

摘要

人工智慧能否在戰爭軌跡尚未歷史性明朗前進行推演？此能力分析之難在於回顧性地緣政治預測深受訓練數據洩漏的干擾。為應對此挑戰，我們針對2026年中東衝突早期階段展開時序錨定的案例研究——該衝突發生於當前前沿模型的訓練數據截止日期之後。我們構建了11個關鍵時間節點、42個節點專屬可驗證問題及5個總體探索性問題，要求模型僅基於各時間點公開可得資訊進行推理。此設計大幅緩解了訓練數據洩漏疑慮，創造出適合研究模型如何在「戰爭迷霧」中分析危機演變的環境，並據我們所知，首次實現對大型語言模型在持續性地緣衝突中推理能力的時序錨定分析。我們的研究揭示三項主要發現：首先，當前頂尖大型語言模型常展現驚人的戰略現實主義傾向，能超越表面修辭而觸及深層結構性動因；其次，此能力存在領域不均性——模型在經濟與後勤結構化情境中的表現優於政治模糊的多方博弈環境；最後，模型敘事會隨時間演變，從早期預期快速遏制逐漸轉向對區域僵局與消耗性降級的系統性解釋。由於本研究中衝突在撰寫時仍在持續，本成果可作為模型在地緣危機演進過程中推理能力的檔案快照，為未來研究提供免受事後回溯偏見影響的基礎。

English

Can AI reason about a war before its trajectory becomes historically obvious? Analyzing this capability is difficult because retrospective geopolitical prediction is heavily confounded by training-data leakage. We address this challenge through a temporally grounded case study of the early stages of the 2026 Middle East conflict, which unfolded after the training cutoff of current frontier models. We construct 11 critical temporal nodes, 42 node-specific verifiable questions, and 5 general exploratory questions, requiring models to reason only from information that would have been publicly available at each moment. This design substantially mitigates training-data leakage concerns, creating a setting well-suited for studying how models analyze an unfolding crisis under the fog of war, and provides, to our knowledge, the first temporally grounded analysis of LLM reasoning in an ongoing geopolitical conflict. Our analysis reveals three main findings. First, current state-of-the-art large language models often display a striking degree of strategic realism, reasoning beyond surface rhetoric toward deeper structural incentives. Second, this capability is uneven across domains: models are more reliable in economically and logistically structured settings than in politically ambiguous multi-actor environments. Finally, model narratives evolve over time, shifting from early expectations of rapid containment toward more systemic accounts of regional entrenchment and attritional de-escalation. Since the conflict remains ongoing at the time of writing, this work can serve as an archival snapshot of model reasoning during an unfolding geopolitical crisis, enabling future studies without the hindsight bias of retrospective analysis.

當人工智能穿越戰爭迷霧

When AI Navigates the Fog of War

摘要

Support