企業系統是否需要學習世界模型?——上下文對於推斷動態的重要性
Do Enterprise Systems Need Learned World Models? The Importance of Context to Infer Dynamics
May 12, 2026
作者: Jishnu Sethumadhavan Nair, Patrice Bechard, Rishabh Maheshwary, Surajit Dasgupta, Sravan Ramachandran, Aakash Bhagat, Shruthan Radhakrishna, Pulkit Pattnaik, Johan Obando-Ceron, Shiva Krishna Reddy Malay, Sagar Davasam, Seganrasan Subramanian, Vipul Mittal, Sridhar Krishna Nemala, Christopher Pal, Srinivas Sunkara, Sai Rajeswar
cs.AI
摘要
世界模型使代理能夠透過內化環境動態來預測其行動的效果。然而,在企業系統中,這些動態往往由租戶特定的業務邏輯所定義,這些邏輯在不同部署間有所差異,並隨著時間演進,使得基於歷史轉換訓練的模型在部署轉變下變得脆弱。我們提出一個世界模型文獻尚未探討的問題:當規則可以在推理時被讀取時,代理是否仍需學習它們?我們論證並透過實驗證明,在轉換動態可配置且可讀取的場景中,運行時發現能透過將預測奠基於當前系統實例,來補足離線訓練的不足。我們提出企業發現代理(enterprise discovery agents),此類代理透過讀取系統配置而非僅依賴內化表徵,在運行時恢復相關的轉換動態。我們引入 CascadeBench,這是一個專注於推理的企業級聯預測基準,採用 World of Workflows 的評估方法學,應用於多樣化的合成環境,並結合部署轉變評估來證明:離線訓練的世界模型在分佈內表現良好,但當動態改變時會退化;而基於發現的代理則透過將預測奠基於當前實例,在轉變下更具穩健性。我們的研究結果指出,在可配置的企業環境中,代理不應僅依賴固定的內化動態,而應納入在運行時發現相關轉換邏輯的機制。
English
World models enable agents to anticipate the effects of their actions by internalizing environment dynamics. In enterprise systems, however, these dynamics are often defined by tenant-specific business logic that varies across deployments and evolves over time, making models trained on historical transitions brittle under deployment shift. We ask a question the world-models literature has not addressed: when the rules can be read at inference time, does an agent still need to learn them? We argue, and demonstrate empirically, that in settings where transition dynamics are configurable and readable, runtime discovery complements offline training by grounding predictions in the active system instance. We propose enterprise discovery agents, which recover relevant transition dynamics at runtime by reading the system's configuration rather than relying solely on internalized representations. We introduce CascadeBench, a reasoning-focused benchmark for enterprise cascade prediction that adopts the evaluation methodology of World of Workflows on diverse synthetic environments, and use it together with deployment-shift evaluation to show that offline-trained world models can perform well in-distribution but degrade as dynamics change, whereas discovery-based agents are more robust under shift by grounding their predictions in the current instance. Our findings suggest that, in configurable enterprise environments, agents should not rely solely on fixed internalized dynamics, but should incorporate mechanisms for discovering relevant transition logic at runtime.