ChatPaper.aiChatPaper

更多的推理時間計算真的能提升魯棒性嗎?

Does More Inference-Time Compute Really Help Robustness?

July 21, 2025
作者: Tong Wu, Chong Xiang, Jiachen T. Wang, Weichen Yu, Chawin Sitawarin, Vikash Sehwag, Prateek Mittal
cs.AI

摘要

近期,Zaremba等人展示了增加推理時計算能提升大型專有推理LLM的穩健性。本文中,我們首先證明,較小規模的開源模型(如DeepSeek R1、Qwen3、Phi-reasoning)通過採用簡單的預算強制策略,也能從推理時擴展中獲益。更重要的是,我們揭示並批判性地檢驗了先前研究中的一個隱含假設:中間推理步驟對攻擊者而言是隱藏的。通過放寬這一假設,我們發現了一個重要的安全風險,直觀上並經實證驗證為一種逆向擴展定律:若中間推理步驟變得顯式可訪問,增加推理時計算會持續降低模型的穩健性。最後,我們探討了在實際場景中,即使推理鏈隱藏,模型仍易受攻擊的情況,例如集成工具推理的模型及高級推理提取攻擊。我們的研究結果共同表明,推理時擴展帶來的穩健性優勢高度依賴於對抗環境和部署情境。我們敦促實踐者在安全敏感的現實應用中應用推理時擴展前,仔細權衡這些微妙的取捨。
English
Recently, Zaremba et al. demonstrated that increasing inference-time computation improves robustness in large proprietary reasoning LLMs. In this paper, we first show that smaller-scale, open-source models (e.g., DeepSeek R1, Qwen3, Phi-reasoning) can also benefit from inference-time scaling using a simple budget forcing strategy. More importantly, we reveal and critically examine an implicit assumption in prior work: intermediate reasoning steps are hidden from adversaries. By relaxing this assumption, we identify an important security risk, intuitively motivated and empirically verified as an inverse scaling law: if intermediate reasoning steps become explicitly accessible, increased inference-time computation consistently reduces model robustness. Finally, we discuss practical scenarios where models with hidden reasoning chains are still vulnerable to attacks, such as models with tool-integrated reasoning and advanced reasoning extraction attacks. Our findings collectively demonstrate that the robustness benefits of inference-time scaling depend heavily on the adversarial setting and deployment context. We urge practitioners to carefully weigh these subtle trade-offs before applying inference-time scaling in security-sensitive, real-world applications.
PDF61July 23, 2025