FinVault:基于执行环境的金融智能体安全基准测试
FinVault: Benchmarking Financial Agent Safety in Execution-Grounded Environments
January 9, 2026
作者: Zhi Yang, Runguo Li, Qiqi Qiang, Jiashun Wang, Fangqi Lou, Mengping Li, Dongpo Cheng, Rui Xu, Heng Lian, Shuo Zhang, Xiaolong Liang, Xiaoming Huang, Zheng Wei, Zhaowei Liu, Xin Guo, Huacan Wang, Ronghao Chen, Liwen Zhang
cs.AI
摘要
基于大语言模型的金融智能体正日益广泛地应用于投资分析、风险评估和自动化决策领域。这些智能体具备规划能力、工具调用能力及可变状态操作能力,但在高风险、强监管的金融环境中,这些能力也带来了新的安全风险。然而,现有安全评估主要聚焦于语言模型层面的内容合规性或抽象智能体场景,未能有效捕捉真实工作流程和状态变更操作中产生的执行层面风险。为弥补这一空白,我们提出首个面向金融智能体的执行安全基准测试框架FinVault,该框架包含31个基于监管案例的沙箱场景(配备可写入状态数据库和明确合规约束)、107种现实漏洞类型及963个测试用例,系统覆盖提示注入、越狱攻击、金融场景适应性攻击以及用于误报评估的良性输入。实验结果表明,现有防御机制在真实金融智能体环境中仍然存在不足:最先进模型的平均攻击成功率高达50.0%,即使对于最稳健的系统,攻击成功率仍保持不可忽视的水平(6.7%),这凸显出现有安全方案的可迁移性有限,亟需构建更强的金融场景专属防御体系。项目代码详见https://github.com/aifinlab/FinVault。
English
Financial agents powered by large language models (LLMs) are increasingly deployed for investment analysis, risk assessment, and automated decision-making, where their abilities to plan, invoke tools, and manipulate mutable state introduce new security risks in high-stakes and highly regulated financial environments. However, existing safety evaluations largely focus on language-model-level content compliance or abstract agent settings, failing to capture execution-grounded risks arising from real operational workflows and state-changing actions. To bridge this gap, we propose FinVault, the first execution-grounded security benchmark for financial agents, comprising 31 regulatory case-driven sandbox scenarios with state-writable databases and explicit compliance constraints, together with 107 real-world vulnerabilities and 963 test cases that systematically cover prompt injection, jailbreaking, financially adapted attacks, as well as benign inputs for false-positive evaluation. Experimental results reveal that existing defense mechanisms remain ineffective in realistic financial agent settings, with average attack success rates (ASR) still reaching up to 50.0\% on state-of-the-art models and remaining non-negligible even for the most robust systems (ASR 6.7\%), highlighting the limited transferability of current safety designs and the need for stronger financial-specific defenses. Our code can be found at https://github.com/aifinlab/FinVault.