ChatPaper.aiChatPaper

基于轨迹的Clawdbot(OpenClaw)安全审计

A Trajectory-Based Safety Audit of Clawdbot (OpenClaw)

February 16, 2026
作者: Tianyu Chen, Dongrui Liu, Xia Hu, Jingyi Yu, Wenjie Wang
cs.AI

摘要

Clawdbot是一种支持自托管、具备工具调用能力的个人AI智能体,其广泛的动作空间覆盖本地执行与网络介导的工作流,这在模糊性和对抗性引导情境下引发了更高的安全风险。我们针对六大风险维度对Clawdbot开展了轨迹中心化评估:测试集采样自现有智能体安全基准(含ATBench与LPS-Bench)并针对Clawdbot工具接口进行轻量化适配,同时补充了手工设计的定制场景。通过记录完整交互轨迹(消息、动作、工具调用参数/输出),我们结合自动化轨迹评估器(AgentDoG-Qwen3-4B)与人工审核进行安全评估。在34个标准测试案例中,其安全表现呈现非均衡特征:可靠性导向任务表现总体稳定,而多数失效案例出现在意图未明确界定、目标开放或看似无害的越狱提示场景中,此时微小误判可能升级为高影响工具操作。我们通过典型案例研究补充整体结果,归纳这些案例的共性特征,剖析了Clawdbot在实际应用中易触发的安全漏洞与典型失效模式。
English
Clawdbot is a self-hosted, tool-using personal AI agent with a broad action space spanning local execution and web-mediated workflows, which raises heightened safety and security concerns under ambiguity and adversarial steering. We present a trajectory-centric evaluation of Clawdbot across six risk dimensions. Our test suite samples and lightly adapts scenarios from prior agent-safety benchmarks (including ATBench and LPS-Bench) and supplements them with hand-designed cases tailored to Clawdbot's tool surface. We log complete interaction trajectories (messages, actions, tool-call arguments/outputs) and assess safety using both an automated trajectory judge (AgentDoG-Qwen3-4B) and human review. Across 34 canonical cases, we find a non-uniform safety profile: performance is generally consistent on reliability-focused tasks, while most failures arise under underspecified intent, open-ended goals, or benign-seeming jailbreak prompts, where minor misinterpretations can escalate into higher-impact tool actions. We supplemented the overall results with representative case studies and summarized the commonalities of these cases, analyzing the security vulnerabilities and typical failure modes that Clawdbot is prone to trigger in practice.
PDF11February 19, 2026