行动见其形：基于UI痕迹的LLM浏览器代理指纹识别

摘要

随着基于大语言模型（LLM）的代理越来越多地代表用户浏览网页，一个自然的问题随之产生：网站能否被动识别驱动代理的底层模型？若能实现，则意味着显著的安全风险——攻击者可针对已知模型漏洞发起定向攻击。在涵盖信息检索与购物任务的14个前沿LLM及四个网络环境中，我们证明：通过被动JavaScript追踪器捕获的代理行为与交互时序，足以将底层模型识别准确率（F1分数）提升至96%。我们通过证明基于代理行为训练的分类器可跨模型规模与系列泛化，从而系统化界定这一攻击面。进一步研究表明，少量交互轨迹即可训练出强分类器，且代理身份可在单次任务早期被推断。在行为间注入随机延时虽会显著降低分类器性能，但无法提供稳健防护：对延时后的交互轨迹重新训练分类器，其性能基本恢复。我们已在https://github.com/KabakaWilliam/known_actions 开源实验框架与标注后的代理轨迹语料库。

English

As LLM-based agents increasingly browse the web on users' behalf, a natural question arises: can websites passively identify which underlying model powers an agent? Doing so would represent a significant security risk, enabling targeted attacks tailored to known model vulnerabilities. Across 14 frontier LLMs and four web environments spanning information retrieval and shopping tasks, we show that an agent's actions and interaction timings, captured via a passive JavaScript tracker, are sufficient to identify the underlying model with up to 96\% F1. We formalise this attack surface by demonstrating that classifiers trained on agent actions generalise across model sizes and families. We further show that strong classifiers can be trained from few interaction traces and that agent identity can be inferred early within an episode. Injecting randomised timing delays between actions substantially degrades classifier performance, but does not provide robust protection: a classifier retrained on delayed traces largely recovers performance. We release our harness and a labelled corpus of agent traces https://github.com/KabakaWilliam/known_actions{here}.