与你协作日臻完善：将用户修正编译为编码代理的运行时强制

摘要

交互式大语言模型（LLM）智能体正逐渐成为日常工作中的一部分，但它们在长期使用中并未可靠地变得更易协作：一个在某一轮对话中被记住的纠正，可能在下一轮对话中仍被违反。我们研究了偏好获取与偏好遵从之间的这种差距。在基于匿名真实用户摩擦案例的任务中，即便使用Mem0记忆系统，仍有57.5%的适用偏好检查被违反。我们提出了一种名为测试时规则获取与编译强制执行（TRACE）的方法，这是一个即插即用的技能层流水线，适用于编码智能体运行时，它能够挖掘用户的纠正性反馈，将其重写为原子规则，并编译成运行时检查，确保智能体在完成未来任务前必须通过这些检查。与由开发者提前编写的运行时检查不同，TRACE技能源自用户自身的聊天纠正。我们通过基于ClawArena编码智能体任务和源自MemoryArena的记忆密集型任务，结合模拟用户参与实验对TRACE进行了评估。在ClawArena上，TRACE将留存偏好违反率从100.0%降至37.6%（分布内任务），并从100.0%降至2.0%（分布外任务）。在源自MemoryArena的任务上，TRACE将分布内违反率从100.0%降至60.5%，同时在任务通过率上达到或超过最强的记忆基线。这些结果表明，将纠正编译为强制执行可以解决纯记忆无法可靠解决的反复摩擦故障模式，减少用户在未来会话中重复相同纠正的需求。实验代码见https://github.com/YujunZhou/TRACE_exp，可部署技能见https://github.com/YujunZhou/tellonce。

English

Interactive LLM agents are becoming part of daily work, but they do not reliably become easier to work with over time: a correction remembered in one session may still be violated in the next. We study this gap between preference access and preference compliance. In tasks derived from anonymized real-user friction cases, Mem0 memory still leaves 57.5% of applicable preference checks violated. We introduce Test-time Rule Acquisition and Compiled Enforcement (TRACE), a drop-in skill-layer pipeline for coding-agent runtimes that mines user corrections, rewrites them as atomic rules, and compiles them into runtime checks that must pass before an agent completes future tasks. Unlike runtime checks written ahead of time by developers, TRACE skills come from the user's own chat corrections. We evaluate TRACE with simulated user-in-the-loop experiments on ClawArena coding-agent tasks and MemoryArena-derived memory-intensive tasks. On ClawArena, TRACE reduces held-out preference violation from 100.0% to 37.6% on in-distribution tasks and from 100.0% to 2.0% on out-of-distribution tasks. On MemoryArena-derived tasks, TRACE reduces in-distribution violation from 100.0% to 60.5% while matching or exceeding the strongest memory baseline on task pass. These results suggest that compiling corrections into runtime enforcement can address a repeated-friction failure mode that memory alone does not reliably solve, reducing the need for users to restate the same correction across future sessions. Experiment code is available at https://github.com/YujunZhou/TRACE_exp, and the deployable skill is available at https://github.com/YujunZhou/tellonce.