面向大规模交互式监督的LLM引导方法
Steering LLMs via Scalable Interactive Oversight
February 4, 2026
作者: Enyu Zhou, Zhiheng Xi, Long Ma, Zhihao Zhang, Shihan Dou, Zhikai Lei, Guoteng Wang, Rui Zheng, Hang Yan, Tao Gui, Qi Zhang, Xuanjing Huang
cs.AI
摘要
随着大语言模型日益自动化处理复杂长周期任务(如氛围编程),监督缺口逐渐显现。尽管模型在执行层面表现出色,但由于用户领域专业知识不足、难以精确表述意图,以及无法可靠验证复杂输出,人类往往难以有效引导模型。这引发了可扩展监督的核心挑战:如何让人类在超越自身规范或验证能力的任务中,实现对人工智能系统的负责任引导。为此,我们提出可扩展交互式监督框架,通过将复杂意图分解为可递归管理的决策树来增强人类监督能力。该系统摒弃开放式提示方式,转而在每个决策节点收集低负担反馈,并递归聚合这些信号形成精确的全局指导。在网页开发任务中的验证表明,该框架能使非专业用户产出专家级产品需求文档,任务对齐度提升54%。关键的是,我们证明该框架可通过强化学习仅基于在线用户反馈进行优化,为人工智能规模化发展过程中保持人类控制权提供了可行路径。
English
As Large Language Models increasingly automate complex, long-horizon tasks such as vibe coding, a supervision gap has emerged. While models excel at execution, users often struggle to guide them effectively due to insufficient domain expertise, the difficulty of articulating precise intent, and the inability to reliably validate complex outputs. It presents a critical challenge in scalable oversight: enabling humans to responsibly steer AI systems on tasks that surpass their own ability to specify or verify. To tackle this, we propose Scalable Interactive Oversight, a framework that decomposes complex intent into a recursive tree of manageable decisions to amplify human supervision. Rather than relying on open-ended prompting, our system elicits low-burden feedback at each node and recursively aggregates these signals into precise global guidance. Validated in web development task, our framework enables non-experts to produce expert-level Product Requirement Documents, achieving a 54\% improvement in alignment. Crucially, we demonstrate that this framework can be optimized via Reinforcement Learning using only online user feedback, offering a practical pathway for maintaining human control as AI scales.