ChatPaper.aiChatPaper

### 通过可扩展交互式监督引导大型语言模型 (注:此处采用学术论文标题的简洁译法,"Steering"译为"引导"更符合技术语境,"Scalable Interactive Oversight"译为"可扩展交互式监督"能准确体现方法论特征,整体句式符合中文论文标题的动宾结构规范。)

Steering LLMs via Scalable Interactive Oversight

February 4, 2026
作者: Enyu Zhou, Zhiheng Xi, Long Ma, Zhihao Zhang, Shihan Dou, Zhikai Lei, Guoteng Wang, Rui Zheng, Hang Yan, Tao Gui, Qi Zhang, Xuanjing Huang
cs.AI

摘要

随着大型语言模型日益自动化处理诸如氛围编程等复杂的长期任务,监督缺口逐渐显现。尽管模型在执行层面表现出色,但由于用户领域专业知识不足、难以精确表述意图,以及无法可靠验证复杂输出结果,往往难以有效引导模型。这引发了可扩展监督领域的核心挑战:如何让人类在自身无法明确规范或验证的任务中,仍能可靠地引导人工智能系统。为此,我们提出可扩展交互式监督框架,通过将复杂意图分解为可递归管理的决策树来增强人类监督效能。该系统摒弃开放式提示方式,转而在每个决策节点收集低负担的反馈,并递归聚合这些信号形成精确的全局指导。在网页开发任务中的验证表明,该框架能使非专业用户产出专家级产品需求文档,任务契合度提升54%。关键的是,我们证明该框架可通过强化学习仅基于在线用户反馈进行优化,为人工智能规模化发展过程中保持人类控制权提供了可行路径。
English
As Large Language Models increasingly automate complex, long-horizon tasks such as vibe coding, a supervision gap has emerged. While models excel at execution, users often struggle to guide them effectively due to insufficient domain expertise, the difficulty of articulating precise intent, and the inability to reliably validate complex outputs. It presents a critical challenge in scalable oversight: enabling humans to responsibly steer AI systems on tasks that surpass their own ability to specify or verify. To tackle this, we propose Scalable Interactive Oversight, a framework that decomposes complex intent into a recursive tree of manageable decisions to amplify human supervision. Rather than relying on open-ended prompting, our system elicits low-burden feedback at each node and recursively aggregates these signals into precise global guidance. Validated in web development task, our framework enables non-experts to produce expert-level Product Requirement Documents, achieving a 54\% improvement in alignment. Crucially, we demonstrate that this framework can be optimized via Reinforcement Learning using only online user feedback, offering a practical pathway for maintaining human control as AI scales.
PDF162February 7, 2026