语言服务器命令行界面通过流程奖励机制增强语言代理能力
Language Server CLI Empowers Language Agents with Process Rewards
October 27, 2025
作者: Yifan Zhang, Lanser Contributors
cs.AI
摘要
大型语言模型常出现API幻觉及编辑定位失准问题,而语言服务器能基于真实代码提供经过验证的IDE级精确信息。我们推出Lanser-CLI——一个CLI优先的编排层,通过锚定并协调语言服务器协议(LSP)服务器,为编码智能体和持续集成系统提供确定性、可复现的工作流。我们的核心观点是:语言服务器不仅提供结构信息(定义、引用、类型、诊断),更提供可操作的过程奖励——即通过机器校验的渐进式信号,使智能体的规划循环与程序现实保持一致。本研究通过Lanser-CLI实现三大创新:(i) 突破脆弱"文件:行号:列号"定址模式,采用选择器DSL(符号化、AST路径及内容锚定选择器)并配备严谨的重定位算法;(ii) 通过标准化语言服务器响应、记录环境/能力元数据及稳定内容哈希值,构建确定性分析包;(iii) 为突变操作(重命名、代码操作)设立安全边界,集成预览功能、工作区沙箱及Git感知的事务性应用机制;(iv) 基于语言服务器事实(诊断增量、消歧置信度、安全应用检查)构建可在线计算、离线复现的过程奖励函数。我们通过冻结快照形式化确定性,并为过程奖励建立单调性属性,使其适用于过程监督与反事实分析。项目页面:https://github.com/yifanzhang-pro/lanser-cli
English
Large language models routinely hallucinate APIs and mislocalize edits, while
language servers compute verified, IDE-grade facts about real code. We present
Lanser-CLI, a CLI-first orchestration layer that pins and mediates a Language
Server Protocol (LSP) server for coding agents and CI, exposing deterministic,
replayable workflows. Our position is that language servers provide not only
structural information (definitions, references, types, diagnostics) but also
an actionable process reward: machine-checked, step-wise signals that align an
agent's planning loop with program reality. In this work, Lanser-CLI
contributes: (i) a robust addressing scheme beyond brittle "file:line:col" via
a Selector DSL (symbolic, AST-path, and content-anchored selectors) with a
principled relocation algorithm; (ii) deterministic Analysis Bundles that
normalize Language Server responses and capture environment/capability metadata
with stable content hashes; (iii) a safety envelope for mutating operations
(rename, code actions) with preview, workspace jails, and Git-aware,
transactional apply; and (iv) a process-reward functional derived from Language
Server facts (diagnostic deltas, disambiguation confidence, and safe-apply
checks) that is computable online and replayable offline. We formalize
determinism under frozen snapshots and establish a monotonicity property for
the process reward, making it suitable for process supervision and
counterfactual analysis. Project Page:
https://github.com/yifanzhang-pro/lanser-cli