从聊天机器人到数字同事:迈向持久自主AI的范式转变
From Chatbot to Digital Colleague: The Paradigm Shift Toward Persistent Autonomous AI
June 12, 2026
作者: Yongheng Zhang, Ziang Liu, Jiaxuan Zhu, Shuai Wang, Xiangqi Chen, Haojing Huang, Jiayi Kuang, Siyu Chen, Ao Shen, Hao Wu, Qiufeng Wang, Qian-Wen Zhang, Junnan Dong, Wenhao Jiang, Ying Shen, Hai-Tao Zheng, Yinghui Li, Di Yin, Xing Sun, Philip S. Yu
cs.AI
摘要
大型语言模型正在经历根本性转变,从对话生成器进化为具备推理、行动、记忆和自我完善能力的综合人工智能系统。我们将这一转变概念化为从聊天机器人到数字同事的跃迁:从对话式回答转向持续性工作。我们沿着两个紧密耦合的维度组织这一转变。首先,在认知核心层面,大语言模型正从聊天机器人时代由下一个词元预测驱动的"快速思维"系统,发展为思考型大语言模型——这类模型利用推理时计算、思维链推理、反思、过程监督及强化学习来支撑更审慎可靠的认知能力。其次,在工具增强的任务执行层面,大语言模型正从临时调用外部资源的工具调用智能体,进化为配备持久工作区、技能、验证循环和治理机制的"开放式爪钳"工作站系统。"工作区+技能"范式通过状态持久化、可复用流程、任务闭合与经验重用,将偶发性的工具使用转变为同事式协作。我们考察了数据构建从指令-响应对到状态-行动-观测轨迹的转型,以及评估体系从静态基准到沙盒化、可审计、自我演进的人工智能生态系统的演化。
English
Large Language Models (LLMs) are undergoing a fundamental transformation from conversational generators into integrated AI systems capable of reasoning, action, memory, and self-improvement. We conceptualize this transition as a shift from Chatbot to Digital Colleague: from conversational answers to persistent work. We organize this transition along two tightly coupled dimensions. First, at the cognitive core level, LLMs are advancing from Chatbot-era "fast thinking" systems driven by next-token prediction toward Thinking LLMs that leverage inference-time computation, Chain-of-Thought reasoning, reflection, process supervision, and reinforcement learning to support more deliberate and reliable cognition. Second, at the tool-augmented task execution level, LLMs are progressing from tool-calling Agents that invoke external resources in an ad hoc manner toward OpenClaw-style workstation systems (OpenClaw) equipped with persistent Workspaces, skills, verification loops, and governance. The "Workspace + Skill" paradigm makes episodic tool use colleague-like via state persistence, reusable procedures, task closure, and experience reuse. We examine data construction shifts from instruction-response pairs to State-Action-Observation trajectories and evaluation from static benchmarks to sandboxed, auditable, self-evolving AI ecosystems.