ChatPaper.aiChatPaper

Notes2Skills:从实验室笔记到置信度感知的科学智能体技能

Notes2Skills: From Lab Notebooks to Certainty-Aware Scientific Agent Skills

June 10, 2026
作者: Shi Liu, Jiayao Chen, Chengwei Qin, Yanqing Hu, Jufan Zhang, Linyi Yang
cs.AI

摘要

科学发现的工作流程通常包含并高度依赖实验笔记,研究人员在其中记录观察结果、解读不确定的实验数据,并规划后续实验。这类信息丰富的实验笔记保留了科学推理的演进过程及作者的不确定性,而非论文中展示的最终精炼结果,这为人工智能在更全面、更深入的层面参与科学探索提供了宝贵机遇。然而,此前大多数关于科学文本的研究聚焦于论文、实验方案或结构化数据库,导致非正式实验笔记作为人工智能科学代理的输入数据尚未得到充分探索。这一空白意义重大,因为实验笔记往往在同一段落中混杂了已验证的观察结果、初步判断以及可能的后续实验步骤。若这些信号被混淆,人工智能代理可能将不确定的科学判断误认为已确认的结论或可执行的行动。为此,我们提出Notes2Skills框架,这是一个两阶段系统,旨在将实验笔记转化为科学人工智能代理可验证的技能,同时保留作者的确定性。在七种实验条件和三次湿实验环节中,Notes2Skills是唯一既不会将不确定笔记误认为明确指令,也不会丢弃明确指令的配置。我们证明,确定性保留是连接实验笔记与可靠代理技能之间缺失的关键环节,为构建更安全的人工智能协同科学家系统开辟了道路。
English
Scientific discovery workflows usually contain and rely heavily on lab notes, where researchers record observations, interpret uncertain results, and plan follow-up experiments. Such informative lab notes preserve evolving scientific reasoning and author uncertainty, rather than polished final results exhibited in publications, providing a valuable opportunity for AI to engage in scientific exploration at a more comprehensive and deeper level. However, most prior work on scientific text focuses on papers, protocols, or structured databases, leaving informal laboratory notes underexplored as inputs to AI agents for science. This gap matters because lab notes often intermingle validated observations, tentative judgments, and possible experimental next steps within the same passage. If these signals are conflated, an AI agent may mistake uncertain scientific judgments for confirmed conclusions or executable actions. To this end, we present Notes2Skills, a two-stage framework for turning lab notebooks into verifiable skills for scientific AI agents while preserving the author's certainty. Across seven conditions and three wet-lab sessions, Notes2Skills is the only configuration that neither mistakes uncertain notes for firm instructions nor discards firm ones. We show that certainty preservation is the missing piece between lab notebooks and reliable agent skills, opening a path toward safer AI co-scientist systems.