FlowPIE：基于流引导文献探索的测试时科学思想演进

摘要

科学思想生成（SIG）对AI驱动的自主研究至关重要，但现有方法常受限于静态的“检索-生成”范式，导致思想同质化且发散不足。本研究提出FlowPIE框架，通过将文献探索与思想生成视为协同演进的过程，构建紧密耦合的检索-生成机制。该框架受GFlowNets启发，采用流引导的蒙特卡洛树搜索（MCTS）扩展文献轨迹，以基于大语言模型的生成式奖励模型（GRM）对当前思想质量的评估作为监督信号，指导自适应检索并构建多样化、高质量的初始种群。在此基础上，FlowPIE将思想生成建模为测试时的思想进化过程：结合隔离岛范式与基于GRM的适应度计算，实施选择、交叉和变异操作以融入跨领域知识，有效缓解因过度依赖参数化知识与静态文献形成的信息茧房。大量实验表明，相较于基于大语言模型和智能体的强基线框架，FlowPIE持续生成具有更高新颖性、可行性与多样性的思想，并能实现测试阶段的奖励缩放。

English

Scientific idea generation (SIG) is critical to AI-driven autonomous research, yet existing approaches are often constrained by a static retrieval-then-generation paradigm, leading to homogeneous and insufficiently divergent ideas. In this work, we propose FlowPIE, a tightly coupled retrieval-generation framework that treats literature exploration and idea generation as a co-evolving process. FlowPIE expands literature trajectories via a flow-guided Monte Carlo Tree Search (MCTS) inspired by GFlowNets, using the quality of current ideas assessed by an LLM-based generative reward model (GRM) as a supervised signal to guide adaptive retrieval and construct a diverse, high-quality initial population. Based on this population, FlowPIE models idea generation as a test-time idea evolution process, applying selection, crossover, and mutation with the isolation island paradigm and GRM-based fitness computation to incorporate cross-domain knowledge. It effectively mitigates the information cocoons arising from over-reliance on parametric knowledge and static literature. Extensive evaluations demonstrate that FlowPIE consistently produces ideas with higher novelty, feasibility and diversity compared to strong LLM-based and agent-based frameworks, while enabling reward scaling during test time.