FlowPIE：基于流引导文献探索的测试时科学思想演化

摘要

科學思想生成（SIG）是人工智慧驅動自主研究的關鍵環節，然而現有方法常受制於靜態的「檢索-生成」範式，導致思想同質化且發散性不足。本研究提出FlowPIE框架，通過緊耦合的檢索-生成機制將文獻探索與思想生成建模為協同演進的過程。該框架受GFlowNets啟發，採用流引導的蒙地卡羅樹搜尋（MCTS）擴展文獻軌跡，以基於大型語言模型（LLM）的生成式獎勵模型（GRM）對當前思想質量的評估作為監督信號，引導自適應檢索並構建多樣化的高質量初始群體。在此基礎上，FlowPIE將思想生成建模為測試階段的思想演化過程，結合隔離島範式與GRM適應度計算，實施選擇、交叉和變異操作以融合跨領域知識，有效緩解因過度依賴參數化知識與靜態文獻形成的信息繭房。大量實驗表明，相較於強基線的LLM框架與智能體框架，FlowPIE能持續產生新穎性、可行性與多樣性更優的思想，同時實現測試階段的獎勵規模化擴展。

English

Scientific idea generation (SIG) is critical to AI-driven autonomous research, yet existing approaches are often constrained by a static retrieval-then-generation paradigm, leading to homogeneous and insufficiently divergent ideas. In this work, we propose FlowPIE, a tightly coupled retrieval-generation framework that treats literature exploration and idea generation as a co-evolving process. FlowPIE expands literature trajectories via a flow-guided Monte Carlo Tree Search (MCTS) inspired by GFlowNets, using the quality of current ideas assessed by an LLM-based generative reward model (GRM) as a supervised signal to guide adaptive retrieval and construct a diverse, high-quality initial population. Based on this population, FlowPIE models idea generation as a test-time idea evolution process, applying selection, crossover, and mutation with the isolation island paradigm and GRM-based fitness computation to incorporate cross-domain knowledge. It effectively mitigates the information cocoons arising from over-reliance on parametric knowledge and static literature. Extensive evaluations demonstrate that FlowPIE consistently produces ideas with higher novelty, feasibility and diversity compared to strong LLM-based and agent-based frameworks, while enabling reward scaling during test time.

FlowPIE：基于流引导文献探索的测试时科学思想演化

FlowPIE: Test-Time Scientific Idea Evolution with Flow-Guided Literature Exploration

摘要

Support