選擇性認知:面向特定領域問答的內外部知識自選框架
Select to Know: An Internal-External Knowledge Self-Selection Framework for Domain-Specific Question Answering
August 21, 2025
作者: Bolei He, Xinran He, Run Shao, Shanfu Shu, Xianwei Xue, Mingquan Cheng, Haifeng Li, Zhenhua Ling
cs.AI
摘要
大型語言模型(LLMs)在通用問答任務中表現出色,但在特定領域場景中往往表現不佳。檢索增強生成(RAG)引入了外部知識,但由於檢索結果的噪音,容易產生幻覺並導致延遲。持續預訓練雖然能內化領域知識,但成本高昂且缺乏跨領域的靈活性。我們將這一挑戰歸因於領域知識的長尾分佈,這使得部分有用但未被充分利用的內部知識未能發揮作用。我們進一步主張,知識獲取應是漸進式的,模仿人類的學習過程:先理解概念,再將其應用於複雜推理。為解決這一問題,我們提出了Selct2Know(S2K),這是一個成本效益高的框架,通過內部-外部知識自選策略和選擇性監督微調來內化領域知識。我們還引入了一個結構化推理數據生成管道,並整合GRPO以增強推理能力。在醫學、法律和金融問答基準測試中的實驗表明,S2K始終優於現有方法,並以顯著更低的成本匹配了領域預訓練的LLMs。
English
Large Language Models (LLMs) perform well in general QA but often struggle in
domain-specific scenarios. Retrieval-Augmented Generation (RAG) introduces
external knowledge but suffers from hallucinations and latency due to noisy
retrievals. Continued pretraining internalizes domain knowledge but is costly
and lacks cross-domain flexibility. We attribute this challenge to the
long-tail distribution of domain knowledge, which leaves partial yet useful
internal knowledge underutilized. We further argue that knowledge acquisition
should be progressive, mirroring human learning: first understanding concepts,
then applying them to complex reasoning. To address this, we propose Selct2Know
(S2K), a cost-effective framework that internalizes domain knowledge through an
internal-external knowledge self-selection strategy and selective supervised
fine-tuning. We also introduce a structured reasoning data generation pipeline
and integrate GRPO to enhance reasoning ability. Experiments on medical, legal,
and financial QA benchmarks show that S2K consistently outperforms existing
methods and matches domain-pretrained LLMs with significantly lower cost.