选择求知：面向领域特定问答的内外部知识自选框架

摘要

大型语言模型（LLMs）在通用问答任务中表现出色，但在特定领域场景中往往表现欠佳。检索增强生成（RAG）引入了外部知识，却因噪声检索导致幻觉和延迟问题。持续预训练虽能内化领域知识，但成本高昂且缺乏跨领域灵活性。我们将这一挑战归因于领域知识的长尾分布，使得部分有用但未被充分利用的内部知识未能发挥其价值。我们进一步主张，知识获取应遵循渐进式原则，模仿人类学习过程：先理解概念，再将其应用于复杂推理。为此，我们提出了Selct2Know（S2K），一个成本效益高的框架，通过内部-外部知识自选择策略和选择性监督微调来内化领域知识。我们还引入了一个结构化推理数据生成管道，并整合GRPO以增强推理能力。在医疗、法律和金融问答基准测试中，S2K持续超越现有方法，并以显著更低的成本匹配领域预训练的LLMs。

English

Large Language Models (LLMs) perform well in general QA but often struggle in domain-specific scenarios. Retrieval-Augmented Generation (RAG) introduces external knowledge but suffers from hallucinations and latency due to noisy retrievals. Continued pretraining internalizes domain knowledge but is costly and lacks cross-domain flexibility. We attribute this challenge to the long-tail distribution of domain knowledge, which leaves partial yet useful internal knowledge underutilized. We further argue that knowledge acquisition should be progressive, mirroring human learning: first understanding concepts, then applying them to complex reasoning. To address this, we propose Selct2Know (S2K), a cost-effective framework that internalizes domain knowledge through an internal-external knowledge self-selection strategy and selective supervised fine-tuning. We also introduce a structured reasoning data generation pipeline and integrate GRPO to enhance reasoning ability. Experiments on medical, legal, and financial QA benchmarks show that S2K consistently outperforms existing methods and matches domain-pretrained LLMs with significantly lower cost.

选择求知：面向领域特定问答的内外部知识自选框架

Select to Know: An Internal-External Knowledge Self-Selection Framework for Domain-Specific Question Answering

摘要

Support