알아보기: 도메인 특화 질의응답을 위한 내부-외부 지식 자가 선택 프레임워크

초록

대규모 언어 모델(LLMs)은 일반적인 질의응답(QA)에서 우수한 성능을 보이지만, 도메인 특화 시나리오에서는 종종 어려움을 겪습니다. 검색 증강 생성(RAG)은 외부 지식을 도입하지만, 노이즈가 많은 검색으로 인한 환각(hallucination)과 지연 문제가 발생합니다. 지속적인 사전 학습은 도메인 지식을 내재화하지만 비용이 많이 들고 도메인 간 유연성이 부족합니다. 우리는 이러한 문제를 도메인 지식의 롱테일(long-tail) 분포로 인해 부분적이지만 유용한 내부 지식이 충분히 활용되지 않기 때문이라고 분석합니다. 또한, 지식 습득은 인간의 학습 과정과 마찬가지로 점진적이어야 한다고 주장합니다: 먼저 개념을 이해한 다음, 이를 복잡한 추론에 적용하는 방식입니다. 이를 해결하기 위해, 우리는 Selct2Know(S2K)라는 비용 효율적인 프레임워크를 제안합니다. S2K는 내부-외부 지식 자체 선택 전략과 선택적 지도 미세 조정을 통해 도메인 지식을 내재화합니다. 또한, 구조화된 추론 데이터 생성 파이프라인을 도입하고 GRPO를 통합하여 추론 능력을 강화합니다. 의료, 법률, 금융 QA 벤치마크에서의 실험 결과, S2K는 기존 방법들을 일관되게 능가하며, 훨씬 낮은 비용으로 도메인 사전 학습된 LLMs와 동등한 성능을 보였습니다.

English

Large Language Models (LLMs) perform well in general QA but often struggle in domain-specific scenarios. Retrieval-Augmented Generation (RAG) introduces external knowledge but suffers from hallucinations and latency due to noisy retrievals. Continued pretraining internalizes domain knowledge but is costly and lacks cross-domain flexibility. We attribute this challenge to the long-tail distribution of domain knowledge, which leaves partial yet useful internal knowledge underutilized. We further argue that knowledge acquisition should be progressive, mirroring human learning: first understanding concepts, then applying them to complex reasoning. To address this, we propose Selct2Know (S2K), a cost-effective framework that internalizes domain knowledge through an internal-external knowledge self-selection strategy and selective supervised fine-tuning. We also introduce a structured reasoning data generation pipeline and integrate GRPO to enhance reasoning ability. Experiments on medical, legal, and financial QA benchmarks show that S2K consistently outperforms existing methods and matches domain-pretrained LLMs with significantly lower cost.

알아보기: 도메인 특화 질의응답을 위한 내부-외부 지식 자가 선택 프레임워크

Select to Know: An Internal-External Knowledge Self-Selection Framework for Domain-Specific Question Answering

초록

Support