KnowRL: 사실성 강화를 위한 지식 기반 강화 학습 탐구

초록

대형 언어 모델(LLMs), 특히 느린 사고(slow-thinking) 모델은 종종 심각한 환각(hallucination) 현상을 보이며, 추론 과정에서 지식의 경계를 정확히 인식하지 못해 잘못된 내용을 출력합니다. 강화 학습(Reinforcement Learning, RL)은 복잡한 추론 능력을 향상시킬 수 있지만, 결과 중심의 보상 메커니즘은 사고 과정에 대한 사실적 감독이 부족하여 환각 문제를 더욱 악화시키는 경우가 많습니다. 느린 사고 모델의 높은 환각 현상을 해결하기 위해, 우리는 지식 기반 강화 학습(Knowledge-enhanced RL, KnowRL)을 제안합니다. KnowRL은 지식 검증을 기반으로 한 사실성 보상(factuality reward)을 RL 훈련 과정에 통합함으로써 모델이 지식의 경계를 인식하고 사실에 기반한 느린 사고를 수행하도록 유도합니다. RL 훈련 중 이러한 목표적인 사실적 입력은 모델이 사실 기반 추론 전략을 학습하고 내재화할 수 있게 합니다. 추론 단계 내에서 사실을 준수하는 행위에 직접 보상을 제공함으로써, KnowRL은 더 신뢰할 수 있는 사고 과정을 조성합니다. 세 가지 환각 평가 데이터셋과 두 가지 추론 평가 데이터셋에 대한 실험 결과는 KnowRL이 느린 사고 모델의 환각 현상을 효과적으로 완화하면서도 원래의 강력한 추론 능력을 유지한다는 것을 보여줍니다. 우리의 코드는 https://github.com/zjunlp/KnowRL에서 확인할 수 있습니다.

English

Large Language Models (LLMs), particularly slow-thinking models, often exhibit severe hallucination, outputting incorrect content due to an inability to accurately recognize knowledge boundaries during reasoning. While Reinforcement Learning (RL) can enhance complex reasoning abilities, its outcome-oriented reward mechanism often lacks factual supervision over the thinking process, further exacerbating the hallucination problem. To address the high hallucination in slow-thinking models, we propose Knowledge-enhanced RL, KnowRL. KnowRL guides models to perform fact-based slow thinking by integrating a factuality reward, based on knowledge verification, into the RL training process, helping them recognize their knowledge boundaries. KnowRL guides models to perform fact-based slow thinking by integrating a factuality reward, based on knowledge verification, into the RL training process, helping them recognize their knowledge boundaries. This targeted factual input during RL training enables the model to learn and internalize fact-based reasoning strategies. By directly rewarding adherence to facts within the reasoning steps, KnowRL fosters a more reliable thinking process. Experimental results on three hallucination evaluation datasets and two reasoning evaluation datasets demonstrate that KnowRL effectively mitigates hallucinations in slow-thinking models while maintaining their original strong reasoning capabilities. Our code is available at https://github.com/zjunlp/KnowRL.

KnowRL: 사실성 강화를 위한 지식 기반 강화 학습 탐구

KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality

초록

Support