KnowRL：探索基于知识的强化学习在事实性验证中的应用

摘要

大型语言模型（LLMs），尤其是慢思考模型，常表现出严重的幻觉现象，即在推理过程中因无法准确识别知识边界而输出错误内容。尽管强化学习（RL）能够提升复杂推理能力，但其以结果为导向的奖励机制往往缺乏对思维过程的事实监督，进一步加剧了幻觉问题。为解决慢思考模型中的高幻觉率，我们提出了知识增强型强化学习——KnowRL。KnowRL通过在RL训练过程中融入基于知识验证的事实性奖励，引导模型进行基于事实的慢思考，帮助其识别知识边界。这种在RL训练中针对事实的输入，使模型能够学习并内化基于事实的推理策略。通过在推理步骤中直接奖励对事实的遵循，KnowRL培养了一种更为可靠的思维过程。在三个幻觉评估数据集和两个推理评估数据集上的实验结果表明，KnowRL有效缓解了慢思考模型中的幻觉现象，同时保持了其原有的强大推理能力。我们的代码已发布于https://github.com/zjunlp/KnowRL。

English

Large Language Models (LLMs), particularly slow-thinking models, often exhibit severe hallucination, outputting incorrect content due to an inability to accurately recognize knowledge boundaries during reasoning. While Reinforcement Learning (RL) can enhance complex reasoning abilities, its outcome-oriented reward mechanism often lacks factual supervision over the thinking process, further exacerbating the hallucination problem. To address the high hallucination in slow-thinking models, we propose Knowledge-enhanced RL, KnowRL. KnowRL guides models to perform fact-based slow thinking by integrating a factuality reward, based on knowledge verification, into the RL training process, helping them recognize their knowledge boundaries. KnowRL guides models to perform fact-based slow thinking by integrating a factuality reward, based on knowledge verification, into the RL training process, helping them recognize their knowledge boundaries. This targeted factual input during RL training enables the model to learn and internalize fact-based reasoning strategies. By directly rewarding adherence to facts within the reasoning steps, KnowRL fosters a more reliable thinking process. Experimental results on three hallucination evaluation datasets and two reasoning evaluation datasets demonstrate that KnowRL effectively mitigates hallucinations in slow-thinking models while maintaining their original strong reasoning capabilities. Our code is available at https://github.com/zjunlp/KnowRL.

KnowRL：探索基于知识的强化学习在事实性验证中的应用

KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality

摘要

Support