KnowRL：探索知识性强化学习以提升事实准确性

摘要

大型語言模型（LLMs），尤其是慢思維模型，常表現出嚴重的幻覺現象，即在推理過程中因無法準確識別知識邊界而輸出錯誤內容。雖然強化學習（RL）能提升複雜推理能力，但其結果導向的獎勵機制往往缺乏對思維過程的事實監督，進一步加劇了幻覺問題。為解決慢思維模型的高幻覺率，我們提出了知識增強的強化學習方法——KnowRL。KnowRL通過在RL訓練過程中整合基於知識驗證的事實性獎勵，引導模型進行基於事實的慢思維，幫助其識別知識邊界。這種在RL訓練期間針對性的事實輸入，使模型能夠學習並內化基於事實的推理策略。通過直接在推理步驟中獎勵對事實的遵循，KnowRL促成了更為可靠的思維過程。在三個幻覺評估數據集和兩個推理評估數據集上的實驗結果表明，KnowRL有效減少了慢思維模型的幻覺，同時保持了其原有的強大推理能力。我們的代碼可在https://github.com/zjunlp/KnowRL獲取。

English

Large Language Models (LLMs), particularly slow-thinking models, often exhibit severe hallucination, outputting incorrect content due to an inability to accurately recognize knowledge boundaries during reasoning. While Reinforcement Learning (RL) can enhance complex reasoning abilities, its outcome-oriented reward mechanism often lacks factual supervision over the thinking process, further exacerbating the hallucination problem. To address the high hallucination in slow-thinking models, we propose Knowledge-enhanced RL, KnowRL. KnowRL guides models to perform fact-based slow thinking by integrating a factuality reward, based on knowledge verification, into the RL training process, helping them recognize their knowledge boundaries. KnowRL guides models to perform fact-based slow thinking by integrating a factuality reward, based on knowledge verification, into the RL training process, helping them recognize their knowledge boundaries. This targeted factual input during RL training enables the model to learn and internalize fact-based reasoning strategies. By directly rewarding adherence to facts within the reasoning steps, KnowRL fosters a more reliable thinking process. Experimental results on three hallucination evaluation datasets and two reasoning evaluation datasets demonstrate that KnowRL effectively mitigates hallucinations in slow-thinking models while maintaining their original strong reasoning capabilities. Our code is available at https://github.com/zjunlp/KnowRL.

KnowRL：探索知识性强化学习以提升事实准确性

KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality

摘要

Support