ChatPaper.aiChatPaper

智能推理能力的民主化:從大型語言模型中定制學習

Democratizing Reasoning Ability: Tailored Learning from Large Language Model

October 20, 2023
作者: Zhaoyang Wang, Shaohan Huang, Yuxuan Liu, Jiahai Wang, Minghui Song, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang
cs.AI

摘要

大型語言模型(LLMs)展現了在自然語言處理中令人印象深刻的新興能力,但由於龐大的計算需求和封閉源代碼的特性,其民主化受到阻礙。最近關於透過從黑盒LLMs中提煉知識來推進開源較小型LLMs的研究,在指示遵循能力方面取得了令人期待的成果。然而,更具挑戰性的推理能力相對較少被探索。在本文中,我們提出了一種定制的學習方法,以提煉這種推理能力到較小型LLMs,以促進專屬推理能力的民主化。與僅僅將LLM作為數據標註者不同,我們利用LLM作為推理教師的潛力,通過構建互動式多輪學習範式。這種範式使學生能夠向黑盒教師展示其不足之處,然後教師可以反過來提供定制的訓練數據。此外,為了發揮較小型LM的推理潛力,我們提出了自我反思學習,以激勵學生從自己的錯誤中學習。由於與多輪學習範式的無縫集成,自我反思學習和LLM的學習都針對學生的學習狀態進行了定制。對數學和常識推理任務的全面實驗和分析展示了我們方法的有效性。代碼將在https://github.com/Raibows/Learn-to-Reason 上提供。
English
Large language models (LLMs) exhibit impressive emergent abilities in natural language processing, but their democratization is hindered due to huge computation requirements and closed-source nature. Recent research on advancing open-source smaller LMs by distilling knowledge from black-box LLMs has obtained promising results in the instruction-following ability. However, the reasoning ability which is more challenging to foster, is relatively rarely explored. In this paper, we propose a tailored learning approach to distill such reasoning ability to smaller LMs to facilitate the democratization of the exclusive reasoning ability. In contrast to merely employing LLM as a data annotator, we exploit the potential of LLM as a reasoning teacher by building an interactive multi-round learning paradigm. This paradigm enables the student to expose its deficiencies to the black-box teacher who then can provide customized training data in return. Further, to exploit the reasoning potential of the smaller LM, we propose self-reflection learning to motivate the student to learn from self-made mistakes. The learning from self-reflection and LLM are all tailored to the student's learning status, thanks to the seamless integration with the multi-round learning paradigm. Comprehensive experiments and analysis on mathematical and commonsense reasoning tasks demonstrate the effectiveness of our method. The code will be available at https://github.com/Raibows/Learn-to-Reason.
PDF161December 15, 2024