ChatPaper.aiChatPaper

强化微调赋能多模态大语言模型的推理能力

Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models

May 24, 2025
作者: Haoyuan Sun, Jiaqi Wu, Bo Xia, Yifu Luo, Yifei Zhao, Kai Qin, Xufei Lv, Tiantian Zhang, Yongzhe Chang, Xueqian Wang
cs.AI

摘要

站在2025年这一追求通用人工智能(AGI)的关键节点,强化微调(RFT)在提升大型语言模型(LLMs)推理能力方面展现出显著潜力,并催生了诸如OpenAI-o1和DeepSeek-R1等尖端AI模型的诞生。此外,RFT在增强多模态大型语言模型(MLLMs)推理能力上的高效应用,已引起学术界的广泛关注。在本立场文件中,我们主张强化微调是推动多模态大型语言模型推理能力的关键动力。首先,我们为对此领域感兴趣的研究者详细介绍了必备的基础背景知识。接着,我们精心总结了RFT在提升MLLMs推理能力方面的五大改进点:多样化的模态、广泛的任务与领域、更优的训练算法、丰富的基准测试以及蓬勃发展的工程框架。最后,我们提出了未来研究可能考虑的五个有前景的方向。我们期望这份立场文件能在AGI发展的这一关键时刻,为学术界提供宝贵的洞见。关于RFT应用于MLLMs的研究成果总结,可访问https://github.com/Sun-Haoyuan23/Awesome-RL-based-Reasoning-MLLMs获取。
English
Standing in 2025, at a critical juncture in the pursuit of Artificial General Intelligence (AGI), reinforcement fine-tuning (RFT) has demonstrated significant potential in enhancing the reasoning capability of large language models (LLMs) and has led to the development of cutting-edge AI models such as OpenAI-o1 and DeepSeek-R1. Moreover, the efficient application of RFT to enhance the reasoning capability of multimodal large language models (MLLMs) has attracted widespread attention from the community. In this position paper, we argue that reinforcement fine-tuning powers the reasoning capability of multimodal large language models. To begin with, we provide a detailed introduction to the fundamental background knowledge that researchers interested in this field should be familiar with. Furthermore, we meticulously summarize the improvements of RFT in powering reasoning capability of MLLMs into five key points: diverse modalities, diverse tasks and domains, better training algorithms, abundant benchmarks and thriving engineering frameworks. Finally, we propose five promising directions for future research that the community might consider. We hope that this position paper will provide valuable insights to the community at this pivotal stage in the advancement toward AGI. Summary of works done on RFT for MLLMs is available at https://github.com/Sun-Haoyuan23/Awesome-RL-based-Reasoning-MLLMs.

Summary

AI-Generated Summary

PDF183May 27, 2025