ChatPaper.aiChatPaper

FunReason-MT技术报告:突破多轮函数调用的复杂性壁垒

FunReason-MT Technical Report: Overcoming the Complexity Barrier in Multi-Turn Function Calling

October 28, 2025
作者: Zengzhuang Xu, Bingguang Hao, Zechuan Wang, Yuntao Wen, Maolin Wang, Yang Liu, Long Chen, Dong Wang, Yicheng Chen, Cunyin Peng, Chenyi Zhuang, Jinjie Gu, Leilei Gan, Xiangyu Zhao, Shi Gu
cs.AI

摘要

函数调用(FC)能力使大语言模型(LLMs)和智能体能够与外部工具交互,这是解决复杂现实问题的关键能力。随着该能力在先进AI系统中的重要性日益凸显,对高质量多轮对话训练数据的需求变得尤为迫切。现有数据合成方法(如随机环境采样或多智能体角色扮演)在真实场景中难以生成高质量数据。实际挑战主要体现在三个方面:定向模型训练、工具架构隔离以及多轮逻辑依赖性。为应对这些结构性缺陷,我们提出FunReason-MT——一种面向真实世界多轮工具使用的新型数据合成框架。该框架通过以下方式突破多轮FC数据的复杂性壁垒:1)采用环境-API图交互收集多样化高质量轨迹;2)通过高级工具查询合成简化复杂查询构建;3)利用引导式迭代链实现精细思维链生成。在伯克利函数调用排行榜(BFCLv3)上的评估表明,基于FunReason-MT生成数据训练的40亿参数模型在同等规模模型中达到最优性能,超越多数闭源模型。在BFCLv4上的进一步性能提升证实,FunReason-MT为智能体学习提供了可靠且鲁棒的数据支撑。
English
Function calling (FC) empowers large language models (LLMs) and autonomous agents to interface with external tools, a critical capability for solving complex, real-world problems. As this ability becomes increasingly central to advanced AI systems, the need for high-quality, multi-turn training data to develop and refine it cannot be overstated. Existing data synthesis methods, such as random environment sampling or multi-agent role-playing, are not powerful enough to generate high-quality data in real-world environments. Practical challenges come in three folds: targeted model training, isolation of tool architecture, and multi-turn logical dependency. To address these structural deficiencies, we present FunReason-MT, a novel data synthesis framework for real-world multi-turn tool use. FunReason-MT resolves the complexity barrier in multi-turn FC data by employing 1) Environment-API Graph Interactions to gather varied high-quality trajectories, 2) Advanced Tool-Query Synthesis to simplify hard query construction, and 3) Guided Iterative Chain for sophisticated CoT generation. Evaluations on Berkeley Function-Calling Leaderboard (BFCLv3) demonstrate the power of our framework: a 4B model built upon FunReason-MT generated data achieves state-of-the-art performance among comparable-sized models, outperforming most close-source models. Further performance improvements on BFCLv4 confirm that FunReason-MT provides a reliable and robust source for agentic learning.
PDF71December 1, 2025