ChatPaper.aiChatPaper

FunReason-MT技术报告:突破多轮函数调用复杂性壁垒

FunReason-MT Technical Report: Overcoming the Complexity Barrier in Multi-Turn Function Calling

October 28, 2025
作者: Zengzhuang Xu, Bingguang Hao, Zechuan Wang, Yuntao Wen, Maolin Wang, Yang Liu, Long Chen, Dong Wang, Yicheng Chen, Cunyin Peng, Chenyi Zhuang, Jinjie Gu, Leilei Gan, Xiangyu Zhao, Shi Gu
cs.AI

摘要

函数调用(FC)能力使大语言模型(LLM)和智能体能够与外部工具交互,这是解决复杂现实问题的关键能力。随着该能力在先进AI系统中日益重要,对高质量多轮对话训练数据来开发和完善该能力的需求不容忽视。现有数据合成方法(如随机环境采样或多智能体角色扮演)在现实环境中难以生成高质量数据。实际挑战主要体现在三个方面:定向模型训练、工具架构的隔离性以及多轮逻辑依赖性。为弥补这些结构性缺陷,我们提出FunReason-MT——一个面向现实世界多轮工具使用的新型数据合成框架。该框架通过三大创新突破多轮FC数据的复杂度壁垒:1)采用环境-API图交互机制收集多样化高质量轨迹;2)通过高级工具查询合成技术简化复杂查询构建;3)利用引导式迭代链实现精细化的思维链生成。在伯克利函数调用排行榜(BFCLv3)上的评估表明:基于FunReason-MT生成数据训练的40亿参数模型,在同等规模模型中达到最优性能,甚至超越多数闭源模型。在BFCLv4上的进一步性能提升证实,FunReason-MT为智能体学习提供了可靠且鲁棒的数据支撑。
English
Function calling (FC) empowers large language models (LLMs) and autonomous agents to interface with external tools, a critical capability for solving complex, real-world problems. As this ability becomes increasingly central to advanced AI systems, the need for high-quality, multi-turn training data to develop and refine it cannot be overstated. Existing data synthesis methods, such as random environment sampling or multi-agent role-playing, are not powerful enough to generate high-quality data in real-world environments. Practical challenges come in three folds: targeted model training, isolation of tool architecture, and multi-turn logical dependency. To address these structural deficiencies, we present FunReason-MT, a novel data synthesis framework for real-world multi-turn tool use. FunReason-MT resolves the complexity barrier in multi-turn FC data by employing 1) Environment-API Graph Interactions to gather varied high-quality trajectories, 2) Advanced Tool-Query Synthesis to simplify hard query construction, and 3) Guided Iterative Chain for sophisticated CoT generation. Evaluations on Berkeley Function-Calling Leaderboard (BFCLv3) demonstrate the power of our framework: a 4B model built upon FunReason-MT generated data achieves state-of-the-art performance among comparable-sized models, outperforming most close-source models. Further performance improvements on BFCLv4 confirm that FunReason-MT provides a reliable and robust source for agentic learning.
PDF71December 1, 2025