ChatPaper.aiChatPaper

教导语言模型掌握工具交流的语言

Teaching a Language Model to Speak the Language of Tools

June 29, 2025
作者: Simeon Emanuilov
cs.AI

摘要

通过函数调用来实现外部工具集成,对于实际的语言模型应用至关重要,然而大多数多语言模型在非英语语言中缺乏可靠的工具使用能力。即便是最先进的多语言模型,在决定何时使用工具以及生成函数调用所需的结构化输出时也面临挑战,尤其是在低资源语言提示下常出现语言混淆现象。本研究提出了一种方法,旨在调整现有语言模型,使其能够在任何目标语言中实现稳健的工具使用,并以保加利亚语为例进行案例研究。该方法包括对BgGPT系列模型(2.6B、9B、27B参数)进行持续训练,使用一个包含10,035个函数调用示例的双语新数据集,该数据集设计用于支持如MCP(模型上下文协议)等标准化协议。研究引入了TUCAN(工具使用能力助手导航器),其在函数调用准确率上相比基础模型提升了高达28.75%,同时保持了核心语言理解能力,这一点在已建立的保加利亚语基准测试中得到了验证。除了准确率的提升,TUCAN模型还展示了生产就绪的响应格式化能力,能够生成简洁、可解析的函数调用,与基础模型冗长且不一致的输出形成鲜明对比。模型、评估框架及数据集均已公开发布,以便于其他语言的复现。本研究表明了一种将工具增强能力扩展至以英语为中心系统之外的实用途径。
English
External tool integration through function-calling is essential for practical language model applications, yet most multilingual models lack reliable tool-use capabilities in non-English languages. Even state-of-the-art multilingual models struggle with determining when to use tools and generating the structured outputs required for function calls, often exhibiting language confusion when prompted in lower-resource languages. This work presents a methodology for adapting existing language models to enable robust tool use in any target language, using Bulgarian as a case study. The approach involves continued training of the BgGPT model series (2.6B, 9B, 27B parameters) on a novel bilingual dataset of 10,035 function-calling examples designed to support standardized protocols like MCP (Model Context Protocol). The research introduces TUCAN (Tool-Using Capable Assistant Navigator), which achieves up to 28.75% improvement in function-calling accuracy over base models while preserving core language understanding, as verified on established Bulgarian benchmarks. Beyond accuracy gains, TUCAN models demonstrate production-ready response formatting with clean, parsable function calls, contrasting with the verbose and inconsistent outputs of base models. The models, evaluation framework, and dataset are released to enable replication for other languages. This work demonstrates a practical approach for extending tool-augmented capabilities beyond English-centric systems.
PDF21July 1, 2025