教導語言模型掌握工具使用的語言
Teaching a Language Model to Speak the Language of Tools
June 29, 2025
作者: Simeon Emanuilov
cs.AI
摘要
外部工具通過函數調用的整合對於實用的語言模型應用至關重要,然而大多數多語言模型在非英語語言中缺乏可靠的工具使用能力。即使是最先進的多語言模型,在決定何時使用工具以及生成函數調用所需的結構化輸出時也面臨困難,尤其是在提示使用資源較少的語言時,常常表現出語言混淆。本研究提出了一種方法,用於調整現有語言模型,使其能夠在任何目標語言中實現穩健的工具使用,並以保加利亞語作為案例研究。該方法涉及對BgGPT模型系列(2.6B、9B、27B參數)進行持續訓練,使用一個包含10,035個函數調用示例的新雙語數據集,旨在支持如MCP(模型上下文協議)等標準化協議。研究引入了TUCAN(工具使用能力助手導航器),其在函數調用準確性上相比基礎模型提升了高達28.75%,同時在保加利亞語基準測試中驗證了其核心語言理解能力的保持。除了準確性提升外,TUCAN模型展示了生產就緒的響應格式,提供清晰、可解析的函數調用,與基礎模型的冗長且不一致的輸出形成鮮明對比。模型、評估框架和數據集均已發布,以便其他語言進行複製。這項工作展示了將工具增強能力擴展到以英語為中心的系統之外的實用方法。
English
External tool integration through function-calling is essential for practical
language model applications, yet most multilingual models lack reliable
tool-use capabilities in non-English languages. Even state-of-the-art
multilingual models struggle with determining when to use tools and generating
the structured outputs required for function calls, often exhibiting language
confusion when prompted in lower-resource languages. This work presents a
methodology for adapting existing language models to enable robust tool use in
any target language, using Bulgarian as a case study. The approach involves
continued training of the BgGPT model series (2.6B, 9B, 27B parameters) on a
novel bilingual dataset of 10,035 function-calling examples designed to support
standardized protocols like MCP (Model Context Protocol). The research
introduces TUCAN (Tool-Using Capable Assistant Navigator), which achieves up to
28.75% improvement in function-calling accuracy over base models while
preserving core language understanding, as verified on established Bulgarian
benchmarks. Beyond accuracy gains, TUCAN models demonstrate production-ready
response formatting with clean, parsable function calls, contrasting with the
verbose and inconsistent outputs of base models. The models, evaluation
framework, and dataset are released to enable replication for other languages.
This work demonstrates a practical approach for extending tool-augmented
capabilities beyond English-centric systems.