APIGen:用於生成可驗證和多樣化函數調用數據集的自動化流程
APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets
June 26, 2024
作者: Zuxin Liu, Thai Hoang, Jianguo Zhang, Ming Zhu, Tian Lan, Shirley Kokane, Juntao Tan, Weiran Yao, Zhiwei Liu, Yihao Feng, Rithesh Murthy, Liangwei Yang, Silvio Savarese, Juan Carlos Niebles, Huan Wang, Shelby Heinecke, Caiming Xiong
cs.AI
摘要
功能調用代理模型的進步需要多樣化、可靠且高質量的數據集。本文介紹了 APIGen,一個自動數據生成管道,旨在為功能調用應用程序合成可驗證的高質量數據集。我們利用 APIGen 收集了 21 個不同類別中的 3,673 個可執行 API,以便以可擴展和結構化的方式生成多樣化的功能調用數據集。我們的數據集中的每個數據都通過三個階段的層次驗證:格式檢查、實際功能執行和語義驗證,確保其可靠性和正確性。我們展示了使用我們精心策劃的數據集訓練的模型,即使只有 7B 參數,也能在伯克利功能調用基準測試中實現最先進的性能,勝過多個 GPT-4 模型。此外,我們的 1B 模型實現了出色的性能,超越了 GPT-3.5-Turbo 和 Claude-3 Haiku。我們發布了一個包含 60,000 條高質量條目的數據集,旨在推動功能調用代理領域的發展。該數據集可在 Huggingface 上找到:https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k,項目主頁:https://apigen-pipeline.github.io/
English
The advancement of function-calling agent models requires diverse, reliable,
and high-quality datasets. This paper presents APIGen, an automated data
generation pipeline designed to synthesize verifiable high-quality datasets for
function-calling applications. We leverage APIGen and collect 3,673 executable
APIs across 21 different categories to generate diverse function-calling
datasets in a scalable and structured manner. Each data in our dataset is
verified through three hierarchical stages: format checking, actual function
executions, and semantic verification, ensuring its reliability and
correctness. We demonstrate that models trained with our curated datasets, even
with only 7B parameters, can achieve state-of-the-art performance on the
Berkeley Function-Calling Benchmark, outperforming multiple GPT-4 models.
Moreover, our 1B model achieves exceptional performance, surpassing
GPT-3.5-Turbo and Claude-3 Haiku. We release a dataset containing 60,000
high-quality entries, aiming to advance the field of function-calling agent
domains. The dataset is available on Huggingface:
https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k and the
project homepage: https://apigen-pipeline.github.io/