LoopTool: Het Sluiten van de Data-Trainingslus voor Robuuste LLM Tool-aanroepen

Samenvatting

Het verrijken van grote taalmmodellen (LLM's) met externe tools stelt hen in staat complexe, meerstaps taken uit te voeren. Toch wordt toollearning belemmerd door statische synthetische datapijplijnen, waarbij gegevensgeneratie en modeltraining als twee gescheiden, niet-interactieve processen worden uitgevoerd. Deze aanpak faalt in het adaptief focussen op de specifieke zwaktes van een model en laat ruislabels voortbestaan, wat de trainings efficiëntie vermindert. Wij introduceren LoopTool, een volledig geautomatiseerd, modelbewust data-evolutiekader dat deze kringloop sluit door datasynthese en modeltraining nauw te integreren. LoopTool verfijnt iteratief zowel de data als het model via drie synergetische modules: (1) Greedy Capability Probing (GCP) diagnosticeert de beheerste en gefaalde capaciteiten van het model; (2) Judgement-Guided Label Verification (JGLV) gebruikt een open-source beoordelaarsmodel om annotatiefouten te vinden en corrigeren, waardoor de dataset geleidelijk wordt gezuiverd; en (3) Error-Driven Data Expansion (EDDE) genereert nieuwe, uitdagende voorbeelden gebaseerd op geïdentificeerde fouten. Dit gesloten kringloopproces functioneert binnen een kosteneffectief, open-source ecosysteem en elimineert de afhankelijkheid van dure closed-source API's. Experimenten tonen aan dat ons 8B-model, getraind met LoopTool, zijn 32B-datagenerator significant overtreft en nieuwe state-of-the-art resultaten behaalt op de BFCL-v3 en ACEBench benchmarks voor zijn schaal. Ons werk demonstreert dat gesloten, zelfverfijnende datapijplijnen de toolgebruikscapaciteiten van LLM's aanzienlijk kunnen verbeteren.

English

Augmenting Large Language Models (LLMs) with external tools enables them to execute complex, multi-step tasks. However, tool learning is hampered by the static synthetic data pipelines where data generation and model training are executed as two separate, non-interactive processes. This approach fails to adaptively focus on a model's specific weaknesses and allows noisy labels to persist, degrading training efficiency. We introduce LoopTool, a fully automated, model-aware data evolution framework that closes this loop by tightly integrating data synthesis and model training. LoopTool iteratively refines both the data and the model through three synergistic modules: (1) Greedy Capability Probing (GCP) diagnoses the model's mastered and failed capabilities; (2) Judgement-Guided Label Verification (JGLV) uses an open-source judge model to find and correct annotation errors, progressively purifying the dataset; and (3) Error-Driven Data Expansion (EDDE) generates new, challenging samples based on identified failures. This closed-loop process operates within a cost-effective, open-source ecosystem, eliminating dependence on expensive closed-source APIs. Experiments show that our 8B model trained with LoopTool significantly surpasses its 32B data generator and achieves new state-of-the-art results on the BFCL-v3 and ACEBench benchmarks for its scale. Our work demonstrates that closed-loop, self-refining data pipelines can dramatically enhance the tool-use capabilities of LLMs.

LoopTool: Het Sluiten van de Data-Trainingslus voor Robuuste LLM Tool-aanroepen

LoopTool: Closing the Data-Training Loop for Robust LLM Tool Calls

Samenvatting

Support