API-BLEND: Un Corpus Completo per l'Addestramento e la Valutazione di Modelli Linguistici per API

Abstract

C'è una crescente necessità che i Modelli Linguistici di Grande Dimensione (LLM) utilizzino efficacemente strumenti e interfacce di programmazione applicativa (API) esterne per pianificare e completare compiti. Di conseguenza, c'è un enorme interesse verso metodi che possano acquisire quantità sufficienti di dati di addestramento e test che includano chiamate a strumenti/API. Due linee di ricerca sono emerse come strategie predominanti per affrontare questa sfida. La prima si è concentrata su tecniche di generazione di dati sintetici, mentre la seconda ha riguardato la cura di dataset vicini al compito che possono essere trasformati in attività basate su API/strumenti. In questo articolo, ci concentriamo sul compito di identificare, curare e trasformare dataset esistenti e, a sua volta, introduciamo API-BLEND, un ampio corpus per l'addestramento e il test sistematico di LLM potenziati da strumenti. I dataset simulano scenari del mondo reale che coinvolgono attività legate alle API, come il rilevamento di API/strumenti, il riempimento di slot e la sequenziazione delle API rilevate. Dimostriamo l'utilità del dataset API-BLEND sia per l'addestramento che per il benchmarking.

English

There is a growing need for Large Language Models (LLMs) to effectively use tools and external Application Programming Interfaces (APIs) to plan and complete tasks. As such, there is tremendous interest in methods that can acquire sufficient quantities of train and test data that involve calls to tools / APIs. Two lines of research have emerged as the predominant strategies for addressing this challenge. The first has focused on synthetic data generation techniques, while the second has involved curating task-adjacent datasets which can be transformed into API / Tool-based tasks. In this paper, we focus on the task of identifying, curating, and transforming existing datasets and, in turn, introduce API-BLEND, a large corpora for training and systematic testing of tool-augmented LLMs. The datasets mimic real-world scenarios involving API-tasks such as API / tool detection, slot filling, and sequencing of the detected APIs. We demonstrate the utility of the API-BLEND dataset for both training and benchmarking purposes.

API-BLEND: Un Corpus Completo per l'Addestramento e la Valutazione di Modelli Linguistici per API

API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs

Abstract

Support