ADaPT: Op-Behoefte Decompositie en Planning met Taalmodellen

Samenvatting

Grote Taalmodellen (LLMs) worden steeds vaker ingezet voor interactieve besluitvormingstaken die planning en aanpassing aan de omgeving vereisen. Recente werken gebruiken LLMs-as-agents op twee hoofdmanieren: iteratief het volgende actie bepalen (iteratieve uitvoerders) of plannen genereren en sub-taken uitvoeren met behulp van LLMs (plan-en-uitvoer). Deze methoden hebben echter moeite met taakcomplexiteit, omdat het onvermogen om een sub-taak uit te voeren kan leiden tot taakfalen. Om deze tekortkomingen aan te pakken, introduceren we As-Needed Decomposition and Planning for complex Tasks (ADaPT), een benadering die complexe sub-taken expliciet plant en decomposeert wanneer dat nodig is, d.w.z. wanneer het LLM ze niet kan uitvoeren. ADaPT decomposeert sub-taken recursief om zich aan te passen aan zowel taakcomplexiteit als de capaciteiten van het LLM. Onze resultaten tonen aan dat ADaPT aanzienlijk beter presteert dan gevestigde sterke basislijnen, met succespercentages die tot 28,3% hoger liggen in ALFWorld, 27% in WebShop en 33% in TextCraft – een nieuw compositorisch dataset die we introduceren. Door uitgebreide analyse illustreren we het belang van multilevel decompositie en stellen we vast dat ADaPT zich dynamisch aanpast aan de capaciteiten van het uitvoerende LLM evenals aan de taakcomplexiteit.

English

Large Language Models (LLMs) are increasingly being used for interactive decision-making tasks requiring planning and adapting to the environment. Recent works employ LLMs-as-agents in broadly two ways: iteratively determining the next action (iterative executors) or generating plans and executing sub-tasks using LLMs (plan-and-execute). However, these methods struggle with task complexity, as the inability to execute any sub-task may lead to task failure. To address these shortcomings, we introduce As-Needed Decomposition and Planning for complex Tasks (ADaPT), an approach that explicitly plans and decomposes complex sub-tasks as-needed, i.e., when the LLM is unable to execute them. ADaPT recursively decomposes sub-tasks to adapt to both task complexity and LLM capability. Our results demonstrate that ADaPT substantially outperforms established strong baselines, achieving success rates up to 28.3% higher in ALFWorld, 27% in WebShop, and 33% in TextCraft -- a novel compositional dataset that we introduce. Through extensive analysis, we illustrate the importance of multilevel decomposition and establish that ADaPT dynamically adjusts to the capabilities of the executor LLM as well as to task complexity.

ADaPT: Op-Behoefte Decompositie en Planning met Taalmodellen

ADaPT: As-Needed Decomposition and Planning with Language Models

Samenvatting

Support