FAMA: Een op foutenbewustzijn gebaseerd meta-agentisch raamwerk voor open-source LLM's in interactieve toolgebruiksomgevingen

Samenvatting

Grootschalige taalmodellen worden steeds vaker ingezet als besluitvormingskern van autonome agents die veranderingen in externe omgevingen kunnen teweegbrengen. Toch falen deze agents vaak in conversatiebenchmarks, die realistische, klantgerichte probleemoplossingsscenario's simuleren, als gevolg van de cascade-effecten van onjuiste besluitvorming. Deze uitdagingen zijn bijzonder groot voor open-source LLM's met kleinere parameterschalen, beperkte contextvensters en krappe inferentiebudgetten, wat bijdraagt aan een toegenomen foutenaccumulatie in agent-gerichte settings. Om deze uitdagingen aan te pakken, presenteren we het Failure-Aware Meta-Agentic (FAMA) raamwerk. FAMA opereert in twee fasen: eerst analyseert het fouttrajecten van baseline-agents om de meest voorkomende fouten te identificeren; vervolgens gebruikt het een orchestratiemechanisme dat een minimale subset van gespecialiseerde agents activeert, die zijn afgestemd op het aanpakken van deze fouten door een gerichte context in te brengen voor de tool-use agent vóór de besluitvormingsstap. Experimenten met open-source LLM's tonen prestatieverbeteringen tot 27% aan across evaluatiemodi ten opzichte van standaardbaselines. Deze resultaten benadrukken dat gerichte contextcuratie via gespecialiseerde agents om veelvoorkomende fouten aan te pakken, een waardevol ontwerpprincipe is voor het bouwen van betrouwbare, multi-turn tool-use LLM-agents die realistische conversatiescenario's simuleren.

English

Large Language Models are being increasingly deployed as the decision-making core of autonomous agents capable of effecting change in external environments. Yet, in conversational benchmarks, which simulate real-world customer-centric issue resolution scenarios, these agents frequently fail due to the cascading effects of incorrect decision-making. These challenges are particularly pronounced for open-source LLMs with smaller parameter sizes, limited context windows, and constrained inference budgets, which contribute to increased error accumulation in agentic settings. To tackle these challenges, we present the Failure-Aware Meta-Agentic (FAMA) framework. FAMA operates in two stages: first, it analyzes failure trajectories from baseline agents to identify the most prevalent errors; second, it employs an orchestration mechanism that activates a minimal subset of specialized agents tailored to address these failures by injecting a targeted context for the tool-use agent before the decision-making step. Experiments across open-source LLMs demonstrate performance gains up to 27% across evaluation modes over standard baselines. These results highlight that targeted curation of context through specialized agents to address common failures is a valuable design principle for building reliable, multi-turn tool-use LLM agents that simulate real-world conversational scenarios.

FAMA: Een op foutenbewustzijn gebaseerd meta-agentisch raamwerk voor open-source LLM's in interactieve toolgebruiksomgevingen

FAMA: Failure-Aware Meta-Agentic Framework for Open-Source LLMs in Interactive Tool Use Environments

Samenvatting

Support