Het Ontdekken van Coöperatieve Pijplijnen: Autoresearch voor Sequentiële Sociale Dilemma's

Samenvatting

We bestuderen twee-level auto-onderzoek voor samenwerking: een buitenste-lus AI-agent herontwerpt autonoom de binnenste-lus pijplijn van een LLM-beleidssynthesesysteem voor multi-agent Sequentiele Sociale Dilemma's (SSD's). Een onderzoekeragent R (uitgevoerd als een codeeragent) leest de broncode van de binnenste lus, bewerkt systeemprompts, feedbackfuncties, hulpbibliotheken en iteratielogica, voert evaluaties uit en beslist wat te behouden, volgens het auto-onderzoeksparadigma. Over twee spellen (Cleanup en Gathering), twee beleidssynthesizer-LLM's en twee welvaartsdoelstellingen (utilitaristische efficiëntie en Rawlsiaans maximin) overtreft de onderzoeker betrouwbaar handmatig ontworpen basislijnen, vermindert scherp de run-tot-run variantie en presteert beter dan alleen-promptoptimalisatie. De ontdekte pijplijnen zijn doelafhankelijk: alleen onder maximin injecteert de onderzoeker een expliciet rechtvaardigheidsmechanisme in synthesizerpijplijnen, een klasse van mechanismen die afwezig is in zijn eigen doelonafhankelijke systeemprompt en in elke efficiëntie-geoptimaliseerde pijplijn. Dit ondersteunt een informatie-ontwerp interpretatie waarin de onderzoeker kiest wat te onthullen aan de begrensd rationele synthesizer als functie van de welvaartsdoelstelling. Code op https://github.com/vicgalle/autoresearch-social-dilemmas.

English

We study two-level autoresearch for cooperation: an outer-loop AI agent autonomously redesigns the inner-loop pipeline of an LLM policy-synthesis system for multi-agent Sequential Social Dilemmas (SSDs). A researcher agent R (run as a coding agent) reads the inner-loop source code, edits system prompts, feedback functions, helper libraries, and iteration logic, runs evaluations, and decides what to keep, following the autoresearch paradigm. Across two games (Cleanup and Gathering), two policy-synthesizer LLMs, and two welfare objectives (utilitarian efficiency and Rawlsian maximin), the researcher reliably exceeds hand-designed baselines, sharply tightens run-to-run variance, and outperforms prompt-only optimization. The discovered pipelines are objective-dependent: only under maximin does the researcher inject an explicit fairness mechanism into synthesizer pipelines, a class of mechanism that is absent from its own objective-agnostic system prompt and from every efficiency-optimized pipeline. This supports an information-design reading in which the researcher chooses what to reveal to the boundedly rational synthesizer as a function of the welfare objective. Code at https://github.com/vicgalle/autoresearch-social-dilemmas.