協調的なパイプラインの発見：逐次的社会ジレンマのための自動研究

要旨

本稿では、協調性を高めるための2層構造のオートリサーチ（自動研究）を検討する。すなわち、外部ループのAIエージェントが、マルチエージェント逐次的社会ジレンマ（SSD）向けLLMポリシー合成システムの内部ループパイプラインを自律的に再設計する。研究者エージェントR（コーディングエージェントとして動作）は、内部ループのソースコードを読み込み、システムプロンプト、フィードバック関数、ヘルパーライブラリ、反復ロジックを編集し、評価を実行して、保持すべきものを決定する。これはオートリサーチのパラダイムに従うものである。2種類のゲーム（CleanupとGathering）、2種類のポリシー合成LLM、および2種類の厚生目的（功利主義的効率性とロールズ的マキシミン）において、研究者は手設計のベースラインを確実に上回り、試行間の分散を大幅に狭め、プロンプトのみの最適化を凌駕する。発見されたパイプラインは目的に依存する。マキシミンの下でのみ、研究者は合成器パイプラインに明示的公平性メカニズムを注入する。このメカニズムは、研究者自身の目的に依存しないシステムプロンプトや、すべての効率性最適化パイプラインには存在しない。これは情報設計的解釈を裏付けるものであり、研究者が厚生目的に応じて、限定合理性を持つ合成器に何を開示するかを選択していることを示している。コードはhttps://github.com/vicgalle/autoresearch-social-dilemmasで公開されている。

English

We study two-level autoresearch for cooperation: an outer-loop AI agent autonomously redesigns the inner-loop pipeline of an LLM policy-synthesis system for multi-agent Sequential Social Dilemmas (SSDs). A researcher agent R (run as a coding agent) reads the inner-loop source code, edits system prompts, feedback functions, helper libraries, and iteration logic, runs evaluations, and decides what to keep, following the autoresearch paradigm. Across two games (Cleanup and Gathering), two policy-synthesizer LLMs, and two welfare objectives (utilitarian efficiency and Rawlsian maximin), the researcher reliably exceeds hand-designed baselines, sharply tightens run-to-run variance, and outperforms prompt-only optimization. The discovered pipelines are objective-dependent: only under maximin does the researcher inject an explicit fairness mechanism into synthesizer pipelines, a class of mechanism that is absent from its own objective-agnostic system prompt and from every efficiency-optimized pipeline. This supports an information-design reading in which the researcher chooses what to reveal to the boundedly rational synthesizer as a function of the welfare objective. Code at https://github.com/vicgalle/autoresearch-social-dilemmas.