SePO: 自己進化型プロンプトエージェントによるシステムプロンプト最適化

要旨

システムプロンプトの最適化は、基盤モデルを変更することなくエージェントの動作を改善し、人間が読みやすくモデルに依存しない指示を生成する。既存手法では、タスクエージェントのシステムプロンプトを改善するプロンプトエージェントを構築するが、プロンプトエージェント自身のシステムプロンプトは手作業で設計され固定されたままである。本稿では、自己進化的プロンプト最適化（SePO）を提案する。SePOは、プロンプトエージェント自身のシステムプロンプトを、タスクエージェントのシステムプロンプトとともに最適化対象として扱う。SePOは自己参照型の設計を採用する。単一のプロンプトエージェントが、タスクエージェントのシステムプロンプトと自身のシステムプロンプトの両方を改善する。これは、候補プロンプトのアーカイブを中間段階として保持する開かれた進化的探索に基づいて行われる。訓練は二段階で進行する。事前学習ではプロンプトエージェントをマルチタスクプール上で進化させ、その後、微調整により対象タスクに適用する。数学（AIME'25）、抽象推論（ARC-AGI-1）、大学院レベルの科学（GPQA）、コード生成（MBPP）、論理パズル（数独）の五つのベンチマークにおいて、SePOは一貫してManual-CoT、TextGrad、MetaSPOを上回り、Manual-CoTと比較して平均精度を4.49ポイント向上させた。事前学習からのプロンプト最適化スキルは、タスクごとのプロンプトを記憶するのではなく、事前学習の組み合わせを超えたタスクにも一般化する。

English

System prompt optimization improves agent behavior without modifying the underlying model, yielding human-readable, model-agnostic instructions. Existing methods build a prompt agent that refines task agents' system prompts, yet leave the prompt agent's own system prompt hand-engineered and fixed. We propose Self-Evolving Prompt Optimization (SePO), which treats the prompt agent's own system prompt as an optimization target alongside task agents' system prompts. SePO adopts a self-referential design. A single prompt agent improves both task agents' system prompts and its own under an open-ended evolutionary search that maintains an archive of candidate prompts as stepping stones. Training proceeds in two stages: pre-training evolves the prompt agent on a multi-task pool, and fine-tuning then applies it to a target task. Across five benchmarks spanning math (AIME'25), abstract reasoning (ARC-AGI-1), graduate-level science (GPQA), code generation (MBPP), and logic puzzles (Sudoku), SePO consistently outperforms Manual-CoT, TextGrad, and MetaSPO, improving the average accuracy by 4.49 points compared to Manual-CoT. The prompt optimization skill from pre-training also generalizes to tasks beyond the pre-training mixture, rather than memorizing per-task prompts.