DEI：質量多樣性搜尋中的演化推理多樣性

摘要

我們提出 DEI：演化推理中的多樣性（Diversity in Evolutionary Inference），這是一個分散式品質-多樣性（Quality-Diversity, QD）搜尋框架，將異質大型語言模型（LLM）分配為跨對等節點的突變算子，這些節點透過非阻塞集合操作進行通訊。不同於同質平行搜尋（將單一模型的歸納偏誤複製到所有工作節點），DEI 將每個 LLM 獨特的創造性先驗視為行為新穎性的互補來源。透過在 DEI 框架下延伸數位紅皇后（Digital Red Queen）架構，節點在每輪結束時共享局部最佳解，作為下一輪族群的種子。這產生了跨模型的對抗壓力，驅動了超越模型內自我對戰的穩健性。在 Core War 領域（一個競爭性程式設計基準，其中 Redcode 勇士程式在模擬機器中戰鬥）進行的評估顯示，一個四節點異質整合（GPT-5.4-mini、Claude Sonnet 4.6、GPT-5.2 和 Claude Haiku 4.5）在總 LLM 呼叫預算相等的情況下，相較於單節點基準線，其合併歸檔 QD 分數提高了 124%（45.90 對 20.46），覆蓋率提高了 28%（80.6% 對 63.0% 的細胞）。該異質整合在 QD 分數、覆蓋率以及所有四個模型家族的保留解泛化能力方面，也優於同等預算的同質整合。這些結果提供了第一個經驗證據，證明在基於 LLM 的分散式 QD 搜尋中，增益的關鍵驅動因素是模型多樣性，而不僅僅是平行化。

English

We present DEI: Diversity in Evolutionary Inference, a distributed Quality-Diversity (QD) search framework that assigns heterogeneous large language models (LLMs) as mutation operators across peer nodes communicating with non-blocking collective operations. Unlike homogeneous parallel search, which replicates a single model's inductive biases across all workers, DEI treats each LLM's distinct creative prior as a complementary source of behavioral novelty. Extending the Digital Red Queen framework with DEI, nodes share local optimal solutions at the end of each round to seed the next round's population. This creates cross-model adversarial pressure that drives robustness beyond intra-model self-play. Evaluated on the Core War domain, a competitive programming benchmark in which Redcode warrior programs battle inside a simulated machine, a four-node heterogeneous ensemble (GPT-5.4-mini, Claude Sonnet 4.6, GPT-5.2, and Claude Haiku 4.5) achieves 124 percent higher merged-archive QD-Score (45.90 vs. 20.46) and 28 percent higher coverage (80.6 percent vs. 63.0 percent of cells) than a single-node baseline at equal total LLM-call budget. The heterogeneous ensemble also outperforms an equally-budgeted homogeneous ensemble on QD-Score, coverage, and held-out solution generality across all four model families. These results provide the first empirical evidence that model diversity, not merely parallelism, is the key driver of gain in distributed LLM-based QD search.