DEI：进化推理中的多样性用于质量-多样性搜索

摘要

我们提出DEI：进化推理中的多样性（Diversity in Evolutionary Inference），这是一个分布式质量多样性（QD）搜索框架，它将异构的大型语言模型（LLM）作为变异算子，分配到通过非阻塞集合操作通信的对等节点上。与同质并行搜索（将单一模型的归纳偏差复制到所有工作节点）不同，DEI将每个LLM独特的创造性先验视为行为新颖性的互补来源。通过将DEI扩展到数字红皇后框架，节点在每轮结束时共享局部最优解，以播种下一轮种群。这创造了跨模型的对抗压力，驱动了超越模型内自对弈的鲁棒性。在Core War领域（一个竞争性编程基准，其中Redcode战士程序在模拟机器中战斗）上评估，一个四节点异构集成（GPT-5.4-mini、Claude Sonnet 4.6、GPT-5.2和Claude Haiku 4.5）在相等的总LLM调用预算下，相比单节点基线，实现了124%更高的合并存档QD分数（45.90 vs 20.46）和28%更高的覆盖率（80.6% vs 63.0%的单元格）。异构集成还在QD分数、覆盖率和所有四个模型家族的保留解泛化性上优于同等预算的同质集成。这些结果提供了首个经验证据，表明模型多样性（而不仅仅是并行性）是分布式基于LLM的QD搜索中增益的关键驱动因素。

English

We present DEI: Diversity in Evolutionary Inference, a distributed Quality-Diversity (QD) search framework that assigns heterogeneous large language models (LLMs) as mutation operators across peer nodes communicating with non-blocking collective operations. Unlike homogeneous parallel search, which replicates a single model's inductive biases across all workers, DEI treats each LLM's distinct creative prior as a complementary source of behavioral novelty. Extending the Digital Red Queen framework with DEI, nodes share local optimal solutions at the end of each round to seed the next round's population. This creates cross-model adversarial pressure that drives robustness beyond intra-model self-play. Evaluated on the Core War domain, a competitive programming benchmark in which Redcode warrior programs battle inside a simulated machine, a four-node heterogeneous ensemble (GPT-5.4-mini, Claude Sonnet 4.6, GPT-5.2, and Claude Haiku 4.5) achieves 124 percent higher merged-archive QD-Score (45.90 vs. 20.46) and 28 percent higher coverage (80.6 percent vs. 63.0 percent of cells) than a single-node baseline at equal total LLM-call budget. The heterogeneous ensemble also outperforms an equally-budgeted homogeneous ensemble on QD-Score, coverage, and held-out solution generality across all four model families. These results provide the first empirical evidence that model diversity, not merely parallelism, is the key driver of gain in distributed LLM-based QD search.