ParEVO：不規則データのためのコード合成：エージェント進化による高性能並列処理

要旨

逐次計算から並列計算への移行は、現代の高性能アプリケーションにとって必須であるが、並行プログラミングの習得が困難であることが障壁となっている。この課題は、静的なスケジューリングが機能せず、データ依存性が予測不能な不規則データ構造（疎グラフ、不平衡木、非一様メッシュなど）において特に顕著である。現在の大規模言語モデル（LLM）は、これらのタスクでしばしば深刻な失敗を起こし、微妙な競合状態、デッドロック、非最適なスケーリングに悩まされるコードを生成してしまう。我々はこのギャップを埋めるため、不規則データ向けの高性能並列アルゴリズムを合成するフレームワーク「ParEVO」を提案する。主な貢献は以下の通りである。(1) 「Critic-Refine」パイプラインを通じて合成された13,820タスクからなる精選データセット「Parlay-Instruct Corpus」。これは、Work-Span並列プリミティブを効果的に利用する、経験的に高性能なアルゴリズムを明示的にフィルタリングしたものである。(2) ParlayLibライブラリの厳密なセマンティクスに確率的生成を適合させるため、特別にファインチューニングしたDeepSeek、Qwen、Geminiモデル。(3) コンパイラ、動的競合検出器、パフォーマンスプロファイラからのフィードバックを用いてコードを反復的に修正し、正確性の「ラストマイル」を改善する進化的コーディングエージェント（ECA）。ベンチマーク「ParEval」において、ParEVOはスイート全体で平均106倍（最大1103倍）の高速化を達成し、複雑な不規則グラフ問題に限定しても堅牢な13.6倍の高速化を実現し、最先端の商用モデルを凌駕した。さらに、我々の進化的アプローチは、最先端の専門家による人間ベースラインに匹敵し、特定の高度に不規則なカーネルでは最大4.1倍の高速化を達成した。ソースコード及びデータセットはhttps://github.com/WildAlg/ParEVO で公開されている。

English

The transition from sequential to parallel computing is essential for modern high-performance applications but is hindered by the steep learning curve of concurrent programming. This challenge is magnified for irregular data structures (such as sparse graphs, unbalanced trees, and non-uniform meshes) where static scheduling fails and data dependencies are unpredictable. Current Large Language Models (LLMs) often fail catastrophically on these tasks, generating code plagued by subtle race conditions, deadlocks, and sub-optimal scaling. We bridge this gap with ParEVO, a framework designed to synthesize high-performance parallel algorithms for irregular data. Our contributions include: (1) The Parlay-Instruct Corpus, a curated dataset of 13,820 tasks synthesized via a "Critic-Refine" pipeline that explicitly filters for empirically performant algorithms that effectively utilize Work-Span parallel primitives; (2) specialized DeepSeek, Qwen, and Gemini models fine-tuned to align probabilistic generation with the rigorous semantics of the ParlayLib library; and (3) an Evolutionary Coding Agent (ECA) that improves the "last mile" of correctness by iteratively repairing code using feedback from compilers, dynamic race detectors, and performance profilers. On the ParEval benchmark, ParEVO achieves an average 106x speedup (with a maximum of 1103x) across the suite, and a robust 13.6x speedup specifically on complex irregular graph problems, outperforming state-of-the-art commercial models. Furthermore, our evolutionary approach matches state-of-the-art expert human baselines, achieving up to a 4.1x speedup on specific highly-irregular kernels. Source code and datasets are available at https://github.com/WildAlg/ParEVO.

ParEVO：不規則データのためのコード合成：エージェント進化による高性能並列処理

ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution

要旨

Support