実行可能な関数型抽象化：高度な数学問題のための生成プログラムの推論

要旨

科学者はしばしば、特定の問題例から抽象的な手順を推論し、その抽象化を用いて新たに関連する例を生成します。例えば、システムの形式的な規則や特性をコード化したプログラムは、RL（手続き型環境）から物理学（シミュレーションエンジン）に至る幅広い分野で有用です。これらのプログラムは、パラメータ化（例えば、グリッドワールドの設定や初期物理条件）に基づいて異なる出力を実行する関数と見なすことができます。我々は、数学問題に対するこのようなプログラムを指すためにEFA（Executable Functional Abstraction）という用語を導入します。EFAに類似した構成は、モデルのストレステスト用の問題生成器として数学的推論に有用であることが示されています。しかし、これまでの研究は小学校レベルの数学（その単純な規則はプログラムに容易にコード化できる）に限定されており、高度な数学に対するEFAの生成はこれまで人間のエンジニアリングを必要としてきました。我々は、高度な数学問題に対するEFAの自動構築を探求します。EFAの自動構築タスクをプログラム合成タスクとして操作化し、LLMをシード数学問題とその段階的な解法に基づいて条件付け、シード問題の背後にある一般化された問題と解法クラスに忠実な候補EFAプログラムを生成するEFAGenを開発します。さらに、有効なEFAが持つべき特性を実行可能なユニットテストの観点で形式化し、これらのテストを検証可能な報酬として使用して、LLMがより優れたEFAの作成者になるように訓練する方法を示します。EFAGenによって構築されたEFAが、シード問題に忠実であり続け、学習可能な問題バリエーションを生成し、EFAGenが競技レベルの数学問題の多様なソースにわたってEFAを推論できることを実証します。最後に、モデルが作成したEFAの下流用途を示します。例えば、学習者が解くのが難しいまたは易しい問題バリエーションを見つけることや、データ生成などです。

English

Scientists often infer abstract procedures from specific instances of problems and use the abstractions to generate new, related instances. For example, programs encoding the formal rules and properties of a system have been useful in fields ranging from RL (procedural environments) to physics (simulation engines). These programs can be seen as functions which execute to different outputs based on their parameterizations (e.g., gridworld configuration or initial physical conditions). We introduce the term EFA (Executable Functional Abstraction) to denote such programs for math problems. EFA-like constructs have been shown to be useful for math reasoning as problem generators for stress-testing models. However, prior work has been limited to abstractions for grade-school math (whose simple rules are easy to encode in programs), while generating EFAs for advanced math has thus far required human engineering. We explore the automatic construction of EFAs for advanced math problems. We operationalize the task of automatically constructing EFAs as a program synthesis task, and develop EFAGen, which conditions an LLM on a seed math problem and its step-by-step solution to generate candidate EFA programs that are faithful to the generalized problem and solution class underlying the seed problem. Furthermore, we formalize properties any valid EFA must possess in terms of executable unit tests, and show how the tests can be used as verifiable rewards to train LLMs to become better writers of EFAs. We demonstrate that EFAs constructed by EFAGen behave rationally by remaining faithful to seed problems, produce learnable problem variations, and that EFAGen can infer EFAs across multiple diverse sources of competition-level math problems. Finally, we show downstream uses of model-written EFAs e.g. finding problem variations that are harder or easier for a learner to solve, as well as data generation.

実行可能な関数型抽象化：高度な数学問題のための生成プログラムの推論

Executable Functional Abstractions: Inferring Generative Programs for Advanced Math Problems

要旨

Support