案例到程式碼：使用合成數據學習歸納推理

摘要

複雜的推理是大型語言模型（LLMs）展示出的令人印象深刻的能力。大多數LLMs擅長於演繹推理，例如思維鏈條提示或迭代工具使用，以逐步解決具有挑戰性的任務。在本文中，我們希望專注於評估和教導LLMs進行歸納推理，即，LLMs應該通過觀察示例或序列轉換來推斷潛在規則。然而，收集大規模和多樣化的人類生成的歸納數據具有挑戰性。我們專注於代碼領域中的數據合成，並通過利用程序的表達性和正確性提出了一個Case2Code任務。具體來說，我們收集了一組多樣化的可執行程序，為每個程序合成輸入輸出轉換，並強迫LLMs根據合成的I/O案例推斷潛在的代碼實現。我們首先評估了代表性的LLMs在合成的Case2Code任務上的表現，並展示了Case-to-code歸納對LLMs來說是具有挑戰性的。然後，我們合成了大規模的Case2Code訓練樣本，以訓練LLMs進行歸納推理。實驗結果表明，這種歸納訓練不僅有助於在分佈上的Case2Code表現，還增強了經過訓練的LLMs的各種編碼能力，展示了通過合成數據學習歸納推理的巨大潛力。

English

Complex reasoning is an impressive ability shown by large language models (LLMs). Most LLMs are skilled in deductive reasoning, such as chain-of-thought prompting or iterative tool-using to solve challenging tasks step-by-step. In this paper, we hope to focus on evaluating and teaching LLMs to conduct inductive reasoning, that is, LLMs are supposed to infer underlying rules by observing examples or sequential transformations. However, collecting large-scale and diverse human-generated inductive data is challenging. We focus on data synthesis in the code domain and propose a Case2Code task by exploiting the expressiveness and correctness of programs. Specifically, we collect a diverse set of executable programs, synthesize input-output transformations for each program, and force LLMs to infer the underlying code implementations based on the synthetic I/O cases. We first evaluate representative LLMs on the synthesized Case2Code task and demonstrate that the Case-to-code induction is challenging for LLMs. Then, we synthesize large-scale Case2Code training samples to train LLMs to perform inductive reasoning. Experimental results show that such induction training benefits not only in distribution Case2Code performance but also enhances various coding abilities of trained LLMs, demonstrating the great potential of learning inductive reasoning via synthetic data.

案例到程式碼：使用合成數據學習歸納推理

Case2Code: Learning Inductive Reasoning with Synthetic Data

摘要

Support