ReGAL：重構程式以發現可泛化抽象

摘要

儘管大型語言模型（LLMs）越來越多地用於程序合成，但它們缺乏開發有用抽象所需的全局視圖；通常它們一次預測一個程序，並且經常重複相同的功能。從頭開始生成冗餘代碼既低效又容易出錯。為了解決這個問題，我們提出了用於通用抽象學習的重構（ReGAL）方法，這是一種無梯度的方法，通過代碼重構來學習一個可重複使用的函數庫，即重組代碼而不改變其執行輸出。ReGAL從一小組現有程序中學習，通過執行逐步驗證和完善其抽象。我們發現，ReGAL發現的共享函數庫使跨不同領域的程序更容易預測。在三個數據集（LOGO圖形生成、日期推理和TextCraft，一個基於Minecraft的文本遊戲）上，無論是開源還是專有的LLMs，在預測具有ReGAL函數的程序時都提高了準確性。對於CodeLlama-13B，ReGAL在圖形方面的絕對準確性提高了11.5％，在日期理解方面提高了26.1％，在TextCraft方面提高了8.1％，在三個領域中有兩個優於GPT-3.5。我們的分析顯示，ReGAL的抽象封裝了常用的子程序以及環境動態。

English

While large language models (LLMs) are increasingly being used for program synthesis, they lack the global view needed to develop useful abstractions; they generally predict programs one at a time, often repeating the same functionality. Generating redundant code from scratch is both inefficient and error-prone. To address this, we propose Refactoring for Generalizable Abstraction Learning (ReGAL), a gradient-free method for learning a library of reusable functions via code refactorization, i.e. restructuring code without changing its execution output. ReGAL learns from a small set of existing programs, iteratively verifying and refining its abstractions via execution. We find that the shared function libraries discovered by ReGAL make programs easier to predict across diverse domains. On three datasets (LOGO graphics generation, Date reasoning, and TextCraft, a Minecraft-based text game), both open-source and proprietary LLMs improve in accuracy when predicting programs with ReGAL functions. For CodeLlama-13B, ReGAL results in absolute accuracy increases of 11.5% on graphics, 26.1% on date understanding, and 8.1% on TextCraft, outperforming GPT-3.5 in two of three domains. Our analysis reveals ReGAL's abstractions encapsulate frequently-used subroutines as well as environment dynamics.

ReGAL：重構程式以發現可泛化抽象

ReGAL: Refactoring Programs to Discover Generalizable Abstractions

摘要

Support