代碼I/O:通過代碼輸入輸出預測來縮短推理模式
CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction
February 11, 2025
作者: Junlong Li, Daya Guo, Dejian Yang, Runxin Xu, Yu Wu, Junxian He
cs.AI
摘要
推理是大型語言模型的基本能力。儘管先前的研究主要集中在增強像數學或代碼生成這樣的狹窄技能上,但由於訓練數據稀疏且分散,改善在許多其他推理任務上的表現仍然具有挑戰性。為了應對這個問題,我們提出了CodeI/O,這是一種新穎的方法,通過將內含於具有情境基礎代碼中的多樣推理模式系統地壓縮,將原始代碼轉換為代碼輸入-輸出預測格式。通過訓練模型以自然語言完全預測給定代碼和測試用例的輸入/輸出,作為Chain-of-Thought(CoT)理性的一部分,我們將其暴露於通用推理基元,如邏輯流規劃、狀態空間搜索、決策樹遍歷和模塊分解,同時將結構化推理與代碼特定語法解耦,並保持程序化嚴謹性。實驗結果表明,CodeI/O在符號、科學、邏輯、數學和數值、常識推理任務中均取得了一致的改善。通過與現有的地面真實輸出匹配或使用預測的輸入重新執行代碼,我們可以驗證每個預測,並通過多輪修訂進一步增強CoTs,從而實現CodeI/O++並實現更高的性能。我們的數據和模型可在https://github.com/hkust-nlp/CodeIO 上找到。
English
Reasoning is a fundamental capability of Large Language Models. While prior
research predominantly focuses on enhancing narrow skills like math or code
generation, improving performance on many other reasoning tasks remains
challenging due to sparse and fragmented training data. To address this issue,
we propose CodeI/O, a novel approach that systematically condenses diverse
reasoning patterns inherently embedded in contextually-grounded codes, through
transforming the original code into a code input-output prediction format. By
training models to predict inputs/outputs given code and test cases entirely in
natural language as Chain-of-Thought (CoT) rationales, we expose them to
universal reasoning primitives -- like logic flow planning, state-space
searching, decision tree traversal, and modular decomposition -- while
decoupling structured reasoning from code-specific syntax and preserving
procedural rigor. Experimental results demonstrate CodeI/O leads to consistent
improvements across symbolic, scientific, logic, math & numerical, and
commonsense reasoning tasks. By matching the existing ground-truth outputs or
re-executing the code with predicted inputs, we can verify each prediction and
further enhance the CoTs through multi-turn revision, resulting in CodeI/O++
and achieving higher performance. Our data and models are available at
https://github.com/hkust-nlp/CodeIO.Summary
AI-Generated Summary