CodeSteer:透過程式碼/文本引導的符號增強語言模型
CodeSteer: Symbolic-Augmented Language Models via Code/Text Guidance
February 4, 2025
作者: Yongchao Chen, Yilun Hao, Yueying Liu, Yang Zhang, Chuchu Fan
cs.AI
摘要
現有方法未能有效引導大型語言模型(LLMs)在文本推理和程式碼生成之間進行轉換,使符號計算能力被低效利用。我們引入了CodeSteer,一種有效的方法,用於引導LLM的程式碼/文本生成。我們構建了一個全面的基準SymBench,包括37個具有可調節複雜度的符號任務,並合成了12,000個多輪引導/生成軌跡和5,500個引導比較對的數據集。我們使用新設計的多輪監督微調(SFT)和直接偏好優化(DPO)對Llama-3-8B模型進行微調。結果得到的模型CodeSteerLLM,配備了提出的符號和自我答案檢查器,有效地引導更大型模型的程式碼/文本生成。通過使用CodeSteer來增強GPT-4o,其平均性能得分從53.3提升至86.4,甚至在所有37個任務(28個已見,9個未見)上都優於現有最佳的LLM OpenAI o1(82.7)、o1-preview(74.8)和DeepSeek R1(76.8)。針對GPT-4o進行訓練,CodeSteer展現出卓越的泛化能力,在Claude、Mistral和GPT-3.5上提供平均41.8的性能提升。CodeSteer引導的LLMs充分利用符號計算,在高度複雜的任務上保持強大的性能。模型、數據集和代碼可在以下網址找到:https://github.com/yongchao98/CodeSteer-v1.0。
English
Existing methods fail to effectively steer Large Language Models (LLMs)
between textual reasoning and code generation, leaving symbolic computing
capabilities underutilized. We introduce CodeSteer, an effective method for
guiding LLM code/text generation. We construct a comprehensive benchmark
SymBench comprising 37 symbolic tasks with adjustable complexity and also
synthesize datasets of 12k multi-round guidance/generation trajectories and
5.5k guidance comparison pairs. We fine-tune the Llama-3-8B model with a newly
designed multi-round supervised fine-tuning (SFT) and direct preference
optimization (DPO). The resulting model, CodeSteerLLM, augmented with the
proposed symbolic and self-answer checkers, effectively guides the code/text
generation of larger models. Augmenting GPT-4o with CodeSteer raises its
average performance score from 53.3 to 86.4, even outperforming the existing
best LLM OpenAI o1 (82.7), o1-preview (74.8), and DeepSeek R1 (76.8) across all
37 tasks (28 seen, 9 unseen). Trained for GPT-4o, CodeSteer demonstrates
superior generalizability, providing an average 41.8 performance boost on
Claude, Mistral, and GPT-3.5. CodeSteer-guided LLMs fully harness symbolic
computing to maintain strong performance on highly complex tasks. Models,
Datasets, and Codes are available at
https://github.com/yongchao98/CodeSteer-v1.0.Summary
AI-Generated Summary