ChatPaper.aiChatPaper

CodeV:通过多层摘要为LLMs生成Verilog的增强

CodeV: Empowering LLMs for Verilog Generation through Multi-Level Summarization

July 15, 2024
作者: Yang Zhao, Di Huang, Chongxiao Li, Pengwei Jin, Ziyuan Nan, Tianyun Ma, Lei Qi, Yansong Pan, Zhenxing Zhang, Rui Zhang, Xishan Zhang, Zidong Du, Qi Guo, Xing Hu, Yunji Chen
cs.AI

摘要

随着现代处理器设计日益复杂和成本高昂,导致了对处理器设计自动化的需求激增。针对通用编程语言如Python等,经过指令调优的大型语言模型(LLMs)已经展示出在自动生成代码方面的显著性能。然而,这些方法在硬件描述语言(HDLs)如Verilog上表现不佳,原因在于缺乏高质量的指令调优数据,即使像GPT-3.5这样的先进LLMs在Verilog生成方面也表现有限。针对这一问题,我们观察到:(1)从现实世界收集的Verilog代码质量高于LLMs生成的代码;(2)像GPT-3.5这样的LLMs擅长总结Verilog代码而非生成它。基于这些观察,本文介绍了CodeV,一系列开源的经指令调优的Verilog生成LLMs。我们不再首先生成描述,然后从先进的LLMs获取相应代码,而是用Verilog代码提示LLMs,并让LLMs通过多级摘要生成相应的自然语言描述。实验结果显示,CodeV在VerilogEval中相对于之前的开源SOTA(VerilogEval中的BetterV)和RTLLM中的RTLCoder分别提升了14.4%和11.3%,并且在VerilogEval中相对于之前的商业SOTA GPT-4提升了22.1%。
English
The increasing complexity and high costs associated with modern processor design have led to a surge in demand for processor design automation. Instruction-tuned large language models (LLMs) have demonstrated remarkable performance in automatically generating code for general-purpose programming languages like Python. However, these methods fail on hardware description languages (HDLs) like Verilog due to the scarcity of high-quality instruction tuning data, as even advanced LLMs like GPT-3.5 exhibit limited performance on Verilog generation. Regarding this issue, we observe that (1) Verilog code collected from the real world has higher quality than those generated by LLMs. (2) LLMs like GPT-3.5 excel in summarizing Verilog code rather than generating it. Based on these observations, this paper introduces CodeV, a series of open-source instruction-tuned Verilog generation LLMs. Instead of generating descriptions first and then getting the corresponding code from advanced LLMs, we prompt the LLM with Verilog code and let the LLM generate the corresponding natural language description by multi-level summarization. Experimental results show that CodeV relatively surpasses the previous open-source SOTA by 14.4% (BetterV in VerilogEval) and 11.3% (RTLCoder in RTLLM) respectively, and also relatively outperforms previous commercial SOTA GPT-4 by 22.1% in VerilogEval.

Summary

AI-Generated Summary

PDF93November 28, 2024