CodeV: 다단계 요약을 통해 Verilog 생성을 위한 대형 언어 모델 강화

초록

현대 프로세서 설계의 복잡성 증가와 높은 비용으로 인해 프로세서 설계 자동화에 대한 수요가 급증하고 있습니다. 명령어 튜닝된 대형 언어 모델(LLMs)은 Python과 같은 범용 프로그래밍 언어에 대한 코드 자동 생성에서 뛰어난 성능을 보여왔습니다. 그러나 Verilog와 같은 하드웨어 기술 언어(HDLs)에서는 고품질 명령어 튜닝 데이터의 부족으로 인해 GPT-3.5와 같은 고급 LLMs도 Verilog 생성에 제한된 성능을 보입니다. 이 문제에 대해 우리는 (1) 실제 세계에서 수집된 Verilog 코드가 LLMs가 생성한 코드보다 더 높은 품질을 가진다는 점과 (2) GPT-3.5와 같은 LLMs가 Verilog 코드를 생성하는 것보다 요약하는 데 더 뛰어나다는 점을 관찰했습니다. 이러한 관찰을 바탕으로, 본 논문은 오픈소스 명령어 튜닝 Verilog 생성 LLMs인 CodeV 시리즈를 소개합니다. 고급 LLMs로 먼저 설명을 생성한 후 해당 코드를 얻는 대신, Verilog 코드를 LLM에 입력하고 다단계 요약을 통해 해당 자연어 설명을 생성하도록 합니다. 실험 결과, CodeV는 이전 오픈소스 SOTA인 BetterV(VerilogEval 기준)와 RTLCoder(RTLLM 기준)를 각각 14.4%와 11.3% 상대적으로 능가하며, VerilogEval에서 이전 상용 SOTA인 GPT-4를 22.1% 상대적으로 능가하는 성능을 보였습니다.

English

The increasing complexity and high costs associated with modern processor design have led to a surge in demand for processor design automation. Instruction-tuned large language models (LLMs) have demonstrated remarkable performance in automatically generating code for general-purpose programming languages like Python. However, these methods fail on hardware description languages (HDLs) like Verilog due to the scarcity of high-quality instruction tuning data, as even advanced LLMs like GPT-3.5 exhibit limited performance on Verilog generation. Regarding this issue, we observe that (1) Verilog code collected from the real world has higher quality than those generated by LLMs. (2) LLMs like GPT-3.5 excel in summarizing Verilog code rather than generating it. Based on these observations, this paper introduces CodeV, a series of open-source instruction-tuned Verilog generation LLMs. Instead of generating descriptions first and then getting the corresponding code from advanced LLMs, we prompt the LLM with Verilog code and let the LLM generate the corresponding natural language description by multi-level summarization. Experimental results show that CodeV relatively surpasses the previous open-source SOTA by 14.4% (BetterV in VerilogEval) and 11.3% (RTLCoder in RTLLM) respectively, and also relatively outperforms previous commercial SOTA GPT-4 by 22.1% in VerilogEval.

CodeV: 다단계 요약을 통해 Verilog 생성을 위한 대형 언어 모델 강화

CodeV: Empowering LLMs for Verilog Generation through Multi-Level Summarization

초록

Support