GoodVibe:基于LLM代码生成的安全振动防护技术
GoodVibe: Security-by-Vibe for LLM-Based Code Generation
February 11, 2026
作者: Maximilian Thang, Lichao Wu, Sasha Behrouzi, Mohamadreza Rostami, Jona te Lintelo, Stjepan Picek, Ahmad-Reza Sadeghi
cs.AI
摘要
在快速非正式的开发流程(常被称为氛围编程)中,大型语言模型正日益广泛地用于代码生成。这类场景优先考虑速度和便利性,安全需求往往未被明确要求。在此环境下,模型常生成功能正确但存在安全隐患的代码,导致安全风险持续加剧。现有提升代码安全性的方法依赖于全参数微调或参数高效适配,但这些方法要么成本高昂且易引发灾难性遗忘,要么操作粒度粗糙,可解释性和可控性有限。
我们提出GoodVibe——一种神经元级框架,旨在默认状态下提升代码语言模型的安全性。该框架基于关键发现:与安全相关的推理过程仅集中于少量神经元子集。我们通过监督式安全任务的梯度归因分析定位这些神经元,并执行神经元选择性微调,仅更新这一安全关键子空间。为进一步降低训练成本,我们引入激活驱动的神经元聚类技术,实现以最小开销完成结构化更新。我们在涵盖C++、Java、Swift和Go等安全关键编程语言的六个大型语言模型上评估GoodVibe。实验表明,该框架在保持模型通用能力的同时显著提升生成代码的安全性:相较基础模型最高提升2.5倍安全性能;以仅需训练4700分之1的参数匹配甚至超越全参数微调效果;与参数高效基线方法(LoRA)相比,训练计算量减少3.6倍以上。这些结果证明,神经元级优化为实现安全代码生成提供了兼顾效能与泛化能力的可扩展方案。
English
Large language models (LLMs) are increasingly used for code generation in fast, informal development workflows, often referred to as vibe coding, where speed and convenience are prioritized, and security requirements are rarely made explicit. In this setting, models frequently produce functionally correct but insecure code, creating a growing security risk. Existing approaches to improving code security rely on full-parameter fine-tuning or parameter-efficient adaptations, which are either costly and prone to catastrophic forgetting or operate at coarse granularity with limited interpretability and control.
We present GoodVibe, a neuron-level framework for improving the security of code language models by default. GoodVibe is based on the key insight that security-relevant reasoning is localized to a small subset of neurons. We identify these neurons using gradient-based attribution from a supervised security task and perform neuron-selective fine-tuning that updates only this security-critical subspace. To further reduce training cost, we introduce activation-driven neuron clustering, enabling structured updates with minimal overhead. We evaluate GoodVibe on six LLMs across security-critical programming languages, including C++, Java, Swift, and Go. GoodVibe substantially improves the security of generated code while preserving general model utility, achieving up to a 2.5x improvement over base models, matching or exceeding full fine-tuning with over 4,700x fewer trainable parameters, and reducing training computation by more than 3.6x compared to the parameter-efficient baseline (LoRA). Our results demonstrate that neuron-level optimization offers an effective and scalable approach to securing code generation without sacrificing efficiency or generality.