SecureCode v2.0: セキュリティ対応コード生成モデル向けプロダクショングレードデータセット

要旨

AIアシスタントは、セキュリティ関連シナリオの45%において脆弱性を含むコードを生成し、これらの欠陥を大規模に本番システムへ導入している。しかし、既存のセキュアコーディングデータセットは不十分である。それらはインシデントへの裏付けを欠き、現代のトレーニングに必要な規模を提供せず、開発者が本番環境へデプロイする際に必要とする運用上のセキュリティ文脈が不足している。我々はSecureCode v2.0を提示する。これは、構造的検証と専門家によるセキュリティレビューを通過した、1,215件のセキュリティに特化したコーディング例からなる本番環境対応のデータセットである。各例は、CVE参照付きで実際に文書化されたセキュリティインシデントに関連付けられ、脆弱な実装と安全な実装を提供し、具体的な攻撃を実証し、多層防御の運用ガイダンスを含む。このデータセットは、11のプログラミング言語（Python、JavaScript、Java、Go、PHP、C#、TypeScript、Ruby、Rust、Kotlin、およびインフラストラクチャとしてのコードのためのYAML）にわたって、11の脆弱性カテゴリ（OWASP Top 10:2025完全版に加え、AI/MLセキュリティ脅威）を網羅している。我々の品質保証フレームワークは、インシデントへの完全な裏付けを保証する。各例には、SIEM連携戦略、インフラストラクチャ強化の推奨事項（Docker、AppArmor、WAF設定）、および言語に適したフレームワークを用いたテスト手法が含まれる。データセットは、実際の開発者とAIの対話を反映した4ターンの会話構造を採用し、基本的な実装から高度なセキュリティ考慮事項、多層防御ガイダンスへと段階的にエスカレートする。我々の貢献は以下の通りである：(1) 厳密に検証された1,215の例（学習用989、検証用122、テスト用104に分割）、(2) データセットの一貫性を保証する自動検証フレームワーク、(3) 現実的なセキュリティワークフローを捉えた4ターンの会話構造、(4) SIEM連携戦略を含む包括的な運用セキュリティガイダンス、(5) 言語ごとの完全な実装の正確性、(6) データ、検証ツール、ベンチマークプロトコルのオープンソース公開。

English

AI assistants produce vulnerable code in 45% of security-relevant scenarios, introducing flaws into production systems at scale. Yet existing secure coding datasets fall short. They lack incident grounding, don't provide the scale modern training requires, and miss the operational security context developers need for production deployments. We present SecureCode v2.0, a production-grade dataset of 1,215 security-focused coding examples that passed structural validation and expert security review. Every example ties to actual documented security incidents with CVE references, provides vulnerable and secure implementations, demonstrates concrete attacks, and includes defense-in-depth operational guidance. The dataset covers 11 vulnerability categories (complete OWASP Top 10:2025 plus AI/ML Security Threats) across 11 languages (Python, JavaScript, Java, Go, PHP, C#, TypeScript, Ruby, Rust, Kotlin, and YAML for infrastructure-as-code). Our quality assurance framework ensures complete incident grounding. Each example includes SIEM integration strategies, infrastructure hardening recommendations (Docker, AppArmor, WAF configurations), and testing approaches using language-appropriate frameworks. The dataset uses a 4-turn conversational structure mirroring actual developer-AI interactions, escalating from basic implementations to advanced security considerations and defense-in-depth guidance. Our contributions: (1) 1,215 rigorously validated examples split into 989 training, 122 validation, and 104 test sets, (2) an automated validation framework ensuring dataset consistency, (3) a 4-turn conversational structure capturing realistic security workflows, (4) comprehensive operational security guidance with SIEM integration strategies, (5) complete language-specific implementation fidelity, and (6) open-source release of data, validation tools, and benchmarking protocols.

SecureCode v2.0: セキュリティ対応コード生成モデル向けプロダクショングレードデータセット

SecureCode v2.0: A Production-Grade Dataset for Training Security-Aware Code Generation Models

要旨

Support