アマゾンのNova Premierのクリティカルリスク評価：フロンティアモデル安全フレームワークに基づく分析

要旨

Nova Premierは、Amazonが開発した最も高度なマルチモーダル基盤モデルであり、モデル蒸留のための教師モデルでもある。このモデルは、100万トークンのコンテキストウィンドウを備え、テキスト、画像、動画を処理し、大規模なコードベース、400ページの文書、90分の動画を単一のプロンプトで分析することが可能である。本論文では、Frontier Model Safety Frameworkの下で、Nova Premierの重要なリスクプロファイルに関する初の包括的評価を提示する。評価は、化学、生物、放射線、核（CBRN）、攻撃的サイバー作戦、自動化されたAI研究開発という3つの高リスク領域を対象とし、自動化されたベンチマーク、専門家によるレッドチーミング、およびリフトスタディを組み合わせて、モデルがリリース閾値を超えているかどうかを判断する。我々は、その方法論を要約し、主要な調査結果を報告する。この評価に基づき、Nova Premierは2025年のパリAI安全サミットで表明したコミットメントに従い、一般公開に安全であると結論付ける。フロンティアモデルに関連する新たなリスクや能力が特定されるにつれ、我々は安全性評価と緩和策のパイプラインを継続的に強化していく。

English

Nova Premier is Amazon's most capable multimodal foundation model and teacher for model distillation. It processes text, images, and video with a one-million-token context window, enabling analysis of large codebases, 400-page documents, and 90-minute videos in a single prompt. We present the first comprehensive evaluation of Nova Premier's critical risk profile under the Frontier Model Safety Framework. Evaluations target three high-risk domains -- Chemical, Biological, Radiological & Nuclear (CBRN), Offensive Cyber Operations, and Automated AI R&D -- and combine automated benchmarks, expert red-teaming, and uplift studies to determine whether the model exceeds release thresholds. We summarize our methodology and report core findings. Based on this evaluation, we find that Nova Premier is safe for public release as per our commitments made at the 2025 Paris AI Safety Summit. We will continue to enhance our safety evaluation and mitigation pipelines as new risks and capabilities associated with frontier models are identified.