评估亚马逊Nova Premier在前沿模型安全框架下的关键风险

摘要

Nova Premier是亚马逊最先进的多模态基础模型，同时也是模型蒸馏的指导者。它能够处理文本、图像和视频，具备一百万令牌的上下文窗口，使得单次提示即可分析大型代码库、400页文档以及90分钟的视频内容。我们首次在“前沿模型安全框架”下对Nova Premier的关键风险特征进行了全面评估。评估聚焦于三大高风险领域——化学、生物、放射性与核能（CBRN）、进攻性网络行动以及自动化AI研发——结合自动化基准测试、专家红队演练及提升研究，以判定该模型是否超出发布阈值。本文概述了我们的方法论并报告了核心发现。基于此次评估，我们确认Nova Premier符合我们在2025年巴黎AI安全峰会上作出的承诺，适合公开发布。随着前沿模型相关的新风险与能力被识别，我们将持续优化安全评估与缓解机制。

English

Nova Premier is Amazon's most capable multimodal foundation model and teacher for model distillation. It processes text, images, and video with a one-million-token context window, enabling analysis of large codebases, 400-page documents, and 90-minute videos in a single prompt. We present the first comprehensive evaluation of Nova Premier's critical risk profile under the Frontier Model Safety Framework. Evaluations target three high-risk domains -- Chemical, Biological, Radiological & Nuclear (CBRN), Offensive Cyber Operations, and Automated AI R&D -- and combine automated benchmarks, expert red-teaming, and uplift studies to determine whether the model exceeds release thresholds. We summarize our methodology and report core findings. Based on this evaluation, we find that Nova Premier is safe for public release as per our commitments made at the 2025 Paris AI Safety Summit. We will continue to enhance our safety evaluation and mitigation pipelines as new risks and capabilities associated with frontier models are identified.