Qwen3技术报告

摘要

在本研究中，我们推出了Qwen模型家族的最新版本——Qwen3。Qwen3包含一系列大型语言模型（LLMs），旨在提升性能、效率及多语言处理能力。该系列涵盖了从0.6亿到2350亿参数规模的密集架构与专家混合（MoE）架构模型。Qwen3的一项关键创新在于将思维模式（用于复杂多步推理）与非思维模式（用于快速上下文响应）整合至统一框架中，从而无需在不同模型间切换——如聊天优化模型（例如GPT-4o）与专用推理模型（例如QwQ-32B）——并支持基于用户查询或聊天模板的动态模式切换。同时，Qwen3引入了思维预算机制，允许用户在推理过程中自适应地分配计算资源，根据任务复杂度平衡延迟与性能。此外，通过借鉴旗舰模型的知识，我们大幅减少了构建小规模模型所需的计算资源，同时确保其具备高度竞争力。实证评估表明，Qwen3在包括代码生成、数学推理、代理任务等在内的多样化基准测试中均达到了业界领先水平，与更大规模的MoE模型及专有模型相抗衡。相较于前代Qwen2.5，Qwen3将多语言支持从29种扩展至119种语言及方言，通过增强跨语言理解与生成能力，提升了全球可访问性。为促进可复现性及社区驱动的研发，所有Qwen3模型均以Apache 2.0许可公开。

English

In this work, we present Qwen3, the latest version of the Qwen model family. Qwen3 comprises a series of large language models (LLMs) designed to advance performance, efficiency, and multilingual capabilities. The Qwen3 series includes models of both dense and Mixture-of-Expert (MoE) architectures, with parameter scales ranging from 0.6 to 235 billion. A key innovation in Qwen3 is the integration of thinking mode (for complex, multi-step reasoning) and non-thinking mode (for rapid, context-driven responses) into a unified framework. This eliminates the need to switch between different models--such as chat-optimized models (e.g., GPT-4o) and dedicated reasoning models (e.g., QwQ-32B)--and enables dynamic mode switching based on user queries or chat templates. Meanwhile, Qwen3 introduces a thinking budget mechanism, allowing users to allocate computational resources adaptively during inference, thereby balancing latency and performance based on task complexity. Moreover, by leveraging the knowledge from the flagship models, we significantly reduce the computational resources required to build smaller-scale models, while ensuring their highly competitive performance. Empirical evaluations demonstrate that Qwen3 achieves state-of-the-art results across diverse benchmarks, including tasks in code generation, mathematical reasoning, agent tasks, etc., competitive against larger MoE models and proprietary models. Compared to its predecessor Qwen2.5, Qwen3 expands multilingual support from 29 to 119 languages and dialects, enhancing global accessibility through improved cross-lingual understanding and generation capabilities. To facilitate reproducibility and community-driven research and development, all Qwen3 models are publicly accessible under Apache 2.0.

Qwen3技术报告

Qwen3 Technical Report

摘要

Support