Qwen3 技术报告

摘要

在本研究中，我們介紹了Qwen模型家族的最新版本——Qwen3。Qwen3包含一系列大型語言模型（LLMs），旨在提升性能、效率及多語言能力。該系列涵蓋了密集架構與專家混合（MoE）架構的模型，參數規模從0.6億到2350億不等。Qwen3的一項關鍵創新在於將思維模式（用於複雜的多步推理）與非思維模式（用於快速的上下文驅動響應）整合到一個統一框架中。這消除了在不同模型間切換的需求——例如聊天優化模型（如GPT-4o）與專用推理模型（如QwQ-32B）——並能根據用戶查詢或聊天模板動態切換模式。同時，Qwen3引入了思維預算機制，允許用戶在推理過程中自適應地分配計算資源，從而根據任務複雜度平衡延遲與性能。此外，通過利用旗艦模型的知識，我們大幅減少了構建小規模模型所需的計算資源，同時確保其具有高度競爭力的性能。實證評估顯示，Qwen3在多樣化的基準測試中取得了領先成果，包括代碼生成、數學推理、代理任務等，與更大的MoE模型及專有模型相比具有競爭力。相較於前代Qwen2.5，Qwen3將多語言支持從29種擴展至119種語言和方言，通過提升跨語言理解與生成能力增強了全球可訪問性。為促進可重現性及社區驅動的研究與開發，所有Qwen3模型均在Apache 2.0許可下公開提供。

English

In this work, we present Qwen3, the latest version of the Qwen model family. Qwen3 comprises a series of large language models (LLMs) designed to advance performance, efficiency, and multilingual capabilities. The Qwen3 series includes models of both dense and Mixture-of-Expert (MoE) architectures, with parameter scales ranging from 0.6 to 235 billion. A key innovation in Qwen3 is the integration of thinking mode (for complex, multi-step reasoning) and non-thinking mode (for rapid, context-driven responses) into a unified framework. This eliminates the need to switch between different models--such as chat-optimized models (e.g., GPT-4o) and dedicated reasoning models (e.g., QwQ-32B)--and enables dynamic mode switching based on user queries or chat templates. Meanwhile, Qwen3 introduces a thinking budget mechanism, allowing users to allocate computational resources adaptively during inference, thereby balancing latency and performance based on task complexity. Moreover, by leveraging the knowledge from the flagship models, we significantly reduce the computational resources required to build smaller-scale models, while ensuring their highly competitive performance. Empirical evaluations demonstrate that Qwen3 achieves state-of-the-art results across diverse benchmarks, including tasks in code generation, mathematical reasoning, agent tasks, etc., competitive against larger MoE models and proprietary models. Compared to its predecessor Qwen2.5, Qwen3 expands multilingual support from 29 to 119 languages and dialects, enhancing global accessibility through improved cross-lingual understanding and generation capabilities. To facilitate reproducibility and community-driven research and development, all Qwen3 models are publicly accessible under Apache 2.0.

Qwen3 技术报告

Qwen3 Technical Report

摘要

Support