DeepSeek-Coder：当大型语言模型遇上编程 —— 代码智能的崛起

摘要

大语言模型的快速发展彻底改变了软件开发中的代码智能。然而，封闭源模型的主导地位限制了广泛的研究和开发。为了解决这一问题，我们推出了DeepSeek-Coder系列，这是一系列开源代码模型，规模从13亿到330亿不等，从头开始训练，使用了2万亿标记。这些模型在高质量项目级代码语料库上进行了预训练，并采用了填空任务和16K窗口，以增强代码生成和填充。我们的广泛评估表明，DeepSeek-Coder不仅在多个基准测试中取得了开源代码模型的最新性能，而且超越了现有的Codex和GPT-3.5等封闭源模型。此外，DeepSeek-Coder模型采用宽松许可证，允许进行研究和无限制的商业使用。

English

The rapid development of large language models has revolutionized code intelligence in software development. However, the predominance of closed-source models has restricted extensive research and development. To address this, we introduce the DeepSeek-Coder series, a range of open-source code models with sizes from 1.3B to 33B, trained from scratch on 2 trillion tokens. These models are pre-trained on a high-quality project-level code corpus and employ a fill-in-the-blank task with a 16K window to enhance code generation and infilling. Our extensive evaluations demonstrate that DeepSeek-Coder not only achieves state-of-the-art performance among open-source code models across multiple benchmarks but also surpasses existing closed-source models like Codex and GPT-3.5. Furthermore, DeepSeek-Coder models are under a permissive license that allows for both research and unrestricted commercial use.

DeepSeek-Coder：当大型语言模型遇上编程 —— 代码智能的崛起

DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

摘要

Support