IQuest-Coder-V1 技术报告

摘要

本报告正式推出IQuest-Coder-V1系列模型（7B/14B/40B/40B-Loop），这是一个全新的代码大语言模型家族。我们突破静态代码表示的局限，提出代码流多阶段训练范式，通过管道不同阶段捕捉软件逻辑的动态演进轨迹。该系列模型采用演进式训练管道：初始预训练阶段融合代码事实、仓库数据及补全数据；随后实施专项中期训练，在32k上下文环境中集成推理与智能体轨迹，并在128k上下文规模实现仓库级建模，奠定深层逻辑基础；最终通过专项编码能力后训练阶段，分化为思维路径（采用推理驱动强化学习）与指令路径（针对通用辅助优化）双专业轨道。IQuest-Coder-V1在代码智能的关键维度——智能体软件工程、竞技编程及复杂工具使用方面，均达到竞争模型中的最先进性能。为应对部署限制，IQuest-Coder-V1-Loop变体引入循环机制，优化模型容量与部署成本之间的平衡，提供架构级增效降耗方案。我们相信，IQuest-Coder-V1系列的发布（包含从预训练基座到最终思维模型与指令模型的完整白盒检查点链条）将推动自主代码智能与真实世界智能体系统的研究进程。

English

In this report, we introduce the IQuest-Coder-V1 series-(7B/14B/40B/40B-Loop), a new family of code large language models (LLMs). Moving beyond static code representations, we propose the code-flow multi-stage training paradigm, which captures the dynamic evolution of software logic through different phases of the pipeline. Our models are developed through the evolutionary pipeline, starting with the initial pre-training consisting of code facts, repository, and completion data. Following that, we implement a specialized mid-training stage that integrates reasoning and agentic trajectories in 32k-context and repository-scale in 128k-context to forge deep logical foundations. The models are then finalized with post-training of specialized coding capabilities, which is bifurcated into two specialized paths: the thinking path (utilizing reasoning-driven RL) and the instruct path (optimized for general assistance). IQuest-Coder-V1 achieves state-of-the-art performance among competitive models across critical dimensions of code intelligence: agentic software engineering, competitive programming, and complex tool use. To address deployment constraints, the IQuest-Coder-V1-Loop variant introduces a recurrent mechanism designed to optimize the trade-off between model capacity and deployment footprint, offering an architecturally enhanced path for efficacy-efficiency trade-off. We believe the release of the IQuest-Coder-V1 series, including the complete white-box chain of checkpoints from pre-training bases to the final thinking and instruction models, will advance research in autonomous code intelligence and real-world agentic systems.