# IQuest-Coder-V1 기술 보고서

초록

본 보고서에서는 코드 대규모 언어 모델(LLM)의 새로운 패밀리인 IQuest-Coder-V1 시리즈(7B/14B/40B/40B-Loop)를 소개합니다. 정적 코드 표현을 넘어서기 위해, 파이프라인의 다양한 단계를 통해 소프트웨어 로직의 동적 진화를 포착하는 코드 흐름 기반 다단계 학습 패러다임을 제안합니다. 우리의 모델은 코드 사실, 저장소, 완성 데이터로 구성된 초기 사전 학습부터 시작하는 진화적 파이프라인을 통해 개발되었습니다. 이후 32k 컨텍스트에서의 추론 및 에이전트 트랙과 128k 컨텍스트에서의 저장소 규모 학습을 통합한 전문적인 중간 학습 단계를 구현하여 깊은 논리적 기초를 다집니다. 그런 다음 모델은 두 가지 전문 경로(추론 기반 강화 학습을 활용하는 사고 경로와 일반 지원에 최적화된 지시 경로)로 세분화된 전문 코딩 능력의 사후 학습을 통해 최종 완성됩니다. IQuest-Coder-V1은 에이전트 기반 소프트웨어 엔지니어링, 경쟁적 프로그래밍, 복잡한 도구 사용이라는 코드 지능의 핵심 영역에서 경쟁 모델들 중 최고 수준의 성능을 달성합니다. 배포 제약을 해결하기 위해 IQuest-Coder-V1-Loop 변형은 모델 용량과 배포 부담 간의 균형을 최적화하도록 설계된 순환 메커니즘을 도입하여 효과성과 효율성 간의 균형을 위한 구조적으로 향상된 경로를 제공합니다. 사전 학습 기반부터 최종 사고 및 지시 모델에 이르기 위한 완전한 화이트박스 체크포인트 체인을 포함한 IQuest-Coder-V1 시리즈의 공개가 자율 코드 지능 및 현실 세계 에이전트 시스템 연구를 발전시킬 것이라고 믿습니다.

English

In this report, we introduce the IQuest-Coder-V1 series-(7B/14B/40B/40B-Loop), a new family of code large language models (LLMs). Moving beyond static code representations, we propose the code-flow multi-stage training paradigm, which captures the dynamic evolution of software logic through different phases of the pipeline. Our models are developed through the evolutionary pipeline, starting with the initial pre-training consisting of code facts, repository, and completion data. Following that, we implement a specialized mid-training stage that integrates reasoning and agentic trajectories in 32k-context and repository-scale in 128k-context to forge deep logical foundations. The models are then finalized with post-training of specialized coding capabilities, which is bifurcated into two specialized paths: the thinking path (utilizing reasoning-driven RL) and the instruct path (optimized for general assistance). IQuest-Coder-V1 achieves state-of-the-art performance among competitive models across critical dimensions of code intelligence: agentic software engineering, competitive programming, and complex tool use. To address deployment constraints, the IQuest-Coder-V1-Loop variant introduces a recurrent mechanism designed to optimize the trade-off between model capacity and deployment footprint, offering an architecturally enhanced path for efficacy-efficiency trade-off. We believe the release of the IQuest-Coder-V1 series, including the complete white-box chain of checkpoints from pre-training bases to the final thinking and instruction models, will advance research in autonomous code intelligence and real-world agentic systems.

# IQuest-Coder-V1 기술 보고서

IQuest-Coder-V1 Technical Report

초록

Support