KAT-Coder-V2 기술 보고서

초록

우리는 Kuaishou의 KwaiKAT 팀이 개발한 에이전트 코딩 모델인 KAT-Coder-V2를 소개한다. KAT-Coder-V2는 "전문화 후 통합(Specialize-then-Unify)" 패러다임을 채택하여 에이전트 코딩을 SWE, WebCoding, Terminal, WebSearch, General이라는 다섯 가지 전문가 영역으로 분해한다. 각 영역은 독립적인 지도 미세 조정과 강화 학습을 거친 후, 온-정책 지식 증류(on-policy distillation)를 통해 단일 모델로 통합된다. 우리는 수만 개의 동시 샌드박스 인스턴스를 지원하는 모듈형 인프라인 KwaiEnv를 개발하고, 과제 복잡성, 의도 정렬, 스캐폴드 일반화를 따라 RL 훈련을 확장했다. 또한 MoE RL 훈련 안정화를 위한 MCLA와 트리 구조 트랙젝토리에서 중복 계산을 제거하여 최대 6.2배의 속도 향상을 이루는 Tree Training을 제안한다. KAT-Coder-V2는 SWE-bench Verified에서 79.6%(대조군 Claude Opus 80.8%), PinchBench에서 88.7점(GLM-5 및 MiniMax M2.7 초과)을 달성했으며, 세 가지 프론트엔드 미적 평가 시나리오 전체에서 1위를 차지하고, Terminal-Bench Hard(46.8점)와 tau^2-Bench(93.9점)에서도 강력한 일반성 점수를 유지한다. 본 모델은 https://streamlake.com/product/kat-coder 에서 공개된다.

English

We present KAT-Coder-V2, an agentic coding model developed by the KwaiKAT team at Kuaishou. KAT-Coder-V2 adopts a "Specialize-then-Unify" paradigm that decomposes agentic coding into five expert domains - SWE, WebCoding, Terminal, WebSearch, and General - each undergoing independent supervised fine-tuning and reinforcement learning, before being consolidated into a single model via on-policy distillation. We develop KwaiEnv, a modular infrastructure sustaining tens of thousands of concurrent sandbox instances, and scale RL training along task complexity, intent alignment, and scaffold generalization. We further propose MCLA for stabilizing MoE RL training and Tree Training for eliminating redundant computation over tree-structured trajectories with up to 6.2x speedup. KAT-Coder-V2 achieves 79.6% on SWE-bench Verified (vs. Claude Opus 4.6 at 80.8%), 88.7 on PinchBench (surpassing GLM-5 and MiniMax M2.7), ranks first across all three frontend aesthetics scenarios, and maintains strong generalist scores on Terminal-Bench Hard (46.8) and tau^2-Bench (93.9). Our model is publicly available at https://streamlake.com/product/kat-coder.

KAT-Coder-V2 기술 보고서

KAT-Coder-V2 Technical Report

초록

Support