Command A: 엔터프라이즈 준비 완료 대규모 언어 모델

초록

본 보고서에서는 실질적인 기업 사용 사례에서 탁월한 성능을 발휘하도록 특별히 설계된 강력한 대규모 언어 모델인 Command A의 개발 과정을 설명합니다. Command A는 에이전트 최적화 및 다국어 지원이 가능한 모델로, 글로벌 비즈니스에서 사용되는 23개 언어를 지원하며 효율성과 최고 수준의 성능을 균형 있게 조화시킨 혁신적인 하이브리드 아키텍처를 갖추고 있습니다. 이 모델은 정교한 비즈니스 프로세스를 자동화하기 위한 최고 수준의 검색 증강 생성(Retrieval Augmented Generation, RAG) 기능과 근거 기반 및 도구 사용 기능을 제공합니다. 이러한 능력은 자체 개선 알고리즘과 모델 병합 기법을 포함한 분산형 훈련 접근 방식을 통해 달성되었습니다. 또한 Command A와 유사한 기능 및 아키텍처를 공유하는 Command R7B의 결과도 포함되어 있습니다. 두 모델의 가중치는 연구 목적으로 공개되었습니다. 본 기술 보고서는 원래의 훈련 파이프라인을 상세히 설명하고, 기업 관련 작업 및 공개 벤치마크에 걸친 모델의 광범위한 평가를 제시하여 우수한 성능과 효율성을 입증합니다.

English

In this report we describe the development of Command A, a powerful large language model purpose-built to excel at real-world enterprise use cases. Command A is an agent-optimised and multilingual-capable model, with support for 23 languages of global business, and a novel hybrid architecture balancing efficiency with top of the range performance. It offers best-in-class Retrieval Augmented Generation (RAG) capabilities with grounding and tool use to automate sophisticated business processes. These abilities are achieved through a decentralised training approach, including self-refinement algorithms and model merging techniques. We also include results for Command R7B which shares capability and architectural similarities to Command A. Weights for both models have been released for research purposes. This technical report details our original training pipeline and presents an extensive evaluation of our models across a suite of enterprise-relevant tasks and public benchmarks, demonstrating excellent performance and efficiency.