Xmodel-LM 기술 보고서

초록

우리는 2조 개 이상의 토큰으로 사전 학습된 컴팩트하고 효율적인 1.1B 규모의 언어 모델인 Xmodel-LM을 소개합니다. 하위 작업 최적화를 기반으로 중국어와 영어 코퍼스를 균형 있게 구성한 자체 구축 데이터셋(Xdata)으로 학습된 Xmodel-LM은 작은 크기에도 불구하고 뛰어난 성능을 보여줍니다. 특히, 이 모델은 유사한 규모의 기존 오픈소스 언어 모델들을 능가하는 성과를 기록했습니다. 우리의 모델 체크포인트와 코드는 https://github.com/XiaoduoAILab/XmodelLM에서 공개적으로 접근 가능합니다.

English

We introduce Xmodel-LM, a compact and efficient 1.1B language model pre-trained on over 2 trillion tokens. Trained on our self-built dataset (Xdata), which balances Chinese and English corpora based on downstream task optimization, Xmodel-LM exhibits remarkable performance despite its smaller size. It notably surpasses existing open-source language models of similar scale. Our model checkpoints and code are publicly accessible on GitHub at https://github.com/XiaoduoAILab/XmodelLM.

Xmodel-LM 기술 보고서

Xmodel-LM Technical Report

초록

Support