TeleChat 技術報告
TeleChat Technical Report
January 8, 2024
作者: Zihan Wang, Xinzhang Liu, Shixuan Liu, Yitong Yao, Yuyao Huang, Zhongjiang He, Xuelong Li, Yongxiang Li, Zhonghao Che, Zhaoxi Zhang, Yan Wang, Xin Wang, Luwen Pu, Huihan Xu, Ruiyu Fang, Yu Zhao, Jie Zhang, Xiaomeng Huang, Zhilong Lu, Jiaxin Peng, Wenjun Zheng, Shiquan Wang, Bingkai Yang, Xuewei he, Zhuoru Jiang, Qiyi Xie, Yanhan Zhang, Zhongqiu Li, Lingling Shi, Weiwei Fu, Yin Zhang, Zilu Huang, Sishi Xiong, Yuxiang Zhang, Chao Wang, Shuangyong Song
cs.AI
摘要
在這份技術報告中,我們介紹了TeleChat,這是一組具有30億、70億和120億參數的大型語言模型(LLMs)。它包括預訓練語言模型以及與人類偏好相符的微調聊天模型。TeleChat最初在包含來自英語和中文語言的各種文本的龐大語料庫上進行預訓練,其中包含數以兆計的標記。隨後,模型經過微調以符合人類偏好,遵循我們描述的詳細方法論。我們評估了TeleChat在各種任務上的表現,包括語言理解、數學、推理、代碼生成和基於知識的問答。我們的研究結果表明,TeleChat在廣泛的公共基準測試中實現了與其他開源模型相似尺寸的可比性能。為了支持利用LLMs進行未來研究和應用,我們向公眾社區釋出了TeleChat的7B和12B變體的微調模型檢查點,以及代碼和部分預訓練數據。
English
In this technical report, we present TeleChat, a collection of large language
models (LLMs) with parameters of 3 billion, 7 billion and 12 billion. It
includes pretrained language models as well as fine-tuned chat models that is
aligned with human preferences. TeleChat is initially pretrained on an
extensive corpus containing a diverse collection of texts from both English and
Chinese languages, including trillions of tokens. Subsequently, the model
undergoes fine-tuning to align with human preferences, following a detailed
methodology that we describe. We evaluate the performance of TeleChat on
various tasks, including language understanding, mathematics, reasoning, code
generation, and knowledge-based question answering. Our findings indicate that
TeleChat achieves comparable performance to other open-source models of similar
size across a wide range of public benchmarks. To support future research and
applications utilizing LLMs, we release the fine-tuned model checkpoints of
TeleChat's 7B and 12B variant, along with code and a portion of our pretraining
data, to the public community.