CodeTF:一站式Transformer库,用于最先进的代码LLM。
CodeTF: One-stop Transformer Library for State-of-the-art Code LLM
May 31, 2023
作者: Nghi D. Q. Bui, Hung Le, Yue Wang, Junnan Li, Akhilesh Deepak Gotmare, Steven C. H. Hoi
cs.AI
摘要
代码智能在改变现代软件工程中发挥着关键作用。最近,基于深度学习的模型,特别是基于Transformer的大型语言模型(LLMs),通过利用大量开源代码数据和编程语言特性展现出了在处理这些任务方面的显著潜力。然而,这类模型的开发和部署通常需要对机器学习和软件工程两者都具备专业知识,这为模型的采用设置了一道障碍。在本文中,我们介绍了CodeTF,一个面向最先进的代码LLMs和代码智能的开源Transformer库。遵循模块化设计和可扩展框架的原则,我们设计了CodeTF,提供了统一接口,以实现在不同类型的模型、数据集和任务之间快速访问和开发。我们的库支持一系列预训练的代码LLM模型和流行的代码基准,包括一个标准化接口,以高效地训练和提供代码LLMs,以及语言特定解析器和用于提取代码属性的实用功能。在本文中,我们描述了设计原则、架构、关键模块和组件,并与其他相关库工具进行了比较。最后,我们希望CodeTF能够弥合机器学习/生成AI和软件工程之间的差距,为开发人员、研究人员和从业者提供全面的开源解决方案。
English
Code intelligence plays a key role in transforming modern software
engineering. Recently, deep learning-based models, especially Transformer-based
large language models (LLMs), have demonstrated remarkable potential in
tackling these tasks by leveraging massive open-source code data and
programming language features. However, the development and deployment of such
models often require expertise in both machine learning and software
engineering, creating a barrier for the model adoption. In this paper, we
present CodeTF, an open-source Transformer-based library for state-of-the-art
Code LLMs and code intelligence. Following the principles of modular design and
extensible framework, we design CodeTF with a unified interface to enable rapid
access and development across different types of models, datasets and tasks.
Our library supports a collection of pretrained Code LLM models and popular
code benchmarks, including a standardized interface to train and serve code
LLMs efficiently, and data features such as language-specific parsers and
utility functions for extracting code attributes. In this paper, we describe
the design principles, the architecture, key modules and components, and
compare with other related library tools. Finally, we hope CodeTF is able to
bridge the gap between machine learning/generative AI and software engineering,
providing a comprehensive open-source solution for developers, researchers, and
practitioners.