Mini-GPTs:通过上下文修剪实现高效大型语言模型
Mini-GPTs: Efficient Large Language Models through Contextual Pruning
December 20, 2023
作者: Tim Valicenti, Justice Vidal, Ritik Patnaik
cs.AI
摘要
在人工智能研究中,优化大型语言模型(LLMs)仍然是一个重要挑战,对推进该领域的实际应用和可持续性至关重要。本文借鉴了韩颂教授在麻省理工学院实验室的基础工作,介绍了一种通过上下文修剪开发Mini-GPTs的新方法。我们的方法有策略地修剪传统LLMs(如Phi-1.5)的计算架构,侧重保留核心功能,同时大幅减小模型大小。我们将这一技术应用于包括美国法律、医学问答、《上古卷轴》对话、英文-台湾话语翻译和经济文章在内的多样化和复杂数据集上。结果突显了上下文修剪的效率和有效性,不仅仅作为一个理论概念,而且作为开发领域特定、资源高效的LLMs的实用工具。上下文修剪是构建领域特定LLMs的一种有前途的方法,本研究是未来发展的基石,需要更多硬件计算、精细调整和量化。
English
In AI research, the optimization of Large Language Models (LLMs) remains a
significant challenge, crucial for advancing the field's practical applications
and sustainability. Building upon the foundational work of Professor Song Han's
lab at MIT, this paper introduces a novel approach in developing Mini-GPTs via
contextual pruning. Our methodology strategically prunes the computational
architecture of traditional LLMs, like Phi-1.5, focusing on retaining core
functionalities while drastically reducing model sizes. We employ the technique
across diverse and complex datasets, including US law, Medical Q&A, Skyrim
dialogue, English-Taiwanese translation, and Economics articles. The results
underscore the efficiency and effectiveness of contextual pruning, not merely
as a theoretical concept but as a practical tool in developing domain-specific,
resource-efficient LLMs. Contextual pruning is a promising method for building
domain-specific LLMs, and this research is a building block towards future
development with more hardware compute, refined fine-tuning, and quantization.