ChatPaper.aiChatPaper

小型GPT:透過上下文修剪實現高效大型語言模型

Mini-GPTs: Efficient Large Language Models through Contextual Pruning

December 20, 2023
作者: Tim Valicenti, Justice Vidal, Ritik Patnaik
cs.AI

摘要

在人工智慧研究中,對於大型語言模型(LLMs)的優化仍然是一個重大挑戰,對於推進該領域的實際應用和可持續性至關重要。本文基於麻省理工學院韓松教授實驗室的基礎工作,介紹了一種通過上下文修剪來開發Mini-GPTs的新方法。我們的方法有策略性地修剪傳統LLMs(如Phi-1.5)的計算架構,著重於保留核心功能,同時大幅減小模型大小。我們在包括美國法律、醫學問答、《上古卷軸》對話、英翻台翻譯和經濟文章在內的各種複雜數據集上應用了這一技術。結果凸顯了上下文修剪的效率和有效性,不僅僅是一個理論概念,而是一個在開發特定領域專用、資源高效的LLMs中的實用工具。上下文修剪是構建特定領域LLMs的一種有前途的方法,這項研究是未來發展的一個基礎,將會有更多硬體運算、精細調整和量化。
English
In AI research, the optimization of Large Language Models (LLMs) remains a significant challenge, crucial for advancing the field's practical applications and sustainability. Building upon the foundational work of Professor Song Han's lab at MIT, this paper introduces a novel approach in developing Mini-GPTs via contextual pruning. Our methodology strategically prunes the computational architecture of traditional LLMs, like Phi-1.5, focusing on retaining core functionalities while drastically reducing model sizes. We employ the technique across diverse and complex datasets, including US law, Medical Q&A, Skyrim dialogue, English-Taiwanese translation, and Economics articles. The results underscore the efficiency and effectiveness of contextual pruning, not merely as a theoretical concept but as a practical tool in developing domain-specific, resource-efficient LLMs. Contextual pruning is a promising method for building domain-specific LLMs, and this research is a building block towards future development with more hardware compute, refined fine-tuning, and quantization.
PDF100December 15, 2024