미니-GPTs: 컨텍스트 프루닝을 통한 효율적인 대형 언어 모델

초록

AI 연구 분야에서 대형 언어 모델(LLMs)의 최적화는 해당 분야의 실용적 응용과 지속 가능성을 발전시키기 위한 중요한 과제로 남아 있습니다. MIT의 송한 교수 연구실의 기초 연구를 바탕으로, 본 논문은 컨텍스트 프루닝을 통해 미니-GPT를 개발하는 새로운 접근 방식을 소개합니다. 우리의 방법론은 Phi-1.5와 같은 전통적인 LLMs의 계산 구조를 전략적으로 프루닝하여 핵심 기능을 유지하면서 모델 크기를 극적으로 줄이는 데 초점을 맞춥니다. 이 기술은 미국 법률, 의학 Q&A, 스카이림 대화, 영어-대만어 번역, 경제학 논문 등 다양한 복잡한 데이터셋에 적용되었습니다. 결과는 컨텍스트 프루닝이 단순한 이론적 개념이 아니라 도메인 특화적이고 자원 효율적인 LLMs를 개발하는 실용적인 도구로서의 효율성과 효과성을 강조합니다. 컨텍스트 프루닝은 도메인 특화적 LLMs를 구축하기 위한 유망한 방법이며, 본 연구는 향후 더 많은 하드웨어 컴퓨팅, 정교한 파인튜닝, 양자화를 통한 발전을 위한 초석입니다.

English

In AI research, the optimization of Large Language Models (LLMs) remains a significant challenge, crucial for advancing the field's practical applications and sustainability. Building upon the foundational work of Professor Song Han's lab at MIT, this paper introduces a novel approach in developing Mini-GPTs via contextual pruning. Our methodology strategically prunes the computational architecture of traditional LLMs, like Phi-1.5, focusing on retaining core functionalities while drastically reducing model sizes. We employ the technique across diverse and complex datasets, including US law, Medical Q&A, Skyrim dialogue, English-Taiwanese translation, and Economics articles. The results underscore the efficiency and effectiveness of contextual pruning, not merely as a theoretical concept but as a practical tool in developing domain-specific, resource-efficient LLMs. Contextual pruning is a promising method for building domain-specific LLMs, and this research is a building block towards future development with more hardware compute, refined fine-tuning, and quantization.

미니-GPTs: 컨텍스트 프루닝을 통한 효율적인 대형 언어 모델

Mini-GPTs: Efficient Large Language Models through Contextual Pruning

초록

Support