델타 활성화: 미세 조정된 대형 언어 모델을 위한 표현 기법

초록

강력한 오픈소스 대규모 언어 모델(LLMs)의 성공은 특정 작업과 도메인에 맞춰 조정된 사후 훈련 모델들의 방대한 컬렉션을 커뮤니티가 생성할 수 있게 하였습니다. 그러나 일관되지 않은 메타데이터와 비구조화된 저장소로 인해 이러한 모델들을 탐색하고 이해하는 것은 여전히 어려운 과제로 남아 있습니다. 본 연구에서는 기본 모델 대비 내부 활성화의 변화를 측정하여 미세 조정된 모델을 벡터 임베딩으로 표현하는 방법인 델타 활성화(Delta Activations)를 소개합니다. 이 표현 방식은 도메인과 작업별로 효과적인 클러스터링을 가능하게 하여 모델 환경의 구조를 드러냅니다. 델타 활성화는 또한 몇 가지 바람직한 특성을 보여줍니다: 미세 조정 설정에 걸쳐 강건하며, 미세 조정 데이터셋이 혼합될 때 가산적 특성을 나타냅니다. 추가적으로, 델타 활성화가 소수 샷 미세 조정을 통해 작업을 임베딩할 수 있음을 보여주고, 모델 선택 및 병합에 대한 활용 가능성을 탐구합니다. 델타 활성화가 공개적으로 이용 가능한 모델의 재사용 실무를 촉진할 수 있기를 바랍니다. 코드는 https://github.com/OscarXZQ/delta_activations에서 확인할 수 있습니다.

English

The success of powerful open source Large Language Models (LLMs) has enabled the community to create a vast collection of post-trained models adapted to specific tasks and domains. However, navigating and understanding these models remains challenging due to inconsistent metadata and unstructured repositories. We introduce Delta Activations, a method to represent finetuned models as vector embeddings by measuring shifts in their internal activations relative to a base model. This representation allows for effective clustering by domain and task, revealing structure in the model landscape. Delta Activations also demonstrate desirable properties: it is robust across finetuning settings and exhibits an additive property when finetuning datasets are mixed. In addition, we show that Delta Activations can embed tasks via few-shot finetuning, and further explore its use for model selection and merging. We hope Delta Activations can facilitate the practice of reusing publicly available models. Code is available at https://github.com/OscarXZQ/delta_activations.

델타 활성화: 미세 조정된 대형 언어 모델을 위한 표현 기법

Delta Activations: A Representation for Finetuned Large Language Models

초록

Support