코드 언어 모델에서 프로그래밍 개념과 뉴런이 공유되는 방식

초록

여러 연구에서 대규모 언어 모델(LLM)의 코딩 작업 메커니즘을 탐구했지만, 대부분은 단일 언어 환경에서의 프로그래밍 언어(PL)에 초점을 맞추었습니다. 본 논문에서는 LLM의 개념 공간에서 여러 PL과 영어 간의 관계를 조사합니다. 우리는 두 개의 Llama 기반 모델을 사용하여 21개의 PL 쌍에 대해 소수 샷(few-shot) 번역 작업을 수행합니다. 이 작업 중 중간 레이어의 임베딩을 디코딩함으로써, 개념 공간이 영어(PL 키워드 포함)에 더 가깝고 중간 레이어의 후반부에서 영어 토큰에 높은 확률을 할당한다는 것을 관찰합니다. 우리는 11개의 PL과 영어에 대한 뉴런 활성화를 분석하여, 언어별 뉴런이 주로 하위 레이어에 집중되어 있는 반면, 각 PL에 독점적인 뉴런은 상위 레이어에 나타나는 경향이 있음을 발견했습니다. 여러 다른 PL과 높은 정렬을 보이는 PL의 경우, 언어별 뉴런을 식별하는 것은 불가능합니다. 이러한 PL은 다른 PL보다 더 큰 키워드 집합을 가지고 있으며, 번역 작업에서 입력/출력 PL에 관계없이 모델의 개념 공간에 더 가깝습니다. 우리의 연구 결과는 LLM이 내부적으로 PL을 어떻게 표현하는지에 대한 통찰을 제공하며, 모델의 개념 공간에서 구조적 패턴을 밝혀냅니다. 코드는 https://github.com/cisnlp/code-specific-neurons에서 확인할 수 있습니다.

English

Several studies have explored the mechanisms of large language models (LLMs) in coding tasks, but most have focused on programming languages (PLs) in a monolingual setting. In this paper, we investigate the relationship between multiple PLs and English in the concept space of LLMs. We perform a few-shot translation task on 21 PL pairs using two Llama-based models. By decoding the embeddings of intermediate layers during this task, we observe that the concept space is closer to English (including PL keywords) and assigns high probabilities to English tokens in the second half of the intermediate layers. We analyze neuron activations for 11 PLs and English, finding that while language-specific neurons are primarily concentrated in the bottom layers, those exclusive to each PL tend to appear in the top layers. For PLs that are highly aligned with multiple other PLs, identifying language-specific neurons is not feasible. These PLs also tend to have a larger keyword set than other PLs and are closer to the model's concept space regardless of the input/output PL in the translation task. Our findings provide insights into how LLMs internally represent PLs, revealing structural patterns in the model's concept space. Code is available at https://github.com/cisnlp/code-specific-neurons.

코드 언어 모델에서 프로그래밍 개념과 뉴런이 공유되는 방식

How Programming Concepts and Neurons Are Shared in Code Language Models

초록

Support