대규모 언어 모델의 인지 패턴을 모듈 커뮤니티를 통해 해석하기

초록

대형 언어 모델(LLMs)은 과학적 발견과 의료 진단부터 챗봇에 이르기까지 다양한 응용 분야를 통해 과학, 공학 및 사회에 있어 중대한 발전을 이루며 우리의 세계를 재구성했습니다. 그러나 그들의 보편성과 유용성에도 불구하고, LLM의 기본 메커니즘은 수십억 개의 매개변수와 복잡한 구조 속에 숨겨져 있어, 그 내부 아키텍처와 인지 과정을 이해하기가 어렵습니다. 우리는 이러한 격차를 해소하기 위해 생물학에서의 신흥 인지 현상을 이해하는 접근법을 채택하고, 인지 능력, LLM 아키텍처 및 데이터셋을 연결하는 네트워크 기반 프레임워크를 개발함으로써 기초 모델 분석에 있어 패러다임 전환을 이끌어냈습니다. 모듈 커뮤니티 내의 기술 분포는 LLM이 특정 생물학적 시스템에서 관찰되는 집중적 전문화와 엄격하게 평행하지는 않지만, 새와 작은 포유류의 뇌에서 볼 수 있는 분산적이면서도 상호 연결된 인지 조직을 부분적으로 반영하는 독특한 모듈 커뮤니티를 보여줍니다. 우리의 수치적 결과는 생물학적 시스템과 LLM 사이의 주요 차이점을 강조하며, 기술 습득이 동적이고 교차 지역적 상호작용 및 신경 가소성으로부터 상당한 이점을 얻는 것을 보여줍니다. 인지 과학 원칙을 기계 학습과 통합함으로써, 우리의 프레임워크는 LLM 해석 가능성에 대한 새로운 통찰을 제공하며, 효과적인 미세 조정 전략은 엄격한 모듈적 개입보다는 분산 학습 역학을 활용해야 함을 시사합니다.

English

Large Language Models (LLMs) have reshaped our world with significant advancements in science, engineering, and society through applications ranging from scientific discoveries and medical diagnostics to Chatbots. Despite their ubiquity and utility, the underlying mechanisms of LLM remain concealed within billions of parameters and complex structures, making their inner architecture and cognitive processes challenging to comprehend. We address this gap by adopting approaches to understanding emerging cognition in biology and developing a network-based framework that links cognitive skills, LLM architectures, and datasets, ushering in a paradigm shift in foundation model analysis. The skill distribution in the module communities demonstrates that while LLMs do not strictly parallel the focalized specialization observed in specific biological systems, they exhibit unique communities of modules whose emergent skill patterns partially mirror the distributed yet interconnected cognitive organization seen in avian and small mammalian brains. Our numerical results highlight a key divergence from biological systems to LLMs, where skill acquisition benefits substantially from dynamic, cross-regional interactions and neural plasticity. By integrating cognitive science principles with machine learning, our framework provides new insights into LLM interpretability and suggests that effective fine-tuning strategies should leverage distributed learning dynamics rather than rigid modular interventions.

대규모 언어 모델의 인지 패턴을 모듈 커뮤니티를 통해 해석하기

Unraveling the cognitive patterns of Large Language Models through module communities

초록

Support