BrainSurgery: 모델 편집 및 업사이클링을 위한 재현 가능하고 신뢰할 수 있는 선언적 가중치 조작

초록

딥러닝 모델이 확장됨에 따라, 대규모 체크포인트의 관리, 검사 및 수정이 점점 더 어려워지고 있다. 연구자들은 종종 계층 재구성, 정밀도 변환, 저순위 분해, 아키텍처 디버깅을 위해 모델 가중치를 변경해야 하지만, 이러한 작업 흐름은 취약한 임시방편적인 Python 스크립트에 의존하는 경우가 많다. 본 연구에서는 신경망 체크포인트에 대한 강력하고 재현 가능한 '텐서 수술' 도구인 BrainSurgery를 소개하고, 모델 업사이클링부터 LoRA 추출에 이르는 네 가지 예시와 세 가지 사례 연구를 포함한 시스템 시연을 제공한다. BrainSurgery는 저장소 형식과 메모리 관리를 추상화함으로써 선언적 YAML 계획을 통해 복잡한 변환을 실행한다. 표현력 있는 정규 표현식과 구조적 타겟팅을 통해 구조적 수정, 수학적 변환 및 텐서 재구성을 지원하며, 내장 검증을 통해 텐서 형태, 데이터 유형, 값을 검증하여 무시되기 쉬운 오류를 방지한다. BrainSurgery는 재현 가능하고 검증된 연산을 통해 향후 연구를 위한 강력한 기반을 제공할 것으로 기대한다.

English

As deep learning models scale, managing, inspecting, and modifying large checkpoints has become increasingly challenging. Researchers often need to alter model weights for layer restructuring, precision casting, low-rank factorization, and architectural debugging, yet these workflows often rely on fragile ad-hoc Python scripts. Here, we introduce BrainSurgery, a tool for robust and reproducible "tensor surgery" on neural network checkpoints, and provide a system demonstration covering four examples and three case studies from model upcycling to LoRA extraction. By abstracting storage formats and memory management, BrainSurgery executes complex transformations through declarative YAML plans. It supports structural modifications, mathematical transformations, and tensor reshaping through expressive regex and structural targeting, while built-in assertions validate tensor shapes, data types, and values to prevent silent errors. We envision that BrainSurgery will provide a strong foundation for future research through its reproducible and validated operations.