BrainSurgery: Reproduceerbare en Betrouwbare Declaratieve Gewichtsmanipulaties voor Modelbewerking en Upcycling

Samenvatting

Naarmate deep learning-modellen opschalen, wordt het beheren, inspecteren en wijzigen van grote checkpoints steeds uitdagender. Onderzoekers moeten vaak modelgewichten aanpassen voor laagherstructurering, precisiecasting, laagrangefactorisatie en architectuurdebugging, maar deze werkwijzen zijn vaak afhankelijk van fragiele ad-hoc Python-scripts. Hier introduceren we BrainSurgery, een tool voor robuuste en reproduceerbare "tensorchirurgie" op neurale-netwerkcheckpoints, en geven we een systeemdemonstratie met vier voorbeelden en drie casestudy's, van model upcycling tot LoRA-extractie. Door opslagformaten en geheugenbeheer te abstraheren, voert BrainSurgery complexe transformaties uit via declaratieve YAML-plannen. Het ondersteunt structurele aanpassingen, wiskundige transformaties en tensorherschikking door middel van expressieve reguliere expressies en structurele targeting, terwijl ingebouwde asserties tensordimensies, gegevenstypen en waarden valideren om stille fouten te voorkomen. We voorzien dat BrainSurgery door zijn reproduceerbare en gevalideerde bewerkingen een sterke basis zal bieden voor toekomstig onderzoek.

English

As deep learning models scale, managing, inspecting, and modifying large checkpoints has become increasingly challenging. Researchers often need to alter model weights for layer restructuring, precision casting, low-rank factorization, and architectural debugging, yet these workflows often rely on fragile ad-hoc Python scripts. Here, we introduce BrainSurgery, a tool for robust and reproducible "tensor surgery" on neural network checkpoints, and provide a system demonstration covering four examples and three case studies from model upcycling to LoRA extraction. By abstracting storage formats and memory management, BrainSurgery executes complex transformations through declarative YAML plans. It supports structural modifications, mathematical transformations, and tensor reshaping through expressive regex and structural targeting, while built-in assertions validate tensor shapes, data types, and values to prevent silent errors. We envision that BrainSurgery will provide a strong foundation for future research through its reproducible and validated operations.