BrainSurgery: Reproduzierbare und zuverlässige deklarative Gewichtsmanipulationen für die Modellbearbeitung und das Upcycling

Zusammenfassung

Mit der Skalierung von Deep-Learning-Modellen wird die Verwaltung, Inspektion und Modifikation großer Checkpoints zunehmend herausfordernd. Forscher müssen häufig Modellgewichte für Layer-Umstrukturierung, Präzisionsumwandlung, Niedrigrang-Faktorisierung und Architektur-Debugging ändern, doch diese Arbeitsabläufe stützen sich oft auf fragile Ad-hoc-Python-Skripte. Hier stellen wir BrainSurgery vor, ein Werkzeug für robuste und reproduzierbare „Tensor-Chirurgie“ an neuronalen Netzwerk-Checkpoints, und bieten eine Systemdemonstration mit vier Beispielen und drei Fallstudien, vom Model-Upcycling bis zur LoRA-Extraktion. Durch Abstraktion von Speicherformaten und Speicherverwaltung führt BrainSurgery komplexe Transformationen mittels deklarativer YAML-Pläne durch. Es unterstützt strukturelle Modifikationen, mathematische Transformationen und Tensor-Umformungen durch ausdrucksstarke Regex und strukturiertes Targeting, während eingebaute Assertionen Tensorformen, Datentypen und Werte validieren, um stille Fehler zu verhindern. Wir sehen voraus, dass BrainSurgery durch seine reproduzierbaren und validierten Operationen eine solide Grundlage für zukünftige Forschung bieten wird.

English

As deep learning models scale, managing, inspecting, and modifying large checkpoints has become increasingly challenging. Researchers often need to alter model weights for layer restructuring, precision casting, low-rank factorization, and architectural debugging, yet these workflows often rely on fragile ad-hoc Python scripts. Here, we introduce BrainSurgery, a tool for robust and reproducible "tensor surgery" on neural network checkpoints, and provide a system demonstration covering four examples and three case studies from model upcycling to LoRA extraction. By abstracting storage formats and memory management, BrainSurgery executes complex transformations through declarative YAML plans. It supports structural modifications, mathematical transformations, and tensor reshaping through expressive regex and structural targeting, while built-in assertions validate tensor shapes, data types, and values to prevent silent errors. We envision that BrainSurgery will provide a strong foundation for future research through its reproducible and validated operations.