脑部手术:用于模型编辑与升级的可复现且可靠的声明式权重操作
BrainSurgery: Reproducible and Reliable Declarative Weight Manipulations for Model Editing and Upcycling
June 8, 2026
作者: Gianluca Barmina, Annemette Broch Pirchert, Andrea Blasi Núñez, Lukas Galke Poech, Peter Schneider-Kamp
cs.AI
摘要
随着深度学习模型规模的扩大,管理、检查和修改大规模检查点变得愈发具有挑战性。研究人员经常需要调整模型权重以进行层重组、精度转换、低秩分解和架构调试,但这些工作流程往往依赖于脆弱的临时Python脚本。在此,我们介绍BrainSurgery——一个用于对神经网络检查点进行稳健且可复现的"张量手术"的工具,并通过系统演示涵盖四个示例和三个案例研究(从模型升级到LoRA提取)。通过抽象化存储格式和内存管理,BrainSurgery通过声明式YAML方案执行复杂的转换。它支持通过富有表现力的正则表达式和结构定位实现结构修改、数学变换和张量重塑,同时内置断言机制可验证张量形状、数据类型和数值,从而防止静默错误。我们相信,BrainSurgery凭借其可复现且经过验证的操作,将为未来研究提供坚实基础。
English
As deep learning models scale, managing, inspecting, and modifying large checkpoints has become increasingly challenging. Researchers often need to alter model weights for layer restructuring, precision casting, low-rank factorization, and architectural debugging, yet these workflows often rely on fragile ad-hoc Python scripts. Here, we introduce BrainSurgery, a tool for robust and reproducible "tensor surgery" on neural network checkpoints, and provide a system demonstration covering four examples and three case studies from model upcycling to LoRA extraction. By abstracting storage formats and memory management, BrainSurgery executes complex transformations through declarative YAML plans. It supports structural modifications, mathematical transformations, and tensor reshaping through expressive regex and structural targeting, while built-in assertions validate tensor shapes, data types, and values to prevent silent errors. We envision that BrainSurgery will provide a strong foundation for future research through its reproducible and validated operations.