ZJUKLAB在SemEval-2025任务4中的探索：通过模型融合实现遗忘学习

摘要

本文介绍了ZJUKLAB团队为SemEval-2025任务4提交的方案：从大型语言模型中去除敏感内容。该任务旨在有选择性地从大型语言模型中抹除敏感知识，避免过度遗忘与遗忘不足的问题。我们提出了一种利用模型融合（特别是TIES-Merging）的遗忘系统，将两个专门化模型结合，生成一个更为均衡的遗忘后模型。我们的系统取得了优异成绩，在26支参赛队伍中位列第二，任务综合得分为0.944，整体综合得分为0.487。本文还进行了局部实验，并对遗忘过程进行了全面分析，包括性能轨迹、损失动态及权重视角的考察，辅以多项补充实验，以深入理解我们方法的有效性。此外，我们分析了方法及评估指标的不足，强调仅依赖MIA分数和基于ROUGE的指标不足以全面评估遗忘的成功与否。最后，我们强调未来研究需要更全面的评估方法，并重新思考遗忘目标。代码已发布于https://github.com/zjunlp/unlearn/tree/main/semeval25。

English

This paper presents the ZJUKLAB team's submission for SemEval-2025 Task 4: Unlearning Sensitive Content from Large Language Models. This task aims to selectively erase sensitive knowledge from large language models, avoiding both over-forgetting and under-forgetting issues. We propose an unlearning system that leverages Model Merging (specifically TIES-Merging), combining two specialized models into a more balanced unlearned model. Our system achieves competitive results, ranking second among 26 teams, with an online score of 0.944 for Task Aggregate and 0.487 for overall Aggregate. In this paper, we also conduct local experiments and perform a comprehensive analysis of the unlearning process, examining performance trajectories, loss dynamics, and weight perspectives, along with several supplementary experiments, to understand the effectiveness of our method. Furthermore, we analyze the shortcomings of our method and evaluation metrics, emphasizing that MIA scores and ROUGE-based metrics alone are insufficient to fully evaluate successful unlearning. Finally, we emphasize the need for more comprehensive evaluation methodologies and rethinking of unlearning objectives in future research. Code is available at https://github.com/zjunlp/unlearn/tree/main/semeval25.

ZJUKLAB在SemEval-2025任务4中的探索：通过模型融合实现遗忘学习

ZJUKLAB at SemEval-2025 Task 4: Unlearning via Model Merging

摘要

Support