利用Captum解释生成式语言模型

摘要

Captum是PyTorch中用于模型可解释性的综合库，提供了一系列方法，这些方法源自可解释性文献，旨在增强用户对PyTorch模型的理解。在本文中，我们介绍了Captum中的新功能，专门设计用于分析生成式语言模型的行为。我们概述了可用功能，并提供了示例应用，展示了这些功能对于理解生成式语言模型中学习到的关联的潜力。

English

Captum is a comprehensive library for model explainability in PyTorch, offering a range of methods from the interpretability literature to enhance users' understanding of PyTorch models. In this paper, we introduce new features in Captum that are specifically designed to analyze the behavior of generative language models. We provide an overview of the available functionalities and example applications of their potential for understanding learned associations within generative language models.

利用Captum解释生成式语言模型

Using Captum to Explain Generative Language Models

摘要

Support