使用 Captum 來解釋生成式語言模型

摘要

Captum是PyTorch中用於模型可解釋性的全面庫，提供了一系列從可解釋性文獻中的方法，以增強用戶對PyTorch模型的理解。在本文中，我們介紹了Captum中專門設計用於分析生成語言模型行為的新功能。我們概述了可用功能以及其潛在應用於理解生成語言模型中學習關聯的示例應用。

English

Captum is a comprehensive library for model explainability in PyTorch, offering a range of methods from the interpretability literature to enhance users' understanding of PyTorch models. In this paper, we introduce new features in Captum that are specifically designed to analyze the behavior of generative language models. We provide an overview of the available functionalities and example applications of their potential for understanding learned associations within generative language models.

使用 Captum 來解釋生成式語言模型

Using Captum to Explain Generative Language Models

摘要

Support