Relax：用於端到端動態機器學習的可組合抽象

摘要

在現代機器學習工作負載中，動態形狀計算已變得至關重要，尤其是在新興的大型語言模型中。這些模型的成功推動了將它們部署到各種後端環境的需求。本文介紹了一種名為Relax的編譯器抽象，用於優化端到端的動態機器學習工作負載。Relax引入了一流的符號形狀註釋，以在整個程序中全局跟踪動態形狀計算。它還引入了一種跨層次的抽象，將計算圖、循環級張量程序和庫調用封裝在單一表示中，以實現跨層次的優化。我們使用提出的方法構建了一個端到端的編譯框架來優化動態形狀模型。對大型語言模型的實驗結果顯示，Relax在各種平台上提供了與最先進的手動優化系統相競爭的性能，並且能夠將新興的動態模型部署到更廣泛的環境中，包括手機、嵌入式設備和網頁瀏覽器。

English

Dynamic shape computations have become critical in modern machine learning workloads, especially in emerging large language models. The success of these models has driven demand for deploying them to a diverse set of backend environments. In this paper, we present Relax, a compiler abstraction for optimizing end-to-end dynamic machine learning workloads. Relax introduces first-class symbolic shape annotations to track dynamic shape computations globally across the program. It also introduces a cross-level abstraction that encapsulates computational graphs, loop-level tensor programs, and library calls in a single representation to enable cross-level optimizations. We build an end-to-end compilation framework using the proposed approach to optimize dynamic shape models. Experimental results on large language models show that Relax delivers performance competitive with state-of-the-art hand-optimized systems across platforms and enables deployment of emerging dynamic models to a broader set of environments, including mobile phones, embedded devices, and web browsers.

Relax：用於端到端動態機器學習的可組合抽象

Relax: Composable Abstractions for End-to-End Dynamic Machine Learning

摘要

Support