Relax:用於端到端動態機器學習的可組合抽象
Relax: Composable Abstractions for End-to-End Dynamic Machine Learning
November 1, 2023
作者: Ruihang Lai, Junru Shao, Siyuan Feng, Steven S. Lyubomirsky, Bohan Hou, Wuwei Lin, Zihao Ye, Hongyi Jin, Yuchen Jin, Jiawei Liu, Lesheng Jin, Yaxing Cai, Ziheng Jiang, Yong Wu, Sunghyun Park, Prakalp Srivastava, Jared G. Roesch, Todd C. Mowry, Tianqi Chen
cs.AI
摘要
在現代機器學習工作負載中,動態形狀計算已變得至關重要,尤其是在新興的大型語言模型中。這些模型的成功推動了將它們部署到各種後端環境的需求。本文介紹了一種名為Relax的編譯器抽象,用於優化端到端的動態機器學習工作負載。Relax引入了一流的符號形狀註釋,以在整個程序中全局跟踪動態形狀計算。它還引入了一種跨層次的抽象,將計算圖、循環級張量程序和庫調用封裝在單一表示中,以實現跨層次的優化。我們使用提出的方法構建了一個端到端的編譯框架來優化動態形狀模型。對大型語言模型的實驗結果顯示,Relax在各種平台上提供了與最先進的手動優化系統相競爭的性能,並且能夠將新興的動態模型部署到更廣泛的環境中,包括手機、嵌入式設備和網頁瀏覽器。
English
Dynamic shape computations have become critical in modern machine learning
workloads, especially in emerging large language models. The success of these
models has driven demand for deploying them to a diverse set of backend
environments. In this paper, we present Relax, a compiler abstraction for
optimizing end-to-end dynamic machine learning workloads. Relax introduces
first-class symbolic shape annotations to track dynamic shape computations
globally across the program. It also introduces a cross-level abstraction that
encapsulates computational graphs, loop-level tensor programs, and library
calls in a single representation to enable cross-level optimizations. We build
an end-to-end compilation framework using the proposed approach to optimize
dynamic shape models. Experimental results on large language models show that
Relax delivers performance competitive with state-of-the-art hand-optimized
systems across platforms and enables deployment of emerging dynamic models to a
broader set of environments, including mobile phones, embedded devices, and web
browsers.