OmniPred：語言模型作為通用回歸器

摘要

在實驗設計的廣泛領域中，回歸一直是一個強大的工具，可以準確預測系統或模型的結果指標，只要給定一組參數，但傳統上僅限於適用於特定任務的方法。在本文中，我們提出了OmniPred，一個框架，用於訓練語言模型作為通用的端到端回歸器，適用於來自多樣真實世界實驗的（x，y）評估數據。通過使用來自Google Vizier的數據來源，這是全球最大的黑盒優化數據庫之一，我們的大量實驗表明，僅通過數學參數和數值的文本表示，語言模型能夠進行非常精確的數值回歸，如果有機會在多個任務上進行訓練，它們可以顯著優於傳統的回歸模型。

English

Over the broad landscape of experimental design, regression has been a powerful tool to accurately predict the outcome metrics of a system or model given a set of parameters, but has been traditionally restricted to methods which are only applicable to a specific task. In this paper, we propose OmniPred, a framework for training language models as universal end-to-end regressors over (x,y) evaluation data from diverse real world experiments. Using data sourced from Google Vizier, one of the largest blackbox optimization databases in the world, our extensive experiments demonstrate that through only textual representations of mathematical parameters and values, language models are capable of very precise numerical regression, and if given the opportunity to train over multiple tasks, can significantly outperform traditional regression models.

OmniPred：語言模型作為通用回歸器

OmniPred: Language Models as Universal Regressors

摘要

Support