OmniPred：语言模型作为通用回归器

摘要

在实验设计的广阔领域中，回归一直是一个强大的工具，能够准确预测系统或模型的结果指标，只需给定一组参数，但传统上只适用于特定任务的方法。在本文中，我们提出了OmniPred，这是一个用于训练语言模型的框架，可以作为通用的端到端回归器，针对来自不同真实世界实验的（x，y）评估数据。通过使用来自Google Vizier的数据，这是全球最大的黑盒优化数据库之一，我们的大量实验证明，仅通过数学参数和值的文本表示，语言模型能够进行非常精确的数值回归，如果有机会进行多任务训练，可以显著优于传统的回归模型。

English

Over the broad landscape of experimental design, regression has been a powerful tool to accurately predict the outcome metrics of a system or model given a set of parameters, but has been traditionally restricted to methods which are only applicable to a specific task. In this paper, we propose OmniPred, a framework for training language models as universal end-to-end regressors over (x,y) evaluation data from diverse real world experiments. Using data sourced from Google Vizier, one of the largest blackbox optimization databases in the world, our extensive experiments demonstrate that through only textual representations of mathematical parameters and values, language models are capable of very precise numerical regression, and if given the opportunity to train over multiple tasks, can significantly outperform traditional regression models.

OmniPred：语言模型作为通用回归器

OmniPred: Language Models as Universal Regressors

摘要

Support