根据连续参数预测整数

摘要

我們研究數值標籤的預測問題，這類標籤被限制為整數或整數的子集。例如社交媒體帖子的點贊數，或公共租賃站點的可用自行車數量。雖然可以將其建模為連續值並應用傳統迴歸方法，但這種做法會將標籤的基礎分佈從離散型轉變為連續型。離散分佈具有特定優勢，這促使我們思考：能否通過離散分佈直接建模此類整數標籤，並根據實例特徵預測分佈參數？此外，我們聚焦神經網絡輸出分佈的應用場景，這要求分佈參數必須連續，以便通過反向傳播和梯度下降學習網絡權重。我們探究了若干符合要求的分佈方案（含既有方法與創新設計），並在表格學習、序列預測和圖像生成等任務中進行驗證。研究發現總體性能最優的分佈有兩種：位分佈（通過比特位表示目標整數並對每位採用伯努利分佈）與拉普拉斯分佈的離散類比（在連續均值周圍採用指數衰減尾部的分佈）。

English

We study the problem of predicting numeric labels that are constrained to the integers or to a subrange of the integers. For example, the number of up-votes on social media posts, or the number of bicycles available at a public rental station. While it is possible to model these as continuous values, and to apply traditional regression, this approach changes the underlying distribution on the labels from discrete to continuous. Discrete distributions have certain benefits, which leads us to the question whether such integer labels can be modeled directly by a discrete distribution, whose parameters are predicted from the features of a given instance. Moreover, we focus on the use case of output distributions of neural networks, which adds the requirement that the parameters of the distribution be continuous so that backpropagation and gradient descent may be used to learn the weights of the network. We investigate several options for such distributions, some existing and some novel, and test them on a range of tasks, including tabular learning, sequential prediction and image generation. We find that overall the best performance comes from two distributions: Bitwise, which represents the target integer in bits and places a Bernoulli distribution on each, and a discrete analogue of the Laplace distribution, which uses a distribution with exponentially decaying tails around a continuous mean.

根据连续参数预测整数

Predicting integers from continuous parameters

摘要

Support