可行學習
Feasible Learning
January 24, 2025
作者: Juan Ramirez, Ignacio Hounie, Juan Elenter, Jose Gallego-Posada, Meraj Hashemizadeh, Alejandro Ribeiro, Simon Lacoste-Julien
cs.AI
摘要
我們介紹可行學習(Feasible Learning, FL),一種以樣本為中心的學習範式,通過解決一個界定每個訓練樣本損失的可行性問題來訓練模型。與普遍存在的經驗風險最小化(Empirical Risk Minimization, ERM)框架相比,後者優化平均性能,而FL要求在每個單獨的數據點上達到令人滿意的性能。由於符合指定性能閾值的任何模型都是有效的FL解決方案,優化算法的選擇及其動態在塑造結果解的性質方面起著至關重要的作用。具體而言,我們研究了一種原始-對偶方法,該方法在訓練過程中動態重新加權每個樣本的重要性。為應對在實踐中設定有意義閾值的挑戰,我們引入了一種FL的放寬版本,其中包含最小範數的松弛變量。我們的實證分析涵蓋了圖像分類、年齡回歸以及大型語言模型中的偏好優化,結果表明,通過FL訓練的模型可以從數據中學習,同時在尾部行為方面優於ERM,對平均性能僅有輕微影響。
English
We introduce Feasible Learning (FL), a sample-centric learning paradigm where
models are trained by solving a feasibility problem that bounds the loss for
each training sample. In contrast to the ubiquitous Empirical Risk Minimization
(ERM) framework, which optimizes for average performance, FL demands
satisfactory performance on every individual data point. Since any model that
meets the prescribed performance threshold is a valid FL solution, the choice
of optimization algorithm and its dynamics play a crucial role in shaping the
properties of the resulting solutions. In particular, we study a primal-dual
approach which dynamically re-weights the importance of each sample during
training. To address the challenge of setting a meaningful threshold in
practice, we introduce a relaxation of FL that incorporates slack variables of
minimal norm. Our empirical analysis, spanning image classification, age
regression, and preference optimization in large language models, demonstrates
that models trained via FL can learn from data while displaying improved tail
behavior compared to ERM, with only a marginal impact on average performance.Summary
AI-Generated Summary