主动学习器作为高效的PRP重排序器
Active Learners as Efficient PRP Rerankers
May 15, 2026
作者: Jeremías Figueiredo Paschmann, Juan Kaplan, Francisco Nattero, Santiago Barron, Juan Wisznia, Luciano del Corro
cs.AI
摘要
成对排序提示(PRP)从大语言模型中获取成对偏好判断,随后通过经典排序算法将这些判断聚合为排名。然而,由于判断存在噪声、顺序敏感性及非传递性,排序假设与实际情况并不匹配。由于排序旨在恢复完整排列,为满足调用预算而截断排序过程无法生成可靠的Top-K结果。因此,我们将PRP重新排序重构为基于噪声成对比较的主动学习,并证明主动排序器可作为即插即用替代方案,在调用受限场景下显著提升每次调用的NDCG@10指标。我们的噪声鲁棒框架还引入了一种随机方向预言机,每对仅需一次LLM调用。该方法将系统性位置偏差转化为零均值噪声,无需双向调用即可实现无偏聚合排序。
English
Pairwise Ranking Prompting (PRP) elicits pairwise preference judgments from an LLM, which are then aggregated into a ranking, usually via classical sorting algorithms. However, judgments are noisy, order-sensitive, and sometimes intransitive, so sorting assumptions do not match the setting. Because sorting aims to recover a full permutation, truncating it to meet a call budget does not produce a dependable top-K. We thus reframe PRP reranking as active learning from noisy pairwise comparisons and show that active rankers are drop-in replacements that improve NDCG@10 per call in the call-constrained regime. Our noise-robust framework also introduces a randomized-direction oracle that uses a single LLM call per pair. This approach converts systematic position bias into zero-mean noise, enabling unbiased aggregate ranking without the cost of bidirectional calls.