Geloof in Je Model: Distributiegestuurde Betrouwbaarheidscalibratie

Samenvatting

Grote redeneermodellen hebben opmerkelijke prestaties getoond dankzij de vooruitgang in testtijd-schaleringstechnieken, die de voorspellingsnauwkeurigheid verbeteren door meerdere kandidaatantwoorden te genereren en het meest betrouwbare antwoord te selecteren. Hoewel eerder onderzoek heeft aangetoond dat interne modelsignalen zoals betrouwbaarheidsscores deels de juistheid van antwoorden kunnen aangeven en een distributionele correlatie vertonen met nauwkeurigheid, is dergelijke distributionele informatie nog niet volledig benut om antwoordselectie te sturen. Gemotiveerd door dit inzicht presenteren we DistriVoting, dat distributionele prioriteiten incorporeert als een aanvullend signaal naast betrouwbaarheid tijdens het stemmen. Onze methode (1) ontleedt eerst de gemengde betrouwbaarheidsverdeling in positieve en negatieve componenten met behulp van Gaussische Mengmodellen, (2) past vervolgens een afwijzingsfilter toe op basis van positieve/negatieve steekproeven om de overlap tussen de twee verdelingen te verminderen. Daarnaast introduceren we SelfStepConf om de overlap vanuit het verdelingsperspectief verder te verminderen, door stap-voor-stap-betrouwbaarheid te gebruiken om het inferentieproces dynamisch aan te passen, waardoor de scheiding tussen de twee verdelingen wordt vergroot om de betrouwbaarheid van scores tijdens het stemmen te verbeteren. Experimenten over 16 modellen en 5 benchmarks tonen aan dat onze methode state-of-the-art-benaderingen significant overtreft.

English

Large Reasoning Models have demonstrated remarkable performance with the advancement of test-time scaling techniques, which enhances prediction accuracy by generating multiple candidate responses and selecting the most reliable answer. While prior work has analyzed that internal model signals like confidence scores can partly indicate response correctness and exhibit a distributional correlation with accuracy, such distributional information has not been fully utilized to guide answer selection. Motivated by this, we propose DistriVoting, which incorporates distributional priors as another signal alongside confidence during voting. Specifically, our method (1) first decomposes the mixed confidence distribution into positive and negative components using Gaussian Mixture Models, (2) then applies a reject filter based on positive/negative samples from them to mitigate overlap between the two distributions. Besides, to further alleviate the overlap from the perspective of distribution itself, we propose SelfStepConf, which uses step-level confidence to dynamically adjust inference process, increasing the separation between the two distributions to improve the reliability of confidences in voting. Experiments across 16 models and 5 benchmarks demonstrate that our method significantly outperforms state-of-the-art approaches.

Geloof in Je Model: Distributiegestuurde Betrouwbaarheidscalibratie

Believe Your Model: Distribution-Guided Confidence Calibration

Samenvatting

Support