최적의 주장 검증을 위한 분해 최적화

초록

긴 형식의 텍스트 사실성 평가를 위한 Decompose-Then-Verify 패러다임에 대한 현재 연구는 일반적으로 분해와 검증을 독립적으로 다루며, 이들 간의 상호작용과 잠재적인 불일치를 간과하고 있습니다. 우리는 기존의 분해 정책, 즉 일반적으로 수작업으로 제작된 데모가 원자성(atomicity)이라는 새로운 지표(정보 밀도를 정량화하는 지표) 측면에서 하위 검증기와 잘 맞지 않아 최적의 검증 결과를 얻지 못한다는 사실을 발견했습니다. 우리는 최적의 검증을 위한 최적의 분해 정책을 찾는 문제를 이중 최적화 문제로 공식화합니다. 이 강력한 NP-난제에 대한 근사 해를 구하기 위해, 우리는 검증기의 피드백을 활용하여 검증기가 선호하는 원자성으로 클레임을 동적으로 분해하는 정책을 학습하는 강화 학습 프레임워크인 동적 분해(dynamic decomposition)를 제안합니다. 실험 결과, 동적 분해는 다양한 검증기, 데이터셋, 입력 클레임의 원자성에 걸쳐 평균적으로 검증 신뢰도를 0.07, 정확도를 0.12(0-1 척도 기준) 향상시키며 기존 분해 정책을 능가하는 것으로 나타났습니다.

English

Current research on the Decompose-Then-Verify paradigm for evaluating the factuality of long-form text typically treats decomposition and verification in isolation, overlooking their interactions and potential misalignment. We find that existing decomposition policies, typically hand-crafted demonstrations, do not align well with downstream verifiers in terms of atomicity -- a novel metric quantifying information density -- leading to suboptimal verification results. We formulate finding the optimal decomposition policy for optimal verification as a bilevel optimization problem. To approximate a solution for this strongly NP-hard problem, we propose dynamic decomposition, a reinforcement learning framework that leverages verifier feedback to learn a policy for dynamically decomposing claims to verifier-preferred atomicity. Experimental results show that dynamic decomposition outperforms existing decomposition policies, improving verification confidence by 0.07 and accuracy by 0.12 (on a 0-1 scale) on average across varying verifiers, datasets, and atomcities of input claims.

최적의 주장 검증을 위한 분해 최적화

Optimizing Decomposition for Optimal Claim Verification

초록

Support