언어 모델의 정책 준수 평가 확장: 정책 추적 기반 사고 과정 분석

초록

정책 준수 평가는 입력 사례가 인간이 정의한 규칙 집합(일반적으로 정책이라고 알려진)을 엄격히 준수하는지 평가하는 기본적인 작업입니다. 실제로, 인간 전문가들은 정책에 명시된 특정 조항에 대한 위반 사항을 식별하기 위해 체계적이고 단계별 프로세스를 따릅니다. 그러나 이러한 골드 스탠더드, 전문가 수준의 추론 과정을 문서화하는 것은 비용이 많이 듭니다. 본 논문에서는 LLM(Large Language Model)의 정책 준수 평가 능력을 향상시키기 위한 추론 다리 역할을 하는 특수 생성 추론 체인인 Policy Reasoning Traces(PRT)를 소개합니다. 우리의 실험적 평가는 PRT를 추론 시점과 훈련 시점 시나리오 모두에서 사용할 경우, 오픈 웨이트 및 상용 모델의 성능을 크게 향상시키며 HIPAA와 GDPR 정책에 대한 새로운 최첨단 기술을 설정한다는 것을 보여줍니다. 정확도 향상 외에도, PRT가 LLM의 정책 조항을 정확히 인용하는 능력을 개선하고, 원시 사고 체인의 높은 활용을 통해 준수 결정에 영향을 미칠 수 있는 방법도 강조합니다.

English

Policy compliance assessment is a fundamental task of evaluating whether an input case strictly complies with a set of human-defined rules, more generally known as policies. In practice, human experts follow a systematic, step-by-step process to identify violations with respect to specific stipulations outlined in the policy. However, such documentation of gold-standard, expert-level reasoning processes is costly to acquire. In this paper, we introduce Policy Reasoning Traces (PRT), a form of specialized generated reasoning chains that serve as a reasoning bridge to improve an LLM's policy compliance assessment capabilities. Our empirical evaluations demonstrate that the use of PRTs for both inference-time and training-time scenarios significantly enhances the performance of open-weight and commercial models, setting a new state-of-the-art for HIPAA and GDPR policies. Beyond accuracy gains, we also highlight how PRTs can improve an LLM's ability to accurately cite policy clauses, as well as influence compliance decisions through their high utilization from the raw chains of thought.

언어 모델의 정책 준수 평가 확장: 정책 추적 기반 사고 과정 분석

Scaling Policy Compliance Assessment in Language Models with Policy Reasoning Traces

초록

Support