개인화된 상황 인지와 VLM 어시스턴트의 정렬

초록

일반적인 인간의 목표, 예를 들어 해가 되지 않거나 환각을 일으키지 않는 것과 같은 목표에 부합하는 비전-언어 모델(VLMs)은 시각적 작업을 관리하는 데 있어 인간의 소중한 조력자로 자리 잡았습니다. 그러나 다양한 배경을 가진 사람들은 동일한 상황에서도 서로 다른 인식을 보입니다. 결과적으로, 그들은 VLM 조력자에 대해 개인화된 기대를 가질 수 있습니다. 이는 실제 세계에서의 지원을 위해 VLM 조력자를 개인화된 상황 인식에 맞추는 것이 시급히 필요함을 강조합니다. 이 문제를 연구하기 위해, 우리는 먼저 사회학적 개념인 역할 집합(Role-Set)을 기반으로 개인을 특성화하여 문제를 단순화합니다. 그런 다음, 개인화된 정렬이 달성되었는지 확인하기 위해 개인의 행동을 평가할 것을 제안합니다. 더 나아가, 우리는 18,000개의 인스턴스와 20명의 서로 다른 역할 집합을 가진 개인을 포함하는 PCogAlignBench라는 벤치마크를 구축합니다. 마지막으로, 우리는 개인화된 정렬을 위해 인식 기반 및 행동 기반 보상 모델을 구축하는 PCogAlign이라는 프레임워크를 제시합니다. 실험 결과와 인간 평가는 PCogAlignBench의 신뢰성과 우리가 제안한 PCogAlign의 효과를 입증합니다. 우리는 구축된 벤치마크와 코드를 https://github.com/NLPGM/PCogAlign에서 오픈소스로 공개할 예정입니다.

English

Vision-language models (VLMs) aligned with general human objectives, such as being harmless and hallucination-free, have become valuable assistants of humans in managing visual tasks. However, people with diversified backgrounds have different cognition even in the same situation. Consequently, they may have personalized expectations for VLM assistants. This highlights the urgent need to align VLM assistants with personalized situated cognition for real-world assistance. To study this problem, we first simplify it by characterizing individuals based on the sociological concept of Role-Set. Then, we propose to evaluate the individuals' actions to examine whether the personalized alignment is achieved. Further, we construct a benchmark named PCogAlignBench, which includes 18k instances and 20 individuals with different Role-Sets. Finally, we present a framework called PCogAlign, which constructs a cognition-aware and action-based reward model for personalized alignment. Experimental results and human evaluations demonstrate the reliability of the PCogAlignBench and the effectiveness of our proposed PCogAlign. We will open-source the constructed benchmark and code at https://github.com/NLPGM/PCogAlign.

개인화된 상황 인지와 VLM 어시스턴트의 정렬

Aligning VLM Assistants with Personalized Situated Cognition

초록

Support