MetaMind: 메타인지 다중 에이전트 시스템을 통한 인간의 사회적 사고 모델링

초록

인간의 사회적 상호작용은 타인의 말로 표현되지 않은 의도, 감정, 믿음을 추론하는 능력에 의존하며, 이는 심리학적 개념인 마음이론(Theory of Mind, ToM)에 기반한 인지적 기술입니다. 대규모 언어 모델(LLMs)은 의미 이해 작업에서 뛰어난 성능을 보이지만, 인간 커뮤니케이션에 내재된 모호성과 맥락적 뉘앙스를 다루는 데 어려움을 겪습니다. 이러한 격차를 해소하기 위해, 우리는 메타인지(metacognition) 심리학 이론에서 영감을 받아 인간과 유사한 사회적 추론을 모방하도록 설계된 다중 에이전트 프레임워크인 MetaMind를 소개합니다. MetaMind는 사회적 이해를 세 가지 협력적 단계로 분해합니다: (1) 마음이론 에이전트(Theory-of-Mind Agent)가 사용자의 정신 상태(예: 의도, 감정)에 대한 가설을 생성하고, (2) 도메인 에이전트(Domain Agent)가 문화적 규범과 윤리적 제약을 사용하여 이러한 가설을 정제하며, (3) 응답 에이전트(Response Agent)가 맥락에 적합한 응답을 생성하면서 추론된 의도와의 일치성을 검증합니다. 우리의 프레임워크는 세 가지 도전적인 벤치마크에서 최첨단 성능을 달성하며, 실제 사회적 시나리오에서 35.7%의 개선과 ToM 추론에서 6.2%의 향상을 보였습니다. 특히, 이 프레임워크는 LLMs가 주요 ToM 작업에서 인간 수준의 성능을 처음으로 달성할 수 있게 합니다. 제거 연구(ablation studies)는 모든 구성 요소의 필요성을 확인하며, 이 프레임워크가 맥락적 타당성, 사회적 적절성, 사용자 적응을 균형 있게 조절할 수 있는 능력을 보여줍니다. 이 연구는 공감적 대화 및 문화적으로 민감한 상호작용을 포함한 인간과 유사한 사회적 지능을 향한 AI 시스템의 발전을 이끌어냅니다. 코드는 https://github.com/XMZhangAI/MetaMind에서 확인할 수 있습니다.

English

Human social interactions depend on the ability to infer others' unspoken intentions, emotions, and beliefs-a cognitive skill grounded in the psychological concept of Theory of Mind (ToM). While large language models (LLMs) excel in semantic understanding tasks, they struggle with the ambiguity and contextual nuance inherent in human communication. To bridge this gap, we introduce MetaMind, a multi-agent framework inspired by psychological theories of metacognition, designed to emulate human-like social reasoning. MetaMind decomposes social understanding into three collaborative stages: (1) a Theory-of-Mind Agent generates hypotheses user mental states (e.g., intent, emotion), (2) a Domain Agent refines these hypotheses using cultural norms and ethical constraints, and (3) a Response Agent generates contextually appropriate responses while validating alignment with inferred intent. Our framework achieves state-of-the-art performance across three challenging benchmarks, with 35.7% improvement in real-world social scenarios and 6.2% gain in ToM reasoning. Notably, it enables LLMs to match human-level performance on key ToM tasks for the first time. Ablation studies confirm the necessity of all components, which showcase the framework's ability to balance contextual plausibility, social appropriateness, and user adaptation. This work advances AI systems toward human-like social intelligence, with applications in empathetic dialogue and culturally sensitive interactions. Code is available at https://github.com/XMZhangAI/MetaMind.

MetaMind: 메타인지 다중 에이전트 시스템을 통한 인간의 사회적 사고 모델링

MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems

초록

Support