디지털 휴먼을 위한 상호작용 지능으로

초록

우리는 성격에 부합하는 표현, 적응형 상호작용, 자기 진화가 가능한 새로운 패러다임의 디지털 휴먼인 '인터랙티브 인텔리전스'를 소개한다. 이를 구현하기 위해 우리는 Thinker, Talker, Face Animator, Body Animator, Renderer라는 5개의 전문 모듈로 구성된 종단간(end-to-end) 프레임워크인 Mio(멀티모달 인터랙티브 옴니-아바타)를 제안한다. 이 통합 아키텍처는 인지 추론과 실시간 멀티모달 구현을 결합하여 유연하고 일관된 상호작용을 가능하게 한다. 더 나아가 우리는 인터랙티브 인텔리전스의 능력을 엄격하게 평가할 새로운 벤치마크를 구축했다. 다양한 실험을 통해 우리의 프레임워크가 평가된 모든 차원에서 최신 방법론 대비 우수한 성능을 달성함을 입증한다. 이러한 종합적 기여는 디지털 휴먼이 단순한 모방을 넘어 지능형 상호작용으로 나아가도록 한다.

English

We introduce Interactive Intelligence, a novel paradigm of digital human that is capable of personality-aligned expression, adaptive interaction, and self-evolution. To realize this, we present Mio (Multimodal Interactive Omni-Avatar), an end-to-end framework composed of five specialized modules: Thinker, Talker, Face Animator, Body Animator, and Renderer. This unified architecture integrates cognitive reasoning with real-time multimodal embodiment to enable fluid, consistent interaction. Furthermore, we establish a new benchmark to rigorously evaluate the capabilities of interactive intelligence. Extensive experiments demonstrate that our framework achieves superior performance compared to state-of-the-art methods across all evaluated dimensions. Together, these contributions move digital humans beyond superficial imitation toward intelligent interaction.

디지털 휴먼을 위한 상호작용 지능으로

Towards Interactive Intelligence for Digital Humans

초록

Support