迈向交互式智能数字人时代
Towards Interactive Intelligence for Digital Humans
December 15, 2025
作者: Yiyi Cai, Xuangeng Chu, Xiwei Gao, Sitong Gong, Yifei Huang, Caixin Kang, Kunhang Li, Haiyang Liu, Ruicong Liu, Yun Liu, Dianwen Ng, Zixiong Su, Erwin Wu, Yuhan Wu, Dingkun Yan, Tianyu Yan, Chang Zeng, Bo Zheng, You Zhou
cs.AI
摘要
我们提出"交互智能体"这一新型数字人范式,其具备性格对齐表达、自适应交互与自我进化能力。为实现该目标,我们推出Mio(多模态交互全能虚拟人)——由五大专业模块构成的端到端框架:思维中枢、语音引擎、面部动画器、肢体动画器与渲染器。这一统一架构将认知推理与实时多模态具身化相结合,实现流畅一致的交互体验。此外,我们建立了全新基准体系以系统评估交互智能体的综合能力。大量实验表明,本框架在所有评估维度上均超越现有最优方法。这些成果共同推动数字人从表层模仿迈向智能交互的新阶段。
English
We introduce Interactive Intelligence, a novel paradigm of digital human that is capable of personality-aligned expression, adaptive interaction, and self-evolution. To realize this, we present Mio (Multimodal Interactive Omni-Avatar), an end-to-end framework composed of five specialized modules: Thinker, Talker, Face Animator, Body Animator, and Renderer. This unified architecture integrates cognitive reasoning with real-time multimodal embodiment to enable fluid, consistent interaction. Furthermore, we establish a new benchmark to rigorously evaluate the capabilities of interactive intelligence. Extensive experiments demonstrate that our framework achieves superior performance compared to state-of-the-art methods across all evaluated dimensions. Together, these contributions move digital humans beyond superficial imitation toward intelligent interaction.