群组GPT：面向多用户聊天助手的令牌高效与隐私保护型智能体框架

摘要

近年来，大型语言模型（LLM）的突破使得聊天机器人能力显著提升。然而现有系统多聚焦于单用户场景，难以适应多人群组聊天环境——在这种动态复杂的语境中，智能体需要更主动精准的干预能力。传统方法通常依赖LLM同时完成推理与生成任务，导致令牌消耗量大、可扩展性有限，并存在隐私风险。为应对这些挑战，我们提出GroupGPT：一种面向多用户聊天助手的令牌高效且隐私保护的智能体框架。该框架采用"小模型-大模型"协同架构，将干预时机判断与响应生成解耦，从而实现高效精准的决策。该框架还支持表情包、图像、视频及语音消息等多模态输入。我们进一步构建了MUIR基准数据集，包含2,500段带干预标签与原理注释的群聊片段，支持对干预时机准确性与响应质量的评估。通过在MUIR上对从大型语言模型到轻量级模型的系统测试，大量实验表明GroupGPT能生成精准适时的响应，在LLM评估中平均得分达4.72/5.0，并在多样化群聊场景中获得用户好评。与基线方法相比，GroupGPT可降低最高3倍的令牌消耗，同时在云端传输前对用户消息进行隐私净化。代码已开源：https://github.com/Eliot-Shen/GroupGPT。

English

Recent advances in large language models (LLMs) have enabled increasingly capable chatbots. However, most existing systems focus on single-user settings and do not generalize well to multi-user group chats, where agents require more proactive and accurate intervention under complex, evolving contexts. Existing approaches typically rely on LLMs for both reasoning and generation, leading to high token consumption, limited scalability, and potential privacy risks. To address these challenges, we propose GroupGPT, a token-efficient and privacy-preserving agentic framework for multi-user chat assistant. GroupGPT adopts a small-large model collaborative architecture to decouple intervention timing from response generation, enabling efficient and accurate decision-making. The framework also supports multimodal inputs, including memes, images, videos, and voice messages. We further introduce MUIR, a benchmark dataset for multi-user chat assistant intervention reasoning. MUIR contains 2,500 annotated group chat segments with intervention labels and rationales, supporting evaluation of timing accuracy and response quality. We evaluate a range of models on MUIR, from large language models to smaller counterparts. Extensive experiments demonstrate that GroupGPT produces accurate and well-timed responses, achieving an average score of 4.72/5.0 in LLM-based evaluation, and is well received by users across diverse group chat scenarios. Moreover, GroupGPT reduces token usage by up to 3 times compared to baseline methods, while providing privacy sanitization of user messages before cloud transmission. Code is available at: https://github.com/Eliot-Shen/GroupGPT .