Macaron-A2UI：个人代理中的生成式用户界面模型

摘要

随着个人智能体逐渐承担起处理复杂、以用户为中心的任务，静态纯文本聊天迅速成为瓶颈。生成式UI应运而生，作为一种必要的全新接口层，能够根据交互上下文实时动态合成相应的控件、选项与状态。我们提出Macaron-A2UI模型，专为个人智能体的生成式UI而设计。目标在于超越纯文本交互，使智能体能够同时生成自然语言，以及轻量级、可执行的UI操作，用于信息收集、偏好优化、确认及多目标组织。我们从异构对话数据源构建大规模生成式UI语料库，引入A2UI-Bench用于受控评估，并训练了30B、235B和754B参数的模型，采用参数高效的基于LoRA的有监督微调，结合奖励驱动的强化学习。最佳的Macaron-A2UI模型在无显式模式提示的情况下，A2UI-Bench总体得分达到75.6，超越了最强的全模式基线。我们开源模型、基准与评估协议，以支持未来个人智能体生成式UI的相关研究。

English

As personal agents evolve to handle complex, user-centric tasks, static plain-text chat is rapidly becoming a bottleneck. Generative UI emerges as the necessary new interface layer, dynamically synthesizing the right controls, options, and state from the interaction context in real time. We present Macaron-A2UI, a model for Generative UI in personal agents. Our goal is to move beyond text-only interaction by enabling agents to generate natural language together with lightweight, executable UI actions for information collection, preference refinement, confirmation, and multi-goal organization. We build a large-scale Generative UI corpus from heterogeneous dialogue sources, introduce A2UI-Bench for controlled evaluation, and train 30B, 235B and 754B models with parameter-efficient LoRA-based supervised fine-tuning followed by reward-driven reinforcement learning. The best Macaron-A2UI model reaches 75.6 overall on A2UI-Bench without explicit schema hints, surpassing the strongest full-schema frontier baseline. We release the models, benchmark, and evaluation protocol to support future work on Generative UI for personal agents.