ChatPaper.aiChatPaper

G-FOCUS:构建评估用户界面设计说服力的稳健方法

G-FOCUS: Towards a Robust Method for Assessing UI Design Persuasiveness

May 8, 2025
作者: Jaehyun Jeon, Jang Han Yoon, Min Soo Kim, Sumin Shim, Yejin Choi, Hanbin Kim, Youngjae Yu
cs.AI

摘要

评估用户界面(UI)设计效果不仅关乎美学,更在于影响用户行为,这是设计说服力的核心理念。A/B测试是确定哪些UI变体能带来更高用户参与度的主要方法,但其成本高昂且耗时。尽管近期的视觉-语言模型(VLMs)能够处理自动化的UI分析,但现有方法多聚焦于孤立的设计属性,而非比较性的说服力——优化用户互动的关键因素。为此,我们推出了WiserUI-Bench,一个专为成对UI设计说服力评估任务设计的基准,包含300对真实世界的UI图像,并附有A/B测试结果及专家解析。此外,我们提出了G-FOCUS,一种新颖的推理时策略,通过减少位置偏差并提升评估准确性,增强了基于VLM的说服力评估。实验结果表明,在成对UI评估的一致性和准确性上,G-FOCUS超越了现有的推理策略。通过推动VLM驱动的UI说服力评估,我们的工作为补充A/B测试提供了一种途径,推动了可扩展UI偏好建模与设计优化的进步。代码与数据将公开发布。
English
Evaluating user interface (UI) design effectiveness extends beyond aesthetics to influencing user behavior, a principle central to Design Persuasiveness. A/B testing is the predominant method for determining which UI variations drive higher user engagement, but it is costly and time-consuming. While recent Vision-Language Models (VLMs) can process automated UI analysis, current approaches focus on isolated design attributes rather than comparative persuasiveness-the key factor in optimizing user interactions. To address this, we introduce WiserUI-Bench, a benchmark designed for Pairwise UI Design Persuasiveness Assessment task, featuring 300 real-world UI image pairs labeled with A/B test results and expert rationales. Additionally, we propose G-FOCUS, a novel inference-time reasoning strategy that enhances VLM-based persuasiveness assessment by reducing position bias and improving evaluation accuracy. Experimental results show that G-FOCUS surpasses existing inference strategies in consistency and accuracy for pairwise UI evaluation. Through promoting VLM-driven evaluation of UI persuasiveness, our work offers an approach to complement A/B testing, propelling progress in scalable UI preference modeling and design optimization. Code and data will be released publicly.

Summary

AI-Generated Summary

PDF122May 12, 2025