GPT-4o在圖像修復上的初步研究
A Preliminary Study for GPT-4o on Image Restoration
May 8, 2025
作者: Hao Yang, Yan Yang, Ruikun Zhang, Liyuan Pan
cs.AI
摘要
OpenAI的GPT-4o模型,在自回归架构中整合了多模态输入与输出,已在图像生成领域展现出前所未有的性能。本研究探讨了其对图像修复社区的潜在影响。我们首次对GPT-4o在多种修复任务上进行了系统性评估。实验表明,尽管GPT-4o生成的修复结果在视觉上颇具吸引力,但与真实图像相比,常在像素级结构保真度上存在不足,常见问题包括图像比例变化、物体位置与数量偏移以及视角改变。为解决这一问题,我们以图像去雾、去雨及低光增强为代表性案例,展示了GPT-4o输出可作为强大的视觉先验,显著提升现有去雾网络的性能。本研究提供了实用指南与基础框架,以促进GPT-4o在未来图像修复流程中的整合。我们期望关于GPT-4o图像修复的研究能加速图像生成领域更广泛的创新。为支持进一步研究,我们将发布来自10多个广泛使用的图像修复数据集的GPT-4o修复图像。
English
OpenAI's GPT-4o model, integrating multi-modal inputs and outputs within an
autoregressive architecture, has demonstrated unprecedented performance in
image generation. In this work, we investigate its potential impact on the
image restoration community. We present the first systematic evaluation of
GPT-4o across diverse restoration tasks. Our experiments reveal that, although
restoration outputs from GPT-4o are visually appealing, they often suffer from
pixel-level structural fidelity when compared to ground-truth images. Common
issues are variations in image proportions, shifts in object positions and
quantities, and changes in viewpoint.To address it, taking image dehazing,
derainning, and low-light enhancement as representative case studies, we show
that GPT-4o's outputs can serve as powerful visual priors, substantially
enhancing the performance of existing dehazing networks. It offers practical
guidelines and a baseline framework to facilitate the integration of GPT-4o
into future image restoration pipelines. We hope the study on GPT-4o image
restoration will accelerate innovation in the broader field of image generation
areas. To support further research, we will release GPT-4o-restored images from
over 10 widely used image restoration datasets.Summary
AI-Generated Summary