GPT-4o在圖像修復上的初步研究

摘要

OpenAI的GPT-4o模型，在自回归架构中整合了多模态输入与输出，已在图像生成领域展现出前所未有的性能。本研究探讨了其对图像修复社区的潜在影响。我们首次对GPT-4o在多种修复任务上进行了系统性评估。实验表明，尽管GPT-4o生成的修复结果在视觉上颇具吸引力，但与真实图像相比，常在像素级结构保真度上存在不足，常见问题包括图像比例变化、物体位置与数量偏移以及视角改变。为解决这一问题，我们以图像去雾、去雨及低光增强为代表性案例，展示了GPT-4o输出可作为强大的视觉先验，显著提升现有去雾网络的性能。本研究提供了实用指南与基础框架，以促进GPT-4o在未来图像修复流程中的整合。我们期望关于GPT-4o图像修复的研究能加速图像生成领域更广泛的创新。为支持进一步研究，我们将发布来自10多个广泛使用的图像修复数据集的GPT-4o修复图像。

English

OpenAI's GPT-4o model, integrating multi-modal inputs and outputs within an autoregressive architecture, has demonstrated unprecedented performance in image generation. In this work, we investigate its potential impact on the image restoration community. We present the first systematic evaluation of GPT-4o across diverse restoration tasks. Our experiments reveal that, although restoration outputs from GPT-4o are visually appealing, they often suffer from pixel-level structural fidelity when compared to ground-truth images. Common issues are variations in image proportions, shifts in object positions and quantities, and changes in viewpoint.To address it, taking image dehazing, derainning, and low-light enhancement as representative case studies, we show that GPT-4o's outputs can serve as powerful visual priors, substantially enhancing the performance of existing dehazing networks. It offers practical guidelines and a baseline framework to facilitate the integration of GPT-4o into future image restoration pipelines. We hope the study on GPT-4o image restoration will accelerate innovation in the broader field of image generation areas. To support further research, we will release GPT-4o-restored images from over 10 widely used image restoration datasets.

GPT-4o在圖像修復上的初步研究

A Preliminary Study for GPT-4o on Image Restoration

摘要

Support