HyPER-GAN：基于混合分块的实时图像转换技术，实现照片级真实感增强

摘要

生成模型被广泛用于提升合成数据的照片级真实感，以训练计算机视觉算法。然而，这类模型常会引入视觉伪影，降低算法精度，且需要高昂的计算资源，限制了其在实时训练或评估场景中的应用。本文提出混合补丁增强真实感生成对抗网络（HyPER-GAN），这是一种基于U-Net架构生成器的轻量级图像到图像转换方法，专为实时推理设计。该模型通过配对的合成图像与照片级真实感增强图像进行训练，并结合混合训练策略——引入真实数据的匹配图像块以提升视觉真实感与语义一致性。实验结果表明，HyPER-GAN在推理延迟、视觉真实感和语义鲁棒性方面均优于当前最先进的配对图像转换方法。此外，研究证实相较于仅使用配对合成图像与真实感增强图像的训练方式，所提出的混合训练策略确实能提升视觉质量与语义一致性。代码与预训练模型已公开下载地址：https://github.com/stefanos50/HyPER-GAN

English

Generative models are widely employed to enhance the photorealism of synthetic data for training computer vision algorithms. However, they often introduce visual artifacts that degrade the accuracy of these algorithms and require high computational resources, limiting their applicability in real-time training or evaluation scenarios. In this paper, we propose Hybrid Patch Enhanced Realism Generative Adversarial Network (HyPER-GAN), a lightweight image-to-image translation method based on a U-Net-style generator designed for real-time inference. The model is trained using paired synthetic and photorealism-enhanced images, complemented by a hybrid training strategy that incorporates matched patches from real-world data to improve visual realism and semantic consistency. Experimental results demonstrate that HyPER-GAN outperforms state-of-the-art paired image-to-image translation methods in terms of inference latency, visual realism, and semantic robustness. Moreover, it is illustrated that the proposed hybrid training strategy indeed improves visual quality and semantic consistency compared to training the model solely with paired synthetic and photorealism-enhanced images. Code and pretrained models are publicly available for download at: https://github.com/stefanos50/HyPER-GAN