HyPER-GAN:基于混合分塊的圖像轉換技術實現實時照片真實感增強
HyPER-GAN: Hybrid Patch-Based Image-to-Image Translation for Real-Time Photorealism Enhancement
March 11, 2026
作者: Stefanos Pasios, Nikos Nikolaidis
cs.AI
摘要
生成模型被廣泛應用於提升合成數據的攝影真實感,以訓練計算機視覺算法。然而,這類模型常會引入視覺偽影,不僅降低算法精度,還需耗費大量計算資源,限制了其在實時訓練或評估場景中的應用。本文提出混合補丁增強真實感生成對抗網絡(HyPER-GAN),這是一種基於U-Net架構生成器的輕量級圖像到圖像轉換方法,專為實時推理設計。該模型通過配對的合成圖像與真實感增強圖像進行訓練,並輔以混合訓練策略:引入真實數據的匹配圖像塊來提升視覺真實感與語義一致性。實驗結果表明,HyPER-GAN在推理延遲、視覺真實感和語義魯棒性方面均優於當前最先進的配對圖像轉換方法。此外,研究證實相比僅使用配對合成圖像與真實感增強圖像的訓練方式,所提出的混合訓練策略確實能提升視覺質量與語義一致性。代碼與預訓練模型已公開於:https://github.com/stefanos50/HyPER-GAN
English
Generative models are widely employed to enhance the photorealism of synthetic data for training computer vision algorithms. However, they often introduce visual artifacts that degrade the accuracy of these algorithms and require high computational resources, limiting their applicability in real-time training or evaluation scenarios. In this paper, we propose Hybrid Patch Enhanced Realism Generative Adversarial Network (HyPER-GAN), a lightweight image-to-image translation method based on a U-Net-style generator designed for real-time inference. The model is trained using paired synthetic and photorealism-enhanced images, complemented by a hybrid training strategy that incorporates matched patches from real-world data to improve visual realism and semantic consistency. Experimental results demonstrate that HyPER-GAN outperforms state-of-the-art paired image-to-image translation methods in terms of inference latency, visual realism, and semantic robustness. Moreover, it is illustrated that the proposed hybrid training strategy indeed improves visual quality and semantic consistency compared to training the model solely with paired synthetic and photorealism-enhanced images. Code and pretrained models are publicly available for download at: https://github.com/stefanos50/HyPER-GAN