PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Perceptually Optimised Swin-Unet for Low-Light Image Enhancement

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
In this paper we propose a novel approach to low-light image enhancement using a transformer-based Swin-Unet and a perceptually driven loss that incorporates Learned Perceptual Image Patch Similarity (LPIPS), a deep-feature distance aligned with human visual judgements. Specifically, our U-shaped Swin-Unet applies shifted-window self-attention across scales with skip connections and multi-scale fusion, mapping a low-light RGB image to its enhanced version in one pass. Training uses a compact objective - Smooth-L1, LPIPS (AlexNet), MS-SSIM (detached), inverted PSNR, channel-wise colour consistency, and Sobel-gradient terms - with a small LPIPS weight chosen via ablation. Our work addresses the limits of purely pixel-wise losses by integrating perceptual and structural components to produce visually superior results. Experiments on LOL-v1, LOL-v2, and SID show that while our Swin-Unet does not surpass current state-of-the-art on standard metrics, the LPIPS-based loss significantly improves perceptual quality and visual fidelity. These results confirm the viability of transformer-based U-Net architectures for low-light enhancement, particularly in resource-constrained settings, and suggest exploring larger variants and further tuning of loss parameters in future work.
Rocznik
Strony
23--42
Opis fizyczny
Bibliogr. 53 poz., il., tab., wykr.
Twórcy
  • Warsaw University of Technology, Warsaw, Poland
  • Warsaw University of Technology, Warsaw, Poland
Bibliografia
  • [1] A. Brateanu, R. Balmez, A. Avram, C. Orhei, and C. Ancuti. LYT-NET: Lightweight YUV transformer-based network for low-light image enhancement. IEEE Signal Processing Letters 32:2065-2069, 2025. doi:10.1109/LSP.2025.3563125.
  • [2] Y. Cai, H. Bian, J. Lin, H. Wang, R. Timofte, et al. Retinexformer: One-stage Retinex-based transformer for low-light image enhancement. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 12504-12513, 2023. doi:10.1109/ICCV51070.2023.01149.
  • [3] H. Cao, Y. Wang, J. Chen, D. Jiang, X. Zhang, et al. Swin-Unet: Unet-like pure transformer for medical image segmentation. In: Computer Vision - ECCV 2022 Workshops, vol. 13803 of Lecture Notes in Computer Science, 2023. doi:10.1007/978-3-031-25066-89.
  • [4] C. Chen, Q. Chen, J. Xu, and V. Koltun. Learning to see in the dark. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018), pp. 3291-3300, 2018. doi:10.1109/CVPR.2018.00347.
  • [5] H. Chen, Y. Wang, T. Guo, C. Xu, Y. Deng, et al. Pre-trained image processing transformer. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2021), pp. 12299-12310, 2021. doi:10.1109/CVPR46437.2021.01212.
  • [6] Z. Cui, K. Li, L. Gu, S. Su, P. Gao, et al. You only need 90k parameters to adapt light: a light weight transformer for image enhancement and exposure correction. In: 33rd British Machine Vision Conference (BMVC 2022), 2022. https://bmvc2022.mpi-inf.mpg.de/238/.
  • [7] C.-M. Fan, T.-J. Liu, and K.-H. Liu. Half wavelet attention on M-Net+ for low-light image enhancement. In: 2022 IEEE International Conference on Image Processing (ICIP 2022), pp. 3878-3882, 2022. doi:10.1109/ICIP46576.2022.9897503.
  • [8] Y. Feng, C. Zhang, P. Wang, P. Wu, Q. Yan, et al. You only need one color space: An efficient network for low-light image enhancement. arXiv, arXiv:2402.05809, 2024. doi:10.48550/arXiv.2402.05809.
  • [9] X. Fu, Y. Liao, D. Zeng, Y. Huang, X.-P. Zhang, et al. A probabilistic method for image enhancement with simultaneous illumination and reflectance estimation. IEEE Transactions on Image Processing 24(12):4965-4977, 2015. doi:10.1109/TIP.2015.2474701.
  • [10] Z. Gu, F. Li, F. Fang, and G. Zhang. A novel Retinex-based fractional-order variational model for images with severely low light. IEEE Transactions on Image Processing 29:7233-7247, 2020. doi:10.1109/TIP.2019.2958144.
  • [11] C. Guo, C. Li, J. Guo, C. C. Loy, J. Hou, et al. Zero-reference deep curve estimation for low-light image enhancement. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020), pp. 1780-1789, 2020. doi:10.1109/CVPR42600.2020.00185.
  • [12] X. Guo and Q. Hu. Low-light image enhancement via breaking down the darkness. International Journal of Computer Vision 131:48-66, 2023. doi:10.1007/s11263-022-01667-9.
  • [13] H. Hou, Y. Hou, Y. Shi, B. Wei, and J. Xu. NLHD: A pixel-level non-local Retinex model for low-light image enhancement. arXiv, arXiv:2106.06971, 2021. doi:10.48550/arXiv.2106.06971.
  • [14] J. H. Jang, Y. Bae, and J. B. Ra. Contrast-enhanced fusion of multisensor images using subbanddecomposed multiscale Retinex. IEEE Transactions on Image Processing 21(8):3479-3490, 2012. doi:10.1109/TIP.2012.2197014.
  • [15] Y. Jiang, X. Gong, D. Liu, Y. Cheng, C. Fang, et al. EnlightenGAN: Deep light enhancement without paired supervision. IEEE Transactions on Image Processing 30:2340-2349, 2021. doi:10.1109/TIP.2021.3051462.
  • [16] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems 25 (NIPS 2012), vol. 25, pp. 1097-1105, 2012. https://proceedings.neurips.cc/paper/2012/hash/c399862d3b9d6b76c8436e924a68c45b-Abstract.html.
  • [17] E. H. Land. The Retinex theory of color vision. Scientific American 237(6):108-128, 1977. doi:10.1038/scientificamerican1277-108.
  • [18] J. Li, J. Li, F. Fang, F. Li, and G. Zhang. Luminance-aware pyramid network for low-light image enhancement. IEEE Transactions on Multimedia 23:3153-3165, 2021. doi:10.1109/TMM.2020.3021243.
  • [19] R. Liu, L. Ma, J. Zhang, X. Fan, and Z. Luo. Retinex-inspired unrolling with cooperative prior architecture search for low-light image enhancement. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2021), pp. 10561-10570, 2021. doi:10.1109/CVPR46437.2021.01042.
  • [20] Y. Liu, T. Huang, W. Dong, F. Wu, X. Li, et al. Low-light image enhancement with multi-stage residue quantization and brightness-aware attention. In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV 2023), pp. 12106-12115, 2023. doi:10.1109/ICCV51070.2023.01115.
  • [21] Z. Liu, H. Hu, Y. Lin, Z. Yao, Z. Xie, et al. Swin transformer V2: Scaling up capacity and resolution. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022), pp. 11999-12009, 2022. doi:10.1109/CVPR52688.2022.01170.
  • [22] A. Mittal, A. K. Moorthy, and A. C. Bovik. No-reference image quality assessment in the spatial domain. IEEE Transactions on Image Processing 21(12):4695-4708, 2012. doi:10.1109/TIP.2012.2214050.
  • [23] A. Mittal, R. Soundararajan, and A. C. Bovik. Making a “completely blind” image quality analyzer. IEEE Signal Processing Letters 20(3):209-212, 2013. doi:10.1109/LSP.2012.2227726.
  • [24] S. Moran, P. Marza, S. McDonagh, S. Parisot, and G. Slabaugh. DeepLPF: Deep Local Parametric Filters for image enhancement. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020), pp. 12823-12832, 2020. doi:10.1109/CVPR42600.2020.01284.
  • [25] M. Noyan, A. R. Gostipathy, R. Wightman, and P. Cuenca. timm PyTorch Image Models. In: Hugging Face, 2025. https://huggingface.co/timm.
  • [26] NVIDIA Corporation. NVIDIA cuDNN. In: NVIDIA DEVELOPER, 2025. https://developer.nvidia.com/cudnn.
  • [27] S. Park, S. Yu, B. Moon, S. Ko, and J.-I. Paik. Low-light image enhancement using variational optimization-based Retinex model. IEEE Transactions on Consumer Electronics 63(2):178-184, 2017. doi:10.1109/TCE.2017.014847.
  • [28] PyTorch. Previous PyTorch Versions, 2025. https://pytorch.org/get-started/previous-versions/.
  • [29] A. Rogozhnikov. Einops: Clear and reliable tensor manipulations with Einstein-like notation. In: International Conference on Learning Representations (ICLR 2022), 2022. https://openreview.net/forum?id=oapKSVM2bcj.
  • [30] A. Rogozhnikov. einops, 2025. https://einops.rocks/.
  • [31] H. Shakibania, S. Raoufi, and H. Khotanlou. CDAN: Convolutional dense attention-guided network for low-light image enhancement. Digital Signal Processing 156:104802, 2025. doi:10.1016/j.dsp.2024.104802.
  • [32] A. Wang, Y. Li, J. Peng, Y. Ma, X. Wang, et al. Real-time image enhancer via learnable spatial aware 3D lookup tables. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV 2021), pp. 2451-2460, 2021. doi:10.1109/ICCV48922.2021.00247.
  • [33] R. Wang, Q. Zhang, C.-W. Fu, X. Shen, W.-S. Zheng, et al. Underexposed photo enhancement using deep illumination estimation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2019), pp. 6842-6850, 2019. doi:10.1109/CVPR.2019.00701.
  • [34] T. Wang, K. Zhang, T. Shen, W. Luo, B. Stenger, et al. Ultra-high-definition low-light image enhancement: A benchmark and transformer-based method. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2023), vol. 37 no. 3, pp. 2654-2662, 2023. doi:10.1609/aaai.v37i3.25364.
  • [35] Y. Wang, R. Wan, W. Yang, H. Li, L.-P. Chau, et al. Low-light image enhancement with normalizing flow. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2022), vol. 36 no. 3, pp. 2604-2612, 2022. doi:10.1609/aaai.v36i3.20162.
  • [36] Z. Wang, X. Cun, J. Bao, W. Zhou, J. Liu, et al. U-former: A general U-shaped transformer for image restoration. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022), pp. 17662-17672, 2022. doi:10.1109/CVPR52688.2022.01716.
  • [37] C. Wei, W. Wang, W. Yang, and J. Liu. Deep Retinex decomposition for low-light enhancement. In: Proceedings of the British Machine Vision Conference (BMVC 2018), 2018. https://bmva-archive.org.uk/bmvc/2018/contents/papers/0451.pdf.
  • [38] J. Wen, C. Wu, T. Zhang, Y. Yu, and P. Swierczynski. Self-reference deep adaptive curve estimation for low-light image enhancement. arXiv, arXiv:2308.08197, 2023. doi:10.48550/arXiv.2308.08197.
  • [39] K. Xu, X. Yang, B. Yin, and R. W. H. Lau. Learning to restore low-light images via decomposition-and-enhancement. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020), pp. 2278-2287, 2020. doi:10.1109/CVPR42600.2020.00235.
  • [40] X. Xu, R. Wang, C.-W. Fu, and J. Jia. Snr-aware low-light image enhancement. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022), pp. 17737-17747, 2022. doi:10.1109/CVPR52688.2022.01719.
  • [41] W. Yang, S. Wang, Y. Fang, Y. Wang, and J. Liu. Band representation-based semi-supervised low-light image enhancement: Bridging the gap between signal fidelity and perceptual quality. IEEE Transactions on Image Processing 30:3461-3473, 2021. doi:10.1109/TIP.2021.3062184.
  • [42] W. Yang, W. Wang, H. Huang, S. Wang, and J. Liu. Sparse gradient regularized deep Retinex network for robust low-light image enhancement. IEEE Transactions on Image Processing 30:2072-2086, 2021. doi:10.1109/TIP.2021.3050850.
  • [43] X. Yi, H. Xu, H. Zhang, L. Tang, and J. Ma. Diff-Retinex: Rethinking low-light image enhancement with a generative diffusion model. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 6725-6735, 2023. doi:10.1109/ICCV51070.2023.01130.
  • [44] D. You, J. Tao, Y. Zhang, and M. Zhang. Low-light image enhancement based on gray scale transformation and improved Retinex. Infrared Technology (Hongwai Jishu) 45(2):161-170, 2023.
  • [45] S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, et al. Learning enriched features for real image restoration and enhancement. In: European Conference on Computer Vision (ECCV 2020), vol. 12370 of Lecture Notes in Computer Science, pp. 492-511, 2020. doi:10.1007/978-3-030-58595-230.
  • [46] S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, et al. Restormer: Efficient transformer for high-resolution image restoration. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022), pp. 5718-5729, 2022. doi:10.1109/CVPR52688.2022.00564.
  • [47] H. Zeng, J. Cai, L. Li, Z. Cao, and L. Zhang. Learning image-adaptive 3d lookup tables for high performance photo enhancement in real-time. IEEE Transactions on Pattern Analysis and Machine Intelligence 44(4):2058-2073, 2022. doi:10.1109/TPAMI.2020.3005590.
  • [48] R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang. The unreasonable effectiveness of deep features as a perceptual metric. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018), pp. 586–595, 2018. doi:10.1109/CVPR.2018.00068.
  • [49] Y. Zhang, X. Guo, J. Ma, W. Liu, and J. Zhang. Beyond brightening low-light images. International Journal of Computer Vision 129(4):1013-1037, 2021. doi:10.1007/s11263-020-01407-x.
  • [50] Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu. Residual dense network for image restoration. IEEE Transactions on Pattern Analysis and Machine Intelligence 43(7):2480-2495, 2021. doi:arXiv:1812.10477.
  • [51] Y. Zhang, J. Zhang, and X. Guo. Kindling the darkness: A practical low-light image enhancer. In: Proceedings of the ACM International Conference on Multimedia (ACM MM), pp. 1632-1640, 2019. doi:10.1145/3343031.3350926.
  • [52] D. Zhou, Z. Yang, and Y. Yang. Pyramid diffusion models for low-light image enhancement. arXiv, arXiv:2305.10028, 2023. doi:10.48550/arXiv.2305.10028.
  • [53] S. Zhou, C. Li, and C. C. Loy. LEDNet: Joint low-light enhancement and deblurring in the dark. In: European Conference on Computer Vision (ECCV), vol. 13666 of Lecture Notes in Computer Science, pp. 573-589, 2022. doi:10.1007/978-3-031-20068-733.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-dd1beda6-1c89-4045-adc1-cabcfd57a780
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.