Attention-based U-Net for image demoiréing

Lehmann, Tomasz M.

doi:10.22630/MGV.2022.31.1.1

Powiadomienia systemowe

Sesja wygasła!

Artykuł - szczegóły

Tytuł artykułu

Attention-based U-Net for image demoiréing

Autorzy

Lehmann Tomasz M.

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

DOI

10.22630/MGV.2022.31.1.1

Warianty tytułu

Języki publikacji

Abstrakty

Image demoiréing is a particular example of a picture restoration problem. Moiré is an interference pattern generated by overlaying similar but slightly offset templates. In this paper, we present a deep learning based algorithm to reduce moiré disruptions. The proposed solution contains an explanation of the cross-sampling procedure - the training dataset management method which was optimized according to limited computing resources. Suggested neural network architecture is based on Attention U-Net structure. It is an exceptionally effective model which was not proposed before in image demoiréing systems. The greatest improvement of this model in comparison to U-Net network is the implementation of attention gates. These additional computing operations make the algorithm more focused on target structures. We also examined three MSE and SSIM based loss functions. The SSIM index is used to predict the perceived quality of digital images and videos. A similar approach was applied in various computer vision areas. The author’s main contributions to the image demoiréing problem contain the use of the novel architecture for this task, innovative two-part loss function, and the untypical use of the cross-sampling training procedure.

Słowa kluczowe

image demoiréing computer vision attention U-Net cross-sampling

Wydawca

Institute of Information Technology of the Warsaw University of Life Sciences – SGGW

Czasopismo

Machine Graphics and Vision

Rocznik

2022

Tom

Vol. 31, No. 1/4

Strony

3--17

Opis fizyczny

Bibliogr. 29 poz., il., tab., wykr.

Twórcy

autor

Lehmann Tomasz M.

tomasz.lehmann.dokt@pw.edu.pl

Warsaw University of Technology, Warsaw, Poland

https://orcid.org/0000-0002-8576-5357

Bibliografia

[1] I. Alhashim and P. Wonka. High quality monocular depth estimation via transfer learning. arXiv, 2018. doi:10.48550/arXiv.1812.11941.
[2] Y. Zhang an Y. Tian, Y. Kong, B. Zhong, and Y. Fu. Residual dense network for image restoration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43 (7): 2480-2495, 2020. doi:10.1109/tpami.2020.2968521.
[3] P. Anderson, X. He, C. Buehler, et al. Bottom-up and top-down attention for image captioning and visual question answering. InProc. 2018 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), pages 6077-6086, Salt Lake City, Utah, 18-23 Jun 2018. IEEE. doi:10.1109/CVPR.2018.00636.
[4] X. Cheng, Z. Fu, and J. Yang. Multi-scale dynamic feature encoding network for image demoiréing. In Proc. 2019 IEEE/CVF Int. Conf. Computer Vision Workshops (ICCVW), pages 3486-3493, Seoul, Korea, 27 Oct. - 2 Nov. 2019. IEEE. doi:10.1109/ICCVW.2019.00432.
[5] J. Gurrola-Ramos, O. Dalmau, and T. E. Alarcon. A residual dense U-Net neural network for image denoising. IEEE Access, 9:31742-31754, 2021. doi:10.1109/ACCESS.2021.3061062.
[6] C. Wang K. Batmanghelich D. Tao H. Fu, M. Gong. Deep ordinal regression network for monocular depth estimation. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. doi:10.1109/cvpr.2018.00214.
[7] B. He, C. Wang, B. Shi, and L. Y. Duan. Mop moiré patterns using MopNet. In Proc. 2019 IEEE/CVF Int. Conf. Computer Vision (ICCV), pages 2424-2432, Seoul, Korea, 27 Oct. - 2 Nov. 2019. IEEE. doi:10.1109/ICCV.2019.00251.
[8] B. He, C. Wang, B. Shi, and L.-Y. Duan. FHDe2Net: Full high definition demoireing network. In A. Vedaldi, H. Bischof, T. Brox, and J.-M. Frahm, editors, Computer Vision - Proc. ECCV 2020, volume 12367 of Lecture Notes in Computer Science, pages 713-729, Glasgow, UK, 23-28 Aug. 2020. Springer International Publishing. doi:10.1007/978-3-030-58542-643.
[9] X. Hu, M. A. Naiel, A. Wong, et al. RUNet: A robust UNet architecture for image super-resolution. In Proc. 2019 IEEE/CVF Conf. Computer Vision and Pattern Recognition Workshops (CVPRW),pages 505–507, Long Beach, California, 16-20 Jun 2019. IEEE. doi:10.1109/CVPRW.2019.00073.
[10] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. In Y. Bengio and Y. LeCun, editors, Proc. 3rd Int. Conf. Learning Representations, ICLR 2015, San Diego, CA, 7-9 May 2015.Accessible in arXiv. doi:10.48550/arXiv.1412.6980.
[11] O. Kupyn, T. Martyniuk, J. Wu, and Z. Wang. DeblurGAN-v2: Deblurring (orders-of-magnitude) faster and better. InProc. 2019 IEEE/CVF Int. Conf. Computer Vision (ICCV), pages 8877-8886, Seoul, Korea, 27 Oct - 2 Nov 2019. IEEE. doi:10.1109/ICCV.2019.00897.
[12] Z. Lu and Y. Chen. Single image super resolution based on a modified U-net with mixed gradient loss. Signal, Image and Video Processing, 16(5): 1143-1151, 2022. doi:10.1007/s11760-021-02063-5.
[13] S. Nah, T. H. Kim, and K. M. Lee. Deep multi-scale convolutional neural network for dynamic scene deblurring. InProc. 2017 IEEE Conf. Computer Vision and Pattern Recognition (CVPR),pages 257-265, Honolulu, Hawaii, 21-26 Jul 2017. IEEE. doi:10.1109/CVPR.2017.35.
[14] O. Oktay, J. Schlemper, L. L. Folgoc, et al. Attention U-Net: Learning where to look for the pancreas. arXiv, 2018. doi:10.48550/arXiv.1804.03999.
[15] G. Palubinskas. Image similarity/distance measures: what is really behind MSE and SSIM? Inter-national Journal of Image and Data Fusion, 8(1):32-53, 2016. doi:10.1080/19479832.2016.1273259.
[16] A. Paszke, S. Gross, F. Massa, et al. PyTorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32 - Proc. 33rd Conf. Neural Information Processing Systems (NeurIPS 2019), volume 11, pages 8024-8035, Vancouver, Canada,8-14 Dec. 2019. Accessible in arXiv. doi:10.48550/arXiv.1912.01703.
[17] B. D. Ripley. Pattern Recognition and Neural Networks. Cambridge University Press, 1996. doi:10.1017/CBO9780511812651.
[18] O. Ronneberger, P. Fischer, and T. Brox. U-Net: Convolutional networks for biomedical image segmentation. In N. Navab, J. Hornegger, W. M. Wells, et al., editors, Medical Image Computing and Computer-Assisted Intervention - Proc. MICCAI 2015, volume 9351 of Lecture Notes in Computer Science, pages 234-241, Munich, Germany, 5-9 Oct. 2015. Springer International Publishing. doi:10.1007/978-3-319-24574-428.
[19] De R. I. M. Setiadi. PSNR vs SSIM: imperceptibility quality assessment for image steganography. Multimedia Tools and Applications, 80(6): 8423-8444, 2021. doi:10.1007/s11042-020-10035-z.
[20] Y. Sun, Y. Yu, and W. Wang. Moiré photo restoration using multiresolution convolutional neural networks, 8 May 2018. https://paperswithcode.com/paper/moire-photo-restoration-using-multiresolution. [Accessed: May, 2022].
[21] F.-J Tsai, Y.-T. Peng, Y.-Y. Lin, et al. Strip former: Strip transformer for fast image deblurring. In S. Avidan, G. Brostow, M. Cissé, et al., editors, Computer Vision - Proc. ECCV 2022, volume13679, Part XIX of Lecture Notes in Computer Science, pages 146-162, Tel Aviv, Israel, 23-27 Oct. 2022. Springer Nature Switzerland. doi:10.1007/978-3-031-19800-79.
[22] Z. Tu, H. Talebi, H. Zhang, et al. MAXIM: Multi-Axis MLP for image processing. In Proc. 2022 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), pages 5759-5770, New Orleans, Louisiana, 18-24 Jun 2022. IEEE. doi:10.1109/CVPR52688.2022.00568.
[23] X. Wang, Y. Li, H. Zhang, and Y. Shan. Towards real-world blind face restoration with generative facial prior. In Proc. 2021 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), pages 9164-9174, Virtual conference, 20-25 Jun 2021. IEEE. doi:10.1109/CVPR46437.2021.00905.
[24] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli. Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Processing, 13(4): 600-612, Apr. 2004. doi:10.1109/TIP.2003.819861.
[25] Z. Wang, X. Cun, J. Bao, et al. Uformer: A general U-shaped transformer for image restoration. In Proc. 2022 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), pages 17662-17672, New Orleans, Louisiana, 18-24 Jun 2022. IEEE. doi:10.1109/CVPR52688.2022.01716.
[26] W. Wang Y. Sun, Y. Yu. Moir ́e photo restoration using multiresolution convolutional neural net-works.IEEE Trans. Image Processing, 27(8):4160–4172, 2018. doi:10.1109/tip.2018.2834737
[27] X. Yu, P. Dai, W. Li, L. Ma, et al. Towards efficient and scale-robust ultra-high-definition image demoiréing. In S. Avidan, G. Brostow, M. Cissé, et al., editors, Computer Vision - Proc. ECCV 2022, volume 13678, Part XVIII of Lecture Notes in Computer Science, pages 646-662, Tel Aviv, Israel, 23-27 Oct. 2022. Springer Nature Switzerland. doi:10.1007/978-3-031-19797-037.
[28] S. W. Zamir, A. Arora, S. Khan, et al. Restormer: Efficient transformer for high-resolution image restoration. In Proc. 2022 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), pages 5718-5729, New Orleans, Louisiana, 18-24 Jun 2022. IEEE.doi:10.1109/CVPR52688.2022.00564.
[29] B. Zheng, S. Yuan, G. Slabaugh, and A. Leonardis. Image demoiréing with learnable bandpass filters. InProc. 2020 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), pages3633-3642, Virtual conference, 14-19 Jun 2020. IEEE. doi:10.1109/CVPR42600.2020.00369.

Uwagi

Opracowanie rekordu ze środków MEiN, umowa nr SONP/SP/546092/2022 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2022-2023).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-f0c59eba-4cbc-42f3-a10b-3fe44cf7fca1