Transformer-based cross-modal multi-contrast network for ophthalmic diseases diagnosis

Yu, Yang; Zhu, Hongqing

doi:10.1016/j.bbe.2023.06.001

Artykuł - szczegóły

Tytuł artykułu

Transformer-based cross-modal multi-contrast network for ophthalmic diseases diagnosis

Autorzy

Yu Yang , Zhu Hongqing

Wybrane pełne teksty z tego czasopisma

Identyfikatory

DOI

10.1016/j.bbe.2023.06.001

Warianty tytułu

Języki publikacji

Abstrakty

Automatic diagnosis of various ophthalmic diseases from ocular medical images is vital to support clinical decisions. Most current methods employ a single imaging modality, especially 2D fundus images. Considering that the diagnosis of ophthalmic diseases can greatly benefit from multiple imaging modalities, this paper further improves the accuracy of diagnosis by effectively utilizing cross-modal data. In this paper, we propose Transformerbased cross-modal multi-contrast network for efficiently fusing color fundus photograph (CFP) and optical coherence tomography (OCT) modality to diagnose ophthalmic diseases. We design multi-contrast learning strategy to extract discriminate features from crossmodal data for diagnosis. Then channel fusion head captures the semantically shared information across different modalities and the similarity features between patients of the same category. Meanwhile, we use a class-balanced training strategy to cope with the situation that medical datasets are usually class-imbalanced. Our method is evaluated on public benchmark datasets for cross-modal ophthalmic disease diagnosis. The experimental results demonstrate that our method outperforms other approaches. The codes and models are available at https://github.com/ecustyy/tcmn.

Słowa kluczowe

ophthalmic disease diagnosis vision transformer cross modal contrastive learning feature fusion

diagnostyka chorób oczu transformator wizyjny fuzja funkcji

Wydawca

Nałęcz Institute of Biocybernetics and Biomedical Engineering of the Polish Academy of Sciences
Elsevier

Czasopismo

Biocybernetics and Biomedical Engineering

Rocznik

2023

Tom

Vol. 43, no. 3

Strony

507--527

Opis fizyczny

Bibliogr. 47 poz., rys., tab., wykr.

Twórcy

autor

Yu Yang

School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China

autor

Zhu Hongqing

hqzhu@ecust.edu.cn

School of Information Science & Engineering, East China University of Science and Technology, No. 130 Mei Long Road, Shanghai 200237, China

Bibliografia

[1] He J, Wang J, Han Z, Ma J, Wang C, Qi M. An interpretable transformer network for the retinal disease classification using optical coherence tomography. Sci Rep 2023;13(1):3637. https://doi.org/10.1038/s41598-023-30853-z.
[2] Hu X, Zhang LX, Gao L, Dai W, Han X, Lai YK, et al. Glim-net: Chronic glaucoma forecast transformer for irregularly sampled sequential fundus images. IEEE Trans Med Imag 2023. https://doi.org/10.1109/TMI.2023.3243692. 1-1.
[3] Wu J, Fang H, Li F, Fu H, Lin F, Li J. et al. Gamma challenge: glaucoma grading from multi-modality images. arXiv preprint arXiv:220206511; 2022.
[4] Toğaçar M, Ergen B, Tümen V. Use of dominant activations obtained by processing oct images with the cnns and slime mold method in retinal disease detection. Biocybernet Biomed Eng 2022;42(2):646-66. https://doi.org/10.1016/j. Bbe.2022.05.005.
[5] Wang W, Li X, Xu Z, Yu W, Zhao J, Ding D, et al. Learning two-stream cnn for multi-modal age-related macular degeneration categorization. IEEE J Biomed Health Informat 2022;26(8):4111-22. https://doi.org/10.1109/JBHI.2022.3171523.
[6] Palanisamy G, Shankar NB, Ponnusamy P, Gopi VP. A hybrid feature preservation technique based on luminosity and edge based contrast enhancement in color fundus images. Biocybernet Biomed Eng 2020;40(2):752-63. https://doi.org/ 10.1016/j.bbe.2020.02.006.
[7] Sambyal N, Saini P, Syal R, Gupta V. Modified u-net architecture for semantic segmentation of diabetic retinopathy images. Biocybernet Biomed Eng 2020;40 (3):1094-109. https://doi.org/10.1016/j.bbe.2020.05.006.
[8] Pathan S, Kumar P, Pai R, Bhandary SV. Automated detection of optic disc contours in fundus images using decision tree classifier. Biocybernet Biomed Eng 2020;40(1):52-64. https:// doi.org/10.1016/j.bbe.2019.11.003.
[9] Xu Z, Zou B, Liu Q. A dark and bright channel prior guided deep network for retinal image quality assessment. Biocybernet Biomed Eng 2022;42(3):772-83. https://doi.org/ 10.1016/j.bbe.2022.06.002.
[10] Liu X, Zhang D, Yao J, Tang J. Transformer and convolutional based dual branch network for retinal vessel segmentation in octa images. Biomed Signal Process Control 2023;83 104604. https://doi.org/10.1016/j.bspc.2023.104604.
[11] Elsharkawy M, Sharafeldeen A, Soliman A, Khalifa F, Ghazal M, El-Daydamony E, et al. A novel computer-aided diagnostic system for early detection of diabetic retinopathy using 3doct higher-order spatial appearance model. Diagnostics 2022;12(2). https://doi.org/10.3390/diagnostics12020461.
[12] He X, Deng Y, Fang L, Peng Q. Multi-modal retinal image classification with modality-specific attention network. IEEE Trans Med Imag 2021;40(6):1591-602. https://doi.org/10.1109/ TMI.2021.3059956.
[13] Bhati A, Gour N, Khanna P, Ojha A. Discriminative kernel convolution network for multi-label ophthalmic disease detection on imbalanced fundus image dataset. Comput Biol Med 2023;153 106519. https://doi.org/10.1016/j.compbiomed.2022.106519.
[14] Wang K, Xu C, Li G, Zhang Y, Zheng Y, Sun C. Combining convolutional neural networks and self-attention for fundus diseases identification. Sci Rep 2023;13(1):76. https://doi.org/ 10.1038/s41598-022-27358-6.
[15] Kuntha Pin Jee Ho Chang YN. Comparative study of transfer learning models for retinal disease diagnosis from fundus images. Comput Mater Continua 2022;70(3):5821-34. https:// doi.org/10.32604/cmc.2022.021943.
[16] Hsu HY, Chou YB, Jheng YC, Kao ZK, Huang HY, Chen HR, et al. Automatic segmentation of retinal fluid and photoreceptor layer from optical coherence tomography images of diabetic macular edema patients using deep learning and associations with visual acuity. Biomedicines 2022;10(6). https://doi.org/10.3390/biomedicines10061269.
[17] Xu Y, Fan Y. Dual-channel asymmetric convolutional neural network for an efficient retinal blood vessel segmentation in eye fundus images. Biocybernet Biomed Eng 2022;42 (2):695-706. https://doi.org/10.1016/j.bbe.2022.05.003.
[18] Meshkin RS, Armstrong GW, Hall NE, Rossin EJ, Hymowitz MB, Lorch AC. Effectiveness of a telemedicine program for triage and diagnosis of emergent ophthalmic conditions. Eye (Lond) 2023;37(2):325-31. https://doi.org/10.1038/s41433-022-01940-8.
[19] Fang H, Shang F, Fu H, Li F, Zhang X, Xu Y. Multi-modality images analysis: a baseline for glaucoma grading via deep learning. In: Fu H, Garvin MK, MacGillivray T, Xu Y, Zheng Y, editors. Ophthalmic Medical Image Analysis. Cham: Springer International Publishing; 2021. p. 139-47. https://doi.org/ 10.1007/978-3-030-87000-3_15.
[20] Hua CH, Kim K, Huynh-The T, You JI, Yu SY, Le-Tien T, et al. Convolutional network with twofold feature augmentation for diabetic retinopathy recognition from multi-modal images. IEEE J Biomed Health Informat 2021;25(7):2686-97. https://doi.org/10.1109/jbhi.2020.3041848.
[21] Liu R, Li Q, Xu F, Wang S, He J, Cao Y, et al. Application of artificial intelligence-based dual-modality analysis combining fundus photography and optical coherence tomography in diabetic retinopathy screening in a community hospital. BioMed Eng OnLine 2022;21(1):1-11. https://doi.org/10.1186/s12938-022-01018-2.
[22] Han J, Choi S, Park JI, Hwang JS, Han JM, Lee HJ, et al. Classifying neovascular age-related macular degeneration with a deep convolutional neural network based on optical coherence tomography images. Sci Rep 2022;12(1):2232. https://doi.org/10.1038/s41598-022-05903-7.
[23] Marrakchi Y, Makansi O, Brox T. Fighting class imbalance with contrastive learning. In: de Bruijne, M., Cattin, P.C., Cotin, S., Padoy, N., Speidel, S., Zheng, Y., et al., editors. Medical Image Computing and Computer Assisted Intervention – MICCAI 2021. Cham: Springer International Publishing; 2021, pp. 466-476. https://doi.org/10.1007/978-3-030-87199-4_44.
[24] Li X, Jia M, Islam MT, Yu L, Xing L. Self-supervised feature learning via exploiting multi-modal data for retinal disease diagnosis. IEEE Trans Med Imag 2020;39(12):4023-33. https:// doi.org/10.1109/TMI.2020.3008871.
[25] He K, Fan H, Wu Y, Xie S, Girshick R. Momentum contrast for unsupervised visual representation learning. In: 2020 IEEE/ CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2020, pp. 9726-9735. https://doi.org/10.1109/ CVPR42600.2020.00975.
[26] Lin Z, Zhang D, Shi D, Xu R, Tao Q, Wu L, et al. Contrastive pre-training and linear interaction attention-based transformer for universal medical reports generation. J Biomed Inform 2023;138 104281. https://doi.org/10.1016/j.jbi.2023.104281.
[27] Khosla P, Teterwak P, Wang C, Sarna A, Tian Y, Isola P, et al. Supervised contrastive learning. Adv Neural Informat Proces Syst 2020;33:18661-73.
[28] Grill JB, Strub F, Altché F, Tallec C, Richemond P, Buchatskaya E, et al. Bootstrap your own latent-a new approach to selfsupervised learning. Adv Neural Informat Process Syst 2020;33:21271-84.
[29] Chen X, He K. Exploring simple siamese representation learning. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2021, pp. 15745-15753. https://doi.org/10.1109/CVPR46437.2021.01549.
[30] Deng Z, Cai Y, Chen L, Gong Z, Bao Q, Yao X, et al. Rformer: Transformer-based generative adversarial network for real fundus image restoration on a new clinical benchmark. IEEE J Biomed Health Informat 2022;26(9):4645-55. https://doi.org/ 10.1109/JBHI.2022.3187103.
[31] Philippi D, Rothaus K, Castelli M. A vision transformer architecture for the automated segmentation of retinal lesions in spectral domain optical coherence tomography images. Sci Rep 2023;13(1):517. https://doi.org/10.1038/s41598-023-27616-1.
[32] Oh W, Yoo H, Ha T, Oh S. Local selective vision transformer for depth estimation using a compound eye camera. Pattern Recogn Lett 2023;167:82-9. https://doi.org/10.1016/j. Patrec.2023.02.010.
[33] Domínguez C, Heras J, Mata E, Pascual V, Royo D, Ángel Zapata M. Binary and multi-class automated detection of age-related macular degeneration using convolutional- and transformer-based architectures. Comput Methods Programs Biomed 2023;229. https://doi.org/10.1016/j.cmpb.2022.107302 107302.
[34] Gu Z, Li Y, Wang Z, Kan J, Shu J, Wang Q. et al. Classification of diabetic retinopathy severity in fundus images using the vision transformer and residual attention. Computat Intell Neurosci 2023;2023. https://doi.org/10.1155/2023/1305583.
[35] Ju L, Wang X, Wang L, Liu T, Zhao X, Drummond T, et al. Relational subsets knowledge distillation for long-tailed retinal diseases recognition. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer; 2021. p. 3-12.
[36] Galdran A, Carneiro G, González Ballester MA. Balanced-mixup for highly imbalanced medical image classification. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer; 2021. p. 323-33.
[37] Li M, Zhang Y, Ji Z, Xie K, Yuan S, Liu Q. et al. Ipn-v2 and octa500: methodology and dataset for retinal image segmentation. arXiv preprint arXiv:201207261; 2020.
[38] Caron M, Touvron H, Misra I, Jégou H, Mairal J, Bojanowski P. et al. Emerging properties in self-supervised vision transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision; 2021, pp. 9650-9660.
[39] Cai Z, Lin L, He H, Tang X. Corolla: An efficient multimodality fusion framework with supervised contrastive learning for glaucoma grading. In: 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI); 2022, pp. 1-4. https://doi.org/10.1109/ISBI52829.2022.9761712.
[40] Chen X, Xie S, He K. An empirical study of training self-supervised vision transformers. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV); 2021, pp. 9620-9629. https://doi.org/10.1109/ICCV48922.2021.00950.
[41] Mai S, Li Q, Zhao Q, Gao M. Few-shot transfer learning for hereditary retinal diseases recognition. In: International Conference on Medical Image Computing and ComputerAssisted Intervention. Springer; 2021. p. 97-107.
[42] Lee J, Oh J, Shin I, Kim Ys, Sohn DK, Kim Ts, et al. Moving from 2d to 3d: volumetric medical image classification for rectal cancer staging. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer; 2022. p. 780-90.
[43] He Y, Liang W, Zhao D, Zhou HY, Ge W, Yu Y. et al. Attribute surrogates learning and spectral tokens pooling in transformers for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2022, pp. 9119-9129.
[44] Liu H, Jiang X, Li X, Bao Z, Jiang D, Ren B. Nommer: Nominate synergistic context in vision transformer for visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2022, pp. 12073- 12082.
[45] Nawaz M, Nazir T, Javed A, Tariq U, Yong HS, Khan MA, et al. An efficient deep learning approach to automatic glaucoma detection using optic disc and optic cup localization. Sensors 2022;22(2). https://doi.org/10.3390/s22020434.
[46] Tulsani A, Kumar P, Pathan S. Automated segmentation of optic disc and optic cup for glaucoma assessment using improved unet++ architecture. Biocybernet Biomed Eng 2021;41(2):819-32. https://doi.org/10.1016/j.bbe.2021.05.011.
[47] Malinowski K, Saeed K. An iris segmentation using harmony search algorithm and fast circle fitting with blob detection. Biocybernet Biomed Eng 2022;42(1):391-403. https://doi.org/ 10.1016/j.bbe.2022.02.010.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-f2ef3081-c0bf-4409-8120-c1d8e7ecf5f2