PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Few-shot medical image classification with simple shape and texture text descriptors using vision-language models

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Deep learning methods are gaining momentum in radiology. In this work, we investigate the usefulness of vision-language models (VLMs) and large language models for binary few-shot classification of medical images. We utilize the GPT-4 model to generate text descriptors that encapsulate the shape and texture characteristics of objects in medical images. Subsequently, these GPT-4 generated descriptors, alongside VLMs pre-trained on natural images, are employed to classify chest X-rays and breast ultrasound images. Our results indicate that few-shot classification of medical images using VLMs and GPT-4 generated descriptors is a viable approach. However, accurate classification requires the exclusion of certain descriptors from the calculations of the classification scores. Moreover, we assess the ability of VLMs to evaluate shape features in breast mass ultrasound images. This is performed by comparing VLM-based results generated for shape-related text descriptors with the actual values of the shape features calculated using segmentation masks. We further investigate the degree of variability among the sets of text descriptors produced by GPT-4. Our work provides several important insights about the application of VLMs for medical image analysis.
Rocznik
Strony
art. no. e153838
Opis fizyczny
Bibliogr. 26 poz., rys., tab.
Twórcy
autor
  • RIKEN Center for Brain Science, Wako, Japan
  • Institute of Fundamental Technological Research, Polish Academy of Sciences, Warsaw, Poland
  • RIKEN Center for Brain Science, Wako, Japan
  • Faculty of Computer Science, Universitas Indonesia, Depok, Indonesia
  • RIKEN Center for Brain Science, Wako, Japan
Bibliografia
  • [1] J. Zhang, J. Huang, S. Jin, and S. Lu, “Vision-language models for vision tasks: A survey,” arXiv preprint arXiv:2304.00685, 2023.
  • [2] OpenAI, “Gpt-4 technical report,” 2023.
  • [3] H. Nori, N. King, S.M. McKinney, D. Carignan, and E. Horvitz, “Capabilities of gpt-4 on medical challenge problems,” arXiv preprint arXiv:2303.13375, 2023.
  • [4] A.E. Johnson et al., “Mimic-cxr-jpg, a large publicly available database of labeled chest radiographs,” arXiv preprint arXiv:1901.07042, 2019.
  • [5] B. Boecking et al., “Making the most of text semantics to improve biomedical vision–language processing,” in European Conference on Computer Vision. Springer, 2022, pp. 1–21.
  • [6] A. Radford et al., “Learning transferable visual models from natural language supervision,” in ICML, 2021.
  • [7] S. Menon and C. Vondrick, “Visual classification via description from large language models,” arXiv preprint arXiv:2210.07183, 2022.
  • [8] C. Pellegrini, M. Keicher, E. Özsoy, P. Jiraskova, R. Braren, and N. Navab, “Xplainer: From x-ray observations to explainable zero-shot diagnosis,” arXiv preprint arXiv:2303.13391, 2023.
  • [9] Z. Qin, H. Yi, Q. Lao, and K. Li, “Medical image understanding with pretrained vision language models: A comprehensive study,” ArXiv, vol. abs/2209.15517, 2022.
  • [10] W.G. Flores, W.C. de Albuquerque Pereira, and A.F.C. Infantosi, “Improving classification performance of breast lesions on ultrasonography,” Pattern Recognit., vol. 48, no. 4, pp. 1125–1136, 2015.
  • [11] G.-G. Wu et al., “Artificial intelligence in breast ultrasound,” World J. Radiol., vol. 11, no. 2, p. 19, 2019. [12] M. Byra, “Breast mass classification with transfer learning based on scaling of deep representations,” Biomed. Signal Process. Control, vol. 69, p. 102828, 2021.
  • [13] Y. Shen et al., “Artificial intelligence system reduces falsepositive findings in the interpretation of breast ultrasound exams,” Nat. Commun., vol. 12, no. 1, p. 5645, 2021.
  • [14] M. Byra, K. Dobruch-Sobczak, H. Piotrzkowska-Wroblewska, Z. Klimonda, and J. Litniewski, “Prediction of response to neoadjuvant chemotherapy in breast cancer with recurrent neural networks and raw ultrasound signals,” Phys. Med. Biol., vol. 67, no. 18, p. 185007, 2022.
  • [15] N. Antropova, B.Q. Huynh, and M.L. Giger, “A deep feature fusion methodology for breast cancer diagnosis demonstrated on three imaging modality datasets,” Med. Phys., vol. 44, no. 10, pp. 5162–5171, 2017.
  • [16] C. Thomas, M. Byra, R. Marti, M.H. Yap, and R. Zwiggelaar, “Bus-set: A benchmark for quantitative evaluation of breast ultrasound segmentation networks with public datasets,” Med. Phys., vol. 50, no. 5, pp. 3223–3243, 2023.
  • [17] D.S. Kermany et al., “Identifying medical diagnoses and treatable diseases by image-based deep learning,” cell, vol. 172, no. 5, pp. 1122–1131, 2018.
  • [18] M.H. Yap et al., “Automated breast ultrasound lesions detection using convolutional neural networks,” IEEE J. Biomed. Health Inform., vol. 22, no. 4, pp. 1218–1226, 2017.
  • [19] T. Fawcett, “An introduction to roc analysis,” Pattern Recognit. Lett., vol. 27, no. 8, pp. 861–874, 2006.
  • [20] L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.” J. Mach. Learn. Res., vol. 9, no. 11, 2008. [21] G. Ilharco et al., “Openclip,” Jul. 2021, doi: 10.5281/zenodo.5143773.
  • [22] A. Paszke et al., “Pytorch: An imperative style, high-performance deep learning library,” in Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 2019, pp. 8024– 8035.
  • [23] C. Schuhmann et al., “LAION-5b: An open large-scale dataset for training next generation image-text models,” in Thirtysixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2022. [Online]. Available: https://openreview.net/forum?id=M3Y74vmsMcY
  • [24] S. Han et al., “A deep learning framework for supporting the classification of breast lesions in ultrasound images,” Phys. Med. Biol., vol. 62, no. 19, p. 7714, 2017.
  • [25] C.W. Hong et al., “Reader agreement and accuracy of ultrasound features for hepatic steatosis,” Abdom. Radiol., vol. 44, pp. 54–64, 2019.
  • [26] M. Byra et al., “Liver fat assessment in multiview sonography using transfer learning with convolutional neural networks,” J. Ultrasound Med., vol. 41, no. 1, pp. 175–184, 2022, doi: 10.1002/jum.15693.
Uwagi
Opracowanie rekordu ze środków MNiSW, umowa nr POPUL/SP/0154/2024/02 w ramach programu "Społeczna odpowiedzialność nauki II" - moduł: Popularyzacja nauki (2025).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-d5f52c05-c337-430b-8fd2-bcfcff50ce3c
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.