PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Targeted data augmentation for improving model robustness

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
This paper proposes a new and effective bias mitigation method called targeted data augmentation (TDA). Since removing biases is often tedious and challenging and may not always lead to effective bias mitigation, we propose an alternative approach: skillfully inserting biases during the training to improve model robustness. To validate the proposed method, we applied TDA to two representative and diverse datasets: a clinical skin lesion dataset and a dataset of male and female faces. We identified and manually annotated existing instrument and sampling biases in these datasets, explicitly focusing on black frames and ruler marks in the skin lesion dataset and glasses in the face dataset. Using the counterfactual bias insertion (CBI) method, we confirmed that these biases strongly affect the model performance. By randomly inserting identified biases into training samples, we demonstrated that TDA significantly reduced bias measures by two times to more than 50 times, with only a negligible increase in the error rate. We performed our research on three model families: EfficientNet, DenseNet and Vision Transformer.
Rocznik
Strony
143--155
Opis fizyczny
Bibliogr. 37 poz., rys., tab.
Twórcy
  • Department of Intelligent Control Systems and Decision Support, Gdańsk University of Technology, Gabriela Narutowicza 11/12, 80-233 Gdańsk, Poland
autor
  • Department of Intelligent Control Systems and Decision Support, Gdańsk University of Technology, Gabriela Narutowicza 11/12, 80-233 Gdańsk, Poland
  • Department of Intelligent Control Systems and Decision Support, Gdańsk University of Technology, Gabriela Narutowicza 11/12, 80-233 Gdańsk, Poland
Bibliografia
  • [1] Abbas, Q., Celebi, M.E. and García, I.F. (2011). Hair removal methods: A comparative study for dermoscopy images, Biomedical Signal Processing and Control 6(4): 395-404.
  • [2] Barata, C., Marques, J.S. and Celebi, M.E. (2019). Deep attention model for the hierarchical diagnosis of skin lesions, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, USA, pp. 2757-2765.
  • [3] Bardou, D., Bouaziz, H., Lv, L. and Zhang, T. (2022). Hair removal in dermoscopy images using variational autoencoders, Skin Research and Technology 28(3): 445-454.
  • [4] Bissoto, A., Fornaciali, M., Valle, E. and Avila, S. (2019). (DE)Constructing bias on skin lesion datasets, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Long Beach, USA, pp. 1-9.
  • [5] Bissoto, A., Valle, E. and Avila, S. (2020). Debiasing skin lesion datasets and models? Not so fast, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, USA, pp. 3192-3201.
  • [6] Chai, C. and Li, G. (2020). Human-in-the-loop techniques in machine learning, IEEE Data Engineering Bulletin 43(3): 37-52.
  • [7] Chauhan, A. (2019). Gender classification dataset, https://www.kaggle.com/datasets/cashutosh/gender-classification-dataset.
  • [8] Codella, N.C., Gutman, D., Celebi, M.E., Helba, B., Marchetti, M.A., Dusza, S.W., Kalloo, A., Liopyris, K., Mishra, N., Kittler, H. and Halpern, A. (2018). Skin lesion analysis toward melanoma detection: A challenge at the 2017 International Symposium on Biomedical Imaging (ISBI), hosted by the International Skin Imaging Collaboration (ISIC), 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), Washington, USA, pp. 168-172.
  • [9] Combalia, M., Codella, N.C., Rotemberg, V., Helba, B., Vilaplana, V., Reiter, O., Carrera, C., Barreiro, A., Halpern, A.C. Puig, S. and Malvehy, J. (2019). BCN20000: Dermoscopic lesions in the wild, arXiv: 1908.02288.
  • [10] Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J. and Houlsby, N. (2021). An image is worth 16×16 words: Transformers for image recognition at scale, International Conference on Learning Representations, Vienna, Austria.
  • [11] Dwork, C., Immorlica, N., Kalai, A.T. and Leiserson, M. (2018). Decoupled classifiers for group-fair and efficient machine learning, 1st Conference on Fairness, Accountability and Transparency, New York, NY, pp. 119-133.
  • [12] Gao, D., Wu, R., Liu, J., Fan, X. and Tang, X. (2020). Finding robust transfer features for unsupervised domain adaptation, International Journal of Applied Mathematics and Computer Science 30(1): 99-112, DOI: 10.34768/amcs-2020-0008.
  • [13] He, J. and van de Vijver, F. (2012). Bias and equivalence in cross-cultural research, Online Readings in Psychology and Culture 2(2): 2307-0919.
  • [14] Hou, Q., Jiang, P., Wei, Y. and Cheng, M.-M. (2018). Self-erasing network for integral object attention, 32nd Conference on Advances in Neural Information Processing Systems, NeurIPS 2018.
  • [15] Huang, G., Liu, Z. and Weinberger, K.Q. (2016). Densely connected convolutional networks, CoRR: abs/1608.06993.
  • [16] Huang, Q., Chen, X., Metaxas, D. and Nadar, M.S. (2019). Brain segmentation from k-space with end-to-end recurrent attention network, in D. Shen et al. (Eds), Medical Image Computing and Computer-Assisted Intervention - MICCAI 2019, Springer, Cham, pp. 275-283.
  • [17] ISIC (2020). SIIM-ISIC 2020 challenge dataset, International Skin Imaging Collaboration, https://challenge2020.isic-archive.com/.
  • [18] Le Bras, R., Swayamdipta, S., Bhagavatula, C., Zellers, R., Peters,M., Sabharwal, A. and Choi, Y. (2020). Adversarial filters of dataset biases, Proceedings of the 37th International Conference on Machine Learning, Vienna, Austria, pp. 1078-1088.
  • [19] Li, H., Liu, Y., Ouyang, W. and Wang, X. (2019). Zoom out-and-in network with map attention decision for region proposal and object detection, International Journal of Computer Vision 127(3): 225-238.
  • [20] Luengo-Oroz, M., Bullock, J., Pham, K.H., Lam, C.S.N. and Luccioni, A. (2021). From artificial intelligence bias to inequality in the time of COVID-19, IEEE Technology and Society Magazine 40(1): 71-79.
  • [21] Mahtani, K., Spencer, E.A., Brassey, J. and Heneghan, C. (2018). Catalogue of bias: Observer bias, BMJ Evidence-Based Medicine 23(1): 23-24.
  • [22] Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K. and Galstyan, A. (2021). A survey on bias and fairness in machine learning, ACM Computing Surveys 54(6): 1-35, DOI: 10.1145/3457607.
  • [23] Mikołajczyk, A., Grochowski, M. and Kwasigroch, A. (2021). Towards explainable classifiers using the counterfactual approach - Global explanations for discovering bias in data, Journal of Artificial Intelligence and Soft Computing Research 11(1): 51-67.
  • [24] Mikołajczyk, A., Majchrowska, S. and Limeros, S.C. (2022). The (de)biasing effect of GAN-based augmentation methods on skin lesion images, arXiv: 2206.15182.
  • [25] Oliveira, R.B., Mercedes Filho, E., Ma, Z., Papa, J.P., Pereira, A.S. and Tavares, J.M.R. (2016). Computational methods for the image segmentation of pigmented skin lesions: A review, Computer Methods and Programs in Biomedicine 131: 127-141.
  • [26] Ramella, G. (2021). Hair removal combining saliency, shape and color, Applied Sciences 11(1): 447.
  • [27] Shorten, C. and Khoshgoftaar, T.M. (2019). A survey on image data augmentation for deep learning, Journal of Big Data 6(1): 1-48.
  • [28] Surówka, G. and Ogorzałek, M. (2022). Segmentation of the melanoma lesion and its border, International Journal of Applied Mathematics and Computer Science 32(4): 683-699, DOI: 10.34768/amcs-2022-0047.
  • [29] Tan, M. and Le, Q. (2019). EfficientNet: Rethinking model scaling for convolutional neural networks, Proceedings of the 36th International Conference on Machine Learning, Long Beach, USA, pp. 6105-6114.
  • [30] Torralba, A. and Efros, A.A. (2011). Unbiased look at dataset bias, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2011, Colorado Springs, USA, pp. 1521-1528.
  • [31] Tschandl, P., Rosendahl, C. and Kittler, H. (2018). The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions, Scientific Data 5: 180161.
  • [32] Van Molle, P., De Strooper, M., Verbelen, T., Vankeirsbilck, B., Simoens, P. and Dhoedt, B. (2018). Visualizing convolutional neural networks to improve decision support for skin lesion classification, in D. Stoyanov et al. (Eds), Understanding and Interpreting Machine Learning in Medical Image Computing Applications, Springer, Cham, pp. 115-123.
  • [33] Wang, Z., Qinami, K., Karakozis, I.C., Genova, K., Nair, P., Hata, K. and Russakovsky, O. (2020). Towards fairness in visual recognition: Effective strategies for bias mitigation, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp. 8919-8928.
  • [34] Wesker, K.H., Radlanski, R.J. and Kaczmarzyk, T. (2015). Face: Atlas of Clinical Anatomy, Kwintesencja, Warsaw, (in Polish).
  • [35] Zawacki, A., Helba, B., Shih, G., Weber, J., Elliott, J., Combalia, M., Kurtansky, N., Codella, N., Culliton, P. and Rotemberg, V. (2020). SIIM-ISIC melanoma classification, https://kaggle.com/competitions/siim-isic-melanoma-classification.
  • [36] Zhang, B. H., Lemoine, B. and Mitchell, M. (2018). Mitigating unwanted biases with adversarial learning, Proceedings of the 2018 AAAI/ACMConference on AI, Ethics, and Society, New Orleans, USA, pp. 335-340.
  • [37] Zhao, J., Wang, T., Yatskar, M., Ordonez, V. and Chang, K.-W. (2017). Men also like shopping: Reducing gender bias amplification using corpus-level constraints, arXiv: 1707.09457.
Uwagi
Opracowanie rekordu ze środków MNiSW, umowa nr POPUL/SP/0154/2024/02 w ramach programu "Społeczna odpowiedzialność nauki II" - moduł: Popularyzacja nauki (2025).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-53d25592-45a7-42d5-8cde-6c92a89b4df4
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.