A linear-attention-combined convolutional neural network for EEG-based visual stimulus recognition

Huang, Junjie; Chen, Wanzhong; Zhang, Tao

doi:10.1016/j.bbe.2024.05.001

Artykuł - szczegóły

Tytuł artykułu

A linear-attention-combined convolutional neural network for EEG-based visual stimulus recognition

Autorzy

Huang Junjie , Chen Wanzhong , Zhang Tao

Wybrane pełne teksty z tego czasopisma

Identyfikatory

DOI

10.1016/j.bbe.2024.05.001

Warianty tytułu

Języki publikacji

Abstrakty

The recognition task of visual stimuli based on EEG (Electroencephalogram) has become a major and important topic in the field of Brain-Computer Interfaces (BCI) research. Although the underlying spatial features of EEG can effectively represent visual stimulus information, it still remains a highly challenging task to explore the local-global information of the underlying EEG to achieve better decoding performance. Therefore, in this paper we propose a deep learning architecture called Linear-Attention-combined Convolutional Neural Network (LACNN) for visual stimuli EEG-based classification task. The proposed architecture combines the modules of Convolutional Neural Networks (CNN) and Linear Attention, effectively extracting local and global features of EEG for decoding while maintaining low computational complexity and model parameters. We conducted extensive experiments on a public EEG dataset from the Stanford Digital Repository. The experimental results demonstrate that LACNN achieves an average decoding accuracy of 54.13% and 29.83% in 6-category and 72-exemplar classification tasks respectively, outperforming the state-of-the-art methods, which indicates that our method can effectively decode visual stimuli from EEG. Further analysis of LACNN shows that the Linear Attention module improves the separability between different category features and localizes key brain region information that aligns with the paradigm principles.

Słowa kluczowe

brain computer interface BCI convolutional neural network CNN electroencephalogram EEG linear attention mechanism visual stimulus recognition

interfejs mózg-komputer BCI sieć neuronowa konwolucyjna CNN elektroencefalogram EEG

Wydawca

Nałęcz Institute of Biocybernetics and Biomedical Engineering of the Polish Academy of Sciences
Elsevier

Czasopismo

Biocybernetics and Biomedical Engineering

Rocznik

2024

Tom

Vol. 44, no. 2

Strony

369--379

Opis fizyczny

Bibliogr. 50 poz., rys., tab., wykr.

Twórcy

autor

Huang Junjie

College of Communication Engineering, Jilin University, Changchun 130025, China

autor

Chen Wanzhong

College of Communication Engineering, Jilin University, Changchun 130025, China

autor

Zhang Tao

zhang_tao_19@jlu.edu.cn

College of Communication Engineering, Jilin University, Changchun 130025, China

Bibliografia

[1] Cichy RM, Pantazis D, Oliva A. Resolving human object recognition in space and time. Nature Neurosci 2014;17(3):455-62. http://dx.doi.org/10.1038/nn.3635.
[2] Jacobs AL, Fridman G, Douglas RM, Alam NM, Latham PE, Prusky GT, et al. Ruling out and ruling in neural codes. Proc Natl Acad Sci 2009;106(14):5936-41. http://dx.doi.org/10.1073/pnas.0900573106.
[3] Goodale MA, Milner AD. Separate visual pathways for perception and action. Trends Neurosci 1992;15(1):20-5. http://dx.doi.org/10.1016/0166-2236(92)90344-8.
[4] Kosmyna N, Lindgren JT, Lécuyer A. Attending to visual stimuli versus performing visual imagery as a control strategy for EEG-based brain-computer interfaces. Sci Rep 2018;8(1):13222. http://dx.doi.org/10.1038/s41598-018-31472-9.
[5] Hanson SJ, Matsuka T, Haxby JV. Combinatorial codes in ventral temporal lobe for object recognition: Haxby (2001) revisited: is there a face area? Neuroimage 2004;23(1):156-66. http://dx.doi.org/10.1016/j.neuroimage.2004.05.020.
[6] Haxby JV, Gobbini MI, Furey ML, Ishai A, Schouten JL, Pietrini P. Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science 2001;293(5539):2425-30. http://dx.doi.org/10.1126/science.1063736.
[7] Poldrack RA, Farah MJ. Progress and challenges in probing the human brain. Nature 2015;526(7573):371-9. http://dx.doi.org/10.1038/nature15692.
[8] Xu H, Hsu S-H, Nakanishi M, Lin Y, Jung T-P, Cauwenberghs G. Stimulus design for visual evoked potential based brain-computer interfaces. IEEE Trans Neural Syst Rehabil Eng 2023;31:2545-51. http://dx.doi.org/10.1109/TNSRE. 2023.3280081.
[9] Zheng L, Pei W, Gao X, Zhang L, Wang Y. A high-performance brain switch based on code-modulated visual evoked potentials. J Neural Eng 2022;19(1):016002. http://dx.doi.org/10.1088/1741-2552/ac494f.
[10] Jia W, Li S, Qian S, Wang M, Bao W, Zhao J, et al. The intuitive decision preference and EEG features based on commonality heuristic. Comput Biol Med 2023;160:106845. http://dx.doi.org/10.1016/j.compbiomed.2023.106845.
[11] Gimenez VB, Dos Reis SL, Simões de Souza FM. Convolutional neural network classification of topographic electroencephalographic maps on alcoholism. Int J Neural Syst 2023;33(05):2350025. http://dx.doi.org/10.1142/ S0129065723500259.
[12] Yu Z, Chen W, Zhang T. Motor imagery EEG classification algorithm based on improved lightweight feature fusion network. Biomed Signal Process Control 2022;75:103618. http://dx.doi.org/10.1016/j.bspc.2022.103618.
[13] Brandmayr G, Hartmann M, Fürbass F, Matz G, Samwald M, Kluge T, et al. Relational local electroencephalography representations for sleep scoring. Neural Netw 2022;154:310-22. http://dx.doi.org/10.1016/j.neunet.2022.07.020.
[14] Guerrero Mendez CDD, Blanco-Díaz CF, Ruiz Olaya AF, Lopez-Delis A, Jaramillo Isaza S, Milanezi Andrade R, et al. EEG motor imagery classification using deep learning approaches in naïve BCI users. Biomed Phys Eng Express 2023;9(4):045029. http://dx.doi.org/10.1088/2057-1976/acde82.
[15] Jindal M, Bajal E, Kazim A. Introduction: Brain-computer interface and deep learning. In: Brain-computer interface: using deep learning applications. 2023, p. 25-62. http://dx.doi.org/10.1002/9781119857655.ch2.
[16] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. In: Proceedings of the 31st international conference on neural information processing systems. 2017, p. 6000-10. http://dx.doi.org/ 10.48550/arXiv.1706.03762.
[17] Song Y, Zheng Q, Liu B, Gao X. EEG conformer: Convolutional transformer for EEG decoding and visualization. IEEE Trans Neural Syst Rehabil Eng 2022;31:710-9. http://dx.doi.org/10.1109/TNSRE.2022.3230250.
[18] Kaneshiro B, Perreau Guimaraes M, Kim H-S, Norcia AM, Suppes P. A representational similarity analysis of the dynamics of object processing using single-trial EEG classification. PLoS ONE 2015;10(8):1-27. http://dx.doi.org/10.1371/journal.pone.0135697.
[19] Karimi-Rouzbahani H, Shahmohammadi M, Vahab E, Setayeshi S, Carlson T. Temporal variabilities provide additional category-related information in object category decoding: A systematic comparison of informative EEG features. Neural Comput 2021;33(11):3027-72. http://dx.doi.org/10.1162/neco_a_01436.
[20] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521(7553):436-44. http://dx.doi.org/10.1038/nature14539.
[21] Zhang X, Gao X, Lu W, He L. A gated peripheral-foveal convolutional neural network for unified image aesthetic prediction. IEEE Trans Multimed 2019;21(11):2815-26. http://dx.doi.org/10.1109/TMM.2019.2911428.
[22] Lawhern VJ, Solon AJ, Waytowich NR, Gordon SM, Hung CP, Lance BJ. EEGNet: a compact convolutional neural network for EEG-based brain-computer interfaces. J Neural Eng 2018;15(5):056013. http://dx.doi.org/10.1088/1741-2552/aace8c.
[23] Kalafatovich J, Lee M, Lee S-W. Decoding visual recognition of objects from EEG signals based on attention-driven convolutional neural network. In: 2020 IEEE international conference on systems, man, and cybernetics. IEEE; 2020, p. 2985-90. http://dx.doi.org/10.1109/SMC42975.2020.9283434.
[24] Spampinato C, Palazzo S, Kavasidis I, Giordano D, Souly N, Shah M. Deep learning human mind for automated visual classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017, p. 6809-17. http://dx.doi.org/10.1109/CVPR.2017.479.
[25] Zheng X, Chen W, You Y, Jiang Y, Li M, Zhang T. Ensemble deep learning for automated visual classification using EEG signals. Pattern Recognit 2020;102:107147. http://dx.doi.org/10.1016/j.patcog.2019.107147.
[26] Zheng X, Chen W. An attention-based bi-LSTM method for visual object classification via EEG. Biomed Signal Process Control 2021;63:102174. http://dx.doi. org/10.1016/j.bspc.2020.102174.
[27] Wang Z, Wang Y, Hu C, Yin Z, Song Y. Transformers for EEG-based emotion recognition: A hierarchical spatial information learning model. IEEE Sens J 2022;22(5):4359-68. http://dx.doi.org/10.1109/JSEN.2022.3144317.
[28] Gong L, Li M, Zhang T, Chen W. EEG emotion recognition using attention-based convolutional transformer neural network. Biomed Signal Process Control 2023;84:104835. http://dx.doi.org/10.1016/j.bspc.2023.104835.
[29] Jia H, Xiao Z, Ji P. End-to-end fatigue driving EEG signal detection model based on improved temporal-graph convolution network. Comput Biol Med 2023;152:106431. http://dx.doi.org/10.1016/j.compbiomed.2022.106431.
[30] Chen Z, Yang Z, Zhu L, Chen W, Tamura T, Ono N, et al. Automated sleep staging via parallel frequency-cut attention. IEEE Trans Neural Syst Rehabil Eng 2023;31:1974-85. http://dx.doi.org/10.1109/TNSRE.2023.3243589.
[31] Zhao X, Yoshida N, Ueda T, Sugano H, Tanaka T. Epileptic seizure detection by using interpretable machine learning models. J Neural Eng 2023;20(1):015002. http://dx.doi.org/10.1088/1741-2552/acb089.
[32] Zeynali M, Seyedarabi H, Afrouzian R. Classification of EEG signals using Transformer based deep learning and ensemble models. Biomed Signal Process Control 2023;86:105130. http://dx.doi.org/10.1016/j.bspc.2023.105130.
[33] Bagchi S, Bathula DR. EEG-ConvTransformer for single-trial EEG-based visual stimulus classification. Pattern Recognit 2022;129:108757. http://dx.doi.org/10. 1016/j.patcog.2022.108757.
[34] Tucker DM. Spatial sampling of head electrical fields: the geodesic sensor net. Electroencephalogr Clin Neurophysiol 1993;87(3):154-63.
[35] Snyder JP. Map projections-a working manual. Vol. 1395, US Government Printing Office; 1987.
[36] Alfeld P. A trivariate clough - tocher scheme for tetrahedral data. Comput Aided Geom Design 1984;1(2):169-81. http://dx.doi.org/10.1016/0167-8396(84)90029-3.
[37] Bashivan P, Rish I, Yeasin M, Codella NCF. Learning representations from EEG with deep recurrent-convolutional neural networks. 2015, http://dx.doi.org/10. 48550/arXiv.1511.06448, arXiv preprint arXiv:1511.06448.
[38] Shazeer NM. GLU variants improve transformer. 2020, http://dx.doi.org/10.48550/arXiv.2002.05202, arXiv preprint arXiv:2002.05202.
[39] Liu H, Dai Z, So DR, Le QV. Pay attention to MLPs. Adv Neural Inf Process Syst 2021;34:9204-15. http://dx.doi.org/10.48550/arXiv.2105.08050.
[40] Hua W, Dai Z, Liu H, Le Q. Transformer quality in linear time. In: Proceedings of the 39th international conference on machine learning. 2022, p. 9099-117. http://dx.doi.org/10.48550/arXiv.2202.10447.
[41] Golub GH, Heath M, Wahba G. Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics 1979;21(2):215-23. http://dx. doi.org/10.1080/00401706.1979.10489751.
[42] Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, et al. Pytorch: An imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 2019;32:8026-37. http://dx.doi.org/10.48550/arXiv.1912.01703.
[43] Kingma DP, Ba J. Adam: A method for stochastic optimization. 2014, http://dx.doi.org/10.48550/arXiv.1412.6980, arXiv preprint arXiv:1412.6980.
[44] Van der Maaten L, Hinton GE. Visualizing data using t-SNE. J Mach Learn Res 2008;9:2579-605, http://jmlr.org/papers/v9/vandermaaten08a.html.
[45] Jiao Z, You H, Yang F, Li X, Zhang H, Shen D. Decoding EEG by visual-guided deep neural networks. In: International joint conference on artificial intelligence. 2019, p. 1387-93. http://dx.doi.org/10.24963/ijcai.2019/192.
[46] Bagchi S, Bathula DR. Adequately wide 1D CNN facilitates improved EEG based visual object recognition. In: 2021 29th European signal processing conference. 2021, p. 1276-80. http://dx.doi.org/10.23919/EUSIPCO54536.2021.9615945.
[47] Kornblith S, Norouzi M, Lee H, Hinton GE. Similarity of neural network representations revisited. In: International conference on machine learning. 2019, p. 3519-29. http://dx.doi.org/10.48550/arXiv.1905.00414.
[48] Bentin S, Allison T, Puce A, Perez E, McCarthy G. Electrophysiological studies of face perception in humans. J Cogn Neurosci 1996;8(6):551-65. http://dx.doi. org/10.1162/jocn.1996.8.6.551.
[49] Ganis G, Smith D, Schendan HE. The N170, not the P1, indexes the earliest time for categorical perception of faces, regardless of interstimulus variance. Neuroimage 2012;62(3):1563-74. http://dx.doi.org/10.1016/j.neuroimage.2012. 05.043.
[50] Selvaraju RR, Das A, Vedantam R, Cogswell M, Parikh D, Batra D. GradCAM: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision. 2017, p. 618-26. http://dx.doi.org/10.1109/ICCV.2017.74.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-44b6ba81-319e-4b4b-9097-d5d27fe351a3