Balancing Privacy and Accuracy in Federated Learning for Speech Emotion Recognition

Mohammadi, Samaneh; Mohammadi, Mohammadreza; Sinaei, Sima; Balador, Ali; Nowroozi, Ehsan; Flammini, Francesco; Conti, Mauro

doi:10.15439/2023F444

Artykuł - szczegóły

Tytuł artykułu

Balancing Privacy and Accuracy in Federated Learning for Speech Emotion Recognition

Autorzy

Mohammadi Samaneh , Mohammadi Mohammadreza , Sinaei Sima , Balador Ali , Nowroozi Ehsan , Flammini Francesco , Conti Mauro

Wybrane pełne teksty z tego czasopisma

http://annals-csis.org

Identyfikatory

DOI

10.15439/2023F444

Warianty tytułu

Języki publikacji

Abstrakty

Speech Emotion Recognition (SER) is a valuable technology that identifies human emotions from spoken language, enabling the development of context-aware and personalized intelligent systems. To protect user privacy, Federated Learning (FL) has been introduced, enabling local training of models on user devices. However, FL raises concerns about the potential exposure of sensitive information from local model parameters, which is especially critical in applications like SER that involve personal voice data. Local Differential Privacy (LDP) has been successful in preventing privacy leaks in image and video data. However, it encounters notable accuracy degradation when applied to speech data, especially in the presence of high noise levels. In this paper, we propose an approach called LDP-FL with CSS, which combines LDP with a novel client selection strategy (CSS). By leveraging CSS, we aim to improve the representatives of updates and mitigate the adverse effects of noise on SER accuracy while ensuring client privacy through LDP. Furthermore, we conducted model inversion attacks to evaluate the robustness of LDP-FL in preserving privacy. These attacks involved an adversary attempting to reconstruct individuals' voice samples using the output labels provided by the SER model. The evaluation results reveal that LDP-FL with CSS achieved an accuracy of 65-70\%, which is 4\% lower than the initial SER model accuracy. Furthermore, LDP-FL demonstrated exceptional resilience against model inversion attacks, outperforming the non-LDP method by a factor of 10. Overall, our analysis emphasizes the importance of achieving a balance between privacy and accuracy in accordance with the requirements of the SER application.

Słowa kluczowe

federated learning privacy-preserving mechanism differential privacy speech emotion recognition

uczenie federacyjne ochrona prywatności prywatność różnicowa rozpoznawanie emocji mowy

Wydawca

Polskie Towarzystwo Informatyczne

Czasopismo

Annals of Computer Science and Information Systems

Rocznik

2023

Tom

Vol. 35

Strony

191--199

Opis fizyczny

Bibliogr. 26 poz., rys., wykr.

Twórcy

autor

Mohammadi Samaneh

samaneh.mohammadi@ri.se

RISE Research Institutes of Sweden, Västerås, Sweden
Mälardalen University, Västerås, Sweden

autor

Mohammadi Mohammadreza

mohammadreza.mohammadi@ri.se

RISE Research Institutes of Sweden, Västerås, Sweden
University of Padua, Padua, Italy

autor

Sinaei Sima

sima.sinaei@ri.se

RISE Research Institutes of Sweden, Västerås, Sweden

autor

Balador Ali

ali.balador@mdu.se

Mälardalen University, Västerås, Sweden

autor

Nowroozi Ehsan

e.nowroozi@qub.ac.uk

Queen’s University Belfast, Centre of Secure Information Technologies, Belfast, Northern Ireland, United Kingdom.

autor

Flammini Francesco

francesco.flammini@mdu.se

Mälardalen University, Västerås, Sweden

autor

Conti Mauro

mauro.conti@unipd.it

University of Padua, Padua, Italy

Bibliografia

1. R. A. Khalil, E. Jones, M. I. Babar, T. Jan, M. H. Zafar, and T. Alhussain, “Speech emotion recognition using deep learning techniques: A review,” IEEE Access, vol. 7, pp. 117 327–117 345, 2019.
2. M. B. Akçay and K. Oğuz, “Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers,” Speech Communication, vol. 116, pp. 56–76, 2020.
3. P. Chhikara, P. Singh, R. Tekchandani, N. Kumar, and M. Guizani, “Federated learning meets human emotions: A decentralized framework for human–computer interaction for iot applications,” IEEE Internet of Things Journal, vol. 8, no. 8, pp. 6949–6962, 2020.
4. J. L. Kröger, O. H.-M. Lutz, and P. Raschke, “Privacy implications of voice and speech analysis–information disclosure by inference,” Privacy and Identity Management. Data for Better Living: AI and Privacy: 14th IFIP WG 9.2, 9.6/11.7, 11.6/SIG 9.2. 2 International Summer School, Windisch, Switzerland, August 19–23, 2019, Revised Selected Papers 14, pp. 242–258, 2020.
5. P. Voigt and A. Von dem Bussche, “The eu general data protection regulation (gdpr),” A Practical Guide, 1st Ed., Cham: Springer International Publishing, vol. 10, no. 3152676, pp. 10–5555, 2017.
6. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” in Artificial intelligence and statistics. PMLR, 2017, pp. 1273–1282.
7. S. Latif, S. Khalifa, R. Rana, and R. Jurdak, “Federated learning for speech emotion recognition applications,” in 2020 19th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN). IEEE, 2020, pp. 341–342.
8. M. Nasr, R. Shokri, and A. Houmansadr, “Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning,” in 2019 IEEE symposium on security and privacy (SP). IEEE, 2019, pp. 739–753.
9. M. S. Jere, T. Farnan, and F. Koushanfar, “A taxonomy of attacks on federated learning,” IEEE Security & Privacy, vol. 19, no. 2, pp. 20–28, 2020.
10. Z. Xiong, Z. Cai, D. Takabi, and W. Li, “Privacy threat and defense for federated learning with non-iid data in aiot,” IEEE Transactions on Industrial Informatics, vol. 18, no. 2, pp. 1310–1321, 2021.
11. Y. Zhao, J. Zhao, M. Yang, T. Wang, N. Wang, L. Lyu, D. Niyato, and K.-Y. Lam, “Local differential privacy-based federated learning for internet of things,” IEEE Internet of Things Journal, vol. 8, no. 11, pp. 8836–8853, 2020.
12. K. Wei, J. Li, M. Ding, C. Ma, H. H. Yang, F. Farokhi, S. Jin, T. Q. Quek, and H. V. Poor, “Federated learning with differential privacy: Algorithms and performance analysis,” IEEE Transactions on Information Forensics and Security, vol. 15, pp. 3454–3469, 2020.
13. M. Kim, O. Günlü, and R. F. Schaefer, “Federated learning with local differential privacy: Trade-offs between privacy, utility, and communication,” in ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021, pp. 2650–2654.
14. M. A. Pathak, Privacy-preserving machine learning for speech processing. Springer Science & Business Media, 2012.
15. T. Feng, R. Peri, and S. Narayanan, “User-level differential privacy against attribute inference attack of speech emotion recognition in federated learning,” arXiv preprint https://arxiv.org/abs/2204.02500, 2022.
16. A. A. Alnuaim, M. Zakariah, A. Alhadlaq, C. Shashidhar, W. A. Hatamleh, H. Tarazi, P. K. Shukla, and R. Ratna, “Human-computer interaction with detection of speaker emotions using convolution neural networks,” Computational Intelligence and Neuroscience, vol. 2022, 2022.
17. M. Fredrikson, S. Jha, and T. Ristenpart, “Model inversion attacks that exploit confidence information and basic countermeasures,” in Proceedings of the 22nd ACM SIGSAC conference on computer and communications security, 2015, pp. 1322–1333.
18. V. Tsouvalas, T. Ozcelebi, and N. Meratnia, “Privacy-preserving speech emotion recognition through semi-supervised federated learning,” in 2022 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops). IEEE, 2022, pp. 359–364.
19. Y. Chang, S. Laridi, Z. Ren, G. Palmer, B. W. Schuller, and M. Fisichella, “Robust federated learning against adversarial attacks for speech emotion recognition,” arXiv preprint https://arxiv.org/abs/2203.04696, 2022.
20. T. Tuncer, S. Dogan, and U. R. Acharya, “Automated accurate speech emotion recognition system using twine shuffle pattern and iterative neighborhood component analysis techniques,” Knowledge-Based Systems, vol. 211, p. 106547, 2021.
21. P. Liu, X. Xu, and W. Wang, “Threats, attacks and defenses to federated learning: issues, taxonomy and perspectives,” Cybersecurity, vol. 5, no. 1, pp. 1–19, 2022.
22. R. Bassily, “Linear queries estimation with local differential privacy,” in The 22nd International Conference on Artificial Intelligence and Statistics. PMLR, 2019, pp. 721–729.
23. M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang, “Deep learning with differential privacy,” in Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, 2016, pp. 308–318.
24. A. Balador, S. Sinaei, M. Pettersson, and I. Kaya, “Dais project - distributed artificial intelligence systems: Objectives and challenges,” in 26th Ada-Europe International Conference on Reliable Software Technologies (AEiC’22), 2022.
25. H. Cao, D. G. Cooper, M. K. Keutmann, R. C. Gur, A. Nenkova, and R. Verma, “Crema-d: Crowd-sourced emotional multimodal actors dataset,” IEEE transactions on affective computing, vol. 5, no. 4, pp. 377–390, 2014.
26. F. Eyben, M. Wöllmer, and B. Schuller, “Opensmile: the munich versatile and fast open-source audio feature extractor,” in Proceedings of the 18th ACM international conference on Multimedia, 2010, pp. 1459–1462.

Uwagi

1. Main Track Regular Papers

2. Opracowanie rekordu ze środków MEiN, umowa nr SONP/SP/546092/2022 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2024).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-160442c5-5008-4245-a654-b0e1195c5363