Subjective tests of speaker recognition for selected voice disguise techniques

Staroniewicz, Piotr

doi:10.24425/ijet.2024.149587

Artykuł - szczegóły

Tytuł artykułu

Subjective tests of speaker recognition for selected voice disguise techniques

Autorzy

Staroniewicz Piotr

Treść / Zawartość

Pełne teksty:

IJET_2024_70_3_STARONOWICZ_Subjective tests of speaker recognition.pdf

Pobierz

Identyfikatory

DOI

10.24425/ijet.2024.149587

Warianty tytułu

Języki publikacji

Abstrakty

Research work on the effectiveness of voice disguise techniques is important for the development of biometric systems (surveillance) as well as phonoscopic research (forensics). A speaker recognition system or a listener can be deliberately or non-deliberately misled by technical or natural methods. It is important to determine the impact of these techniques on both automatic systems and live listeners. This paper presents the results of listening tests conducted on a group of 40 people. The effectiveness of speaker recognition was investigated using selected natural (chosen from four groups of deliberate natural techniques: phonation, phonemic, prosodic and deformation) and technical (pitch shifting, GSM coding) voice disguise techniques. The results were related to the previously obtained outcomes for the automatic method of verification carried out using a classical speaker recognition system based on MFCC (Mel Frequency Cepstral Coefficients) parameterisation and GMM (Gaussian Mixture Models) classification.

Słowa kluczowe

speaker recognition forensics biometrics voice desguise

Wydawca

Polish Academy of Sciences, Committee of Electronics and Telecommunication

Czasopismo

International Journal of Electronics and Telecommunications

Rocznik

2024

Tom

Vol. 70, No. 3

Strony

615--620

Opis fizyczny

Bibliogr. 19 poz., rys.

Twórcy

autor

Staroniewicz Piotr

piotr.staroniewicz@pwr.edu.pl

Wrocław University of Science and Technology

https://orcid.org/https%3A//orcid.org/0000-0003-4244-0592

Bibliografia

[1] F. Alegre, G. Soldi, N. Evans, B. Fauve, J. Liu, “Evasion and Obfuscation in Speaker Recognition Surveillance and Forensics”, Proc. International Conference on Biometrics and Forensics (IWBF), IEEE, 2014. http://dx.doi.org/10.1109/IWBF.2014.6914244
[2] P. Staroniewicz, “Effect of the deliberate and non-deliberate natural voice disguise on speaker recognition performance”, Acoustics, acoustoelectronics and electrical engineering/ed. Franciszek Witos, Gliwice, Wydawnictwo Politechniki Śląskiej, 2021. pp. 312-325. (Monografia - Politechnika Śląska; nr 888), 2021. https://dx.doi.org/10.34918/80139
[3] M. Farrus, “Voice Disguise in Automatic Speaker Recognition”, ACM Computing Surveys, Vol. 51, No. 4, Article 68, July 2018. https://doi.org/10.1016/j.forsciint.2007.05.019
[4] I. Krzosek-Piwowarczyk, O. Komosa, W. Maciejko, „Kryminalistyczna identyfikacja mówcy maskującego głos”, Problemy Kryminalistyki 280 (2) 2013 39-52.
[5] S. S. Kajarekar, H. Bratt, E. Shriberg, R . de Leon, “A Study of Intentional Voice Modifications for Evading Automatic Speaker Recognition”, Proc. Speaker and Language Recognition Workshop, 2006, IEEE Odyssey 2006. https://doi.org/10.1109/ODYSSEY.2006.248123
[6] H. J. Kunzel, J. Gonzales-Rodriguez, J. Ortega-Garcia, “Effect of voice disguise on the performance of a forensic automatic speaker recognition system”, Proc. IEEE Odyssey - The Speaker and Language Recognition Workshop, 2004.
[7] P. Perrot, G. Aversano, G. Chollet, “Voice disguise and automatic detection: review and perspectives”, Progress in nonlinear speech processing, pp. 101-117, (ed.): Springer 2007. https://doi.org/10.1007/978-3-540-71505-4_7
[8] C. Zhang, T. Tan, “Voice disguise and automatic speaker recognition”, Forensic Science International 175 (2008) 118-122. https://doi.org/10.1016/j.forsciint.2007.05.019
[9] R. D. Rodman, M. S.Powell, “Computer Recognition of Speakers Who Disguise Their Voice”, Proc. of the International Conference on Signal Processing Applications and Technology 2000 (ICSPAT 2000) Dallas, TX, October 2000. https://api.semanticscholar.org/CorpusID:16980245
[10] W. Majewski, P. Staroniewicz, “Imitation of Target Speakers by Different Types of Impersonators”, Analysis of Verbal and Nonverbal Communication and Enactment, Springer LNCS vol. 6800, 104-112, 2011. https://doi.org/10.1007/978-3-642-25775-9_10
[11] P. Staroniewicz. “Test of robustness of GMM speaker verification in VoIP telephony”, Archives of Acoustics 2007, vol.32, nr 4, suppl. pp.187-192. https://acoustics.ippt.pan.pl/index.php/aa/article/download/1408/1225
[12] J. Krajewski, S. Schnieder, D. Sommer, A. Batliner, B. Schuller, “Applying Multiple Classifiers and Non-Linear Dynamic Features for Detecting Sleepiness from Speech”, Neurocomputing 84, pp. 65-75, 2012. https://doi.org/10.1016/j.neucom.2011.12.021
[13] P. Staroniewicz, “Influence of Natural Voice Disguise Techniques on Automatic Speaker Recognition”, Proc. of Joint Conference - Acoustics, Ustka 2018, pp.1-4 (ed.): IEEE 2018. https://doi.org/10.1109/ACOUSTICS.2018.8502372
[14] S. Brachmański “Speech signal noise reduction in forensic audio analysis”, Proc. 56 OSA 15-18.09.2009, pp.135-140, 2009.
[15] A. B. Dobrucki, S. Brachmański “Test signals used in electroacoustics and speech technology”, Proc. Signal processing, algorithms, architectures, arrangements and applications, SPA 2017, 20-22.09.2017 IEEE 2017. https://doi.org/10.23919/SPA.2017.8166828
[16] P. Staroniewicz, “Considering basic emotional state information in speaker verification”, Proc. 4th International Conference on Biometrics and Forensics (IWBF) IEEE 2016. https://doi.org/10.1109/IWBF.2016.7449689
[17] D. A. Reynolds, R. C. Rose, “Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models”, IEEE Trans. Speech and Audio Proc. Vol.3(1), pp. 72-83. 1995. https://doi.org/10.1109/89.365379
[18] D. A. Reynolds, T. F. Quatieri, R. B. Dunn, “Speaker Verification Using Adapted Gausian Mixture Models”, Digital Signal Processing 10, pp. 19-41, 2000. https://doi.org/10.1006/dspr.1999.0361
[19] F. Bimbot, J. F. Bonastre, C. Fredouille, G.Gravier, I. Magrin-Chagnolleau, S. Meignier, T. Merlin, J. Ortega-Garcia, D. Petrovska-Delacretaz, D. A. Reynolds, “A tutorial on text-independent speaker verification”, EURASIP J. Appl. Signal Process., vol.2004, pp.430-451, 2004. https://doi.org/10.1155/S1110865704310024

Uwagi

Opracowanie rekordu ze środków MNiSW, umowa nr POPUL/SP/0154/2024/02 w ramach programu "Społeczna odpowiedzialność nauki II" - moduł: Popularyzacja nauki (2025).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-4ede0a57-182b-4e5e-8cbe-1e92103d2db6