Nowa wersja platformy, zawierająca wyłącznie zasoby pełnotekstowe, jest już dostępna.
Przejdź na https://bibliotekanauki.pl

PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
2019 | Vol. 39, no. 1 | 246--255
Tytuł artykułu

Multi-channel acoustic analysis of phoneme /s/ mispronunciation for lateral sigmatism detection

Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
The paper presents a method for computer-aided detection of lateral sigmatism. The aim of the study is to design an automated sigmatism diagnosis tool. For that purpose, a reference speech corpus has been collected. It contains 438 recordings of a phoneme /s/ surrounded by certain vowels with normative and simulated pathological pronunciation. The acoustic signal is recorded with an acoustic mask, which is a set of microphones organised in a semi-cylindrical surface around the subject's face. Frames containing /s/ phoneme are subjected to beamforming and feature extraction. Two different feature vectors containing, e.g., Mel-frequency cepstral coefficients and fricative formants, are defined and evaluated in terms of binary classification involving support vector machines. A single-channel analysis is con- fronted with multi-channel processing. The experimental results show that the multi-channel speech signal processing supported by beamforming is able to increase the pathology detection capabilities in general.
Wydawca

Rocznik
Strony
246--255
Opis fizyczny
Bibliogr. 39 poz., rys., tab., wykr.
Twórcy
  • Faculty of Biomedical Engineering, Silesian University of Technology, Zabrze, Poland
  • Faculty of Biomedical Engineering, Silesian University of Technology, Zabrze, Poland
  • Non-Resident Faculty of Jesuit University Ignatianum in Cracow, Krakow, Poland
  • Silesia's Center of Hearing and Speech MEDINCUS, Katowice, Poland
Bibliografia
  • [1] Bilibajkic R, Saric Z, Jovicic ST, Punisic S, Subotic M. Automatic detection of stridence in speech using the auditory model. Comput Speech Lang 2016;36:122–35. http://dx.doi.org/10.1016/j.csl.2015.08.006.
  • [2] Irwin JV. Distribution and production characteristics of /s/in the vocabulary and spontaneous speech of children. Speech Lang 1982;7:217–35. http://dx.doi.org/10.1016/B978-0-12-608607-2.50013-3.
  • [3] Borsel JV, Rentergem SV, Verhaeghe L. The prevalence of lisping in young adults. J Commun Disorders 2007;40(6):493– 502. http://dx.doi.org/10.1016/j.jcomdis.2006.12.001.
  • [4] Lobacz P, Dobrzanska K. Acoustic description of sybilant phones in pronunciation of pre-school children, (PL) Opis akustyczny glosek sybilantnych w wymowie dzieci przedszkolnych. Audiofonologia 1999;15:7–26.
  • [5] Trzaskalik J. Sigmatismus lateralis in Polish logopedics literature. Theoretical considerations, (PL) Seplenienie boczne w polskiej literaturze logopedycznej. Rozważania teoretyczne. Forum Logopedyczne 2016;24:33–46.
  • [6] Hu W, Qian Y, Song FK, Wang Y. Improved mispronunciation detection with deep neural network trained acoustic models and transfer learning based logistic regression classifiers. Speech Commun 2015;67:154–66. http://dx.doi.org/10.1016/j.specom.2014.12.008.
  • [7] Su P-H, Wu C-H, Lee L-S. A recursive dialogue game for personalized computer-aided pronunciation training, Audio, Speech, and Language Processing. IEEE/ACM Trans 2015;23(1):127–41. http://dx.doi.org/10.1109/TASLP.2014.2375572.
  • [8] Wang H, Qian X, Meng H. Phonological modeling of mispronunciation gradations in L2 English speech of L1 Chinese learners. 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2014. pp. 7714–8. http://dx.doi.org/10.1109/ICASSP.2014.6855101.
  • [9] Wei S, Hu G, Hu Y, Wang R-H. A new method for mispronunciation detection using support vector machine based on pronunciation space models. Speech Communication 2009;51(10):896–905. http://dx.doi.org/10.1016/j.specom.2009.03.004. spoken Language Technology for Education.
  • [10] Valentini-Botinhao C, Degenkolb-Weyers S, Maier A, Noeth E, Eysholdt U, Bocklet T. Automatic detection of sigmatism in children. Proc. WOCCI 2012 - Workshop on Child; 2012. pp. 1–4.
  • [11] Benselam Z, Guerti M, Bencherif M. Arabic speech pathology therapy computer aided system. J Comput Sci 2007;3(9):685–92.
  • [12] Skorek E. Faces of Speech Impediments. Warszawa: (PL) Oblicza wad wymowy; 2001.
  • [13] Ostapiuk B. Dyslalia. About speech quality testing in speech therapy, (PL) Dyslalia. O badaniu jakości wymowy w logopedii. Wydawnictwo Naukowe Uniwersytetu Szczecińskiego; 2013.
  • [14] Brandstein M, Ward D. Microphone Arrays. Signal Processing Techniques and Applications. Springer-Verlag Berlin Heidelberg; 2001.
  • [15] Benesty J, Sondhi M, Huang Y. Springer Handbook of Speech Processing. Springer; 2008.
  • [16] Benesty J, Chen J, Huang Y. Microphone Array Signal Processing, Springer Topics in Signal Processing. Springer Berlin Heidelberg; 2008.
  • [17] Krol D, Lorenc A, Swiecinski R. Detecting laterality and nasality in speech with the use of a multi-channel recorder. 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2015. pp. 5147–51. http://dx.doi.org/10.1109/ICASSP.2015.7178952.
  • [18] Krecichwost M, Miodonska Z, Trzaskalik J, Pyttel J, Spinczyk D. Acoustic Mask for Air Flow Distribution Analysis in Speech Therapy. Cham: Springer International Publishing; 2016. p. 377–87. http://dx.doi.org/10.1007/978-3-319-39796-2_31.
  • [19] Salvati D, Drioli C, Foresti GL. On the use of machine learning in microphone array beamforming for far-field sound source localization. 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP). 2016. pp. 1–6. http://dx.doi.org/10.1109/MLSP.2016.7738899.
  • [20] Pasha S, Ritz C. Informed source location and DOA estimation using acoustic room impulse response parameters. 2015 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT). 2015. pp. 139–44. http://dx.doi.org/10.1109/ISSPIT.2015.7394316.
  • [21] Argentieri S, Danes P, Soueres P. Prototyping filter-sum beamformers for sound source localization in mobile robotics. Robotics and Automation, 2005. ICRA 2005. Proceedings of the 2005 IEEE International Conference on; 2005. p. 3551–6. http://dx.doi.org/10.1109/ROBOT.2005.1570660.
  • [22] Ren W-J, Hu D-H, Ding C-B. An improved method to sort and pair TDOA based on the correlation between TDOAs. Radar (Radar), 2011 IEEE CIE International Conference on, Vol. 2. 2011. pp. 1041–4. http://dx.doi.org/10.1109/CIE-Radar.2011.6159730.
  • [23] Rabiner L, Schafer R. Theory and Applications of Digital Speech Processing. 1st Edition. Upper Saddle River, NJ, USA: Prentice Hall Press; 2010.
  • [24] Oppenheim AV. Digital signal processing. Englewood Cliffs, N.J: Prentice-Hall; 1975.
  • [25] Rabiner L, Juang B-H. Fundamentals of Speech Recognition. Upper Saddle River, NJ, USA: Prentice-Hall, Inc; 1993.
  • [26] Sahidullah M, Saha G. Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition. Speech Commun 2012;54(4):543–65.
  • [27] Koolagudi SG, Rastogi D, Rao KS. Identification of language using mel-frequency cepstral coefficients (MFCC). Procedia Engineering 2012;38:3391–8. International Conference on Modelling, Optimization and Computing.
  • [28] Miodonska Z, Bugdol MD, Krecichwost M. Dynamic time warping in phoneme modeling for fast pronunciation error detection. Comput Biol Med. doi:10.1016/j.compbiomed.2015.12.004.
  • [29] Nowak PM. The role of vowel transitions and frication noise in the perception of Polish sibilants. J Phonetics 2006;34 (2):139–52. http://dx.doi.org/10.1016/j.wocn.2005.03.001.
  • [30] Haley KL, Seelinger E, Mandulak KC, Zajac DJ. Evaluating the spectral distinction between sibilant fricatives through a speaker-centered approach. J Phonetics 2010;38(4):548–54. http://dx.doi.org/10.1016/j.wocn.2010.07.006.
  • [31] Zygis M, Hamann S. Perceptual and acoustic cues of Polish coronal fricatives. Proc. 15th ICPhS. 2003. pp. 395–8.
  • [32] Gordon M, Barthmaier P, Sands K. A cross-linguistic acoustic study of voiceless fricatives. J Int Phonetic Assoc 2002;141-174.
  • [33] Miodonska Z, Krecichwost M, Szymanska A. Computer- Aided Evaluation of Sibilants in Preschool Children Sigmatism Diagnosis. Cham: Springer International Publishing; 2016. p. 367–76. http://dx.doi.org/10.1007/978-3-319-39796-2_30.
  • [34] Burges CJC. A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 1998;2(2):121–67. http://dx.doi.org/10.1023/A:1009715923555.
  • [35] Cortes C, Vapnik V. Support-vector networks. Machine Learning. 1995. pp. 273–97.
  • [36] Platt J. Sequential minimal optimization: A fast algorithm for training support vector machines. Tech. rep. April 1998. https://www.microsoft.com/en-us/research/publication/ sequential-minimal-optimization-a-fast-algorithm-for- training-support-vector-machines/.
  • [37] Arlot S, Celisse A. A survey of cross-validation procedures for model selection. Stat Surveys 2010;4:40–79. http://dx.doi.org/10.1214/09-SS054.
  • [38] Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model selection. Proceedings of the 14th International Joint Conference on Artificial Intelligence - Volume 2; 1995. p. 1137–43.
  • [39] Jassem W. The formant patterns of fricative consonants. STL-QPSR 1962;3(3):6–15.
Uwagi
PL
Opracowanie rekordu w ramach umowy 509/P-DUN/2018 ze środków MNiSW przeznaczonych na działalność upowszechniającą naukę (2019).
Typ dokumentu
Bibliografia
Identyfikatory
Identyfikator YADDA
bwmeta1.element.baztech-55675895-88f8-4c72-a747-83fa89d19565
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.