PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Fusing the electromagnetic articulograph, high-speed video cameras and a 16-channel microphone array for speech analysis

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Electromagnetic articulography (EMA) is one of the instrumental phonetic research methods used for recording and assessing articulatory movements. Usually, articulographic data are analysed together with standard audio recordings. This paper, however, demonstrates how coupling the articulograph with devices providing other types of information may be used in more advanced speech research. A novel measurement system is presented that consists of the AG 500 electromagnetic articulograph, a 16-channel microphone array with a dedicated audio recorder and a video module consisting of 3 high-speed cameras. It is argued that synchronization of all these devices allows for comparative analyses of results obtained with the three components. To complement the description of the system, the article presents innovative data analysis techniques developed by the authors as well as preliminary results of the system’s accuracy.
Rocznik
Strony
257--266
Opis fizyczny
Bibliogr. 41 poz., rys., wykr., tab.
Twórcy
autor
  • Polytechnic Institute, State Higher Vocational School in Tarnow, Mickiewicza 8, 33-100 Tarnów, Poland
autor
  • Department of Speech Therapy and Applied Linguistics, Maria Curie-Skłodowska University, Sowińskiego 17, 20-040 Lublin, Poland
  • Institute of Applied Polish Studies, University of Warsaw, Krakowskie Przedmieście 26/28, 00-927 Warszawa, Poland
autor
  • Polytechnic Institute, State Higher Vocational School in Tarnow, Mickiewicza 8, 33-100 Tarnów, Poland
autor
  • Polytechnic Institute, State Higher Vocational School in Tarnow, Mickiewicza 8, 33-100 Tarnów, Poland
  • Amsterdam School of International Business, Amsterdam University of Applied Sciences, Fraijlemaborg 133, 1102CV Amsterdam, Netherlands
autor
  • Polytechnic Institute, State Higher Vocational School in Tarnow, Mickiewicza 8, 33-100 Tarnów, Poland
Bibliografia
  • [1] K. Mathiak, U. Klose, H. Ackermann, I. Hertrich, W.E. Kincses, and W. Grodd, “Stroboscopic articulography using fast magnetic resonance imaging”, Int. J. Lang. Commun. Disord., 35(3), 419–425 (2000).
  • [2] L. Davidson, “Comparing tongue shapes from ultrasound imaging using smoothing spline analysis of variance”, J. Acoust. Soc. Am. 120(1), 407–415 (2006).
  • [3] K.L. Moll, “Cinefluorographic techniques in speech research”, J. Speech Hear. Res. 3, 227–241 (1960).
  • [4] T. Baer, J. Gore, S. Boyce, and P. Nye, “Analysis of vocal tract shape and dimensions using magnetic resonance imaging: vowels”, J. Acoust. Soc. Am. 90, 799–828 (1991).
  • [5] C. Moore, “The correspondence of vocal tract resonance with volumes obtained from magnetic resonance images”, J. Speech Hear. Res. 35, 1009–1023 (1992).
  • [6] B. Wein, W. Angerstein, C. Neuschaefer-Rube, A. Obrębowski, and S. Klajman, “Badanie obwodowego narządu mowy przy wymowie głosek polskich za pomocą jądrowego rezonansu magnetycznego (NMR)”, Polish Journal of Otolaryngology (Otolaryngologia Polska) 48(2), 178–198 (1994).
  • [7] J. Dang, K. Honda, and H. Suzuki, “Morphological and acoustical analysis of the nasal and paranasal cavities”, J. Acoust. Soc. Am., 96(4), 2088‒2100 (1994).
  • [8] A. Serrurier and P. Badin, “A three-dimensional linear articulatory model of velum based on MRI data”, Proceedings of the 9th Eurospeech, 2161–2164 (2005).
  • [9] M. Toda, S. Maeda, and K. Honda, “Formant-cavity affiliation in sibilant fricatives”, Turbulent sounds: An interdisciplinary guide, eds. S. Fuchs, M. Toda, M. Zygis, De Gruyter Mouton, 343–371 (2010).
  • [10] T. Sorensen, A. Toutios, L. Goldstein, and S.S. Narayanan, “Characterizing vocal tract dynamics with real-time MRI”, 15th Conference on Laboratory Phonology, (2016)
  • [11] M. Stone, G. Stock, K. Bunin, K. Kumar, and M. Epstein, “Comparison of speech production in upright and supine position”, J. Acoust. Soc. Am., 122, 532‒541 (2007).
  • [12] J.S. Perkell, M.H. Cohen, M.A. Svirsky, M.L. Matthies, I. Garabieta, and M.T. Jackson, “Electromagnetic midsagittal articulometer (EMMA) systems for transducing speech articulatory movements”, J. Acoust. Soc. Am. 92(6), 3078–3096 (1992).
  • [13] K. Richmond, “Trajectory mixture density networks with multiple mixtures for acoustic-articulatory inversion”, Advances in Nonlinear Speech Processing – Lect. Notes Comput. Sc. 4885, 263–272 (2007).
  • [14] P. Badin, F. Elisei, G. Bailly, and Y. Tarabalka, “An audiovisual talking head for augmented speech generation: models and animations based on a real speaker’s articulatory data”, Proceedings of the 5th AMDO, 132–143 (2008).
  • [15] O. Engwall and J. Beskow, “Resynthesis of 3D tongue movements from facial data”, Proceedings of the 8th Eurospeech, 2261–2264 (2003).
  • [16] J. Beskow, O. Engwall, and B. Granström, “Simultaneous measurements of facial and intraoral articulation”, Proceedings of Fonetik, 57‒60 (2003).
  • [17] H. Kjellström and O. Engwall, “Audiovisual to articulatory inversion”, Speech Communication 51(3), 195–209 (2009).
  • [18] D. Schabus, M. Pucher, and P. Hoole, “The MMASCS multimodal annotated synchronous corpus of audio, video, facial motion and tongue motion data of normal, fast and slow speech”, Proceedings of the 9th LREC, 3411–3416 (2014).
  • [19] R.A. Krakow and M.K. Huffman, “Instruments and techniques for investigating nasalization and velopharyngeal function in the laboratory: An introduction”, Phonetics and Phonology 5: Nasals, Nasalization and the Velum, 3–59 (1993).
  • [20] F. Bell-Berti, “An electromyographic study of velopharyngeal function in speech”, J. Speech. Hear. Res. 19, 225–240 (1976).
  • [21] T. Bressmann, B. Radovanovic, G.V. Kulkarni, P. Klaiman, and D. Fisher, “An ultrasonographic investigation of cleft-type compensatory articulations of voiceless velar stops”, Clinical Linguistics & Phonetics, 1028–1033 (2011).
  • [22] S. Rossato, P. Badin, and F. Bouaouni, “Velar movements in French: An articulatory and acoustical analysis of coarticulation” Proceedings of the 15th ICPhS, 3141–3144 (2003).
  • [23] D. Warren, “Velopharyngeal orifice size and upper pharyngeal pressure-flow patterns in normal speech”, Plast. Reconstr. Surg. 33, 148–161 (1964).
  • [24] D. Warren, “Velopharyngeal orifice size and upper pharyngeal pressure-flow patterns in cleft palate speech: a preliminary study”, Plast. Reconstr. Surg., 34, 15–26 (1964).
  • [25] D. Warren, “Nasal emission of air and velopharyngeal function”, Cleft Palate J., 16, 279–285 (1967).
  • [26] M. Rothenberg, “Measurements of air flow in speech”, J. Speech Hear. Res. 20, 155–176 (1977).
  • [27] S.G. Fletcher and M.E. Bishop, “Measurement of nasality with TONAR”, Cleft Palate J. 7, 610–621 (1970).
  • [28] R.M. Dalston, D.W. Warren, and E.T. Dalston, “Use of nasometry as a diagnostic tool for identifying patients with velopharyngeal impairment”, Cleft Palate J. 28(2), 184–189 (1991).
  • [29] A. Lorenc, „Wymowa normatywna polskich samogłosek nosowych i spółgłoski bocznej”, Dom Wydawniczy ELIPSA, Warszawa 2016.
  • [30] D. Król, A. Lorenc, and R. Święciński, “Detecting laterality and nasality in speech with the use of a multi-channel recorder”, Proceedings of the 40th IEEE ICASSP, 5147–5151 (2015).
  • [31] A. Lorenc, R. Święciński, and D. Król, “Assessment of sound laterality with the use of a multi-channel recorder”, Proceedings of the 18th ICPhS, (2015)
  • [32] P. Hoole, C. Mooshammer, and H.G. Tillmann, “Kinematic analysis of vowel production in German”, Proceedings of the 3rd ICSLP, 53–56 (1994).
  • [33] Carstens Medizinelektronik GmbH, “SyBox Opto-4 Manual ver. 0”, http://ag500.de/Sybox_man.pdf
  • [34] C. Kroos, “Measurement accuracy in 3D electromagnetic articulography (Carstens AG500)”, 8th International Seminar on Speech Production, 61–64 (2008).
  • [35] Y. Yanusova, J.R. Green, and A. Mefferd, “Accuracy assessment for AG500, electromagnetic articulograph”, J. Speech Lang. Hear. Res. 52(2), 81–84 (2009).
  • [36] P. Hoole and A. Zierdt, “Five-dimensional articulography” Speech Motor Control: New developments in basic and applied research, eds. B. Maassen and P.H.H.M. Van Lieshout, 331–349 (2009).
  • [37] M. Stella, P. Bernardini, F. Sigona, A. Stella, M. Grimaldi, and B. Gili Fivela, “Numerical instabilities and three-dimensional electromagnetic articulography”, J. Acoust. Soc. Am. 132(6), 3941–3949 (2012).
  • [38] Carstens Medizinelektronik GmbH, “NormPos Program Description”, http://www.ag500.de/manual/ag500/NormPos.pdf
  • [39] P. Boersma and D. Weenink, “Praat: doing phonetics by computer” [software], (2016). Different versions retrieved in 2016 from: http://www.praat.org/.
  • [40] C.T. Best, C. Kroos, R.L. Bundgaard-Nielsen, B. Baker, M. Harvey, M. Tiede, and L. Goldstein, “Articulatory basis of the apical/ laminal distinction: tongue tip/body coordination in the Wubuy 4-way coronal stop contrast”, Proceedings of the 10th ISSP, 33–36 (2014).
  • [41] M. Tiede, “MVIEW: Multi-channel visualization application for displaying dynamic sensor movement” [software], (2010).
Uwagi
PL
Opracowanie rekordu w ramach umowy 509/P-DUN/2018 ze środków MNiSW przeznaczonych na działalność upowszechniającą naukę (2018).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-495fdbc4-1112-444f-90df-344a43b2b13c
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.