Projekt koncepcyjny bazy danych do przechowywania nagrań z badań artykulograficznych mowy polskiej

Wielgat, R.; Jędryka, R.; Mik, Ł.; Król, D.

doi:10.5604/01.3001.0010.7558

Artykuł - szczegóły

Tytuł artykułu

Projekt koncepcyjny bazy danych do przechowywania nagrań z badań artykulograficznych mowy polskiej

Autorzy

Wielgat R. , Jędryka R. , Mik Ł. , Król D.

Treść / Zawartość

Pełne teksty:

wielgat_i_in._projekt_Science,_Technology_and_Innovation_v._1_no._1'17.pdf

Pobierz

Identyfikatory

DOI

10.5604/01.3001.0010.7558

Warianty tytułu

Conceptual design of a database to store recordings from articulographic studies of Polish speech

Języki publikacji

Abstrakty

W artykule opisano strukturę i funkcjonalność bazy danych artykulograficznych do przechowywania danych z badań przeprowadzanych z wykorzystaniem artykulografu elektromagnetycznego, kamery akustycznej i 3 kamer wideo. Baza danych umożliwia selektywne pobieranie różnych typów danych, w szczególności dotyczących mówcy, sesji nagraniowej, nagrań oraz eksperymentów. Opisano strukturę i budowę bazy danych. Przedstawiono również potencjalne przyszłe zastosowania do przeprowadzania analiz statystycznych oraz w eksperymentach dotyczących inwersji mowy z wykorzystaniem modeli sieci Bayesa.

The article describes the structure and functionality of the articulographic database for storing data from articulographic research using an electromagnetic articulograph, an acoustic camera and 3 video cameras. The database enables selective extraction of various types of data for scientific research and interoperates with programs that carry out experiments. Structure and construction of the database is described. Potential future application in statistical analysis and experiments on speech inversion using dynamic Bayesian networks (DBN) was also presented.

Słowa kluczowe

artykulacja elektromagnetyczna baza danych sieć bayesowska odwrócenie mowy kamera akustyczna fonetyka artykulacyjna fonetyka akustyczna

electromagnetic articulography database Bayesian network speech inversion acoustic camera articulatory phonetics acoustic phonetics

Wydawca

Państwowa Wyższa Szkoła Zawodowa w Tarnowie

Czasopismo

Science, Technology and Innovation

Rocznik

2017

Tom

Vol. 1, no. 1

Strony

64--72

Opis fizyczny

Bibliogr. 30 poz.

Twórcy

autor

Wielgat R.

rwielgat@poczta.onet.pl

State Higher Vocational School in Tarnow, Mickiewicza 8, 33-100 Tarnów, Poland

autor

Jędryka R.

State Higher Vocational School in Tarnow, Mickiewicza 8, 33-100 Tarnów, Poland

autor

Mik Ł.

Maria Curie-Skłodowska University, Department of Speech Therapy and Applied Linguistics, Sowińskiego 17, 20-040 Lublin, Poland

autor

Król D.

State Higher Vocational School in Tarnow, Mickiewicza 8, 33-100 Tarnów, Poland

Bibliografia

1. J.S. Perkell, M.H. Cohen, M.A. Svirsky, M.L. Matthies, I. Garabieta, and M.T. Jackson, The Journal of the Acoustic Society of America, 1992, 92(6), 3078–3096.
2. H. Kjellström, O. Engwall, Audiovisual to articulatory inversion, Speech Communication, 2009, 51(3), 195–209.
3. A. Katsamanis, G. Papandreou, and P. Maragos, Audiovisual-to-Articulatory Inversion Using Hidden Markov Models, Proceedings of the IEEE Workshop on Multimedia Signal Processing (MMSP-2007), 2007, 457–460.
4. K. Richmond, Announcing the electromagnetic articulography (day 1) subset of the mngu0 articulatory corpus, Proceedings of 12th Annual Conference of the International Speech Communication Association INTERSPEECH 2011, 1505-1508.
5. A. Lorenc, Wymowa normatywna polskich samogłosek nosowych i spółgłoski bocznej, Dom wydawniczy ELIPSA, Warszawa 2016, ISBN 978-83-8017-090-2.
6. D. Król, A. Lorenc, Tarnowskie Colloquia Naukowe, 2017, 4(3/2017), 9–16.
7. MOCHA-TIMIT database (2001), available online, http://www.cstr.ed.ac.uk/research/projects/artic/mocha.html, (accessed December 2017).
8. S. Narayanan, A. Toutios, V. Ramanarayanan, et al., The Journal of the Acoustical Society of America, 2014, 136(3), 1307–1311.
9. F. Rudzicz, A.K. Namasivayam, T. Wolff, Lang Resources & Evaluation, 2012, 46, 523–541.
10. A. Ji, J.J. Berry, M.T. Johnson, The Electromagnetic Articulography Mandarin Accented English (EMA-MAE) corpus of acoustic and 3D articulatory kinematic data, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014, 7719–7723.
11. J. Beskow, O. Engwall, and B. Granström, Resynthesis of Facial and Intraoral Articulation fromSimultaneous Measurements, Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS’03), 2003.
12. E. Meister, L. Meister, Multimodal Corpus of Speech Production: Work in Progress, in book: Human Language Technologies. The Baltic Perspective: Proceedings of the Fifth International Conference Baltic HLT 2012, Edition: Frontiers in Artificial Intelligence and Applications, IOS Press, 2012, ch. Multimodal Corpus of Speech Production: Work in Progress, pp.146–153.
13. M. Rochoń, B. Pompino-Marschall, The Articulation of Secondarily Palatalized Coronals in Polish. In Proceedings of XIVth International Congress of Phonetic Sciences, San Francisco, 1999, 1897–1900.
14. B. Pompino-Marschall, M. Żygis, Surface Palatalization of Polish Bilabial Stops: Articulation and Acoustics. Proceedings of the 15th International Congress of Phonetic Sciences, 2003, 1751–1754.
15. A. Trochymiuk, R. Święciński,. Logopedia, 2009, 38, 173–201.
16. A. Lorenc, R. Święciński, 2014. Application of Phonetics in Speech Therapy: a Case of Abnormal Convex Tongue Setting in Polish. in Recent Developmnets in Applied Phonetics. Studies in Linguistics and Methodology 6, Wydawnictwo KUL, Lublin, 2014, 287–324.
17. R. Święciński, 2013. An EMA Study of Articulatory Settings in Polish Speakers of English. in Teaching and Researching English Accents in Native and Non-native Speakers, Springer, Heidelberg, 2013, 73–82.
18. Ł. Mik, R. Wielgat, A. Lorenc, D. Król, R. Święciński, R. Jędryka, Multimodal Speech Data Acquisition with the Use of EMA Fast-speed Video Cameras and a Dedicated Microphone Array, Proceedings of 20 MIXDES - 23rd International Conference Mixed Design of Integrated Circuits and Systems, 2016, 415– 418.
19. P. Hoole and A. Zierdt, Five-dimensional articulography, Speech Motor Control: New developments in basic and applied research, eds. B. Maassen and P.H.H.M. Van Lieshout, 2009, 331–349.
20. M. Stella, P. Bernardini, F. Sigona, A. Stella, M. Grimaldi, B. Gili Fivela, J. Acoust. Soc. Am., 2012, 132(6), 3941– 949.
21. P. Boersma, D. Weenink, „Praat: doing phonetics by computer” [computer program, version 5.3.57]. webpage: http://www.praat.org/, 2014.
22. Hidden Markov Model Toolkit (HTK), available online, http://htk.eng.cam.ac.uk/, (accessed December 2017).
23. K. Murphy, Dynamic Bayesian networks: Representation, inference and learning, Ph.D. thesis, UC Berkeley, Computer Science Division (2002).
24. Xie, L., Liu, Z.-Q., Pattern Recognition, 2007, 40(8), 2325–2340.
25. A. Lorenc, R. Wielgat, Tarnowskie Colloquia Naukowe – Nauki humanistyczne, 2017, (2)1/2017, 129–157.
26. R. Wielgat, A. Lorenc, Science, Technology and Innovation, 2017, zgłoszone do publikacji.
27. R. Wielgat, Ł. Mik, A. Lorenc, A. Truchan, M. Szostek, Choice of optimal measurement conditions for calculating the correlation between EMA sensor and video marker position coordinates in electromagnetic articulography, Proceedings of 2017 International Conference on Systems, Signals and Image Processing (IWSSIP), Poznań, 2017.
28. R. Wielgat, Ł. Mik, A. Lorenc, Correlational and regressive analysis of the relationship between tongue and lips motion - An EMA and video study of selected polish speech sounds, 2017 MIXDES - 24th International Conference "Mixed Design of Integrated Circuits and Systems", 509-514, 2017.
29. A. Ji, Speaker Independent Acoustic-to-Articulatory Inversion, Dissertation, Marquette University, 2014.
30. R. Wielgat, A. Lorenc, Speech inversion by dynamic time warping method, 2016 International Conference on Signals and Electronic Systems (ICSES), 2016, 81–84.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-e243460a-4a64-42a6-8376-194eee992781