PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Determination of Input Parameters of the Neural Network Model, Intended for Phoneme Recognition of a Voice Signal in the Systems of Distance Learning

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
The article is devoted to the problem of voice signals recognition means introduction in the system of distance learning. The results of the conducted research determine the prospects of neural network means of phoneme recognition. It is also shown that the main difficulties of creation of the neural network model, intended for recognition of phonemes in the system of distance learning, are connected with the uncertain duration of a phoneme-like element. Due to this reason for recognition of phonemes, it is impossible to use the most effective type of neural network model on the basis of a multilayered perceptron, at which the number of input parameters is a fixed value. To mitigate this shortcoming, the procedure, allowing to transform the non-stationary digitized voice signal to the fixed quantity of mel-cepstral coefficients, which are the basis for calculation of input parameters of the neural network model, is developed. In contrast to the known ones, the possibility of linear scaling of phoneme-like elements is available in the procedure. The number of computer experiments confirmed expediency of the fact that the use of the offered coding procedure of input parameters provides the acceptable accuracy of neural network recognition of phonemes under near-natural conditions of the distance learning system. Moreover, the prospects of further research in the field of development of neural network means of phoneme recognition of a voice signal in the system of distance learning is connected with an increase in admissible noise level. Besides, the adaptation of the offered procedure to various natural languages, as well as to other applied tasks, for instance, a problem of biometric authentication in the banking sector, is also of great interest.
Rocznik
Strony
425--432
Opis fizyczny
Bibliogr. 18 poz., rys., tab., wykr.
Twórcy
autor
  • Caspian State University of Technologies and Engineering named after Sh. Yessenov, Aktau, Republic of Kazakhstan
  • National Technical University of Ukraine ”Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv, Ukraine
  • Department IT-engineering, Almaty University of Power Engineering and Telecommunications, Almaty, Republic of Kazakhstan
  • Kyiv National University of Construction Architecture, Ukraine
Bibliografia
  • [1] V. Mikhaylenko, Neural network models and methods of recognition of phonemes in a voice signal in the system of distance learning: [Monograph] / V. M. Mikhailenko, L. O. Tereykovskaya, I. A. Tereykovsky., B. B. Akhmetov. - K .: CP ”Komprint”, 2017.- 252 p.
  • [2] A Najib, A Basari, A Pee, M Daimon, A Rahman, L Tahir, Online performance dialogue system model (e-DP): a requirement analysis study at batu pahat district education office, Journal of Theoretical and Applied Information Technology, 31st December 2017, vol.95, no 24, p. 6699 6706.
  • [3] A. Mosa, M. Mahrin, R. Yusoff, A systematic literature review of technological factors for e-learning readiness in higher education, Journal of Theoretical and Applied Information Technology, 30th November 2016, vol.93, no.2, p. 500 521.
  • [4] I. Veritawati, I. Wasito, T. Basaruddin, Text interpretation using a modified process of the ontology and sparse clustering, Journal of Theoretical and Applied Information Technology, 15th March 2017, vol.95, no 5, p. 1019-1028.
  • [5] A.Kadir, A. Yauri, Automated semantic query formulation using machine learning approach, Journal of Theoretical and Applied Information Technology, 30th June 2017, vol. 95, no 12, p. 2761-2775.
  • [6] J. Park, J. Yoon, Y. Seo, G. Jang, Spectral energy based voice activity detection for real-time voice interface, Journal of Theoretical and Applied Information Technology, 15th September 2017, vol. 95, no17, p. 4304-4312.
  • [7] A. Agranovsky, D. Lednov, Theoretical aspects of algorithms for processing and classifying speech signals, - M . Radio and Communication, 2004. - Ch. 1. 164 c.
  • [8] L. Babenko, D. Subbotin, V. Fedorov, P. Yurkov, Definition of the borders between the fonemas by a neuroet network method, Izvestiya Southern Federal University. Technical science, 2003, no 4, t 33, pp. 321-323.
  • [9] T. Kartbayev, B. Akhmetov, A. Doszhanova, K. Mukapil, A. Kalizhanova, G. Nabiyeva, L. Balgabayeva, F. Malikova, Development of a computer system for identity authentication using artificial neural networks, Image Analysis & Stereology, 10.5566/ias.1612. V.36, 1, 2017.
  • [10] O. Fedyaev, I. Bondarenko, Neural network algorithm for speakerindependent recognition of speech phonemes, USIM, 2013, no. 4 p. 41- 50.
  • [11] B. Meyer , T. Jrgens, T. Wesker, T. Brand, B. Kollmeier, Human phoneme recognition depending on speech-intrinsic variability, J Acoust Soc Am, 2010 Nov ; 128 (5) : 3126-41.
  • [12] Y. Qian, M. Bi, T. Tan, K. Yu, ”Very deep convolutional neural networks for noise robust speech recognition”, IEEE/ACM Trans. Audio Speech Language Process. , vol. 24, no. 12, pp. 2263-2276, 2016.
  • [13] V. Lila, E. Puchkov, Methodology of training a recurrent artificial neural network with dynamic stack memory, International magazine ”Software products and systems”, Tver, no 4, 2014 p. [on pages 132-135].
  • [14] Understanding LSTM Networks Posted on August 27, 2015 (http://colah.github.io/posts/2015-08-Understanding-LSTMs/).
  • [15] A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, K. Lang, Phoneme Recognition Using Time - Delay Neural Networks, IEEE Transactions On Accoustics, Speech And Signal Processing, vol. 37, 1989.
  • [16] M. Gusev, Methods and models of recognition of Russian speech in information systems, dis. doctors of techn. Sciences: 05.13.01 / MN Gusev - St. Petersburg, 2014. - 378 p.
  • [17] I. Tereykovskii, Optimization of the structure of a two-chirped perceptron, possible distribution of fertility of anomalous influences of experimental parameters of computer technology, Scientific and technical journal ”Management of branching of folded systems”, Kiev. National University of Architecture. - 2011. - vol. 5. - S. 128-131.
  • [18] I. Boykov, A. Ivanov, D. Kalashnikov, Algorithm of the construction of the statistic discrete-continuous description of the duration of the audio sources of the increased speech of the dictor, News of higher educational institutions. The Volga region, no 4 (36), 2015 p.64-76.
Uwagi
Opracowanie rekordu w ramach umowy 509/P-DUN/2018 ze środków MNiSW przeznaczonych na działalność upowszechniającą naukę (2019).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-8d8e9fcb-d8ae-4980-845b-cb7a6cf0ca8c
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.