PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

A class of neuro-computational methods for assamese fricative classification

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
In this work, a class of neuro-computational classifiers are used for classification of fricative phonemes of Assamese language. Initially, a Recurrent Neural Network (RNN) based classifier is used for classification. Later, another neuro fuzzy classifier is used for classification. We have used two different feature sets for the work, one using the specific acoustic-phonetic characteristics and another temporal attributes using linear prediction cepstral coefficients (LPCC) and a Self Organizing Map (SOM). Here, we present the experimental details and performance difference obtained by replacing the RNN based classifier with an adaptive neuro fuzzy inference system (ANFIS) based block for both the feature sets to recognize Assamese fricative sounds.
Rocznik
Strony
59--70
Opis fizyczny
Bibliogr. 37 poz., rys., tab.
Twórcy
autor
  • Department of Applied Electronics and Instrumentation, GIMT al., Guwahati-781017, Assam, India
autor
  • Department of Electronics and Communication Engineering, Gauhati University al., Guwahati- 781014, Assam, India
autor
  • Department of Electronics and Communication Technology, Gauhati University al. Guwahati- 781014, Assam, India
Bibliografia
  • [1] M. Sarma and K.K. Sarma, Phoneme-Based Speech Segmentation Using Hybrid Soft Computing Framework. Studies in Computational Intelligence vol. 550, Springer India, New Delhi, 2014.
  • [2] M. Sarma and K. K. Sarma, An ANN based approach to Recognize Initial Phonemes of Spoken Words of Assamese Language. Elsevier International Journal of Applied Soft Computing, vol. 13, no. 5, pp. 2281-2291, 2013.
  • [3] T Robinson, M Hochberg and S Renals, IPA: Improved phone modelling with recurrent neural networks. In Proceedings of IEEE ICASSP, 1994.
  • [4] T Lee, P C Ching and L W Chan, An RNN Based Speech Recognition System with Discriminative Training. In Proceedings of the 4th European Conferrence on Speech Communication and Technology, pp. 1667-1670, 1995.
  • [5] L. H. R. C. Jamieson, Experiments on the Implementation of Recurrent Neural Networks for Speech Phone Recognition. Proceedings of the Thirtieth Annual Asilomar Conference on Signals, Systems and Computers, Pacific Grove, California, pp. 779-782, November, 1996.
  • [6] T. Koizumi, M. Mori, S. Taniguchi and M. Maruya, Recurrent Neural Networks for Phoneme Recognition. In Proceedings Fourth International Conference, vol. 1, pp. 326 -329, 1996.
  • [7] L J M Rothkrantz and D Nollen, Speech Recognition Using Elman Neural Networks. Text, Speech and Dialogue, Lecture Notes in Computer Science, 1692: 146-151, 1999.
  • [8] Y Sun , L T Bosch and L Boves, Hybrid HMM/BLstm-Rnn for Robust Speech Recognition. In Proceedings of 13th International Conference on Text, Speech and Dialogue, Springer-Verlag Berlin, Heidelberg :400-407, 2010.
  • [9] O Vinyals , S V Ravuri and D Povey, Revisiting Recurrent Neural Networks For Robust ASR. In Proceedings of IEEE International Confrence on Acoustics, Speech, and Signal Processing (ICASSP), 2012.
  • [10] T Mikolov and G Zweig,Context Dependent Recurrent Neural Network Language Model. In Proceedings of the IEEE Spoken Language Technology Workshop (SLT), Miami, FL, USA : 234-239, 2012.
  • [11] S. Badura, M. Fnitrik, O. Skvarek, M. Klimo, Bimodal vowel recognition using fuzzy logic networks - naive approach. In Proceedings of ELEKTRO, Rajecke Teplice, 2014.
  • [12] R. Halavati, S. B. Shouraki, S. H. Zadeh, Recognition of human speech phonemes using a novel fuzzy approach. Applied Soft Computing, 7:828839, 2007 .
  • [13] I. B. Fredj, K. Ouni, A novel phonemes classification method using fuzzy logic. Science Journal of Circuits, Systems and Signal Processing vol. 2, no. 1, pp. 1-5, 2013.
  • [14] P. Melin, J. Urias, D. Solano, M. Soto, M. Lopez and O. Castillo, Voice Recognition with Neural Networks, Type-2 Fuzzy Logic and Genetic Algorithms, Engineering Letters, vol.3, no.2, 2006.
  • [15] A. Taleb, Speech Recognition by Fuzzy-Neuro ANFIS Network and Genetic Algorithms, In Proceedings of International Conference on Intelligent Computational Systems, Dubai, 2012.
  • [16] A. Jongman, R. Wayland, and S. Wong, Acoustic characteristics of English fricatives. Journal of Acoustical Society of America , vol. 108, no. 3, Sepember, 2000.
  • [17] D. O’Shaughnessy, Speech Communication Human and Machine, 2nd Edition, IEEE Press, New York, 2000.
  • [18] Kenneth N. Stevens, Acoustic Phonetics, 1st MIT Press paperback Edition, The MIT Press, Cambridge, Massachusetts, London, England, 2000.
  • [19] P. Ladefoged, S. F. Disner, Vowels and Consonants, 3rd Edition, Wiley-Blackwell Publishing Ltd., West Sussex, UK, 2012.
  • [20] G. C. Goswami, Structure of Assamese, 1st Edition, Department of Publication, Gauhati University, Guwahati, Assam, India, 1982.
  • [21] U. N. Goswami, An Introduction to Assamese, Mani-Manik Prakash, Guwahati, Assam, India, 1978.
  • [22] G. C. Goswami and J. P. Tamuli, “Asamiya”, in G. Cardona and D. Jain (eds.), The Indo-Aryan Languages, London: Routledge, pp. 391-443, 2003.
  • [23] S. Haykin, Neural Networks:A Comprehensive Foundation, 2nd Edition, Prentice-Hall of India Pvt. Ltd., Delhi, India, 2005.
  • [24] J. R. Jang, ANFIS : Adaptive Network-Based Fuzzy Inference System, IEEE Transactions on Systems, Man and Cybernetics, Vol. 23, No. 3, 1993.
  • [25] A. Abraham, Neuro Fuzzy Systems: State-of the-art Modeling Techniques, Connectionist Models of Neurons, Learning Processes, and Artificial Intelligence Lecture Notes in Computer Science, Vol.2084, pp 269-276, 2001.
  • [26] K. Haese, Self-organizing feature maps with selfadjusting learning parameters. IEEE Transactions on Neural Networks, vol. 9, pp. 1270-1278, 1998.
  • [27] J. S. R. Jang, C. T. Sun and E. Mizutani, Neuro-Fuzzy and Soft-Computing, 1st Edition, Prentice- Hall of India Pvt. Ltd., Delhi, India, 2011.
  • [28] J. Makhoul, Linear prediction: A tutorial review, In Proceedings of IEEE, vol. 63, pp. 561-580, 1975.
  • [29] B. P. Bogert, M. J. R. Healy, and J. W. Tukey, The Quefrency Alanysis of Time Series for Echoes: Cepstrum, Pseudo Autocovariance, Cross-Cepstrum and Saphe Cracking, Proceedings of the Symposium on Time Series Analysis, Chapter 15, pp. 209-243, 1963.
  • [30] A. M. Noll and M. R. Schroeder, Short-Time ’Cepstrum’Pitch Detection, Journal of the Acoustical Society of America, vol. 36, no. 5, pp. 1030-1036, 1964.
  • [31] A. M. Noll , Short-Time Spectrum and Cepstrum Techniques for Vocal-Pitch Detection, Journal of the Acoustical Society of America, vol. 36, no. 2, pp. 296-302, 1964.
  • [32] A. V. Oppenheim and R.W. Schafer, Digital Signal Processing, Englewood Cliffs, NJ:Prentice-Hall, 1975.
  • [33] Sarma B D, Sarma M, Sarma M and Prasanna S R M, Development of Assamese Phonetic Engine: Some Issues, In Proceedings of INDICON-2013, IIT Bombay, Mumbai, India, 2013.
  • [34] Rajen Barua, The X sound in Assamese language, The Assam Tribune, Guwahati, Sunday, March 5, 2006.
  • [35] Prof. Gautam Baruah, Dept. of CSE, IIT Guwahati, Available via tdil.mit.gov.in/assamesecodechartoct02.pdf
  • [36] htt p : //www.speech.kth.se/wavesur f er/
  • [37] P. Boersma and D. Weenink, Praat: doing phonetics by computer. Available via htt p : //www. f on.hum.uva.nl/praat
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-ac623236-2c34-41d8-8505-683249eb23de
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.