Tytuł artykułu
Autorzy
Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
We present a comparative evaluation of different classification algorithms for a fusion engine that is used in a speaker identity selection task. The fusion engine combines the scores from a number of classifiers, which uses the GMM-UBM approach to match speaker identity. The performances of the evaluated classification algorithms were examined in both the text-dependent and text-independent operation modes. The experimental results indicated a significant improvement in terms of speaker identification accuracy, which was approximately 7% and 14.5% for the text-dependent and the text-independent scenarios, respectively. We suggest the use of fusion with a discriminative algorithm such as a Support Vector Machine in a real-world speaker identification application where the text-independent scenario predominates based on the findings.
Słowa kluczowe
Wydawca
Czasopismo
Rocznik
Tom
Strony
55--64
Opis fizyczny
Bibliogr. 30 poz., rys., tab.
Twórcy
Bibliografia
- [1] Altman, N.S. (1992). An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician, 46(3), 175–185
- [2] Beigi, H. (2011). Speaker Recognition, Encyclopedia of Cryptography and Security, Springer, pp. 1232–1242
- [3] Bimbot, F., Bonastre, J.F., Fredouille, C., Gravier, G., Magrin-Chagnolleau, I., Meignier, S., Reynolds, D.A. (2004). A tutorial on textindependent speaker verification. EURASIP journal on applied signal processing, 2004, 430–451
- [4] Bishop, C.M. (2008, June). A new framework for machine learning. In IEEE World Congress on Computational Intelligence (pp. 1–24). Springer Berlin Heidelberg
- [5] Bouchard, G. (2007). Bias-variance tradeoff in hybrid generative-discriminative models. In Machine Learning and Applications. ICMLA 2007. Sixth International Conference on (pp. 124–129). IEEE
- [6] Burges, C.J.C., Ben, J.I., Denker, J.S., LeCun, Y., Nohl, C.R. (1993). Off line recognition of handwritten postal words using neural networks. International Journal of Pattern Recognition and Artificial Intelligence, 7(04), 689–704
- [7] Campbell, J.P. (1997). Speaker recognition: a tutorial. Proceedings of the IEEE, 85(9), 1437–1462
- [8] Campbell, J.P., Reynolds, D A. (1999, March). Corpora for the evaluation of speaker recognition systems. In Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on (Vol. 2, pp. 829–832). IEEE
- [9] Dehak, N., Kenny, P.J., Dehak, R., Dumouchel, P., Ouellet, P. (2011). Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 19(4), 788–798
- [10] Damper, R.I., Higgins, J.E. (2003). Improving speaker identification in noise by subband processing and decision fusion. Pattern Recognition Letters, 24(13), 2167–2173
- [11] Furui, S. (1981). Cepstral analysis technique for automatic speaker verification. IEEE Transactions on Acoustics, Speech, and Signal Processing, 29(2), 254–272
- [12] Ganchev, T., Siafarikas, M., Mporas, I., Stoyanova, T. (2014). Wavelet basis selection for enhanced speech parametrization in speaker verification. International Journal of Speech Technology, 17(1), 27–36
- [13] Hermansky, H., Morgan, N. (1994). RASTA processing of speech. IEEE transactions on speech and audio processing, 2(4), 578–589
- [14] Hsu, C.W., Lin, C.J. (2002). A comparison of methods for multiclass support vector machines. IEEE transactions on Neural Networks, 13(2), 415–425
- [15] Kittler, J., Hatef, M., Duin, R.P., Matas, J. (1998). On combining classifiers. IEEE transactions on pattern analysis and machine intelligence, 20(3), 226–239
- [16] Kuncheva, L.I., Alpaydin, E. (2007). Combining Pattern Classifiers: Methods and Algorithms, IEEE Transactions on Neural Networks, 18(3), 964–964
- [17] Kung, S.Y. (2014). Kernel methods and machine learning. Cambridge University Press. pp. 341–342
- [18] Larcher, A., Lee, K.A., Ma, B., Li, H. (2014). Text-dependent speaker verification: Classifiers, databases and RSR2015. Speech Communication, 60, 56–77
- [19] Mitchell, H. B. (2007). Multi-sensor data fusion: an introduction. Springer Science & Business Media
- [20] Monte-Moreno, E., Chetouani, M., Faundez-Zanuy, M., Sole-Casals, J. (2009). Maximum likelihood linear programming data fusion for speaker recognition. Speech Communication, 51(9), 820–830
- [21] Najafian, M., Safavi, S., Weber, P., Russell, M. (2016). Identification of British English regional accents using fusion of i-vector and multi-accent phonotactic systems. ODYSSEY
- [22] Nandakumar, K., Jain, A. K. (2008, September). Multibiometric template security using fuzzy vault. In Biometrics: Theory, Applications and Systems, 2008. BTAS 2008. 2nd IEEE International Conference on (pp. 1–6). IEEE
- [23] Pal, S.K., Mitra, S. (1996). Noisy fingerprint classification using multilayer perceptron with fuzzy geometrical and textural features. Fuzzy sets and systems, 80(2), 121–132
- [24] Ramachandran, R.P., Farrell, K.R., Ramachandran, R., Mammone, R.J. (2002). Speaker recognition–general classifier approaches and data fusion methods. Pattern Recognition, 35(12), 2801–2821
- [25] Raudys, Š. (2006). Trainable fusion rules. I. Large sample size case. Neural Networks, 19(10), 1506–1516
- [26] Reynolds, D.A., Rose, R. C. (1995). Robust textindependent speaker identification using Gaussian mixture speaker models. IEEE transactions on speech and audio processing, 3(1), 72–83
- [27] Reynolds, D.A., Quatieri, T.F., Dunn, R.B. (2000). Speaker verification using adapted Gaussian mixture models. Digital signal processing, 10(1), 19–41
- [28] Safavi, S., Gan, H., Mporas, I., Sotudeh, R. Fraud Detection in Voice-based Identity Authentication Applications and Services. In The IEEE International Conference on Data Mining series (ICDM), 2016
- [29] Safavi, S., Hanani, A., Russell, M., Jancovic, P., Carey, M.J. (2012). Contrasting the effects of different frequency bands on speaker and accent identification. IEEE Signal Processing Letters, 19(12), 829–832.
- [30] Safavi, S., Jancovic, P., Russell, M.J., Carey, M.J. (2013). Identification of gender from children’s speech by computers and humans. In INTERSPEECH (pp. 2440–2444)
Uwagi
PL
Opracowanie ze środków MNiSW w ramach umowy 812/P-DUN/2016 na działalność upowszechniającą naukę.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-52117dc4-7912-4ea4-ae80-87ff199cbe0f