An embedded system for real-time speaker recognition using Raspberry Pi platform

Weychan, R.; Marciniak, T.; Dąbrowski, A.

doi:10.15199/13.2016.4.1

Artykuł - szczegóły

Tytuł artykułu

An embedded system for real-time speaker recognition using Raspberry Pi platform

Autorzy

Weychan R. , Marciniak T. , Dąbrowski A.

Identyfikatory

DOI

10.15199/13.2016.4.1

Warianty tytułu

System wbudowany do rozpoznawania mówcy w czasie rzeczywistym zrealizowany za pomocą platform Raspberry Pi

Konferencja

IEEE SPA 2015 (19 ; 23-25.09.2015 ; Poznań, Poland)

Języki publikacji

Abstrakty

The paper presents an embedded system, which realizes real time speaker recognition from the internet radio broadcasts. The proposed solution was developed with the use of the open source Python programming language. It was first tested within the Windows environment, then adapted to the Unix operating system in order to use is on the Raspberry Pi 2 platform. We analyzed available libraries to select the most convenient solutions for individual blocks of the speaker recognition task. In the paper we also indicate parameters, for which the algorithm exhibits the greatest efficiency. The prepared software is available on the Github file repository.

Artykuł prezentuje system realizujący rozpoznawanie mówcy z radia internetowego. Zaproponowane rozwiązanie wykorzystuje narzędzia udostępnione w ramach ogólnie dostępnego oprogramowania dla języka Python. Prezentowane oprogramowanie zostało przetestowane w środowisku Windows a następnie zostało zaadaptowane do uruchomienia na platformie Raspberry Pi 2, zarządzanej przez system Linux. W artykule przeanalizowano dostępne biblioteki, które posłużyły do implementacji algorytmów ekstrakcji cech oraz modelowania sygnału mowy. Przeprowadzone eksperymenty pozwoliły na dobranie parametrów systemu, przy których uzyskuje się najlepszą skuteczność identyfikacji i jednocześnie największą szybkość przetwarzania danych. Przygotowane oprogramowanie jest dostępne w repozytorium Github.

Słowa kluczowe

speaker recognition GMM internet radio Python Raspberry Pi

rozpoznawanie mówcy GMM radio internetowe Python Raspberry Pi

Wydawca

Wydawnictwo SIGMA-NOT

Czasopismo

Elektronika : konstrukcje, technologie, zastosowania

Rocznik

2016

Tom

Vol. 57, nr 4

Strony

3--6

Opis fizyczny

Bibliogr. 22 poz., il., rys.

Twórcy

autor

Weychan R.

Poznan University of Technology, Faculty of Computing Science, Chair of Control and Systems Engineering, Division of Signal Processing and Electronic Systems

autor

Marciniak T.

Poznan University of Technology, Faculty of Computing Science, Chair of Control and Systems Engineering, Division of Signal Processing and Electronic Systems

autor

Dąbrowski A.

Poznan University of Technology, Faculty of Computing Science, Chair of Control and Systems Engineering, Division of Signal Processing and Electronic Systems

Bibliografia

[1] Beigi H. 2011. Fundamentals of speaker recognition. Springer Science & Business Media.
[2] Dabrowski A., Drgas S., Marciniak T. 2008. “Detection of GSM speech coding for telephone call classiﬁcation and automatic speaker recognition”. ICSES conference proceedings : 415–418.
[3] Davis S., Mermelstein P. 1980. “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences”. Acoustics, Speech and Signal Processing, IEEE Transactions on vol. 28 no. 4 : 357–366.
[4] Garofolo J. S., L. D. Consortium et al. 1993. TIMIT: acoustic-phonetic continuous speech corpus.
[5] Github source code repository. [Online]. Available: https://github.com/audiodsp/Internet_radio_speaker_recognition.
[6] Jones E., Oliphant T., Peterson P. et al. 2001. “SciPy: Open source scientiﬁc tools for Python”. [Online]. Available: http://www.scipy.org/.
[7] Lenarczyk P., Piotrowski Z. 2013. “Speaker recognition system based on GMM multivariate probability distributions built-in a digital watermarking token”. Przegląd Elektrotechniczny vol. 89 no. 2a : 59–63, 2013.
[8] Marciniak T., Weychan R., Dabrowski A., and Krzykowska A. 2010. “Speaker recognition based on short Polish sequences”. IEEE SPA: Signal Processing Algorithms, Architectures, Arrangements, and Applications Conference Proceedings : 95–98.
[9] Marciniak T., Weychan R., Stankiewicz A., Dabrowski A. 2014. “Biometric speech signal processing in a system with digital signal processor”. Bulletin of the Polish Academy of Sciences Technical Sciences vol. 62 no. 3 : 589–594.
[10] Numeric and scientiﬁc python packages. [Online]. Available: https://wiki.python.org/moin/NumericAndScientiﬁc.
[11] P. S. Foundation. Python language reference, version 2.7. [Online]. Available: http://www.python.org.
[12] Pedregosa F., et. al. 2011. “Scikit-learn: Machine learning in Python”. Journal of Machine Learning Research vol. 12 : 2825–2830.
[13] Pygame project website. [Online]. Available: https://www.pygame.org/wiki/about.
[14] Pymedia – python module for multimedia ﬁles and streams processing. [Online]. Available: http://pymedia.org/features.html.
[15] PyQT4 project website. [Online]. Available: https://wiki.python.org/moin/PyQt4.
[16] Rajewski M., Skrzypczak M., 2016. Automatic speaker recognition from Internet radio using embedded system (in Polish: Automatyczne rozpoznawanie mówcy z radia internetowego przy wykorzystaniu systemu wbudowanego), B. Sc. thesis (unpublished), Supervisor: Marciniak T., Poznan.
[17] Raspberry Pi platform detailed overview. [Online]. Available: http://elinux.org/Rpi_Hardware.
[18] Van der Walt S., Colbert S., Varoquaux G. 2011. “The NumPy array: A structure for efﬁcient numerical computation”. Computing in Scienceand Engineering vol. 11 : 22–30.
[19] Weychan R., Marciniak T., Dabrowski A. 2012. “Analysis of differences between MFCC after multiple GSM transcodings”, Przegląd Elektrotechniczny 6/2012 : 24–29.
[20] Weychan R., Marciniak T., Stankiewicz A., Dabrowski A. 2014. “Real time speaker recognition from internet radio”. IEEE Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA) : 128–132.
[21] Weychan R., Marciniak T., Stankiewicz A., Dabrowski A. 2015. “Implementation aspects of speaker recognition using Python language and Raspberry Pi platform”. IEEE Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA) : 162–167.
[22] Weychan R., Stankiewicz A., Marciniak T., Dabrowski A. 2014. “Improving of speaker identiﬁcation from mobile telephone calls”. Multimedia Communications, Services and Security, ser. Communications in Computer and Information Science vol. 429 : 254–264.

Uwagi

Opracowanie ze środków MNiSW w ramach umowy 812/P-DUN/2016 na działalność upowszechniającą naukę.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-11959754-6a93-4b02-8200-cec6858a2003