PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Speech sound detection employing deep learning

Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Konferencja
Federated Conference on Computer Science and Information Systems (16 ; 02-05.09.2021 ; online)
Języki publikacji
EN
Abstrakty
EN
The primary way of communication between people is speech, both in the form of everyday conversation and speech signal transmitted and recorded in numerous ways. The latter example is especially important in the modern days of the global SARS-CoV-2 pandemic when it is often not possible to meet with people and talk with them in person. Streaming, VoIP calls, live podcasts are just some of the many applications that have seen a significant increase in usage due to the necessity of social distancing. In our paper, we provide a method to design, develop, and test the deep learning-based algorithm capable of performing voice activity detection in a manner better than other benchmark solutions like the WebRTC VAD algorithm, which is an industry standard based mainly on a classic approach to speech signal processing.
Rocznik
Tom
Strony
221--222
Opis fizyczny
Bibliogr. 6 poz., tab.
Twórcy
autor
  • Gdańsk University of Technology, Faculty of Electronics Telecommunication and Informatics, Gabriela Narutowicza 11/12, 80-233 Gdańsk, Poland
  • Gdańsk University of Technology, Faculty of Electronics Telecommunication and Informatics, Gabriela Narutowicza 11/12, 80-233 Gdańsk, Poland
  • Gdańsk University of Technology, Faculty of Electronics Telecommunication and Informatics, Gabriela Narutowicza 11/12, 80-233 Gdańsk, Poland
  • Gdańsk University of Technology, Faculty of Electronics Telecommunication and Informatics, Gabriela Narutowicza 11/12, 80-233 Gdańsk, Poland
  • Gdańsk University of Technology, Faculty of Electronics Telecommunication and Informatics, Gabriela Narutowicza 11/12, 80-233 Gdańsk, Poland
  • Gdańsk University of Technology, Faculty of Electronics Telecommunication and Informatics, Multimedia Systems Department, Gabriela Narutowicza 11/12, 80-233 Gdańsk, Poland
Bibliografia
  • 1. H. Haneche, B. Boudraa, and A. Ouahabi, “A new way to enhance speech signal based on compressed sensing,” Measurement, vol. 151, p. 107117, 2020. http://dx.doi.org/https://doi.org/10.1016/j.measurement.2019.107117. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0263224119309832
  • 2. K. Paciorek. Andrzej Duda o: LGBT, TVP, koronawirusie, głosach po Bosaku i o szansach w starciu z Trzaskowskim (in Polish). Youtube (Imponderabilia channel). [Online]. Available: https://www.youtube.com/watch?v=Izxj72bg4A4
  • 3. Freesound. Party Sounds recording from the online royalty free recordings archive. [Online]. Available: https://freesound.org/people/FreqMan/sounds/23153/
  • 4. B. McFee, C. Raffel, D. Liang, D. P. Ellis, M. McVicar, E. Battenberg, and O. Nieto, “librosa: Audio and music signal analysis in python,” in Proceedings of the 14th python in science conference, vol. 8, 2015.
  • 5. GitHub. Python interface to the WebRTC voice activity detector. [Online]. Available: https://github.com/wiseman/py-webrtcvad
  • 6. M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro et al., “TensorFlow: Large-scale machine learning on heterogeneous systems,” 2015. [Online]. Available: https://www.tensorflow.org/
Uwagi
Track 5: Young Researchers Workshop on Artificial Intelligence and Cybersecurity
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-ea1c30ff-71aa-4ff8-8ed2-eebc5d69a519
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.