Wyniki wyszukiwania - BazTech

Ograniczanie wyników

1 Computer Science

1 2020

Powiadomienia systemowe

Sesja wygasła!

Znaleziono wyników: 1

Liczba wyników na stronie

Wyniki wyszukiwania

Sortuj według:

Ogranicz wyniki do:

Hybrid CNN-Ligru acoustic modeling using sincnet raw waveform for hindi ASR

Kumar Ankit, Aggarwal Rajesh Kumar

Computer Science

2020

T. 21 (4)

397-417

Deep neural networks (DNN) currently play a most vital role in automatic speech recognition (ASR). The convolution neural network (CNN) and recurrent neural network (RNN) are advanced versions of DNN. They are right to deal with the spatial and temporal properties of a speech signal, and both properties have a higher impact on accuracy. With its raw speech signal, CNN shows its superiority over precomputed acoustic features. Recently, a novel first convolution layer named SincNet was proposed to increase interpretability and system performance. In this work, we propose to combine SincNet-CNN with a light-gated recurrent unit (LiGRU) to help reduce the computational load and increase interpretability with a high accuracy. Different configurations of the hybrid model are extensively examined to achieve this goal. All of the experiments were conducted using the Kaldi and Pytorch-Kaldi toolkit with the Hindi speech dataset. The proposed model reports an 8.0% word error rate (WER).