Wyniki wyszukiwania - BazTech

1

Speech emotion recognition using wavelet packet reconstruction with attention-based deep recurrent neutral networks

Meng Hao, Yan Tianhao, Wei Hongwei, Ji Xun

Bulletin of the Polish Academy of Sciences. Technical Sciences

|

2021

|

Vol. 69, nr 1

art. no. e136300

EN

Speech emotion recognition (SER) is a complicated and challenging task in the human-computer interaction because it is difficult to find the best feature set to discriminate the emotional state entirely. We always used the FFT to handle the raw signal in the process of extracting the low-level description features, such as short-time energy, fundamental frequency, formant, MFCC (mel frequency cepstral coefficient) and so on. However, these features are built on the domain of frequency and ignore the information from temporal domain. In this paper, we propose a novel framework that utilizes multi-layers wavelet sequence set from wavelet packet reconstruction (WPR) and conventional feature set to constitute mixed feature set for achieving the emotional recognition with recurrent neural networks (RNN) based on the attention mechanism. In addition, the silent frames have a disadvantageous effect on SER, so we adopt voice activity detection of autocorrelation function to eliminate the emotional irrelevant frames. We show that the application of proposed algorithm significantly outperforms traditional features set in the prediction of spontaneous emotional states on the IEMOCAP corpus and EMODB database respectively, and we achieve better classification for both speaker-independent and speaker-dependent experiment. It is noteworthy that we acquire 62.52% and 77.57% accuracy results with speaker-independent (SI) performance, 66.90% and 82.26% accuracy results with speaker-dependent (SD) experiment in final.

2

Accurate identification on individual similar communication emitters by using HVG-NTE feature

Li Ke, Ge Wei, Yang Xiaoya, Xu Zhengrong

Bulletin of the Polish Academy of Sciences. Technical Sciences

|

2021

|

Vol. 69, nr 2

art. no. e136741

EN

Individual identification of similar communication emitters in the complex electromagnetic environment has great research value and significance in both military and civilian fields. In this paper, a feature extraction method called HVG-NTE is proposed based on the idea of system nonlinearity. The shape of the degree distribution, based on the extraction of HVG degree distribution, is quantified with NTE to improve the anti-noise performance. Then XGBoost is used to build a classifier for communication emitter identification. Our method achieves better recognition performance than the state-of-the-art technology of the transient signal data set of radio stations with the same plant, batch, and model, and is suitable for a small sample size.

3

Rgb-D face recognition using LBP-DCT algorithm

Kumar Sunil B L, Kumari Sharmila M

Applied Computer Science

|

2021

|

Vol. 17, no 3

73--81

EN

Face recognition is one of the applications in image processing that recognizes or checks an individual's identity. 2D images are used to identify the face, but the problem is that this kind of image is very sensitive to changes in lighting and various angles of view. The images captured by 3D camera and stereo camera can also be used for recognition, but fairly long processing times is needed. RGB-D images that Kinect produces are used as a new alternative approach to 3D images. Such cameras cost less and can be used in any situation and any environment. This paper shows the face recognition algorithms’ performance using RGB-D images. These algorithms calculate the descriptor which uses RGB and Depth map faces based on local binary pattern. Those images are also tested for the fusion of LBP and DCT methods. The fusion of LBP and DCT approach produces a recognition rate of 97.5% during the experiment