Wyniki wyszukiwania - BazTech

1

HMM-based phoneme speech recognition system for the control and command of industrial robots

Naik Adwait

Technical Transactions

|

2021

|

Vol. 118, iss. 1

art. no. e2021002

EN

n recent years, the integration of human-robot interaction with speech recognition has gained a lot of pace in the manufacturing industries. Conventional methods to control the robots include semi-autonomous, fully-autonomous, and wired methods. Operating through a teaching pendant or a joystick is easy to implement but is not effective when the robot is deployed to perform complex repetitive tasks. Speech and touch are natural ways of communicating for humans and speech recognition, being the best option, is a heavily researched technology. In this study, we aim at developing a stable and robust speech recognition system to allow humans to communicate with machines (robotic-arm) in a seamless manner. This paper investigates the potential of the linear predictive coding technique to develop a stable and robust HMM-based phoneme speech recognition system for applications in robotics. Our system is divided into three segments: a microphone array, a voice module, and a robotic arm with three degrees of freedom (DOF). To validate our approach, we performed experiments with simple and complex sentences for various robotic activities such as manipulating a cube and pick and place tasks. Moreover, we also analyzed the test results to rectify problems including accuracy and recognition score.

2

Navigation security module with real-time voice command recognition system

Yagimli M., Kursat-Tezer H.

Polish Maritime Research

|

2017

|

nr 2

17--26

EN

The real-time voice command recognition system used for this study, aims to increase the situational awareness, therefore the safety of navigation, related especially to the close manoeuvres of warships, and the courses of commercial vessels in narrow waters. The developed system, the safety of navigation that has become especially important in precision manoeuvres, has become controllable with voice command recognition-based software. The system was observed to work with 90.6% accuracy using Mel Frequency Cepstral Coefficients (MFCC) and Dynamic Time Warping (DTW) parameters and with 85.5% accuracy using Linear Predictive Coding (LPC) and DTW parameters.

3

Wysokowydajne nawęglanie niskociśnieniowe stali

Dybowski K.

Zeszyty Naukowe. Rozprawy Naukowe / Politechnika Łódzka

|

2013

|

Z. 473

1--101

PL

Niniejsza dysertacja zawiera zbiór wyników badań i analiz autora, stanowiących jego wkład w inżynierię powierzchni, związany z wysokowydajną technologią nawęglania niskociśnieniowego stali. Obejmuje ona analizę możliwości podniesienia temperatury procesu oraz właściwą jego organizację, w celu zwiększenia wydajności, przy jednoczesnym zachowaniu wysokich właściwości mechanicznych i ograniczeniu odkształceń cieplnych. Część pierwsza rozprawy, obejmująca rozdziały 1-5, opisuje metodę nawęglania niskociśnieniowego stali w atmosferze acetylen-etylen-wodór oraz modyfikację tej technologii, polegającą na wstępnym azotowaniu stali na etapie nagrzewania do temperatury nawęglania. W tej części pracy przestawiono również nowoczesny sposób obróbki cieplnej po nawęglaniu, jakim jest hartowanie w gazach pod wysokim ciśnieniem. Ponadto określono wymagania, jakie stawia się warstwom nawęglonym oraz scharakteryzowano wpływ budowy strukturalnej na właściwości mechaniczne tych warstw. W drugiej części rozprawy, w rozdziale 7, przedstawiono wpływ organizacji procesu na wydajność nawęglania niskociśnieniowego, wynikającą z możliwości zastosowania podziału procesu na jedno- i wielosegmentowy. Dokonano analizy efektywności wytwarzania warstw nawęglanych o znacznych grubościach, pod kątem skrócenia całkowitego czasu procesu i uzyskania wysokiej skuteczności przekazywania węgla z atmosfery do powierzchni nawęglanych detali. W rozdziale 8 zaprezentowano wyniki badań dotyczące wpływu temperatury oraz sposobu nawęglania na właściwości mechaniczne nawęglonej stali, tj. wytrzymałość zmęczeniową na zginanie, odporność na zmęczenie stykowe, odporność na dynamiczne obciążenia. Określono, jakie czynniki i w jakim stopniu wpływają na poziom wytrzymałości nawęglonej stali. Wykazano, że odpowiedni sposób prowadzenia nawęglania nisko-ciśnieniowego gwarantuje możliwość podniesienia temperatury procesu w celu jego intensyfikacji, bez pogorszenia właściwości mechanicznych wynikających z rozrostu ziarna austenitu. W rozdziale 9 określono wpływ podwyższenia temperatury procesu nawęglania na wielkość odkształceń cieplnych oraz zaproponowano nowatorski sposób ograniczenia tych odkształceń poprzez nasycanie węglem austenitu już na etapie nagrzewania do temperatury nawęglania, co powoduje zwiększenie granicy plastyczności. W rozdziale 10 i 11 podsumowano wyniki przeprowadzonych badań i sformułowano wnioski dotyczące możliwości intensyfikacji procesu nawęglania w wyniku zastosowania wysokowydajnej technologii nawęglania niskociśnieniowego.

EN

This dissertation is a collection of results from studies and analyses by its author, which constitutes his contribution to the surface engineering in the field of high-performance technology of low pressure carburizing of steel. It includes an analysis of the process temperature increase capabilities and the appropriate organization in order to increase efficiency, while at the same time maintaining high mechanical properties and reducing thermal deformations. The first part of the dissertation, comprising of chapters 1-5, describes the method of low pressure carburizing of steel in the atmosphere of acetylene-ethylene-hydrogen and the modification of this technology consisting of pre-nitriding of steel at the stage of heating up to carburizing. In this part of the discourse the high pressure gas quenching, the modern heat treatment after carburizing, has been presented. Furthermore, the defined requirements for the carburized layers and the impact of the structure on the mechanical properties of the layers were set. In the second part of the dissertation, in Chapter 7, the study reviews the influence of the process organization (resulting from the possibility of division into one and multi-segments) on the low pressure carburizing. The efficiency analysis of creating carburized layers of considerable thickness was performed with a view to reducing the overall process time in which it takes the process to run and obtain high efficiency carbon transfer from the atmosphere to the surface of the elements which are undergoing carburizing. In Chapter 8, the research results are presented which deal with the influence of temperature and the methods of carburizing on the mechanical qualities of carburized steel, i.e. the bending fatigue strength, impact and pitting resistance. It defines what factors and to what extent affect the level of durability of carburized steel. It was shown that the appropriate way of conducting glow pressure carburizing guarantees the possibility of increasing the temperature in order to intensify the process without reducing the mechanical qualities resulting from austenite grain growth. Chapter 9 sets out the impact of such temperature increase on thermal deformation during the carburizing process. An innovative way to reduce the thermal deformation by reintroduction of carbon saturation of austenite when heating up to carburizing temperature, which increase the yield strength, has been proposed. In Chapter 10 and 11 the results of the conducted studies were summarized and conclusions concerning the possibility of intensifying the carburizing process by applying the high performance technology of low pressure carburizing were drawn.

4

Zastosowanie współczynników predykcji liniowej do kwalifikacji zdrowych i chorych na zapalenie zatok obocznych nosa w oparciu o termogramy twarzy

Murawski P., Kalicki B.

Przegląd Elektrotechniczny

|

2013

|

R. 89, nr 3a

298--300

PL

Artykuł przedstawia wyniki prac nad zastosowaniem współczynników LPC do kwalifikacji osób zdrowych i chorych na zapalenie zatok w oparciu o automatyczną analizę rozkładu temperatury na powierzchni twarzy. Wskazuje, że możliwym jest skuteczne ich wykorzystanie a tym samym zaimplementowanie algorytmu w sprzęcie pomiarowym jakim jest kamera termowizyjna.

EN

The paper presents the results of the use of LPC coefficients for the qualification of healthy and sick for sinusitis patients based on automatic analysis of temperature distribution on the surface of the face. Shows that it is possible to use them effectively and thus the algorithm could be implemented in a hardware as an addition method for a measuring in clinical practice.

5

A method of designing an adaptive uniform quantizer for LPC coefficients quantization

Eskic Z., Peric Z., Nikolic J.

Przegląd Elektrotechniczny

|

2011

|

R. 87, nr 7

245-248

EN

This paper proposes a method of designing the adaptive uniform quantizer for frame by frame LPC coefficients quantization. The method firstly determines the support region thresholds of two uniform quantizers designated to quantize the minimal and the maximal value of LPC coefficients of each frame. Based on this, the uniform quantizer thresholds estimation for LPC coefficients quantization are provided. The results obtained by testing the proposed method in processing the speech signal from the TIMIT data base are presented and disscused in the paper.

PL

W artykule zaproponowano metodę zuniformowane go adaptacyjne kwantowania współczynnika LPC (linear prediction coders). Początkowo obliczana jest minimalna i maksymalna wartość LPC dla każdej ramki. Następnie zuniformowany współczynnik jest określany. Zaprezentowano test metody na przykładzie przetwarzania sygnału mowy z bazy TIMIT.

6

Speech nonfluency detection and classification based on linear prediction coefficients and neural networks

Kobus A., Kuniszyk-Jóźkowiak W., Smołka E., Codello I.

Journal of Medical Informatics & Technologies

|

2010

|

Vol. 15

135--143

EN

The goal of the paper is to present a speech nonfluency detection method based on linear prediction coefficients obtained by using the covariance method. The application “Dabar” was created for research. It implements three different methods of LP with the ability to send coefficients computed by them into the input of Kohonen networks. Neural networks were used to classify utterances in categories of fluent and nonfluent. The first one was Kohonen network (SOM), used to reduce LP coefficients representation of each window, which were used as input data to SOM input layer, to a vector of winning neurons of SOM output layer. Radial Basis Function (RBF) networks, linear networks and Multi-Layer Perceptrons were used as classifiers. The research was based on 55 fluent samples and 54 samples with blockades on plosives (p, b, d, t, k, g). The examination was finished with the outcome of 76% classifying.

7

Maskowanie długich przerw w muzyce z użyciem modelowania sinusoidalno-szumowego i adaptacji do kontekstu

Bartkowiak M., Latanowicz B.

Elektronika : konstrukcje, technologie, zastosowania

|

2010

|

Vol. 51, nr 12

61-65

PL

W artykule opisano technikę pozwalając ukryć lub zamaskować długie przerwy (do 0,5 s) w programach muzycznych. Brakująca treść zastępowana jest sygnałem syntezowanym z modeli widmowych na podstawie treści otaczającej lukę. Składowe tonalne sygnału syntezowane są z użyciem modelu sinusoidalnego. Połączenia trajektorii sinusoidalnych z obu stron przerwy dokonuje heurystyczny algorytm adaptacyjny. Przed połączeniem trajektorie są klasyfikowane jako stabilne lub zmienne, co pozwala prawidłowo odtworzyć dźwięki z efektami vibrato i glissando. Składowa szumowa jest modelowana i syntezowana przy użyciu spaczonego modelu LPC. Wyniki oceny odsłuchowej potwierdzają wysoką jakość dźwięku.

EN

The paper describes a technique allowing to conceal or mitigate long gaps (up to 0.5 second) in music programs. Missing content is replaced by signal synthesized from spectral models using data surrounding the gap. Tonal components are synthesized using a sinusoidal model. A heuristic adaptive algorithm is employed to link model parameters across the gap. Prior to linking, sinusoidal partials are categorized as stable or variable, allowing to properly dealing with vibrato or glissando notes in music. The noise part is synthesized using a warped LPC model. Results of blind listening tests show a significant advantage of the subjective audio quality.

8

Rozpoznawanie komend głosowych za pomocą sieci neuronowych

Duda J.

Śląskie Wiadomości Elektryczne

|

2004

|

Nr 2 (53)

21--24

PL

W artykule zostały przedstawione informacje na temat budowy systemu rozpoznawania komend głosowych. Dokładniej omówiono: przygotowanie wzorców głosowych za pomocą Liniowego Kodowania Predykcyjnego (LPC) oraz metodę ich klasyfikacji za pomocą sieci neuronowych. Przedstawiono też program, który został napisany w celu sprawdzenia możliwości zastosowana sieci neuronowych do rozpoznawania komend głosowych.

9

Rozpoznawanie mówców

Dustor A., Izydorczyk J.

Przegląd Telekomunikacyjny + Wiadomości Telekomunikacyjne

|

2003

|

nr 2-3

71-76

PL

Omówiono problematykę identyfikacji i weryfikacji mówcy. Przedstawiono poszczególne elementy składowe systemu rozpoznawania mówcy skupiając się szczególnie na zagadnieniach ekstrakcji parametrów z sygnału mowy, tworzeniu modeli mówcy zarówno parametrycznych jak i nieparametrycznych oraz na metodach rozpoznawania. Skrótowo przedstawiono również zagadnienia związane z zasobami mowy.

EN

The article presents fundamentals of speaker recognition as well as some basic problems of this technology like feature extraction, model training and recognition in more detail. Additionally a short description of speech corpora is included.

10

Source enhanced linear prediction of speech incorporating simultaneously masked spectral weighting

Lukasiak J., Burnett I.S.

Journal of Telecommunications and Information Technology

|

2001

|

nr 3

15-23

EN

Linear prediction is the cornerstone of most modern speech compression algorithms. This paper proposes modifying the calculation of the linear predictor coefficients to incorporate a weighting function based on the simultaneous masking property of the ear. The resultant prediction filter better models the perceptual characteristics of the source and results in the removal of more perceptually important information from the input speech signal than a standard LP filter. When employed in a low rate speech codec the net effect is an improvement in subjective quality, with no increase in transmission rate and only a modest increase in computational complexity.