Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 2

Liczba wyników na stronie
first rewind previous Strona / 1 next fast forward last
Wyniki wyszukiwania
Wyszukiwano:
w słowach kluczowych:  small data
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 1 next fast forward last
EN
In this paper several statistical learning algorithms are used to predict the maximal length of fatigue cracks based on a sample composed of 31 observations. The small-data regime is still a problem for many professionals, especially in the areas where failures occur rarely. The analyzed object is a high-pressure Nozzle of a heavy-duty gas turbine. Operating parameters of the engines are used for the regression analysis. The following algorithms are used in this work: multiple linear and polynomial regression, random forest, kernel-based methods, AdaBoost and extreme gradient boosting and artificial neural networks. A substantial part of the paper provides advice on the effective selection of features. The paper explains how to process the dataset in order to reduce uncertainty; thus, simplifying the analysis of the results. The proposed loss and cost functions are custom and promote solutions accurately predicting the longest cracks. The obtained results confirm that some of the algorithms can accurately predict maximal lengths of the fatigue cracks, even if the sample is small.
EN
The lack of dopamine in the human brain is the cause of Parkinson disease (PD) which is a degenerative disorder common globally to older citizens. However, late detection of this disease before the first clinical diagnosis has led to increased mortality rate. Research effort towards the early detection of PD has encountered challenges such as: small dataset size, class imbalance, overfitting, high false detection rate, model complexity, etc. This paper aims to improve early detection of PD using machine learning through data augmentation for very small datasets. We propose using Spline interpolation and Piecewise Cubic Hermite Interpolating Polynomial (Pchip) interpolation methods to generate synthetic data instances. We further investigate on reducing dimensionality of features for effective and real-time classification while considering computational complexity of implementation on real-life mobile phones. For classification we use Bidirectional LSTM (BiLSTM) deep learning network and compare the results with traditional machine learning algorithms like Support Vector Machine (SVM), Decision Tree, Logistic regression, KNN and Ensemble bagged tree. For experimental validation we use the Oxford Parkinson disease dataset with 195 data samples, which we have augmented with 571 synthetic data samples. The results for BiLSTM shows that even with a holdout of 90%, the model was still able to effectively recognize PD with an average accuracy for ten rounds experiment using 22 features as 82.86%, 97.1\%, and 96.37% for original, augmented (Spline) and augmented (Pchip) datasets, respectively. Our results show that proposed data augmentation schemes have significantly (p < 0.001) improved the accuracy of PD recognition on a small dataset using both classical machine learning models and BiLSTM.
first rewind previous Strona / 1 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.