The research described in this paper concerns the reduction of streams of data derived from medical devices, i.e. ECG recordings. Experimental studies included three instance selection techniques: thresholding method, bounds checking and frequent data reduction . It was shown that application the instance selection techniques may reduce data stream by over 90% without losing anomalies or the measurements that are key values for the medical diagnosis.
PL
W ramach niniejszej pracy przeprowadzona została redukcja strumienia danych pozyskanych z urządzeń medycznych. Badania eksperymentalne obejmowały zastosowanie trzech technik selekcji przypadków: metody eliminacji progowej, weryfikacji zakresu oraz redukcji obiektów częstych. W pracy zostało wykazane, że zastosowanie selekcji przypadków pozwala na redukcję strumienia danych o ponad 90% bez utraty wartości kluczowych dla postawienia diagnozy medycznej.
Instance selection is often performed as one of the preprocessing methods which, along with feature selection, allows a significant reduction in computational complexity and an increase in prediction accuracy. So far, only few authors have considered ensembles of instance selection methods, while the ensembles of final predictive models attract many researchers. To bridge that gap, in this paper we compare four ensembles adapted to instance selection: Bagging, Feature Bagging, AdaBoost and Additive Noise. The last one is introduced for the first time in this paper. The study is based on empirical comparison performed on 43 datasets and 9 base instance selection methods. The experiments are divided into three scenarios. In the first one, evaluated on a single dataset, we demonstrate the influence of the ensembles on the compression–accuracy relation, in the second scenario the goal is to achieve the highest prediction accuracy, and in the third one both accuracy and the level of dataset compression constitute a multi-objective criterion. The obtained results indicate that ensembles of instance selection improve the base instance selection algorithms except for unstable methods such as CNN and IB3, which is achieved at the expense of compression. In the comparison, Bagging and AdaBoost lead in most of the scenarios. In the experiments we evaluate three classifiers: 1NN, kNN and SVM. We also note a deterioration in prediction accuracy for robust classifiers (kNN and SVM) trained on data filtered by any instance selection methods (including the ensembles) when compared with the results obtained when the entire training set was used to train these classifiers.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.