Instance selection techniques in reduction of data streams derived from medical devices

Byczkowska-Lipińska, L.; Wosiak, A.

doi:10.15199/48.2017.12.29

Artykuł - szczegóły

Tytuł artykułu

Instance selection techniques in reduction of data streams derived from medical devices

Autorzy

Byczkowska-Lipińska L. , Wosiak A.

Wybrane pełne teksty z tego czasopisma

http://pe.org.pl/

Identyfikatory

DOI

10.15199/48.2017.12.29

Warianty tytułu

Redukcja strumienia danych pozyskiwanych z urządzeń diagnostyki medycznej za pomocą technik selekcji przypadków

Języki publikacji

Abstrakty

The research described in this paper concerns the reduction of streams of data derived from medical devices, i.e. ECG recordings. Experimental studies included three instance selection techniques: thresholding method, bounds checking and frequent data reduction . It was shown that application the instance selection techniques may reduce data stream by over 90% without losing anomalies or the measurements that are key values for the medical diagnosis.

W ramach niniejszej pracy przeprowadzona została redukcja strumienia danych pozyskanych z urządzeń medycznych. Badania eksperymentalne obejmowały zastosowanie trzech technik selekcji przypadków: metody eliminacji progowej, weryfikacji zakresu oraz redukcji obiektów częstych. W pracy zostało wykazane, że zastosowanie selekcji przypadków pozwala na redukcję strumienia danych o ponad 90% bez utraty wartości kluczowych dla postawienia diagnozy medycznej.

Słowa kluczowe

instance selection data stream medical data analysis

selekcja przypadków strumień danych analiza danych medycznych

Wydawca

Wydawnictwo SIGMA-NOT

Czasopismo

Przegląd Elektrotechniczny

Rocznik

2017

Tom

R. 93, nr 12

Strony

115--118

Opis fizyczny

Bibliogr. 38 poz., rys., tab., wykr.

Twórcy

autor

Byczkowska-Lipińska L.

liliana.byczkowska-lipinska@p.lodz.pl

University of Computer Sciences and Skills, ul. Rzgowska 17 a, 93-008 Lodz, Poland

autor

Wosiak A.

agnieszka.wosiak@p.lodz.pl

Lodz University of Technology, Institute of Information Technology, ul. Wólczańska 215, 90-924 Lodz

Bibliografia

[1] Bellman, R. (2013). Dynamic programming. Courier Corporation.
[2] Bellman, R. E. (2015). Adaptive control processes: a guided tour. Princeton University Press.
[3] Keogh, E., Mueen, A. (2011). Curse of dimensionality. In Encyclopedia of Machine Learning (pp. 257-258). Springer US.
[4] Chen, L. (2009). Curse of dimensionality. In Encyclopedia of Database Systems (pp. 545-546). Springer US.
[5] Abdi, H., Williams, L. J. (2010). Principal component analysis. Wiley interdisciplinary reviews: computational statistics, 2(4), pp. 433-459.
[6] Gorban, A. N., Kégl, B., Wunsch, D. C., & Zinovyev, A. Y. (Eds.). (2008). Principal manifolds for data visualization and dimension reduction, Vol. 58, pp. 96-130. Berlin-Heidelberg: Springer.
[7] Byczkowska-Lipińska L., Wosiak A. (2015). Feature Selection and Classification Techniques in the Assessment of the State for Large Power Transformers. Przegląd Elektrotechniczny, R. 91 NR 1/2015, doi:10.15199/48.2015.01.39
[8] Liu L., Özsu M. T. (Eds.) (2009). Encyclopedia of database systems. Berlin, Heidelberg, Germany: Springer. De
[9] Choudhury, M., Lin, Y. R., Sundaram, H., Candan, K. S., Xie, L., & Kelliher, A. (2010). How does the data sampling strategy impact the discovery of information diffusion in social media?. ICWSM, 10, pp. 34-41.
[10] Holmes, A. (2012). Hadoop in practice. Manning Publications Co..
[11] Buza K., Nanopoulos A., Schmidt-Thieme L., Koller J. (2011, July). Fast classification of electrocardiograph signals via instance selection. In Healthcare Informatics, Imaging and Systems Biology (HISB), 2011 First IEEE International Conference on, pp. 9 – 16.
[12] Ramírez-Gallego S., Krawczyk B., García S., Woźniak M., Herrera F. (2017). A survey on data preprocessing for data stream mining: Current status and future directions. Neurocomputing, 239, pp. 39 – 57.
[13] García, S., Ramírez-Gallego, S., Luengo, J., Benítez, J. M., Herrera, F. (2016). Big data preprocessing: methods and prospects. Big Data Analytics, 1(1), 9.
[14] Pyle, D. (1999). Data preparation for data mining (Vol. 1). Morgan Kaufmann.
[15] Hall, M. A. (1999). Correlation-based feature selection for machine learning.
[16] Guyon, I., Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, vol. 3, pp. 1157-1182.
[17] Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers & Electrical Engineering, 40(1), 16-28.
[18] Giráldez, R. (2005, June). Feature influence for evolutionary learning. In Proceedings of the 7th Annual Conference on Genetic and Evolutionary Computation, pp. 1139-1145. ACM.
[19] Kim, J. O., Mueller, C. W. (1978). Factor analysis: Statistical methods and practical issues, Vol. 14. Sage.
[20] Dunteman, G. H. (1989). Principal components analysis. Vol. 69. Sage.
[21] Smith, L. I. (2002). A tutorial on principal components analysis. Cornell University, USA, vol. 51(52), no 65.
[22] Groth, D., Hartmann, S., Klie, S., Selbig, J. (2013). Principal components analysis. Computational Toxicology: Volume II, pp. 527-547.
[23] Bressan, M., Vitria, J. (2003). On the selection and classification of independent features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(10), pp. 1312- 1317.
[24] Liu, H., Motoda, H. (2002). On issues of instance selection. Data Mining and Knowledge Discovery, 6(2), pp. 115-130.
[25] Olvera-López, J. A., Carrasco-Ochoa, J. A., Martínez-Trinidad, J. F., & Kittler, J. (2010). A review of instance selection methods. Artificial Intelligence Review, vol. 34(2), pp. 133- 143.
[26] Garcia, S., Derrac, J., Cano, J., & Herrera, F. (2012). Prototype selection for nearest neighbor classification: Taxonomy and empirical study. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34(3), pp. 417-435.
[27] García, S., Luengo, J., Herrera, F. (2015). Data preprocessing in data mining (pp. 59-139). New York: Springer.
[28] Garcia, S., Derrac, J., Cano, J., & Herrera, F. (2012). Prototype selection for nearest neighbor classification: Taxonomy and empirical study. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34(3), pp. 417-435.
[29] Aha, D. W., Kibler, D., & Albert, M. K. (1991). Instance-based learning algorithms. Machine learning, vol. 6(1), pp. 37-66.
[30] Salganicoff, M. (1993, December). Density-adaptive learning and forgetting. In Proceedings of the Tenth International Conference on Machine Learning (Vol. 3, pp. 276-283).
[31] Klinkenberg, R. (2004). Learning drifting concepts: Example selection vs. example weighting. Intelligent Data Analysis, vol. 8(3), pp. 281-300.
[33] Brighton, H., & Mellish, C. (2002). Advances in instance selection for instance-based learning algorithms. Data mining and knowledge discovery, vol. 6(2), pp. 153-172.
[34] Goldberger A.L., Amaral L.A.N., Glass L., Hausdorff J.M., Ivanov P.Ch., Mark R.G., Mietus J.E., Moody G.B., Peng C.K., Stanley H.E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals. Circulation vol. 101(23), pp. e215-e220, DOI: 10.1161/01.CIR.101.23.e215
[35] Goldsmith R.L., Bigger J.T., Bloomfield D.M., Krum H., Steinman R.C., Sackner-Bernstein J., Packer M. Long-term carvedilol therapy increases parasympathetic nervous system activity in chronic congestive heart failure. American Journal of Cardiology 1997; vol. 80, pp. 1101-1104.
[36] Olvera-López, J. A., Carrasco-Ochoa, J. A., Martínez-Trinidad, J. F., & Kittler, J. (2010). A review of instance selection methods. Artificial Intelligence Review, vol. 34(2), pp. 133- 143.
[37] Witten I. H., Frank E., Hall M. A., Pal C. J. (2017). Data Mining: Practical machine learning tools and techniques. Fourth Edition. Morgan Kaufmann.
[38] Guo, L., Chen, F., Gao, C., & Xiong, W. (2012). Performance Measurement Model of Multi-Source Data Fusion Based on Network Situation Awareness. Przegląd Elektrotechniczny, vol. 88(7b), pp. 315-319.

Uwagi

Opracowanie ze środków MNiSW w ramach umowy 812/P-DUN/2016 na działalność upowszechniającą naukę (zadania 2017).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-d86f1531-a65e-4287-8ce4-16677a9e5b09