Clustering with Missing Values

Simiński, K.

Artykuł - szczegóły

Tytuł artykułu

Clustering with Missing Values

Autorzy

Simiński K.

Wybrane pełne teksty z tego czasopisma

https://fi.episciences.org/

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

The paper presents the clustering algorithm for data with missing values. In this approach both marginalisation and imputation are applied. The result of the clustering is the type-2 fuzzy set / rough fuzzy set. This approach enables the distinction between original and imputed data. The method can be applied to the data sets with all attributes lacking some values. The paper is accompanied by the numerical examples of clustering of synthetic and real-life data sets.

Słowa kluczowe

missing values clustering

Wydawca

Polskie Towarzystwo Matematyczne

Czasopismo

Fundamenta Informaticae

Rocznik

2013

Tom

Vol. 123, nr 3

Strony

331--350

Opis fizyczny

Bibliogr. 29 poz., tab., wykr.

Twórcy

autor

Simiński K.

krzysztof.siminski@polsl.pl

Institute of Informatics Silesian University of Technology ul. Akademicka 16, 44-100 Gliwice, Poland

Bibliografia

[1] Acuña, E., Rodriguez, C.: The treatment of missing values and its effect in the classifier accuracy, Classification, Clustering and Data Mining Applications, Springer, Berlin, Heidelberg (D. Banks, L. House, F. McMorris, P. Arabie, W. G. (Eds.), Eds.), 2004, 639–648.
[2] Box, G. E. P., Jenkins, G.: Time Series Analysis, Forecasting and Control, Holden-Day, Incorporated, Oakland, California, 1970.
[3] Chan, L. S., Gilman, J. A., Dunn, O. J.: Alternative Approaches to Missing Values in Discriminant Analysis, Journal of the American Statistical Association, 71(356), 1976, 842–844.
[4] Czogała, E., Ł˛eski, J.: Fuzzy and Neuro-Fuzzy Intelligent Systems, Series in Fuzziness and Soft Computing, Physica-Verlag, A Springer-Verlag Company, Heidelberg, New York, 2000.
[5] Dempster, A. P., Laird, N. M., Rubin, D. B.: Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, Series B, 39(1), 1977, 1–38.
[6] Dunn, J. C.: A Fuzzy Relative of the ISODATA Process and its Use in Detecting Compact, Well Separated Clusters, Journal Cybernetics, 3(3), 1973, 32–57.
[7] Frank, A., Asuncion, A.: UCI Machine Learning Repository, 2010.
[8] Gath, I., Geva, A. B.: Unsupervised optimal fuzzy clustering, Pattern Analysis and Machine Intelligence, IEEE Transactions on, 11(7), Jul 1989, 773–780, ISSN 0162-8828.
[9] Ghahramani, Z., Jordan, M.: Learning From Incomplete Data, Technical report, Lab Memo No. 1509, CBCL Paper No. 108, MIT AI Lab, 1995.
[10] Hathaway, R., Bezdek, J.: Fuzzy c-means clustering of incomplete data, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 31(5), Oct 2001, 735–744, ISSN 1083-4419.
[11] Himmelspach, L., Conrad, S.: Fuzzy Clustering of Incomplete Data Based on Cluster Dispersion, Computational Intelligence for Knowledge-Based Systems Design, 13th International Conference on Information Processing and Management of Uncertainty, IPMU 2010, Dortmund, Germany, June 28 – July 2, 2010. Proceedings (E. Hüllermeier, R. Kruse, F. Hoffmann, Eds.), 6178, Springer, 2010, 59–68.
[12] Hwang, C., Rhee, F. C.-H.: An interval type-2 fuzzy C spherical shells algorithm, Proceedings of the 2004 IEEE International Conference on Fuzzy Systems, 2004, 1117–1122.
[13] Juršič, M., Lavrač, N.: Fuzzy clustering of documents, Proceedings of Conference on Data Mining and Data Warehouses (SiKDD 2008), 2008.
[14] Kim, D.-W., Kang, B.-Y.: Iterative Clustering Analysis for Grouping Missing Data in Gene Expression Profiles, in: Advances in Knowledge Discovery and Data Mining (W. Ng, M. Kitsuregawa, J. Li, K. Chang, Eds.), vol. 3918 of Lecture Notes in Computer Science, Springer Berlin / Heidelberg, 2006, 129–138.
[15] Laskaris, N. A., Zafeiriou, S. P.: Beyond FCM: Graph-theoretic post-processing algorithms for learning and representing the data structure, Pattern Recogn., 41(8), 2008, 2630–2644, ISSN 0031-3203.
[16] Ł˛eski, J.: Neuro-fuzzy systems (in Polish: Systemy neuronowo-rozmyte), Wydawnictwa Naukowo-Techniczne, Warszawa, 2008.
[17] Li, J., Gao, X., Tian, C.: FCM-based clustering algorithm ensemble for large data sets, Lecture Notes in Artificial Intelligence LNAI 4223, 2006, 559–567.
[18] Pedrycz, W.: Conditional fuzzy clustering in the design of radial basis function neural networks, IEEE Transactions on Neural Networks, 9(4), 1998, 601–612.
[19] Renz, C., Rajapakse, J. C., Razvi, K., Liang, S. K. C.: Ovarian cancer classification with missing data, Proceedings of the 9th International Conference on Neural Information Processing, ICONIP’02, 2, 2002, 809–813.
[20] Sarkar, M., Leong, T.-Y.: Fuzzy K-means clustering with missing values, Proceedings of American Medical Informatics Association Annual Symposium (AMIA), 2001, 588–592.
[21] Sikora, M., Krzykawski, D.: Application of data exploration methods in analysis of carbon dioxide emission in hard-coal mines’ dewater pump stations, Mechanization and Automation of Mining, 413(6), 2005.
[22] Timm, H., Döring, C., Kruse, R.: Differentiated treatment of missing values in fuzzy clustering, IFSA’03: Proceedings of the 10th international fuzzy systems association World Congress conference on Fuzzy sets and systems, Springer-Verlag, Berlin, Heidelberg, 2003, ISBN 3-540-40383-3, 354–361.
[23] Timm, H., Döring, C., Kruse, R.: Different approaches to fuzzy clustering of incomplete datasets, International Journal of Approximate Reasoning, 35(3), 2004, 239–249, ISSN 0888-613X, Integration of Methods and Hybrid Systems.
[24] Timm, H., Kruse, R.: Fuzzy cluster analysis with missing values, NAFIPS – 1998 Conference of the North American Fuzzy Information Processing Society, aug 1998, 242–246.
[25] Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Botstein, D., Altman, R. B.: Missing value estimation methods for DNA microarrays, Bioinformatics, 17(6), 2001, 520–525.
[26] Wagstaff, K.: Clustering with Missing Values: No Imputation Required, Classification, Clustering, and Data Mining Applications (Proceedings of the Meeting of the International Federation of Classification Societies) (D. Banks, L. House, F. R. Mcmorris, P. Arabie, W. Gaul, Eds.), Springer, 2004, 649–658.
[27] Wagstaff, K. L., Laidler, V. G.: Making the Most of Missing Values: Object Clustering with Partial Data in Astronomy, Proceedings of Astronomical Data Analysis Software and Systems XIV, 347, 2005, 172–176.
[28] Zhang, C., Zhu, X., Zhang, J., Qin, Y., Zhang, S.: GBKII: An Imputation Method for Missing Values, Advances in Knowledge Discovery and Data Mining, 4426, 2007, 1080–1087.
[29] Zhang, S.: Shell-neighbor method and its application in missing data imputation, Applied Intelligence, 2010, 1–11, ISSN 0924-669X, 10.1007/s10489-009-0207-6.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-e0f4f23f-32de-48cd-82d1-f8ed8ac64dfa