Impact of learners’ quality and diversity in collaborative clustering

Rastin, Parisa; Matei, Basarab; Cabanes, Guénaël; Grozavu, Nistor; Bennani, Younés

doi:10.2478/jaiscr-2018-0030

Artykuł - szczegóły

Tytuł artykułu

Impact of learners’ quality and diversity in collaborative clustering

Autorzy

Rastin Parisa , Matei Basarab , Cabanes Guénaël , Grozavu Nistor , Bennani Younés

Treść / Zawartość

Pełne teksty:

rastin_ Impact of Learners’ Quality and Diversity in Collaborative Clustering.pdf

Pobierz

Identyfikatory

DOI

10.2478/jaiscr-2018-0030

Warianty tytułu

Języki publikacji

Abstrakty

Collaborative Clustering is a data mining task the aim of which is to use several clustering algorithms to analyze different aspects of the same data. The aim of collaborative clustering is to reveal the common underlying structure of data spread across multiple data sites by applying clustering techniques. The idea of collaborative clustering is that each collaborator shares some information about the segmentation (structure) of its local data and improve its own clustering with the information provided by the other learners. This paper analyses the impact of the quality and the diversity of the potential learners to the quality of the collaboration for topological collaborative clustering algorithms based on the learning of a Self-Organizing Map (SOM). Experimental analysis on real data-sets showed that the diversity between learners impact the quality of the collaboration. We also showed that some internal indexes of quality are a good estimator of the increase of quality due to the collaboration.

Słowa kluczowe

collaborative clustering topological neural networks unsupervised learning diversity quality

Wydawca

University of Social Sciences

Czasopismo

Journal of Artificial Intelligence and Soft Computing Research

Rocznik

2019

Tom

Vol. 9, No. 2

Strony

149--165

Opis fizyczny

Bibliogr. 41 poz., rys.

Twórcy

autor

Rastin Parisa

rastin@lipn.univ-paris13.fr

LIPN, UMR CNRS 7030, Institut Galile Universit Paris 13 99, avenue Jean-Baptiste Clment, 93430 Villetaneuse

autor

Matei Basarab

LIPN, UMR CNRS 7030, Institut Galile Universit Paris 13 99, avenue Jean-Baptiste Clment, 93430 Villetaneuse

autor

Cabanes Guénaël

LIPN, UMR CNRS 7030, Institut Galile Universit Paris 13 99, avenue Jean-Baptiste Clment, 93430 Villetaneuse

autor

Grozavu Nistor

LIPN, UMR CNRS 7030, Institut Galile Universit Paris 13 99, avenue Jean-Baptiste Clment, 93430 Villetaneuse

autor

Bennani Younés

LIPN, UMR CNRS 7030, Institut Galile Universit Paris 13 99, avenue Jean-Baptiste Clment, 93430 Villetaneuse

Bibliografia

[1] R. E. Schapire, The strength of weak learnability, Mach. Learn., vol. 5, no. 2, pp. 197–227, Jul. 1990. [Online]. Available: http://dx.doi.org/10.1023/A:1022648800760
[2] D. H. Wolpert, Stacked generalization, Neural Networks, vol. 5, pp. 241–259, 1992
[3] J. Kittler, M. Hatef, R. P. W. Duin, and J. Matas, On combining classifiers, IEEE Trans. attern Anal. Mach. Intell., vol. 20, no. 3, pp. 226–239, Mar. 1998. [Online]. Available: http://dx.doi.org/10.1109/34.667881
[4] P. Bachman, O. Alsharif, and D. Precup, Learning with pseudo-ensembles, in Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and . Weinberger, Eds. Curran Associates, Inc., 2014, pp. 3365–3373
[5] A. Strehl and J. Ghosh, Cluster Ensembles – A Knowledge Reuse Framework for Combining Multiple Partitions, Journal on Machine Learning Research (JMLR), vol. 3, pp. 583–617, Dec. 2002
[6] J. da Silva and M. Klusch, Inference on distributed data clustering, in Machine Learning and Data Mining in Pattern Recognition, ser. Lecture Notes in Computer Science, P. Perner and A. Imiya, ds. Springer Berlin Heidelberg, 2005, vol. 3587, pp. 610–619. [Online]. Available: http://dx.doi.org/10.1007/11510888 60
[7] W. Pedrycz, Collaborative fuzzy clustering, Pattern Recognition Letters, vol. 23, no. 14, pp. 1675–1686, 2002
[8] N. Grozavu, M. Ghassany, and Y. Bennani, Learning confidence exchange in collaborative clustering, in IJCNN, 2011, pp. 872–879
[9] W. Pedrycz and K. Hirota, A consensus-driven fuzzy clustering, Pattern Recogn. Lett., vol. 29, no. 9, pp. 333–1343, 2008
[10] N. Grozavu, G. Cabanes, and Y. Bennani, Diversity analysis in collaborative clustering, in IEEE World Congress on Computational Intelligence, 2014
[11] B. Depaire, R. Falcon, K. Vanhoof, and G. Wets, Pso driven collaborative clustering: A clustering algorithm for ubiquitous environments, Intell. Data Anal., vol. 15, no. 1, pp. 49–68, Jan. 2011.[Online]. Available: http://dl.acm.org/citation.cfm?id=1937721.1937725
[12] M. Ghassany, N. Grozavu, and Y. Bennani, Collaborative clustering using prototype-based techniques, International Journal of Computational Intelligence and Applications, vol. 11, no. 03, p. 1250017, 2012
[13] S. Zhang, C. Zhang, and X. Wu, Knowledge Discovery in Multiple Databases, ser. Advanced Information and Knowledge Processing. Springer, 2004.[Online]. Available: http://dx.doi.org/10.1007/978-0-85729-388-6
[14] W. Pedrycz, Interpretation of clusters in the framework of shadowed sets, Pattern Recogn. Lett., vol. 26, no. 15, pp. 2439–2449, 2005
[15] N. Grozavu and Y. Bennani, Topological collaborative clustering, Australian Journal of Intelligent Information Processing Systems, vol. 12, no. 3, 2010
[16] M. Ghassany, N. Grozavu, and Y. Bennani, Collaborative clustering using prototype-based techniques, International Journal of Computational Intelligence and Applications, vol. 11, no. 3, 2012
[17] N. Grozavu and Y. Bennani, Topological Collaborative Clustering, in LNCS Springer of ICONIP’10 : 17th International Conference on Neural Information Processing, 2010
[18] T. Kohonen, Self-organized formation of topologically correct feature maps, Biol. Cyb., vol. 43, pp. 59–69, 1982
[19] Analysis of a simple self-organizing process, Biol. Cyb., vol. 44, pp. 135–140, 1982 [20] C. M. Bishop and C. K. I. Williams, GTM: The generative topographic mapping, Neural Computation, vol. 10, pp. 215–234, 1998
[21] N. Grozavu, Y. Bennani, and M. Lebbah, From variable weighting to cluster characterization in topographic unsupervised learning, in Proc. of IJCNN09, International Joint Conference on Neural Network, 2009
[22] N. Grozavu and Y. Bennani, Topological collaborative clustering, Australian Journal of Intelligent Information Processing Systems, vol. 12, no. 2, 2010
[23] J. Sublime, N. Grozavu, G. Cabanes, Y. Bennani, and A. Cornuejols, From horizontal to vertical collaborative clustering using generative topographic maps, International Journal of Hybrid IntelligentSystems, vol. 12, no. 4, 2016
[24] L. I. Kuncheva and C. J. Whitaker, Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy, Mach. Learn., vol. 51, no. 2, pp. 181–207, May 2003
[25] F. Gullo, A. Tagarelli, and S. Greco, DiversityBased Weighting Schemes for Clustering Ensembles, in SDM, 2009, pp. 437–448
[26] N. Grozavu, M. Ghassany, and Y. Bennani, Learning confidence exchange in collaborative clustering, in Neural Networks (IJCNN), The 2011 International Joint Conference on IEEE, 2011, pp. 872–879
[27] A. K. Jain and R. C. Dubes, Algorithms for clustering data. Upper Saddle River, NJ, USA: PrenticeHall, Inc., 1988
[28] W. M. Rand, Objective criteria for the evaluation of clustering methods, Journal of the American Statistical Association, vol. 66, no. 336, pp. 846–850, Dec. 1971
[29] L. Hubert and P. Arabie, Comparing Partitions, Journal of the Classification, vol. 2, pp. 193–218, 1985
[30] P. Jaccard, The distribution of the flora in the alpine zone, New Phytologist, vol. 11, no. 2, pp. 37–50,1912
[31] D. L. Wallace, A Method for Comparing Two Hierarchical Clusterings: Comment, Journal of the American Statistical Association, vol. 78, no.383, pp. pp. 569–576, 1983. [Online]. Available:http://www.jstor.org/stable/2288118
[32] F. Pinto, J. Carrico, M. Ramirez, and J. Almeida,Ranked Adjusted Rand: integrating distance and partition information in a measure of clustering agreement, BMC Bioinformatics,vol. 8, no. 1, p. 44, 2007. [Online]. Available:http://www.biomedcentral.com/1471-2105/8/44
[33] I. H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2005
[34] M. Meila, Comparing clusterings - an information based distance, Journal of Multivariate Analysis, vol. 98, pp. 873–895, 2007
[35] A. Frank and A. Asuncion, UCI machine learning repository, 2010. [Online]. Available: http://archive.ics.uci.edu/ml
[36] T. Calinski and J. Harabasz, Dendrite method for cluster analysis, Communications in Statistics, vol. 3, no. 1, pp. 1–27, 1974
[37] D. L. Davies and D. W. Bouldin, A cluster separation measure, IEEE Trans. Pattern Anal. Mach. Intell., vol. 1, no. 2, pp. 224–227, Feb. 1979
[38] W. J. Krzanowski and Y. T. Lai, A criterion for determining the number of groups in a data set using sum-of-squares clustering, Biometrics, vol. 44,no. 1, pp. pp. 23–34, 1988. [Online]. Available:http://www.jstor.org/stable/2531893
[39] P. J. Rousseeuw, Silhouettes: A graphical aid to the interpretation and validation of cluster analysis, Journal of Computational and Applied Mathematics,vol. 20, no. 0, pp. 53 – 65, 1987. [Online]. Available: ttp://www.sciencedirect.com/science/article/pii/0377042787901257
[40] K. Kiviluoto, Topology Preservation in SelfOrganizing Maps, International Conference on Neural Networks, pp. 294–299, 1996
[41] T. Kohonen, Self-Organizing Maps. Berlin: Springer-Verlag, 2001

Uwagi

Opracowanie rekordu w ramach umowy 509/P-DUN/2018 ze środków MNiSW przeznaczonych na działalność upowszechniającą naukę (2019).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-97c6fb61-620d-40b1-9f81-8aff5c86a280