PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Social Network Optimization for Cluster Ensemble Selection

Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
This paper studies the cluster ensemble selection problem for unsupervised learning. Given a large ensemble of clustering solutions, our goal is to select a subset of solutions to form a smaller yet better performing cluster ensemble than using all available solutions. The common way of aggregating the chosen solutions is accumulating the information of the selected results to a similarity matrix. This paper suggests transforming the similarity matrix to a modularity matrix and then applying a new consensus function which optimizes modularity measure in it. We represent the modularity maximization problem as a 0-1 quadratic program which can be exactly solved for small datasets. We also established a new greedy algorithm, namely sum linkage, to optimize the objective function specially for large scale datasets in a very short time. We show that the proposed consensus partition gets much closer to the actual cluster structure than the partitions obtained from the direct application of common cluster ensemble methods. The promising results compared with other most cited consensus functions show the excellent efficiency of the proposed method.
Wydawca
Rocznik
Strony
79--102
Opis fizyczny
Bibliogr. 47 poz., rys., tab., wykr.
Twórcy
autor
  • School of Computer Science and Technology, Jilin University, Jilin Province, China
  • School of Computer Science, Iran University of Sci. and Technology, Tehran, Iran
  • School of Computer Science, Iran University of Sci. and Technology, Tehran, Iran
  • Young Researchers and Elite Club, Yasooj Branch, Islamic Azad University, Yasooj, Iran
autor
  • Department of Computer Engineering, Nourabad Mamasani Branch, Islamic Azad University, Nourabad Mamasani, Iran
  • Institute of Research and Development, Duy Tan University, Da Nang 550000, Vietnam
Bibliografia
  • [1] Jain AK. Data clustering: 50 years beyond K-means. Pattern Recognit. Lett., 2010. 31(8):651-666. doi:10.1016/j.patrec.2009.09.011. URL https://doi.org/10.1016/j.patrec.2009.09.011.
  • [2] Alizadeh H, Minaei-Bidgoli B, Parvin H. Optimizing Fuzzy Cluster Ensemble in String Representation. Int. J. Pattern Recognit. Artif. Intell., 2013. 27(2). doi:10.1142/S0218001413500055. URL https://doi.org/10.1142/S0218001413500055.
  • [3] Strehl A, Ghosh J. Cluster Ensembles — A Knowledge Reuse Framework for Combining Multiple Partitions. J. Mach. Learn. Res., 2002. 3:583-617. URL http://jmlr.org/papers/v3/strehl02a.html.
  • [4] Alizadeh H, Minaei-Bidgoli B, Parvin H. Cluster ensemble selection based on a new cluster stability measure. Intell. Data Anal., 2014. 18(3):389-408. doi:10.3233/IDA-140647. URL https://doi.org/10.3233/IDA-140647.
  • [5] Fern XZ, Lin W. Cluster Ensemble Selection. Statistical Analysis and Data Mining, 2008. 1(3):128-141. doi:10.1002/sam.10008. URL https://doi.org/10.1002/sam.10008.
  • [6] Azimi J, Fern XZ. Adaptive Cluster Ensemble Selection. In: Boutilier C (ed.), IJCAI 2009, Proceedings of the 21st International Joint Conference on Artificial Intelligence, Pasadena, California, USA, July 11-17, 2009. 2009 pp. 992-997. URL http://ijcai.org/Proceedings/09/Papers/168.pdf.
  • [7] Abbasi S, Nejatian S, Parvin H, Rezaie V, Bagherifard K. Clustering ensemble selection considering quality and diversity. Artif. Intell. Rev., 2019. 52(2):1311-1340. doi:10.1007/s10462-018-9642-2. URL https://doi.org/10.1007/s10462-018-9642-2.
  • [8] Alizadeh H, Minaei-Bidgoli B, Parvin H. To improve the quality of cluster ensembles by selecting a subset of base clusters. J. Exp. Theor. Artif. Intell., 2014. 26(1):127-150. doi:10.1080/0952813X.2013.813974. URL https://doi.org/10.1080/0952813X.2013.813974.
  • [9] Agarwal G, Kempe D. Modularity-Maximizing Graph Communities via Mathematical Programming. Physics of Condensed Matter, 2007. 66. doi:10.1140/epjb/e2008-00425-1.
  • [10] Newman MEJ. Communities, modules and large-scale structure in networks, 2012.
  • [11] Fortunato S. Community detection in graphs. CoRR, 2009. abs/0906.0612. 0906.0612, URL http://arxiv.org/abs/0906.0612.
  • [12] Fred ALN, Jain AK. Combining Multiple Clusterings Using Evidence Accumulation. IEEE Trans. Pattern Anal. Mach. Intell., 2005. 27(6):835-850. doi:10.1109/TPAMI.2005.113. URL https://doi.org/10.1109/TPAMI.2005.113.
  • [13] Caruana R, Niculescu-Mizil A, Crew G, Ksikes A. Ensemble selection from libraries of models. In: Brodley CE (ed.), Machine Learning, Proceedings of the Twenty-first International Conference (ICML 2004), Banff, Alberta, Canada, July 4-8, 2004, volume 69 of ACM International Conference Proceeding Series. ACM, 2004 doi:10.1145/1015330.1015432. URL https://doi.org/10.1145/1015330.1015432.
  • [14] Hadjitodorov ST, Kuncheva LI, Todorova LP. Moderate diversity for better cluster ensembles. Inf. Fusion, 2006. 7(3):264-275. doi:10.1016/j.inffus.2005.01.008. URL https://doi.org/10.1016/j.inffus. 2005.01.008.
  • [15] Parvin H, Minaei-Bidgoli B. A clustering ensemble framework based on elite selection of weighted clusters. Adv. Data Anal. Classif., 2013. 7(2):181-208. doi:10.1007/s11634-013-0130-x. URL https://doi.org/10.1007/s11634-013-0130-x.
  • [16] Singh V, Mukherjee L, Peng J, Xu J. Ensemble clustering using semidefinite programming with applications. Mach. Learn., 2010. 79(1-2):177-200. doi:10.1007/s10994-009-5158-y. URL https://doi.org/10.1007/s10994-009-5158-y.
  • [17] Rao PR, da Costa JPP. A performance study of a consensus clustering algorithm and properties of partition graph. In: 2010 IEEE International Conference on Computational Intelligence and Computing Research. 2010 pp. 1-5.
  • [18] Guénoche A. Consensus of partitions : a constructive approach. Adv. Data Anal. Classif., 2011. 5(3):215-229. doi:10.1007/s11634-011-0087-6. URL https://doi.org/10.1007/s11634-011-0087-6.
  • [19] Christou IT. Coordination of Cluster Ensembles via Exact Methods. IEEE Trans. Pattern Anal. Mach. Intell., 2011. 33(2):279-293. doi:10.1109/TPAMI.2010.85. URL https://doi.org/10.1109/TPAMI.2010.85.
  • [20] Vega-Pons S, Ruiz-Shulcloper J. A Survey of Clustering Ensemble Algorithms. Int. J. Pattern Recognit. Artif. Intell., 2011. 25(3):337-372. doi:10.1142/S0218001411008683. URL https://doi.org/10.1142/S0218001411008683.
  • [21] Brandes U, Delling D, Gaertler M, Görke R, Hoefer M, Nikoloski Z, Wagner D. On Modularity Clustering. IEEE Trans. Knowl. Data Eng., 2008. 20(2):172-188. doi:10.1109/TKDE.2007.190689. URL https://doi.org/10.1109/TKDE.2007.190689.
  • [22] Zhang X, Wang R, Wang Y, Wang J, Qiu Y, Wang L, Chen L. Modularity optimization in community detection of complex networks. EPL (Europhysics Letters), 2009. 87:38002. doi:10.1209/0295-5075/87/38002.
  • [23] Zhang X, Li Z, Wang R, Wang Y. A combinatorial model and algorithm for globally searching community structure in complex networks. J. Comb. Optim., 2012. 23(4):425-442. doi:10.1007/s10878-010-9356-0. URL https://doi.org/10.1007/s10878-010-9356-0.
  • [24] Lancichinetti A, Fortunato S. Consensus clustering in complex networks. CoRR, 2012. abs/1203.6093. 1203.6093, URL http://arxiv.org/abs/1203.6093.
  • [25] Gambette P, Guénoche A. Bootstrap clustering for graph partitioning. RAIRO - Operations Research, 2011. 45(4):339-352. doi:10.1051/ro/2012001. URL https://doi.org/10.1051/ro/2012001.
  • [26] Hosseinzadeh R, Alizadeh H, Nazemi E. Community Detection Ensemble in Social Networks. In: 11th Iranian Conf. on Intelligent Systems (ICIS13), Tehran. 2013 pp. 27-35.
  • [27] Newman MEJ, Girvan M. Finding and evaluating community structure in networks. Phys. Rev. E, 2004. 69(2):026113. doi:10.1103/PhysRevE.69.026113. URL http://link.aps.org/doi/10.1103/PhysRevE.69.026113.
  • [28] Breckenridge JN. Replicating Cluster Analysis: Method, Consistency, and Validity. Multivariate Behavioral Research, 1989. 24(2):147-161. doi:10.1207/s15327906mbr2402\_1. PMID: 26755276, https://doi.org/10.1207/s15327906mbr2402_1, URL https://doi.org/10.1207/s15327906mbr2402_1.
  • [29] Roth V, Lange T, Braun ML, Buhmann JM. A Resampling Approach to Cluster Validation. In: Härdle WK, Rönz B (eds.), COMPSTAT 2002, Proceedings in Computational Statistics, 15th Symposium, Berlin, Germany, August 24-28, 2002. Springer, 2002 pp. 123-128. doi:10.1007/978-3-642-57489-4\_13. URL https://doi.org/10.1007/978-3-642-57489-4\_13.
  • [30] Roth V, Lange T. Feature Selection in Clustering Problems. In: Thrun S, Saul LK, Schölkopf B (eds.), Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, NIPS 2003, December 8-13, 2003, Vancouver and Whistler, British Columbia, Canada]. MIT Press, 2003 pp. 473-480. URL http://papers.nips.cc/paper/2486-feature-selection-in-clustering-problems.
  • [31] Pascual D, Pla F, Sánchez JS. Cluster validation using information stability measures. Pattern Recognit. Lett., 2010. 31(6):454-461. doi:10.1016/j.patrec.2009.07.009. URL https://doi.org/10.1016/j.patrec.2009.07.009.
  • [32] Law MHC, Topchy AP, Jain AK. Multiobjective Data Clustering. In: 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2004), with CD-ROM, 27 June - 2 July 2004, Washington, DC, USA. IEEE Computer Society, 2004 pp. 424-430. doi:10.1109/CVPR.2004.170. URL http://doi.ieeecomputersociety.org/10.1109/CVPR.2004.170.
  • [33] Guimera R, Amaral LAN. Functional cartography of complex metabolic networks. Nature, 2005. 433(7028):895-900. URL http://dx.doi.org/10.1038/nature03288.
  • [34] Clauset A, Newman M, Moore C. Finding community structure in very large networks. Physical review. E, Statistical, nonlinear, and soft matter physics, 2004. 70 6 Pt 2:066111.
  • [35] Newman ME. Modularity and community structure in networks. Proc Natl Acad Sci U S A, 2006. 103(23):8577-8582. doi:10.1073/pnas.0601602103. URL http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=retrieve&db=pubmed&list_uids=16723398&dopt=AbstractPlus.
  • [36] Ahn YY, Bagrow J, Jørgensen S. Link communities reveal multiscale complexity in networks. Nature, 2010. 466(7307):761-764. doi:10.1038/nature09182.
  • [37] Li S, Chen Y, Du H, Feldman MW. A genetic algorithm with local search strategy for improved detection of community structure. Complexity, 2010. 15(4):53-60. doi:10.1002/cplx.20300. URL https://doi.org/10.1002/cplx.20300.
  • [38] Newman MEJ. Fast algorithm for detecting community structure in networks. Phys. Rev. E, 2004. 69:066133. doi:10.1103/PhysRevE.69.066133. URL https://link.aps.org/doi/10.1103/PhysRevE.69.066133.
  • [39] Mojarad M, Parvin H, Nejatian S, Rezaie V. Consensus Function Based on Clusters Clustering and Iterative Fusion of Base Clusters. Int. J. Uncertain. Fuzziness Knowl. Based Syst., 2019. 27(1):97-120. doi:10.1142/S0218488519500053. URL https://doi.org/10.1142/S0218488519500053.
  • [40] Tan P, Steinbach MS, Kumar V. Introduction to Data Mining. Addison-Wesley, 2005. ISBN 0-321-32136-7. URL http://www-users.cs.umn.edu/\%7Ekumar/dmbook/.
  • [41] Mojarad M, Nejatian S, Parvin H, Mohammadpoor M. A fuzzy clustering ensemble based on cluster clustering and iterative Fusion of base clusters. Appl. Intell., 2019. 49(7):2567-2581. doi:10.1007/s10489-018-01397-x. URL https://doi.org/10.1007/s10489-018-01397-x.
  • [42] Iam-on N, Boongoen T, Garrett SM, Price CJ. A Link-Based Approach to the Cluster Ensemble Problem. IEEE Trans. Pattern Anal. Mach. Intell., 2011. 33(12):2396-2409. doi:10.1109/TPAMI.2011.84. URL https://doi.org/10.1109/TPAMI.2011.84.
  • [43] Zhou Z, Tang W. Clusterer ensemble. Knowl. Based Syst., 2006. 19(1):77-83. doi:10.1016/j.knosys.2005.11.003. URL https://doi.org/10.1016/j.knosys.2005.11.003.
  • [44] Topchy AP, Jain AK, Punch WF. Clustering Ensembles: Models of Consensus and Weak Partitions. IEEE Trans. Pattern Anal. Mach. Intell., 2005. 27(12):1866-1881. doi:10.1109/TPAMI.2005.237. URL https://doi.org/10.1109/TPAMI.2005.237.
  • [45] Nguyen N, Caruana R. Consensus Clusterings. In: Proceedings of the 7th IEEE International Conference on Data Mining (ICDM 2007), October 28-31, 2007, Omaha, Nebraska, USA. IEEE Computer Society, 2007 pp. 607-612. doi:10.1109/ICDM.2007.73. URL https://doi.org/10.1109/ICDM.2007.73.
  • [46] Minaei-Bidgoli B, Parvin H, Alinejad-Rokny H, Alizadeh H, Punch WF. Effects of resampling method and adaptation on clustering ensemble efficacy. Artif. Intell. Rev., 2014. 41(1):27-48. doi:10.1007/s10462-011-9295-x. URL https://doi.org/10.1007/s10462-011-9295-x.
  • [47] Dietterich TG. Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms. Neural Computation, 1998. 10(7):1895-1923.
Uwagi
Opracowanie rekordu ze środków MNiSW, umowa Nr 461252 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2020).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-e9c34d3a-8725-4ced-8588-0b413ef9e9d6
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.