PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

On Seeking Consensus Between Document Similarity Measures

Autorzy
Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
This paper investigates the application of consensus clustering and meta-clustering to the set of all possible partitions of a data set. We show that when using a ”complement” of Rand Index as a measure of cluster similarity, the total-separation partition, putting each element in a separate set, is chosen.
Wydawca
Rocznik
Strony
43--68
Opis fizyczny
Bibliogr. 28 poz., rys., tab.
Twórcy
autor
  • Institute of Computer Science, Polish Academy of Sciences ul. Jana Kazimierza 5, 01-248 Warszawa, Poland
Bibliografia
  • [1] Wang H, Shan H, Banerjee A. Bayesian cluster ensembles. Statistical Analysis and Data Mining, 2011; 4:54–70. doi:10.1002/sam.10098.
  • [2] Gionis A, Mannila H, Tsaparas P. Clustering Aggregation. ACM Trans. Knowl. Discov. Data, 2007;1(1). doi:10.1145/1217299.1217303.
  • [3] Caruana R, Elhawary M, Nguyen N, Smith C. Meta Clustering. In: Proceedings of the Sixth International Conference on Data Mining, ICDM ’06. IEEE Computer Society, Washington, DC, USA. 2006 pp. 107–118. ISBN 0-7695-2701-9.
  • [4] Niu D, Dy JG, Jordan MI. Multiple Non-Redundant Spectral Clustering Views. In: ICML’10. 2010 pp. 831–838. URL http://dblp.uni-trier.de/db/conf/icml/icml2010.html#NiuDJ10.
  • [5] Bifulco I, Iorio F, Napolitano F, Raiconi G, Tagliaferri R. Interactive Visualization Tools for Meta-Clustering. In: Proceedings of the 2009 conference on New Directions in Neural Networks: 18th Italian Workshop on Neural Networks: WIRN 2008. IOS Press, Amsterdam, The Netherlands, The Netherlands. ISBN 978-1-58603-984-4, 2009 pp. 223–231.
  • [6] Bifulco I, Fedullo C, Napolitano F, Raiconi G, Tagliaferri R. Multiple data structure discovery through global optimisation, meta clustering and consensus methods. In: International Journal of Knowledge Engineering and Soft Data Paradigms, v.1 n.4, October 2009, pp. 300–317. URL https://doi.org/10.1504/IJKESDP.2009.028984.
  • [7] Dasgupta S, Ng V. Which clustering do you want? inducing your ideal clustering with minimal feedback. J. Artif. Int. Res., 2010;39:581–632. URL http://dl.acm.org/citation.cfm?id=1946417.1946430.
  • [8] Cui Y, Fern XZ, Dy JG. Learning multiple nonredundant clusterings. ACM Transactions on Knowledge Discovery from Data (TKDD), 2010;4(3):15:1–15:32. doi:10.1145/1839490.1839496.
  • [9] Anderberg M. Cluster Analysis for Applications. Academic Press, London, 1973. ISBN:9781483191393.
  • [10] Strehl A, Ghosh J. Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res., 2003;3:583–617. doi:10.1162/153244303321897735.
  • [11] Goder A, Filkov V. Consensus Clustering Algorithms: Comparison and Refinement. In: Munro JI, Wagner D (eds.), Proceedings of the Workshop on Algorithm Engineering and Experiments, ALENEX 2008, San Francisco, California, USA, January 19, 2008 pp. 109–117. doi:10.1137/1.9781611972887.11.
  • [12] Hore P, Hall LO, Goldgof DB. A scalable framework for cluster ensembles. Pattern Recogn., 2009; 42(5):676–688. doi:10.1016/j.patcog.2008.09.027.
  • [13] Ghosh J, Acharya A. Cluster ensembles. Wiley Interdisc. Rew.: Data Mining and Knowledge Discovery, 2011;1(4):305–315. doi:10.1002/widm.32.
  • [14] Li T, Ding C. Weighted consensus clustering. In: Proceedings of 2008 SIAM International Conference on Data Mining (SDM 2008), Atlanta, April 24-26, 2008. Society for Industrial and Applied Mathematics, 2008 pp. 798–809. URL https://doi.org/10.1137/1.9781611972788.72.
  • [15] Punera K, Ghosh J. Consensus Based Ensembles of Soft Clusterings. Applied Artificial Intelligence: An International Journal, 2008;22(7-8):780–810. doi:10.1080/08839510802170546.
  • [16] Monti S, Tamayo P, Mesirov J, Golub T. Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data. Mach. Learn., 2003;52(1-2):91–118. doi:10.1023/A:1023949509487.
  • [17] Topchy A, Jain AK, Punch W. Clustering ensembles: Models of consensus and weak partitions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005;27:1866–1881. doi:10.1109/TPAMI.2005.237.
  • [18] Nguyen N, Caruana R. Consensus Clusterings. In: Proceedings of the 7th IEEE International Conference on Data Mining (ICDM 2007), October 28-31, 2007, Omaha, Nebraska, USA. 2007 pp. 607–612. doi:10.1109/ICDM.2007.73.
  • [19] Wang Y, Pan Y. Semi-Supervised Consensus Clustering for Gene Expression Data Analysis. BioData Mining, 2014;7(7):13. doi:10.1186/1756-0381-7-7.
  • [20] Vogel T, Naumann F. Semi-Supervised Consensus Clustering: Reducing Human Effort. In: Proceedings of the International Workshop on Data Integration and Applications. 2014. doi:10.1109/ICDMW.2014.97.
  • [21] Barthelemy JP, Leclerc B. The median procedure for partition. In: et al IC (ed.), Partitioning Data Sets, AMS DIMACS Series in Discrete Mathematics. 1995 pp. 3–34.
  • [22] Gordon A, Vichi M. Partitions of partitions. Journal of Classification, 1998;15(2):265–285. doi:10.1007/s003579900034.
  • [23] Goder A, Filkov V. Consensus Clustering Algorithms: Comparison and Refinement. In: Alenex, volume 8. SIAM, 2008 pp. 109–117. doi:10.1137/1.9781611972887.11.
  • [24] Morlini I, Zani S. Comparing Approaches for Clustering Mixed Mode Data: An Application in Marketing Research. In: Palumbo F, Lauro CN, Greenacre M (eds.), Data Analysis and Classification: Proceedings of the 6th Conference. Springer, 2010 pp. 49–57. doi:10.1007/978-3-642-03739-9_6.
  • [25] Lei Y, Bezdek JC, Romano S, Vinh NX, Chan J, Bailey J. Ground Truth Bias in External Cluster Validity Indices. CoRR, 2016. abs/1606.05596.
  • [26] Milligan GW, Cooper MC. A study of the comparability of external criteria for hierarchical cluster analysis. Multivar. Behav. Res. 1986;21(4):441–458. doi:10.1207/s15327906mbr2104_5.
  • [27] Fowlkes EB, Mallows CL. A method for comparing two hierarchical clusterings, J. Am. Stat. Assoc., 1983;78(383):553–569. doi:10.2307/2288117.
  • [28] Simone Romano VNKV James Bailey. Standardized Mutual Information for Clustering Comparisons: One Step Further in Adjustment for Chance. In: Proceedings of The 31st International Conference on Machine Learning. 2014 pp. 1143–1151. URL http://jmlr.org/proceedings/papers/v32/romano14.pdf.
Uwagi
Opracowanie rekordu w ramach umowy 509/P-DUN/2018 ze środków MNiSW przeznaczonych na działalność upowszechniającą naukę (2018).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-1d4770e0-1f8e-4eaf-bf4f-e7ae259f8180
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.