Unsupervised and Interactive Semi-supervised Clustering for Large Image Database Indexing and Retrieval

Lai, H. P.; Visani, M.; Boucher, A.; Ogier, J. M.

doi:10.3233/FI-2014-988

Artykuł - szczegóły

Tytuł artykułu

Unsupervised and Interactive Semi-supervised Clustering for Large Image Database Indexing and Retrieval

Autorzy

Lai H. P. , Visani M. , Boucher A. , Ogier J. M.

Wybrane pełne teksty z tego czasopisma

https://fi.episciences.org/

Identyfikatory

DOI

10.3233/FI-2014-988

Warianty tytułu

Języki publikacji

Abstrakty

The feature space structuring methods play a very important role in finding information in large image databases. They organize indexed images in order to facilitate, accelerate and improve the results of further retrieval. Clustering, one kind of feature space structuring, may organize the dataset into groups of similar objects without prior knowledge (unsupervised clustering) or with a limited amount of prior knowledge (semi-supervised clustering). In this paper, we present both formal and experimental comparisons of different unsupervised clustering methods for structuring large image databases. We use different image databases of increasing sizes (Wang, PascalVoc2006, Caltech101, Corel30k) to study the scalability of the different approaches. Then, we present a new interactive semi-supervised clustering model, which allows users to provide feedback in order to improve the clustering results according to their wishes. Moreover,we also compare, experimentally, our proposed model with the semi-supervised HMRF-kmeans clustering method.

Słowa kluczowe

unsupervised clustering semi-supervised clustering interactive learning

Wydawca

IOS Press

Czasopismo

Fundamenta Informaticae

Rocznik

2014

Tom

Vol. 130, nr 2

Strony

201--218

Opis fizyczny

Bibliogr. 23 poz., tab., wykr.

Twórcy

autor

Lai H. P.

hien_phuong.lai@univ-lr.fr

L3I, Université de La Rochelle, France; IFI, MSI team; IRD, UMI 209 UMMISCO; Vietnam National University, Vietnam

autor

Visani M.

muriel.visani@univ-lr.fr

L3I, Université de La Rochelle, France; IFI, MSI team; IRD, UMI 209 UMMISCO; Vietnam National University, Vietnam

autor

Boucher A.

alainboucher12@gmail.com

L3I, Université de La Rochelle, France; IFI, MSI team; IRD, UMI 209 UMMISCO; Vietnam National University, Vietnam

autor

Ogier J. M.

jean-marc.ogier@univ-lr.fr

L3I, Université de La Rochelle, France; IFI, MSI team; IRD, UMI 209 UMMISCO; Vietnam National University, Vietnam

Bibliografia

[1] Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications, SIGMOD Rec., 27(2), June 1998, 94–105.
[2] Ankerst, M., Breunig, M. M., Kriegel, H.-P., Sander, J.: OPTICS: ordering points to identify the clustering structure, SIGMOD Rec., 28(2), June 1999, 49–60.
[3] Ball, G. H., Hall, D. J.: A clustering technique for summarizing multivariate data, Systems Research and Behavioral Science, 12, 1967, 153–155.
[4] Basu, S., Banerjee, A., Mooney, R. J.: Semi-supervised Clustering by Seeding, Proceedings of the Nineteenth International Conference on Machine Learning, ICML ’02, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2002.
[5] Basu, S., Bilenko, M., Banerjee, A., Mooney, R. J.: Probabilistic Semi-Supervised Clustering with Constraints, in: Semi-Supervised Learning (O. Chapelle, B. Schölkopf, A. Zien, Eds.), MIT Press, Cambridge, MA, 2006.
[6] Dubey, A., Bhattacharya, I., Godbole, S.: A cluster-level semi-supervision model for interactive clustering, Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I, ECML PKDD’10, Springer-Verlag, Berlin, Heidelberg, 2010.
[7] Ester, M., peter Kriegel, H., S, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise, In Proc. of the 2nd Intl. Conf. on Knowledge Discovery and Data Mining, AAAI Press, 1996.
[8] Guha, S., Rastogi, R., Shim, K.: CURE: an efficient clustering algorithm for large databases, SIGMOD Rec., 27(2), June 1998, 73–84.
[9] Hinneburg, A., Keim, D. A.: A General Approach to Clustering in Large Databases with Noise, Knowledge and Information Systems, 5, 2003, 387–415.
[10] Katayama, N., Satoh, S.: The SR-tree: an index structure for high-dimensional nearest neighbor queries, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, SIGMOD ’97, ACM, New York, NY, USA, 1997.
[11] Kaufman, L., Rousseeuw, P. J.: Finding groups in data: an introduction to cluster analysis, John Wiley and Sons, New York, 1990.
[12] Lance, G. N., Williams, W. T.: A general theory of classificatory sorting strategies. II. Clustering systems, The Computer Journal, 10(3), 1967, 271–277.
[13] Likas, A., Vlassis, N., Verbeek, J. J.: The global k-means clustering algorithm, Pattern Recognition, 36(2), February 2003, 451–461.
[14] MacQueen, J.: Some methods for classification and analysis of multivariate observations, Proc. Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1, Univ. of Calif. Press, 1967.
[15] Rand, W.M.: Objective Criteria for the Evaluation of Clustering Methods, Journal of the American Statistical Association, 66(336), 1971, 846–850.
[16] Rosenberg, A., Hirschberg, J.: V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure, Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), 2007.
[17] Rousseeuw, P.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis, Journal of Computational & Applied Mathematics, 20(1), November 1987, 53–65.
[18] van de Sande, K. E. A., Gevers, T., Snoek, C. G. M.: Evaluation of Color Descriptors for Object and Scene Recognition, IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, Alaska, 2008.
[19] Sheikholeslami, G., Chatterjee, S., Zhang, A.: Wavecluster: A multi-resolution clustering approach for very large spatial databases, In Proc. of the 24th VLDB, New York, NY, USA, 1998.
[20] Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos, Proceedings. Ninth IEEE International Conference on Computer Vision (ICCV), 2003.
[21] Wagstaff, K., Cardie, C., Rogers, S., Schr¨odl, S.: Constrained K-means Clustering with Background Knowledge, Proceedings of the Eighteenth International Conference on Machine Learning, ICML ’01, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2001.
[22] Wang, W., Yang, J., Muntz, R. R.: STING: A Statistical Information Grid Approach to Spatial Data Mining, Proceedings of the 23rd International Conference on Very Large Data Bases, VLDB ’97,Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1997.
[23] Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, SIGMOD ’96, ACM, New York, NY, USA, 1996.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-a2dd657a-a8ff-45ae-987f-d180ef037a86