Vol. 99, nr 2
Tytuł artykułu
Wybrane pełne teksty z tego czasopisma
Warianty tytułu
Języki publikacji
This work focusses on the problem of clustering resources contained in knowledge bases represented throughmulti-relational standard languages that are typical for the context of the SemanticWeb, and ultimately founded in Description Logics. The proposed solution relies on effective and language-independent dissimilarity measures that are based on a finite number of dimensions corresponding to a committee of discriminating features, that stands for a context, represented by concept descriptions in Description Logics. The proposed clustering algorithm expresses the possible clusterings in tuples of central elements: in this categorical setting, we resort to the notion of medoid, w.r.t. the given metric. These centers are iteratively adjusted following the rationale of fuzzy clustering approach, i.e. one where the membership to each cluster is not deterministic but graded, ranging in the unit interval. This better copes with the inherent uncertainty of the knowledge bases expressed in Description Logics which adopt an open-world semantics. An extensive experimentation with a number of ontologies proves the feasibility of our method and its effectiveness in terms of major clustering validity indices.
Opis fizyczny
Bibliogr. 36 poz., tab.
- Dipartimento di Informatica, Universit`\a degli studi di Bari, Campus Universitario, Via Orabona 4, 70125 Bari, Italy,
- [1] Andrejko, A., Bieliková,M.: Comparing instances of ontological concepts for personalized recommendation in large information spaces, Computing and Informatics, 28(4), 2009.
- [2] Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P., Eds.: The Description Logic Handbook, Cambridge University Press, 2003.
- [3] Bezdek, J.: Pattern Recognition with Fuzzy Objective Function Algoritms, Plenum Press, New York, 1981.
- [4] Borgida, A., Walsh, T., Hirsh, H.: Towards Measuring Similarity in Description Logics, in: Working Notes of the International Description Logics Workshop (I. Horrocks, U. Sattler, F.Wolter, Eds.), vol. 147 of CEUR Workshop Proceedings, CEUR, Edinburgh, UK, 2005.
- [5] d'Amato, C., Fanizzi, N., Esposito, F.: Analogical Reasoning in Description Logics, in: Uncertainty Reasoning for the Semantic Web I (P. da Costa, et al., Eds.), vol. 5327 of LNAI, Springer, 2008, 330-347.
- [6] d'Amato, C., Fanizzi, N., Esposito., F.: Query Answering and Ontology Population: An Inductive Approach., in: Proceedings of the 5th European Semantic Web Conference, ESWC2008 (S. Bechhofer, et al., Eds.), vol. 5021 of LNCS, Springer, 2008, 288-302.
- [7] Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edition,Wiley, 2001.
- [8] Emde, W., Wettschereck, D.: Relational Instance-Based Learning, in: Proceedings of the 13th International Conference on Machine Learning, ICML96 (L. Saitta, Ed.), Morgan Kaufmann, 1996, 122-130.
- [9] Fanizzi, N., d'Amato, C., Esposito, F.: Conceptual Clustering for Concept Drift and Novelty Detection, in: Proceedings of the 5th European Semantic Web Conference, ESWC2008 (S. Bechhofer, et al., Eds.), vol. 5021 of LNCS, Springer, 2008, 318-332.
- [10] Fanizzi, N., d'Amato, C., Esposito, F.: DL-FOIL: Concept Learning in Description Logics, in: Proceedings of the 18th International Conference on Inductive Logic Programming, ILP2008 (F. Zelezny, N. Lavrač, Eds.), vol. 5194 of LNAI, Springer, Prague, Czech Rep., 2008, 107-121.
- [11] Fanizzi, N., d'Amato, C., Esposito, F.: A Multi-relational Hierarchical Clustering Method for DATALOG Knowledge Bases, in: Proceedings of the 17th International Symposium on Methodologies for Intelligent Systems, ISMIS2008 (A. An, et al., Eds.), vol. 4994 of Lecture Notes in Computer Science, Springer, 2008, 137-142.
- [12] Fanizzi, N., d'Amato, C., Esposito, F.: Metric-based stochastic conceptual clustering for ontologies, Information Systems, 34(8), 2009, 792-806.
- [13] Fanizzi, N., Iannone, L., Palmisano, I., Semeraro, G.: Concept Formation in Expressive Description Logics, in: Proceedings of the 15th European Conference on Machine Learning, ECML2004 (J.-F. Boulicaut, et al., Eds.), vol. 3201 of LNAI, Springer, 2004, 99-113.
- [14] Goldstone, R., Medin, D., Halberstadt, J.: Similarity in context, Memory and Cognition, 25(2), 1997, 237-255.
- [15] Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On Clustering Validation Techniques, Journal of Intelligent Information Systems, 17(2-3), 2001, 107-145.
- [16] Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Data Mining, Inference, and Prediction, Springer, 2001.
- [17] Hirano, S., Tsumoto, S.: An indiscernibility-based clustering method., 2005 IEEE International Conference on Granular Computing (X. Hu, et al., Eds.), IEEE, 2005.
- [18] Jain, A., Murty, M., Flynn, P.: Data Clustering: A Review, ACM Computing Surveys, 31(3), 1999, 264-323.
- [19] Kaufman, L., Rousseeuw, P.: Finding Groups in Data: an Introduction to Cluster Analysis, Wiley, 1990.
- [20] Kietz, J.-U., Morik, K.: A Polynomial Approach to the Constructive Induction of Structural Knowledge, Machine Learning, 14(2), 1994, 193-218.
- [21] Kirsten, M., Wrobel, S.: Relational Distance-Based Clustering, in: Proceedings of the 8th International Workshop, ILP98 (D. Page, Ed.), vol. 1446 of LNCS, Springer, 1998, 261-270.
- [22] Koller, D., Levy, A., Pfeffer, A.: P-CLASSIC: A Tractable Probablistic Description Logic, Proceedings of the 14th National Conference on Artificial intelligence, AAAI97, The MIT Press, 1997.
- [23] Kramer, S., Lavrač, N., Flach, P.: Propositionalization approaches to relational data mining, in: Relational Data Mining (S. Dˇzeroski, N. Lavrač, Eds.), Springer, 2001, 262-286.
- [24] Lehmann, J., Hitzler, P.: A Refinement Operator Based Learning Algorithm for the ALC Description Logic, Proceedings of the 17th International Conference on Inductive Logic Programming, ILP2007 (H. Blockeel, et al., Eds.), 4894, Springer, 2008.
- [25] Lukasiewicz, T.: Expressive probabilistic description logics, Artif. Intell., 172(6-7), 2008, 852-883.
- [26] Nasraoui, O., Leon, E.: Improved Niching and Encoding Strategies for Clustering Noisy Data Sets, in: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO2004 (K. Deb, et al., Eds.), vol. 3103 of LNCS, Springer, 2004, 1324-1325.
- [27] Ng, R., Han, J.: Efficient and effective clustering method for spatial data mining, Proceedings of the 20th Conference on Very Large Databases, VLDB94, 1994.
- [28] Nienhuys-Cheng, S., deWolf, R.: Foundations of Inductive Logic Programming, vol. 1228 of LNAI, Springer, 1997.
- [29] Nienhuys-Cheng, S.-H.: Distances and Limits on Herbrand Interpretations, in: Proceedings of the 8th International Workshop on Inductive Logic Programming, ILP98 (D. Page, Ed.), vol. 1446 of LNAI, Springer, 1998, 250-260.
- [30] Noy, N. F., Klein, M.: Ontology Evolution: Not the Same as Schema Evolution, Knowledge and Information Systems, 6, 2003, 428-440.
- [31] Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning About Data, Kluwer, 1991.
- [32] Pitman, J.: Exchangeable and partially exchangeable random partitions, Probability Theory and Related Fields, 102(2), 1995, 145-158.
- [33] Spinosa, E., Ponce de Leon Ferreira de Carvalho, A., Gama, J.: OLINDDA: A cluster-based approach for detecting novelty and concept drift in data streams, Proceedings of the 22nd Annual ACM Symposium of Applied Computing, SAC2007, 1, ACM, Seoul, South Korea, 2007.
- [34] Stepp, R. E., Michalski, R. S.: Conceptual clustering of structured objects: A goal-oriented approach, Artificial Intelligence, 28(1), Feb. 1986, 43-69.
- [35] Wang, W., Zhang, Y.: On fuzzy cluster validity indices, Fuzzy Sets Systems, 158(19), 2007, 2095-2117.
- [36] Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search - The Metric Space Approach, Advances in database Systems, Springer, 2007.
Typ dokumentu
Identyfikator YADDA