On the order equivalence relation of binary association measures

Paradowski, M.

doi:10.1515/amcs-2015-0047

Powiadomienia systemowe

Sesja wygasła!
Sesja wygasła!
Sesja wygasła!
Sesja wygasła!
Sesja wygasła!
Sesja wygasła!
Sesja wygasła!
Sesja wygasła!

Artykuł - szczegóły

Tytuł artykułu

On the order equivalence relation of binary association measures

Autorzy

Paradowski M.

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

DOI

10.1515/amcs-2015-0047

Warianty tytułu

Języki publikacji

Abstrakty

Over a century of research has resulted in a set of more than a hundred binary association measures. Many of them share similar properties. An overview of binary association measures is presented, focused on their order equivalences. Association measures are grouped according to their relations. Transformations between these measures are shown, both formally and visually. A generalization coefficient is proposed, based on joint probability and marginal probabilities. Combining association measures is one of recent trends in computer science. Measures are combined in linear and nonlinear discrimination models, automated feature selection or construction. Knowledge about their relations is particularly important to avoid problems of meaningless results, zeroed generalized variances, the curse of dimensionality, or simply to save time.

Słowa kluczowe

association coefficient result ranking linear combination zeroed variance determinant feature selection

Wydawca

Oficyna Wydawnicza Uniwersytetu Zielonogórskiego

Czasopismo

International Journal of Applied Mathematics and Computer Science

Rocznik

2015

Tom

Vol. 25, no. 3

Strony

645--657

Opis fizyczny

Bibliogr. 30 poz., tab., wykr.

Twórcy

autor

Paradowski M.

mariusz.paradowski@pwr.edu.pl

Department of Computational Intelligence, Wrocław University of Technology, Wyb. Wyspiańskiego 27, 50–370 Wrocław, Poland

Bibliografia

[1] Batagelj, V. and Bren, M. (1995). Comparing resemblance measures, Journal of Classification 12(1): 73–90.
[2] Buczyński, A. (2004). Text Acquisition from the Internet for Linguistic Research, Master’s thesis, Warsaw University, Warsaw, (in Polish).
[3] Chapelle, O. and Wu, M. (2010). Gradient descent optimization of smoothed information retrieval metrics, Information Retrieval 13(3): 216–235.
[4] Cheetham, A.H. and Hazel, J.E. (1969). Binary (presence-absence) similarity coefficients, Journal of Paleontology 43(5): 1130–1136.
[5] Choi, S.-S., Cha, S.-H. and Tappert, C.C. (2010). A survey of binary similarity and distance measures., Journal of Systemics, Cybernetics & Informatics 8(1): 43–48.
[6] Clarke, K.R., Somerfield, P.J. and Chapman, M.G. (2006). On resemblance measures for ecological studies, including taxonomic dissimilarities and a zero-adjusted Bray–Curtis coefficient for denuded assemblages, Journal of Experimental Marine Biology and Ecology 330(1): 55–80.
[7] Consonni, V. and Todeschini, R. (2012). New similarity coefficients for binary data, Match-Communications in Mathematical and Computer Chemistry 68(2): 581.
[8] Dice, L.R. (1945). Measures of the amount of ecologic association between species, Ecology 26(3): 297–302.
[9] Duarte, J.M., Santos, J.B.d. and Melo, L.C. (1999). Comparison of similarity coefficients based on RAPD markers in the common bean, Genetics and Molecular Biology 22(3): 427–432.
[10] Friedman, J.H. (1997). On bias, variance, 0/1–loss, and the curse-of-dimensionality, Data Mining and Knowledge Discovery 1(1): 55–77.
[11] Gower, J.C. and Legendre, P. (1986). Metric and Euclidean properties of dissimilarity coefficients, Journal of Classification 3(1): 5–48.
[12] Hoang, H.H., Kim, S.N. and Kan, M.-Y. (2009). A re-examination of lexical association measures, Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications, Singapore, pp. 31–39.
[13] Hubalek, Z. (1982). Coefficients of association and similarity, based on binary (presence-absence) data: An evaluation, Biological Reviews 57(4): 669–689.
[14] Jaccard, P. (1912). The distribution of the flora in the alpine zone 1, New Phytologist 11(2): 37–50.
[15] Johnson, R.A. and Wichern, D.W. (2007). Applied Multivariate Statistical Analysis, 6th Edn., Pearson International Edition, Prentice Hall, Upper Saddle River, NJ.
[16] Kazienko, P. (2009). Mining indirect association rules for web recommendation, International Journal of Applied Mathematics and Computer Science 19(1): 165–186, DOI: 10.2478/v10006-009-0015-5.
[17] Kekäläinen, J. (2005). Binary and graded relevance in IR evaluations—comparison of the effects on ranking of IR systems, Information Processing & Management 41(5): 1019–1033.
[18] Liu, T.-Y. (2009). Learning to rank for information retrieval, Foundations and Trends in Information Retrieval 3(3): 225–331.
[19] Nieddu, L. and Rizzi, A. (2007). Proximity measures in symbolic data analysis, Statistica 63(2): 195–211.
[20] Pecina, P. (2005). An extensive empirical study of collocation extraction methods, Proceedings of the Association for Computational Linguistics Student Research Workshop, Ann Arbor, MI, USA, pp. 13–18.
[21] Pecina, P. (2008). A machine learning approach to multiword expression extraction, Proceedings of the Language Resources and Evaluation Workshop Towards a Shared Task for Multiword Expressions, Marrakech, Morocco, pp. 54–61.
[22] Pecina, P. (2010). Lexical association measures and collocation extraction, Language Resources and Evaluation 44(1–2): 137–158.
[23] Pecina, P. and Schlesinger, P. (2006). Combining association measures for collocation extraction, Proceedings of the COLING/Association for Computational Linguistics on Main Conference, Sydney, Australia, pp. 651–658.
[24] Petrović, S., Šnajder, J. and Bašić, B.D. (2010). Extending lexical association measures for collocation extraction, Computer Speech & Language 24(2): 383–394.
[25] Rifqi, M., Lesot, M.-J. and Detyniecki, M. (2008). Fuzzy order-equivalence for similarity measures, Annual Meeting of the North American Fuzzy Information Processing Society, NAFIPS 2008, New York, NY, USA, pp. 1–6.
[26] Segond, M. and Borgelt, C. (2011). Item set mining based on cover similarity, in J.Z. Huang, L. Cao and J. Srivastava (Eds.), Advances in Knowledge Discovery and Data Mining, Springer, Berlin/Heidelberg, pp. 493–505.
[27] Tan, P.-N., Kumar, V. and Srivastava, J. (2004). Selecting the right objective measure for association analysis, Information Systems 29(4): 293–313.
[28] Tversky, A. (1977). Features of similarity, Psychological Review 84(4): 327.
[29] Washtell, J. and Markert, K. (2009). A comparison of windowless and window-based computational association measures as predictors of syntagmatic human associations, Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, Vol. 2, pp. 628–637.
[30] Wolda, H. (1981). Similarity indices, sample size and diversity, Oecologia 50(3): 296–302.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-66385c10-2a11-4515-bb54-24b11d811281