Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
The Robinson–Foulds (RF) distance is the most popular method of evaluating the dissimilarity between phylogenetic trees. In this paper, we define and explore in detail properties of the Matching Cluster (MC) distance, which can be regarded as a refinement of the RF metric for rooted trees. Similarly to RF, MC operates on clusters of compared trees, but the distance evaluation is more complex. Using the graph theoretic approach based on a minimum-weight perfect matching in bipartite graphs, the values of similarity between clusters are transformed to the final MC-score of the dissimilarity of trees. The analyzed properties give insight into the structure of the metric space generated by MC, its relations with the Matching Split (MS) distance of unrooted trees and asymptotic behavior of the expected distance between binary n-leaf trees selected uniformly in both MC and MS [...].
Rocznik
Tom
Strony
669--684
Opis fizyczny
Bibliogr. 51 poz., rys., tab., wykr.
Twórcy
autor
- Department of Algorithms and System Modeling, Faculty of Electronics, Telecommunications and Informatics, Gdańsk University of Technology, Narutowicza 11/12, 80-233 Gdańsk, Poland
autor
- Department of Algorithms and System Modeling, Faculty of Electronics, Telecommunications and Informatics, Gdańsk University of Technology, Narutowicza 11/12, 80-233 Gdańsk, Poland
Bibliografia
- [1] Alberich, R., Cardona, G., Rosselló, F. and Valiente, G. (2009). An algebraic metric for phylogenetic trees, Applied Mathematics Letters 22(9): 1320–1324.
- [2] Aldous, D.J. (1991). The continuum random tree II: An overview, in M.T. Barlow and N. H. Bingham (Eds.), Stochastic Analysis, Cambridge University Press, Cambridge, pp. 23–70.
- [3] Bansal, M., Burleigh, J.G., Eulenstein, O. and Fernández-Baca, D. (2010). Robinson–Foulds supertrees, Algorithms for Molecular Biology 5(1): 18.
- [4] Bansal, M.S., Dong, J. and Fernández-Baca, D. (2011). Comparing and aggregating partially resolved trees, Theoretical Computer Science 412(48): 6634–6652.
- [5] Biedrzycki, R. and Arabas, J. (2012). KIS: An automated attribute induction method for classification of DNA sequences, International Journal of Applied Mathematics and Computer Science 22(3): 711–721, DOI: 10.2478/v10006-012-0053-2.
- [6] Bininda-Emonds, O.R.P., Cardillo, M., Jones, K.E., MacPhee, R. D.E., Beck, R.M.D., Grenyer, R., Price, S.A., Vos, R.A., Gittleman, J.L. and Purvis, A. (2007). The delayed rise of present-day mammals, Nature 446(7135): 507–512.
- [7] Blum, M.G.B., François, O. and Janson, S. (2006). The mean, variance and limiting distribution of two statistics sensitive to phylogenetic tree balance, The Annals of Applied Probability 16(4): 2195–2214.
- [8] Boc, A., Philippe, H. and Makarenkov, V. (2010). Inferring and validating horizontal gene transfer events using bipartition dissimilarity, Systematic Biology 59(2): 195–211.
- [9] Bogdanowicz, D. and Giaro, K. (2012). Matching split distance for unrooted binary phylogenetic trees, IEEE/ACM Transactions on Computational Biology and Bioinformatics 9(1): 150–160.
- [10] Bogdanowicz, D., Giaro, K. and Wróbel, B. (2012). TreeCmp: Comparison of trees in polynomial time, Evolutionary Bioinformatics 8: 475–487.
- [11] Bolikowski, L. and Gambin, A. (2007). New metrics for phylogenies, Fundamenta Informaticae 78(2): 199–216.
- [12] Boorman, S.A. and Olivier, D.C. (1973). Metrics on spaces of finite trees, Journal of Mathematical Psychology 10(1): 26–59.
- [13] Bordewich, M. and Semple, C. (2005). On the computational complexity of the rooted subtree prune and regraft distance, Annals of Combinatorics 8(4): 409–423.
- [14] Brinkmeyer, M., Griebel, T. and Böcker, S. (2011). Polynomial supertree methods revisited, Advances in Bioinformatics 2011: 524182.
- [15] Bryant, D. (1997). Building Trees, Hunting for Trees, and Comparing Trees—Theory and Methods in Phylogenetic Analysis, Ph.D. thesis, University of Canterbury, Christchurch.
- [16] Cardona, G., Llabrés, M., Rosselló, F. and Valiente, G. (2009a). Metrics for phylogenetic networks I: Generalizations of the Robinson–Foulds metric, IEEE/ACM Transactions on Computational Biology and Bioinformatics 6(1): 46–61.
- [17] Cardona, G., Llabrés, M., Rosselló, F. and Valiente, G. (2009b). Metrics for phylogenetic networks II: Nodal and triplets metrics, IEEE/ACM Transactions on Computational Biology and Bioinformatics 6(3): 454–469.
- [18] Cardona, G., Rossello, F. and Valiente, G. (2009). Comparison of tree-child phylogenetic networks, IEEE/ACM Transactions on Computational Biology and Bioinformatics 6(4): 552–569.
- [19] Cardona, G., Llabrés, M., Rosselló, F. and Valiente, G. (2010). Nodal distances for rooted phylogenetic trees, Journal of Mathematical Biology 61(2): 253–276.
- [20] Critchlow, D.E., Pearl, D.K. and Qian, C. (1996). The triples distance for rooted bifurcating phylogenetic trees, Systematic Biology 45(3): 323–334.
- [21] Darlu, P. and Guénoche, A. (2011). TreeOfTrees method to evaluate the congruence between gene trees, Journal of Classification 28(3): 390–403.
- [22] Davies, T.J., Barraclough, T.G., Chase, M.W., Soltis, P.S., Soltis, D.E. and Savolainen, V. (2004). Darwin’s abominable mystery: Insights from a supertree of the angiosperms, Proceedings of the National Academy of Sciences of the United States of America 101(7): 1904–1909.
- [23] Felsenstein, J. (2003). Inferring Phylogenies, Sinauer Associates, Sunderland, MA.
- [24] Frąckiewicz, M. and Palus, H. (2011). KHM clustering technique as a segmentation method for endoscopic colour images, International Journal of Applied Mathematics and Computer Science 21(1): 203–209, DOI: 10.2478/v10006-011-0015-0.
- [25] Gabow, H.N. and Tarjan, R.E. (1989). Faster scaling algorithms for network problems, SIAM Journal on Computing 18(5): 1013–1036.
- [26] Górecki, P. and Eulenstein, O. (2012). A Robinson–Foulds measure to compare unrooted trees with rooted trees, in L. Bleris, I.Mandoiu, R. Schwartz and J.Wang (Eds.), Bioinformatics Research and Applications, Lecture Notes in Computer Science, Vol. 7292, Springer, Berlin/Heidelberg, pp. 115–126.
- [27] Gusfield, D. (1991). Efficient algorithms for inferring evolutionary trees, Networks 21(1): 19–28.
- [28] Hayes, M., Walenstein, A. and Lakhotia, A. (2009). Evaluation of malware phylogeny modelling systems using automated variant generation, Journal in Computer Virology 5(4): 335–343.
- [29] Hillis, D.M., Heath, T.A. and John, K.S. (2005). Analysis and visualization of tree space, Systematic Biology 54(3): 471–482.
- [30] Kennedy, M., Page, R. D.M. and Prum, R. (2002). Seabird supertrees: Combining partial estimates of procellariiform phylogeny, The Auk 119(1): 88–108.
- [31] Lin, Y., Rajan, V. and Moret, B.M.E. (2012). A metric for phylogenetic trees based on matching, IEEE/ACM Transactions on Computational Biology and Bioinformatics 9(4): 1014–1022.
- [32] Ma, B., Li, M. and Zhang, L. (1998). On reconstructing species trees from gene trees in term of duplications and losses, Proceedings of the 2nd Annual International Conference on Computational Molecular Biology, RECOMB’98, New York, NY, USA, pp. 182–191.
- [33] McKenzie, A. and Steel, M. (2000). Distributions of cherries for two models of trees, Mathematical Biosciences 164(1): 81–92.
- [34] Munzner, T., Guimbretière, F., Tasiran, S., Zhang, L. and Zhou, Y. (2003). TreeJuxtaposer: Scalable tree comparison using focus+context with guaranteed visibility, ACM Transactions on Graphics 22(3): 453–462.
- [35] Nguyen, N., Mirarab, S. and Warnow, T. (2012). MRL and SuperFine+MRL: New supertree methods, Algorithms for Molecular Biology 7(1): 3.
- [36] Nye, T.M., Liò, P. and Gilks,W.R. (2006). A novel algorithm and web-based tool for comparing two alternative phylogenetic trees, Bioinformatics 22(1): 117–119.
- [37] Orlin, J.B. and Ahuja, R.K. (1992). New scaling algorithms for the assignment and minimum mean cycle problems, Mathematical Programming 54(1–3): 41–56.
- [38] Penny, D., Watson, E.E. and Steel, M.A. (1993). Trees from languages and genes are very similar, Systematic Biology 42(3): 382–384.
- [39] Pompei, S., Loreto, V. and Tria, F. (2011). On the accuracy of language trees, PLoS ONE 6(6): e20109.
- [40] Restrepo, G., Héber, M. and Llanos, E.J. (2007). Three dissimilarity measures to contrast dendrograms, Journal of Chemical Information and Modeling 47(3): 761–770.
- [41] Robinson, D.F. and Foulds, L.R. (1981). Comparison of phylogenetic trees, Mathematical Biosciences 53(1–2): 131–147.
- [42] Sackin, M.J. (1972). “Good” and “bad” phenograms, Systematic Zoology 21(2): 225–226.
- [43] Semple, C. and Steel, M. (2003). Phylogenetics, Oxford University Press, Oxford.
- [44] Shao, K.-T. and Sokal, R.R. (1990). Tree balance, Systematic Zoology 39(3): 266–276.
- [45] Steel, M. A. and Penny, D. (1993). Distributions of tree comparison metrics—some new results, Systematic Biology 42(2): 126–141.
- [46] Stockham, C., Wang, L.-S. and Warnow, T. (2002). Statistically based postprocessing of phylogenetic analysis by clustering, Bioinformatics 18(suppl 1): S285–S293.
- [47] Suri, R. and Warnow, T. (2010). Spruce, Website, http://www.cs.utexas.edu/~phylo/software/spruce/.
- [48] Swenson, M.S., Suri, R., Linder, C.R. and Warnow, T. (2011). An experimental study of Quartets MaxCut and other supertree methods, Algorithms for Molecular Biology 6(1): 7.
- [49] Swofford, D. (2002). PAUP*. Phylogenetic Analysis Using Parsimony (*and other methods). Version 4, Sinauer Associates, Sunderland, MA.
- [50] Wang, J.T., Shan, H., Shasha, D. and Piel, W.H. (2005). Fast structural search in phylogenetic databases, Evolutionary Bioinformatics Online 1: 37–46.
- [51] Williams, W.T. and Clifford, H.T. (1971). On the comparison of two classifications of the same set of elements, Taxon 20(4): 519–522.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-79f9a591-d6b7-4116-8884-0d82c62c316a