Recut : A Seriation Algorithm Balancing Smooth Display and Aggregated Features

Qi, J.; Ohsawa, Y.

doi:10.3233/FI-2016-1387

Artykuł - szczegóły

Tytuł artykułu

Recut : A Seriation Algorithm Balancing Smooth Display and Aggregated Features

Autorzy

Qi J. , Ohsawa Y.

Wybrane pełne teksty z tego czasopisma

https://fi.episciences.org/

Identyfikatory

DOI

10.3233/FI-2016-1387

Warianty tytułu

Języki publikacji

Abstrakty

Previous seriation algorithms are confronted with a balance problem. Some approaches provide permutations with perfect wholeness, where matrix rows/columns are associated with increasing or decreasing gradient. However, this smooth permutation may lead to the blurred representation of the data structure, such as clustering structures and detailed structures inside clusters. Some other approaches indicate these structures well by tighter aggregating similar rows/columns, but this aggregation is alway at the cost of losing necessary coherence of the matrix rows/columns. In this paper, we introduce a seriation algorithm that aims at balancing the smoothness of the permutation and the clarity of the matrix structure. The permutation algorithm greedily and recursively replaces high-dissimilar object pairs with low-dissimilar ones, and the optimization algorithm searches the global optimizing solution by applying the simulated annealing algorithm. A comparison study shows both empirical and statistical evidence that Recut can provide more accurate and visually appropriate permutation by considering the balance problem.

Słowa kluczowe

seriation matrix reordering matrix visualization data visualization data analysis

Wydawca

Polskie Towarzystwo Matematyczne

Czasopismo

Fundamenta Informaticae

Rocznik

2016

Tom

Vol. 146, nr 3

Strony

293--304

Opis fizyczny

Bibliogr. 17 poz., rys., tab., wykr.

Twórcy

autor

Qi J.

3225347132@mail.ecc.u-tokyo.ac.jp

Department of Systems Innovation, School of Engineering, The University of Tokyo, 7 Chome-3-1 Hongo, Bunkyo, Tokyo, Japan

autor

Ohsawa Y.

ohsawa@sys.t.u-tokyo.ac.jp

Department of Systems Innovation, School of Engineering, The University of Tokyo, 7 Chome-3-1 Hongo, Bunkyo, Tokyo, Japan

Bibliografia

[1] Hubert L, Arabie P, Meulman J. Combinatorial data analysis: Optimization by dynamic programming. vol. 6. SIAM; 2001. doi:10.1137/1.9780898718553.
[2] Chen CH. Generalized association plots: Information visualization via iteratively generated correlation matrices. Statistica Sinica. 2002;p. 7-29.
[3] Brusco MJ, Köhn HF, Stahl S. Heuristic implementation of dynamic programming for matrix permutation problems in combinatorial data analysis. Psychometrika. 2008;73(3):503-522. doi:10.1007/s11336-007-9049-5.
[4] Brusco MJ, Stahl S. Branch-and-bound applications in combinatorial data analysis. Springer Science & Business Media; 2006. doi:10.1007/0-387-28810-4.
[5] Caraux G, Pinloche S. PermutMatrix: a graphical environment to arrange gene expression profiles in optimal linear order. Bioinformatics. 2005;21(7):1280-1281. doi:10.1093/bioinformatics/bti141.
[6] Hubert L, Schultz J. Quadratic assignment as a general data analysis strategy. British journal of mathematical and statistical psychology. 1976;29(2):190-241. doi:10.1111/j.2044-8317.1976.tb00714.x.
[7] Barnard ST, Pothen A, Simon H. A spectral algorithm for envelope reduction of sparse matrices. Numerical linear algebra with applications. 1995;2(4):317-334. doi:10.1002/nla.1680020402.
[8] Rodgers JL, Thompson TD. Seriation and multidimensional scaling: A data analysis approach to scaling asymmetric proximity matrices. Applied psychological measurement. 1992;16(2):105-117. doi:10.1177/014662169201600201.
[9] Ding C, He X. Linearized cluster assignment via spectral ordering. In: Proceedings of the twenty-first international conference on Machine learning. ACM; 2004. p. 30. doi:10.1145/1015330.1015407.
[10] Climer S, Zhang W. Rearrangement clustering: Pitfalls, remedies, and applications. The Journal of Machine Learning Research. 2006;7:919-943. Available from: http://dl.acm.org/citation.cfm?id=1248547.1248579.
[11] McCormick Jr WT, Schweitzer PJ, White TW. Problem decomposition and data reorganization by a clustering technique. Operations Research. 1972;20(5):993-1009.
[12] Niermann S. Optimizing the ordering of tables with evolutionary computation. The American Statistician. 2012. doi:10.1198/000313005X22770.
[13] Lenstra J. Technical Note-Clustering a Data Array and the Traveling-Salesman Problem. Operations Research. 1974;22(2):413-414. doi:10.1287/opre.22.2.413.
[14] Fisher RA. The use of multiple measurements in taxonomic problems. Annals of eugenics. 1936;7(2):179-188.
[15] Bezdek JC, Hathaway RJ. VAT: A tool for visual assessment of (cluster) tendency. In: Neural Networks, 2002. IJCNN’02. Proceedings of the 2002 International Joint Conference on. vol. 3. IEEE; 2002. p. 2225-2230.
[16] Buchta C, Hornik K, Hahsler M. Getting things in order: an introduction to the R package seriation. Journal of Statistical Software. 2008;25(3):1-34. doi:10.18637/jss.v025.i03.
[17] Gruvaeus G, Wainer H. Two Additions to Hierarchical Cluster Analysis. British Journal of Mathematical and Statistical Psychology. 1972;25(2):200-206. doi:10.1111/j.2044-8317.1972.tb00491.x.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-096ea6a6-4911-4ad1-9f00-78e31351101e