Tytuł artykułu
Autorzy
Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
The selection of data representation and metric for a given data set is one of the most crucial problems in machine learning since it affects the results of classification and clustering methods. In this paper we investigate how to combine a various data representations and metrics into a single function which better reflects the relationships between data set elements than a single representation-metric pair. Our approach relies on optimizing a linear combination of selected distance measures with use of least square approximation. The application of our method for classification and clustering of chemical compounds seems to increase the accuracy of these methods.
Słowa kluczowe
Czasopismo
Rocznik
Tom
Strony
83--92
Opis fizyczny
Bibliogr. 27 poz., rys.
Twórcy
autor
- Faculty of Mathematics and Computer Science ul. Lojasiewicza 6, 30-348 Kraków
autor
- Faculty of Mathematics and Computer Science ul. Lojasiewicza 6, 30-348 Kraków
Bibliografia
- [1] Aczel A., Sounderpandian J., Complete Business Statistics. McGraw Hill, New York 2009.
- [2] Atkeson C., Moore A., Schaal S., Locally weighted learning. Artificial Intelligence Review, 1997, 11, pp. 11–73.
- [3] Bar-Hillel A., Hertz T., Shental N., Weinshall D., Learning a Mahalanobis metric from equivalence constraints. Journal of Machine Learning Research, 2005, 6, pp. 937–965.
- [4] Cover T., Hart P., Nearest Neighbor Pattern Classification. IEEE Transactions on Information Theory, 1967, 13, pp. 21–27.
- [5] Cox T.F., Cox M.A.A., Multidimensional Scaling. Chapman and Hall, London 1994.
- [6] Deng Z., Chuaqui C., Singh J., Knowledge-based design of target-focused libraries using protein-ligand interaction constraints. Journal of Medicinal Chemistry, 2006, 49(2), pp. 490–500.
- [7] Domeniconi C., Gunopulos D., Adaptive nearest neighbor classification using support vector machines. Advances in Neural Information Processing Systems, 2002, 14, pp. 665–672.
- [8] Geppert H., Vogt M., Bajorath J., Current Trends in Ligand-Based Virtual Screening: Molecular Representations, Data Mining Methods, New Application Areas, and Performance Evaluation. Journal of Chemical Information and Modeling, 2010, 50, pp. 205–216.
- [9] Goldberger J., Roweis S., Hinton G., Salakhutdinov R., Neighbourhood Components Analysis. Advances in Neural Information Processing Systems, 2004, 17, pp. 513–520.
- [10] Hastie T., Tibshirani R., Discriminant Adaptive Nearest Neighbor Classification. IEEE Trans. Pattern Anal. Mach. Intell., 1996, 18, pp. 607–616.
- [11] Hubert L., Arabie P., Comparing partitions. Journal of Classification, 1985, 2, pp. 193–218.
- [12] Jaakkola T.S., Haussler D., Exploiting Generative Models in Discriminative Classifiers. Proceedings of the 1998 Conference on Advances in Neural Information Processing Systems II, 1999, pp. 487–493.
- [13] Kedem D., Tyree S., Weinberger K.Q., Sha F., Lanckriet G., Non-linear Metric Learning. Advances in Neural Information Processing Systems, 2012, 25, pp. 2582–2590. Available via http://books.nips.cc/papers/files/nips25/NIPS2012 1223.pdf.
- [14] Klekota J., Roth F.P., Chemical Substructures That Enrich for Biological Activity. Bioinformatics 2008, 21, pp. 2518–2525.
- [15] Kohavi R., A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI ’95), 1995, pp. 1137–1143.
- [16] Lloyd S., Least Squares Quantization in PCM. IEEE Trans. Inf. Theor., 1982, 28, pp. 129–137.
- [17] Roweis S.T., Saul L.K., Nonlinear dimensionality reduction by locally linear embedding. Science, 2000, 290, pp. 2323–2326.
- [18] Scholkopf B., Smola A.J., Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge, MA, 2001.
- [19] Shalev-Shwartz S., Singer Y., Ng A.Y., Online and Batch Learning of Pseudometrics. Proceedings of the Twenty-first International Conference on Machine Learning (ICML ’04), 2004, pp. 743–750.
- [20] Shental N., Hertz T., Weinshall D., Pavel M., Adjustment Learning and Relevant Component Analysis. Proceedings of the 7th European Conference on Computer Vision-Part IV (ECCV ’02), 2002, pp. 776–792.
- [21]Śmieja M., Warszycki D., Tabor J., Bojarski A.J., Asymmetric Clustering Index in a Case Study of 5-HT1A Receptor Ligands. PloS ONE 9(7): e102069, doi:10.1371/journal.pone.0102069, 2014.
- [22] Sneath P.H.A., The Application of Computers to Taxonomy. J. Gen. Microbiol., 1957, 17, pp. 201–226.
- [23] Takeda H., Farsiu S. and Milanfar P., Robust kernel regression for restoration and reconstruction of images from sparse noisy data. IEEE International Conference on Image Processing, 2006, pp. 1257–1260.
- [24] Xing E.P., Ng A.Y., Jordan M.I., Russell S., Distance Metric Learning, With Application To Clustering With Side-Information,. Advances in Neural Information Processing Systems, 2003, 15, pp. 505–512.
- [25] Warszycki D., Mordalski S., Kristiansen K., Kafel R., Sylte I., Chilmonczyk, Z., Bojarski A. J., A Linear Combination of Pharmacophore Hypotheses as a New Tool in Search of New Active Compounds An Application for 5-HT1A Receptor Ligands. PloS ONE 8(12): e84510, doi:10.1371/journal.pone.0084510, 2013.
- [26] Weinberger K.Q., Saul L.K., Distance Metric Learning for Large Margin Nearest Neighbor Classification. J. Mach. Learn. Res., 2009, 10, pp. 207–244.
- [27] Weinberger K.Q., Saul L.K., Fast solvers and efficient implementations for distance metric learning. ACM International Conference Proceeding Series, 2008, 307, pp. 1160–1167.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-13fb7a2a-1681-43c9-9803-7400dbdede32
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.