PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Exponential machines

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Modeling interactions between features improves the performance of machine learning solutions in many domains (e.g. recommender systems or sentiment analysis). In this paper, we introduce Exponential machines (ExM), a predictor that models all interactions of every order. The key idea is to represent an exponentially large tensor of parameters in a factorized format called tensor train (TT). The tensor train format regularizes the model and lets you control the number of underlying parameters. To train the model, we develop a stochastic Riemannian optimization procedure, which allows us to fit tensors with ¼ 256 entries. We show that the model achieves state-of-the-art performance on synthetic data with high-order interactions and that it works on par with high-order factorization machines on a recommender system dataset MovieLens 100 K.
Rocznik
Strony
789--797
Opis fizyczny
Bibliogr. 31 poz., rys., wykr., tab.
Twórcy
autor
  • National Research University Higher School of Economics
  • Institute of Numerical Mathematics RAS
autor
  • Federal Research Center “Computer Science and Control” RAS
autor
  • Institute of Numerical Mathematics RAS
  • Skolkovo Institute of Science and Technology
Bibliografia
  • [1] M. Abadi et al., “Tensorflow: Large-scale machine learning on heterogeneous systems”, 2015. Software available from tensorflow.org.
  • [2] I. Bayer, “FASTFM: A library for factorization machines”, Journal of Machine Learning Research, 2016.
  • [3] M. Blondel, A. Fujino, N. Ueda, and M. Ishihata, “Higher-order factorization machines”, Advances in Neural Information Processing Systems 29 (NIPS), 2016.
  • [4] M. Blondel, M. Ishihata, A. Fujino, and N. Ueda, “Polynomial networks and factorization machines: New insights and efficient training algorithms”, In Advances in Neural Information Processing Systems 29 (NIPS). 2016.
  • [5] A. Bordes, S. Ertekin, J. Weston, and L. Bottou, “Fast kernel classifiers with online and active learning”, The Journal of Machine Learning Research, 6, 1579–1619, 2005.
  • [6] B.E. Boser, I.M. Guyon, and V.N. Vapnik. “A training algorithm for optimal margin classifiers”, In Proceedings of the fifth annual workshop on Computational learning theory, pages 144–152, 1992.
  • [7] J.D. Caroll and J.J. Chang, “Analysis of individual differences in multidimensional scaling via n-way generalization of eckartyoung decomposition”, Psychometrika, 35, 283–319, 1970.
  • [8] A. Cichocki, N. Lee, I. Oseledets, A.-H. Phan, Q. Zhao, and D. Mandic, “Tensor networks for dimensionality reduction and large-scale optimization: Part 1 low-rank tensor decompositions”, Foundations and Trends® in Machine Learning, 9(4‒5), 249–429, 2016.
  • [9] A. Cichocki, A.-H. Phan, Q. Zhao, N. Lee, I. Oseledets, M. Sugiyama, and D. Mandic, “Tensor networks for dimensionality reduction and large-scale optimization: Part 2 applications and future perspectives”, Foundations and Trends® in Machine Learning, 9(6), 431–673, 2017.
  • [10] D. Dheeru and E.K. Taniskidou, “UCI machine learning repository”, 2017.
  • [11] F.M. Harper and A.J. Konstan, “The movielens datasets: History and context”, ACM Transactions on Interactive Intelligent Systems (TiiS), 2015.
  • [12] R.A. Harshman, “Foundations of the parafac procedure: models and conditions for an explanatory multimodal factor analysis”, UCLA Working Papers in Phonetics, 16, 1–84, 1970.
  • [13] S. Holtz, T. Rohwedder, and R. Schneider, “On manifolds of tensors of fixed tt-rank”, Numerische Mathematik, pages 701‒731, 2012.
  • [14] V. Khrulkov, A. Novikov, and I. Oseledets, “Expressive power of recurrent neural networks”, In International Conference on Learning Representations (ICLR), 2018.
  • [15] D. Kingma and J. Ba. Adam, “A method for stochastic optimization”, In International Conference on Learning Representations (ICLR), 2015.
  • [16] V. Lebedev, Y. Ganin, M. Rakhuba, I. Oseledets, and V. Lempitsky. “Speeding-up convolutional neural networks using fine-tuned cp-decomposition. In International Conference on Learning Representations (ICLR), 2014.
  • [17] R. Livni, S. Shalev-Shwartz, and O. Shamir. “On the computational efficiency of training neural networks”, In Advances in Neural Information Processing Systems 27 (NIPS), 2014.
  • [18] C. Lubich, I. V. Oseledets, and B. Vandereycken, “Time integration of tensor trains”, SIAM Journal on Numerical Analysis, pages 917–941, 2015.
  • [19] G. Meyer, S. Bonnabel, and R. Sepulchre, “Regression on fixedrank positive semidefinite matrices: a Riemannian approach”, The Journal of Machine Learning Research, 593–625, 2011.
  • [20] A. Novikov, P. Izmailov, V. Khrulkov, M. Figurnov, and I. Oseledets, “Tensor train decomposition on tensorflow (t3f)”, arXiv preprint arXiv:1801.01928, 2018.
  • [21] A. Novikov, D. Podoprikhin, A. Osokin, and D. Vetrov, “Tensorizing neural networks”, In Advances in Neural Information Processing Systems 28 (NIPS). 2015.
  • [22] I. V. Oseledets, “Tensor-train decomposition”, SIAM J. Scientific Computing, 33(5), 2295–2317, 2011.
  • [23] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, “Scikit-learn: Machine learning in python”, Journal of Machine Learning Research, 12, 2825–2830, 2011.
  • [24] S. Rendle. “Factorization machines”, In Data Mining (ICDM), 2010 IEEE 10th International Conference on, pages 995–1000, 2010.
  • [25] U. Schollwöck, “The density-matrix renormalization group in the age of matrix product states”, Annals of Physics, 326(1), 96–192, 2011.
  • [26] E. Stoudenmire and D. J. Schwab, “Supervised learning with tensor networks”, In Advances in Neural Information Processing Systems 29 (NIPS). 2016.
  • [27] M. Tan, I.W. Tsang, L. Wang, B. Vandereycken, and S.J. Pan, “Riemannian pursuit for big matrix recovery”, In Proceedings of The 31st International Conference on Machine Learning (ICML), 2014.
  • [28] S. Wahls, V. Koivunen, H.V. Poor, and M. Verhaegen. “Learning multidimensional fourier series with tensor trains”, In Signal and Information Processing (GlobalSIP), 2014 IEEE Global Conference on, pages 394–398. IEEE, 2014.
  • [29] Z. Xu and Y. Ke. “Stochastic variance reduced Riemannian eigensolver”, arXiv preprint arXiv:1605.08233, 2016.
  • [30] J. Yang and A. Gittens, “Tensor machines for learning targetspecific polynomial features”, arXiv preprint arXiv:1504.01697, 2015.
  • [31] H. Zhang, S.J. Reddi, and S. Sra. Riemannian, “SVRG: Fast stochastic optimization on riemannian manifolds”, Advances in Neural Information Processing Systems 29 (NIPS), 2016.
Uwagi
PL
Opracowanie rekordu w ramach umowy 509/P-DUN/2018 ze środków MNiSW przeznaczonych na działalność upowszechniającą naukę (2019).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-0df3b067-aabc-4060-8afa-3fbb598d3783
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.