Extreme Classification under Limited Space and Time Budget

Jasinska, K.; Dembczynski, K.; Karampatziakis, N.

doi:10.4467/20838476SI.16.001.6182

Artykuł - szczegóły

Tytuł artykułu

Extreme Classification under Limited Space and Time Budget

Autorzy

Jasinska K. , Dembczynski K. , Karampatziakis N.

Wybrane pełne teksty z tego czasopisma

http://www.ejournals.eu/Schedae-Informaticae/

Identyfikatory

DOI

10.4467/20838476SI.16.001.6182

Warianty tytułu

Języki publikacji

Abstrakty

We discuss a new framework for solving extreme classification (i.e., learning problems with an extremely large label space), in which we reduce the original problem to a structured prediction problem. Thanks to this we can obtain learning algorithms that work under a strict time and space budget. We mainly focus on a recently introduced algorithm, referred to as LTLS, which is to our best knowledge the first truly logarithmic time and space (in the number of labels) method for extreme classification. We compare this algorithm with two other approaches that also rely on transformation to structured prediction problems. The first algorithm encodes original labels as binary sequences. The second algorithm follows the label tree approach. The comparison shows the trade-off between computational complexity (in time and space) and predictive performance.

Słowa kluczowe

supervised learning space complexity of learning algorithms time complexity of learning algorithms extreme classification multi-class classification learning reductions

Wydawca

Wydawnictwo Uniwersytetu Jagiellońskiego

Czasopismo

Schedae Informaticae

Rocznik

2016

Tom

Vol. 25

Strony

9--23

Opis fizyczny

Bibliogr. 35 poz., rys.

Twórcy

autor

Jasinska K.

kjasinska@cs.put.poznan.pl

Institute of Computing Science, Poznan University of Technology, Poznań, Poland

autor

Dembczynski K.

kdembczynski}@cs.put.poznan.pl

Institute of Computing Science, Poznan University of Technology, Poznań, Poland

autor

Karampatziakis N.

nikosk@microsoft.com

Microsoft Research, Redmond, USA

Bibliografia

[1] Viterbi A.J., Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory, 1967.
[2] Seshadri N., Sundberg C.E.W., List viterbi decoding algorithms with applications. IEEE Transactions on Communications, 1994.
[3] Doppa J., Fern A., Tadepalli P., Hc-search: Learning heuristics and cost functions for structured prediction. In: Journal of Artificial Intelligence Research (JAIR), 2013.
[4] Jasinska K., Karampatziakis N., Log-time and log-space extreme classication. In: Workshop on Extreme Classification at Neural Information Processing Systems (NIPS), 2016.
[5] Dembczynski K., Kotłowski W., Waegeman W., Busa-Fekete R., Hüllermeier E.,Consistency of probabilistic classifier trees. In: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD). Springer-Verlag 2016.
[6] Jasinska K., Dembczynski K., Busa-Fekete R., Pfannschmidt K., Klerx T., Hüllermeier E., Extreme F-measure maximization using sparse probability estimates. In: International Confernece on Machine Learning (ICML), 2016.
[7] Yen I.E.H., Huang X., Ravikumar P., Zhong K., Dhillon I., Pd-sparse: A primal and dual sparse approach to extreme multiclass and multilabel classification. In: International Conference on Machine Learning (ICML), 2016.
[8] Bhatia K., Jain H., Kar P., Varma M., Jain P., Sparse local embeddings for extreme multi-label classification. In: Neural Information Processing Systems (NIPS), 2015.
[9] Yu H., Jain P., Kar P., Dhillon I.S., Large-scale multi-label learning with missing labels. In: International Conference on Machine Learning (ICML), 2014.
[10] Weston J., Bengio S., Usunier N., Wsabie: Scaling up to large vocabulary image annotation. In: International Joint Conference on Artificial Intelligence (IJCAI), 2011.
[11] Mineiro P., Karampatziakis N., Fast label embeddings via randomized linear algebra. In: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD), 2015.
[12] Prabhu Y., Varma M., FastXML: A fast, accurate and stable tree-classifier for extreme multi-label learning. In: Knowledge Discovery and Data Mining (KDD), 2014.
[13] Choromanska A., Langford J., Logarithmic time online multiclass prediction. In: Neural Information Processing Systems (NIPS), 2015.
[14] Niculescu-Mizil A., Abbasnejad E., Label filters for large scale multilabel classification. In: Workshop on Extreme Classification at the International Confernece on Machine Learning (ICML), 2015.
[15] Weston J., Makadia A., Yee H., Label partitioning for sublinear ranking. In: International Conference on Machine Learning (ICML), 2013.
[16] Cissé M., Usunier N., Artières T., Gallinari P., Robust bloom filters for large multilabel classification tasks. In: Neural Information Processing Systems (NIPS), 2013.
[17] Jasinska,K., Dembczynski K., Consistent label tree classifiers for extreme multilabel classification. In: Workshop on Extreme Classification at the International Confernece on Machine Learning (ICML), 2015.
[18] Vijayanarasimhan S., Shlens J., Monga R., Yagnik J., Deep networks with large output spaces. In: Workshop contribution at International Conference on Learning Representation (ICLR), 2014.
[19] Shrivastava A., Li P., Improved asymmetric locality sensitive hashing (ALSH) for maximum inner product search (mips). In: Uncertainty in Artificial Intelligence (UAI), 2015.
[20] Fagin R., Lotem A., Naor M., Optimal aggregation algorithms for middleware. In: Principles of Database Systems (PODS). ACM 2001.
[21] Stock M., Pahikkala T., Airola A., De Baets B., Waegeman, W., Efficient pairwise learning using kernel ridge regression: an exact two-step method. Computing Research Repository (CoRR), 2016.
[22] Indyk P., Motwani R., Approximate nearest neighbors: Towards removing the curse of dimensionality. In: ACM Symposium on Theory of Computing, 1998.
[23] Beygelzimer A., Langford J., Zadrozny B., Machine learning techniques-reductions between prediction quality metrics. In: Performance Modeling and Engineering. Springer-Verlag 2008.
[24] Beygelzimer A., Daumé H., Langford J., Mineiro P., Learning reductions that really work. In: Proceedings of the IEEE, 2016.
[25] Dietterich T., Bakiri G., Solving multiclass learning problems via error-correcting output codes. Journal of Machine Learning Research (JMLR), 1996.
[26] Huffman D., A method for the construction of minimum-redundancy codes. Proceedings of the Institute of Radio Engineers (IRE), 1952.
[27] Weinberger K., Dasgupta A., Langford J., Smola A., Attenberg J., Feature hashing for large scale multitask learning. In: International Conference on Machine Learning (ICML), ACM, 2009.
[28] Dembczynski K., Waegeman W., Cheng W., Hüllermeier E., An analysis of chaining in multi-label classification. In: European Conference on Artificial Intelligence (ECAI), 2012.
[29] Mena D., Montanes E., Quevedo J.R., del Coz J.J., Using a* for inference in probabilistic classifier chains. In: International Joint Conference on Artificial Intelligence (IJCAI), 2015.
[30] Beygelzimer A., Langford J., Lifshits Y., Sorkin G.B., Strehl A.L., Conditional probability tree estimation analysis and algorithms. In: Uncertainty in Artificial Intelligence (UAI), 2009.
[31] Fox J., Applied regression analysis, linear models, and related methods. Sage, 1997.
[32] Morin F., Bengio Y., Hierarchical probabilistic neural network language model. In: Artificial Intelligence and Statistics Conference (AISTATS), 2005, pp. 246–252.
[33] Lafferty J.D., McCallum A., Pereira F.C.N., Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: International Conference on Machine Learning (ICML), 2001.
[34] Tsochantaridis Y., Joachims T., Hofmann T., Altun Y., Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research (JMLR), 2005.
[35] Langford J., Strehl A., Li L., Vowpal wabbit, 2007 http://mloss.org/software/view/53/.

Uwagi

Opracowanie ze środków MNiSW w ramach umowy 812/P-DUN/2016 na działalność upowszechniającą naukę (zadania 2017).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-e258df38-e6c0-492e-bea7-108bd0b72aee