PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Extreme Classification under Limited Space and Time Budget

Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
We discuss a new framework for solving extreme classification (i.e., learning problems with an extremely large label space), in which we reduce the original problem to a structured prediction problem. Thanks to this we can obtain learning algorithms that work under a strict time and space budget. We mainly focus on a recently introduced algorithm, referred to as LTLS, which is to our best knowledge the first truly logarithmic time and space (in the number of labels) method for extreme classification. We compare this algorithm with two other approaches that also rely on transformation to structured prediction problems. The first algorithm encodes original labels as binary sequences. The second algorithm follows the label tree approach. The comparison shows the trade-off between computational complexity (in time and space) and predictive performance.
Rocznik
Tom
Strony
9--23
Opis fizyczny
Bibliogr. 35 poz., rys.
Twórcy
autor
  • Institute of Computing Science, Poznan University of Technology, Poznań, Poland
  • Institute of Computing Science, Poznan University of Technology, Poznań, Poland
  • Microsoft Research, Redmond, USA
Bibliografia
  • [1] Viterbi A.J., Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory, 1967.
  • [2] Seshadri N., Sundberg C.E.W., List viterbi decoding algorithms with applications. IEEE Transactions on Communications, 1994.
  • [3] Doppa J., Fern A., Tadepalli P., Hc-search: Learning heuristics and cost functions for structured prediction. In: Journal of Artificial Intelligence Research (JAIR), 2013.
  • [4] Jasinska K., Karampatziakis N., Log-time and log-space extreme classication. In: Workshop on Extreme Classification at Neural Information Processing Systems (NIPS), 2016.
  • [5] Dembczynski K., Kotłowski W., Waegeman W., Busa-Fekete R., Hüllermeier E.,Consistency of probabilistic classifier trees. In: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD). Springer-Verlag 2016.
  • [6] Jasinska K., Dembczynski K., Busa-Fekete R., Pfannschmidt K., Klerx T., Hüllermeier E., Extreme F-measure maximization using sparse probability estimates. In: International Confernece on Machine Learning (ICML), 2016.
  • [7] Yen I.E.H., Huang X., Ravikumar P., Zhong K., Dhillon I., Pd-sparse: A primal and dual sparse approach to extreme multiclass and multilabel classification. In: International Conference on Machine Learning (ICML), 2016.
  • [8] Bhatia K., Jain H., Kar P., Varma M., Jain P., Sparse local embeddings for extreme multi-label classification. In: Neural Information Processing Systems (NIPS), 2015.
  • [9] Yu H., Jain P., Kar P., Dhillon I.S., Large-scale multi-label learning with missing labels. In: International Conference on Machine Learning (ICML), 2014.
  • [10] Weston J., Bengio S., Usunier N., Wsabie: Scaling up to large vocabulary image annotation. In: International Joint Conference on Artificial Intelligence (IJCAI), 2011.
  • [11] Mineiro P., Karampatziakis N., Fast label embeddings via randomized linear algebra. In: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD), 2015.
  • [12] Prabhu Y., Varma M., FastXML: A fast, accurate and stable tree-classifier for extreme multi-label learning. In: Knowledge Discovery and Data Mining (KDD), 2014.
  • [13] Choromanska A., Langford J., Logarithmic time online multiclass prediction. In: Neural Information Processing Systems (NIPS), 2015.
  • [14] Niculescu-Mizil A., Abbasnejad E., Label filters for large scale multilabel classification. In: Workshop on Extreme Classification at the International Confernece on Machine Learning (ICML), 2015.
  • [15] Weston J., Makadia A., Yee H., Label partitioning for sublinear ranking. In: International Conference on Machine Learning (ICML), 2013.
  • [16] Cissé M., Usunier N., Artières T., Gallinari P., Robust bloom filters for large multilabel classification tasks. In: Neural Information Processing Systems (NIPS), 2013.
  • [17] Jasinska,K., Dembczynski K., Consistent label tree classifiers for extreme multilabel classification. In: Workshop on Extreme Classification at the International Confernece on Machine Learning (ICML), 2015.
  • [18] Vijayanarasimhan S., Shlens J., Monga R., Yagnik J., Deep networks with large output spaces. In: Workshop contribution at International Conference on Learning Representation (ICLR), 2014.
  • [19] Shrivastava A., Li P., Improved asymmetric locality sensitive hashing (ALSH) for maximum inner product search (mips). In: Uncertainty in Artificial Intelligence (UAI), 2015.
  • [20] Fagin R., Lotem A., Naor M., Optimal aggregation algorithms for middleware. In: Principles of Database Systems (PODS). ACM 2001.
  • [21] Stock M., Pahikkala T., Airola A., De Baets B., Waegeman, W., Efficient pairwise learning using kernel ridge regression: an exact two-step method. Computing Research Repository (CoRR), 2016.
  • [22] Indyk P., Motwani R., Approximate nearest neighbors: Towards removing the curse of dimensionality. In: ACM Symposium on Theory of Computing, 1998.
  • [23] Beygelzimer A., Langford J., Zadrozny B., Machine learning techniques-reductions between prediction quality metrics. In: Performance Modeling and Engineering. Springer-Verlag 2008.
  • [24] Beygelzimer A., Daumé H., Langford J., Mineiro P., Learning reductions that really work. In: Proceedings of the IEEE, 2016.
  • [25] Dietterich T., Bakiri G., Solving multiclass learning problems via error-correcting output codes. Journal of Machine Learning Research (JMLR), 1996.
  • [26] Huffman D., A method for the construction of minimum-redundancy codes. Proceedings of the Institute of Radio Engineers (IRE), 1952.
  • [27] Weinberger K., Dasgupta A., Langford J., Smola A., Attenberg J., Feature hashing for large scale multitask learning. In: International Conference on Machine Learning (ICML), ACM, 2009.
  • [28] Dembczynski K., Waegeman W., Cheng W., Hüllermeier E., An analysis of chaining in multi-label classification. In: European Conference on Artificial Intelligence (ECAI), 2012.
  • [29] Mena D., Montanes E., Quevedo J.R., del Coz J.J., Using a* for inference in probabilistic classifier chains. In: International Joint Conference on Artificial Intelligence (IJCAI), 2015.
  • [30] Beygelzimer A., Langford J., Lifshits Y., Sorkin G.B., Strehl A.L., Conditional probability tree estimation analysis and algorithms. In: Uncertainty in Artificial Intelligence (UAI), 2009.
  • [31] Fox J., Applied regression analysis, linear models, and related methods. Sage, 1997.
  • [32] Morin F., Bengio Y., Hierarchical probabilistic neural network language model. In: Artificial Intelligence and Statistics Conference (AISTATS), 2005, pp. 246–252.
  • [33] Lafferty J.D., McCallum A., Pereira F.C.N., Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: International Conference on Machine Learning (ICML), 2001.
  • [34] Tsochantaridis Y., Joachims T., Hofmann T., Altun Y., Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research (JMLR), 2005.
  • [35] Langford J., Strehl A., Li L., Vowpal wabbit, 2007 http://mloss.org/software/view/53/.
Uwagi
PL
Opracowanie ze środków MNiSW w ramach umowy 812/P-DUN/2016 na działalność upowszechniającą naukę (zadania 2017).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-e258df38-e6c0-492e-bea7-108bd0b72aee
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.