PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Word Embeddings for Morphologically Complex Languages

Autorzy
Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Recent methods for learning word embeddings, like GloVe or Word2- Vec, succeeded in spatial representation of semantic and syntactic relations. We extend GloVe by introducing separate vectors for base form and grammatical form of a word, using morphosyntactic dictionary for this. This allows vectors to capture properties of words better. We also present model results for word analogy test and introduce a new test based on WordNet.
Rocznik
Tom
Strony
127--138
Opis fizyczny
Bibliogr. 17 poz.
Twórcy
  • Department of Theoretical Computer Science Faculty of Mathematics and Computer Science of the Jagiellonian University ul. prof. Stanisława Łojasiewicza 6, 30-348 Kraków, Poland
Bibliografia
  • [1] Manning C.D., Raghavan P., Schütze H., Introduction to Information Retrieval. Cambridge University Press, 2008.
  • [2] Sebastiani F., Machine learning in automated text categorization. ACM computing surveys (CSUR), 2002, 34 (1), pp. 1–47.
  • [3] Tellex S., Katz B., Lin J., Fernandes A., Marton G., Quantitative evaluation of passage retrieval algorithms for question answering. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, 2003, pp. 41–47.
  • [4] Turian J., Ratinov L., Bengio Y., Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, 2010, pp. 384–394.
  • [5] Socher R., Bauer J., Manning C.D., Ng A.Y., Parsing with compositional vector grammars. In: ACL (1), 2013, pp. 455–465.
  • [6] Mikolov T., Yih W.t., Zweig G., Linguistic regularities in continuous space word representations. In: HLT-NAACL. vol. 13., 2013, pp. 746–751.
  • [7] Mikolov T., Sutskever I., Chen K., Corrado G.S., Dean J., Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems 26, 2013, pp. 3111–3119.
  • [8] Pennington J., Socher R., Manning C.D., Glove: Global vectors for word representation. In: EMNLP. vol. 14., 2014, pp. 1532–43.
  • [9] Bengio Y., Ducharme R., Vincent P., Jauvin C., A neural probabilistic language model. Journal of Machine Learning Research, 2003, 3 (Feb), pp. 1137–1155.
  • [10] Mikolov T., Chen K., Corrado G., Dean J., Efficient estimation of word representations in vector space. CoRR, 2013, abs/1301.3781.
  • [11] Luong T., Socher R., Manning C.D., Better word representations with recursive neural networks for morphology. In: Proceedings of the Seventeenth Conference on Computational Natural Language Learning, CoNLL 2013, Sofia, Bulgaria, August 8-9, 2013, 2013, pp. 104–113.
  • [12] Botha J.A., Blunsom P., Compositional morphology for word representations and language modelling. In: ICML, 2014, pp. 1899–1907.
  • [13] Deerwester S., Dumais S.T., Furnas G.W., Landauer T.K., Harshman R., Indexing by latent semantic analysis. Journal of the American Society for Information Science, 1990, 41 (6), pp. 391.
  • [14] Duchi J., Hazan E., Singer Y., Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 2011, 12 (Jul), pp. 2121–2159.
  • [15] Miłkowski M., Polimorfologik. https://github.com/morfologik/polimorfologik 2016.
  • [16] Maziarz M., Piasecki M., Szpakowicz S., Approaching plWordNet 2.0. In: Proceedings of the 6th Global Wordnet Conference, January 2012.
  • [17] Schnabel T., Labutov I., Mimno D.M., Joachims T.,Evaluation methods for unsupervised word embeddings. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17–21, 2015, 2015, pp. 298–307.
Uwagi
PL
Opracowanie ze środków MNiSW w ramach umowy 812/P-DUN/2016 na działalność upowszechniającą naukę (zadania 2017).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-4ec7fcc5-82b2-4139-9f0a-70298332531d
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.