Knowledge graphs effectiveness in Neural Machine Translation improvement

Ahmadnia, Benyamin; Dorr, Bonnie J.; Kordjamshidi, Parisa

doi:10.7494/csci.2020.21.3.3701

Artykuł - szczegóły

Tytuł artykułu

Knowledge graphs effectiveness in Neural Machine Translation improvement

Autorzy

Ahmadnia Benyamin , Dorr Bonnie J. , Kordjamshidi Parisa

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

DOI

10.7494/csci.2020.21.3.3701

Warianty tytułu

Języki publikacji

Abstrakty

Maintaining semantic relations between words during the translation process yields more accurate target-language output from Neural Machine Translation (NMT). Although difficult to achieve from training data alone, it is possible to leverage Knowledge Graphs (KGs) to retain source-language semantic relations in the corresponding target-language translation. The core idea is to use KG entity relations as embedding constraints to improve the mapping from source to target. This paper describes two embedding constraints, both of which employ Entity Linking (EL)—assigning a unique identity to entities—to associate words in training sentences with those in the KG: (1) a monolingual embedding constraint that supports an enhanced semantic representation of the source words through access to relations between entities in a KG; and (2) a bilingual embedding constraint that forces entity relations in the source-language to be carried over to the corresponding entities in the target-language translation. The method is evaluated for English-Spanish translation exploiting Freebase as a source of knowledge. Our experimental results demonstrate that exploiting KG information not only decreases the number of unknown words in the translation but also improves translation quality

Słowa kluczowe

natural language processing neural machine translation knowledge graph representation

Wydawca

Wydawnictwa AGH

Czasopismo

Computer Science

Rocznik

2020

Tom

T. 21 (3)

Strony

299–318

Opis fizyczny

Bibliogr. 54 poz., rys., tab.

Twórcy

autor

Ahmadnia Benyamin

ahmadnia@tulane.edu

Tulane University, Department of Computer Science, New Orleans, LA, United States

autor

Dorr Bonnie J.

bdorr@ihmc.us

Institute for Human and Machine Cognition (IHMC), Ocala, FL, United States

autor

Kordjamshidi Parisa

kordjams@msu.edu

Michigan State University, Department of Computer Science and Engineering, East Lansing, MI, United States

Bibliografia

[1] Ahmadnia B., Dorr B.J.: Augmenting Neural Machine Translation through Round-Trip Training Approach, Open Computer Science, vol. 9(1), pp. 268–278, 2019.
[2] Ahmadnia B., Dorr B.J.: Enhancing Phrase-Based Statistical Machine Translation by Learning Phrase Representations Using Long Short-Term Memory Network. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pp. 25–32, 2019.
[3] Ahmadnia B., Serrano J., Haffari G.: Persian-Spanish Low-Resource Statistical Machine Translation Through English as Pivot Language. In: Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP 2017), pp. 24–30, 2017.
[4] Annervaz K.M., Chowdhury S.B.R., Dukkipati A.: Learning beyond datasets: Knowledge Graph Augmented Neural Networks for Natural language Processing. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pp. 313–322, 2018.
[5] Arthur P., Neubig G., Nakamura S.: Incorporating Discrete Translation Lexicons into Neural Machine Translation. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1557–1567, 2016.
[6] Auer S., Bizer C., Kobilarov G., Lehmann J., Cyganiak R., Ives Z.: DBpedia: A Nucleus for a Web of Open Data. In: Proceedings of the 6th International Semantic Web Conference, pp. 722–735. 2008.
[7] Bahdanau D., Cho K., Bengio Y.: Neural Machine Translation by Jointly Learning to Align and Translate. In: Proceedings of the International Conference on Learning Representations, 2015.
[8] Bollacker K., Evans C., Paritosh P., Sturge T., Taylor J.: Freebase: A Collaboratively Created Graph Database for Structuring Human Knowledge. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 1247–1250, 2008.
[9] Bordes A., Usunier N., Garcia-Dur´an A., Weston J., Yakhnenko O.: Translating Embeddings for Modeling Multi-Relational Data. In: NIPS’13: Proceedings of the 26th International Conference on Neural Information Processing Systems, vol. 2, pp. 2787–2795, 2013.
[10] Chah N.: OK Google, What Is Your Ontology? Or: Exploring Freebase Classification to Understand Google’s Knowledge Graph, CoRR, vol. abs/1805.03885, 2018.
[11] Chatterjee R., Negri M., Turchi M., Federico M., Specia L., Blain F.: Guiding Neural Machine Translation Decoding with External Knowledge. In: Proceedings of the Second Conference on Machine Translation, pp. 157–168, 2017.
[12] Chousa K., Sudoh K., Nakamura S.: Training Neural Machine Translation using Word Embedding-based Loss, CoRR, vol. abs/1807.11219, 2018.
[13] Dasgupta S.S., Ray S.N., Talukdar P.: HyTE: Hyperplane-based Temporally aware Knowledge Graph Embedding. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 2001–2011, 2018.
[14] Dong X.L., Gabrilovich E., Heitz G., Horn W., Lao N., Murphy K., Strohmann T., Sun S., Zhang W.: Knowledge Vault: A Web-Scale Approach to Probabilistic Knowledge Fusion. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 601–610, 2014.
[15] Dorr B.J.: Machine Translation Divergences: A Formal Description and Proposed Solution. In: Computational Linguistics, vol. 20(4), pp. 597–633, 1994.
[16] Dorr B.J., Pearl L., Hwa R., Habash N.: DUSTer: A Method for Unraveling Cross-Language Divergences for Statistical Word-Level Alignment. In: AMTA’02: Proceedings of the 5th Conference of the Association for Machine Translation in the Americas on Machine Translation: From Research to Real Users, pp. 31–43, 2002.
[17] Du J., Way A., Zydron A.: Using BabelNet to Improve OOV Coverage in SMT. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pp. 9–15, 2016.
[18] F¨arber M., Ell B., Menne C., Rettinger A.: A Comparative Survey of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO, Semantic Web Journal, pp. 1–26, 2015.
[19] Gehring J., Auli M., Grangier D., Yarats D., Dauphin Y.N.: Convolutional Sequence to Sequence Learning, CoRR, 2017.
[20] Gu J., Lu Z., Li H., Li V.O.K.: Incorporating Copying Mechanism in Sequence-to-Sequence Learning. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1631–1640, 2016.
[21] Gulcehre C¸ ., Firat O., Xu K., Cho K., Barrault L., Lin H.-Ch., Bougares F., Schwenk H., Bengio Y.: On Using Monolingual Corpora in Neural Machine Translation, CoRR, vol. abs/1503.03535, 2015.
[22] Hochreiter S., Schmidhuber J.: Long short-term memory, Neural Computation, vol. 9(8), pp. 1735–1780, 1997.
[23] Jean S., Cho K., Memisevic R., Bengio Y.: On Using Very Large Target Vocabulary for Neural Machine Translation. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, pp. 1–10, 2015.
[24] Klein G., Kim Y., Deng Y., Senellart J., Rush A.M.: OpenNMT: Open-Source Toolkit for Neural Machine Translation. In: Proceedings of 55th Annual Meeting of the Association for Computational Linguistics, pp. 67–72, 2017.
[25] Koehn P., Och F.J., Marcu D.: Statistical Phrase-Based Translation. In: Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, pp. 127–133, 2003.
[26] Li S., Xu J., Miao G., Zhang Y., Chen Y.: A Semantic Concept Based Unknown Words Processing Method in Neural Machine Translation. In: Natural Language Processing and Chinese Computing, pp. 233–242, 2018.
[27] Lin Y., Liu Z., Sun M., Liu Y., Zhu X.: Learning Entity and Relation Embeddings for Knowledge Graph Completion. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence, pp. 2181–2187, 2015.
[28] Linzen T., Dupoux E., Goldberg Y.: Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies, Transactions of the Association for Computational Linguistics, vol. 4, pp. 521–535, 2016.
[29] Lu Y., Zhang J., Zong C.: Exploiting Knowledge Graph in Neural Machine Translation. In: Proceedings of Machine Translation: 14th China Workshop, CWMT, pp. 27–38, 2019.
[30] Luong M.T., Manning C.D.: Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 1054–1063, 2016.
[31] Luong M.T., Pham H., Manning C.D.: Effective Approaches to Attention-based Neural Machine Translation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1412–1421, 2015.
[32] Luong T., Sutskever I., Le Q., Vinyals O., Zaremba W.: Addressing the Rare Word Problem in Neural Machine Translation. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 11–19, 2015.
[33] Moussallem D., Arcan M., Ngonga Ngomo A.C., Buitelaar P.: Augmenting Neural Machine Translation with Knowledge Graphs, CoRR, vol. abs/1902.08816, 2019.
[34] Moussallem D., Usbeck R., Roder M., Ngonga Ngomo A.C.: MAG: A Multilingual, Knowledge-base Agnostic and Deterministic Entity Linking Approach. In: Proceedings of Knowledge Capture Conference, 2017.
[35] Navigli R., Ponzetto S.P.: BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network, Artificial Intelligence, vol. 193, pp. 217–250, 2012.
[36] Pan S.J., Yang Q.: A Survey on Transfer Learning, IEEE Transactions on Knowledge and Data Engineering, vol. 22(10), pp. 1345–1359, 2010.
[37] Papineni K., Roukos S., Ward T., Zhu W.J.: BLEU: A Method for Automatic Evaluation of Machine Translation. In: Proceedings of the 40th Annual Meeting on the Association for Computational Linguistics, pp. 311–318, 2001.
[38] Robbins H., Monro S.: A stochastic approximation method, Annals of Mathematical Statistics, vol. 22, pp. 400–407, 1951.
[39] Rumelhart D.E., Hinton G.E., Williams R.J.: Learning representations by backpropagating errors, Nature, vol. 323, pp. 533–536, 1986.
[40] Sennrich R., Haddow B.: Linguistic Input Features Improve Neural Machine Translation. In: Proceedings of the First Conference on Machine Translation, pp. 83–91, 2016.
[41] Sennrich R., Haddow B., Birch A.: Neural Machine Translation of Rare Words with Subword Units. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 1715–1725, 2016.
[42] Shi C., Liu S., Ren S., Feng S., Li M., Zhou M., Sun X., Wang H.: Knowledge- -Based Semantic Embedding for Machine Translation. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 2245–2254, 2016.
[43] Song Y., Roth D.: Machine Learning with World Knowledge: The Position and Survey, CoRR, vol. abs/1705.02908, 2017.
[44] Sorokin D., Gurevych I.: Modeling Semantics with Gated Graph Neural Networks for Knowledge Base Question Answering. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 3306–3317, 2018.
[45] Sutskever I., Vinyals O., Le Q.V.: Sequence to Sequence Learning with Neural Networks. In: Proceedings of the International Conference on Neural Information Processing Systems, pp. 3104–3112, 2014.
[46] Tang G., Cap F., Pettersson E., Nivre J.: An Evaluation of Neural Machine Translation Models on Historical Spelling Normalization. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 1320–1331, 2018.
[47] Tang G., Muller M., Rios A., Sennrich R.: Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 4263–4272, 2018.
[48] Tang G., Sennrich R., Nivre J.: An Analysis of Attention Mechanisms: The Case of Word Sense Disambiguation in Neural Machine Translation. In: Proceedings of the Third Conference on Machine Translation, pp. 26–35, 2018.
[49] Tran K., Bisazza A., Monz C.: The Importance of Being Recurrent for Modeling Hierarchical Structure. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 4731–4736, 2018.
[50] Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N., Kaiser L., Polosukhin I.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems, pp. 5998–6008, 2017.
[51] Wang Z., Zhang J., Feng J., Chen Z.: Knowledge Graph Embedding by Translating on Hyperplanes. In: Proceedings of the 28th AAAI Conference on Artificial Intelligence, pp. 1112–1119, 2014.
[52] Wu L., Tian F., Qin T., Lai J., Liu T.Y.: A Study of Reinforcement Learning for Neural Machine Translation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3612–3621, 2018.
[53] Yang B., Mitchell T.: Leveraging Knowledge Bases in LSTMs for Improving Machine Reading. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp. 1436–1446, 2017.
[54] Yin W., Kann K., Yu M., Sch¨utze H.: Comparative Study of CNN and RNN for Natural Language Processing, CoRR, vol. abs/1702.01923, 2017.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-4f6ce46f-982b-469a-81dc-a774f27aef74