Improving Domain-Specific Retrieval by NLI Fine-Tuning

Dušek, Roman; Galias, Christopher; Wojciechowska, Lidia; Wawer, Aleksander

doi:10.15439/2023F7569

Artykuł - szczegóły

Tytuł artykułu

Improving Domain-Specific Retrieval by NLI Fine-Tuning

Autorzy

Dušek Roman , Galias Christopher , Wojciechowska Lidia , Wawer Aleksander

Wybrane pełne teksty z tego czasopisma

http://annals-csis.org

Identyfikatory

DOI

10.15439/2023F7569

Warianty tytułu

Języki publikacji

Abstrakty

The aim of this article is to investigate the fine-tuning potential of natural language inference (NLI) data to improve information retrieval and ranking. We demonstrate this for both English and Polish languages, using data from one of the largest Polish e-commerce sites and selected open-domain datasets. We employ both monolingual and multilingual sentence encoders fine-tuned by a supervised method utilizing contrastive loss and NLI data. Our results point to the fact that NLI fine-tuning increases the performance of the models in both tasks and both languages, with the potential to improve mono- and multilingual models. Finally, we investigate uniformity and alignment of the embeddings to explain the effect of NLI-based fine-tuning for out-of-domain use-case.

Słowa kluczowe

computer science computational modeling natural language information retrieval electronic commerce task analysis

informatyka modelowanie obliczeniowe język naturalny wyszukiwanie informacji handel elektroniczny analiza zadań

Wydawca

Polskie Towarzystwo Informatyczne

Czasopismo

Annals of Computer Science and Information Systems

Rocznik

2023

Tom

Vol. 35

Strony

949--953

Opis fizyczny

Bibliogr. 18 poz., wykr., tab.

Twórcy

autor

Dušek Roman

roman.a.dusek@allegro.com

Allegro sp. z o.o. Wierzbięcice 1B, 61-569 Poznań, Poland

autor

Galias Christopher

krzysztof.galias@allegro.com

Allegro sp. z o.o. Wierzbięcice 1B, 61-569 Poznań, Poland

autor

Wojciechowska Lidia

lidia.wojciechowska@allegro.com

Allegro sp. z o.o. Wierzbięcice 1B, 61-569 Poznań, Poland

autor

Wawer Aleksander

aleksander.wawer@allegro.com

Allegro sp. z o.o. Wierzbięcice 1B, 61-569 Poznań, Poland
Institue of Compter Science, Polish Academy of Sciences Jana Kazimierza 5, 01-248 Warszawa

Bibliografia

1. W. Guo, X. Liu, S. Wang, H. Gao, A. Sankar, Z. Yang, Q. Guo, L. Zhang, B. Long, B.-C. Chen, and D. Agarwal, “DeText: A deep text ranking framework with BERT,” in Proceedings of the 29th ACM International Conference on Information & Knowledge Management, ser. CIKM ’20. New York, NY, USA: Association for Computing Machinery, 2020, p. 2509–2516. [Online]. Available: https://doi.org/10.1145/3340531.3412699
2. J. Johnson, M. Douze, and H. Jégou, “Billion-scale similarity search with GPUs,” IEEE Transactions on Big Data, vol. 7, no. 3, pp. 535–547, 2019.
3. R. Mroczkowski, P. Rybak, A. Wróblewska, and I. Gawlik, “HerBERT: Efficiently pretrained transformerbased language model for Polish,” in Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing. Kiyv, Ukraine: Association for Computational Linguistics, Apr. 2021, pp. 1–10. [Online]. Available: https://www.aclweb.org/anthology/2021.bsnlp-1.1
4. A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave, M. Ott, L. Zettlemoyer, and V. Stoyanov, “Unsupervised cross-lingual representation learning at scale,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online: Association for Computational Linguistics, Jul. 2020, pp. 8440–8451. [Online]. Available: https://aclanthology.org/2020.acl-main.747
5. Y. Yang, D. Cer, A. Ahmad, M. Guo, J. Law, N. Constant, G. H. Abrego, S. Yuan, C. Tar, Y.-H. Sung, B. Strope, and R. Kurzweil, “Multilingual universal sentence encoder for semantic retrieval,” 2019. [Online]. Available: https://aclanthology.org/2020.acl-demos.12.pdf
6. S. R. Bowman, G. Angeli, C. Potts, and C. D. Manning, “A large annotated corpus for learning natural language inference,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, 2015.
7. T. Gao, X. Yao, and D. Chen, “SimCSE: Simple contrastive learning of sentence embeddings,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Online and Punta Cana, Dominican Republic: Association for Computational Linguistics, Nov. 2021, pp. 6894–6910. [Online]. Available: https://aclanthology.org/2021.emnlp-main.552
8. R. Litschko, I. Vulić, S. P. Ponzetto, and G. Glavaš, “On cross-lingual retrieval with multilingual text encoders,” Information Retrieval Journal, vol. 25, no. 2, pp. 149–183, Jun. 2022. [Online]. Available: https://doi.org/ 10.1007/s10791-022-09406-x
9. P. Rybak, R. Mroczkowski, J. Tracz, and I. Gawlik, “KLEJ: Comprehensive benchmark for Polish language understanding,” Online, pp. 1191–1201, Jul. 2020. [Online]. Available: https://aclanthology.org/2020.acl-main.111
10. S. Dadas, M. Perełkiewicz, and R. Poświata, “Evaluation of sentence representations in Polish,” in Proceedings of the Twelfth Language Resources and Evaluation Conference. European Language Resources Association, 2020, pp. 1674–1680. [Online]. Available: https:// aclanthology.org/2020.lrec-1.207
11. A. Wróblewska and K. Krasnowska-Kieraś, “Polish evaluation dataset for compositional distributional semantics models,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vancouver, Canada: Association for Computational Linguistics, Jul. 2017, pp. 784–792. [Online]. Available: https://aclanthology.org/P17-1073
12. Y. Chen, S. Liu, Z. Liu, W. Sun, L. Baltrunas, and B. Schroeder, “WANDS: Dataset for product search relevance assessment,” in Proceedings of the 44th European Conference on Information Retrieval, 2022.
13. D. Wadden, S. Lin, K. Lo, L. L. Wang, M. van Zuylen, A. Cohan, and H. Hajishirzi, “Fact or Fiction: Verifying Scientific Claims,” in EMNLP, 2020.
14. N. Thakur, N. Reimers, A. Rücklé, A. Srivastava, and I. Gurevych, “BEIR: A heterogeneous benchmark for zero-shot evaluation of information retrieval models,” in Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021. [Online]. Available: https://openreview.net/forum?id=wCu6T5xFjeJ
15. R. Rei, C. Stewart, A. C. Farinha, and A. Lavie, “COMET: A neural framework for MT evaluation,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Online: Association for Computational Linguistics, Nov. 2020, pp. 2685–2702. [Online]. Available: https: //aclanthology.org/2020.emnlp-main.213
16. J. Devlin, M. Chang, K. Lee, and K. Toutanova, “BERT: pre-training of deep bidirectional transformers for language understanding,” CoRR, vol. abs/1810.04805, 2018. [Online]. Available: http://arxiv.org/abs/1810.04805
17. T. Wang and P. Isola, “Understanding contrastive representation learning through alignment and uniformity on the hypersphere,” in Proceedings of the 37th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, H. D. III and A. Singh, Eds., vol. 119. PMLR, 13– 18 Jul 2020, pp. 9929–9939. [Online]. Available: https://proceedings.mlr.press/v119/wang20k.html
18. C. Wang, Y. Yu, W. Ma, M. Zhang, C. Chen, Y. Liu, and S. Ma, “Towards representation alignment and uniformity in collaborative filtering,” in Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2022, pp. 1816–1825. [Online]. Available: http://arxiv.org/abs/2206.12811

Uwagi

1. Thematic Tracks Short Papers

2. Opracowanie rekordu ze środków MEiN, umowa nr SONP/SP/546092/2022 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2024).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-1ba6acb7-cd08-4a70-8fde-b28b7cde7a05