PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Automatic questions generation based on keywords using language models

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
In this paper, we presented a novel method for question generation, one of the most impactful NLP tasks in contacts of user interfaces, chatbots and intelligent assistants with a user. Our method outperforms commonly used methods in terms of quality and speed of question generation. Additionally, we benchmarked the most used methods of question generation that are based on the usage of Large Language Models in a fewshot approach as well as finetuned to that task. Our work is done for the Polish language, it has one of the most challenging and complex grammars, which makes the task even more difficult.
Twórcy
  • JT Weston sp. z o.o. Warszawa, Poland
autor
  • Wroclaw University of Science and Technology, Faculty of Information and Communication Technology, Department of Computer Engineering, Poland
  • JT Weston sp. z o.o. Warszawa, Poland
  • University of Warsaw, Poland
Bibliografia
  • [1] Z. Wang, “Generating complex questions from knowledge graphs with query graphs,” in 2022 IEEE 10th International Conference on Information, Communication and Networks (ICICN), 2022, pp. 606-613.
  • [2] D. Rothman, Transformers for Natural Language Processing: Build, train, and fine-tune deep neural network architectures for NLP with Python, Hugging Face, and OpenAI’s GPT-3, ChatGPT, and GPT-4. Packt Publishing Ltd, 2022.
  • [3] T. Gniazdowski, M. Bazan, and M. Marchwiany, “Test dataset to reproduce experiments,” https://github.com/TomekGniazdowski/Automatic-questions-generation-based-on-keywords-using-language-models-paper-dataset, 2024.
  • [4] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” in 3rd International Conference on Learning Representations, ICLR 2015 ; Conference date: 07-05-2015 Through 09-05-2015, Jan. 2015.
  • [5] C. Gulcehre, S. Ahn, R. Nallapati, B. Zhou, and Y. Bengio, “Pointing the unknown words,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), K. Erk and N. A. Smith, Eds. Berlin, Germany: Association for Computational Linguistics, Aug. 2016, pp. 140-149. [Online]. Available: https://aclanthology.org/P16-1014
  • [6] I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” Advances in neural information processing systems, vol. 27, 2014.
  • [7] D. Lindberg, F. Popowich, J. Nesbit, and P. Winne, “Generating natural language questions to support learning on-line,” in Proceedings of the 14th European workshop on natural language generation, 2013, pp. 105-114.
  • [8] R. Meng, S. Zhao, S. Han, D. He, P. Brusilovsky, and Y. Chi, “Deep keyphrase generation,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), R. Barzilay and M.-Y. Kan, Eds. Vancouver, Canada: Association for Computational Linguistics, Jul. 2017, pp. 582-592. [Online]. Available: https://aclanthology.org/P17-1054
  • [9] K. Cho, B. van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), A. Moschitti, B. Pang, and W. Daelemans, Eds. Doha, Qatar: Association for Computational Linguistics, Oct. 2014, pp. 1724-1734. [Online]. Available: https://aclanthology.org/D14-1179
  • [10] S. R. Indurthi, D. Raghu, M. M. Khapra, and S. Joshi, “Generating natural language question-answer pairs from a knowledge graph using a rnn based question generation model,” in Proceedings of the 15th Con-ference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, 2017, pp. 376-385.
  • [11] W. Hu, B. Liu, J. Ma, D. Zhao, and R. Yan, “Aspect-based question generation,” International Conference on Learning Representations, ICLR 2018, 2018, workshop track.
  • [12] X. Shen, J. Chen, J. Chen, C. Zeng, and Y. Xiao, “Diversified query generation guided by knowledge graph,” in Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, 2022, pp. 897-907.
  • [13] R. Doi, T. Charoenporn, and V. Sornlertlamvanich, “Automatic question generation for chatbot development,” in 2022 7th International Conference on Business and Industrial Research (ICBIR). IEEE, 2022, pp. 301-305.
  • [14] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, “Exploring the limits of transfer learning with a unified text-to-text transformer,” Journal of Machine Learning Research, vol. 21, no. 140, pp. 1-67, 2020. [Online]. Available: http://jmlr.org/papers/v21/20-074.html
  • [15] A. Kumar, S. Dandapat, and S. Chordia, “Translating web search queries into natural language questions,” in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), N. Calzolari, K. Choukri, C. Cieri, T. Declerck, S. Goggi, K. Hasida, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, J. Odijk, S. Piperidis, and T. Tokunaga, Eds. Miyazaki, Japan: European Language Resources Association (ELRA), May 2018. [Online]. Available: https://aclanthology.org/L18-1151
  • [16] R. Pandey, D. Chaudhari, S. Bhawani, O. Pawar, and S. Barve, “Interview bot with automatic question generation and answer evaluation,” in 2023 9th international conference on advanced computing and communication systems (ICACCS), vol. 1. IEEE, 2023, pp. 1279-1286.
  • [17] G. Deena, K. Raja et al., “Keyword extraction using latent semantic analysis for question generation,” Journal of Applied Science and Engineering, vol. 26, no. 4, pp. 501-510, 2022.
  • [18] G. Deena and K. Raja, “Objective type question generation using natural language processing,” International Journal of Advanced Computer Science and Applications, vol. 13, no. 2, 2022.
  • [19] J. J. G. Torres, M. B. Bîndil˘a, S. Hofstee, D. Szondy, Q.-H. Nguyen, S. Wang, and G. Englebienne, “Automated question-answer generation for evaluating rag-based chatbots,” in Proceedings of the First Workshop on Patient-Oriented Language Processing (CL4Health)@ LRECCOLING 2024, 2024, pp. 204-214.
  • [20] N. Houlsby, A. Giurgiu, S. Jastrzebski, B. Morrone, Q. De Laroussilhe, A. Gesmundo, M. Attariyan, and S. Gelly, “Parameter-efficient transfer learning for NLP,” in Proceedings of the 36th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, K. Chaudhuri and R. Salakhutdinov, Eds., vol. 97. PMLR, 09-15 Jun 2019, pp. 2790-2799. [Online]. Available: https://proceedings.mlr.press/v97/houlsby19a.html
  • [21] E. J. Hu, yelong shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen, “LoRA: Low-rank adaptation of large language models,” in International Conference on Learning Representations, 2022. [Online]. Available: https://openreview.net/forum?id=nZeVKeeFYf9
  • [22] T. e. a. Brown, “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877-1901, 2020.
  • [23] M. Blšták and V. Rozinajová, “Automatic question generation based on sentence structure analysis using machine learning approach,” Natural Language Engineering, vol. 28, no. 4, pp. 487-517, 2022.
  • [24] S. Banerjee and A. Lavie, “Meteor: An automatic metric for mt evaluation with improved correlation with human judgments,” in Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, 2005, pp. 65-72.
  • [25] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “Bleu: a method for automatic evaluation of machine translation,” in Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 2002, pp. 311-318.
  • [26] C.-Y. Lin, “Rouge: A package for automatic evaluation of summaries,” in Text summarization branches out, 2004, pp. 74-81.
  • [27] T. Zhang*, V. Kishore*, F. Wu*, K. Q. Weinberger, and Y. Artzi, “Bertscore: Evaluating text generation with bert,” in International Conference on Learning Representations, 2020. [Online]. Available: https://openreview.net/forum?id=SkeHuCVFDr
  • [28] T. Sellam, D. Das, and A. Parikh, “BLEURT: Learning robust metrics for text generation,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, D. Jurafsky, J. Chai, N. Schluter, and J. Tetreault, Eds. Online: Association for Computational Linguistics, Jul. 2020, pp. 7881-7892. [Online]. Available: https://aclanthology.org/2020.acl-main.704
  • [29] Y. Huang, L. Sun, H. Wang, S. Wu, Q. Zhang, Y. Li, C. Gao, Y. Huang, W. Lyu, Y. Zhang et al., “Trustllm: Trustworthiness in large language models,” arXiv preprint arXiv:2401.05561, 2024.
  • [30] Z. Dong, Z. Zhou, C. Yang, J. Shao, and Y. Qiao, “Attacks, defenses and evaluations for llm conversation safety: A survey,” arXiv preprint arXiv:2402.09283, 2024.
  • [31] C. Walker, C. Rothon, K. Aslansefat, Y. Papadopoulos, and N. Deth-lefs, “Safellm: Domain-specific safety monitoring for large language models: A case study of offshore wind maintenance,” arXiv preprint arXiv:2410.10852, 2024.
  • [32] M. Honnibal and I. Montani, “spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing,” 2017, https://github.com/explosion/spaCy.
  • [33] M. Honnibal and M. Johnson, “An improved non-monotonic transition system for dependency parsing,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, L. Màrquez, C. Callison-Burch, and J. Su, Eds. Lisbon, Portugal: Association for Computational Linguistics, Sep. 2015, pp. 1373-1378. [Online]. Available: https://aclanthology.org/D15-1162
  • [34] J. Nivre and J. Nilsson, “Pseudo-projective dependency parsing,” in Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), K. Knight, H. T. Ng, and K. Oflazer, Eds. Ann Arbor, Michigan: Association for Computational Linguistics, Jun. 2005, pp. 99-106. [Online]. Available: https://aclanthology.org/P05-1013
  • [35] W. Kieraś and M. Woliński, “Morfeusz 2 - analizator i generator fleksyjny dla j˛ezyka polskiego,” Język Polski, vol. XCVII, no. 1, pp. 75-83, 2017.
  • [36] M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov, and L. Zettlemoyer, “BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, D. Jurafsky, J. Chai, N. Schluter, and J. Tetreault, Eds. Online: Association for Computational Linguistics, Jul. 2020, pp. 7871-7880. [Online]. Available: https://aclanthology.org/2020.acl-main.703
  • [37] S. Dadas, “Polish bart base,” https://huggingface.co/sdadas/polish-bart-base, 2022.
  • [38] A. Chrabrowa, Ł. Dragan, K. Grzegorczyk, D. Kajtoch, M. Koszowski, R. Mroczkowski, and P. Rybak, “Evaluation of transfer learning for Polish with a text-to-text model,” in Proceedings of the Thirteenth Language Resources and Evaluation Conference, N. Calzolari, F. Béchet, P. Blache, K. Choukri, C. Cieri, T. Declerck, S. Goggi, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, J. Odijk, and S. Piperidis, Eds. Marseille, France: European Language Resources Association, Jun. 2022, pp. 4374-4394. [Online]. Available: https://aclanthology.org/2022.lrec-1.466
  • [39] S. Rothe, J. Mallinson, E. Malmi, S. Krause, and A. Severyn, “A simple recipe for multilingual grammatical error correction,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), C. Zong, F. Xia, W. Li, and R. Navigli, Eds. Online: Association for Computational Linguistics, Aug. 2021, pp. 702-707. [Online]. Available: https://aclanthology.org/2021.acl-short.89
  • [40] P. Lison and J. Tiedemann, “OpenSubtitles2016: Extracting large parallel corpora from movie and TV subtitles,” in Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), N. Calzolari, K. Choukri, T. Declerck, S. Goggi, M. Grobelnik, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, J. Odijk, and S. Piperidis, Eds. Portorož, Slovenia: European Language Resources Association (ELRA), May 2016, pp. 923-929. [Online]. Available: https://aclanthology.org/L16-1147
  • [41] M. Bañón, P. Chen, B. Haddow, K. Heafield, H. Hoang, M. Esplà-Gomis, M. L. Forcada, A. Kamran, F. Kirefu, P. Koehn, S. Ortiz Rojas, L. Pla Sempere, G. Ramírez-Sánchez, E. Sarrías, M. Strelec, B. Thompson, W. Waites, D. Wiggins, and J. Zaragoza, “ParaCrawl: Web-scale acquisition of parallel corpora,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, D. Jurafsky, J. Chai, N. Schluter, and J. Tetreault, Eds. Online: Association for Computational Linguistics, Jul. 2020, pp. 4555-4567. [Online]. Available: https://aclanthology.org/2020.acl-main.417
  • [42] C. Borowski, “Polish verb conjugator,” 2023. [Online]. Available: https://github.com/chriseborowski/Polish-verb-conjugator
  • [43] M. Grootendorst, “Keybert: Minimal keyword extraction with bert.” 2020. [Online]. Available: https://doi.org/10.5281/zenodo.4461265
  • [44] S. Rose, D. Engel, N. Cramer, and W. Cowley, Automatic Keyword Extraction from Individual Documents. John Wiley & Sons, Ltd, 2010, ch. 1, pp. 1-20. [Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1002/9780470689646.ch1
  • [45] F. Barrios, F. López, L. Argerich, and R. Wachenchauzer, “Variations of the similarity function of textrank for automated summarization,” 2015, simposio Argentino de Inteligencia Artificial.
  • [46] AI@Meta, “Llama 3 model card,” https://huggingface.co/meta-llama/Meta-Llama-3-8B, 2024. [Online]. Available: https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md
  • [47] R. Zhang, J. Han, C. Liu, A. Zhou, P. Lu, Y. Qiao, H. Li, and P. Gao, “LLaMA-adapter: Efficient fine-tuning of large language models with zero-initialized attention,” 2024. [Online]. Available: https://openreview.net/forum?id=d4UiXAHN2W
  • [48] L. AI, “Litgpt,” https://github.com/Lightning-AI/litgpt, 2023.
  • [49] A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts, “Learning word vectors for sentiment analysis,” in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Portland, Oregon, USA: Association for Computational Linguistics, June 2011, pp. 142-150. [Online]. Available: http://www.aclweb.org/anthology/P11-1015
  • [50] R. Tatman, “Question-answer dataset.” [Online]. Available: https://www.kaggle.com/datasets/rtatman/questionanswer-dataset
  • [51] A. Bordes, N. Usunier, S. Chopra, and J. Weston, “Large-scale simple question answering with memory networks,” CoRR, vol. abs/1506.02075, 2015. [Online]. Available: http://arxiv.org/abs/1506.02075
  • [52] P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, “SQuAD: 100,000+ questions for machine comprehension of text,” in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, J. Su, K. Duh, and X. Carreras, Eds. Austin, Texas: Association for Computational Linguistics, Nov. 2016, pp. 2383-2392. [Online]. Available: https://aclanthology.org/D16-1264
  • [53] W. Xiong, J. Wu, H. Wang, V. Kulkarni, M. Yu, S. Chang, X. Guo, and W. Y. Wang, “TWEETQA: A social media focused question answering dataset,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, A. Korhonen, D. Traum, and L. Màrquez, Eds. Florence, Italy: Association for Computational Linguistics, Jul. 2019, pp. 5020-5031. [Online]. Available: https://aclanthology.org/P19-1496
  • [54] D. C. Austin Walters, “Sentence classification.” [Online]. Available: https://github.com/lettergram/sentence-classification
  • [55] N. B. et al., “Deep translator.” [Online]. Available: https://github.com/nidhaloff/deep-translator
  • [56] K. Ociepa, L. Flis, K. Wróbel, S. Kondracki, SpeakLeash Team, and Cyfronet Team, “Introducing bielik-7b-instruct-v0.1: Instruct polish language model,” 2024, accessed: 2024-19-07. [Online]. Available: https://huggingface.co/speakleash/Bielik-7B-Instruct-v0.1
  • [57] I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” in International Conference on Learning Representations, 2019. [Online]. Available: https://openreview.net/forum?id=Bkg6RiCqY7
  • [58] ——, “SGDR: Stochastic gradient descent with warm restarts,” in International Conference on Learning Representations, 2017. [Online]. Available: https://openreview.net/forum?id=Skq89Scxx
  • [59] S. Minaee, T. Mikolov, N. Nikzad, M. Chenaghlu, R. Socher, X. Amatriain, and J. Gao, “Large language models: A survey,” 2024. [Online]. Available: https://arxiv.org/abs/2402.06196
  • [60] A. Q. Jiang, A. Sablayrolles, A. Mensch, C. Bamford, D. S. Chaplot, D. de las Casas, F. Bressand, G. Lengyel, G. Lample, L. Saulnier, L. R. Lavaud, M.-A. Lachaux, P. Stock, T. L. Scao, T. Lavril, T. Wang, T. Lacroix, and W. E. Sayed, “Mistral 7b,” 2023. [Online]. Available: https://arxiv.org/abs/2310.06825
Uwagi
1. Opracowanie rekordu ze środków MNiSW, umowa nr POPUL/SP/0154/2024/02 w ramach programu "Społeczna odpowiedzialność nauki II" - moduł: Popularyzacja nauki (2025).
2. This work was financed with European Union funds from the Smart Growth Operational Programme 2014-2020, Measure 1.1.: R&D projects of enterprises. Sub-measure 1.1.1.: Industrial research and development carried out by enterprises
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-d003a15f-64cc-434c-88a4-a63a55a7ae43
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.