Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
Nowadays, textual information grows exponentially on the Internet. Text summarization (TS) plays a crucial role in the massive amount of textual content. Manual TS is time-consuming and impractical in some applications with a huge amount of textual information. Automatic text summarization (ATS) is an essential technology to overcome mentioned challenges. Non-negative matrix factorization (NMF) is a useful tool for extracting semantic contents from textual data. Existing NMF approaches only focus on how factorized matrices should be modeled, and neglect the relationships among sentences. These relationships provide better factorization for TS. This paper suggests a novel non-negative matrix factorization for text summarization (NMFTS). The proposed ATS model puts regularizes on pairwise sentences vectors. A new cost function based on the Frobenius norm is designed, and an algorithm is developed to minimize this function by proposing iterative updating rules. The proposed NMFTS extracts semantic content by reducing the size of documents and mapping the same sentences closely together in the latent topic space. Compared with the basic NMF, the convergence time of the proposed method does not grow. The convergence proof of the NMFTS and empirical results on the benchmark data sets show that the suggested updating rules converge fast and achieve superior results compared to other methods.
Wydawca
Rocznik
Tom
Strony
37--49
Opis fizyczny
Bibliogr. 47 poz., rys.
Twórcy
- Department of Computer Engineering, University of Bonab, Bonab, Iran
Bibliografia
- [1] M. H. Aghdam, S. Heidari, Feature selection using particle swarm optimization in text categorization, Journal of Artificial Intelligence and Soft Computing Research 5 (4) (2015) 231–238.
- [2] G. C. V. Vilca, M. A. S. Cabezudo, A study of abstractive summarization using semantic representations and discourse level information (2017) 482–490.
- [3] N. Moratanch, S. Chitrakala, A survey on extractive text summarization, in: 2017 international onference on computer, communication and signal processing (ICCCSP), IEEE, 2017, pp. 1–6.
- [4] M. Gambhir, V. Gupta, Recent automatic text summarization techniques: a survey, Artificial Intelligence Review 47 (1) (2017) 1–66.
- [5] K. Yang, K. Al-Sabahi, Y. Xiang, Z. Zhang, An integrated graph model for document summarization, Information 9 (9) (2018) 232.
- [6] R. M. Alguliyev, R. M. Aliguliyev, N. R. Isazade, A. Abdi, N. Idris, Cosum: Text summarization based on clustering and optimization, Expert Systems 36 (1) (2019) e12340.
- [7] E. Lloret, M. T. Roma-Ferri, M. Palomar, Compendium: A text summarization system for generating abstracts of research papers, Data & Knowledge Engineering 88 (2013) 164–175.
- [8] C. Fang, D. Mu, Z. Deng, Z. Wu, Word-sentence co-ranking for automatic extractive text summarization, Expert Systems with Applications 72 (2017) 189–195.
- [9] M. A. Fattah, F. Ren, Ga, mr, ffnn, pnn and gmm based models for automatic text summarization, Computer Speech & Language 23 (1) (2009) 126–144.
- [10] D. Shen, J.-T. Sun, H. Li, Q. Yang, Z. Chen, Document summarization using conditional random fields., in: IJCAI, Vol. 7, 2007, pp. 2862–2867.
- [11] R. Nallapati, F. Zhai, B. Zhou, Summarunner: A recurrent neural network based sequence model for extractive summarization of documents, in: Thirtyfirst AAAI conference on artificial intelligence, 2017.
- [12] O. Vikas, A. K. Meshram, G. Meena, A. Gupta, Multiple document summarization using principal component analysis incorporating semantic vector space model (2008) 141–156.
- [13] J.-H. Lee, S. Park, C.-M. Ahn, D. Kim, Automatic generic document summarization based on non-negative matrix factorization, Vol. 45, Elsevier, 2009, pp. 20–34.
- [14] W. S. El-Kassas, C. R. Salama, A. A. Rafea, H. K. Mohamed, Automatic text summarization: A comprehensive survey, Expert Systems with Applications 165 (2021) 113679.
- [15] D. Sahoo, R. Balabantaray, M. Phukon, S. Saikia, Aspect based multi-document summarization, in: 2016 International Conference on Computing, Communication and Automation (ICCCA), IEEE, 2016, pp. 873–877.
- [16] M. J. Mohan, C. Sunitha, A. Ganesh, A. Jaya, A study on ontology based abstractive summarization, Procedia Computer Science 87 (2016) 32–37.
- [17] M. Mohd, R. Jan, M. Shah, Text document summarization using word embedding, Expert Systems with Applications 143 (2020) 112958.
- [18] D. D. Lee, H. S. Seung, Learning the parts of objects by non-negative matrix factorization, Nature 401 (6755) (1999) 788.
- [19] M. H. Aghdam, M. D. Zanjani, A novel regularized asymmetric non-negative matrix factorization for text clustering, Information Processing & Management 58 (6) (2021) 102694.
- [20] M. H. Aghdam, A novel constrained non-negative matrix factorization method based on users and items pairwise relationship for recommender systems, Expert Systems with Applications (2022) 116593.
- [21] D. D. Lee, H. S. Seung, Algorithms for nonnegative matrix factorization, in: Advances in neural information processing systems, 2001, pp. 556–562.
- [22] D. Cai, X. He, J. Han, T. S. Huang, Graph regularized nonnegative matrix factorization for data representation, IEEE transactions on pattern analysis and machine intelligence 33 (8) (2010) 1548–1560.
- [23] H. Liu, Z. Wu, X. Li, D. Cai, T. S. Huang, Constrained nonnegative matrix factorization for image representation, IEEE Transactions on Pattern Analysis and Machine Intelligence 34 (7) (2011) 1299–1311.
- [24] X. Luo, M. Zhou, Y. Xia, Q. Zhu, An efficient non-negative matrix-factorization-based approach to collaborative filtering for recommender systems, IEEE Transactions on Industrial Informatics 10 (2) (2014) 1273–1284.
- [25] X. Luo, M. Zhou, S. Li, Z. You, Y. Xia, Q. Zhu, A nonnegative latent factor model for large-scale sparse matrices in recommender systems via alternating direction method, IEEE transactions on neural networks and learning systems 27 (3) (2015) 579–592.
- [26] O. Vikas, A. K. Meshram, G. Meena, A. Gupta, Multiple document summarization using principal component analysis incorporating semantic vector space model, in: International Journal of Computational Linguistics & Chinese Language Processing, Volume 13, Number 2, June 2008, 2008, pp. 141–156.
- [27] Y. Gong, X. Liu, Generic text summarization using relevance measure and latent semantic analysis, in: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, 2001, pp. 19–25.
- [28] J. Qiang, Y. Li, Y. Yuan, W. Liu, Snapshot ensembles of non-negative matrix factorization for stability of topic modeling, Applied Intelligence 48 (11) (2018) 3963–3975.
- [29] C. Liu, Discriminant analysis and similarity measure, Pattern Recognition 47 (1) (2014) 359–367.
- [30] C. C. Aggarwal, C. Zhai, Mining text data, Springer Science & Business Media, 2012.
- [31] A. P. Dempster, N. M. Laird, D. B. Rubin, Maximum likelihood from incomplete data via the em algorithm, Journal of the Royal Statistical Society: Series B (Methodological) 39 (1) (1977) 1–22.
- [32] G. Erkan, D. R. Radev, Lexrank: Graph-based lexical centrality as salience in text summarization, Journal of artificial intelligence research 22 (2004) 457–479.
- [33] A. Ibrahim, T. Elghazaly, M. Gheith, A novel arabic text summarization model based on rhetorical structure theory and vector space model, International Journal of Computational Linguistics and Natural Language Processing 2 (8) (2013) 480–485.
- [34] G. Salton, C. Buckley, Term-weighting approaches in automatic text retrieval, Vol. 24, Elsevier, 1988, pp. 513–523.
- [35] O. Mogren, M. Kageb ˚ ack, D. Dubhashi, Extractive summarization by aggregating multiple similarities, in: Proceedings of the International Conference Recent Advances in Natural Language Processing, 2015, pp. 451–457.
- [36] C.-Y. Lin, Rouge: A package for automatic evaluation of summaries, in: Text summarization branches out, 2004, pp. 74–81.
- [37] R. M. Aliguliyev, A new sentence similarity measure and sentence based extractive technique for automatic text summarization, Expert Systems with Applications 36 (4) (2009) 7764–7772.
- [38] X. Wan, Towards a unified approach to simultaneous single-document and multi-document summarizations, in: Proceedings of the 23rd international conference on computational linguistics (Coling 2010), 2010, pp. 1137–1145.
- [39] D. Parveen, H.-M. Ramsl, M. Strube, Topical coherence for graph-based extractive summarization, in: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015, pp. 1949–1954.
- [40] R. Mihalcea, P. Tarau, Textrank: Bringing order into text, in: Proceedings of the 2004 conference on empirical methods in natural language processing, 2004, pp. 404–411.
- [41] K. Al-Sabahi, Z. Zuping, M. Nadher, A hierarchical structured self-attentive model for extractive document summarization (hssas), IEEE Access 6 (2018) 24205–24212.
- [42] J. Cheng, M. Lapata, Neural summarization by extracting sentences and words, arXiv preprint arXiv:1603.07252 (2016).
- [43] R. Nallapati, B. Zhou, M. Ma, Classify or select: Neural architectures for extractive document summarization, arXiv preprint arXiv:1611.04244 (2016).
- [44] K. Yao, L. Zhang, T. Luo, Y. Wu, Deep reinforcement learning for extractive document summarization, Neurocomputing 284 (2018) 52–62.
- [45] S. Narayan, S. B. Cohen, M. Lapata, Ranking sentences for extractive summarization with reinforcement learning, arXiv preprint arXiv:1802.08636 (2018).
- [46] Q. Zhou, N. Yang, F. Wei, S. Huang, M. Zhou, T. Zhao, Neural document summarization by jointly learning to score and select sentences, arXiv preprint arXiv:1807.02305 (2018).
- [47] Y. Dong, Y. Shen, E. Crawford, H. van Hoof, J. C. K. Cheung, Banditsum: Extractive summarization as a contextual bandit, arXiv preprint arXiv:1809.09672 (2018).
Uwagi
Opracowanie rekordu ze środków MEiN, umowa nr SONP/SP/546092/2022 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2022-2023).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-3bae1733-4135-4554-a07d-029b85d10781
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.