K-Graph: knowledgeable graph for text documents

Mittal, Varsha; Gangodkar, Durgaprasad; Pant, Bhaskar

doi:10.2478/jok-2021-0006

Artykuł - szczegóły

Tytuł artykułu

K-Graph: knowledgeable graph for text documents

Autorzy

Mittal Varsha , Gangodkar Durgaprasad , Pant Bhaskar

Wybrane pełne teksty z tego czasopisma

https://journalofkonbin.com

Identyfikatory

DOI

10.2478/jok-2021-0006

Warianty tytułu

Języki publikacji

Abstrakty

Graph databases are applied in many applications, including science and business, due to their low-complexity, low-overheads, and lower time-complexity. The graph-based storage offers the advantage of capturing the semantic and structural information rather than simply using the Bag-of-Words technique. An approach called Knowledgeable graphs (K-Graph) is proposed to capture semantic knowledge. Documents are stored using graph nodes. Thanks to weighted subgraphs, the frequent subgraphs are extracted and stored in the Fast Embedding Referral Table (FERT). The table is maintained at different levels according to the headings and subheadings of the documents. It reduces the memory overhead, retrieval, and access time of the subgraph needed. The authors propose an approach that will reduce the data redundancy to a larger extent. With realworld datasets, K-graph’s performance and power usage are threefold greater than the current methods. Ninety-nine per cent accuracy demonstrates the robustness of the proposed algorithm.

Słowa kluczowe

subgraph mining graph database text classification

Wydawca

Wydawnictwo Instytutu Technicznego Wojsk Lotniczych

Czasopismo

Journal of KONBiN

Rocznik

2021

Tom

Vol. 51, iss. 1

Strony

73--89

Opis fizyczny

Bibliogr. 28 poz., rys., tab.

Twórcy

autor

Mittal Varsha

Graphic Era Deemed to be University, Dehradun, Uttarakhand, India

autor

Gangodkar Durgaprasad

Graphic Era Deemed to be University, Dehradun, Uttarakhand, India

autor

Pant Bhaskar

Graphic Era Deemed to be University, Dehradun, Uttarakhand, India

Bibliografia

1. Atastina I., Sitohang B., Saptawati G., Moertini V.S.: A Review of Big Graph Mining Research. IOP Conf. Ser. Mater. Sci. Eng., 180, 12-16 , 2017.
2. Abdelhamid E., Canim M., Sadoghi M., Bhattacharjee B., Chang Y., Kalnis P.: Incremental Frequent Subgraph Mining for Large Evolving Graphs. IEEE Transactions on Knowledge and Data Engineering, 29, 12, 2017.
3. Dhiman A., Jain S.K..: Frequent subgraph mining algorithms for single large graphs — A brief survey. International Conference on Advances in Computing, Communication, Automation (ICACCA) (Spring), Apr. 2016.
4. Gee K.R., Cook D.J.: Text Classification Using Graph-Encoded Linguistic Elements. In FLAIRS Conference, 487-492, 2005.
5. Geibel, Krumnack U., Pustylnikow O., Mehler A.: Structure-Sensitive Learning of Text Types. Advances in Artificial Intelligence ,4830, 642-646, 2007.
6. Giarelis N., Kanakaris N., Karacapilidis N.: On a Novel Representation of Multiple Textual Documents in a single Graph. Proceedings of International Conference on Intelligent Decision Technologies IDT 2020, Split, Croatia, 105-115, 2020.
7. https://shodhganga.inibnet.ac.in.
8. https://library.stanford.edu/spc/universityarchives/dissertations-and-theses.
9. https://indiankanoon.org/browse/supremecourt/
10. http://read.gov/books/
11. Huan J., Wang J., Prins J.: Efficient mining of frequent subgraphs in the presence of isomorphism. Third IEEE International Conference on Data Mining, 549–552, 2003.
12. Inokuchi A., Washio T., Motoda H.: An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data. Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, London, UK, UK,13–23, 2003.
13. Kang U., Tsourakakis C.E., Faloutsos C.: PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations. Ninth IEEE International Conference on Data Mining, Miami Beach, FL, USA, Dec. 2009.
14. Kuramochi M., Karypis G.: Frequent Subgraph Discovery. Proceedings – IEEE International Conference on Data Mining, ICDM, 313–320, 2010.
15. Kuramochi M., Karypis G.: GREW - a scalable frequent subgraph discovery algorithm. IEEE International Conference on Data Mining (ICDM’04), 439–442, 2004.
16. Markov A.: Efficient Graph-based Representation of web Documents. Proceedings of the Third International Workshop on Mining Graphs, Trees and Sequences, Potro Portugal 52-62, 2005.
17. Markov A., Last M., Kandel A.: A Fast Categorization of Web Documents represented by Graphs. Advances in Web Mining and Web Usage Analysis, 4811, 56-71, 2007.
18. Mukund D., Kuramochi M., Karypis G.: Frequent Sub-structur based Approaches for Classifying Chemical Compounds, In Proceedings of the Third IEEE International Conference on Data Mining, 2003.
19. Nijssen S., Kok J.N.: A Quickstart in Frequent Structure Mining Can Make a Difference. Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 2004.
20. Paulheim H.: Knowledge graph refinement: A survey of approaches and evaluation methods. Semantic Web, vol. 8, no.3, 489–508, 2016.
21. Pokorny J.: Integration of Relational and Graph Database Functionally. Foundation of Computing and Decision Sciences, 44, 4, 427-441, 2019.
22. Schenker A.: Graph Theoretic Techniques for Web Content Mining, Phd Thesis, University of South Florida, 2003.
23. Ramraj T., Prabhakar R.: Frequent Subgraph Mining Algorithms – A Survey. Procedia Comput. Sci.,47, 197–204, 2015.
24. Rehman S.U., Khan A.U and Fong S.: Graph mining: A survey of graph mining techniques. Seventh International Conference on Digital Information Management (ICDIM 2012), 88–92, 2012.
25. Rehman S.U., Asghar S., Fong S.: An Efficient Ranking Scheme for Frequent Subgraph Patterns. Proceedings of the 2018 10th International Conference on Machine Learning and Computing, New York, NY, USA, 257–262, 2018.
26. Tao F., Murtagh F., Farid M.: Weighted Association Rule Mining Using Weighted Support and Significant Framework. Proceedings of ACM International Conference on Knowledge Discovery and Data Mining, USA, 2003.
27. Yan X., Han J.: gSpan: graph-based substructure pattern mining. IEEE International Conference on Data Mining Proceedings, pp. 721–724, 2002.
28. Yan X., Han J.: CloseGraph: Mining Closed Frequent Graph Patterns. Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 286–295, 2003.

Uwagi

Opracowanie rekordu ze środków MNiSW, umowa Nr 461252 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2021).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-d6ed48da-0d1d-430d-ab3a-44c3c332312f