In this article we propose a general framework incorporating semantic indexing and search of texts within scientific document repositories. In our approach, a semantic interpreter, which can be seen as a tool for automatic tagging of textual data, is interactively updated based on feedback from the users, in order to improve quality of the tags that it produces. In our experiments, we index our document corpus using the Explicit Semantic Analysis (ESA) method. In this algorithm, an external knowledge base is used to measure relatedness between words and concepts, and those assessments are utilized to assign meaningful concepts to given texts. In the paper, we explain how the weights expressing relations between particular words and concepts can be improved by interaction with users or by employment of expert knowledge. We also present some results of experiments on a document corpus acquired from the PubMed Central repository to show feasibility of our approach.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.