PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Automatic Generation of Annotated Corpora of Diagnoses with ICD-10 codes based on Open Data and Linked Open Data

Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Konferencja
Federated Conference on Computer Science and Information Systems (15 ; 06-09.09.2020 ; Sofia, Bulgaria)
Języki publikacji
EN
Abstrakty
EN
We propose methods for automatic generation of corpora that contains descriptions of diagnoses in Bulgarian and their associated codes in ICD-10-CM (International Classification of Diseases, 10th revision, Clinical Modification). The proposed approach is based on the available open data and Linked Open Data and can be easily adapted for other languages. The resulted corpora generated for the Bulgarian clinical texts consists of about 370,000 pairs of diagnoses and corresponding ICD-10 codes and is beyond the usual size that can be generated manually, moreover it was created from scratch and for a relatively short time. Further updates of the corpora are also possible whenever new open resources are available or the current ones are updated.
Rocznik
Tom
Strony
163--167
Opis fizyczny
Bibliogr. 11 poz., il., tab., wykr.
Twórcy
  • Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, Sofia, Bulgaria
  • Faculty of Mathematics and Informatics, Sofia University "St. Kliment Ohridski", Sofia, Bulgaria
  • Faculty of Mathematics and Informatics, Sofia University "St. Kliment Ohridski", Sofia, Bulgaria
autor
  • Faculty of Mathematics and Informatics, Sofia University "St. Kliment Ohridski", Sofia, Bulgaria
Bibliografia
  • 1. A. Névéol, H. Dalianis, S. Velupillai, G. Savova, P. Zweigenbaum. "Clinical natural language processing in languages other than english: opportunities and challenges." Journal of biomedical semantics, 2018 Dec 1;9(1):12.
  • 2. S. Boytcheva, "Multilingual aspects of information extraction from medical texts in Bulgarian." Multilingual Processing in Eastern and Southern EU Languages: Less-resourced Technologies and Translation, Cambridge Scholars Publishing. 2012 Apr 25:308-29.
  • 3. S. Boytcheva, "Automatic matching of ICD-10 codes to diagnoses in discharge letters."In Proceedings of the second workshop on biomedical natural language processing, RANLP 2011, pp. 11-18, September 2011.
  • 4. M. Voinov et al. Latin-Bulgarian Dictionary. Planeta-3, pp. 792, 1999. (in Bulgarian)
  • 5. Q. Wang et al. "A study of entity-linking methods for normalizing Chinese diagnosis and procedure terms to ICD codes". Journal of Biomedical Informatics. 2020 Apr 13:103418. https://doi.org/10.1016/j.jbi.2020.103418
  • 6. U. Marovac, A. Avdić, D. Janković, and S. Marovac. "Creating Resources for Marking Diagnoses in Electronic Health Reports in Serbian". International Journal of Electrical Engineering and Computing, 2020. 4(1), pp. 18-23.
  • 7. M. Almagro, R. M. Unanue, V. Fresno and S. Montalvo, "ICD-10 Coding of Spanish Electronic Discharge Summaries: An Extreme Classification Problem", IEEE Access, 2020, vol. 8, pp. 100073-100083, 2020, http://dx.doi.org/10.1109/ACCESS.2020.2997241.
  • 8. A. Bagheri, A. Sammani, PGM Van der Heijden, FW Asselbergs, and DL Oberski. "Automatic ICD-10 classification of diseases from Dutch discharge letters". In: Proceedings of the 13th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 3: C2C. 2020, pp. 281-289.
  • 9. H. Dalianis. "Clinical text retrieval-an overview of basic building blocks and applications". In Professional Search in the Modern World, 2014, pp. 147-165. Springer, Cham.
  • 10. J. Wei, and K. Zou. "Eda: Easy data augmentation techniques for boosting performance on text classification tasks". arXiv preprint https://arxiv.org/abs/1901.11196. 2019 Jan 31.
  • 11. N. Khairova, S. Petrasova, W. Lewoniewski, O. Mamyrbayev, and K. Mukhsina. "Automatic extraction of synonymous collocation pairs from a text corpus". In 2018 Federated Conference on Computer Science and Information Systems (FedCSIS)". 2018 Sep 9, pp. 485-488, IEEE.
Uwagi
1. Track 1: Artificial Intelligence
2. Technical Session: 5th International Workshop on Language Technologies and Applications
3. Opracowanie rekordu ze środków MNiSW, umowa Nr 461252 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2021).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-fd3ed88e-82a1-4745-bc94-d405e12f2630
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.