The recent advancements in medical science have caused a considerable acceleration in the rate at which new information is being published. The MEDLINE database is growing at 500,000 new citations each year. As a result of this exponential increase, it is not easy to manually keep up with this increasing swell of information. Thus, there is a need for automatic information extraction systems to retrieve and organize information in the biomedical domain. Biomedical Named Entity Recognition is one such fundamental information extraction task, leading to significant information management goals in the biomedical domain. Due to the complex vocabulary (e.g., mRNA) and free nomenclature (e.g., IL2), identifying named entities in the biomedical domain is more challenging than any other domain, hence requires special attention. In this paper, we deploy two novel bi-directional encoder-based systems, viz., BioBERT and RoBERTa to identify named entities in the biomedical text. Due to the domain-specific training of BioBERT, it gives reasonably good performance for the NER task in the biomedical domain. However, the structure of RoBERTa makes it more suitable for the task. We obtain a significant improvement in F-score by RoBERTa over BioBERT. In addition, we present a comparative study on training loss attained with ADAM and LAMB optimizers.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.