PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Exploit relations between the word letters and their placement in the word for Arabic root extraction

Autorzy
Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
This paper presents a new root-extraction approach for Arabic words. The approach tries to assign for Arabic words a unique root without relying on a database of word roots, a list of word patterns or a list of all the prefixes and the suffixes of the Arabic words. Unlike most of Arabic rule-based stemmers, it tries to predict the root-letters positions one by one based on some rules and relations among the word letters and their placement in the word. This paper focuses on two parts of the approach. The first one introduces some rules to distinguish between the Arabic definite article and the permanent component that may found in any Arabic word. The second one classifies Arabic letters in to groups according to their positions in the word. The proposed approach is a system composed of several modules used to extract the word root. The approach has been evaluated using the Holy Quran words. The evaluation results show a promising root extraction algorithm.
Słowa kluczowe
Wydawca
Czasopismo
Rocznik
Strony
327--341
Opis fizyczny
Bibliogr. 12 poz., rys., tab.
Twórcy
autor
  • Faculty of Information Technology and Computer Sciences, Yarmouk University, Irbid, Jordan
Bibliografia
  • [1] AI-Sughaiyer I.A., Al-Kharashi I.A.: Arabic morphological analysis techniques: A comprehensive survey. Journal of the American Society for Information Science and Technology, 55(3):189–213, 2004
  • [2] Duwairi R.: Machine learning for Arabic Text Categorization. Journal of the American Society for Information Science and Technology (JASIST), 57(8):1005–1010, 2005
  • [3] Jurafsky D., Martin. J.H.: Speech and Language Processing: An Introduction to Speech Recognition. Natural Language Processing, and Computational Linguistics and Speech Recognition, Prentice-Hall, 2007
  • [4] Khoja S., Garside R.: Stemming Arabic text . Technical report, Computing Department, Lancaster University, 1999
  • [5] Krovetz R.: Viewing morphology as an inference process. In Conference on Research and Development in Information Retrieval, pp. 191–202. In Proc. of the Sixteenth Annual International ACM SIGIR, 1993.
  • [6] Momani M., Faraj J.: A novel algorithm to extract tri-literal arabic roots. In International Conference on Computer Systems and Applications (AICCSA), pp. 309–315. In IEEE/ACS, May 2007
  • [7] Paice. C.D.: Another stemmer. SIGIR Forum, 24(3):56–61, 1990
  • [8] Porter M.F.: An algorithm for suffix stripping. Program, 14(3):130–137, 1980
  • [9] Savoy J.: Stemming of French words based on grammatical categories. Journal of the American Society for Information Science, 44(1):1–9, 1993
  • [10] Savoy J.: A stemming procedure and stop word list for general French corpora. Journal of the American Society for Information Science, 50(10):944–952, 1999
  • [11] Shalabi R.A.: Pattern-based stemmer for finding Arabic roots. Information Technology Journal, 4(1):38–43, 2005.
  • [12] Wikipedia: Arabic language. http://en.wikipedia.org/wiki/Arabic_language, 2013. Online; accessed 18- January-2013
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-50ffcad3-39e6-443f-b3d6-1474660738a7
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.