PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Powiadomienia systemowe
  • Sesja wygasła!
  • Sesja wygasła!
Tytuł artykułu

Lexicon management and standard formats

Autorzy
Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Konferencja
Human Language Technologies as a challenge for Computer Science and Linguistics (2; 21-23.04.2005; Poznań, Poland)
Języki publikacji
EN
Abstrakty
EN
International standards for lexicon formats are in preparation. To a certain extent, the proposed formats converge with prior results of standardization projects. However, their adequacy for (i) lexicon management and (ii) lexicon-driven applications have been little debated in the past, nor are they as a part of the present standardization effort. We examine these issues. IGM has developed XML formats compatible with the emerging international standards, and we report experimental results on large-coverage lexicons.
Rocznik
Strony
337--348
Opis fizyczny
Bibliogr. 41 poz.
Twórcy
autor
Bibliografia
  • [1] A. W. Appel and G. J Jacobson: The world's fastest Scrabble program. Comm. ACM, 31(5). 1988, 572-578 & 585.
  • [2] S. Bird and E. Loper: NLTK: the Natural Language Toolkit. In Proc. of ACL, (2004).
  • [3] O. Blanc: Rapport d'avancement Outilex. IGM, 2003.
  • [4] O. Blanc and A. Dister: Automates lexicaux avec structure de trails. In RECITAL 2004. (2004), 23-32.
  • [5] CH. Boitet, M. Mangeot and G. Serasset: The PAPILLON project: cooperatively building a multilingual lexical data-base to derive open source dictionaries & lexicons. In COUNG Workshop on NLP and XML. Taipei, Taiwan, (2002), 93-96.
  • [6] T. Briscoe: Lexical issues in Natural Language Processing. In E. Klein and F. Veltman, (Eds). Natural Language and Speech. Springer. (1991).
  • [7] B. Courtois: Un systeme de dictionnaires electroniques pour les mots simples du franc,ais. Langue Francaise. 87 Paris. Larousse. (1990).
  • [8J. H. Cunningham: GATE, a general architecture for text engineering. Computers and the Humanities, 36 (2002), 223-254.
  • [9] M. Domenig: Word Manager: A System for the Definition. Access and Maintenance of Lexical Databases. In Proc. of COUNG. Budapest, 1 (1988).
  • [10] W. N. Francis and H. Kucera: Manual of Information to Accompany a Standard Corpus of Present-Day Edited American English, for use with Digital Computers (Corrected and Revised edition). Department of Linguistics, Brown University, Providence, Rhode Island, 1979.
  • [11] G. Francopoulo: Proposition de norme des lexiques pour le traitement automatique du langage. AFNOR, 21 p. (2003).
  • [12] M. George: Terminology and other language resources. Lexical Resource Markup Framework. ISO. 16 p. (2003).
  • [13] D. Gibbon and Th. Trippel: A multi-view hyper-lexicon resource for speech and language system development. In Proc. of LREC. Athens, (2000), 1713-1718.
  • [14] M. Groos: Lexicon-Grammar. The Representation of Compound Words. In Proc. of COUNG. Bonn. (1986), 1-6.
  • [15] C. Grover and A. Lascarides: XML-Based Data Preparation for Robust Deep Parsing. In Proc. Joint EACL-ACL Meeting, Toulouse. (2001).
  • [16] L. Hayashi and J. Hatton: Combining UML, XML and relational database technologies. The best of all worlds for robust linguistic databases. In Proc. IRCS Workshop on Linguistic Databases, (2001).
  • [17] H.-G. Huh and E. Laporte: A resource-based Korean morphological annotation system. In Proc. Int. Joint Conf. on Natural Language Processing, Jeju. Korea, (2005).
  • [18] N. Ide and L. Romary: Standards for language resources. In Proc. LREC. Las Palmas, (2002), 839-844.
  • [19] N. Ide and J. Veronis: Text Encoding Initiative: Background and Context. Dordrecht: Kluwer, 1995.
  • [20] D. Jurafsky and J. Martin: Speech and language processing. Prentice Hall, 2000.
  • [21] E. Laporte: Symbolic natural language processing. In Applied Combinatorics on Words. Lothaire, Cambridge Univ. Press, (2005), 153-195.
  • [22] K. Lee. H. Bunt, S. Bauman, L. Burnar,. L. Clement, E. de la Clergerie, Th. Declerck, L. Romary, A. Roussanaly and C. Roux: Towards an international standard on feature structure representation. In Proc. of LREC, (2004), 373-376.
  • [23] W. Lezius: Morphy. German Morphology. Part-of-Speech Tagging and Applications. In Proc. EURALEX. Stuttgart, (2000), 619-623.
  • [24] Ch. Lieske. S. McCormick and G. Thurmair: The Open Lexicon Interchange Format (OLIF) Comes of Age. Machine Translation Summit VIII. (2001).
  • [25] E. Loper and S. Bird: NLTK: the Natural Language Toolkit. In Proc. ACL Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics, Philadelphia. (2002).
  • [26] C. Lucchesi and T. Kowaltowski: Applications of finite automata representing large vocabularies. Software - Practice and Experience, 23(1). Wiley & Sons. (1993), 15-30.
  • [27] M. Marcus, B. Santorini and M. A. Marcinkiewicz: Building a large annotated corpus of English: The Penn Treebank, Computational Linguistics, 19(2), (1993), 313-330.
  • [28] S. Nirenburg: The Subworld Concept Lexicon and the Lexicon Management I System. Computational Linguistics, 13(3-4), (1987).
  • [29] B. Normier and M. Nossin: GENELEX Project: Eureka for Linguistic Engineering. Proc Int. Workshop on Electronic Dictionaries, OISA. Kanagawa. Japan. (1990), 63-70.
  • [30] K. Oelazer and Sh. Inkelas: A Finite Stale Pronunciation Lexicon for Turkish. In Proc. EACL Workshop on Finite State Methods in NLP. Budapest. (2003).
  • [31] S. Paumier: Unitex. Manuel d'uiilisation. Research report. (2002).
  • [32] H. Poirier: The XELDA framework. (1999). http:// www.dcs.shcf.ac.uk/ hamish/dalr/baslow/xelda.pdf
  • [33] M. F. Porter: An algorithm for suffix stripping. Program. 14(3). (1980). 130-137.
  • [34] U. Quasthoff: Tools for Automatic Lexicon Maintenance; Acquisition. Error Correction, and the Generation of Missing Values. In Proc. LREC. (1998). 853-856.
  • [35] D. Revuz: Minimization of acyclic deterministic automata in linear time. Theoretical Computer Science. 92( 1), (1992), 181-189.
  • [36] L. Romary: Towards an Abstract Representation of Terminological Data Collections. The TMF model. TAMA. Antwerp. 2001.
  • [37] M. Silberztein: A new approach to lagging: the use of a large-coverage electronic dictionary. Applied Computer Translation, 1(4), (1991).
  • [38] M. Silberztein: INTEX: a corpus processing system. In Proc. COLING. Kyoto, (1994).
  • [39] M. Silberztein: Inlex: an FST toolbox. Tlieoretical Computer Science. 231(1), (2000), 33-46.
  • [40] C. Vertan and W. von Hahn: Towards a Generic Architecture for Lexicon Management. In Proc. LREC, (2002). 45-48.
  • [41] P. Wittenburg, W. Peters and S. Drude: Analysis of Lexical Structures from Field Linguistics and Language Engineering. In Proc. LREC, (2002).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BSW3-0021-0007
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.