Powiadomienia systemowe
- Sesja wygasła!
Tytuł artykułu
Autorzy
Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Konferencja
Human Language Technologies as a challenge for Computer Science and Linguistics (2; 21-23.04.2005; Poznań, Poland)
Języki publikacji
Abstrakty
International standards for lexicon formats are in preparation. To a certain extent, the proposed formats converge with prior results of standardization projects. However, their adequacy for (i) lexicon management and (ii) lexicon-driven applications have been little debated in the past, nor are they as a part of the present standardization effort. We examine these issues. IGM has developed XML formats compatible with the emerging international standards, and we report experimental results on large-coverage lexicons.
Słowa kluczowe
Czasopismo
Rocznik
Tom
Strony
337--348
Opis fizyczny
Bibliogr. 41 poz.
Twórcy
autor
- Institut Gaspard-Monge (IGM), University of Marne-la-Vallée, France, eric.laporte@univ-mlv.fr
Bibliografia
- [1] A. W. Appel and G. J Jacobson: The world's fastest Scrabble program. Comm. ACM, 31(5). 1988, 572-578 & 585.
- [2] S. Bird and E. Loper: NLTK: the Natural Language Toolkit. In Proc. of ACL, (2004).
- [3] O. Blanc: Rapport d'avancement Outilex. IGM, 2003.
- [4] O. Blanc and A. Dister: Automates lexicaux avec structure de trails. In RECITAL 2004. (2004), 23-32.
- [5] CH. Boitet, M. Mangeot and G. Serasset: The PAPILLON project: cooperatively building a multilingual lexical data-base to derive open source dictionaries & lexicons. In COUNG Workshop on NLP and XML. Taipei, Taiwan, (2002), 93-96.
- [6] T. Briscoe: Lexical issues in Natural Language Processing. In E. Klein and F. Veltman, (Eds). Natural Language and Speech. Springer. (1991).
- [7] B. Courtois: Un systeme de dictionnaires electroniques pour les mots simples du franc,ais. Langue Francaise. 87 Paris. Larousse. (1990).
- [8J. H. Cunningham: GATE, a general architecture for text engineering. Computers and the Humanities, 36 (2002), 223-254.
- [9] M. Domenig: Word Manager: A System for the Definition. Access and Maintenance of Lexical Databases. In Proc. of COUNG. Budapest, 1 (1988).
- [10] W. N. Francis and H. Kucera: Manual of Information to Accompany a Standard Corpus of Present-Day Edited American English, for use with Digital Computers (Corrected and Revised edition). Department of Linguistics, Brown University, Providence, Rhode Island, 1979.
- [11] G. Francopoulo: Proposition de norme des lexiques pour le traitement automatique du langage. AFNOR, 21 p. (2003).
- [12] M. George: Terminology and other language resources. Lexical Resource Markup Framework. ISO. 16 p. (2003).
- [13] D. Gibbon and Th. Trippel: A multi-view hyper-lexicon resource for speech and language system development. In Proc. of LREC. Athens, (2000), 1713-1718.
- [14] M. Groos: Lexicon-Grammar. The Representation of Compound Words. In Proc. of COUNG. Bonn. (1986), 1-6.
- [15] C. Grover and A. Lascarides: XML-Based Data Preparation for Robust Deep Parsing. In Proc. Joint EACL-ACL Meeting, Toulouse. (2001).
- [16] L. Hayashi and J. Hatton: Combining UML, XML and relational database technologies. The best of all worlds for robust linguistic databases. In Proc. IRCS Workshop on Linguistic Databases, (2001).
- [17] H.-G. Huh and E. Laporte: A resource-based Korean morphological annotation system. In Proc. Int. Joint Conf. on Natural Language Processing, Jeju. Korea, (2005).
- [18] N. Ide and L. Romary: Standards for language resources. In Proc. LREC. Las Palmas, (2002), 839-844.
- [19] N. Ide and J. Veronis: Text Encoding Initiative: Background and Context. Dordrecht: Kluwer, 1995.
- [20] D. Jurafsky and J. Martin: Speech and language processing. Prentice Hall, 2000.
- [21] E. Laporte: Symbolic natural language processing. In Applied Combinatorics on Words. Lothaire, Cambridge Univ. Press, (2005), 153-195.
- [22] K. Lee. H. Bunt, S. Bauman, L. Burnar,. L. Clement, E. de la Clergerie, Th. Declerck, L. Romary, A. Roussanaly and C. Roux: Towards an international standard on feature structure representation. In Proc. of LREC, (2004), 373-376.
- [23] W. Lezius: Morphy. German Morphology. Part-of-Speech Tagging and Applications. In Proc. EURALEX. Stuttgart, (2000), 619-623.
- [24] Ch. Lieske. S. McCormick and G. Thurmair: The Open Lexicon Interchange Format (OLIF) Comes of Age. Machine Translation Summit VIII. (2001).
- [25] E. Loper and S. Bird: NLTK: the Natural Language Toolkit. In Proc. ACL Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics, Philadelphia. (2002).
- [26] C. Lucchesi and T. Kowaltowski: Applications of finite automata representing large vocabularies. Software - Practice and Experience, 23(1). Wiley & Sons. (1993), 15-30.
- [27] M. Marcus, B. Santorini and M. A. Marcinkiewicz: Building a large annotated corpus of English: The Penn Treebank, Computational Linguistics, 19(2), (1993), 313-330.
- [28] S. Nirenburg: The Subworld Concept Lexicon and the Lexicon Management I System. Computational Linguistics, 13(3-4), (1987).
- [29] B. Normier and M. Nossin: GENELEX Project: Eureka for Linguistic Engineering. Proc Int. Workshop on Electronic Dictionaries, OISA. Kanagawa. Japan. (1990), 63-70.
- [30] K. Oelazer and Sh. Inkelas: A Finite Stale Pronunciation Lexicon for Turkish. In Proc. EACL Workshop on Finite State Methods in NLP. Budapest. (2003).
- [31] S. Paumier: Unitex. Manuel d'uiilisation. Research report. (2002).
- [32] H. Poirier: The XELDA framework. (1999). http:// www.dcs.shcf.ac.uk/ hamish/dalr/baslow/xelda.pdf
- [33] M. F. Porter: An algorithm for suffix stripping. Program. 14(3). (1980). 130-137.
- [34] U. Quasthoff: Tools for Automatic Lexicon Maintenance; Acquisition. Error Correction, and the Generation of Missing Values. In Proc. LREC. (1998). 853-856.
- [35] D. Revuz: Minimization of acyclic deterministic automata in linear time. Theoretical Computer Science. 92( 1), (1992), 181-189.
- [36] L. Romary: Towards an Abstract Representation of Terminological Data Collections. The TMF model. TAMA. Antwerp. 2001.
- [37] M. Silberztein: A new approach to lagging: the use of a large-coverage electronic dictionary. Applied Computer Translation, 1(4), (1991).
- [38] M. Silberztein: INTEX: a corpus processing system. In Proc. COLING. Kyoto, (1994).
- [39] M. Silberztein: Inlex: an FST toolbox. Tlieoretical Computer Science. 231(1), (2000), 33-46.
- [40] C. Vertan and W. von Hahn: Towards a Generic Architecture for Lexicon Management. In Proc. LREC, (2002). 45-48.
- [41] P. Wittenburg, W. Peters and S. Drude: Analysis of Lexical Structures from Field Linguistics and Language Engineering. In Proc. LREC, (2002).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BSW3-0021-0007