Eksperymenty z zakresu klasyfikacji czasowników w semantycznym słowniku walencyjnym polskiego

Hajnicz, E.

Artykuł - szczegóły

Tytuł artykułu

Eksperymenty z zakresu klasyfikacji czasowników w semantycznym słowniku walencyjnym polskiego

Autorzy

Hajnicz E.

Wybrane pełne teksty z tego czasopisma

http://ipipan.waw.pl/instytut/wydawnictwo/prace-ipi-pan

Identyfikatory

Warianty tytułu

Experiments on classifying verbs in a semantic dictionary of Polish

Języki publikacji

Abstrakty

Niniejszy raport opisuje wstępne eksperymenty dotyczące klasyfikacji syntaktyczno-semantycznej czasowników polskich. Wpierw omówione zostały istniejące prace z tej dziedziny, dotyczące głównie języka angielskiego. Następnie opisana została gradacyjna analiza odpowiedniości i skupień, która została użyta do klasyfikacji. Potem przedstawiony został semantyczno-syntaktyczny słownik walencyjny będący źródłem danych do klasyfikacji. Na koniec zaprezentowane były właściwe eksperymenty dotyczące klasyfikacji wraz z ich ewaluacją.

The present report describes initial experiments on syntactic-semantic classification of Polish verbs. First, the existing works on this subject were discussed, mainly concerning English. Second, Grade Correspondence-Cluster Analysis used in experiments was described. Next, syntactic-semantic valence dictionary of Polish verbs being a source of data for experiments was presented. Finally, actual experiments were discussed and evaluated.

Słowa kluczowe

lingwistyka komputerowa semantyka leksykalna słowniki walencyjne klasyfikacja czasowników preferencje selekcyjne wordnet

computational linguistics lexical semantics electronic valence dictionaries verb classification selectional preferences wordnet

Wydawca

Instytut Podstaw Informatyki PAN

Czasopismo

Prace Instytutu Podstaw Informatyki Polskiej Akademii Nauk

Rocznik

2011

Tom

Nr 1021

Strony

1--32

Opis fizyczny

Bibliogr. 46 poz.

Twórcy

autor

Hajnicz E.

Instytut Podstaw Informatyki PAN ul. Ordona 21 01-237 Warszawa Polska, Elzbieta.Hajnicz@ipipan.waw.pl

Bibliografia

ARPA (1994) Proceedings of the ARPA Human Language Technology Workshop, Morgan Kaufmann, Princeton, NJ.
C. F. Baker, C. J. Fillmore, J. B. Lowe (1998) The Berkeley FrameNet Project, w: Proceedings of COLING-ACL'98, s. 86-90, Montreal, Kanada.
C. Carroll, M. Rooth (1998) Valence Induction with a Head-Lexicalized PCFG, w: Proceedings of (EMNLP-1998), s. 36-45, Granada, Hiszpania.
A. Ciok, T. Kowalczyk, E. Pleszczyńska, W. Szczęsny (1995) Algorithms of grade correspondence-cluster analysis, Archiwum Informatyki Teoretycznej i Stosowanej, t. 7, nr 1-4, s. 5-22.
M.. J. Collins (1997) Three generative, lexicalised models for statistical parsing, w: Proceedings of (ACL'97), s. 16-23, Madrid, Spain.
H.T. Dang (2004) Investigations into the Role of Lexical Semantics in Word Sense Disambiguation, Rozprawa doktorska, Computer and Information Science Department, University of Pennsylvania.
A. P. Dempster, N. M. Laird, D. B. Rubin (1977) Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, series B, t. 39, s. 185-197.
G. J. Fillmore, C. R. Johnson, M. R. L. Petruck (2003) Background to Frame-Net, International Journal of Lexicography, t. 16, nr 3, s. 235-250.
D. J. Gildea (2002) Probabilistic model of verb-argument structure, w: Proceedings of (CoNNL-2002), s. 308-314, Taipei, Tajwan.
E. .Hajnicz (2009a) Problems with Pruning in Automatic Creation of Semantic Valence Dictionary for Polish, w: V. Matousek, P. Mautner (red.), Proceedings of TSD 2009, t. 5729 serii Lecture Notes in Artificial Intelligence, s.131-138, Springer-Verlag, Pilzno, Czechy.
— (2009b) Semantic annotation of verb arguments in shallow parsed Polish sentences by means of EM selection algorithm, w: M. Marciniak, A. Mykowiecka (red.), Aspects of Natural Language Processing, t. 5070 serii Lecture Notes in Computer Science, s. 211-240, Springer-Verlag.
— (2011) Proces tworzenia semantycznego słownika walencyjnego, Inżynieria Lingwistyczna, Akademicka Oficyna Wydawnicza Exit, Warszawa.
E. Hajnicz, M. Woliński (2009) How Valence Information Influences Parsing Polish with Świgra, w: M. A. Kłopotek, A. Przepiórkowski, S. T. Wierzchoń, K. Trojanowski (red.), Recent Advances in Intelligent Information Systems, Challenging Problems in Science: Computer Science, s. 193-206, Akademicka Oficyna Wydawnicza Exit, Warszawa.
T. Hoffman, J. Puzicha (1998) Statistical Models for co-occurrence data, Memo.
J. Hughes (1994) Automatically Acquiring Classification of Words, Rozprawa doktorska, School of Computer Studies, University of Leeds, Leeds.
E. Joanis, S. Stevenson (2003) A general feature space for automatic verb classification, w: Proceedings of (EACL-2003), s. 163-170, Budapeszt, Węgry.
E. Joanis, S. Stevenson, D. James (2008) A general feature space for automatic verb classification, Natural Language Engineering, t. 14, nr 3, s. 337-367.
F. .Keller, M. Corlay, S. Corlay, M. W. Crocker, S. Trevin (1999) Gsearch: a tool for syntactic investigation of unparsed corpora, w: Proceedings of EACL-1999, s. 56-63, Bergen, Norwegia.
M\ Kim, H. Yoo, R. S. Ramakrishna (2004) Cluster Validation for High-Dimensional Datasets, w: C. Bussler, D. Fensel (red.), Proceedings of the 11th International Conference on Artificial Intelligence Methods, Systems and Applications, t. 3192 serii Lecture Notes in Computer Science, s. 178-187, Springer-Verlag, Berlin / Heidelberg.
N.Kolev, B. Vaz de Mendes, U. dos Anjos (2006) Copulas: a Review and Recent Developments, Stochastic Models, t. 22, nr 4, s. 617-660.
T. Kowalczyk, E. Pleszczyńska, F. Ruland (red.) (2004) Grade Models and Methods for Data Analysis. With Applications for the Analysis of Data Populations, Studies in Fuzziness and Soft Computing, Springer-Verlag, Berlin Heidelberg New York.
J. Książyk, O. Matyja, E. Pleszczyńska, M. Wiech (2005) Analiza danych medycznych i demograficznych przy użyciu programu GradeStat, Instytut Podstaw Informatyki, Polska Akademia Nauk Instytut „Pomnik — Centrum Zdrowia Dziecka", Warszawa.
M. Lapata, C. Brew (1999) Using subcategorization to resolve verb class ambiguity, w: Proceedings of the Joint SIGDAT Conference on Empirical methods in NLP and Very Large Corpora, s. 266-274, College Park, MD.
— (2004) Verb class disambiguation using informative priors, Computational Linguistics, t. 30, nr 1, s. 45-73.
B. Levin (1993) English verb classes and alternation: a preliminary investigation, University of Chicago Press, Chicago, IL.
M. P. Marcus (1994) The Penn TreeBank: A revised corpus design for extracting predicate-argument structure, w: ARPA.
M. P..Marcus, G. Kim, M. A. Marcinkiewicz, R. Maclntyre, A. Bies, M. Ferguson, K. Katz, B. Schasberger (1994) The Penn Treebank: Annotating predicate argument structure, w: ARPA, s. 114-119.
O. Matyja (2003) Smooth Grade Correspondence Analysis and Related Computer System, Rozprawa doktorska, Instytut Podstaw Informatyki, Polska Akademia Nauk.
D. McCarthy (2001) Lexical Acquisition at the Syntax-Semantics Interface: Diathesis Alternations, Subcategorization Frames and Selectional Preferences, Rozprawa doktorska, University of Sussex.
P. Merlo, S. Stevenson (2001) Automatic verb classification based on statistical distributions of argument structure, Computational Linguistics, t. 27, nr 3, s. 373-408.
P. Merlo, S. Stevenson, V. Tsang, G. Allaria (2002) A multilingual paradigm for automatic verb classification, w: Proceedings of (ACL'02), s. 207-214, Philadelphia, PA.
M. Niewiadomska-Bugaj, T. Kowalczyk (2005) On grade transformation and its Implications for copulas, Brazilian Journal of Probability and Statistics, t. 19, s. 125-137.
M. Piasecki, S. Szpakowicz, B. Broda (2009) A Wordnet from the Ground Up, Oficyna Wydawnicza Politechniki Wrocławskiej, Wrocław.
P. Resnik (1993) Selection and Information: A Class-Based Approach to Lexical Relationships, Rozprawa doktorska, University of Pennsylvania, Filadelphia, PA.
F. Ribas (1994) An Experiment on Learning Appropriate Selectional Restrictions from Parsed Corpus, w: Proceedings of (COLING-199Ą), s. 769-774, Kioto, Japonia.
— (1995a) On Acquiring Appropriate Selectional Restrictions from Corpora Using a Semantic Taxonomy, Rozprawa doktorska, University of Catalonia.
— (1995b) On Learning More Appropriate Selectional Restrictions, w: Proceedings of (EACL'95), s. 112-118, Dublin, Irlandia.
M.Rooth (1998) Two-Dimensional Clusters in Grammatical Relations, Rap. tech. 3, Institut für Maschinelle Sprachverarbeitung, Universität Stuttgart.
M. Rooth, S. Riezler, D. Prescher, G. Carroll, F. Beil (1999) Inducing a semantically annotated lexicon via EM-based clustering, w: Proceedings of (ACL'99), s. 104-111, College Park, MA.
L. Shi, R. Mihalcea (2005) Putting Pieces Together: Combining FramNet, Verb-Net and Prop Bank for Robust Semantic Parsing, w: A. Gelbukh (red.), Proceedings of CICLing-2005, t. 3406 serii Lecture Notes in Computer Science, s. 100-111, Springer-Verlag, Heidelberg.
S. Stevenson, P. Merlo (1997) Lexical structure and parsing complexity, Language and Cognitive Process, t. 12, nr 2/3, s. 349-399.
— (1999) Automatic Verb Classification Using Distributions of Grammatical Features, w: Proceedings of (EACL'99), s. 45-52, Bergen, Norwegia.
R. S. Swier, S. Stevenson (2004) Unsupervised semantic role labelling, w: Proceedings of (EMNLP-2004), s. 95-102, Barcelona, Hiszpania.
S. Schulte im Walde (2000) Clustering verbs semantically according to their alternation behaviour, w: Proceedings of (COLING-2000), s. 747-753, Saarbrücken, Niemcy.
B. Zapirain, E. Agirre, L. Marquez (2008) Robustness and generalization of role sets: PropBank vs. VerbNet, w: Proceedings of (ACL'08), s. 550-558, Columbus, OH.
Y. Zhao, , G. Karypis (2001) Criterion Functions for Document Clustering: Experiments and Analysis, Rap. tech. #01-40, University of Minnesota, Minneapolis, MN.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BUJ8-0024-0069