A Canonical Semi-Deterministic Transducer

Beros, A. A.; De la Higuera, C.

doi:10.3233/FI-2016-1394

Artykuł - szczegóły

Tytuł artykułu

A Canonical Semi-Deterministic Transducer

Autorzy

Beros A. A. , De la Higuera C.

Wybrane pełne teksty z tego czasopisma

https://fi.episciences.org/

Identyfikatory

DOI

10.3233/FI-2016-1394

Warianty tytułu

Języki publikacji

Abstrakty

We prove the existence of a canonical form for semi-deterministic transducers with sets of pairwise incomparable output strings. Based on this, we develop an algorithm which learns semi-deterministic transducers given access to translation queries. We also prove that there is no learning algorithm for semi-deterministic transducers that uses only domain knowledge.

Słowa kluczowe

grammatical inference semi-deterministic transducers

Wydawca

Polskie Towarzystwo Matematyczne

Czasopismo

Fundamenta Informaticae

Rocznik

2016

Tom

Vol. 146, nr 4

Strony

431--459

Opis fizyczny

Bibliogr. 59 poz., rys.

Twórcy

autor

Beros A. A.

beros@math.hawaii.edu

Department of Mathematics, University of Hawai‘i at Mānoa, 2565 McCarthy Mall, Keller Hall 401A, Honolulu, Hawaii 96822, USA

autor

De la Higuera C.

cdlh@univ-nantes.fr

Laboratoire d’Informatique de Nantes Atlantique, Université de Nantes, 2 rue de la Houssiniėre BP 92208, 44322 Nantes Cedex 03, France

Bibliografia

[1] Nivat M. Transductions des langages de Chomsky. In: Annales de l’institut Fourier. Institut Fourier. 1968; 18(1):339-455. Available from: http://eudml.org/doc/73950.
[2] Berstel J. Transductions and context-free languages. Vieweg, Teubner Verlag, Leipzig; 1979. doi: 10.1007/978-3-663-09367-1.
[3] Sakarovitch J. Elements of Automata Theory. Cambridge University Press; 2009. ISBN-0521844258, 9780521844253.
[4] Mohri M. Finite-state transducers in language and speech processing. Computational Linguistics. 1997; 23(3):269-311. Available from: http://dl.acm.org/citation.cfm?id=972695.972698.
[5] Mohri M. Minimization algorithms for sequential transducers. Theoretical Computer Science. 2000; 234:177-201. doi:10.1016/S0304-3975(98)00115-7.
[6] Mohri M, Pereira FCN, Riley M. A Design Principles of a Weighted Finite-State Transducer Library. Theoretical Computer Science. 2000;231(1):17-32. doi:10.1016/S0304-3975(99)00014-6.
[7] Roark B, Sproat R. Computational Approaches to Syntax and Morphology. Oxford University Press; 2007. ISBN-13:978-0199274772, 10:0199274770.
[8] Amengual JC, Benedí JM, Casacuberta F, Castaño A, Castellanos A, Jiménez VM, et al. The EuTrans-I Speech Translation System. Machine Translation. 2001;15(1):75-103. doi:10.1023/A:1011116115948.
[9] Casacuberta F, Vidal E. Machine Translation with Inferred Stochastic Finite-State Transducers. Computational Linguistics. 2004;30(2):205-225. doi:10.1162/089120104323093294.
[10] Clark A. Partially Supervised Learning of Morphology with Stochastic Transducers. In: Proceedings of the Sixth Natural Language Processing Pacific Rim Symposium. 2001. p. 341-348.
[11] Carme J, Gilleron R, Lemay A, Niehren J. Interactive Learning of Node Selecting Tree Transducer. Machine Learning Journal. 2007;66(1):33-67. doi:10.1007/s10994-006-9613-8.
[12] Bernard M, Janodet JC, Sebban M. A Discriminative Model of Stochastic Edit Distance in the Form of a Conditional Transducer. In: Sakakibara Y, Kobayashi S, Sato K, Nishino T, Tomita E, editors. Grammatical Inference: Algorithms and Applications, Proceedings of ICGI ’06. vol. 4201 of LNAI. Springer-Verlag; 2006. p. 240-252. doi:10.1007/11872436 _20.
[13] Oncina J, García P, Vidal E. Learning Subsequential Transducers for Pattern Recognition Interpretation Tasks. Pattern Analysis and Machine Intelligence. 1993;15(5):448-458. doi:10.1109/34.211465.
[14] Vilar JM, Jiménez VM, Amengual J, Castellanos A, Llorens D, Vidal E. Text and speech translation by means of subsequential transducers. Natural Language Engineering. 1996;2(4):351-354.
[15] Oncina J, Varó MA. Using domain information during the learning of a subsequential transducer. In: Miclet L, de la Higuera C, editors. Proceedings of ICGI ’96. No. 1147 in LNAI. Springer-Verlag; 1996. p.301-312.
[16] Kermorvant C, de la Higuera C. Learning Languages with Help. In: Adriaans P, Fernau H, van Zaannen M, editors. Grammatical Inference: Algorithms and Applications, Proceedings of ICGI ’02. vol. 2484 of LNAI. Springer-Verlag; 2002. p. 161-173.
[17] Coste F, Fredouille D, Kermorvant C, de la Higuera C. Introducing Domain and Typing Bias in Automata Inference. In: Paliouras G, Sakakibara Y, editors. Grammatical Inference: Algorithms and Applications, Proceedings of ICGI ’04. vol. 3264 of LNAI. Springer-Verlag; 2004. p. 115-126. doi:10.1007/978-3-540-30195-0_11.
[18] Vilar JM. Query learning of subsequential transducers. In: Miclet L, de la Higuera C, editors. Proceedings of ICGI ’96. No. 1147 in LNAI. Springer-Verlag; 1996. p. 72-83.
[19] Vilar JM. Improve the learning of subsequential transducers by using alignments and dictionaries. In: de Oliveira AL, editor. Grammatical Inference: Algorithms and Applications, Proceedings of ICGI ’00. vol. 1891 of LNAI. Springer-Verlag; 2000. p. 298-312.
[20] Starkie B, van Zaanen M, Estival D. The Tenjinno Machine Translation Competition. In: Sakakibara Y, Kobayashi S, Sato K, Nishino T, Tomita E, editors. Grammatical Inference: Algorithms and Applications, Proceedings of ICGI ’06. vol. 4201 of LNAI. Springer-Verlag; 2006. p. 214-226. doi: 10.1007/11872436_18.
[21] Clark A. Large Scale Inference of Deterministic Transductions: Tenjinno problem 1. In: Sakakibara Y, Kobayashi S, Sato K, Nishino T, Tomita E, editors. Grammatical Inference: Algorithms and Applications, Proceedings of ICGI ’06. vol. 4201 of LNAI. Springer-Verlag; 2006. p. 227-239.
[22] Casacuberta F, de la Higuera C. Optimal linguistic decoding is a difficult computational problem. Pattern Recognition Letters. 1999;20(8):813-821. doi:10.1016/S0167-8655(99)00045-8.
[23] Casacuberta F, de la Higuera C. Computational complexity of problems on probabilistic grammars and transducers. In: de Oliveira AL, editor. Grammatical Inference: Algorithms and Applications, Proceedings of ICGI ’00. vol. 1891 of LNAI. Springer-Verlag; 2000. p. 15-24. ISBN: 3-540-41011-2.
[24] Allauzen C, Mohri M. p-Subsequentiable Transducers. In: Implementation and Application of Automata, 7th International Conference, CIAA 2002, Revised Papers. vol. 2608 of LNCS. Springer-Verlag; 2002. p.24-34. doi: 10.1007/3-540-44977-9_2.
[25] Akram HI. Learning Probabilistic Subsequential Transducers. Technische Universität München; 2013.
[26] Kunen K. Set theory: an introduction to independence proofs. vol. 102 of Studies in Logic and the Foundations of Mathematics. North-Holland Publishing Co., Amsterdam-New York; 1980.
[27] Jech T. Set theory. Springer Monographs in Mathematics. Berlin: Springer-Verlag; 2003. The third millennium edition, revised and expanded. ISBN: 978-3-540-44085-7, 978-3-540-44761-0. doi: 10.1007/3-540-44761-X.
[28] Beros A. Anomalous Vacillatory Learning. Journal of Symbolic Logic. 2013 12;78(4):1183-1188. Available from: http://journals.cambridge.org/article_S0022481200126490, doi: 10.2178/jsl/1058448450.
[29] Beros A. Learning Theory in the Arithmetic Hierarchy. Journal of Symbolic Logic. 2014 9;79(3):908-927. Available from: http://journals.cambridge.org/article_S0022481214000231, doi: 10.1017/jsl.2014.23.
[30] Beros A, de la Higuera C. A canonical semi-deterministic transducer. arXiv:1405.2476. 2014.
[31] Nerode A. Linear Automaton Transformations. Proceedings of the American Mathematical Society. 1958; 9(4):541-544.
[32] Blum M, Blum L. Towards a mathematical theory of inductive inference. Information and Control. 1975; 28:125-155.
[33] Gold EM. Language Identification in the Limit. Information and Control. 1967;10(5):447-474.
[34] Valiant LG. A theory of the learnable. Communications of the Association for Computing Machinery. 1984; 27(11):1134-1142.
[35] Angluin D. Queries and Concept Learning. Machine Learning Journal. 1987;2:319-342. doi: 10.1007/BF00116828.
[36] de la Higuera C. Characteristic Sets for Polynomial Grammatical Inference. Machine Learning Journal. 1997;27:125-138. doi: 10.1023/A:1007353007695.
[37] de la Higuera C. Grammatical Inference: Learning Automata and Grammars. Cambridge University Press; 2010. ISBN: 0521763169, 9780521763165.
[38] Sakarovitch J, De Souza R. On the decidability of bounded valuedness for transducers. In: Mathematical Foundations of Computer Science 2008. vol. 5162 of LNCS. Springer; 2008. p. 588-600. doi:10.1007/978-3-540-85238-4_48.
[39] Akram HI, de la Higuera C, Eckert C. Actively Learning Probabilistic Subsequential Transducers. In: Heinz J, de la Higuera C, Oates T, editors. Proceedings of the Eleventh International Conference on Grammatical Inference, University of Maryland, College Park, United States. vol. 21, 2012. p. 19-33. Available from: http://jmlr.csail.mit.edu/proceedings/papers/v21/akram12a/akram12a.pdf.
[40] Akram HI, de la Higuera C. Learning Probabilistic Subsequential Transducers from Positive Data. In: Proceedings of ICAART 2013, Barcelona; 2012. Available from: http://dblp.uni-trier.de/rec/bib/conf/icaart/AkramH13.
[41] Alshawi H, Bangalore S, Douglas S. Learning Dependency Translation Models as Collections of Finite State Head Transducers. Computational Linguistics. 2000;26(1):45-60. doi: 10.1162/ 089120100561629.
[42] Alshawi H, Bangalore S, Douglas S. Head Transducer Model for Speech Translation and their Automatic Acquisition from Bilingual Data. Machine Translation. 2000;15(1-2):105-124. doi: 10.1023/A:1011187330969.
[43] Angluin D. A Note on the Number of Queries Needed to Identify Regular Languages. Information and Control. 1981;51:76-87.
[44] Bailly R, Carreras X, Quattoni A. Unsupervised Spectral Learning of Finite State Transducers. In: Advances in Neural Information Processing Systems 26; 2013. p. 800-808. Available from: http://papers.nips.cc/paper/4862-unsupervised-spectral-learning-of-finite-state-transducers.pdf.
[45] Balle B, Quattoni A, Carreras X. A Spectral Learning Algorithm for Finite State Transducers. In: ECML/PKDD (1). vol. 6911 of LNCS. Springer-Verlag; 2011. p. 156-171. doi: 10.1007/978-3-642-23780-5_20.
[46] Carme J, Gilleron R, Lemay A, Niehren J. Interactive learning of node selecting tree transducer. Machine Learning Journal. 2007;66(1):33-67. doi:10.1007/s10994-006-9613-8.
[47] Casacuberta F. Inference of finite-state transducers by using regular grammars and morphisms. In: de Oliveira AL, editor. Grammatical Inference: Algorithms and Applications, Proceedings of ICGI ’00. vol. 1891 of LNAI. Springer-Verlag; 2000. p. 1-14. doi: 10.1007/978-3-540-45257-7_1.
[48] Castellanos A, Vidal E, Varó MA, Oncina J. Language Understanding and Subsequential Transducer Learning. Computer Speech and Language. 1998;12:193-228. doi: 10.1006/csla.1998.0045.
[49] Chidlovskii B, Ragetli J, de Rijke M. Wrapper Generation via Grammar Induction. In: Proceedings of ECML 2000. vol. 1810. Springer-Verlag; 2000. p. 96-108. doi: 10.1007/3-540-45164-1_11.
[50] Chidlovskii B. Wrapper generation by k-reversible grammar induction. In: Proceedings of the Workshop on Machine Learning and Information Extraction; 2000.
[51] Chidlovskii B. Schema Extraction from XML: A Grammatical Inference Approach. In: Lenzerini M, Nardi D, Nutt W, Suciu D, editors. Proceedings of KRDB 2001. vol. 45 of CEUR Workshop Proceedings; 2001.
[52] Clark A. Learning Morphology with Pair Hidden Markov Models. In: ACL (Companion Volume). CNRS, Toulouse, France; 2001. p. 55-60. Available from: papers/eacl01.pdf.
[53] Eisner J. Parameter Estimation for Probabilistic Finite-State Transducers. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. ACL; 2002. p. 1-8. doi: 10.3115/1073083.1073085.
[54] Gold EM. Complexity of Automaton Identification from Given Data. Information and Control. 1978;37: 302-320.
[55] de la Higuera C. Grammatical inference: learning automata and grammars. Cambridge University Press; 2010. ISBN: 0521763169, 9780521763165.
[56] Kempe A. Finite State Transducers: Approximating Hidden Markov Models. In: 35th Annual Meeting of the ACL and 8th Conference of ECACL. ACL; 1997. p. 460-467. doi: 10.3115/979617.979676.
[57] Jech T. Set theory, Second Edition. Perspectives in Mathematical Logic. Springer; 1997. doi: 10.1007/978-3-662-22400-7.
[58] Mohri M. Generic e-Removal and Input e-Normalization Algorithms for Weighted Transducers. International Journal on Foundations of Computer Science. 2002;13(1):129-143. doi: http://dx.doi.org/10.1142/S0129054102000996.
[59] Oncina J. The Data Driven Approach Applied to the Ostia Algorithm. In: Honavar V, Slutski G, editors. Grammatical Inference, 4th International Colloquium, ICGI-98 Ames, Iowa, USA, July 1214, 1998 Proceedings, p. 50-56. doi: 10.1007/BFb0054063.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-19eea44d-99dc-494e-8bb4-cbec720439bc