PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Survey : Weighted Extended Top-down Tree Transducers. Part II. Application in Machine Translation

Autorzy
Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
In this second part of the survey, we present the application of weighted extended topdown tree transducers in machine translation, which is the automatic translation of natural language texts. We present several formal properties that are relevant to machine translation and evaluate the weighted extended top-down tree transducer along those criteria. In addition, we demonstrate how to extract rules for an extended top-down tree transducer from existing linguistic data and how to obtain suitable rule weights automatically from similar information. Overall, the aim of the survey is twofold. It should provide a synopsis that illustrates how theory (tree transducers) and practice (machine translation) interact on this particular example. Secondly, it presents a uniform and simplified treatment of the rule extraction and training algorithms that is accessible to the nonexpert. Additional details can be found in the original results that are referenced throughout the text.
Wydawca
Rocznik
Strony
239--261
Opis fizyczny
Bibliogr. 107 poz., wykr.
Twórcy
autor
Bibliografia
  • [1] Alexandrakis, A., Bozapalidis, S.: Weighted grammars and Kleene's theorem, Inf. Process. Lett., 24(1), 1987, 1-4.
  • [2] Allauzen, C., Riley, M., Schalkwyk, J., Skut, W., Mohri, M.: OpenFst - A general and efficient weighted finite-state transducer library, Proc. CIAA, Springer, 2007, 11-23.
  • [3] Alshawi, H., Bangalore, S., Douglas, S.: Learning dependency translation models as collections of finite state head transducers, Computational Linguistics, 26(1), 2000, 45-60.
  • [4] Arnold, A., Dauchet, M.: Bi-transductions de forˆets, Proc. ICALP, Edinburgh University Press, 1976, 74-86.
  • [5] Arnold, A., Dauchet, M.: Morphismes et bimorphismes d'arbres, Theoret. Comput. Sci., 20(1), 1982, 33-93.
  • [6] ATANLP Participants: Requirements on a tree transformation model for machine translation, available at: http://stp.lingfil.uu.se/atanlp/maletti-qf.pdf, 2010.
  • [7] Beesley, K. R., Karttunen, L.: Finite State Morphology, CSLI Publications, Palo Alto, 2003.
  • [8] Berstel, J., Reutenauer, C.: Recognizable formal power series on trees, Theoret. Comput. Sci., 18(2), 1982, 115-148.
  • [9] Bikel, D. M.: On the Parameter Space of Generative Lexicalized Statistical Parsing Models, Ph.D. Thesis, University of Pennsylvania, 2004.
  • [10] Blackwood, G. W., de Gispert, A., Brunning, J., Byrne,W.: Large-scale statistical machine translation with weighted finite state transducers, Proc. FSMNLP, IOS Press, 2009, 39-49.
  • [11] Bloem, R., Engelfriet, J.: A comparison of tree transductions defined by monadic second order logic and by attribute grammars, J. Comput. System Sci., 61(1), 2000, 1-50.
  • [12] Borchardt, B.: The Theory of Recognizable Tree Series, Ph.D. Thesis, Technische Universit¨at Dresden, 2005.
  • [13] Brown, P. F., Cocke, J., Della Pietra, S. A., Della Pietra, V. J., Jelinek, F., Lafferty, J. D., Mercer, R. L., Roossin, P. S.: A statistical approach to machine translation, Computational Linguistics, 16(2), 1990, 79-85.
  • [14] Brown, P. F., Della Pietra, S. A., Della Pietra, V. J., Mercer, R. L.: The mathematics of statistical machine translation: parameter estimation, Computational Linguistics, 19(2), 1993, 263-311.
  • [15] Charniak, E.: Statistical techniques for natural language parsing, AI Magazine, 18(4), 1997, 33-44.
  • [16] Charniak, E., Knight, K., Yamada, K.: Syntax-based language models for statistical machine translation, Proc. MT Summit IX, 2003.
  • [17] Chiang, D.: A hierarchical phrase-based model for statistical machine translation, Proc. ACL, Association for Computational Linguistics, 2005, 263-270.
  • [18] Chiang, D.: Hierarchical Phrase-Based Translation, Computational Linguistics, 33(2), 2007, 201-228.
  • [19] Chomsky, N.: Aspects of the Theory of Syntax, MIT Press, 1965.
  • [20] Cleophas, L.: Forest FIRE and FIRE wood: tools for tree automata and tree algorithms, Proc. FSMNLP, 2008, 191-198.
  • [21] Collins, M.: Head-Driven Statistical Models for Natural Language Parsing, Ph.D. Thesis, University of Pennsylvania, 1999.
  • [22] Comon, H., Dauchet,M., Gilleron, R., Löding, C., Jacquemard, F., Lugiez, D., Tison, S., Tommasi,M.: Tree Automata Techniques and Applications, Available on: http://www.grappa.univ-lille3.fr/tata, 2007.
  • [23] Courcelle, B., Franchi-Zannettacci, P.: Attribute grammars and recursive program schemes I, Theoret. Comput. Sci., 17(1), 1982, 163-191.
  • [24] Courcelle, B., Franchi-Zannettacci, P.: Attribute grammars and recursive program schemes II, Theoret. Comput. Sci., 17(1), 1982, 235-257.
  • [25] Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., Singer, Y.: Online passive-aggressive algorithms, J. Machine Learning Research, 7, 2006, 551-585.
  • [26] Cristianini, N., Shawe-Taylor, J.: An Introduction to Support VectorMachines, Cambridge University Press, 2000.
  • [27] Dauchet, M.: Transductions inversibles de forˆets, Th`ese 3`eme cycle, Universit´e de Lille, 1975.
  • [28] Davey, B. A., Priestley, H. A.: Introduction to Lattices and Order, 2nd edition, Cambridge University Press, 2002.
  • [29] Dempster, A. P., Laird, N. M., Rubin, D. B.: Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society Series B, 39(1), 1977, 1-38.
  • [30] Drewes, F.: Grammatical Picture Generation - A Tree-Based Approach, Texts in Theoretical Computer Science. An EATCS Series, Springer, 2006.
  • [31] Duda, R. O., Hart, P. E.: Pattern Classification and Scene Analysis, John Wiley & Sons, New York, NY, 1973.
  • [32] Eilenberg, S.: Volume A: Automata, Languages, and Machines, vol. 59 of Pure and Applied Math., Academic Press, 1974.
  • [33] Engelfriet, J.: Bottom-up and top-down tree transformations: a comparison, Math. Systems Theory, 9(3), 1975, 198-231.
  • [34] Engelfriet, J.: Top-down tree transducers with regular look-ahead, Math. Systems Theory, 10(1), 1976, 289-303.
  • [35] Engelfriet, J.: Three hierarchies of transducers, Math. Systems Theory, 15(2), 1982, 95-125.
  • [36] Engelfriet, J., Fülöp, Z., Vogler, H.: Bottom-up and top-down tree series transformations, J. Autom. Lang. Combin., 7(1), 2002, 11-70.
  • [37] Engelfriet, J., Lilin, E., Maletti, A.: Composition and decomposition of extended multi bottom-up tree transducers, Acta Inform., 46(8), 2009, 561-590.
  • [38] Engelfriet, J., Maneth, S.: Macro tree transducers, attribute grammars, and MSO definable tree translations, Inform. and Comput., 154(1), 1999, 34-91.
  • [39] Engelfriet, J., Vogler, H.: Macro tree transducers, J. Comput. System Sci., 31(1), 1985, 71-146.
  • [40] Engelfriet, J., Vogler, H.: Modular tree transducers, Theoret. Comput. Sci., 78(2), 1991, 267-303.
  • [41] Frishert,M., Cleophas, L. G.,Watson, B.W.: FIRE station: an environment for manipulating finite automata and regular expression views, Proc. CIAA, Springer, 2004, 125-133.
  • [42] Fülöp, Z.: On attributed tree transducers, Acta Cybernet., 5(1), 1981, 261-279.
  • [43] Fülöp, Z., Kühnemann, A., Vogler, H.: A bottom-up characterization of deterministic top-down tree transducers with regular look-ahead, Inf. Process. Lett., 91(2), 2004, 57-67.
  • [44] Fülöp, Z., Kühnemann, A., Vogler, H.: Linear deterministic multi bottom-up tree transducers, Theoret. Comput. Sci., 347(1-2), 2005, 276-287.
  • [45] Fülöp, Z., Vogler, H.: Weighted tree automata and tree transducers, in: Handbook of Weighted Automata (M. Droste, W. Kuich, H. Vogler, Eds.), chapter 9, Springer, 2009, 313-403.
  • [46] Galley, M., Graehl, J., Knight, K., Marcu, D., DeNeefe, S., Wang, W., Thayer, I.: Scalable inference and training of context-rich syntactic translation models, Proc. ACL, Association for Computational Linguistics, 2006, 961-968.
  • [47] Galley, M., Hopkins, M., Knight, K., Marcu, D.: What's in a translation rule?, Proc. NAACL, Association for Computational Linguistics, 2004, 273-280.
  • [48] Gécseg, F., Steinby, M.: Tree Automata, Akad´emiai Kiad´o, Budapest, 1984.
  • [49] Gécseg, F., Steinby, M.: Tree Languages, in: Handbook of Formal Languages (G. Rozenberg, A. Salomaa, Eds.), vol. 3, chapter 1, Springer, 1997, 1-68.
  • [50] Gildea, D.: Optimal parsing strategies for linear context-free rewriting systems, Proc. NAACL, Association for Computational Linguistics, 2010, 769-776.
  • [51] Gildea, D., Satta, G., Zhang, H.: Factoring synchronous grammars by sorting, Proc. ACL, Association for Computational Linguistics, 2006, 279-286.
  • [52] Golan, J. S.: Semirings and their Applications, Kluwer Academic, Dordrecht, 1999.
  • [53] Graehl, J.: Carmel finite-state toolkit, ISI/USC, http://www.isi.edu/licensed-sw/carmel, 1997.
  • [54] Graehl, J., Knight, K.: Training tree transducers, Proc. NAACL, Association for Computational Linguistics, 2004, 105-112, See also [55].
  • [55] Graehl, J., Knight, K.,May, J.: Training tree transducers, Computational Linguistics, 34(3), 2008, 391-427.
  • [56] Hebisch, U., Weinert, H. J.: Semirings - Algebraic Theory and Applications in Computer Science, World Scientific, 1998.
  • [57] Iglesias, G., de Gispert, A., Banga, E. R., Byrne,W. J.: Hierarchical phrase-based translation with weighted finite state transducers, Proc. NAACL, Association for Computational Linguistics, 2009, 433-441.
  • [58] Juang, B.-H., Chou, W., Lee, C.-H.: Statistical and discriminative methods for speech recognition, in: Automatic Speech and Speaker Recognition - Advanced Topics (C.-H. Lee, F. K. Soong, K. K. Paliwal, Eds.), Kluwer Academic, 1996, 109-132.
  • [59] Jurafsky, D., Martin, J. H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Processing, Prentice Hall, 2000.
  • [60] Kari, J.: Image processing using finite automata, in: Recent Advances in Formal Languages and Applications, vol. 25 of Studies in Computational Intelligence, Springer, 2006, 171-208.
  • [61] Kempe, A., Baeijs, C., Ga´al, T., Guingne, F., Nicart, F.: WFSC - A new weighted finite state compiler, Proc. CIAA, Springer, 2003, 108-119.
  • [62] Klarlund, N., Møller, A.: MONA Version 1.4 User Manual, BRICS, Department of Computer Science, University of Aarhus, 2001.
  • [63] Klein, D., Manning, C. D.: Accurate unlexicalized parsing, Proc. ACL, Association for Computational Linguistics, 2003, 423-430.
  • [64] Klein, D., Manning, C. D.: Fast exact inference with a factored model for natural language parsing, Proc. NIPS, MIT Press, 2003, 3-10.
  • [65] Knight, K.: Capturing practical natural language transformations, Machine Translation, 21(2), 2007, 121-133.
  • [66] Knight, K., Al-Onaizan, Y.: Translation with finite-state devices, Proc. Machine Translation and the Information Soup, Springer, 1998, 421-437.
  • [67] Knight, K., Graehl, J.: An overview of probabilistic tree transducers for natural language processing, Proc. CICLing, Springer, 2005, 1-24.
  • [68] Knuth, D. E.: Semantics of context-free languages, Math. Systems Theory, 2(2), 1968, 127-145.
  • [69] Koehn, P.: Statistical Machine Translation, Cambridge University Press, 2010.
  • [70] Koehn, P., Och, F. J., Marcu, D.: Statistical phrase-based translation, Proc. NAACL, Association for Computational Linguistics, 2003, 48-54.
  • [71] Kuich,W.: Tree transducers and formal tree series, Acta Cybernet., 14(1), 1999, 135-149.
  • [72] Kuich,W., Salomaa, A.: Semirings, Automata, Languages, vol. 5 of EATCS Monographs on Theoret. Comput. Sci., Springer, 1986.
  • [73] Li, Z., Callison-Burch, C., Dyer, C., Khudanpur, S., Schwartz, L., Thornton, W., Weese, J., Zaidan, O.: Joshua: An Open Source Toolkit for Parsing-Based Machine Translation, Proc. Workshop Statistical Machine Translation, Association for Computational Linguistics, 2009, 135-139.
  • [74] Ma, X., Cieri, C.: Corpus support for machine translation at LDC, Proc. LREC, 2006, 859-864.
  • [75] Maletti, A.: The Power of Tree Series Transducers, Ph.D. Thesis, Technische Universit¨at Dresden, 2006.
  • [76] Maletti, A.: Compositions of extended top-down tree transducers, Inform. and Comput., 206(9-10), 2008, 1187-1196.
  • [77] Maletti, A.: Why synchronous tree substitution grammars?, Proc. NAACL, Association for Computational Linguistics, 2010, 876-884.
  • [78] Maletti, A.: How to train your multi bottom-up tree transducer, Proc. ACL, Association for Computational Linguistics, 2011, 825-834.
  • [79] Maletti, A.: Survey: Weighted extended top-down tree transducers- Part I: Basics and expressive power, Acta Cybernet., 2011, To appear; available at http://www.ims.uni-stuttgart.de/~maletti/pub/mal11.pdf.
  • [80] Maletti, A., Graehl, J., Hopkins, M., Knight, K.: The power of extended top-down tree transducers, SIAM J. Comput., 39(2), 2009, 410-430.
  • [81] Maletti, A., Satta, G.: Parsing algorithms based on tree automata, Proc. IWPT, Association for Computational Linguistics, 2009, 1-12.
  • [82] Maletti, A., Satta, G.: Parsing and translation algorithms based on weighted extended tree transducers, Proc. ATANLP, Association for Computational Linguistics, 2010, 19-27.
  • [83] Manning, C., Schütze, H.: Foundations of Statistical Natural Language Processing, MIT Press, 1999.
  • [84] May, J., Knight, K.: TIBURON: a weighted tree automata toolkit, Proc. CIAA, Springer, 2006, 102-113.
  • [85] Milo, T., Suciu, D., Vianu, V.: Typechecking for XML transformers, J. Comput. System Sci., 66(1), 2003, 66-97.
  • [86] Mohri,M., Pereira, F., Riley, M.: Weighted finite-state transducers in speech recognition, Computer Speech & Language, 16(1), 2002, 69-88.
  • [87] Nederhof, M.-J., Satta, G.: Computing partition functions of PCFGs, Research on Language & Computation, 6(2), 2009, 139-162.
  • [88] Nesson, R., Satta, G., Shieber, S. M.: Optimal k-arization of synchronous tree-adjoining grammar, Proc. ACL, Association for Computational Linguistics, 2008, 604-612.
  • [89] Och, F. J.: Minimum error rate training in statistical machine translation, Proc. ACL, 2003, 160-167.
  • [90] Och, F. J., Ney, H.: A systematic comparison of various statistical alignment models, Computational Linguistics, 29(1), 2003, 19-51.
  • [91] Och, F. J., Ney, H.: The alignment template approach to statistical machine translation, Computational Linguistics, 30(4), 2004, 417-449.
  • [92] Paciorek, C., Rosenfeld, R.: Minimum classification error training in exponential language models, Proc. NIST/DARPA Speech Transcription Workshop, 2000.
  • [93] Raoult, J.-C.: Rational tree relations, Bull. Belg. Math. Soc., 4(1), 1997, 149-176.
  • [94] Rounds,W. C.: Mappings and grammars on trees, Math. Systems Theory, 4(3), 1970, 257-287.
  • [95] Rozenberg, G., Salomaa, A., Eds.: Handbook of Formal Languages, Springer, 1997.
  • [96] Schölkopf, B., Smola, A. J.: Learning with Kernels: Support VectorMachines, Regularization, Optimization and Beyond, MIT Press, 2002.
  • [97] Shieber, S. M.: Synchronous grammars as tree transducers, Proc. TAG+7, 2004, 88-95.
  • [98] Sun, J., Zhang, M., Tan, C. L.: A non-contiguous tree sequence alignment-based model for statistical machine translation, Proc. ACL, Association for Computational Linguistics, 2009, 914-922.
  • [99] Thatcher, J. W.: Generalized2 sequential machine maps, J. Comput. System Sci., 4(4), 1970, 339-367.
  • [100] Thatcher, J. W.: Tree Automata: An Informal Survey, in: Currents in the Theory of Computing (A. V. Aho, Ed.), chapter 4, Prentice Hall, 1973, 143-172.
  • [101] Vapnik, V. N.: Statistical Learning Theory, Wiley-Interscience, 1998.
  • [102] Wu, D.: Stochastic inversion transduction grammars and bilingual parsing of parallel corpora, Computational Linguistics, 23(3), 1997, 377-403.
  • [103] Wu, D., Wong, H.: Machine translation with a stochastic grammatical channel, Proc. ACL, Association for Computational Linguistics, 1998, 1408-1415.
  • [104] Yamada, K., Knight, K.: A decoder for syntax-based statistical MT, Proc. ACL, Association for Computational Linguistics, 2002, 303-310.
  • [105] Zhang, H., Huang, L., Gildea, D., Knight, K.: Synchronous binarization for machine translation, Proc. NAACL, Association for Computational Linguistics, 2006, 256-263.
  • [106] Zhang, M., Jiang, H., Aw, A., Li, H., Tan, C. L., Li, S.: A tree sequence alignment-based tree-to-tree translation model, Proc. ACL, Association for Computational Linguistics, 2008, 559-567.
  • [107] Zhang, M., Jiang, H., Li, H., Aw, A., Li, S.: Grammar comparison study for translational equivalence modeling and statistical machine translation, Proc. CoLing, Association for Computational Linguistics, 2008, 1097-1104.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BUS8-0022-0058
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.