PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

How to keep the HG weights non-negative : the truncated Perceptron reweighing rule

Autorzy
Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
The literature on error-driven learning in Harmonic Grammar (HG) has adopted the Perceptron reweighing rule. Yet, this rule is not suited to HG, as it fails at ensuring non-negative weights. A variant is thus considered which truncates the updates at zero, keeping the weights non-negative. Convergence guarantees and error bounds for the original Perceptron are shown to extend to its truncated variant.
Rocznik
Strony
345--375
Opis fizyczny
Bibliogr. 40 poz., rys., tab., wykr.
Twórcy
autor
  • SFL (CNRS and University of Paris 8), France
  • UiL-OTS (Utrecht University)
Bibliografia
  • [1] Tamás Sándor Bíró (2006), Finding the right words: Implementing Optimality Theory with Simulated Annealing, Ph.D. thesis, University of Groningen, available as ROA-896.
  • [2] Hans-Dieter Block (1962), The perceptron: A model of brain functioning, Review of Modern Physics, 34 (1): 123-135.
  • [3] Paul Boersma (1997), How we learn variation, optionality and probability, in Rob van Son, editor, Proceedings of the Institute of Phonetic Sciences (IFA) 21, pp. 43-58, Institute of Phonetic Sciences, University of Amsterdam.
  • [4] Paul Boersma (1998), Functional Phonology, Ph.D. thesis, University of Amsterdam, The Netherlands, holland Academic Graphics.
  • [5] Paul Boersma and Bruce Hayes (2001), Empirical tests for the Gradual Learning Algorithm, Linguistic Inquiry, 32 (1): 45-86.
  • [6] Paul Boersma and Joe Pater (to appear), Convergence properties of a gradual learner for Harmonic Grammar, in John McCarthy and Joe Pater, editors, Harmonic Grammar and Harmonic Serialism, Equinox Press.
  • [7] Paul Boersma and Jan-Willem van Leussen (2014), Fast evaluation and learning in multi-level parallel constraint grammars, University of Amsterdam.
  • [8] Nicolò Cesa-Bianchi and Gábor Lugosi (2006), Prediction, learning, and games, Cambridge University Press.
  • [9] Andries W. Coetzee and Shigeto Kawahara (2013), Frequency biases in phonological variation, Natural Language and Linguistic Theory, 31 (1): 47-89.
  • [10] Andries W. Coetzee and Joe Pater (2008), Weighted constraints and gradient restrictions on place co-occurrence in Muna and Arabic, Natural Language and Linguistic Theory, 26 (2): 289-337.
  • [11] Andries W. Coetzee and Joe Pater (2011), The place of variation in phonological theory, in John Goldsmith, Jason Riggle, and Alan Yu, editors, Handbook of phonological theory, pp. 401-434, Blackwell.
  • [12] Nello Cristianini and John Shawe-Taylor (2000), An introduction to Support Vector Machines and other kernel-based methods, Cambridge University Press.
  • [13] Robert Frank and Shyam Kapur (1996), On the use of triggers in parameter setting, Linguistic Inquiry, 27 (4): 623-660.
  • [14] Yoav Freund and Robert E. Schapire (1999), Large margin classification using the Perceptron algorithm, Machine Learning, 37 (3): 277-296.
  • [15] Edward Gibson and Kenneth Wexler (1994), Triggers, Linguistic Inquiry, 25 (3): 407-454.
  • [16] Bruce Hayes (2004), Phonological acquisition in Optimality Theory: The early stages, in René Kager, Joe Pater, and Wim Zonneveld, editors, Constraints in phonological acquisition, pp. 158-203, Cambridge University Press.
  • [17] Gaja Jarosz (2013), Learning with hidden structure in Optimality Theory and Harmonic Grammar: Beyond Robust Interpretative Parsing, Phonology, 30 (1): 27-71.
  • [18] Karen Jesney and Anne-Michelle Tessier (2011), Biases in Harmonic Grammar: the road to restrictive learning, Natural Language and Linguistic Theory, 29 (1): 251-290.
  • [19] Frank Keller (2000), Gradience in grammar. Experimental and computational aspects of degrees of grammaticality, Ph.D. thesis, University of Edinburgh, England.
  • [20] Jyrki Kivinen (2003), Online learning of linear classifiers, in Shahar Mendelson and Alexander J. Smola, editors, Advanced lectures on Machine Learning (LNAI 2600), pp. 235-257, Springer.
  • [21] Jyrki Kivinen, Manfred K. Warmuth, and Peter Auer (1997), The Perceptron algorithm versus Winnow: linear versus logarithmic mistake bounds when few input variables are relevant, Artificial Intelligence, 97 (1-2): 325-343.
  • [22] Norbert Klasner and Hans-Ulrich Simon (1995), From noise-free to noise-tolerant and from on-line to batch learning, in Wolfgang Maass, editor, Computational Learning Theory (COLT) 8, pp. 250-257, ACM.
  • [23] Gèraldine Legendre, Yoshiro Miyata, and Paul Smolensky (1998a), Harmonic Grammar: A formal multi-level connectionist theory of linguistic well-formedness: An application, in Morton Ann Gernsbacher and Sharon J. Derry, editors, Annual conference of the Cognitive Science Society 12, pp. 884-891, Lawrence Erlbaum Associates.
  • [24] Géraldine Legendre, Yoshiro Miyata, and Paul Smolensky (1998b), Harmonic Grammar: A formal multi-level connectionist theory of linguistic well-formedness: Theoretical foundations, in Morton Ann Gernsbacher and Sharon J. Derry, editors, Annual conference of the Cognitive Science Society 12, pp. 388-395, Lawrence Erlbaum.
  • [25] Gèraldine Legendre, Antonella Sorace, and Paul Smolensky (2006), The Optimality Theory/Harmonic Grammar connection, in Paul Smolensky and Gèraldine Legendre, editors, The Harmonic Mind, pp. 903-966, MIT Press.
  • [26] Nick Littlestone (1988), Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm, Machine Learning, 2 (4): 285-318.
  • [27] Giorgio Magri (2015), Idempotency in Optimality Theory, manuscript.
  • [28] Giorgio Magri (to appear), Error-driven learning in OT and HG: a comparison, Phonology.
  • [29] Marvin Minsky and Seymour Papert (1969), Perceptrons: An introduction to Computational Geometry, MIT Press.
  • [30] Mehryar Mohri and Afshin Rostamizadeh (2013), Perceptron Misteka bounds, arXiv:1305.0208.
  • [31] Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar (2012), Foundations of Machine Learning, MIT Press.
  • [32] Albert B. J. Novikoff (1962), On convergence proofs on Perceptrons, in Proceedings of the symposium on the mathematical theory of automata, volume XII, pp. 615-622.
  • [33] Joe Pater (2008), Gradual learning and convergence, Linguistic Inquiry, 39 (2): 334-345.
  • [34] Alan Prince (2002), Entailed Ranking Arguments, ms., Rutgers University, New Brunswick, NJ. Rutgers Optimality Archive, ROA 500. Available at http://www.roa.rutgers.edu.
  • [35] Alan Prince and Bruce Tesar (2004), Learning phonotactic distributions, in René Kager, Joe Pater, and Wim Zonneveld, editors, Constraints in phonological acquisition, pp. 245-291, Cambridge University Press.
  • [36] Frank Rosenblatt (1958), The Perceptron: A probabilistic model for information storage and organization in the brain, Psychological Review, 65 (6): 386-408.
  • [37] Frank Rosenblatt (1962), Principles of Neurodynamics, Spartan.
  • [38] Shai Shalev-Shwartz and Yoram Singer (2005), A new perspective on an old Perceptron algorithm, in Peter Auer and Ron Meir, editors, Conference on Computational Learning Theory (COLT) 18, Lecture notes in Computer Science, pp. 264-278, Springer.
  • [39] Paul Smolensky and Gèraldine Legendre (2006), The Harmonic Mind, MIT Press.
  • [40] Kenneth Wexler and Peter W. Culicover (1980), Formal principles of language acquisition, MIT Press, Cambridge, MA.
Uwagi
Opracowanie rekordu ze środków MNiSW, umowa Nr 461252 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2020).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-37bcd932-3358-477b-8f8e-23bbd26f9912
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.