Tytuł artykułu
Treść / Zawartość
Pełne teksty:
Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
We derive well-understood and well-studied subregular classes of formal languages purely from the computational perspective of algorithmic learning problems. We parameterise the learning problem along dimensions of representation and inference strategy. Of special interest are those classes of languages whose learning algorithms are necessarily not prohibitively expensive in space and time, since learners are often exposed to adverse conditions and sparse data. Learned natural language patterns are expected to be most like the patterns in these classes, an expectation supported by previous typological and linguistic research in phonology. A second result is that the learning algorithms presented here are completely agnostic to choice of linguistic representation. In the case of the subregular classes, the results fall out from traditional model-theoretic treatments of words and strings. The same learning algorithms, however, can be applied to model-theoretic treatments of other linguistic representations such as syntactic trees or autosegmental graphs, which opens a useful direction for future research.
Wydawca
Czasopismo
Rocznik
Tom
Strony
151--194
Opis fizyczny
Bibliogr. 92 poz., rys.
Twórcy
autor
- Stony Brook University
autor
- Department of Linguistics and Language Development, San José State University
autor
- Department of Linguistics and Institute for Advanced Computational Science, Stony Brook University
Bibliografia
- [1]. Alëna AKSËNOVA, Jonathan RAWSKI, Thomas GRAF, and Jeffrey HEINZ (2020), The Computational Power of Harmony, in Harry VAN DER HULST, editor, Oxford Handbook of Vowel Harmony, Oxford University Press, under review.
- [2]. Martin ANTHONY and Norman BIGGS (1992), Computational Learning Theory, Cambridge University Press.
- [3]. Richard Brian APPLEGATE (1972), Ineseño Chumash Grammar, Ph.D. thesis, University of California, Berkeley.
- [4]. Frederico Augusto Casarsa AZEVEDO, Ludmila Ribeiro Bezerra CARVALHO, Lea Tenenholz GRINBERG, José Marcelo FARFEL, Renata Eloah de Lucena FERRETTI, Renata Elaine Paraizo LEITE, Wilson Jacob FILHO, Roberto LENT, and Suzana HERCULANO-HOUZEL (2009), Equal Numbers of Neuronal and Nonneuronal Cells Make the Human Brain an Isometrically Scaled-Up Primate Brain, The Journal of Comparative Neurology, 513:532-541, doi:10.1002/cne.21974.
- [5]. Danièle BEAUQUIER and Jean-Éric PIN (1989), Factors of Words, in Giorgio AUSIELLO, Mariangiola DEZANI-CIANCAGLINI, and Simonetta RONCHI DELLA ROCCA, editors, Automata, Languages and Programming: 16th International Colloquium, volume 372 of Lecture Notes in Computer Science, pp. 63-79, Springer Berlin / Heidelberg, doi:10.1007/BFb0035752.
- [6]. Julius Richard BÜCHI (1960), Weak Second-Order Arithmetic and Finite Automata, Zeitschrift für mathematische Logik und Grundlagen der Mathematik, 6(1-6):66-92, doi:10.1002/malq.19600060105.
- [7]. Pascal CARON (1998), LANGAGE: A Maple Package for Automaton Characterization of Regular Languages, in Derick WOOD and Sheng YU, editors, Automata Implementation, volume 1436 of Lecture Notes in Computer Science, pp. 46-55, Springer Berlin / Heidelberg, doi:10.1007/BFb0031380.
- [8]. Pascal CARON (2000), Families of Locally Testable Languages, Theoretical Computer Science, 242(1-2):361-376, doi:10.1016/S0304-3975(98)00332-6.
- [9]. Jane CHANDLEE (2014), Strictly Local Phonological Processes, Ph.D. thesis, University of Delaware, https://chandlee.sites.haverford.edu/wp-content/uploads/2015/05/Chandlee_dissertation_2014.pdf.
- [10]. Jane CHANDLEE (2017), Computational Locality in Morphological Maps, Morphology, 27(4):599-641, doi:10.1007/s11525-017-9316-9.
- [11]. Jane CHANDLEE, Rémi EYRAUD, Jeffrey HEINZ, Adam JARDINE, and Jonathan RAWSKI (2019), Learning with Partially Ordered Representations, in Proceedings of the 16th Meeting on the Mathematics of Language, pp. 91-101, Association for Computational Linguistics, doi:10.18653/v1/W19-5708.
- [12]. Jane CHANDLEE and Adam JARDINE (2019), Autosegmental Input Strictly Local Functions, Transactions of the Association for Computational Linguistics, 7:157-168, doi:10.1162/tacl_a_00260.
- [13]. Alexander CLARK and Shalom LAPPIN (2011), Linguistic Nativism and the Poverty of the Stimulus, Wiley-Blackwell.
- [14]. Bruno COURCELLE (1994), Monadic Second-Order Definable Graph Transductions: A Survey, Theoretical Computer Science, 126(1):53-75, doi:10.1016/0304-3975(94)90268-2.
- [15]. Bruno COURCELLE and Joost ENGELFRIET (2012), Graph Structure and Monadic Second-Order Logic: A Language-Theoretic Approach, volume 138, Cambridge University Press.
- [16]. Colin DE LA HIGUERA (2010), Grammatical Inference: Learning Automata and Grammars, Cambridge University Press, doi:10.1017/CBO9781139194655.
- [17]. Aniello DE SANTO and Thomas GRAF (2019), Structure Sensitive Tier Projection: Applications and Formal Properties, in Raffaella BERNARDI, Greg KOBELE, and Sylvain POGODALLA, editors, Formal Grammar 2019, volume 11668 of Lecture Notes in Computer Science, pp. 35-50, Springer Verlag, doi:10.1007/978-3-662-59648-7_3.
- [18]. Hossep DOLATIAN and Jonathan RAWSKI (2020), Multi-Input Strictly Local Functions for Templatic Morphology, in Proceedings of the Society for Computation in Linguistics, volume 3, pp. 282-296, https://scholarworks.umass.edu/scil/vol3/iss1/28.
- [19]. Matt EDLEFSEN, Dylan LEEMAN, Nathan MYERS, Nathaniel SMITH, Molly VISSCHER, and David WELLCOME (2008), Deciding Strictly Local (SL) Languages, in Jon BREITENBUCHER, editor, Proceedings of the 2008 Midstates Conference for Undergraduate Research in Computer Science and Mathematics, pp. 66-73.
- [20]. Joost ENGELFRIET and Hendrik Jan HOOGEBOOM (2001), MSO Definable String Transductions and Two-Way Finite-State Transducers, ACM Transactions on Computational Logic, 2(2):216-254, doi:10.1145/371316.371512.
- [21]. Emmanuel FILIOT (2015), Logic-Automata Connections for Transformations, in Logic and Its Applications, volume 8923 of Lecture Notes in Computer Science, pp. 30-57, Springer Berlin / Heidelberg, doi:10.1007/978-3-662-45824-2_3.
- [22]. Sara FINLEY (2008), Formal and Cognitive Restrictions on Vowel Harmony, Ph.D. thesis, Johns Hopkins University.
- [23]. Jerry Alan FODOR and Zenon Walter PYLYSHYN (1988), Connectionism and Cognitive Architecture: A Critical Analysis, Cognition, 28(1-2):3-71, doi:10.1016/0010-0277(88)90031-5.
- [24]. Pedro GARCIA, Enrique VIDAL, and José ONCINA (1990), Learning Locally Testable Languages in the Strict Sense, in Proceedings of the 1st International Workshop on Algorithmic Learning Theory, pp. 325-338, https://grfia.dlsi.ua.es/repositori/grfia/pubs/111/alt1990.pdf.
- [25]. R. W. N. GOEDEMANS, Jeffrey HEINZ, and Harry VAN DER HULST (2015), StressTyp2, http://st2.ullet.net/.
- [26]. Edward Mark GOLD (1967), Language Identification in the Limit, Information and Control, 10(5):447-474, doi:10.1016/S0019-9958(67)91165-5.
- [27]. John Anton GOLDSMITH (1976), Autosegmental Phonology, Ph.D. thesis, Swarthmore College, https://dspace.mit.edu/bitstream/handle/1721.1/16388/03188555-MIT.
- [28]. Saul GORN (1967), Explicit Definitions and Linguistic Dominoes, in John HART and Satoru TAKASU, editors, Systems and Computer Science, pp. 77-115, University of Toronto Press, doi:10.3138/9781487592769.
- [29]. Thomas GRAF (2011), Closure Properties of the Minimalist Derivation Tree Languages, in Sylvain POGODALLA and Jean-Philippe PROST, editors, Logical Aspects of Computational Linguistics, volume 6736 of Lecture Notes in Computer Science, pp. 96-111, Springer Berlin / Heidelberg, doi:10.1007/978-3-642-22221-4_7.
- [30]. Thomas GRAF (2014), Beyond the Apparent: Cognitive Parallels Between Syntax and Phonology, in Carson T. SCHÜTZE and Linnaea STOCKALL, editors, Connectedness: Papers by and for Sarah Van Wagenen, volume 18 of UCLA Working Papers in Linguistics, pp. 161-174, http://phonetics.linguistics.ucla.edu/wpl/issues/wpl18/papers/graf.pdf.
- [31]. Thomas GRAF (2020), Curbing Feature Coding: Strictly Local Feature Assigment, in Proceedings of the Society for Computation in Linguistics, volume 3, pp. 362-371, doi:10.7275/f7y5-xz32.
- [32]. Thomas GRAF and Nazila SHAFIEI (2019), C-Command Dependencies as TSL String Constraints, in Proceedings of the Society for Computation in Linguistics, volume 2, pp. 205-215, doi:10.7275/4rrx-x488.
- [33]. Leonard H. HAINES (1969), On Free Monoids Partially Ordered by Embedding, Journal of Combinatorial Theory, 6(1):94-98, doi:10.1016/s0021-9800(69)80111-0.
- [34]. Jeffrey HEINZ (2010a), Learning Long-Distance Phonotactics, Linguistic Inquiry, 41(4):623-661, doi:10.1162/ling_a_00015.
- [35]. Jeffrey HEINZ (2010b), String Extension Learning, in Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 897-906, Association for Computational Linguistics, https://www.aclweb.org/anthology/P10-1092.
- [36]. Jeffrey HEINZ (2016), Computational Theories of Learning and Developmental Psycholinguistics, in Jeffrey LIDZ, William SYNDER, and Joe PATER, editors, The Oxford Handbook of Developmental Linguistics, chapter 27, pp. 633-663, Oxford University Press, doi:10.1093/oxfordhb/9780199601264.013.27.
- [37]. Jeffrey HEINZ (2018), The Computational Nature of Phonological Generalizations, in Larry HYMAN and Frank PLANK, editors, Phonological Typology, volume 23 of Phonetics and Phonology, chapter 5, pp. 126-195, Mouton de Gruyter, doi:10.1515/9783110451931-005.
- [38]. Jeffrey HEINZ, Anna KASPRZIK, and Timo KÖTZING (2012), Learning in the Limit with Lattice-Structured Hypothesis Spaces, Theoretical Computer Science, 457:111-127, doi:10.1016/j.tcs.2012.07.017.
- [39]. Jeffrey HEINZ, Chetan RAWAL, and Herbert G. TANNER (2011), Tier-based Strictly Local Constraints for Phonology, in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Short Papers, volume 2, pp. 58-64, Association for Computational Linguistics, https://aclweb.org/anthology/P11-2011.
- [40]. Jeffrey HEINZ and James ROGERS (2013), Learning Subregular Classes of Languages with Factored Deterministic Automata, in Proceedings of the 13th Meeting on the Mathematics of Language, pp. 64-71, Association for Computational Linguistics, https://www.aclweb.org/anthology/W13-3007.
- [41]. Suzana HERCULANO-HOUZEL and Roberto LENT (2005), Isotropic Fractionator: A Simple, Rapid Method for the Quantification of Total Cell and Neuron Numbers in the Brain, Journal of Neuroscience, 25(10):2518-2521, doi:10.1523/JNEUROSCI.4526-04.2005.
- [42]. Larry M. HYMAN (2009), How (Not) to Do Phonological Typology: The Case of Pitch-Accent, Language Sciences, 31(2-3):213-238, doi:10.1016/j.langsci.2008.12.007.
- [43]. Larry M. HYMAN and Francis X. KATAMBA (1993), A New Approach to Tone in Luganda, Language, 69(1):34-67, doi:10.2307/416415.
- [44]. Sanjay JAIN, Daniel OSHERSON, James S. ROYER, and Arun SHARMA (1999), Systems That Learn: An Introduction to Learning Theory (Learning, Development and Conceptual Change), The MIT Press, 2nd edition.
- [45]. Adam JARDINE (2017a), The Local Nature of Tone-Association Patterns, Phonology, 34(2):385-405, doi:10.1017/s0952675717000185.
- [46]. Adam JARDINE (2017b), On the Logical Complexity of Autosegmental Representations, in Proceedings of the 15th Meeting on the Mathematics of Language, pp. 22-35, Association for Computational Linguistics, doi:10.18653/v1/W17-3403.
- [47]. Adam JARDINE (2019), The Expressivity of Autosegmental Grammars, Journal of Logic, Language and Information, 28(1):9-54, ISSN 1572-9583, doi:10.1007/s10849-018-9270-x.
- [48]. Adam JARDINE, Nick DANIS, and Luca IACOPONI (2021), A Formal Investigation of Q-Theory in Comparison to Autosegmental Representations, Linguistic Inquiry, 52(2):333-358, doi:10.1162/ling_a_00376.
- [49]. Adam JARDINE and Jeffrey HEINZ (2016), Learning Tier-Based Strictly 2-Local Languages, Transactions of the Association for Computation in Linguistics, 4:87-98, doi:10.1162/tacl_a_00085.
- [50]. Adam JARDINE and Kevin MCMULLIN (2017), Efficient Learning of Tier-Based Strictly k-Local Languages, in Frank DREWES, Carlos MARTÍN-VIDE, and Bianca TRUTHE, editors, Language and Automata Theory and Applications: 11th International Conference, volume 10168 of Lecture Notes in Computer Science, pp. 64-76, Springer, Cham, doi:10.1007/978-3-319-53733-7_4.
- [51]. Kent JOHNSON (2004), Gold’s Theorem and Cognitive Science, Philosophy of Science, 71:571-592.
- [52]. Mark JOHNSON (1988), Attribute-Value Logic and the Theory of Grammar, Center for the Study of Language and Information.
- [53]. Michael KEARNS and Umesh VAZIRANI (1994), An Introduction to Computational Learning Theory, MIT Press.
- [54]. Paul J. KING (1989), A Logical Formalism for Head-Driven Phrase Structure Grammar, Ph.D. thesis, University of Manchester, https://www.proquest.com/docview/2201235827.
- [55]. Gregory M. KOBELE (2011), Minimalist Tree Languages are Closed Under Intersection with Recognizable Tree Languages, in Sylvain POGODALLA and Jean-Philippe PROST, editors, Logical Aspects of Computational Linguistics, volume 6736 of Lecture Notes in Computer Science, pp. 129-144, Springer Berlin / Heidelberg, doi:10.1007/978-3-642-22221-4_9.
- [56]. John Richard KRUEGER (1961), Chuvash Manual: Introduction, Grammar, Reader, and Vocabulary, volume 7 of Uralic and Altaic Series, Indiana University.
- [57]. Regine LAI (2015), Learnable vs. Unlearnable Harmony Patterns, Linguistic Inquiry, 46(3):425-451, doi:10.1162/LING_a_00188.
- [58]. Dakotah LAMBERT (2021), Grammar Interpretations and Learning TSL Online, in Adam JARDINE, Jane CHANDLEE, Jeffrey HEINZ, Menno VAN ZAANEN, and Rémi EYRAUD, editors, Proceeedings of the Fifteenth International Conference on Grammatical Inference (ICGI 2020/21), PMLR: Proceedings of Machine Language Research, http://proceedings.mlr.press/, in press.
- [59]. Dakotah LAMBERT and James ROGERS (2019), A Logical and Computational Methodology for Exploring Systems of Phonotactic Constraints, in Proceedings of the Society for Computation in Linguistics, volume 2, pp. 247-256, doi:10.7275/t0dv-9t05.
- [60]. Dakotah LAMBERT and James ROGERS (2020), Tier-Based Strictly Local Stringsets: Perspectives from Model and Automata Theory, in Proceedings of the Society for Computation in Linguistics, volume 3, pp. 330-337, doi:10.7275/2n1j-pj39.
- [61]. Shalom LAPPIN and Stuart Merrill SHIEBER (2007), Machine Learning Theory and Practice as a Source of Insight into Universal Grammar, Journal of Linguistics, 43(2):393-427, doi:10.1017/S0022226707004628.
- [62]. Julie Anne LEGATE and Charles D. YANG (2002), Empirical Re-assessment of Stimulus Poverty Arguments, The Linguistic Review, 18(1-2):151-162, doi:10.1515/tlir.19.1-2.151.
- [63]. Leonid LIBKIN (2004), Elements of Finite Model Theory, Texts in Theoretical Computer Science, Springer Berlin / Heidelberg, doi:10.1007/978-3-662-07003-1.
- [64]. Kevin MCMULLIN and Gunnar Ólafur HANSSON (2019), Inductive Learning of Locality Relations in Segmental Phonology, Laboratory Phonology: Journal of the Association for Laboratory Phonology, 10(1):1-53, doi:10.5334/labphon.150,article 14.
- [65]. Robert MCNAUGHTON and Seymour A. PAPERT (1971), Counter-Free Automata, MIT Press.
- [66]. Tom Michael MITCHELL (1982), Generalization as Search, Artificial Intelligence, 18(2):203-226, doi:10.1016/0004-3702(82)90040-6.
- [67]. Tom Michael MITCHELL (2017), Key Ideas in Machine Learning, in Machine Learning: Second Edition, McGraw Hill, http://www.cs.cmu.edu/~tom/mlbook/keyIdeas.pdf, forthcoming.
- [68]. Meyhar MOHRI, Afshin ROSTAMIZADEH, and Ameet TALWALKAR (2012), Foundations of Machine Learning, MIT Press.
- [69]. Max NELSON, Hossep DOLATIAN, Jonathan RAWSKI, and Brandon PRICKETT (2020), Probing RNN Encoder-Decoder Generalizations of Subregular Functions Using Reduplication, in Proceedings of the Society for Computation in Linguistics, volume 3, pp. 31-42, doi:10.7275/xd0r-pg04.
- [70]. Partha NIYOGI (2006), The Computational Nature of Language Learning and Evolution, Cambridge, MA: MIT Press.
- [71]. Martin Andreas NOWAK, Natalia L. KOMAROVA, and Partha NIYOGI (2002), Computational and Evolutionary Aspects of Language, Nature, 417(6889):611-617, doi:10.1038/nature00771.
- [72]. Chris OAKDEN (2020), Notational Equivalence in Tonal Geometry, Phonology, 37(2):257-296, doi:10.1017/S0952675720000123.
- [73]. Christopher POTTS and Geoffrey PULLUM (2002), Model Theory and the Content of OT Constraints, Phonology, 19:361-393.
- [74]. Jonathan RAWSKI (2019), Tensor Product Representations of Subregular Formal Languages, in Proceedings of the International Joint Conference on Artificial Intelligence workshop on Neural-Symbolic Learning and Reasoning, pp. 36-42.
- [75]. Jonathan RAWSKI and Hossep DOLATIAN (2020), Multi-Input Strictly Local Functions for Tonal Phonology, in Proceedings of the Society for Computation in Linguistics, volume 3, pp. 245-260, https://scholarworks.umass.edu/scil/vol3/iss1/25.
- [76]. Jonathan RAWSKI and Jeffrey HEINZ (2019), No Free Lunch in Linguistics or Machine Learning: Response to Pater, Language, 93(1):e125-e135, doi:10.1353/lan.2019.0004.
- [77]. James ROGERS (1998), A Descriptive Approach to Language-Theoretic Complexity, (Monograph.) Studies in Logic, Language, and Information, CSLI Publications.
- [78]. James ROGERS (2003), Syntactic Structures as Multi-Dimensional Trees, Research on Language and Computation, 1(3-4):265-305, doi:10.1023/A:1024695608419.
- [79]. James ROGERS, Jeff HEINZ, Margaret FERO, Jeremy HURST, Dakotah LAMBERT, and Sean WIBEL (2012), Cognitive and Sub-Regular Complexity, in Glyn MORRILL and Mark-Jan NEDERHOF, editors, Formal Grammar 2012, volume 8036 of Lecture Notes in Computer Science, pp. 90-108, Springer-Verlag, doi:10.1007/978-3-642-39998-5_6.
- [80]. James ROGERS, Jeffrey HEINZ, Gil BAILEY, Matt EDLEFSEN, Molly VISSCHER, David WELLCOME, and Sean WIBEL (2010), On Languages Piecewise Testable in the Strict Sense, in Christian EBERT, Gerhard JÄGER, and Jens MICHAELIS, editors, The Mathematics of Language: Revised Selected Papers from the 10th and 11th Biennial Conference on the Mathematics of Language, volume 6149 of LNCS/LNAI, pp. 255-265, FoLLI/Springer, doi:10.1007/978-3-642-14322-9_19.
- [81]. James ROGERS and Dakotah LAMBERT (2019a), Extracting Subregular Constraints from Regular Stringsets, Journal of Language Modelling, 7(2):143-176, doi:10.15398/jlm.v7i2.209.
- [82]. James ROGERS and Dakotah LAMBERT (2019b), Some Classes of Sets of Structures Definable Without Quantifiers, in Proceedings of the 16th Meeting on the Mathematics of Language, pp. 63-77, Association for Computational Linguistics, doi:10.18653/v1/W19-5706.
- [83]. Marcel-Paul SCHÜTZENBERGER (1965), On Finite Monoids Having Only Trivial Subgroups, Information and Control, 8(2):190-194, doi:10.1016/s0019-9958(65)90108-7.
- [84]. Imre SIMON (1975), Piecewise Testable Events, in Helmut BRAKHAGE, editor, Automata Theory and Formal Languages, volume 33 of Lecture Notes in Computer Science, pp. 214-222, Springer-Verlag, doi:10.1007/3-540-07407-4_23.
- [85]. Kristina STROTHER-GARCIA (2019), Using Model Theory in Phonology: A Novel Characterization of Syllable Structure and Syllabification, Ph.D. thesis, University of Delaware.
- [86]. Wolfgang THOMAS (1982), Classifying Regular Events in Symbolic Logic, Journal of Computer and Systems Sciences, 25:360-376, doi:10.1016/0022-0000(82)90016-2.
- [87]. Leslie Gabriel VALIANT (1984), A Theory of the Learnable, Communications of the ACM, 27(11):1134-1142, doi:10.1145/1968.1972.
- [88]. Iris VAN ROOIJ and Giosuè BAGGIO (2021), Theory Before the Test: How to Build High-Verisimilitude Explanatory Theories in Psychological Science, Perspectives on Psychological Science, doi:10.1177/1745691620970604.
- [89]. Vladimir VAPNIK (1995), The Nature of Statistical Learning Theory, Springer.
- [90]. Mai Ha VU, Ashkan ZEHFROOSH, Kristina STROTHER-GARCIA, Michael SEBOK, Jeffrey HEINZ, and Herbert G. TANNER (2018), Statistical Relational Learning with Unconventional String Models, Frontiers in Robotics and AI, 5(76):1-26, doi:10.3389/frobt.2018.00076.
- [91]. Edwin Samuel WILLIAMS (1976), Underlying Tone in Margi and Igbo, Linguistic Inquiry, 7(3):463-484.
- [92]. Charles YANG (2013), Who’s Afraid of George Kingsley Zipf? Or: Do Children and Chimps Have Language?, Significance, 10(6):29-34, doi:10.1111/j.1740-9713.2013.00708.x.
Uwagi
Opracowanie rekordu ze środków MNiSW, umowa Nr 461252 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2021).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-3b169e6c-b9b4-4b8e-9589-12d749c6c4e0