Hostname: page-component-68c7f8b79f-8spss Total loading time: 0 Render date: 2026-01-01T04:35:39.058Z Has data issue: false hasContentIssue false

Generative Grammar, Neural Networks, and the Implementational Mapping Problem: Response to Pater

Published online by Cambridge University Press:  01 January 2026

Ewan Dunbar*
Affiliation:
CNRS, Université Paris Diderot, and Université Sorbonne Paris Cité
*
Laboratoire de Linguistique Formelle, CNRS, Université Paris Diderot, and Université Sorbonne Paris Cité [ewan.dunbar@univ-paris-diderot.fr]
Get access

Abstract

The target article (Pater 2019) proposes to use neural networks to model learning within existing grammatical frameworks. This is easier said than done. There is a fundamental gap to be bridged that does not receive attention in the article: how can we use neural networks to examine whether it is possible to learn some linguistic representation (a tree, for example) when, after learning is finished, we cannot even tell if this is the type of representation that has been learned (all we see is a sequence of numbers)? Drawing a correspondence between an abstract linguistic representational system and an opaque parameter vector that can (or perhaps cannot) be seen as an instance of such a representation is an implementational mapping problem. Rather than relying on existing frameworks that propose partial solutions to this problem, such as harmonic grammar, I suggest that fusional research of the kind proposed needs to directly address how to ‘find’ linguistic representations in neural network representations.

Information

Type
Perspectives
Copyright
Copyright © 2019 Linguistic Society of America

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Ashida, Go, and Carr, Catherine E.. 2011. Sound localization: Jeffress and beyond. Current Opinion in Neurobiology 21. 745-51. DOI: 10.1016/j.conb.2011.05.008.CrossRefGoogle ScholarPubMed
Augasta, M. Gethsiyal, and Kathirvalavakumar, Thangairulappan. 2012. Reverse engineering the neural networks for rule extraction in classification problems. Neural Processing Letters 35. 131-50. DOI: 10.1007/s11063-011-9207-8.CrossRefGoogle Scholar
Berwick, Robert, and Weinberg, Amy. 1984. The grammatical basis of linguistic performance: Language use and language acquisition. Cambridge, MA: MIT Press.Google Scholar
Chaabouni, Rahma, Dunbar, Ewan, Zeghidour, Neil; and Dupoux, Emmanuel. 2017. Learning weakly supervised multimodal phoneme embeddings. Proceedings of INTERSPEECH 2017, 2218-22. DOI: 10.21437/Interspeech.2017-1689.CrossRefGoogle Scholar
Chomsky, Noam. 1957. Syntactic structures. The Hague: Mouton.CrossRefGoogle Scholar
Chomsky, Noam. 1975. Reflections on language. New York: Pantheon Books.Google Scholar
Chomsky, Noam. 1995. The minimalist program. Cambridge, MA: MIT Press.Google Scholar
Dunbar, Ewan, Synnaeve, Gabriel; and Dupoux, Emmanuel. 2015. Quantitative methods for comparing featural representations. Proceedings of the 18th International Congress of Phonetic Sciences (ICPhS), Glasgow. Online: https://www.internationalphoneticassociation.org/icphs-proceedings/ICPhS2015/Papers/ICPHS1024.pdf.Google Scholar
Fodor, Jerry A., and Pylyshyn, Zenon W.. 1988. Connectionism and cognitive architecture: A critical analysis. Cognition 28. 371. DOI: 10.1016/0010-0277(88)90031-5.CrossRefGoogle ScholarPubMed
Gladkova, Anna, Drozd, Aleksandr; and Matsuoka, Satoshi. 2016. Analogy-based detection of morphological and semantic relations with word embeddings: What works and what doesn't. Proceedings of the NAACL Student Research Workshop, 815. DOI: 10.18653/v1/N16-2002.CrossRefGoogle Scholar
Gulordava, Kristina, Bojanowski, Piotr, Grave, Edouard, Linzen, Tal; and Baroni, Marco. 2018. Colorless green recurrent networks dream hierarchically. Proceedings of NAACL-HLT 2018, 11951205. DOI: 10.18653/v1/N18-1108.CrossRefGoogle Scholar
Hornik, Kurt. 1991. Approximation capabilities of multilayer feedforward networks. Neural Networks 4. 251-57. DOI: 10.1016/0893-6080(91)90009-T.CrossRefGoogle Scholar
Kriegeskorte, Nikolaus, Mur, Marieke; and Bandettini, Peter. 2008. Representational similarity analysis—Connecting the branches of systems neuroscience. Frontiers in Systems Neuroscience 2:4. DOI: 10.3389/neuro.06.004.2008.Google ScholarPubMed
Legendre, Géraldine, Miyata, Yoshiro; and Smolensky, Paul. 1990a. Can connectionism contribute to syntax? Harmonic grammar, with an application. CU-CS-485-90. Computer Science Technical Reports 467. Online: https://scholar.colorado.edu/csci_techreports/467.Google Scholar
Legendre, Géraldine, Miyata, Yoshiro; and Smolensky, Paul. 1990b. Harmonic grammar—a formal multi-level connectionist theory of linguistic well-formedness: Theoretical foundations. CU-CS-465-90. Computer Science Technical Reports 447. Online: https://scholar.colorado.edu/csci_techreports/447.Google Scholar
Leshno, Moshe, Lin, Vladimir Ya., Pinkus, Allan; and Schocken, Shimon. 1993. Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Networks 6. 861-67. DOI: 10.1016/S0893-6080(05)80131-5.CrossRefGoogle Scholar
Levy, Omer, and Goldberg, Yoav. 2014. Linguistic regularities in sparse and explicit word representations. Proceedings of the 18th Conference on Computational Natural Language Learning, 171-80. Online: http://www.aclweb.org/anthology/W14-1618.Google Scholar
Lin, Henry W., Tegmark, Max; and Rolnick, David. 2017. Why does deep and cheap learning work so well? Journal of Statistical Physics 168. 1223-47. DOI: 10.1007/s10955-017-1836-5.CrossRefGoogle Scholar
Linzen, Tal. 2016. Issues in evaluating semantic spaces using word analogies. Proceedings of the 1st Workshop on Evaluating Vector Space Representations for NLP, 1318. Online: http://www.aclweb.org/anthology/W16-2503.Google Scholar
Linzen, Tal, Dupoux, Emmanuel; and Goldberg, Yoav. 2016. Assessing the ability of LSTMs to learn syntax-sensitive dependencies. Transactions of the Association for Computational Linguistics 4. 521-35. Online: http://aclweb.org/anthology/Q16-1037.CrossRefGoogle Scholar
Linzen, Tal, Dupoux, Emmanuel; and Spector, Benjamin. 2016. Quantificational features in distributional word representations. Proceedings of the Fifth Joint Conference on Lexical and Computational Semantics (*SEM 2016), 111. Online: http://www.aclweb.org/anthology/S16-2001.Google Scholar
Marr, David. 1982. Vision: A computational investigation into the human representation and processing of visual information. Cambridge, MA: MIT Press.Google Scholar
Matthews, Robert J. 1991. Psychological reality of grammars. The Chomskyan turn, ed. by Kasher, Asa, 182200. Oxford: Blackwell.Google Scholar
McCloskey, Michael. 1991. Networks and theories: The place of connectionism in cognitive science. Psychological Science 2. 387-95. DOI: 10.1111/j.1467-9280.1991.tb00173.x.CrossRefGoogle Scholar
Mendelson, Elliott. 1997. Introduction to mathematical logic. 4th edn. New York: Chapman and Hall.Google Scholar
Mikolov, Tomas, Chen, Kai, Corrado, Greg; and Dean, Jeffrey. 2013. Efficient estimation of word representations in vector space. arXiv:1301.3781 [cs.CL]. Online: https://arxiv.org/abs/1301.3781.Google Scholar
Miller, George, and Chomsky, Noam. 1963. Finitary models of language users. Handbook of mathematical psychology, vol. 2, ed. by Luce, Robert Duncan, Bush, Robert T., and Galanter, Eugene, 419-92. New York: Wiley.Google Scholar
Özbakır, Lale, Baykasoğlu, Adil; and Kulluk, Sinem. 2010. A soft computing-based approach for integrated training and rule extraction from artificial neural networks: DIFACONN-miner. Applied Soft Computing 10. 304-17. DOI: 10.1016/j.asoc.2009.08.008.CrossRefGoogle Scholar
Palangi, Hamid, Smolensky, Paul, He, Xiaodong; and Deng, Li. 2017. Question-answering with grammatically-interpretable representations. arXiv:1705.08432 [cs.CL]. Online: https://arxiv.org/abs/1705.08432.CrossRefGoogle Scholar
Pater, Joe. 2019. Generative linguistics and neural networks at 60: Foundation, friction, and fusion. Language 95(1). e41e74.CrossRefGoogle Scholar
Phillips, Colin. 1996. Order and structure. Cambridge, MA: MIT dissertation.Google Scholar
Pinker, Steven, and Prince, Alan. 1988. On language and connectionism: Analysis of a parallel distributed processing model of language acquisition. Cognition 28. 73193. DOI: 10.1016/0010-0277(88)90032-7.CrossRefGoogle ScholarPubMed
Pollack, Jordan B. 1988. Recursive auto-associative memory: Devising compositional distributed representations. Proceedings of the 10th annual meeting of the Cognitive Science Society (CogSci 1988), 3339. Online: http://mindmodeling.org/cogscihistorical/cogsci_10.pdf.Google Scholar
Rumelhart, David E., and McClelland, James L.. 1986. On learning the past tenses of English verbs. Parallel distributed processing: Explorations in the microstructures of cognition, vol. 2, ed. by Rumelhart, David E., McClelland, James L., and PDP, the Group, Research, 216-71. Cambridge, MA: MIT Press.CrossRefGoogle Scholar
Sanz, Ricardo. 2008. Top 100 most influential works in cognitive science. UPM Autonomous Systems Laboratory. Online: http://tierra.aslab.upm.es/public/index.php?option=com_content&task=view&id=141.Google Scholar
Smolensky, Paul. 1986. Information processing in dynamical systems: Foundations of harmony theory. Parallel distributed processing: Explorations in the microstructure of cognition, vol. 1, ed. by Rumelhart, David E., McClelland, James L., and PDP, the Group, Research, 194281. Cambridge, MA: MIT Press.Google Scholar
Smolensky, Paul. 1988a. The constituent structure of connectionist mental states: A reply to Fodor and Pylyshyn. The Southern Journal of Philosophy 26. 137-61. DOI: 10.1111/j.2041-6962.1988.tb00470.x.CrossRefGoogle Scholar
Smolensky, Paul. 1988b. On the proper treatment of connectionism. Behavioral and Brain Sciences 11. 123. DOI: 10.1017/S0140525X00052432.CrossRefGoogle Scholar
Smolensky, Paul, and Goldrick, Matthew. 2016. Gradient symbolic representations in grammar: The case of French liaison. Baltimore: Johns Hopkins University, and Evanston, IL: Northwestern University, ms. Online: http://roa.rutgers.edu/article/view/1552.Google Scholar
Smolensky, Paul, Goldrick, Matthew; and Mathis, Donald. 2014. Optimization and quantization in gradient symbol systems: A framework for integrating the continuous and the discrete in cognition. Cognitive Science 38. 1102-38. DOI: 10.1111/cogs.12047.CrossRefGoogle ScholarPubMed
Taylor, Brian J., and Darrah, Marjorie A.. 2005. Rule extraction as a formal method for the verification and validation of neural networks. Proceedings of the 2005 IEEE International Joint Conference on Neural Networks, 2915-20. DOI: 10.1109/IJCNN.2005.1556388.CrossRefGoogle Scholar