Hostname: page-component-68c7f8b79f-7wx25 Total loading time: 0 Render date: 2026-01-01T01:33:49.731Z Has data issue: false hasContentIssue false

A Case for Deep Learning in Semantics: Response to Pater

Published online by Cambridge University Press:  01 January 2026

Christopher Potts*
Affiliation:
Stanford University
*
Department of Linguistics, Stanford University, Stanford, CA 94305 [cgpotts@stanford.edu]
Get access

Abstract

Pater's (2019) target article builds a persuasive case for establishing stronger ties between theoretical linguistics and connectionism (deep learning). This commentary extends his arguments to semantics, focusing in particular on issues of learning, compositionality, and lexical meaning.

Information

Type
Perspectives
Copyright
Copyright © 2019 Linguistic Society of America

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Baroni, Marco, Bernardi, Raffaella; and Zamparelli, Roberto. 2014. Frege in space: A program for compositional distributional semantics. Linguistic Issues in Language Technology 9. 241346. Online: http://csli-lilt.stanford.edu/ojs/index.php/LiLT/article/view/6/5.CrossRefGoogle Scholar
Baroni, Marco, and Zamparelli, Roberto. 2010. Nouns are vectors, adjectives are matrices: Representing adjective-noun constructions in semantic space. Proceedings of the 2010 conference on Empirical Methods in Natural Language Processing, 1183-93. Online: https://www.aclweb.org/anthology/D/D10/D10-1115.pdf.Google Scholar
Bowman, Samuel R. 2016. Modeling natural language semantics in learned representations. Stanford, CA: Stanford University dissertation.Google Scholar
Carlson, Gregory. 1977. Reference to kinds in English. Amherst: University of Massachusetts Amherst dissertation.Google Scholar
Clark, Herbert H. 1997. Dogmas of understanding. Discourse Processes 23. 567-98. DOI: 10.1080/01638539709545003.CrossRefGoogle Scholar
Coecke, Bob, Sadrzadeh, Mehrnoosh; and Clark, Stephen. 2011. Mathematical foundations for a compositional distributed model of meaning. Linguistic Analysis 36. 345-84.Google Scholar
Cybenko, George. 1989. Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems 2. 303-14. DOI: 10.1007/BF02551274.CrossRefGoogle Scholar
Elman, Jeffrey L. 1990. Finding structure in time. Cognitive Science 14. 179211. DOI: 10.1016/0364-0213(90)90002-E.CrossRefGoogle Scholar
Fellbaum, Christiane (ed.) 1998. WordNet: An electronic database. Cambridge, MA: MIT Press.CrossRefGoogle Scholar
Frank, Michael C., Goodman, Noah D.; and Tenenbaum, Joshua B.. 2009. Using speakers' referential intentions to model early cross-situational word learning. Psychological Science 20. 578-85. DOI: 10.1111/j.1467-9280.2009.02335.x.CrossRefGoogle ScholarPubMed
Goller, Christoph, and Küchler, Andreas. 1996. Learning task-dependent distributed representations by backpropagation through structure. IEEE International Conference on Neural Networks, 347-52. DOI: 10.1109/ICNN.1996.548916.CrossRefGoogle Scholar
Harris, Randy Allen. 1993. The linguistic wars. Oxford: Oxford University Press.CrossRefGoogle Scholar
Hastie, Trevor, Tibshirani, Robert; and Friedman, Jerome. 2009. The elements of statistical learning. 2nd edn. Berlin: Springer.CrossRefGoogle Scholar
Hornik, Kurt. 1992. Approximation capabilities of multilayer feedforward networks. Neural Networks 4. 251-57. DOI: 10.1016/0893-6080(91)90009-T.Google Scholar
Jackendoff, Ray S. 1996. Semantics and cognition. The handbook of contemporary semantic theory, ed. by Lappin, Shalom, 539-59. Oxford: Blackwell.Google Scholar
Janssen, Theo M. V. 1997. Compositionality. Handbook of logic and language, ed. by Benthem, Johan van and Meulen, Alice ter, 417-73. Cambridge, MA: MIT Press, and Amsterdam: North-Holland.Google Scholar
Kaplan, David. 1999. What is meaning? Explorations in the theory ofMeaning as use. Brief version—draft 1. Los Angeles: University of California, Los Angeles, ms.Google Scholar
Klein, Ewan, and Sag, Ivan A.. 1985. Type-driven translation. Linguistics and Philosophy 8. 163201. Online: http://www.jstor.org/stable/25001200.CrossRefGoogle Scholar
Lakoff, Robin. 1971. If's, and's, and but's about conjunction. Studies in linguistic semantics, ed. by Fillmore, Charles J. and Langendoen, D. Terence, 114-49. New York: Holt, Rinehart, and Winston.Google Scholar
LeCun, Yann, Bengio, Yoshua; and Hinton, Geoffrey E.. 2015. Deep learning. Nature 521. 436-44. DOI: 10.1038/nature14539.CrossRefGoogle ScholarPubMed
Lenci, Alessandro. 2018. Distributional models of word meaning. Annual Review of Linguistics 4. 151-71. DOI: 10.1146/annurev-linguistics-030514-125254.CrossRefGoogle Scholar
Lewis, David. 1970. General semantics. Synthese 22. 1867. DOI: 10.1007/BF00413598.CrossRefGoogle Scholar
Liang, Percy, and Potts, Christopher. 2015. Bringing machine learning and compositional semantics together. Annual Review of Linguistics 1. 355-76. DOI: 10.1146/annurev-linguist-030514-125312.CrossRefGoogle Scholar
Manning, Christopher D. 2015. Computational linguistics and deep learning. Computational Linguistics 41. 701-7. DOI: 10.1162/COLI_a_00239.CrossRefGoogle Scholar
Mitchell, Jeff, and Lapata, Mirella. 2010. Composition in distributional models of semantics. Cognitive Science 34. 13881429. DOI: 10.1111/j.1551-6709.2010.01106.x.CrossRefGoogle ScholarPubMed
Montague, Richard. 1970. Universal grammar. Theoria 36. 373-98. DOI: 10.1111/j.1755-2567.1970.tb00434.x. [Reprinted in Montague 1974, 222–46.].CrossRefGoogle Scholar
Montague, Richard. 1974. Formal philosophy: Selected papers of Richard Montague, ed. by Thomason, Richmond H.. New Haven, CT: Yale University Press.Google Scholar
Partee, Barbara H. 1980. Semantics—mathematics or psychology? Semantics from different points of view, ed. by Bäuerle, Egli and Stechow, Arnim von, 114. Berlin: Springer.Google Scholar
Partee, Barbara H. 1981. Montague grammar, mental representations, and reality. Philosophy and grammar, ed. by Kanger, Stig and Öhman, Sven, 5978. Dordrecht: Reidel.Google Scholar
Partee, Barbara H. 1984. Compositionality. Varieties of formal semantics, ed. by Landman, Fred and Veltman, Frank, 281311. Dordrecht: Foris.Google Scholar
Partee, Barbara H. 1995. Lexical semantics and compositionality. An invitation to cognitive science, vol. 1: Language, 2nd edn., ed. by Gleitman, Lila R. and Liberman, Mark, 311-60. Cambridge, MA: MIT Press.Google Scholar
Pater, Joe. 2019. Generative linguistics and neural networks at 60: Foundation, friction, and fusion. Language 95(1). e41e74.CrossRefGoogle Scholar
Plate, Tony A. 1994. Distributed representations and nested compositional structure. Toronto: University of Toronto dissertation.Google Scholar
Pollack, Jordan B. 1990. Recursive distributed representations. Artificial Intelligence 46. 77105. DOI: 10.1016/0004-3702(90)90005-K.CrossRefGoogle Scholar
Potts, Christopher, and Levy, Roger. 2015. Negotiating lexical uncertainty and speaker expertise with disjunction. Berkeley Linguistics Society 41. 417-45. DOI: 10.20354/B4414110013.Google Scholar
Ruppenhofer, Josef, Ellsworth, Michael, Petruck, Miriam R. L., Johnson, Christopher R.; and Scheffczyk, Jan. 2006. FrameNet II: Extended theory and practice. Berkeley, CA: International Computer Science Institute.Google Scholar
Searle, John R. 1972. Chomsky's revolution in linguistics. The New York Review of Books 18. 1229. Online: https://www.nybooks.com/articles/1972/06/29/a-special-supplement-chomskys-revolution-in-lingui/.Google Scholar
Smolensky, Paul. 1990. Tensor product variable binding and the representation of symbolic structures in connectionist systems. Artificial Intelligence 46. 159216. DOI: 10.1016/0004-3702(90)90007-M.CrossRefGoogle Scholar
Socher, Richard, Huval, Brody, Manning, Christopher D.; and Ng, Andrew Y.. 2012. Semantic compositionality through recursive matrix-vector spaces. Proceedings of the 2012 joint conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 1201-11. Online: http://aclweb.org/anthology/D12-1000.Google Scholar
Socher, Richard, Pennington, Jeffrey, Huang, Eric H., Ng, Andrew Y.; and Manning, Christopher D.. 2011. Semi-supervised recursive autoencoders for predicting sentiment distributions. Proceedings of the 2011 conference on Empirical Methods in Natural Language Processing, 151-61. Online: http://www.aclweb.org/anthology/D11-1014.Google Scholar
Socher, Richard, Perelygin, Alex, Wu, Jean Y., Chuang, Jason, Manning, Christopher D., Ng, Andrew Y.; and Potts, Christopher. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. Proceedings of the 2013 conference on Empirical Methods in Natural Language Processing, 1631-42. Online: http://aclweb.org/anthology/D13-1170.Google Scholar
Thomason, Richmond H. 1974. Introduction. In Montague 1974, 169.Google Scholar
Turney, Peter D., and Pantel, Patrick. 2010. From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research 37. 141-88. DOI: 10.1613/jair.2934.CrossRefGoogle Scholar
Wilson, Deirdre, and Carston, Robyn. 2007. A unitary approach to lexical pragmatics: Relevance, inference and ad hoc concepts. Pragmatics, ed. by Burton-Roberts, Noel, 230-59. Basingstoke: Palgrave Macmillan.Google Scholar