Semantics and Deep Learning

Lasha Abzianidze; Lisa Bylinina; Denis Paperno

doi:10.1017/9781009542340

Series: Elements in Semantics

Semantics and Deep Learning

Published online by Cambridge University Press: 16 December 2024

Lasha Abzianidze ,

Lisa Bylinina and

Denis Paperno

Show author details

Lasha Abzianidze: Affiliation:
Utrecht University
Lisa Bylinina: Affiliation:
Utrecht University
Denis Paperno: Affiliation:
Utrecht University

Summary

This Element covers the interaction of two research areas: linguistic semantics and deep learning. It focuses on three phenomena central to natural language interpretation: reasoning and inference; compositionality; extralinguistic grounding. Representation of these phenomena in recent neural models is discussed, along with the quality of these representations and ways to evaluate them (datasets, tests, measures). The Element closes with suggestions on possible deeper interactions between theoretical semantics and language technology based on deep learning models.

Element contents

Summary
References

Get access

Keywords

compositionality grounding inference deep learning large language models

Information

Type: Element
Information: Series: Elements in Semantics

DOI: https://doi.org/10.1017/9781009542340 [Opens in a new window]

Online ISBN: 9781009542340

Publisher: Cambridge University Press

Print publication: 16 January 2025

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Element purchase

Temporarily unavailable

References

Abdou, M., Kulmizev, A., Hershcovich, D., et al. (2021). Can language models encode perceptual structure without grounding? A case study in color. arXiv:2109.06129.Google Scholar

Abusch, D. (2020). Possible-worlds semantics for pictures. In Gutzmann, D., Matthewson, L., Meier, C., et al., eds., The Wiley Blackwell companion to semantics (pp. 1–31). Wiley Blackwell.Google Scholar

Abzianidze, L. (2016). Natural solution to fracas entailment problems. In Proceedings of *SEM (pp. 64–74). ACL.Google Scholar

Abzianidze, L., Bjerva, J., Evang, K., et al. (2017). The Parallel Meaning Bank: Towards a multilingual corpus of translations annotated with compositional meaning representations. In Proceedings of EACL (pp. 242–247). ACL.Google Scholar

Abzianidze, L., Zwarts, J., & Winter, Y. (2023). SpaceNLI: Evaluating the consistency of predicting inferences in space. In Proceedings of NALOMA (pp. 12–24). ACL.Google Scholar

Agirre, E., Cer, D., Diab, M., & Gonzalez-Agirre, A. (2012). Semeval-2012 task 6: A pilot on semantic textual similarity. In Proceedings of Semeval (pp. 385–393).Google Scholar

Akyürek, E., & Andreas, J. (2022). Compositionality as lexical symmetry. arXiv:2201.12926.Google Scholar

Andreas, J. (2019a). Good-enough compositional data augmentation. arXiv:1904.09545.CrossRef Google Scholar

Andreas, J. (2019b). Measuring compositionality in representation learning. In International conference on learning representations. https://openreview.net/forum?id=HJz05o0qK7.Google Scholar

Antol, S., Agrawal, A., Lu, J., et al. (2015). Vqa: Visual question answering. In Proceedings of IEEE/CVFICCV (pp. 2425–2433).Google Scholar

Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv:1409.0473.Google Scholar

Banarescu, L., Bonial, C., Cai, S., et al. (2013). Abstract meaning representation for sembanking. In Proceedings of the 7th linguistic annotation workshop and interoperability with discourse (pp. 178–186). ACL.Google Scholar

Baroni, M. (2016). Grounding distributional semantics in the visual world. Language and Linguistics Compass, 10(1), 3–13.CrossRef Google Scholar

Baroni, M. (2022). On the proper role of linguistically-oriented deep net analysis in linguistic theorizing. In Lappin, S., ed., Algebraic systems and the representation of linguistic knowledge (pp. 5–22). Taylor and Francis.Google Scholar

Baroni, M., Dinu, G., & Kruszewski, G. (2014). Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In Proceedings of ACL (pp. 238–247). ACL.Google Scholar

Baroni, M., & Zamparelli, R. (2010). Nouns are vectors, adjectives are matrices: Representing adjective–noun constructions in semantic space. In Proceedings of EMNLP (pp. 1183–1193).Google Scholar

Barsalou, L. W. (2008). Grounded cognition. Annual Review of Psychology, 59(1), 617–645.CrossRef Google Scholar PubMed

Bender, E. M., & Koller, A. (2020). Climbing towards NLU: On meaning, form, and understanding in the age of data. In Proceedings of ACL (pp. 5185–5198). ACL.Google Scholar

Bernardi, R., Dinu, G., Marelli, M., & Baroni, M. (2013). A relatedness benchmark to test the role of determiners in compositional distributional semantics. In Proceedings of ACL (pp. 53–57). ACL.Google Scholar

Bernardi, R., & Pezzelle, S. (2021). Linguistic issues behind visual question answering. Language and Linguistics Compass, 15(6), e12417.CrossRef Google Scholar PubMed

Bernardy, J.-P. (2018). Can recurrent neural networks learn nested recursion? Linguistic Issues in Language Technology, 16(1). https://aclanthology.org/2018.lilt-16.1.CrossRef Google Scholar

Bernardy, J.-P., & Chatzikyriakidis, S. (2021). Applied temporal analysis: A complete run of the FraCaS test suite. In Proceedings of IWCS (pp. 11–20). ACL.Google Scholar

Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5, 135–146.CrossRef Google Scholar

Boleda, G., Baroni, M., McNally, L., et al. (2013). Intensionality was only alleged: On adjective-noun composition in distributional semantics. In Proceedings of IWCS.Google Scholar

Botha, J., & Blunsom, P. (2014). Compositional morphology for word representations and language modelling. In Xing, E. P. & Jebara, T., eds., Proceedings of the 31st international conference on machine learning (Vol. 32, no. 2) (pp. 1899–1907). https://proceedings.mlr.press/v32/botha14.html.Google Scholar

Bowman, S. R., Angeli, G., Potts, C., & Manning, C. D. (2015). A large annotated corpus for learning natural language inference. In Proceedings of EMNLP (pp. 632–642).CrossRef Google Scholar

Bowman, S. R., & Dahl, G. (2021). What will it take to fix benchmarking in natural language understanding? In Proceedings of NAACL. ACL.Google Scholar

Brown, T., Mann, B., Ryder, N., et al. (2020). Language models are few-shot learners. In Larochelle, H., Ranzato, M., Hadsell, R., et al., eds., Advances in neural information processing systems (Vol. 33, pp. 1877–1901). Curran Associates, Inc. https://proceedings.neurips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf.Google Scholar

Burgess, C., & Lund, K. (1995). Hyperspace analogue to language (HAL): A general model of semantic memory. In Annual meeting of the psychonomic society.Google Scholar

Bylinina, L., & Tikhonov, A. (2022). The driving forces of polarity-sensitivity: Experiments with multilingual pre-trained neural language models. In Proceedings of COGSCI (Vol. 44).Google Scholar

Cer, D., Yang, Y., Kong, S.-Y., et al. (2018). Universal sentence encoder. arXiv:1803.11175.Google Scholar

Chaabouni, R., Dessì, R., & Kharitonov, E. (2021). Can Transformers jump around right in natural language? Assessing performance transfer from scan. In Proceedings of blackboxnlp (pp. 136–148).Google Scholar

Chaabouni, R., Kharitonov, E., Bouchacourt, D., et al. (2020). Compositionality and generalization in emergent languages. arXiv:2004.09124.Google Scholar

Chan, S. C., Santoro, A., Lampinen, A. K., et al. (2022). Data distributional properties drive emergent few-shot learning in transformers. arXiv:2205.05055.Google Scholar

Chatzikyriakidis, S., Cooper, R., Dobnik, S., & Larsson, S. (2017). An overview of natural language inference data collection: The way forward? In Proceedings of the computing natural language inference workshop.Google Scholar

Chen, T., Jiang, Z., Poliak, A., et al. (2020). Uncertain natural language inference. In Proceedings of ACL. ACL.CrossRef Google Scholar

Chen, Z. (2021). Attentive tree-structured network for monotonicity reasoning. In Proceedings of NALOMA (pp. 12–21). ACL.Google Scholar

Chen, Z., Gao, Q., & Moss, L. S. (2021). NeuralLog: Natural language inference with joint neural and logical reasoning. In Proceedings of *SEM. ACL.Google Scholar

Cho, K., Van Merriënboer, B., Gulcehre, C., et al. (2014). Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv:1406.1078.Google Scholar

Chowdhery, A., Narang, S., Devlin, J., et al. (2022). Palm: Scaling language modeling with pathways. arXiv:2204.02311.Google Scholar

Clark, H. H. (1996). Using language. Cambridge University Press.CrossRef Google Scholar

Clark, P., Tafjord, O., & Richardson, K. (2021). Transformers as soft reasoners over language. In Proceedings of IJCAI.Google Scholar

Clark, S., Coecke, B., & Sadrzadeh, M. (2008). A compositional distributional model of meaning. In Proceedings of the second quantum interaction symposium (qi-2008) (pp. 133–140).Google Scholar

Condoravdi, C., Crouch, D., de Paiva, V., et al. (2003). Entailment, intensionality and text understanding. In Proceedings of the HLT-NAACL 2003 workshop on text meaning (pp. 38–45).CrossRef Google Scholar

Cooper, R., Crouch, D., Eijck, J. V., et al. (1996). Fracas: A framework for computational semantics. Deliverable D16.Google Scholar

Coppock, E., & Champollion, L. (2022). Invitation to formal semantics. Manuscript, Boston University and New York University.Google Scholar

Dagan, I., Glickman, O., & Magnini, B. (2006). The Pascal recognising textual entailment challenge. In Proceedings of the Pascal challenges workshop on recognising textual entailment (pp. 177–190). Springer.Google Scholar

Dagan, I., Roth, D., Sammons, M., & Zanzotto, F. M. (2013). Recognizing textual entailment: Models and applications. Morgan & Claypool.CrossRef Google Scholar

Dalvi, B., Jansen, P., Tafjord, O., et al. (2021). Explaining answers with entailment trees. In Proceedings of EMNLP (pp. 7358–7370). ACL.Google Scholar

Dankers, V., Bruni, E., & Hupkes, D. (2022). The paradox of the compositionality of natural language: A neural machine translation case study. In Proceedings of ACL (pp. 4154–4175). ACL.Google Scholar

Davis, F. (2022). On the limitations of data: Mismatches between neural models of language and humans (Unpublished doctoral dissertation). Cornell University.Google Scholar

de Deyne, S., Navarro, D. J., Collell, G., & Perfors, A. (2021). Visual and affective multimodal models of word meaning in language and mind. Cognitive Science, 45(1). 1–44. https://onlinelibrary.wiley.com/doi/epdf/10.1111/cogs.12922.CrossRef Google Scholar PubMed

de Marneffe, M.-C., Rafferty, A. N., & Manning, C. D. (2008). Finding contradictions in text. In Proceedings of ACL (pp. 1039–1047). ACL.Google Scholar

de Marneffe, M.-C., Simons, M., & Tonhauser, J. (2019). The commitmentbank: Investigating projection in naturally occurring discourse. Proceedings of Sinn und Bedeutung, 23(2), 107–124.Google Scholar

Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Burstein, J., Doran, C., & Solorio, T., eds., Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: Human language technologies, volume 1 (long and short papers) (pp. 4171–4186). Association for Computational Linguistics. https://aclanthology.org/N19-1423. https://doi.org/10.18653/v1/N19-1423.Google Scholar

Dima, C., de Kok, D., Witte, N., & Hinrichs, E. (2019). No word is an island: A transformation weighting model for semantic composition. Transactions of the Association for Computational Linguistics, 7, 437–451.CrossRef Google Scholar

Drozdov, A., Schärli, N., Akyuürek, E., et al. (2022). Compositional semantic parsing with large language models. arXiv:2209.15003.Google Scholar

Du, L., Ding, X., Xiong, K., Liu, T., & Qin, B. (2022). Enhancing pretrained language models with structured commonsense knowledge for textual inference. Knowledge-Based Systems, 109488.CrossRef Google Scholar

Du, Y., Li, S., & Mordatch, I. (2020). Compositional visual generation with energy based models. In Neurips (Vol. 33, pp. 6637–6647). Curran Associates, Inc.Google Scholar

Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14(2), 179–211.CrossRef Google Scholar

Ettinger, A. (2020). What BERT is not: Lessons from a new suite of psycholinguistic diagnostics for language models. Transactions of the Association for Computational Linguistics, 8, 34–48.CrossRef Google Scholar

Ettinger, A., Elgohary, A., Phillips, C., & Resnik, P. (2018). Assessing composition in sentence vector representations. In Proceedings of Coling (pp. 1790–1801). ACL.Google Scholar

Fitch, F. B. (1973). Natural deduction rules for English. Philosophical Studies, 24(2), 89–104.CrossRef Google Scholar

Frank, S., Bugliarello, E., & Elliott, D. (2021). Vision-and-language or vision-for-language? On cross-modal influence in multimodal transformers. In Proceedings of EMNLP (pp. 9847–9857). Association for Computational Linguistics.Google Scholar

Gal, R., Alaluf, Y., Atzmon, Y., Patashnik, O., Bermano, A. H., Chechik, G., & Cohen-Or, D. (2022). An image is worth one word: Personalizing text-to-image generation using textual inversion. arXiv.Google Scholar

Gamallo, P. (2021). Compositional distributional semantics with syntactic dependencies and selectional preferences. Applied Sciences, 11(12), 1–13.CrossRef Google Scholar

Gardenfors, P. (2004). Conceptual spaces as a framework for knowledge representation. Mind and Matter, 2(2), 9–27.Google Scholar

Gatti, D., Marelli, M., Vecchi, T., & Rinaldi, L. (2022). Spatial representations without spatial computations. Psychological Science, 33(11), 1947–1958. https://doi.org/10.1177/09567976221094863.CrossRef Google Scholar PubMed

Geiger, A., Cases, I., Karttunen, L., & Potts, C. (2018). Stress-testing neural models of natural language inference with multiply-quantified sentences. arXiv.Google Scholar

Geiger, A., Richardson, K., & Potts, C. (2020). Neural natural language inference models partially embed theories of lexical entailment and negation. In Proceedings of blackboxnlp (pp. 163–173).CrossRef Google Scholar

Giampiccolo, D., Magnini, B., Dagan, I., & Dolan, B. (2007). The third PASCAL recognizing textual entailment challenge. In Proceedings of the ACL-PASCAL workshop on textual entailment and paraphrasing. ACL.Google Scholar

Gleitman, L. (1990). The structural sources of verb meanings. Language Acquisition, 1(1), 3–55.CrossRef Google Scholar

Gleitman, L. R., Cassidy, K., Nappa, R., Papafragou, A., & Trueswell, J. C. (2005). Hard words. Language Learning and Development, 1(1), 23–64.CrossRef Google Scholar

Goyal, Y., Khot, T., Summers-Stay, D., Batra, D., & Parikh, D. (2017). Making the v in vqa matter: Elevating the role of image understanding in visual question answering. In Proceedings of IEEE/CVF CVPR (pp. 6904–6913).CrossRef Google Scholar

Greenberg, G. (2013). Beyond resemblance. Philosophical Review, 122(2).CrossRef Google Scholar

Greenberg, G. (2021). Semantics of pictorial space. Review of Philosophy and Psychology, 12(4), 847–887.CrossRef Google Scholar

Grefenstette, E., Dinu, G., Zhang, Y.- Z., et al. (2013). Multi-step regression learning for compositional distributional semantics. arXiv:1301.6939.Google Scholar

Guevara, E. R. (2011). Computing semantic compositionality in distributional semantics. In Proceedings of the ninth international conference on computational semantics (pp. 135–144).Google Scholar

Gururangan, S., Swayamdipta, S., Levy, O., et al. (2018). Annotation artifacts in natural language inference data. In Proceedings of NAACL (pp. 107–112). ACL.Google Scholar

Guu, K., Lee, K., Tung, Z., et al. (2020). Realm: Retrieval-augmented language model pre-training. arXiv.Google Scholar

Hacquard, V., & Lidz, J. (2022). On the acquisition of attitude verbs. Annual Review of Linguistics, 8, 193–212.CrossRef Google Scholar

Harnad, S. (1990). The symbol grounding problem. Physica D: Nonlinear Phenomena, 42(1–3), 335–346.CrossRef Google Scholar

Harris, R. A. (1993). The linguistics wars. Oxford University Press on Demand.CrossRef Google Scholar

Hartmann, M., de Lhoneux, M., Hershcovich, D., et al. (2021). A multilingual benchmark for probing negation-awareness with minimal pairs. In Proceedings of CONLL (pp. 244–257). ACL.Google Scholar

Hawthorne, C., Jaegle, A., Cangea, C., et al. (2022). General-purpose, long-context autoregressive modeling with perceiver ar. arXiv:2202.07765.Google Scholar

He, Q., Wang, H., & Zhang, Y. (2020). Enhancing generalization in natural language inference by syntax. In Findings of EMNLP. ACL.Google Scholar

Hessel, J., & Lee, L. (2020). Does my multimodal model learn cross-modal interactions? It’s harder to tell than you might think! In Proceedings of EMNLP (pp. 861–877).Google Scholar

Hill, F., Cho, K., & Korhonen, A. (2016). Learning distributed representations of sentences from unlabelled data. In Proceedings of NAACL (pp. 1367–1377).CrossRef Google Scholar

Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.CrossRef Google Scholar PubMed

Hofmann, V., Pierrehumbert, J. B., & Schütze, H. (2021). Superbizarre is not superb: Derivational morphology improves BERT’s interpretation of complex words. arXiv:2101.00403.Google Scholar

Hong, R., Liu, D., Mo, X., et al. (2019). Learning to compose and reason with language tree structures for visual grounding. IEEE transactions on pattern analysis and machine intelligence.Google Scholar

Hossain, M. M., Kovatchev, V., Dutta, P., et al. (2020). An analysis of natural language inference benchmarks through the lens of negation. In Proceedings of EMNLP (pp. 9106–9118). ACL.Google Scholar

Hu, H., Chen, Q., & Moss, L. (2019). Natural language inference with monotonicity. In Proceedings of IWCS (pp. 8–15). ACL.Google Scholar

Hudson, D. A., & Manning, C. D. (2018). Compositional attention networks for machine reasoning. In International Conference on Learning Representations.Google Scholar

Hudson, D. A., & Manning, C. D. (2019). Gqa: A new dataset for real-world visual reasoning and compositional question answering. In Proceedings of IEEE/CVF CVPR (pp. 6700–6709).CrossRef Google Scholar

Hupkes, D., Dankers, V., Mul, M., & Bruni, E. (2020). Compositionality decomposed: How do neural networks generalise? Journal of Artificial Intelligence Research, 67, 757–795.CrossRef Google Scholar

Hupkes, D., Giulianelli, M., Dankers, V., et al. (2022). State-of-the-art generalisation research in NLP: A taxonomy and review.Google Scholar

Hupkes, D., Veldhoen, S., & Zuidema, W. (2018). Visualisation and “diagnostic classifiers” reveal how recurrent and recursive neural networks process hierarchical structure. Journal of Artificial Intelligence Research, 61(1), 907–926.CrossRef Google Scholar

Icard, T. F. (2012). Inclusion and exclusion in natural language. Studia Logica, 100(4), 705–725.CrossRef Google Scholar

Icard, T. F., & Moss, L. S. (2014). Recent progress on monotonicity. LILT, 9.Google Scholar

Irsoy, O., & Cardie, C. (2014). Deep recursive neural networks for compositionality in language. NeurIPS, 27, 1–9.Google Scholar

Jeretic, P., Warstadt, A., Bhooshan, S., & Williams, A. (2020). Are natural language inference models IMPPRESsive? Learning IMPlicature and PRESupposition. In Proceedings of ACL (pp. 8690–8705). ACL.Google Scholar

Jiang, N., & de Marneffe, M.-C. (2019). Evaluating BERT for natural language inference: A case study on the CommitmentBank. In Proceedings of EMNLP–IJCNLP (pp. 6086–6091). ACL.Google Scholar

Jinman, Z., Zhong, S., Zhang, X., & Liang, Y. (2020). Pbos: Probabilistic bag-of-subwords for generalizing word embedding. arXiv:2010.10813.Google Scholar

Johnson, J., Hariharan, B., Van der Maaten, L., et al. (2017). Clevr: A diagnostic dataset for compositional language and elementary visual reasoning. In Proceedings of IEEE/CVF CVPR (pp. 2901–2910).CrossRef Google Scholar

Jumelet, J., Denic, M., Szymanik, J., et al. (2021). Language models use monotonicity to assess NPI licensing. In Findings of ACL–IJCNLP (pp. 4958–4969). ACL.Google Scholar

Kalouli, A.-L., Hu, H., Webb, A. F., et al. (2023). Curing the SICK and other NLI maladies. Computational Linguistics, 49(1), 199–243.CrossRef Google Scholar

Kalouli, A.-L., Real, L., & de Paiva, V. (2017). Textual inference: Getting logic from humans. In Proceedings of IWCS.Google Scholar

Kartsaklis, D., Sadrzadeh, M., & Pulman, S. (2013). Separating disambiguation from composition in distributional semantics. In Proceedings of CONLL (pp. 114–123). ACL.Google Scholar

Kassner, N., & Schütze, H. (2020). Negated and misprimed probes for pretrained language models: Birds can talk, but cannot fly. In Proceedings of ACL (pp. 7811–7818). ACL.Google Scholar

Khan, S., Naseer, M., Hayat, M., et al. (2021). Transformers in vision: A survey. ACM computing surveys (CSUR).Google Scholar

Kiela, D., Bulat, L., & Clark, S. (2015). Grounding semantics in olfactory perception. In Proceedings of ACL (pp. 231–236).CrossRef Google Scholar

Kim, N., & Linzen, T. (2020). Cogs: A compositional generalization challenge based on semantic interpretation. In Proceedings of EMNLP.CrossRef Google Scholar

Kim, N., & Schuster, S. (2023). Entity tracking in language models. In Proceedings of ACL (pp. 3835–3855). ACL.Google Scholar

Kirby, S., Cornish, H., & Smith, K. (2008). Cumulative cultural evolution in the laboratory: An experimental approach to the origins of structure in human language. Proceedings of the National Academy of Sciences, 105(31), 10681–10686.CrossRef Google Scholar

Kirby, S., Tamariz, M., Cornish, H., & Smith, K. (2015). Compression and communication in the cultural evolution of linguistic structure. Cognition, 141, 87–102.CrossRef Google Scholar PubMed

Kober, T., Bijl de Vroe, S., & Steedman, M. (2019). Temporal and aspectual entailment. In Proceedings of IWCS (pp. 103–119). ACL.Google Scholar

Kracht, M. (2011). Interpreted languages and compositionality (Vol. 89). Springer Science & Business Media.CrossRef Google Scholar

Kratzer, A., & Heim, I. (1998). Semantics in generative grammar (Vol. 1185). Blackwell Oxford.Google Scholar

Krishna, R., Zhu, Y., Groth, O., et al. (2017). Visual genome: Connecting language and vision using crowdsourced dense image annotations. International Journal of Computer Vision, 123(1), 32–73.CrossRef Google Scholar

Kudo, T., & Richardson, J. (2018). Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing.CrossRef Google Scholar

Lai, A., & Hockenmaier, J. (2014). Illinois-LH: A denotational and distributional approach to semantics. In Proceedings of SemEval (pp. 329–334). ACL.Google Scholar

Lake, B., & Baroni, M. (2018). Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks. In ICML.Google Scholar

Lakoff, G. (1970). Linguistics and natural logic. Synthese, 22(1), 151–271.CrossRef Google Scholar

Landau, B., & Gleitman, L. R. (1985). Language and experience: Evidence from the blind child. Harvard University Press.Google Scholar

Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104(2), 211–240.CrossRef Google Scholar

Lazaridou, A., Marelli, M., Zamparelli, R., & Baroni, M. (2013). Compositional-ly derived representations of morphologically complex words in distributional semantics. In Proceedings of ACL (pp. 1517–1526).Google Scholar

Le, P., & Zuidema, W. (2015). Compositional distributional semantics with long short term memory. arXiv:1503.02510.Google Scholar

Levy, O., & Goldberg, Y. (2014). Neural word embedding as implicit matrix factorization. NeurIPS, 27, 1–9.Google Scholar

Lewis, D. (1970). General semantics. Synthese, 22(1/2), 18–67.CrossRef Google Scholar

Li, B. Z., Nye, M., & Andreas, J. (2021). Implicit representations of meaning in neural language models. arXiv:2106.00737.Google Scholar

Li, F., Zhang, H., Zhang, Y.-F., et al. (2022). Vision-language intelligence: Tasks, representation learning, and large models. arXiv:2203.01922.Google Scholar

Li, L. H., Yatskar, M., Yin, D., et al. (2019). Visualbert: A simple and performant baseline for vision and language. arXiv:1908.03557.Google Scholar

Lin, T.-Y., Maire, M., Belongie, S., et al. (2014). Microsoft Coco: Common objects in context. In European conference on computer vision (pp. 740–755).Google Scholar

Lin, Z., Feng, M., dos Santos, C. N., et al. (2017). A structured self-attentive sentence embedding. In ICLR.Google Scholar

Linzen, T., & Baroni, M. (2021). Syntactic structure from deep learning. Annual Review of Linguistics, 7, 195–212.CrossRef Google Scholar

Liu, A., Wu, Z., Michael, J., et al. (2023). We’re afraid language models aren’t modeling ambiguity. In Bouamor, H., Pino, J., & Bali, K., eds., Proceedings of the 2023 conference on empirical methods in natural language processing (pp. 790–807). Association for Computational Linguistics. https://aclanthology.org/2023.emnlp-main.51. https://doi.org/10.18653/v1/2023.emnlp-main.51.CrossRef Google Scholar

Liu, Y., Ott, M., Goyal, N., et al. (2019). Roberta: A robustly optimized BERT pretraining approach. arXiv:1907.11692.Google Scholar

Lu, J., Batra, D., Parikh, D., & Lee, S. (2019). Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. NeurIPS.Google Scholar

Lu, J., Goswami, V., Rohrbach, M., et al. (2020). 12-in-1: Multi-task vision and language representation learning. In Proceedings of IEEE/CVF CVPR (pp. 10437–10446).Google Scholar

Luong, M.- T., Socher, R., & Manning, C. D. (2013). Better word representations with recursive neural networks for morphology. In Proceedings of CONLL.Google Scholar

MacCartney, B., & Manning, C. D. (2007). Natural logic for textual inference. In Proceedings of the ACL-PASCAL workshop on textual entailment and paraphrasing (pp. 193–200). ACL.CrossRef Google Scholar

MacCartney, B., & Manning, C. D. (2009). An extended model of natural logic. In Proceedings of IWCS (pp. 140–156). ACL.Google Scholar

Mao, J., Huang, J., Toshev, A., et al. (2016). Generation and comprehension of unambiguous object descriptions. In Proceedings of IEEE/CVF CVPR (pp. 11–20).Google Scholar

Marelli, M., Menini, S., Baroni, M., et al. (2014). A sick cure for the evaluation of compositional distributional semantic models. In Proceedings of LREC (pp. 216–223).Google Scholar

Margolis, E. E., & Laurence, S. E. (1999). Concepts: Core readings. MIT Press.Google Scholar

McCoy, R. T., Linzen, T., Dunbar, E., & Smolensky, P. (2019). RNNs implicitly implement tensor-product representations. In ICLR.Google Scholar

McCoy, R. T., Pavlick, E., & Linzen, T. (2019). Right for the wrong reasons: Diagnosing syntactic heuristics in natural language inference. In Proceedings of ACL (pp. 3428–3448). ACL.Google Scholar

Merrill, W., Warstadt, A., & Linzen, T. (2022). Entailment semantics can be extracted from an ideal language model. arXiv.CrossRef Google Scholar

Merullo, J., Eickhoff, C., & Pavlick, E. (2023). A mechanism for solving relational tasks in transformer language models.Google Scholar

Meteyard, L., Cuadrado, S. R., Bahrami, B., & Vigliocco, G. (2012). Coming of age: A review of embodiment and the neuroscience of semantics. Cortex, 48(7), 788–804.CrossRef Google Scholar

Mickus, T., Bernard, T., & Paperno, D. (2020). What meaning–form correlation has to compose with: A study of MFC on artificial and natural language. In Proceedings of COLING (pp. 3737–3749). International Committee on Computational Linguistics.Google Scholar

Mickus, T., Paperno, D., & Constant, M. (2022). How to dissect a Muppet: The structure of transformer embedding spaces. TACL, 10, 981–996.CrossRef Google Scholar

Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv:1301.3781.Google Scholar

Mitchell, J., & Lapata, M. (2010). Composition in distributional models of semantics. Cognitive Science, 34(8), 1388–1429.CrossRef Google Scholar PubMed

Montague, R. (1970). English as a formal language. In Linguaggi nella societa e nella tecnica (pp. 188–221). Edizioni di Communita.Google Scholar

Montague, R. (1973). The proper treatment of quantification in ordinary English. In Approaches to natural language (pp. 221–242). Springer.CrossRef Google Scholar

Moss, L. S. (2010). Natural logic and semantics. In Logic, language and meaning (pp. 84–93). Springer.CrossRef Google Scholar

Moss, L. S. (2015). Natural logic. The handbook of contemporary semantic theory (pp. 559–592).CrossRef Google Scholar

Murzi, J., & Steinberger, F. (2017). Inferentialism. A Companion to the Philosophy of Language, 1, 197–224.CrossRef Google Scholar

Naik, A., Ravichander, A., Sadeh, N., et al. (2018). Stress test evaluation for natural language inference. In Proceedings of COLING (pp. 2340–2353). ACL.Google Scholar

Nangia, N., & Bowman, S. (2018). ListOps: A diagnostic dataset for latent tree learning. In Proceedings of NAACL: Student research workshop. ACL.Google Scholar

Nie, Y., Zhou, X., & Bansal, M. (2020). What can we learn from collective human opinions on natural language inference data? In Proceedings of EMNLP (pp. 9131–9143). ACL.Google Scholar

Nivre, J., de Marneffe, M.-C., Ginter, F., et al. (2020). Universal Dependencies v2: An evergrowing multilingual treebank collection. In Proceedings of LREC. ELRA.Google Scholar

Nye, M., Solar-Lezama, A., Tenenbaum, J., & Lake, B. M. (2020). Learning compositional rules via neural program synthesis. NeurIPS, 33, 10832–10842.Google Scholar

Olsson, C., Elhage, N., Nanda, N., et al. (2022). In-context learning and induction heads. arXiv:2209.11895.Google Scholar

Ouyang, L., Wu, J., Jiang, X., et al. (2022). Training language models to follow instructions with human feedback. NeurIPS, 35, 27730–27744.Google Scholar

Paperno, D. (2022). On learning interpreted languages wit–h recurrent models. Computational Linguistics, 48(2), 471–482.CrossRef Google Scholar

Paperno, D., & Baroni, M. (2016). When the whole is less than the sum of its parts: How composition affects pmi values in distributional semantic vectors. Computational Linguistics, 42(2), 345–350.CrossRef Google Scholar

Paperno, D., Kruszewski, G., Lazaridou, A., et al. (2016). The LAMBADA dataset: Word prediction requiring a broad discourse context. In Proceedings of ACL.Google Scholar

Paperno, D., Pham, N. T., & Baroni, M. (2014). A practical and linguistically-motivated approach to compositional distributional semantics. In Proceedings of ACL (pp. 90–99). ACL.Google Scholar

Parcalabescu, L., Cafagna, M., Muradjan, L., et al. (2022). Valse: A task-independent benchmark for vision and language models centered on linguistic phenomena. In Proceedings of ACL.CrossRef Google Scholar

Parcalabescu, L., & Frank, A. (2022). Mm-shap: A performance-agnostic metric for measuring multimodal contributions in vision and language models & tasks. arXiv:2212.08158.Google Scholar

Parikh, P. (2001). The use of language. CSLI Publications.Google Scholar

Parrish, A., Schuster, S., Warstadt, A., et al. (2021). NOPE: A corpus of naturally-occurring presuppositions in English. In Proceedings of CONLL (pp. 349–366). ACL.Google Scholar

Patel, A., Li, B., Rasooli, M. S., et al. (2022). Bidirectional language models are also few-shot learners. arXiv:2209.14500.Google Scholar

Pavlick, E., & Kwiatkowski, T. (2019). Inherent disagreements in human textual inferences. TACL, 7, 677–694.CrossRef Google Scholar

Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. In Proceedings of EMNLP (pp. 1532–1543).Google Scholar

Pérez, J., Barceló, P., & Marinkovic, J. (2021). Attention is Turing complete. Journal of Machine Learning Research, 22(1), 3463–3497.Google Scholar

Pezzelle, S. (2023). Dealing with semantic underspecification in multimodal NLP. In Proceedings of ACL (pp. 12098–12112). ACL.Google Scholar

Pezzelle, S., Takmaz, E., & Fernández, R. (2021). Word representation learning in multimodal pre-trained transformers: An intrinsic evaluation. TACL, 9, 1563–1579.CrossRef Google Scholar

Piantadosi, S. T., & Hill, F. (2022). Meaning without reference in large language models. arXiv.Google Scholar

Pinter, Y., Guthrie, R., & Eisenstein, J. (2017). Mimicking word embeddings using subword rnns. arXiv:1707.06961.Google Scholar

Poliak, A. (2020). A survey on recognizing textual entailment as an NLP evaluation. In Proceedings of the first workshop on evaluation and comparison of NLP systems (pp. 92–109). ACL.CrossRef Google Scholar

Poliak, A., Haldar, A., Rudinger, R., et al. (2018). Collecting diverse natural language inference problems for sentence representation evaluation. In Proceedings of EMNLP (pp. 67–81). ACL.Google Scholar

Poliak, A., Naradowsky, J., Haldar, A., et al. (2018). Hypothesis only baselines in natural language inference. In Proceedings of *SEM (pp. 180–191). ACL.Google Scholar

Potts, C. (2020). Is it possible for language models to achieve language understanding? (Medium post).Google Scholar

Prokhorov, V., Pilehvar, M. T., Kartsaklis, D., et al. (2019). Unseen word representation by aligning heterogeneous lexical semantic spaces. In Proceedings of the AAAI conference on artificial intelligence (Vol. 33, pp. 6900–6907).CrossRef Google Scholar

Pullum, G. K., & Huddleston, R. (2002). Negation. In The Cambridge grammar of the English language (pp. 785–850). Cambridge University Press.CrossRef Google Scholar

Radford, A., Kim, J. W., Hallacy, C., et al. (2021). Learning transferable visual models from natural language supervision. In ICML (pp. 8748–8763).Google Scholar

Radford, A., Wu, J., Child, R., et al. (2019). Language models are unsupervised multitask learners.Google Scholar

Rajaee, S., Yaghoobzadeh, Y., & Pilehvar, M. T. (2022). Looking at the overlooked: An analysis on the word-overlap bias in natural language inference. In Proceedings of EMNLP (pp. 10605–10616). ACL.Google Scholar

Rajpurkar, P., Zhang, J., Lopyrev, K., & Liang, P. (2016). Squad: 100,000+ questions for machine comprehension of text. arXiv:1606.05250.Google Scholar

Ramesh, A., Dhariwal, P., Nichol, A., et al. (2022). Hierarchical text-conditional image generation with clip latents. arXiv.Google Scholar

Rassin, R., Ravfogel, S., & Goldberg, Y. (2022). Dalle-2 is seeing double: Flaws in word-to-concept mapping in text2image models. arXiv:2210.10606.Google Scholar

Ravichander, A., Naik, A., Rose, C., & Hovy, E. (2019). EQUATE: A benchmark evaluation framework for quantitative reasoning in natural language inference. In Proceedings of CONLL (pp. 349–361). ACL.Google Scholar

Ribeiro, M. T., Wu, T., Guestrin, C., & Singh, S. (2020). Beyond accuracy: Behavioral testing of NLP models with CheckList. In Proceedings of ACL (pp. 4902–4912). ACL.Google Scholar

Richardson, K., Hu, H., Moss, L. S., & Sabharwal, A. (2020). Probing natural language inference models through semantic fragments. In AAAI.CrossRef Google Scholar

Ritter, S., Long, C., Paperno, D., et al. (2015). Leveraging preposition ambiguity to assess compositional distributional models of semantics. In Proceedings of *SEM.Google Scholar

Rogers, A., Kovaleva, O., & Rumshisky, A. (2020). A primer in BERTology: What we know about how BERT works. TACL, 8, 842–866.CrossRef Google Scholar

Rogers, A., & Rumshisky, A. (2020). A guide to the dataset explosion in QA, NLI, and commonsense reasoning. In Proceedings of COLING: Tutorial abstracts (pp. 27–32). International Committee for Computational Linguistics.Google Scholar

Rombach, R., Blattmann, A., Lorenz, D., et al. (2021). High-resolution image synthesis with latent diffusion models.CrossRef Google Scholar

Ross, A., & Pavlick, E. (2019). How well do NLI models capture verb veridicality? In Proceedings of EMNLP–IJCNLP (pp. 2230–2240). ACL.Google Scholar

Ruiz, N., Li, Y., Jampani, V., et al. (2022). Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. arXiv.Google Scholar

Ryzhova, D., Kyuseva, M., & Paperno, D. (2016). Typology of adjectives benchmark for compositional distributional models. In Proceedings of LREC (pp. 1253–1257).Google Scholar

Saha, S., Ghosh, S., Srivastava, S., & Bansal, M. (2020). PRover: Proof generation for interpretable reasoning over rules. In Proceedings of EMNLP (pp. 122–136). ACL.Google Scholar

Saha, S., Nie, Y., & Bansal, M. (2020). ConjNLI: Natural language inference over conjunctive sentences. In Proceedings of EMNLP. ACL.Google Scholar

Saharia, C., Chan, W., Saxena, S., et al. (2022). Photorealistic text-to-image diffusion models with deep language understanding. Advances in Neural Information Processing Systems, 35, 36479–36494.Google Scholar

Schlag, I., Smolensky, P., Fernandez, R., et al. (2019). Enhancing the transformer with explicit relational encoding for math problem solving. arXiv:1910.06611.Google Scholar

Schlenker, P. (2018). What is super semantics? Philosophical Perspectives, 32(1), 365–453.CrossRef Google Scholar

Schroeder-Heister, P. (2018). Proof-theoretic semantics. In The Stanford encyclopedia of philosophy (Spring 2018 ed.). Metaphysics Research Lab, Stanford University.Google Scholar

Schuster, S., Chen, Y., & Degen, J. (2020). Harnessing the linguistic signal to predict scalar inferences. In Proceedings of ACL (pp. 5387–5403). ACL.Google Scholar

Sennrich, R., Haddow, B., & Birch, A. (2015). Neural machine translation of rare words with subword units. arXiv:1508.07909.Google Scholar

Smolensky, P. (1990). Tensor product variable binding and the representation of symbolic structures in connectionist systems. Artificial Intelligence, 46(1– 2), 159–216.CrossRef Google Scholar

Socher, R., Huval, B., Manning, C. D., & Ng, A. Y. (2012). Semantic compositionality through recursive matrix-vector spaces. In Proceedings of EMNLP–CONLL (pp. 1201–1211).Google Scholar

Socher, R., Perelygin, A., Wu, J., et al. (2013). Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of EMNLP (pp. 1631–1642).Google Scholar

Sommers, F. (1982). The logic of natural language. Oxford University Press.Google Scholar

Song, X., Salcianu, A., Song, Y., et al. (2021). Fast WordPiece tokenization. In Proceedings of EMNLP. ACL.Google Scholar

Soricut, R., & Och, F. J. (2015). Unsupervised morphology induction using word embeddings. In Proceedings of NAACL (pp. 1627–1637).Google Scholar

Soulos, P., McCoy, R. T., Linzen, T., & Smolensky, P. (2020). Discovering the compositional structure of vector representations with role learning networks. In Proceedings of blackboxnlp (pp. 238–254). ACL.Google Scholar

Srivastava, A., Rastogi, A., Rao, A., et al. (2022). Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. arXiv:2206.04615.Google Scholar

Storks, S., Gao, Q., & Chai, J. Y. (2019). Recent advances in natural language inference: A survey of benchmarks, resources, and approaches. arXiv.Google Scholar

Suhr, A., Lewis, M., Yeh, J., & Artzi, Y. (2017). A corpus of natural language for visual reasoning. In Proceedings of ACL (pp. 217–223).CrossRef Google Scholar

Tafjord, O., Dalvi, B., & Clark, P. (2021). ProofWriter: Generating implications, proofs, and abductive statements over natural language. In Findings of ACL–IJCNLP (pp. 3621–3634). ACL.Google Scholar

Tan, H., & Bansal, M. (2019). Lxmert: Learning cross-modality encoder representations from transformers. arXiv:1908.07490.Google Scholar

Tan, H., & Bansal, M. (2020). Vokenization: Improving language understanding with contextualized, visual-grounded supervision. arXiv:2010.06775.Google Scholar

Thrush, T., Jiang, R., Bartolo, M., et al. (2022). Winoground: Probing vision and language models for visio-linguistic compositionality. In Proceedings of IEEE/CVF CVPR.CrossRef Google Scholar

Tikhonov, A., Bylinina, L., & Paperno, D. (2023). Leverage points in modality shifts: Comparing language-only and multimodal word representations. In Proceedings of *SEM (pp. 11–17). ACL.Google Scholar

Tokmakov, P., Wang, Y.-X., & Hebert, M. (2019). Learning compositional representations for few-shot recognition. In Proceedings of IEEE/CVF ICCV (pp. 6372–6381).CrossRef Google Scholar

Touvron, H., Martin, L., Stone, K., et al. (2023). Llama 2: Open foundation and fine-tuned chat models. arXiv:2307.09288.Google Scholar

Truong, T., Baldwin, T., Cohn, T., & Verspoor, K. (2022). Improving negation detection with negation-focused pre-training. In Proceedings of NAACL (pp. 4188–4193). ACL.Google Scholar

Truong, T. H., Otmakhova, Y., Baldwin, T., et al. (2022). Not another negation benchmark: The NaN–NLI test suite for sub-clausal negation. In Proceedings of AACL–IJCNLP (pp. 883–894). ACL.Google Scholar

Tsuchiya, M. (2018). Performance impact caused by hidden bias of training data for recognizing textual entailment. In Proceedings of LREC. ELRA.Google Scholar

Turing, A. M. (2009). Computing machinery and intelligence. In Parsing the Turing test (pp. 23–65). Springer.CrossRef Google Scholar

Van Benthem, J. (1986). Natural logic. In Essays in logical semantics (pp. 109–119). Springer Netherlands.CrossRef Google Scholar

Van Benthem, J. (2008). A brief history of natural logic. In Logic, navya-nyaya and applications, homage to Bimal Krishna Matilal. College Publications.Google Scholar

Vashishtha, S., Poliak, A., Lal, Y. K., et al. (2020). Temporal reasoning in natural language inference. In Findings of EMNLP (pp. 4070–4078). ACL.Google Scholar

Vaswani, A., Shazeer, N., Parmar, N., et al. (2017). Attention is all you need. In Neurips.Google Scholar

Vedantam, R., Bengio, S., Murphy, K., et al. (2017). Context-aware captions from context-agnostic supervision. In Proceedings of IEEE/CVF CVPR (pp. 251–260).CrossRef Google Scholar

Verga, P., Sun, H., Soares, L. B., & Cohen, W. W. (2020). Facts as experts: Adaptable and interpretable neural memory over symbolic knowledge. arXiv.Google Scholar

Vulić, I., Baker, S., Ponti, E. M., et al. (2020). Multi-simlex: A large-scale evaluation of multilingual and crosslingual lexical semantic similarity. Computational Linguistics, 46(4), 847–897.CrossRef Google Scholar

Wang, A., Pruksachatkun, Y., Nangia, N., et al. (2019). Superglue: A stickier benchmark for general-purpose language understanding systems. In Neurips (Vol. 32). Curran Associates, Inc.Google Scholar

Wang, A., Singh, A., Michael, J., et al. (2019). GLUE: A multi-task benchmark and analysis platform for natural language understanding. In ICLR.Google Scholar

Warstadt, A., & Bowman, S. R. (2022). What artificial neural networks can tell us about human language acquisition. arXiv:2208.07998.Google Scholar

Weiss, G., Goldberg, Y., & Yahav, E. (2018). On the practical computational power of finite precision rnns for language recognition. In Proceedings of ACL (pp. 740–745).Google Scholar

White, A. S., Rastogi, P., Duh, K., & Van Durme, B. (2017). Inference is everything: Recasting semantic resources into a unified evaluation framework. In Proceedings of IJCNLP (pp. 996–1005). Asian Federation of Natural Language Processing.Google Scholar

Williams, A., Nangia, N., & Bowman, S. (2018). A broad-coverage challenge corpus for sentence understanding through inference. In Proceedings of NAACL (pp. 1112–1122). ACL.Google Scholar

Yanaka, H., Mineshima, K., Bekki, D., & Inui, K. (2020). Do neural models learn systematicity of monotonicity inference in natural language? In Proceedings of ACL (pp. 6105–6117). ACL.Google Scholar

Yanaka, H., Mineshima, K., Bekki, D., et al. (2019a). Can neural networks understand monotonicity reasoning? In Proceedings of blackboxnlp (pp. 31–40).Google Scholar

Yanaka, H., Mineshima, K., Bekki, D., et al. (2019b). HELP: A dataset for identifying shortcomings of neural models in monotonicity reasoning. In Proceedings of *SEM.Google Scholar

Yang, Z., Dai, Z., Yang, Y., et al. (2019). Xlnet: Generalized autoregressive pretraining for language understanding. In Neurips (Vol. 32). Curran Associates, Inc.Google Scholar

Yi, K., Gan, C., Li, Y., et al. (2019). Clevrer: Collision events for video representation and reasoning. arXiv:1910.01442.Google Scholar

Yuksekgonul, M., Bianchi, F., Kalluri, P., et al. (2022). When and why vision-language models behave like bags-of-words, and what to do about it? arXiv.Google Scholar

Yun, T., Bhalla, U., Pavlick, E., & Sun, C. (2022). Do vision-language pretrained models learn primitive concepts? arXiv:2203.17271.Google Scholar

Zaenen, A., Karttunen, L., & Crouch, R. (2005). Local textual inference: Can it be defined or circumscribed? In Proceedings of the ACL workshop on empirical modeling of semantic equivalence and entailment. ACL.Google Scholar

Zhang, C., Van Durme, B., Li, Z., & Stengel-Eskin, E. (2022). Visual commonsense in pretrained unimodal and multimodal models. arXiv:2205.01850.Google Scholar

Zhang, C., Yang, Z., He, X., & Deng, L. (2020). Multimodal intelligence: Representation learning, information fusion, and applications. IEEE Journal of Selected Topics in Signal Processing, 14(3), 478–493.CrossRef Google Scholar

Zhou, D., Schärli, N., Hou, L., et al. (2022). Least-to-most prompting enables complex reasoning in large language models. arXiv:2205.10625.Google Scholar

Zhou, Y., Liu, C., & Pan, Y. (2016). Modelling sentence pairs with tree-structured attentive encoder. In Proceedings of COLING (pp. 2912–2922). The COLING 2016 Organizing Committee.Google Scholar

Accessibility standard: Unknown

Accessibility compliance for the PDF of this Element is currently unknown and may be updated in the future.

Element contents

Semantics and Deep Learning

Summary

Keywords

Information

Access options

Element purchase

Temporarily unavailable

References

Accessibility standard: Unknown

Save element to Kindle

Save element to Dropbox

Save element to Google Drive