To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Kamp represents the first step in a very ambitious program of research. It is appropriate at this time to reflect upon this program, how far we have come, and what lies in the future.
KAMP represents not merely an attempt to devise an expedient strategy for getting text out of a computer, but rather embodies an entire theory of communication. The goal of such a theory could be summarized by saying that its objective is to account for how agents manage to intentionally affect the beliefs, desires and intentions of other agents. Developing such a theory requires examining utterances to determine the goals the speakers are attempting to achieve thereby, and in the process explicating the knowledge about their environment, about their audience, and about their language that these speakers must have. Language generation has been ehosen as an ideal vehicle for the study of problems arising from such a theory because it requires one to face the problem of why speakers choose to do the things they do in a way that is not required by language understanding. Theories of language understanding make heavy use of the fact that the speaker is behaving according to a coherent plan. Language generation requires producing such a coherent plan in the first place, and therefore requires uncovering the underlying principles that make such a plan coherent.
This chapter discusses in detail a typical example that requires KAMP to form a plan involving several physical and illocutionary acts, and then to integrate the illocutionary acts into a single utterance. This example does not reflect every aspect of utterance planning, but hopefully touches upon enough of them to enable an understanding of the way KAMP works, to illustrate the principles discussed in earlier chapters of this book, and to provide a demonstration of KAMP's power and some of its limitations. It is important to bear in mind that the implementation of KAMP was done to test the feasability of a particular approach to multiagent planning and language generation. Since it is not intended to be a “production” system, many details of efficiency involving both fundamental issues and engineering problems have been purposely disregarded in this discussion.
KAMP is based on a first-order logic natural-deduction system that is similar in many respects to the one proposed by Moore (1980). The current implementation does not take advantage of well-known techniques such as structure sharing and indexing that could be used to reduce some of the computational effort required. Nevertheless, the system is reliable, albeit inefficient, in making the necessary deductions to solve problems similar to the one described here.
This chapter examines some of the special requirements of a knowledge representation formalism that arise from the planning of linguistic actions. Utterance planning requires the ability to reason about a wide variety of intensional concepts that include knowledge per se, mutual knowledge, belief, and intention. Intensional concepts can be represented in intensional logic by operators that apply to both individuals and sentences. What makes intensional operators different from ordinary extensional ones such as conjunction and disjunction is that one cannot substitute terms that have the same truth-value within the scope of one of these operators without sometimes changing the truth-value of the entire sentence. For example, suppose that John knows Mary's phone number. Suppose that unbeknown to John, Mary lives with Bill — and therefore Bill's phone number is the same as Mary's. It does not follow from these premises that John knows what Bill's phone number is.
The planning of linguistic actions requires reasoning about several different types of intensional operators. In this research we shall be concerned with the operators Know (and occasionally the related operator Believe), Mutually-Know, Knowref (knowing the denotation of a description), Intend (intending to make a proposition true) and Intend-To-Do (intending to perform a particular action).
This chapter discusses the problems of planning surface linguistic actions, including surface speech acts, concept activation actions, and focusing actions. What distinguishes these surface linguistic acts from the illocutionary acts considered in Chapter 5 is that they correspond directly to parts of the utterance that are produced by the planning agent. An agent intends to convey a proposition by performing an illocutionary act. There may be many choices available to him for the purpose of conveying the proposition with the intended illocutionary force. For example, he may make a direct request by using an imperative, or perform the act of requesting indirectly by asking a question. He usually has many options available to him for referring to objects in the world.
A surface linguistic act, on the other hand, represents a particular linguistic realization of the intended illocutionary act. Planning a surface speech act entails making choices about the many options that are left open by a high-level specification of an illocutionary act. In addition, the surface speech act must satisfy a multitude of constraints imposed by the grammar of the language. The domain of reasoning done by the planner includes actions along with their preconditions and effects. The grammatical constraints lie outside this domain of actions and goals (excluding, of course, the implicit goal of producing coherent English), and are therefore most suitably specified within a different system.
The planning of natural-language utterances builds on contributions from a number of disciplines. The construction of the multiagent planning system is relevant to artificial intelligence research on planning and knowledge representation. The axiomatization of illocutionary acts discussed in Chapter 5 relies on results in speech act theory and the philosophy of language. Constructing a grammar of English builds on the study of syntax in linguistics and of semantics in both linguistics and philosophy. A complete survey of the relevant literature would go far beyond the scope of this book. This chapter is included to give the reader an overview of some of the most important research that is pertinent to utterance planning.
Language generation
It was quite a long time before the problem of language generation began to receive the attention it deserves. Beginning in about 1982, there has been a virtual explosion in the quantity of research being done in this field, and a complete review of all of it could well fill a book (see Bole and McDonald, forthcoming). This chapter presents an overview of some of the earlier work that provides a foundation for the research that follows.
Several early language-generation systems, (e.g. Friedman, 1969), were designed more for the purpose of testing grammars than for communication.
This book is based on research I did in the Stanford University Computer Science Department for the degree of Doctor of Philosophy. I express my sincere gratitude to my dissertation reading committee: Terry Winograd, Gary Hendrix, Doug Lenat and Nils Nilsson. Their discussion and comments contributed greatly to the research reported here. Barbara Grosz's thoughtful comments on my thesis contributed significantly to the quality of the research. I also thank Phil Cohen and Bonnie Webber for providing detailed comments on the first draft of this book and for providing many useful suggestions, and Aravind Joshi for his efforts in editing the Cambridge University Press Studies in Natural Language Processing series.
This research was supported by the Office of Naval Research under contract N00014-80-C-0296, and by the National Science Foundation under grant MCS-8115105. The preparation of this book was in part made possible by a gift from the System Development Foundation to SRI International as part of a coordinated research effort with the Center for the Study of Language and Information at Stanford University.
This book would be totally unreadable were it not for the efforts of SRI International Senior Technical Editor Savel Kliachko, who transformed my muddled ramblings into golden prose.
Toward a theory of language generation and communication
A primary goal of natural-language generation research in artificial intelligence is to design a system that is capable of producing utterances with the same fluency as that of a human speaker. One could imagine a “Turing Test” of sorts in which a person was presented with a dialogue between a human and a computer and, on the basis of the naturalness of its use of the English language, asked to identify which participant was the computer. Unfortunately, no natural-language generation system yet developed can pass the test for an extended dialogue.
A language-generation system capable of passing this test would obviously have a great deal of syntactic competence. It would be capable of using correctly and appropriately such syntactic devices as conjunction and ellipsis; it would be competent at fitting its utterances into a discourse, using pronominal references where appropriate, choosing syntactic structures consistent with the changing focus, and giving an overall feeling of coherence to the discourse. The system would have a large knowledge base of basic concepts and commonsense knowledge so that it could converse about any situation that arose naturally in its domain.
However, even if a language-generation system met all the above criteria, it might still not be able to pass our “Turing Test” because to know only about the syntactic and semantic rules of the language is not enough.
This chapter deals with the design and implementation of a planning system called KAMP (an acronym for Knowledge And Modalities Planner) that is capable of planning to influence another agent's knowledge and intentions. The motivation for the development of such a planning system is the production of natural-language utterances. However, a planner with such capabilities is useful in any domain in which information-gathering actions play an important role, even though the domain does not necessarily involve planning speech acts or coordinating actions among multiple agents.
One could imagine, for example, a police crime laboratory to which officers bring for analysis substances found at the scene of a crime. The system's goal is to identify the unknown substance. The planner would know of certain laboratory operations that agents would be capable of performing — in effect actions that would produce knowledge about what the substance is or is not. A plan would consist of a sequence of such information-gathering actions, and the result of executing the entire plan would be that the agent performing the actions knows the identity of the mystery substance. Since the primary motivation for KAMP is a linguistic one, most of the examples will be taken from utterance planning; the reader should note, however, that the mechanisms proposed are general and appear to have interesting applications in other areas as well.
In order to understand the relationship between syntactic theory and how people parse sentences, it is first necessary to understand the more general relationship between the grammar and the general cognitive system (GCS). The Chomskyan view, adhered to by most linguists working within the modern generative framework, is that the grammar is a cognitive subsystem whose vocabulary and operations are defined independently of the GCS and account for the structure of language (Chomsky, 1980). Linguistics is thus the branch of theoretical cognitive psychology which explains language structure.
There is another possible relationship between the grammar and the GCS in which linguistics does not play a primary theoretical role in explaining language structure. On this view, the structure of language is explained by basic principles of the GCS – for example, the nature of concepts in interaction with basic properties of the human information processing system. If this view is correct, grammars become convenient organizational frameworks for describing the structure of language. Linguistics is then a descriptive rather than a theoretical branch of cognitive psychology. The linguistics-as-descriptive position was held by the American Structuralists and is presently being revived from a somewhat different perspective in the form of “cognitive grammar” (Lakoff, in press).
These two frameworks for understanding the relationship between grammars and the cognitive system – linguistics as explanation and linguistics as description – suggest different research strategies for answering the question posed by the theme of this book: namely, What is the relationship between syntactic theory and how listeners parse sentences?
There has been some interest in recent years in finding functional explanations for various properties of human languages. The general form of these explanations is
Languages have property P because if they did not
couldn't learn them; or
couldn't plan and produce sentences efficiently; or
couldn't understand sentences reliably and efficiently; or
wouldn't be able to express the sorts of messages we typically want to express.
Some linguists are dubious about the legitimacy of such investigations, and they are indeed a notoriously risky undertaking. It is all too easy to be seduced by what looks like a plausible explanation for some linguistic phenomenon, but there is really no way of proving that it is the correct explanation, or even that functional considerations are relevant at all. What, then, can be said in favor of this line of research?
Setting aside the sheer fascination of finding answers to why-questions, we can point to some more practical benefits that may result. First, we may find out something about the learning mechanism, or the sentence processing mechanism, or whichever component of the language faculty provides a likely functional explanation for the linguistic facts. In this paper we will concentrate on the sentence parsing mechanism. (See Fodor and Crain, in preparation, for discussion of language learning.) It is clear that one can derive at least some interesting hypotheses about how the parser is structured, by considering how it would have to be structured in order to explain why certain sentences are ungrammatical, why there are constraints excluding certain kinds of ambiguity, and so forth.
Since the late 1970s there has been vigorous activity in constructing highly constrained grammatical systems by eliminating the transformational component either totally or partially. There is increasing recognition of the fact that the entire range of dependencies that transformational grammars in their various incarnations have tried to account for can be captured satisfactorily by classes of rules that are nontransformational and at the same time highly constrained in terms of the classes of grammars and languages they define.
Two types of dependencies are especially important: subcategorization and filler-gap dependencies. Moreover, these dependencies can be unbounded. One of the motivations for transformations was to account for unbounded dependencies. The so-called nontransformational grammars account for the unbounded dependencies in different ways. In a tree adjoining grammar (TAG) unboundedness is achieved by factoring the dependencies and recursion in a novel and linguistically interesting manner. All dependencies are defined on a finite set of basic structures (trees), which are bounded. Unboundedness is then a corollary of a particular composition operation called adjoining. There are thus no unbounded dependencies in a sense.
This factoring of recursion and dependencies is in contrast to transformational grammars (TG), where recursion is defined in the base and the transformations essentially carry out the checking of the dependencies. The phrase linking grammars (PLGs) (Peters and Ritchie, 1982) and the lexical functional grammars (LFGs) (Kaplan and Bresnan, 1983) share this aspect of TGs; that is, recursion builds up a set a structures, some of which are then filtered out by transformations in a TG, by the constraints on linking in a PLG, and by the constraints introduced via the functional structures in an LFG.
In this paper I want to draw together a number of observations bearing on how people interpret constituent questions. The observations concern the interpretation possibilities for “moved” and “unmoved” wh-phrases, as well as wide scope interpretation of quantifiers in embedded sentences. I will argue that languages typically display a correlation between positions that do not allow extractions and positions where a constituent cannot be interpreted with wide scope. Given this correlation, it seems natural to investigate the processes of extraction and wide-scope interpretation from the perspective of sentence processing, in the hope of explaining correlations between the two. I have singled out constituent questions because they illustrate the parsing problem for sentences with nonlocal filler-gap dependencies; they are a particularly interesting case to consider because of interactions between scope determining factors and general interpretive strategies for filler-gap association.
Gap-filling
To what extent is the process of gap-filling sensitive to formal, as opposed to semantic, properties of the linguistic input? One type of evidence that is relevant here is the existence of a morphological dependency between the filler and the environment of the gap, as illustrated in (1).
(1) a. Which people did Mary say — were invited to dinner?
b. *Which people did Mary say — was invited to dinner?
In languages with productive case marking, a similar type of dependency will hold between the case of the filler and the local environment of the gap. This kind of morphological agreement is typically determined by properties having to do with the surface form of the items in question, or with inherent formal properties, such as which noun class a given noun belongs to.
The ostensive goal of this paper is to construct a general complexity metric for the processing of natural language sentences, focusing on syntactic determinants of complexity in sentence comprehension. The ultimate goal, however, is to determine how the grammars of natural languages respond to different types of syntactic processing complexity.
A complexity metric that accurately predicts the relative complexity of processing different syntactic structures is not, in itself, of much theoretical interest. There does not seem to be any compelling reason for linguistic theory or psycholinguistic theory to incorporate such a metric. Rather, ultimately the correct complexity metric should follow directly as a theorem or consequence of an adequate theory of sentence comprehension.
Different theories of sentence comprehension typically lead to distinct predictions concerning the relative perceptual difficulty of sentences. Hence, one reason for developing a complexity metric is simply to help pinpoint inadequacies of current theories of sentence comprehension and to aid in the evaluation and refinement of those theories. An explicit complexity metric should also help to reveal the relation between the human sentence processor and the grammars of natural languages. In particular, developing a well-motivated complexity metric is a crucial prerequisite for evaluating the hypothesis that the grammars of natural languages are shaped in some respect by the properties of the human sentence processor since the most common form of this hypothesis claims that grammars tend to avoid generating sentences that are extremely difficult to process.