Hostname: page-component-7857688df4-fwx92 Total loading time: 0 Render date: 2025-11-14T14:55:28.279Z Has data issue: false hasContentIssue false

From Generalized Phrase Structure Grammar to Categorial Grammar (and partway back again)

Published online by Cambridge University Press:  13 November 2025

Pauline Jacobson*
Affiliation:
Brown University , Providence, RI, USA
Rights & Permissions [Opens in a new window]

Abstract

The account of extraction using only generalized context free phrase structure (put forth in a series of papers by Gazdar in the late 1970s and early 1980s and then codified in Generalized Phrase Structure Grammar) used, slash as a feature to indicate that there was something missing in wh-extraction constructions. Although this was (deliberately) reminiscent of the slash of Categorial Grammar (CG) (which encodes argument selection), they treated it as distinct from the CG slash. Subsequent work by Steedman proposed to unite them. This paper argues first, that Gazdar et al. were correct to treat the two differently. Second, I advocate a natural view of syntactic categories under the CG world view. Thus, we take the function categories of CG to correspond to functions on strings, and with this we preclude what I call S-crossing composition, used in many CG analyses. With this in mind, we suggest that rightward extraction as in Right Node Raising really is function composition, while wh-extraction should be handled by something much closer to the account in Gazdar et al. The two behave differently under coordination chains involving a silent and or or. This behavior provides evidence that the two should be kept distinct (see also work by Oehrle for this poit), while providing striking evidence for the view of syntactic categories advocated here.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press

1. Background

This paper is rooted in the framework of Categorial Grammar (CG); see, among many others, Steedman (Reference Steedman1987, Reference Steedman2024), Jacobson (Reference Jacobson1999, Reference Jacobson2014). CG has many varieties; I will assume the framework outlined in Jacobson (Reference Jacobson2014). But while one of my overall goals here is to promote the CG worldview and to show that a natural view of syntactic categories under that view accounts for some surprising facts, this paper simultaneously explores one of the (perhaps the) leading idea put forth in Gazdar, Klein, Pullum and Sag (Reference Gazdar, Klein, Pullum and Sag1985) (hereafter GKPS) and also earlier in Gazdar (Reference Gazdar1979, Reference Gazdar1981). I will argue – with Oehrle (Reference Oehrle and Halpern1990) – that that idea should be incorporated (with modification) into CG.

This particular idea of interest here is the slash category account of what I will for convenience call wh -extraction constructions, by which I mean wh-questions, relative clauses (including those with that or with no complementizer/wh), and – even though it does not involve a wh – Topicalization, which shares many properties with the others. (Of course, the grammar itself in both the view in GKPS and in CG contains no actual category of wh-extraction, so I am using this merely as a convenient description of a group of phenomena.) Thus, one of the main coups of Generalized Phase Structure Grammar (GPSG) was to provide simple analyses of the above constructions without movement by use of slash categories, categories that essentially encoded that something is missing from an expression. GKPS takes SLASH to be a feature; more specifically, they adopt a system where each feature has a name (here, SLASH) and a value (here, the name of whatever is the category of what is missing). Thus, the italicized S-like expression in (1) is of category S [SLASH:PP], and the VP node directly dominating crawled is of category VP[SLASH:PP]:

There are a few different ways that one can spell out the exact details of how this feature is passed, how at the bottom of the construction there can be something missing of the relevant category and how the feature is introduced at the top of the construction. In various versions of GPSG, these were worked out in different ways, so we summarize here just one possibility that will be close to the ultimate conventions we will adopt in the later CG rendering. (The bottom is not the same as in GKPS but is compatible with the basic program and is closer to what we will adopt here.) It is also basically the account in other works in GPSG, in particular in Gazdar, Klein, Pullum and Sag (Reference Gazdar, Klein, Pullum and Sag1982, Reference Gazdar, Klein, Pullum, Sag, de Geest and Putseys1984). Those accounts – and the account here – are traceless in that there is no item at all in the position of the gap.

There is an obvious simplicity to this approach; it removes the necessity of more than one level of syntactic representation and shows that the existence of such extraction constructions is compatible with a grammar containing only phrase structure rules (in, of course, very generalized form).Footnote 1 It should go without saying that to be able to analyze something without introducing new machinery is a welcome result. In this case, the additional machinery avoided by not having multiple levels of representation is the rules needed to map one level to another. But aside from the general theoretical benefit, a major appeal of any analysis along these lines is that Across the Board extraction, as in (3), came for free and was completely unsurprising:

This is nothing more than coordination of like categories, as both Sandy loves and Lee hates are S[SLASH:NP].

The use of the term slash for this is reminiscent of the slash notated as / in CG, and of course this was no accident; Gazdar (Reference Gazdar1979) clearly was borrowing that terminology from CG. Nonetheless, because GKPS was not rooted in the more general program of CG, the two are not the same. The slash of CG is used for argument selection: A VP can be seen as S/NP (I ignore how to notate here that the NP is taken to the left of the VP), a transitive verb as (S/NP)/NP, a preposition as PP/NP, and so forth. But this is not what is meant by slash in GKPS or in Gazdar (Reference Gazdar1979, Reference Gazdar1981). The main question that this paper seeks to answer is, Should the two be the same? While the simplest answer would be yes, this paper argues that the correct answer is no. I first review some prior evidence and then present new evidence to this effect. The conclusion is that GKPS were right about the sort of extraction found in classic wh cases as being somewhat different from ordinary argument selection. Distinguishing between the argument selection slash of CG and the slash used for extraction is also featured in work in Type Logical Grammar (TLG); we return to this in Appendix 1.

The structure of this paper is as follows. Section 2 provides background and briefly lays out the particular version of CG assumed here. There we develop one of the key tools here, which is to view the so-called function categories of CG as literally corresponding to functions on strings (taking two strings as input and giving a third as output), a view that has consequences for some of the additional operations assumed in CG. Section 3 reviews one of the striking (and well-known) benefits of this view for the analysis of Right Node Raising (RNR) and other kinds of so-called Non-Constituent Conjunctions; see e.g. Steedman (Reference Steedman1987), Dowty (Reference Dowty, Oehrle, Bach and Wheeler1988), Steedman (Reference Steedman1996). Section 4 goes on to raise the question noted at the outset here: Are the slash of CG and the slash used in wh-extraction constructions the same? This is put forth in e.g. Steedman (Reference Steedman1987) and many of his subsequent works. But despite the initial appeal, Section 5 reviews some evidence in prior literature arguing against this sort of unification of e.g. RNR and wh-extraction constructions. This suggests (along with e.g. Oehrle Reference Oehrle and Halpern1990) that the slash of GPSG should be introduced into the CG mechanisms and the way in which it is inherited from mothers to daughters essentially as in GKPS (translated into CG) rather than what we would expect if this was the CG slash. The main results in this paper are laid out in Section 6. There, I turn to the interaction of coordination and both wh-extraction and RNR particularly in what I call Lakoff chains. These are chains of more than two coordinated expressions where only the last material contains an overt conjunction (as in Curly, Moe, and Larry) and where some contain gaps and some do not. Interestingly, we find that there is one striking difference between the types of Lakoff chains allowed in wh-extraction constructions and those in RNR; only the latter requires that the rightmost member of the chain contain a gap. It is shown that this is exactly as predicted if we use the basic CG slash mechanism for RNR cases and the GPSG-like mechanism for wh-extractions. Moreover, Section 6.4 shows that this result holds not only in coordination cases but also more generally for Left Branch Constraint effects. While wh-extraction from a left branch is often questionable (hence the so-called Left Branch Constraint), the corresponding RNR cases are quite impossible.

As noted above, many versions of TLG also treat the RNR case and the wh case differently; see e.g. Kubota and Levine (Reference Kubota and Levine2020: 29–33). While the crucial facts in this paper have not to my knowledge been discussed within TLG (I believe that the key contrast between RNR and wh-extraction in Lakoff chains is new here), these contrasts should easily be accommodated in most versions of TLG in a similar way as is done here. In view of the similarities both in this local way and more general similarities (and differences) between TLG and the approach here, Appendix 1 gives some comparison between the two approaches, focusing primarily on big picture issues rather than on the specific analyses of right and left extraction. Appendix 2 discusses a possibility raised by one referee that the Lakoff chain data are truly ungrammatical but nonetheless acceptable and argues that this strategy – at least without considerable elaboration – provides no real explanation for the key contrasts.

2. The CG assumed here

2.1. Basics

We begin with some notation. We take any linguistic expression to be a triple of sound, category, and meaning, notated as <[sound], Cat, [[meaning]]> (I use orthography to represent the sound part). I assume a small set of basic categories, we can take these initially to be S, NP, N, PP, CP, although there might be a few others. I also assume that the categories can have features. In fact, as discussed in Chomsky (Reference Chomsky, Jacobs and Rosenbaum1970), GKPS, and others, we can think of the basic categories as themselves being feature bundles. This means that at the bottom of the system is a set of features where category features like S, NP, etc. are just themselves the value of one of the features. But for convenience I will notate the usual category features S, NP, etc. as given above. I also continue to use the terminology of basic categories, ignoring the fact that these are actually feature bundles.

In addition to the basic categories, we will (as is standard in CG) add in a set of function categories. We can define these recursively as follows:

A/RB is something that can combine with a B to its right to give an expression of category A, and A/LB takes its argument to the left. The semantics and semantic combinatorics are parallel; each expression of some basic category A has as its extension (I ignore intentions throughout) some member of set a, and each function category of the form A/B denotes as function in bXa. As is standard, we use <b,a> to mean the set of all functions in bXa. Moreover, we take these function categories to literally correspond to functions on actual strings; this is central to this paper. The function corresponding to any category of the form A/B applies to any string which is of that category and gives back a function from strings of category B to strings of category A. Technically, then, the function corresponding to any category A/B is a function in <string, <string, string>>, and which actual function it is depends on the directionality of the slash.Footnote 2 Thus, the function corresponding to S/LNP, for example, applies to a string like [walks], and the result is a function mapping [Lee] to [Lee walks], [Sandy] to [Sandy walks], etc. Incidentally, while there is no primitive category VP (VP is S/LNP or perhaps in some cases S/LCP, etc.), I use VP for convenience as an abbreviation. And I will use TV to mean (S/LNP)/RNP.

For any category X, we notate the corresponding function as F:X (read that as the function corresponding to the category X), and for every function F we refer to the corresponding category as CAT:F. We then define the relevant functions as follows:

In prose, (a) says that the function corresponding to the category A/RB is such that when it applies first to a string [x] and then to [y], it yields the string [xy] and the reverse for (b). Given this, we can formalize how the syntactic and semantic combinatorics work by adopting the very general rule schema in (6), where X ranges over R and L:

Note that the directional features on items are largely predictable: All expressions of category S/NP, for example, take their subject NPs to the left. And all transitive verbs (expressions of category (S/NP)/NP take their objects to the right. Therefore, we surely do not want to list the directional features on a case-by-case basis on lexical items. To this end, Jacobson (Reference Jacobson2014) suggests that lexical items are (in general) listed in underspecified form without directional features (although there could be occasional listed exceptions to the general rules), and the directional features are supplied by rules further specifying the items. For present purposes, it suffices to adopt three such rules (there are probably a few others):

These rules apply only when not already marked; they do not override existing features. And (c) will have to apply after the other two rules.Footnote 4 (This is the only place in the system where there is any kind of ordering among the rules.)

We assume that and is listed in the lexicon as being of category (X/X)/X – in other words, it takes its arguments one at a time rather than in a flat ternary fashion. For arguments for this, see, among others, Munn (Reference Munn1993), Jacobson (Reference Jacobson2014). The word order rules thus predict that the fully fleshed out category for and is (X/LX)/RX. This predicts that and takes its first argument to the right (by the default rule) and second to the left (by the fact that the inner part here is a modifier; something like and chews gum is a VP modifier as it maps a VP like walks to the VP walks and chews gum). Partee and Rooth (Reference Partee, Rooth, Bäuerle, Schwarze and von Stechow1983) proposed that and has a generalized meaning corresponding to the meet operation, which is a binary operator. But since actual English and takes its arguments one at a time, we can take its meaning to be the Curryed version of this meet operation. In most of the cases looked at here, it conjoins things of type <x,t>, so it ends up having a meaning in those cases which is the Curryed version of the intersection operation. (For a case like Lee and Sandy, the two NPs here have to be generalized quantifiers to be conjoinable; see Partee and Rooth (Reference Partee, Rooth, Bäuerle, Schwarze and von Stechow1983) and the remarks below about Lift). Also central to this paper is the behavior of what I call silent and, which I notate as and (as in walks,and talks, and chews gum). The analyses work out effortlessly if indeed this is just a silent version of ordinary and – it has no phonology but its syntax and semantics are the same. Thus, it is listed in the lexicon as (slightly simplified) <[∅], (X/X)/X, Curryed meet>. The directional features on the slashes will be the same as for ordinary and.

But there is one interesting issue raised by and . Since everything said above regarding and is also true for or (except of course with generalized join instead of meet semantics), and since there is also a silent or , the grammar needs a way to ensure that and (or or) be part of a chain where at the bottom there is the corresponding overt item. Thus, we get chains like Lee, Sandy, or Kelly left, but not *Lee, Sandy, Kelly left. (I use or here as it is completely clear that the second cannot mean the same as the first – meaning we cannot just have or all the way. But it is not very good even if interpreted as Lee, Sandy, and Kelly left, although I have found people who marginally accept lists like this on the and reading. I have no account here for that dialect.) And we certainly do not want Lee, Sandy, or Kelly to be interpreted as Lee and Sandy or Kelly. All of this can be accomplished with appropriate feature passing; for details, see Jacobson (Reference Jacobson2025). As such, the category for and given above would be further elaborated, but this will do for present purposes.

2.2. Additional operations: Lift

The results in this paper rely on the adoption of two further operations – both of which have independent motivation and are not adopted just for the material here. The first is a Lift operation. By way of background, it is well known that ordinary NPs like Mitka, the most disobedient husky, etc., have the same distribution as generalized quantifiers like every dog, no cat, etc. To account for this, Montague (Reference Montague, Suppes, Moravcsik and Hintikka1973) assigned them all the same semantic type and syntactic category. Looking only at the semantics here, Montague’s treated both as being of type <<e,t>,t>. Partee and Rooth (Reference Partee, Rooth, Bäuerle, Schwarze and von Stechow1983), however, argued against having them only have the higher type, and proposed that they are born as denoting just individuals (hence type e) but can lift to the <<e,t>,t> type. While Partee and Rooth did not adopt a CG syntax (and hence treated these all as NPs), it is natural here to marry this with the syntax, which means that expressions of category NP can shift to S/(S/NP) (we ignore the directional features momentarily). We will, however, generalize the Partee and Rooth rule to allow expressions of any category to lift. We thus adopt a unary rule (a rule mapping a single triple into another triple) as follows:

Two related notes about (8). First, Lift is a perfectly general operation, totally independent of its role in linguistics. For any member a of a set A and any set of functions F from A to B, we can define the lift of a over such functions to be a function in <<a,b>,b>, such that for any function f in F, Lift(a) takes f as argument and returns f(a). Note that the semantics of the operation in (8) is just the lift of [[α]] over functions in <a,b>. Second, one might wonder why the syntactic output here apparently just happens to preserve word order; the lifted item will combine with some expression with a function category to give just what we would have gotten had that expression taken the original α as argument. (This is ensured by the way the slash directions are given in (8).) But once we take seriously the notion of syntactic function categories as literally being functions in <string, <string,string>>, this follows from the very definition of lift.Footnote 5 Moreover, the disjunctive syntactic category in the output means that (8) as given here is really two rules. But this inelegance also disappears once we reformulate this to actually incorporate the lift operation on syntactic categories. To exposit this requires more space, so we refer simply to Jacobson (Reference Jacobson2025) for a formalized version that does incorporate the actual lift operation and collapses the two cases in (8) into one as both just instantiate lift. Here we will adopt (8) with the understanding that is really one general lift rule which, by definition, preserves order.Footnote 6

2.3. Function composition (or Geach)

The second operation is function composition, put forth in Steedman (Reference Steedman1987) and in many of his subsequent works. Steedman introduced this as an account of wh-extraction constructions, and indeed it is that very analysis that I will be rejecting later, in favor of something closer to that in GKPS. Nonetheless, the adoption of function composition (or, as discussed below, a Curryed version of function composition) as a combinatory possibility has other welcome consequences.

Steedman proposed that two function expressions can combine by function composition (rather than function expressions always having to first take their arguments) and in particular proposes (among others) the following two rule schemas:

Unlike Steedman, I will assume these are the only two composition rules (modulo the possibility of having infix slashes, see fns. 2 and 6). But, one might ask, why just these two? (Indeed, Steedman actually allows for two others; we return directly.) Why, for example, is there no schema exactly like (9a) but where the phonological part of the output is [βα]? Steedman (Reference Steedman2024) rules this out by his Principle of Consistency: ‘All rules linearize their inputs consistent with the directionality specified in the governing category’. As far as I can tell, this needs a minor reformulation; the principle needs to restrict the output category rather than directly restricting the way in which items are linearized, since the latter is a consequence of the former. But while a suitably reformulated principle might be quite general, it nonetheless would need to be stipulated in the grammar as an additional principle. It also does no work without being supplemented with a definition of governing category (Despite the convergence of terminology, I suspect the definition in Chomsky Reference Chomsky1981 is irrelevant here.). And this alone does not preclude what I will call S-crossing composition (which actually is allowed by Steedman). S-crossing composition would allow a string [α] of category A/RB to combine with a string [β] of category B/LC to give a string [αβ] of category A/LC, and similarly for the reverse. One of the key contributions of this paper is to provide evidence that this does not exist. Note that such a possibility would, for example allow believe (of category VP/RS) to combine with won the election (of category S/LNP) to give believe won the election of category VP/LNP; and that in turn would (given the assumptions here) allow for sentences like Lee Don believes won the election, meaning ‘Lee believes Don won the election’, obviously an unfortunate result. It should be noted that – as mentioned above – Steedman does in fact allow S-crossing composition. He nonetheless successfully rules out the above by the assumption there actually are no expressions of category NP; they all have raised categories. We return to this (Section 5.2), but note that since we are – with Partee and Rooth – assuming that Don is born with category NP, having S-crossing composition would allow for a VP like Don believes won the election, meaning ‘believes Don won the election’. Thus, we rule this out here by the claim that S-crossing composition is not allowed – so the question at hand is why this is not allowed.

Indeed, the important point to note is that ruling out S-crossing composition – and all sorts of other imaginable possibilities – involves no stipulation here. Rather, the set of allowable function compositions follows from the very idea that syntactic categories correspond to functions on strings. Once this is taken seriously, the combinations are just those that are actual function composition; S-crossing composition is not. To show this requires an extra step: What we want to be composing is two functions in <string, string>, but recall that some function corresponding to a category A/B yield a function in <string, string> only after it has taken its first argument (some string). Thus, let F be the function corresponding to some category A/B. For any expression α, F([α]) is a function in <string,string> and let us call that function F’- α. With that definition, we recast the two schemas in (9) into the single schema in (10), where we can see that both the phonological effect of the rule and the resulting category literally are function composition:

As the reader can verify, this schema does not allow for S-crossing composition nor does it allow for many other conceivable possibilities but only for just the two instantiations shown in (9).Footnote 7

2.4. Function composition recast as the Geach rule

Function composition is a binary operator: It takes two arguments to yield a third. But, like any binary operator, it can be Curryed to take one argument at a time. The Curryed version of this is found in a lot of CG literature and is known as the Geach rule or Division; I will refer to it here as Geach (but see Humberstone Reference Humberstone2005) and notate that as g. Thus, g takes a function h in <a,b> and returns a function in <<c,a>,<c.b>>, more specifically g(h) = λX<c,a>[λCc[h(X(V))]]. This is just the Curryed version of the function composition operation, such that for any functions h (in <a,b>) and f (in <b,c>), g(h)(f) = h o f. Given this, the schema in (10) could instead be replaced by g in such a way as to ensure that what is allowed in (10) is simply recast to happen in two steps (g followed by application).Footnote 8 I believe that there are reasons for this (see Jacobson Reference Jacobson2014, Reference Jacobson2025 for some advantages). However, for expository simplicity, we will for the rest of this paper instead adopt direct function composition via the schema in (10); switch to g plus application has no effect on the main points here. By the very definition of g, the only allowable instances are a mapping of an expression of category A/RB to (A/RC)/R(B/RC) or the left inverse. In particular, there is no mapping of A/RB to (A/LC)/R(B/LC) or A/LB to (A/RC)/L(B/RC), which has the same effect as Steedman’s two S-crossing compositions.

3. Application to non-constituent conjunction

One of the striking (and well-known) benefits of allowing function composition (or Geach) into the grammar is that RNR is automatic. Consider (11):

Since every psychology student has the category S/RVP and love is VP/RNP, these can compose to give S/RNP. every linguistics student is the same, and these can conjoin (here, by two steps given that and takes its arguments one at a time) and the full expression is S/RNP. It denotes the intersection of the set of things that every psych student loves and every ling student hates, and at the end of the day says that the stats course is in that intersection. Similarly, for cases with simple NPs in subject position:

Here these can lift and then compose.Footnote 9 Of course, the idea of having these as constituents is quite appalling to many, but see Jacobson (Reference Jacobson2014, Reference Jacobson2023) and Steedman (Reference Steedman2024) for discussion of why such a reaction is unwarranted.

Aside from RNR, there are other kinds of so-called Non-Constituent Coordination that become entirely unsurprising. Consider (13), which Dowty (Reference Dowty, Oehrle, Bach and Wheeler1988) showed is simple to derive with the use of lifting and function composition:

Both lobster and scallops are NPs but can lift over transitive verbs to be of category VP/L(VP/RNP). We can assume that on Tuesday is a VP modifier (ditto for on Wednesday) and hence of category VP/LVP. lobster and on Tuesday can thus combine by (left) composition, to give lobster on Tuesday of category VP/L(VP/RNP). Note, then, that just as lobster is a lifted object, lobster on Tuesday is just a fancier version of that category; both want to take a TV to the left to give a VP. However, in the case in (13), that is not the next step: scallops on Wednesday also has this category, so lobster on Tuesday and scallops on Wednesday can conjoin and then combine with the TV serves to give a VP. We will not work through the semantics here as it simply tracks the syntax (see Dowty Reference Dowty, Oehrle, Bach and Wheeler1988 for full details). Thus, independent of the analysis of wh-extraction constructions, having function composition (or, Geach) has some welcome consequences.

4. Unifying GPSG and CG?

We now return to the question posed at the end of Section 1: Should the GPSG slash and the CG slash be the same? Quite famously, Steedman (Reference Steedman1987) and many of his subsequent works answer yes, and he provides an elegant account of wh-extraction using all the apparatus we have seen already. Because the account at first glance requires very little new, it is surely the one to be preferred all other things being equal.

Since the case of wh-questions is perhaps the simplest (in terms of the syntax only), we will use that as our example. (Relative clauses involve extra complications due to a fuller set of allowable pied-piped expressions; for a few relevant analyses, see Pollard Reference Pollard, Oehrle, Bach and Wheeler1988 using Head Driven Phrase Structure Grammar (HPSG), Szabolcsi Reference Szabolcsi, Sag and Szabolcsi1992 in Combinatory Categorial Grammar (CCG), Jacobson Reference Jacobson, Krifka and Schenner2019 in a CG with a variable free semantics, among others). Take the underlined expression in Lee knows what Martha built. All we need is to have a question word like what to be of category QR/(S/RNP). We know that Martha built denotes the set of things that Martha built and what will map this to the appropriate question meaning. (Obviously the details depend on a full analysis of questions which we will not give here.) Since what can also serve as subject (Lee knows what pleases Martha), we can posit another item of category Q/R(S/LNP) – alternatively perhaps it is just Q/R(S/NP) with no directional specification on the slash. (In the dialect that distinguishes who from whom these would be distinct; in the dialect where they are merged, they would be the same category as what.)

Before continuing, it should be noted that it is not quite right to say that ‘nothing extra is needed’. For there is a complication here in that the general word order rules must be overridden; the general rules are such that we expect the category of what to be (only) Q/R(S/LNP). The possibility of what combining with S/RNP is unexpected. Even if we revise the category to say that the argument of what just has no specified directionality on S/NP, that too requires overriding the word order rules. There are other cases of a lexical item being listed so as to override the default (e.g, the postposition ago), so one might not consider this a deadly problem for an otherwise elegant hypothesis that function composition can get us what is needed for wh-extraction (once suitably extended to relative clauses and topicalization). But there are enough wh words that we probably do not want to make this analogous to the case of ago, so there remain open questions under this view of the category of question words, which is necessitated by having it take as argument something of the form S/NP rather than S with a different kind of slash. While this is arguably a minor problem, there are other reasons to distinguish these, for there are known differences between wh-extractions and RNR. We turn first to these in Section 5, and then Section 6 develops a new case.

5. Marrying CG and GPSG

5.1. Rightward extraction (RNR) vs. wh-extraction: Some known differences

As noted above, there are well-known differences between rightward extraction (as in e.g. RNR) and wh-extraction, a few of which we review below. (In dealing with rightward extraction, we for the most part restrict discussion to cases of RNR; although, see (15b), which – if it were good – would go under the rubric of Heavy NP Shift. We will say little about the latter here as it requires a more thorough discussion of three-place verbs than space allows here.Footnote 10) These differences led Oehrle (Reference Oehrle and Halpern1990) – working within a CG framework – to adopt a different account of wh-extraction from the account well known from the works of Steedman. Borrowing heavily from GPSG, he suggested another recursive category A|B (I have modified his notation slightly) where | is not the same as the ordinary CG slash but encodes what we are used to thinking of informally as an extraction gap. (Of course, the notion of an extraction gap is not something that has any real theoretical status in the theory assumed here but is just used here to include a group of constructions that often go under that rubric. The actual role of this in the theory is completely determined by what sorts of items select for something of category A|B.) Note that Jacobson (Reference Jacobson1999) suggests a similar strategy for pronouns, where she uses AB to denote an expression of category A but with an unbound proform of category B within it; we might see the | and superscript as closely related. Any expression of category A/B, A|B, or AB has a meaning of type <b,a>. An A/B is something expecting an actual B argument in the syntax, and AB encodes (in general) that the expression has within it a proform of category B, and A|B encodes a missing B. (This is informal: In GKPS, this is cashed out by having the missing B actually be a silent item of that category – much like a trace. We could adopt this but will not; here, by a missing B, we just mean a case where an item of category A/B (expecting a B in the syntax) maps to one of category A|B and will never find a B in the syntax.)

Following the exposition of SLASH given in Section 1, we posit that the top of wh-extraction construction involves some item which subcategorizes for A|B (usually S|X for X some set of categories). A question word like what asks for S|NP, and for Topicalization, we might suppose that there is a rule (which is not a general rule but is an additional rule stated for English) by which an S|X can map to S/LX (or perhaps S[TOP]/LX) for X some set of categories – i.e. just those that can topicalize. For the passing of |, we assume that any item of category A/B can map to (A|C)/(B|C), with the semantics of the Geach rule. This involves adding very little extra to the system that was developed for pronouns in Jacobson (Reference Jacobson1999), whereby A/B can map to AC/BC. Finally, for the bottom, we can assume that any item of category A/B can map to A|B (with no meaning change). Note that we have not here made any restriction on the directional feature on the slash, allowing an A/LB to map to (A|C)/L(B|C), which in turn allows for a gap to be on a left branch. The reason for this will be clear below, and we revisit this in Section 6.4. (Whatever analysis one gives for and will involve cases of gaps only on the left branch.) Finally, we follow Jacobson (Reference Jacobson2014) in defining the rules recursively. That system allows an (A/B)/C to combine with a B to give A/C and an (A/B)C to combine with B to give AC, and we extend this to allow (A/B)|C combine with B to give A|C. (Some of this can be circumvented by the use of Lift, but many derivations are simplified by this recursive definition.)

As to some differences between wh-extraction and RNR, McCawley (Reference McCawley1982) notes that RNR is impervious to most island effects while of course wh-extraction is not, as in (14):

This, however, is not entirely conclusive since – if island effects are about processing rather than a grammatical principle (see Newmeyer Reference Newmeyer2016 for an overview) – the difference could well reduce to what happens when processing the filler comes first (ordinary wh-extraction) vs. processing the filler at the end. But much stronger evidence is found in McCloskey (Reference McCloskey1986), who shows that RNR cases allow preposition stranding in Irish, whereas wh-extraction cases do not.

There is another obvious difference between wh-extraction and RNR (and cases that might look like Heavy NP Shift). The difference in question concerns embedded subject extraction. The basic empirical facts are these: Embedded subject gaps are not allowed in rightward cases but are in wh cases. It is helpful to give a context to give the bad cases every chance of being good:

Since one referee was dubious about the facts, I did what is admittedly only an informal survey but gives at least a preliminary bit of data beyond those that are just one particular referee’s judgments (or just my judgments). There is no claim here to being an experiment in the sense in which that term is often used in much modern work (but see Jacobson Reference Jacobson, Ball and Rabern2018 for the position that that sense is too narrow an interpretation of the experimental method). However, besides going beyond a single informant, the informal survey here has the advantage of carefully controlling for context, something that is sometimes missing in many large-scale informant judgment studies. Thus, I checked the very cases above with the scenario given above with six consultants (all linguists), and all gave the lowest possible rating to (a) and the highest rating to (c). Three did find a slight improvement in (b) (as do I), and I will not give an account of this here but will assume that both (a) and (b) are bad, but I suspect it is the added length and complexity that allow for a more charitable judgment for (b). The claim that Heavy NP Shift is impossible from the subject of a tensed clause is also documented in the literature; see e.g. Postal (Reference Postal1974), Bach (Reference Bach1977).Footnote 11

Note that the ungrammaticality of (15a, b) is not actually surprising in any system using function composition for rightward extraction (including Heavy NP Shift), because no proposed function composition will allow believes to compose with won to give a VP/RNP. The question then becomes not why rightward extraction is bad, but why leftward extraction as (15c) (or, for that matter, the simpler case of which student Professor Carberry said was terrific is good. Here the answer is that this involves | and not composition, but what happens in a system with mixed composition?

5.2 A digression on Steedman’s explanation

Steedman (Reference Steedman2024) has an interesting analysis of (15c). Since his system allows S-crossing composition, said was terrific can indeed compose, giving the category VP/LNP. This can then combine by another S-crossing composition with Professor Carberry of category S/RVP, which yields Professor Carberry said was terrific of category S/LNP (i.e. an ordinary VP). Similarly for Professor Glazie said was terrible, and the two can conjoin by the ordinary conjunction of likes. And we have seen independently that question phrases like which candidate can take VPs as well as S/RNPs to give questions, which is what happens here. So, the goodness of leftward extraction in (15c) is quite unremarkable.

But it raises a new question. This analysis makes use of assuming that said was terrific is a VP/LNP. But then, what keeps this expression from combining with an NP like that candidate to its left, to give Sally said was terrific meaning ‘said that Sally was terrific’? Steedman’s answer is that Sally is not an NP. More generally, with Montague (Reference Montague, Suppes, Moravcsik and Hintikka1973), he assumes that there are no actual expressions of category NP even though NP is one of the primitive categories in the grammar. Similarly, while individuals are presumably in the ontology (since there are functions of type <e,t>), there are no linguistic expressions that denote individuals. Rather, what we may think of as expressions of category NP are actually all type raised, where the raising happens via Case Marking (all NPs need case). NP is a schema over raised categories that includes S/R(S/LNP) for subject NPs, (S/LNP)/L((S/LNP)/RNP) for object NPs, etc. (so none of these denote individuals but have more complex meanings). Moreover – slightly oversimplifying – the allowable categories are just those that are raised over lexical categories, from which it follows that there is no item of category VP/R(VP/LNP), because there are no lexical items of category VP/LNP. There is, then, no way to derive the VP Sally said was terrific with the meaning ‘said that Sally was terrific’.

Aside from the oddity noted in Partee and Rooth (Reference Partee, Rooth, Bäuerle, Schwarze and von Stechow1983) of having individuals in the ontology but no linguistic expressions with those denotations and having NP as a grammatical category but where there are no expressions with this category, this has problems. First, one needs to ensure the directionality of the main slash in each of these raised categories is as it is. That is, subject NPs must be S/R(S/LNP) – a completely unexpected category in view of the basic word order generalizations of English, where the last argument that is supplied to give S goes to the left. And object NPs must take the TV to the left – not to the right as is the general default. Steedman (personal communication) suggests that this follows from various principles, such as the Principle of Consistency, all of which taken together ensure that there are no combinatory rules that override the possible orders specified by the lexical categories. But the existence the Principle of Consistency itself does not follow from anything; there is no obvious reason why it should be the case that some combinatory rules introduce new order possibilities. As such, this is a stipulation. (It is reminiscent of early transformational grammar adding a stipulation ensuring that all non-root transformations rules preserve structure; see Emonds (Reference Emonds1976).) Note that it might appear that the view here also has a similar problem if one assumes that e.g. everyone is listed in the lexicon with the unexpected category S/R(S/LNP). But Jacobson (Reference Jacobson2014, Reference Jacobson2025) proposes that it can instead simply be listed with category Lift(NP); see those works for detail. Second, an additional category (not given by Steedman’s generalized schema) needs to be posited for the RNR cases, as the object in RNR cases has to be S/L(S/RNP). This is a valid category if derived from lifting NPs but not a valid instantiation of Steedman’s schema since there are no lexical items of category S/RNP.Footnote 12 The third problem comes from the fact that other categories besides NPs can be subjects. For example, we get CP subjects in That Don lost is surprising. There is reason to believe that CPs might – like NPs – denote individuals in some sense (i.e. what one might take as an individual correlate of a proposition), but there is evidence that they are not the same syntactic category. See e.g. Rosenbaum (Reference Rosenbaum1967) and Grimshaw (Reference Grimshaw and Zaenen1982) for important distributional differences. Hence is suprising is presumably of category S/LCP (as well as S/LNP), so again we would expect the two to compose to give believes is surprising of category VP/LCP. This would allow that Don lost believes is surprising as a VP, and ultimately we should get Lee that Don lost believes is surprising meaning ‘Lee believes (that) that Don lost is surprising’. Extending the explanation for the badness of the parallel case with an NP subject would, I believe, require the view that there are also no CPs that enter into the composition; all CPs are type raised via case marking. But this is highly unlikely, given that we generally do not find case marking on CPs.

5.3 Back to the analysis here: | is not /

If, on the other hand, | and / are different and we take the view of syntactic categories advocated here, it is easy to allow embedded subject extraction in wh constructions but not in RNR while at the same time ruling out the bad cases above. The fact that Professor Carberry said was terrible in (15a, b) cannot be an S/RNP follows here exactly as in Steedman’s account (no function composition will allow for this). But our explanation for (15c) does not rely on a stage in the derivation in which said was terrible is a VP/LNP and we can maintain that that student is an ordinary NP. The fact that (15c) is good follows from the conventions passing | which are different from function composition. And the impossibility of e.g. Professor Carberry the butterfly researcher said was terrible (meaning [Carberry said that the butterfly researcher was terrible]) follows because there is no S-crossing composition. We now turn to new evidence for separating out extraction slash from the normal expected argument of the CG slash.

6. Background

6.1. The so-called Coordinate Structure Constraint and Lakoff chains

One of the initially exciting results in GPSG was the appearance that Coordinate Structure Constraint (CSC) effects followed immediately from the fact that this would involve coordination of unlikes; see Gazdar (Reference Gazdar1979, Reference Gazdar1981). But a closer examination – especially of the way in which the SLASH passing conventions were developed in GKPS combined with their analysis of and – revealed that this was not really so; nothing would stop the passing of SLASH onto just one conjunct.Footnote 13 Without going into the details of their actual analysis and solution, we can note that in a CG analyses there is also nothing to predict CSC effects. This holds regardless of whether we are looking at function composition or at the passing of |. The latter is governed by a rule mapping any A/B to (A|C)/(B|C), and this should allow | to occur on just one conjunct, as shown in (16). In (16a) | occurs on the left conjunct only and in (16b) on the right conjunct only. (The case where it is on both is just coordination of likes; thus, the analysis of ATB extraction in GKPS and in earlier works within GPSG carry over directly here.) Note that in (16b), the expression of category (VP/LVP)|NP combines directly with a VP to its left to give VP|NP by the recursive specification of the rules noted above.

The situation with respect to function composition (and hence in the worldview here the situation with right extraction) is more complex and will be central to the punchline of this paper, so we will postpone the full discussion. But if we take the view (with Steedman) that the wh case involves function composition, then it still should at least be possible to find cases of a gap on the right conjunct and not on the left, as in (17); this also assumes free lift whereby the right conjunct VP lifts to VP/R(VP/LVP), as shown in (17). Note that o in (17) is showing where function composition has occurred in the combinatorics.

If and takes its arguments one at a time, then the case of and is really no different from the situation with think. Just as think allows wh-extraction from its right argument (e.g. a CP) only, so should and, regardless of whether wh-extraction involves / (passed by function composition) or | (passed by feature passing).

And indeed, the result that there should be no Coordinate Structure effects in wh-extractions is actually a welcome one, as it has often been argued that CSC effects seem to be about information structure and/or discourse coherence rather than a syntactic principle; see e.g. Lakoff (Reference Lakoff1986), Kehler (Reference Kehler, Johnson, Juge and Moxey1996). Standard counterexamples in the literature include (18) (we use __ to indicate gaps without any theoretical commitment to how gaps are represented in the grammar); see, for example, Ross (Reference Ross1967) and Goldsmith (Reference Goldsmith, Eilfort, Kroeber and Peterson1985). Note that these examples show that in wh-extraction we can find a gap in the right but not left conjunct as in (18a) as well as in the left but not right conjunct, as in (18b). As shown in (16), this is exactly as we would expect if wh-extraction involved the passing of |. We have not yet fully shown what happens if, instead, it involves function composition, but (17) shows that at least examples like (18a) (with a gap in the right conjunct only) are expected:

It is worth noting that we find cases like these not just with VPs but also with full Ss:

A common reaction to examples like these is to posit that these involve a different and – a subordinating rather than a coordinating one. First, though, if it is correct that and combines with its arguments one at a time, it is always subordinating, so it is not clear what that could mean. Second, we would need to also posit a second or:

But even more striking is data discovered in Lakoff (Reference Lakoff1986), which he used to argue for a non-syntactic account of CSC effects. His precise argument is not exactly the one I will give below, but his data provide intriguing new evidence against positing a second and and or. And this is that we also find CSC violations in chains involving silent and and silent or :

The reason I take these to be strong evidence that we do not want a syntactic CSC (with a different and and or in the counterexamples) is because it seems even more suspicious to posit yet another and which is silent (and has all the same properties as and except its phonology) as well as another item or. Although, as noted above, Lakoff’s own argument based on these examples was slightly different (and he did not deal with or ) I nonetheless refer to these as Lakoff chains. By this I mean any chain with at least one silent and or or and where some of the conjuncts/disjuncts have gaps and some do not. (I use the term gap liberally, as Jacobson Reference Jacobson2025 extends this also to cases of what one might call Left Node Raising; see Section 7.)

6.2. RNR and Lakoff chains

Leaving aside for the moment the correct treatment of wh-extraction, (17) above shows that we expect to be able to find a gap in the right conjunct and not the left in RNR as well, since this would involve ordinary function composition. It would be difficult to show this for the case of just two ordinary conjoined VPs (or Ss), but Lakoff chains provide just the evidence we need (see also Chaves Reference Chaves2014 for relevant discussion). Indeed, (23) shows that the prediction that we can find perfectly good cases of RNR with a gap in a right conjunct but not all of the others:

Here we know that this is a matter of RNR rather than e.g. 2 six packs of beer in (23a) being an ordinary object of drink because there is also a gap following bought. Thus 2 six packs of beer combines with the complex TV (VP/RNP) went to the store, bought, bicycled home, and then proceeded to drink.

We find similar effects with full Ss, as in (24), and also similar effects with or as in (25):

6.3. But wh-extraction and RNR differ

As shown in (21) and (22), the wh-extraction cases allow for a complete mix and match in terms of where there is a gap and where not. The rightmost conjunct/disjunct can contain a gap or not and similarly for the leftmost one, and all the ones in-between can be of either variety. But the view here predicts an interesting constraint on the rightward looking Lakoff chains. Recall that we are claiming there is no S-crossing composition, which follows from the view that syntactic categories correspond to functions on strings. Further, we are claiming here that RNR is an automatic consequence of having function composition, and we have also adopted a free lift operation. With this – as shown earlier – it follows that we can get things like bicycled home and proceeded to drink __ as in (23a) (this of course is part of the larger Lakoff chain); we illustrate this again in (26) (note that what is notated here as and/and is intended to mean that either of those items can occur in that position):

Here the chain is launched by the lowest conjunct missing its object; i.e. it is of category VP/RNP From there, this can combine with another full VP (which type lifts as in the above case), which would give us something like found his car, bicycled home, and proceeded to drink __ or it can combine with another VP/RNP by the ordinary conjunction of likes (to give e.g. bought __, bicycled home, and proceeded to drink__). And ultimately any of these chains will find an NP on the right to give what we are calling the RNR Lakoff chain.

But to launch a chain with the full VP on the left and one with a gap on the right, we would need S-crossing composition. We show one way to try to derive such a case below (the use of * here is to indicate an illegitimate composition):

Of course, this is just one bad derivation and as such does not prove that there is no possible way to get this. But we simply invite the reader to try other possibilities; all require S-crossing composition (or some other illegitimate composition) somewhere.

In sum, the prediction is that a Lakoff chain in an RNR construction has to contain as its rightmost (lower) expression something with a gap. Once such an expression launches the chain, all further VPs on the left can have a gap or not. And indeed, the prediction constraining the right expression is borne out. (Note that (28a) is not a Lakoff chain but involves just a simple coordination – were it to be good we would probably label it Heavy NP Shift rather than RNR, but the names are irrelevant. Importantly, the analysis here predicts that (28a) should be bad.)

All the corresponding wh cases are good: We can construct the same contrasts with S-level conjoined cases (note the contrast between (29b, c) where the latter has a gap in the rightmost conjunct:

6.4. Left branch effects more generally

Assuming that and takes its arguments one at a time, the cases where there is a gap in the left conjunct and not in the right conjunct – whether it be in wh-extraction or in RNR – show that the grammar also cannot contain anything like the Left Branch Constraint. We do, however, see in the Lakoff chain RNR cases a limited kind of Left Branch effect: It is impossible to have a gap in a left conjunct unless there is one also on the right (possibly further down). This follows for the rightward looking cases from the impossibility of S-crossing composition.

Note that the case of gap passing in coordinated structures is but a special case of gap passing in anything of the form (A/LB)/RC. So, we make a more general prediction beyond the case of coordination: We should find Left Branch effects in rightward extraction but not in leftward. Of course, there are many questionable cases of a gap on a left branch in wh-extraction, as in (30), which is certainly at least degraded:

But again, this could conceivably have to do with information structure since we know that subject position tends to be old, given or presupposed information. But there is a very robust contrast between (30) and the corresponding right extraction case, which contrast is predicted here:

This is as predicted; the reader can confirm that (31) would have to involve S-crossing function composition, whereas (30) should be good (without additional principles) given the | passing conventions. This is exactly parallel to the case with and.

One might have the reaction here that of course (31) is bad because Heavy NP Shift targets only items within the VP. This is correct, but why? I actually assume that run-of-the-mill Heavy NP Shift cases (involving three-place verbs, as in give to Mitka a delicious bone) are a matter of a verb like give having two different specifications on its directional slashes: (VP/INP)/RPP as in give a bone to Mitka and a second possibility (VP/RNP)/RPP. Hence, run-of-the-mill Heavy NP Shift cases have nothing to do with function composition. But there remains the question as to why (31) cannot be composed using function composition: S-crossing composition should allow just this. We can also embed these in conjuncts simply to avoid any commitment about Heavy NP Shift and turn this into a clear RNR case. As predicted, the same robust contrast emerges:

The (b) cases are simply beyond (at least this author’s) tolerance, no matter how we might try to give them context.

7. Conclusion and looking further

There are two morals to be drawn from the behavior of Lakoff chains. The first is that wh-extraction constructions behave differently from RNR (or right looking extraction) more generally, which follows if the latter is a matter of function composition while the former is handled by something akin to the slash of GPSG (rather than the CG slash). The CG slash is exactly what is inherited in function composition. And if we also take function categories in CG to correspond to functions on strings, it is the case that right and left composition are allowed because they are function composition, while S-crossing and other kinds of composition rules are not. The inability to have a rightward looking Lakoff chain whose bottom does not contain a gap thus provides striking confirmation for the lack of S-crossing composition and in turn for this view of syntactic categories. We remind the reader again that function composition can instead be broken down into the two steps of Geach + application but the analogous restriction will hold. The passing of | does not show this restriction since, although it too is a special instance of a Geach rule (where A/B maps to (A|C)/(B|C), the | has nothing to do with directionality (unlike /), so is not limited in the same way.

Jacobson (Reference Jacobson2025) demonstrates some even more surprising facts about Lakoff chains that are predicted under the general account here. The first is that we also find cases of what we might think of as Left Node Raising; cases of this sort were first noted in Maxwell and Manning (Reference Maxwell, Manning, Butt and King1996) (though not under that name) but have received almost no discussion since. An example is (33):

These are handled effortlessly in a theory with function composition (or Geach) and their existence is unsurprising under CG. But there is also an interesting constraint on these, which follows from the lack of S-crossing composition. That is, they can only grow rightward. In (33), for example, we have two fancy lifted objects (of the sort discussed in Dowty Reference Dowty, Oehrle, Bach and Wheeler1988) followed by a full S. But we cannot have one such object, followed by an S and then another lifted object:

Note that this has a surprising reading in which the amazingly long-lasting rock band recorded grammars as well as funky-butt songs, but it cannot have the intended reading that is the same as (33). For discussion of the odd reading, see again Jacobson Reference Jacobson2025.

We give these examples without analysis here simply as a teaser; space precludes saying anything more about them; full details are in Jacobson (Reference Jacobson2025). But these too provide further striking evidence for the view of syntactic categories advocated here and in turn for the more general CG worldview. That said, the main point of this paper is that wh-extraction gaps behave somewhat differently, vindicating the intuition clearly held by GKPS that these were not to be handled by the ordinary slash of CG.

Supplementary material

The supplementary material for this article can be found at http://doi.org/10.1017/S0022226725100820.

Acknowledgements

For extremely useful and detailed comments, I would like to thank three anonymous referees. I would also like to thank audiences at the Pullumfest in Edinburgh 2023 and at MIT and Yale for helpful discussion on this and related material, as well as Mark Steedman, Bob Levine, and especially Yusuke Kubota for fruitful comments and for interesting discussion, which will I hope continue beyond this paper.

Footnotes

1 The generalizations on phrase structure rules include stating some of them via metarules – rules that predict the existence of certain phrase structure rule schemas on the basis of others.

2 I am assuming that three-place verbs as in give the bone to Mitka first combine with the rightmost argument (here, PP) and then with the direct object (here, the bone), at which point the bone is infixed in. This general view was first put forth in terms of movement rather than infixation (and for a limited range of cases) by Chomsky (Reference Chomsky1957); the CG version using infixation was proposed originally in Bach (Reference Bach1979, Reference Bach1980) and the general idea also implemented within GPSG by Jacobson (Reference Jacobson, Huck and Ojeda1987) and again in a movement theory by Larson (Reference Larson1988). This means then that there would be an additional slash which we can notate as I for infixation, and give would have as its category ((S/LNP)/INP)/RNP. We mostly ignore three-place verbs in this paper except for some brief mentions, since working out the full details of infixation and folding it into the system would require a separate paper.

3 This obviously requires an additional rule for single word adjectives, which generally (though not always) occur prenominally. See Jacobson (Reference Jacobson2014) for a fuller statement which extends to that case.

4 Note that this is somewhat reminiscent of the ID/LP (Immediate Dominance/Linear Precedence) treatment of GKPS in that word order generalizations can be stated across a swath of categories; it is also reminiscent of that property of X-bar theory. In fact, if all we had was application one could recast the effect of the slash direction rules as properties of the combinatorics (and have no directionality specified on the slashes). This would involve e.g. a rule saying that any time an X combines with an S/X the X is to the left, and so forth. But as will be argued below we do not have just function composition, and it is not clear at all how the generalizations elucidated below regarding the order preserving nature of e.g. Lift and function composition could be captured in this way. Indeed, this strategy is at odds with the program here of having the categories be functions in <string, <string, string>>. Without directional features on the slashes, there is no single function corresponding to any given category.

5 I thank (unfortunately, posthumously) Dick Oehrle for pointing this out to me.

6 If we have infix slashes, we also need a corresponding circumfix slash. Thus A/IB is an expression which takes the B argument as an infix; if a B expression lifts over this it will - by the definition of lift - be (A/C(A/IB), and will take its A/IB argument as a circumfix. One of the reasons we are not fully working out the logic of I and C slashes here is that these are meaningless unless each string also has a dedicated infixation point and we have conventions for how the infixation point is inherited when two strings combine.

7 Again there are interesting questions if there are infixation slashes. It is conceivable that, for example, believes of category VP/RS could compose with won of category S/LNP, where the result would be the string believes won of category VP/INP. Whether this is possible depends on the precise conventions regarding infixation points and on how these are inherited, but if this is allowed it will indeed be function composition only if the subsequent combining of believes won with Don yields believes Don won, which is of course a perfectly innocent result.

8 This means that the formalization of g is as follows:

Given an expression α of the form <[α], A/B, [[α]]> there is an expression β of the form <[α], where the Cat of β is that Cat whose associated function F is such that for any string [γ] in B/C (for any C),), F’-β([γ]) = F’-α ο F ’- γ, and for any γ in B/C, [[β]]([[γ]]) = [[α]] ο [[γ]].

9 Jacobson (Reference Jacobson2014) defines all rules recursively which allows for an additional derivation which does not requiring lifting Lee and Sandy; these can directly combine with loves. This provides derivations with fewer steps. The addition of such a derivation has no effect on the remarks here.

10 It was hypothesized in fn. 3 that three-place verbs, such as give (as in give the bone to Mitka), are of category (VP/INP)/RPP. If correct, the Heavy NP Shift is a simple matter of letting all such verbs have a second category (VP/INP)/RPP (or whatever category the non-direct object complement is). It remains to explain the constraints on Heavy NP Shift to the effect that the rightmost NP needs to be heavy or have a special prosodic and/or semantic prominence, but that is a matter which any theory needs to deal with. And another well-known difficulty is that Heavy NP Shift is not allowed in the case of verbs taking two NP objects: give the very disobedient dog a bone has no variant *give a bone the very disobedient dog. We have nothing to say about this here.

11 Of course, corresponding cases with an infinitive are good as in:

12 Steedman (personal communication) claims that it is actually advantageous to have the addition of that category be just a language particular stipulation rather than coming from the general system, as there are other subject–verb–object languages, such as French and Chinese, that lack RNR. If French and Chinese were exactly like English in every other respect, this indeed would be worrisome for the account here. But given that there are many other differences, I think it is an open question as to whether the account here is jeopardized, acknowledging that this needs further investigation.

13 See Pollard and Sag (Reference Pollard and Sag1994: 201) for relevant discussion about this fact within HPSG. Although they give a possible syntactic explanation, they also indicate that their sympathies lie with the view put forth in, among others, Lakoff (Reference Lakoff1986), that these effects are due to the information structure. This is the view that will be taken here as well.

References

Bach, Emmon. 1977. [Review of the book On raising: One rule of English grammar and its theoretical implications, by Paul Postal]. Language 53.7, 621654.10.2307/413179CrossRefGoogle Scholar
Bach, Emmon. 1979. Control in Montague grammar. Linguistic Inquiry 10.4, 515531.Google Scholar
Bach, Emmon. 1980. In defense of passive. Linguistics and Philosophy 3.3, 297341.10.1007/BF00401689CrossRefGoogle Scholar
Chaves, Rui. 2014. On the disunity of Right Node Raising phenomena: Extraposition, ellipsis, and deletion. Language 90.4, 834886.10.1353/lan.2014.0081CrossRefGoogle Scholar
Chomsky, Noam. 1957. Syntactic structures. The Hague: Mouton.10.1515/9783112316009CrossRefGoogle Scholar
Chomsky, N. 1970. Remarks on nominalization. In Jacobs, Roderick & Rosenbaum, Peter (eds.), Readings in English transformational grammar, 184221. Waltham: Ginn.Google Scholar
Chomsky, Noam. 1981. Lectures in government and binding. Dordrecht: Foris.Google Scholar
Dowty, David. 1985. On recent analyses of the semantics of control. Linguistics and Philosophy 8.3, 291331.10.1007/BF00630916CrossRefGoogle Scholar
Dowty, David. 1988. Type raising, functional composition, and non-constituent conjunction. In Oehrle, Richard, Bach, Emmon & Wheeler, Dierdre (eds.), Categorial grammars and natural language structure, 153197. Dordrecht: Kluwer.10.1007/978-94-015-6878-4_7CrossRefGoogle Scholar
Emonds, Joseph. 1976. A transformational approach to English syntax: Root, structure preserving and local transformations. New York: Academic Press.Google Scholar
Gazdar, Gerald. 1979. English as a context free language. Ms., University of Sussex.Google Scholar
Gazdar, Gerald. 1981. Unbounded dependencies and coordinate structure. Linguistic Inquiry 12.2, 155184.Google Scholar
Gazdar, Gerald, Klein, Ewan, Pullum, Geoffrey & Sag, Ivan. 1982. Coordination and unbounded dependencies (Stanford University Working Papers in Linguistics, vol. 2: Developments in Generalized Phrase Structure Grammar), 3871. Bloomington: Indiana University Linguistics Club.Google Scholar
Gazdar, Gerald, Klein, Ewan, Pullum, Geoffrey & Sag, Ivan. 1984. Foot features and parasitic gaps. In de Geest, Wim & Putseys, Yvan (eds.), Sentential complementation. 8394. Dordrecht: Foris.10.1515/9783110882698-009CrossRefGoogle Scholar
Gazdar, Gerald, Klein, Ewan, Pullum, Geoffrey & Sag, Ivan. 1985. Generalized phrase structure grammar. Cambridge, MA: Harvard University Press.Google Scholar
Goldsmith, John. 1985. A principled exception to the coordinate structure constraint. In Eilfort, William H., Kroeber, Paul D. & Peterson, Karen L. (eds.), Papers from the general session of the twenty-first regional meeting of the Chicago Linguistic Society, 133143. Chicago: Chicago Linguistic Society.Google Scholar
Grimshaw, Jane. 1982. Subcategorization and grammatical relations. In Zaenen, Annie (ed.), Subjects and other subjects. 3355. Bloomington: Indiana University Linguistics Club.Google Scholar
Humberstone, Lloyd. 2005. Geach’s cartegorial grammar. Linguistics and Philosophy 28.3, 281317.10.1007/s10988-004-7160-yCrossRefGoogle Scholar
Jacobson, Pauline. 1987. Phrase structure, grammatical relations, and discontinuous constituency. In Huck, Geoffrey & Ojeda, Almerindo (eds.), Discontinuous constituency (Syntax and Semantics 20, 2769. New York: Academic Press.10.1163/9789004373204_004CrossRefGoogle Scholar
Jacobson, Pauline. 1990. Raising as function composition. Linguistics and Philosophy 13.4, 423475.10.1007/BF00630750CrossRefGoogle Scholar
Jacobson, Pauline. 1999. Towards a variable free semantics. Linguistics and Philosophy 22.2, 117185.10.1023/A:1005464228727CrossRefGoogle Scholar
Jacobson, Pauline. 2014. Compositional semantics: An introduction to the syntax/semantics interface. Oxford: Oxford University Press.Google Scholar
Jacobson, Pauline, 2018. What is (or for that matter) is not experimental semantics. In Ball, Derek & Rabern, Brian (eds.), The science of meaning. Oxford: Oxford University Press.Google Scholar
Jacobson, Pauline. 2019. Deconstructing reconstruction. In Krifka, Manfred & Schenner, Matthias (eds.), Reconstruction effects in relative clauses, 303356. Berlin: De Gruyter (Studia Grammatica).Google Scholar
Jacobson, Pauline. 2023. Losing sight of the forest through the trees. In Chris Collins, Ordinary Working Grammarian (blog), June 2023. https://ordinaryworkinggrammarian.blogspot.com/2023/06/guest-blog-post-by-pauline-jacobson-on.htmlGoogle Scholar
Jacobson, Pauline. 2025. A categorial grammar view of grammatical categories: Evidence from coordination. Linguistic Inquiry. https://doi.org/10.1162/ling_a_00554CrossRefGoogle Scholar
Kehler, Andrew. 1996. Coherence and the coordinate structure constraint. In Johnson, Jan, Juge, Matthew L. & Moxey, Jeri L. (eds.), Proceedings of the twenty-second annual meeting of the Berkeley Linguistic Society: General session and parasession on the role of learnability in grammatical theory, 220231. Linguistic Society of America: eLanguage. https://journals.linguisticsociety.org/proceedings/index.php/BLS/article/view/1329/1113 (Accessed on 1996).Google Scholar
Kubota, Yusuke & Levine, Robert D.. 2020. Type-logical syntax. Cambridge, MA: MIT Press.10.7551/mitpress/11866.001.0001CrossRefGoogle Scholar
Lakoff, George. 1986. Frame semantic control of the coordinate structure constraint. Proceedings of the 22nd annual meeting of the Chicago Linguistics Society, 152167. Chicago: Chicago Linguistics Society.Google Scholar
Larson, Richard. 1988. On the double object construction. Linguistic Inquiry 19.3. 335391.Google Scholar
Maxwell, John T. & Manning, Christopher D.. 1996. A theory of non-constituent coordination based on finite state rules. In Butt, Miriam & King, Tracy Holloway (eds.), Proceedings of the First LFG Conference. Grenoble: CSLI Publications.Google Scholar
McCawley, James. 1982. Parentheticals and discontinuous constituent structure. Linguistic Inquiry 13.1, 91106.Google Scholar
McCloskey, James. 1986. Right node raising and preposition stranding. Linguistic Inquiry 17.1, 183186.Google Scholar
Montague, Richard. 1973. The proper treatment of quantification in ordinary English. In Suppes, Patrick, Moravcsik, Julius & Hintikka, Jaakko (eds.). Approaches to natural language. 221242. Dordrecht: Reidel.10.1007/978-94-010-2506-5_10CrossRefGoogle Scholar
Montalbetti, Mario. 1984. After binding: The interpretation of pronouns. Ph.D. dissertation, MIT.Google Scholar
Munn, A. 1993. Topics in the syntax and semantics of coordinate structures. Ph.D. dissertation, University of Maryland.Google Scholar
Newmeyer, Frederic. 2016. Nonsyntactic explanations of island constraints. Annual Review of Linguistics 2.1, 187210.10.1146/annurev-linguistics-011415-040707CrossRefGoogle Scholar
Oehrle, Richard, 1990. Categorial frameworks, coordination, and extraction. In Halpern, Aaron L. (ed.), Proceedings of the Ninth West Coast Conference on Formal Linguistics, 411426. Stanford, CA: Center for the Study of Language and Information, Leland Stanford Junior University.Google Scholar
Partee, Barbara & Rooth, Mats. 1983. Generalized conjunction and type ambiguity. In Bäuerle, Rainer, Schwarze, Christoph & von Stechow, Arnim (eds.). Meaning, use and interpretation of language, 361–183. Berlin: De Gruyter.10.1515/9783110852820.361CrossRefGoogle Scholar
Pollard, Carl. 1988. Categorial grammar and phrase structure grammar: An excursion on the syntax-semantics frontier. In Oehrle, Richard, Bach, Emmon & Wheeler, Dierdre (eds.). Categorial grammars and natural language structures, 391415. Dordrecht: Reidel.10.1007/978-94-015-6878-4_14CrossRefGoogle Scholar
Pollard, Carl & Sag, Ivan. 1994. Head driven phrase structure grammar. Chicago: University of Chicago Press.Google Scholar
Postal, Paul. 1974. On raising: One rule of English grammar and its theoretical implications. Cambridge, MA: MIT Press.Google Scholar
Rosenbaum, Peter. 1967. The grammar of the English predicate complement construction. Cambridge, MA: MIT Press.Google Scholar
Ross, John. 1967. Constraints on variables in syntax. Ph.D. dissertation, MIT.Google Scholar
Steedman, Mark. 1987. Combinatory grammars and parasitic gaps. Natural Language and Linguistic Theory 5.3, 403439.10.1007/BF00134555CrossRefGoogle Scholar
Steedman, Mark. 1996. Surface structure and interpretation. Cambridge, MA: MIT Press.Google Scholar
Steedman, Mark. 2024. On internal merge. Linguistic Inquiry. https://doi.org/10.1162/ling_a_00521CrossRefGoogle Scholar
Szabolcsi, Anna. 1992. Combinatory grammar and projection from the lexicon. In Sag, Ivan & Szabolcsi, Anna (eds.), Lexical matters. Stanford: CSLI Publications. 241268.Google Scholar
Supplementary material: File

Jacobson supplementary material

Jacobson supplementary material
Download Jacobson supplementary material(File)
File 261.7 KB