11.1 What Is Non-Canonical?
Non-canonical syntax is found in a range of social settings in which English is spoken by non-native or second-language speakers. This introduction to Part III considers definitions and key concepts before turning to these social settings and the main sources of non-canonical syntax among these speakers, as well as appropriate methods for analysis.
In the context of language contact and variation, the term ‘non-canonical syntax’ has a number of related interpretations, all of which are relevant for looking at the emergence of non-canonical structures in non-native English varieties. The original use of the term refers to a range of phenomena within a given language – for example, within English – that involve divergence from typical or default syntactic structures. These are usually information-structural deviations from a default word order (Birner & Ward Reference Birner and Ward1998; Ward & Birner Reference Ward, Birner, Horn and Ward2004). ‘Non-canonical’ in this earlier sense can thus mean information-structural syntactic reorganisation or more generally any less conventional or innovative syntactic usage (Birner & Ward Reference Birner and Ward1998; Huddleston & Pullum Reference Huddleston and Pullum2002).
The latter, more generalised characterisation accommodates the second common meaning according to which the term refers to non-standard or atypical systems at a more general level, for example, whole social registers or varieties. The ‘canon’ in these cases is not the set of unmarked structures within a variety, but often a variety as a whole. In this case, a standard (typically native) variety like Standard British or American English embodies the ‘canonical’ usage, which is distinguished from ‘non-canonical’ usage in other varieties of the language.
In recent work (Lange Reference Lange2012; Pham & Leuckert Reference Leuckert2019), a third related use has become more prominent, namely non-canonical usage in new, emerging, and contact varieties of a language. This use is slightly distinct from the second use above. This use above tends to be associated with prescriptive judgments of incorrectness and deviation from a conservative or outdated usage norm, whereas in this part of the volume, the focus is on the emergence of novel syntax – novel in terms of frequency, function, and/or form – in a range of situations of social contact and language learning and shift. These are often similar to the historical, vernacular environments in which usage that is now (but has not always been) canonical developed, so prescriptive concerns are not the focus. At times, a descriptive comparison to a norm or canon is valuable for understanding the stages of semantic, pragmatic, and/or syntactic change, but novel usage of this kind can also be described on its own terms.
Information structure is most directly related to the first of the three meanings above, but is implicated in all of them. The exigencies of discourse can reorganise the elements in a ‘canonical’ clause (a child sat in the mud) in a number of ways, including, for instance, preposing and postposing (in the mud sat a child or in the mud, a child sat), left and right dislocation (she sat in the mud, that child or one of the naughtiest children, she sat in the mud), argument reversal (that mud’s been sat in by a child), it-cleft (it was a child who sat in the mud), existential there (there was a child who sat in the mud), and wh-clefts (what the child did was sit in the mud). These effects are usually described as located in discourse, as they concern the status of information as common ground or novel for interlocutors, but they have also been described as involving the speaker’s (or writer’s) attitude or evaluation (Pham Reference Pham2017).
A final dimension of ambiguity in terminology around non-canonical syntax, as noted elsewhere in the present volume, is whether it specifically pertains to the (non-)canonical ordering of arguments relative to their discourse status, as above, or to (non-)canonical usage in syntax more generally. The original term applied specifically to how the status of arguments in discourse affects a canonical or ‘unmarked’ ordering. For example, ‘non-canonical constructions are used in predictable ways in order to preserve a general old-before-new ordering of information in English’ (Ward & Birner Reference Ward, Birner, Horn and Ward2004: 172). A large body of work has elucidated this relationship between discourse givenness, hearer familiarity, and order of constituents (Firbas Reference Firbas1966; Prince Reference Prince and Cole1981; Vallduví Reference Vallduví1992; Lambrecht Reference Lambrecht1994). Syntactic constructions that diverge from standard norms more generally – not just in information-structural terms, but also in terms of semantic and formal syntactic behaviour – have also been described in recent research as non-canonical. Indeed, as we will see, the two kinds cannot be easily dissociated, as discourse-based reorganisation can influence morphosyntactic behaviour and vice versa, and so the present volume casts a wide net to capture whole systems of syntactic innovation.
11.2 What Is Non-Native?
Although the native speaker is privileged in almost all fields of linguistics, the concept of the native speaker is surprisingly indeterminate. Nativeness can be defined in linguistic or social terms, including shared grammaticality judgments, mode of acquisition, amount of use, test performance, language dominance, type of variation, social group membership, historical status, or perception by others. Research on World Englishes and on English as a Lingua Franca has challenged some of these conceptualisations within linguistics, as it is in these contexts that common assumptions about nativeness start to give way (Paikeday Reference Paikeday1985; Rampton Reference Rampton1990; Kachru Reference Kachru1992 [1982]; Schneider Reference Schneider2007; Agnihotri & Singh Reference Agnihotri and Singh2012; Mauranen et al. Reference Mauranen, Carey, Ranta, Biber and Reppen2015).
For example, it is unremarkable in India to find people who pass one but not another ‘test’ of nativeness. A person might have used English every day for most of their life and affiliate themselves with it as one of their dominant languages, yet have highly variable grammatical structures and intuitions. Or they might have stable or ‘standard’ grammatical structures and intuitions but proficiency in only one register that is used relatively rarely. They might have a default code that mixes Hindi and English, so that they sound like native speakers in both but may be unable to speak either language on its own. Parents might transmit English to their children as their primary language while nevertheless retaining a host of highly variable phonetic and grammatical forms. The reality of these complexities sits uncomfortably beside simple categorisations of Indian speakers – and many millions of others in postcolonial contexts – as either ‘non-native’ or ‘native’.
The reality is one of a bilingual cline (Kachru Reference Kachru1992 [1982]) with variable language use and competence across different dimensions: acquisition, function, and context of situation. Acquisition refers to varying performance levels, for instance in rural as opposed to urban education and through daily use. Function may vary according to whether English is being used for personal communication, instrumentally as a national ‘link’ language, or as an international mode of communication. Context of situation may vary considerably according to regional, cultural, or occupational practices and norms. At one end of the cline are speakers who consider English a native language and at the other, speakers who command only a restricted subset of functional uses of English.
Mukherjee (Reference Mukherjee2007: 182) offers an empirically based model of Indian English (IndE) as ‘semiautonomous’, caught between the pull of conservative (exonormative) and innovating (endonormative) forces. He notes a characteristic self-critical stance that suggests a lingering anxiety about correctness in varieties experiencing nativisation. Sharma (Reference Sharma2023) shows intermediate patterns of this kind (e.g., native-like acceptance being more apparent in IndE phonology than grammar), and differing degrees of dialect confidence between IndE and Singapore English (SgE), again pointing to degrees of nativeness.
In this final part of the book, the focus is on speakers described as non-native. In fact, Part III implicates the full continuum of speech community types: English as a Native Language (ENL), English as a Second Language (ESL), English as a Foreign Language (EFL), and English as a Lingua Franca (ELF). This broad focus permits a consideration of how these different social contexts of use make speakers more or less sensitive to factors such as structural complexity, simplification, variation in input, and the emergence and focusing of new norms of use (see Szmrecsanyi Reference Szmrecsanyi2009).
The intriguing case of ELF, for example, once again challenges the interpretation of ‘non-standard’ syntax as a deviation from a canonical standard as opposed to a new form of accommodation or collaboration. Many multinational firms now have commerce conducted solely among ELF speakers, who quickly converge upon and regularise new norms. A simple contrast of native and non-native competence does not always suffice for such situations. It is useful to bear in mind that speakers of ESL, EFL, and ELF are often speakers with substantial proficiency and competence (Paikeday Reference Paikeday1985), and often strong affiliation with the language (Rampton Reference Rampton1990). When speech is produced and processed in these diverse communicative contexts, the acquisitional status of the individual but also their social goals and affiliations will influence how much their speech is affected by limited native speaker input, and how likely socio-pragmatic contexts are to generate new ways of signalling stance or meaning.
Some of the examples presented later in this introductory discussion are drawn from speech communities that combine native and non-native dynamics: for example, in urban multiethnolects in Europe, we find new syntactic forms triggered by group second language acquisition in high-migration zones, but these are rapidly absorbed into native usage by the adolescent population, quickly becoming identified with native speaker identity rather than learner status (Cheshire et al. Reference Cheshire, Adger and Fox2013).
Wiese (Reference Wiese2023) makes a compelling case for how these social and communicative situations are central to novel syntactic usage. In relation to non-canonical syntax in bilingual contexts, she notes:
Typically, the com-sits [communicative situations] associated with such dialects are initially restricted to these specific settings, that is, relevant characteristics are urbanity, youth, and ethnic and linguistic diversity. Further on, such dialects can loosen their association with a specific community and setting and spread to broader contexts, for instance, generally to com-sits among adolescents or to informal urban settings. From the point of view of com-sits, we can capture this as a broadening of the com-sit base when less specific situational characteristics become relevant. … Another aspect of com-sits that urban contact dialects highlight is that com-sits support the emergence of grammar. These urban contact dialects are not just characterised by a bunch of words, of course: their elements are integrated grammatically. Hence, linguistic elements can organise into different systems through their association with different com-sits.
Wiese (Reference Wiese2023: 54–7) goes on to illustrate this through a striking example of ‘market grammar’ in Berlin’s Maybachufermarkt. Signs advertising produce were observed to follow a number of syntactic rules (order: numeral+classifier, numeral+currency; zero agreement: zwei Mango, *zwei Mangos) that allow users to cross language boundaries easily, swapping in lexicon from German, Turkish, and English. In the market, users were able to offer field researchers clear intuitions about these rules of their functionally driven, register-specific code (‘Not Mangos. Mango!’, ‘Nobody says Mangos here!’, ‘Kiste! That’s how one talks on the market.’, ‘No plural on the market!’).
We can see how the communicative situation influences syntactic outcomes by comparing this example to a very different non-native English-using environment, ELF in middle-class professional and academic contexts. In some contexts of spoken ELF, such as multinational workplaces, we may expect to see regularisation, simplification, and transparency (e.g., non-idiomatic language and one-to-one meaning-form correspondences) in syntactic constructions used between non-native speakers from different L1 backgrounds (Mauranen et al. Reference Mauranen, Carey, Ranta, Biber and Reppen2015). On the other hand, in the register of ELF academic written discourse, Wu et al. (Reference Wu, Mauranen and Lei2020) find that ELF speakers use longer sentences, more coordinate phrases, and more complex nominals than an American English reference corpus.
Social setting, function, and register are thus crucial factors in determining the specific communicative goals that give rise to specific syntactic formats. It is these situational pressures that generate many of the observed regularities and cycles of discourse-based restructuring, potentially a process universal to human language and only accelerated by contact.
11.3 What Gives Rise to Non-Canonical Syntax among Non-Native Speakers?
With our focus on a broad meaning of ‘non-canonical’, extending to information-structural as well as other syntactic innovation, two immediate questions arise: (1) what counts as an example of non-canonical syntax, and (2) what gives rise to these novel, (initially) non-canonical uses, especially in non-native speaker contexts?
In terms of the first of these – what counts as an instance of non-canonical syntax – Lange and Rütten (Reference Lange and Rütten2017: 244) offer a concise summary: ‘Languages in general and both, varieties and registers of English in particular may display different preferences for the formal realization as well as the frequency of individual information-packaging strategies.’ These preferences give rise to expectations within language users of what is canonical (cf. Introduction to this volume). Non-canonical syntax can thus involve innovative syntactic constructions, which may not be attested at all in other varieties or registers, but also innovative frequency distributions, such that all the structures are shared with a reference variety, but infrequent constructions have become much more frequent in the variety in question (or vice versa). Relatedly, non-canonical syntax may involve a structure that is superficially similar to a canonical construction, but its function differs. Decades ago, S. V. Shastri, the creator of the first corpus of IndE in the late 1970s, observed that unlike ‘transparent’ features, involving a new form, more ‘opaque’ syntactic, semantic, and pragmatic features may be very common in IndE, where ‘it is perhaps not the “form” that is at variance but the “function”’ (Shastri Reference Shastri and Leitner1992: 274).
Below, we consider some of the main sources of non-canonical syntax found in the language use of non-native or bilingual speakers (see Matras Reference Matras2009 for a detailed review). One obvious source, but by no means the only one, is language transfer: change that is rooted in the contrasts between syntactic structures in English and those in a non-native speaker’s other (native) languages. Related to this is the relative need for pragmatically rich English language input for the acquisition of certain syntactic constructions, such that both L1 influence and universal discourse patterns intervene in its absence. A second source of change is inherent variability in English itself. It is often the presence of low-frequency non-canonical standard variants within the reference or standard variety that opens up variability for contact-driven non-canonical usage particular to non-native speakers. Finally, a third source is simply pragmatic innovation, where novel functions of language structure arise out of the exigencies of a new social setting.
11.3.1 L1-L2 Contrast and Input Demand
A clear source of non-canonical syntax among non-native speakers of a language is their first language(s) (L1). Examples include a much more frequent use of existing fronting devices (topicalisation, left dislocation) and novel topic-marking devices in many Asian varieties of English (Lange Reference Lange2012; Leuckert Reference Leuckert2019). These can involve constructions fairly directly transferred from the first languages (Can or not?: SgE, syntactic format transferred from Sinitic substrates) or indirect attempts to reconstitute a feature of the first language through constructions available in English (Every year, inflation is there: IndE, partly recreating the Indo-Aryan use of clause-final copula for existential meaning). Bao (Reference Bao2015) describes this as a ‘filtering’ of substrate meanings through superstrate lexicon: semantic functions of the L1 come to be reconstituted in reorganised elements of available L2 syntax. These are not simply errors of course; they have developed their own grammatical rules and speakers have intuitions concerning these constructions (Parshad et al. Reference Parshad, Bhowmick, Chand, Kumari and Sinhad2016).
Some have argued that non-canonical usage in a contact variety arises in part due to complexity within the English system, suggesting an acquisitional source of divergence (Housen Reference Housen, Rafael Salaberry and Shirai2002; Davydova Reference Davydova2011). This can be refined to be a statement about difficulty relative to the structures available in the L1. Sharma (Reference Sharma2023) argues that this interpretation of difficulty – termed input demand, or how much input a speaker of a given L1 needs to acquire an L2 form – may account for why certain non-canonical syntactic constructions persist more over time and become established as a new dialect feature while others do not.
Some syntactic innovations arise repeatedly across non-native English usage. Kortmann and Szmrecsanyi (Reference Kortmann, Szmrecsanyi, Kortmann, Burridge, Mesthrie, Schneider and Upton2004) use the term ‘varioversals’ for such forms, due to their shared origin in the variety type, for example, arising out of acquisitional processes or restricted input in non-native contexts. Some examples are given in (1) below;Footnote 1 Mauranen et al. (Reference Mauranen, Carey, Ranta, Biber and Reppen2015) cite similar features as potentially arising across ELF situations as well.
(1)
a. Irregular use of articles (omission): i. His sister passed message to me. Singapore ii. … like mudbrick hut with thatch roof. West Africa b. Present perfect and simple past levelled (occurrence with punctual adverbials):
i. I have seen him yesterday. Philippines ii. Nine supporters of Malawi’s president-elect Mr Bakili Muluzi have been killed when a bus crashed into them as they celebrated their victory. East Africa
c. Wider range of uses of the progressive (with stative verbs): i. We are knowing each other. India ii. Maybe one or two is having HIV/Aids. South Africa d. Resumptive/shadow pronouns (with subject and object): i. Some teachers when I was in high school I liked them very much. Papua New Guinea ii. My daughter she is attending the University of Nairobi. East Africa
e. Loosening of sequence-of-tense rules (past perfect after present): i. Never before in the Capital’s history these colonies had faced such a flood threat. India ii. Most of these syringes and razors you find that they had been used by someone before. South Africa
f. Invariant non-concord tags (negative): i. Upili returned the book, isn’t it? Sri Lanka ii. He loves you, isn’t it? West Africa
Some non-canonical innovations in a contact setting arise from more general discourse-driven principles – not strictly grafted from first languages, but drawing on universal or logical tendencies that emerge due to wider mismatches between L1 and L2 and foregrounding of pragmatic functions (Lange Reference Lange2012; Leuckert Reference Leuckert2019). Current theoretical models in bilingualism research such as the Interface Hypothesis can also offer new insights into why we might repeatedly observe discourse-based solutions to the challenge of mismatched grammars in contact (Sharma Reference Sharma2023).
11.3.2 Non-Canonical Alternatives within Standard English
As L1 sources of change are so prominent and noticeable in non-native varieties, loci of variability within English tend to be less of a focus as a source of new usage, but they play a key part in the actuation and spread of non-canonical syntactic usage in second and foreign language varieties (Sharma Reference Sharma2001; Hundt & Vogel Reference Hundt, Vogel, Mukherjee and Hundt2011). Indeed, there is a risk of over-idealising how homogeneous canonical, standard, or unmarked usage is in native varieties. Native and standard English varieties encompass a substantial degree of variability, as attested in the preceding parts of this book. An example noted by Ranta (Reference Ranta, Mauranen and Ranta2009) is lack of plural agreement in existential there-constructions in English: ELF speakers are often penalised for utterances such as There’s people outside, on the false assumption that native speakers do not produce such forms.
As noted, one form of change in non-native speakers may be an increase in the relative frequency of these ‘built-in’ non-canonical constructions. For example, IndE has dramatically higher frequencies of topicalisation and left dislocation constructions than British English, a difference in frequency rather than form. (As previously mentioned, IndE also allows non-canonical constructions that fall outside of the set of constructions licensed in British English at all.)
An example of repurposing an existing non-canonical form within Standard British English for a new non-canonical system in a new dialect is the development of a new topic-marking function of the relativiser who in Multicultural London English (MLE). Cheshire et al. (Reference Cheshire, Adger and Fox2013) compare the use of different relativisers in two varieties of English: traditional London English and the newer MLE. They show that the same canonical set of relativiser forms in English (Ø, that, who) are used in both varieties, but only in the latter variety has a new topic-marking function developed for who (e.g., my medium brother who moved to Antigua, with high topic persistence of that referent in the following discourse). They note that ‘present-day English does not have a specific topic-marking feature, using instead a range of non-canonical syntactic structures such as presentational existential clauses and left-dislocated “presentational” constructions to introduce both a new subject and a new topic into the discourse’ (Reference Cheshire, Adger and Fox2013: 68). By contrast, many of the languages that are in the linguistic ecology of high-contact MLE speakers (e.g., Jamaican Creole, Igbo, Yoruba, Moroccan Arabic, Sylheti, Bengali, Twi, Spanish, and Maltese) have overt topicalisers and focus devices. Cheshire et al. (Reference Cheshire, Adger and Fox2013) argue that the novel development in MLE is in line with Matras’s (Reference Matras2009) observation that, in contact situations, the features most susceptible to transfer are those that convey the speaker’s monitoring and directing of the interaction, due to their routinisation in usage. They conclude that ‘topic marking, then, can be considered to be one of the information structuring strategies that drive innovation in situations of extreme linguistic diversity’ (Cheshire et al. Reference Cheshire, Adger and Fox2013: 67). Their study offers a vivid example of existing (non-)canonical variation in English giving rise to a more elaborated non-canonical system. (It is worth noting that Mauranen et al. Reference Mauranen, Carey, Ranta, Biber and Reppen2015 also observe systematic innovation and change in the relativiser system in ELF, supporting Matras’s speculation that certain parts of the syntactic structure are more susceptible to change in contact.)
Inherent variability in English as a source extends to further subtypes of sources, such as change arising from distributional biases in the input (see van Rooy Reference van Rooy2008 for an example from Black South African English) and universal principles of markedness that are subtly present in native standard varieties and that may get amplified in a non-native variety. For instance, the accessibility hierarchy (Keenan & Comrie Reference Keenan and Comrie1977) affects syntactic rules in both native and non-native varieties.
11.3.3 Pragmatic Innovation in the Usage Environment
Cheshire et al.’s (Reference Cheshire, Adger and Fox2013) example also relates to a further source of novel non-canonical syntax, namely innovation that arises out of particular functions or pragmatic needs in a given group. The social situations of European multiethnolects such as MLE are an intriguing environment to observe these processes at work.
These varieties have developed in major European cities after late twentieth-century migration. Examples include Multicultural London English (London), Kiezdeutsch (Berlin), Citétaal (Belgium), Rinkebysvenska (Stockholm), Straattaal (Netherlands), and further varieties in Copenhagen, Oslo, Helsinki, and other cities (Cheshire et al. Reference Cheshire, Nortier and Adger2015). The varieties share a number of defining features: they originated through late modern migration into working-class areas of European cities, they developed partly out of group second language acquisition in a social mix of non-native and near-native speakers in high-migration zones, they involve rapid change at many levels of the language, and they have been quickly absorbed into native usage by the adolescent population, shifting from learner to native speaker identity.
A number of grammatical innovations in these varieties originate in the pragmatic context of new usage by teenagers of recent migrant heritage background and mixed nativeness, in intensive social interactions that closely resemble the ‘com-sits’ described by Wiese (Reference Wiese2023) earlier. One example of an MLE innovation is a novel causal interrogative construction with the format why … for? An example is Why they looking at me like that for?, a blend of Why … and What … for? One might argue that this is simply a learner error that has taken hold. However, the use of the form has been consistently observed in MLE to express purpose, rather than a general cause. Brookes et al. (Reference Brookes, Hall, Cheshire and Adger2017) show convincingly that the redundant use of for, given its implicit presence in why, has the effect of being a ‘reinforcing’ or ‘expressive’ feature. They observe that the ‘variant typically occurs in pragmatically-charged environments, such as those that involve a confrontation between the participants in the conversational exchange or in direct speech contexts that report an aggressive encounter’ (Reference Brookes, Hall, Cheshire and Adger2017: 6). Its absence in elliptical structures or in negated clauses reinforces this analysis.
It is reminiscent of non-canonical negation structures in Jespersen’s Cycle, whereby in French a redundant emphatic particle (ne … pas) developed out of the pressure for discourse informational clarity, starting out as non-canonical as the former negation device eroded, and later became the canonical negation (Dahl Reference Dahl1979). On analogy with those cases, MLE causal interrogatives could retain their narrow purpose function and register restriction, or they could grammaticalise into a general cause marker, losing their immediacy, emphasis, and stance/affect (confrontation) constraints. This sort of pragmatic bleaching with stabilisation and spread would resemble semantic bleaching found across grammaticalisation, such as the grammaticalisation of determiners out of more contextually constrained demonstratives (Hopper & Traugott Reference Hopper and Traugott2003).
Thus, a new non-canonical construction in MLE involves initial grounding in speaker stance and immediacy, but with the potential to spread and lose its pragmatically narrow and marked functions over time. In the long term, a process of vernacular renewal may continue to regenerate forms of this kind, especially in high-contact urban youth groups.
The influence of immediacy of communicative function on linguistic form and use has been described in a range of research (Bernstein Reference Bernstein1971; Koch & Oesterreicher Reference Koch and Oesterreicher1985; Leuckert & Buschfeld Reference Leuckert and Buschfeld2021). It arises under specific conditions of intensive, vernacular social interaction. If we think back to Asian Englishes, discussed earlier, we can argue that IndE did not historically involve a preponderance of such interactional contexts, because of very limited nativisation and use in young peer groups; this is different to SgE, which was used in informal trade and later nativised more rapidly, with more extensive nativising use within child and teenage peer groups. The two varieties have developed very distinct kinds of innovative, or non-canonical, usage.
11.4 Summary
Adopting a broad definition of non-canonical syntax as atypical usage across any syntactic domain, this introduction to Part III has focused on key characteristics of such innovations among what are described as non-native English-speaking groups. Not surprisingly, difference between these speakers’ first languages and English is an important source of change. Two further sources of non-canonical syntax were noted. One is the presence of standard but minor (less frequent or more pragmatically constrained) variants, that is, inherent variability within English, that can lead to shifts in the balance of frequency or dominance of competing constructions. Another is pragmatic innovation that will always arise under certain interactional conditions, in native and non-native varieties alike.
The balance of these three forces can vary dramatically from situation to situation. The discussion pointed to the considerable need to understand the particular acquisitional and social setting to understand the nature of non-canonical innovations. If a situation involves little nativisation and language shift, with acquisition and use established through education and work, the syntactic innovations we observe are likely to be very different to those found in intensive adolescent peer groups involving rapid shift from non-native to native speaker status, with a reliance on the language as a vehicle of intimate personal exchange (Thomason & Kaufman Reference Thomason and Kaufman1988; Matras Reference Matras2009).
Furthermore, the methods adopted for analysis also need to be particularly attuned to all of these considerations, to capture multiple dimensions of variation. Given the potential for innovation in frequency, function, form, or all three, both quantitative and qualitative perspectives on variation are needed, whether for naturalistic, elicited, or experimental data. And the question of nativeness and pragmatic context presents further challenges for coding precise uses of non-canonical forms.
The three chapters in this part of the volume elegantly illustrate ways to respond to these challenges. They span the full gamut of social situation types, from learner English (see Kircili, in Chapter 13), through ELF (see Neumaier & Leuckert, in Chapter 14), to nativising varieties of Asian English (see Götz & Kircili, in Chapter 12), with some intriguing observations about parallels across these. They investigate both classic non-canonical syntax that relates to information structure (adverbial fronting and introductory-it constructions) and usage that goes beyond this (absence of plural marking), and they highlight the complex interplay of L1 transfer with universal language processing and discourse pressures, as well as intriguing elements of genre sensitivity. The studies also demonstrate the value of triangulation across multiple perspectives on the data, with distinctive qualitative, (descriptive) quantitative, and statistical approaches to capturing incipient systematicity in non-canonical usage patterns.
12.1 Introduction
The following quote serves as an illustration of a non-canonical sentence pattern that – across registers – distinguishes itself by its highly expressive and productive nature, namely the introductory-it pattern (henceforward intro-it), also referred to as ‘(it-)extrapositioning’ (Kaltenböck Reference Kaltenböck2004), ‘preparatory’ or ‘anticipatory it’ (Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999; Hewings & Hewings Reference Hewings and Hewings2002).
It was reported to Minivan News that at 5 pm around 20–25 people descended upon the Maldivian High Commission in Colombo with placards demanding the release of the hunger-strikers still held in jail […].
Given its information-structural value, the pattern is commonly employed in both the spoken and the written mode (Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999) and has thus received quite a bit of scholarly attention in English as a Native Language (ENL) and English as a Foreign Language (EFL) contexts, where it has been shown to entail versatile functional and structural possibilities but also to occur in preferred registers of employment (cf. Section 12.2). When it comes to English as a Second Language (ESL) varieties, however, research is surprisingly scarce. This also applies to the Englishes spoken in South Asian regions, whose speakers, in fact, constitute the largest number of ESL speakers across the globe (Bolton Reference Bolton2008: 6). The history of English in Asia began in the early seventeenth century, with the foundation of the East India Company in 1600 marking the beginning of British trading – and ultimately imperial – activities. Since then, at least six varieties of English with varying evolutionary states (Schneider Reference Schneider2007) have evolved in South Asia – influenced and shaped not just by the superstrate language but also by prevalent and persistent linguistic, cultural, and social customs.
In the present chapter, these six varieties, namely Indian (IndE), Bangladeshi (BgE), Nepali (NpE), Maldivian (MvE), Pakistani (PkE), and Sri Lankan English (SLE), are employed to fill the existing gap in academic discourse concerning the use of the intro-it in outer-circle varieties in general and newspaper language in particular. While in Section 12.2 the structural and functional peculiarities of the pattern as well as previous research will be discussed, Sections 12.3 and 12.4 are dedicated to our own study, with a presentation of our research gaps and questions and our database and methodology (Section 12.3) as well as our quantitative and qualitative results (Section 12.4). This is followed by a discussion of the findings, a conclusion, and an outlook in Section 12.5.
12.2 Taking a Closer Look at the Intro-it
12.2.1 Defining the Intro-it
As has been mentioned in the Introduction to this volume, there are two major approaches to non-canonicity that can be distinguished: the frequency-based approach and the theory-based approach. When it comes to the intro-it pattern, this represents a particularly interesting distinction. With regard to the former approach, its enablement of the adherence to common information-structural principles like those of end-weight and end-focus (Behaghel Reference Behaghel1909) make it the cognitively least demanding choice given that, in comparison with its syntactically canonical counterpart, the intro-it reaches a point of grammatical completeness before the notional subject is added. This fact benefits production, comprehension, and processing (Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999: 677)Footnote 2 and makes the use of the pattern both the more natural and the more frequent choice.
From a structural or theoretical perspective, the situation is a little more complex: structurally, the notion of non-canonicity adopted by the researchers contributing to the present volume, in a nutshell, entails a deviation from an Sn‑V-X structure, with ‘Sn’ referring to the notional – and thus semantically loaded – subject and ‘X’ operating as a placeholder for what is to follow to make a given construction grammatically complete (see the Introduction to this volume). Quirk et al. (Reference Quirk, Greenbaum, Leech and Svartvik1985: 53), for example, suggest seven basic clause patterns of English, which entail
to be followed by (1) zero, (2) Od-n, (3) CS, (4) Oi-n-Od-n, (5) Od-n-Co, (6) A, or (7) Od-n-A. Deviations from the default
structure are, of course, not random choices, but are triggered by complex and versatile (information-)structural mechanisms, a circumstance which, as was mentioned above, becomes particularly apparent in case of the intro-it. The pattern is defined by Biber et al. (Reference Biber, Johansson, Leech, Conrad and Finegan1999) as the use of ‘the dummy subject it … in the ordinary subject position, anticipating a finite or non-finite clause in extraposition’ (Reference Biber, Johansson, Leech, Conrad and Finegan1999: 155), a definition that already alludes to the structural diversity of the pattern. As is illustrated below, the expletive it, which excludes time, weather, or distance it (Callies Reference Callies2009), may be followed by either a copular or a lexical verb, while, depending on the verbal constituent and what is to be expressed, the postverbal element may take the form of an adjective phrase (AdjP), a noun phrase (NP), or a prepositional phrase (PP), which is then succeeded by the notional subject clause constituting either a that-clause, a to-infinitive-clause, an -ing-clause, a (w)h-clause (e.g., with whether or how), or an if-clause.
| It | + |
| + | AdjP NP PP | + | that-clause to-inf-clause -ing-clause (w)h-clause if-clause |
However, even if it is the most frequent choice, the role of it as a proxy is not limited to the subject of a construction but may affect the object as well (Kaltenböck Reference Kaltenböck2004: 65; Gentens Reference Gentens2016). As illustrated below, it may, in fact, be used to fill grammatically vacant positions in almost all of the seven basic clause patterns previously discussed, as illustrated in Table 12.1 (Quirk et al. Reference Quirk, Greenbaum, Leech and Svartvik1985: 1391–3).
Naturally, the frequencies with which the non-canonical structures are employed vary. While extraposed that-clauses, for example, are generally named as the most frequent realisation (Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999), extraposed -ing-clauses are most commonly associated with rather informal spoken discourse (Quirk et al. Reference Quirk, Greenbaum, Leech and Svartvik1985: 1393).

Note. g refers to grammatical constituents while n refers to notional (or semantically loaded) ones.
Table 12.1Long description
The table displays different clause structures for the usage of "intro-it" in English. It includes three columns: "Canonical Variant," "Non-canonical Variant," and "Example."
Row 1: The canonical variant is S-V-zero, and the non-canonical variant is Sg-V-zero-Sn. Example: "It doesn’t matter whether you like it."
Row 1 (continued): The canonical variant is S-V-zero, and the non-canonical variant is Sg-V-pass-zero-Sn. Example: "It is claimed that she does not know him."
Row 2: The canonical variant is S-V-Od, and the non-canonical variant is Sg-V-Od-Sn. Example: "It surprised me to see you."
Row 3: The canonical variant is S-V-Cs, and the non-canonical variant is Sg-V-Cs-Sn. Example: "It is nice to see you."
Row 5: The canonical variant is S-V-Od-Co, and the non-canonical variant is S-V-Od-g-Od-g-Co-Od-n. Example: "He must consider it interesting working here."
Row 6: The canonical variant is S-V-A, and the non-canonical variant is Sg-V-A-Sn. Example: "It was on the news that he is guilty."
Row 7: The canonical variant is S-V-Od-A, and the non-canonical variant is S-V-Od-g-A-Od-n. Example: "She put it into his head that he should resign."
Each row provides a sentence example illustrating different syntactic forms using intro-it across basic clause patterns.
In addition to its formal diversity, the pattern, very generally speaking, also enables the evaluation or assessment of discourse content (Couper-Kuhlen & Thompson Reference Couper-Kuhlen and Thompson2008). It is associated with particular semantic or rhetorical domains which can be derived from the nature of the verb and/or verb-complement structure, which, in a way, ‘set(s) the scene’ for the subsequent clausal content. These include the expression of necessity, importance, ease, or difficulty by means of particular lexical bundles containing, most commonly, combinations of four (e.g., it is clear that, it is possible to, it is difficult to), five (e.g., it may be possible to, it is interesting to note), or even six words (e.g., it should not be forgotten that, it must have been difficult to) (Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999). Many previous studies focus on the classification of verb and adjective phrases. Examples include Römer (Reference Römer2009) or Hewings and Hewings (Reference Hewings and Hewings2002), with the latter suggesting a fourfold distinction between hedging (1), emphasising (2), attributive (3), and attitude marking functions (4).
It seems reasonable to do some additional testing.
It is essential to take the medication on time.
It was proposed that the results had been forged.
It is surprising to see him here.
A more comprehensive approach was adopted by Larsson (Reference Larsson2017), who, unlike Hewings and Hewings (Reference Hewings and Hewings2002), whose focus was mainly on AdjPs and passivised verb phrases, also took postverbal NPs into account. Another advantage of her approach is the fact that, while her classification, too, includes attitude markers, hedges, and emphatics (as well as combinations thereof), the latter two functions are only assigned if there is linguistic ‘evidence’, such as amplifiers preceding the adjective (e.g., extremely surprising, very disappointing) as an indicator of an emphatic use or the use of seem or a modal verb of possibility (e.g., could, might) as indicators of hedging, making the analysis less subjective and thus more reliable. Her final category, observations, is reserved for neutral propositional content which, prior to her study, had been mostly disregarded.
Now, what unites research on the semantic or rhetorical functions of the sentence pattern is the fact that there appear to be population- as well as register-specific preferences (see also Zhang Reference Zhang2015). It thus remains for us to determine whether this also holds true for ESL in general and newspaper language in particular.
12.2.2 Theories and Previous Research
To the best of our knowledge, the first researcher to discuss extraposition in his work is Jespersen (Reference Jespersen1927), claiming that ‘words in extraposition may be added as a kind of afterthought after the sentence has been completed; they stand outside it and form, as it were, a separate utterance which might even be called a separate sentence’ (Reference Jespersen1927: §17.12). Now, even though this first perspective seems to extend to structures like dislocations as well (e.g., He is nice, your brother.) and thus encompasses a broader range of phenomena, it-extrapositioning has still received quite a lot of scholarly attention ever since. Research has been dedicated to the description of the structural and functional employments as well as to the discussion of the information-structural mechanisms that may be at play. With the postponement of clausal subjects or objects to the end of a construction while it occupies the grammatical position, the pattern, for example, operates in accordance with the principle of end-weight (Behaghel Reference Behaghel1909). The concept associates the placement of syntactically weighty elements towards the end of a sentence with reduced cognitive efforts for both the production and the comprehension of an utterance (see, e.g., Arnold et al. Reference Arnold, Losongco, Wasow and Ginstrom2000). The applicability of the concept to the intro-it has been confirmed by a number of researchers, including Huddleston and Pullum (Reference Huddleston and Pullum2002), Gómez-González (Reference Gómez-González1997), and Kaltenböck (Reference Kaltenböck2004). Simultaneously, the structure also serves the adherence to the given-new progression (e.g., Clark & Haviland Reference Clark, Haviland and Freedle1977; Prince Reference Prince and Cole1981) and the principle of end-focus (Behaghel Reference Behaghel1909: 138; Quirk et al. Reference Quirk, Greenbaum, Leech and Svartvik1985: 1356–61; Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999: 897–9). Both concepts are strongly connected to the efficiency of information packaging. Given that the dummy it merely fulfils a grammatical function and does not contribute any semantic value to the message that is being produced or perceived, attention is placed on the postponed clausal element, which is commonly associated with new and thus focused information. While the study by Herriman (Reference Herriman2000) supports the view that the structure enforces the principle of end-focus, Kaltenböck (Reference Kaltenböck2004) maintains that this is, indeed, strongly connected to the given-new status. He finds that the majority of instances with discourse-new clausal subjects (roughly 72%) were used in an intro-it construction. This is supported by Huddleston and Pullum (Reference Huddleston and Pullum2002), as well as Miller (Reference Miller2001), who describe discourse-new clausal subjects in non-canonical position as more felicitous or even mandatory.
As far as other language contexts are concerned, quite some research has been devoted to EFL learners already. While the accounts report a general overrepresentation of the phenomenon, it has been found that the structural and functional realisations may vary. One example includes the preference for extraposed to-clauses by some learner populations, including German and Swedish learners of English (Callies Reference Callies2009; Larsson Reference Larsson2016). Likewise, Hewings and Hewings (Reference Hewings and Hewings2002) and Larsson (Reference Larsson2017) have determined that learners tend to use the structure to increase the force of a claim rather than for hedging purposes, with the employment of unusually forceful adjectives, such as amazing or stupid, resulting in non-target-like uses by some learners (Römer Reference Römer2009).
When it comes to the ESL context, non-canonical patterns, and among these fronting and left dislocation in particular, have been described as common features of ‘New Englishes’ (e.g., Alsagoff & Lick Reference Alsagoff, Lick, Foley, Kandiah, Zhiming, Gupta, Alsagoff, Lick, Wee, Talib and Bokhorst-Heng1998; Sharma Reference Sharma, Butt and King2003, Reference Sharma and Hickey2012; Bhatt Reference Bhatt and Mesthrie2008; Mesthrie & Bhatt Reference Mesthrie and Bhatt2008), with a number of corpus-based studies providing detailed accounts of a variety of syntactic structures (Lange Reference Lange2012; Winkle Reference Winkle2015; Leuckert Reference Leuckert2019). One study, Götz (Reference Götz2017), investigates fronting in the newspaper language of the same varieties considered in the present study and finds interesting variety-specific preferences, thus proving that fronting is by no means limited to informal spoken data – as suggested by previous research. As to research into the intro-it – or postponement patterns in general – only a very limited number of studies have been conducted. One of the few exceptions is Dubey’s (Reference Dubey1989) monograph on Newspaper English in India, in which he briefly mentions that, in comparison with the movement of participles or parts of NPs, the cases of postponements employed in newspapers in India were almost exclusively limited to clausal subjects, presumably to achieve end-focus (Reference Dubey1989: 77).
It remains to be seen in what ways this attestation of a preference for extraposed clausal structures can be extended, both on a structural and a functional basis. The precise nature of the research gaps identified and our research questions will be discussed in the following section.
12.3 The Intro-it in South Asian Varieties of English
12.3.1 Research Gaps and Research Questions
In the previous description of the status quo of research into the topic, we have identified two major research gaps which the present chapter intends to fill. To the best of our knowledge, no detailed account of the intro-it has been provided in the South Asian English context. Therefore, we pose the following research questions:
12.3.2 Database and Methodology
In order to avoid overinterpretations of (spontaneous) speech and because we are mainly interested in nativised forms of Englishes, we investigated a corpus of acrolectal newspaper language that had been subjected to an editing process. In doing so, we are able to ascertain that all instances of the intro-it we found are deliberate uses. The database used for the present chapter is the South Asian Varieties of English Corpus (SAVE Corpus; Bernaisch et al. Reference Bernaisch, Koch, Mukherjee and Schilk2011) which, based on two sources each, contains the newspaper language of six ESL varieties in South Asia. It was compiled between 2008 and 2011 and contains approximately 3 million words per subcorpus (i.e., about 18 million words in total). Details on the papers used as well as the time spans considered can be found in Table 12.2.

Table 12.2Long description
The table lists newspapers from various South Asian countries, along with their respective U R Ls and the time span of their publication. The table is divided into 4 columns, where the headers are the aforementioned. Each country has two papers along with the relevant data. The rows are filled from left to right as follows:
1. For Bangladesh:
The corresponding data are Daily Star, New Age, www.thedailystar.net, ww.newagebd.com, 2003 to 2006, and 2005 to2007.
2. For India:
The corresponding data are The Statesman, The Times of India, www.thestatesman.net, http://timesofindia.indiatimes.com, 2002 to 2005, and 2002 to 2005.
3. For Maldives:
The corresponding data are Dhivehi Observer, Minivan News, www.dhivehiobserver.com, www.minivannews.com, 2004 to 2007/2008, and 2004 to 2008.
4. For Nepal:
The corresponding data are Nepali Times, The Himalayan Times, www.nepalitimes.com, www.thehimalayantimes.com, 2000 to 2007/2000, and 2002 to 2008.
5. For Pakistan:
The corresponding data are Daily Times, Dawn, www.dailytimes.com.pk, www.dawn.com, 2002 to 2006, and 2002 to 2007.
6. For Sri Lanka:
The corresponding data are Daily Mirror, Daily News, www.dailymirror.lk, www.dailynews.lk, 2002 to 2007, and 2001 to 2005.
With BrE as the historical input variety of the varieties under scrutiny, the news section of the British National Corpus (BNC) was employed as a control corpus. The first version of the BNC (i.e., BNC 1994) was compiled between 1991 and 1994 and has a size of about 100 million words, 9 million of which are part of the periodicals section used as part of our database.
We employed a modified random sampling technique to extract the articles to be analysed, making a point to attempt to have each file of the corpus (each of which contains a varying number of articles) feature in the analysis. Files and/or articles were disregarded, however, if they were too short, did not contain neutral newspaper language (e.g., birthday greetings, advertisements), or were used repeatedly in the corpus.
As Table 12.3 illustrates, roughly 8,000 sentences per subcorpus were extracted for the manual annotation process. They were copied into an Excel spreadsheet before identifying the instances of the intro-it, which were then annotated for a number of structural and functional variables, which can be found in Table 12.4.

Table 12.3Long description
The table is sectioned into 5 columns. The columns are labeled as: Country, Corpus, Number of sentences, total, Number of intro-its, total, Number of intro-its, p t s. There are 7 rows for different countries and an eight row for total. The data in the rows are filled from left to right as follows:
For Great Britain, the corresponding data are B N C underscore News, 8055, 143, and 17.75.
For India, the corresponding data are S A V E underscore I N, 8289, 165, and 19.91.
For Bangladesh, the corresponding data are S A V E underscore B D, 8034, 123, and 15.31.
For Maldives, the corresponding data are S A V E underscore M V, 7951, 280, and 35.22.
For Sri Lanka, the corresponding data are S A V E underscore S L, 8130, 256, and 31.49.
For Nepal, the corresponding data are S A V E underscore N P, 8245, 123, and 14.92.
For Pakistan, the corresponding data are S A V E underscore P K, 8098, 185, and 22.85.
For total, the corresponding data are 56802, 1275, and 22.45.

Table 12.4Long description
The table is sectioned into 2 columns. The columns are labeled as: variables and meaning. There are 14 rows for different variables. The data in the rows are filled from left to right as follows:
The data for S E M A N T I C F U N C T I O N is semantic function based on verb or verb-complement structure.
The data for S underscore length is length of the sentence of interest in words.
The data for I N T R O I T underscore T F is indication of whether the given sentence contains an intro-it.
The data for C O P U L A R underscore T F is indication of whether the verb used is a copular or not.
The data for P A S S I V E underscore T F is indication of whether the verb is used in the passive or not.
The data for T H A T underscore C L is indication of whether the postponed constituent is a that-clause.
The data for Z E R O underscore T H A T underscore C L is indication of whether the postponed constituent is a zero that clause.
The data for T O underscore I N F underscore C L is indication of whether the postponed constituent is a to-inf clause.
The data for I N G underscore C L is indication of whether the postponed constituent is an -ing clause.
The data for I F underscore C L is indication of whether the postponed constituent is an if-clause.
The data for W H underscore C L is indication of whether the postponed constituent is a wh-clause.
The data for H underscore C L is indication of whether the postponed constituent is an h-clause.
The data for I N T R O I T underscore S is indication of whether the intro-it affects the subject.
The data for I N T R O I T underscore O is indication of whether the intro-it affects the object.
Except for the s_length variable, which was determined using an Excel formula, each of the variables was coded manually by the authors of this chapter. While the majority are based on structural observations, the classification of the semantic function calls for some additional explanatory remarks. Due to the previously outlined advantages of her approach, we adopted Larsson’s (Reference Larsson2017) general distinction between attitude markers, expressing ‘the writer’s affective attitude towards what is stated in the clausal subject’, emphatics, as a means to ‘strengthen the force of the utterance’, hedges, used to weaken the forcefulness of an utterance and to indicate a lack of full commitment, and observations, which may express ‘affectively neutral … propositional content’ (Larsson Reference Larsson2017: 61).
Note, however, that we did not code for combinations of the former three categories because, after close consideration of the data at hand, we came to the realisation that particular categories override others. As the examples below illustrate, both emphatics (5) and hedges (6) exert a domineering influence on the perception of the meaning expressed by means of the attitude markers.Footnote 3
It is extremely sad that she cannot attend the event.
It might be useful to call him again.
In case of a combination of all three (7), the instance was considered a hedge as well, given that, irrespective of the strong terms in which a claim might be expressed, as soon as it is preceded by a hedge, its force is still weakened.
It is perhaps very upsetting to her that she had to cancel the party.
Finally, given that our database is comprised of newspaper language, generally characterised by neutrality and formality, we considered it reasonable to slightly extend the observational category. We propose a distinction between observation-neutral, observation-attitude, observation-hedging, observation-emphatics, and observation-reporting. While the former category comprises factual information without any kind of assessment (e.g., It is the government’s responsibility that the laws are enforced.), observation-attitude, observation-emphatics, and observation-hedging – as opposed to the simple attitude, emphatics, and hedging categories – are not concerned with the attitude of the author of the article (or with making their statement more or less forceful) but with that of, for example, the people they are writing about (e.g., It is the manager’s declared belief that he will win the election [attitude]; It has been criticised immensely that the minister did not attend the event. [emphatics]; The chairman maintained it seemed reasonable to take action. [hedging]). Finally, observation-reporting predominantly contains verbs that are employed in passivised verb structures and which are commonly used in the news sector to report events neutrally (e.g., It is reported that fifty people were wounded during the attack.)
Following the annotation process, all analyses were conducted using the software package R Studio. In order to attend to research question (1), the data were submitted to a regression modelling process using the R package dplyr that predicts the probability for an intro-it to occur based on the independent variables. To answer research questions (2a) and (2b), we zoomed in on the formal, functional, and semantic contexts of the intro-it by applying conditional inference trees (ctree) to our dataset (Hothorn et al. Reference Hothorn, Hornik and Zeileis2006). In a nutshell, ctree analysis recursively performs univariate splits of the dependent variable based on values of an independent variable or a set of covariates. As pointed out by Timofeev (Reference Timofeev2004) and Phelps and Merkle (Reference Phelps, Merkle, Eichhorn and Kovar2008), adopting such a classification tree approach presents a number of advantages that are relevant for our dataset: (i) it can handle small datasets, (ii) as a nonparametric statistical approach, this method only comes with a limited number of statistical assumptions, (iii) it can be applied to various data structures and is exceptionally efficient in handling categorical predictors, (iv) the output provides rankings of variable importance, and (v) the output of a classification tree analysis can easily be understood and interpreted. Please note that, in order to be in a position to reveal the effects of the individual variables on the complementation patterns of the intro-it and to thus provide a comprehensive account of the behaviour of these patterns in the individual corpora, we ran individual trees for each variable. The results of the analysis are presented and discussed in the following.
12.4 Results
12.4.1 The Likelihood of Using the Intro-it across SAVEs
When predicting the likelihood of using an intro-it to occur across SAVEs and BrE, we fitted a generalised linear model and took S_Length (logged) and Corpus as predictors. The interaction between the two variables turned out not to be significant, which means that the final model consisted only of the two main predictors S_Length (logged) and Corpus. The model is highly significant and explains 4.4% of the variance in the data (R2 = 0.044). Although this might not seem to be a high value, it performed significantly better than the baseline model, and visual inspection of the residual plots suggests that the model performed decently. The two main effects are illustrated in Figure 12.1.

Figure 12.1 Effect plots illustrating the main effects Corpus (panel 1) and S_Length (logged) (panel 2) predicting the likelihood for an intro-it to occur in the SAVE Corpus and the BNC_NEWS
Figure 12.1Long description
Plot a: corpus effect plot
This line graph is titled corpus effect plot and illustrates how the variable corpus affects the dependent measure I N T R O I T T F. The graph is designed to show categorical differences across various corpora.
The vertical axis is labeled I N T R O I T T F, measuring relative frequency or weight of the I N T R O I T T F function. It ranges from 0.000 to 0.030, with tick marks at regular intervals.
The horizontal axis is labeled corpus and displays seven discrete corpus labels:
B D, B N C, I N, M V, N P, P K, and S L. Each corpus label corresponds to a point on the graph, indicating the value of I N T R O I T T F, for that dataset. The points are connected by a continuous line to help visualize trends across corpora. Each point also has a vertical error bar, reflecting the confidence interval or variability for that data point.
The pattern of the graph is as follows:
It begins at a low value at B D, possibly the baseline corpus.
The line remains relatively flat through B N C and I N, suggesting minimal change in I N T R O I T T F.
A sharp peak occurs at M V, where I N T R O I T T F reaches its highest value across all corpora.
The value then drops at N P, indicating a decrease in frequency.
A slight rise follows at S L, though it does not reach the height seen at M V.
Overall, the plot shows non-linear variation in I N T R O I T T F across corpora, with some datasets contributing significantly more than others. The presence of error bars helps interpret the reliability of these observed differences.
Plot b: S length effect plot
This plot is titled S length effect plot, showing the relationship between S length likely sentence length or a sentence-length-related measure and the outcome variable I N T R O I T T F.
The vertical axis is labeled I N T R O I T T F, ranging from 0.00 to 0.10 in increments of 0.02.
The horizontal axis is labeled S length and spans from negative 2 to 4, suggesting that this variable is standardized e.g., z scored for comparability.
A single smooth curve is plotted on the graph to depict the effect of S length on I N T R O I T T F.
The curve starts near the bottom-left corner, with values close to 0.00 when S length is around negative 2, indicating that shorter sentences are associated with very low I N T R O I T T F values.
The line then gradually increases, becoming steeper as S length moves toward the upper end of the scale.
At around S length equals 4, the curve reaches close to 0.10, indicating that longer sentences are predicted to have higher I N T R O I T T F values.
Error bars or confidence intervals are visible along the line, growing wider at the extremes of the S length range, which indicates greater uncertainty in predictions for very short or very long sentences.
A rug plot is shown at the base of the graph along the x-axis, consisting of small vertical tick marks that indicate the density or distribution of actual data points. Most of these ticks cluster between negative 1 and 2, suggesting that most sentences in the data fall within this range of length.
The overall trend suggests a positive, nonlinear relationship: as sentence length increases, so does I N T R O I T T F, with more substantial increases for longer sentences.
It becomes immediately obvious that the use of the intro-it is not generally more likely to occur across SAVEs than in BrE. In fact, the likelihood of using this pattern is not significantly different from BrE in BgE, IndE, NpE, or PkE. However, we do see that the pattern is significantly more likely to occur in Sri Lanka and on the Maldives.
The second main predictor that increases the likelihood of using the intro-it is S_Length (logged). Since this effect does not engage in an interaction with Corpus, it seems that sentence length has a robust effect across varieties in the sense that the longer a sentence is, the more likely it is to make use of an intro-it. While an increased sentence length obviously generally increases the chance for non-canonical patterns to occur (e.g., cf. Götz Reference Götz2017; cf. Kircili, Chapter 13 in this volume; Kircili, Reference Kirciliforthcoming), it also makes sense from an information-packaging standpoint to make use of this pattern (e.g., when there are very long subjects; see Section 12.2.2). Here, the intro-it enables the writer to finalise the main structure of a sentence (i.e., Sg-V-Cs-Sn) and thus to be in a position to pack the main information (contained in the notional subject) into a structure that is in line with the principle of end-weight and thus more easily accessible to and processed by the reader, as in example (8) from BgE.
[It]Sg [is]V indeed [a curious coincidence]Cs [that just as the vice-president of the United States, Dick Cheney, stepped up the rhetoric against Iran, warning during a joint news conference with the Australian prime minister, John Howard, in Sydney that ‘it would be a serious mistake if a nation like Iran were to become a nuclear power’ and refusing to rule out the use of force to keep atomic weapons out of the hands of Tehran, Israel was reported by the conservative British broadsheet The Daily Telegraph to be in negotiations with the US-led forces for an ‘air corridor’ over Iraq should it decide on unilateral action on the theocratic west Asian state.]Sn
While this does not seem to be a particularly surprising finding, we can note that the cognitive effect of easing the processing load for the reader (and writer), which has been attested for other weight-sensitive structures, including the placement of subordinate clauses, and which has mainly been researched in the context of ENL varieties of English (e.g., Arnold et al. Reference Arnold, Losongco, Wasow and Ginstrom2000; Diessel Reference Diessel2005; Junge et al. Reference Junge, Theakston and Lieven2015), also seems to hold true for the intro-it and SAVEs.
12.4.2 Comparing the Use of the Intro-it across SAVEs
We will now turn to our second research question and zoom in on the cases in which intro-its were used across SAVEs by taking a look at their overall frequencies, their formal complementation patterns, and the semantic functions in which the construction is used.
Frequency of the Intro-it across SAVEs
Figure 12.2 illustrates the frequencies of the intro-it per thousand sentences (pts) across SAVEs as compared to the BNC_NEWS.Footnote 4

Figure 12.2 Normalised frequencies of intro-its per thousand sentences across SAVEs and the BNC_NEWS
Figure 12.2Long description
The horizontal axis labeled I N T R O I T P T S, or frequency of intro-its per thousand sentences, ranges from 0 to 35, in increments of 5. The horizontal axis labeled C O R P U S marks different labels of S A V E or South Asian variety of English corpus and B N C or British national corpus. The labels include B N C-N E W S, S A V E-B D, S A V E-I N, S A V E-M V, S A V E-N P, S A V E-P K, and S A V E-S L, and the bars associated with each corpus vary in shades. The frequency is more for S A V E-M V. All others are around the range of around 18 to 30.
As Figure 12.2 illustrates, we can observe considerable variation in the frequencies of the intro-it between the varieties, which is highly significant (χ2 = 132.86, df = 6, p < 2.2e-16). The highest frequencies are found in the newspaper language of the Maldives with 35.22 pts and Sri Lanka with 31.49 pts, followed by Pakistan with 22.85 pts, India with 19.91 pts, and Great Britain with 17.75 pts. The lowest frequencies are found in Bangladesh with 15.31 pts and Nepal with 14.92 pts. These frequencies are very much in line with the predicted probabilities of the pattern to occur (see Section 12.4.1). Compared to BrE, the historical input variety, we can observe that it is Bangladesh and Nepal that show similar – although slightly lower – frequencies, while India and Pakistan make use of the pattern somewhat more frequently. Yet, it is only writers from Sri Lanka and the Maldives who use the pattern significantly more frequently than the BrE ones.
Structural Preferences of the Intro-it across SAVEs and British English
Turning to the structural preferences across SAVEs, we will first focus on complementation patterns. In doing so, we look at the different clause types that the intro-it may occur in. The findings are illustrated in Figure 12.3.

Figure 12.3 Ctree illustrating the clause types used to complement the intro-it ~ Corpus + Passive in the SAVE Corpus and the BNC_NEWS
Note. TH = that-clauses, ZT = zero that-clauses, TO = to-infinitive clauses, IN = infinitive clauses, IF = if-clauses, WH = which-clauses, HO = how-clauses.
Figure 12.3Long description
Decision tree diagram analyzes clause types used to complement the intro-it, with particular focus on how they are influenced by the variable passive T F. At the top of the tree, there is a node labeled passive T F with a significance level of p less than 0.001, indicating a strong relationship between this variable and the outcome. The tree splits into two branches based on a numerical threshold.
Left Branch:
First corpus Node: This node is labeled corpus with a significance of p = 0.002, indicating a significant effect of the corpus on clause types.
B N C Node: This leads to a node labeled B N C with n = 115, meaning there are 115 observations from the B N C corpus. The corresponding bar graph here shows higher frequencies of clause types T H and T O.
B D, I N, M V, N P, P K, S L Node: This node represents other corpora, aggregating data from B D, I N, M V, N P, P K, and S L, with n = 825. The graph for this node shows elevated values for Z T and T O, suggesting that these clause types are more frequent in these corpora compared to T H and T O.
Right Branch:
Second corpus Node: This right branch leads to another corpus node, where the significance is p = 0.024. This is still statistically significant but weaker than the first corpus node.
B D, I N, M V, P K, S L Node: This node, with n = 288, displays a clear preference for the clause type T H, which has the highest bar in the graph, showing that these corpora use T H more frequently than others.
B N C, N P Node: This node includes data from B N C and N P corpora, with n = 44. The bar graph for this node shows elevated frequencies for both T H and Z T, indicating a higher usage of these clause types in the B N C and N P corpora.
In conclusion, the decision tree analyzes how different corpora affect the choice of clause types used with intro-it. The nodes highlight significant variations between corpora and show clear preferences for certain clause types within those corpora, with T H and T O being more prominent in some cases, and Z T and T O being more frequent in others.
The ctree illustrates that the most important and significant factor triggering particular complementation patterns is voice of the verb across all varieties (see node 1; p < 0.001). When we follow the left branch of the tree (i.e., the cases where there is a main verb that is not used in the Passive), node 2 illustrates significantly different complementation patterns (p = 0.003) between the BNC on the one hand (see final node 3) and the SAVE data on the other, which all cluster together (see final node 4). Albeit significantly different, the overall distribution patterns follow the same order regarding the three most frequent complementation patterns: the most frequent complementation pattern for active constructions in newspaper language is to-infinitives, with the SAVEs making use of this construction in almost 60% of the cases, compared to ca. 45% in the BNC. These are illustrated by example (9) from BrE and example (10) from PkE.
Now [it]Sg [is]V [up to the aid agencies]Cs [to make life as comfortable and easy as possible for the refugees.]Sn
[It]Sg [is]V [difficult]Cs [to quantify the gains and losses],Sn but unsuspecting investors ought to have lost heavily.
That-clauses figure as the second most frequent complementation pattern (cf., however, Section 12.2.2), with ca. 40% of all cases in BrE and ca. 36% in the SAVEs, followed by zero-that clauses, which BrE newspapers make use of in ca. 6% of the cases and SAVE newspapers in ca. 4% of the cases. There is another noteworthy finding among the low-frequency complementation patterns, namely that, in BrE, intro-it in active constructions is complemented using how-clauses (see (11)), which is something that we do not find in SAVEs at all, which, in turn, complement the intro-it using wh-clauses, as in (12), which is not observed in the British data at all.
But [it]Sg [has been]V [a mystery]Cs [how this and spongiform encephalopathies, such as mad cow, kill brain cells.]Sn
No shots were fired during the raid and [it]Sg [is not known]V [where the men are being hauled up.]Sn
With verbs in the Passive voice, if we follow the tree to the right, there is a significant split between the varieties again (see node 5, p = 0.028), and this time Nepal clusters together with Great Britain (see final node 7), whereas the other SAVEs form a cluster again (see final node 6). While both groups use that-clauses most frequently, the former generally shows more variety by also using different complementation patterns than the SAVEs, which almost exclusively use that-clauses, namely in over 80% of the cases (see (13)). This is followed by to-infinitive-clauses (ca. 8%, see (14)), and zero-that clauses (ca. 4%, see (15)), and very low proportions of the other complementation types (each under 2%). While British and Nepalese English speakers also complement the introductory-it pattern most frequently by that-clauses (ca. 65%, see (16)), they also show a high proportion of zero-that clauses (over 20%, see (17)), and also make use of how-clauses (ca. 5%, see (18)). The remaining patterns also only occur with very low frequencies, albeit still higher than in the SAVEs.
[It]Sg [was announced]V [that the party would look after the victim’s children.]Sn
During the Oslo meeting on June 26, [it]Sg [was agreed]V [to put pressure on Sri Lanka in this regards as well.]Sn
[It]Sg [was reported]V [ø he may have been manhandled.]Sn
[It]Sg [was agreed]V [that the wife was the chatelaine par excellence.]Sn
[ø It should be able to accumulate enough water by 2050,]Sn [it]Sg [is hoped.]V
[It]Sg [remains to be seen]V [how much Biggs has got left and even Duff has made mistakes.]Sn
In addition to the verb’s voice, the Verbtype and whether a subject or an object is referred to by the intro-it have an effect on the complementation type. This is illustrated in Figure 12.4.

Figure 12.4 Ctree illustrating the clause types used to complement the intro-it ~ Corpus + Verbtype + Function in the SAVE Corpus and the BNC_NEWS
Figure 12.4Long description
A diagram illustrates a conditional inference tree used to analyze clause types in relation to verb type and corpus. The diagram consists of nodes connected by lines, and it presents how these nodes break down into different categories based on probabilities.
Top Node:
Verb type node: The conditional inference tree begins with a top node labeled verb type, which contains a probability value that indicates the likelihood of a certain verb type influencing the outcomes in subsequent nodes. This node is connected to two lower nodes, each labeled corpus with different associated probabilities. This suggests that the selection of a corpus or the type of corpus plays a significant role in determining the pattern of clause types used with different verb types.
Branching Nodes:
First corpus node: The left branch leads to the first corpus node. This node further branches out into additional nodes.
Second corpus node: The right branch leads to another corpus node, which connects to further nodes as well. These branching nodes break down the data further, leading to specific groupings of clauses or categories.
Bar Graphs:
At the bottom of the diagram, five bar graphs are presented, each corresponding to a different node with a specific sample size n.
Node 3 where n = 27: The first bar graph is labeled node 3 where n = 27. The y-axis ranges from 0 to 1, representing the frequency of different clause types, and the x-axis labeled with various clause types: T H, Z T, T O, I N, I F, W H, H O. The bar graph displays varying heights of bars corresponding to each clause type, indicating their relative frequency within this node.
Node 5 where n = 7: The second bar graph is labeled node 5 where n = 7. It follows the same y-axis and x-axis as Node 3, displaying the distribution of clause types across the data, with each bar corresponding to one of the clause types listed. The heights of the bars reflect how frequently each clause type occurs in this specific node, to-infinitive clauses being most frequent.
Node 6 where n = 365: The third bar graph is labeled node 6 where n = 365. This graph also shows the frequency of the seven clause types for the second largest subset of the data with 365 observations. The bar heights indicate the frequency distribution of different clause types in this data segment, that-clauses being most frequent.
Node 8 where n = 106: The fourth bar graph is labeled node 8 where n = 106. Like the previous graphs, it shows the frequency of clause types for a dataset with 106 observations. The bar heights show the distribution of different clause types in this segment of the data.
Node 9 where n = 767: The fifth bar graph, labeled node 9 where n = 767, represents the largest subset of the data. The bar heights in this graph indicate how the various clause types are distributed in this final node. To-infinitive clauses are most frequent, followed by that-clauses.
T O has the highest value in all the nodes except node 6 where T H has the highest value.
The ctree in Figure 12.4 illustrates that, across all corpora, complementation patterns in intro-it sentences differ significantly depending on the Verbtype used (see node 1, p < 0.001). If a lexical verb is used, there is another significant split between the corpora (see node 2, p = 0.002), with NpE showing significantly different complementation patterns (see node 3) than the other SAVEs, however, this time plus the BrE speakers. While in NpE, for lexical verbs there is an almost even preference of complementing the intro-it with that-clauses, as in (19), or to-infinitive-clauses, as in (20), for the other varieties, there is another interesting split concerning the functional element that is referenced by the intro-it (see node 4, p = 0.04)): if the it-pattern refers to an object, the six varieties show a clear preference for selecting a to-clause to complement it in ca. 70% of the cases (see node 5), as illustrated in (21), whereas subjects are complemented by that-clauses in almost 80% of the cases (see node 6), see example (22). As far as copular verbs are concerned, the SAVEs behave significantly differently from BrE (see node 7, p = 0.003), with BrE speakers showing an almost even distribution between that- and to-clauses (see node 8), illustrated in (23) and (24), and SAVE speakers choosing to-clauses in over half of the cases (see node 9), as in example (25).
[It]Sg [was carefully instilled]V [in her]A [that all girls grew old before they grew up.]Sn
[It]Sg [took]V [divers from the Gent Fire Brigade]Od [two days]A [to retrieve Prem Prasad’s body.]Sn
But a senior banker, Shaukat Tarin, told Dawn that [it]Sg [was]V [best]Cs [to leave it to shareholders to assess the merits and demerits.]Sn
[It]Sg [saddens]V [me]Od [that educated people in the government could do such things.]Sn
[It]Sg [has become]V [increasingly apparent]Cs [that Thomson belonged to the American experimental tradition.]Sn
[It]Sg [was]V [very difficult]Cs [for a court to say what a reasonable mother would have done in circumstances which were almost hypothetical.]Sn
But [it]Sg [will be]V [difficult]Cs [to say that the dead has been mourned in a befitting way by the nation itself.]Sn
After having looked in detail at the different structural preferences across the SAVEs and BrE, we will now turn to the semantic functions that are associated with the intro-it.
12.4.3 Semantic Function of the Intro-it across SAVEs and the BNC_NEWS
We will now turn to the semantic functions in which the intro-it is used. The ctree in Figure 12.5 illustrates a number of variety-specific preferences. After the first significant split between the British, Nepalese, and Maldivian varieties on the one hand, and BgE, IndE, PkE, and SLE on the other hand (see node 1, p < 0.001), if we first follow the branch to the left, the most noteworthy difference from the latter group is that the former group shows the highest proportions of the pattern used in an attitudinal function. Further down the tree, there is another significant split that clusters the British and the Nepalese speakers together (see final node 3) and separates them from the Maldivian speakers (see final node 4). British and Nepalese speakers show a clear preference for using the intro-it in an attitudinal function in ca. 36% of the cases, see (25), followed by an almost equal proportion of observation-neutral and observation-attitudinal cases with ca. 14% each (see final node 3), as illustrated in (26) and (27), respectively. In comparison, apart from a similarly high proportion of ca. 30% of all cases in the attitudinal function, MvE (see final node 4) also shows a relatively high proportion of intro-its that are used either in an observation-reporting function (see example (28)) or observation-neutrally (as in (29)).

Figure 12.5 Ctree illustrating the semantic functions in which the intro-it is used by the different SAVEs vs. the BNC_NEWS
Note. A = attitude, E = emphatic, H = hedging, OA = observation-attitudinal, OE = observation-emphatics, OH = observation-hedging, ON = observation-neutral, OR = observation-reporting
Figure 12.5Long description
The tree diagram analyzes the semantic functions of the introductory it across different corpora. The diagram consists of a hierarchical structure of nodes connected by lines, illustrating the decision-making process and categorization of data. Here’s a detailed description of the diagram:
Top Node:
Corpus node: At the top of the tree structure is a node labeled corpus, which contains various statistical p-values indicating the significance of different corpora in the analysis. These values help determine how strongly each corpus influences the results in subsequent branches of the diagram.
Branching Nodes:
The corpus node branches into several additional nodes based on specific corpora, each representing a different set of data sources or variables. The branches are labeled as follows: B N C, M V, N P, S L, B D, I N, and P K.
Each branch represents a corpus or a specific set of data from the given categories. These nodes lead to further breakdowns of the data based on the semantic functions of the introductory it.
Bar Graphs:
Underneath each branch, there are bar graphs labeled node 3 through node 9, representing different subsets of the data, categorized according to the various corpora. These nodes display the distribution of semantic functions across the following categories: A, O A, O E, O H, O N, and O R.
Each bar graph corresponds to a specific node and shows the frequency of each semantic function for that subset of data. The Y-axis of each bar graph ranges from 0 to 1, indicating the relative frequency of each category. The X-axis contains the aforementioned categories, each representing a different semantic function that the introductory it may take.
The highest value is for A in nodes 3 and 4 and O A for all the other nodes. The graphs follow an overall increasing trend except nodes 3 and 4.
It would not be prudent to count too much on any unilateral pre-election cease-fire.
Now it is up to the aid agencies to make life as comfortable and easy as possible for the refugees.
The Himalmedia poll tried to gauge the people’s enthusiasm for general elections, and more than half the respondents felt it was ‘inappropriate’ for the prime minister to call elections during an emergency, while 29.4% agreed with the decision.
It has been pointed out that almost everyone in the broader democracy movement has charges pending against them.
It takes 35 minutes to reach the island from Male’ by launch.
When we follow the right branch of the tree, an overall trend that Sri Lanka, India, Pakistan, and Bangladesh have in common is the high proportion of using the intro-it in observation-attitudinal and -reporting functions, as compared to the other group. More specifically, we can observe that SLE splits off from the remaining varieties first (see node 5, p < 0.001) with the highest proportions of observation-attitudinal (see (30)) and -reporting (see (31)) functions (see final node 6).
‘It is good to see Mr. Samaraweera and Mr. Sooriyarachchi finally take a solid decision, because things were getting quite hard for them in the government,’ he said.
In fact it has been said that this was one of the root causes of the ethnic problem in Sri Lanka.
Further to the right of the tree, node 7 (p = 0.01) splits off IndE from BgE and PkE, with IndE showing the highest proportions of the intro-it to occur in observation-reporting (as in (32), -attitudinal (33), and -neutral (34) functions (see final node 8). Pakistan and Bangladesh, finally, use the pattern most frequently in an observation-attitudinal (see (35)) function, followed by an observation-reporting (see (36)) and -attitudinal (37) function (see final node 9).
It was announced that the party would look after the victim’s children.
‘It is hard to I [sic] they are gone forever,’ said an emotional batchmate.
It takes a two-hour gruelling walk to reach Kodiyanpalayam.
‘It is obvious why four-day cricket struggles to attract spectators,’ said Woolmer.
It is expected that virus-less blood is safe blood.
It is difficult to loosen the gripping power of the long held index. (PK_DA_2007–01-21
It is clearly visible that the major split between the two variety groups shows different preferences for using the intro-it in an attitudinal function (in the case of the British, Nepalese, and Maldivian speakers), whereas the remaining SAVEs prefer using the pattern in observational functions of different subtypes.
12.5 Discussion, Conclusion, and Outlook
In this chapter, we set out to investigate the use of the intro-it in SAVEs as compared to BrE. First, we were interested to find out whether the construction is a frequent phenomenon in SAVEs (compared to BrE), as has been reported for other non-canonical structures (e.g., Alsagoff & Lick Reference Alsagoff, Lick, Foley, Kandiah, Zhiming, Gupta, Alsagoff, Lick, Wee, Talib and Bokhorst-Heng1998; Sharma Reference Sharma, Butt and King2003, Reference Sharma and Hickey2012; Bhatt Reference Bhatt and Mesthrie2008; Mesthrie & Bhatt Reference Mesthrie and Bhatt2008). Here, we found that the construction is indeed more likely to occur in MvE and SLE than in BrE, but none of the other SAVEs show high frequencies of this pattern. The same holds true for the normalised frequencies of the intro-it across varieties per thousand sentences. In this respect, neither the likelihood of using the intro-it nor the actual frequency of usage follows the overall trend that has been found for other non-canonical patterns in ESL types of English. However, we do find high frequencies in Sri Lanka and on the Maldives, as compared to the other varieties under scrutiny. MvE has previously been shown to behave similarly to BrE for other structural features (e.g., Mukherjee & Gries Reference Mukherjee and Th. Gries2009; Götz Reference Götz2017), whereas Sri Lanka has often been claimed to behave similarly to India, possibly due to epicentral influences (e.g., Mukherjee Reference Mukherjee and Stierstorfer2008). As far as the frequencies of the intro-it in newspaper language are concerned, however, neither of these observations can be corroborated. For Sri Lanka, we could argue that this finding is very much in line with recent literature that situates SLE between phase 3 and phase 4 concerning its evolutionary status (e.g., Bernaisch Reference Bernaisch2015; Kircili et al. Reference Kircili, Degenhardt, Bernaisch and Götz2024). In this phase, we would indeed expect innovative and particular features to emerge in SLE, including a more frequent use of certain structural features, such as the construction at hand. For MvE, on the other hand, these high frequencies seem to be a rather unexpected finding. Therefore, another explanation could be the influence from the indigenous languages spoken in these two speech communities, because a structurally similar feature seems to exist in both Sinhala (Fernando Reference Fernando and Gooneratne1973), the language most frequently spoken in Sri Lanka, and Dhivehi (Fritz Reference Fritz2004), the language most frequently spoken on the Maldives. We might tentatively interpret this finding in the sense that there might (also) be L1 influences at play, as documented for other structural features in SAVEs (e.g., Lange Reference Lange2007 on focus marking in IndE). This hypothesis, however, will require further and much more systematic investigations on different databases which include information on the speakers’/writers’ L1s in order to make more robust claims.
Turning to the structural and semantic preferences of the intro-it, we found that the primary factors that trigger different structural complementation patterns are linguistic ones. More specifically, the quality of the verb has the main influence on the choice of the complementation pattern, that is, we found significantly different complementation preferences determined by whether the verb is (a) in the active or in the passive voice or (b) whether the it is used with a lexical or a copular verb. For verbs in the active voice, the preferred complementation pattern across varieties is to-infinitive clauses, whereas for passive constructions that-clauses are used most frequently. Also, as far as newspaper language is concerned, the previously reported observation that the intro-it is mainly complemented by that-clauses (Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999 on British and American conversation data) only holds true for lexical verbs when the subject is extraposed. For all other cases (i.e., for lexical verbs when objects are extraposed and for all copular verbs), the most frequent complementation pattern is the to-infinitive clause. Extraposed -ing clauses, which have been reported to occur frequently in rather informal spoken discourse (Quirk et al. Reference Quirk, Greenbaum, Leech and Svartvik1985: 1393), accordingly, do not feature in newspaper data at all. The pattern of the intro-it thus seems to be very genre-sensitive, which stresses the necessity of taking register variation thoroughly into account when describing World Englishes (see also Mahboob & Liang Reference Mahboob and Liang2014). This fact notwithstanding, such clear preferences triggered by linguistic factors across all varieties are particularly noteworthy in the sense that they might be universal features across all types of Englishes, and that they might indeed constitute parts of the ‘Common Core’ (Quirk et al. Reference Quirk, Greenbaum, Leech and Svartvik1985: 16) of English.
However, below this level, we did find the complementation patterns of BrE to be significantly different from the SAVEs, or clustering together with MvE or NpE. Turning to our second research question, whether we can observe different preferences in the form of the extraposed clauses between the SAVEs and BrE, we see that our findings are very much in line with previous research on other non-canonical features in SAVEs using the same database (e.g., Götz Reference Götz2017 on fronting). We attempt to explain these similarities in that MvE and NpE do not seem to have emerged as far as the other varieties in terms of their evolutionary state and therefore still behave more similarly to the historical input variety. The Englishes of Pakistan, India, Bangladesh, and Sri Lanka seem to have emerged – and distanced themselves – furthest from BrE, while MvE, and NpE are still behaving most similarly to BrE. In the case of Nepal, for example, researchers have not even reached consensus on whether NpE is an EFL or ESL type of English (see Götz Reference Götz2022 for a summary of the discussion). We therefore might assume that innovative uses of the intro-it might correlate with the evolutionary state (Schneider Reference Schneider2007) of these New Englishes, meaning the more ‘advanced’ a variety is, the more it will deviate from BrE or American English (see also Mukherjee & Gries Reference Mukherjee and Th. Gries2009; Götz Reference Götz2017). This observation also seems to correlate with the status of the English language within the speech community (see also Koch & Bernaisch Reference Koch, Bernaisch, Andersen and Bech2013).
As far as the semantic functions of the intro-it are concerned, we see more variety-specific diversity than on the structural level: apart from BrE and NpE as well as BgE and PkE forming pairs, each of the other varieties has developed unique semantic preferences concerning how they use the intro-it. While it is again Nepal and the Maldives that show the most similarities to BrE (which we would like to assume to be the case for similar reasons as discussed above), Sri Lanka, India and Bangladesh, and Pakistan use the intro-it in significantly different semantic functions than the other varieties. This might give rise to the assumption that cultural differences become most visible in their different semantic profiles. Alternatively, this might be an indication of different writing styles being employed in newspapers in different varieties of English: some speech communities might put more emphasis on writing about different attitudes, whereas other speech communities might focus more on observational attitudes or observed reports. While only looking into one syntactic pattern, however, this interpretation might at most turn into a research hypothesis for future research. Hence, we would like to put special emphasis on this aspect as being a particularly valuable avenue for future research, because, so far, not many studies have discussed the topic of ‘semantic nativisation’ in ESL yet (see, however, Werner & Mukerjee Reference Werner, Mukherjee, Hoffmann, Rayson and Leech2012).
As we hope to have shown with the present study, investigating non-canonical syntactic features in SAVEs from different perspectives can be a worthwhile endeavour, as it has brought to light linguistic features that apply to different types of Englishes and those that are unique and characteristic of certain varieties of English. The intro-it in particular has been a rather underresearched phenomenon within the ESL paradigm, so that we hope to have been able to put this construction ‘on the research map’ in ESL varieties of English.
Also, extending the scope of intro-it studies to the newspaper genre has brought to light different structural complementation patterns that have not yet been revealed for other text types or modes. The same holds true for the semantic categories that we extended in the present study. For the newspaper genre, reporting seems to be a highly frequent function, so that it seemed necessary (and turned out to be relevant) to extend this category further. It would be worthwhile for future research to investigate different (non-canonical) features in the same (but also in different) database(s) in order to test if our findings are universally valid for the varieties under scrutiny or whether they might be construction-, or even database-specific.
13.1 Introduction
Adverbials are by far the most diverse sentence constituents. Not only are they employed to fulfil various semantic functions – from the expression of time and place to that of manner, reason, contrast, or purpose – they also come in various shapes and sizes – from a phrase, constituting nothing but its head (e.g., today), to lengthy subordinate clauses, as exemplified in the construction below.
Instead of writing a five-page-letter to your brother in San Francisco, waiting four weeks for the answer, where he writes that he didn’t really understand what you wanted to ask him[A:cl_c] you can just go to this machine, that is not bigger than a shoe-box, press a certain sequence of buttons and then- hope that your brother is at home.
This sentence-initial subordinate clause is, however, not simply remarkable for its syntactic complexity but also for its status as an optional adverbial, a type of constituent that has, due to its inherent flexibility, mostly been neglected in previous research (Callies Reference Callies2009: 37). This holds true especially in relation to the topic of fronting,Footnote 1 commonly defined as the placement of core elements in sentence-initial position (Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999). This chapter intends to break with this habit. In the light of the two approaches to non-canonicity, namely the theory-based and the frequency-based approach, suggested in the Introduction to this volume, a distinction is proposed between two kinds of fronting phenomena, namely fronting (i.e., the initial placement of optional sentence constituents, enabling the consideration of optional adverbials) and preposing (i.e., the initial placement of obligatory constituents). To exemplify this distinction, this chapter investigates the production of adverbial fronting phenomena in German learners of English.
While in Section 13.2 a theoretical basis will be established by discussing not only the structural and functional peculiarities of fronting phenomena in general and adverbial fronting in particular but also previous research endeavours into the topics, Section 13.3 is dedicated to the study itself. First, the research gaps and research questions as well as the databases and methodology are presented before a detailed account and discussion of the results. This is followed by a conclusion as well as an outlook in Section 13.4.
13.2 Taking a Closer Look at Fronting Phenomena
13.2.1 Defining (Adverbial) Fronting Phenomena
As a traditional, and, in fact, ‘one of the most consistent rigid SVO [subject-verb-object] languages’ (Givón Reference Givón2001: 235), the canonical and thus standard syntactic structure of the English language involves a subject that precedes its verb and, depending on whether the latter is transitive or not, its object(s) (Givón Reference Givón1993; de Bleser Reference Bleser, Karnath and Thier2012: 426) as well as other obligatory or optional sentence elements, such as complements or adverbials (Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999: 899). In contrast, non-canonical sentence patterns, from the theory-based or structural perspective discussed in the Introduction to this volume, deviate from this Sn-V-X core structure.Footnote 2 One pattern that meets this structural criterion of non-canonicity and, as the following discussion will hopefully prove, simultaneously poses a very interesting case with regard to the frequency-based approach to non-canonicity as well, is fronting. It is commonly defined as ‘the initial placement of core elements which are normally found in post-verbal position’ (Quirk et al. Reference Quirk, Greenbaum, Leech and Svartvik1985: 1377; Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999: 900). In fact, as the examples in (1) illustrate, in addition to the employment of the mentioned standard postverbal constituents as initial elements of a sentence, sometimes even the predication is affected (Greenbaum & Quirk Reference Greenbaum and Quirk1990: 163):
(1)
a. 
Whether I can attend, I don’t know.Footnote 3 (preposed direct object) b. 
A strange person he is. (preposed subject complement) c. 
John they called the boy. (preposed object complement) d. 
Drawing a picture was the little girl. (preposed predication) e. 
On the table she put the book. (preposed obligatory adverbial) f. 
Last week I called him. (fronted optional adverbial)
In this chapter the focus is on the initial placement of one constituent in particular, namely the adverbial. While research has commonly focused on the discussion of instances like (1e) above, where a grammatically demanded adverbial is moved to the front, I propose an extension of the consideration of ‘core’ constituents to optional adverbials (as in (1f)), reserving the notion of preposing for the commonly marked movement of obligatory constituents and that of fronting for the common – and thus unmarked – sentence-initial placement of optional adverbials. It should be noted here that the term markedness is defined in various ways (see Haspelmath Reference Haspelmath2006 for an overview). In this chapter, and in line with the frequency-based approach to canonicity discussed in the Introduction to this volume, the term refers to the uncommonness or rarity of a feature in language use, as in this case, while the other approach to non-canonicity concerns the deviation from an expected theory-based or structural norm, as suggested by Halliday (Reference Halliday1994) in relation to his marked theme. This makes the topic of this chapter all the more interesting because its focus on fronting phenomena implies that, if a given sentence contains a sentence-initial optional adverbial, it is unmarked or canonical from a frequency-based perspective while the deviation from the S-V-X norm makes it structurally non-canonical. In case of preposing, however, both characteristics of non-canonicity – the frequency-based and the theory-based – are met.
In fact, it has been stressed repeatedly in the literature that preposing (as opposed to fronting) is a comparatively rare pattern (Quirk et al. Reference Quirk, Greenbaum, Leech and Svartvik1985; Sinclair Reference Sinclair1990) while the optionality and innate flexibility of the majority of adverbials make them a fruitful but, in the context of non-canonicity, rarely discussed structure. The reason for the proposed extension is that previous research has proven that there are (discourse-)pragmatic, (information-)structural, semantic, and cognitive effects associated with the allocation of an adverbial (be it optional or obligatory) to the first slot within a construction (Sinclair Reference Sinclair1990; Diessel Reference Diessel2005; Wiechmann & Kerz Reference Wiechmann and Kerz2013; Junge et al. Reference Junge, Theakston and Lieven2015), and both preposing and fronting are subject to comparable if not identical (information-structural) implications, a topic that will be discussed later on in this section.
First, however, it is necessary to take a look at the diverse structural and functional peculiarities of adverbials in general. With regard to the former, as briefly indicated in the introduction, adverbials may be realised in various ways and in varying degrees of complexity, ranging from individual adverbs (such as luckily, perhaps, or frankly in (5a–c) below)Footnote 4 over adverb, noun, and prepositional phrases (such as slowly but surely in (3c), last Monday in (3a) or on the table in (3b) below) to different kinds of dependent clauses, illustrated under (2) (Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999: 767–8).
(2)
a. Verbless: If in doubt, the example is out. b. Finite: Although she likes Jon, she did not invite him. c. Non-finite (to-inf.): To be honest, I don’t know how to react. d. Non-finite (-ing): Working tirelessly for hours, she immediately fell asleep. e. Non-finite (-ed): Compared to last year, they did a good job.
As far as the functional characteristics are concerned, over the years, various classification schemes have been suggested, including Quirk et al.’s (Reference Quirk, Greenbaum, Leech and Svartvik1985) differentiation between adjuncts, conjuncts, disjuncts, and subjuncts or Halliday’s (Reference Halliday1994) and Huddleston and Pullum’s (Reference Huddleston and Pullum2002) distinction between different kinds of adjuncts. The taxonomy adopted here is that by Biber et al. (Reference Biber, Johansson, Leech, Conrad and Finegan1999), who distinguish between so-called circumstance, linking, and stance adverbials. The former type is employed ‘to add circumstantial information about the proposition in the clause’ (Reference Biber, Johansson, Leech, Conrad and Finegan1999: 762), which may be temporal or spatial in nature or provide details about the process or manner, contingency, the extent or degree, or the recipient (Reference Biber, Johansson, Leech, Conrad and Finegan1999: 776–81). They are exemplified in (3):Footnote 5
(3)
Adverbials of circumstance a. Time: Last Monday she cancelled her meeting. b. Place: On the table she placed a heavy book. c. Process/manner: Slowly but surely, the project is evolving. d. Contingency: Because he lied repeatedly, he lost her trust. e. Extent/degree: To a certain degree, I think she is right. f. Recipient: For his family he would move mountains.
The second type, linking adverbials, is used to establish a connection between a given ‘clause (or some part of it) … [and] some other unit of discourse’ (Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999: 762) and it therefore contributes significantly to the cohesion and coherence of a given output. Again, various subtypes, listed in (4), can be distinguished (Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999: 875):
(4)
Linking adverbials a. Addition/enumeration: Firstly, I want to discuss the unemployment rate. b. Apposition: In other words, their conclusions were well-founded. c. Contrast/concession: However, this shouldn’t stop you. d. Result/inference: Hence, you shouldn’t rely on his help. e. Summation: All in all, the event was a success. f. Transition: Incidentally, you never know what life has in store for you.
Finally, stance adverbials express the speaker’s or writer’s attitude, point of view, or opinion as well as their assessment of, for example, the certainty or reliability of the proposition of a given message (Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999: 762). Three major subtypes, listed in (5), can be identified (Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999: 854):
(5)
Adverbials of stance a. Attitude: Luckily, they returned home safe and sound. b. Epistemic: Perhaps, I should leave now. c. Style: Frankly, I don’t give a damn.
The fact that adverbials are (more commonly than not) optional goes hand in hand with their flexibility, that is, with the possibility to employ the constituent in varying slots within a given sentence, ranging, according to Quirk et al. (Reference Quirk, Greenbaum, Leech and Svartvik1985: 490), from Initial, Medial, and End (for the first, post-operator, and final position) to four intermediate slots.
Which position is ultimately chosen may depend on both their structural make-up – with individual adverbs being considered particularly flexible (Quirk et al. Reference Quirk, Greenbaum, Leech and Svartvik1985: 491) – and the semantic categories discussed above. While linking adverbials have been found to be most commonly used initially, stance and circumstance/place adverbials tend to be placed in medial or final position, respectively (Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999). The situation may differ, however, when several adverbials are used in juxtaposition. Hasselgård (Reference Hasselgård2010), drawing on Quirk et al.’s (Reference Quirk, Greenbaum, Leech and Svartvik1985) distinction, investigated adjunct adverbials and their placement in British English and, in doing so, also took a look at common patterns of adverbial clustering in fronted position. She determined that a combination of two adjuncts, one of time and one of space, was by far the most frequent and that these two are generally the types that are most likely to be found in combinations with other adverbials. Focusing adverbs (e.g., especially), considered as subjuncts in Quirk et al.’s (Reference Quirk, Greenbaum, Leech and Svartvik1985) taxonomy, are also common, but, if they are used, they always appear in cluster-initial position, modifying the following adverbial (Hasselgård Reference Hasselgård2010: 92–3).
However, the semantic class poses just one factor that exerts an influence on the positioning of adverbials. Even greater is the influence of powerful mechanisms that are at play when structuring a message, namely those of information structure (Halliday Reference Halliday1967), defined as ‘the complex interaction of numerous phenomena and principles that govern the organization of information in discourse’ (Callies Reference Callies2009: 10; see also Féry & Ishihara Reference Féry and Ishihara2016). These involve aspects like the thematic structure, the information status progression, syntactic weight, focus placement, or other stylistic reasons, some of which will be briefly discussed in the following.
The thematic structure, as the first information-structural principle, is concerned with the organisation of a clause as a message (Halliday Reference Halliday1994: 37). From a Hallidayan perspective, the theme, as the starting point of the message, which is further developed in what follows (i.e., the rheme), commonly constitutes the subject. If this criterion is not met, however, and the position is occupied by a different constituent, the theme is inevitably marked and thus receives special attention or emphasis (Reference Halliday1994: 44). This view is supported by numerous researchers, including Quirk et al. (Reference Quirk, Greenbaum, Leech and Svartvik1985), who consider the thematic constituent ‘the first thing that strikes the speaker … [while] the rest is added as an afterthought’ (Reference Quirk, Greenbaum, Leech and Svartvik1985: 1377) or Virtanen (Reference Virtanen and Virtanen2004: 81), who equates thematicity with informational foregrounding. This makes apparent the special status the first constituent occupies within a given sentence and thus provides evidence for the relevance of optional adverbials in relation to the topic of non-canonicity. Diessel (Reference Diessel2005), in his investigation of different types of adverbial clauses, also recognises the importance of the first position and makes clear that adverbials can be particularly beneficial in that they may be ‘used to organize the information flow in the ongoing discourse … [and] function to provide a thematic ground or orientation for subsequent clauses’ (Reference Diessel2005: 459).
The topic of information flow directly leads to another important principle that is strongly affected by fronting phenomena, namely the information status progression. It is concerned with the newness and givenness of information (Clark & Haviland Reference Clark, Haviland and Freedle1977; Prince Reference Prince and Cole1981) and thus with facts which are ‘new and at the centre of the … communicative interest’ (Bache & Davidsen-Nielsen Reference Bache and Davidsen-Nielsen1997: 114) and those which are ‘assume[d] to be known’ (Bache & Davidsen-Nielsen Reference Bache and Davidsen-Nielsen1997: 113) or are retrievable ‘from the preceding discourse’ (Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999: 896). In the English language, the so-called given-new contract is assumed to hold between interlocutors (Clark & Haviland Reference Clark, Haviland and Freedle1977: 3), implying that the standard progression of the information status is from what is known to what is newly introduced to an interaction. As previous research has shown, this sequence may, in fact, be adhered to by means of both fronted and preposed adverbials. While this observation may seem natural, for example, for linking adverbials, which establish a connection between the previous and the upcoming discourse, it has also been confirmed for sentence-initial clausal adverbials which, according to Biber et al. (Reference Biber, Johansson, Leech, Conrad and Finegan1999: 835), frequently entail known information and, in a way, prepare the recipient for the new information that is to follow in the remainder of the message.
Two major principles that are also frequently associated with discourse-new information are those of end-focus and end-weight. The former claims that special focus, and thus emphasis, is placed on the final ‘lexical item of the last element in the clause’ (Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999: 897), which commonly coincides with information that is newly introduced to the discourse. Likewise, new information is commonly provided in a ‘longer, “heavier” structure’ (Greenbaum & Quirk Reference Greenbaum and Quirk1990: 398), which, in accordance with the principle of end-weight (Behaghel Reference Behaghel1909: 139), is likely to be placed at the end of a sentence and after providing the necessary context. This reduces cognitive efforts on the part of both the producer and the recipient of a message, given that lengthy constituents do not have to be retained in the working memory until the point of grammatical completeness of a given construction is actually reached.
The way adverbials interact with these principles is twofold. On the one hand, researchers like Virtanen (Reference Virtanen2008) have found that adverbials of manner are, for example, frequently moved to the front in order to allow more complex constituents to be placed finally, thus enabling the adherence to the principle of end-weight. On the other hand, however, previous studies, in particular into the positioning of clauses, have determined that the clause semantics may, in fact, override (information-)structural as well as cognitive factors (Diessel Reference Diessel2005; Wiechmann & Kerz Reference Wiechmann and Kerz2013), that is, that lengthier and more complex constituents, which naturally entail greater processing efforts, or those that contain discourse-new and focused information, are put first due to their semantic class. Diessel (Reference Diessel2001) developed a sequence of adverbial clauses based on their likelihood to be encountered before or after their juxtaposed main clause. According to his findings, the most frequent fronted adverbial clauses (all of which are circumstance adverbials in Biber et al.’s Reference Biber, Johansson, Leech, Conrad and Finegan1999 framework) are conditionals, followed by temporal and causal clauses, while result and purpose clauses are least commonly found sentence-initially.
However, as has been established, such an intentional deviation from the canonical word order, and thus from common information-structural principles, brings with it the possibility to establish emphasis, which is also at play when dramatic or rhetoric effects are to be achieved (Quirk et al. Reference Quirk, Greenbaum, Leech and Svartvik1985: 1378) or when, in line with Givón’s (Reference Givón and Haiman1985) principle of task-urgency, which highlights ‘the main purpose of the utterance’ (Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999: 904), the element that is placed initially is ‘contextually most demanded’ (Quirk et al. Reference Quirk, Greenbaum, Leech and Svartvik1985: 1377).
The final two functions of fronting phenomena that will be discussed here are concerned with the relation between sentences in discourse. By using determiners or ‘anaphoric deictic markers such as this, that, these and such’ (Callies Reference Callies2009: 39), by repeating given information (Callies Reference Callies2009: 39), or by the use of particular types of linking adverbials, both preposing and fronting pose a fruitful means to establish cohesion or, as illustrated below in (6), to express a contrasting relationship between units of discourse:
Sam thought speaking in front of many people would make her uncomfortable. However, she ultimately enjoyed it very much.
However is used here to mark the contrast between the expected experience expressed in the first and the actual experience referred to in the second sentence. As becomes clear, this is typically achieved by providing different alternatives that are contrasted by means of links acting as contrasting connectives (Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999: 900–1). In fact, according to Callies (Reference Callies2009: 39), this ‘contrastive emphasis’ is actually the most common function of fronting phenomena. However, the function of establishing cohesion and coherence is not exclusive to linking adverbials. As Hasselgård (Reference Hasselgård1996) makes clear, both temporal and spatial adverbials may also contribute to cohesive discourse (cf. Reference Hasselgård1996: 110).
Having discussed the diverse structural and functional realisations of adverbials as well as the information-structural implications their initial placement may have, one final aspect that needs to be mentioned is that of inversion, without which, according to Biber et al. (Reference Biber, Johansson, Leech, Conrad and Finegan1999: 900), a true understanding of fronting phenomena is not possible, since the preposing of certain elements, among them particular types of adverbials, triggers the phenomenon. This is illustrated in the examples under (7):
(7)
a. [On the wall]A [hung]V [a nice picture.]S (subject-verb inversion) b. [Next]A [came]V [the Queen.]S (subject-verb inversion) c. [Rarely]A [have]Aux [I]S [seen]V [this.]O (subject-operator inversion)
As indicated, the initial placement of various types of adverbials may lead to the inversion of the subject and the entire verb phrase or its operator. These include opening adverbials of place, as in (7a), or time, illustrated in (7b), as well as negative or restrictive elements such as only, never, scarcely, or, as in (7c), rarely.
13.2.2 Previous Research into Adverbial Fronting Phenomena in Learner Language
Previous research on fronting phenomena as such has been versatile. When it comes to preposing, quite a bit of research has been dedicated to both English as a Native Language and English as Second Language contexts (e.g., Quirk et al. Reference Quirk, Greenbaum, Leech and Svartvik1985; Birner & Ward Reference Birner and Ward1998; Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan1999; Esser Reference Esser, Fischer, Tottie and Lehmann2002; Huddleston & Pullum Reference Huddleston and Pullum2002; Ward & Birner Reference Ward, Birner, Horn and Ward2006 on ENL; Lange Reference Lange2012; Winkle Reference Winkle2015; Götz Reference Götz2017 on ESL, to name but a few),Footnote 6 all of which have greatly contributed to the understanding of the structure, its characteristics, and its underlying mechanisms, also discussed by Götz and Kircili (Chapter 12 in this volume).
Research into learner language, however, has been scarce, with two laudable exceptions being Plag and Zimmermann (Reference Plag, Zimmermann, Börner and Vogel1998) and Callies (Reference Callies2009), both of whom investigated German learners of English. Callies (Reference Callies2009), using a multi-method approach, investigated multiple non-canonical sentence patterns. His results on preposing confirm that learners, just like native and second-language speakers of English, rather shy away from employing the construction, with the corpus analysis yielding only five instances of preposing of an obligatory constituent, one of which had been produced by a native speaker and four by learners of English (Reference Callies2009: 194–5). In relation to the use of sentence-initial optional adverbials, Callies (Reference Callies2009: 178) remarks that they tend to be overrepresented in the L2 output of German learners, which may be due to the fact that they are not familiar with the previously mentioned principles of end-weight or end-focus.
Other studies on the use of adverbials and adverbial fronting by learners have paid a lot of attention to the use of connectors/linking adverbials and, in this connection, especially their overrepresentation in learners from different L1 backgrounds, which is, in fact, so common that it has been referred to as a potential ‘universal feature of interlanguage’ (Gilquin et al. Reference Gilquin, Granger and Paquot2007: 328). Granger and Tyson (Reference Granger and Tyson1996), for instance, compared their employment by native speakers with that by French learners. They determined that there is no general overrepresentation but solely an over- or underrepresentation of connectors with particular functions. An overrepresentation was determined, for example, for connectors that either corroborate and thus support the argument (e.g., in fact, indeed) or provide an example (e.g., for instance, namely) or additional information (e.g., moreover) (Reference Granger and Tyson1996: 20). Connectors which serve to establish contrast (e.g., though, however) or cohesion (e.g., thus, therefore) were found to be less frequently used by learners. This implies that, while connector use for emphasis or exemplification seems common, using them to ‘change the direction’ of arguments or develop them logically does not (Granger & Tyson Reference Granger and Tyson1996: 20). Reasons include the insufficient knowledge of students concerning the semantic restrictions and the syntactic behaviour of individual adverbials as well as their ‘inexperience in manipulating connectors within the sentence structure’ (Granger & Tyson Reference Granger and Tyson1996: 25). They conclude that, instead of providing them with lists of words and phrases, students should learn to use them not as stylistic enhancers but as higher-level discourse units with a variety of semantic functions and a high degree of syntactic flexibility. This increased awareness of their semantic, stylistic, and syntactic properties, achieved, for instance, by studying authentic texts, might help students to attain the ultimate goal behind the use of connectives, namely cohesion in discourse (Granger & Tyson Reference Granger and Tyson1996: 25–6). In fact, this connection between curricula activities, designated teaching materials and learner productions has been proven in a number of studies; so much so that van Vuuren and Berns, in their Reference Vuuren and Berns2018 study on Dutch and Francophone learners of English, referred to it as a ‘teaching-induced interlanguage feature’ (Reference Vuuren and Berns2018: 457) (see also Field & Oi Reference Field and Oi1992 on Cantonese; Altenberg & Trapper Reference Altenberg, Tapper and Granger1998 on Swedish; Milton Reference Milton and Granger1998 on Chinese; or Leńko-Szymańska Reference Leńko-Szymańska2008 on Polish, Russian, Spanish, French, Swedish, Finnish, and German learners of English).
Another common focus of previous research endeavours has been on transfer-related performances. For Dutch learners, for example, it has been found that they use more circumstantial and linking adverbials, while the number of stance adverbials in fronted position is smaller than that used by native writers. With increasing proficiency, learners’ means develop in the direction of native writing, however. The only exception are, in fact, linking adverbials, in case of which the learners clearly exhibited a continuous overuse. This fact is attributed to both the teaching materials and language transfer (van Vuuren Reference Vuuren2017; van Vuuren & Laskin Reference Vuuren and Laskin2017), given that linking adverbials (as well as local anchors) as cohesive devices were not only overrepresented in the English L2 output of their Dutch participants but also in their L1 writing (van Vuuren & de Vries Reference Vuuren, de Vries, de Haan, de Vries and van Vuuren2017; see also Larsson Reference Larsson2017 on the use of probably and possibly by Swedish and Norwegian and of maybe and perhaps by Spanish learners of English in Larsson et al. Reference Larsson, Callies, Hasselgård, Laso, van Vuuren, Verdaguer and Paquot2020).
13.3 Adverbial Fronting Phenomena in Learner Language
13.3.1 Research Gaps and Research Questions
Following the account of previous research endeavours into the topic, two research gaps were identified:
(1) There is a lack of research into the formal and functional peculiarities of adverbial fronting phenomena in German learner language going beyond the consideration of particular kinds of adverbials.
(2) There is a lack of research involving predictive modelling that might contribute to the understanding of the use of the structures by learners of English.
Consequently, this study intends to answer the following research questions:
(1) Can significant differences be determined between the frequencies of the different semantic and/or syntactic types of adverbials in the native speaker vs. the learner data and/or do the findings support previous research (particularly on conjuncts/adjuncts)?
(2) Can information-structural variables (information status or the syntactic complexity of the initial element) be determined as predictors for the use of adverbial fronting in the two corpora?
13.3.2 Databases and Methodology
This study makes use of the German sub-component of the International Corpus of Learner English (ICLE) and the Louvain Corpus of Native English Essays (LOCNESS; Granger Reference Granger and Granger1998). ICLE is a corpus of argumentative essays produced by advanced (university-level) students of English, compiled with the aim of enabling the investigation of ‘the interlanguage of advanced learners from various mother-tongue backgrounds’ (Granger & Tyson Reference Granger and Tyson1996: 18). The version employed for this chapter contains a total number of approximately 3.7 million words from 16 different first-language backgrounds. The German component contains about 234,000 words (Granger et al. Reference Granger, Dagneaux and Meunier2009).
LOCNESS was compiled at the Université catholique de Louvain in order to provide control data for ICLE (Granger & Tyson Reference Granger and Tyson1996: 19). The corpus contains approximately 320,000 words and it, too, is mostly made up of argumentative essays of 500+ words on a variety of topics, written by British A-level as well as British and American university students (‘LOCNESS description’ n.d.). For the present study, it was decided to use the American English argumentative essays in spite of the fact that the variety of instruction in Europe (and thus also in Germany) tends to be British English (Leńko-Szymańska Reference Leńko-Szymańska2008) because the former is commonly considered ‘the most important and influential dialect’ due to ‘films, television, popular music, the Internet and the World Wide Web, air travel and control, commerce, scientific publications, economic and military assistance, and activities … in world affairs’ (Algeo Reference Algeo2010: 183; cf. also Schneider Reference Schneider, Kachru, Kachru and Nelson2009). Consequently, the assumption is that the language of university students in their early twenties is likely to be shaped more by the continuous influence of and exposure to the language in these areas than by what they were taught at school and thus predominantly in the first years of language acquisition. Table 13.1 provides an overview of the material used.

Table 13.1Long description
The table is divided into 4 columns. The first column header is empty and the others are labeled as I C L E, L O C N E S S, and Total. The rows are filled from left to right and the data are filled as follows:
For the number of essays, the data are 74, 48, and 122.
For the number of sentences, the data are 1619, 1630, and 3249.
For excluded sentences, the data are 193, 142, and 335.
For the total, the data are 1426, 1488, and 2914.
As indicated in Table 13.1, 74 out of 437 ICLE-Germany essays were randomly selected and compared with 48 out of 188 American English argumentative essays from LOCNESS, with the topics ranging from experiences with friends, pros and cons of cars, and environmental issues (in ICLE) to important discoveries, capital punishment, or feminism (in LOCNESS). Although the difference in the number of essays is not ideal, it was deemed more important to have a comparable number of sentences and to analyse complete essays (as opposed to the consideration of, e.g., only a particular number of sentences per essay). This has the advantage that, for instance, summative linking adverbials, which are naturally used towards the end of a text, also feature in the analysis.
Turning to the annotation process itself, Table 13.2 shows the parameters for which each sentence was manually annotated. It is based on an annotation scheme used by Götz (Reference Götz2017) for her study on fronting phenomena in South Asian varieties of English, which was adopted with minor adjustments. Other than the given-new status (category gn) and the semantic function (front_function), the majority of variables are dedicated to the syntactic form and function of the first constituent of every sentence (e.g., front_el, front_form, front_obl_tf). While the explanations provided in the table are sufficient for the majority of the individual categories, a more detailed discussion is in order when it comes to the given/new distinction. According to Prince’s (Reference Prince and Cole1981) taxonomy of assumed familiarity, a threefold distinction can be made between new information, information that is inferable, or evoked (or given) information (see also Pham, Chapter 9 in this volume). In a first step of the analysis, this distinction was generally followed (with a slightly more extensive consideration of inferable entities than originally suggested in Prince’s work) to be able to obtain a more fine-grained picture of the information status in the essays, but, for the sake of comparability with other studies, it was then decided to conflate the inferable and the given categories to be able to proceed with just a twofold distinction between given and new information. For the statistical evaluation, different statistical tests and methods were employed, including descriptive statistics, conditional inference trees (cf. Hothorn et al. Reference Hothorn, Hornik and Zeileis2006), and logistic regressions using the R package dplyr in R Studio.

Table 13.2Long description
The table is sectioned into 2 columns. The columns are labeled as category and explanation. There are 13 rows for different variables. The data in the rows are filled from left to right as follows:
The data for C O R P U S is construction at hand belongs to I C L E or L O C N E S S.
The data for F I L E N A M E is filename of the essay the construction is taken from.
The data for C or C N is Sentence is canonical, C or non-canonical, N C.
The data for S underscore length is number of words the sentence at hand consists of.
The data for F R O N T underscore I F is sentence-initial element constitutes subject, F or any other sentence constituent T.
The data for G N is first element constitutes given or inferable, G or new information, N.
The data for A underscore F R O N T underscore T F is first element constitutes adverbial T or not F.
The data for F R O N T underscore E L is first element constitutes subject S, object O, complement C, verb V, or adverbial A.
The data for F R O N T underscore F O R M is Syntactic realisation as one-word expression x, phrase (N P, P P, A D V P, A D J P) or clause C L.
The data for F R O N T underscore F U N C T I O N is Semantic function of the adverbial – circumstance C, stance s, or linking L.
The data for F R O N T underscore O B L underscore T F is Sentence-initial element is obligatory T or optional F.
The data for L E N G T H underscore E L underscore I is number of words the first element consists of.
The data for C L U S T underscore T F is the sentence at hand begins with adverbial cluster T or not F.
13.3.3 Results and Discussion
To establish a basis for the following results, Figure 13.1 provides an overview of the frequency distributions of the placement of different sentence elements in sentence-initial position.

Figure 13.1 Frequency distribution of different sentence elements in fronted position (percentage)
Figure 13.1Long description
The vertical axis marks numbers that range from 0 to 80 in increments of 10. The horizontal axis marks the different annotation schemes used to explain the corpus I C L E or international corpus of learner english and L O C N E S S or louvain corpus of native english essays. The annotations used are subject or s, verb, or v, object or o, complement or c, and adverbial or a. There are two vertical bars where the dark shaded one represents I C L E and the light shaded one represents L O C N E S S. The subject and adverb use is predominantly seen in L O C N E S S and I C L E. Other annotations are much smaller compared to them. A data table is provided at the bottom with the distribution of the annotations given from left to right for I C L E and L O C N E S S, respectively. The values are as follows: For I C L E, the corresponding values are 54.56, 0.00, 1.47, 0.28, and 43.69. For L O C N E S S, the corresponding values are 69.15, 0.07, 0.67, 0.20, and 29.91.
As can be seen in Figure 13.1, objects, complements, and verbs were hardly found sentence-initially in the data, while the canonical subject was the most commonly chosen sentence-initial constituent in both populations, constituting 54.56% (n = 779) of the 1,426 sentences taken from ICLE and 69.15% (n = 1029) of the 1,488 sentences from LOCNESS. Consequently, adverbials were, in fact, the second most frequent sentence-initial element in both corpora, employed in 29.91% (n = 445) of the cases in LOCNESS compared to 43.69% (n = 622) in ICLE. Instances of adverbial preposing (and thus the use of obligatory adverbials as sentence-starters) were, as suggested by previous research, exceptionally rare, constituting only 2.02% (n = 9) of the instances identified in the native speaker and as little as 0.48% (n = 3) of the cases in the learner data. In both populations, see (8) and (9), it was either the use of negative or restrictive adverbs or the use of spatial references, both of which trigger inversion, that resulted in the employment of preposing:Footnote 7
Next door to me[A:NP_c] lives a typical representative of this kind.
Never at once[A:AdvP_c] in an article[A:PP_c] does it mention that euthanasia precedes some type of consequence.
This observation already illustrates the crucial role of adverbials in relation to fronting (as the initial placement of optional constituents) since they are, by far, the most common non-canonical fronted element.
Zooming in on their different functions, Figure 13.2 provides an overview of percentage distributions of the different semantic types (circumstance, linking, stance) of adverbials in the two corpora.

Figure 13.2 Frequency distribution of adverbial functions in ICLE and LOCNESS (percentage)
Figure 13.2Long description
The vertical axis marks numbers that range from 0 to 60 in increments of 10. The horizontal axis marks the different front functions used to explain the corpus I C L E or international corpus of learner english and L O C N E S S or louvain corpus of native english essays. The annotations used are circumstance or c, linking or l, and stance or s. There are two vertical bars where the dark shaded one represents I C L E and the light shaded one represents L O C N E S S. I C L E is lower in c, higher in l and s. A data table is provided at the bottom with the distribution of the annotations given from left to right for I C L E and L O C N E S S, respectively. The values are as follows: For I C L E, the corresponding values are 39.65, 41.57, and 18.78. For L O C N E S S, the corresponding values are 51.24, 32.81, and 15.96.
Based on the percentages, we can derive clear preferences. While both groups least frequently opted for stance adverbials, circumstance and linking adverbials were preferred by native speakers and German learners, respectively, with both the stance and linking adverbials being significantly more frequently employed by the latter group (stance: χ2 = 10.829, df = 1, p < 0.001; linking: χ2 = 31.05, df = 1, p < 0.001). In particular the results concerning linking adverbials thus seem to confirm previous research in that these adverbials tend to be overrepresented in learner data. In fact, when taking a more qualitative approach and taking a closer look at individual concordance lines, it becomes apparent that the results corroborate some of the findings reported by Granger and Tyson (Reference Granger and Tyson1996). Just like in their study on French learners, German students seem to rely on fronted adverbials with a linking function to expand on information previously provided, using, for example, moreover, therefore, or so, but refraining from sentence-initial also, a potential false friend in German, which is found only twice in the learner but 16 times in the native speaker data. Interestingly, the establishment of contrast, commonly fulfilled, for example, by however in the American data, was, in fact, instead achieved by the use of the subordinating conjunction but, used as many as 81 times in ICLE (compared to 13 instances in LOCNESS). Given the fact that, even in their mother tongue, starting a sentence with but is considered stylistically rather clumsy, it is surprising that German learners still felt compelled to use the conjunction for this purpose in their L2 output. In both corpora, linking adverbials were also identified as the semantic type that was most commonly attested in adverbial clusters, as exemplified below. Both combinations, illustrated in (10) and (11), contain linking as well as circumstance adverbials (roughly corresponding to the adjunct category investigated by Hasselgård Reference Hasselgård1996).
On the one hand[A:PP_l] in some cases[A:PP_c] the car is a really useful thing […].
However[A:Adv_l] if that person knew what he should do […],[A:cl_c] he would not have erroneous answers.
Moving on to the structural distribution, Figure 13.3 shows the frequency of the structural realisations of sentence-initial adverbials (i.e., CLAuse, PHRase, SINgle adverb) in percent.

Figure 13.3 Frequency distribution of syntactic realisations in ICLE and LOCNESS (percentage)
Figure 13.3Long description
The vertical axis marks numbers that range from 0 to 60 in increments of 10. The horizontal axis marks the different sentence-initial adverbials used to explain the corpus I C L E or international corpus of learner english and L O C N E S S or louvain corpus of native english essays. The annotations used are clause or C L A, phrase or P H R, and single or S I N. There are two vertical bars where the dark shaded one represents I C L E and the light shaded one represents L O C N E S S. I C L E is higher in C L A and P H R, as shown in the graph. A data table is provided at the bottom with the distribution of the annotations given from left to right for I C L E and L O C N E S S, respectively. The values are as follows: For I C L E, the corresponding values are 21.22, 30.71, and 48.07. For L O C N E S S, the corresponding values are 29.44, 35.73, and 34.83.
What can be seen here is that the native speakers exhibited a tendency to use more complex syntactic realisations (i.e., phrases or clauses) more frequently, namely in 35.73% and 29.44% of the cases, while learners only used them in 30.71% and 21.22% of the instances, respectively. Still, the only statistically significant difference was, in fact, determined for single constituents (χ2 = 45.674, df = 1, p < 0.001), used in 48.07% (n = 299) of the learner and 34.83% (n = 155) of the native speaker cases. The tendency for the American students to express themselves in more complex structures is exemplified in (12) to (14) below. While the first two sentences contain phrasal structures that, in ICLE, would have predominantly been expressed using a single constituent (e.g., therefore), the third example is particularly interesting. The reason is that, as is frequently the case in this subcorpus, this participant fronts a dependent clause, obviously in order to particularly emphasise the reason behind the points that were made in the preceding discourse.
For this reason[A:PP_l] people did not travel extremely long distances […].
Because of this,[A:PP_l] it has been met with great opposition.
In order for these people to be sure […],[A:cl_c] they need to be assured that it is safe and more economical.
This final example is also indicative of the observations made for the syntactic preferences of each type of adverbial, with circumstance adverbials being the only case for which clauses were preferred by native speakers while learners rather opted for phrase structures. For each of the other semantic categories, the single-word expression was, in fact, the most commonly chosen alternative.
Regarding the second research question, enquiring whether certain information-structural variables can be identified as predictors for adverbial fronting, the ctree in Figure 13.4 already alludes to the fact that, overall, there are significant differences between the two corpora. Likewise, both the given-new variable and the sentence length proved to be of relevance to the nature of the constituent that is found in sentence-initial position.

Figure 13.4 Ctree: Distribution of front_el~corpus+s_length+gn
Figure 13.4Long description
The conditional inference tree shows a hierarchical classification based on corpus type, sentence length, and givenness, G or N, ending in bar charts. At the top, Node 1 is labeled C O R P U S, branching into two categories: L O C N E S S and I C L E.
The L O C N E S S branch leads to Node 2 labeled S_L E N G T H, which splits into two paths:
Less than 16, leading to Node 3 labeled G N, which divides into:
G leading to Node 4,
N leading to Node 5.
Greater than 16, leading to Node 6 labeled G N, which splits into:
N leading to Node 8, which continues to Node 7 labeled S_L E N G T H, dividing into:
Less than 34 to Node 9
Greater than 34 to Node 10
G leads directly to Node 10.
The I C L E branch from Node 1 leads to Node 11 labeled S_L E N G T H, which divides into:
Less than 22, leading to Node 13, then to Node 12 labeled G N, branching into:
N to Node 14
G to Node 15
Greater than 22 connects directly to Node 15.
All terminal nodes (Nodes 4, 5, 9, 10, 14, 15) include small bar charts.
Each bar chart shows relative frequency values from 0 to 1 across five categories labeled A, C, O, S, and V.
In general, ctrees (Hothorn et al. Reference Hothorn, Hornik and Zeileis2006) are based on recursive partitioning algorithms, providing a means to meaningfully split the values or manifestations of the dependent variable based on the independent variables that are assumed to exert an influence. One major advantage is the fact that ctrees also detect behavioural patterns in fairly small datasets, making this method a perfect choice for the present chapter.
As can be seen, even though the first significant split separates the learner from the native speaker data, they also share certain behaviours, such as the fact that the next important splits (Node2 (p < 0.001), if the left branch of the tree is followed, and Node11 (p < 0.001), which is reached to the right) are both based on the length of the sentence at hand. Focusing on the learners, and thus on the right part of the tree, sentence-initial adverbials, compared to other sentence-initial constituents, are particularly commonly chosen in sentences that are longer than 22 words (cf. Node15). If they are shorter than that, the given-new distinction is of relevance (Node12; p = 0.02) and they are more frequently employed if they contain given information (Node14). When it comes to native speakers, and thus the left part of the tree, Node2 separates sentences from each other that are either longer or shorter than 16 words and in both cases, the next important split is the given-new status of the first constituent (Node6 (p = 0.008) for sentences that exceed 16 words, and Node3 (p = 0.034) for those that are shorter). If the branches of Node3 are followed, it can be seen that the percentages of canonical sentences (commencing with a subject) are much higher than those of sentence-initial adverbials (cf. Node4 and Node5). Interestingly, in Node5, which provides the frequency distributions for new information, the percentage of fronted adverbials is slightly higher than when the information is given (visible in Node4). This is striking because, as mentioned previously, the standard progression of the information status in an English sentence would commence with given and proceed to new information. This observation also holds true for particularly long constructions. If the left branch of Node6 is followed, which, again, refers to new information in sentence-initial position, there is another split (Node7; p = 0.025) between sentences that are up to 34 words long and those that exceed 34 words, with the percentage being particularly high for the latter instances (Node9; note, however, that this figure is only based on 34 cases).
However, to determine whether the three variables corpus, gn, and s_length also have a predictive power, the glm function (allowing for two-way interaction) was used and yielded the results depicted in Figure 13.5.

Figure 13.5 Effect plot of glm a_front_tf~log(s_length)+gn+corpus+gn:corpus
Figure 13.5Long description
Two graphs side-by-side effect plots showing how syntactic variables relate to A F R O N T dash T F values. The left plot, titled S_L E N G T H effect plot, displays a single line graph with a shaded confidence band. As sentence length increases from 0 to 80 along the x-axis, the A F R O N T dash T F value on the y-axis rises from 0.1 to 0.6, showing an upward trend. The right plot, titled G N by C O R P U S effect plot, has two smaller panels. The first is labeled C O R P U S equals I C L E and shows a downward sloping line from G to N. The second is labeled C O R P U S equals L O C N E S S and shows an upward sloping line from G to N.
Even though the model ‘only’ explains 5.9% of the variance in the data (R² = 0.059) it turned out to be highly significant and to perform significantly better than the baseline model. As the effect plots in Figure 13.5 make clear, the observations made for the above ctree are, in a way, confirmed and the variable Corpus was returned as a significant predictor (p < 0.001). The plot on the left indicates that, for both varieties, the likelihood for adverbial fronting to occur increases significantly with an increasing sentence length (p < 0.001), an observation that can be explained by the fact that additional/explicatory information or temporal references, as they are frequently provided by circumstantial adverbials, are commonly provided in longer structures as in (15):
From the times of the Roman Empire, when, for the amusement of the multitude, dissenters and runaway slaves were torn to shreds by ferocious animals, to modern entertainment in the form of brandished baseball bats, hurled cobblestones and indiscriminating hooliganism at soccer matches,[A:cl_c] violence has always found some people to whom it appeals as an attractive pastime.
The plot on the right, on the other hand, implies that there is a significant interaction between the given-new structure and the corpus. While the learners show a tendency for adverbial fronting to occur when the constituent contains given information, in LOCNESS it is significantly more frequently used with new information (p < 0.05) as illustrated in (16) and (17).
When the death penalty is requested as a sentence,[A:cl_c] it is usually based upon the rage of our society towards a criminal’s violent act.
When my tea finally arrived,[A:cl_c] I had decided that I didn’t want to drink it.
At first sight, this seems like a surprising result given that it implies that, while German learners structure their sentences in conformity with the given-new contract, native speakers frequently opt to deviate from the standard progression when using sentence-initial adverbials. However, considering the suggestion made by Diessel (Reference Diessel2005), the finding may confirm that, under certain circumstances, particular information-structural principles can override each other and that a deviation is acceptable, for example, to enable an adherence to the principles of end-weight or end-focus or in order to achieve balanced weight, as in (16) above, where the lengths of the preverbal and the postverbal constituents (in number of words) are roughly the same.
13.4 Conclusion and Outlook
The aim of this study was to provide a comprehensive overview of adverbial fronting in German learners as opposed to their American English peers. Likewise, it attests the justification and merit of the consideration of both approaches to non-canonicity, namely the theory-based and the frequency-based approach, in relation to adverbial fronting phenomena. The distinction proved particularly relevant also for the newly proposed approach to the investigation of adverbials that takes account of the fact that these sentence constituents play an important syntactic, pragmatic, and information-structural role, irrespective of whether they are obligatory or optional. Against this backdrop, the chapter gives an insight into the learners’ use of a constituent that, due to its frequently flexible nature, is commonly overlooked in the context of non-canonical sentence patterns.
The corpus analysis was able to verify numerous findings that have been reported in previous research. First and foremost, and perhaps unsurprisingly, it is confirmed that preposing is a comparatively rare structure in both the native speaker and the learner data. Considering the fact that, if the pattern is used, it would commonly entail place references or negative or restrictive adverbials triggering inversion, the complex nature of the resulting construction may be difficult to master, particularly for learners of English. Other findings that match previous research results include the overrepresentation of (particular) connectors along with potential L1-transfer-realated choices. From a structural perspective, it was also found that German learners strongly prefer single-word adverbials to phrases and clauses (which are more frequently employed by native speakers), assuredly due to the fact that they are easier to acquire and to retrieve. The only type for which this did not hold true was the circumstance adverbial for which a clear preference of clauses (in LOCNESS) or phrases (in ICLE) could be determined, a fact which might be attributed to the element’s common use to provide additional and/or explicatory information or to express time and/or place references by means of short and easy-to-use noun or prepositional phrases (e.g., last Monday, in California) and dependent clauses (e.g., when I saw him). Another important finding of the study is the correlation between the likelihood of encountering a fronted adverbial and the sentence length and, with regard to the native speaker data, the newness of information. While especially the latter finding seems surprising at first sight, it may prove the claim that some information-structural forces take precedence over others (Diessel Reference Diessel2005), although this is a claim that needs to be verified by further research.
Both the theoretical discussion and the findings, attesting this contrast in the employment of adverbials by native speakers and learners, prove the importance and justification of the consideration of adverbial fronting (as opposed to preposing) as part of the realm of non-canonical syntax. Naturally, however, with the consideration of a very limited number of explanatory variables, this study only provides a starting point for further research into the topic.
In addition, future studies should take account of other (information‑) structural variables (e.g., theme-rheme or focus structure, constituent weight), learner context variables, and possibly also additional (interlanguage) varieties (see Kircili Reference Kirciliforthcoming) to increase the predictive power of a model and to thus provide a more detailed account of the mechanisms underlying the choice of adverbial fronting phenomena.
14.1 Introduction
The question to what extent the linguistic features found in English as a Lingua Franca (ELF) interactions are recurring – or even systematic – and to what extent they are the result of accommodational processes remains at the heart of ELF research. This is, at least in part, due to the fact that ELF encounters are typically transient interactions between speakers of various first languages, at varying stages of proficiency in English, and from different linguacultural backgrounds. ELF interactions can occur between speakers who meet repeatedly and over an extended time period, but often they are only temporary and take place only once in a specific situation and within a specific constellation of people (so-called ‘Transient International Groups’, Pitzl Reference Pitzl2018). This leads to a complex situation of language contact, in which speakers contribute their Individual Multilingual Repertoires (IMRs) to a common Multilingual Resource Pool (MRP), from which they can choose during the ongoing encounter (Pitzl Reference Pitzl2016: 295–9; Reference Pitzl2018: 33). In consequence, Pitzl (Reference Pitzl2018: 36–7) identifies two major methodological challenges for the study of ELF: (1) what are suitable models to capture the fleeting nature of ELF encounters, and (2) how can the fact that grammar is constantly being negotiated by language users be considered in research? Indeed, the concept of (non‑)canonicity is particularly relevant in ELF, since this is the context in which the notion of the ‘elusive target’ is central: ELF users may or may not subscribe to a (supra-)regional or external standard; they may or may not have received formal education in English; they may or may not have spent significant time abroad; etc. Hence, it is difficult to determine what a ‘canonical’ structure in a specific ELF context might look like.
In this chapter, we therefore draw on insights from the theory of emergent grammar to investigate (potentially) non-canonical language usage in ELF face-to-face interactions. This means that we regard conversational interaction as consisting not ‘of sentences generated by rules, but of the linear on-line assembly of familiar fragments’ and grammar as ‘emergent and epiphenomenal to the ongoing creation of new combinations of forms in interactive encounters’ (Hopper Reference Hopper, Auer and Pfänder2011: 26). As we show in the following, this usage- and experience-based view of grammar acknowledges the somewhat idiosyncratic nature of ELF encounters while explaining shifting feature frequencies in ELF. To that end, we conduct a qualitative and quantitative analysis of minus-plurals in Asian and European ELF to investigate how much speaker L1s and their corresponding language families as well as other extra- and intra-linguistic factors play a role in the decision-making of ELF speakers. The term ‘minus-plurals’ describes examples such as We had two beer-Ø instead of We had two beers and has been preferred over other terms as it represents an ideologically neutral term (see Rüdiger Reference Rüdiger2019: 48, and Section 14.3 for more details). Furthermore, we provide a discussion of what statistics can and cannot contribute to an analysis of ELF encounters, which, so far, have been predominantly analysed using qualitative approaches. The two main research questions we address in this chapter are the following:
(1) Are minus-plural constructions in Asian and European ELF encounters the result of typological pressure, that is, are they selected because of their dominance in the linguistic ecology of the speech situation (cf. Ansaldo Reference Ansaldo2009) and therefore the result of ‘cooperative restructuring’ (Thompson Reference Thompson2017: 209)?
(2) Which other extra-linguistic and intra-linguistic context factors have a significant influence (if any) on the use of minus-plurals?
While the first research question primarily scrutinises L1 transfer as an explanation for the use of minus-plurals, the second research question focuses on the age and gender of the interactants, the animacy of the head noun, and the presence of a quantifier as potential factors.
We use two comparable ELF corpora to analyse the data from a qualitative and quantitative perspective since, hitherto, few studies have combined a close reading of select examples with statistical methods in ELF contexts. The goal of the quantitative study is to identify potentially impactful variables for occurrences of minus-plurals. Ultimately, we seek to improve our understanding of how linguistic features in ELF are employed and to what extent they can be attributed to (socio-)linguistic factors, cooperative restructuring, or a combination of these factors. Furthermore, we address the potential of explaining feature emergence in ELF as based on regionally identifiable but also diffuse cores in which certain languages and language families dominate, leading to more frequent occurrences of specific contact features.
In Section 14.2, we establish the theoretical framework of communicative dynamism and cooperative restructuring in ELF and introduce and explain minus-plurals as the feature in focus of our study. We then provide detailed information on ACE and VOICE, our dataset, the predictor variables, and the applied statistical method in Section 14.3. In Section 14.4, we present and discuss the results of our study. Finally, in Section 14.5, we conclude the chapter and provide an outlook.
14.2 Grammar in ELF: Syntactic Borderlands?
14.2.1 Emergent Grammar in ELF
ELF interactions come in manifold forms, which makes it hard – or maybe even impossible – to define what ‘ELF grammar’ actually looks like. In consequence, recent research has moved away from regarding ELF as a stable code and rather conceptualises it as ‘a series of more or less demanding communicative situations where speakers come with whatever their language skills to tackle the communicative tasks at hand’ (Ranta Reference Ranta, Jenkins, Baker and Dewey2017: 247). According to this view, ELF grammar is in a state of fast change, strongly depending on the linguistic backgrounds and proficiency levels of its speakers, who are engaging in specific interactional encounters, which can be temporary and clearly task-based (such as asking for directions) but might also be more stable and general (such as roommates communicating through a lingua franca) (see Mauranen Reference Mauranen2018: 107–8). Of course, this does not imply that morphosyntactic aspects cannot be investigated in ELF contexts; however, it highlights the necessity of suitable methods and an awareness of the transient nature of ELF when working in the field of ELF grammar. As Pitzl highlights, interactants in ELF situations bring their own individual multilingual resource pools, consisting of ‘all the linguistic resources … [they] have at their disposal’ (Reference Pitzl2016: 298), but they also have access to the shared multilingual resource pool which emerges during the encounter itself and is constantly modified and adapted (Reference Pitzl2018: 34–6; see also Hülmbauer Reference Hülmbauer, Mauranen and Ranta2009: 325). In other words, ELF speakers tend to ‘bridge lingua-cultural boundaries, resulting in code-mixing and foreign language use, and the use of forms or form-function relationships which are non-codified’ (Osimk-Teasdale & Dorn Reference Osimk-Teasdale and Dorn2016: 373).
Thus, rather than merely listing features which seem to be diverging from native speaker ‘standard English’ – whatever this term is supposed to mean – recent research in ELF primarily focuses on the strategies interactants use to establish mutual understanding, that is, on how they ‘effectively employ language in communication by drawing on a set of resources and … strategically use them in interaction’ (Vettorel Reference Vettorel2019: 184). This includes, for instance, accommodation processes in general but also very specific strategies like restructuring or reformulating on the side of the speaker and repair initiations, such as requests for clarification, on the side of the hearer(s) (e.g., Wong Reference Wong2000: 247; Cogo Reference Cogo, Mauranen and Ranta2009; Vettorel Reference Vettorel2019). However, there has been considerably less research on the question why a certain strategy is preferably selected in specific interactional contexts, and, in particular, larger quantitative investigations of this topic are still missing (but see Neumaier Reference Neumaier2023a, Reference Neumaier, Wilson and Westphal2023b, who looks at the interface of conversational patterns and grammar in Southeast Asian interactions).
Although World Englishes research typically focuses on long-term situations of language contact, the question of how grammatical conventions come into being through language use is central to its research agenda, and the methodological approaches it has developed might be fruitfully applied to ELF contexts. When describing the emergence of creoles, Mufwene lists several aspects which seem to influence whether a feature is likely to be selected from a common pool of multilingual repertoires, including its ‘statistical frequency, semantic transparency, regularity, salience, [and] social status of the model speakers’ (Reference Mufwene2008: 19). Mufwene’s concept of the linguistic feature pool strongly resembles the idea of a shared multilingual resource pool advocated in current ELF research (as discussed previously). In fact, as Thompson puts it, ‘in view of the highly diverse interaction environments that embed ELF, we might expect that not only the extent but the capacity for exploitation is also greater – the virtual language, or resource pool, is bigger – than it is in non-lingua-franca contexts’ (Reference Thompson2017: 214). That this is the case has already been illustrated in previous studies on minus-features in ELF conversations, for instance, on copula usage by (South)-East Asian speakers (see Leuckert & Neumaier Reference Leuckert and Neumaier2016).
Accepting that ELF grammar is inherently adaptive and in a process of constant restructuring means, of course, that any investigation of syntactic patterns must be able to take this fluid nature into account. How, indeed, can one analyse something which is influenced not only by various linguistic inputs but also by the context-specific contingencies of the very specific communicative situation, where the focus is – in many cases – not predominantly on grammatical correctness but rather on intelligibility? The concept of ‘emergent grammar’ (Hopper Reference Hopper, Auer and Pfänder2011) as well as the related idea of ‘probabilistic grammar’ put forward by Szmrecsanyi et al. (Reference Szmrecsanyi, Grafmiller, Heller and Röthlisberger2016) have been proposed as ways to account for this complexity and will also form the underlying framework for the present study. These conceptualisations of grammar as something dynamic go together well with the framework developed by Pham et al. (Reference Pham, Leuckert, Dreschler, Götz, Günther, Kircili, Lange, Mycock, Neumaier and 329Rüdiger2024), as they also regard non-canonical syntactic constructions as logical products of language usage and acknowledge their theoretical potential to develop into unmarked constructions in specific contexts. Emergent grammar has been described as ‘an alternative to the standard lexical-item-and-rules model of linguistic description’ (Hopper Reference Hopper, Auer and Pfänder2011: 23). In communication, speakers resort to their individual linguistic feature pool which consists of both individual items but also formulaic constructions and fragments, and the appliance of grammatical rules is subject to ‘different degrees of adaptation to meet syntactic constraints and the requirements of context’ (Widdowson Reference Widdowson1989: 135). Importantly, this does not imply that grammar cannot be investigated systematically; on the contrary, it is regarded as essential to establish a basis of analysis. The concept of emergent grammar acknowledges, however, that
when the study of language is directed toward spoken conversational interactions, the relevant results of traditional linguistics are soon exhausted. It is understood that categories don’t exist in advance of the communicative setting. Instead, they are constantly being elaborated in and by communication itself. They are unfinished and indeterminate. … Emergent Grammar focuses on the boundaries of categories rather than their prototypes, exploring the leading edges and the territory around them as they move.
Although Hopper’s statement does not refer to interactions between non-native speakers per se, we suggest that the concept of emergent grammar is well suited to investigate the ‘syntactic borderlands’ researchers enter when dealing with ELF conversations.
Probabilistic grammar models embrace this usage-based approach but also ‘incorporate statistical regularities derived from experience [of the speakers, and] … associate these quantitative patterns not (only) with surface forms or lexical items … but with abstract features or constraints’ (Grafmiller et al. Reference Grafmiller, Szmrecsanyi, Röthlisberger and Heller2018: 3). This gives rise to three predictions:
(a) The influence of certain cognitive factors on quantitative syntactic variation in across [sic] different (sub)varieties of a given language should be relatively stable in terms of the directions of those factors’ influence. (b) Subtle variation in the types and frequencies of constructions will lead to gradient, yet detectable differences in the strength of different factors’ influence on speakers’ syntactic choices. (c) This variation in the use of specific constructions may be driven by stylistic preferences among registers or speakers, by situational forces such as language/dialect contact, by cognitive pressures related to language processing, or by normal dialectal drift.
To date, probabilistic approaches have typically been concerned with grammatical patterns in second-language varieties of English. Heller et al. (Reference Heller, Bernaisch and Th. Gries2017), for instance, compare the use of the genitive in Asian Englishes with British English. Recently, however, researchers have started to acknowledge the enormous potential of probabilistic models for the study of ELF (Deshors Reference Deshors2020), revealing that ‘ELF is not only a discourse-driven and fuzzy phenomenon’ but exhibits underlying – and statistically measurable – systematicity while at the same time ‘ELF users not only react passively to ongoing processes but also contribute to create patterns within the core grammar of English’ (Laitinen Reference Laitinen2020: 440).
14.2.2 Minus-Plurals in ELF
In this chapter, we suggest using a quantitative approach to investigate some of the intra- and extra-linguistic factors influencing the use of a specific non-canonical syntactic construction, the lack of overt morphological plural marking in nouns, in ELF contexts. Examples (1) and (2) illustrate the phenomenon.
there are so many ethnic (.) group
handle five beers but i can- cannot handle three beers and two coffee
Thus, while in (1) the head of the noun phrase (NP), group, is clearly semantically marked for plural by the preceding quantifier many as well as the copula, it is not morphologically marked as such. Similarly, in (2) the head noun coffee also remains morphologically unmarked, even though it is modified by a numeral quantifier indicating plurality – and even though the speaker explicitly marks plural in the preceding NPs (five beers, three beers). In the description of the selected feature, we rely on Rüdiger’s (Reference Rüdiger2019: 48) classification, illustrated in Table 14.1. The main reason for using the term ‘minus-plurals’ instead of a more common one is its relative neutrality, since other labels, such as ‘omission’, ‘lack’, or ‘underuse’, are, to a degree, associated with more prescriptive ideas (consciously or subconsciously). Our decision to focus on minus-plurals is directly connected to our research questions: minus-plurals are a well-known feature of many varieties of English (see, for instance, eWAVE feature 57) and may result from language contact or represent simplification tendencies. In research on Asian ELF, minus-plurals have been reported regularly (e.g., Thompson Reference Thompson2017; Ji Reference Ji2016; Kirkpatrick Reference Kirkpatrick2010).

Table 14.1Long description
The table is divided into 4 columns with the labels Superordinate term, Subordinate terms, Other labels, and example. The data is filled in a row, where the subordinate term has two subdivisions: plus and minus. The data from left to right is filled as follows:For the term non-canonical use:
The relevant data for plus plural are superfluous items, overuse, and two childrens.
The relevant data for the minus plural are omission, lack, underuse, and two coffee.
14.3 Data and Method
Following the theoretical contextualisation of ELF and minus-plurals in the previous section, we now introduce the dataset, consisting of segments of ACE and VOICE, as well as our methodology.
14.3.1 ACE and VOICE
In order to trace minus-plurals in spoken ELF conversations from different contexts, we analysed the files containing non-institutionalised, spontaneous language in the Asian Corpus of English (ACE) and the Vienna-Oxford International Corpus of English (VOICE). ACE and VOICE each represent 1‑million-word corpora of naturally occurring face-to-face interactions of ELF speakers, with VOICE predominantly featuring European and ACE featuring Asian speakers from countries that are part of the Association of Southeast Asian Nations (ASEAN)Footnote 1 as well as China, Japan, and South Korea. The corpora can be accessed freely via http://corpus.ied.edu.hk/ace/ and https://voice.acdh.oeaw.ac.at/, respectively. ACE has been modelled to match the compilation procedure and structure of VOICE, with both corpora containing conversations in educational contexts (ca. 25%), leisurely contexts (ca. 10%), professional business (ca. 20%), professional organisation (ca. 35%), and professional research/science (ca. 10%). The dataset we used is summarised in Table 14.2, which also gives the figures for the tokens of overtly marked plurals and minus-plurals (see Section 14.3.2 for further details on token extraction and annotation). To reach roughly equal word counts, we extracted conversations from two interactional contexts in ACE (leisure and education) and one interactional context in VOICE (leisure).

Table 14.2Long description
The table is divided into 3 columns with the labels Asian Corpus of English or A C E, and Vienna-Oxford International Corpus of English or V O I C E. The data is filled in 4 rows from left to right as follows:
The relevant data for word count is c a. 76000 words and c a. 72000 words.
The relevant data for the number of speakers is 36 and 63.
The relevant data for overtly marked plurals is 1029 and 917.
The relevant data for minus-plural is 256 c a. 20% of all cases and 31 c a. 3% of all cases.
It is important to note that we did not normalise frequencies since we focus on the relative proportions of minus-plurals to overtly marked plurals, which means that frequency normalisation would not provide further insights.
ACE and VOICE are compiled with the goal of accessing natural ELF conversations, meaning that L1 speakers of English and speakers at all levels of proficiency are generally included. Consequently, it is important to stress that neither ACE nor VOICE are corpora of learner English. Just like Osimk-Teasdale and Dorn point out in their paper on Part-of-Speech tagging in VOICE, the ‘goal [is] not to tag “errors” or to register degrees of systematicity or conformity in reference to established norms of conventional usage’ (Reference Osimk-Teasdale and Dorn2016: 374) but to find out to what extent variation in plural marking in ELF contexts can be explained by language contact and by relevant interactional strategies.
14.3.2 Token Retrieval and Annotation
After deciding on a dataset for analysis, we manually identified and tagged all relevant constructions by reading through the corpus files and tagging each regular plural noun that is morphologically marked in Standard English, that is, with the inflectional {-s}-ending. In order to be able to make sound statements about the relative frequency of minus-plurals, we tagged both overtly marked plural forms as well as minus-plurals. Focusing on variation this way allows investigating ‘alternate ways of saying “the same” thing’ (Labov Reference Labov1972: 188). We annotated all relevant tokens yielded by the process described above for the predictor variables listed in Table 14.3.

Table 14.3Long description
The table is divided into 2 columns with the labels predictor variable and levels. The data is filled in 7 rows from left to right as follows:Sociolinguistic variables:
The data for the age group is young, mid, old, and n or a.
The data for gender is male, female, and n or a.
The data for Speaker’s L 1 is language code according to I S O 639-3.
The data for language family is given in Table 14.4. Intra-linguistic variables:
The data for quantifiers in N P is quantifier or no quantifier.
The data for animacy of the head noun in N P is animate or inanimate.
The data for corpus is A C E or V O I C E.
We included four variables that can be described as sociolinguistic in nature. Age is potentially relevant because of apparent-time change (e.g., Chambers Reference Chambers, Chambers, Trudgill and Schilling-Estes2004) that may be visible in the data, that is, there might be differences between generations due to younger speakers adopting new features or features disappearing or changing over time.Footnote 2 Gender, in turn, is relevant because female speakers are known as innovators in language development, which means that minus-plurals could possibly be more frequent in their language use. Corpus-linguistic methods have been identified as highly useful to investigate gender in World Englishes contexts as they contribute to ‘obtain[ing] a sociolinguistically and empirically valid picture’ (Bernaisch Reference Bernaisch and Bernaisch2021: 5).
Both speaker L1 and language family are crucial in that they allow us to discuss the potential impact of language contact on occurrences of minus-plurals. Terassa, for instance, found that ‘speakers of HKE [Hong Kong English] might omit the plural suffix because Cantonese does not inflectionally mark its nouns for plural’ (Reference Terassa2017: 139), although she identified low rates of minus-plurals in Singapore English (SgE) and Indian English, whose substratum languages do have inflectional plural marking. Language families may also provide insight, since language areas can have significant effects in statistical modelling (see, for instance, Bentz Reference Bentz2018: 121).
We would like to point out that we used the language codes according to the ISO 639-3, which is organised by the Summer Institute of Linguistics (SIL). The SIL, which also publishes the Ethnologue, has been criticised for its missionary activities and at times questionable language naming practices, but the ISO language codes offer a tool to quickly refer to languages with clearly identifiable codes. The language families, the languages, and the language codes relevant for our data are listed in Table 14.4. We decided to use sub-groups for the Indo-European languages, since we otherwise would have had to exclude language family from the statistical analysis due to an extreme skew towards Indo-European (IE) in the dataset.

Table 14.4Long description
The table is divided into 2 columns with the labels Language families, 17 branches, and Languages and language codes I S O 639-3, 31 languages. The data is filled in several rows from left to right as follows:
The data for the branch Afro-Asiatic or A f r is Maltese, m l t.
The data for the branch Albanian, A greater than I E,) is Albanian, a l b.
The data for the branch Lolo-Burmese, B is Burmese, b u r.
The data for the branch Dravidian, D is Tamil, t a m.
The data for the branch Finno-Ugric, F is Finnish, f i n.
The data for the branch Germanic G greater than I E, is Danish, d a n, Dutch or d u t, English or e n g, German, g e r, Norwegian, n o, Swedish, s w e.
The data for Indo-Aryan, I n greater than I E, is Hindi, h I n.
The data for Indo-Iranian, I r greater than I E, is Iranian, i r a.
The data for Japonic, J is Japanese, j a p.
The data for Korean, K, is Korean k o r.
The data for Malayo-Polynesian, M is Cebuano c e b, Filipino f i l, Indonesian Malay i n d), Tagalog t g l.
The data for Romance R greater than I E is Catalan c a t, Italian i t a, Spanish s p a.
The data for Sinitic S is Cantonese or Yue y u e, Mandarin c m n, Putonghua p u t.
The data for Slavic S l a greater than I E is Czech c z e, Polish p o l, Serbian s r p.
The data for Tai T a is Thai t h a.
The data for Turkic T is Kyrgyz k I r.
The data for Vietic V is Vietnamese v i e.
In addition to the sociolinguistic variables, we considered the presence or absence of a quantifier in the NP and the animacy of the head noun in the NP as potential intra-linguistic predictors. The role of quantifiers in SgE in relation to plural inflection, for instance, ‘has been examined in a number of accounts … with contradictory findings’ (Terassa Reference Terassa2017: 119). We decided not to include usage frequency of the nouns as a variable (as Terassa Reference Terassa2017 does in her study on plural inflection in Asian Englishes), since our dataset was relatively small while still allowing for inferential modelling. Animacy has been included since ‘human nouns are more likely to have plural marking than non-human (especially inanimate) nouns’ (Haspelmath Reference Haspelmath, Dryer and Haspelmath2013 in WALS) across all languages and animacy may therefore be a relevant predictor.
The final predictor in our model was corpus, that is, if the token occurs in ACE or VOICE. This predictor was included to find out if this essential distinction causes any significant splits in the model and since the two corpora, while compiled with comparability in mind, are different in some fundamental ways. First, they have been compiled in different timeframes – VOICE data were recorded between 2001 and 2007, while data collection for ACE started in 2009, with the corpus being released in 2014. This means that the corpora do not represent language use in their respective regions at identical periods of time. Second, and more importantly, the dominant language ecologies differ in the corpora and, by extension, in the regions they were created in. These differences are accounted for to an extent by the language families and speaker L1s, but the corpora are better suited to represent Asia and Europe at large (although ELF is by nature multilingual and there are no clear-cut boundaries). Third, and again in spite of the intended comparability, the two corpora are two different datasets created by different teams and in different situations.
14.3.3 Statistical Approach: RePrInDT
We subjected the variables described in the previous section to a statistical analysis using conditional inference trees. Tree-based methods, such as conditional inference trees and random forests, have gained in popularity in recent years and their increasing use is symptomatic of what has been called the ‘quantitative turn’ in linguistics (see Kortmann Reference Kortmann2021). Such methods ‘function by repeatedly splitting data sets up in two parts such that the split leads to the best increase in terms of classification accuracy or in terms of some other statistical criterion’ (Gries Reference Gries2020: 618; see also Bernaisch et al. Reference Bernaisch, Th. Gries and Mukherjee2014; Lohmann Reference Lohmann2013; Tagliamonte & Baayen Reference Tagliamonte and Harald Baayen2012). Gries (Reference Gries2020) addresses the widespread conception that tree-based methods are ‘easy to interpret’ even though they offer various pitfalls. Key issues in this context are that ‘there can be patterns in data that make trees underperform considerably when it comes to accuracy, variable importance/parsimony, and effects interpretation’ (Gries Reference Gries2020: 644). An advancement of the conditional inference trees available, for instance, as part of R’s partykit package (Hothorn & Zeileis Reference Hothorn and Zeileis2015), is Weihs and Buschfeld’s (Reference Weihs and Buschfeld2021) repeated undersampling in PrInDT (RePrInDT), which is itself a further developed version of PrInDT. RePrInDT improves conditional inference trees in R by undersampling the larger class in the dependent variable (in our case: 15% of overtly marked plurals) in order to achieve better prediction rates in the trees (cf. Weihs & Buschfeld Reference Weihs and Buschfeld2021: 5). This function is particularly useful in such cases as ours, in which one class is (much) bigger than the other and, as a result, prediction would always favour the larger class.
The script runs numerous trees and provides information on all results as well as the three trees with the highest accuracy. Furthermore, a big advantage is the in-built calculation and plotting of the balanced accuracy, that is, the accuracy for both the larger and smaller class in the predictor variable. We decided to work with RePrInDT since, just like in Weihs and Buschfeld’s (Reference Weihs and Buschfeld2021) case study on zero and realised subject pronouns in SgE, there is a strong imbalance in our dataset due to the much higher frequencies of overtly marked plural forms. It is important to note that, at an earlier stage, we had included speaker as a variable in the tree but decided to discard it since it did not appear as an important predictor variable in any of the best-performing trees.
14.4 Results: Minus-Plurals in ACE and VOICE
In this section, we first present the results of the statistical analysis based on RePrInDT before moving on to a qualitative analysis highlighting usage contexts of minus-plurals in ACE and VOICE.
The best acceptable conditional inference tree yielded by RePrInDT is built on 579 observations and has eight terminal nodes, as can be seen in Figure 14.1. Speaker L1 shows to have the strongest effect and is responsible for the first split in the tree. From this first node, the tree splits into two further branches, both of which select corpus as the strongest predictor, thus separating ACE from VOICE speakers. If further splits occur, speaker L1, language family, or age play a role. Somewhat surprisingly, the purely intra-linguistic factors animacy and quantifier in the NP do not appear as important predictors in the model.

Figure 14.1 Best acceptable conditional inference tree
Figure 14.1Long description
Tree diagram with multiple nodes representing a decision-making process based on the variable speaker.L 1 with a significance level of p less than 0.001 at the top node, node 1. The tree branches into two main paths from node 1, evaluating different conditions related to the speaker's language background. As the tree progresses downward, each node is labeled with conditions, sample sizes n, and p-values, which provide insight into the significance of each decision point. Node 4, with n equals 56, node 5, with n equals 160, node 7, with n equals 21, node 8, with n equals 14, node 11, with n equals 69, node 12, with n equals 57, node 14, with n equals 21, and node 15, with n equals 181, are terminal nodes. These nodes are followed by bar charts that split into two segments: one representing plus and the other representing minus. The vertical bars display proportions, with values ranging from 0.0 to 1.0 on the y-axis.
A closer look at the left branches of the tree reveals that node 2, corpus, splits the group into Southeast Asian and European speakers of ELF.Footnote 3 For the ACE group, minus-plurals are most strongly predicted if the speakers’ L1 is Burmese, English, Thai, or Vietnamese (node 5). This finding is partly expected, as the majority of these languages does not code plural via suffixes but rather relies on numeral classifiers or reduplication (cf. Iwasaki & Ingkaphirom Reference Iwasaki and Ingkaphirom2005, on Thai) instead – according to WALS (feature 55A), numeral classifiers are obligatory for plural marking in Thai, Vietnamese, and Burmese. English seems to be the odd one out in this group, but it should be kept in mind that several Southeast Asian speakers indicated English as their L1; hence, it is likely that English in ACE refers to a Southeast Asian variety rather than Standard American or British English. Overall, our finding thus corroborates the hypothesis that the use of minus-plurals in English might be influenced by transfer from L1 structures (cf. also Deterding Reference Deterding2007; Wee & Ansaldo Reference Wee, Ansaldo and Lim2004: 64).
Node 4 of the tree also indicates relatively high rates of minus-plurals for L1 speakers of Indonesian Malay and Cantonese/Yue. Again, this is unsurprising from a typological perspective, as plurality is not coded through suffixation in these languages (WALS feature 33): while Indonesian Malay uses reduplication for plural marking, plural is not morphologically marked in Cantonese/ Yue. As before, the high frequency of minus-plural constructions thus suggests potential transfer from speakers’ L1s. Previous research on Chinese ELF already pointed at possible L1 transfer in more formal settings of ACE (Ji Reference Ji2016), and our study provides evidence that this also seems to hold for informal speech situations. However, as mentioned above, we could not find evidence for a strong effect of an existing quantifier in the NP, which has been suggested to trigger overt inflectional plural marking in similar linguistic contexts, such as SgE (Alsagoff & Ho Reference Alsagoff, Lick, Foley, Kandiah, Zhiming, Gupta, Alsagoff, Lick, Wee, Talib and Bokhorst-Heng1998: 144).
For VOICE speakers, we find very low rates of minus-plural for speakers with English as their L1 (node 8). The relatively high rates for L1 speakers of Albanian, Italian, or Korean (node 7) are somewhat unexpected, however, and require further analysis. Again, typological transfer from the L1 might play a role: although Korean has an inflectional plural marker, for instance, plural is typically only overtly marked for emphasis and not if number is inferable from the context. In both Italian and Albanian, plural marking often involves inflectional vowel changes in the stem rather than the addition of inflectional morphemes.
On the right side of the tree in Figure 14.1, it is again corpus which is responsible for a first split in the data (node 9). For VOICE speakers, language family then splits the tree into a branch with hardly any predicted minus-plurals – this is the group of Maltese (Afro-Asiatic) and Finnish (Finno-Ugric) speakers as well as speakers of Romance and Slavic languages (node 11). Some minus-plurals are predicted if Germanic and Turkic languages are involved (node 12). For the group of ACE speakers, age turned out to be a relevant predictor: minus-plurals are predicted much more often if speakers are younger, that is, not older than 30 (node 14). Still, as this group is very small and only involves 21 participants, this finding should not be overinterpreted.
Overall, the tree has an acceptable balanced accuracy of 0.75, with a slightly better predictive power for plural marking (0.80) than for minus-plural (0.71) constructions. Two variables did not seem to have a central effect, and both of them are intra-linguistic: the existence of a quantifier in the NP and the animacy of the noun. Our findings are confirmed when looking at the second-best tree yielded by RePrInDT. As Figure 14.2 shows, the tree exhibits the same splits – speaker L1 is at node 1, followed by corpus. Language family and age of the speakers also have an effect: for VOICE, we find more minus-plurals if the speakers have an L1 which is Albanian, Italian, or Korean; for ACE, Burmese, English, Cantonese/Yue, Thai, and Vietnamese seem to trigger more minus-plurals.

Figure 14.2 Second-best acceptable conditional inference tree
Figure 14.2Long description
The tree diagram is composed of ovals connected by lines, representing a decision-making process based on probability values. The diagram is structured horizontally with nodes displaying descriptions and associated probability values. At the top, node 1 evaluates the speaker. L 1 with p equals 0.001, which splits into two branches. The left branch leads to node 2 labeled corpus with p equals 0.001, which further splits into node 3 evaluating speaker L 1 with p equals 0.01. Node 3 then leads to two results: node 4 representing English and node 5 representing Albanian, Italian, and Korean. Node 2 also branches into node 6, evaluating language-family with p equals 0.035, and node 8 representing M, each with their respective sub-nodes.
On the right side of node 1 is node 9, evaluating corpus with p less than 0.001. This node splits into node 10, evaluating language-family with p equals 0.048, and node 13, evaluating age with p less than 0.001, both with further splits. Each node is followed by a bar chart showing the distribution of plus and minus, with black and gray bars representing the proportion of results at each terminal node. The diagram provides a clear hierarchical structure of decisions based on probability values and statistical significance.
This pattern is repeated when we re-run the model. Taken together, the three best trees yielded by RePrInDT all show language family, speaker L1, corpus, and age as best predictors, while the effect of the language-internal variables animacy and quantifier in the NP seems to be small. The balanced accuracy of the three best trees is 0.75, and thus acceptable though not excellent. When the ensemble of all 1,001 trees is considered, this balanced accuracy decreases slightly to 0.73. The pattern identified in the three best trees is confirmed in the overall ensemble. The language-internal predictors animacy and quantifier in the NP play a slightly greater role when all trees yielded by the model are taken into account.Footnote 4
Overall, our model predicts well. The histogram in Figure 14.3 shows the balanced accuracies of all trees from undersampling. The bold line represents the median of all balanced accuracies, which lies at 0.72. In total, 500 trees yielded by our model come with a greater balanced accuracy than this median value.

Figure 14.3 Balanced accuracies of all 1,001 trees
Although statistical modelling did not find animacy and quantifier in the NP to constitute major factors for predicting minus-plurals, a qualitative analysis still finds highly interesting patterns related to them. Generally, minus-plurals are found to occur primarily with inanimate head nouns, as in (3) or (4).
our our contract are yearly (.) contract are yearly
NPs with inanimate head nouns account for the vast majority (75%) of all minus-plural constructions. Animate head nouns tend to be marked for plural in both the Southeast Asian and the European dataset, even though minus-plurals also occur in this context,Footnote 6 as shown by (5).
but er many many tourist especially in autumn and er summer yes and in winter mhm not so much because it’s cooler and er there’s high water
Furthermore, 68% of all minus-plurals can be identified in NPs without quantifiers, as in (6), taken from ACE.
[first name13] is a person who used to work in er university of the Philippines er and er we er we are friend online
One third (32%) of all minus-plurals involve quantification of some kind, for instance via a numeral determiner, as in (7).
yeah there are two supervisor
Hence, although quantifiers seem to lead to overt inflectional plural marking in most cases, which supports Alsagoff and Ho’s (Reference Alsagoff, Lick, Foley, Kandiah, Zhiming, Gupta, Alsagoff, Lick, Wee, Talib and Bokhorst-Heng1998: 144) observation for SgE, there is still a considerable amount of utterances in which quantification of the NP is combined with the use of a minus-plural construction. One possible explanation for this tendency might be the wish to avoid redundancy, which is often described as a typical feature of ELF interactions (e.g., Cogo & Dewey Reference Cogo and Dewey2012: 89).Footnote 7 That is, speakers might not see the need for additional inflectional plural in cases where plurality is already indicated through a quantifying expression.
14.5 Discussion and Conclusion
In this chapter we analysed minus-plurals in the ELF corpora ACE and VOICE. Our main goal was to identify the frequencies of minus-plurals in conversational contexts. If frequency differences between the two corpora could be found, we further wanted to investigate whether these might best be explained by focusing on typological influence from the speakers’ respective L1s, or whether they could be due to other processes, such as grammatical restructuring in ELF interactions. By closely examining typological structures involved in ELF encounters and including language families and individual L1s, we intended to acknowledge that ‘multilingualism rather than English is to be understood as the overarching framework within which ELF communication takes place’ (Jenkins Reference Jenkins2015: 67).
The statistical analysis revealed that speaker L1 as well as language family seem to have an effect on plural marking in ACE and VOICE, which means that, at least from a statistical perspective, transfer of L1 structures is a possible explanation for occurrences of minus-plurals in ELF encounters. However, minus-plurals do not constitute a preferred choice in ACE and VOICE in general, since they are (still) strongly linked to specific L1s. It is important to emphasise again that there are notable frequency differences of minus-plurals between ACE and VOICE. Thus, while there is no ground for speaking of minus-plurals as a ‘preferred choice’ in VOICE, they might be on their way to becoming unmarked or even ‘canonical’ in Asian ELF – at least based on a frequency-based definition of canonicity. In cases where minus-plurals occur, they confirm that the ‘multilingual repertoires of ELF users, who are by definition at least bilingual, are an integral part of communication moves in general, and often used in close combination with communication strategies in the co-construction of meaning, and become part of a shared Multilingual Resource Pool’ (Vettorel Reference Vettorel2019: 203). Many Asian contact languages in ACE employ little to no affixation to mark number, and L1 emerged as a highly predictive factor across all statistical models in our study.
Another important aspect to consider is the role of ‘emergent grammar’ that we outlined in Section 14.2. Hopper finds that utterances in spontaneous spoken language ‘do not conform to fixed prior grammatical forms. Instead, they conform to norms, which means they more or less conform to recognizable patterns’ (Reference Hopper, Auer and Pfänder2011: 42). Two aspects related to this idea are relevant to our study. First, speakers in ACE are more likely to encounter minus-plurals in Asian Englishes than speakers in VOICE are in European varieties of English and, as we have pointed out for various languages in the previous section, many languages that are part of their multilingual resource pools do not overtly mark number. Thus, minus-plurals (or the general absence of morphological number marking) are indeed at least one of the norms ASEAN speakers encounter – in addition to the norms associated with the varieties that are closer to Standard English (such as Standard Singapore English) and formal classroom English. Second, the use of minus-plurals does not in any significant way hinder effective communication. If speakers are used to succeeding with the linguistic tools available to them, there is no need to resort to other strategies. Indeed, repeated usage of a linguistic feature or construction has important consequences both in the short and the long term (see Bybee Reference Bybee2006 on frequency effects). Thus, minus-plural marking, which might subconsciously be considered as effective or more effective than overt plural marking, may become one of the norms that Asian ELF speakers base utterances on in subsequent interactions. In this context, it is important to acknowledge that many ELF speakers employ strategies of second-language acquisition. These are challenging to operationalise as part of statistical modelling, but ELF is a variety used by bi- or multilinguals who may consciously or subconsciously resort to second-language acquisition strategies in order to achieve their communicative goals.
In methodological terms, we hope to have shown that sophisticated statistics can meaningfully be applied to ELF data. However, we also believe that caution is required when interpreting the statistical analysis. ELF encounters are multifaceted and influenced by a range of factors, such as the other languages that are part of the larger as well as the conversation-specific language ecology, the speakers’ individual sociolinguistic profiles and their linguistic backgrounds, processes of linguistic accommodation, etc. In addition, the linguistic feature at hand and its properties, be they phonetic/phonological, lexical, or morphosyntactic, might present certain constraints that make it more or less dynamic and, hence, challenging to analyse. It is hardly possible to operationalise all of these factors as part of a statistical procedure; instead, a mixed-method approach involving both qualitative and quantitative analyses seems most promising. In the present chapter, we addressed this problem by a close reading of the corpus, enriching the statistical analysis with insights from language typology, and scrutinising selected examples.
Finally, we share the belief that World Englishes and ELF research are not only distantly related but, instead, crucially share their interest in multilingual contact scenarios. Specifically, evolving modes of communication in the digital realm (e.g., Leuckert Reference Leuckert and Jansen2020) but also in other contexts, such as international trade and tourism, lead to ever-increasing ELF encounters that frequently involve a range of varieties of English. The concept of ‘probabilistic indigenisation’ (Szmrecsanyi et al. Reference Szmrecsanyi, Grafmiller, Heller and Röthlisberger2016) has been proposed for L2 varieties of English. At least based on our findings, this concept also proves helpful in improving our understanding of ELF encounters by framing them as dynamic and flexible but not as random. This understanding of ELF also ties in with how canonicity is viewed in ELF contexts, with some factors playing bigger and others smaller roles in the process of feature selection. In order to advance our understanding of ELF grammar as probabilistic grammar, future studies involving additional datasets and additional linguistic features will be required.
There is little doubt that non-canonical syntax is omnipresent in English – it occurs along diachronic, diatopic, and diastratic axes. Thus, it is a prevalent phenomenon through time, through all regions where English is used, across all layers of society, and, in fact, across all situations of language use. This edited collection was conceptualised with the intention of bringing together linguists working on non-canonical syntax in English, but from a range of perspectives and with different research foci. In this synopsis, we summarise the insights that have emerged from the different contributions regarding concepts, approaches, and methods central to the study of non-canonical syntax.
All major grammars of English account for the ‘rules’ or ‘common’ structures of English syntax, but they also try to account for the many deviations from these rules. Some do so by describing what they perceive as theoretically possible (e.g., Jespersen Reference Jespersen1909–1949) and others by considering frequency differences in actual data (e.g., Quirk et al. Reference Quirk, Greenbaum, Leech and Svartvik1985; Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan2021). In the Introduction to this volume, we define the concepts ‘canonical’ and ‘non‑canonical’ as follows:
A ‘canonical’ syntactic construction [is] a default structure, that is, an ordering, composition, formal marking, or realisation of elements, which under general circumstances will be chosen with the highest likelihood by a speaker or writer unless there are good reasons for choosing a different syntactic structure. By contrast, a ‘non-canonical’ syntactic construction is defined as a deviation from the default ordering, composition, formal marking, or realisation of elements which in the production process is motivated by one or several factors.
What emerges from studies such as Dreschler’s or Pham’s is that, for a syntactic construction to be perceived as ‘canonical’ or ‘non-canonical’ by language users, the existence or non-existence of other structurally similar constructions at a given point in a language’s development may also be essential. For instance, while many types of inversion existed beside late subjects in Old and Middle English, subject-operator and subject-main verb inversion are the only types of inversion in Present-Day English (PDE), which, as argued by Dreschler, makes them more non-canonical. Similarly, the existence of a whole cluster of constructions involving clefting makes each of these constructions less non-canonical in evaluative discourse situations, as outlined by Pham.
At the same time, the existence of alternatives is the key aspect to understanding non-canonical syntax. Non-canonical syntactic constructions as such are deviations from a canonical variant and have specific functions – indeed, as Mycock and Glaas point out, they are inherently multi-functional: they foreground or background, have information-structural function, mark topics, express the speaker’s commitment to the proposition, facilitate processing, etc. These functions particularly apply to deviations from an expectable standard and are therefore fulfilled especially by infrequent constructions. It is these functions that ensure the longevity of these non-canonical constructions in the face of their rarity, as suggested by Mycock and Glaas.
In their contribution, Leuckert and Rüdiger shift the focus to (non‑) canonicity in linguistic publications and show that terms such as ‘canonical’, ‘non-canonical’, and related terms are variously preferred or dispreferred across linguistic journals, and boundaries between terms or groups of terms are often fuzzy. In order to set the stage and establish a terminological framework for the remainder of the volume, the Introduction outlines two dominant approaches to non-canonical syntax following our definition: a frequency-based and a theory-based approach, which generally map onto the two approaches described above for accounts of non-canonical syntax across different grammars. These two approaches are complementary, and one would not make sense without the other.
Based on the contributions to this volume, we can observe that some areas of linguistics have a natural affinity to either the frequency-based or the theory-based approach. For instance, in historical syntax, the theory-based approach to non-canonical syntax prevails, as outlined in Hundt’s introduction to Part I, due to the scarcity of data, at least for the earliest periods of English. In this regard, the contributions to Part I of our volume represent valuable exceptions, because all of them are corpus studies: Dreschler provides a diachronic study covering the earliest periods of English, Mycock and Glaas focus on Early Modern English, and Lange on Late Modern English. At the same time, however, these studies show that non-canonical syntax in the history of English indeed represents a ‘moving target’, as posited in the Introduction to this volume. This means that, from a historical perspective, constructions that are non-canonical in PDE may well have been canonical in earlier stages or English, or vice versa. In contrast to studies in historical linguistics, studies on PDE in the functional-linguistic paradigm analyse constructions in their co-text and context. Consequently, they naturally tend towards an empiricist and thus the frequency-based approach. This is illustrated by the majority of studies in Parts II and III, which analyse non-canonical syntax in register-based and non-native varieties of PDE.
How much can we gain from a frequency-based or a theory-based approach to non-canonical syntax in the different areas? In some cases, the frequency-based and the theory-based analyses may amount to the same classification of a construction as non-canonical or canonical. ProTags, for instance, are additions to a syntactically and semantically complete structure, and, even when we limit our scope to colloquial spoken British English, they are infrequent. They are thus, as outlined by Mycock and Glaas, non-canonical according to both approaches, which makes them interesting not only from a historical, but also from a PDE perspective. In other cases, by contrast, a change in scope or perspective can challenge established frequency- or theory-based judgments of constructions as canonical or non-canonical: for example, as shown in the study by Pham, clefts are clearly a deviation from the Sn-V-X pattern, but if we regard them as part of an extended set of explicitly evaluative lexico-grammatical stance constructions, they are in fact a canonical means of expressing evaluation as far as the theory of evaluation is concerned, while still remaining infrequent and thus non-canonical from a quantitative perspective. Similarly, as outlined by Götz and Kircili, it-extraposition also represents a deviation from a minimally complete structure, but extraposed sentences are more frequent than non-extraposed sentences, challenging the theory-based classification as non-canonical. Günther and Biber et al. discuss similar cases. On the one hand, a theory-based approach to particle placement in phrasal verbs entails defining the discontinuous variant as non-canonical, while, according to a frequency-based approach, the discontinuous variant is canonical, at least in spoken PDE. On the other hand, Non-Canonical Reduced Structures are clearly a deviation from the basic SVX clause structure of English, but, as discussed by Biber et al., they occur much more frequently and are thus arguably canonical in certain types of TV news broadcasts. These examples show that considering (non‑)canonicity both from a theory- and a frequency-based approach can help the researcher dissociate themselves from and reconsider established evaluations of syntactic constructions. Taking things one step further, the contribution by Neumaier and Leuckert considers syntactic phenomena in English as a Lingua Franca within the context of non-canonical syntax – which, to the best of our knowledge, has rarely been done before. Thus, the contribution showcases that the added conceptual layer of non-canonical syntax and its links to other relevant concepts, such as Hopper’s (Reference Hopper, Auer and Pfänder2011) ‘emergent grammar’, can be a fruitful addition in areas where it has never or rarely been invoked before.
In line with the usage-based direction of the contributions not only in Parts II and III, but also in Part I, all studies employ empirical methodology to investigate non-canonical syntax. The focus on actual language use in the study of non-canonical English syntax, often involving analyses of information status, in fact makes the empirical approach imperative and, in most cases, even requires a corpus-based methodology to permit the inclusion of contextual information in the analyses. The only exception in this volume is Günther’s investigation with an experimental approach to the study of particle placement. Her study hints at the potential of bringing methodological diversity to the empirical study of non-canonical syntax: while corpus-linguistic methodology is certainly highly effective in this context, there is no doubt that other methods may produce useful results, and larger studies could benefit from triangulation.
Importantly, in line with efforts in recent years to include register as a variable in explaining variation (see, for instance, Bohmann Reference Bohmann2019 and the journal Register Studies, Gray & Egbert Reference Gray and Egbert2019), we also need to acknowledge that register plays an important role in the study of non-canonical syntax. The contribution by Biber et al., for instance, shows that the unique register of broadcasting invites the use of Non-Canonical Reduced Structures. Götz and Kircili, in turn, investigate non-canonical syntax in South Asian newspapers. Newspapers typically represent a specific register with a range of distinct (pervasive and functional) linguistic features, and they contain sub-registers at different levels of formality. Furthermore, non-canonical syntactic constructions are also not limited to spoken or written language. Instead, they can be found at any given point on the two poles that Koch and Oesterreicher (Reference Koch, Oesterreicher, Lange, Weber and Wolf1985/2012) call ‘conceptually spoken’ and ‘conceptually written’ language, including Computer-Mediated Communication (CMC) – as shown, for instance, in the chapter by Pham. Similarly, non-canonical syntax is not limited to first- or institutionalised second-language varieties, but may also be found in learner varieties – as hinted at in the introduction by Sharma to Part III and shown in the contribution by Kircili.
This volume has highlighted non-canonical syntax and its applicability across a range of case studies in historical linguistics, register-based varieties, and non-native varieties of English. Obviously, it is impossible to do the immense complexity and diversity of non-canonical syntactic constructions justice in the scope of one volume. This is why it is meaningful to consider some possible directions for future research into the field. First, as is so often the case, it would be highly relevant to add more cross-linguistic studies into the mix. While some contributions to this volume hint at or even overtly include language contact, unique contact scenarios permanently emerge, and language contact constantly takes place in the digital sphere. There has been important work on non-canonical syntax in other languages, but comparative work systematically taking into account the definitions and concepts outlined in this volume has the potential to advance what we know about non-canonical syntax. Second, as already hinted at, the study of non-canonical syntax in CMC with its broad spectrum of text types and registers still has enormous potential. This is due to the breadth of factors that go into CMC (which, admittedly, also lead to methodological challenges). Third, the methodological toolkit can and should be expanded. Important work, for instance by Levshina et al. (Reference Levshina, Namboodiripad, Allassonnière-Tang, Kramer, Talamo, Verkerk, Wilmoth, Rodriguez, Gupton, Kidd, Liu, Naccarato, Nordlinger, Panova and Stoynova2023), has already shown significant advancements in how word order variation can be investigated. With corpus linguistics and research on Natural Language Processing developing at rapid rates, improved automated annotation of non-canonical syntactic constructions may be in the near future and would help researchers deal with big data for phenomena that, thus far, have required (semi-) manual identification. Fourth, as briefly addressed in the introduction by Dorgeloh and Wanner to Part II, AI and machine translation also represent important phenomena worthy of investigation. For instance, it might be interesting to look into how non-canonical syntax is used in AI-generated texts and how tools used in machine translation deal with non-canonical syntax. These questions arise because non-canonical syntax is typically not only defined by formal characteristics, but also by its conveying discourse-pragmatic meaning that, on the one hand, is dependent on the respective co- and context and that, on the other hand, may not map 1:1 onto another language.
Whatever the answers to these questions may be, language will keep evolving. Syntactic (non‑)canonicity is thus necessarily dynamic, which is why it has remained one of the most fascinating topics in structural linguistics and will certainly continue to puzzle linguists.






















