1 Introduction
Vowel deletion is common in the Mayan language family, attested in languages such as Mam, Tektitek, Ixil, Uspantek, Tz’utujil, Sakapultek, Tseltal, Yucatec Maya, Tojolab’al, Mocho’, Huastec, and K’iche’ (Bennett Reference Bennett2016; Sapón et al., Reference Sapón, Beatriz and Angelina Can Pixabaj2000). In these languages, deletion is typically restricted to short vowels, and is especially common for vowels that are unstressed, non-initial, and non-final (Bennett Reference Bennett2016; López Ixcoy Reference Ixcoy and Candelaria Dominga1994; Sapón et al., Reference Sapón, Beatriz and Angelina Can Pixabaj2000). Vowel deletion is particularly prevalent in the Chichicastenango dialect of K’iche’, where it has a marked effect on both production and comprehension of the language. Building on the observations of Campbell (Reference Campbell1977), López Ixcoy (Reference Ixcoy and Candelaria Dominga1994) and England & Baird (Reference England and Baird2017), Wood (Reference Wood2020) describes deletion in content words as predictable based on vowel quality, syllable shape, syllable position, and stress, and unaffected by phrasal context. In function words, in contrast, deletion is practically unstudied and appears to be highly variable.
The Chichicastenango dialect is a stigmatized dialect of K’iche’. Speakers are typically aware of how people from other towns speak the language and often feel pressured to use these prescriptively correct forms when asked how their language is spoken. Therefore, a complete understanding of where vowels are deleted in Chichicastenango K’iche’ requires the use of data from speakers focused on the content rather than the form of their words. This article presents the vowel deletion patterns observed in content words and function words in a corpus of Chichicastenango K’iche’ spontaneous speech. The study provides support for many of the generalizations made in the previous literature, including the effects of vowel quality, onset, coda, and stress, and the observation that deletion in function words is more variable than in content words and is affected by phrasal context. However, the results also provide some novel information about the language, showing that vowels are deleted in some apparently final syllables and demonstrating differences in the deletion pattern at three levels of prosodic structure (stress domain of content words, extrametrical morphemes in content words, and function words).
The structure of the paper is as follows. Section 2 provides background on the language, its sounds, and past research on vowel deletion. Section 3 details the methods used for the corpus study. Section 4 describes the results. Section 5 discusses some of the main results of the study, including the differences observed between content words and function words and how this data informs understanding of prosodic structure in the language. Section 6 concludes the article.
2 Background
K’iche’ (ISO639-3 QUC) is a Mayan language belonging to the Eastern branch of the family spoken by around one million people in the highland area of Guatemala (Campbell Reference Campbell and Judith Aissen2017; Instituto Nacional de Estadística 2019). The language is characterized by a relatively high degree of dialectal variation at various levels of the grammar (Sapón et al., Reference Sapón, Beatriz and Angelina Can Pixabaj2000; Romero Reference Romero2009, Reference Romero, Judith Aissen and Maldonado2017).
An official K’iche’ orthography was standardized by the Academia de Lenguas Mayas de Guatemala in 1987, by Decree 1046-87 (López Ixcoy Reference Ixcoy and Dominga1997). This orthography indicates tense vowels with plain vowel symbols and lax vowels with diereses, and is therefore appropriate for Chichicastenango K’iche’, which has this phonological contrast, as discussed further below (Wood Reference Wood2020). Most other works on K’iche’ use a modified orthography more suited to the vowel length contrast found in most other K’iche’ dialects.
The consonant inventory of K’iche’ is shown in Table 1. Where different from the IPA, the orthographic symbol is included in < >.
Table 1. Consonant inventory of K’iche’ with orthographic symbols in < >

Stops and affricates contrast a plain series with a series referred to as ‘glottalized’ in the Mayanist literature (Bennett Reference Bennett2016). In Chichicastenango K’iche’, like other Mayan languages, most of the glottalized consonants are ejective, but the bilabial is usually realized as a voiced implosive. The bilabial and uvular glottalized stops are often reduced to a glottal stop in casual speech. Whatever their realization, glottalized consonants cause glottalized phonation on adjacent vowels (Wood Reference Wood2023).
Most K’iche’ dialects have vowels at five places of articulation /i e u o a/, with a phonemic length contrast (Sapón et al., Reference Sapón, Beatriz and Angelina Can Pixabaj2000). In Chichicastenango K’iche’, however, the length contrast has been replaced by a contrast in quality, with historic long vowels appearing as ‘tense’ /i e u o a/ and historic short vowels as ‘lax’ /ɪ ɛ ʊ ɔ ə/(<ï eë ü ö ä>) (Wood, Reference Wood2020). The vowels of Chichicastenango K’iche’ are shown in Figure 1.

Figure 1. Realizations of the 10 vowel phonemes in Chichicastenango K’iche’ in a controlled speech production task (Wood, Reference Wood2025a). Tense vowels are in dark gray and lax vowels in light gray. Ellipses enclose about one standard deviation.
This graph shows the realizations of the tense and lax vowel phonemes of Chichicastenango K’iche’ as produced by eleven speakers (three male, eight female) in a controlled speech production task (Wood, Reference Wood2025a). The data was collected through a translation task from Spanish and it includes only stressed monosyllabic words. The formant values were normalized with the Lobanov method (z-scores, Lobanov Reference Lobanov1971) to permit comparisons across speakers. The large size of the /I/ ellipse is due to the fact that this phoneme is realized as [i], [e], [ɛ], or [ə], depending on the item and sometimes the speaker. This vowel appears to be merging phonetically with surrounding categories; however, it retains distinct phonological behavior and patterns as a lax vowel irrespective of its phonetic realization.
Primary stress is word-final in non-verbs in Chichicastenango K’iche’, with the exclusion of two extrametrical suffixes: the superlative /-ələχ/ and the suffix /-Ik/ which appears on some adjectives when in phrase-final position (for more information on phrase-final suffixes in K’iche’, see Henderson Reference Henderson2012; Royer Reference Royer2021; Wood Reference Wood2025b). Stress is non-final in some unadapted Spanish loanwords, which maintain the position of stress in the original Spanish and are common in everyday speech. Examples of stress in non-verbs are shown in Table 2. Here, and throughout the article, angle brackets < > enclose extrametrical morphemes. The audio files for these and all other examples cited throughout the paper are included in the supplementary materials. Each example also appears with a code to the source recording – many of which are publicly available in the Archive of the Indigenous Languages of Latin America (K’iche’ collection of Elizabeth Wood) – and timestamp. More details about the data sources can be found in the Appendix.
Table 2. Stress in non-verbs, with extrametrical morphemes in angle brackets < >

In verbs, stress is sensitive to vowel quality and syllable shape. Stress falls on the left- most syllable containing either a tense vowel or a coda, or if there are none, on the final syllable. The stress domain excludes most inflectional morphemes (person, aspect, and incorporated movement prefixes, and status suffixes, a set of suffixes which indicate the mood, transitivity, and phrase position of the verb). Examples of stress in verbs are shown in Table 3.
Table 3. Stress in verbs, with extrametrical morphemes in angle brackets < >

Campbell (Reference Campbell1977) observes that all short (i.e., lax) vowels in Chichicastenango K’iche’ are deleted when preceded by a consonant and followed by a stressed syllable, and provides a number of examples, all of which are disyllabic nouns. López Ixcoy (Reference Ixcoy and Candelaria Dominga1994) notes that deletion is common but tends to affect non-final vowels, and England & Baird (Reference England and Baird2017) that it affects many unstressed vowels. Wood (Reference Wood2020) brings these descriptions together with original data taken from monologues and elicited materials and concludes that lax vowels are deleted in content words when in a non-final CV syllable adjacent to the stressed syllable. Tense vowels, stressed vowels, and those in onsetless, closed, or word-final syllables are never deleted. Lax vowels in unstressed, non-final CV syllables not adjacent to the stressed syllable are sometimes deleted, but less regularly than those in syllables adjacent to the stressed syllable. Very little previous research exists on the deletion of vowels in function words in Chichicastenango K’iche’. Wood (Reference Wood2020) observes that vowels may be deleted in function words in contexts that would be prevented in content words, such as word-final or closed syllables. These deletions sometimes appear to be regular, but in other cases appear to be optional, with the same vowel appearing variably in the same word and similar phrasal contexts.
This article addresses vowel deletion in Chichicastenango K’iche’ in a corpus of spontaneous speech. The primary research question is what conditions vowel deletion in the language. Specifically, the study explores (i) whether the previously identified factors of vowel quality, syllable structure, syllable position and stress are relevant, (ii) how predictable vowel deletion is according to these factors, and (iii) whether there are differences between content words and function words.
3 Methods
3.1 Data
The data used for this study comes from a corpus of spontaneous, monologic speech recorded in Chichicastenango in 2018–2019. The corpus includes a total of 2 hours and 38 minutes of audio by 11 native speakers. The corpus includes speakers of a range of ages (young adults to elders), genders (two men and nine women), and locations within the Chichicastenango area (city center and surrounding rural communities), and thus constitutes a broad sample of speakers from this dialect area. The monologues cover topics such as local history and traditions, traditional stories, personal anecdotes, instructions, and recipes. Speakers were asked in very general terms to talk about a particular topic, such as explaining how to prepare a local dish or relating something that happened to them in the past, and were allowed to speak freely for as long as they wished, which ranged from 1 to 15 minutes. These recordings were then transcribed by the author with help from native speakers. The recordings were made using a Zoom Hn4 digital recorder at a sampling rate of 44.1 kHz, using either the internal microphone or (in most cases) connected to a Shure SM10A headset microphone. Recordings for which permission was granted by the speaker are publicly available in the Archive of the Indigenous Languages of Latin America (K’iche’ Collection of Elizabeth Wood) along with transcriptions and translations. The unarchived recordings are used for academic purposes with permission from the speakers.
For each morpheme, the underlying form was determined primarily based on comparison of any surface forms produced in the corpus as well as other Chichicastenango K’iche’ materials collected by the author. If a vowel occurred in at least one instance of that morpheme, it was included as an underlying vowel in all instances of that morpheme. Additionally, vowels known to be present in the historical form of the language and preserved in other K’iche’ dialects where deletion is much less prevalent (Larsen Reference Larsen1988; Ajpacajá Tum Reference Tum and Florentino2001; Ajpacajá Tum et al. Reference Tum, Florentino, Chox Tum, Tepax Raxulew and Guarchaj Ajtzalam2005; Pixabaj & Angelina, Reference Pixabaj and Angelina2015) were assumed to be present underlyingly in Chichicastenango K’iche’. These two systems for identifying underlying vowels were consistent with each other.
For the purposes of this study, content words include nouns, verbs, adjectives, and adverbs. All other word types are considered to be function words, including prepositions, complementizers, determiners, pronouns, directional particles, tense–aspect–mood particles, negation markers, existential particles, and nominal modifiers. Relational nouns – a set of words in Mayan languages that are formally nominal, receiving nominal morphology such as possessive prefixes, but have functional uses (Polian Reference Polian, Judith Aissen and Maldonado2017) – were categorized as content words. Relational nouns express meanings such as location, time, purpose, cause, and other circumstantial meanings. Temporal and locative relational nouns combine with prepositions, while others do not. Two examples of relational nouns are shown in (1)–(2).


(1) shows the relational noun -ümal ‘because of’, which appears with the 3rd person plural possessive prefix k-. (17) shows the locative relational noun -wï’‘on top of’ (literally, ‘hair’), which combines with the preposition pä and the 3rd person singular possessive prefix u-.
Phonologically unadapted Spanish borrowings, whether content words (as in Table 2) or function words (e.g. si [si] ‘if’, porque [ˈpoɾ.ke] ‘because’), were excluded from the study. It was observed during the course of the work on the language that deletion does not tend to occur in these words, which may reflect the form of the word in the original Spanish rather than the vowel deletion pattern of Chichicastenango K’iche’. Words that are composed of only a single vowel in the phonemic form, such as the plural marker e /e/, were also excluded, as it is not possible to know if such a word is present if its sole vowel is deleted. Instances where the word is repeated multiple times as a speech error or instances where the speaker doesn’t complete a grammatical sentence and instead immediately restarts with a new sentence after the word in question were also excluded. Finally, in some cases there were gaps or uncertainties in the transcriptions, and no words from these portions were included to avoid categorization errors.
Due to time considerations and the greater number of vowels in content words as compared to function words, the content word data was taken from a subset of the corpus comprised of ten recordings produced by eight different speakers, adding up to one hour. This data includes 6967 underlying vowels (present or deleted) in 3411 words, including 891 distinct lexical items.
The function word data was taken from the entire corpus. Words with less than ten tokens were excluded. For words with more than 500 tokens, a randomized subset was included: every third instance of the determiner rï /ɾI/ (1728 total tokens, 576 included) and every other instance of the pronoun rï’ /ɾIʔ/ (1072 total tokens, 536 included). For the remaining function words, every instance of the word present in the corpus was included in the study. This led to a total of 6995 function words included in the study with 7480 underlying vowels.
The function words that were included in the study are summarized in Table 4. ‘Phrase- final’ refers to the form of a word found only at the end of a phrase, and ‘phrase-medial’ to the form of the word found elsewhere (see Henderson Reference Henderson2012 for more information on function words which vary by phrase position in K’iche’). ‘Pre-C’ refers to the form of a word found before a word beginning with a consonant and ‘pre-V’ to the form of the word found before a word beginning with a vowel.
Table 4. Function words included in the study

3.2 Categorization
Each potential vowel was classified according to vowel presence (dependent variable) as well as the factors vowel quality, coda, onset, stress position, syllable position, domain, item and speaker (independent variables).
Vowel presence: Depending on the segmental context, the following visual metrics were used in Praat (Boersma & Weenink Reference Boersma and Weenink2023) to determine if a vowel was present or deleted. For underlying vowels between voiceless segments, any indication of voicing (voicing bar, periodicity) was sufficient to categorize a vowel as present. Examples are shown in Figure 2.

Figure 2. Present vs. deleted vowels between voiceless segments. Left: [pə k], right: [p s].
On the left of Figure 2 is shown the preposition pä /pə/ and the first consonant of the following word külew /kʊlew/ ‘their land’ in the phrase pä külew ‘on their land’ (planting, 2:46). There is clear voicing after the release of the /p/ before the closure for the /k/, and the vowel in pä was categorized as present. On the right of the figure is shown the same preposition pä /pə/ followed by the first consonant of the word sü't /sʊʔt/ ‘cloth’ in the phrase pä sü’t ‘in cloths’ (planting, 13:04). There is no voicing between the /p/ and the following /s/. The frication immediately follows the stop burst. In this case, the underlying vowel in pä was categorized as deleted.
For underlying vowels adjacent to a voiced consonant (implosive, nasal, glide or liquid), a vowel was determined to be present or deleted based on the number of segments visible in the voiced portion. The vowel was determined to be present if the voiced portion showed a relatively abrupt change in formants or intensity marking a boundary between two segments. If there was only one segment in the voiced portion, the vowel was categorized as deleted. Examples are shown in Figure 3.

Figure 3. Present vs. deleted vowels after a voiced segment. Left: [na k], right: [n k].
On the left of Figure 3 is the word na /na/ ‘still’ followed by first consonant /k/ of komo /komo/ ‘since’ in the speech fragment köjiwï’j na komo /kɔχiwIʔχ na komo/ ‘you wait for us still since…’ (marriage 1:31). There is a sharp boundary in both the formant structure and the intensity curve between the initial nasal [n] and the following vowel [a]. This vowel was categorized as present. On the right of the figure is the same word na followed by the first consonant [k] of kib’ /kiɓ/ ‘themselves’ in the phrase kikï’j na kib’ /kikIʔχ na kiɓ/ ‘they still wait for each other’ (marriage, 4:13). Here the entire voiced portion forms one segment with no sharp boundaries between them. The whole portion is of low intensity and displays antiformants. The underlying vowel in this word was categorized as deleted in this token.
For underlying vowels adjacent to another vowel, a vowel was determined to be present if the vowel portion showed a change in the formants corresponding to the two expected vowels, an abrupt shift in intensity marking a boundary between two vowels, or evidence of glottalization in the middle of the vowel portion marking a boundary between two vowels. When only one vowel segment was present, which of the vowels was deleted was determined based on the formants. When the two underling vowels were identical in quality and there was only one vowel segment produced, the first underlying vowel was categorized as deleted and the second as present, since it is always the first vowel that is deleted when there is a change in quality.
Examples are shown in Figure 4.

Figure 4. Present vs. deleted vowel preceding another vowel. Left: [ɓI uf], right: [ɓ us].
On the left is shown the directional particle b’ï /ɓI/ followed by the first two segments of the word ufideo /ufideo/ ‘its noodles’ in the phrasekäqäya chï b’ ï ufideo /kəqəja t͡ʃI ɓI ufideo/ ‘we add its noodles’ (caldores 0:44). During the vowel portion, F2 clearly shifts from a high position to a lower position (front vowel to back vowel). The vowel /I/ in the directional was categorized as present in this token. On the right is shown the same directional particle followed by the first two segments of the word usopa /usopa/ ‘its broth’ in the phrase käqäya b’ ï usopa /kəqəja ɓI usopa/ ‘we add its broth’ (caldores 0:47). Here the vowel portion forms one segment with no shift in formants or intensity. F2 is low, indicating a back vowel. The underlying vowel in the directional particle was categorized as deleted in this token.
In total there were 11499 present vowels and 2948 deleted vowels in the data.
Vowel quality: Each underlying vowel was categorized as either tense (/a e i o u/) or lax (/ə ɛ I ɔ ʊ/). In most cases the phonemic identity of the vowel in a given morpheme is easily determined perceptually. However, for function words this was not always as trivial task, as the contrast between tense and lax vowel pairs can be difficult to distinguish when reduced. In this study the phonemic vowel was designated with reference to the perception of the sound over many different tokens and comparing its formant structure to those of tense and lax vowels of the same type produced by the same speaker in the surrounding context. When this was ambiguous (both tense-like and lax-like perceptions and formant comparisons were frequent in the data for a given morpheme), the designation was made through comparison to the form found in other dialects and in the historical form of the language. For example, the vowel in the complementizer chï /t͡ʃI/ varies perceptually between tense-like and lax-like in Chichicastenango, but it corresponds to a short vowel in other K’iche’ dialects (Tum & Florentino, Reference Tum and Florentino2001; Ajpacajá Tum et al. Reference Tum, Florentino, Chox Tum, Tepax Raxulew and Guarchaj Ajtzalam2005) and was categorized as lax. Meanwhile the vowel in the exclusive particle xa /ʃa/ ‘only’ also varies perceptually between tense-like and lax-like in Chichicastenango, but corresponds to a long vowel in other K’iche’ dialects (Tum & Florentino, Reference Tum and Florentino2001; Larsen Reference Larsen1988) and was categorized as tense. In total, there were 8,808 lax vowels and 5,639 tense vowels in the data.
Coda and onset: Each vowel was classified according to whether its syllable has an onset and whether its syllable has a coda in the underlying form, with syllabification at the level of the morphological word (including affixes but excluding clitics). Any word-initial consonants were assumed to be the onset of the first vowel, and any word-final consonants the coda of the last vowel. A single intervocalic consonant was assumed to be the onset of the second vowel. If there were two intervocalic consonants, the first was assumed to be the coda of the first vowel and the second the onset of the second vowel. In total, the data includes 5,809 vowels in closed syllables and 8,638 vowels in open syllables. There are 13,469 vowels in syllables with onsets and 978 vowels in onsetless syllables.
Stress position: Each vowel in a content word was classified as stressed or unstressed following the pattern described in Section 2 of this article. Unstressed vowels were classified by the distance from the stressed syllable (adjacent, distant) and the direction (preceding or following).
For function words, determining whether a given syllable is stressed is more complicated. Little research exists on the prosody of function words and the interaction between word-level stress and intonation in K’iche’. Some previous works on K’iche’ have argued that some function words bear stress while others are unstressed. Henderson (Reference Henderson2012) argues that the ‘phrase-final’ forms of directionals and other particles are stressed and the ‘phrase-medial’ forms unstressed.
For the purposes of this study, a given word was classified as stressed if it is: (a) the phrase-final form of one of these alternating words, (b) a pronoun, or (c) a word routinely perceived as prominent (/χeʔ/, /tɛʔ/, /aɾe/, /k'ɔ/). In a stressed multisyllabic word, the final syllable was classified as stressed, excluding the extrametrical suffixes /-Ik/ and /-ɔq/ – this is consistent with the stress pattern of content words as well as perceptual judgments of prominence. All other function words were classified as unstressed, including (a) the phrase-medial forms of alternating words (except for the existential /k'ɔ/, typically perceived as prominent), (b) determiners, (c) complementizers, (d) prepositions, and (e) other words usually perceived as non-prominent.
In total, there are 6,655 stressed vowels and 7792 unstressed vowels in the data.
Syllable position: Each vowel was classified as final or non-final according to the position of its syllable within the morphological word (including affixes but excluding clitics). In total there are 10,408 vowels in final syllables and 4,039 vowels in non-final syllables.
Domain: Each vowel was classified as belonging to either a function word, the stress domain of a content word, or an extrametrical morpheme in a content word. For content words, all vowels were considered to be part of the stress domain unless they belonged to verbal person, aspect, or incorporated movement prefixes, the superlative adjectival suffix, or the status suffixes -ïk, --öq, - ö ü. As discussed above, these affixes are not stressed regardless of their position or shape, including when they might be expected to be stressed, such as in word-final position for a non-verb or containing a tense vowel or closed syllable in a verb. Examples of the exclusion of these morphemes from the stress domain can be seen in Tables 2 and 3 above. In total, the data includes 5,111 vowels belonging to the stress domain of content words, 1,856 vowels in extrametrical morphemes in content words, and 7,480 vowels in function words.
Item: Each vowel was categorized according to the morpheme it belongs to, with separate items for each vowel in morphemes with multiple vowels. By this metric, there were 610 distinct items in the data.
Speaker: Each word was categorized according to the speaker who produced it. There are 11 different speakers represented in the data.
In addition to the previous factors, function words were additionally classified according to the following independent variables which address the phrasal context, which appeared to influence rates of deletion in function words, unlike content words: syllable position (phrase), syllable position (utterance), coda (phrase-level), onset (phrase-level), coda (utterance-level), onset (utterance-level).
Syllable position (phrase): Vowels were categorized as phrase-final if they are in the last syllable of the phonological phrase and not phrase-final if there is a following syllable within the phonological phrase. Due to a lack of previous research on phonological phrasing in K’iche’, for the purposes of this study the phonological phrase was determined based on the syntactic structure, assuming that the phonological phrase should correspond roughly to the syntactic phrase as it does cross-linguistically (Bennett & Elfner Reference Bennett and Elfner2019). Each of the major components of a sentence was considered to be a separate phrase: subject or object noun phrases, prepositional or adverbial phrases, and the verb complex (verb together with dependent particles). Any material that precedes the verb complex, such as complementizers or conjunctions, was excluded from the verb phrase and considered to form a separate phrase. Examples are shown in (3)–(4) with phrase boundaries in square brackets.


(3) is composed of three phrases: the adverbial phrase të’ k’ü rï’ ‘then’, the subject noun phrases rï jün ali ‘the girl’, and the verb complex nä xäraj täj ‘didn’t want to’. (4) is composed of four phrases: the introductory material para ke ‘so that’, the subject noun phrase rï wïnäq ‘the people’, the verb kiköjönïk ‘believe’, and the prepositional phrase pä rï kanton ‘in the community’.
In total there are 3,915 non-phrase-final tokens and 3,565 phrase-final tokens in the function word data.
Syllable position (utterance): Vowels were categorized as utterance-final if they in the last syllable before a pause, and non-utterance-final otherwise. In total there are 669 utterance-final tokens and 6,811 non-utterance-final tokens in the function word data.
Coda and onset (phrase-level and utterance-level): Words were syllabified separately at the phrase level (including all material within the phonological phrase) and the utterance level (including all material enclosed by pauses). Syllabification was determined according to the same rules used for word-level syllabification, discussed above. In total, there are 2,744 vowels in closed syllables with phrase-level syllabification, and 4,736 in open syllables. There are 495 vowels in onsetless syllables with phrase-level syllabification, and 6,985 in syllables with onsets. When syllabification is considered at the utterance level, there are 2,629 vowels in closed syllables, and 4,851 vowels in open syllables. There are 326 vowels in onsetless syllables with utterance-level syllabification, and 7,154 in syllables with onsets.
Finally, each lax vowel in a non-final, unstressed CV syllable not adjacent to a stressed syllable was categorized according to one additional factor.
Following vowel presence: Each vowel of this type was categorized according to whether the following underlying vowel was present or deleted. Of 1,476 vowels in this category, the following vowel was deleted in 661 cases and present in 815 cases.
3.3 Statistical analysis
The data was visualized in R (R Core Team 2020) with the package ggplot2 (Wickham Reference Wickham2016) and analyzed with mixed-effects logistic regression using the package lme4 (Bates et al. Reference Bates, Mächler, Bolker and Walker2015). The statistical models used treatment (“dummy”) coding, and the baseline condition for each variable was where deletion was most expected.
Deletion in function words, unlike content words, often seems to depend on the larger phrasal context. The factors of word, phrase, and utterance level are not independent of each other: all utterance-final syllables are also word-final and nearly all are phrase-final (except those that precede a phrase-internal pause). Rather than creating a complex factor with levels formed of all possible combinations of word, phrase, and utterance position – which would result in some categories with very small amounts of data – model comparison was done with the AIC (Akaike’s information criterion) and BIC (Bayesian information criterion) functions in R to determine whether the word, phrase, or utterance level best predicted the presence of vowels in function words for the variables onset, coda and syllable position. These models predict vowel presence according to only the syllable position variable or the syllabification variables (onset, coda), respectively, in addition to the random effects of item and speaker. The results of the model comparison are shown in Tables 5 and 6.
Table 5. Model comparison of the syllable position variable

Table 6. Model comparison of the syllabification variables

Both AIC and BIC are lowest at the phrase level for both the syllable position variable and the syllabification variables (onset, coda). Therefore, syllabification and syllable position were included at the phrase level for function words in later models (but at the word level for content words).
Next, vowel presence (present, deleted) was modeled in the complete data set (function words and content words). The fixed effects were vowel quality (lax, tense), onset (yes, no), coda (no, yes), syllable position (non-final, final) and stress (unstressed, stressed). An interaction was included between each of the fixed effects and domain (stress domain internal content word, stress domain external content word, function word), with the exception of stress since all stress domain external vowels in content words are unstressed. These interactions were included because deletion appeared to be less regular outside of the stress domain and in function words as compared to content words. speaker and item were included as random effects. The equation is shown as follows.
glmer(factor(vowel presence ∼ vowel quality * domain + onset * domain + coda *
domain + syllable position * domain + stress + (1|speaker) + (1|item), family = binomial,
control = glmerControl(optimizer = “bobyqa”))
The baseline categories for each variable were those most expected to contribute to vowel deletion: lax vowel quality, onset, no coda, unstressed, non-final position, stress domain internal morpheme in content word.
Model comparison was carried out with the AIC and BIC functions to ensure that the addition of the interaction terms improved the fit of the model. The results are shown in Table 7.
Table 7. Model comparison of the full model with and without interactions

Both AIC and BIC are lower for the model with interactions, indicating that this model has an improved fit.
Additionally, to explore how deletion operates for vowels that meet the qualifications for deletion, a second model was run for the subset of the data consisting of lax, unstressed vowels in non-final CV syllables. This model had fixed effects stress position (adjacent, distant) and domain, and the interaction between them, as well as random effects of speaker and item. The equation is shown as follows.
glmer(factor(vowel presence ∼ stress position * domain + (1|speaker) + (1|item), family =
binomial, control = glmerControl(optimizer = “bobyqa”))
The AIC and BIC values for this model compared to the same model without the interaction term is shown in Table 8. Again, both AIC and BIC are lower for the model with the interaction, indicating that the interaction improves the fit of the model.
Table 8 Model comparison of the second model with and without interaction term

Finally, to explore how deletion operates for vowels that meet the qualifications for deletion but are more distant from a stressed syllable, a third model was run consisting of lax, unstressed vowels in CV syllables preceding a stressed syllable by two or more syllables. This model had fixed effects following vowel presence (present, deleted) and domain (stress domain internal content word, stress domain external content word, function word), with an interaction between them, as well as random effects of speaker and item. The equation is shown as follows.
glmer(factor(vowel presence ∼ following vowel presence * domain + (1|speaker) +
(1|item), family = binomial, control = glmerControl(optimizer = “bobyqa”))
The AIC and BIC for this model compared to an identical model without the interaction term is shown in Table 9. Again, the inclusion of the interaction improves the fit of the model.
Table 9. Model comparison of the third model with and without interaction term

4 Results
The results of the full statistical model are shown in Table 10.
There is a significant positive effect (p < .01) of vowel quality, domain (function word), onset, coda, and stress. There are also significant interactions between domain and vowel quality, onset and coda.
Table 10. Results of the full statistical model (fixed effects)

Vowel quality: The main effect of vowel quality shows that in content words, tense vowels are less likely to be deleted than lax vowels. The significant interaction with domain shows that the effect of vowel quality on deletion differs in content words and function words: in function words, tense vowels are more likely to be deleted than in content words. The number of present and deleted vowels in each domain is shown in Figure 5.

Figure 5. Deletion of vowels by vowel quality.
Examples of words with similar structures where a lax vowel is deleted and a tense vowel preserved are shown in (5). The tense vowel /a/ is preserved in (5a) while the lax vowel /ʊ/ is deleted in (5b) in a similar context: in an open, unstressed syllable preceding the stressed syllable.

The vowel /ʊ/ is known to be present underlyingly in the word jünab’ as it is present in other contexts, e.g. jünäb’ir [χʊmɓiɾ] ‘a year ago’ (elicited example, same speaker as 5a,b).
Onset: The main effect of onset shows that inside of the stress domain of content words, vowels in onsetless syllables are less likely to be deleted than vowels in syllables with onsets. The significant interaction with domain shows that vowels in onsetless syllables are comparatively less resistant to being deleted in extrametrical morphemes of content words and in function words. The number of present and deleted vowels in each domain is shown in Figure 6.

Figure 6. Deletion of vowels by onset.
Examples of words with similar structures where a vowel in an onsetless syllable is preserved while a vowel in a syllable with an onset is deleted are shown in (6). The vowel /ə/ is preserved in (6a), and realized as its tense counterpart [a] due to its word-initial position. The same vowel is deleted in (6b), where the possessive prefix provides an onset to this syllable.

Coda: The main effect of coda shows that in stress domain internal morphemes in content words, vowels in closed syllables are less likely to be deleted than vowels in open syllables. The significant interaction terms show that vowels in closed syllables are comparatively more likely to be deleted in function words or extrametrical morphemes belonging to content words. The number of present and deleted vowels in each domain is shown in Figure 7. There are practically no deletions inside of the stress domain in content words. In extrametrical morphemes in content words and in function words, vowel deletion is more likely, though still uncommon.

Figure 7. Deletion of vowels by coda.
The preservation of vowels in closed syllables is exemplified in (7) with two forms of the morpheme nïm ‘big, important’. In (7a) the vowel /ɪ/ is in a closed syllable and is not deleted, whereas in (7b) it is in an open syllable, still between the same consonants, and deleted.

Syllable position: There is no significant effect of syllable position, nor any significant interactions between syllable position and domain.
Stress: The main effect of stress shows that vowels are less likely to be deleted when stressed. The number of present and deleted vowels according to stress is shown in Figure 8. (Note that definitionally, no stressed vowels are extrametrical.)

Figure 8. Deletion of vowels by stress.
The preservation of stressed vowels is exemplified in (8) with two forms of the verb köj ‘wear’. In (8a), the vowel /ɔ/ in the verb root is in the stressed syllable and is not deleted. In (8b), the same vowel is in an unstressed syllable, as stress falls on the suffix /-om/. This unstressed vowel is deleted. At a segmental level, the context for these two vowels is very similar: in the penultimate syllable, which has the shape /kɔ/.

Domain: The main effect of domain shows that vowels that are lax and in unstressed CV syllables are more likely to be deleted inside of the stress domain of content words than in function words. The number of deleted and preserved vowels in each domain is shown in Figure 9. Whereas these vowels are very likely to be deleted inside of the stress domain of content words, they are about equally likely to be deleted or not in extrametrical morphemes and in the stress domain of function words.

Figure 9. Deletion of unstressed lax vowels in CV syllables across domains.
Finally, the results of the random effects in this full model are shown in Table 11. There is little variation across speakers in the overall likelihood of vowels being deleted. There is much larger variation across items, indicating that vowels in certain morphemes are more prone to being deleted than others.
The results of the statistical model of the subset of lax, unstressed vowels in non-final CV syllables are shown in Table 12.
Table 11. Results of full statistical model (random effects)

Table 12. Results of the model of lax, unstressed vowels in CV syllables (fixed effects)

There is a significant main effect of stress position and both levels of domain, as well as a significant interaction between these two factors.
Stress position: For vowels belonging to the stress domain of content words, they are less likely to be deleted when two or more syllables away from a stressed syllable than when adjacent to a stressed syllable. Vowels in function words are comparatively more likely to be deleted in this distant context.
Domain: For vowels adjacent to a stressed syllable, they are less likely to be deleted when in an extrametrical syllable or function word compared to inside of the stress domain of content words. Vowels more distant from a stressed syllable are comparatively more likely to be deleted in function words than inside of the stress domain of content words. The number of deleted and preserved unstressed lax vowels in non-final CV syllables according to domain and position with respect to a stressed syllable is shown in Figure 10. Overall, vowels are deleted more often when adjacent to a stressed syllable than more distant from a stressed syllable. However, the contrast between positions is much smaller for vowels in function words than those in content words.

Figure 10. Deletion of unstressed lax vowels in non-final CV syllables.
Examples of the deletion of lax vowels in non-final open syllables preceding the stressed syllable are shown in (9).

These vowels are known to be present underlyingly, as they appear in the same morphemes in other contexts, e.g. utzüküxik [ut͡sʊkʃik] ‘its being searched for’ (mushrooms, 6:30), ixöq [iʃɔq] ‘woman’ (healing, 2:33), and mïq’ïna’ [mɪq'naʔ] ‘hot water’ (3recipes, 7:27).
Table 13. Results of the model of lax, unstressed vowels in CV syllables (random effects)

Examples of the deletion of lax vowels in non-final open syllables following the stressed syllable are shown in (10). The vowel /ɛ/ is deleted in (10a) following the stressed syllable, and the vowel /ə/ in (10b).

These vowels are known to be present underlyingly because they appear in other forms of the same morphemes, e.g. käch’äjb’ëj [kət͡ʃ'əχɓɛχ] ‘you use it to wash’ (healing, 0:46) and sɪb’äläj [sIɓələχ] ‘very’ (changes1, 3:35).
Finally, the results of the random effects in this model are shown in Table 13. Again, there is very little variation across speakers in the likelihood of vowels being deleted. There is greater variation across items, indicating that the vowels in some specific morphemes are more likely to be deleted than others.
The results of the statistical model of the subset of lax, unstressed vowels in CV syllables preceding a stressed syllable by two or more syllables are shown in Table 14.
Table 14. Results of the model of lax, unstressed vowels in CV syllables distant from a stressed syllable (fixed effects)

There is a significant effect of following vowel and both levels of domain, as well as significant interactions between them.
Following vowel: The main effect of a following vowel shows that for vowels belonging to the stress domain of content words, they are much less likely to be deleted if the following vowel is deleted. Comparatively, similar vowels in extrametrical morphemes or function words are more likely to be deleted than inside of the stress domain of content words; that is, they are less dependent on the presence of the following vowel.
Domain: The main effect of domain shows that for vowels which precede a present vowel, they are less likely to be deleted if they are in extrametrical morphemes or function words than within the stress domain of content words.
The number of deleted and preserved vowels according to the presence of the following vowel and the domain are shown in Figure 11. Stress domain internal vowels in content words show the sharpest difference across context: they are almost always deleted when the following vowel is present and present when the following vowel is deleted. The same overall difference is observed for function words, but it is less predictable. For extrametrical vowels in content words, they are usually present in both contexts, but proportionally more often deleted if the following vowel is present.

Figure 11. Deletion of lax vowels in non-final CV syllables distant from a stressed syllable.
Examples of how deletion affects lax vowels in non-final open syllables more distant from the stressed syllable are shown in (11). In (11a), the vowel of the initial syllable is deleted, preceding a syllable with a present vowel. In (11b), the vowel of the initial syllable is present, and it precedes a syllable with a deleted vowel.

Finally, the results of the random effects in this model are shown in Table 15. As in previous models, there is little variation across speakers. There is greater variation across items, indicating that the vowels in certain morphemes are more likely to be deleted than those in others.
Table 15. Results of the model of lax, unstressed vowels in CV syllables distant from a stressed syllable (random effects)

5 Discussion
As noted above, building on prior work by Campbell (Reference Campbell1977), López Ixcoy (Reference Ixcoy and Candelaria Dominga1994) and England & Baird (Reference England and Baird2017), Wood (Reference Wood2020) concluded that lax vowels are deleted in content words when in a non-final CV syllable adjacent to the stressed syllable. Tense vowels, stressed vowels, and those in onsetless, closed, or word-final syllables are not deleted. Lax vowels in unstressed, non-final CV syllables not adjacent to the stressed syllable are sometimes deleted, but less regularly than those in syllables adjacent to the stressed syllable. Very little previous research existed on the deletion of vowels in function words in Chichicastenango K’iche’, but Wood (Reference Wood2020) observed that vowels may be deleted in function words in contexts that would be prevented in content words, such as word-final. These deletions sometimes appear to be regular, but in other cases appear to be optional, with the same vowel appearing variably in the same word and similar phrasal contexts.
Most of these observations are well substantiated in the corpus data. Vowels are significantly less likely to be deleted when tense, in closed syllables, in onsetless syllables, or stressed. Deletion is very frequent when the vowel is lax and in an unstressed CV syllable adjacent to a stressed syllable. When in a similar context but more distant from a stressed syllable, deletion is less frequent. In function words, vowel deletion follows similar restrictions to content words but is much more variable. However, the corpus study results find no significant effect of syllable position on vowel deletion, nor an interaction with domain. The corpus data also clarifies the differences in the vowel deletion pattern across domains (within the stress domain of content words, extrametrical morphemes, and function words), including differences in the predictability of the pattern.
The following sections discuss some of the major results of the study. Section 5.1 discusses the effect of syllable position, which was identified as a factor in previous literature but was not significant in the statistical model. Section 5.2 discusses differences in the deletion pattern between different domains. Finally, Section 5.3 addresses the limitations of the study and possibilities for future research.
5.1 Deletion in word-final syllables
Previous descriptions of vowel deletion in Chichicastenango K’iche’ had indicated that deletion is limited to non-final syllables (Campbell Reference Campbell1977; López Ixcoy Reference Ixcoy and Candelaria Dominga1994; Wood Reference Wood2020). However, this factor was not significant in the statistical model, either as a fixed effect or as an interaction with domain.
An exploration of these final syllables sheds some light on why this factor appeared in previous descriptions but was not significant in the statistical model. The statistical model included as the baseline categories those most expected to contribute to vowel deletion: lax vowel quality, unstressed syllable, syllable with an onset, syllable without a coda. The baseline for domain was vowel belonging to the stress domain of content words. But lax vowels only ever occur in word-final position in extrametrical morphemes or in function words. Any final syllables of content words that belong to the stress domain of the word must always have a tense vowel, a coda, or both. Therefore, vowel deletion would be fully prevented in word-final syllables belonging to the stress domain of content words through the effects of the restriction against deleting tense vowels and vowels in closed syllables.
Vowels belonging to extrametrical morphemes may, however, be lax in word-final position without the need of a following consonant. Specifically, the status suffix -ɔ/-ʊ, which appears on root transitive verbs when at the end of an intonational phrase, creates verbs with a final unstressed CV syllable with a lax vowel. These vowels are rarely or never deleted, as they practically always appear when expected; that is, at the end of an intonational phrase (Wood Reference Wood2025b). An example is shown in (15). Extrametrical morphemes are indicated in angle brackets < >.

Here, the lax vowel /ɔ/ is not deleted despite occurring in an unstressed CV syllable that follows the stressed syllable. However, it is not clear whether this is an effect of syllable position per se, as assumed in Wood (Reference Wood2020), or rather an effect of this specific suffix, which as it consists of only a single vowel would result in a complete loss of the morpheme if the vowel were deleted.
Function words may also end in a lax vowel, including when they are the final syllable of a phonological phrase. Of the 2,420 lax vowels in unstressed CV syllables in function words, 802 are phrase-final. Of these, 306 are deleted (38.2%). The remaining 1,618 are not phrase-final. Of these, 967 are deleted (59.8%). Therefore vowel deletion is more frequent in a function word that is not at the end of a phrase, indicating that there is some effect of syllable position though it was not evident in the statistical results.
It is also interesting to consider the small number of cases where vowel deletion does occur in final syllables, counter to expectations. As noted above, vowel deletion would be expected to be avoided entirely in final syllables inside of the stress domain of content words due to the effects of vowel quality and syllable shape. However, of 3,151 stress domain internal vowels in word-final syllables, five are deleted. These cases might seem negligible due to the tiny number of occurrences. However, four of these five tokens are similar: they end in a CVC syllable with a glide onset and a lax vowel, and they follow the stressed syllable. Interestingly, there are no tokens in the data that end in this type of syllable where the vowel is not deleted. An example is shown in (12).

This vowel is known to be present underlyingly as it appears in other forms of the same morpheme /-əɾ/, e.g. säqärïk [sqəɾIk] ‘it dawned’ (fishing, 0:03).
In order to understand why vowel deletion occurs in this position, but not in similar contexts where the first consonant is not a glide, it is necessary to understand the structure of verbs in K’iche’. As is common across the Mayan language family, K’iche’ has a set of verbal suffixes referred to as ‘status suffixes’ which indicate the mood and transitivity of the verb stem (Polian Reference Polian, Judith Aissen and Maldonado2017). In K’iche’, these suffixes also provide information about the phrase position of the verb, as they are absent or change form when the verb is not at the end of an intonational phrase (Henderson Reference Henderson2012; Royer Reference Royer2021; Wood Reference Wood2025b). Status suffixes appear on intransitive and root transitive verbs (a transitive root with no further derivational morphology), but not on derived transitive verbs (derived from a root that is not a transitive verb). When present, status suffixes are always the last morpheme in the word and are extrametrical. An example of the status suffix alternation is shown in (13).

In (13a), the verb kächäq’ïj ‘it cooks’ is followed by the noun phrase rï qäti’ ‘our meat’ within the intonational phrase, and it has the (null) phrase-medial status suffix. In (13b), the same verb occurs at the end of the intonational phrase and appears with the phrase-final suffix -ïk.
The alternation between phrase-medial and phrase-final forms is relevant to understanding the vowel deletion pattern because in Chichicastenango K’iche’, like in the related languages Q’anjob’al (Mateo Toledo Reference Mateo Toledo, Judith Aissen and Maldonado2017) and the San Mateo Ixtatán variety of Chuj (Coon Reference Coon2019; Royer Reference Royer2021), phrase-final status suffixes are additionally used in phrase-medial contexts in order to prevent disallowed word-final consonant clusters. The only surface clusters which occur in word-final position in Chichicastenango K’iche’ consist of either a glide or a glottal stop followed by another consonant. If a verb would otherwise end in a disallowed consonant cluster, the relevant phrase-final status suffix appears even in phrase-medial position. Notably, these potentially word-final consonant clusters which are avoided through the appearance of the phrase-final status suffix always result from vowel deletion. An example is shown in (14).

Here the verb kataq’änïk [kataq'nIk] ‘you go up’ appears with the phrase-final suffix -ïk despite its phrase-medial position. The vowel /ə/ in the verb is deleted, which would result in a word- final [q'n] cluster if the suffix were absent.
The question that is raised by forms like kataq’nïk ‘you go up’ is why vowel deletion should occur in these verbs to begin with. If these vowels were not deleted, there would be no need for the phrase-final status suffix to appear in medial contexts. Analytical choices about the underlying structure of verbs, and whether or not the status suffixes are present in the underlying form of intonational phrase-medial verbs, result in different conclusions about the cause of vowel deletion in these cases. The typical assumption in the Mayanist literature is that phrase-final suffixes are removed in phrase-medial contexts rather than inserted in phrase-final contexts (e.g., Larsen Reference Larsen1988; Coon Reference Coon2016; Pixabaj & Telma, Reference Pixabaj and Telma2017; del Prado & Curiel, Reference del Prado, Curiel, Judith Aissen and Maldonado2017). This is certainly the case from a historical perspective, as status suffixes would have been present in all phrasal environments in Proto-Mayan. However, it is not clear whether it is the best analysis for the individual modern languages in which the appearance of the suffix is conditioned by phrasal context. Some authors describe the suffixes as being added in phrase-final position, instead (e.g., Baird Reference Baird2014; Pye, Pfeiler & Mateo Pedro Reference Pye, Pfeiler, Mateo Pedro, Judith Aissen and Maldonado2017).
If it is assumed that the phrase-final forms of the status suffixes are present underlyingly in all cases, no matter the prosodic context of the verb, then the fact that vowel deletion occurs in verbs like xik’iyr and kataq’nïk is not surprising. The presence of the status suffix means that the preceding syllable is not closed, as the last consonant of the verb stem becomes the onset of the status suffix syllable rather than the coda of the preceding syllable. Therefore, vowel deletion in these cases affects lax vowels in unstressed CV syllables adjacent to the stressed syllable, exactly where it occurs in other contexts. Under such an analysis, the status suffix would then be unpronounced in medial contexts unless required for phonotactic reasons.
If it is instead assumed that the phrase-final forms of the status suffixes are not present underlyingly, at least in intonational phrase-medial context, this means that there is a second context in which vowel deletion occurs. This second deletion rule would affect lax vowels in unstressed CVC syllables which follow the stressed syllable. This rule applies if it results in either a permissible word-final consonant cluster, as in (12), or a non-permissible cluster which can be avoided by adding in the relevant status suffix for that type of verb, as in (14). If the application of the rule would result in a non-permissible cluster which cannot be avoided in this way – as is the case in derived transitive verbs which have no status suffixes in any context – then deletion is prevented. An example is shown in (15).

The verb esäj ‘remove’ does not have a status suffix that appears in phrase-final contexts. Therefore, deletion of the vowel /ə/ in the final syllable would result in a word-final cluster [sχ] which cannot be prevented by inserting a status suffix, and this vowel is not deleted.
Neither of these possibilities were initially suggested by the corpus study because syllable position was determined according to the morphemes that appear in the surface form of the word. Phrase-final status suffixes were counted in the syllabification if and only if they were present in the surface form, irrespective of the phrase position of the verb.
In sum, there is no conclusive evidence that syllable position affects vowel deletion in content words. Extrametrical vowels are not deleted when expected in word-final syllables, but this may be due to the specific morphemes involved rather than their position in the word per se. The likelihood of vowel deletion in function words does appear to be influenced by phrase position. Finally, the minority of cases where vowel deletion does occur unexpectedly in word- final (closed) syllables in content words reveals the possible existence of a second vowel deletion rule: lax vowels in CVC syllables following the stressed syllable. Whether or not deletion occurs in such an environment depends on assumptions about the underlying structure of verbal status suffixes, which would be a fruitful avenue for future research.
5.2 Differences across domains
The statistical results show that there are important differences in how deletion operates between stress domain internal morphemes of content words, extrametrical morphemes, and function words. The likelihood of deletion is most predictable for stress domain internal morphemes in content words, where the difference between contexts where deletion is expected and not expected is very sharp. Whereas vowels are practically never deleted when tense, stressed, in closed syllables, or in onsetless syllables, they are almost always deleted when lax, unstressed, and in CV syllables adjacent to a stressed syllable. Extrametrical vowels in content words follow similar patterns, but less starkly. For example, deletion of vowels in closed syllables, though not the norm, is more common in extrametrical morphemes than those belonging to the stress domain. An example is shown in (16), with extrametrical morphemes in angle brackets < >.

The vowel is deleted in the first syllable, which has a coda /n/. This vowel belongs to the extrametrical 1st person singular personal prefix ïn-.
Similarly, the preservation of lax vowels in CV syllables that are adjacent to a stressed syllable, though not the dominant pattern, is more likely for extrametrical vowels than those belonging to the stress domain. An example is shown in (17).

Here the lax vowel /ɔ/ is not deleted in the syllable /kɔ/, which is extrametrical and precedes the stressed syllable.
Extrametrical lax vowels in CV syllables more distant from a stressed syllable are also less consistently preserved when the following vowel is deleted, or deleted when the following vowel is preserved. That is, they are less predictable based on the presence or absence of the following vowel. Examples are shown in (18).

In (18a) the lax vowel /ə/ is not deleted in the first syllable, despite the fact that it precedes a syllable with a present vowel. In (18b), conversely, the vowel in the first syllable is deleted, despite the fact that the following, unstressed vowel is also deleted.
Compared to extrametrical morphemes in content words, deletion in function words is even more variable. Although vowels are unlikely to be deleted when tense, stressed, in closed syllables, or in onsetless syllables, deletion is more likely in this context than for vowels in content words. Even more notably, although vowels in function words are most likely to be deleted when lax and in unstressed, CV syllables adjacent to a stressed syllable, the rate of deletion in this context is little more than half. Examples of the same function words with and without the vowel in very similar contexts are shown in (19)–(21).



In (19a), the vowel in the diminutive sïn /sIn/ is preserved, whereas it is deleted in (19b). These words occur in very similar contexts: following the determiner rï, whose vowel is deleted, and preceding the noun u’al ‘broth, juice’. In (20a), the vowel in the existential k’ö /k'ɔ/ is deleted and the vowel in the following particle wï /wI/ is present. In (20b), the vowel in the existential is present and that in the following particle is deleted. In both cases these words are followed by the same directional particle känöq. In (21a), the vowel in the particle chï /t͡ʃI/ ‘already, again’ is preserved, whereas it is deleted in (21b). In both cases, it follows a stressed syllable and precedes the directional particle b’ï /ɓI/ ‘here to there’.
In addition to the greater variability of vowel deletion in function words, there is a second important difference compared to content words. In content words, phrasal context is irrelevant. For instance, the extrametrical superlative suffix /-ələχ/ appears 33 times in the data. The first vowel, which follows the stressed syllable, is typically deleted. The second vowel, however, is never deleted, no matter the following word. Examples are shown in (22).

In (22a), the suffix precedes the word tïnämït ‘town’, which begins with a consonant. Therefore, resyllabification at the phrase level would not affect the shape of the final syllable of the word nïmäläj. In (22b), however, the following word ulew ‘land’ begins with a vowel. If resyllabification occurred at the phrase level, the final consonant would become the onset of the second word, leaving an open syllable at the end of the first word. This vowel is not deleted, however – neither in this instance nor in any of the other instances in the data where the following word begins with a vowel. The phrasal context does not affect where vowels are deleted in content words.
Similarly, the addition of an affix as compared to a function word to a given content word has a markedly different effect on deletion. Prefixes permit the deletion of an otherwise word-initial vowel. Function words never have this effect – even when they could provide an onset to the following word-initial vowel – and the word-initial vowel is realized as tense in this context regardless of its underlying quality. This contrast is shown in (23).

The intermediate lax /ʊ/ in külew ‘their land’ in (23a), which is in a CV syllable preceding a stressed syllable, is deleted. The underlyingly lax vowel /ʊ/ at the beginning of the word ulew ‘land’ in (23b), when preceded by the proclitic täq, occurs in the same type of segmental context: preceded by a single consonant and followed by a stressed syllable. However, it is realized as tense and is not deleted.
In sum, the results of the study suggest the existence of three different levels of prosodic structure in the language. The stress domain includes content word roots, derivational affixes, and some inflectional affixes. At this level, vowel deletion is highly predictable based on vowel quality, syllable structure, and stress. The morphological word incorporates the remaining, extrametrical inflectional affixes. At this level, vowel deletion remains fairly predictable based on the same factors, though less so than inside of the stress domain. Finally, the phrase level incorporates function words. Vowel deletion is highly variable in this context, but is influenced by the same types of factors that constrain deletion at lower levels: vowel quality, syllable shape, and stress.
5.3 Limitations and future research
Studies on vowel deletion and other forms of speech reduction have identified a number of factors not considered in this study that are relevant cross-linguistically. Many studies have shown that reduction is more common for high-frequency words than low frequency words, which correlates with greater predictability and lower cognitive effort (e.g., Ernestus & Warner Reference Ernestus and Warner2011; Clopper & Turnbull Reference Clopper and Rory2018, and references therein). The differences observed between content words and function words in the present study may be partially explained by word frequency effects. Deletion occurs in unexpected environments (such as a closed syllable or tense vowel) more commonly in function words than in content words, which may be due to their greater predictability and higher frequency. However, vowel preservation also occurs in unexpected environments more commonly in function words than in content words. This cannot be explained as a result of preserving information where it is less predictable. The effects of word frequency would be of great interest for future research. Due to the understudied status of the language, frequency estimates for content words are currently unknown.
Similarly, speech tempo is a common factor in vowel deletion cross-linguistically, with deletion occurring more at faster speech rates (e.g., Lindblom Reference Lindblom1990). This study did not consider speech rate as a factor. The results do not suggest a likely effect of speech rate for content words, at least within the stress domain, as the contexts where deletion occurs or does not occur are highly predictable based on vowel quality, syllable structure, and stress alone. The author’s experience is that deletion also occurs in the same words in contexts which induce phonetic lengthening, such as focus or utterance-final position. Speech rate may have an effect on deletion in more variable contexts, however, such as in extrametrical morphemes in content words and in function words. Similarly, speech style is known to affect vowel deletion across languages, with deletion being more common in casual speech than clear or formal speech (Ernestus & Warner Reference Ernestus and Warner2011). Speech style could not be included as a factor for the corpus study, since all of the data belongs to the same speech style: spontaneous, fairly casual, but with the speaker aware that they were being recorded by someone interested in how their language is spoken. However, anecdotal evidence suggests that deletion in content words occurs predictably in the same contexts in other speech styles, such as in very clear speech produced to correct the pronunciation of the researcher, who is not a native speaker of the language, or in controlled speech production experiment data. Therefore, it is not the case that deletion is restricted only to casual speech. For function words, however, it may have a greater effect. Future research on the possible effects of speech rate and style could test these predictions.
Finally, the present study considered deletion as binary: vowels were categorized as either present or deleted. Even minor evidence of the presence of the vowel, such as voicing between otherwise adjacent voiceless consonants, was considered sufficient to code a vowel as present. These present vowels therefore cover a range from hyperarticulated vowels to extreme reductions. It is possible that deletion might not be a categorical phenomenon but rather a realization of gradient vowel reduction. Bennett, Henderson & Harvey (Reference Bennett, Henderson and Harvey2023) argue that in the closely related language Uspanteko, the apparent deletion of vowels is actually better understood as a result of high levels of gestural overlap. Although the vowel is articulated, it is acoustically masked by the flanking consonants and therefore inaudible. They provide pilot electroglottographic (EGG) data that shows that at least in some cases there was covert voicing detected in the EGG signal that was not detectable in the acoustic signal. They further argue that this gestural overlap is not phonetic in nature, but phonologically controlled: it is not conditioned by speech rate or style, it is a language-specific pattern, and it shows phonological and morphological conditioning. Bennett et al. (Reference Bennett, Henderson and Harvey2023) do not discuss word class specifically, but all of the provided examples are of content words.
There are many similarities between the vowel deletion patterns of Uspanteko and Chichicastenango K’iche’, at least for content words. In both languages, deletion targets unstressed vowels preceding the stressed syllable (when the word has final stress) or following the stressed syllable (when the word has non-final stress). In both languages, deletion produces highly marked consonant clusters which are rare or unattested underlyingly in the language, especially morpheme-internally, and which violate common cross-linguistic principles like sonority sequencing. In both languages, deletion occurs not only in fast, casual speech but also in careful elicited speech and in contexts that induce phonetic lengthening, like utterance-final position. The languages differ primarily in the specific phonological and morphological conditions that deletion follows. For instance, deletion is avoided in prefixes and between identical adjacent consonants in Uspanteko, but it is not restricted in either of these contexts in Chichicastenango K’iche’. An example of deletion in a prefix can be seen in (16) above; examples of deletions between identical consonants include anïnäq /anInəq/ [annəq] ‘quickly’ (3recipes, 3:06), qäqän /qəqən/ [qqən] ‘our legs’ (changes1, 1:56), and kkaj /kəkaχ/ [kkaχ] ‘they want’ (fishing, 5:41).
The languages also appear to differ, however, in the predictability of the deletion pattern. Bennett et al. (Reference Bennett, Henderson and Harvey2023) note that deletion is more common in post-tonic position than pre-tonic position in Uspanteko, but in both contexts it is variable. The same speakers sometimes produce the same words with and without the vowels in quick succession. In Chichicastenango K’iche’, in contrast, deletion within the stress domain of content words is highly predictable, occurring over 90% of the time when expected.
In sum, there are many similarities between the apparent patterns of vowel deletion in Chichicastenango K’iche’ and the closely related language Uspanteko, where “deletion” has been argued to result from a gradient reduction process of gesture overlap. However, the Chichicastenango K’iche’ pattern is much more predictable, at least for content words. Further research is needed to assess the possibility of understanding the Chichicastenango K’iche’ pattern as a phonologically controlled gestural coordination.
6 Conclusion
The results of the corpus study provide empirical support for most of the previous generalizations about vowel deletion in content words, within the stress domain: deletion is prevented for tense vowels, those in closed syllables, those in onsetless syllables, and those that are stressed. Even though very reduced realizations are typical in casual, spontaneous speech, these restrictions are practically absolute in the data. Deletion occurs very regularly for lax vowels in unstressed CV syllables adjacent to the stressed syllable. Additionally, depending on how status suffixes are analyzed, there may be a second context where deletion occurs: lax vowels in unstressed post-tonic closed word-final syllables, if they have a glide onset or if an appropriate phrase-final status suffix can be recruited to prevent the subsequent consonant cluster from being word-final. Finally, for vowels of a type that could in principle be deleted but which are not adjacent to the stressed syllable, deletion typically occurs when the following vowel is present and does not occur when the following vowel is deleted; that is, deletion is avoided in adjacent syllables.
Outside of the stress domain, deletion follows similar restrictions. However, it is less predictable in this context than within the stress domain. Vowels in extrametrical syllables are more likely to be deleted when expected to be present, or present when expected to be deleted, than vowels within the stress domain.
For function words, in turn, vowel deletion is highly unpredictable. Although deletion is uncommon for tense vowels, those in closed syllables, those in onsetless syllables, and those that are stressed, it occurs in this context more frequently than in content words. Furthermore, vowels of the type most likely to be deleted – lax, in unstressed CV syllables – are deleted in only a little over half of all instances in the data compared to over 90% of the time within the stress domain of content words.
These results suggest the existence of three different levels of prosodic structure in Chichicastenango K’iche’. At each of these levels, similar phonological rules apply, however the regularity of the application differs. The existence of similar phonological rules that apply differently at different levels of prosodic structure is attested cross-linguistically. For example, the assimilation of English nasals is obligatory within words but optional across word boundaries (Kiparsky Reference Kiparsky1985). Similarly, palatalization in English is gradient and variable at the post-lexical level but categorical at the level of stem derivation (Zsiga Reference Zsiga2000). These types of patterns have been argued to result from rule scattering; that is, stabilization of a previously gradient phonetic process into a categorical rule without the previous process disappearing (Robinson Reference Robinson1976; Bermúdez-Otero & Trousdale Reference Bermúdez-Otero, Trousdale, Nevalainen and Closs Traugott2012; Iosad Reference Iosad2016). Over time, the level at which a phonological pattern applies may change in the language, and the result may be the existence of multiple similar processes at different levels of structure where one is categorical and the other gradient.
The greater likelihood of deletion in unexpected contexts in function words makes sense from an articulatory perspective, as function words are more often reduced than content words, and a high degree of variability is characteristic of reduction (Warner Reference Warner2019). Variability in vowel deletion in casual speech is found in patterns such as schwa deletion in English (Oshika et al. Reference Oshika, Zue, Weeks, Neu and Aurbach1975) and French (Dell Reference Dell1981; Bürki et al. Reference Bürki, Fougeron, Gendrot and Frauenfelder2011). The fact that vowels in these contexts are less likely to be deleted when expected, however, cannot be easily explained as resulting from reduction, and rather appears to be due to a phonological effect of prosodic structure.
Acknowledgements
Sib’alaj maltyox to all of the K’iche’ speakers who shared their language with me and made this work possible. Thank you to Scott Myers, Brandon Baird, Danny Law, Megan Crowhurst, Ryan Bennett and Joshua Lieberstein for helpful conversations on many aspects of this work. Thanks also to two anonymous reviewers and the editor for their comments, which substantially improved the manuscript. This material is based upon work supported by the National Science Foundation under Grant No. 000392968. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.
Competing interests
The author declares none.
Supplementary material
To view supplementary material for this article (including audio files to accompany the language examples), please visit https://doi.org/10.1017/S0025100325100765
Appendix A. Abbreviations
The following abbreviations are used in example glosses throughout the text.

Appendix B. Data sources
Unless otherwise noted, all examples cited throughout the text come from the collection of spontaneous, monologic speech used for the corpus study and are identified with a code to the recording and timestamp. Audio files of each of these examples are included in the supplementary materials. Information about each of the recordings that make up the corpus is shown in the following table. The archive PID is included for all recordings archived in the Archive of the Indigenous Languages of Latin America (AILLA, the K’iche’ Collection of Elizabeth Wood, https://ailla.utexas.org/collections/563/).
Table B1. Information about the recordings that comprise the corpus






















