Evidence for stress in Filipino text-setting

Kie Zuraw; Paolo Roca

doi:10.1017/S0952675725000041

Evidence for stress in Filipino text-setting

Published online by Cambridge University Press: 04 July 2025

Kie Zuraw

and

Paolo Roca

Show author details

Kie Zuraw*: Affiliation:
https://ror.org/046rm7j60Department of Linguistics, University of California, Los Angeles, Los Angeles, CA, USA.
Paolo Roca: Affiliation:
School of Nursing, University of California, Los Angeles, Los Angeles, CA, USA.
*: Corresponding author: Kie Zuraw; Email: p.roca@ucla.edu

Article contents

Abstract
Introduction
Background
Methods and predictions
Duration
Beat strength
Monte Carlo tests for significance
Phrase-final enclitics
Pre-tonic syllables
OPM text-setting does not track phonetics
Conclusion
Data availability statement
Funding statement
Competing interests
Footnotes
References

Rights & Permissions

Abstract

Words in Tagalog/Filipino can be either penult-prominent or ultima-prominent. Scholars have been divided on whether the language has stress, or only phonemic vowel length in penults and default phrase-final prominence. Using a corpus of Original Pilipino Music, we find that both prominent penults and prominent ultimas are set to longer notes and stronger beats, even in phrase-medial position. We further find that among pre-tonic syllables, those that would plausibly attract secondary stress are mostly set to longer notes and stronger beats. Text-setting does not faithfully reflect differences in phonetic cues between the two types of prominence, nor is it sensitive to presumed phonetic differences between high and low vowels. We conclude that songwriters’ text-setting decisions reflect phonological stress in Filipino, and that both penult-prominent and ultima-prominent words bear stress.

Keywords

Filipino Tagalog music text-setting stress

Information

Type: Article
Information: Phonology , Volume 42 , 2025 , e7

DOI: https://doi.org/10.1017/S0952675725000041 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2025. Published by Cambridge University Press

1. Introduction

We use data from Filipino pop music to address a long-standing debate about stress in Tagalog/Filipino, with three main findings.

First, we find that phonologically prominent syllables are set to longer and stronger musical notes. This is not surprising; many previous works on other languages have found that songwriters choose to begin stressed syllables on strong beats, and/or to give them long durations (to mention a few: Dell & Halle Reference Dell, Halle, Arleo and Aroui2009; Hayes Reference Hayes, Arleo and Aroui2009; Proto Reference Proto and Vincenzo2013; Proto & Dell Reference Proto and Dell2013; Temperley & Temperley Reference Temperley and Temperley2013; Bellik Reference Bellik2019). However, to the best of our knowledge, this is the first study of phonological aspects of text-setting in a Philippine language, and possibly in any language of the Austronesian language family.Footnote ¹

Our second finding concerns the contrast between penultimate and final prominence. Nearly, all words in Tagalog/Filipino fall into one of two types: penult-prominent (prominence on the second-to-last syllable of the word), as in [ʔábot] ‘power, capacity’ and ultima-prominent (prominence on the last syllable of the word), as in [ʔabót] ‘arrival’. As discussed below, some scholars treat the two types as qualitatively different, with penult-prominent words like [ʔábot] having underlying length (/ʔaːbot/) and ultima-prominent words like [ʔabót] having only default phrase-final prominence (Constantino Reference Constantino1965; Schachter & Otanes Reference Schachter and Otanes1972; Soberano Reference Soberano1980; Himmelmann & Kaufman Reference Himmelmann, Kaufman, Gussenhoven and Chen2020). We find evidence instead for a stress analysis: penultimate and final prominent syllables are treated similarly in the music corpus we examine, even when phrase-medial, both being set to stronger beats and longer notes. As further support for the stress analysis, we find that text-setting prominence does not shift onto phrase-final enclitics; that syllable shape matters for text-setting only where it has been claimed to matter for stress; and that syllables plausibly predicted to bear secondary stress are also set to longer notes and stronger beats. We conclude that Tagalog/Filipino has stress in all words, not just the penult-prominent ones.

Our third finding is that text-setting appears to be sensitive to prominence at an abstract, phonological level, in that it does not mirror the phonetic cues of word prominence, nor is it sensitive to vowel height, even though in speech low vowels should have longer duration and greater loudness.

In the next section, we lay out the two types of word prominence and their phonetic cues, and give background information on the language and music. §3 describes our methods; §4 and §5 give results for duration and beat strength. §6 provides Monte Carlo tests of statistical significance. §7 and §8 give results for phrase-final enclitics and pre-tonic syllables. §9 argues that the text-setting in this corpus does not track phonetics. An html file showing the annotated R code that generated our figures and results is provided as Supplementary Material.

2. Background

2.1. Tagalog and Filipino

The terms ‘Tagalog language’ and ‘Filipino language’ are often used interchangeably. When a distinction is intended, ‘Tagalog’ generally refers to the language of the Tagalog ethnic group in the northern Philippines, whereas ‘Filipino’ is the national language of the Philippines, based on Tagalog and enriched with vocabulary from other Philippine languages, English, Spanish and elsewhere. ‘Filipino’ tends to refer to the language as spoken outside the Tagalog region, or in Philippine cities where Tagalog and non-Tagalog people interact; it can also refer to a prestige variety of the language used in national media (Nolasco Reference Nolasco2007). Filipino, especially as used outside of the Tagalog region, has grammatical differences from Tagalog (Rubrico Reference Rubrico2012; Demeterio & Dreisbach Reference Demeterio and Dreisbach2017).

We use the term ‘Filipino’ in this article, because the music we are analysing forms part of the Philippine national mass media; the linguistic sources we cite use the term ‘Tagalog’.

2.2. Word prosody in Filipino

2.2.1. Two types of word

The great majority of words in Filipino fall into two classes: penult-prominent and ultima-prominent. Table 1 lists the phonetic properties that have been observed in Gonzalez (Reference Gonzalez1970), Anderson (Reference Anderson2006) and Klimenko et al. (Reference Klimenko, Maria and Javier2010);Footnote ² examples are in (1). In the song corpus, there are about the same number of words of each kind (by type: 269 penult-prominent, 266 ultima-prominent; by token: 498 penult-prominent, 475 ultima-prominent). To avoid committing prematurely to a phonological analysis, we avoid the IPA stress notation (International Phonetic Association 1999) and instead place an acute accent over the prominent vowel.

Table 1 Two types of word in Filipino.

While lexical words are generally at least disyllabic, there are several monosyllabic function words, for example, [ʔat] ‘and’, [ba] ‘(question particle)’, [din] ‘also’. There are loanwords with antepenultimate prominence, such as [ʔágila] ‘eagle’, from Spanish águila, which was the only such word in the song corpus we used. And there are loans with prominence on a closed penult, such as [bɾiljánte] ‘diamond’, from Spanish brillante.

The spectrograms in Figure 1, made with Praat (Boersma & Weenink Reference Boersma and Weenink2017) from recordings in the online dictionary of Tagalog.com, illustrate a minimal pair in citation form.Footnote ³ The top example is [ʔábot] ‘power, capacity’, with penult and ultima vowels similarly long, penult louder than ultima (as can be seen by comparing the heights of the waveforms for [a] and [o]) and high pitch on the penult followed by a pitch fall between the two vowels, as shown by the pitch track (thick black trace overlaid on the spectrogram). The bottom example is [ʔabót] ‘arrival’, with a short penult vowel and long ultima vowel, penult and ultima similarly loud and a pitch fall late in the ultima.

Figure 1 Citation-form disyllables: penult-prominent [ʔábot] ‘power, capacity’ (top) and ultima-prominent [ʔabót] ‘arrival’ (bottom).

Further evidence that final prominence is not associated with longer duration (prominent ultimas are not longer than non-prominent ultimas) comes from Reed’s (Reference Reed2022) acoustic study of reduplication. Tagalog has semi-productive copying of a stem’s first two syllables, as in ka-sáma ‘companion’ vs. ka-sáma-sáma ‘constant companion’. Impressionistically, both copies often have word-level prominence. Reed finds that when penult-prominent roots undergo copying of their first two syllables, the copied penult is also long, as in d[aː]mi-dámi ‘quite a lot’, from dámi ‘quantity’. But when ultima-prominent roots are copied, the copy of the ultima is not long: an[o]-anó ‘what-pl’, from anó ‘what’. Reed takes this difference as evidence for the underlying-length analysis, but it is also consistent with final stress not causing additional lengthening beyond word-final lengthening.

The facts for multi-word utterances are less clear. Schachter & Otanes (Reference Schachter and Otanes1972) describe most Tagalog intonation patterns as including a phrase-final pitch-accent; for some patterns, this accent’s location depends on vowel length. The behaviour of these ‘fixed-P2 patterns’ (Schachter & Otanes Reference Schachter and Otanes1972: 32) is illustrated in (2): for penult-prominent words, the pitch accent can optionally move to the phrase-final syllable when an enclitic is added; for ultima-prominent words, the pitch accent must move to the phrase-final syllable.Footnote ⁴

The optionality for penult-prominent words, as well as the variety of intonation types and their overlapping semantics or usage, makes it difficult to say whether any given utterance falsifies these claims.

2.2.2. Previous analyses

Previous authors have fallen into two broad camps, summarised in Table 2.Footnote ⁵ For some (Constantino Reference Constantino1965; Schachter & Otanes Reference Schachter and Otanes1972; Soberano Reference Soberano1980; Himmelmann & Kaufman Reference Himmelmann, Kaufman, Gussenhoven and Chen2020), there is a phonemic length contrast in open penults. This explains the phonetic duration facts: long vowels are pronounced with greater duration, as are all final syllables (leaving the [ʔa] of [ʔabót] short). This analysis must stipulate that long vowels attract pitch accent away from its default phrase-final location, and that a syllable following the pitch accent has lower amplitude. For these authors, stress, if it exists at all in the language, is the surface result of prominence due to vowel length or intonational pitch-accent. (Constantino Reference Constantino1965 uses underlying stress to encode exceptions such as /bɾilˈjante/ → [bɾil.ján.te] ‘diamond’, which has prominence on a closed penult.)

Table 2 Approaches to word prominence in Filipino.

For other authors (Bloomfield Reference Bloomfield1917; Blake Reference Blake1925; Ramos Reference Ramos1981; French Reference French1988, Reference French1991; Avery & Lamontagne Reference Avery and Lamontagne1995; Sabbagh Reference Sabbagh2014; Richards Reference Richards2017), there is a phonemic stress contrast. This analysis must stipulate that both stressed syllables and word-final syllables are lengthened, but not additively, so that stressed, word-final syllables are not extra-long. Like the vowel-length analysis, the stress analysis stipulates that a syllable following the pitch accent has lower amplitude.

We now review three phonological phenomena relevant to length or stress, concluding that they can be analysed under either approach.Footnote ⁶ Thus, new data are needed in order to distinguish between the two approaches.

The first phenomenon is that closed penults cannot bear prominence (with exceptions, mainly in loans): *hágkan is not allowed. Under the length analysis, this is straightforward: as in many languages, a syllable can have either a long vowel or a final consonant, but not both, ruling out *[haːg.kan]. A stress analysis can stipulate that stress must fall on one of the last two moras of the word (with final consonants extrametrical – that is, not bearing a mora). As shown in (3), stress (the bottom-most × mark on the grids) cannot fall on a consonant mora (ruling out (3d) *[haǵ.kan]), nor can it fall on a mora that is not one of the last two (ruling out (3e) *[hág.kan]).Footnote ⁷ This is reminiscent of many trochaic languages’ ban on words that end with a heavy penult and a light ultima (see Hayes Reference Hayes1995 on trochaic shortening; Zuraw Reference Zuraw2018).

The second phenomenon is prominence shift in verbs. As shown in (4), prominence shifts one syllable to the right when a suffix is added. Exceptional loan verbs with prominence on a closed penult retain that prominence and gain another on the final syllable.

Under a length analysis, ultima-prominent words require no explanation; they have no long vowel underlyingly, and continue to have no long vowel after suffixation. For penult-prominent words, some form of prosodic faithfulness can be invoked, whereby length moves in order to remain penultimate; see Shryock (Reference Shryock1993) and Crosswhite (Reference Crosswhite1998) for analyses of similar phenomena in other languages. Prosodic faithfulness has to be overridden by some mechanism that anchors exceptional vowel length in a closed syllable, as in [kweːntuhan]; the final syllable is far enough from the long vowel to get phrase-final prominence.

Under a stress analysis, a similar prosodic faithfulness mechanism is needed to keep stress penultimate when a penult-prominent root is suffixed, and the same mechanism applies to ultima-prominent roots, keeping stress final (see French Reference French1988; Sabbagh Reference Sabbagh2004). For the exceptional words, again some mechanism keeps the exceptional stress in place, and an additional stress is added to the final syllable to avoid stress lapse.

The third phenomenon is prominence shift in some suffixed nouns. Depending on the morphology, several patterns are possible (see Schachter & Otanes Reference Schachter and Otanes1972: 98–102). Both penult-prominent and ultima-prominent words can, when suffixed, become either penult-prominent or ultima-prominent, with or without lengthening of the root-initial vowel, and possibly other vowels. A sampling is given in (5). Under any account, there must be morpheme-specific prosodic requirements and optionality. A length account requires length to be deleted, moved, or added; a stress account requires stress to be moved or not moved and pre-tonic length to sometimes be added (see Sabbagh Reference Sabbagh2004 and Hagberg Reference Hagberg2006 for stress-based accounts of part of the pattern).

While these length-based and stressed-based accounts have their strengths and weaknesses, both are workable, and these primary phonological data are thus not decisive.

2.3. Original Pilipino Music (OPM)

OPM stands for Original Pilipino Music. The term can refer to any Philippine pop music, but there is a stylistic core of music that is most likely to be labelled OPM. Arceo-Dumlao (Reference Arceo-Dumlao2017) is a rich collection of interviews with OPM songwriters, focusing on topics such as how an individual got into the music business, the inspiration for a song, or how a singer and songwriter met. There is little discussion of songwriting mechanics such as word choice, but the interviews do provide insight into the songwriting process. Some famous songs were written, music and lyrics, by one person in one day. In other cases, lyrics were written for a melody that had already been composed by someone else. And in yet other cases, the lyrics were written first and then another songwriter composed the melody.

Other literature on OPM includes engineering projects to classify songs into sub-genres, identify mood, distinguish OPM from K-pop, recommend songs automatically or predict hits (Deja et al. Reference Deja, Blanquera, Carabeo, Copiaco, Nishizaki, Numao, Caro and Suarez2016; Mital et al. Reference Mital, Tobias, Bandala, Billones, Dadios, Niranjan, Rana and Khurana2019; Abisado et al. Reference Abisado, Yongson and Los Trinos2021; Sulit Reference Sulit2022; Monterola et al. Reference Monterola, Abundo, Tugaff and Venturina2009); humanistic studies of the history, culture and politics of OPM (Lockard Reference Lockard1996; Maceda Reference Maceda2007; Gabrillo Reference Gabrillo2018; Domingo Reference Domingo2021; Peña Reference Peña2021; Cayabyab Reference Cayabyab, Johan and Santaella2021; Prudente Reference Prudente2021; Shunwei & Jia Reference Shunwei and Jia2022; Nagai Reference Nagai, Hamza, Chan and Chin2022; Gaillard Reference Gaillard2022); a study of cover performances on social media (Anacin et al. Reference Anacin, Baker and Bennett2021); and social-science studies of music preference (Boer et al. Reference Boer, Fischer, Atilano, Hernández, Garcia, Mendoza, Gouveia, Lam and Lo2013; Limjuco et al. Reference Limjuco, Ticudo and Pregua2014) and of attitudes towards code-switching in OPM (Bareng Reference Bareng2019). We have identified two linguistic studies of OPM. Alegado et al. (Reference Alegado, Labaya, Lirio and Rivera2021) analyses instances of English in OPM according to their length and syntax (the corpus we examine here did not happen to contain any code-switches into English). Sumalinog et al. (Reference Sumalinog, Salid, Sarino and Amante2021) discuss OPM lyrics’ use of Swardspeak, ‘the vernacular language or code used by Filipino gay men in the Philippines and in the diaspora’ (Manalansan Reference Manalansan2003: 46). Strikingly, more than half of the literature we identified was published since 2020.

We failed to find linguistic analyses of text-setting – the relationship between text and notes – in music of any Philippine language. The closest publication we could find was Anderson (Reference Anderson2015), a guide to performing Tagalog Kundiman songs, which notes several instances where it is musically effective for duration, beat strength and/or pitch to correlate with stress.

3. Methods and predictions

3.1. Song corpus

We found sheet music online for 19 usable songs. Nine songs were purchased from the composer Aldy Santos’s Web site, Aldy Sheet Music (aldysheetmusic.com), and ten songs from MuseScore (musescore.com), a site that allows users to upload their own transcriptions. (We found but excluded another seven songs: one was in 3/4 time; one was translated from Visayan, and song translators are working under different constraints; and five were each in a markedly different style from the core 19 songs.) The Appendix Table A1 lists the songs.

All songs were in the 4/4 time signature, which means that each measure has four beats. In this time signature, each beat is a quarter note (or crotchet). For those unfamiliar with musical notation, what is important about this time signature is that one can count along to the music in a repeating pattern of 1–2–3–4, 1–2–3–4, 1–2–3–4, ….

3.2. Coding

After listening to recordings and correcting the sheet music where necessary (this was rare), we hand-converted each piece of sheet music into a spreadsheet. In Figure 2, we show a fragment from ‘Akin ka na lang’ by Francis Salazar, as performed by Morisette (with accent marks added). The opening words are Bákit hindíɁ mo maramdamán… (‘Why don’t you feel…’). Each column in the spreadsheet stands for one sixteenth-note of duration, with sixteen columns per ‘measure’ of music. The rows include a repeating metrical grid (the rows with × marks; Liberman Reference Liberman1975; Lerdahl & Jackendoff Reference Lerdahl and Jackendoff1981, [1983] Reference Lerdahl and Jackendoff1996), to guide us to the correct cell for data entry, and rows to enter information about each syllable.

Figure 2 Transcription fragment, from ‘Akin ka na lang’.

As mentioned above, each measure is counted as 1–2–3–4; these numbers appear on the ‘beat’ row of the spreadsheet. We assume, following the usual convention for music in the 4/4 time signature, that the strongest position in the measure is the 1, or downbeat; we show this by giving the downbeats the tallest columns in the metrical grid, with five × marks. The second-strongest position is the 3, which we give four × marks. The next-strongest positions are the 2 and the 4, with three × marks each. If a musician wants to count to the music more finely, dividing the measure into eight equal parts, they can count 1–and–2–and–3–and–4–and, with an and falling in the middle of each beat. These ands are numbered 1.5, 2.5, 3.5 and 4.5 in the ‘beat’ row, and are the next-strongest positions, with two × marks each. A musician can count even more finely, dividing the measure into 16 equal parts, often as 1–ee–and–a–2–ee–and–a–3–ee–and–a–4–ee–and–a. These ees and as, numbered 1.25, 1.75, 2.25, 2.75, etc., are the weakest positions, with one × mark each.

The other rows encode linguistic information. Ba, for example, is entered in the ‘text’ row, in the column where it begins (the downbeat, identified as 1 in the ‘beat’ row). Because Ba is set to an eighth note, it extends over two columns; the underscore in the next cell indicates continuation. The 1 in the ‘stress’ row indicates that Ba is prominent.Footnote ⁸ The L in the ‘length’ row indicates that this syllable has a long vowel (in this case, predictable from being a stressed open penult); closed syllables are coded as C, and open syllables with short vowels as S. The ‘syll_position’ row shows 2, indicating that Ba is a penult (second from the end of the word). The ‘line_number’ row shows that all the syllables depicted here are in the first line of the song. The line is not a repeat, and we made no special notes (so the ‘repeat’ and ‘notes’ rows are blank). Filipino spelling does not indicate prominence, pre-tonic length, or word-final glottal stops; we relied on a combination of dictionary entries and one author’s native-speaker knowledge for the ‘stress’ and ‘length’ rows, and for the Qs indicating word-final glottal stops in the ‘text’ row.

An R script (R Core Team 2021) reads and processes these spreadsheets.

3.3. Exclusions

We excluded 223 syllables because they belonged to a word that fell entirely or in a part on a triplet (a division of a note into three equal parts instead of two or four), because we had no principled way to classify the prominence of the second and third sub-beats of a triplet. All the songs we coded had four beats per measure, but some had short passages with a different number of beats per measure, and we excluded 11 syllables because they belonged to words that fell partly or wholly within such passages. We excluded repeated lines, so that our data set would not appear (in plots and statistical analyses) to be bigger and more consistent than it really is. Finally, we excluded the one word in the corpus with antepenultimate stress, ágila ‘eagle’, a Spanish loan.

3.4. Predictions

We are comparing four syllable types: prominent and non-prominent penults and ultimas. Our absolute null hypothesis is that all four types are assigned to notes of similar length, and beats of similar strength.

The next closest to a null hypothesis is that OPM text-setting purely reflects the phonetics of word prosody, and tells us nothing about its phonology. In that case, non-prominent penults should be assigned to shorter notes than all the rest, as illustrated by the arrows on the left in (6), which start from the syllable type predicted to be set to longer notes, and point to the syllable type predicted to be set to shorter notes. The musical equivalent of loudness is less direct, but there is a tendency for loudness to signal beat strength (Lerdahl & Jackendoff [1983] Reference Lerdahl and Jackendoff1996: 17–18, 78–79). Non-prominent ultimas should then be assigned to weaker beats than all the rest, as illustrated on the right in (6), but with dashed arrows to show that the predictions are less direct, starting from the syllable type predicted to be set to stronger beats, and pointing to the syllable type predicted to be set to weaker beats.

There are two non-null hypotheses: that generated by the vowel-length theory, and that generated by the stress theory. As illustrated in (7), the vowel-length theory predicts that prominent penults, which contain a phonemically long vowel, should be set to longer notes than all other syllable types. Versions of the vowel-length theory that assign predictable stress to those phonemically long vowels also predict that prominent penults should be assigned to stronger beats than the rest (since cross-linguistically, stressed syllables tend to fall on strong beats, as stated in §1). Dashed lines are again used for the beat-strength predictions, to reflect that they are made by only one version of the vowel-length theory.

The hypothesis generated by the stress theory is that prominent syllables, because they are stressed, should be assigned to stronger beats than non-prominent syllables are, as shown in (8). There is less research on note length in uncontroversial stress languages, with some evidence that German stressed syllables are set to longer notes (Girardi & Plag Reference Girardi and Plag2022). The arrows for note length are dashed to show that this prediction is less clear.

4. Duration

4.1. Quantifying duration

Duration was quantified in quarter-note beats: a whole note (or semibreve, a note that lasts one measure) has a duration of 4 beats, a half note (minim) has a duration of 2, a quarter note (crotchet) 1, an eighth note (quaver) 0.5 and a sixteenth note (semiquaver) 0.25.

4.2. Duration results

There were 498 penult-prominent words and 475 ultima-prominent words analysed.

4.2.1. All final two syllables

The bean plots in Figure 3, made in R using the beanplot package (Kampstra Reference Kampstra2008), show results for the final two syllables of all usable words. On the left are the penult-prominent words, like ʔábot, and on the right are the ultima-prominent words, like ʔabót. The left side of each pair, coloured orange, represents the smoothed distribution of duration for the penult in each type of word, and the right side, coloured sky blue, represents the distribution of duration for the ultima. The four horizontal line segments show the mean of each distribution.Footnote ⁹

Figure 3 Duration of final two syllables of all words.

In the penult-prominent words, on the left, the penult tends to have a shorter duration (0.7 beats on average) than the ultima (1.0 beats). This might seem unexpected, but recall that in speech, final syllables tend to be long regardless of stress. In ultima-prominent words, the gap is bigger: penults have an average duration of 0.4 beats and ultimas 1.5. Overall then, ultimas are long, but less so in penult-prominent words.

Another way of looking at this plot is to compare the two orange distributions to each other: penults are slightly longer when prominent (0.7 on the left > 0.4 on the right). And comparing the two sky-blue distributions to each other, ultimas are longer when prominent (1.5 on the right > 1.0 on the left).

The plot in Figure 4 shows that syllable shape has little consistent effect. Whether a penult-prominent word has a closed or an open penult, ultimas tend to be slightly longer than penults. And regardless of the syllable shapes in an ultima-prominent word, the ultima is substantially longer than the penult.

Figure 4 Duration, broken down by syllable shape.

4.2.2. Separating out phrase- and line-final words

The very long durations seen for some ultimas are mostly line-final syllables, reflecting the musical tendency to place a long note at the end of a line. Splitting up the results into line-final versus line-medial words will allow us to see whether prominent ultimas are still long even when not line-final.

Furthermore, given various authors’ observations about how intonational prominence can track the end of the phrase, rather than the end of the word, we should further split line-medial words into phrase-medial and phrase-final, to see whether prominent ultimas are still long even when phrase-medial. (We assume that line-final words are always phrase-final.)

To determine phrase boundaries, we follow Schachter & Otanes’s (Reference Schachter and Otanes1972: 36) description of where optional pauses may occur; we take these optional pause locations to represent phrase boundaries. Schachter & Otanes state that phrase boundaries never occur after a proclitic or before an enclitic, and are more likely to occur at large syntactic breaks than at small ones, such as between a modifier and the word it modifies. They give the following examples:

We operationalised phrasehood by having a look-up list of enclitics, based mainly on Kaufman (Reference Kaufman2010).Footnote ¹⁰ If a word was followed by one of these enclitics, it was considered phrase-medial.Footnote ¹¹ If not, then it was considered phrase-final. In the examples above, our procedure correctly codes all the boldface words, but leaves bagong miscoded as phrase-final. (Monosyllabic words are ignored here because they are excluded from our results.) Following this example and Schachter & Otanes’s description, we decided that two-word modifier–modified pairs connected by the linker suffix -ng, like bagong paaralan, should be hand-coded as belonging to the same phrase; we identified 19 such sequences in the song corpus.

We assumed that line breaks occurred where displayed on the lyrics website Musixmatch (https://www.musixmatch.com/). Some of the songs rhyme, providing further support for the line-break locations, but we did not use rhyming or other criteria to make changes from Musixmatch’s line breaks.

Figure 5 illustrates this three-way breakdown. As we move from phrase-medial to phrase-final to line-final, durations get longer, especially for ultimas. But the key result still holds in all three environments: prominent penults are longer than non-prominent penults; and prominent ultimas are longer than non-prominent ultimas.

Figure 5 Note duration broken down by position.

4.3. Duration summary

We have seen that, as predicted by both the vowel-length analysis and the stress analysis, prominent penults are set to longer notes than non-prominent penults. As predicted by only the stress analysis, prominent ultimas are also set to longer notes than non-prominent ultimas. This is true even for phrase-medial position, where, in speech, ultima-prominent words may lose their intonational prominence.

We defer discussion of statistical significance to §6. A regression model, included in the Supplementary Material, finds that ultimas are longer, prominent syllables are longer and prominent ultimas are the longest of all. In §6, we argue that regression may not be conservative enough, and provide an alternative method for assessing significance.

5. Beat strength

5.1. Measuring beat strength: syncopation

OPM has extensive anticipatory syncopation (Temperley Reference Temperley1999; Tan et al. Reference Tan, Lustig and Temperley2019), meaning that beats that, according to various expectations, should count as strong, begin slightly early. Two examples are shown in Figure 6: the syllable ram begins on the ‘and’ (second half) of a beat, the second-weakest position, but musically it behaves as though it began on the following measure’s first beat, which is the strongest position; the syllable man begins on the last sixteenth note of a beat, the weakest position, but musically behaves as though it began on the third beat of the measure, the second-strongest position. Temperley’s (Reference Temperley1999) solution is to move each syncopated note forward, so that it counts as beginning later than it really does.

Figure 6 Examples of syncopation in first line of ‘Akin ka na lang’.

To operationalise this, we identified, for each syllable, the strongest beat that it contains: for ram, beat 1 of a measure, and for man beat 3 of a measure. One danger is that if a note is very long, it will always end up counting as strong, because it eventually goes on long enough to include a strong position. Therefore, we only looked for the strongest beat contained in the first 1.25 beats of the note. This is enough for a note that begins on the last beat of the measure (or even one sixteenth-note earlier) to count as beginning on a downbeat, if it lasts long enough.Footnote ¹² Our procedure ‘corrected’ 49% of sixteenth notes, 27% of eighth notes and 11% of quarter notes to a stronger position.

5.2. Beat strength results

As with duration, there were 498 penult-prominent words and 475 ultima-prominent words analysed.

5.2.1. All final two syllables

Results are plotted in Figure 7. The plot is the same as in Figure 3, except that the vertical axis measures beat strength. The strongest value, 5, is for a note that starts on or contains a downbeat (beat 1 of the measure); the next-strongest, 4, is for a note that starts on or contains beat 3 of a measure; 3 is for a note that starts on or contains beat 2 or 4; 2 is for a note that starts on or contains at most the second half of a beat, and 1 is for a note that starts on and contains at most the second or fourth quarter of a beat.

Figure 7 Beat strength (correcting for syncopation) of final two syllables of all words.

On the left, we see that in penult-prominent words, average strength is somewhat higher in penults than in ultimas; the modal penult of such words is set to a downbeat, while their modal ultima is set to the second half of a beat (the peak of the distribution is at 2). On the right, for ultima-prominent words, ultimas are on average much stronger (modally downbeats) than penults (modally second half of a beat). Thus, beat strength matches prominence, especially for ultima-prominent words.

Just as we did for duration, we can also divide the results by syllable shape, as shown in Figure 8, with little effect: regardless of syllable shape, penult-prominent words have a stronger penult, and ultima-prominent words have a stronger ultima.

Figure 8 Beat strength broken down by syllable shape.

5.2.2. Separating out phrase- and line-final words

Just as with duration, we separate out phrase- and line-final words. As shown in Figure 9, for phrase-medial words, penult-prominent and ultima-prominent words look symmetrical, but moving to phrase-final and then line-final, penult-prominent words show less and less of a strength difference between penults and ultimas. Nevertheless, in all three environments, prominent syllables are stronger than non-prominent syllables within the same word type; prominent penults are stronger than non-prominent penults; and prominent ultimas are stronger than non-prominent ultimas.

Figure 9 Beat strength broken down by position.

5.3. Beat strength summary

The results again support the stress analysis. As both the vowel-length analysis and the stress analysis predict, prominent penults are set to stronger beats than both non-prominent penults and non-prominent ultimas. But as predicted only by the stress analysis, prominent ultimas are also set to stronger beats than both non-prominent ultimas and non-prominent penults. This holds even phrase-medially, where, in speech, ultima-prominent words may lose their intonational prominence.

A regression model, included in the Supplementary Material, finds that prominent ultimas and penults are set to the strongest beats, and non-prominent penults to the weakest. A more conservative method of assessing statistical significance is given now.

6. Monte Carlo tests for significance

Imagine a language where most words have final stress, and a musical style where most lines of music end in a long note. Even if songwriters make no effort to align stress with long notes, stressed syllables will still tend to be placed on long notes, because of line-final syllables. In other words, the simple null hypothesis that the regression models mentioned above use, ‘stressed and unstressed syllables are set to notes of equal duration’, won’t do, because the data are inherently biased to falsify that null hypothesis. We want to construct a null hypothesis that already includes such biases. One way to do this is to randomly re-combine text and music. In some studies, this scrambled, null-hypothesis corpus is constructed by drawing text from prose (the ‘Russian method’ for poetry; Ryan Reference Ryan2011; see references in Hayes Reference Hayes2013), but it is not clear what would be suitable prose to use in the case of OPM. Instead, we follow Gunkel & Ryan (Reference Gunkel, Ryan, Jamison, Melchert and Vine2011) in scrambling the lines of lyrics within our corpus of songs.

We first compute several measures for the real data, such as the mean duration of prominent penults’ notes minus the mean duration of non-prominent penults’ notes (all the measures are listed in (10)). Then, we convert each musical line into a pattern representing the number of syllables, and the locations where a disyllabic or longer word ends. For example, Bákit hindíʔ mo maɾamdamán ‘why can’t you feel it’ has the structure σ σ | σ σ | σ σ σ σ σ |, with | representing the ends of words, not including monosyllabic mo ‘you’. These word boundaries are important because, as we saw in §4.2, word-final syllables tend to be given a long duration regardless of whether the word is penult- or ultima-prominent. Since we are interested not in that effect, but rather in the difference between penult- and ultima-prominent words, we want our null hypothesis to include word-final lengthening.

Our script randomly selects a line from the corpus that has the same structure, such as Sáma-sáma ɾiŋ maɾaɾatíŋ ‘will also arrive together’. (We generally coded words with two-syllable reduplication, like sáma-sáma ‘together’, as two separate words.) In this example, the selected line is from a different song (‘Tagumpay nating lahat’, written by Gary Granada and performed by Lea Salonga). The script combines the old line’s lyrics with the new line’s musical notes, as shown in Figure 10.

Figure 10 Lyrics randomly assigned to a new line of music.

Figure 11 Monte Carlo results.

We recompute the measures of interest on the new, scrambled corpus, and repeat the scrambling procedure 1,000 times, to obtain the distribution of values that we would expect to see under the null hypothesis, following Kessler (Reference Kessler2001); Martin (Reference Martin2007, Reference Martin2011); Hayes et al. (Reference Hayes, Zuraw, Londe and Siptár2009). Each plot in Figure 11 is for one of the measures in (10). The grey bars are a histogram of the measure’s values in the shuffled corpora, and the solid blue line is the value in the actual corpus. The estimated p value for each measure is the proportion of shuffled corpora that lie to the right of the solid blue line – how often we’d expect to see such an extreme result by chance. For many measures, the histogram does not overlap with the solid blue line at all, meaning that the estimated p value is less than 0.001. The measures and p values are given in (10), and depicted graphically in (11).Footnote ¹³

Measures (10a)–(10d) and (10g)–(10j), shown in bold, are those predicted by the stress analysis to be positive, or in any case greater than expected by chance; the other measures are included for completeness, even if no analysis predicts them to be non-zero. Since there are 12 measures being taken, if we require $p < 0.05$ to reject the null hypothesis, a Bonferroni correction (Dunn Reference Dunn1961) adjusts that threshold to $p < 0.004$ . The measures meeting that significance criterion are marked with * in (11).

Except for (10d), then, the predictions of the stress analysis given in §3.4 are supported. The vowel-length analysis and the stress analysis overlap in correctly predicting (10a), (10b), (10g) and (10h) to be positive. The vowel-length analysis alone predicts (10e) and (10k) to be positive, which was not strongly supported. We conclude that our results support the stress analysis over the vowel-length analysis.

For those who prefer a regression analysis, despite its drawbacks, the html file in the Supplementary Material includes regression models that reach similar conclusions.

7. Phrase-final enclitics

Recall from §2.2.2 that some scholars hold that apparent word-final prominence is really phrase-final prominence, because when an ultima-prominent word is put into a phrase, its prominence shifts. For example, ultima-prominent damít ‘clothes’ may appear to have final stress in citation form, but when it is combined with the enclitic ko ‘my’ to form damit ko ‘my clothes’, this ‘stress’ now falls on ko. By contrast, the pa of penult-prominent sapátos ‘clothes’ remains (or at least can remain) ‘stressed’ in sapatos ko.

To test this idea, we examine enclitics like ko. If apparent word-final prominence is really phrase-final prominence, ko should be set to longer and stronger notes when it follows ultima-prominent words (because prominence should shift onto it) than when it follows penult-prominent words.

We extracted all monosyllabic enclitics from the corpus, and counted them as phrase-final if they were followed by a content word. This yielded 143 total tokens of phrase-final monosyllabic enclitics, of which 101 were after penult-prominent words and 42 after ultima-prominent words.

Looking first at note duration, shown in Figure 12, we see that monosyllabic enclitics, whether line-medial or line-final, are not set to longer notes when they follow an ultima-prominent word (yellow) than when they follow a penult-prominent word (green).

Figure 12 Phrase-final clitic duration.

As for beat strength, shown in Figure 13, monosyllabic enclitics are set to a weaker beat when they follow an ultima-prominent word than when they follow a penult-prominent word, the opposite of what phrasal prominence-shifting predicts. This is probably explained by the fact that weak and strong beats tend to alternate in music. When a prominent ultima, like the mit in damít, gets set to a strong beat, a following syllable will tend to be set to a weak beat; whereas when a prominent penult, like the pa in sapátos, is set to a strong beat, an enclitic two syllables later can also be set to a strong beat.

Figure 13 Phrase-final clitic beat strength.

Using the same Monte Carlo method as in §6 (plots in Supplementary Material), we found that enclitics were not set to significantly longer notes after ultima-prominent words than after penult-prominent ( $p = 0.55$ ), nor to significantly stronger beats ( $p = 0.82$ ).

The text-setting data thus contradict the idea that ultima prominence is illusory. Even if intonational pitch accent often moves onto an enclitic in speech, songwriters keep prominence on the word ultima.

8. Pre-tonic syllables

8.1. Introduction

Filipino morphology can create a long vowel earlier in the word – before the penult, that is – though this is often variable or optional. The most common morphological source of long vowels is verb aspect reduplication. As illustrated in (12), aspect reduplication produces a copied, prefixed syllable that typically contains a long vowel. (The infixes -um- and -in- represent voice and aspect.) All but ten of the pre-tonic long vowels in the song corpus come from aspect reduplication.

French (Reference French1988) and French (Reference French1991) give partly conflicting descriptions of secondary stress, and French (Reference French1991) calls for acoustic analysis of secondary stress – which as far as we know has still not been carried out – to clarify the picture. We will focus here on French’s claims about the types of words that are well attested in the song corpus. French’s two accounts agree that aspect reduplicants, like those shown in (12), receive secondary stress; for example, French would transcribe ‘will write’ as [ˌsuˈsulat]. French (Reference French1988) claims that closed syllables in prefixes generally attract secondary stress, as in [màg-pa-ka-ʔáɾal] ‘study intensely’ (and does not address closed, pre-tonic root syllables, as in the penult of [ta:-takbó]). French (Reference French1988) further claims that a closed prefix syllable will not receive secondary stress if a following prefix syllable itself has secondary stress (the context found in aspect reduplication), as in [mag-ˌpa-pa-ka-ˈʔaɾal] ‘will study intensely’, where the prefix /pa/ has undergone aspect reduplication. The two works make conflicting claims about default locations of secondary stress when prefixes are all open syllables with no aspect reduplication.

While acknowledging that much remains to be determined about Filipino secondary stress, we extract two hypotheses from these descriptions. First, pre-tonic syllables that are closed or have long vowels, as in words like [sa:-sabí-hin] ‘will be said’ and [nag-simuláʔ] ‘began’, should tend to be treated as having secondary stress, and thus be set to longer notes and stronger beats than pre-tonic syllables that are open and have a long vowel, as in [ka-ʔibíg-an] ‘friend’. Second, looking just at antepenults, stress clash avoidance should weaken or eliminate this effect when the next syllable is a prominent penult, so that the antepenult in a word like [pag-ʔíbig] ‘love’ or [ma-pa-páwiʔ] ‘will come to an end’ would not be set to particularly long notes or strong beats, despite being closed or having a long vowel, because the following syllable is prominent.

For this part of the analysis, we used words of three or more syllables. Table 3 shows how many tokens were found in each position. Because there were so few observations of fifth-, sixth- and seventh-to-last syllables, they are not included in the analysis.

Table 3 Number of open-syllable observations for analysing pre-tonic length.

8.2. Pre-tonic duration

The bean plots in Figures 14 and 15 show the distribution of note duration for pre-tonic syllables that are short-vowelled and open, short-vowelled and closed, or long-vowelled and open. (There were no long closed tokens.) In fourth-to-last syllables (Figure 14), we see that songwriters have assigned the short open syllables to the shortest note durations. In contrast to the duration data for penults and ultimas above in Figure 4, syllable shape does matter here: the closed syllables pattern with the long-vowel syllables – although, not surprisingly given the small amount of data, the differences are not significant.Footnote ¹⁴ This is consistent with French’s (Reference French1991) contention that both closed syllables and long-vowelled syllables (aspect reduplicants) attract secondary stress, and goes against the otherwise appealing notion that the reason a closed penult cannot be prominent is that it can’t have a long vowel: even though these closed pre-tonic syllables have short vowels, they appear to be receiving prominence.Footnote ¹⁵

Figure 14 Note duration in fourth-to-last syllables.

Figure 15 Note duration in third-to-last syllables.

For third-to-last syllables, in Figure 15, the data are further divided according to whether the following syllable is prominent (penult-prominent word, as in liːlípas ‘will elapse’) or not (ultima-prominent, as in puːputíʔ ‘will turn white’). The only syllables to receive longer note duration are closed and long-vowelled syllables that are not followed by a prominent penult, and thus not subject to stress clash. The difference between on the one hand long-vowel syllables not subject to stress clash and on the other hand short-vowel syllables was significant ( $p < 0.001$ ).

8.3. Pre-tonic strength

The results for beat strength are less perfectly in line with our secondary-stress predictions, but still broadly support them. In fourth-to-last position, long-vowelled syllables – but not closed syllables – trend towards being set to stronger beats, as shown in Figure 16.

Figure 16 Beat strength in fourth-to-last syllables.

In third-to-last syllables (Figure 17), the syllables set to the strongest beats are those that we predict to have secondary stress: closed and long-vowelled syllables in ultima-prominent words (no stress clash). There also appears to be a difference within the short open syllables between those that are followed by a prominent syllable and those that are not. It could be that short open syllables prefer to bear secondary stress if followed by an unstressed syllable (Blake Reference Blake1925; Avery & Lamontagne Reference Avery and Lamontagne1995). There is also a plausible musical explanation for this: unlike note length, beat strength alternates in the underlying musical structure. Because prominent penults and ultimas tend to be assigned to strong beats, there will thus be a tendency for an antepenult preceding a prominent penult to be weak, and for an antepenult preceding a prominent ultima to be strong. In the case of short-vowelled open syllables, this musical tendency creates a small difference; in the closed and long-vowelled syllables the musical tendency combines with stress clash avoidance to create a bigger difference. The difference between, on the one hand, long-vowelled syllables not subject to stress clash and, on the other, short-vowelled syllables was significant ( $p < 0.001$ ).

Figure 17 Beat strength in third-to-last syllables.

8.4. Pre-tonic syllables summary

We have seen that closed or long-vowelled pre-tonic syllables are set to longer notes and stronger beats, as long as the following syllable is not the tonic (as in a word like liːlípas ‘will elapse (time)’). This supports French’s (Reference French1991) contention that pre-tonic closed syllables and long vowels attract secondary stress, subject to some stress clash avoidance. We saw earlier that open versus closed syllable shape in penults and ultimas, which does not affect stress (except that closed penults may not bear stress), was not important for note length and beat strength. Thus, syllable shape seems to matter for text-setting only where it has been claimed to matter for stress.

9. OPM text-setting does not track phonetics

While text-setting partly tracks the phonetics of duration and loudness (see §2.2.1), there are some mismatches. In speech, the last two syllables of penult-prominent words have two long vowels, and those of ultima-prominent words have a short and then a long vowel; music was a rough match to this (excluding line-final words), as summarised in Table 4, except that prominent ultimas were set to longer notes than either non-prominent ultimas or prominent penults, as predicted by the stress analysis. In speech, penult-prominent words have a loud penult and quiet ultima, which was reflected in beat strength, but prominent ultimas, which in speech have similar loudness to their non-prominent penults, were set to stronger beats, again as predicted by the stress analysis.

Table 4 Summary of phonetic and musical properties of last two syllables of words.

We also looked at vowel height, on the assumption that the low vowel /a/ should be longer and louder in speech than the high vowels /i, u/ (though the acoustic results on this in Gonzalez Reference Gonzalez1970 are not straightforward). The plots in Figure 18 show the note durations (left plot) and beat strengths (right plot) of the last two syllables of penult-prominent and ultima-prominent words. Rather than pairing the distributions of penult and ultima for each word type, the four syllables are all separated out, and each pair of distributions is for a low vowel (left) and a high vowel (right). Within each pair, the left and right distributions are almost identical, with no indication that songwriters assign low vowels to longer notes or stronger beats.

Figure 18 Note duration and beat strength by vowel height.

Although we found that OPM text-setting does not track phonetic detail, there is one area we found where it does track surface rather than underlying phonology. When a [ʔ]-final word is phrase-medial, the [ʔ] usually deletes, and the preceding vowel lengthens in compensation, as in, using Schachter & Otanes’s (Reference Schachter and Otanes1972: 16) length-based notation, [luːtoʔ] ‘cooked’ vs. [luːtuː ba] ‘cooked?’ and [hindiʔ] ‘no’ vs. [hindiː ba] ‘no?’. Within prominent ultimas followed by a consonant-initial enclitic, we found that underlyingly glottal-final ultimas, like the /diʔ/ in ‘no’, are set to somewhat longer notes and stronger beats than other prominent ultimas are, presumably reflecting their surface lengthening. (Plots are provided in the Supplementary Material.)

10. Conclusion

This study has found that prominent penults and prominent ultimas are both set, in a corpus of OPM songs, to longer notes and stronger beats, both phrase-medially and phrase-finally – and that these text-setting tendencies are not simple reflections of duration and loudness in speech. Text-setting seems to reflect stress at the word level, and not merely phrasal prominence: when an ultima-prominent word is followed by a phrase-final enclitic (e.g., damít ko), many authors have observed that intonational prominence tends to shift onto the enclitic, but as we saw in §6, it is the content word’s ultima (mít) that is musically prominent, not the enclitic (ko). Furthermore, while syllable shape (open vs. closed) did not affect text-setting of penults and ultimas, it did affect text-setting of pre-tonic syllables, which is where French (Reference French1991) has claimed that syllable shape affects stress. All this is evidence in favour of analysing Filipino as having stress, even though the stress is realised differently in different positions in speech, with stressed penults having greater duration than unstressed, and stressed ultimas having greater loudness than unstressed (in addition to possible intonational differences). As we discussed in §2.2.2, standard phonological data were insufficient to decide between the length-driven and the stress-driven analyses. We believe that the musical data here provide the first straightforward evidence in favour of one analysis, the stress-driven one.

If the basic phonological data are not decisive for phonologists, how is it that songwriters have converged on treating Filipino as having stress? It is possible that there’s something in the basic data that no phonologists have noticed, but which is decisive for children learning the language. Or cases like Filipino could be telling us that, faced with ambiguous data of the Filipino type, learners are biased to acquire a lexicon and grammar with stress.

Our findings echo those of Domene Moreno & Kabak (Reference Domene Moreno and Kabak2022) for Turkish songs. In Turkish, as in Filipino, it has been proposed that words with non-final prominence bear true stress, while words with final prominence bear only phrase-final accent. Domene Moreno & Kabak measure beat strength and melodic peakhood in a song corpus. They find that, in Western European-style children’s songs, linguistically prominent syllables receive more of both types of musical prominence, with no difference between penultimate prominence and final prominence. Like us, they take this as evidence for word-level stress in Turkish.

Domene Moreno & Kabak found that songs they analysed in the Makam style did not give musical prominence to either type of Turkish prominent syllable. This raises the question of whether Western European-style Turkish children’s songs and OPM songs are both showing influence from English-language pop music’s tendency to align musical prominence and stress. This is possible, but does not explain away their or our results, because songwriters influenced by English songs would still have to decide what counts, in their language, as the equivalent to English’s stress. And in these Turkish and Filipino corpora, the songwriters have decided to treat both final and non-final prominent syllables as needing to be musically prominent.

The one interpretation of our data that could be consistent with an underlying-length analysis is effectively an empty one, where, before any phonology applies, an underlying length contrast gets converted into surface stress for all content words, both in words that have an underlying long vowel (/ʔa:bot/ → [ˈʔa:.bot]) and in words that do not, which receive final stress (/ʔabot/ → [ʔa.ˈbot]). Without direct access to speakers’ underlying representations, the availability of a deeper level of analysis with length only, cannot be refuted by any data. More broadly, data alone cannot rule out an analysis of any phenomenon where a feature that the phonology appears to be sensitive to is actually the (un-neutralised) reflex of a different underlying feature, though there could be cross-linguistic or theoretical justifications for such an analysis. We do not, however, find any support for underlying length in the text-setting data, which appears to be sensitive only to stress.

We end with a methodological note on the usefulness of musical data for low- and medium-resource languages. Filipino could be considered a medium-resource language. Unlike for most of the world’s languages, there are corpora and engineering tools, either available or in development: see Jakubíček et al. (Reference Jakubíček, Kilgarriff, Kovář, Rychlý, Suchomel, Hardie and Love2013), Go & Nocon (Reference Go, Nocon and Roxas2017), Go et al. (Reference Go, Nocon and Borra2017), Lazaro et al. (Reference Lazaro, Policarpio and Guevara2009), Ang et al. (Reference Ang, Guevara, Miyanaga, Cajote, Ilao, Bayona and Laguna2014), and many others. But the extent of these resources is small compared to what exists for English, Korean, French and other languages with well-funded public and private research infrastructure.Footnote ¹⁶ Our song corpus consists of 1,662 words in total. A spoken corpus of that size would be too small for studying stress correlates, with too many sources of noise (speech rate, inherent duration and loudness of vowels, etc.). But in songs, we have access to songwriters’ categorical decisions about duration and strength, which makes the data clean enough for clear patterns to emerge. We originally coded and analysed just nine songs, and the main patterns were already there; adding the remaining ten songs made us more confident in the results, but didn’t change them. A small corpus of songs, even a number as small as what a research team could transcribe themselves from listening to recordings, can thus be useful for gaining insight into the phonology of a lower-resource language, as long as the object of study occurs with sufficient density in songs. In our case, most of the syllables in a song provided relevant data, so the density of observations per song was high.

Supplementary material

The supplementary material for this article can be found at https://doi.org/10.1017/S0952675725000041.

Data availability statement

The annotated R code that generated our figures and results, and some additional statistical analysis is available in the Supplementary Material.

Acknowledgements

For their feedback on various parts of this project, we thank participants in the UCLA phonology seminar, the UC Berkeley Phorum, the University of Ottawa linguistics colloquium, the Linguistic Society of the Philippines RIPPLE series (Researches, Insights, and Perspectives on the Philippine Linguistics Enterprise), the Manchester Phonology Meeting and the Linguistic Society of America annual meeting. We especially thank Peter Avery, François Dell, Bruce Hayes, Larry Hyman, Dan Kaufman, Stephanie Reed, Hannah Sande and the editors and reviewers of Phonology for their comments.

Funding statement

This research was supported by a grant from the UCLA Academic Senate’s Committee on Research.

Competing interests

The authors declare no competing interests.

A. Songs

Table A1 lists the songs used in the corpus. Because the sheet music we used is made by listeners, where possible we list the performer whose performance the sheet music is based on, as well as the composer, when known.

Table A1 Songs in the corpus.

Footnotes

1 We have found some (ethno)musicological studies of Austronesian song genres, such as Goldsworthy (Reference Goldsworthy1995, Fijian), Moyle (Reference Moyle and Moyle2007, Takuu) and Yampolsky (Reference Yampolsky2022, Southern Tetun), that include basic text-setting information, such as number of syllables per line and whether vowel elisions and vowel length changes are used to make lyrics fit the meter. Aloufi (Reference Aloufi2021) sketches a preliminary analysis of line length and rhythm in Pohnpeian songs using Optimality Theory (Prince & Smolensky Reference Prince and Smolensky1993, Reference Prince and Smolensky2004).

2 Hwang et al. (Reference Hwang, Nagaya and Villegas2019) conducted a perception experiment of the minimal pair bábad ‘to soak’ and babád ‘immersed’, manipulating pitch, duration and intensity. If the penult was long, participants perceived bábad, regardless of ultima duration. If the penult was short, participants perceived babád, except that low pitch on the ultima caused slightly fewer babád responses. They conclude that listeners use primarily duration and secondarily pitch.

3 The two words are etymologically related.

4 A special case: when an ultima-prominent word ending in /ʔ/ is followed by a consonant-initial enclitic, the /ʔ/ deletes and the preceding vowel lengthens, according to Schachter & Otanes: /malíʔ na/ → [mali: na] ‘wrong now’. In these cases, the pitch accent can either stay on [li:] or move to [na].

5 Many of these authors treat word-initial [ʔ] as inserted, and therefore would treat the words in Table 2 as underlyingly /a/-initial.

6 There is a fourth phenomenon that will not be discussed in detail: Avery & Lamontagne (Reference Avery and Lamontagne1995) appeal to stress, specifically avoidance of stress clash and stress lapse, in their analysis of infix placement in loans that begin with two consonants (on which see also Zuraw Reference Zuraw2007). It is difficult to imagine a length-based analysis of their data.

7 This version of the stress analysis assigns only one mora to any vowel, even those that are phonetically lengthened, such as the /a/ in the first syllable of (3b).

8 We mark only a word’s main prominent syllable with 1. §7 tests the idea of predictable secondary stress earlier in the word.

9 Because some distributions here are skewed, we considered placing these lines at the medians. But because our duration and beat strength are coarse-grained, the medians ended up being misleading in two ways. First, distributions that are rather different sometimes had the same median. For example, the plot below shows that the mean durations for penults and ultimas of penult-prominent words are 0.7 beats and 1.0 beats. Their medians, however, are the same (0.5 beats); the longer tail for the ultimas is not enough to pull the median up to the next possible value, 0.75 beats. Second, distributions that are similar sometimes had rather different medians. The subtle difference between a distribution where almost half of the tokens are downbeats, and one where just over half of the tokens are downbeats, produces a correspondingly subtle difference in means, but a large difference in medians.

10 Our list of enclitics comprises ako, ka, siya, kami, kata, kita, tayo, kayo, sila, ko, mo, niya, ta, ninyo, nila, kita, na, pa, nga, talaga, po, ho, pala, yata, sana, nawa, ba, baga, namin, natin,din, rin, man, naman, lang, lamang, daw and raw, plus the combination of each of these with -ng, ’t and ’y. Because na has non-enclitic homophones, we hand-coded all 52 instances of na in the song corpus, marking 16 of them as non-enclitic.

11 Excluding function words that should not be phrase-final, namely the plural marker mga, the preposition-like nasa and linker-suffix-bearing numerals, demonstrative pronouns and personal pronouns, which in our corpus were aking, akong, ating, inyong, isang, itong, iyong, kayong, kitang, nating, nilang, nitong, siyang and tayong. These words were always coded as phrase-medial, except for one token of kayong ‘you’ that was line-final.

12 Tan et al. (Reference Tan, Lustig and Temperley2019) implement a different procedure to correct for anticipatory syncopation: a syllable on a weak nth note is considered as though it actually began on the next nth note, as long as no other syllables begin on or before that next nth note. (Working with English-language music, Tan et al. apply this correction only to stressed syllables, but we ignored stress so as not to bias the results to make stressed syllables appear stronger.) For example, the ram of maramdaman begins on the weak eighth note of a beat (i.e., the second of the two eighth notes in that beat), so it would be considered as if it began on the next eighth note, which is the downbeat of the next measure. As with our strongest-beat-contained procedure, ram is thus treated as though it really began on a downbeat. Unlike in our procedure, the di of hindi would also be promoted, from the second half of a beat to the beginning of the measure’s third beat. This is because our procedure requires the note to continue into a position to be promoted to it, and Tan et al.’s does not. Only 1.3% of syllables had different corrected strengths under the two procedures, so we did not pursue a reanalysis of the data under Tan et al.’s procedure.

13 Measures (10a)–(10d) are restricted to non-line-final words.

14 See Supplementary Material for Monte Carlo plots. For fourth-to-last syllables, we compared closed to short-open and long to short-open, in both note length and beat strength. Long vs. short-open note duration was the most promising, with $p = 0.031$ , but this does not survive any correction for multiple comparisons. The Supplementary Material also contain regression models of pre-tonic note length and beat strength, which largely support the trends seen in the plots: in fourth-to-last syllables, long-vowel syllables are set to longer notes and stronger beats, closed syllables are set to longer notes (but not stronger beats); in third-to-last syllables, closed syllables are set to longer notes, especially if the following syllable is not stressed, and long vowels and lack of stress clash both predict stronger beats.

15 The reason closed penults don’t attract secondary stress in ultima-prominent words is, under French’s analysis, that Tagalog avoids stressing two syllables in a row (stress clash) where possible.

16 The multi-language DALI corpus of pop song recordings (Meseguer-Brocal et al. Reference Meseguer-Brocal, Cohen-Hadria and Peeters2018) has three songs coded as Filipino, but on inspection two of those were Japanese songs that had been mislabelled, so there was only one Filipino song in the corpus, not enough to analyse. The multi-language Vocadito corpus (Bittner et al. Reference Bittner, Pasalo, Bosch, Meseguer-Brocal and Rubinstein2021) has about 300 syllables of folk-song and nursery-rhyme Filipino songs, but provides only duration and requires considerable hand-correction if duration at the syllable level is desired. The Smule data sets of amateur karaoke performances (Smule, Inc. Reference Smule2018a, Reference Smuleb) no doubt include valuable data, but the raw data would require considerable processing to be usable for our purposes.

References

Abisado, Mideth B., Yongson, Mardyon B. & Los Trinos, Ma Ian P. (2021). Towards the development of music mood classification of original Pilipino music (OPM) songs based on audio and lyrics keyword. In ICSET 2021: 5th International Conference on E-Society, E-Education and E-Technology. New York: Association for Computing Machinery, 87–90.10.1145/3485768.3485786CrossRef Google Scholar

Alegado, Joni Rose, Labaya, Abigail M., Lirio, Pia S. & Rivera, Ruthier S. (2021). A linguistic analysis of Tagalog-English code switching in OPM love songs. Ms., Polytechnic University of the Philippines.Google Scholar

Aloufi, Aliaa (2021). Ponapean song meter in Optimality Theory. Advances in Language and Literary Studies 12, 46–50.10.7575/aiac.alls.v.12n.3.p.46CrossRef Google Scholar

Anacin, Carljohnson, Baker, David & Bennett, Andy (2021). Mimicking the mimics: problematizing cover performance of Filipino local music on social media. Media, Culture & Society 43, 1414–1430.10.1177/01634437211029888CrossRef Google Scholar

Anderson, Quiliano Niñeza (2015). Kundiman love songs from the Philippines: their development from folksong to art song and an examination of representative repertoire. PhD dissertation, University of Iowa.10.17077/etd.hivytk5hCrossRef Google Scholar

Anderson, Victoria B. (2006). Lexical stress without postlexical head-marking: evidence from Tagalog. JASA 120, 3092–3092.10.1121/1.4787497CrossRef Google Scholar

Ang, Federico, Guevara, Rowena Cristina, Miyanaga, Yoshikazu, Cajote, Rhandley, Ilao, Joel, Bayona, Michael Gringo Angelo & Laguna, Ann Franchesca (2014). Open domain continuous Filipino speech recognition: challenges and baseline experiments. IEICE Transactions on Information and Systems 97, 2443–2452.10.1587/transinf.2013EDP7442CrossRef Google Scholar

Arceo-Dumlao, Tina (2017). Himig at titik: a tribute to OPM songwriters. Makati City: Inquirer Books.Google Scholar

Avery, Peter & Lamontagne, Greg (1995). Infixation <and metathesis> in Tagalog. Paper presented at the Annual Meeting of the Canadian Linguistic Association.+in+Tagalog.+Paper+presented+at+the+Annual+Meeting+of+the+Canadian+Linguistic+Association.>Google Scholar

Bareng, Jeanmarc M. (2019). Code switching in original Pilipino music (OPM). Ascendens Asia Journal of Multidisciplinary Research Abstracts 3, 685.Google Scholar

Bellik, Jennifer Ann (2019). Vowel intrusion in Turkish onset clusters. PhD dissertation, University of California, Santa Cruz.Google Scholar

Bittner, Rachel M., Pasalo, Katherine, Bosch, Juan José, Meseguer-Brocal, Gabriel & Rubinstein, David (2021). Vocadito: a dataset of solo vocals with

${f}_0$ , note, and lyric annotations. In Proceedings of the 22nd International Society for Music Information Retrieval Conference, 3 pp.Google Scholar

Blake, Frank Ringgold (1925). A grammar of the Tagalog language. New Haven, CT: American Oriental Society.Google Scholar

Bloomfield, Leonard (1917). Tagalog texts with grammatical analysis. Urbana, IL: University of Illinois.Google Scholar

Boer, Diana, Fischer, Ronald, Atilano, Ma Luisa Gonzalez, Hernández, Jimena Garay, Garcia, Luz Irene Moreno, Mendoza, Socorro, Gouveia, Valdiney V., Lam, Jason & Lo, Eva (2013). Music, identity, and musical ethnocentrism of young people in six Asian, Latin American, and Western cultures. Journal of Applied Social Psychology 43, 2360–2376.10.1111/jasp.12185CrossRef Google Scholar

Boersma, Paul & Weenink, David (2017). Praat: doing phonetics by computer. Version 6.0.29. Available at http://www.praat.org/.Google Scholar

Cayabyab, Krina (2021). The (de-) and (re-) mythification of OPM: decentring a popular music sign. In Johan, Adil & Santaella, Mayco A. (eds.) Made in Nusantara: studies in popular music. New York: Routledge, 45–54.10.4324/9780367855529-5CrossRef Google Scholar

Constantino, Ernesto (1965). The sentence patterns of twenty-six Philippine languages. Lingua 15, 71–124.10.1016/0024-3841(65)90009-4CrossRef Google Scholar

Crosswhite, Katherine (1998). Segmental vs. prosodic correspondence in Chamorro. Phonology 15, 281–316.10.1017/S0952675799003619CrossRef Google Scholar

Deja, Jordan Aiko, Blanquera, Kim, Carabeo, Carlo Eliczar & Copiaco, Jo Rupert (2016). Genre classification of OPM songs through the use of musical features. In Nishizaki, Shin-ya, Numao, Masayuki, Caro, Jaime D.L. & Suarez, Merlin Teodosia C. (eds.) Theory and practice of computation: proceedings of Workshop on Computation: Theory and Practice WCTP2014. Singapore: World Scientific, 77–88.Google Scholar

Dell, François & Halle, John (2009). Comparing musical textsetting in French and in English songs. In Arleo, Andy & Aroui, Jean-Louis (eds.) Towards a typology of poetic forms. Amsterdam: Benjamins, 63–78.10.1075/lfab.2.03delCrossRef Google Scholar

Demeterio, Feorillo A. III & Dreisbach, Jeconiah Louis (2017). Disentangling the Rubrico and Dolalas hypotheses on the Davao Filipino language. Recoletos Multidisciplinary Research Journal 5, 1–15.Google Scholar

Domene Moreno, Christina & Kabak, Barış (2022). Prominence alignment in English and Turkish songs: implications for word prosodic typology. In Scharinger & Wiese (2022), 223–258.10.1515/9783110770186-009CrossRef Google Scholar

Domingo, Luis Zuriel P. (2021). Korean pop music a threat to contemporary Filipino identity? Globalization, nation, and interrogation in Philippine culture and identity. Asia Review 11, 247–265.10.24987/SNUACAR.2021.8.11.2.247CrossRef Google Scholar

Dunn, Olive Jean (1961). Multiple comparisons among means. Journal of the American Statistical Association 56, 52–64.10.1080/01621459.1961.10482090CrossRef Google Scholar

French, Koleen Matsuda (1988). Insights into Tagalog reduplication, infixation and stress from nonlinear phonology. Dallas, TX: Summer Institute of Liguistics and University of Texas at Arlington.Google Scholar

French, Koleen Matsuda (1991). Secondary stress in Tagalog. Oceanic Linguistics 30, 157–178.10.2307/3623086CrossRef Google Scholar

Gabrillo, James (2018). Rak en rol: the influence of psychedelic culture in Philippine music. Rock Music Studies 5, 257–274.10.1080/19401159.2018.1510757CrossRef Google Scholar

Gaillard, J.C. (2022). On the production of hybrid urban space(s) in post-colonial cities: Manila in the music of Eraserheads. Geografiska Annaler: Series B, Human Geography 105, 305–320.10.1080/04353684.2022.2146523CrossRef Google Scholar

Girardi, Elena & Plag, Ingo (2022). Metrical mapping in text-setting: empirical analysis and grammatical implementation. In Scharinger & Wiese (2022), 191–221.10.1515/9783110770186-008CrossRef Google Scholar

Go, Matthew Phillip & Nocon, Nicco (2017). Using Stanford part-of-speech tagger for the morphologically-rich Filipino language. In Roxas, Rachel Edita (ed.) Proceedings of the 31st Pacific Asia Conference on Language, Information and Computation (PACLIC) 31. Manila: National University, 81–88.Google Scholar

Go, Matthew Phillip, Nocon, Nicco & Borra, Allan (2017). Gramatika: a grammar checker for the low-resourced Filipino language. In TENCON 2017 – 2017 IEEE Region 10 Conference. 471–475.10.1109/TENCON.2017.8227910CrossRef Google Scholar

Goldsworthy, David (1995). Continuities in Fijian music: meke and same. Yearbook for Traditional Music 27, 23–33.10.2307/768101CrossRef Google Scholar

Gonzalez, A. (1970). Acoustic correlates of accent, rhythm, and intonation in Tagalog. Phonetica 22, 11–44.10.1159/000259307CrossRef Google Scholar

Gunkel, Dieter & Ryan, Kevin M. (2011). Hiatus avoidance and metrification in the Rigveda. In Jamison, Stephanie W., Melchert, H. Craig & Vine, Brent (eds.) Proceedings of the 22nd annual UCLA Indo-European Conference. Bremen: Hempen, 53–68.Google Scholar

Hagberg, Lawrence Raymond (2006). An autosegmental theory of stress. Dallas, TX: SIL International.Google Scholar

Hayes, Bruce (1995). Metrical stress theory: principles and case studies. Chicago, IL: University of Chicago Press.Google Scholar

Hayes, Bruce (2009). Textsetting as constraint conflict. In Arleo, Andy & Aroui, Jean-Louis (eds.) Towards a typology of poetic forms: from language to metrics and beyond. Amsterdam: Benjamins, 43–61.10.1075/lfab.2.02hayCrossRef Google Scholar

Hayes, Bruce (2013). Milton, Maxent, and the Russian method. Paper presented at the M@90 Workshop on Metrical Structure, Massachusetts Institute of Technology, September 2013. Available at https://youtu.be/IRRyxW6xPeg.Google Scholar

Hayes, Bruce, Zuraw, Kie, Londe, Zsuzsa Cziráky & Siptár, Peter (2009). Natural and unnatural constraints in Hungarian vowel harmony. Lg 85, 822–863.Google Scholar

Himmelmann, Nikolaus P. & Kaufman, Daniel (2020). Austronesia. In Gussenhoven, Carlos & Chen, Aoju (eds.) The Oxford handbook of prosody. Oxford: Oxford University Press, 370–383.Google Scholar

Hwang, Hyun Kyung, Nagaya, Naonori & Villegas, Julián (2019). Cue weighting in the perception of Tagalog stress. JASA 146, 3052.10.1121/1.5137583CrossRef Google Scholar

International Phonetic Association (1999). Handbook of the International Phonetic Association: a guide to the use of the International Phonetic Alphabet. Cambridge: Cambridge University Press.Google Scholar

Jakubíček, Miloš, Kilgarriff, Adam, Kovář, Vojtěch, Rychlý, Pavel & Suchomel, Vít (2013). The TenTen corpus family. In Hardie, Andrew & Love, Robbie (eds.) Corpus linguistics 2013 abstract book. Lancaster: UCREL, 125–127.Google Scholar

Johan, Adil & Santaella, Mayco A. (eds.) (2021). Made in Nusantara: studies in popular music. New York: Routledge.10.4324/9780367855529CrossRef Google Scholar

Kampstra, Peter (2008). Beanplot: a boxplot alternative for visual comparison of distributions. Journal of Statistical Software 28, 1–9.10.18637/jss.v028.c01CrossRef Google Scholar

Kaufman, Daniel (2010). The morphosyntax of Tagalog clitics: a typologically driven approach. PhD dissertation, Cornell University.Google Scholar

Kessler, Brett (2001). The significance of word lists. Stanford, CA: Center for the Study of Language and Information.Google Scholar

Klimenko, Sergey B., Maria, Paz C. San Juan & Javier, Jem R. (2010). Stressed out with stress: perceptual recognition of acoustic correlates of stress in Tagalog. Paper presented at the 1st Philippine Conference-Workshop on Mother Tongue Based Multilingual Education, Capitol University, Cagayan de Oro City.Google Scholar

Lazaro, Lito Rodel S., Policarpio, Leslie L. & Guevara, Rowena Cristina L. (2009). Incorporating duration and intonation models in Filipino speech synthesis. In Proceedings: APSIPA ASC 2009: Asia-Pacific Signal and Information Processing Association, 2009 annual summit and conference, 45–49.Google Scholar

Lerdahl, Fred & Jackendoff, Ray S. (1981). On the theory of grouping and meter. The Musical Quarterly 67, 479–506.10.1093/mq/LXVII.4.479CrossRef Google Scholar

Lerdahl, Fred & Jackendoff, Ray S. ([1983] 1996). A generative theory of tonal music. Cambridge, MA: MIT Press. Reissue.10.7551/mitpress/12513.001.0001CrossRef Google Scholar

Liberman, Mark Yoffe (1975). The intonational system of English. Master’s thesis, Massachusetts Institute of Technology.Google Scholar

Limjuco, Renan P., Ticudo, Ronald Jay S. & Pregua, Helenne U. (2014). Determinants of music type preference of university students in Davao City. UIC Research Journal 20, 177–189.Google Scholar

Lockard, Craig A. (1996). Popular musics and politics in modern Southeast Asia: a comparative analysis. Asian Music 27, 149–199.10.2307/834493CrossRef Google Scholar

Maceda, Teresita Gimenez (2007). Problematizing the popular: the dynamics of Pinoy pop (ular) music and popular protest music. Inter-Asia Cultural Studies 8, 390–413.10.1080/14649370701393766CrossRef Google Scholar

Manalansan, Martin F. (2003). Global divas: Filipino gay men in the diaspora. Durham, NC: Duke University Press.10.2307/j.ctv12101tnCrossRef Google Scholar

Martin, Andrew (2011). Grammars leak: modeling how phonotactic generalizations interact within the grammar. Lg 87, 751–770.Google Scholar

Martin, Andrew Thomas (2007). The evolving lexicon. PhD dissertation, University of California, Los Angeles.Google Scholar

Meseguer-Brocal, Gabriel, Cohen-Hadria, Alice & Peeters, Geoffroy (2018). DALI: a large dataset of synchronized audio, lyrics and notes, automatically created using teacher-student machine learning paradigm. In Proceedings of the 19th International Society for Music Information Retrieval (ISMIR) conference, 431–437.Google Scholar

Mital, Matt Ervin G., Tobias, Rogelio Ruzcko N.M.I., Bandala, Argel A., Billones, Robert Kerwin & Dadios, Elmer P. (2019). Utilization of genetic algorithm in classifying Filipino and Korean music through distinct windowing and perceptual features. In Niranjan, S.K., Rana, Ajay & Khurana, Himdweep (eds.) Proceedings of the Fourth International Conference on Contemporary Computing and Informatics (iCI 2019). Piscataway, NJ: IEEE, 121–126.Google Scholar

Monterola, Christopher, Abundo, Cheryl, Tugaff, Jeric & Venturina, Lorcel Ericka (2009). Prediction of potential hit song and musical genre using artificial neural networks. International Journal of Modern Physics C 20, 1697–1718.10.1142/S0129183109014680CrossRef Google Scholar

Moyle, Richard (2007). Taking five – quintuple metre in Takū tuki songs. In Moyle, Richard (ed.) Oceanic music encounters: the print resource and the human resource: essays in honour of Mervyn McLean. Auckland: Department of Anthropology, University of Auckland, 123–132.Google Scholar

Nagai, Hiroko (2022). Embrace our color: nationalism in Philippine popular music. In Hamza, Hafzan Zannie, Chan, Clare Suet Ching & Chin, Lena Farida Hussain (eds.) Proceedings of the 4th International Music & Performing Arts Conference: trending digital virtual and capital. Tanjong Malim: Sultan Idris Education University, 32–37.Google Scholar

Nolasco, Ricardo Ma. (2007). Filipino and Tagalog, not so simple / how to value our languages. Blog post. Archived at https://web.archive.org/web/20170730230733/http://svillafania.philippinepen.ph/2007/08/articles-filipino-and-tagalog-not-so.html.Google Scholar

Peña, Verne de la (2021). Songs for and of the youth: mapping trends in Philippine popular music, 1900–2000. In Johan & Santaella (2021), 92–100.Google Scholar

Prince, Alan & Smolensky, Paul (1993). Optimality Theory: constraint interaction in generative grammar. Technical Report 2, Rutgers Center for Cognitive Science. Subsequently published as Prince & Smolensky (2004).Google Scholar

Prince, Alan & Smolensky, Paul (2004). Optimality Theory: constraint interaction in generative grammar. Oxford: Blackwell.10.1002/9780470759400CrossRef Google Scholar

Proto, Teresa (2013). Singing in German: text-setting rules and language rhythm. In Vincenzo, Galatà (ed.) Multimodalità e multilingualità: la sfida più avanzata della comunicazione orale. Rome: Bulzoni, 9–10.Google Scholar

Proto, Teresa & Dell, François (2013). The structure of metrical patterns in tunes and in literary verse. Evidence from discrepancies between musical and linguistic rhythm in Italian songs. Probus 25, 105–138.10.1515/probus-2013-0004CrossRef Google Scholar

Prudente, Felicidad A. (2021). Colonialism and identity: a short history of popular music in the Philippines. In Johan & Santaella (2021), 35–44.Google Scholar

R Core Team (2021). R: a language and environment for statistical computing. Available at https://www.R-project.org.Google Scholar

Ramos, Teresita V. (1981). Tagalog structures. Honolulu, HI: University of Hawaii Press.Google Scholar

Reed, Stephanie (2022). Duration in Tagalog reduplication as evidence for phonemic vowel length. Paper presented at LabPhon 18, online, June 2022.Google Scholar

Richards, Norvin (2017). Some notes on Tagalog prosody and scrambling. Glossa 2, article no. 21.Google Scholar

Rubrico, Jessie Grace U. (2012). Indigenization of Filipino: the case of the Davao City variety. Ms., University of Malaya. Available at https://www.languagelinks.org/onlinepapers/Indigenization-of-Filipino.pdf.Google Scholar

Ryan, Kevin M. (2011). Gradient weight in phonology. PhD dissertation, University of California, Los Angeles.Google Scholar

Sabbagh, Joseph (2004). Stress shift and prosodic correspondence in Tagalog. Presented at the Western Conference on Linguistics (WECOL), University of Southern California.Google Scholar

Sabbagh, Joseph (2014). Word order and prosodic-structure constraints in Tagalog. Syntax 17, 40–89.10.1111/synt.12012CrossRef Google Scholar

Schachter, Paul & Otanes, Fe T. (1972). Tagalog reference grammar. Berkeley, CA: University of California Press.10.1525/9780520321205CrossRef Google Scholar

Scharinger, Mathias & Wiese, Richard (eds.) (2022). How language speaks to music: prosody from a cross-domain perspective. Berlin: De Gruyter.10.1515/9783110770186CrossRef Google Scholar

Shryock, Aaron (1993). A metrical analysis of stress in Cebuano. Lingua 91, 103–148.10.1016/0024-3841(93)90010-TCrossRef Google Scholar

Shunwei, Liu & Jia, Li (2022). Establishment of Philippine popular music industry. Multicultural Education 8, 60–82.Google Scholar

Smule, Inc. (2018a). DAMP-MVP: Digital archive of mobile performances – Smule multilingual vocal performance 300x30x2. Data set. Available at https://doi.org/10.5281/zenodo.2747436.CrossRef Google Scholar

Smule, Inc. (2018b). DAMP-VSEP: Digital archive of mobile performances – vocal separation. Data set. Available at https://doi.org/10.5281/zenodo.3553059.CrossRef Google Scholar

Soberano, Rosa (1980). The dialects of Marinduque Tagalog. Number 69 in Pacific Linguistics Series B. Canberra: Department of Linguistics, School of Pacific Studies, Australian National University.Google Scholar

Sulit, Jimson Montilla (2022). Content-based original Pilipino music (OPM) recommender system centered on mood. MICS capstone project, University of the Philippines.Google Scholar

Sumalinog, Deo Jamael M., Salid, Ronald S., Sarino, Exel Rose D. & Amante, F. (2021). Exploring gay lingo in some selected OPM songs. Asian Journal of Advanced Multidisciplinary Researches 1, 1–4.Google Scholar

Tan, Ivan, Lustig, Ethan & Temperley, David (2019). Anticipatory syncopation in rock: a corpus study. Music Perception 36, 353–370.10.1525/mp.2019.36.4.353CrossRef Google Scholar

Temperley, David (1999). Syncopation in rock: a perceptual perspective. Popular Music 18, 19–40.10.1017/S0261143000008710CrossRef Google Scholar

Temperley, Nicholas & Temperley, David (2013). Stress-meter alignment in French vocal music. JASA 134, 520–527.10.1121/1.4807566CrossRef Google Scholar PubMed

Yampolsky, Philip (2022). Poetic text and melodic text: text-setting in two song traditions of Timor. Asian Music 53, 80–126.10.1353/amu.2022.0004CrossRef Google Scholar

Zuraw, Kie (2007). The role of phonetic knowledge in phonological patterning: corpus and survey evidence from Tagalog infixation. Lg 83, 277–316.Google Scholar