Hostname: page-component-745bb68f8f-g4j75 Total loading time: 0 Render date: 2025-02-09T21:35:34.722Z Has data issue: false hasContentIssue false

Auto-Tune as instrument: trap music's embrace of a repurposed technology

Published online by Cambridge University Press:  03 February 2025

Ben Duinker*
Affiliation:
Schulich School of Music, McGill University, 555 Sherbrooke St West, Montreal, Canada H3A 1E3 benjamin.duinker@mail.mcgill.ca
Rights & Permissions [Opens in a new window]

Abstract

This article explores Auto-Tune's importance to the production, perception and reception of trap music, a sub-genre of hip hop. Central to this exploration is the observation that Auto-Tuned trap vocals are readily audible as such because the software's pitch correction function is applied unnaturally quickly to the vocal audio signal, a feature herein termed ‘zero-onset Auto-Tune’. First, I posit that although Auto-Tune is ostensibly a pitch-correction device, its impact on vocal timbre is not well documented or understood. Second, I argue that Auto-Tune's recent importance as a creative tool in trap recasts it as an instrument. Third, I suggest that understanding Auto-Tune's repurposing as an instrument begets its situation in a lineage of technologies repurposed, adapted and embraced by the hip-hop community, including the turntable, digital sampler, and analogue mixer. And fourth, I propose that this repurposing surfaces in Auto-Tune's ability to facilitate emotiveness in trap vocals.

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
Copyright © The Author(s), 2025. Published by Cambridge University Press

In a 2013 interview with The Fader, Jeffrey Lamar Williams, better known as Young Thug, admitted that ‘I don't really know how to sing, but I've been trying for years’ (Stephenson, Reference Stephenson2013). As one of the most prolific voices in contemporary hip hop, Thug does much more than ‘try to sing’. Across the three studio albums, three EPs, and 19 mixtapes he has released since 2011, Thug's vocal delivery has encompassed rapping, singing and many other utterances: yelps, screeches, shouts and croaks. Amid this vocalic diversity, one thing remains constant. The sonic traces of Auto-Tune are never far away on Thug's studio and live performances alike. On ‘Yeah Yeah’, his unreleased 2017 single with Travis Scott, Thug's first vocal entry (0:58) swoops up high in his range in pixelated granularity – although his unmediated singing voice is performing a glissando of sorts, Auto-Tune forces his pitch into a rapidly ascending scale, with discrete steps. Across the ensuing verse (1:00–2:08) Auto-Tune turns his shouted lyrics into a series of digital melismas (to borrow a term from Ragnhild Brøvig-Hanssen and Anne Danielsen Reference Brøvig-Hanssen and Danielsen2016), each seeming to reach higher and more earnestly – to what, is not clear – culminating in several long-held notes on the lyrics ‘I ain't flexing fool/shoutout something to do’ (2:02). Young Thug's performance on ‘Yeah Yeah’ exemplifies what Kit Mackintosh describes as artists ‘[using Auto-Tune] to completely destabilize their [vocal] delivery by accentuating its warbling quality’ (Reference Mackintosh2021, p. 34). Yet it is not entirely clear who or what is unstable or warbling here: is it Thug's voice, or is it Auto-Tune's almost desperate-sounding fluttering between pitches?

Developed by Andy Hildebrand for Antares Technology in 1996, Auto-Tune was originally intended as a studio tool for covert adjustments and corrections to the pitch of recorded vocals in popular music. Made famous (and audible) by Cher's Reference Cher1998 hit single ‘Believe’, Auto-Tune became a reviled yet constant fixture in popular music production, one that would come to be synonymous with certain artists’ vocal personas – notably T-Pain, Kanye West, Bon Iver and several other (almost exclusively male) artists. Yet perhaps Auto-Tune's most pervasive imprint on popular music is found in trap music, a sub-genre of hip hop that emerged in the American South in early 2000s. Although Auto-Tune did not originally find favour in trap, the 2010s witnessed a surge in popularity of its artists who use Auto-Tune regularly, including Migos, Future, Travis Scott and Young Thug. The software has arguably found a more comfortable home here than anywhere else.Footnote 1

This article makes four interconnected arguments that encourage an understanding of the Auto-Tuned trap voice as artistically and procedurally distinct from the commercial music industry's other uses of Auto-Tune, and historically relevant to other technology-based practices in the broader hip-hop genre. In the first section, I summarise Auto-Tune's functionality and explain how its contemporary usage in trap music facilitates the type of vocal performance Young Thug delivers on ‘Yeah Yeah’. Here I also explore the ramifications of Auto-Tune's technological mediation of vocal timbre, unpacking how timbral characteristics of the voice are altered through Auto-Tune's mediation process. In the second section I propose that Auto-Tune's usage in trap music functions more like that of an instrument, rather than a studio tool. This claim requires interrogation into the studio-based workflow of several trap artists and engagement with scholarship on compositional organology and digital instruments, and is made to illustrate the real-time, creative, symbiotic relationship between the human voice and technology. In the third section I situate Auto-Tune as the latest in a line of sound-producing or sound-modifying technologies whose intended functionalities have been repurposed to creative ends by hip-hop musicians. This situation speaks to the broader pattern of Black-led musical innovations evolving into widespread acceptance in the mainstream and, problematically, co-opting (by that same mainstream) of such innovations. And finally, in the fourth section I explore the intersection between machine and human agency in Auto-Tune's functionality in trap music, engaging discourse surrounding Black post-humanism and questioning the notion of whose voice we are hearing in trap vocals, and how that affects our perception of emotiveness in this music. Together, the positions I take in this paper understand Auto-Tune's usage in trap music as experimental and divergent from the software's original intended functionality, yet consistent with Hildebrand's claim that Auto-Tune was meant to facilitate emotive singing.Footnote 2 In so doing, this article situates trap music as a bridge between hip-hop, technology, emotion and perceived artifice in mainstream popular music.

Functionality and timbre

In her 2022 single ‘Twinnem’, Coi Leray (b. Coi Leray Collins) sings passionately about creating lifetime bonds of friendship, consistent with the song's title.Footnote 3 Her vocal performance encompasses soft, intimate-sounding references to camaraderie in the lyrics ‘go best friend’ (0:16) and earnest scepticism of others on ‘who these new n***as, I ain't feelin’ them’ (0:23). Amid the timbral variety and nuance that Leray employs throughout ‘Twinnem’, her voice is audibly yoked to the key of D-major – audible for its overt reliance of Auto-Tune. Indeed, the momentary pitch fluctuations one hears on the repeated hook lyric ‘we killin'em’ (0:18) and the angular pitch trajectories one hears on the second-verse lyrics ‘far’, ‘ours’ and ‘scores’ are as much the provenance of Auto-Tune as they are of Leray's vocals.

Although not necessarily emblematic of the trap genre, Leray's voice on ‘Twinnem’ exemplifies the way Auto-Tune usage in trap has departed from the software's original intended purpose. Auto-Tune's functionality is driven by an algorithm that analyses the pitch content of an audio signal of recorded vocals and maps them onto the steps of a scale. In this way it takes a continuous parameter – frequency – and yokes it to a discrete system of 12-tone equal temperament. The scale onto which Auto-Tune performs this mapping is pre-set by the user, and typically is major or minor, the two diatonic systems common in Western music (although others are available in Auto-Tune's settings). In addition to pre-setting the scale, the user can also pre-set the retune speed, which determines how quickly the Auto-Tune algorithm will begin the mapping process once it detects an identifiable pitch. At its most covert, Auto-Tune's retune speed can be set to begin mapping well after the pitch is detected, which lets the singer's vocal pitch sound as natural and as accurate as possible, because the singing voice habitually takes a few moments to settle on a stable pitch. Such usage of Auto-Tune is subtly applied to smooth out pitch, especially on longer-held notes, and its effects are often only moderately perceptible to most untrained ears.Footnote 4 This covert functionality is thus meant to preserve the naturalness of vocal sound while improving its pitch – Catherine Provenzano writes that when Auto-Tune is used this way, ‘the voice's timbre remains intact and recognizable’ (Reference Provenzano2019, p. 110). To some extent, the voice's timbre also remains intact and recognisable even when Auto-Tune is used less than covertly: despite the obvious presence of Auto-Tune on ‘Twinnem’, we can still hear and identify the owner of the voice as Coi Leray.

Before continuing this discussion of timbre, Auto-Tune's functionality must be further unpacked. The fluctuations we hear on Leray's lyrics ‘we killin’ em’ arise from setting Auto-Tune's retune speed to zero – that is, forcing Auto-Tune to map its detected pitches to scale steps immediately upon identifying them. This is what we hear in Cher's ‘Believe’ (at 0:36 and other points in the song) and comprises the core of the Auto-Tune practice of Faheed Rashad Najm, better known as T-Pain. T-Pain ushered ‘zero-onset Auto-Tune’ (my term that describes Auto-Tune's effect when the retune speed is set to zero) into mainstream consciousness with his 2005 debut album, appropriately titled Rappa Ternt Sanga.Footnote 5 The lead single from that album, ‘I'm Sprung’, gave listeners a taste of what would eventually become known as the ‘T-Pain Effect’: a pixelated, warbly vocal so deeply permeated by technological mediation (a thoroughly ‘wet’ vocal aesthetic, to use Victoria Malawey's term) that T-Pain's natural voice was difficult to discern amid the processing.Footnote 6 To be sure, the technological mediation we hear in T-Pain's voice on ‘I'm Sprung’ – his vocal wetness (after Malawey) – encompasses more than just Auto-Tune's effect, its impacts on timbre notwithstanding. Yet the zero-onset Auto-Tune is audible, nonetheless. T-Pain's performance of the lyrics ‘she got me doing the dishes’ (0:39) is transcribed in Example 1. The Auto-Tune here is both obvious and subtle. T-Pain can surely create the melismatic scoop between E-flat and F without Auto-Tune. Yet the immediacy with which this scoop occurs is not conducive to the natural singing voice, nor is the immediate timbral consistency we hear on each longer tone.Footnote 7 The pixelated timbral quality of T-Pain's voice is probably achieved through a combination of Auto-Tune's effects on timbre and some other type of processing, possibly a bit crusher.Footnote 8 This processing becomes clear when T-Pain's Auto-Tuned voice on ‘I'm ‘n Luv (wit a Stripper)’ – another single from Rappa Ternt Sanga – appears sporadically (e.g. 0:30) without this pixelated quality. Because of how thoroughly the ‘T-Pain effect’ has permeated popular discourse, songs like ‘I'm ‘n Luv’ demonstrate that the T-Pain effect is not reducible to Auto-Tune alone.Footnote 9 This matters, because it risks creating a situation where critics of Auto-Tune are judging it based on its combination with other vocal processing, as T-Pain has used it throughout his career – any time Auto-Tune is said to imbue the voice with a distorted or metallic quality, technically such distortion is coming from other processing tools.

Example 1. Excerpt from ‘I'm Sprung’ (T-Pain, 2005)

Auto-Tune does, however, impact vocal production of timbre in a significant way. The human voice produces pitch and timbre through a fundamental frequency (F0) and a series of overtones, which occur at higher frequencies. F0 is what we typically perceive as the voice's pitch, while the overtones – more specifically, their relative strengths – are what give each voice its unique timbre. The strengths of these overtones are determined by a variety of biophysical factors: the size (specifically the width) and positioning of a singer's vocal tract, the positioning of their soft palette, the shape of their oral cavity (which facilitates sympathetic vibration, creating formants – or local peaks in spectral energy), and the way air passes through their vocal folds.Footnote 10

While the nature of the relationship between a singer's physical characteristics and the vocal timbre they produce has been debated, the above summary suggests that a vocalist's singing mechanics do have a sizable impact on the timbre they produce.Footnote 11 Yet two matters confound any sense of absolute timbral consistency in the singing voice, and this is where Auto-Tune complicates matters of timbral ‘intactness’. First, the human voice is, in its physicality, inherently unstable – an interesting parallel with Provenzano's identification of instability in the politics of the singing voice.Footnote 12 Even well-trained singers cannot fully eliminate minute physical fluctuations in their voice. Just as it takes time for the voice to settle on a stable pitch, it takes time to settle on a stable timbre, but even then, there will always be small timbral fluctuations before such stability is attained. These are natural, and we are accustomed to hearing them.

Second, the singing voice naturally vibrates (fluctuates in pitch) to a certain degree, unique to each voice. The overtones of the voice – the partials above the fundamental frequency (F0) that go some way in determining the voice's timbre – also vibrate, meaning vibrato singing affects pitch and timbre. Auto-Tune modifies both these dimensions of the singing voice. Example 2 shows a spectrogram of a sung straight tone in its unmediated form, and after zero-onset Auto-Tune has been applied to it. The spectrogram visualises what we would hear. First, the timbral stability of the pitch is immediately achieved, which would not happen naturally. Second, the vibrato in the overtone partials is all but eliminated after Auto-Tune is applied. The modification of these two dimensions of vocal timbre goes much of the way towards producing the audible changes in timbre we hear with Auto-Tune, but a third change is worth exploring.

Example 2. Spectrogram of sung straight tone (performed by the author) without (above) and with (below) Auto-Tune applied to the audio signal

As Mariana Young (Reference Young2016) explains, Auto-Tune does not, indeed it cannot, alter a singer's formants – the heightened spectral energy heard in certain overtones based on the resonant frequencies of a singer's oral cavity.Footnote 13 The location and strength of formants in any pitched sound will substantially contribute to that sound's timbre. Singers, for example, can use formants to vary their vowel production across a stable pitch. In essence, then, a singer will shape their oral cavity to emphasise certain overtones in the pitch they produce. If a singer changes pitch while maintaining a constant vowel, their formant structure will remain constant, despite the necessary transposition of F0 (the perceived pitch) and its related overtones. Because the formants do not change here, the timbres of the original and new pitches will be slightly different. This natural difference is not novel to listeners. In contrast, if Auto-Tune changes the singer's pitch over a constant vowel, it transposes everything: F0, overtones and formant structure. Young's assertion that Auto-Tune does not alter singer's formants is thus true to the point that Auto-Tune can preserve these formants in an unnatural way.Footnote 14 For many of Auto-Tune's very slight adjustments to pitch, such unnatural timbral consistency might go unnoticed, but in cases where the adjustments occur over a semitone or wider interval, these artificial consistencies are audible. This process thus creates an uneasy tension between the natural and the artificial in the trap voice, and encourages rumination on authenticity in this genre, and the questioning of whose voice we are hearing, topics which will be explored below in more detail.

Kit Mackintosh's colourful description of Auto-Tune's effect on the trap voice stipulates that the approach many contemporary MCs take with it ‘is all about exploring and accentuating the feel and the internal architecture of their oral cavities in combination with the intensified texturising effects of Auto-Tune’ (28). In a sense, he is describing this multi-level timbral functionality of zero-onset Auto-Tune: the singer's formant structure remains unchanged but is transposed along with the pitch (F0 and overtones) when the Auto-Tune does its work of adjusting the pitch to fit its scale, all while immediately flatlining the F0 and overtones of each pitch. That is perhaps why when Young Thug ascends through the first five pitches of a C-minor scale on a neutral syllable in his 2015 single ‘Best Friend’ (0:40), we hear an unnatural timbral straightness and the increasing strain in his voice as he ascends into his passaggio register. We are thus left with a curious sonic result that bears traces of the human and the machine that produced the sound. It forces the question: what, or whose, vocal timbre are we hearing?

Studio tool or instrument?

Before delving further into matters of timbre and agency, it is necessary to explore how Auto-Tune is used in trap music and how this usage differs from Auto-Tune's original intended purpose, which has continued in contemporary popular music. In short, Auto-Tune's functionality in trap music is instrument-like in nature. More specifically, Auto-Tune in trap music functions as a digital, compositional and collaborative instrument, topics I will explore in turn below. This discussion is supported by four main observations: first, that Auto-Tune in trap adds something appreciable to the voice, rather than removing its imperfections; second, that trap artists derive creative inspiration from Auto-Tune's effects and thus react to it as an instrumentalist would their instrument; third, that this reaction recasts Auto-Tune as an agent of composition, instead of its original role as an editing device; and fourth, that such positioning encourages us to seriously consider Auto-Tune's capabilities as a creative tool.

As mentioned earlier, Auto-Tune was originally intended to function as a correction device, to smooth out pitch-based inconsistencies in studio-recorded vocal performances. This functionality had two benefits: in streamlining the recording processes, Auto-Tune relieved singers of the stress and labour involved with delivering pitch-perfect performances, and enabled engineers and producers to exert more agile control over the final recorded product, much like they already could with the time-based quantisation tools that digital recording rigs offered. In both senses, Auto-Tune served as a labour-saving device (Provenzano Reference Provenzano2019), a software-based production tool used to facilitate expressive vocal performances. Reviled or adored, Cher's performance on ‘Believe’ laid bare the effects of Auto-Tune for all to hear, and in so doing arguably created an expressive vocal performance. This is an important distinction to make because Auto-Tune's role was therefore no longer to remove something from a vocal performance, like a sculptor chipping away stone to create a sculpture, but rather to add something, like a painter adding paint to a canvas. Here, the canvas is the unmediated voice.

This evolution of Auto-Tune's usage is the first step in coming to define it as an instrument. In invoking the notion of an instrument, we must consider the whole ecosystem of physical action, sound production and sound modification. When striking a snare drum, for example, the drummer's motion involves movements in the torso, arm, wrist and hand, and sound is produced when the stick contacts the drumhead. Different movements will produce different sounds from the drum, as will modifications to the drum itself (such as tightening the drumhead). Furthermore, the drum can be miked, equalised and treated with other live processing; these elements also affect how the drum sounds. This ecosystem of various actions (playing, adjusting, processing) and sounds comprises the snare drum's functionality as an instrument and define its sonic profile – the collection of sounds it can produce. Similarly, vocal theory often refers to the ‘vocal instrument’, which comprises the physical elements within the body that work to produce vocal sound. For example, Kate Heidemann's Reference Heidemann2016 summary of vocal physiology and timbre details how air is propelled from the lungs via muscular contraction in the torso, its pressure modified in the vocal folds (thus turning it into sound), and its timbre shaped by the singer's oral cavity. The vocal instrument can be ‘adjusted’ in much the same way as a drum – these adjustments come from within the singer's body, however – and it can be miked, equalised and otherwise processed. As with the drum, this ecosystem of the voice comprises its functionality as an instrument. Auto-Tune's role in this ecosystem is often situated in this last category of ‘processing’. Alexander Refsum Jensenius (Reference Jensenius2022) writes that ‘the voice comprises both sound producing and sound-modifying parts as any other acoustical instrument’ (p. 48). As such, Auto-Tune's role as a sound modifier makes logical sense to include within the vocal–instrument ecosystem.

Auto-Tune's shaping of the voice-as-heard thus represents the final stage in the vocal–instrument ecosystem that begins with the passage of air from the lungs and ends with the voice as heard through loudspeakers or headphones. Auto-Tune's role in this process is digital, and its presence, inseparable from the voice, goes some way in rendering the voice a digital musical instrument, or DMI. Most thoroughly discussed by Eduardo Miranda and Marcelo Wanderley (Reference Miranda and Wanderley2006), DMIs encompass the growing list of instruments whose sonic output involves digitally produced sound. Miranda and Wanderley classify DMIs as instruments that ‘contain a control surface (gestural or performance controller), an input device, and a sound generation unit’ (p. 3). The performer effectuates input gestures on a controller, which is treated with a mapping scheme that determines which gestures produce which sounds. By this definition, DMIs produce sounds, and while Auto-Tune requires a sonic input that it can modify, the immense gulf between the unmediated singing voice and the Auto-Tuned final product almost impresses that Auto-Tune is creating sound, rather than simply modifying it, because this final product is sonically so far removed from the unmediated singing voice that feeds it. While this is technically untrue – Auto-Tune does not create sound ex nihilo – similarities to Miranda and Wanderley's description of DMIs do exist. With Auto-Tune, the input gesture is the singer's unmediated voice (the gesture being the physical production of vocal sound), the gestural controller – so to speak – is the Auto-Tune interface, the mapping encompasses the settings used in Auto-Tune, and the sound production is the final product listeners hear. The difference between Auto-Tune and Miranda and Wanderley's DMI model is, of course, that sound production occurs from the input gestures, and not from the mapping scheme.

Framing the Auto-Tuned voice as a DMI foregrounds the step-by-step process of sound production and modification in Auto-Tuned trap vocals and invites discussion on how these are created in the studio and in live settings. Auto-Tune was designed as something producers could use after vocals are recorded. In this context, the vocalist records as they always would have – voice to microphone – in an unmediated environment. Auto-Tune is thus not part of the recording process, but part of the post-production process. While recording, the only awareness of Auto-Tune a singer might have is the reassurance (or tacit acceptance) that it would be used later to smooth out any imperfections in their pitch. Indeed, documentation of the recording process for ‘Believe’ suggests that, even there, Auto-Tune was applied post-factum.Footnote 15

Returning to T-Pain momentarily, the sessions for Rappa Ternt Sanga have not been widely discussed in public, but it is difficult to believe, with the quantity of Auto-Tune and other processing used on T-Pain's vocals, that he was not party to its application. In fact, he might have been hearing it and interacting with it as he recorded. Indeed, many trap rappers today record with zero-onset Auto-Tune running in real time.Footnote 16 The adjustments are thus not applied in post-production, but during the recording process.Footnote 17 Vocalists hear their autotuned self through headphones. The Auto-Tuned vocal track is the only track a producer has available to them for mixing and mastering. The user of a DMI must become adept and familiar with gesture-to-sound mappings the DMI affords, and how to manipulate these and react to sounds produced by the DMI. In the same way, the trap vocalist recording with Auto-Tune running in real time must become familiar with its effects; how it will react to what the vocalist does; where it will flutter, fluctuate or roam off key. To be sure, much of this familiarity comes through experimentation and improvisation, some of which finds its way to final versions of songs.Footnote 18 Yet certain artists, having used Auto-Tune as much as they have, do become accustomed to its effects and results, and can record and perform with it to some level of predictability (even if such predictability is not always aesthetically desirable). The vocalist here is thus encountering similar challenges to learning a new DMI, which involves familiarity with mappings that link physical action with sonic output.Footnote 19 Example 3 documents the workflow for recorded vocals with Auto-Tune running in real time, as it is used in many of today's trap recordings.Footnote 20 In this workflow, the vocalist sings, chants, shouts or raps into the microphone (input gestures) and their voice is processed through the Auto-Tune plug-in, which is pre-set with an onset time, scale and humanisation factor (mappings). The resulting audio signal is what the vocalist hears while they record and can thus be dramatically different from their internal hearing of their voice. The vocalist is thus navigating and reacting to two heard voices simultaneously – their unmediated ‘dry’ voice in their head, and their Auto-Tuned ‘wet’ voice in their headphones.

Example 3. Workflow diagram for Auto-Tune as it is used in trap music

Understanding the ecosystem in which zero-onset Auto-Tune operates suggests its connection to another concept in organology. Jonathan De Souza describes a compositional instrument as ‘a conceptual tool and a source of material’ (Reference De Souza2017, p. 133). Because zero-onset Auto-Tune runs while singers record, and because they react to it as they record, it becomes inseparable from the creation, or composition, of the recorded song. This differs from unmediated vocal recording because, theoretically, a singer/songwriter can write a song's melody and lyrics, and cultivate an approach to performing them, all before entering the studio. Yet if the song's vocalic content relies heavily on a tool like Auto-Tune, this process of composition cannot be fully realised until the vocals are recorded (or, at least, in dry studio runs immediately prior to recording). Further emphasising this notion of in-studio composition is rap's history of its artists doing much of their composing in the moment, through freestyling. Los Angeles-based producer Bainz, who works extensively with Young Thug, mentions in an interview that Thug's songs are often entirely freestyled, meaning that all the composition happens in the studio, and all of it happens with Auto-Tune running. Consequently, Auto-Tune's functionality as a compositional instrument makes natural sense.Footnote 21

De Souza's discussion probes how composers respond to instruments in their creative processes, more specifically how they navigate these instruments’ affordances. Here he is using the term ‘affordances’ in the sense that it is described in the work of James Gibson (Reference Gibson1966). According to Gibson, affordances are what an environment offers or furnishes to an organism existing and acting in that environment. In the case of Auto-Tune, its affordances are three, as explained above: retune speed, scale and humanisation factor. Yet there is another key actor who influences these affordances: the producer. So far, we have described Auto-Tune as party to a vocal–instrument ecosystem that involves the physiological production of vocal sound, its transformation into a digital audio signal, and its modification and reshaping through the parameters of Auto-Tune. While the software itself technically ‘does’ these modifications, the vocalist's input plays an equally important role in determining what modifications are possible and might come to pass in real time. The producer also controls these outcomes, through their establishing pre-set specifications (often called pre-sets) on the Auto-Tune interface.Footnote 22 While pre-sets are, by nature, set prior to the recording process, they can also be modified in real time, or adjusted over a series of various takes to craft what ultimately sounds like an evolving use of Auto-Tune across a longer performance, such as a verse or song, as was done in J. Cole's verse on the Young Thug's 2019 single ‘The London’.Footnote 23 As his verse begins (0:39) with the lyrics ‘circumnavigate the globe’, no Auto-Tune is audible, but just 10 seconds later, its presence is tangible on the lyrics ‘you can never hit mine’ and undeniable a further 10 seconds later on the melismatic ‘Everybody sing’ (0:59). From this point, Cole progresses through a series of off-beat rhymes on ‘heard’, ‘buried’ (pronounced ‘burred’ to rhyme), ‘birds’ and ‘word’, lingering on each of them, creating space for Auto-Tune's audible presence to emerge through melismatic bursts of energy. Again, without ethnographic material concerning the recording sessions for ‘The London’ we cannot be sure how these adjustments to Auto-Tune occurred – whether they were applied in real time or post-factum – but they did occur, demonstrating the fluidity with which Auto-Tune can engage the human voice, drastically affecting the aesthetics of its delivery and reception.

The workflow ecosystem documented above can thus also account for the creative input of the producer, as shown in Example 3. This collaborative workflow involves a vocal producer who is controlling, and possibly changing in real time, the settings on the plug-in, responding to the vocalist's performance.Footnote 24 It is important to again underscore that in both these workflows, the singer is responding in real time to the affordances of their voice and its Auto-Tuned processing: they are hearing their Auto-Tuned voice in their headphones, necessitating a level of comfort with producing one sound and hearing another.Footnote 25 In the collaborative workflow, however, the vocalist also responds to the actions of the producer. Thus, if Auto-Tune is an instrument, the singer and the producer are its performers and creators. Unpacking the ecosystem behind this relationship between performer, creator, and instrument is necessary to understand the agentive power singers have over their recorded performances. While sources that explicitly decry the use of Auto-Tune in trap music are relatively scarce, the absence of singer testimony (beyond rare admittances such as the one by Young Thug cited in the introduction) on their relationship with Auto-Tune – instead spoken about at length by producers – risks marginalising their creative and agentive role in producing sonic outputs with this instrument.Footnote 26

Technological repurposing in hip-hop music

Understanding Auto-Tune as a digital, compositional and collaborative instrument begets its situation in a lineage of technologies repurposed, adapted and instrumentised by the hip-hop community.Footnote 27 In the 1970s, DJs began repurposing turntables from their original functionality as playback devices to a tool for musical expression through looping, sampling and scratching. In the 1980s, producers began using digital samplers for beat creation, rather than for their original purpose as money- and labour-saving devices in large studio recording projects. Also in the 1980s, producers exploited the limitations of analogue mixers by facilitating signal bleed between channels, thereby cultivating new timbral possibilities in their mixes. In a more general sense, the hip-hop production community has continually harnessed technological limitations as wellsprings of creativity, leading to sonic innovations otherwise not possible in environments where money, time, and access to top-quality equipment were not always available. In a similar vein, Auto-Tune's establishment as an instrument in trap music illustrates how technology and hip-hop music have always engaged in a bilateral evolutionary relationship, where developments in one realm spurs developments in the other. Yet this evolutionary relationship also reflects the tension that Black-led musical innovation has encountered from the (largely white) musical mainstream over the 20th century. Understanding Auto-Tune's evolution through trap music thus intersects with the blowback it has received in some quarters of the music community. Below I unpack each of these technological repurposings and adaptations in turn, culminating with how Auto-Tune's use in trap music represents the newest such iteration of this practice in hip-hop music.

The story of the turntable's rise to prominence in hip-hop production – indeed, shaping its aesthetic more than any other technology – has been documented widely.Footnote 28 In the 1970s, vinyl was the most popular means of dissemination for commercial music, and DJs who played sets at clubs would do so with crates of vinyl, two turntables and eventually a crossfader (which was first released commercially in 1977). Some version of this setup was to be found at many early hip-hop parties, often outdoors, across New York City, notably in the Bronx and Queens. While DJs in New York's disco scene were perfecting the art of seamless transitions between records with identical (or similar) tempos (usually measured in BPMs, or beats per minute), hip-hop DJs such as Kool Herc and Grandmaster Flash began performing these transitions with ever smaller segments of music. Instead of using entire songs, these DJs focused on sections of funk and disco records known as breakbeats – sparsely textured, highly rhythmic excerpts often featuring only drums, and occasionally the rest of the rhythm section. In this sense, the DJs were stringing together series of samples from other records, creating new, unique, musical experiences that transcended the identities of any original records that were used. Consequently, two musical innovations occurred: the records were repurposed (sampled) as material for new musical creations, which were originally only heard live but later formed the basis for commercial recordings, and the turntable itself was repurposed from a playback device to a source for composition and performance, an instrument, with a codified set of playing techniques that included scratching, back spinning, the merry-go-round and many others.Footnote 29

When digital samplers became popular in the 1980s, they were marketed as time-saving devices in recording or sound-scoring studios. As Tricia Rose explains, samplers were used to lift parts from existing recordings to efficiently flesh out audio mixes, shortcutting the need to hire expensive studio musicians, for example. She writes that ‘prior to rap, the most desirable use of a sample was to bury its identity … rap producers have inverted this logic, using samples as point of reference, as a means by which the process of repetition and recontextualisation can be highlighted and privileged’ (Reference Rose1994, p. 73). Rose discusses Marley Marl's realisation in the early 1980s that single drum hits could be easily sampled with this new technology, and beats created anew out of these single hits.Footnote 30 By the time the Akai MPC 60 was introduced to the world in 1988 as the first affordable sampler/playback device, hip-hop producers had cultivated the art of collage-style sampling and reproduction. In a sense, the new breed of samplers led by the MPC 60 was a response to these musical innovations already happening in hip-hop music: the sampler, too, became prized and used in hip-hop music as an instrument, one that could be used creatively in the studio and live performances alike.

Rose highlights another way early hip hop ‘used machines in ways not intended’ (p. 75): the distortion and bleed of audio signals on analogue recording rigs. By riding the low-end frequencies further than analogue mixing boards were designed to accommodate, producers frequently worked ‘in the red’, distorting an otherwise clean audio signal. A consequence of this practice was audio-signal bleed between tracks. On analogue mixers, if one channel is set too hot, its signal will leak into adjacent channels, which results in further distortion and loss of focus to the sound passing through the original channel. As Rose writes, ‘in traditional recording techniques, leakage is a problem to be avoided, it means the sounds on the tracks are not clearly separated, therefore making them less fixed in their articulation’ (p. 76). Yet this less fixed articulation is precisely what was sought by hip-hop beat producers, and analogue mixing technology was harnessed in its imperfections to make this possible.

In becoming an ‘audible, obvious technology’ (Burton Reference Burton2017, p. 76) that is laid bare in the final recorded product, zero-onset Auto-Tune represents a further example of hip hop's propensity to evolve the functionality of audio technologies. Vocalists and producers are inverting the intended use of a technology and harnessing new modes of creation in so doing. In its original functionality, Auto-Tune's appeal lay in its abilities as a labour-saving device, which Provenzano describes as ‘a tool that makes our lives easier even though we do not think of [it] as artistic, expressive, or agentive’ (Reference Provenzano2019, p. 164). Although she was not writing about trap music specifically, Provenzano's assertation that this labour-saving conception of Auto-Tune ‘misrepresents much of the work Auto-Tune and its users perform’ (p. 165) resonates with the trap community's usage of it over the past decade. Trap's repurposing of Auto-Tune has indeed recast it as artistic, expressive, and as I will discuss below, agentive.

The zero-onset Auto-Tune used in trap music diverges from how this tool was used in earlier forms of pop music. As I have argued in this section, foregrounding this divergence encourages Auto-Tune's situation among other technologies that have been repurposed and embraced by the hip-hop community. These repurposings – and the sound worlds that have resulted from them – have all been met with criticism, from assertions in the 1980s and 1990s that rap wasn't ‘music’ to more recent claims that Auto-Tune enables no-talent vocalists to succeed. Consider the similarities between J.D. Considine's Reference Considine1992 article ‘Fear of a Rap Planet’, in which guitarist Al Di Meola asks about hip-hop music – ‘where are the people who've learned to play their instrument?’ – to Robert Everett-Green's Reference Everett-Green2006 negative assertion that Auto-Tune has produced a new generation of ‘frankenmusic’. Both instances can be read as scathing critiques of the embracing of a new technology in popular music. Di Meola's comment implies that turntables and samplers do not count as instruments, and Green's comment, although penned before T-Pain's revolutionary use of Auto-Tune was widely emulated, reveals an ardent backlash against any form of technical manipulation that might render a music inauthentic (a point I expand upon below).

It helps to contextualise Auto-Tune's repurposing using Karin Bijsterveld and Trevor Pinch's notion of ‘breaching experiments’, where the authors propose to ‘treat the introduction [or repurposing] of musical machines as cases of breaches in musical culture’ (Reference Bijsterveld and Pinch2003, p. 538). The importance of such breaching experiments, the authors add, is that ‘breaches of convention reveal underlying norms and values’ (p. 538). Seen in this way, negative criticism of the way Auto-Tune is used in trap music, like hip hop's other repurposed technologies before it, perhaps reveals much about what we value in music, and what counts as music to us. It is, in a sense, an incarnation of what Mark V. Campbell has described in reference to the turntable as a critique through the modernist lens of orthodoxy. Campbell writes that ‘at the core of turntablism is what some might call an “unorthodox” use of the turntable and its components. Such unorthodoxy appears only as such through modernist lenses, where authority and detachment dictate “correct usage”’ (Campbell Reference Campbell2022, p. 49). Read through Campbell's lens of modernist orthodoxy, zero-onset Auto-Tune would qualify as an unorthodox usage. In setting up a dichotomy between the ‘correct usage’ of turntables and hip hop's ‘incorrect’ use of them, Campbell establishes a binary where correct usage is accepted, and incorrect usage is criticised. The difference with Auto-Tune, however, is that all usages are implicitly criticised. Many criticisms of Auto-Tune operate on the grounds that, when used as a pitch correction device, Auto-Tune removes the humanness from the singing voice. These criticisms seem to conflate the under-the-radar, subtle pitch corrections that are made on nearly all pop records today, and the overt, creative use of zero-onset Auto-Tune, which I have argued here is markedly distinct in its creation, production, and aesthetic.

Indeed, many of the most pointed invectives of Auto-Tune were written well before Young Thug, Travis Scott and others began heavily using it. The late 2000s, when T-Pain and Kanye West were the most visible ‘unorthodox’ users of Auto-tune, witnessed a surge in journalistic and online backlash against the software, as well as artistic outputs such as Jay-Z's famous 2009 single ‘D.O.A. (Death of Auto-Tune)’. A gradual recession in negative feedback since then (excepting publications such as Rick Beato's Reference Beato2021 obituary for popular music) can be explained by Owen Marshall's opinion that ‘headlines in tech journalism are typically reserved for the newest, most ground-breaking gadgets. Often, though, the really interesting stuff only happens once a technology begins to lose its novelty, recede into the background, and quietly incorporate itself into fundamental ways we think about, perceive, and act in the world’ (Reference Marshall2014). Writing in 2014 – again before much of Thug and Scott's most innovative use of Auto-Tune – Marshall's words now seem almost prophetic: Auto-Tune is no longer a novelty, nor are the sonic traces of its zero-onset usage in trap music and its vocal ancestries, in which T Pain and Kanye West certainly belong. Yet as the examples in this paper demonstrate, Auto-Tune's aesthetic impact on popular music continues to emerge and evolve.

What began as something radical and subversive – an incarnation of technological signifyin(g), perhaps – has now become commonplace not only in trap music, but also in many other quarters of mainstream popular music. In this way, zero-onset Auto-Tune belongs in a lineage of musical developments spurred by Black creators and eventually co-opted by white musicians and producers, a lineage that includes, broadly, the genres of blues, rock ’n’ roll, funk, disco, techno and hip-hop, as well as the performative, organological and technological innovations that came with them. Reading Auto-Tune's development in this context necessitates consideration of its impact within the hip-hop community. When Mackintosh provocatively argues that the music about which he writes in Neon Screams is not rap or hip hop, but a new genre entirely, he underscores the substantive stylistic divergence between cutting edge Black musical practice in the 2020s and that of the final decades of the 20th century. Mackintosh hits on a divergence that has been widely panned by icons of earlier hip-hop generations – many earlier artists (i.e. active in the 1980s and 1990s) have decried the direction that trap music has recently taken hip hop (Grady Reference Grady2023). In summary, zero-onset Auto-Tune has driven two cycles of musical innovation: in evolving the broad, parent genre of hip-hop music, zero-onset Auto-Tune also contributes to the ongoing development of mainstream popular music, because, as I have argued elsewhere, mainstream pop is intricately tied to currents in hip-hop music.Footnote 31 Put another way: what happens in hip-hop music tends to eventually happen in all popular music, laying the groundwork for tensions to arise when the cultural values, technological strategies and divergent markers of identity in trap music begin to permeate and intersect a wider body of popular music that may not share those features.

Who and what are we hearing?

In a 2006 essay entitled ‘The death and life of digital audio’, Jonathan Sterne argues against contemporary critiques of digital audio as lacking the life-like qualities of live performances owing to the apparent loss of fidelity in digital audio formats such as MP3 (Sterne, Reference Sterne2006) by his account, because MP3s are created through compression, where the highest and lowest amplitudes of an audio signal are removed, critics argue that this mediation removes the ‘life’ of the recording, fostering this consequential ‘lifeless’ quality in the MP3. In offering a revisionist history of audio quality vis-à-vis recorded sound, Sterne resituated these critiques away from fidelity, writing that ‘fidelity is a metaphysical problem, based on the idea that a copy lacks some of the metaphysical “stuff” that an original sound once had’ (p. 340). By his reasoning, a more balanced critique of these formats must consider ‘the broader cultural formations through which recorded music moves’, arguing specifically that digital audio ‘must be understood within the contexts of its circulation’ (p. 339).

The critiques of MP3 against which Sterne argues are markedly similar in scope to contemporary critiques of Auto-Tune – that its effects strip the voice of its life-like qualities. As Auto-Tune removes some of the timbal and pitch-based fluctuations that make a singer's voice sound human, the Auto-Tuned voice – especially the zero-onset Auto-Tuned voice – bears similar ‘lifeless’ traits as the heavily compressed Mp3 audio format. As described above, Sterne's solution to critiques of digital audio involves reconsidering the context in which digital audio is evaluated. We can apply that same logic to Auto-Tune, albeit slightly differently than Sterne does for the MP3: instead of focusing on what life-like qualities of the singer's voice are lost, we might focus on what life-like qualities the Auto-Tune instrument itself grafts on to the voice. As Auto-Tune searches for the ‘correct’ pitch in the singer's voice, it lays audibly bare all of its calculations in so doing. The resulting ‘digital melismas’ are the joint product of the singer's voice and Auto-Tune's ongoing find-and-replace functionality. The audio sounds imperfect; it sounds like a machine that is trying to do its job, and not always succeeding. We can almost hear Auto-Tune's effort in Lil’ Gotit's vocals on ‘Argentina’ (Reference Gotit2020), specifically in his lyrics ‘big racks’ (1:26). Auto-Tune's retune speed is audibly set at (or close to) zero for this recording. Using the zoomed-in spectrogram of Gotit's vocals in Example 4, we can see that his (Auto-Tuned) pitch fluctuates extremely rapidly between D♯ and C♯. The sonic result is some kind of mechanical vibrato, one that could not be easily produced by an unmediated human voice alone. Mackintosh describes the process we hear here: ‘[Auto-Tune will] rapidly oscillate between different notes as it desperately scrabbles to latch onto a stable pitch, meaning that vocals tremble and undulate with completely artificial vibrato’ (p. 34). Instead of being used to fix inconsistencies in vocal performances, Auto-Tune in trap music exposes these inconsistencies further, and even creates new ones in the process. In this sense, what might have been considered mistakes are instead outcomes.

Example 4. Spectrogram of Lil' Gotit's performance of the lyrics ‘Big Racks' in the song ‘Argentina’ (2020). Note the rapid fluctuation between C#4 (277.18 hz) and D#4 (311.13 hz) on the lyric “Racks,” as seen through the abrupt step-like transpositions in red and orange-coloured bands on the spectrogram

Yet Auto-Tune occasionally does sound like it is making mistakes, like during Lil’ Baby's performance on ‘Section 8’ (2018). Baby's performance here is, like many Auto-Tuned trap songs, quite limited in pitch, almost exclusively between A3 and D4. On the lyrics ‘he got off’ (0:55), we hear the momentary introduction of a G3, not heard anywhere else in the song. It sounds as though Auto-Tune guessed the ‘wrong’ destination pitch to map the audio signal. Similarly, Young Thug's Auto-Tuned entry in ‘Yeah! Yeah!’ (discussed above) comes to rest on a prominent F♯5, a pitch that does not bear any structural relationship to the tonality of the beat (although this is by no means a requirement in the pitch realm of trap music), but also has little to no bearing on the pitch content in Thug's ensuing verse, which all occurs nearly an octave lower. Yet this illusion of error, unpredictability and imperfection is precisely what facilitates emotiveness, humanness and vulnerability in the Auto-Tuned trap voice. To be sure, Auto-Tune is not literally malfunctioning in these examples. Yet it may sound like it is, and that is where the illusion resides. Reynolds writes that ‘the lingering mystery [about Auto-Tune is] the extent to which the general public has adapted to hearing overly processed voices as the sound of lust, longing, and loneliness’. Certainly, the inconclusiveness and constant pitch fluctuations in the zero-onset Auto-Tuned voice plays a role in this adaptation that Reynolds identifies. In following Brøvig-Hanssen and Danielsen's assertion that ‘when we emphasize the vulnerable aspects of technology, we humanize the machine’ (p. 126), to emphasise Auto-Tune's imperfections and illusory malfunctionality is to expose its vulnerability, to humanise it.Footnote 32 When we combine this notion with the depressingly and steadfastly human tropes in trap lyrics, amplified through the croaks, yelps and chants of its vocalists, perhaps we do indeed get what Mackintosh calls ‘a new incarnation of humanity only attainable through technology’ (p. 35). This ‘incarnation of humanity’ is perhaps what gives the zero-onset Auto-Tuned voice its life-like qualities.

One way to read Mackintosh's ‘new incarnation of humanity’ is through post-humanism. In concluding her analysis of Kanye West's ‘Stronger’, Nicola Dibben (Reference Dibben, Fabian, Timmers and Schubert2014) writes that ‘the combination of machine-like and human-like expressive characteristics can be seen as part of a Black post-humanism in which the technological is the locus of expression’ (p. 128). While such an assertion certainly resonates with the human/machine dynamics at play with zero-onset Auto-Tune, Dibben's Black post-humanism extends back in time much further than the hip-hop genre at large. Specifically, she cites the work of Alexander Weheliye (Reference Weheliye2002), who argues that such mechanised or technological post-humanism has plagued the Black voice since the advent of recording technology (Reference Weheliye2002, p. 25), and indeed extends right back to the denial of basic human rights accorded to Black American slaves, essentially rendering them subhuman. Situating this argument in a broader cultural vein, Sylvia Wynter's assertion that historical conceptions of the human are entangled with epistemological histories which exclude persons who are not both white and male.Footnote 33 Consequently, those who do not participate in this demographic are not afforded the means of representation as capable of embodying whatever it has traditionally meant to ‘be human’, and thus turn, in the case of trap, to technologically augmented modes of emotional expression. If this is true, then emotion, authenticity and vulnerability in the trap voice must be understood in full view of this historical denial of humanness. Returning to Sterne's argument then, our judgements of zero-onset Auto-Tune, what it brings to vocal performances, what it replaces, and what it takes away, are framed in opposition to how we have judged the voice in other musical contexts; that is, we are compelled to understand Auto-Tune ‘within the [cultural] contexts of its circulation’ (p. 339).

In his 2017 book Posthuman Rap, Justin Adams Burton investigates how trap constructs inclusive space for communities excluded from a humanity that ‘favours whiteness, masculinity, heterosexuality, and fixed gender identities’ (abstract Ch. 1). According to Burton, in presenting as a largely black genre, with artists like Young Thug publicly stating their belief in gender fluidity and otherwise dismantling rap's hypermasculine image, trap music finds itself squarely in this post-human realm. Auto-Tune thus helps give voice to a musical community whose otherness maps onto to Burton's post-human. Indeed, Provenzano (Reference Provenzano, Fink, Latour and Wallmark2018) suggests that Auto-Tune might be heard as a way of rejecting the soundings of the white supremacist context of the North American music industry. Yet again, this concept is not new, and traces its roots further back in hip hop's history. As Rose writes of early hip-hop culture, ‘rap [was an] especially aggressive public display of counterpresence and voice. Each asserted the right to write – to inscribe one's identity on an environment that made legitimate avenues for material and social participation inaccessible’ (Reference Rose and Dagel Caponi1992, pp. 214–15). Trap's embrace of Auto-Tune is, in some sense, a reincarnation of this process, again a public display of counterpresence. This time, however, the counterpresence can be measured against early rap itself, with, as mentioned above, ample evidence that the pioneers of early rap have roundly criticised the Auto-Tune-infused direction in which contemporary trap music is leading the genre.

Although I have argued that the ‘what’ that we hear in the zero-onset Auto-Tuned voice is life-like, the ‘who’ that we hear is still unclear. For instance, whose emotion are we hearing in the zero-onset Auto-Tuned voice, in its life-like qualities? This lack of clarity can be unpacked with the help of Yvon Bonenfant's description of the ‘vocalic body’ (Reference Bonenfant2010).Footnote 34 In brief, Bonenfant describes the tactility of vocal timbre in how it travels from the body that produced it (the vocaliser) to the body that receives it (the listener). As sound waves emanate from a vocalising body, they are perceived by other bodies, but at this point the waves themselves are completely and literally disembodied from their source, while also being ‘fabricated by a living body and [carrying] a unique imprint of that body. When the sound reaches the listener, they must infer, invent, an assumed body, linked to the voice … there is the implication of a body and a representation of a body, but no flesh’ (p. 76). What body does a listener infer in the Auto-Tuned trap voice, where the vocalic body is so thoroughly mediated by a technological device?

Bonenfant's mapping of perceived vocals to the bodies that produced them necessitates some sort of emotional synergy between the vocaliser and the listener. For example, I can hear, even feel to some extent, 50 Cent's pain in his verse on ‘Hate it or Love it’ (The Game 2005), Tupac Shakur's sadness in ‘Brenda's Got a Baby’ (1991) and Vince Staples's conflicted emotions about his hometown of Long Beach in ‘Senorita’ (2015). More specifically, I can hear these artists feeling these emotions, through a process that is akin to Bonenfant's vocalic body. Advancements in artificial intelligence notwithstanding, it is safe to say that Auto-Tune itself is incapable of expressing emotion on its own, no matter how humanising its mediatory functionality might be. Yet according to Hildebrand, Auto-Tune was created to facilitate emotive singing; artists could focus on giving an emotionally charged performance without worrying about being perfectly in tune, knowing that Auto-Tune would do its work. Proponents of the zero-onset Auto-Tuned voice generally report that Auto-Tune is still doing just that: facilitating the conveyance of emotion.Footnote 35 The difference here, as mentioned earlier, is that Auto-Tune's original purpose stripped something away from vocals to facilitate emotion in singing, whereas zero-onset Auto-Tune is adding something to vocals to this end.

As Kit Mackintosh writes, ‘Auto-Tune enables emotional expression that artists are simply unable to achieve with their voices alone’ (p. 33). This complicates a listener's conception of the vocalic body, but it does not de-utilise the concept altogether. We can hear the human voice – the trace of the singer's body – here, it is just simply not all that we hear. Perhaps it is the post-human vocalic body we hear. Perhaps it is some sort of imaginary body, similar in concept to Paul Théberge's ‘imaginary sonic space’, where we know that the human voice cannot literally produce, on its own, the sound we hear, yet we are nonetheless able to imagine some human/machine hybrid producing it (which is what is really happening).Footnote 36 We as listeners, in a sense, are able to produce a conception of this hybrid, post-human body. As Nina Sun Eidsheim (Reference Eidsheim2018) argues, the voice and its timbre assume their identity and meaning in what the listener brings to them, rather than through some set of essential, intrinsic characteristics. Consequently, if listeners measure emotiveness, authenticity, humanness and any other life-like quality by how unmediated, or ‘dry’ (per Malawey) a recorded voice is, then of course zero-onset Auto-Tune will fall short in these measures (notwithstanding the fact that no recorded voice is completely unmediated). Our task, then, is to better appreciate how the vocalic body of the Auto-Tuned voice impacts our own bodies as listeners, asking questions posed by Arnie Cox (Reference Cox2011) such as ‘what's it like to do that’ or ‘what's it like to be that’? More generally in listening to the Auto-Tuned voice, we would do well to evaluate ourselves as much, if not more so, than we evaluate it.

Auto-Tune's bidimensional emotive functionality is aptly summed up by Reynolds, who writes that the ‘taste for and revulsion against Auto-Tune are part of the same syndrome, reflecting a deeply conflicted confusion in our desires: simultaneously craving the real and the true while continuing to be seduced by digital's perfection and the facility and flexibility of use it offers’ (2018). Perhaps some of zero-onset Auto-Tune's most progressive champions like Young Thug aim to fulfil both such desires: expressing ‘real’ and ‘true’ emotion through their vocals while navigating the threshold between automated ‘perfection’ and human ‘imperfection’. Occasionally this means not sounding like one's own self, if we subscribe to Lauren Levy's assertion that Travis Scott's ‘most admirable accomplishment is the ways in which he has modified his voice to sound like everyone else than himself’ (Reference Levy2018). (This resonates with Snoop Dogg's criticism of contemporary trap music, stating that, when listening to Future, Migos or Drake, ‘I don't know who is who when the record is over’ (Grady Reference Grady2023).) Consulting Young Thug's verse on ‘Abracadabra’, the timbral fluctuation in his voice on the lyrics ‘I'm making sure they missin’’ (1:12) is undeniably present. The melisma that follows Thug's utterance of the lyric ‘missin’’ meanders into a digital soup of Thug's trailing off voice (to the point where it cracks slightly) and the frantic oscillation of pitch brought on by Auto-Tune in response to Thug's wavering vocal pitch. Given our knowledge of Thug's meticulous and prolific working methods, it is almost certain that he consciously chose this take for the final version of the song. Indeed, his very next line, ‘Um, listen, I was littin’’ follows the same melodic contour but ends with a much more assured-sounding melismatic descent. The mixture of subtle, but eminently audible, Auto-Tuned fluctuations with timbral inconsistencies in Thug's voice imbue an emotive power that is arguably as potent as any unmediated performance – to arrive at this conclusion, we simply must refocus where we look for this emotive power, to whom we ascribe it and what, as listeners, we project onto it.

Conclusion

In this paper I have explored zero-onset Auto-Tune in four realms: timbral, organological, technological and emotional. I began by summarising Auto-Tune's general functionality and showing how its use in trap music departs from this functionality, primarily through retune speed (hence invoking of the ‘zero-onset’ concept). I then unpacked how zero-onset Auto-Tune does, and does not, impact vocal timbre. Its rapid adjustment of pitch counters the constancy of singers’ formants, creating abrupt timbral shifts wherein the strength of specific overtone partials is artificially held constant across varying pitches. This factor, countered with the immediacy of timbral constancy in zero-onset Auto-Tune (not a natural feature of the human voice), contributes to the slight but very audible timbral modifications it facilitates. Following this discussion, I argued that zero-onset Auto-Tune should be considered a digital, compositional and collaborative instrument, using organology scholarship to support this argument. Because of Auto-Tune's role as a modifier of the sounds produced by the human voice, I propose that it should be considered a part – however prosthetic – of the vocal instrument ecosystem. Furthermore, because vocalists and producers engage with this part in real time, the vocal ecosystem, by virtue of Auto-Tune's participation in it, becomes a compositional, collaborative instrument.

I then situated zero-onset Auto-Tune in a list of technological repurposings that have punctuated hip-hop music's development since the 1970s. This approach not only encourages us to understand it as closely linked to hip-hop's earlier technology-driven aesthetics, but also situates any negative reaction to it as a reincarnation of the depressingly perennial criticism of new developments in the hip-hop genre. I continued this discussion in the final section of the article, expanding on the notion of hearing the ‘human behind the voice’ when mediated so heavily with zero-onset Auto-Tune. Here I identified that the human–machine interaction can be considered the locus of emotiveness in zero-onset Auto-Tuned vocals, and that post-humanist scholarship has already engaged this notion in the context of other recorded, and specifically Black recorded, musical genres. Finally, in countering the criticism of the apparent ‘dearth of life’ in these vocals, I questioned whether, as Sterne argued for the Mp3, we as listeners can sensitise to emotion and expressivity in the zero-onset Auto-tuned voice by adjusting our traditional criteria of these vocal qualities.

There are many unanswered questions this work – and other recent scholarship on Auto-Tune – raise. Here I have limited my work to analysing studio-recorded albums, and the process of studio recording itself. Live performances of trap music, complete with zero-onset Auto-Tuned vocals, represent a drastically different context ripe for exploration. Many trap artists perform with Auto-Tune running in real time (just as they record), but with the energy and unpredictability of performing in front of thousands of fans, and without the comfort of multiple studio takes, live concerts force more variable sonic outcomes in Auto-Tuned vocals.Footnote 37 While artists performing in genres where Auto-Tune is more taboo might have reason to worry about this variability (think about examples where singers are ‘exposed’ in live events to have substandard vocal pitch abilities), in trap music this is not an issue – Auto-Tune's presence is obvious, and much less currency is placed on being ‘able to sing’ (per Young Thug's comments in the introduction) with an unmediated voice. Nevertheless, the stark differences between recorded and live performance in trap music remain fertile ground for exploring zero-onset Auto-Tune's effects on musical aesthetics.

Further questions might involve reception of zero-onset Auto-Tune with respect to singer gender and identity. Provenzano (Reference Provenzano2019) has already questioned the sympathy male-presenting singers have disproportionately received over female-presenting singers for using Auto-Tune, and gendered standards of acceptable vocal production may have something to do with this. Such standards intersect curiously with standards of singing in trap music. For example, Beyoncé's zero-onset Auto-Tuned voice on The Carters’ 2018 song ‘APESHIT’ might be better read in its homage to trap music over any question of whether it is needed to assist her with pitch precision. Beyonce's use of it here is telling in several ways. The song features a trap beat and Beyonce raps/sings in triplet flow (a hallmark of trap). We might go so far as to say that to not use Auto-Tune in this context would be stylistically incongruent. We know Beyonce ‘can sing’ in the traditional sense, but that is not what zero-onset Auto-Tune is about, not here and not anywhere in trap.

Further work on this topic could situate Auto-Tune more thoroughly in the broader aesthetics and cultural positioning of trap music. Zachary Wallmark's work on analysing vocables in the music of Megan Thee Stallion (Reference Wallmark2022) provides an important precedent in exploring the ad-lib-heavy vocal styles of many trap artists, who happen to produce these ad-libs with zero-onset Auto-Tune. Furthermore, trap's broader musical template involves a certain degree of sonic and timbral heterogeneity, a concept thoroughly explored by Olly Wilson in his famous Reference Wilson and Dagel Caponi1992 essay ‘The heterogeneous sound ideal in African American music’. Wilson's concepts could be applied to trap music in a convincing way, and the way trap vocals are processed and mixed would certainly be a central part of this application.

Tricia Rose eloquently sums up her ethnographic work with various hip-hop producers active in the 1980s by writing that these producers ‘actively and aggressively deploy strategies that revise and manipulate musical technologies so that they will articulate black cultural priorities’ (1994, p. 78). The cultural priorities of which Rose speaks are encapsulated in early hip-hop's continuation of Jamaican bass culture (the constant drive for the fattest, lowest bass sounds possible), sampling or ‘versioning’ previously created musical works, and fully exploring rhythmic repetition and nuance in beat creation. When hip-hop music's evolution is contextualised within this assertion, it becomes easy to see how Auto-Tune's repurposing partakes in that evolution, but perhaps less obvious are the cultural priorities that Auto-Tune espouses. Such priorities are best discerned through ethnography with the creators and consumers of trap music, necessarily confronting issues of social heterogeneity (Holt Reference Holt2020) in this pursuit.Footnote 38 So far, my work in locating instances where trap rappers discuss the technical or practical aspects of using zero-onset Auto-Tune has revealed scant material; I hope to incorporate practitioner testimony into my research, and eventually interview MCs and producers who have experience using Auto-Tune in real-time in the studio. At the same time, I, and others envisaging ethnographic research in hip-hop music, must consider what Édouard Glissant (Reference Glissant and Wing1997) has called ‘the right to opacity’, specifically for our purposes, an artist's right to resist attempts to make their creations legible in an analytic system (in this case Eurocentric) that risks imposing a power relationship to their work.Footnote 39 I engage with this topic as an outsider to both the trap genre and its musical practices, and foregrounding voices from within the trap community must be an essential element of my work in the long term.

In view of the close-miked recordings of crooners like Bing Crosby, the slap-echo of Elvis Presley's recorded vocals and the double-tracked recordings of John Lennon's voice, Reynolds (Reference Reynolds2018) asks, of Auto-Tune's critics, whether there have ever been truly ‘natural’ vocals in recorded popular music. In each of those cases, technology enabled something impossible to seem real and natural. If we understand Auto-Tune as a compositional instrument that facilitates conveyance of emotion in a genre that has never purported to ‘know how to sing’, we might question our own belief systems on what comprises talent, authenticity, and creativity in music.

Footnotes

1 In becoming something of a generic trademark, like Kleenex or Band-Aid, the term ‘Auto-Tune’ is frequently used as a stand-in for any pitch correction software. In this article, I am writing primarily about Auto-Tune, as it is the only pitch correction software (to my knowledge) that offers real-time functionality.

2 In the original patent for Auto-Tune, Hildebrand writes that ‘when voices or instruments are out of tune, the emotional qualities of the performance are lost’ (Reference Hildebrand1997).

3 Twinem is a term used to describe a lifelong bond between two people who share remarkable similarities and sympathies with one another, like twins, from which the term derives.

4 For more detail on how this covert effect is achieved, see Hoffman (Reference Hoffman2021).

5 It did not take long for the hip-hop community, and the wider public, to associate T-Pain with Auto-Tune to the point of synonymity. On a recent podcast (Nappy Boy 2022), the rapper Snoop Dogg discussed the recording of his 2007 single ‘Sensual Seduction’ (original title ‘Sexual Eruption’) – where his sung vocals are heavily Auto-Tuned – referring to the process of adding Auto-Tune to his vocals as ‘T-Paining’ them. T-Pain's appearance on The Ellen Show in 2009 and cameo in The Lonely Island's 2009 single ‘I'm on a Boat’ bear witness to his then newfound popularity through his use of Auto-Tune, but also risk caricaturising his reputation. Indeed, as discussed by Catherine Provenzano (Reference Provenzano, Fink, Latour and Wallmark2018), T-Pain's first years after Rappa Ternt Sanga was released were marked by critical and public derision towards his use of Auto-Tune.

6 Malawey (Reference Malawey2020) proposes a spectrum upon which the degree of technological mediation of recorded vocals can be characterised by ‘dry’ (completely unmediated), ‘wet’ (thoroughly mediated) and points therebetween.

7 Certain vocal techniques can evoke the sound of zero-onset Auto-Tune without actually using it, such as sideways yodelling, which describes the technique of moving between chest and head voice on pitches very close in register. Extreme examples involve using head voice for pitches lower than those used with chest voice. The resulting quality of the change between pitches gives it an immediate, unnatural sound, like zero-onset Auto-Tune.

8 As an audio effect, a bit crusher creates distortion in an audio signal by reducing its sample rate or resolution.

9 Such misconstruals go some way in explaining the massive backlash T-Pain received following his brief but intense run of top 20 singles in the late 2000s (see Tinsley Reference Tinsley2021).

10 For a more detailed discussion on how vocal physiology creates and affects timbre in the singing voice, see Heidemann (Reference Heidemann2016) and Marchand Knight (Reference Marchand Knight, Soden, Duinker and Gutiérrez Martínez2020).

11 A recent salvo in this debate is provided in Marchand Knight et al. (Reference Marchand Knight, Sares and Deroche2023). Specifically, the authors advocate against essentialising vocal timbres based on physical appearances, as happens frequently in the opera world through the fach system, a German invention for categorising voice types in opera.

13 Specifically, Young writes that ‘because auto-tuning software doesn't correct formants, when it is used to track the pitches to an extreme degree, the voice starts to sound artificial’ (2016, p. 96).

14 Like Young, Provenzano's assertion that the voice's timbre ‘stays intact’ is thus true, but this intactness forms the source of Auto-Tune's effect on vocal timbre.

15 Sillitoe (Reference Sillitoe1999) documented the recording process of ‘Believe’ through a conversation with the song's producers Mark Taylor and Brian Rawling, who reveal that the Auto-Tune on Cher's voice was added following the recording session, in post-production. Curiously, the producers were hesitant to admit Auto-Tune was used (they instead claim the vocals were processed using a Korg vocoder), exemplifying the trepidation producers had with using Auto-Tune in the late 1990s.

16 Lobad (Reference Lobad2020) characterises contemporary Auto-Tune usage by writing that ‘today, rappers hit the studio with Auto-Tune already geared up and in full-effect as they record, rather than adding it to a track during the editing process’.

17 Provenzano (Reference Provenzano2019) states that Auto-Tune is the only pitch correction software capable of functioning with real-time input, as would be the case here.

18 Producer Alex Tumay (Red Bull Music Academy 2016, 19:00) describes Young Thug's recording process as constituting long sessions of trying out different vocals and vocalisations and fitting some of them together as the final product. With Auto-Tune always running, Thug's creative process is thus highly experimental, with an ongoing real-time process of trying things and evaluating how they sound.

19 Miranda and Wanderley (Reference Miranda and Wanderley2006) stress the importance of considering mappings between action and sound in the context of new digital instruments.

20 Hip-hop producer Bainz states that, in his experience, ‘[in studio sessions] there usually isn't an Auto-Tune discussion, it's there by default. It's just there’ (Auto-Tune/Antares Audio Technologies 2023, 1:40).

21 Experimental vocalist Pamela Z's embrace of technology outlines a similar relationship: ‘highly idiomatic to Z's particular type of vocal performance, in many ways her electronic tools serve as an extension of the vocal instrument rather than an external instrument’ (Lansang Reference Lansang2019).

22 Several studio engineers discuss the importance of using Auto-Tune pre-sets when recording vocals for sessions with trap artists (see TheWavMan 2019; Kids Take Over 2021; and Sky Jordxn 2022).

23 As Provenzano writes of one hip-hop producer she interviewed, ‘he mentioned too how in many sessions, engineers work with the artist in real time, pulling up plugins as the session unfolds and as the performer performs in order to get the right sound’ (2018, p. 378). Reynolds (Reference Reynolds2018) quotes producer Chris ‘TEK’ O'Ryan as claiming ‘I'm embellishing – hearing what they [the singers are] doing in the booth and following their lead’.

24 This workflow invokes Bortolon-Vettor's notion of mixer-as-improviser (Reference Bortolon-Vettor2023).

25 Reynolds (Reference Reynolds2018) reports on the importance of this level of comfort in the case of Young Thug.

26 In his NPR Music Tiny Desk concert, T-Pain opened the programme with obvious and detailed commentary on his use of Auto-Tune, drawing attention to the fact that he did not use Auto-Tune for that live performance (2014), specifically describing his experience singing without Auto-Tune as ‘weird as hell’.

27 In a similar vein beyond hip hop, D'Errico states that ‘the emergence of software in DJ culture marked yet another moment in which a new technology negotiated its changing status as a musical instrument’ (Reference D'Errico2016, p. 3).

28 Kabango (Reference Kabango2016) and Patrin (Reference Patrin2020) both extensively document the advent of turntabling and sampling, situating these two practices as foundational in hip hop's musical genesis and subsequent evolution.

29 The Merry-go-round technique is attributed to DJ Kool Herc, who, in the late 1970s, developed an idiosyncratic method of selecting breakbeats and adeptly weaving them together live using two turntables and a crossfader.

30 Rose (Reference Rose1994, p. 79) cites Marley Marl's testimony to this effect.

31 See Duinker (Reference Duinker2020), where I focus on how song form can be used as a lens to view hip-hop music's ongoing mainstreaming throughout its history, and the tension between that process being either a subversive take-over or a co-opting.

32 Here the authors are summarising a position taken by Nicola Dibben in her work on the Icelandic singer Björk (Reference Dibben2009).

33 See McKittrick (Reference McKittrick2015), where the editor/author discusses this matter at length with Wynter.

34 Bonenfant's work draws on earlier research by the scholar who coined this term, Steven Connor (Reference Connor2000).

35 Although as Provenzano has argued that positive reception of Auto-Tune as a facilitator of emotive singing is disproportionately directed to male singers. See Provenzano (Reference Provenzano2019), Ch. 5.

36 See Théberge (Reference Théberge, Fink, Latour and Wallmark2018), where the author discusses the tangible perception of non-real sonic spaces using added reverb in recorded sound.

37 Here I am referencing the ethnographic work of Danielle Davis (Reference Davis2023) on listener and fan experiences of timbre in live performances of southern hip hop, especially trap.

38 Holt points out that scholars must be careful about who they are including and excluding when they explore what hip-hop music encompasses, what it means, and what it represents. Many of these explorations often take the notion of communities (i.e. who is being referred to in the context of artists, producers, consumers, listeners) for granted.

39 I adapt this idea from Toru Momii's (Reference Momii2021) work analysing the music of Mitski Miyawaki.

References

References

Auto-Tune/Antares Audio Technologies. 2023. ‘2023 GRAMMY interview: Bainz’, YouTube, https://www.youtube.com/watch?v=hlFGoukWWX8 (accessed 10 October 2023)Google Scholar
Beato, R. 2021. ‘Modern music's death by Auto-Tune’, YouTube, https://www.youtube.com/watch?v=NNXg5dIVC1M&ab_channel=RickBeato (accessed 4 April 2023)Google Scholar
Bijsterveld, K., and Pinch, T. 2003. ‘“Should one applaud?”: breaches and boundaries in the reception of new technologies in music’, Technology and Culture, 44/3, pp. 536–59Google Scholar
Bonenfant, Y. 2010. ‘Queer Listening to Queer Vocal Timbres’, Performance Research, 15/3, pp. 7480CrossRefGoogle Scholar
Bortolon-Vettor, E. 2023. ‘“Our album”: improvisation, creative process, and invisible labour in to pimp a butterfly’, paper presented at the IASPM Canada Conference, Quebec City QC, 18–21 MayGoogle Scholar
Brøvig-Hanssen, R., and Danielsen, A. 2016. Digital Signatures: The Impact of Digitization on Popular Music Sound (Cambridge, MA, MIT Press)CrossRefGoogle Scholar
Burton, J. A. 2017. Posthuman Rap (New York, Oxford University Press)CrossRefGoogle Scholar
Campbell, M. V. 2022. Afrosonic Life (New York, Bloomsbury Academic)CrossRefGoogle ScholarPubMed
Connor, S. 2000. Dumbstruck: a Cultural History of Ventriloquism (New York, Oxford University Press)CrossRefGoogle Scholar
Considine, J.D. 1992. ‘Fear of a rap planet’, Musician, February, pp. 3443Google Scholar
Cox, A. 2011. ‘Embodying music: principles of the mimetic hypothesis’, Music Theory Online, 17/2, https://mtosmt.org/issues/mto.11.17.2/mto.11.17.2.cox.htmlCrossRefGoogle Scholar
Davis, D. 2023. ‘Live in Atlanta: listening to black popular music performances in concert’, poster presented at TIMBRE 2023 Conference, Thessaloniki, 10–12 JulyGoogle Scholar
D'Errico, M. 2016. ‘Interface aesthetics: sound, software, and the ecology of digital audio production’, PhD dissertation, University of California Los AngelesGoogle Scholar
De Souza, J. 2017. Music at Hand: Instruments, Bodies, and Cognition (Oxford, Oxford University Press)CrossRefGoogle Scholar
Dibben, N. 2009. Björk (Bloomington, IN, Indiana University Press)Google Scholar
Dibben, N. 2014. ‘Understanding performance expression in popular music recordings’, in Expressiveness in Music Performances: Empirical Approaches Across Styles and Cultures, ed. Fabian, D., Timmers, R., and Schubert, E. (Oxford, Oxford University Press), pp. 117–32Google Scholar
Duinker, B. 2020. ‘Song form and the mainstreaming of hip-hop music’, Current Musicology, 107, https://doi.org/10.52214/cm.v107i.7177Google Scholar
Eidsheim, N. 2018. The Race of Sound (Durham, NC, Duke University Press)Google Scholar
Everett-Green, R. 2006. ‘Ruled by frankenmusic’, The Globe and Mail, 14 October, https://www.theglobeandmail.com/arts/ruled-by-frankenmusic/article969945/ (accessed 10 October 2023)Google Scholar
Gibson, J. 1966. The Senses Considered as Perceptual Systems (Boston, MA, Houghton Mifflin)Google Scholar
Glissant, É. 1997. Poetics of Relation, trans. Wing, B. (Ann Arbor, MI, University of Michigan Press)CrossRefGoogle Scholar
Grady, M. C. 2023. ‘22 Older rappers’ complaints about hip-hop and where it's headed’, XXL, 8 December, https://www.xxlmag.com/older-rappers-complain-hip-hop/Google Scholar
Heidemann, K. 2016. ‘A system for describing vocal timbre in popular song’, MusicTheory Online, 22/1, https://mtosmt.org/issues/mto.16.22.1/mto.16.22.1.heidemann.htmlGoogle Scholar
Hildebrand, H. A. 1997. ‘Pitch detection and intonation correction apparatus and method’, United States Patent US5973252A. https://patents.google.com/patent/US5973252A/en (accessed 31 October 2023)Google Scholar
Hoffman, C. 2021. ‘How to perfectly Auto-Tune vocals in 7 steps’, Black Ghost Audio, https://www.blackghostaudio.com/blog/how-to-perfectly-auto-tune-vocals-in-7-steps (accessed 26 September 2023)Google Scholar
Holt, K. 2020. ‘Emcee ethnographies: a brief sketch of U.S. hip-hop ethnography’, Current Musicology, 105, https://doi.org/10.7916/cm.v0i105.5400Google Scholar
Jensenius, A. R. 2022. Sound Actions: Conceptualizing Musical Instruments (Cambridge, MA, MIT Press)CrossRefGoogle Scholar
Kabango, S. 2016. Hip-Hop Evolution (HBO Canada)Google Scholar
Kids Take Over. 2021. ‘Bainz on being Young Thug's engineer, Punk Album, Future Stories (interview)’, YouTube, https://www.youtube.com/watch?v=2xCFK7kvUhs&ab_channel=KidsTakeOver (accessed 26 September 2023)Google Scholar
Lansang, R. L. 2019. ‘Songs for contemporary voices: perspectives and strategies of women making music in the twenty-first century’, PhD dissertation, Rutgers UniversityGoogle Scholar
Levy, L. 2018. ‘The sneaky power of Travis Scott's voice’, The Fader, https://www.thefader.com/2018/08/08/travis-scotts-astroworld-real-voice (accessed 26 September 2023)Google Scholar
Lobad, N. 2020. ‘Auto-Tune in hip-hop: a brief history from T-Pain to Future’, Hot New Hip Hop, https://www.hotnewhiphop.com/313576-auto-tune-in-hip-hop-a-brief-history-from-t-pain-to-future-news (accessed 8 October 2023)Google Scholar
Mackintosh, K. 2021. Neon Screams: How Drill, Trap, and Bashment Made Music New Again (London, Repeater)Google Scholar
Malawey, V. 2020. A Blaze of Light in Every Word: Analyzing the Popular Singing Voice (New York, Oxford University Press)CrossRefGoogle Scholar
Marchand Knight, J. 2020. ‘The singer's formant’, in Timbre and Orchestration Resource, ed. Soden, K., Duinker, B. and Gutiérrez Martínez, A., https://timbreandorchestration.org/writings/timbre-lingo/2019/12/10/the-singers-formant (accessed 29 September 2023)Google Scholar
Marchand Knight, J., Sares, A., Deroche, M. 2023. ‘Visual biases in evaluation of speakers’ and singers’ voice type by cis and trans listeners’, Frontiers in Psychology, 14 https://doi.org/10.3389/fpsyg.2023.1046672.CrossRefGoogle ScholarPubMed
Marshall, O. 2014. ‘A brief history of Auto-Tune’, Sounding Out!, https://soundstudiesblog.com/2014/04/21/its-about-time-auto-tune/ (accessed 4 November 2023)Google Scholar
McKittrick, K. (ed) 2015. Sylvia Wynter: On Being Human as Praxis (Raleigh, NC, Duke University Press)Google Scholar
Miranda, E., and Wanderley, M. 2006. New Digital Musical Instruments: Control and Interaction Beyond the Keyboard (Middleton, WI, A-R Editions)Google Scholar
Momii, T. 2021. ‘Music analysis and the politics of knowledge production: interculturality in the music of Honjoh Hidejirō, Miyata Mayumi, and Mitski’, PhD dissertation, Columbia UniversityGoogle Scholar
Nappy Boy. 2022. ‘Snoop Dogg explains to T-Pain why he used Auto-Tune on Sensual Seduction’, Nappy Boy Radio Podcast, https://www.youtube.com/watch?v=NSwZ7gmZ7JE (accessed 30 September 2023)Google Scholar
NPR Music. 2014. ‘T-Pain: NPR Music Tiny Desk Concert’, https://youtu.be/CIjXUg1s5gc?si=xe3wm3tRya30w9a1 (accessed 26 April 2024)Google Scholar
Patrin, N. 2020. Bring that Beat Back: How Sampling Built Hip-Hop (Minneapolis, MN, University of Minnesota Press)Google Scholar
Provenzano, C. 2018. ‘Auto-Tune, labor, and the pop-music voice’, in The Relentless Pursuit of Tone: Timbre in Popular Music, ed. Fink, R., Latour, M. and Wallmark, Z. (New York, Oxford University Press), pp. 159–83CrossRefGoogle Scholar
Provenzano, C. 2019. ‘Emotional signals: digital tuning software and the meanings of pop music voices’, PhD dissertation, New York UniversityGoogle Scholar
Red Bull Music Academy. 2016. ‘Young Thug engineer Alex Tumay on recording | Red Bull Music Academy’, YouTube, https://www.youtube.com/watch?v=QvLYckq_BWE&ab_channel=RedBullMusicAcademy (accessed 26 September 2023)Google Scholar
Reynolds, S. 2018. ‘How Auto-Tune revolutionized the sound of popular music’, Pitchfork, https://pitchfork.com/features/article/how-auto-tune-revolutionized-the-sound-of-popular-music/ (accessed 15 July 2023).Google Scholar
Rose, T. 1992. ‘Flow, layering, and rupture in post-industrial New York’, in Signifyin(g), Sanctifying’, & Slam Dunking: A Reader in African American Expressive Culture, ed. Dagel Caponi, G. (Amherst, MA, University of Massachusetts Press), pp. 191221Google Scholar
Rose, T. 1994. Black Noise (Middletown, CT, Wesleyan University Press)Google Scholar
Sillitoe, S. 1999. ‘Recording Cher's “Believe”’, Sound on Sound, https://www.soundonsound.com/techniques/recording-cher-believe (accessed 4 October 2023)Google Scholar
Sky Jordxn. 2022. ‘GUNNA and YOUNG THUG AUTOTUNE SETTINGS EXPOSED’, YouTube, https://www.youtube.com/watch?v=91tgyTCI63E&ab_channel=SkyJordxn (accessed 26 September 2023)Google Scholar
Stephenson, W. 2013. ‘Gen F: Young Thug’, The Fader, https://www.thefader.com/2013/05/08/gen-f-young-thug (accessed 4 July 2023)Google Scholar
Sterne, J. 2006. ‘The death and life of digital audio’, Interdisciplinary Science Reviews, 31(4), pp. 338–48CrossRefGoogle Scholar
Théberge, P. 2018. ‘The sound of nowhere: reverb and the construction of sonic space’, in The Relentless Pursuit of Tone: Timbre in Popular Music, ed. Fink, R., Latour, M. and Wallmark, Z. (New York, Oxford University Press), pp. 323–44CrossRefGoogle Scholar
TheWavMan. 2019. ‘How to sound like Young Thug vocal effect tutorial! FL Studio’, YouTube, https://www.youtube.com/watch?v=dAQxDHW71mQ&ab_channel=TheWavMan (accessed 26 September 2023)Google Scholar
Tinsley, J. 2021. ‘T-Pain popularized Auto-Tune, but it came at a cost’, Andscape, https://andscape.com/features/t-pain-popularized-auto-tune-but-it-came-at-a-cost/ (accessed 26 September 2023)Google Scholar
Wallmark, Z. 2022. ‘Analyzing vocables in rap: a case study of Megan Thee Stallion’, Music Theory Online, 28/2, https://www.mtosmt.org/issues/mto.22.28.2/mto.22.28.2.wallmark.htmlCrossRefGoogle Scholar
Weheliye, A. G. 2002. ‘“Feenin”: posthuman voices in contemporary Black popular music’, Social Text, 20/2, pp. 2147CrossRefGoogle Scholar
Wilson, O. 1992. ‘The heterogeneous sound ideal in African American music’, in Signifyin(g), Sanctifying, & Slam Dunking: A Reader in African American Expressive Culture, ed. Dagel Caponi, G. (Amherst, MA, University of Massachusetts Press), pp. 157–71Google Scholar
Young, M. 2016. Singing the Body Electric: The Human Voice and Sound Technology (Farnham, Taylor & Francis)CrossRefGoogle Scholar
Cher, , ‘Believe’. Believe. Warner Bros. 1998Google Scholar
Leray, Coi, ‘Twinnem’. Trendsetter. Uptown. 2022Google Scholar
Jay-Z, , ‘DOA (Death of Auto-Tune)’. The Blueprint 3. Roc-A-Fella. 2009Google Scholar
West, Kanye, ‘Stronger’. Graduation. Roc-A-Fella. 2007Google Scholar
Baby, Lil', ‘Section 8’. Street Gossip. Quality Control. 2018Google Scholar
Gotit, Lil', ‘Argentina’. Superstar Creature. Alamo. 2020Google Scholar
T-Pain, , ‘I'm ‘n Luv (wit a Stripper)’. Rappa Ternt Sanga. Konvict. 2005aGoogle Scholar
T-Pain, , ‘I'm Sprung’. Rappa Ternt Sanga. Konvict. 2005bGoogle Scholar
The Carters, ‘APESHIT’. Everything is Love. Parkwood. 2018Google Scholar
The Game ft. 50 Cent, ‘Hate it or Love it’. The Documentary. Aftermath. 2005Google Scholar
Scott, Travis ft. Young Thug, ‘Yeah Yeah’. Days Before Rodeo. Grand Hustle. 2014Google Scholar
Shakur, Tupac, ‘Brenda's Got a Baby’. 2Pacalypse Now. TNT. 1991Google Scholar
Staples, Vince, ‘Senorita’. Summertime ’06. ARTium. 2015Google Scholar
Thug, Young, ‘Best Friend’. Slime Season. 300. 2015Google Scholar
Thug, Young. Cole, J. & Scott, Travis, ‘The London’. So Much Fun. 300. 2019Google Scholar
Thug, Young. Scott, Travis, ‘Abracadabra’. Business is Business. YSL. 2023Google Scholar
Cher, , ‘Believe’. Believe. Warner Bros. 1998Google Scholar
Leray, Coi, ‘Twinnem’. Trendsetter. Uptown. 2022Google Scholar
Jay-Z, , ‘DOA (Death of Auto-Tune)’. The Blueprint 3. Roc-A-Fella. 2009Google Scholar
West, Kanye, ‘Stronger’. Graduation. Roc-A-Fella. 2007Google Scholar
Baby, Lil', ‘Section 8’. Street Gossip. Quality Control. 2018Google Scholar
Gotit, Lil', ‘Argentina’. Superstar Creature. Alamo. 2020Google Scholar
T-Pain, , ‘I'm ‘n Luv (wit a Stripper)’. Rappa Ternt Sanga. Konvict. 2005aGoogle Scholar
T-Pain, , ‘I'm Sprung’. Rappa Ternt Sanga. Konvict. 2005bGoogle Scholar
The Carters, ‘APESHIT’. Everything is Love. Parkwood. 2018Google Scholar
The Game ft. 50 Cent, ‘Hate it or Love it’. The Documentary. Aftermath. 2005Google Scholar
Scott, Travis ft. Young Thug, ‘Yeah Yeah’. Days Before Rodeo. Grand Hustle. 2014Google Scholar
Shakur, Tupac, ‘Brenda's Got a Baby’. 2Pacalypse Now. TNT. 1991Google Scholar
Staples, Vince, ‘Senorita’. Summertime ’06. ARTium. 2015Google Scholar
Thug, Young, ‘Best Friend’. Slime Season. 300. 2015Google Scholar
Thug, Young. Cole, J. & Scott, Travis, ‘The London’. So Much Fun. 300. 2019Google Scholar
Thug, Young. Scott, Travis, ‘Abracadabra’. Business is Business. YSL. 2023Google Scholar
Figure 0

Example 1. Excerpt from ‘I'm Sprung’ (T-Pain, 2005)

Figure 1

Example 2. Spectrogram of sung straight tone (performed by the author) without (above) and with (below) Auto-Tune applied to the audio signal

Figure 2

Example 3. Workflow diagram for Auto-Tune as it is used in trap music

Figure 3

Example 4. Spectrogram of Lil' Gotit's performance of the lyrics ‘Big Racks' in the song ‘Argentina’ (2020). Note the rapid fluctuation between C#4 (277.18 hz) and D#4 (311.13 hz) on the lyric “Racks,” as seen through the abrupt step-like transpositions in red and orange-coloured bands on the spectrogram