Hostname: page-component-cb9f654ff-d5ftd Total loading time: 0 Render date: 2025-08-30T10:19:21.724Z Has data issue: false hasContentIssue false

Bias and stereotyping: Human and artificial intelligence (AI)

Published online by Cambridge University Press:  25 July 2025

Okim Kang*
Affiliation:
Program of Applied Linguistics, Department of English Flagstaff, North Arizona University, Arizona, USA
Kevin Hirschi
Affiliation:
Program of Applied Linguistics, Department of English Flagstaff, North Arizona University, Arizona, USA Department of Bicultural-Bilingual Studies, The University of Texas, San Antonio, USA
*
Corresponding author: Okim Kang; Email: Okim.Kang@nau.edu

Abstract

As social and educational landscapes continue to change, especially around issues of inclusivity, there is an urgent need to reexamine how individuals from diverse linguistic backgrounds are perceived. Speakers are often misjudged due to listeners’ stereotypes about their social identities, resulting in biased language judgments that can limit educational and professional opportunities. Much research has demonstrated listeners’ biases toward L2-accented speech, i.e., perceiving accented utterances as less credible, less grammatical, or less acceptable for certain professional positions, due to their bias and stereotyping issues. Then, artificial intelligence (AI) technology has emerged as a viable alternative to mitigate listeners’ biased judgments. It serves as a tool for assessing L2-accented speech as well as establishing intelligibility thresholds for accented speech. It is also used to assess characteristics such as gender, age, and mood in AI facial-analysis systems. However, these AI systems or current technologies still may hold racial or accent biases. Accordingly, the current paper will discuss both human listeners’ and AI’ bias issues toward L2 speech, illustrating such phenomena in various contexts. It concludes with specific recommendations and future directions for research and pedagogical practices.

Information

Type
Research Article
Copyright
© The Author(s), 2025. Published by Cambridge University Press.

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Abid, A., Farooqi, M., & Zou, J. (2021). Persistent anti-Muslim bias in large language models. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society (pp. 298306). https://doi.org/10.1145/3461702.3462624CrossRefGoogle Scholar
Adank, P., Evans, B., Stuart-Smith, J., & Scotti, S. (2009). Comprehension of familiar and unfamiliar native accents under adverse listening conditions. Journal of Experimental Psychology: Human Perception and Performance, 35(2), 520529. https://doi.org/10.1037/a0013552Google ScholarPubMed
Allport, G. W. (1954). The nature of prejudice. Perseus Books.Google Scholar
Babel, M., & Russell, J. (2015). Expectations and speech intelligibility. The Journal of the Acoustical Society of America, 137(5), 28232833. https://doi.org/10.1121/1.4919317CrossRefGoogle ScholarPubMed
Bae, Y., & Kang, O. (2024). Biased AI: The impact of L2 accents on the AI intelligibility [Conference presentation]. PSLLT 2024, Ames, IA, United States.Google Scholar
Baevski, A., Zhou, H., Mohamed, A., & Auli, M. (2020) wav2vec 2.0: A framework for self-supervised learning of speech representations. Advances in Neural Information Processing Systems, 33, 1244912460.Google Scholar
Berg, H., Hall, S. M., Bhalgat, Y., Yang, W., Kirk, H. R., Shtedritski, A., & Bain, M. (2022). A prompt array keeps the bias away: Debiasing vision-language models with adversarial learning. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers, 806822). Association for Computational Linguistics. https://doi.org/10.48550/arXiv.2203.11933CrossRefGoogle Scholar
Bommasani, R., Hudson, D. A., Adeli, E., Altman, R. B., Arora, S., von Arx, S., Demszky, D., Bernstein, M. S., Bohg, J., Bosselut, A., Brunskill, E., & Brynjolfsson, E. (2022). On the opportunities and risks of foundation models [arXiv preprint]. arXiv. https://doi.org/10.48550/arXiv.2108.07258CrossRefGoogle Scholar
Bradac, J. J., Cargile, A. C., & Hallett, J. S. (2001). Language attitudes: Retrospect, conspect, and prospect. In Robinson, W. P. & Giles, H. (Eds.), The new handbook of language and social psychology (pp. 137158). John Wiley.Google Scholar
Bresnahan, M. I., & Sun Kim, M. (1993). The impact of positive and negative messages on change in attitude toward international teaching assistants. Folia Linguistica, 27(3/4), 347363. https://doi.org/10.1515/flin.1993.27.3-4.347CrossRefGoogle Scholar
Buolamwini, J., & Gebru, T. (2018). Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. In Friedler, S. A. & Wilson, C. (Eds.), Proceedings of the 1st Conference on Fairness, Accountability and Transparency (Vol. 81, pp. 7791). PMLR. https://proceedings.mlr.press/v81/buolamwini18a.htmlGoogle Scholar
Burgoon, J. K. (1993). Interpersonal expectations, expectancy violations, and emotional communication. Journal of Language and Social Psychology, 12(1–2), 3048. https://doi.org/10.1177/0261927X93121003CrossRefGoogle Scholar
Chan, M. P. Y., Choe, J., Li, A., Chen, Y., Gao, X., & Holliday, N. (2022). Training and typological bias in ASR performance for world Englishes. Proceedings of Interspeech, 2022, 12731277. https://doi.org/10.21437/Interspeech.2022-10869CrossRefGoogle Scholar
Dixon, J. A., Mahoney, B., & Cocks, R. (2002). Accents of guilt? effects of regional accent, race, and crime type on attributions of guilt. Journal of Language and Social Psychology, 21(2), 162168. https://doi.org/10.1177/02627X02021002004CrossRefGoogle Scholar
Dovidio, J. F., Gaertner, S. L., & Kawakami, K. (2003). Intergroup contact: The past, present, and the future. Group Processes and Intergroup Relations, 6(1), 521. https://doi.org/10.1177/1368430203006001009CrossRefGoogle Scholar
Field, A., Blodgett, S. L., Waseem, Z., & Tsvetkov, Y. (2021). A survey of race, racism, and anti-racism in NLP. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 1905–1925. Association for Computational Linguistics.10.18653/v1/2021.acl-long.149CrossRefGoogle Scholar
Fitch, F., & Morgan, S. E. (2003). “Not a lick of English”: Constructing the ITA identity through student narratives. Communication Education, 52(3–4), 297310. https://doi.org/10.1080/0363452032000156262CrossRefGoogle Scholar
Fuertes, J. N., Gottdiener, W. H., Martin, H., Gilbert, T. C., & Giles, H. (2012). A meta‐analysis of the effects of speakers’ accents on interpersonal evaluations. European Journal of Social Psychology, 42(1), 120133. https://doi.org/10.1002/ejsp.862CrossRefGoogle Scholar
Ghanem, R., & Kang, O. (2021). ESL students’ reverse linguistic stereotyping of English teachers. ELT Journal, 75(3), 330340. doi: https://doi.org/10.1093/elt/ccab011CrossRefGoogle Scholar
Giles, H., & Niedzielski, N. (1998). Italian is beautiful, German is ugly. In Bauer, L. & Trudgill, P. (Eds.), Language myths. (pp. 8593). Penguin Books.Google Scholar
Hirschi, K., & Kang, O. (2024). Machine Learning (ML) tools for measuring second language (L2) intelligibility. In Sadeghi, K. (Ed.), Routledge handbook of technological advances in researching language learning. (pp. 465478). Routledge. https://doi.org/10.4324/9781003459088-42CrossRefGoogle Scholar
Hofmann, V., Kalluri, P. R., Jurafsky, D., & King, S. (2024). AI generates covertly racist decisions about people based on their dialect. Nature, 633(8028), 147154. https://doi.org/10.1038/s41586-024-07856-5CrossRefGoogle ScholarPubMed
Inceoglu, S., Chen, W.-H., & Lim, H. (2023). Assessment of L2 intelligibility: Comparing L1 listeners and automatic speech recognition. ReCALL, 35(1), 89104. https://doi.org/10.1017/S0958344022000192CrossRefGoogle Scholar
Jiang, X. (2014). Chinese biology teaching assistants’ perception of their English proficiency: An exploratory case study. The Qualitative Report, 19(21), 124. https://doi.org/10.46743/2160-3715/2014.1226Google Scholar
Jiang, Y., Hao, J., Fauss, M., & Li, C. (2024). Detecting ChatGPT-generated essays in a large-scale writing assessment: Is there a bias against non-native English speakers?. Computers and Education, 217, 105070. https://doi.org/10.1016/j.compedu.2024.105070CrossRefGoogle Scholar
Kang, O. (2012). Impact of rater characteristics and prosodic features of speakers on ratings of international teaching assistants’ oral performance. Language Assessment Quarterly, 9(3), 249269. https://doi.org/10.1080/15434303.2011.642631CrossRefGoogle Scholar
Kang, O., & Rubin, D. (2009). Reverse linguistic stereotyping: Measuring the effect of listener expectations on speech evaluation. Journal of Language and Social Psychology, 28(4), 441456. https://doi.org/10.1177/0261927X09341950CrossRefGoogle Scholar
Kang, O., Rubin, D., & Lindemann, S. (2015). Mitigating U.S. undergraduates’ attitudes toward international teaching assistants. TESOL Quarterly, 49(4), 681706. https://doi.org/10.1002/tesq.192CrossRefGoogle Scholar
Kang, O., & Yaw, K. (2021). Social judgement of L2 accented speech stereotyping and its influential factors. Journal of Multilingual and Multicultural Development, 45(4), 921936. https://doi.org/10.1080/01434632.2021.1931247CrossRefGoogle Scholar
Kang, O., & Yaw, K. (2024). Reverse linguistic stereotyping and judgment of L2 accented speech in social contexts: A case study about raciolinguistic phenomena. In Kubota, R. & Motha, S. (Eds.), Race, racism, and antiracism in language education. Routledge. https://doi.org/10.4324/9781003283492-13Google Scholar
Kang, O., Yaw, K., & Kostromitina, M. (2023). The effects of situational contexts and occupational roles on listeners’ judgements on accented speech. Psychology of Language and Communication, 27(1), 122. https://doi.org/10.58734/plc-2023-0001CrossRefGoogle Scholar
Koenecke, A., Nam, A., Lake, E., Nudell, J., Quartey, M., Mengesha, Z., Goel, S. (2020). Racial disparities in automated speech recognition. Proceedings of the National Academy of Sciences, 117(14), 76847689. https://doi.org/10.1073/pnas.1915768117CrossRefGoogle ScholarPubMed
Kojima, T., Gu, S. S., Reid, M., Matsuo, Y., & Iwasawa, Y. (2022). Large language models are zero-shot reasoners. Advances in Neural Information Processing Systems, 35(1), 2219922213. https://doi.org/10.48550/arXiv.2205.11916Google Scholar
Lambert, W. E., Hodgson, R. C., Gardner, R. C., & Fillenbaum, S. (1960). Evaluational reactions to spoken language. The Journal of Abnormal and Social Psychology, 60(1), 4451. https://doi.org/10.1037/h0044430CrossRefGoogle Scholar
Lev-Ari, S., & Keysar, B. (2010). Why don’t we believe non-native speakers? The influence of accent on credibility. Journal of Experimental Social Psychology, 46(6), 10931096. https://doi.org/10.1016/j.jesp.2010.05.025CrossRefGoogle Scholar
Lima, L., Furtado, V., Furtado, E., & Almeida, V. (2019). Empirical analysis of bias in voice-based personal assistants. In Companion Proceedings of the 2019 World Wide Web Conference (pp. 533538). https://doi.org/10.1145/3308560.3317597CrossRefGoogle Scholar
Lindemann, S. (2003). Koreans, Chinese or Indians? Attitudes and ideologies about non-native English speakers in the United States. Journal of Sociolinguistics, 7(3), 348364. https://doi.org/10.1111/1467-9481.00228CrossRefGoogle Scholar
Lindemann, S. (2005). Who speaks “broken English”? US undergraduates’ perceptions of non‐native English. International Journal of Applied Linguistics, 15(2), 187212. https://doi.org/10.1111/j.1473-4192.2005.00087.xCrossRefGoogle Scholar
Lippi-Green, R. (2012). English with an accent: Language, ideology, and discrimination in the United States. Routledge. https://doi.org/10.4324/9780203348802CrossRefGoogle Scholar
Martin, J. L., & Wright, K. E. (2023). Bias in automatic speech recognition: The case of African American language. Applied Linguistics, 44(4), 613630. https://doi.org/10.1093/applin/amac066CrossRefGoogle Scholar
Moussalli, S., & Cardoso, W. C. (2020). Intelligent personal assistants: Can they understand and be understood by accented L2 learners? Computer Assisted Language Learning, 33(8), 865890. https://doi.org/10.1080/09588221.2019.1595664CrossRefGoogle Scholar
Munro, M. J., Derwing, T. M., & Morton, S. L. (2006). The mutual intelligibility of L2 speech. Studies in Second Language Acquisition, 28(1), 111131. https://doi.org/10.1017/S0272263106060049CrossRefGoogle Scholar
Nacimiento-García, E., Díaz-Kaas-Nielsen, H. S., & González-González, C. S. (2024). Gender and accent biases in AI-based tools for Spanish: A comparative study between Alexa and Whisper. Applied Sciences, 14(11), 4734. https://doi.org/10.3390/app14114734CrossRefGoogle Scholar
National Research Council (2004) Measuring racial discrimination. The National Academies Press. https://doi.org/10.17226/10887Google Scholar
Navigli, R., Conia, S., & Ross, B. (2023). Biases in large language models: Origins, inventory, and discussion. Journal of Data and Information Quality, 15(2), 121. https://doi.org/10.1145/3597307CrossRefGoogle Scholar
Neri, A., Cucchiarini, C., & Strik, H. (2003). Automatic speech recognition for second language learning: How and why it actually works. Proceedings of 15th International Congress of Phonetic Sciences, 11571160.Google Scholar
Nozza, D., Bianchi, F., & Hovy, D. (2021). HONEST: Measuring hurtful sentence completion in language models. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 23982406). Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.naacl-main.191CrossRefGoogle Scholar
Panayotov, V., Chen, G., Povey, D., & Khudanpur, S. (2015). Librispeech: An ASR corpus based on public domain audio books. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 52065210). IEEE. https://doi.org/10.1109/ICASSP.2015.7178964CrossRefGoogle Scholar
Piller, I. (2016). Linguistic diversity and social justice: An introduction to applied sociolinguistics. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199937240.001.0001CrossRefGoogle Scholar
Piller, I., Torsh, H., & Smith-Khan, L. (2023). Securing the borders of English and whiteness. Ethnicities, 23(5), 706725. https://doi.org/10.1177/14687968211052610CrossRefGoogle Scholar
Powers, D. E., Schedl, M. A., Wilson Leung, S. W., & Butler, F. A. (1999). Validating the revised test of spoken English against a criterion of communicative success. Language Testing, 16(4), 339425. https://doi.org/10.1177/026553229901600401CrossRefGoogle Scholar
Prikhodkine, A., Correia Saavedra, D., & Dos Santos Mamed, M. (2016). “Give me your name and I’ll tell you whether you speak with an accent” the effect of proper names ethnicity on listener expectations. CALL: Irish Journal for Culture, Arts, Literature and Language, 1(1), 10. https://doi.org/10.21427/D7D592Google Scholar
Reusens, M., Borchert, P., De Weerdt, J., & Baesens, B. (2024) Native design bias: Studying the impact of English nativeness on language model performance. arXiv preprint arXiv:2406.17385. https://doi.org/10.48550/arXiv.2406.17385CrossRefGoogle Scholar
Rubin, D. L. (1992). Nonlanguage factors affecting undergraduates’ judgments of nonnative English-speaking teaching assistants. Research in Higher Education, 33(4), 511531. https://doi.org/10.1007/BF00973770CrossRefGoogle Scholar
Rubin, D. L. (2012). The power of prejudice in accent perception: Reverse linguistic stereotyping and its impact on listener judgments and decisions. In Levis, J. & LeVelle, K. (Eds.), Proceedings of the 3rd Pronunciation in Second Language Learning and Teaching Conference (pp. 1117). Iowa State University.Google Scholar
Rubin, D. L., Ainsworth, S., Cho, E., Turk, D., & Winn, L. (1999). Are Greek letter social organizations a factor in undergraduates’ perceptions of international instructors? International Journal of Intercultural Relations, 23(1), 112. https://doi.org/10.1016/S0147-1767(98)00023-6CrossRefGoogle Scholar
Rubin, D. L., Coles, V. B., & Barnett, J. T. (2016). Linguistic stereotyping in older adults’ perceptions of health care aides. Health Communication, 31(7), 911916. https://doi.org/10.1080/10410236.2015.1007549CrossRefGoogle ScholarPubMed
Rubin, D. L., Healy, P., Zath, R. C., Gardiner, T. C., & Moore, C. P. (1997). Non-native physicians as message sources: Effects of accent and ethnicity on patients’ responses to AIDS prevention counseling. Health Communication, 9(4), 351368. https://doi.org/10.1207/s15327027hc0904_4CrossRefGoogle Scholar
Rubin, D. L., & Smith, K. A. (1990). Effects of accent, ethnicity, and lecture topic on undergraduates’ perceptions of nonnative English-speaking teaching assistants. International Journal of Intercultural Relations, 14(3), 337353. https://doi.org/10.1016/0147-1767(90)90019-SCrossRefGoogle Scholar
Ruivivar, J., & Collins, L. (2019). Nonnative accent and the perceived grammaticality of spoken grammar forms. Journal of Second Language Pronunciation, 5(2), 269293. https://doi.org/10.1075/jslp.17039.ruiCrossRefGoogle Scholar
Russell, S. J., & Norvig, P. (2016). Artificial intelligence: A modern approach. Pearson Education Limited.Google Scholar
Saeed, W., & Omlin, C. (2023). Explainable AI (XAI): A systematic meta-survey of current challenges and future opportunities. Knowledge-Based Systems, 263, 110273. https://doi.org/10.1016/j.knosys.2023.110273CrossRefGoogle Scholar
Tatman, R. (2017). Gender and dialect bias in YouTube’s automatic captions. In Proceedings of the First ACL Workshop on Ethics in Natural Language Processing, (pp. 5359). Association for Computational Linguistics. https://doi.org/10.18653/v1/W17-1606CrossRefGoogle Scholar
Tatman, R., & Kasten, C. (2017). Effects of talker dialect, gender & race on accuracy of Bing speech and YouTube automatic captions. Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH 2017), 934938. https://doi.org/10.21437/Interspeech.2017-1746CrossRefGoogle Scholar
Tessler, H., Choi, M., & Kao, G. (2020). The anxiety of being Asian American: Hate crimes and negative biases during the COVID-19 pandemic. American Journal of Criminal Justice, 45(4), 636646. https://doi.org/10.1007/s12103-020-09541-5CrossRefGoogle ScholarPubMed
von Glasersfeld, E. (1995). Radical constructivism: A way of knowing and learning. The Falmer Press.Google Scholar
Walesiak, B. (2021) Mobile apps for pronunciation training. In Kirkova-Naskova, A., Henderson, A., & Fouz-González, J. (Eds.), English Pronunciation Instruction: Research-based insights. (Vol. 19, 358384). John Benjamins Publishing Company https://doi.org/10.1075/aals.19.15walCrossRefGoogle Scholar
Wolfe, R., & Caliskan, A. (2022). American == white in multimodal language-and-image AI. In Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society (pp. 800812). Association for Computing Machinery.10.1145/3514094.3534136CrossRefGoogle Scholar
Yu, D., & Deng, L. (2016). Automatic speech recognition. Springer.Google Scholar
Zechner, K., & Evanini, K. (2019). Automated speaking assessment: Using language technologies to score spontaneous speech. Routledge. https://doi.org/10.4324/9781315165103CrossRefGoogle Scholar