1. Introduction
The proliferation of digital presentations in the Web 2.0 era has given rise to the popularization of a conceptual framework known as “digital oratory” (Lind, Reference Lind2012) or “new oratory” (Rossette-Crake, Reference Rossette-Crake2020). Digital oratory is a combination of public speaking and digital platforms or tools (Sharma, Reference Sharma2023). It lies somewhere between traditional face-to-face public speaking and media production, providing opportunities for speakers to communicate with both live and online audiences (Lind, Reference Lind2012). As a subfield of computer-assisted language learning (CALL), digital oratory fundamentally involves the integration of digital platforms into teaching methods (Cotellese, Bornak & Davidson, Reference Cotellese, Bornak and Davidson2015), learning resources (Chang & Huang, Reference Chang and Huang2015; Quagliata, Reference Quagliata2014), and assignment design (Hayward, Reference Hayward2017; Lind, Reference Lind2012). While the use of videos in technology-assisted language learning varies from passive practices, like watching videos, to active integration, such as student-produced video artifacts, using both open and closed platforms, and targets a wide range of skills (Yu & Zadorozhnyy, Reference Yu and Zadorozhnyy2022), digital oratory encourages language learners to actively get involved in speech video production. With the increasing accessibility of built-in smartphone applications that enable users to create their own videos, digital oratory has emerged as a crucial component in L2 speaking pedagogy.
The rise of digital communication has transformed the landscape of public speaking, demanding that students adapt their nonverbal communication skills to the digital realm. Effective communication skills, especially in public speaking, play a crucial role in professional, academic, and social settings. Nonetheless, English as a second language (ESL) students encounter distinct challenges when communicating in a non-native language, particularly concerning nonverbal communication cues, which could significantly impact their message delivery and overall persuasive efficacy. For instance, ESL students may display closed or uptight postures, restless gestures, or lack of facial expression that hinder the effectiveness of their presentations (Gregersen, Reference Gregersen2007; Gullberg, Reference Gullberg2006). The fear of making language errors and self-consciousness about their accent can cause significant anxiety, undermining students’ confidence and ability to present effectively (Mak, Reference Mak2011; Woodrow, Reference Woodrow2006). While previous research has focused on speaking anxiety and L2 nonverbal performance, there is a paucity of research investigating the interplay of these two variables in the context of digital oratory. To address this gap, the present study extends the extant literature by examining whether and to what extent EFL students’ public speaking anxiety (PSA) influences nonverbal speech performance, with a specific focus on digital oratory.
The convergence of digital and traditional communication has created a compelling need to integrate digital oratory into public speaking pedagogy. Despite a substantial record of studies investigating L2 speaking anxiety and nonverbal communication (e.g. Diadori, Reference Diadori2024; Lindberg, McDonough & Trofimovich, Reference Lindberg, McDonough and Trofimovich2021; Maher & King, Reference Maher and King2020; McDonough, Kim, Uludag, Liu & Trofimovich, Reference McDonough, Kim, Uludag, Liu and Trofimovich2022), the focus has predominantly been on face-to-face interactions during class discussions or conversations in L2 classrooms. Recent years have witnessed a growing interest in exploring L2 students’ speaking performance in digital contexts (e.g. Asadnia & Atai, Reference Asadnia and Atai2022; Bobkina & Domínguez Romero, Reference Bobkina and Domínguez Romero2022; Bobkina, Domínguez Romero & Gómez Ortiz, Reference Bobkina, Domínguez Romero and Gómez Ortiz2023; Chen, Reference Chen2021, Reference Chen2024). However, the interplay between L2 students’ PSA and their nonverbal performances in digital settings remains under-researched. To address this research gap, we delve into this unexplored aspect to address the following questions:
RQ1. To what extent do four factors (cognitive, physiological, behavioral, and technical) affect L2 speakers’ digital oratory anxiety?
RQ2. How do L2 speakers perform in terms of nonverbal speech delivery in digital settings?
RQ3. How does digital oratory anxiety affect L2 speakers’ nonverbal speech performance in digital settings?
2. Digital oratory: Concept and framework
Digital oratory, a novel form of public speech disseminated through online media platforms (Lind, Reference Lind2012), harnesses digital tools and technologies to deliver impactful presentations. It merges the traditional principles of effective public speaking with the use of digital media, such as videos, slideshows, and interactive multimedia, to illustrate key points or convey complex information. This emerging mode of public address is prominently observed on popular social media channels, such as TikTok, Instagram, and YouTube, which involves “formats that are typically relayed via videos uploaded to the Internet” (Rossette-Crake, Reference Rossette-Crake2020: 571). By leveraging social media platforms, speakers can effectively reach a wider audience and actively engage with them in real time. The array of digital oratory videos encompasses a vast multitude of genres, including but not limited to product or service reviews, expressive rants, instructional tutorials, personal experiences, humorous sketches, and video-based resumes (Sharma, Reference Sharma2023). Amidst the widespread adoption of digital communication in the social media era, educators have recognized the pedagogical necessity of incorporating digital oratory training into basic communication courses (Lind, Reference Lind2020; Sharma, Reference Sharma2023). It has become a trend to utilize online media platforms, such as TED Talks (Chang & Huang, Reference Chang and Huang2015), YouTube (Quagliata, Reference Quagliata2014), and TikTok (Edwards, Reference Edwards2021), as learning resources in public speaking curriculum.
The recent decade has witnessed a growing interest in utilizing technology to enhance ESL speaking (e.g. Bobkina & Domínguez Romero, Reference Bobkina and Domínguez Romero2022; Chen, Reference Chen2024; Hsu, Reference Hsu2012; Sharma, Reference Sharma2023; Vallade, Kaufmann, Frisby & Martin, Reference Vallade, Kaufmann, Frisby and Martin2021; Xie, Chen & Ryder, Reference Xie, Chen and Ryder2021; Zheng, Wang & Chai, Reference Zheng, Wang and Chai2023). Digital oratory offers students experiential learning opportunities that allow them to analyze, imitate, practice, and reflect on their oral presentation skills (Lind, Reference Lind2012). Integrating web conferences into online public speaking classes improves students’ speech delivery skills as well as digital skills (Cotellese et al., Reference Cotellese, Bornak and Davidson2015). The use of web conferencing to teach online public speaking equips students with essential skills needed to thrive in a technology-rich working environment. Balakrishnan and Puteh (Reference Balakrishnan and Puteh2014) identified the efficiency of integrating video blog learning and face-to-face instruction to help students acquire public speaking skills. TED Talks and YouTube videos are two prominent examples of applying the digital oratory approach to public speaking pedagogy. For instance, TED Talks can be used as an in-class activity for students to analyze speech structure, language use, persuasive skills, and visual aids (Hayward, Reference Hayward2017). Similarly, Lind (Reference Lind2012) assigned students to record an oral presentation video and publish it on YouTube. The concept of digital oratory has provided a direction for L2 language teachers to modernize speaking pedagogy, ensure it aligns with the demands of today’s digital contexts, and better prepare students to excel in communication in the digital age.
3. L2 public speaking anxiety
PSA is a prevalent issue experienced by many speakers, characterized as “a situation-specific social anxiety that arises from the real or anticipated enactment of an oral presentation” (Bodie, Reference Bodie2010: 72). This anxiety stems from the unique context of public speaking and can lead to various responses, including physiological arousal (e.g. heart rate and body sensations), negative cognition (e.g. internal or external pressure and concern), and uncontrolled behaviors such as trembling and avoiding eye contact (Gallego, McHugh, Penttonen & Lappalainen, Reference Gallego, McHugh, Penttonen and Lappalainen2022). For L2 speakers, PSA can be even more pronounced when delivering L2 speeches before audiences. Four main factors generally contribute to PSA among L2 speakers: communication apprehension, affective factors, language proficiency, and nonverbal communication skills. Communication apprehension is often experienced by L2 speakers during oral presentations in class or group discussion (Nakamura, Nomura & Saeki, Reference Nakamura, Nomura and Saeki2020) when the inability to make themselves understood leads to negative self-perceptions (Tsang, Reference Tsang2022). L2 speakers with a high level of communication apprehension often experience a high level of PSA. Personality (i.e. introversion and extroversion) also influences communication apprehension. Students with introvert personalities may experience more PSA than extrovert students (MacIntyre & MacKay, Reference MacIntyre and MacKay2019).
Affective factors such as self-doubt, fear of negative judgments, and fear of making mistakes before audiences may contribute to PSA in L2 speakers (Shumin, Reference Shumin, Richards and Renandya2002; Sugiyati & Indriani, Reference Sugiyati and Indriani2021). Language proficiency can be another significant source of anxiety for speakers who struggle to express ideas in a non-native language (Goberman, Hughes & Haydock, Reference Goberman, Hughes and Haydock2011). Speaking proficiency, including grammatical, sociolinguistic, and strategic competence, reflects the application of linguistic systems and functional communication skills (Canale & Swain, Reference Canale and Swain1980; Huang & Bui, Reference Huang and Bui2019). Unfamiliarity with nonverbal skills in the target language may cause anxiety when L2 speakers are compelled to perform actions that they are not equipped with adequate nonverbal skills to perform (Richmond, Wrench & McCroskey, Reference Richmond, Wrench and McCroskey2013). PSA can negatively impact oral presentation performance as speakers may feel uneasy and perform poorly (Morris & Leach, Reference Morris and Leach2017). Thus, PSA can be both a cause and effect of delivery skills contributing to poor performance and vice versa (Woolfolk, Reference Woolfolk2019).
A considerable body of literature focuses on PSA in the L2 classroom (e.g. Liu, Reference Liu2018; Miskam & Saidalvi, Reference Miskam and Saidalvi2019; Taly & Paramasivam, Reference Taly and Paramasivam2020). For example, Miskam and Saidalvi (Reference Miskam and Saidalvi2019) found that Malaysian students suffered from moderate to high levels of anxiety when they had a class performance or public speaking was required. Similarly, Gregersen (Reference Gregersen2005) found that compared with non-anxious counterparts, L2 learners tended to maintain more tense facial expressions during oral presentations and demonstrated limited eye contact. Improving nonverbal presentation skills, such as arm and hand gestures, has been found to mitigate L2 learners’ PSA (Tsang, Reference Tsang2020). PSA is a complex phenomenon influenced by various factors, including communication apprehension, affective factors, language proficiency, and nonverbal presentation skills. Understanding PSA is crucial for improving L2 students’ public speaking performance.
4. L2 nonverbal speech performance
One of the strategies to reduce PSA is the effective use of nonverbal communication skills (Bobkina et al., Reference Bobkina, Domínguez Romero and Gómez Ortiz2023; Chen, Reference Chen2021, Reference Chen2024). The role of nonverbal communication skills in public speaking is crucial, as it has a significant impact on the efficacy of conveying messages to the audience. Oral presentations rely heavily on nonverbal skills to highlight, illustrate, or reinforce a verbal message (Fraleigh & Tuman, Reference Fraleigh and Tuman2011). Nonverbal cues, such as “eye contact, vocal delivery, enthusiasm, interaction with the audience, and body language,” can enhance messages being conveyed (De Grez, Valcke & Roozen, Reference De Grez, Valcke and Roozen2012: 133). These elements have been identified as crucial in oral presentation assessments. Poor body language or vocal delivery can negatively impact speakers’ impression on the audience (Gronbeck, Reference Gronbeck1990). Similarly, rigid facial features and limited affective features in online speech videos reduce video popularity (Trzciński & Rokita, Reference Trzciński and Rokita2017). Emotional tones, including passion and perceived sincerity of the speaker, are also crucial in creating an engaging atmosphere (Yalçın & Yalçın, Reference Yalçın and Yalçın2010).
Although self-recorded videos are considered beneficial for enhancing L2 students’ oracy skills, students felt unconfident and intimidated when using video presentations (Bobkina & Domínguez Romero, Reference Bobkina and Domínguez Romero2022). Analyzing students’ kinesic physical skills in both traditional and digital contexts, Bobkina et al. (Reference Bobkina, Domínguez Romero and Gómez Ortiz2023) found that students used kinesic markers such as gestures, facial expressions, and eye contact ineffectively, suggesting notable challenges in digital communication. As a result, placing a stronger focus on improving digital nonverbal communication skills is recommended to equip students for the evolving communication environment (Bobkina et al., Reference Bobkina, Domínguez Romero and Gómez Ortiz2023). However, teaching digital nonverbal speech skills can be challenging as it involves both cognitive and physical development. Speakers must not only control their body’s involuntary movements caused by speech anxiety but also align their body language and emotions with speech contents. Identifying the root causes of speech anxiety is crucial, as it enables public speaking instructors to offer effective guidance for students in enhancing delivery skills across diverse settings including on-site presentations with live audiences and online engagements with invisible audiences (Rudnick, Reference Rudnick2017).
5. Methodology
This study explored the factors affecting L2 speakers’ digital oratory anxiety and investigated the relationship between their anxiety and nonverbal speech performance in a naturalistic setting. Given the in situ and exploratory nature of this study, we employed a non-intrusive approach to minimize participant disruption and reduce the risk of observational interference. To enhance the validity and robustness of the findings, we adopted a mixed-methods design, integrating both quantitative and qualitative data through methodological triangulation (Creswell & Plano Clark, Reference Creswell and Plano Clark2018).
5.1 Participants
The participants were 40 second-year students aged 18 to 20 from different disciplines, including business (n = 18), sciences (n = 10), and humanities and social science (n = 12), at a Hong Kong university. There were 29 female students and 11 male students, and 94% of the participants were local students whose L1 was Cantonese, a dialect spoken in Hong Kong; the remaining 6% were Chinese Mandarin speakers from mainland China. Their English proficiency level was upper intermediate, equivalent to IELTS 6.5 on average, with a Band 6 above in speaking.
5.2 Context
The data were collected from a five-week public speaking workshop. The 40 participants met for 100 minutes each week in two groups taught by one instructor using the same teaching and learning resources. The students received training in basic public speaking skills, including topic selection, audience analysis, organizational patterns, speech outline, and speech delivery skills. The teaching materials were developed primarily based on two textbooks: Lucas and Stob (Reference Lucas and Stob2020) and Powers (Reference Powers2016). In Week 1, each student selected a controversial topic concerning a local or global issue (e.g. woke culture, climate change, or technology) for an 8-minute speech. Between Weeks 2 and 4, students conducted audience analysis and prepared speech outlines, organizing their arguments in a clear pattern to support their chosen positions. In Week 5, the students video-recorded their speeches using smartphones, tablets, or laptops. Upper-body images were required to be presented in the videos. As for visual aids, the students were encouraged to use actual objects rather than PowerPoint slides. The rationale behind this recommendation was that when using PowerPoint to video record a speech, the slides tend to dominate most of the screen, resulting in the speaker’s image appearing too small for the audience to see clearly. Not allowing PowerPoint slides enabled students to fully focus on their nonverbal communication skills, which meets one of the objectives of this study. After finishing video recording, they submitted the videos to a digital platform, Moodle (see Figure 1). A 5-point Likert scale questionnaire was then implemented to gain insights into what speech anxiety they may experience during making a digital presentation. Follow-up interviews were also conducted to collect students’ perceptions on digital oratory.

Figure 1. Screenshots of students’ speech video samples.
5.3 Instruments
5.3.1 Digital oratory anxiety questionnaire
An examination of existing scales measuring PSA was conducted to design a public speaking anxiety inventory specifically tailored to digital contexts. Based on a comprehensive and thorough review, two most relevant scales, namely the Personal Report of Public Speaking Anxiety (PRPSA) developed by McCroskey (Reference McCroskey1970) and the Public Speaking Anxiety Scale (PSAS) (Bartholomay & Houlihan, Reference Bartholomay and Houlihan2016), were considered for this study. The PRPSA has been widely used to assess public speaking anxiety within L2 contexts, as demonstrated in previous studies (e.g. Chen, Reference Chen2024; Hsu, Reference Hsu2012; Zheng et al., Reference Zheng, Wang and Chai2023). Nonetheless, not all the PRPSA items are fit for investigating PSA in digital contexts (e.g. “I get anxious if someone asks me something about my topic that I don’t know”). Also, the PRPSA comprises 34 items requiring lengthy completion time; a shorter yet methodologically robust scale is thus highly desirable (Bartholomay & Houlihan, Reference Bartholomay and Houlihan2016).
A condensed PRPSA version, the PSAS, developed by Bartholomay and Houlihan (Reference Bartholomay and Houlihan2016), contains 17 items under three subscales: cognitive, physiological, and behavioral. The cognitive scale captures speakers’ general perceptions of public speaking and anxiety as assessed through self-reports and interviews (Yadav et al., Reference Yadav, Sakib, Nirjhar, Feng, Behzadan and Chaspari2022). The physiological scale pertains to the body’s physiological responses to stress, including variables such as heart rate, blood pressure, sweat, and saliva. The behavioral scale focuses on observable behaviors exhibited during public speaking engagements (Gallego et al., Reference Gallego, McHugh, Penttonen and Lappalainen2022). Aligning with the research objectives, relevant items from the PRPSA and the PSAS were meticulously included in this study. The selected items were then classified into three categories, namely cognitive, physiological, and behavioral, based on Bartholomay and Houlihan (Reference Bartholomay and Houlihan2016). Given the technological nature inherent in digital oratory, an additional category labeled “technical” was included in the questionnaire to explore the potential influence of technology-related issues on L2 students’ performance in digital settings. As a result, the questionnaire yielded a total of four categories comprising 28 statements (see Appendix 1 in the supplementary material). All the statements in the questionnaire were cautiously formulated and statistically tested for reliability using SPSS (Version 26). Results obtained from reliability analyses indicated high internal consistency for each category, with Cronbach’s alpha coefficients of .896 for the cognitive subscale, .877 for physiological, .782 for behavioral, and .865 for technical.
5.3.2 Rubrics of nonverbal speech performance
Although numerous rubrics are available to evaluate public speaking performances, assessments focusing on nonverbal speech skills in digital contexts are still rare. Therefore, we designed a set of rubrics to evaluate nonverbal speech performance in digital contexts, drawing on established public speaking assessment schemes from previous studies and textbooks (Bobkina & Domínguez Romero, Reference Bobkina and Domínguez Romero2022; Powers, Reference Powers2016; Schneider, Börner, van Rosmalen & Specht, Reference Schneider, Börner, van Rosmalen and Specht2017; Zhang, Ardasheva & Austin, Reference Zhang, Ardasheva and Austin2020). The rubrics have five components – (1) Posture/Gesture, (2) Eye Contact, (3) Facial Expression, (4) Voice Control, and (5) Technical Effect – which are measured on a 5-point Likert scale with 20 items in total (see Appendix 2). The Technical Effect category aligns with the technological nature of digital oratory. To assess the internal consistency of our measures, we conducted a reliability analysis using Cronbach’s alpha coefficient, which yielded high internal consistency values ranging from .765 to .866.
5.3.3 Semi-structured interviews
To obtain an in-depth understanding of students’ PSA in digital contexts, we conducted semi-structured interviews with the 20 randomly selected students to elicit their perceptions and challenges they may have encountered during digital presentations. The interviews were guided by three main questions: (1) How much time did you spend on recording your digital speech? (2) Did you encounter any difficulties during your digital presentation? (3) What have you learned from digital presentations? Each individual interview lasted approximately 20 minutes. All interviews were conducted in both English and Chinese languages to accommodate participants’ linguistic preferences, ensure accurate interpretation, and enhance clarity of the responses.
5.4 Data analysis
We incorporated three sets of data for analysis: (1) students’ responses to the questionnaire, (2) nonverbal speech performance, and (3) semi-structured interviews. The quantitative data included students’ self-reports of anxiety levels as measured by the questionnaire. Mean values of the cognitive, physiological, behavioral, and technical subscales were compared using statistical tests to ascertain whether any significant difference existed among the four subscales.
Two evaluators assessed the students’ nonverbal speech performances based on the designed rubrics. In cases where there was a discrepancy of 2 in the overall scores provided by the first two evaluators, a third assessor re-evaluated the performances using the same rubrics. To ensure objectivity, validity, and reliability, any conflicting issues in assessment were addressed through a negotiated process involving the three evaluators. The consensus level among the assessors was particularly robust, as demonstrated by a kappa coefficient of 0.782. Statistical tests were then employed to examine whether there were significant differences among the five categories. Multiple regression analysis was performed to examine correlations between digital oratory anxiety and nonverbal speech performance.
The interview data were recorded and transcribed verbatim for qualitative analysis. The transcripts were then cross-checked and analyzed using a bottom-up coding approach by the three researchers to identify emerging themes associated with the participants’ perceptions. Cohen’s kappa coefficient of 0.83 demonstrated a high level of interrater reliability. To ensure confidentiality, the interview results were pseudonymized to protect participants’ privacy. All the data for this study, including speech videos, questionnaire, and semi-structured interviews, were collected adhering to ethical consideration, and informed consent was obtained from all participants for research purposes only.
6. Results
6.1 Digital oratory anxiety
The questionnaire examined the impacts of cognitive, physiological, behavioral, and technical factors on digital oratory anxiety. A Shapiro–Wilk test showed a significant deviation from normality across all responses, W(1120) = 0.91, p < .001, indicating the use of non-parametric methods for subsequent analyses. A Kruskal–Wallis H test found significant differences among the four factors, H(3) = 117.76, p < .001, with a large effect size (η2 = 0.689). Table 1 presents the Post hoc Mann–Whitney U tests with Bonferroni correction. All category pairs differed significantly (p < .005), with large effect sizes ranging from 0.582 to 0.86, except for Cognitive vs. Behavioral, which showed a moderate effect (r = 0.327). The cognitive factor demonstrated the highest mean score of 3.54 (SD = 0.51) followed by physiological (M = 3.35, SD = 0.41), indicating negative self-perceptions, fear of judgment, and physiological responses (e.g. heart rate and perspiration) are the most anxiety-inducing factors. In contrast, the technical factor showed the lowest mean score (M = 2.02, SD = 0.25), suggesting students were less anxious about the technical components of video-recorded presentations.
Table 1. Post hoc results of the digital oratory anxiety questionnaire

*p < .005. **p < .001.
Table 2 shows the items with a mean value above 3.5. No items in the technical subscale were found above 3. Under the cognitive subscale, the highest mean score (M = 4.10, SD = 0.63) of the statement “I am worried that my audience will think I am a bad speaker” suggested a strong fear of negative judgment, indicating a lack of confidence in L2 speaking and a fear of failure or criticism. The statement “I don’t feel satisfied after recording a digital speech” (M = 3.88, SD = 0.72) highlighted frequent dissatisfaction with recorded performances. Students may have high expectations for their delivery and perceive flaws in their speeches, leading to a sense of dissatisfaction. The statement “I have little confidence in giving a digital speech” (M = 3.73, SD = 0.75) indicated a moderate level of diffidence, suggesting that students may feel uncertain or uncomfortable with the digital format, struggling to adapt their speaking style effectively.
Table 2. Items on the Digital Oratory Anxiety Scale with a mean value above 3.5

Note. 1 = not at all, 2 = slightly, 3 = moderately, 4 = very, 5 = extremely.
As for physiological responses, the statements “Certain parts of my body feel very tense and rigid while giving a speech” (M = 3.96, SD = 0.80) and “My heart beats very fast when I am video recording my speech” (M = 3.80, SD = 0.76) suggested nervousness may lead to stiff body posture and fast heart beats. The statement “I breathe faster just before starting a digital speech in front of the camera” (M = 3.73, SD = 0.72) indicated an increase in breathing rate before speeches, which can affect vocal control. The statement “Realizing that only a little time remains in a speech makes me very tense and anxious” (M = 4.00, SD = 0.71) suggested anxiety heightened when time was limited. As for behavioral responses, the statement “I don’t know how to make eye contact with the audience in front of the camera” (M = 3.68, SD = 0.86) reflected a challenge among the students in establishing eye contact with virtual audiences.
6.2 Nonverbal speech performance
The 40 students were assessed across 20 items in a nonverbal speech performance test, grouped into five categories: Posture/Gesture, Eye Contact, Facial Expression, Voice Control, and Technical Effect. Technical Effect has the highest mean score (M = 4.14, SD = 0.31), followed by Voice Control (M = 3.49, SD = 0.56); Facial Expression (M = 2.94, SD = 0.82) and Posture/Gesture (M = 2.95, SD = 0.77) shared similar mean scores, while Eye Contact had the lowest score (M = 2.79, SD = 1.00). The Shapiro–Wilk test revealed a significant deviation from normality across the full dataset, W(800) = 0.62, p < .001, requiring the use of non-parametric methods. A Kruskal–Wallis H test indicated statistically significant differences among the five categories, H(4) = 67.99, p < .001, with a large effect size (η² = 0.328). Post hoc comparison using the Mann–Whitney U test (Bonferroni-adjusted threshold p < .005) showed that Eye Contact, Posture/Gesture, and Facial Expression scored significantly lower than Voice Control and Technical Effect, with effect sizes ranging from medium (0.356) to large (0.812) (see Table 3). These findings suggest that students perform strongest in areas related to voice control and technical effect, while nonverbal expressiveness such as eye contact and gesture may require additional instructional emphasis. Two items in particular, namely “Aware and responsive to invisible audience (M = 2.55, SD = 1.14) and “Use spontaneous, meaningful, conversation-like gesture” (M = 2.79, SD = 0.91), attained the lowest mean scores among all the assessment items.
Table 3. Post hoc results of nonverbal speech performance

Note. 1 = Posture/Gesture; 2 = Eye Contact; 3 = Facial Expression; 4 = Voice Control; 5 = Technical Effect.
*p < .005. **p < .001.
A multiple regression analysis was conducted to examine the relationship between nonverbal performance and digital oratory anxiety. Prior to the regression analysis, the assumption of normality of residuals was evaluated. The Shapiro–Wilk test indicated no significant deviation from normality (W = 0.949, p = 0.071), and this was corroborated by the Jarque–Bera test (JB = 2.37, p = 0.306). A histogram and Q-Q plot of residuals (see Figure 2) visually confirmed approximate normality, with the histogram displaying approximately normal distribution and the Q-Q plot showing that the residuals closely followed the theoretical normal line.

Figure 2. Histogram and Q-Q plot of residuals.
Table 4 shows the results of multiple regression analysis between nonverbal performance and anxiety levels. The predictors were the four anxiety factors (cognitive, physiological, behavioral, and technical), and the outcome variable was the total performance score. The overall result yielded a non-significant relationship between the anxiety factors and nonverbal speech performance, R² = 0.116, F(4, 35) = 2.276, p = 0.081, indicating that the four predictors explained only 11.6% of the variance in the total performance score. The effect size, Cohen’s f 2 = 0.131, demonstrated a medium effect, suggesting a moderate impact of the speech anxiety factors on explaining variance in nonverbal performance. Among the anxiety factors, the cognitive factor demonstrated a marginally significant negative association with performance (β = −11.390, p = 0.012), suggesting that higher levels of cognitive anxiety may lead to poorer speaking outcomes. However, no significant relationships were found for physiological, behavioral, and technical in influencing nonverbal performance. These results indicated that while the cognitive factor may have a significant impact, the other factors did not appear to play a significant role in affecting nonverbal performance. However, caution is necessary in generalizing these findings as the relatively small sample size (N = 40) might have contributed to the lack of statistical significance.
Table 4. Multiple regression analysis of digital oratory anxiety on nonverbal speech performances

*p < 0.05. **p < 0.01.
6.3 Semi-structured interview results
The semi-structured interviews revealed two primary dimensions of students’ perceptions of digital oratory: benefits and challenges. Table 5 presents a summary of these themes, including frequency counts and illustrative excerpts. Each theme is elaborated below, with interpretations supported by representative participant quotations.
Table 5. Benefits and challenges of digital oratory (n = 20)

6.3.1 Benefits of digital oratory
Opportunities for rehearsal. Most participants (n = 14, 70%) reported that digital oratory provided greater opportunities for rehearsal. The recording format allowed them to practice repeatedly, refine their delivery, and gain control over their performance. As Daniel explained, “I could record my speech many times until I got satisfied with my own performance.” Similarly, Mary shared that she “rarely did rehearsal before face-to-face presentation,” but found digital recording “very convenient.” These accounts suggest that digital oratory encouraged self-paced learning and performance monitoring, enabling students to approach speech practice with a level of reflection often missing in live presentations.
Opportunities for self-evaluation. Thirteen students (65%) highlighted self-evaluation as a major benefit. Watching their recorded speeches helped them become more aware of their delivery, body language, and tone. Kelly reflected that “it was weird to watch my own speech,” yet found it “super helpful.” Similarly, Kyle observed that recording was “the best way to improve” because it allowed him to “observe my own performance.” Through this process of visual reflection, students identified weaknesses, such as monotone delivery or limited gestures, and became more conscious of areas needing improvement. These findings indicate that digital oratory promotes metacognitive awareness, helping learners analyze and refine their own speaking performance.
Reduced speaking anxiety. Almost all participants (n = 18, 90%) viewed digital oratory as less stressful than live face-to-face presentations. Many appreciated having the option to record multiple takes and manage their environment. Alex commented, “Making a digital presentation is more fun and less stressful … I can record … again and again.” Likewise, James remarked that “no one is watching me; [it] helped me speak more confidently.” These reflections show how the asynchronous recording process helped reduce performance pressure. By allowing students to control their timing, environment, and output, digital oratory appeared to lower affective barriers and enhance speaking confidence.
6.3.2 Challenges of digital oratory
Maintaining eye contact. While students reported minimal difficulty using digital devices, the majority (n = 17, 85%) identified eye contact as a key challenge. Elaine remarked that “staring at the camera … looked awkward,” while Lizzy noted being “at a loss on how to make natural eye contact.” Without visual audience feedback, students struggled to maintain authentic engagement. This finding suggests the need for explicit training on how to maintain natural eye contact through the camera to project engagement and build virtual rapport with the unseen audience.
Using gestures effectively. Fifteen participants (75%) found it challenging to use gestures naturally within the camera frame. Charlotte shared that “my two hands looked strange waving before the camera,” and Bill added that excessive gestures “can be distracting.” The limited screen space made it difficult for students to judge appropriate movement and gesture size. These reflections highlight the importance of gesture awareness and framing, suggesting that learners benefit from instruction on adapting physical expressiveness for digital formats.
Managing distance and camera angle. More than half of the students (n = 11, 55%) reported difficulties with camera positioning and angle. Sarah admitted being unsure “how far or close” she should appear, while Tom noted that facial visibility affected the impression he made: “If my face is too small, they cannot see my expressions; too big looks strange.” These responses demonstrate that effective digital presentation requires a degree of technical competence, including awareness of framing, lighting, and camera distance. Without such visual control, students risk diminishing their communicative presence and clarity.
7. Discussion
In relation to RQ1, which sought to determine the extent to which four key factors influenced the level of anxiety L2 students experienced when engaging in digital oratory, the findings of the questionnaire revealed that students experienced varying levels of anxiety across different factors, with cognitive and physiological being the most anxiety-inducing factors. Cognitive attaining the highest mean score suggests that L2 students may experience worries about being judged negatively and lacking confidence in their L2 speaking. A high level of anxiety worrying that the audience would think that they are bad speakers might be associated with L2 speakers’ cultural background (Jones, Reference Jones2004; Wang & Roopchund, Reference Wang and Roopchund2015). Cultural factors, such as fear of losing “face,” making mistakes, or being criticized in the presentation, are a major contribution to more salient PSA among learners from Confucian Heritage Cultures, such as China, Korea and Japan, than learners from other ethnic groups (He, Reference He2013; Woodrow, Reference Woodrow2006). Other affective elements, such as self-esteem and self-doubt, could also affect students’ cognitive status (Shumin, Reference Shumin, Richards and Renandya2002). Since the cognitive factor was rated as the most anxiety-inducing by the students, language practitioners should adopt a holistic approach to address not only students’ linguistic needs but also their affective needs. This entails the creation of a learning environment characterized by a sense of security and comfort, wherein learners are free from speaking apprehension and are encouraged to engage in risk-taking in the target language. The interview results reveal that digital oratory provides a less stressful environment for students to practice their presentation skills compared to face-to-face presentations with a live audience because it allows students to record their speeches repeatedly until they are satisfied with their performance. The repetitive attempts ensure students have ample rehearsals. Self-recording also offers flexibility, enabling students to stop and restart their presentations whenever they wish to make another attempt.
Concerning RQ2, which aimed to examine L2 students’ nonverbal performance in a digital context, the nonverbal performance results disclosed students’ weak performances, particularly in eye contact and gestures. The interview results further aligned with the relatively low mean scores of eye contact and gestures in nonverbal performance. Although students perceived digital presentation as less stressful as it eliminated the need for physical proximity to the audience, the lack of physical presence hindered their ability to establish effective eye contact to connect with the virtual audience. Students expressed uncertainty about where to direct their gaze, whether toward the camera or the screen. Similarly, the effective use of gestures proved to be another challenge for students as physical presence is confined to the boundaries of the camera frame, which can limit the visibility and impact of gestures. However, the interview results also highlight a potential benefit of digital oratory: the opportunity for self-evaluation through reviewing one’s own speech recording. The self-evaluation process plays a crucial role in improving nonverbal performance, as it exposes students to areas that require improvement. By observing their own performance, students can identify specific aspects of their nonverbal performance that need refinement and develop targeted strategies to refine their speech delivery skills. This reflective practice also enhances students’ learning autonomy as proactive learners in digital settings.
With respect to RQ3, which explored the relationship between L2 students’ speech anxiety and their nonverbal speech performance, the study found no significant correlations between anxiety levels and nonverbal speech performances. This finding corroborates previous research that perceived anxiety does not necessarily correlate with actual speaking performances (Bobkina et al., Reference Bobkina, Domínguez Romero and Gómez Ortiz2023; Chen, Reference Chen2024) in both face-to-face and digital contexts. One possible explanation for this finding could be the accuracy of individuals’ metacognitive reports. External reports may not fully align with one’s internal psychological reality due to the subjective nature of consciousness and limitation of introspection (Schmidt, Reference Schmidt1990). Another possible explanation could be attributed to the inconclusive impact of PSA on nonverbal speech performance (MacIntyre & Thivierge, Reference MacIntyre and Thivierge1995). On the one hand, the behavioral factor could be one of the useful clues to identify L2 speech anxiety (Gregersen, Reference Gregersen2007) by observing their nonverbal behaviors, such as body posture, gesture, or facial expression. L2 learners with a high level of speech anxiety tend to appear more uptight and have less eye contact with the audience. On the other hand, anxiety might enhance performance by stimulating increased effort and attentiveness during speech delivery, driving L2 students to perform better (Bobkina et al., Reference Bobkina, Domínguez Romero and Gómez Ortiz2023). Thus, high anxiety levels and performances may not always correlate positively (Chen, Reference Chen2024). Another noteworthy observation is that while behavioral and technical were not perceived as major causes of speech anxiety, the nonverbal performance results disclosed students’ weak performances in eye contact and gestures, which were closely related to behavioral and technical factors. This incongruity between students’ metacognitive reports on speech anxiety and their actual nonverbal speech performances suggests a lack of awareness among students about the pivotal role that body language plays in digital presentation. Raising students’ awareness of the importance of digital nonverbal skills and providing guidance on posture, gesture, eye contact, facial expression, and camera positioning can significantly enhance students’ confidence and overall speech performance.
The findings shed light on the pedagogical implications for addressing PSA and improving L2 speakers’ nonverbal skills in digital contexts. Interventions should prioritize addressing cognitive and physiological factors, as they have been found to significantly impact anxiety levels. Engaging students in digital oratory, which involves numerous attempts at speech recording, can help students alleviate negative self-perceptions and fear of judgment, providing a safe and controlled environment for them to practice their public speaking skills without the pressure of a live audience. Also, encouraging students to self-evaluate their nonverbal performance during the video review process is crucial for fostering reflective practice and self-awareness. To facilitate effective self-evaluation, it is essential to provide students with clear guidance on what aspects to focus on when reviewing their video recordings. This guidance should include specific criteria for assessing nonverbal performance, such as eye contact, posture, gesture, facial expression, and vocal delivery. To mitigate the impacts of behavioral and technical factors, students can overcome challenges related to camera distance, presentation angle, and eye contact by following these suggestions: (1) Experiment with camera distance to find the optimal framing, ensuring the head and upper body are clearly visible without being too close or too far away from camera. (2) Be mindful of the framing to ensure that gestures are within the camera’s view. Practice speech with purposeful gestures to engage the virtual audience. (3) Maintain eye contact with the invisible audience by looking naturally at the screen instead of staring at the camera lens.
8. Conclusion
The present study explored the intricate relationships between digital oratory anxiety and nonverbal speech performance among L2 students. Through triangulation of quantitative and qualitative data from a questionnaire, nonverbal speech performances, and semi-structured interviews, we explored the relationships between L2 PSA and nonverbal speech performance as well as the benefits and challenges they encountered while creating digital oratory videos. The findings highlight the significance of digital oratory in facilitating L2 learners’ self-evaluation and reflection on nonverbal communication skills. The recording process allowed students to rehearse their presentation and refine certain nonverbal skills, but it also presented challenges during digital presentations.
This study primarily focused on nonverbal performance, while other speaking aspects such as content and language use were not within the current research scope. Future research could analyze students’ actual speech output, focusing on lexical, syntactic, and discourse patterns, to gain further insights into L2 digital oratory performances. To enrich the understanding of digital oratory, researchers could adopt a multimodal approach, analyzing both textual and visual features of speech videos (Trzciński & Rokita, Reference Trzciński and Rokita2017). Furthermore, the present study relied on self-report measures with a small sample size (N = 40) to assess anxiety levels, which may introduce response biases. Future research could benefit from a larger sample, including additional measures or variables that may potentially impact presentation skills, such as self-efficacy, prior experience, or content preparation. Experimental research employing a pre-test/post-test design could be conducted to compare speech performance of students with and without digital oratory training. This would allow us to measure the effects of digital oratory tasks as an intervention for enhancing speech performance and provide insights into integrating digital oratory in public speaking curricula. These studies would further illuminate the effectiveness of digital oratory as a highly affordable and accessible means of CALL to enhance L2 students’ public speaking skills in academic and professional contexts.
Supplementary material
To view supplementary material for this article, please visit https://doi.org/10.1017/S0958344025100396
Data availability statement
Data available on request due to privacy/ethical restrictions.
Acknowledgements
We would like to extend our sincere thanks to the anonymous reviewers and the editors for their valuable comments. Special thanks also go to Emeritus Professor John Powers from Hong Kong Baptist University for his inspiration in shaping this research.
Authorship contribution statement
Zeping Huang: Conceptualization, Formal analysis, Data curation, Methodology, Validation, Visualization, Writing – original draft, Writing – reviewing & editing; Wendy Ting Ting Wu: Data curation, Formal analysis, Writing – original draft; Mariah Chan: Visualization, Writing – reviewing; Jianwen Liu: Validation, Writing – reviewing.
Funding disclosure statement
This research did not receive any specific funding.
Competing interests statement
The authors declare no competing interests.
Ethical statement
Informed consent was obtained from all participants prior to data collection. Before collecting the speech data and administering the survey, participants were clearly informed that their involvement in this study was entirely voluntary and that they could withdraw at any time without penalty or consequence. Ethical approval has been obtained from Hong Kong Baptist University. All the data collected from this project were used for research purposes only.
GenAI use disclosure statement
During the preparation of this work, we used ChatGPT-4 for language editing to refine phrasing, grammar, and style. No substantive intellectual content, such as hypotheses, literature review, methods, results, and discussion, was generated by the tool. The ChatGPT plugin for visualization was also used to assist in the graph creation of Figure 2, the histogram and Q-Q plot of residuals, on June 3 and 5, 2025. No private data or participant data were entered into these tools. Use of AI tools did not influence our interpretation or results; all statistical decisions and conclusions were made by the authors. After using the AI tool, the authors reviewed and edited the content as needed and take full responsibility for the content of the published article.
About the authors
Zeping Huang is an assistant professor in the Department of English at the Hang Seng University of Hong Kong. Her research interests include corpus linguistics, CALL, world Englishes, discourse studies, and health communication. She has published in ReCALL, English World-Wide, Language Teaching Research, Public Relations Review, Journalism, and Dentistry Journal.
Wendy Ting Ting Wu is a PhD student in the Department of English Language and Literature at Hong Kong Baptist University. Her research interests are centered on interactional linguistics, corpus linguistics, health communication, and discourse analysis in social media.
Mariah Chan is a former lecturer at the Center for Language Education of the Hong Kong University of Science and Technology. Her research interests include English as a second language (ESL), public speaking, and corpus linguistics.
Jianwen Liu is an associate professor in the Department of English Language and Literature at Hong Kong Shue Yan University. Her research interests include corpus-based translation studies, corpus linguistics, and data-driven learning. She has published in Critical Arts, International Journal of Applied Linguistics, Corpora, Sustainability, and Asia Pacific Journal of Corpus Research.




