Introduction
Four decades of exploratory research have affirmed the characterization of state supreme courts as indispensable cogs in the American judiciary (Boyea and Brace Reference Boyea and Brace2021).Footnote 1 Scholarship on state court decision-making is expanding, a trend that Hall (Reference Hall, Howard and Randazzo2017) believes is “one of the most exciting and dynamic areas of study within the discipline of political science” (315). Yet a vital stage in a state supreme court’s decision-making process has long eluded empirical scrutiny – oral arguments.
A growing literature now analyzes US Supreme Court oral arguments. Seminal work by Johnson (Reference Johnson2001; Reference Johnson2004) led to the conclusion that “oral arguments can and do change justices’ minds” (Ringsmuth, Bryan, and Johnson Reference Ringsmuth, Bryan and Johnson2012, 436). Oral arguments allow justices to collect information for coalition building (Black, Sorenson, and Johnson Reference Black, Sorenson and Johnson2012), enhance public legitimacy (Black et al. Reference Black, Timothy, Ryan and Justin2024; Cann and Goelzhauser Reference Cann and Goelzhauser2023), and allow litigants to build their case after briefing (Johnson Reference Johnson2004; Johnson, Wahlbeck, and Spriggs Reference Johnson, Wahlbeck and Spriggs2006). However, this body of research has not yet expanded to the role of oral arguments beyond the highest court in America.
As scholars were debating the influence of oral arguments, a spirited field of research burgeoned, examining how justices reach their decisions. Theories such as attitudinal, legalistic, and strategic emerged explaining the US Supreme Court’s decisions (Epstein, Sadl, and Weinshall Reference Epstein, Sadl and Weinshall2022), and even theories specified for state supreme courts (Hall Reference Hall, Howard and Randazzo2017). Recent scholarship extends these conventional theories. Scholars are probing the intersection of emotion during oral arguments and briefing, sometimes due to personality, and decision-making by the US Supreme Court (Black et al. Reference Black, Truel, Johnson and Goldman2011; Black et al. Reference Black, Owens, Wedeking and Wohlfarth2016; Dickinson Reference Dickinson2019; Black et al. Reference Black, Owens, Wedeking and Wohlfarth2020; Epstein, Sadl, and Weinshall Reference Epstein, Sadl and Weinshall2022; Hall et al. Reference Hall, Hollibaugh, Klingler and Ramey2022). Specifically, Epstein and Weinshall (Reference Epstein, Weinshall and Guerriero2021) believe that “there is no getting around the fact that [emotions] can distort purely strategic decision-making” (31). The result of this research has been a novel decision-making theory coined thinking-fast (Epstein, Sadl, and Weinshall Reference Epstein, Sadl and Weinshall2022).
I exploit novel data on state supreme court oral arguments to measure the emotional content of judicial speech and its impact on decision-making. Studying this relationship at the state level is especially fruitful because of state supreme courts’ unique institutional arrangements relative to those at the US Supreme Court and the rich trove of unexplored data (Weinshall and Epstein Reference Weinshall and Epstein2020). Previously unanswered questions on decision-making at state supreme courts could be answered from a promising yet developing theory of decision-making (Hall Reference Hall, Howard and Randazzo2017).
Thus, I consider whether oral arguments influence judicial decision-making at state supreme courts. If so, do biases such as the emotion in justices’ statements and questions play a role in this influence?
By employing emerging natural language processing (NLP) methods (Dickinson Reference Dickinson2019; Hall et al. Reference Hall, Hollibaugh, Klingler and Ramey2022), I assemble a dataset of appellate cases spanning 2014 to 2021 before the New York Court of Appeals (NYCOA).Footnote 2 Harnessing those methods – such as deep learning and transformers – enables me to place justice’s emotional speech alongside other theoretically motivated predictors of case outcomes. The implications simultaneously make contributions to state supreme court research while also testing and supplementing theories of judicial decision-making. First, justices’ emotions expressed during oral arguments can explain their decision-making, and more broadly that oral arguments matter at state supreme courts. Indeed, the evidence suggests that justices negotiate their expressed emotions during questioning alongside the strategic goals or the legal factors of a case, such as prior briefing. This conclusion provides support to a thinking-fast theory (Epstein and Weinshall Reference Epstein, Weinshall and Guerriero2021; Epstein, Sadl, and Weinshall Reference Epstein, Sadl and Weinshall2022, 711) but explicated at the state supreme court level. These findings may also have implications for how emotions interact with personality-driven approaches to thinking-fast theory (Hall Reference Hall2018; Black et al. Reference Black, Owens, Wedeking and Wohlfarth2020). Second, this study positions oral arguments as an important institutional design in framing behavior under a thinking-fast theory, previously identified as a promising approach (D’Elia-Kueper and Segal Reference D’Elia-Kueper, Segal, by and Kosslyn2015; Hall Reference Hall, Howard and Randazzo2017; Epstein and Weinshall Reference Epstein, Weinshall and Guerriero2021, 34). Lastly, this analysis demonstrates the usefulness of analyzing raw textual data with NLP and deep learning within the judicial politics literature.
Theory and previous research
Oral arguments at the federal and state levels
Since the early 2000s, scholarship on US Supreme Court oral arguments has expanded markedly. Johnson (Reference Johnson2001) established formative, early evidence that information gathered during oral arguments is used by justices to make substantive policy choices. Indeed, 80% of the issues in justices’ questions during oral arguments are original to oral arguments (Johnson Reference Johnson2004), providing a compelling motivation to analyze them. Oral arguments also shape judicial behavior and outcomes. They provide justices with supplementary information and fresh perspectives that otherwise could have been overlooked (Segal and Spaeth Reference Segal and Spaeth2002; Conlon and Karaba Reference Conlon and Karaba2012). The quality of the arguments presented also uniquely influences the decision of justices (Johnson, Wahlbeck, and Spriggs Reference Johnson, Wahlbeck and Spriggs2006; McAtee and McGuire Reference McAtee and McGuire2007). Ringsmuth, Bryan, and Johnson (Reference Ringsmuth, Bryan and Johnson2012) provided original evidence that high-quality arguments from attorneys can actually change a justice’s vote. Effective oral arguments also resolve any lingering questions from briefs and allow justices to assess the opinions of their colleagues for coalition building (Johnson Reference Johnson2004; Black, Johnson, and Wedeking Reference Black, Johnson and Wedeking2012). Experimental research has demonstrated the institutionally legitimizing effects of oral arguments (Cann and Goelzhauser Reference Cann and Goelzhauser2023), albeit with mixed results at state supreme courts (Black et al. Reference Black, Timothy, Ryan and Justin2024).
How do scholars measure these effects of oral arguments? Litigants who face more – and more verbose – questions are less likely to win the case or individual justices’ votes (Johnson et al. Reference Johnson, Black, Goldman and Truel2009; Epstein, Landes, and Posner Reference Epstein, Landes and Posner2010). To estimate more nuanced effects, the content of arguments themselves has been employed in clever measures of implicit biases (Shullman Reference Shullman2004; Johnson et al. Reference Johnson, Black, Goldman and Truel2009; Wistrich, Rachlinski, and Guthrie Reference Wistrich, Rachlinski and Guthrie2015; Vunikili et al. Reference Vunikili, Ochani, Jaiswal, Deshmukh, Chen and Ash2018). A seminal approach by Black et al. (Reference Black, Truel, Johnson and Goldman2011) analyzed “bags of words” to measure unpleasant and pleasant emotions directed at each litigant using a dictionary of language. The authors find relatively more unpleasant emotional words directed toward a particular party can predict how the US Supreme Court will rule with 73% accuracy. These measures, unfortunately, do not capture the contextual understanding of emotion positioned within a justice’s entire question. Recent advancements in NLP and deep learning address this shortcoming by understanding the complex, contextual position of emotion in sophisticated, difficult sentences seen in the judicial setting (Dickinson Reference Dickinson2019). Other innovations of the same ilk include Dietrich, Ryan, and Sen (Reference Dietrich, Ryan and Sen2018) and Chen et al. (Reference Chen, Kumar, Motwani and Yeres2025), both of whom use the physical attributes of oral arguments such as vocal pitch and facial expressions, all suggesting that emotional speech contains information beyond the legal, political, and textual.
However, Sorenson (Reference Sorenson2023) departs from the canonical “questions” used by many studies on oral arguments in distinguishing statements from questions during judicial speech. She finds that they are theoretically and empirically distinct; more questions increase the chance of a litigant winning while more statements decrease their odds. This result has broad implications for how justices communicate information during oral arguments with Sorenson explaining that “certain types of communications are helpful while others are intended to monopolize time and convey specific attitudes” (1569). To maintain theoretical validity, I draw on this approach in the hypotheses below, data collection, and variable construction.
All the prior empirical work on oral arguments has focused exclusively on the US Supreme Court.Footnote 3 Thus, to examine the level of interest for this study, a state supreme court, I borrow heavily on the federal level oral argument literature to compare and contrast findings in a subnational judiciary setting.Footnote 4
Decision-making: how do justices think?
Normative theories explaining why courts and justices reach their decisions cover a range of literatures.Footnote 5 Legalists purport that justices employ legal factors such as precedent, statutory analysis, constitutional analysis, textualism, originalism, and particularly briefing (Maltzman and Bailey Reference Maltzman and Bailey2011; Hall Reference Hall2018; Hazelton and Hinkle Reference Hazelton and Hinkle2022; Reference Hazelton and Hinkle2024). Attitudinal theory takes the view that justices make decisions primarily using policy or ideological preferences inelastic to external actors or factual stimuli of a particular case (Segal and Spaeth Reference Segal and Spaeth1993; 2002). Much of the empirical social science literature has focused on testing the latter theory, positing that precedent, originalism and textualism, and other legal factors inadequately explain judicial behavior as opposed to personal policy and ideological preferences (Segal and Spaeth Reference Segal and Spaeth1999; Howard and Segal Reference Howard and Segal2002; Bailey and Maltzman Reference Bailey and Maltzman2008).
However, cracks emerged in attitudinal theory. Epstein and Knight (Reference Epstein and Knight2013) point to unanimous decisions, filling the docket with “easy cases,” and the success of the US government. Strategic theory fills those gaps by describing judicial decision-making as the actions of rational actors seeking to achieve their goals, influenced by the preferences of others, expected choices, and institutional arrangements (Maltzman, Spriggs, and Wahlbeck Reference Maltzman, Spriggs and Wahlbeck2000; Epstein and Knight Reference Epstein and Knight1998; Reference Epstein and Knight2013). Recent research has focused on bridging strategic accounts to others (Epstein and Weinshall Reference Epstein, Weinshall and Guerriero2021), such as justices utilizing precedent to produce outcomes consistent with their own policy preferences (Hansford and Spriggs Reference Hansford and Spriggs2006).
An accumulating body of research examines strategic theory’s institutional structures specifically at state supreme courts. Scholars developed decision-making theories for these states by examining political preferences and characteristics of justices, case facts and legal variables influencing judicial choice, institutional arrangements, and the external environment surrounding courts (Brace and Hall Reference Brace and Hall1990; Reference Brace and Hall1993; Reference Brace and Hall1997).Footnote 6 This work suggests that the attitudinal model could describe judicial behavior at the state level, but with institutional structure caveats (Brace and Hall Reference Brace and Hall1995; Hall and Brace Reference Hall and Brace1996).Footnote 7 Indeed, the culmination of this research resulted in a neo-institutionalism theory of judicial decision-making at state supreme courts (Brace and Hall Reference Brace and Hall1990).
An emerging theory has been how a justice sides with a litigant “not because they are rationally advancing an economic or any other interest but because of an emotional response” (Epstein, Sadl, and Weinshall Reference Epstein, Sadl and Weinshall2022, 711).Footnote 8 Unsurprisingly, experimental research has found that humans, including the jurist, rely on heuristics or mental shortcuts using intuition, emotion, physical attractiveness, and personality to maximize efficient decision-making and minimize intense cognitive effort (Guthrie, Rachlinski, and Wistrich Reference Guthrie, Rachlinski and Wistrich2007; Rachlinski, Guthrie, and Wistrich Reference Rachlinski, Guthrie and Wistrich2011; Wistrich, Rachlinski, and Guthrie Reference Wistrich, Rachlinski and Guthrie2015; Black et al. Reference Black, Owens, Wedeking and Wohlfarth2016; Spamann and Klohn Reference Spamann and Klohn2016; Epstein, Parker, and Segal Reference Epstein, Parker and Segal2018; Segal, Sood, and Woodson Reference Segal, Sood and Woodson2019; Waterbury Reference Waterbury2024). These cues stem from the neurobiological and cognitive processes involved in legal decision-making (Kahan Reference Kahan2015; Liu and Li Reference Liu and Li2019). However, this line of theory has been forced to contend with the extensive legal training and expertise that justices possess which would expectantly reduce biased reasoning (Posner Reference Posner2008; Kahan Reference Kahan2015). Liu and Li (Reference Liu and Li2019), nonetheless, find in experimental research that justices employ those same skills to rationalize or “decorate” a biased decision with the trappings of legalism (659). This result yields broad implications for the legal field, but for the instant study how scholars should measure effects of biases if they are being masked in decision-making.
Oral arguments as a source of emotional content and biased decision-making
Accordingly, considering where in appellate proceedings a biasing effect of heuristic processing may occur is requisite to explaining that effect. During briefing? During oral arguments, as that literature may suggest (Black et al. Reference Black, Truel, Johnson and Goldman2011)? Or some combination of those? Scholars offer different answers. Black et al. (Reference Black, Owens, Wedeking and Wohlfarth2016) and Hazelton and Hinkle (Reference Hazelton and Hinkle2022) demonstrate that emotion in briefing as a bias can have a negative effect. They propose that such strongly emotional language “decreases an attorney’s perceived credibility” and professionalism (Black et al. Reference Black, Owens, Wedeking and Wohlfarth2016, 397). These findings support the presence of heuristic cues in judicial decision-making, particularly during briefing, by justices relying on credibility cues that bias systematic decision-making (382). These may also be symptoms of personality, particularly conscientiousness, for a mediating effect on emotionality in briefs (Black et al. Reference Black, Owens, Wedeking and Wohlfarth2020).Footnote 9 A conscientious justice is likely to penalize emotional language even more because this personality trait devalues emotion (Black et al. Reference Black, Owens, Wedeking and Wohlfarth2020, 118), suggesting additional extra-legal factors in decision-making. Hall et al. (Reference Hall, Hollibaugh, Klingler and Ramey2022) also find evidence that personality traits may influence US Supreme Court decision-making.
Yet, the other promising location of the effect on decision-making is oral arguments. Compelling research gives reason to suspect oral arguments, especially emotional arguments, are uniquely positioned to influence decision-making as compared to briefing (Johnson Reference Johnson2004; Black et al. Reference Black, Truel, Johnson and Goldman2011; Dickinson Reference Dickinson2019). Descriptively, it is true that US Supreme Court oral arguments contain emotionally charged speech by justices (Black et al. Reference Black, Truel, Johnson and Goldman2011; Black, Johnson, and Wedeking Reference Black, Johnson and Wedeking2012). Arguments also raise novel issues unseen in litigant briefs (Johnson Reference Johnson2004), giving fresh fodder for the justices to consider. They certainly pay attention to that fodder, as it has the ability to change their votes (Ringsmuth, Bryan, and Johnson Reference Ringsmuth, Bryan and Johnson2012). They are also a conversation among the panel of justices as they direct questions at each attorney, later used during coalition building (Johnson Reference Johnson2004; Black et al. Reference Black, Owens, Wedeking and Wohlfarth2016). But why is emotional language expressed during oral arguments? Cognitive research shows that, under high information load, decision-makers tend to shift from deliberate, rule-based reasoning to heuristic shortcuts that rely on emotional cues (Chaiken and Ledgerwood Reference Chaiken, Ledgerwood, Van Lange, Kruglanski and Higgins2011). The shortcut relevant to this emotional content identified by judicial politics scholars is the affect heuristic: a rapid, intuitive judgment formed by tagging stimuli with positive or negative affective valence (Finucane et al. Reference Finucane, Alhakami, Paul and Johnson2000; Kahneman 2011). In the context of judicial behavior, this implies that when cognitive demands are high (Black et al. Reference Black, Owens, Wedeking and Wohlfarth2020; Hazelton and Hinkle Reference Hazelton and Hinkle2022; Reference Hazelton and Hinkle2024; Waterbury Reference Waterbury2024), justices may unconsciously form impressions of litigants or arguments based on emotional tone, and then use those impressions to guide decision-making (Liu and Li Reference Liu and Li2019). These impressions serve as cognitive anchors that simplify the choice before them. Later, due to professional norms and institutional expectations (Black et al. Reference Black, Timothy, Ryan and Justin2024), the justice may engage in a post hoc rationalization, using legal reasoning to justify the initial intuition (Liu and Li Reference Liu and Li2019). This process could unfold in four steps: (1) information strain; (2) unconscious affective tagging; (3) intuitive judgment on case merit; and (4) post-hoc legal rationalization. These steps form the microfoundations of the affect heuristic mechanism in legal decision-making, providing a potential, stepwise account of how emotion during oral arguments may influence judicial behavior at the individual level.
Why should we expect this particular process to operate specifically in judicial settings, and not just any high-stakes decision environment? Courts, and especially state supreme courts, possess unique institutional features that make them fertile ground for this kind of cognitive processing (Hall Reference Hall2018). First, justices must write or join written opinions to explain their decisions across a wide array of subject areas that they may be unfamiliar with (Meyer et al. Reference Meyer, Bergan, Agata and Agata2006). This necessity of public justification supports the presence of rationalization, the final step in the mechanism. Second, oral arguments are fast-paced, collegial, and adversarial, with justices engaging in real-time questioning of both litigants and one another (Black, Johnson, and Wedeking Reference Black, Johnson and Wedeking2012). This increases information strain and introduces dynamic, social cues into the environment, amplifying the likelihood of automatic emotional responses (Kahneman 2011). Third, state high court judges face large caseloads and limited clerical resources (Hughes, Wilhelm, and Vining Reference Hughes, Wilhelm and Vining2015; Boyea and Brace Reference Boyea and Brace2021). The NYCOA decides upward of 100 cases per year, which further taxes cognitive bandwidth and encourages the use of intuitive processing (Chaiken and Ledgerwood Reference Chaiken, Ledgerwood, Van Lange, Kruglanski and Higgins2011).Footnote 10 Finally, state courts are smaller institutional environments with repeat-player attorneys. Many lawyers appear before the same panels of justices across multiple terms, creating ongoing affective associations that may influence credibility judgments (McGuire Reference McGuire1995; Black et al. Reference Black, Owens, Wedeking and Wohlfarth2016). Taken together, these conditions make the state supreme court oral argument setting an especially powerful site for observing affect-based cognitive processing.
These considerations also help explain why we should expect oral arguments over written briefs to be a particular source of affect-based biases. While briefs arrive well in advance and can be digested slowly with clerical assistance, oral arguments happen live and demand immediate processing under conditions of cognitive load.Footnote 11 In contrast to the systematic reading of briefs, oral arguments require justices to hold numerous facts and doctrines in memory, adapt to novel issues, and participate in a collaborative and strategic performance with other justices. These institutional features encourage heuristic processing, making it more likely that emotion expressed by justices during oral arguments shapes their final votes on the merits.
Thus, I expect that oral arguments will likely be a source of expressed emotional content that can ultimately be explicative of decision-making at a state supreme court. This expectation is given as the first hypothesis:
H1: A state supreme court justice who directs more positive emotion in speech toward the appellee during oral arguments is more likely to vote favorably for the appellee by affirming.
Oral arguments may prove – or fail to prove – to be the genesis of an emotional bias in judicial decision-making, but additional analysis will be needed to fully disentangle the emotional content in briefs from later moments of exposure to isolate the cognitive microfoundations at play. Nonetheless, H1 provides the literature with an empirically testable foundation for state supreme courts that future research can build from.
Political science research has also shown that linguistic clarity in opinion writing is a factor considered strategically by US Supreme Court justices to promote lower court compliance with their decisions (Black et al. Reference Black, Owens, Wedeking and Wohlfarth2016); it’s entirely conceivable that such a factor is portable to oral arguments as the general principles encouraging clarity apply, but likely under differing mechanisms for this procedural setting. The same heuristic processing that incorporates emotions during oral arguments can also be induced when speech is linguistically clear and understandable (Kahneman 2011). When textual stimuli are unclear and difficult to read, systematic processing is mobilized because of cognitive strain and analytical reasoning (Chaiken and Ledgerwood Reference Chaiken, Ledgerwood, Van Lange, Kruglanski and Higgins2011). If a justice’s speech is linguistically unclear, I suspect that cognitive strain is experienced by the justice and other justices listening. Conversely, clear speech might be revealing of heuristic processing, which would be expressed by justices during oral arguments. Thus, these expectations are given as the second hypothesis:
H2: A state supreme court justice who uses more linguistically simple speech toward the appellee during oral arguments is more likely to vote favorably for the appellee by affirming.
The precise mechanisms underlying how clarity shapes decision-making are not directly assessed herein. Rather, H2 seeks to test if this factor plays a role in the assertion that oral arguments influence state supreme court outcomes.
Data and methods
The New York Court of Appeals
A lack of data on state supreme courts has hindered the analysis of their respective oral arguments (Weinshall and Epstein Reference Weinshall and Epstein2020).Footnote 12 A review of the germane literature reveals no robust database on oral arguments before a state supreme court. Therefore, I collect original data on the NYCOA to examine its arguments and test the effects of emotionality on decision-making at the subnational level.
The NYCOA comprises seven justices, including one Chief Justice. To hear a case, the NYCOA or the lower Appellate Division court must grant a Motion for Leave to Appeal (22 CRR-NY 500.20-22).Footnote 13 Yet, although the court exercises some docket control, its caseload remains comparable to that of its peers (Meyer et al. Reference Meyer, Bergan, Agata and Agata2006; Hinkle and Nelson Reference Hinkle and Nelson2016). The Court regularly schedules oral arguments for its cases, permitting thirty minutes per litigating party – appellant and appellee – with opportunities for rebuttal by the appellant (22 CRR-NY 500.18).Footnote 14
Selection of a single court presents potential concerns for this study’s design if there are peculiar institutional designs and characteristics. If the NYCOA had unusual institutional features uncommon at other state supreme courts, there would likely be reproachable implications for its analysis such as biased measures. These abnormalities would also likely hinder generalization. However, the evidence suggests that the Court is not entirely unique. Table 1 presents theory-driven institutional features of the Court and the proportion of other state supreme courts sharing that feature. While a majority is not achieved across all institutional features, the table demonstrates a plurality for most, giving little reason to suspect detrimental implications. The table also presents no particular features that raise alarm for subsequent data collection or variable construction.
Table 1. The NYCOA’s Institutional Features as Compared to Other State Supreme Courts

Note: The institutional features considered are theoretically relevant to theories of decision-making specified for state supreme courts (Hall Reference Hall, Howard and Randazzo2017). For term in office, I include states with a range of 12–15 years. The source for opinion assignment and deliberation rules is Hughes, Wilhelm, and Vining (Reference Hughes, Wilhelm and Vining2015).
There are also strengths in selecting the NYCOA as the research setting as opposed to other courts or selecting multiple. Nicholson-Crotty and Meier (Reference Nicholson-Crotty and Meier2002) explain how selecting a single state can permit validity and generalization through sufficient intra-state variation, added internal and construct validity, and more detailed variables. By selecting the NYCOA as a single-state case, a richer data collection is permitted while maintaining variance across time periods and among individual justices. The focal independent variables employed all leverage variance between justices in a multilevel structure of cases. Moreover, the NYCOA enjoys measurably more opinion citations than others and historical prestige (Hinkle and Nelson Reference Hinkle and Nelson2016, 180). The Court is closely watched in terms of the national judiciary and media due to its prestige and hearing paramount cases (Meyer et al. Reference Meyer, Bergan, Agata and Agata2006), which strains hypothesis propositions (Nicholson-Crotty and Meier Reference Nicholson-Crotty and Meier2002).Footnote 15 In other words, the standing of the NYCOA raises the stakes of decision-making because its opinions are consumed by an array of institutions (Epstein and Knight Reference Epstein and Knight1998; Reference Epstein and Knight2013).
With data selection attended to, the task of its collection remains. The State of New York provides an online database, Court-PASS, for accessing court records of a particular NYCOA case. Considering the terms from 2014 to 2021, I collect from this system oral argument transcripts, the Court decision, litigants’ briefs, and filed amicus briefs.Footnote 16 In total, there are 822 cases in the data which nearly captures the NYCOA’s entire docket during the years considered.Footnote 17 To obtain the prior case record, I use LexisNexis to gather the lower appeals court opinions – often the NY Appellate Division – and the district court opinion – often the NY Supreme Court.Footnote 18 These other panels of lower court justices, which range in size from three to seven, are included in the models via variables that measure their influence on the appellate court. Wherever possible, I capture the judicial decision-making record spanning from when a case enters a district court’s docket all the way to the NYCOA’s opinion.Footnote 19 These data allow measurement of the briefing NYCOA justices receive prior to oral arguments. From these data, NYCOA cases spanning 2014 to 2021 form a justice-level dataset.Footnote 20 I append justices’ utterances toward attorneys during arguments to create the raw textual dataset employed in downstream NLP tasks.Footnote 21
The outcome variable, justice votes, I also gather from LexisNexis and code as 1 if a justice in a case voted to affirm and 0 if they voted to reverse.Footnote 22
The Muppets go back to law school: deep learning with BERT
NLP and deep learning have made significant, recent advances in the field (Dickinson Reference Dickinson2019; Hall et al. Reference Hall, Hollibaugh, Klingler and Ramey2022). Notably, Google in 2018 released its “BERT” transformer model (Devlin et al. Reference Devlin, Chang, Lee and Toutanova2018). Transformers are an innovation in the machine learning discipline whose architecture leverages both the power of learning via “attention” and also embeddings which are encodings of contextual meaning for an input, such as text (Vaswani et al. Reference Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, Polosukhin, Guyon, Von Luxburg, Bengio, Wallach, Fergus, Vishwanathan and Garnett2017). These embeddings are central to how the model feeds data through attention heads to represent abstract ideas such as emotion in a high-dimensional vector space where movements in this space are semantic meanings such as skepticism, sarcasm, or sincerity. At a high level, the goal of BERT is to take an embedding and then attend the other embeddings in the vector space to update that initial embedding based on the others in the sequence (Devlin et al. Reference Devlin, Chang, Lee and Toutanova2018).Footnote 23 The key contribution this model makes is a flexible approach to empirically representing contextual, semantic meanings that can then be used in a downstream task such as classifying a sentence as positive, neutral, or negative. BERT and other deep learning models are now ubiquitous in the NLP field and are already being deployed in judicial politics (Hall et al. Reference Hall, Hollibaugh, Klingler and Ramey2022), largely attributed to their contextual understandings of words through embeddings. I leverage a domain-specific pretrained model, LEGAL-BERT, that is fine-tuned in later downstream tasks in a process known as transfer learning (Chalkidis et al. Reference Chalkidis, Fergadiotis, Malakasiotis, Aletras and Androutsopoulos2020).Footnote 24 Indeed, the researchers find predictive results improve when fine tuning the out-of-the box BERT model for domain-specific tasks such as emotion analysis of legal texts (Chalkidis et al. Reference Chalkidis, Fergadiotis, Malakasiotis, Aletras and Androutsopoulos2020, 2).
I deploy the LEGAL-BERT model in transfer learning on the downstream task of emotion analysis. The advantage of a transformer for this analysis is that the order of words and their contextual meaning are not lost, an improvement over previous studies that measured emotion with non-contextual methods (Black et al. Reference Black, Truel, Johnson and Goldman2011; Wistrich, Rachlinski, and Guthrie Reference Wistrich, Rachlinski and Guthrie2015). I employ the SigmaLaw-ABSA corpus of 2,000 legal opinion texts which were annotated by expert hand-coders for negative, neutral, and positive emotion (Mudalige et al. Reference Mudalige, Karunarathna, Rajapaksha, de Silva, Ratnayaka, Perera and Pathirana2020). After training the model on the pre-coded dataset, I run inference with a deep learning model to classify utterances as 0 for negative, 1 for neutral, and 2 for positive. After fine-tuning LEGAL-BERT on the SigmaLaw-ABSA corpus, the model attains an F1 score of 0.74.Footnote 25 Coding details, model training, model accuracy, and hyperparameter selection are all discussed in Section B.1 of the Supplemental Appendix.
However, a quandary still remains for classifying utterances in the data. Sorenson (Reference Sorenson2023) finds that questions and statements are both empirically and theoretically distinct in their effect on judicial behavior. There are more than 85,000 utterances in the data and many hundreds of documents; discriminating between questions and statements for all of them would be impractical. Accordingly, I hand-code several thousand utterances within the NYCOA data as either a statement or a question for transfer learning on the LEGAL-BERT. Another deep learning model uses these hand-coded outcomes to predict out-of-sample utterances as either a statement or a question.Footnote 26 The model’s classification achieves a 0.75 F1 score out-of-sample for all classes. See Appendix B.2 of the Supplemental Appendix for further details on the coding, training procedure, and hyperparameter selection.
Features of oral arguments, briefs, and cases
Several variables are employed as the focal, or main effect, measures to test the theoretical expectations concerning judicial behavior. Using the LEGAL-BERT embeddings classified by their associated emotional content, I calculate the average positive, negative, and neutral emotion at the justice-case-litigant level. I compute these averages separately for statements and for questions. To capture the relative level of emotionality, the average justice-level emotion directed at the appellee is subtracted from the average overall emotion directed at the appellant. Thus, for each type of utterance, the main effect measures of emotionality are represented as the ∆ Statement Emotion and ∆ Question Emotion where positive values indicate more positive emotion relative to the appellant and negative values indicate more negative emotion relative to the appellant. While this operationalization of emotionality does not directly capture all moments of emotional exposure between the litigant and the justice, it does offer observational insight into the emotional disposition of the justice throughout oral arguments.
To evaluate the clarity of arguments, I employ the Dale-Chall reading ease score.Footnote 27 The score allows quantification of readability based on a list of 3,000 words that are easily or not easily understood (Chall and Dale Reference Chall and Dale1995). This method takes the number of words in a given sentence and decomposes the text to compute the readability score. I do this separately for statements and questions and then calculate by justice the relative difference between the appellee and appellant. This operation yields the remaining main effect measures: ∆ Statement Clarity and ∆ Question Clarity. Doing so captures the relative amount of easily understandable judicial speech directed at a particular litigant, which primarily locates roots in evidence that US Supreme Court opinion clarity is reflective of decision-making (Black et al. Reference Black, Owens, Wedeking and Wohlfarth2016).
The remaining model covariates are included as controls, drawn from various literature including judicial politics and political ideology.Footnote 28 These include: Ideological Alignment, the absolute distance between a justice’s ideology and the median ideology of all justices presiding on that case (Martin and Quinn Reference Martin and Quinn2002; Bonica and Woodruff Reference Bonica and Woodruff2015); ∆ Amici Briefs, the difference between the number of amicus curiae briefs filed for the appellee minus the number for the appellant (Collins Reference Collins2004); ∆ Attorney General Support, a dummy variable coded 1 if the attorney general supported the appellee either as the litigating party or filed amicus curiae briefs in support minus the same variable but for the appellant (Epstein and Knight Reference Epstein and Knight2013); ∆ Resources, a measure of how well-equipped the appellee is for litigation minus the appellant (Collins Reference Collins2004); Salience, a dummy variable if relevant media outlets in New York state provided coverage of the case (McAtee and McGuire Reference McAtee and McGuire2007; Vining and Wilhelm Reference Vining and Wilhelm2011); Case Complexity, a latent construct of the number of provisions cited in the briefs and the number of legal issues raised by a case (Goelzhauser, Kassow, and Rice Reference Goelzhauser, Kassow and Rice2021); and Issue Area, a representation of what area of law a case concerns (Johnson Reference Johnson2004).Footnote 29 To account for briefing prior to oral arguments, I construct several variables: Lower Court Similarity a measure of the linguistic similarity between the substantive content of the NYCOA opinion and the lower court opinion(s) (both district and lower appeals court); and ∆ Brief Similarity, a measure of how similar the content in the appellee’s brief is to the case opinion minus the similarity for the appellant (Hinkle Reference Hinkle2015; Hazelton, Hinkle, and Spriggs Reference Hazelton, Hinkle and Spriggs2019; Hazelton and Hinkle Reference Hazelton and Hinkle2022).Footnote 30 These measures do not inherently capture emotional language as the main effect does by using classified LEGAL-BERT embeddings, but rather the extent to which the NYCOA borrows textual content from briefs or lower court opinions which Hazelton and Hinkle (Reference Hazelton and Hinkle2022) empirically show as an indicator of judicial decision-making.
Results
I test my hypotheses by relying on case-justice-level data estimated by the following primary statistical model:

for i = 1, …, n where i is each case and j is each justice ruling on i case,

is the level 1 variation, and

is the level 2 variation. I test whether the main effect differs when justice-level fixed effects (FE) are omitted by running the above specification (Models 1 and 2 in Table 2) without including individual justice names.
Table 2. Oral Argument’s Impact on Justices: Multilevel Logit Regression with Random and Fixed Effects

Note: The outcome is a justice voting to affirm. In all instances, the differential indicates more emotion or speech directed at the appellee. Models include case-level random intercepts and justice-level fixed effects (when indicated in the table). Standard log-odds coefficients are reported. Standard errors are in parentheses.
*p < 0.05;**p < 0.01
Simulated data with interactive effects are used later to present easily interpretable quantities of interest: the predicted probability of a justice’s vote. Nonlinear models pose challenges for directly interpreting interaction effects, particularly as compared to standard linear models (Ai and Norton Reference Ai and Norton2003). Thus, I do not substantively interpret coefficient sizes in the subsequent analysis. Fortunately, alternative strategies, such as predicted probabilities held at margins of a predictor, allow for simulating substantively meaningful scenarios that are intuitive when assessing theoretical expectations. After reviewing overall results in Table 2, predicted probabilities for cases and justices themselves are presented in subsequent sections to further examine the hypotheses.
Table 2 presents results for the main effects considered in my hypotheses: emotion and clarity differentials. The effect of ∆ Emotion is shown when not including FE for individual justices (Model 1), and the effect with their inclusion (Model 2). This pattern is repeated but for the effects of ∆ Clarity without FE (Model 3) and with FE (Model 4). The table separates other covariates between levels of variation to aid in interpretation. I report log-odds coefficients from the multilevel logistic regressions which estimate the direction of each effect and its statistical significance. These coefficients indicate an increased or decreased effect of a justice voting for the appellee (i.e., affirm).
The results in Table 2 are consistent with hypothesis expectations. Emotion contained within questions, when directed more at the appellee, has a statistically significant (p < 0.01), positive effect of 0.33 for a justice voting to affirm. This result remains robust when directly controlling for variation among justices presiding in the case (Model 2). However, the effect for differentially more emotion contained within statements is not statistically significant (p > 0.05). These results taken together suggest that emotion communicated during oral arguments has an effect on judicial behavior when that emotion is posed as a question. This result, while not initially considered in the hypotheses, is not entirely novel. It is also not unsurprising within this article’s theoretical framework of judicial behavior given prior research that statements do not communicate the same information as questions (Sorenson Reference Sorenson2023). I refrain from concluding that a lack of significance for statements does not generally limit implications or contravene theoretical expectations. Nonetheless, subsequent models focus on questions’ effects. The remaining models (2–3) in Table 2 present coefficients for ∆ Clarity that support H2 expectations. Linguistic clarity contained within questions that is differentially directed at the appellee has a statistically significant (p < 0.001), positive effect of 0.07 for a justice voting to affirm. This result also remains robust when directly controlling for variation among justices presiding in each case (Model 4). As with emotion, linguistic clarity of statements does not have a statistically significant (p > 0.05) effect on a justice voting to affirm. Lastly, comparing Table 2’s models with FE (Models 1 and 3) to those without (Models 2 and 4), respectively, does not reveal any substantive differences in the coefficients to warrant further analysis. Justice-level FE are omitted in subsequent models to ease the estimation of predicted probabilities.
The findings may be confounded by unobserved factors. I test alternative model specifications under which the main effect remains robust.Footnote 31 See Section D in the Supplemental Appendix for complete results of the robustness testing.
Figure 1 displays the main effects from Table 2 – each justice’s predicted probability of voting to affirm. These quantities of interest allow for an intuitive interpretation of effect sizes and their substantive relation to hypothesis expectations. Specifically, the figure addresses H1 and H2 in that I expect there to be a greater probability than a coin-flip (random chance or > 50%) for positive margins of emotion and clarity. When a justice directs relatively more positive emotion toward the appellee, Figure 1 shows if that litigant is more likely to win the justice’s vote. I expect the results to evaluate whether emotional and linguistic cues revealed during oral arguments forecast justices’ votes.

Figure 1. Predicted Probabilities for Oral Argument’s Impact on Justices. Notes: Figure 1(a) is derived from Model 1 and Figure 1(b) is derived from Model 3 in Table 2. The figure shows predicted probabilities for justices voting to affirm at margins of ∆ Emotion (left panel) and ∆ Clarity (right panel). Shaded areas represent 95% confidence intervals. The values of other continuous covariates are held constant at their means and categorical covariates at their reference level.
The figure demonstrates that as the appellee is treated with more positive emotion by a justice relative to the appellant, they are more likely to receive that same justice’s vote when all other covariates are held constant. At one-unit of ∆ Question Emotion, the probability that a justice votes for the appellee is 84.01% (75.16, 90.12). For ∆ Statements Emotion, the probability is 84.74% (74.69, 91.27) also held at one-unit. Figure 1 shows these probabilities decrease slightly using ∆ Clarity as the predictor. With all other covariates held constant, the probability a justice votes for the appellee is 80.47% (70.81, 87.50) held at one-unit of ∆ Question Clarity. The probability is 81.06% (71.56, 87.92) held at one-unit of ∆ Statement Clarity. While these probabilities have wide confidence intervals, the figure suggests that these cues are important enough to substantially reflect good odds (> 80%) of winning justice votes. In other words, a justice’s emotion and linguistic clarity reflected in their speech forecasts their decision (Black et al. Reference Black, Truel, Johnson and Goldman2011; Dietrich, Ryan, and Sen Reference Dietrich, Ryan and Sen2018; Dickinson Reference Dickinson2019; Chen et al. Reference Chen, Kumar, Motwani and Yeres2025).
The evidence from Figure 1 also suggests that oral arguments may be unique in state supreme court decision-making and should be incorporated into state-level theories (Hall Reference Hall, Howard and Randazzo2017). Indeed, the results support Epstein and Weinshall’s (2021) assessment that emotion “complicates our efforts to explain [justices] behavior” (31) using just strategic or legal accounts. Understanding state supreme court decision-making with purely legalistic, personality, strategic, and attitudinal factors is probably insufficient because it omits critical information (Dietrich, Ryan, and Sen Reference Dietrich, Ryan and Sen2018). However, these findings may not translate well to theories specified and tested for the US Supreme Court. These results, nonetheless, offer novel evidence for both hypotheses in that emotionality and linguistic clarity in justices’ questions to litigants during oral arguments explain and forecast state supreme court decision-making.
Sorenson’s (Reference Sorenson2023) distinction between questions and statements (positive vs. negative effects) materializes differently at the state level. In each panel of Figure 1, there are virtually identical predicted probabilities for statements and questions, unsurprising given the lack of statistical significance for statements in Table 2. The figure shows that the difference between questions and statements is 0.73% for ∆ Emotion and 0.59% for ∆ Clarity. This finding seems to indicate that, while there is no negative effect seen for statements as Sorenson (Reference Sorenson2023) observes, we are unable to fully parse the mechanisms between questions and statements without additional research. The lack of statistical significance for statements in Table 2, however, encourages focusing on question’s effect for subsequent analyses.
Strategic processing: institutional actors and case characteristics
There are alternative mechanisms from case characteristics and external actors that can motivate justices to decide the case on strategic factors (Epstein and Knight Reference Epstein and Knight1998), fitting under a systematic reasoning explanation rather than heuristic processing (Todorov, Chaiken, and Henderson Reference Todorov, Chaiken, Henderson, Dillard and Pfau2002; Kahneman 2011). Prior research suggests that salience raises the stakes of decision-making and decreases justice’s susceptibility to be swayed by oral arguments (McAtee and McGuire Reference McAtee and McGuire2007; Black, Johnson, and Wedeking Reference Black, Johnson and Wedeking2012). Moreover, litigants with the support of government have greater credibility and odds of receiving a favorable ruling for that side, largely due to judicial deference to the state (Segal Reference Segal1988; Epstein and Knight Reference Epstein and Knight2013). Additional model specifications test these alternatives with interactive effects for two relevant case factors: salience and government involvement.
Table 3 displays coefficients for the main effects of questions in oral arguments interacted with select measures relevant to strategic theories of judicial decision-making (Segal Reference Segal1988; Epstein and Knight Reference Epstein and Knight1998; McAtee and McGuire Reference McAtee and McGuire2007). The interaction with salience is shown for ∆ Question Emotion (Model 5) and ∆ Question Clarity (Model 6). This pattern is repeated but for the interaction of New York Attorney General (AG) support with the same effects, Model 7 and Model 8 respectively. In Table 3, I again report log-odds coefficients from the multilevel logistic regressions which assess the direction of the effects and their statistical significance.Footnote 32 These coefficients indicate the extent of the likelihood a justice votes for the appellee as the interactions shift values.
Table 3. Oral Argument Questions’ Interactive Effect with Legalistic Predictors

Note: The outcome is a justice voting to affirm. In all instances, the differential indicates more emotion or speech directed at the appellee. Models include case-level random intercepts and justice-level fixed effects (when indicated in the table). Standard log-odds coefficients are reported. Standard errors are in parentheses.
*p < 0.05;**p < 0.01
The results in Table 3 do not give reason to suspect that the effect of emotion or clarity on justice votes is dependent on case salience or government intervention. Emotion contained within questions that is differentially directed at the appellee remains statistically significant (p < 0.01) with a positive effect of 0.32 for a justice voting to affirm among salient cases. A similar sign and significance level for ∆ Question Clarity is observed. The interaction term coefficients for emotion and clarity and case salience are statistically insignificant (p > 0.05); I interpret the extent to which the effect shifts in ∆ Question Emotion and ∆ Question Clarity, as you go from non-salient to salient cases and negative differential AG support to positive differential AG support, to be null. In other words, there is not a statistical difference observed in effects for emotion and linguistic clarity when considering strategic theory-driven alternative explanations of decision-making.
These findings in context with hypothesis expectations further support the idea that emotion expressed by judges during oral argument is relevant to case outcomes and thus, decision-making. Previous research on decision-making as a function of rational actors’ action seeking to achieve their preferences may yet be highly explicative at the US Supreme Court. (Maltzman, Spriggs, and Wahlbeck Reference Maltzman, Spriggs and Wahlbeck2000; Epstein and Knight Reference Epstein and Knight1998; Reference Epstein and Knight2013). However, the evidence presented here complicates this limited explanation at the NYCOA.
Does the legal or ideological justice avoid thinking fast?
Justices have unique preferences and judging styles, with distinct policy goals (Segal and Spaeth Reference Segal and Spaeth1993; Reference Segal and Harold2002) that interact with institutional actors (Maltzman, Spriggs, and Wahlbeck Reference Maltzman, Spriggs and Wahlbeck2000; Epstein and Knight Reference Epstein and Knight1998; Reference Epstein and Knight2013). Although attitudinal and strategic decision-making has been a focus of research (D’Elia-Kueper and Segal Reference D’Elia-Kueper, Segal, by and Kosslyn2015), justices consume litigants’ briefs and lower court opinions prior to voting and writing opinions (Maltzman and Bailey Reference Maltzman and Bailey2011; Hall Reference Hall2018; Hazelton and Hinkle Reference Hazelton and Hinkle2022). Accordingly, I carefully evaluate the pro forma legal information that very likely shapes justices’ behavior throughout the appellate proceedings. Evidence suggesting that the hypothesis expectations are not dependent on these interactions will be particularly compelling for suggesting that oral arguments influence state supreme courts via emotional judicial speech.
Figure 2 presents predicted probabilities for ∆ Question Emotion and ∆ Question Clarity both interacted with ∆ Brief Similarity. The accompanying Table 4 shows the raw coefficients for these interactions.Footnote 33 The interaction of ∆ Brief Similarity, as the construct High-Quality Brief where ∆ Brief Similarity is greater than 0, is shown with ∆ Question Emotion (Model 9) and with ∆ Question Clarity (Model 10), respectively. What this measure implies practically from the data is the appellee’s brief having been more textually incorporated into the NYCOA’s opinion as opposed to the appellant which is indicative of briefing’s influence on decision-making (Hazelton, Hinkle, and Spriggs Reference Hazelton, Hinkle and Spriggs2019; Hazelton and Hinkle Reference Hazelton and Hinkle2022; Reference Hazelton and Hinkle2024).

Figure 2. Predicted Probabilities for Oral Argument Questions’ Effect with High-Quality Briefing. Notes: Figure 2(a) is derived from Model 9 and Figure 2(b) is derived from Model 10 in Table 4. The figure shows predicted probabilities for justices voting to affirm at margins of the constituent terms ∆ Question Emotion and ∆ Question Clarity in the interaction with ∆ Brief Similarity > 0. Shaded areas represent 95% confidence intervals. The values of other continuous covariates are held constant at their means and categorical covariates at their reference level.
Table 4. Oral Argument Questions’ Effect on High-Quality Briefing

Note: The outcome is a justice voting to affirm. High-Quality Brief = (1, 0) is a construct of ∆ Brief Similarity where values are greater than 0. Models include case-level random intercepts. Standard log-odds coefficients are reported. Standard errors are in parentheses.
*p < 0.05;**p < 0.01
The interaction term’s coefficients in Table 4 show that the extent of brief incorporation is not relevant to the effects of emotion and clarity. Models 9–10 yield statistically insignificant (p > 0.05) coefficients for the interactive terms with ∆ Question Emotion and ∆ Question Clarity. I interpret the extent to which the effect of shifts in ∆ Question Emotion and ∆ Question Clarity, as you go from negative to positive brief incorporation, as a null effect. However, the constituent term for ∆ Question Emotion is not significant (p > 0.05); this could mean that the effect has sample size limitations within the model. The coefficient for ∆ Question Clarity retains its significance as compared to Table 2, but a more modest level (p < 0.05). Turning toward Figure 2, all other covariates held constant, when an appellee’s brief is differentially more incorporated by a justice, the probability a justice votes for them is 83.66% (71.84, 91.13) held at a one-unit increase of ∆ Question Emotion. When an appellee’s brief is not more incorporated by a justice, a justice votes with an 80.87% (68.47, 89.16) probability held at the same margin. These probabilities decline when ∆ Question Clarity is the predictor. When an appellee’s brief is differentially more incorporated by a justice, the probability a justice votes for them is 78.69% (65.48, 87.79) held at one-unit of ∆ Question Clarity. When an appellee’s brief is not more incorporated by a justice, a justice votes with a 77.88% (65.40, 86.76) probability held at the same margin.
Regardless of whose brief is more incorporated in the NYCOA’s opinion, decision-making is predictable by the models. This result is not entirely unexpected given that the US Supreme Court justices very often raise novel issues during arguments not seen in briefs (Johnson Reference Johnson2004). However, emotion’s influence may be more varied. The evidence taken together suggests that when the Court preferentially incorporates text from briefs into the opinion (Maltzman and Bailey Reference Maltzman and Bailey2011; Hall Reference Hall2018; Hazelton and Hinkle Reference Hazelton and Hinkle2022), it can be partially explained by emotional content during oral arguments.
On the other hand of considering justices’ preferences and policy goals, I also test the hypotheses against ideological influences. Figure 3 again presents predicted probabilities for the interaction term with ideological alignment to test the attitudinal theory’s mediating effect on emotion (Maltzman, Spriggs, and Wahlbeck Reference Maltzman, Spriggs and Wahlbeck2000; Epstein and Knight Reference Epstein and Knight1998; Reference Epstein and Knight2013). Table 5 shows the raw coefficients used in the figure.Footnote 34 Column 1 contains the interaction of Ideological Alignment with ∆ Question Emotion (Model 11) and Column 2 with ∆ Question Clarity (Model 12), respectively. The interaction term, Aligned, takes a value of one when Ideological Alignment is less than 1. Practically, this measure represents when there is minimal difference in ideology between a justice and the median ideology of the other justices on a given panel.Footnote 35

Figure 3. Predicted Probabilities for Oral Argument Questions’ Effect with Ideologically Aligned Justices. Notes: Figure 3(a) is derived from Model 11 and Figure 3(b) is derived from Model 12 in Table 5. The figure shows predicted probabilities for justices voting to affirm at margins of the constituent terms ∆ Question Emotion and ∆ Question Clarity in the interaction with Ideological Alignment < 1. Shaded areas represent 95% confidence intervals. The values of other continuous covariates are held constant at their means and categorical covariates at their reference level.
Table 5. Oral Argument Questions’ Effect for Ideological Alignment

Note: The outcome is a justice voting to affirm. Aligned = (1, 0) is a construct of Ideological Alignment where values are less than 1. Models include case-level random intercepts. Standard log-odds coefficients are reported. Standard errors are in parentheses.
*p < 0.05;**p < 0.01
The interaction term coefficients in Table 5 again demonstrate that the extent of ideological alignment is not fully relevant to the effects of emotion and clarity. Models 11–12 yield statistically insignificant (p > 0.05) coefficients for the interactive terms with ∆ Question Emotion and ∆ Question Clarity. I interpret the extent to which the effect of shifts in ∆ Question Emotion and ∆ Question Clarity, as you go from more to less difference in alignment, as a null effect. However, the constituent terms for ∆ Question Emotion and ∆ Question Clarity are not significant (p > 0.05). This interpretation perhaps dampens the evidence for questions and clarity influence being independent of ideology, but it is not altogether conclusive given the lack of significance for the interaction terms themselves. There could again be sample size limitations restricting variance for emotion and clarity in the specified models. Fortunately, the accompanying figure, Figure 3, provides additional interpretative context for these coefficients. The probability an ideologically aligned justice votes for the appellee is 84.37% (75.56, 90.41) held at one-unit of ∆ Question Emotion. When the justice is not ideologically aligned, they vote for the appellee with an 80.55% (66.53, 89.62) probability held at the same margin of question emotion. These results remain consistent with ∆ Question Clarity as the predictor. The probability an ideologically aligned justice votes for the appellee is 80.81% (71.49, 87.61) held at one-unit of ∆ Question Clarity. When the justice is not ideologically aligned, they vote for the appellee with a 72.08% (58.59, 82.50) probability held at the same margin of question emotion.
Comparing Figure 1’s results with Figure 3 reveals similar probabilities for justice votes at both levels of the interactions. While the margins are noticeably larger, this comparison eases concerns from Table 5 and gives credence to the main effect being most likely limited to sample size in the interaction, as opposed to larger confounding issues from ideological alignment.
Conclusion
Justices are too often modeled as cold, calculating machines advancing their ideological preferences (Segal and Spaeth Reference Segal and Spaeth1993; Reference Segal and Harold2002), pursuing nebulous strategic goals (Epstein and Knight Reference Epstein and Knight1998; Maltzman, Spriggs, and Wahlbeck Reference Maltzman, Spriggs and Wahlbeck2000), or deciding as a legalist “apolitical, apartisan, value-free umpire” (Hall Reference Hall2018; Hazelton and Hinkle Reference Hazelton and Hinkle2022; Epstein and Knight Reference Epstein and Knight2013, 13). Legal realists, attitudinalists, and proponents of strategic theory aimed to dispel unrealistic decision-making conceptions, yet often view justices as wholly engaged in rational goals or strict legal analysis. This article departs from these incomplete conceptualizations by incorporating emotion as a heuristic during oral arguments into state supreme court decision-making through a “thinking-fast” theory (Epstein, Sadl, and Weinshall Reference Epstein, Sadl and Weinshall2022). This contribution to the theory aids recent realizations by scholars across multiple disciplines that emotion is important for explaining judicial behavior (Black et al. Reference Black, Truel, Johnson and Goldman2011; Wistrich, Rachlinski, and Guthrie Reference Wistrich, Rachlinski and Guthrie2015; Segal, Sood, and Woodson Reference Segal, Sood and Woodson2019; Epstein, Parker, and Segal Reference Epstein, Parker and Segal2018; Epstein and Weinshall Reference Epstein, Weinshall and Guerriero2021) and now has evidence from the oral argument stage of proceedings. However, the results do not rule out the possibility that strategic, attitudinal, or legal theories explain certain behavior. The findings of this article are most beneficial when used to supplement these other theories, as they are “complementary, or at the least, not mutually exclusive” (Epstein, Sadl, and Weinshall Reference Epstein, Sadl and Weinshall2022, 703).
Using a novel dataset comprising nearly a thousand decisions and millions of words from justices, I demonstrate that oral arguments likely influence state supreme court decision-making. The interaction term analyses performed show that emotion and linguistic clarity probably play at least some role in this influence while controlling for other factors, which is theorized to operate under an affect heuristic processing model of cognition (Chaiken and Ledgerwood Reference Chaiken, Ledgerwood, Van Lange, Kruglanski and Higgins2011; Kahneman 2011). Although there are fewer indications of an impact from clarity in judicial speech, emotion often explains decision-making. As Sorenson (Reference Sorenson2023) finds, this predictor is not always immune to other factors and often varies between statements and questions during oral arguments. The implications of these conclusions span multiple literatures. First, previous research had not yet linked the content and process of oral arguments to decision-making before state supreme courts, despite considerable research doing exactly that for the US Supreme Court (Johnson Reference Johnson2001; Reference Johnson2004; Johnson, Wahlbeck, and Spriggs Reference Johnson, Wahlbeck and Spriggs2006; Black, Johnson, and Wedeking Reference Black, Johnson and Wedeking2012; Ringsmuth, Bryan, and Johnson Reference Ringsmuth, Bryan and Johnson2012). This article addresses that gap and provides evidence that the information conferred during oral arguments explains judicial votes, even when considering well-tested alternative explanations. Second, this article proffers opportunities for the thinking-fast theory beyond state supreme courts to provide a decision-making framework portable to elite actors across institutions, including the US Supreme Court, legislatures, and executives. Third, the results have policy implications. Policymakers should consider the impact of overloaded dockets, barriers to entry for lawyers, and legitimizing factors such as decorum when designing state supreme courts and allocating resources (McGuire Reference McGuire1995; Cann and Goelzhauser Reference Cann and Goelzhauser2023; Black et al. Reference Black, Timothy, Ryan and Justin2024). Promoting professionalism and introducing fresh attorney faces during oral arguments may discourage emotional behavior in favor of objective legal doctrine and arguments. Lastly, this article employed novel methodologies of NLP and transformer models for large-N text data. Judicial politics and political science broadly are increasingly adopting methods such as deep learning models and this trend is expected to grow with the development of large language models (LLM) (Hall et al. Reference Hall, Hollibaugh, Klingler and Ramey2022).
The limitations of this study are apparent. The data collected cover only 1 state supreme court in the US, which may limit generalizability. However, previous research has shown that single-state studies can still provide valuable analysis given certain criteria (Nicholson-Crotty and Meier Reference Nicholson-Crotty and Meier2002). In this study, New York provided a rigorous empirical testing ground for state supreme court oral arguments, as its characteristics did not present immediate anomalies compared to other state high courts.Footnote 36 Furthermore, even a restrictive interpretation of these results, comparing only the NYCOA to the US Supreme Court, still has implications for understanding how emotion influences judicial behavior, consistent with a thinking-fast theory of judging. Another key limitation is that the data collected did not measure emotion at the moment of exposure to case briefs, such as their issues, topics, or scenarios (Black et al. Reference Black, Owens, Wedeking and Wohlfarth2016), or attorney speech. Collecting data on these exposures would allow comparing emotion during multiple phases, oral arguments and briefing, to consider a more complete picture of the decision-making process.
Several avenues for future research remain open. To fully parse out the theoretical mechanisms explored in this article, we need more research that measures emotional exposure at additional steps in the decision-making process: briefing emotional content, litigants’ emotional speech toward justices, and the emotional content of the court’s opinion. Particularly compelling analyses may measure the emotional content of attorney speech and associated judicial speech line-by-line, response-by-response to yield a clear chain of emotional exposure. Measuring these points will allow a deeper understanding of the decision-making mechanisms explained by affect heuristic processing and how oral arguments shape that. The other obvious next line of future research is to explore all state supreme courts. The principal limitation of this inquiry is the effort required to collect and analyze the necessary data; such a dataset would span hundreds of thousands of cases and tens of millions of utterances. This study focused on textual emotion analysis, but future research may consider Dietrich, Ryan, and Sen’s (Reference Dietrich, Ryan and Sen2018) vocal pitch measure to evaluate the differences between written and vocalized emotionality.
Supplementary material
The supplementary material for this article can be found at http://doi.org/10.1017/jlc.2025.10008.
Data availability statement
Replication code and data can be found at the Journal of Law and Courts Dataverse (https://doi.org/10.7910/DVN/2NB75D).