To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
A large empirical literature examines how judges’ traits affect how cases get resolved. This literature has led many to conclude that judges matter for case outcomes. But how much do they matter? Existing empirical findings understate the true extent of judicial influence over case outcomes since standard estimation techniques hide some disagreement among judges. We devise a machine learning method to reveal additional sources of disagreement. Applying this method to the Ninth Circuit, we estimate that at least 38% of cases could be decided differently based solely on the panel they were assigned to.
Existing approaches to conducting inference about the Local Average Treatment Effect or LATE require assumptions that are considered tenuous in many applied settings. In particular, Instrumental Variable techniques require monotonicity and the exclusion restriction while principal score methods rest on some form of the principal ignorability assumption. This paper provides new results showing that an estimator within the class of principal score methods allows conservative inference about the LATE without invoking such assumptions. I term this estimator the Compliance Probability Weighting estimator and show that, under very mild assumptions, it provides an asymptotically conservative estimator for the LATE. I apply this estimator to a recent survey experiment and provide evidence of a stronger effect for the subset of compliers than the original authors had uncovered.
Many philosophers think that doing philosophy cultivates valuable intellectual abilities and dispositions. Indeed this is a premise in a venerable argument for philosophy’s value. Unfortunately, empirical support for this premise has heretofore been lacking. We provide evidence that philosophical study has such effects. Using a large dataset (including records from over half a million undergraduates at hundreds of institutions across the United States), we investigate philosophy students’ performance on verbal and logical reasoning tests, as well as measures of valuable intellectual dispositions. Results indicate that students with stronger verbal abilities, and who are more curious, open-minded, and intellectually rigorous, are more likely to study philosophy. Nonetheless, after accounting for such baseline differences, philosophy majors outperform all other majors on tests of verbal and logical reasoning and on a measure of valuable habits of mind. This offers the strongest evidence to date that studying philosophy does indeed make people better thinkers.
This chapter focuses on correlation, a key metric in data science that quantifies to what extent two quantities are linearly related. We begin by defining correlation between normalized and centered random variables. Then, we generalize the definition to all random variables and introduce the concept of covariance, which measures the average joint variation of two random variables. Next, we explain how to estimate correlation from data and analyze the correlation between the height of NBA players and different basketball stats.In addition, we study the connection between correlation and simple linear regression. We then discuss the differences between uncorrelation and independence. In order to gain better intuition about the properties of correlation, we provide a geometric interpretation of correlation, where the covariance is an inner product between random variables. Finally, we show that correlation does not imply causation, as illustrated by the spurious correlation between temperature and unemployment in Spain.
This chapter describes how to model multiple discrete quantities as discrete random variables within the same probability space and manipulate them using their joint pmf. We explain how to estimate the joint pmf from data, and use it to model precipitation in Oregon. Then, we introduce marginal distributions, which describe the individual behavior of each variable in a model, and conditional distributions, which describe the behavior of a variable when other variables are fixed. Next, we generalize the concepts of independence and conditional independence to random variables. In addition, we discuss the problem of causal inference, which seeks to identify causal relationships between variables. We then turn our attention to a fundamental challenge: It is impossible to completely characterize the dependence between all variables in a model, unless they are very few. This phenomenon, known as the curse of dimensionality, is the reason why independence assumptions are needed to make probabilistic models tractable. We conclude the chapter by describing two popular models based on such assumptions: Naive Bayes and Markov chains.
This chapter begins by defining an averaging procedure for random variables, known as the mean. We show that the mean is linear, and also that the mean of the product of independent variables equals the product of their means. Then, we derive the mean of popular parametric distributions. Next, we caution that the mean can be severely distorted by extreme values, as illustrated by an analysis of NBA salaries. In addition, we define the mean square, which is the average squared value of a random variable, and the variance, which is the mean square deviation from the mean. We explain how to estimate the variance from data and use it to describe temperature variability at different geographic locations. Then, we define the conditional mean, a quantity that represents the average of a variable when other variables are fixed. We prove that the conditional mean is an optimal solution to the problem of regression, where the goal is to estimate a quantity of interest as a function of other variables. We end the chapter by studying how to estimate average causal effects.
Quantifying the relationships between variables is affected by the spatial structure in which they occur and the scales of the processes that affect them. First, this chapter covers the topics of spatial regression, spatial causal inference and the Mantel and partial Mantel statistics. These are all methods designed to assess the relationships between variables of interest within a spatial structure. Then, multiscale analysis is presented because it is key to understanding how ecological processes and patterns change with the scale of observation. Indeed, multiscale analysis has become increasingly important as ecologists address studies at larger and larger scales with increasing probability of significant spatial heterogeneity. We describe several approaches, including multiscale ordination (MSO), Morán’s eigenvector maps (MEMs) and wavelet decomposition.
Political scientists regularly rely on a selection-on-observables assumption to identify causal effects of interest. Once a causal effect has been identified in this way, a wide variety of estimators can, in principle, be used to consistently estimate the effect of interest. While these estimators are all justified by appeals to the same causal identification assumptions, they often differ greatly in how they make use of the data at hand. For instance, methods based on regression rely on an explicit model of the outcome variable but do not explicitly model the treatment assignment process, whereas methods based on propensity scores explicitly model the treatment assignment process but do not explicitly model the outcome variable. Understanding the tradeoffs between estimation methods is complicated by these seemingly fundamental differences. In this paper we seek to rectify this problem. We do so by clarifying how most estimators of causal effects that are justified by an appeal to a selection-on-observables assumption are all special cases of a general weighting estimator. We then explain how this commonality provides for diagnostics that allow for meaningful comparisons across estimation methods—even when the methods are seemingly very different. We illustrate these ideas with two applied examples.
We present a method for narrowing nonparametric bounds on treatment effects by adjusting for potentially large numbers of covariates, using generalized random forests. In many experimental or quasi-experimental studies, outcomes of interest are only observed for subjects who select (or are selected) to engage in the activity generating the outcome. Outcome data are thus endogenously missing for units who do not engage, and random or conditionally random treatment assignment before such choices is insufficient to identify treatment effects. Nonparametric partial identification bounds address endogenous missingness without having to make disputable parametric assumptions. Basic bounding approaches often yield bounds that are wide and minimally informative. Our approach can tighten such bounds while permitting agnosticism about the data-generating process and honest inference. A simulation study and replication exercise demonstrate the benefits.
The conventional wisdom in political science is that incumbency provides politicians with a massive electoral advantage. This assumption has been challenged by the recent anti-incumbent cycle. When is incumbency a blessing for politicians and when is it a curse? Incumbency Bias offers a unified theory that argues that democratic institutions will make incumbency a blessing or curse by shaping the alignment between citizens' expectations of incumbent performance and incumbents' capacity to deliver. This argument is tested through a comparative investigation of incumbency bias in Brazil, Argentina and Chile that draws on extensive fieldwork and an impressive array of experimental and observational evidence. Incumbency Bias demonstrates that rather than clientelistic or corrupt elites compromising accountability, democracy can generate an uneven playing field if citizens demand good governance but have limited information. While focused on Latin America, this book carries broader lessons for understanding the electoral returns to office around the world.
How do race, ethnicity, and gender shape a legislator’s approach to bill sponsorship and cosponsorship? This paper examines how institutional marginalization influences the legislative strategies of racial and gender minority representatives. Constrained by systemic barriers that limit their ability to sponsor legislation, minority legislators prioritize cosponsorship to achieve policy goals, build coalitions, and demonstrate responsiveness to their constituencies. Using the quasi-experimental context of the 2016 and 2018 U.S. congressional elections, I apply the synthetic difference-in-differences estimator and find that minority legislators sponsor fewer bills but cosponsor significantly more than their non-Hispanic White counterparts. Additionally, race-gendered effects reveal that women of color sponsor significantly less legislation than non-minority legislators and men of color. These patterns cannot be explained by factors like freshman status or primary election competitiveness. The findings highlight the strategic adaptations of minority legislators to navigate structural inequities and amplify their legislative influence. This study is the first to use a causal inference approach to explore the intersection of race, gender, and sponsorship and cosponsorship of congressional bills, contributing to a deeper understanding of legislative behavior among marginalized groups.
Sustainable agricultural practices have become increasingly important due to growing environmental concerns and the urgent need to mitigate the climate crisis. Digital agriculture, through advanced data analysis frameworks, holds promise for promoting these practices. Pesticides are a common tool in agricultural pest control, which are key in ensuring food security but also significantly contribute to the climate crisis. To combat this, Integrated Pest Management (IPM) stands as a climate-smart alternative. We propose a causal and explainable framework for enhancing digital agriculture, using pest management and its sustainable alternative, IPM, as a key example to highlight the contributions of causality and explainability. Despite its potential, IPM faces low adoption rates due to farmers’ skepticism about its effectiveness. To address this challenge, we introduce an advanced data analysis framework tailored to enhance IPM adoption. Our framework provides (i) robust pest population predictions across diverse environments with invariant and causal learning, (ii) explainable pest presence predictions using transparent models, (iii) actionable advice through counterfactual explanations for in-season IPM interventions, (iv) field-specific treatment effect estimations, and (v) assessments of the effectiveness of our advice using causal inference. By incorporating these features, our study illustrates the potential of causality and explainability concepts to enhance digital agriculture regarding promoting climate-smart and sustainable agricultural practices, focusing on the specific case of pest management. In this case, our framework aims to alleviate skepticism and encourage wider adoption of IPM practices among policymakers, agricultural consultants, and farmers.
In this opinion article, we discuss the application of critical realism as an alternative model to the biopsychosocial model in the understanding of psychiatric disorders. Critical realism presents a stratified view of reality and recognises mental disorders as emergent phenomena; that is, their full explanation cannot be reduced to explanations at any lower level of biological processes alone. It thus underscores the significance of the depth of ontology, the interaction between agency and structure, and the context dependency and complex nature of causality. Critical realism provides the conceptual and epistemological basis for a more subtle understanding of the aetiology of psychiatric conditions, which is polyfactorial and includes biological, psychological and social dimensions. Through the realisation of the conceptual and applicative shortcomings in the biopsychosocial model, critical realism promises to advance the understanding of mental disorders and enable a more holistic approach to the problem of people with mental disorders.
Emerging evidence suggests a co-occurrence of attention-deficit hyperactivity disorder (ADHD) and immune response-related conditions. However, it is unclear whether there is a causal relationship between ADHD and immune response.
Methods
We investigated the associations between ADHD traits, common variant genetic liability to ADHD, and serum C-reactive protein (CRP) levels in childhood and adulthood, using data from the Avon Longitudinal Study of Parent and Children. Genetic correlation was estimated using linkage-disequilibrium score regression. Two-sample Mendelian randomization (MR) was conducted to test potential causal effects of ADHD genetic liability on serum CRP as an indicator of systemic inflammation, as well as the genetically proxied levels of specific plasma cytokines.
Results
There was little evidence to suggest association between ADHD and CRP in childhood and adulthood. ADHD genetic liability was associated with a higher serum CRP at ages 9 (β = 0.02, 95% confidence interval [CI] = 0, 0.03), 15 (β = 0.04; 95% CI = 0.02, 0.06), and 24 years (β = 0.03; 95% CI = 0.01, 0.05). There was evidence of genetic correlations between ADHD and CRP ($ {r}_g $ = 0.27; 95% CI = 0.19, 0.35). Evidence of a bidirectional effect of genetic liability to ADHD and CRP was found by two-sample MR (ADHD-CRP: βIVW= 0.04, 95% CI = 0.01, 0.07; CRP-ADHD: ORIVW = 1.09, 95% CI = 1.01, 1.17).
Conclusions
Further work is necessary to understand the biological pathways linking ADHD genetic liability and CRP and gain insights into understanding how they might contribute in the links between ADHD and later-life adverse physical and mental health outcomes.
Many influential political science articles use close elections to study how important outcomes vary after a certain type of candidate wins, such as a Democrat or a Republican. This politician characteristic regression discontinuity (PCRD) design offers opportunities for inferential leverage but also the potential for confusion. In this article, we clarify what causal claims the PCRD licenses, offering a rigorous causal analysis that points to three principal lessons. First, PCRDs do nothing to isolate the effect of the politician characteristic of interest as apart from other politician characteristics. Second, selection processes (regarding both “who runs” and “which elections are close”) can generate and exacerbate such confounding, as noted in Marshall (2024). Third and more fortunately, this approach does make it possible to estimate the average effect of electing a leader of type “A” vs. “B” in the context of close elections, treating the units as districts, not leaders. We also suggest a set of tools that can aid in falsifying key assumptions, avoiding unwarranted claims, and surfacing mechanisms of interest. We illustrate these issues and tools through a reanalysis of an influential study about what happens when extremists win primaries (Hall 2015).
Do more restrictive voter identification (ID) laws decrease turnout? I argue that in the 2018 London Local elections this was the case. Bromley was the only London borough to pilot a more restrictive ID scheme. The scheme was assessed by the Electoral Commission and Cabinet Office but lacked a good estimate for the impact on turnout. Applying a synthetic difference-in-difference (DID) methodology, which has several benefits compared to traditional DID methods, to turnout data from 2002 to 2018 I show that turnout was between 4.0 and 5.0% points lower than otherwise would be expected. This indicates more restrictive ID laws can meaningfully limit turnout which has implications for future elections if governments chose to implement a more restrictive regime.
Researchers would often like to leverage data from a collection of sources (e.g., meta-analyses of randomized trials, multi-center trials, pooled analyses of observational cohorts) to estimate causal effects in a target population of interest. However, because different data sources typically represent different underlying populations, traditional meta-analytic methods may not produce causally interpretable estimates that apply to any reasonable target population. In this article, we present the CausalMetaR R package, which implements robust and efficient methods to estimate causal effects in a given internal or external target population using multi-source data. The package includes estimators of average and subgroup treatment effects for the entire target population. To produce efficient and robust estimates of causal effects, the package implements doubly robust and non-parametric efficient estimators and supports using flexible data-adaptive (e.g., machine learning techniques) methods and cross-fitting techniques to estimate the nuisance models (e.g., the treatment model, the outcome model). We briefly review the methods, describe the key features of the package, and demonstrate how to use the package through an example. The package aims to facilitate causal analyses in the context of meta-analysis.
Guerini and Moneta (2017) have developed a sophisticated method of providing empirical evidence in support of the relations of causal dependence that macroeconomists engaging in agent-based modelling believe obtain in the target system of their models. The paper presents three problems that get in the way of successful applications of this method: problems that have to do with the potential chaos of the target system, the non-measurability of variables standing for individual or aggregate expectations, and the failure of macroeconomic aggregates to screen off individual expectations from the microeconomic quantities that constitute the aggregates. The paper also discusses the in-principle solvability of the three problems and uses a prominent agent-based model (the Keynes + Schumpeter model of the macroeconomy) as a running example.
Oral argument is the most public and visible part of the U.S. Supreme Court’s decision-making process. Yet what if some advocates are treated differently before the Court solely because of aspects of their identity? In this work, we leverage a causal inference framework to quantify the effect of an advocate’s gender on interruptions of advocates at both the Court-level and the justice-level. Examining nearly four decades of U.S. Supreme Court oral argument transcript data, we identify a clear and consistent gender effect that dwarfs other influences on justice interruption behavior, with female advocates interrupted more frequently than male advocates.