We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Causal inference lies at the core of many scientific endeavours. Yet answering causal questions is challenging, especially when studying culture as a causal force. Against this backdrop, this paper reviews research designs and statistical tools that can be used – together with strong theory and knowledge about the context of study – to identify the causal impact of culture on outcomes of interest. We especially discuss how overlooked strategies in cultural evolutionary studies can allow one to approximate an ideal experiment wherein culture is randomly assigned to individuals or entire groups (instrumental variables, regression discontinuity design, and epidemiological approach). In doing so, we also review the potential outcome framework as a tool to engage in causal reasoning in the cultural evolutionary field.
This paper studies whether school-based financial education has spillover effects from children to parents. Leveraging data from a large-scale experiment with public high schools in Peru and credit bureau records on the parents of the youth targeted, this study measures the impact of providing personal finance lessons during secondary school on parental financial behavior. Financial education lessons in the school yield limited average spillover effects, but lead to sizable effects on parental financial behavior within disadvantaged households. Among parents from poorer households, the treatment reduces default probability by 26%, increases credit scores by 5%, and increases current debt levels by 40%. The treatment has stronger effects among the parents of daughters, who experience a significant 6.7% increase in their credit score and a 28% reduction in their loan portfolio in arrears. Among the parents of boys, most of the spillover effects are muted.
Causal inference and machine learning are typically introduced in the social sciences separately as theoretically distinct methodological traditions. However, applications of machine learning in causal inference are increasingly prevalent. This Element provides theoretical and practical introductions to machine learning for social scientists interested in applying such methods to experimental data. We show how machine learning can be useful for conducting robust causal inference and provide a theoretical foundation researchers can use to understand and apply new methods in this rapidly developing field. We then demonstrate two specific methods – the prediction rule ensemble and the causal random forest – for characterizing treatment effect heterogeneity in survey experiments and testing the extent to which such heterogeneity is robust to out-of-sample prediction. We conclude by discussing limitations and tradeoffs of such methods, while directing readers to additional related methods available on the Comprehensive R Archive Network (CRAN).
We describe fundamental challenges to estimating heterogeneous treatment effects in the context of the statistical causal inference literature, proposed algorithms for addressing those challenges, and methods to evaluate how well heterogeneous treatment effects have been estimated. We illustrate the proposed algorithms using data from two large randomized trials of blood pressure treatments. We describe directions for future research in medical statistics and machine learning in this domain. The focus will be on how flexible machine learning methods can improve causal estimators, especially in the RCT setting.
Matching is a conceptually straightforward method to make groups of units comparable on observed characteristics. The method is, however, limited to settings where the study design is simple and the sample is moderately sized. We illustrate these limitations by asking what the causal effects would have been if a large-scale voter mobilization experiment that took place in Michigan for the 2006 election were scaled up to the full population of registered voters. Matching could help us answer this question, but no existing matching method can accommodate the six treatment arms and the 6,762,701 observations involved in the study. To offer a solution for this and similar empirical problems, we introduce a generalization of the full matching method that can be used with any number of treatment conditions and complex compositional constraints. The associated algorithm produces near-optimal matchings; the worst-case maximum within-group dissimilarity is guaranteed to be no more than four times greater than the optimal solution, and simulation results indicate that it comes considerably closer to the optimal solution on average. The algorithm’s ability to balance the treatment groups does not sacrifice speed, and it uses little memory, terminating in linearithmic time using linear space. This enables investigators to construct well-performing matchings within minutes even in complex studies with samples of several million units.
Surveys are a key tool for understanding political behavior, but they are subject to biases that render their estimates about the frequency of socially desirable behaviors inaccurate. For decades the American National Election Study (ANES) has overestimated voter turnout, though the causes of this persistent bias are poorly understood. The face-to-face component of the 2012 ANES produced a turnout estimate at least 13 points higher than the benchmark voting-eligible population turnout rate. We consider three explanations for this overestimate in the survey: nonresponse bias, over-reporting and the possibility that the ANES constitutes an inadvertent mobilization treatment. Analysis of turnout data supplied by voter file vendors allows the three phenomena to be measured for the first time in a single survey. We find that over-reporting is the largest contributor, responsible for six percentage points of the turnout overestimate, while nonresponse bias and mobilization account for an additional 4 and 3 percentage points, respectively.
Are student subject experiment pools comparable across institutions? Despite repeated concerns over the “college sophomore problem,” many experiment-based studies still rely on student subject pools due to their convenience and accessibility. In this paper, I investigate whether student subject pools are comparable across universities by examining how respondents across three student subject pools at distinct educational institutions perform on the same survey experiment about crisis bargaining between states. I argue that, due to selection biases inherent in university matriculation and the self-selection of students into experimental protocols, respondents across these subject pools will exhibit key demographic differences. I also examine whether respondents across these subject pools think similarly about international politics and respond comparably to experimental treatments. I find that, while there are significant demographic differences across subject pools, subjects across institutions respond similarly to experimental treatments—with the key exception of information regarding the regime type of a state. Furthermore, there is little evidence that these demographic differences impact conditional average treatment effects across subgroups. These findings carry critical implications for the use of student samples across political science and within international relations more specifically, particularly regarding the current replication crisis in the discipline.
Our research investigates the effects of residential energy efficiency audit programs on subsequent household electricity consumption. Here there is a one-time interaction between households, which participate voluntarily, and the surveyors. Our research objective is to determine whether and to what extent the surveys lead to behavioral changes. We then examine how persistent the intervention is over time and whether the effects decay or intensify. The main evaluation problem here is survey participants’ self-selection, which we address econometrically via several non-parametric estimators involving kernel-based propensity-score matching. In the first method we use difference-in-differences (DID) estimation. Our second estimator is quantile DID, which produces estimates on distributions. The comparison group consists of households who were not yet participating in the survey but participated later. Our evidence is that the customers who participated in the survey reduced their electricity consumption by about 7%, on average compared to customers who had not yet participated in the survey. Considering the total number of high-usage households participating in the survey in 2009, we estimate that electricity consumption was reduced by an aggregate of 2 million kWh per year, which is approximately equal to the monthly consumption of 3500 typical households in California with an estimated 1527 metric tons less of carbon dioxide emissions. Because the energy audit program is inexpensive ($10–$20 per household) a key issue is that while the program is cost-effective, is it regressive? We find that as the quantiles of the outcome distribution increase, high-use households save proportionally less electricity than do low-use customers. Overall, our results imply that program designers can better target low-use and low-income households, because they are more likely to benefit from the programs through energy savings.
This article examines how the presentation of information during a laboratory experiment can alter a study’s findings. We compare four possible ways to present information about hypothetical candidates in a laboratory experiment. First, we manipulate whether subjects experience a low-information or a high-information campaign. Second, we manipulate whether the information is presented statically or dynamically. We find that the design of a study can produce very different conclusions. Using candidate’s gender as our manipulation, we find significant effects on a variety of candidate evaluation measures in low-information conditions, but almost no significant effects in high-information conditions. We also find that subjects in high-information settings tend to seek out more information in dynamic environments than static, though their ultimate candidate evaluations do not differ. Implications and recommendations for future avenues of study are discussed.
Propensity score matching is used to estimate treatment effects when data are observational. Results presented in this study demonstrate the use of propensity score matching to evaluate the average treatment effect of unit-based pricing of household trash for reducing municipal solid waste disposal. Average treatment effect of the treated for 34 New Hampshire communities range from an annual reduction of 631 pounds per household to 823 pounds per household. This represents an annual reduction of 42 percent to 54 percent from an average of 1530 pounds per household if a town did not adopt municipal solid waste user fees.
This study investigates the effects of a local information campaign on farmers’ interest in a rural development programme (RDP) in the former Yugoslav Republic of Macedonia. The results suggest that while our intervention succeeded in informing farmers, it had a negative, albeit only marginally significant, effect on the reported possibility of using future RDP support. This puzzling result can be attributed to increased awareness of administrative burden associated with RDP participation. An additional heterogeneity analysis suggests the negative effect is driven by unprofitable farmers who are averse to any administrative encumbrance, for whom upfront cofinancing of an RDP is untenable.
Crop scientists occasionally compute sample correlations between traits based on observed data from designed experiments, and this is often accompanied by significance tests of the null hypothesis that traits are uncorrelated. This simple approach does not account for effects due to the randomization layout and treatment structure of the experiments and hence statistical inference based on standard procedures is not appropriate. The present paper describes how valid inferences accounting for all relevant effects can be obtained using bivariate mixed linear model procedures. A salient feature of the approach is that the bivariate model is commensurate with the model used for univariate analysis of individual traits and allows bivariate correlations to be computed at the level of effects. Heterogeneity of correlations between effects can be assessed by likelihood ratio tests or by graphical inspection of bivariate scatter plots of effect estimates. if heterogeneity is found to be substantial, it is crucial to focus on the correlation of effects, and usually, the main interest will be in the treatment effects. If heterogeneity is judged to be negligible, the marginal correlation can be estimated from the bivariate model for an overall assessment of association. The proposed methods are illustrated using four examples. Hints are given to alternative routes of analysis accounting for all treatment and design effects such as regression with groups and analysis of covariance.
We utilize a treatment effects model to examine if there are differences in costs/profits for manure-using corn producers versus non-users. We find that manure users have lower peracre operating costs via reductions in fertilizer and soil conditioner costs; however, the use of manure reduces grain yields and ultimately leads to no difference in profit. Separate results indicate that manure-use restrictions do not affect costs or profits; thus policies could be in place to regulate manure usage without impacting the costs/profit structure of the farm.
A previous review of the neuroimaging studies in obstructive sleep apnea (OSA) called for specific attention to longitudinal studies of the treatment effects of OSA on neuroimaging. This chapter focuses on those studies where treatment effects were considered. The structural studies suggest that there are some notable changes in the structure of the human brain when continuous positive airway pressure (CPAP) is used to correct OSA. Some of these changes are even associated with cognitive changes in the expected cognitive domains and are seen with as little as only three months of treatment. The functional imaging studies together suggest that changes in brain function associated with working memory are evident when comparing treatment with no-treatment conditions in patients with OSA. Specifically, treatment often results in the recruitment of fewer cognitive resources to perform at the same level or better.
Objective – To provide a relatively non-technical review of recent statistical research on the analysis and interpretation of the results of randomised controlled trials in which there are possibly all three of the following types of protocol violation: non-adherence to allocated treatment, contamination (that is, patients receiving treatments other than the one to which they were allocated) and attrition (missing outcome data). Methods – The estimation methods involve the use of potential outcomes (counterfactuals) in the definition of a causal effect of treatment and in drawing valid inferences concerning its size. Results – The methods are explained through the use of simple arithmetical expressions involving the counts from three-way contingency tables (Outcome by Treatment Received by Random Allocation). Illustration is provided through the use of a hypothetical data set. Conclusions – Recent advances in statistical methodology enable one to estimate treatment effects from the results of randomised trials in which the treatment actually received is not necessarily the one to which the patient was allocated. These methods allow one to make adjustments to allow for both non-compliance and loss to follow-up. Even for such a 'broken' randomised trial, inference concerning causal effects is safer than that from data arising from an observational study that never involved random allocation in the first place.
Recommend this
Email your librarian or administrator to recommend adding this to your organisation's collection.