We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This chapter describes the data collection strategy and multimethod research design employed to test the theory in the subsequent chapters of the book. The structure of the empirical analysis mirrors the book’s primary argument: to show how peacekeeping works from the bottom up, from the individual to the community to the country. Given that UN peacekeepers deploy to the most violent areas, the design needed to account for selection bias as well as other confounding variables in order to make causal inference possible. Using data from individual- and subnational/community-level data from Mali as well as cross-national data from the universe of multidimensional PKOs deployed in Africa, the book employs a three-part strategy to test the hypotheses in the next few chapters. First, the book considers the micro-level behavioral implications of the theory using a lab-in-the-field experiment and a survey experiment, both implemented in Mali. Second, it test whether UN peacekeepers’ ability to increase individual willingness to cooperate aggregates upward to prevent communal violence in Mali. Third, the book considers whether these findings extend to other countries.
The remuneration of MPs affects who engages in politics. Even if average returns to office are positive, as found in all other studies, some officeholders’ returns are likely negative. Further, the timing of returns to office is crucial as politicians often enjoy delayed compensation like lucrative pensions. Utilizing administrative data for parliamentary candidates in Denmark from 1994 to 2015, we estimate first-time runners’ earnings and total income returns to office. Based on total income, practically all elected MPs experience economic gains during their first term. Computations of life-cycle returns reveal that the 25 percent highest earning candidates (pre-office) have no long-term economic gain from winning. Generally, considering the distribution and timing of returns to office improves studies of how office-holding is economically rewarded.
There are widespread assumptions that implicit group bias leads to biased behavior. This chapter summarizes existing evidence on the link between implicit group bias and biased behavior, with an analysis of the strength of that evidence for causality. Our review leads to the conclusion that although there is substantial evidence that implicit group bias is related to biased behavior, claims about causality are not currently supported. With plausible alternative explanations for observed associations, as well as the possibility of reverse causation, scientists and policy makers need to be careful about claims made and actions taken to address discrimination, based on the assumption that implicit bias is the problem.
We construct a framework for meta-analysis and other multi-level data structures that codifies the sources of heterogeneity between studies or settings in treatment effects and examines their implications for analyses. The key idea is to consider, for each of the treatments under investigation, the subject’s potential outcome in each study or setting were he to receive that treatment. We consider four sources of heterogeneity: (1) response inconsistency, whereby a subject’s response to a given treatment would vary across different studies or settings, (2) the grouping of nonequivalent treatments, where two or more treatments are grouped and treated as a single treatment under the incorrect assumption that a subject’s responses to the different treatments would be identical, (3) nonignorable treatment assignment, and (4) response-related variability in the composition of subjects in different studies or settings. We then examine how these sources affect heterogeneity/homogeneity of conditional and unconditional treatment effects. To illustrate the utility of our approach, we re-analyze individual participant data from 29 randomized placebo-controlled studies on the cardiovascular risk of Vioxx, a Cox-2 selective nonsteroidal anti-inflammatory drug approved by the FDA in 1999 for the management of pain and withdrawn from the market in 2004.
Behavioral science researchers have shown strong interest in disaggregating within-person relations from between-person differences (stable traits) using longitudinal data. In this paper, we propose a method of within-person variability score-based causal inference for estimating joint effects of time-varying continuous treatments by controlling for stable traits of persons. After explaining the assumed data-generating process and providing formal definitions of stable trait factors, within-person variability scores, and joint effects of time-varying treatments at the within-person level, we introduce the proposed method, which consists of a two-step analysis. Within-person variability scores for each person, which are disaggregated from stable traits of that person, are first calculated using weights based on a best linear correlation preserving predictor through structural equation modeling (SEM). Causal parameters are then estimated via a potential outcome approach, either marginal structural models (MSMs) or structural nested mean models (SNMMs), using calculated within-person variability scores. Unlike the approach that relies entirely on SEM, the present method does not assume linearity for observed time-varying confounders at the within-person level. We emphasize the use of SNMMs with G-estimation because of its property of being doubly robust to model misspecifications in how observed time-varying confounders are functionally related to treatments/predictors and outcomes at the within-person level. Through simulation, we show that the proposed method can recover causal parameters well and that causal estimates might be severely biased if one does not properly account for stable traits. An empirical application using data regarding sleep habits and mental health status from the Tokyo Teen Cohort study is also provided.
Covariate-adjusted treatment effects are commonly estimated in non-randomized studies. It has been shown that measurement error in covariates can bias treatment effect estimates when not appropriately accounted for. So far, these delineations primarily assumed a true data generating model that included just one single covariate. It is, however, more plausible that the true model consists of more than one covariate. We evaluate when a further covariate may reduce bias due to measurement error in another covariate and in which cases it is not recommended to include a further covariate. We analytically derive the amount of bias related to the fallible covariate’s reliability and systematically disentangle bias compensation and amplification due to an additional covariate. With a fallible covariate, it is not always beneficial to include an additional covariate for adjustment, as the additional covariate can extensively increase the bias. The mechanisms for an increased bias due to an additional covariate can be complex, even in a simple setting of just two covariates. A high reliability of the fallible covariate or a high correlation between the covariates cannot in general prevent from substantial bias. We show distorting effects of a fallible covariate in an empirical example and discuss adjustment for latent covariates as a possible solution.
In the behavioral and social sciences, quasi-experimental and observational studies are used due to the difficulty achieving a random assignment. However, the estimation of differences between groups in observational studies frequently suffers from bias due to differences in the distributions of covariates. To estimate average treatment effects when the treatment variable is binary, Rosenbaum and Rubin (1983a) proposed adjustment methods for pretreatment variables using the propensity score.
However, these studies were interested only in estimating the average causal effect and/or marginal means. In the behavioral and social sciences, a general estimation method is required to estimate parameters in multiple group structural equation modeling where the differences of covariates are adjusted.
We show that a Horvitz-Thompson-type estimator, propensity score weighted M estimator (PWME) is consistent, even when we use estimated propensity scores, and the asymptotic variance of the PWME is shown to be less than that with true propensity scores.
Furthermore, we show that the asymptotic distribution of the propensity score weighted statistic under a null hypothesis is a weighted sum of independent χ12 variables.
We show the method can compare latent variable means with covariates adjusted using propensity scores, which was not feasible by previous methods. We also apply the proposed method for correlated longitudinal binary responses with informative dropout using data from the Longitudinal Study of Aging (LSOA). The results of a simulation study indicate that the proposed estimation method is more robust than the maximum likelihood (ML) estimation method, in that PWME does not require the knowledge of the relationships among dependent variables and covariates.
A central theme of research on human development and psychopathology is whether a therapeutic intervention or a turning-point event, such as a family break-up, alters the trajectory of the behavior under study. This paper lays out and applies a method for using observational longitudinal data to make more confident causal inferences about the impact of such events on developmental trajectories. The method draws upon two distinct lines of research: work on the use of finite mixture modeling to analyze developmental trajectories and work on propensity scores. The essence of the method is to use the posterior probabilities of trajectory group membership from a finite mixture modeling framework, to create balance on lagged outcomes and other covariates established prior to t for the purpose of inferring the impact of first-time treatment at t on the outcome of interest. The approach is demonstrated with an analysis of the impact of gang membership on violent delinquency based on data from a large longitudinal study conducted in Montreal.
Graph-based causal models are a flexible tool for causal inference from observational data. In this paper, we develop a comprehensive framework to define, identify, and estimate a broad class of causal quantities in linearly parametrized graph-based models. The proposed method extends the literature, which mainly focuses on causal effects on the mean level and the variance of an outcome variable. For example, we show how to compute the probability that an outcome variable realizes within a target range of values given an intervention, a causal quantity we refer to as the probability of treatment success. We link graph-based causal quantities defined via the do-operator to parameters of the model implied distribution of the observed variables using so-called causal effect functions. Based on these causal effect functions, we propose estimators for causal quantities and show that these estimators are consistent and converge at a rate of \documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$N^{-1/2}$$\end{document} under standard assumptions. Thus, causal quantities can be estimated based on sample sizes that are typically available in the social and behavioral sciences. In case of maximum likelihood estimation, the estimators are asymptotically efficient. We illustrate the proposed method with an example based on empirical data, placing special emphasis on the difference between the interventional and conditional distribution.
Recently, machine learning (ML) methods have been used in causal inference to estimate treatment effects in order to reduce concerns for model mis-specification. However, many ML methods require that all confounders are measured to consistently estimate treatment effects. In this paper, we propose a family of ML methods that estimate treatment effects in the presence of cluster-level unmeasured confounders, a type of unmeasured confounders that are shared within each cluster and are common in multilevel observational studies. We show through simulation studies that our proposed methods are robust from biases from unmeasured cluster-level confounders in a variety of multilevel observational studies. We also examine the effect of taking an algebra course on math achievement scores from the Early Childhood Longitudinal Study, a multilevel observational educational study, using our methods. The proposed methods are available in the CURobustML R package.
Due to the difficulty in achieving a random assignment, a quasi-experimental or observational study design is frequently used in the behavioral and social sciences. If a nonrandom assignment depends on the covariates, multiple group structural equation modeling, that includes the regression function of the dependent variables on the covariates that determine the assignment, can provide reasonable estimates under the condition of correct specification of the regression function. However, it is usually difficult to specify the correct regression function because the dimensions of the dependent variables and covariates are typically large. Therefore, the propensity score adjustment methods have been proposed, since they do not require the specification of the regression function and have been applied to several applied studies. However, these methods produce biased estimates if the assignment mechanism is incorrectly specified. In order to make a more robust inference, it would be more useful to develop an estimation method that integrates the regression approach with the propensity score methodology. In this study we propose a doubly robust-type estimation method for marginal multiple group structural equation modeling. This method provides a consistent estimator if either the regression function or the assignment mechanism is correctly specified. A simulation study indicates that the proposed estimation method is more robust than the existing methods.
Mediation analysis practices in social and personality psychology would benefit from the integration of practices from statistical mediation analysis, which is currently commonly implemented in social and personality psychology, and causal mediation analysis, which is not frequently used in psychology. In this chapter, I briefly describe each method on its own, then provide recommendations for how to integrate practices from each method to simultaneously evaluate statistical inference and causal inference as part of a single analysis. At the end of the chapter, I describe additional areas of recent development in mediation analysis that that social and personality psychologists should also consider adopting I order to improve the quality of inference in their mediation analysis: latent variables and longitudinal models. Ultimately, this chapter is meant to be a kind introduction to causal inference in the context of mediation with very practical recommendations for how one can implement these practices in one’s own research.
Field research refers to research conducted with a high degree of naturalism. The first part of this chapter provides a definition of field research and discusses advantages and limitations. We then provide a brief overview of observational field research methods, followed by an in-depth overview of experimental field research methods. We discuss randomization schemes of different types in field experimentation, such as cluster randomization, block randomization, and randomized rollout or waitlist designs, as well as statistical implementation concerns when conducting field experiments, including spillover, attrition, and noncompliance. The second part of the chapter provides an overview of important considerations when conducting field research. We discuss the psychology of construal in the design of field research, conducting non-WEIRD field research, replicability and generalizability, and how technological advances have impacted field research. We end by discussing career considerations for psychologists who want to get involved in field research.
A quasi-experiment is a type of study that attempts to mimic the objectives and structure of traditional (randomized) experiments. However, quasi-experiments differ from experiments in that condition assignment is randomized in experiments whereas it is not randomized in quasi-experiments. This chapter reviews conceptual, methodological, and practical issues that arise in the design, implementation, and interpretation of quasi-experiments. The chapter begins by highlighting similarities and differences between quasi-experiments, randomized experiments, and nonexperimental studies. Next, it provides a framework for discussion of the relative strengths and weaknesses of different study types. The chapter then discusses traditional threats to causal inferences when conducting studies of different types and reviews the most common quasi-experimental designs and how they attempt to reach accurate assessments of the causal impact of independent variables. The chapter concludes with a discussion of how quasi-experiments might be integrated with studies of other types to produce richer insights.
Quantifying the causal effects of race is one of the more controversial and consequential endeavors to have emerged from the causal revolution in the social sciences. The predominant view within the causal inference literature defines the effect of race as the effect of race perception and commonly equates this effect with “disparate treatment” racial discrimination. If these concepts are indeed equivalent, the stakes of these studies are incredibly high as they stand to establish or discredit claims of discrimination in courts, policymaking circles and public opinion. This paper interrogates the assumptions upon which this enterprise has been built. We ask: what is a perception of race, a perception of, exactly? Drawing on a rich tradition of work in critical race theory and social psychology on racial cognition, we argue that perception of race and perception of other decision-relevant features of an action situation are often co-constituted; hence, efforts to distinguish and separate these effects from each other are theoretically misguided. We conclude that empirical studies of discrimination must turn to defining what constitutes just treatment in light of the social differences that define race.
Observational studies consistently report associations between tobacco use, cannabis use and mental illness. However, the extent to which this association reflects an increased risk of new-onset mental illness is unclear and may be biased by unmeasured confounding.
Methods
A systematic review and meta-analysis (CRD42021243903). Electronic databases were searched until November 2022. Longitudinal studies in general population samples assessing tobacco and/or cannabis use and reporting the association (e.g. risk ratio [RR]) with incident anxiety, mood, or psychotic disorders were included. Estimates were combined using random-effects meta-analyses. Bias was explored using a modified Newcastle–Ottawa Scale, confounder matrix, E-values, and Doi plots.
Results
Seventy-five studies were included. Tobacco use was associated with mood disorders (K = 43; RR: 1.39, 95% confidence interval [CI] 1.30–1.47), but not anxiety disorders (K = 7; RR: 1.21, 95% CI 0.87–1.68) and evidence for psychotic disorders was influenced by treatment of outliers (K = 4, RR: 3.45, 95% CI 2.63–4.53; K = 5, RR: 2.06, 95% CI 0.98–4.29). Cannabis use was associated with psychotic disorders (K = 4; RR: 3.19, 95% CI 2.07–4.90), but not mood (K = 7; RR: 1.31, 95% CI 0.92–1.86) or anxiety disorders (K = 7; RR: 1.10, 95% CI 0.99–1.22). Confounder matrices and E-values suggested potential overestimation of effects. Only 27% of studies were rated as high quality.
Conclusions
Both substances were associated with psychotic disorders and tobacco use was associated with mood disorders. There was no clear evidence of an association between cannabis use and mood or anxiety disorders. Limited high-quality studies underscore the need for future research using robust causal inference approaches (e.g. evidence triangulation).
This chapter focuses on causal inference in healthcare, emphasizing the need to identify causal relationships in data to answer important questions related to efficacy, mortality, productivity, and care delivery models. The authors discuss the limitations of randomized controlled trials due to ethical or pragmatic considerations and introduce quasi-experimental research designs as a scientifically coherent alternative. They divide these designs into two broad categories, independence-based designs and model-based designs, and explain the validity of assumptions necessary for each design. The chapter covers key concepts such as potential outcomes, selection bias, heterogeneous treatment effects bias, average treatment effect, average treatment effect for the treated and untreated, and local average treatment effect. Additionally, it discusses important quasi-experimental designs such as regression discontinuity, difference-in-differences, and synthetic controls. The chapter concludes by highlighting the importance of careful selection and application of these methods to estimate causal effects accurately and open the black box of healthcare.
What was the effect of war outcomes on key indicators of state formation in a post-war phase? In this chapter I demonstrate that victors and losers of war were set into different state capacity trajectories after war outcomes were revealed. I do this using a set of cutting-edge causal inference techniques to analyse the gap in state capacity that was generated between winners and losers in the time-period of most stringent warfare (1865-1913). After substantiating that the outcomes of these wars were determined by exogenous or fortuitous events, I provide a short description of my treatment—i.e., defeat—and outcomes—i.e., total revenues and railroad mileage as key indicators of state infrastructural capacity. My estimator, a difference-in-differences model, shows defeat had a negative long-term impact on state capacity which remains remarkably robust even after relaxing key assumptions. Finally, I use the synthetic control method to estimate how state capacity in Paraguay and Peru would have evolved in a counterfactual world where these countries were spared the most severe defeats in late nineteenth-century Latin America.
from
Part III
-
Methodological Challenges of Experimentation in Sociology
Davide Barrera, Università degli Studi di Torino, Italy,Klarita Gërxhani, Vrije Universiteit, Amsterdam,Bernhard Kittel, Universität Wien, Austria,Luis Miller, Institute of Public Goods and Policies, Spanish National Research Council,Tobias Wolbring, School of Business, Economics and Society at the Friedrich-Alexander-University Erlangen-Nürnberg
This chapter addresses the often-misunderstood concept of validity. Much of the methodological discussion around sociological experiments is framed in terms of internal and external validity. The standard view is that the more we ensure that the experimental treatment is isolated from potential confounds (internal validity), the more unlikely it is that the experimental results can be representative of phenomena of the outside world (external validity). However, other accounts describe internal validity as a prerequisite of external validity: Unless we ensure internal validity of an experiment, little can be said of the outside world. We contend in this chapter that problems of either external or internal validity do not necessarily depend on the artificiality of experimental settings or on the laboratory–field distinction between experimental designs. We discuss the internal–external distinction and propose instead a list of potential threats to the validity of experiments that includes "usual suspects" like selection, history, attrition, and experimenter demand effects and elaborate on how these threats can be productively handled in experimental work. Moreover, in light of the different types of experiments, we also discuss the strengths and weaknesses of each regarding threats to internal and external validity.
Synthetic controls (SCs) are widely used to estimate the causal effect of a treatment. However, they do not account for the different speeds at which units respond to changes. Reactions may be inelastic or “sticky” and thus slower due to varying regulatory, institutional, or political environments. We show that these different reaction speeds can lead to biased estimates of causal effects. We therefore introduce a dynamic SC approach that accommodates varying speeds in time series, resulting in improved SC estimates. We apply our method to re-estimate the effects of terrorism on income (Abadie and Gardeazabal [2003, American Economic Review 93, 113–132]), tobacco laws on consumption (Abadie, Diamond, and Hainmueller [2010, Journal of the American Statistical Association 105, 493–505]), and German reunification on GDP (Abadie, Diamond, and Hainmueller [2015, American Journal of Political Science 59, 495–510]). We also assess the method’s performance using Monte Carlo simulations. We find that it reduces errors in the estimates of true treatment effects by up to 70% compared to traditional SCs, improving our ability to make robust inferences. An open-source R package, dsc, is made available for easy implementation.