Search

6 - Setting Up a Multivariable Analysis
Mitchell H. Katz, NYC Health and Hospitals
Book:

Multivariable Analysis

Published online:

09 October 2025

Print publication:

23 October 2025, pp 98-122
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In setting up your model, include those variables, in addition to the risk factor or group assignment, that have been theorized or shown in prior research to be confounders or those that empirically are associated with the risk factor and the outcome in bivariate analysis.
Exclude variables that are on the intervening pathway between the risk factor and outcome, those that are extraneous because they are not on the causal pathway, redundant variables, and variables with a lot of missing data.
Sample size calculation for multivariable analysis is complicated but statistical programs exist to help you to calculate it. Missing data on independent variables can compromise your multivariable analysis. Several methods exist to compensate for missing independent data including deleting cases, using indicator variables to represent missing data and estimating the value of missing cases. Methods also exist for estimating missing outcome data using other data you have about the subject and multiple imputation.

2 - Common Uses of Multivariable Models
Mitchell H. Katz, NYC Health and Hospitals
Book:

Multivariable Analysis

Published online:

09 October 2025

Print publication:

23 October 2025, pp 14-24
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Multivariable analysis is used for four major types of studies: observational studies of etiology, randomized and nonrandomized intervention studies, studies of diagnosis, and studies of prognosis.
For observational studies, whether etiologic or intervention, the most important reason to do multivariable analysis is to eliminate confounding, since in observational studies the groups are not randomly assigned. With randomized studies, multiple analysis is used to adjust for baseline differences that occurred by chance, to identify other independent predictors of outcome besides the randomized group, and x.
With studies of diagnosis, multivariable analysis is used to identify the best combination of diagnostic information to determine whether a person has a particular disease. Multivariable analysis can also be used to predict the prognosis of a group of patients with a particular set of known prognostic factors.

4 - Multiple Discrete Variables
Carlos Fernandez-Granda, New York University
Book:

Probability and Statistics for Data Science

Published online:

19 June 2025

Print publication:

03 July 2025, pp 109-160
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter describes how to model multiple discrete quantities as discrete random variables within the same probability space and manipulate them using their joint pmf. We explain how to estimate the joint pmf from data, and use it to model precipitation in Oregon. Then, we introduce marginal distributions, which describe the individual behavior of each variable in a model, and conditional distributions, which describe the behavior of a variable when other variables are fixed. Next, we generalize the concepts of independence and conditional independence to random variables. In addition, we discuss the problem of causal inference, which seeks to identify causal relationships between variables. We then turn our attention to a fundamental challenge: It is impossible to completely characterize the dependence between all variables in a model, unless they are very few. This phenomenon, known as the curse of dimensionality, is the reason why independence assumptions are needed to make probabilistic models tractable. We conclude the chapter by describing two popular models based on such assumptions: Naive Bayes and Markov chains.

Can Factor Investing Become Scientific?
Marcos M. López de Prado
Published online:

12 October 2023

Print publication:

09 November 2023
- Element
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Virtually all journal articles in the factor investing literature make associational claims, in denial of the causal content of factor models. Authors do not identify the causal graph consistent with the observed phenomenon, they justify their chosen model specification in terms of correlations, and they do not propose experiments for falsifying causal mechanisms. Absent a causal theory, their findings are likely false, due to rampant backtest overfitting and incorrect specification choices. This Element differentiates between type-A and type-B spurious claims, and explains how both types prevent factor investing from advancing beyond its current phenomenological stage. It analyzes the current state of causal confusion in the factor investing literature, and proposes solutions with the potential to transform factor investing into a truly scientific discipline. This title is also available as Open Access on Cambridge Core.

1 - Clinical research design: analytical studies
from Part I - Quantitative methods in clinical neurology
- By Richard Mayeux
Albert Hofman, Erasmus Universiteit Rotterdam, Richard Mayeux, Columbia University, New York
Book:

Investigating Neurological Disease

Published online:

29 September 2009

Print publication:

30 August 2001, pp 3-10
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Analytic studies are increasingly important for understanding relationships between diseases and their causes. They usually take the form of observational investigations, but can also include randomized clinical trials. Risk factors are antecedents that are considered to be components of the disease pathway. Appropriate study design and consideration to systematic bias and confounding helps to establish association between the exposure and disease. The purpose for maintaining the principles of causal inference and eliminating chance, bias, and confounding is to establish validity. The most serious concern in analytic studies is maintaining validity. Confounders are extraneous factors that are related to the disease and to a risk factor or exposure related to the disease. The confounder usually predicts disease in the absence of any risk factor. Investigators have to consider cost and efficiency in their design as well as the potential public health impact of any observed association.

Search Results