We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In linear multiple regression, “enhancement” is said to occur when R2=b′r>r′r, where b is a p×1 vector of standardized regression coefficients and r is a p×1 vector of correlations between a criterion y and a set of standardized regressors, x. When p=1 then b≡r and enhancement cannot occur. When p=2, for all full-rank Rxx≠I, Rxx=E[xx′]=VΛV′ (where VΛV′ denotes the eigen decomposition of Rxx; λ1>λ2), the set \documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$\boldsymbol{B}_{1}:=\{\boldsymbol{b}_{i}:R^{2}=\boldsymbol{b}_{i}'\boldsymbol{r}_{i}=\boldsymbol{r}_{i}'\boldsymbol{r}_{i};0 \ltR^{2}\le1\}$\end{document} contains four vectors; the set \documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$\boldsymbol{B}_{2}:=\{\boldsymbol{b}_{i}: R^{2}=\boldsymbol{b}_{i}'\boldsymbol{r}_{i}\gt\boldsymbol{r}_{i}'\boldsymbol{r}_{i}$\end{document}; \documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$0\lt R^{2}\le1;R^{2}\lambda_{p}\leq\boldsymbol{r}_{i}'\boldsymbol{r}_{i}\lt R^{2}\}$\end{document} contains an infinite number of vectors. When p≥3 (and λ1>λ2>⋯>λp), both sets contain an uncountably infinite number of vectors. Geometrical arguments demonstrate that B1 occurs at the intersection of two hyper-ellipsoids in ℝp. Equations are provided for populating the sets B1 and B2 and for demonstrating that maximum enhancement occurs when b is collinear with the eigenvector that is associated with λp (the smallest eigenvalue of the predictor correlation matrix). These equations are used to illustrate the logic and the underlying geometry of enhancement in population, multiple-regression models. R code for simulating population regression models that exhibit enhancement of any degree and any number of predictors is included in Appendices A and B.
A family of solutions for linear relations among k sets of variables is proposed. It is shown how these solutions apply for k = 2, and how they can be generalized from there to k ≥ 3.
The family of solutions depends on three independent choices: (i) to what extent a solution may be influenced by differences in variances of components within each set; (ii) to what extent the sets may be differentially weighted with respect to their contribution to the solution—including orthogonality constraints; (iii) whether or not individual sets of variables may be replaced by an orthogonal and unit normalized basis.
Solutions are compared with respect to their optimality properties. For each solution the appropriate stationary equations are given. For one example it is shown how the determinantal equation of the stationary equations can be interpreted.
Using Carroll's external analysis, several studies have found that unfolding models account for more, although seldom significantly more, variance in preferences than Tucker's vector model. In studies of sociometric ratings and political preferences, the unfolding model again rarely outpredicted the vector model by a significant amount. Yet on cross-validation, the unfolding model consistently accounted for more variance. Results suggest that sometimes significance tests are less sensitive than cross-validation procedures to the small but consistent superiority of the unfolding model. Future researchers may wish to use significance tests and cross-validation techniques in comparing models.
A quadratic programming algorithm is presented for fitting Carroll’s weighted unfolding model for preferences to known multidimensional scale values. The algorithm can be applied directly to pairwise preferences; it permits nonnegativity constraints on subject weights; and it provides a means of testing various preference model hypotheses. While basically metric, it can be combined with Kruskal’s monotone regression to fit ordinal data. Monte Carlo results show that (a) adequacy of “true” preference recovery depends on the number of data points and the amount of error, and (b) the proportion of data variance accounted for by the model sometimes only approximately reflects “true” recovery.
Yuan and Chan (Psychometrika, 76, 670–690, 2011) recently showed how to compute the covariance matrix of standardized regression coefficients from covariances. In this paper, we describe a method for computing this covariance matrix from correlations. Next, we describe an asymptotic distribution-free (ADF; Browne in British Journal of Mathematical and Statistical Psychology, 37, 62–83, 1984) method for computing the covariance matrix of standardized regression coefficients. We show that the ADF method works well with nonnormal data in moderate-to-large samples using both simulated and real-data examples. R code (R Development Core Team, 2012) is available from the authors or through the Psychometrika online repository for supplementary materials.
We describe methods for assessing all possible criteria (i.e., dependent variables) and subsets of criteria for regression models with a fixed set of predictors, x (where x is an n×1 vector of independent variables). Our methods build upon the geometry of regression coefficients (hereafter called regression weights) in n-dimensional space. For a full-rank predictor correlation matrix, Rxx, of order n, and for regression models with constant R2 (coefficient of determination), the OLS weight vectors for all possible criteria terminate on the surface of an n-dimensional ellipsoid. The population performance of alternate regression weights—such as equal weights, correlation weights, or rounded weights—can be modeled as a function of the Cartesian coordinates of the ellipsoid. These geometrical notions can be easily extended to assess the sampling performance of alternate regression weights in models with either fixed or random predictors and for models with any value of R2. To illustrate these ideas, we describe algorithms and R (R Development Core Team, 2009) code for: (1) generating points that are uniformly distributed on the surface of an n-dimensional ellipsoid, (2) populating the set of regression (weight) vectors that define an elliptical arc in ℝn, and (3) populating the set of regression vectors that have constant cosine with a target vector in ℝn. Each algorithm is illustrated with real data. The examples demonstrate the usefulness of studying all possible criteria when evaluating alternate regression weights in regression models with a fixed set of predictors.
Commonality components have been defined as a method of partitioning squared multiple correlations. In this paper, the asymptotic joint distribution of all 2k − 1 squared multiple correlations is derived. The asymptotic joint distribution of linear combinations of squared multiple correlations is obtained as a corollary. In particular, the asymptotic joint distribution of commonality components are derived as a special case. Simultaneous and nonsimultaneous asymptotic confidence intervals for commonality components can be obtained from this distribution.
A general theory on the use of correlation weights in linear prediction has yet to be proposed. In this paper we take initial steps in developing such a theory by describing the conditions under which correlation weights perform well in population regression models. Using OLS weights as a comparison, we define cases in which the two weighting systems yield maximally correlated composites and when they yield minimally similar weights. We then derive the least squares weights (for any set of predictors) that yield the largest drop in R2 (the coefficient of determination) when switching to correlation weights. Our findings suggest that two characteristics of a model/data combination are especially important in determining the effectiveness of correlation weights: (1) the condition number of the predictor correlation matrix, Rxx, and (2) the orientation of the correlation weights to the latent vectors of Rxx.
Every set of alternate weights (i.e., nonleast squares weights) in a multiple regression analysis with three or more predictors is associated with an infinite class of weights. All members of a given class can be deemed fungible because they yield identical SSE (sum of squared errors) and R2 values. Equations for generating fungible weights are reviewed and an example is given that illustrates how fungible weights can be profitably used to evaluate parameter sensitivity in multiple regression.
Prediction and classification are two very active areas in modern data analysis. In this paper, prediction with nonlinear optimal scaling transformations of the variables is reviewed, and extended to the use of multiple additive components, much in the spirit of statistical learning techniques that are currently popular, among other areas, in data mining. Also, a classification/clustering method is described that is particularly suitable for analyzing attribute-value data from systems biology (genomics, proteomics, and metabolomics), and which is able to detect groups of objects that have similar values on small subsets of the attributes.
Mean squared error of prediction is used as the criterion for determining which of two multiple regression models (not necessarily nested) is more predictive. We show that an unrestricted (or true) model with t parameters should be chosen over a restricted (or misspecified) model with m parameters if (Pt2−Pm2)>(1−Pt2)(t−m)/n, where Pt2 and Pm2 are the population coefficients of determination of the unrestricted and restricted models, respectively, and n is the sample size. The left-hand side of the above inequality represents the squared bias in prediction by using the restricted model, and the right-hand side gives the reduction in variance of prediction error by using the restricted model. Thus, model choice amounts to the classical statistical tradeoff of bias against variance. In practical applications, we recommend that P2 be estimated by adjusted R2. Our recommendation is equivalent to performing the F-test for model comparison, and using a critical value of 2−(m/n); that is, if F>2−(m/n), the unrestricted model is recommended; otherwise, the restricted model is recommended.
Guttman's assumption underlying his definition of “total images” is rejected: Partial images are not generally convergent everywhere. Even divergence everywhere is shown to be possible. The convergence type always found on partial images is convergence in quadratic mean; hence, total images are redefined as quadratic mean-limits. In determining the convergence type in special situations, the asymptotic properties of certain correlations are important, implying, in some cases, convergence almost everywhere, which is also effected by a countable population or multivariate normality or independent variables. The interpretations of a total image as a predictor, and a “common-factor score”, respectively, are made precise.
In a multiple regression analysis with three or more predictors, every set of alternate weights belongs to an infinite class of “fungible weights” (Waller, Psychometrica, in press) that yields identical SSE (sum of squared errors) and R2 values. When the R2 using the alternate weights is a fixed value, fungible weights (ai) that yield the maximum or minimum cosine with an OLS weight vector (b) are called “fungible extrema.” We describe two methods for locating fungible extrema and we report R code (R Development Core Team, 2007) for one of the methods. We then describe a new approach for populating a class of fungible weights that is derived from the geometry of alternate regression weights. Finally, we illustrate how fungible weights can be profitably used to gauge parameter sensitivity in linear models by locating the fungible extrema of a regression model of executive compensation (Horton & Guerard, Commun. Stat. Simul. Comput. 14:441–448, 1985).
A class of monotonic transformations which generalize the power transformation is fit to the independent and dependent variables in multiple regression so that the resulting additive relationship is optimized. This is achieved by minimizing a quadratic fitting criterion with linear inequality constraints on the parameters. A quadratic programming technique which works reliably and quickly in this application is outlined. Some examples of the analysis of artificial and real data are offered.
This paper presents the first meta-analysis of the ‘Taking Game,’ a variant of the Dictator Game where participants take money from recipients instead of giving. Upon analyzing data from 39 experiments, which include 123 effect sizes and 7262 offers made by dictators, we discovered a significant framing effect: dictators are more generous in the Taking Game than in the Dictator Game (Cohen's d = 0.26, p < 0.0001), leaving approximately 35.5 percent of the stakes to recipients in the former as opposed to 27.5 percent in the latter. The difference is higher when the participants have earned their endowment before sharing or when the recipient is a charity. Consistent with the standard literature on giving, we also find that participants take less from a charity than from a standard recipient, take less when payoffs are hypothetical, or when recipients have previously earned their endowment. We also find that women (non-students) take less than men (students). Finally, it appears that participants from non-OECD countries leave more money to recipients than participants from OECD countries.
Chapter 11 introduces students to bivariate (simple) regression and multiple regression. Students learn the importance of linear relationships and how linearity can be used to make predictions on one variable from the knowledge of another variable or multiple variables. Interpretation and conceptual understanding of critical concepts in regression are emphasized.
This study explores the relationships between group-affect tone, teams’ transactive memory systems (TMSs), and teams’ incremental creativity. Data were collected from 334 team members and 70 team leaders across 70 teams. Results indicate that positive group-affect tone enhances TMS, while negative group-affect tone impedes it. TMS positively impacts team incremental creativity. Additionally, both types of group-affect tone influence incremental creativity through TMS mediation. This research advances TMS theory and group-affect tone, substantiating the affect-cognition model and deepening the understanding of TMS’s role in incremental creativity.
A hegemonic neoliberal ideology dominates all areas of work in Turkey, including healthcare. Though neoliberalism has been studied extensively from the perspective of meaning, values, and processes, managerial and leadership behavior dynamics require further research. This study analyzes the relationship between managerialism, toxic leadership, and ethical climate in an industry swept up by untamed neoliberalism, particularly in a nation where employment and human rights are ceremoniously protected. Through an analysis of medical doctors working in 207 public and private university hospitals in Turkey, we explored the role of managerialism and four distinct ethical climate types, resulting in the emergence of toxic leadership behaviors during the global pandemic. We theorize the extent to which toxic leaders emerge from managerialism. We further explain why the hegemonic Turkish leadership culture thrives in toxic behaviors such as paternalism, fealty, ingratiation, nepotism, and cronyism in the context of neoliberal expansion.
Chapter 6 starts with the description of multiple regression. Even if it is unlikely for multiple regression to be used as the primary method for multivariate biomarker discovery based on high-dimensional data, presenting this classical method provides the necessary background for regression analysis and highlights the weaknesses of multiple regression, which will be addressed by the subsequently presented methods. This chapter also presents partial least squares regression (PLSR), which by performing supervised dimensionality reduction addresses some weaknesses of multiple regression; however, by not performing any feature selection, PLSR does not reduce noise that is typically abundant in high-dimensional data.
This chapter explores ways to set up a model matrix so that linear combinations of the columns can fit curves and multidimensional surfaces. These extend to methods, within a generalized additive model framework, that use a penalization approach to constrain over-fitting. A further extension is to fitting quantiles of the data. The methodologies are important both for direct use for modeling data, and for checking for pattern in residuals from models that are in a more classical parametric style. The methodology is extended, in later chapters, to include smoothing terms in generalized linear models and models that allow for time series errors.