We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
We investigate under what conditions the matrix of factor loadings from the factor analysis model with equal unique variances will give a good approximation to the matrix of factor loadings from the regular factor analysis model. We show that the two models will give similar matrices of factor loadings if Schneeweiss' condition, that the difference between the largest and the smallest value of unique variances is small relative to the sizes of the column sums of squared factor loadings, holds. Furthermore, we generalize our results and discus the conditions under which the matrix of factor loadings from the regular factor analysis model will be well approximated by the matrix of factor loadings from Jöreskog's image factor analysis model. Especially, we discuss Guttman's condition (i.e., the number of variables increases without limit) for the two models to agree, in relation with the condition we have shown, and conclude that Schneeweiss' condition is a generalization of Guttman's condition. Some implications for practice are discussed.
A review is provided for the creation of the Psychometric Society in 1935, and the establishment of its journal, Psychometrika, in 1936. This document is part of the 80th anniversary celebration for Psychometrika’s founding, held during the annual meeting of the Psychometric Society in July of 2016 in Asheville, NC.
The objective of this paper is to introduce and motivate additional properties and interpretations for the redundancy variables. It is shown that these variables can be derived by application of certain invariance arguments and without reference to the index of redundancy. In addition, an optimality property for the variables is presented which is important whenever one restricts attention in a study to a subset of the redundancy variables. This optimality property pertains to the subset rather than to the individual variables.
The notion of scale freeness does not seem to have been well understood in the factor analytic literature. It has been believed that if the loss function that is minimized to obtain estimates of the parameters in the factor model is scale invariant, then the estimates are scale free. It is shown that scale invariance of the loss function is neither a necessary nor a sufficient condition for scale freeness. A theorem that ensures scale freeness in the orthogonal factor model is given in this paper.
The use of multilevel VAR(1) models to unravel within-individual process dynamics is gaining momentum in psychological research. These models accommodate the structure of intensive longitudinal datasets in which repeated measurements are nested within individuals. They estimate within-individual auto- and cross-regressive relationships while incorporating and using information about the distributions of these effects across individuals. An important quality feature of the obtained estimates pertains to how well they generalize to unseen data. Bulteel and colleagues (Psychol Methods 23(4):740–756, 2018a) showed that this feature can be assessed through a cross-validation approach, yielding a predictive accuracy measure. In this article, we follow up on their results, by performing three simulation studies that allow to systematically study five factors that likely affect the predictive accuracy of multilevel VAR(1) models: (i) the number of measurement occasions per person, (ii) the number of persons, (iii) the number of variables, (iv) the contemporaneous collinearity between the variables, and (v) the distributional shape of the individual differences in the VAR(1) parameters (i.e., normal versus multimodal distributions). Simulation results show that pooling information across individuals and using multilevel techniques prevent overfitting. Also, we show that when variables are expected to show strong contemporaneous correlations, performing multilevel VAR(1) in a reduced variable space can be useful. Furthermore, results reveal that multilevel VAR(1) models with random effects have a better predictive performance than person-specific VAR(1) models when the sample includes groups of individuals that share similar dynamics.
In this paper implicit function-based parameterizations for orthogonal and oblique rotation matrices are proposed. The parameterizations are used to construct Newton algorithms for minimizing differentiable rotation criteria applied to m factors and p variables. The speed of the new algorithms is compared to that of existing algorithms and to that of Newton algorithms based on alternative parameterizations. Several rotation criteria were examined and the algorithms were evaluated over a range of values for m. Initial guesses for Newton algorithms were improved by subconvergence iterations of the gradient projection algorithm. Simulation results suggest that no one algorithm is fastest for minimizing all criteria for all values of m. Among competing algorithms, the gradient projection algorithm alone was faster than the implicit function algorithm for minimizing a quartic criterion over oblique rotation matrices when m is large. In all other conditions, however, the implicit function algorithms were competitive with or faster than the fastest existing algorithms. The new algorithms showed the greatest advantage over other algorithms when minimizing a nonquartic component loss criterion.
A test for linear trend among a set of eigenvalues of a correlation matrix is developed. As a technical implementation of Cattell's scree test, this is a generalization of Anderson's test for the equality of eigenvalues, and extends Bentler and Yuan's work on linear trends in eigenvalues of a covariance matrix. The power of minimum x2 and maximum likelihood ratio tests are compared. Examples show that the linear trend hypothesis is more realistic than the standard hypothesis of equality of eigenvalues, and that the hypothesis is compatible with standard decisions on the number of factors or components to retain in data analysis.
Principal components analysis can be redefined in terms of the regression of observed variables upon component variables. Two criteria for the adequacy of a component representation in this context are developed and are shown to lead to different component solutions. Both criteria are generalized to allow weighting, the choice of weights determining the scale invariance properties of the resulting solution. A theorem is presented giving necessary and sufficient conditions for equivalent component solutions under different choices of weighting. Applications of the theorem are discussed that involve the components analysis of linearly derived variables and of external variables.
A component method is presented maximizing Stewart and Love's redundancy index. Relationships with multiple correlation and principal component analysis are pointed out and a rotational procedure for obtaining bi-orthogonal variates is given. An elaborate example comparing canonical correlation analysis and redundancy analysis on artificial data is presented.
As an extension of Lastovicka's four-mode components analysis an n-mode components analysis is developed. Using a convenient notation, both a canonical and a least squares solution are derived. The relation between both solutions and their computational aspects are discussed.
Some existing three-way factor analysis and MDS models incorporate Cattell's “Principle of Parallel Proportional Profiles”. These models can—with appropriate data—empirically determine a unique best fitting axis orientation without the need for a separate factor rotation stage, but they have not been general enough to deal with what Tucker has called “interactions” among dimensions. This article presents a proof of unique axis orientation for a considerably more general parallel profiles model which incorporates interacting dimensions. The model, Xk=AADk HBDk B', does not assume symmetry in the data or in the interactions among factors. A second proof is presented for the symmetrically weighted case (i.e., where ADk=BDk). The generality of these models allows one to impose successive restrictions to obtain several useful special cases, including PARAFAC2 and three-way DEDICOM.
Tucker has outlined an application of principal components analysis to a set of learning curves, for the purpose of identifying meaningful dimensions of individual differences in learning tasks. Since the principal components are defined in terms of a statistical criterion (maximum variance accounted for) rather than a substantive one, it is typically desirable to rotate the components to a more interpretable orientation. “Simple structure” is not a particularly appealing consideration for such a rotation; it is more reasonable to believe that any meaningful factor should form a (locally) smooth curve when the component loadings are plotted against trial number. Accordingly, this paper develops a procedure for transforming an arbitrary set of component reference curves to a new set which are mutually orthogonal and, subject to orthogonality, are as smooth as possible in a well defined (least squares) sense. Potential applications to learning data, electrophysiological responses, and growth data are indicated.
Research questions in the human sciences often seek to answer if and when a process changes across time. In functional MRI studies, for instance, researchers may seek to assess the onset of a shift in brain state. For daily diary studies, the researcher may seek to identify when a person’s psychological process shifts following treatment. The timing and presence of such a change may be meaningful in terms of understanding state changes. Currently, dynamic processes are typically quantified as static networks where edges indicate temporal relations among nodes, which may be variables reflecting emotions, behaviors, or brain activity. Here we describe three methods for detecting changes in such correlation networks from a data-driven perspective. Networks here are quantified using the lag-0 pair-wise correlation (or covariance) estimates as the representation of the dynamic relations among variables. We present three methods for change point detection: dynamic connectivity regression, max-type method, and a PCA-based method. The change point detection methods each include different ways to test if two given correlation network patterns from different segments in time are significantly different. These tests can also be used outside of the change point detection approaches to test any two given blocks of data. We compare the three methods for change point detection as well as the complementary significance testing approaches on simulated and empirical functional connectivity fMRI data examples.
P. M. Bentler has shown that Rao's canonical factor analysis is in effect a psychometric analysis, leading to factors that are maximally assessible from the data. He contrasts this with Kaiser and Caffrey's alpha factor analysis that leads to factors that maximally represent the true factors in the content domain. Noting the problems associated with factors that may be highly assessible, but not very representative, or vice versa, Bentler suggests the need for a technique that would, insofar as possible, be optimal with respect to both criteria. Such a technique is presented here, and is shown to resolve into a traditional scaling method, which in turn acquires a richer psychometric interpretation.
Forecasts play a central role in decision-making under uncertainty. After a brief review of the general issues, this article considers ways of using high-dimensional data in forecasting. We consider selecting variables from a known active set, known knowns, using Lasso and One Covariate at a time Multiple Testing, and approximating unobserved latent factors, known unknowns, by various means. This combines both sparse and dense approaches to forecasting. We demonstrate the various issues involved in variable selection in a high-dimensional setting with an application to forecasting UK inflation at different horizons over the period 2020q1–2023q1. This application shows both the power of parsimonious models and the importance of allowing for global variables.
This chapter discusses the Fourier series representation for continuous-time signals. This is applicable to signals which are either periodic or have a finite duration. The connections between the continuous-time Fourier transform (CTFT), the discrete-time Fourier transform (DTFT), and Fourier series are also explained. Properties of Fourier series are discussed and many examples presented. For real-valued signals it is shown that the Fourier series can be written as a sum of a cosine series and a sine series; examples include rectified cosines, which have applications in electric power supplies. It is shown that the basis functions used in the Fourier series representation satisfy an orthogonality property. This makes the truncated version of the Fourier representation optimal in a certain sense. The so-called principal component approximation derived from the Fourier series is also discussed. A detailed discussion of the properties of musical signals in the light of Fourier series theory is presented, and leads to a discussion of musical scales, consonance, and dissonance. Also explained is the connection between Fourier series and the function-approximation property of multilayer neural networks, used widely in machine learning. An overview of wavelet representations and the contrast with Fourier series representations is also given.
This chapter moves from regression to methods that focus on the pattern presented by multiple variables, albeit with applications in regression analysis. A strong focus is to find patterns that beg further investigation, and/or replace many variables by a much smaller number that capture important structure in the data. Methodologies discussed include principal components analysis and multidimensional scaling more generally, cluster analysis (the exploratory process that groups “alike” observations) and dendogram construction, and discriminant analysis. Two sections discuss issues for the analysis of data, such as from high throughput genomics, where the aim is to determine, from perhaps thousands or tens of thousands of variables, which are shifted in value between groups in the data. A treatment of the role of balance and matching in making inferences from observational data then follows. The chapter ends with a brief introduction to methods for multiple imputation, which aims to use multivariate relationships to fill in missing values in observations that are incomplete, allowing them to have at least some role in a regression or other further analysis.
A total of 108 diverse sorghum (Sorghum bicolor) accessions were characterized for quantitative and qualitative fodder-related traits and zonate leaf spot (ZLS) (Gloeocercospora sorghi) disease during two successive wet seasons of 2019 and 2020 in augmented randomized block design. The Shannon's diversity index and analysis of variance showed the existence of significant variability among qualitative and quantitative traits. K-mean clustering showed strong relationship between green fodder yield (GFY) and other yield-contributing traits. The dendrogram constructed based on morphological traits classified accessions into four diverse groups and most of genotype fall under cluster II. The principal component analysis bi-plot analysis showed a total variation of 68.96%, where GFY, stem weight per plant, panicle length and dry matter yield (DMY) contributed significantly. From the experimental results, three sorghum genotypes viz., IG-03-424, IG-01-436 and IG-03-438 were identified as promising for higher GFY (808.66 g/plant) and DMY (238.0 g/plant), respectively. Further, based on disease reactions under natural condition, five genotypes viz., EC-512397, EC512393, EC512394, EC512399 and IG-02-437 were identified as potential donor for resistance to ZLS disease. These selected lines could be used as promising sources for high biomass and disease resistance in forage sorghum breeding programme.
The shape of the flower can vary based on the type of pollinator or the environment in which the plant develops. In Vanilla insignis, there are no studies that analyse the shape of the labellum of the flower as has been done in V. planifolia. Therefore, the study aimed to determine the variation in the shape of the labellum of V. insignis through a morphometric analysis in different environmental conditions in the state of Quintana Roo, Mexico. The results showed that there were significant differences in the variables analysed. Principal component analysis and dendrogram analysis reveal that four V. insignis morphotypes were possibly associated with soil water availability conditions because there were significant differences between the variables that define the apical region. In addition, the distribution of the morphotypes corresponded with the presence of humidity regardless of geographical distances such as in the populations of Tenampulco, Puebla and Caobas, Quintana Roo. The presence of these morphotypes allows the development of conservation programmes and genetic improvement of the species of V. insignis and related commercial species.
This study assessed the genetic diversity of African yam bean (AYB) accessions using morphological and molecular markers. The accessions were grown, and morphological data collected were subjected to analysis of variance and multivariate analyses. Genomic DNA extracted from the accessions were amplified with inter simple sequence repeat (ISSR) markers. The diversity analysis was conducted using MEGA4 software. The accessions varied significantly (P < 0.05) in growth, flowering and seed-related parameters. Flowering commenced early in most accessions. Weight of 100-seed range from 15.01 to 21.15 g with the mean value of 18.30 g. Significant correlations existed between stem height, the number of leaves and leaf dimensions. Also, days to flowering correlated with pod formation; likewise, seed dimension had a positive association with seed weight. The principal biplot revealed that two components accounted for 41.77% of the observed variation. Analysis of the electropherogram showed 95 loci comprising 1351 alleles were detected by the ISSR markers with 65.26% polymorphism and combined polymorphic information content of 0.85. The principal coordinate analysis placed accessions together on a plane based on their spatial relationship. The dendrogram showed accession pairs (TSs-77, TSs-95) and (TSs-111, TSs-84) are closely related. The phylogram identified three kinships with a total length of 454. Accession TSs-115 is likely the progenitor while TSs-82 and TSs-86 are the most recent. The study concluded that a combination of morphological and ISSR markers is effective for the diversity study of AYB and the existing, genetic diversity in the accessions could be harnessed for its improvement, conservation and utilization.