We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Current practice in factor analysis typically involves analysis of correlation rather than covariance matrices. We study whether the standard z-statistic that evaluates whether a factor loading is statistically necessary is correctly applied in such situations and more generally when the variables being analyzed are arbitrarily rescaled. Effects of rescaling on estimated standard errors of factor loading estimates, and the consequent effect on z-statistics, are studied in three variants of the classical exploratory factor model under canonical, raw varimax, and normal varimax solutions. For models with analytical solutions we find that some of the standard errors as well as their estimates are scale equivariant, while others are invariant. For a model in which an analytical solution does not exist, we use an example to illustrate that neither the factor loading estimates nor the standard error estimates possess scale equivariance or invariance, implying that different conclusions could be obtained with different scalings. Together with the prior findings on parameter estimates, these results provide new guidance for a key statistical aspect of factor analysis.
We address several issues that are raised by Bentler and Tanaka's [1983] discussion of Rubin and Thayer [1982]. Our conclusions are: standard methods do not completely monitor the possible existence of multiple local maxima; summarizing inferential precision by the standard output based on second derivatives of the log likelihood at a maximum can be inappropriate, even if there exists a unique local maximum; EM and LISREL can be viewed as complementary, albeit not entirely adequate, tools for factor analysis.
Consider an old test X consisting of s sections and two new tests Y and Z similar to X consisting of p and q sections respectively. All subjects are given test X plus two variable sections from either test Y or Z. Different pairings of variable sections are given to each subsample of subjects. We present a method of estimating the covariance matrix of the combined test (X1, ..., Xs, Y1, ..., Yp, Z1, ..., Zq) and describe an application of these estimation techniques to linear, observed-score, test equating.
The normal theory based maximum likelihood procedure is widely used in structural equation modeling. Three alternatives are: the normal theory based generalized least squares, the normal theory based iteratively reweighted least squares, and the asymptotically distribution-free procedure. When data are normally distributed and the model structure is correctly specified, the four procedures are asymptotically equivalent. However, this equivalence is often used when models are not correctly specified. This short paper clarifies conditions under which these procedures are not asymptotically equivalent. Analytical results indicate that, when a model is not correct, two factors contribute to the nonequivalence of the different procedures. One is that the estimated covariance matrices by different procedures are different, the other is that they use different scales to measure the distance between the sample covariance matrix and the estimated covariance matrix. The results are illustrated using real as well as simulated data. Implication of the results to model fit indices is also discussed using the comparative fit index as an example.
Given known item parameters, the bootstrap method can be used to determine the statistical accuracy of ability estimates in item response theory. Through a Monte Carlo study, the method is evaluated as a way of approximating the standard error and confidence limits for the maximum likelihood estimate of the ability parameter, and compared to the use of the theoretical standard error and confidence limits developed by Lord. At least for short tests, the bootstrap method yielded better estimates than the corresponding theoretical values.
A maximum likelihood procedure for combining standardized mean differences based on a noncentratt-distribution is proposed. With a proper data augmentation technique, an EM-algorithm is developed. Information and likelihood ratio statistics are discussed in detail for reliable inference. Simulation results favor the proposed procedure over both the existing normal theory maximum likelihood procedure and the commonly used generalized least squares procedure.
This paper uses an extension of the network algorithm originally introduced by Mehta and Patel to construct exact tail probabilities for testing the general hypothesis that item responses are distributed according to the Rasch model. By assuming that item difficulties are known, the algorithm is applicable to the statistical tests either given the maximum likelihood ability estimate or conditioned on the total score. A simulation study indicates that the network algorithm is an efficient tool for computing the significance level of a person fit statistic based on test lengths of 30 items or less.
We apply the Hawkes process to the analysis of dyadic interaction. The Hawkes process is applicable to excitatory interactions, wherein the actions of each individual increase the probability of further actions in the near future. We consider the representation of the Hawkes process both as a conditional intensity function and as a cluster Poisson process. The former treats the probability of an action in continuous time via non-stationary distributions with arbitrarily long historical dependency, while the latter is conducive to maximum likelihood estimation using the EM algorithm. We first outline the interpretation of the Hawkes process in the dyadic context, and then illustrate its application with an example concerning email transactions in the work place.
A general latent variable model is given which includes the specification of a missing data mechanism. This framework allows for an elucidating discussion of existing general multivariate theory bearing on maximum likelihood estimation with missing data. Here, missing completely at random is not a prerequisite for unbiased estimation in large samples, as when using the traditional listwise or pairwise present data approaches. The theory is connected with old and new results in the area of selection and factorial invariance. It is pointed out that in many applications, maximum likelihood estimation with missing data may be carried out by existing structural equation modeling software, such as LISREL and LISCOMP. Several sets of artifical data are generated within the general model framework. The proposed estimator is compared to the two traditional ones and found superior.
The notion of scale freeness does not seem to have been well understood in the factor analytic literature. It has been believed that if the loss function that is minimized to obtain estimates of the parameters in the factor model is scale invariant, then the estimates are scale free. It is shown that scale invariance of the loss function is neither a necessary nor a sufficient condition for scale freeness. A theorem that ensures scale freeness in the orthogonal factor model is given in this paper.
In this paper, linear structural equation models with latent variables are considered. It is shown how many common models arise from incomplete observation of a relatively simple system. Subclasses of models with conditional independence interpretations are also discussed. Using an incomplete data point of view, the relationships between the incomplete and complete data likelihoods, assuming normality, are highlighted. For computing maximum likelihood estimates, the EM algorithm and alternatives are surveyed. For the alternative algorithms, simplified expressions for computing function values and derivatives are given. Likelihood ratio tests based on complete and incomplete data are related, and an example on using their relationship to improve the fit of a model is given.
This paper focuses on the computation of asymptotic standard errors (ASE) of ability estimators with dichotomous item response models. A general framework is considered, and ability estimators are defined from a very restricted set of assumptions and formulas. This approach encompasses most standard methods such as maximum likelihood, weighted likelihood, maximum a posteriori, and robust estimators. A general formula for the ASE is derived from the theory of M-estimation. Well-known results are found back as particular cases for the maximum and robust estimators, while new ASE proposals for the weighted likelihood and maximum a posteriori estimators are presented. These new formulas are compared to traditional ones by means of a simulation study under Rasch modeling.
Bootstrap and jackknife techniques are used to estimate ellipsoidal confidence regions of group stimulus points derived from INDSCAL. The validity of these estimates is assessed through Monte Carlo analysis. Asymptotic estimates of confidence regions based on a MULTISCALE solution are also evaluated. Our findings suggest that the bootstrap and jackknife techniques may be used to provide statements regarding the accuracy of the relative locations of points in space. Our findings also suggest that MULTISCALE asymptotic estimates of confidence regions based on small samples provide an optimistic view of the actual statistical reliability of the solution.
Unless data are missing completely at random (MCAR), proper methodology is crucial for the analysis of incomplete data. Consequently, methods for effectively testing the MCAR mechanism become important, and procedures were developed via testing the homogeneity of means and variances–covariances across the observed patterns (e.g., Kim & Bentler in Psychometrika 67:609–624, 2002; Little in J Am Stat Assoc 83:1198–1202, 1988). The current article shows that the population counterparts of the sample means and covariances of a given pattern of the observed data depend on the underlying structure that generates the data, and the normal-distribution-based maximum likelihood estimates for different patterns of the observed sample can converge to the same values even when data are missing at random or missing not at random, although the values may not equal those of the underlying population distribution. The results imply that statistics developed for testing the homogeneity of means and covariances cannot be safely used for testing the MCAR mechanism even when the population distribution is multivariate normal.
Problem solving models relating individual and group solution times under time limit censoring are presented. Maximum likelihood estimates of parameters of the resulting censored distributions are derived and goodness of fit tests for the individual-group models are constructed. The methods are illustrated on data previously analyzed by Restle and Davis.
Maximum likelihood is an important approach to analysis of two-level structural equation models. Different algorithms for this purpose have been available in the literature. In this paper, we present a new formulation of two-level structural equation models and develop an EM algorithm for fitting this formulation. This new formulation covers a variety of two-level structural equation models. As a result, the proposed EM algorithm is widely applicable in practice. A practical example illustrates the performance of the EM algorithm and the maximum likelihood statistic.
Lord and Wingersky have developed a method for computing the asymptotic variance-covariance matrix of maximum likelihood estimates for item and person parameters under some restrictions on the estimates which are needed in order to fix the latent scale. The method is tedious, but can be simplified for the Rasch model when one is only interested in the item parameters. This is demonstrated here under a suitable restriction on the item parameter estimates.
The relationship between linear factor models and latent profile models is addressed within the context of maximum likelihood estimation based on the joint distribution of the manifest variables. Although the two models are well known to imply equivalent covariance decompositions, in general they do not yield equivalent estimates of the unconditional covariances. In particular, a 2-class latent profile model with Gaussian components underestimates the observed covariances but not the variances, when the data are consistent with a unidimensional Gaussian factor model. In explanation of this phenomenon we provide some results relating the unconditional covariances to the goodness of fit of the latent profile model, and to its excess multivariate kurtosis. The analysis also leads to some useful parameter restrictions related to symmetry.
Standard procedures for estimating item parameters in item response theory (IRT) ignore collateral information that may be available about examinees, such as their standing on demographic and educational variables. This paper describes circumstances under which collateral information about examinees may be used to make inferences about item parameters more precise, and circumstances under which it must be used to obtain correct inferences.
The details of EM algorithms for maximum likelihood factor analysis are presented for both the exploratory and confirmatory models. The algorithm is essentially the same for both cases and involves only simple least squares regression operations; the largest matrix inversion required is for a q × q symmetric matrix where q is the matrix of factors. The example that is used demonstrates that the likelihood for the factor analysis model may have multiple modes that are not simply rotations of each other; such behavior should concern users of maximum likelihood factor analysis and certainly should cast doubt on the general utility of second derivatives of the log likelihood as measures of precision of estimation.