To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This chapter introduces random variables and explains how to use them to model uncertain numerical quantities that are discrete. We first provide a mathematical definition of random variables, building upon the framework of probability spaces. Then, we explain how to manipulate discrete random variables in practice, using their probability mass function (pmf), and describe the main properties of the pmf. Motivated by an example where we analyze Kevin Durant's free-throw shooting, we define the empirical pmf, a nonparametric estimator of the pmf that does not make strong assumptions about the data. Next, we define several popular discrete parametric distributions (Bernoulli, binomial, geometric, and Poisson), which yield parametric estimators of the pmf, and explain how to fit them to data via maximum-likelihood estimation. We conclude the chapter by comparing the advantages and disadvantages of nonparametric and parametric models, illustrated by a real-data example, where we model the number of calls arriving at a call center.
This chapter introduces continuous random variables which enable us to model uncertain continuous quantities. We again begin with a formal definition, but quickly move on to describe how to manipulate continuous random variables in practice. We define the cumulative distribution function and quantiles (including the median) and explain how to estimate them from data. We then introduce the concept of probability density and describe its main properties. We present two approaches to obtain nonparametric models of probability densities from data: The histogram and kernel density estimation. Next, we define two celebrated continuous parametric distributions – the exponential and the Gaussian – and show how to fit them to data using maximum-likelihood estimation. We use these distributions to model the interarrival time of calls at a call center, and height in a population, respectively. Finally, we discuss how to simulate continuous random variables via inverse transform sampling.
The Hawkes process is a popular candidate for researchers to model phenomena that exhibit a self-exciting nature. The classical Hawkes process assumes the excitation kernel takes an exponential form, thus suggesting that the peak excitation effect of an event is immediate and the excitation effect decays towards 0 exponentially. While the assumption of an exponential kernel makes it convenient for studying the asymptotic properties of the Hawkes process, it can be restrictive and unrealistic for modelling purposes. A variation on the classical Hawkes process is proposed where the exponential assumption on the kernel is replaced by integrability and smoothness type conditions. However, it is substantially more difficult to conduct asymptotic analysis under this setup since the intensity process is non-Markovian when the excitation kernel is non-exponential, rendering techniques for studying the asymptotics of Markov processes inappropriate. By considering the Hawkes process with a general excitation kernel as a stationary Poisson cluster process, the intensity process is shown to be ergodic. Furthermore, a parametric setup is considered, under which, by utilising the recently established ergodic property of the intensity process, consistency of the maximum likelihood estimator is demonstrated.
We present a multidimensional data analysis framework for the analysis of ordinal response variables. Underlying the ordinal variables, we assume a continuous latent variable, leading to cumulative logit models. The framework includes unsupervised methods, when no predictor variables are available, and supervised methods, when predictor variables are available. We distinguish between dominance variables and proximity variables, where dominance variables are analyzed using inner product models, whereas the proximity variables are analyzed using distance models. An expectation–majorization–minimization algorithm is derived for estimation of the parameters of the models. We illustrate our methodology with three empirical data sets highlighting the advantages of the proposed framework. A simulation study is conducted to evaluate the performance of the algorithm.
Current practice in factor analysis typically involves analysis of correlation rather than covariance matrices. We study whether the standard z-statistic that evaluates whether a factor loading is statistically necessary is correctly applied in such situations and more generally when the variables being analyzed are arbitrarily rescaled. Effects of rescaling on estimated standard errors of factor loading estimates, and the consequent effect on z-statistics, are studied in three variants of the classical exploratory factor model under canonical, raw varimax, and normal varimax solutions. For models with analytical solutions we find that some of the standard errors as well as their estimates are scale equivariant, while others are invariant. For a model in which an analytical solution does not exist, we use an example to illustrate that neither the factor loading estimates nor the standard error estimates possess scale equivariance or invariance, implying that different conclusions could be obtained with different scalings. Together with the prior findings on parameter estimates, these results provide new guidance for a key statistical aspect of factor analysis.
We address several issues that are raised by Bentler and Tanaka's [1983] discussion of Rubin and Thayer [1982]. Our conclusions are: standard methods do not completely monitor the possible existence of multiple local maxima; summarizing inferential precision by the standard output based on second derivatives of the log likelihood at a maximum can be inappropriate, even if there exists a unique local maximum; EM and LISREL can be viewed as complementary, albeit not entirely adequate, tools for factor analysis.
Consider an old test X consisting of s sections and two new tests Y and Z similar to X consisting of p and q sections respectively. All subjects are given test X plus two variable sections from either test Y or Z. Different pairings of variable sections are given to each subsample of subjects. We present a method of estimating the covariance matrix of the combined test (X1, ..., Xs, Y1, ..., Yp, Z1, ..., Zq) and describe an application of these estimation techniques to linear, observed-score, test equating.
The normal theory based maximum likelihood procedure is widely used in structural equation modeling. Three alternatives are: the normal theory based generalized least squares, the normal theory based iteratively reweighted least squares, and the asymptotically distribution-free procedure. When data are normally distributed and the model structure is correctly specified, the four procedures are asymptotically equivalent. However, this equivalence is often used when models are not correctly specified. This short paper clarifies conditions under which these procedures are not asymptotically equivalent. Analytical results indicate that, when a model is not correct, two factors contribute to the nonequivalence of the different procedures. One is that the estimated covariance matrices by different procedures are different, the other is that they use different scales to measure the distance between the sample covariance matrix and the estimated covariance matrix. The results are illustrated using real as well as simulated data. Implication of the results to model fit indices is also discussed using the comparative fit index as an example.
Given known item parameters, the bootstrap method can be used to determine the statistical accuracy of ability estimates in item response theory. Through a Monte Carlo study, the method is evaluated as a way of approximating the standard error and confidence limits for the maximum likelihood estimate of the ability parameter, and compared to the use of the theoretical standard error and confidence limits developed by Lord. At least for short tests, the bootstrap method yielded better estimates than the corresponding theoretical values.
A maximum likelihood procedure for combining standardized mean differences based on a noncentratt-distribution is proposed. With a proper data augmentation technique, an EM-algorithm is developed. Information and likelihood ratio statistics are discussed in detail for reliable inference. Simulation results favor the proposed procedure over both the existing normal theory maximum likelihood procedure and the commonly used generalized least squares procedure.
This paper uses an extension of the network algorithm originally introduced by Mehta and Patel to construct exact tail probabilities for testing the general hypothesis that item responses are distributed according to the Rasch model. By assuming that item difficulties are known, the algorithm is applicable to the statistical tests either given the maximum likelihood ability estimate or conditioned on the total score. A simulation study indicates that the network algorithm is an efficient tool for computing the significance level of a person fit statistic based on test lengths of 30 items or less.
We apply the Hawkes process to the analysis of dyadic interaction. The Hawkes process is applicable to excitatory interactions, wherein the actions of each individual increase the probability of further actions in the near future. We consider the representation of the Hawkes process both as a conditional intensity function and as a cluster Poisson process. The former treats the probability of an action in continuous time via non-stationary distributions with arbitrarily long historical dependency, while the latter is conducive to maximum likelihood estimation using the EM algorithm. We first outline the interpretation of the Hawkes process in the dyadic context, and then illustrate its application with an example concerning email transactions in the work place.
A general latent variable model is given which includes the specification of a missing data mechanism. This framework allows for an elucidating discussion of existing general multivariate theory bearing on maximum likelihood estimation with missing data. Here, missing completely at random is not a prerequisite for unbiased estimation in large samples, as when using the traditional listwise or pairwise present data approaches. The theory is connected with old and new results in the area of selection and factorial invariance. It is pointed out that in many applications, maximum likelihood estimation with missing data may be carried out by existing structural equation modeling software, such as LISREL and LISCOMP. Several sets of artifical data are generated within the general model framework. The proposed estimator is compared to the two traditional ones and found superior.
The notion of scale freeness does not seem to have been well understood in the factor analytic literature. It has been believed that if the loss function that is minimized to obtain estimates of the parameters in the factor model is scale invariant, then the estimates are scale free. It is shown that scale invariance of the loss function is neither a necessary nor a sufficient condition for scale freeness. A theorem that ensures scale freeness in the orthogonal factor model is given in this paper.
In this paper, linear structural equation models with latent variables are considered. It is shown how many common models arise from incomplete observation of a relatively simple system. Subclasses of models with conditional independence interpretations are also discussed. Using an incomplete data point of view, the relationships between the incomplete and complete data likelihoods, assuming normality, are highlighted. For computing maximum likelihood estimates, the EM algorithm and alternatives are surveyed. For the alternative algorithms, simplified expressions for computing function values and derivatives are given. Likelihood ratio tests based on complete and incomplete data are related, and an example on using their relationship to improve the fit of a model is given.
This paper focuses on the computation of asymptotic standard errors (ASE) of ability estimators with dichotomous item response models. A general framework is considered, and ability estimators are defined from a very restricted set of assumptions and formulas. This approach encompasses most standard methods such as maximum likelihood, weighted likelihood, maximum a posteriori, and robust estimators. A general formula for the ASE is derived from the theory of M-estimation. Well-known results are found back as particular cases for the maximum and robust estimators, while new ASE proposals for the weighted likelihood and maximum a posteriori estimators are presented. These new formulas are compared to traditional ones by means of a simulation study under Rasch modeling.
Bootstrap and jackknife techniques are used to estimate ellipsoidal confidence regions of group stimulus points derived from INDSCAL. The validity of these estimates is assessed through Monte Carlo analysis. Asymptotic estimates of confidence regions based on a MULTISCALE solution are also evaluated. Our findings suggest that the bootstrap and jackknife techniques may be used to provide statements regarding the accuracy of the relative locations of points in space. Our findings also suggest that MULTISCALE asymptotic estimates of confidence regions based on small samples provide an optimistic view of the actual statistical reliability of the solution.
Unless data are missing completely at random (MCAR), proper methodology is crucial for the analysis of incomplete data. Consequently, methods for effectively testing the MCAR mechanism become important, and procedures were developed via testing the homogeneity of means and variances–covariances across the observed patterns (e.g., Kim & Bentler in Psychometrika 67:609–624, 2002; Little in J Am Stat Assoc 83:1198–1202, 1988). The current article shows that the population counterparts of the sample means and covariances of a given pattern of the observed data depend on the underlying structure that generates the data, and the normal-distribution-based maximum likelihood estimates for different patterns of the observed sample can converge to the same values even when data are missing at random or missing not at random, although the values may not equal those of the underlying population distribution. The results imply that statistics developed for testing the homogeneity of means and covariances cannot be safely used for testing the MCAR mechanism even when the population distribution is multivariate normal.
Problem solving models relating individual and group solution times under time limit censoring are presented. Maximum likelihood estimates of parameters of the resulting censored distributions are derived and goodness of fit tests for the individual-group models are constructed. The methods are illustrated on data previously analyzed by Restle and Davis.
Maximum likelihood is an important approach to analysis of two-level structural equation models. Different algorithms for this purpose have been available in the literature. In this paper, we present a new formulation of two-level structural equation models and develop an EM algorithm for fitting this formulation. This new formulation covers a variety of two-level structural equation models. As a result, the proposed EM algorithm is widely applicable in practice. A practical example illustrates the performance of the EM algorithm and the maximum likelihood statistic.