We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
As a multivariate model of the number of events, Rasch's multiplicative Poisson model is extended such that the parameters for individuals in the prior gamma distribution have continuous covariates. The parameters for individuals are integrated out and the hyperparameters in the prior distribution are estimated by a numerical method separately from difficulty parameters that are treated as fixed parameters or random variables. In addition, a method is presented for estimating parameters in Rasch's model with missing values.
A method is presented for generalized canonical correlation analysis of two or more matrices with missing rows. The method is a combination of Carroll’s (1968) method and the missing data approach of the OVERALS technique (Van der Burg, 1988). In a simulation study we assess the performance of the method and compare it to an existing procedure called GENCOM, proposed by Green and Carroll (1988). We find that the proposed method outperforms the GENCOM algorithm both with respect to model fit and recovery of the true structure.
The measurement of latent traits and investigation of relations between these and a potentially large set of explaining variables is typical in psychology, economics, and the social sciences. Corresponding analysis often relies on surveyed data from large-scale studies involving hierarchical structures and missing values in the set of considered covariates. This paper proposes a Bayesian estimation approach based on the device of data augmentation that addresses the handling of missing values in multilevel latent regression models. Population heterogeneity is modeled via multiple groups enriched with random intercepts. Bayesian estimation is implemented in terms of a Markov chain Monte Carlo sampling approach. To handle missing values, the sampling scheme is augmented to incorporate sampling from the full conditional distributions of missing values. We suggest to model the full conditional distributions of missing values in terms of non-parametric classification and regression trees. This offers the possibility to consider information from latent quantities functioning as sufficient statistics. A simulation study reveals that this Bayesian approach provides valid inference and outperforms complete cases analysis and multiple imputation in terms of statistical efficiency and computation time involved. An empirical illustration using data on mathematical competencies demonstrates the usefulness of the suggested approach.
We estimate historical stock returns for Swedish listed companies in a newly constructed data set of daily stock prices that spans more than 100 years. Stock returns exhibit all the familiar characteristics. The growth of the public sector depressed the stock market, and the process of globalization revitalized it. Banks played an important role in the early development of the stock market. There was little trading in the past, and we examine the effects on return measurement from missing data. Stock selection and the replacement of missing transaction prices through search back procedures or limit orders make little difference to a value-weighted stock price index, while ignoring the price effects of capital operations makes a big difference.
Standard portions or substitution of missing portion sizes with medians may generate bias when quantifying the dietary intake from FFQ. The present study compared four different methods to include portion sizes in FFQ.
Design
We evaluated three stochastic methods for imputation of portion sizes based on information about anthropometry, sex, physical activity and age. Energy intakes computed with standard portion sizes, defined as sex-specific medians (median), or with portion sizes estimated with multinomial logistic regression (MLR), ‘comparable categories’ (Coca) or k-nearest neighbours (KNN) were compared with a reference based on self-reported portion sizes (quantified by a photographic food atlas embedded in the FFQ).
Setting
The Danish Health Examination Survey 2007–2008.
Subjects
The study included 3728 adults with complete portion size data.
Results
Compared with the reference, the root-mean-square errors of the mean daily total energy intake (in kJ) computed with portion sizes estimated by the four methods were (men; women): median (1118; 1061), MLR (1060; 1051), Coca (1230; 1146), KNN (1281; 1181). The equivalent biases (mean error) were (in kJ): median (579; 469), MLR (248; 178), Coca (234; 188), KNN (−340; 218).
Conclusions
The methods MLR and Coca provided the best agreement with the reference. The stochastic methods allowed for estimation of meaningful portion sizes by conditioning on information about physiology and they were suitable for multiple imputation. We propose to use MLR or Coca to substitute missing portion size values or when portion sizes needs to be included in FFQ without portion size data.
This work deals with prediction of IBNR reserve under a different data ordering of the non-cumulative runoff triangle. The rows of the triangle are stacked, resulting in a univariate time series with several missing values. Under this ordering, two approaches entirely based on state space models and the Kalman filter are developed, implemented with two real data sets, and compared with two well-established IBNR estimation methods — the chain ladder and an overdispersed Poisson regression model. The remarks from the empirical results are: (i) computational feasibility and efficiency; (ii) accuracy improvement for IBNR prediction; and (iii) flexibility regarding IBNR modeling possibilities.
To investigate item non-response in a postal food-frequency questionnaire (FFQ), and to assess the effect of substituting/imputing missing values on dietary intake levels in the Norwegian Women and Cancer study (NOWAC). We have adapted and probably for the first time applied k nearest neighbours (KNN) imputation to FFQ data.
Design
Data from a recent reproducibility study were used. The FFQ was mailed twice (test–retest) about 3 months apart to the same subjects. Missing responses in the test FFQ were imputed using the null value (frequencies = null, amount = smallest), the sample mode, the sample median, KNN, and retest values.
Setting
A methodological substudy of NOWAC, a national population-based cohort.
Subjects
A random sample of 2000 women aged 46–75 years was drawn from the cohort in 2002 (response 75%). The imputation methods were compared for 1430 women who completed at least 50% of the test FFQ.
Results
We imputed 16% missing values in the overall test data matrix. Compared to null value imputation, the largest differences in estimated dietary intake were seen for KNN, and for food items with a high proportion of missing. Imputation with retest values increased total energy intake, indicating that not all missing values are caused by respondents failing to specify no consumption, and that null value imputation may lead to underestimation and misclassification.
Conclusion
Missing values in FFQs present a methodological challenge. We encourage the application and evaluation of newer imputation methods, including KNN, which may reduce imputation errors and give more accurate intake estimates.
By using the alternating projection theorem of J. von Neumann, we obtain explicit formulae for the best linear interpolator and interpolation error of missing values of a stationary process. These are expressed in terms of multistep predictors and autoregressive parameters of the process. The key idea is to approximate the future by a finite-dimensional space.
Recommend this
Email your librarian or administrator to recommend adding this to your organisation's collection.