We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Incorporating temporal decline in predictive validity into mental testing theory is outlined. Starting from multivariate regression of criterion on repeated measurements, an analytic extension results in a weighting function for repeated measurements, replacing the beta weights. Besides optimizing, the procedure permits an evaluation of any particular prognosis setting: In cases of exponentially declining predictive validity, prognostic range can be extended if concurrent validity is nonperfect by optimal weighting (“predictive filtering”) of repeated measurements. Considerable gain in prognostic range over the traditional approach can be achieved if predictive validity declines concavely downwards.
In predicting\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\tilde y$$\end{document} scores from p > 1 observed scores\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$(\tilde x)$$\end{document} in a sample of size ñ, the optimal strategy (minimum expected loss), under certain assumptions, is shown to be based upon the least squares regression weights\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$(\hat \beta )$$\end{document} computed from a previous sample. Letting\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\tilde r(\hat \beta )$$\end{document} represent the correlation between\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\tilde y$$\end{document} and the predicted values\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$(\hat \beta '\tilde x)$$\end{document}, and letting\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\tilde r(w)$$\end{document} represent the correlation between\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\tilde y$$\end{document} and a different set of predicted values\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$(w'\tilde x)$$\end{document}, where w is any weighting system which is not a function of\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\tilde y$$\end{document}, it is shown that the probability of\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\tilde r(\hat \beta )$$\end{document} being less than\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\tilde r(w)$$\end{document} cannot exceed .50. The relationship of this result to previous research and practical implications are discussed.
Issues of model selection have dominated the theoretical and applied statistical literature for decades. Model selection methods such as ridge regression, the lasso, and the elastic net have replaced ad hoc methods such as stepwise regression as a means of model selection. In the end, however, these methods lead to a single final model that is often taken to be the model considered ahead of time, thus ignoring the uncertainty inherent in the search for a final model. One method that has enjoyed a long history of theoretical developments and substantive applications, and that accounts directly for uncertainty in model selection, is Bayesian model averaging (BMA). BMA addresses the problem of model selection by not selecting a final model, but rather by averaging over a space of possible models that could have generated the data. The purpose of this paper is to provide a detailed and up-to-date review of BMA with a focus on its foundations in Bayesian decision theory and Bayesian predictive modeling. We consider the selection of parameter and model priors as well as methods for evaluating predictions based on BMA. We also consider important assumptions regarding BMA and extensions of model averaging methods to address these assumptions, particularly the method of Bayesian stacking. Simple empirical examples are provided and directions for future research relevant to psychometrics are discussed.
Under a formula score instruction (FSI), test takers omit items. If students are encouraged to answer every item (under a rights-only scoring instruction, ROI), the score distribution will be different. In this paper, we formulate a simple statistical model to predict the score ROI distribution using the FSI data. Estimation error is also provided. In addition, a preliminary investigation of the probability of guessing correctly on omitted items and its sensitivity is presented in the paper. Based on the data used in this paper, the probability of guessing correctly may be close or slightly greater than the chance score.
Consumers typically overstate their intentions to purchase products, compared to actual rates of purchases, a pattern called “hypothetical bias”. In laboratory choice experiments, we measure participants’ visual attention using mousetracking or eye-tracking, while they make hypothetical as well as real purchase decisions. We find that participants spent more time looking both at price and product image prior to making a real “buy” decision than making a real “don’t buy” decision. We demonstrate that including such information about visual attention improves prediction of real buy decisions. This improvement is evident, although small in magnitude, using mousetracking data, but is not evident using eye-tracking data.
Functional impairment is a major concern among those presenting to youth mental health services and can have a profound impact on long-term outcomes. Early recognition and prevention for those at risk of functional impairment is essential to guide effective youth mental health care. Yet, identifying those at risk is challenging and impacts the appropriate allocation of indicated prevention and early intervention strategies.
Methods
We developed a prognostic model to predict a young person’s social and occupational functional impairment trajectory over 3 months. The sample included 718 young people (12–25 years) engaged in youth mental health care. A Bayesian random effects model was designed using demographic and clinical factors and model performance was evaluated on held-out test data via 5-fold cross-validation.
Results
Eight factors were identified as the optimal set for prediction: employment, education, or training status; self-harm; psychotic-like experiences; physical health comorbidity; childhood-onset syndrome; illness type; clinical stage; and circadian disturbances. The model had an acceptable area under the curve (AUC) of 0.70 (95% CI, 0.56–0.81) overall, indicating its utility for predicting functional impairment over 3 months. For those with good baseline functioning, it showed excellent performance (AUC = 0.80, 0.67–0.79) for identifying individuals at risk of deterioration.
Conclusions
We developed and validated a prognostic model for youth mental health services to predict functional impairment trajectories over a 3-month period. This model serves as a foundation for further tool development and demonstrates its potential to guide indicated prevention and early intervention for enhancing functional outcomes or preventing functional decline.
Prediction is a crucial mechanism of language comprehension. Our research question asked whether learners of Spanish were capable of using word order cues to predict the semantic class of the upcoming verb, and how this ability develops with proficiency. To answer this question, we conducted a self-paced reading study with three L2 Spanish groups at different proficiency levels and one native control group. Among the advanced L2 learners and native speakers, we found that reading times increased after the verb appeared in a word order not strongly associated with its semantic class. Because the only cue to the sentences’ word order was the presence or absence of the object marker a before the first noun, we suggest that these groups use this morphosyntactic cue to anticipate the semantic class of the upcoming verb. However, this pattern of processing behavior was not detected in our less experienced L2 groups.
This study investigated the predictive use of dative verb constraints in Mandarin among home-country-raised native speakers and classroom learners (including both sequential L2 learners and heritage speakers). In a visual world eye-tracking experiment, participants made anticipatory looks to the upcoming argument (recipient versus theme) following categorical restrictions of non-alternating verbs and gradient bias of alternating verbs before the acoustic onset of the disambiguating noun. Crucially, no delay or reduction in the prediction effects was observed among L2 learners and heritage speakers in comparison with home-country-raised native speakers. Mandarin proficiency and dominant language (English versus other) did not modulate prediction effects among classroom learners. These findings provide direct support for the assumption of error-driven learning accounts of the dative alternation, that is, language users actively predict upcoming arguments based on verb information during real-time sentence processing.
We tested whether verb-based prediction in late bilinguals is facilitated when the verb is a cognate versus non-cognate. Spanish–English bilinguals and Chinese–English bilinguals (control) listened to English sentences such as “The girl will adopt the dog” while viewing a scene containing either a dog and unadoptable objects (predictable condition) or a dog and other adoptable animals (unpredictable condition). The verb was either a cognate or non-cognate between Spanish and English and never a cognate between Chinese and English. Both groups of bilinguals were more likely to look at the target (the dog) in the predictable versus unpredictable condition. However, only low-proficient L1 Spanish bilinguals showed greater and earlier prediction when the verb was cognate than when it was non-cognate, suggesting that cognate facilitation effect occurs not only on the cognate word itself but also on prediction based on this cognate word, and that this effect is modulated by L2 proficiency.
This chapter uses a range of quotes and findings from the internet and the literature. The key premises of this chapter, which is illustrated with examples, are as follows. First, Big Data requires the use of algorithms. Second, algorithms can create misleading information. Third, algorithms can lead to destructive outcomes. But we should not forget that humans program algorithms. With Big Data come algorithms to run many and involved computations. We cannot oversee all these data ourselves, so we need the help of algorithms to make computations for us. We might label these algorithms as Artificial Intelligence, but this might suggest that they can do things on their own. They can run massive computations, but they need to be fed with data. And this feeding is usually done by us, by humans, and we also choose the algorithms to be used.
Transition to psychosis rates within ultra-high risk (UHR) services have been declining. It may be possible to ‘enrich’ UHR cohorts based on the environmental characteristics seen more commonly in first-episode psychosis cohorts. This study aimed to determine whether transition rates varied according to the accumulated exposure to environmental risk factors at the individual (migrant status, asylum seeker/refugee status, indigenous population, cannabis/methamphetamine use), family (family history or parental separation), and neighborhood (population density, social deprivation, and fragmentation) level.
Methods
The study included UHR people aged 15–24 who attended the PACE clinic from 2012 to 2016. Cox proportional hazards models (frequentist and Bayesian) were used to assess the association between individual and accumulated factors and transition to psychosis. UHR status and transition was determined using the CAARMS. Benjamini–Hochberg was used to correct for multiple comparisons in frequentist analyses.
Results
Of the 461 young people included, 55.5% were female and median follow-up was 307 days (IQR: 188–557) and 17.6% (n = 81) transitioned to a psychotic disorder. The proportion who transitioned increased incrementally according to the number of individual-level risk factors present (HR = 1.51, 95% CIs 1.19–1.93, p < 0.001, pcorr = 0.01). The number of family- and neighborhood-level exposures did not increase transition risk (p > 0.05). Cannabis use was the only specific risk factor significantly associated with transition (HR = 1.89, 95% CIs 1.22–2.93, pcorr = 0.03, BF = 6.74).
Conclusions
There is a dose–response relationship between exposure to individual-level psychosis-related environmental risk factors and transition risk in UHR patients. If replicated, this could be incorporated into a novel approach to identifying the highest-risk individuals within clinical services.
Involuntary admissions to psychiatric hospitals are on the rise. If patients at elevated risk of involuntary admission could be identified, prevention may be possible. Our aim was to develop and validate a prediction model for involuntary admission of patients receiving care within a psychiatric service system using machine learning trained on routine clinical data from electronic health records (EHRs).
Methods
EHR data from all adult patients who had been in contact with the Psychiatric Services of the Central Denmark Region between 2013 and 2021 were retrieved. We derived 694 patient predictors (covering e.g. diagnoses, medication, and coercive measures) and 1134 predictors from free text using term frequency-inverse document frequency and sentence transformers. At every voluntary inpatient discharge (prediction time), without an involuntary admission in the 2 years prior, we predicted involuntary admission 180 days ahead. XGBoost and elastic net models were trained on 85% of the dataset. The models with the highest area under the receiver operating characteristic curve (AUROC) were tested on the remaining 15% of the data.
Results
The model was trained on 50 634 voluntary inpatient discharges among 17 968 patients. The cohort comprised of 1672 voluntary inpatient discharges followed by an involuntary admission. The best XGBoost and elastic net model from the training phase obtained an AUROC of 0.84 and 0.83, respectively, in the test phase.
Conclusion
A machine learning model using routine clinical EHR data can accurately predict involuntary admission. If implemented as a clinical decision support tool, this model may guide interventions aimed at reducing the risk of involuntary admission.
Humanity’s situation with climate change is sometimes compared to that of a frog in a slowly boiling pot of water. Most of our climate science takes the form of prediction: telling the frog that in five minutes’ time he will be a little bit warmer. We need more risk assessment: telling the frog that the worst that could happen is he could boil to death, and that this is becoming increasingly likely over time. This approach can give a much clearer picture of the risks of climate change to human health, food security, and coastal cities.
Taking a simplified approach to statistics, this textbook teaches students the skills required to conduct and understand quantitative research. It provides basic mathematical instruction without compromising on analytical rigor, covering the essentials of research design; descriptive statistics; data visualization; and statistical tests including t-tests, chi-squares, ANOVAs, Wilcoxon tests, OLS regression, and logistic regression. Step-by-step instructions with screenshots are used to help students master the use of the freely accessible software R Commander. Ancillary resources include a solutions manual and figure files for instructors, and datasets and further guidance on using STATA and SPSS for students. Packed with examples and drawing on real-world data, this is an invaluable textbook for both undergraduate and graduate students in public administration and political science.
The aspirations-ability framework proposed by Carling has begun to place the question of who aspires to migrate at the center of migration research. In this article, building on key determinants assumed to impact individual migration decisions, we investigate their prediction accuracy when observed in the same dataset and in different mixed-migration contexts. In particular, we use a rigorous model selection approach and develop a machine learning algorithm to analyze two original cross-sectional face-to-face surveys conducted in Turkey and Lebanon among Syrian migrants and their respective host populations in early 2021. Studying similar nationalities in two hosting contexts with a distinct history of both immigration and emigration and large shares of assumed-to-be mobile populations, we illustrate that a) (im)mobility aspirations are hard to predict even under ‘ideal’ methodological circumstances, b) commonly referenced “migration drivers” fail to perform well in predicting migration aspirations in our study contexts, while c) aspects relating to social cohesion, political representation and hope play an important role that warrants more emphasis in future research and policymaking. Methodologically, we identify key challenges in quantitative research on predicting migration aspirations and propose a novel modeling approach to address these challenges.
Neural predictors underlying variability in depression outcomes are poorly understood. Functional MRI measures of subgenual cortex connectivity, self-blaming and negative perceptual biases have shown prognostic potential in treatment-naïve, medication-free and fully remitting forms of major depressive disorder (MDD). However, their role in more chronic, difficult-to-treat forms of MDD is unknown.
Methods:
Forty-five participants (n = 38 meeting minimum data quality thresholds) fulfilled criteria for difficult-to-treat MDD. Clinical outcome was determined by computing percentage change at follow-up from baseline (four months) on the self-reported Quick Inventory of Depressive Symptomatology (16-item). Baseline measures included self-blame-selective connectivity of the right superior anterior temporal lobe with an a priori Brodmann Area 25 region-of-interest, blood-oxygen-level-dependent a priori bilateral amygdala activation for subliminal sad vs happy faces, and resting-state connectivity of the subgenual cortex with an a priori defined ventrolateral prefrontal cortex/insula region-of-interest.
Findings:
A linear regression model showed that baseline severity of depressive symptoms explained 3% of the variance in outcomes at follow-up (F[3,34] = .33, p = .81). In contrast, our three pre-registered neural measures combined, explained 32% of the variance in clinical outcomes (F[4,33] = 3.86, p = .01).
Conclusion:
These findings corroborate the pathophysiological relevance of neural signatures of emotional biases and their potential as predictors of outcomes in difficult-to-treat depression.
Broadening prediction efforts from imminent psychotic symptoms to neurodevelopmental vulnerabilities can enhance the accuracy of diagnosing severe mental disorders. Early interventions, especially during adolescence, are vital as these disorders often follow a long prodromal phase of neurodevelopmental disturbances. Child and adolescent mental health services should lead a developmentally-sensitive model for timely, effective detection and intervention.
Despite strong evidence of efficacy of electroconvulsive therapy (ECT) in the treatment of depression, no sensitive and specific predictors of ECT response have been identified. Previous meta-analyses have suggested some pre-treatment associations with response at a population level.
Aims
Using 10 years (2009–2018) of routinely collected Scottish data of people with moderate to severe depression (n = 2074) receiving ECT we tested two hypotheses: (a) that there were significant group-level associations between post-ECT clinical outcomes and pre-ECT clinical variables and (b) that it was possible to develop a method for predicting illness remission for individual patients using machine learning.
Method
Data were analysed on a group level using descriptive statistics and association analyses as well as using individual patient prediction with machine learning methodologies, including cross-validation.
Results
ECT is highly effective for moderate to severe depression, with a response rate of 73% and remission rate of 51%. ECT response is associated with older age, psychotic symptoms, necessity for urgent intervention, severe distress, psychomotor retardation, previous good response, lack of medication resistance, and consent status. Remission has the same associations except for necessity for urgent intervention and, in addition, history of recurrent depression and low suicide risk. It is possible to predict remission with ECT with an accuracy of 61%.
Conclusions
Pre-ECT clinical variables are associated with both response and remission and can help predict individual response to ECT. This predictive tool could inform shared decision-making, prevent the unnecessary use of ECT when it is unlikely to be beneficial and ensure prompt use of ECT when it is likely to be effective.
This chapter is devoted to extensive instruction regarding bivariate regression, also known as ordinary least squares regression (OLS).Students are presented with a scatterplot of data with a best-fitting line drawn through it.They are instructed on how to calculate the equation of this line (least squares line) by hand and with the R Commander.Interpretation of the statistical output of the y-intercept, beta coefficient, and R-squared value are discussed.Statistical significance of the beta coefficient and its implications for the relationship between an independent and dependent variable are described.Finally, the use of the regression equation for prediction is illustrated.
The chapter begins with an applied example describing the limitations of bivariate regression and the need to include multiple independent variables in a regression model to explain the dependent variable.The logic of multivariate regression is discussed as it compares to bivariate regression.Running a multivariate regression in the R Commander and interpretation of the results are the main foci of the chapter, with particular attention to the beta coefficients, y-intercept, and adjusted R-squared.Generating the multivariate regression equation from the R Commander output is covered, along with engaging in prediction using this equation.Finally, interpretation of dummy independent variables in a multivariate regression is covered.