Highlights
-
• Existing recurrent stroke risk prediction models do not target the population that survives to outpatient follow-up.
-
• We developed and validated risk prediction models for patients attending stroke prevention clinics, but the predictive accuracy of the models was modest.
-
• Additional risk factors and innovative approaches to individualized secondary prevention are needed.
Introduction
Over the past 15 years, there has been considerable effort to reduce short-term morbidity and mortality after stroke and to reduce stroke recurrence after transient ischemic attack (TIA). For those with TIA, the emphasis on the first 90 days began in earnest in 2004 when studies showed high 30- and 90-day stroke recurrence rates.Reference Gladstone, Kapral, Fang, Laupacis and Tu1,Reference Lioutas, Ivan and Himali2 With changes in process, expedited assessments and treatment, rates of recurrent stroke after TIA dropped significantly.Reference Lioutas, Ivan and Himali2 Similarly, organized systems of care,Reference Ganesh, Lindsay and Fang3,Reference Kapral, Fang and Silver4 and expanded eligibility for thrombolysis Reference Thomalla, Simonsen and Boutitie5 and endovascular thrombectomy,Reference Goyal, Menon and van Zwam6,Reference Nogueira, Jadhav and Haussen7 have helped to reduce mortality and improve 90-day outcomes after ischemic stroke. Given these changes, there are more people living with the effects of stroke than ever before.
Most of these advances have focused on the early, highest-risk period after stroke or TIA. However, the long-term risk of stroke recurrence remains high. Even when people survive the high-risk 90-day period after stroke and TIA without adverse events, they are still at over seven times the risk of having another stroke in the next 1-year and five times the risk after 3- and 5-years compared to matched controls.Reference Edwards, Kapral, Fang and Swartz8 Globally, the 10-year risk of stroke recurrence after TIA or minor stroke is 20%.Reference Khan and Yogendrakumar9
It is well recognized that the more vascular risk factors are controlled, the lower the recurrence risk.Reference Dong, Rundek, Wright, Anwar, Elkind and Sacco10,Reference Lin, Ovbiagele, Markovic and Towfighi11 Yet, almost no stroke survivors achieve targets on all risk factors, and many have recurrent strokes despite good risk control.Reference Lin, Ovbiagele, Markovic and Towfighi11 It remains unclear how much of the long-term recurrence or mortality risks for people seen in outpatient stroke clinic settings can be attributable to identifiable factors. To date, many clinical prediction models exist for in-hospital,Reference Smith, Shobha and Dai12 7-day,Reference Ay, Arsava and Johnston13 30-dayReference ODonnell, Fang and D'Uva14,Reference Wang, Lim, Levi, Heller and Fischer15 and 3-month outcomesReference Weimar, Konig, Kraywinkel, Ziegler and Diener16 for stroke/TIA patients. In a systematic review of 66 prediction models for survival and functional outcome in ischemic stroke patients, outcome periods ranged from seven days to ten years, with most models examining outcomes in the first three months.Reference Fahey, Crayton, Wolfe and Douiri17 However, people seen in stroke prevention clinics are those who survive the initial high-risk period and typically have sufficient residual function to attend outpatient clinics. Few of the abovementioned prediction models account for this survival bias. Given the limited resources for specialty stroke prevention services, aging global population and increasing survival of people after stroke,18 there is a need to better predict longer-term stroke recurrence and mortality for the stroke prevention clinic population.
The primary objective of this study was to determine whether demographic, medical history and stroke event data reflecting the episode of care for the index stroke, and available in an outpatient setting, could be used to create a prediction model for 1-year stroke recurrence after accounting for the competing risk of death. Specifically, we aimed to identify factors associated with 1-year outcomes among patients who survive and are event-free 90 days after stroke, and one day after TIA discharge, from acute care, to mimic when these patients would typically be seen in a secondary prevention clinic. These timings reflect national practice and guidelines for stroke management.Reference Gladstone, Lindsay and Douketis19
Methods
Study cohort
Our cohort was obtained from the Ontario Stroke Registry (OSR) held at ICES (formerly known as the Institute for Clinical Evaluative Sciences). The OSR was a province-wide registry developed for monitoring and reporting on the quality of stroke care that included a population-based sample of patients with stroke and TIA who were seen at any of the province’s 150 acute care institutions until 2013.Reference Kapral, Silver and Richards20 Stroke was determined by clinical presentation, confirmed by brain imaging and obtained through chart reviews performed by trained abstractors with clinical expertise.
We included all patients aged 18 or older who were discharged alive between April 1, 2002 and March 31, 2013 after hospitalization or emergency room visit with a diagnosis of ischemic stroke or TIA. We limited our cohort to Ontario residents who were in hospital for under 90 days and were not discharged to a long-term or palliative care facility (Figure 1). Patients with missing information on rurality index, income and modified Rankin scale (mRS) were excluded as based on our prior experience patients with missing information in these factors have higher rates of missingness in other variables in administrative datasets. Next, we selected the patients who had a recorded referral to a secondary prevention clinic at discharge resulting in a sample of 26,907 patients (13,848 with ischemic stroke and 13,059 with TIA). Finally, to capture patients who survived long enough to attend an outpatient clinic, we conducted a landmark analysis, excluding TIA patients who had died or had a recurrent stroke within one day (n = 63) and ischemic stroke patients who had died or had a recurrent stroke within 90 days of discharge (n = 703). The landmark date (1 and 90 days after hospital discharge) was used as the new index date (time zero) for the risk prediction models.

Figure 1. Flowchart of final cohort creation from OSR and CIHI-DAD datasets. CIHI-DAD = Canadian Institute for Health Information Discharge Abstract Database. *Excluded patients who had died/recurrent stroke within 90 days of discharge. **Excluded patients who had died/recurrent stroke within one day of discharge.
To assess outcomes, we used the Canadian Institute for Health Information Discharge Abstract Database (DAD) to capture hospital admissions for ischemic stroke, and the Ontario Registered Persons Database (RPDB) to capture deaths. The DAD includes data from the discharge abstracts of all acute care hospitals in Ontario, including admission and discharge dates and diagnoses, and the RPDB provides basic demographic information and vital statistics about anyone who has ever been eligible for Ontario’s universal health insurance plan. These datasets were linked using unique encoded identifiers and analyzed at ICES.
Candidate variable selection
The OSR collected 505 variables on patients with stroke/TIA including sociodemographic characteristics, vascular risk factors, comorbidities, stroke type, severity, Charlson comorbidity index score, CHADS2 and CHA2DS2-VASc risk scores at admission, hospital care, complications and outcomes including length of stay, neurologic deficit and mRS at discharge, discharge medications and disposition.
To begin variable selection, we used clinician input to filter from the initial variables in OSR to 136 clinically relevant variables, including those with a potential relationship to stroke recurrence and excluding those with known poor coding reliability. We excluded variables that would not be potentially available to a clinician assessing a patient’s risk in an outpatient clinic setting, had high missingness (>10%) or low prevalence (<3%). This selection process resulted in 18 candidate variables for univariable analysis for all models.
Statistical analysis
Model creation
The ischemic stroke and TIA cohorts were randomly divided into a derivation cohort (2/3 of the sample; n = 9,232 for ischemic stroke, 8,706 for TIA) and a validation cohort (1/3 of sample; n = 4,616 for stroke; 4,353 for TIA). Characteristics of the derivation and validation cohorts were reported prior to model development using means for age and proportions for categorical variables and compared using standardized mean differences, with values of <0.10 indicating negligible differences between the two cohorts.
Our primary and secondary outcomes were stroke recurrence and all-cause mortality one year after their respective landmark periods. For univariate analysis, we examined the association between each candidate predictor and recurrent stroke outcome using a Fine-Gray subdistribution hazard (sdHR) model to account for the competing risk of death. For univariate analysis to predict risk of death at one year, we used a univariate Cox proportional hazards model. For age, as a continuous variable, we tested linearity of association with outcomes using cubic spline analyses with five knots at percentiles 5, 25, 50, 75 and 95. We performed multivariable regression analyses with backward selection using a p-value of <0.10 for variable inclusion in the first model. Age, as a continuous variable, and sex were chosen a priori for inclusion; and other variables with a p-value <0.10 were included in the final models.
Model evaluation
Multicollinearity was assessed by evaluating condition indices, and no instances of high correlations (condition index >10) between variables in the initial and final models were found. Model discrimination was evaluated using time-dependent area under the curve (AUC) for the competing risk stroke models and c-statistic for the mortality models. We assessed the performance of each model in the validation cohorts, applying the risk coefficients from the derivation models and determining the c-statistic or time-dependent AUC. We also constructed calibration plots to compare the outcome rates by decile of predicted risk versus observed risk in the validation cohorts.
All analyses were conducted using SAS version 9.4 (SAS Institute, Cary, NC). The use of data in this project was authorized under section 45 of Ontario’s Personal Health Information Protection Act, which does not require review by a Research Ethics Board. We followed the TRIPOD guidelineReference Collins, Reitsma, Altman and Moons21 for reporting our study (Supplemental Table S1).
Results
A full summary of baseline characteristics of the derivation and validation cohorts is available in Table 1. Univariate analyses between each candidate predictor and outcome variable in derivation cohorts are reported in Supplemental Tables S2 and S3.
Table 1. Baseline characteristics of the derivation and validation cohorts: candidate predictor variables

Data presented as n (%) unless otherwise indicated. Standardized differences were<0.10 for comparisons of all variables between derivation and validation cohorts (data not shown). *Includes myocardial infarction, angina, percutaneous coronary intervention or coronary artery bypass grafting. **Includes Alzheimer’s disease, chronic confusion or senility. CAD = coronary artery disease; COPD = chronic obstructive pulmonary disease; HF = heart failure; TIA = transient ischemic attack; SD = standard deviation.
Among patients with an ischemic stroke (mean age 69 years; 43% female), 238 (2.7%) had a recurrent stroke within one year of landmark period. Of the 18 candidate variables, seven entered the final model. Age, prior stroke/TIA, mRS 3–5 and diabetes were associated with an increased risk of recurrent stroke. Comorbid hypertension predicted a reduction in risk of recurrent stroke. The time-dependent AUC was 0.62 (0.59–0.66) in the derivation cohort and 0.59 (0.54–0.64) in the validation cohort (Table 2, Figure 2A).
Table 2. Predictors of recurrent stroke among ischemic stroke patients in the final model, derivation cohort

AUC = area under the curve; CI = confidence interval; sdHR = subdistribution hazard ratio; TIA = transient ischemic attack.

Figure 2. Calibration figures for risk prediction models in validation cohorts.
Among patients with a TIA (mean age 70.0 years; 49% female), 298 recurrent strokes were observed within one year (3.44%). Nine of the 18 candidate variables entered the final model. Age, hypertension, prior stroke/TIA, CHF or pulmonary edema, smoking and discharge location (acute other) were all associated with an increased risk of recurrent stroke. Cancer and valvular heart disease were associated with a decreased risk of recurrent stroke. The time-dependent AUC was 0.67 (0.65–0.70) in the derivation cohort and 0.59 (0.55–0.64) in the validation cohort (Table 3, Figure 2B).
Table 3. Predictors of recurrent stroke among TIA patients in the final model, derivation cohort

AUC = area under the curve; CI = confidence interval; HF = heart failure; sdHR = subdistribution hazard ratio, TIA = transient ischemic attack.
Models for mortality in both the TIA and stroke cohorts had greater discrimination (Supplemental Table S4). The C-statistics of both the derivation and validation models ranged from 0.74–0.78 in the stroke and TIA cohorts (Supplemental Table S5; Figures 2C and 2D).
Discussion
Most outcome prediction models for stroke/TIA were developed to identify outcomes after discharge; however, these models may not apply to the population of people seen in stroke prevention clinics. One-third of people discharged alive will pass away, be institutionalized, have a recurrent stroke or MI within 1 day of discharge for TIA or 90 days of stroke.Reference Edwards, Kapral, Fang and Swartz8 Thus, most predictive models are weighted to outcomes that occur before clinic visits, limiting utility in an outpatient clinical setting. By imposing a landmark period – including only those who survived to typical stroke prevention clinic visit windows without events – the current study sought to provide better models to guide clinicians in stroke clinics on identifying the highest-risk individuals.
In the stroke cohort, the model predicting stroke recurrence had relatively few variables: age, sex, mRS, atrial fibrillation, diabetes, hypertension and previous history of stroke/TIA. Although most of these predictors have been included in previous models, it is interesting to note that some predictors do not appear consistently. For instance, age has been found to be unrelated or being of low importance in predicting the risk of recurrent stroke in a number of previous studies, but was important in our models.Reference Elhefnawy, Sheikh Ghadzi and Albitar22,Reference Elneihoum, Goransson, Falke and Janzon23 In the TIA cohort, age, prior history of stroke/TIA, CHF, smoking, cancer, hypertension, valvular heart disease and discharge to another facility other than rehabilitation were predictors of recurrent stroke, and similarly some of these variables do feature as significant predictors in other studies exploring the risk of stroke after TIA but not consistently.Reference Lemmens, Smet and Thijs24–Reference Purroy, Jimenez Caballero and Gorospe26
The association between hypertension and the decreased risk of recurrent strokes in the ischemic stroke cohort may be unexpected, but information on the relationship between hypertension and recurrent stroke in the literature is contradictory.Reference Rundek, Sacco, Grotta, Albers and Broderick27 It is possible that there is greater adherence to hypertension treatment when the diagnosis is a stroke rather than a TIA. This is reflected in our TIA cohort findings, where hypertension was positively associated with a risk of stroke one year post TIA. Unfortunately, data concerning the effectiveness of treatment for known hypertension or treatment adherence were not available for this analysis.
The negative association between stroke outcome and history of cancer in the TIA cohort is contrary to the literature which reports an increased risk of recurrent stroke in patients with cancer.Reference Lau, Wong and Teo28–Reference Navi and Iadecola30 However, prior studies suggest that this increased stroke risk may be associated with more recent cancer diagnoses (i.e., within the first 6–12 months)Reference Wei, Chen and Wu31 and our dataset only specifies “history of cancer” without information on diagnosis dates or cancer type. Though the exact associations between cancer and stroke recurrence require further investigation, our results reinforce the complexity of this relationship.Reference Dardiotis, Aloizou and Markoula32 The unexpected protective effect of the baseline cancer and valvular heart disease on the risk of recurrent stroke in the TIA cohort might also be explained by unmeasured confounding.
The unreliability of sex as a significant predictor variable was expected given the competing evidence for and against its impact on stroke recurrence and mortality.Reference Phan, Blizzard and Reeves33,Reference Rexrode, Madsen, Yu, Carcel, Lichtman and Miller34 We did not find sex as a predictor of stroke recurrence or mortality. There is some evidence that women have poorer medical management at presentation, which may act as an additional confounder in the relationship between sex and stroke recurrence and/or mortality.Reference Rexrode, Madsen, Yu, Carcel, Lichtman and Miller34 In the TIA cohort, there were no sex effects for the stroke recurrence model. However, women had a reduced mortality risk compared to men. Women presenting with TIA are more likely to have atypical symptoms and less likely to have diffusion-weighted MR imaging changes showing a minor stroke,Reference Coutts, Moreau and Asdaghi35 so women may be at a lower recurrence risk.
We conducted a targeted search of past prediction models but found no model that predicted stroke recurrence or death among stroke or TIA patients treated in stroke prevention clinics or surviving the early high-risk periods. Published c-statistics from models of in-hospital or discharged cohorts have similar performance characteristics as our models, with modest predictive accuracy, which limits clinical utility (Supplemental Table S6). Historically, clinical prediction models have had more success in predicting death as opposed to stroke recurrence.Reference Fahey, Crayton, Wolfe and Douiri17,Reference Purroy, Jimenez Caballero and Gorospe26,Reference Saposnik, Kapral and Liu36 Those findings were consistent with our results, with models for all-cause mortality in the stroke and TIA cohorts having the greater c-statistics than the models for stroke recurrence.
Due to limitations of data availability, current models include stroke and TIA populations between 2002 and 2013 in Ontario and do not capture any potential changes in population characteristics, patient management strategies and outcomes that might have occurred thereafter, and, hence, may differ meaningfully from contemporary practice. In addition, study data did not include information on race and ethnicity, post-discharge changes in adherence to medications, rehabilitation care, lifestyle modifications and other cardiovascular health metrics such as control of blood pressure, blood glucose and total cholesterol, which could be available in outpatient settings and could improve the predictive ability of the models. Lastly, our aim was to predict outcomes among patients who were referred to a secondary prevention clinic after TIA/stroke. While the results are not intended to be generalizable to those who were not referred it is noteworthy to mention that it has previously been shown that, in comparison with people referred to stroke prevention clinics, those who are not referred tend to be older, more likely to have dementia, live in a long-term care or in a rural residence, more often treated for the index event in a hospital without an on-site SPC and have a higher risk of 1-year mortality, but no difference in risk for recurrent stroke or TIA.Reference Kapral, Hall and Fang37 We also could not assess if the patient, in fact, attended the clinic after the referral and the factors potentially affecting referral decisions or access.
Previous prediction models have emphasized early stroke recurrence within the highest-risk period in the first 90 days post stroke or TIA,Reference Lemmens, Smet and Thijs24,Reference Gupta, Farrell and Mittal25 unlike in the present study, where the emphasis is on patients who survived the first 90 days post ischemic stroke, or one day post TIA, without a recurrent stroke. Thus, the small number of predictor variables in the final models, the unexpected findings (e.g., hypertension and cancer), and low c-statistics in both cohorts reflect a broader theme in the stroke literature: after the early high-risk period, the risk of longer-term ischemic stroke recurrence is difficult to predict.Reference Rundek, Sacco, Grotta, Albers and Broderick27 Yet, this risk is 7-fold higher than age-matched controls at one year and remains five times higher even five years after an event.Reference Edwards, Kapral, Fang and Swartz8 The current prediction models underscored the critical gaps that exist in our understanding of risk factors for stroke recurrence. Without knowledge of additional factors that increase the risk of recurrent stroke, the ability to mitigate this risk is limited. Measures of target attainment and adherence to risk reduction targets may help to fill this gap, and novel, modifiable risk factors may be needed to identify new targets. The laudable advancements in the acute treatment of stroke in recent years bring with them an increasing challenge: keeping those who have survived stroke and TIA, often with relatively little or no disability, free of recurrent stroke in the long term. Work to achieve and sustain long-term vascular risk reduction, identify novel predictors of risk, and innovative approaches to long-term individualized management are needed for improved secondary stroke prevention.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/cjn.2025.10405.
Acknowledgments
This study was supported by ICES, which is funded by an annual grant from the Ontario Ministry of Health (MOH) and the Ministry of Long-Term Care (MLTC). This document used data adapted from the Statistics Canada Postal CodeOM Conversion File, which is based on data licensed from Canada Post Corporation, and/or data adapted from the Ontario Ministry of Health Postal Code Conversion File, which contains data copied under license from ©Canada Post Corporation and Statistics Canada. Parts of this material are based on data and/or information compiled and provided by the Canadian Institute for Health Information. The analyses, conclusions, opinions and statements expressed herein are solely those of the authors and do not reflect those of the funding or data sources; no endorsement is intended or should be inferred.
Author contributions
SA and LAM: conception, interpretation, drafting and final approval; CA, YB and FJ: analysis, interpretation, drafting and final approval; BSE, KMK and AL: interpretation, drafting and final approval; APC: analysis and interpretation, drafting and final approval; SRH: conception, acquisition, analysis, interpretation, drafting and final approval. All authors reviewed and edited the manuscript and approved its final version.
Funding statement
This study was supported by CIHR (Grant Ref No. 137038) and HSF (Project No. 000392). RHS receives salary support from an Ontario Clinician Scientist (Phase II) Award from the HSF Canada. In-kind support from the Ontario Brain Institute. Dr Moira Kapral holds the Lillian Love Chair in Women’s Health, University Health Network/University of Toronto.
Competing interests
RHS reports stock ownership of Follow MD, and has contributed to an advisory board for Roche. All other authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.




