Depressive disorder is one of the most prevalent mental disorders, affecting over 300 million people.1 The prevalence of depressive disorder is positively correlated with age, placing middle-aged and older adults at a higher risk.Reference Vos, Lim, Abbafati, Abbas, Abbasi and Abbasifard2 If not appropriately treated, severe cases of depression can lead to intentional self-harm (ISH) behaviours or suicide attempts, which heightens the importance of the management of depression.
ISH behaviour, which encompasses all deliberate actions of self-poisoning (e.g. intentional drug overdoses) or self-inflicted injury (e.g. self-cutting), irrespective of the level of suicidal intent or other underlying motivations., is directly associated with a significant risk of future suicide.Reference Hawton, Harriss, Hall, Simkin, Bale and Bond3,Reference Duarte, Paulino, Almeida, Gomes, Santos and Gouveia-Pereira4 Individuals with records of more severe ISH behaviours are likely to be at a higher risk of suicide attempts, making hospitalisation due to intentional self-harm (HISH) a key indicator in suicide and mental health surveillance systems.Reference Mahinpey, Pollock, Liu, Contreras and Thompson5
As suicide prevention has become a major public health agenda worldwide, there have been several attempts to investigate the impacts of HISH. According to previous studies, HISH cost a total of £162 million in the UK in 2017 and a direct cost of $60 million USD for Denmark, which is even increasing. Furthermore, these costs for inpatient care account for a large portion of these expenses.Reference Tsiachristas, McDaid, Casey, Brand, Leal and Park6,Reference Dyvesether, Hastrup, Hawton, Nordentoft and Erlangsen7 Taken together, the public health impact and use of medical resources highlight the need for study on ISH and its prevention among patients with depression.
Although the mechanism of ISH in patients with depression is still unclear, it has commonly been assumed that gene-environment interactions (GxEs) play a vital role in developing the condition.Reference Althoff, Hudziak, Willemsen, Hudziak, Bartels and Boomsma8 In consideration of this, non-pharmacological interventions, which might affect environmental factors, have begun to be widely used, along with the concern about the effectiveness of antidepressants.Reference Farah, Alsawas, Mainou, Alahdab, Farah and Ahmed9
Physical activity, which refers to activity that involves bodily movement produced by skeletal muscles that requires energy expenditure, is one of the most commonly implemented treatment options for depression.Reference López-Torres Hidalgo, Aguilar Salmerón, Boix Gras, Campos Rosa, Escobar Rabadán and Escolano Vizcaíno10,Reference Fiona, Salih, Stuart, Katja, Matthew and Greet11 A considerable amount of literature has reported the effectiveness of physical activity in reducing depressive symptoms, including suicidal ideation, while it varies by genotype.Reference Dotson, Hsu, Langaee, McDonough, King and Cohen12–Reference Vancampfort, Hallgren, Firth, Rosenbaum, Schuch and Mugisha14
Physical activity may be classified into four main categories according to the absolute intensity, or rate of energy expenditure: sedentary, light, moderate and vigorous activity.Reference Piercy, Troiano, Ballard, Carlson, Fulton and Galuska15 Despite its wide clinical use, there are still no clear recommendations on what kind of physical activity should be done to prevent serious consequences, such as HISH, along with optimal doses for patients with depression. For instance, current physical activity guidelines from the World Health Organization (WHO) and the United States Department of Health and Human Services (HHS) predominantly emphasise moderate-to-vigorous physical activity (MVPA) for the general population, overlooking emerging concepts such as light intensity physical activity (LIPA) and genotype-specific effects of physical activity.Reference Fiona, Salih, Stuart, Katja, Matthew and Greet11,Reference Piercy, Troiano, Ballard, Carlson, Fulton and Galuska15 Although these guidelines have mentioned the benefits of physical activity for mental well-being along with other conditions, a domain-specific approach to depression patients has not been made in spite of emerging evidence.Reference Teychenne, White, Richards, Schuch, Rosenbaum and Bennie16
To address this evidence gap, we devised a method called the bidirectional analytical model to investigate genotype-specific effectiveness (BAIGE). This multi-phase model, developed to examine the genotype-specific effects of a particular intervention in longitudinal data-sets, consists of three steps: genetic stratification, retrospective cohort analysis (phase 1) and prospective cohort analysis (phase 2). In phase 1, based on a retrospective study design, it is examined whether specific factors influence the occurrence of the outcome. In the second phase, participants in whom the outcome has not yet occurred are followed up prospectively, and the relative risk of the outcome is estimated when a particular intervention is implemented. By utilising the longitudinal and comprehensive data ranging from sociodemographic information to electronic health records (EHRs) and accelerometry data in this model, we examined the beneficial effect of physical activity on the risk of HISH in patients with depression phenotypes.
Method
UK Biobank
The UK Biobank project is a large prospective cohort data-set comprising 502 492 participants from England, Scotland and Wales. Since 2006, extensive data including sociodemographic variables, online questionnaires, lifestyle factors, physical activity data and genetic data were collected on the participants who visited 22 assessment centres across the UK. The participants were aged 40–69 years in recruitment, and the follow-up data until 2021 were included in this study. All the participants provided written informed consent, and the project has approval from the North West Multi-centre Research Ethics Committee (MREC) as a research tissue bank (RTB) approval (REC reference: 21/NW/0157, IRAS project ID: 299116). Our research was conducted under application number 86 585.
Phenotype and outcome
A comprehensive medical record of each participant, coded clinical events (e.g. consultations, diagnoses, procedures and laboratory tests), and registration records, including admission and discharge information, are linked to the UK Biobank project. In this study, primary healthcare data (category 3000), summary diagnoses (category 2002) and record-level patient data (category 2006) were used to define the depression cohort and outcome.
The cohort of patients with depression was constructed using the definition of a single probable episode of major depression, as previously reported and validated by Smith et al,Reference Smith, Nicholl, Cullen, Martin, Ul-Haq and Evans17 along with related hospital episodes of the disorder: participants who include any of the following ICD-10 diagnoses: F32 (depressive episode); F33 (recurrent depressive episode); F34 (persistent mood (affective) disorders); F38 (other mood (affective) disorders); and F39 (unspecified mood (affective) disorder); or participants who were depressed/down for a whole week (field ID 4598) for at least 2 weeks (field ID 4609) and ever seen a general practitioner (GP) or psychiatrist for nerves, anxiety or depression (field ID 2090 and 2100); or participants with anhedonia for a whole week (field ID 4631), which lasted for at least 2 weeks (field ID 5375) and ever seen a GP or psychiatrist for nerves, anxiety or depression (field ID 2090 and 2100). Detailed procedures used to derive and operationalise the phenotypic definition have been provided by the UK Biobank.18 The primary outcome of the study was ISH, which was defined using the hospital in-patient records of X60–X84 (self-harm) diagnoses of ICD-10 codes.
Genotyping and imputation
Genome-wide genotype data were collected on all participants using either the UK Biobank lung exome variant evaluation axiom or the UK Biobank axiom array and underwent quality control procedures performed by Wellcome Trust Centre for Human Genetics, University of Oxford, UK.Reference Clare, Colin, Desislava, Gavin, Lloyd and Kevin19 Genotype data were imputed based on the Haplotype Reference Consortium (HRC) and UK10K haplotype resource. Imputation data of over 90 million variants were available through the UK Biobank under approval. Based on three published genome-wide association studies (GWAS) and meta-analyses research, which were conducted on a large cohort of more than 500 000 individuals with depression, we selected 55 risk single nucleotide polymorphisms (SNPs) associated with depressive disorders and their symptoms. Detailed information on the selected risk SNPs is shown in Supplementary Table 2 available at https://doi.org/10.1192/bjo.2024.845.Reference Okbay, Baselmans, De Neve, Turley, Nivard and Fontana20–Reference Wray, Ripke, Mattheisen, Trzaskowski, Byrne and Abdellaoui22 Participants with any missing value in imputation data or mismatch in reported genetic and reported sex were excluded from the analysis.
Physical activity
Two types of physical activity assessment were used in this study: subjective and objective measurement for phases 1 and 2, respectively. For phase 1, we used the metabolic equivalent task (MET) score based on the latest version of the International Physical Activity Questionnaire – Short Form (IPAQ-SF).Reference Craig, Marshall, Sjöström, Bauman, Booth and Ainsworth23 After having adjusted for activity type (walking, moderate and vigorous)-specific weight, the MET score for each subject was derived and used to classify populations into three categories. Data were processed and categorised following the guidelines published by the IPAQ Research Committee.24
We used derived accelerometry data from the UK Biobank for an objective physical activity assessment. Between February 2013 and December 2015, email invitations were sent to UK Biobank participants with valid email addresses. Over 100 000 agreed participants received an AX3 wrist-worn triaxial logging accelerometer and were instructed to wear the device continuously for a whole week to track their entire physical activity. From measured raw acceleration data at 100 Hz with a dynamic range of ±8 g, physical activity data were extracted after calibration procedures. Physical activity data were classified into four types: sleep, sedentary behaviour, LIPA and MVPA using a random forest model composed of 100 decision trees trained on CAPTURE-24 data. CAPTURE-24 is a data-set recording participants’ physical activity over 24 h using an AX3 accelerometer and a wearable camera, with expert annotations based on camera images and time-use diaries. The fine-grained labels in CAPTURE-24 were mapped into four types of physical activity: sleep, sedentary behaviour (e.g. office/computer work, riding in a bus or train), LIPA (e.g. household activities, walking at work) and MVPA (e.g. bicycling, running).Reference Rosemary, Shing, Karl, Rema, Mark and Kazem25
Medication use
Medication use data were derived from the prescription medication data from the UK Biobank (field ID 20003). Information on medication use was obtained by trained health professionals in the verbal interview stage at an assessment centre of the UK Biobank. In this interview, participants reported the regular prescription medications they were taking. Short-term medications (e.g. a 1-week course of antibiotics for an infection) or medications that were prescribed but not taken were not included in the data collection. We utilised the pre-existing list of antidepressants, anxiolytics, antipsychotics and mood stabilizers from the previous literature, and included the use of psychotropic medications in the analysis.Reference Skelton, Rayner, Purves, Coleman, Gaspar and Glanville26
Statistical analyses
A BAIGE, which combined a genetic stratification module, retrospective cohort study model and prospective cohort study model was built to assess the associations between physical activity and self-harm behaviour across the genotypes. Sixteen years of longitudinal follow-up time period of the UK Biobank cohort was divided into two phases: from 2006 to May 2013 for phase 1 and from June 2013 to 2021 for phase 2. The scheme of the BAIGE is shown in Fig. 1.
In this study, genotype data were binarised corresponding to the presence of the risk SNPs. Subjects are then clustered based on the Euclidean distance of the binarised genetic vector by hierarchical clustering. As the number of clusters is unknown, we found the optimal threshold of hierarchical clustering using a silhouette coefficient. We generated decision trees to investigate significant SNPs for each cluster.
Univariate and multiple logistic regression analyses were conducted for the retrospective cohort study in phase 1 (from 2006 to May 2013). We employed univariate logistic regression analysis (model 1) to assess the relationship between the self-harm behaviour and self-reported physical activity level measured with IPAQ in each genetic cluster. In the multivariable logistic regression analysis (model 2), the model was adjusted for the potential confounders, including age group, sex, ethnicity, current employment status, education level, Townsend deprivation index, smoking status and alcohol consumption. Model 3 added the adjustment for psychotropic medication use.
The prospective cohort for phase 2 (from June 2013 to 2021) consists of participants without HISH medical records until the beginning of the phase. To assess the effectiveness of physical activity as a non-pharmacological intervention for depressive symptoms, the optimal level of each physical activity had to be decided prior to analysis. Previous guidelines by the WHO have indicated the recommended level of MVPA, whereas there is no such recommendation for LIPA.Reference Fiona, Salih, Stuart, Katja, Matthew and Greet11 Therefore, the recommended level of LIPA was calculated using the methodology suggested by Chen et al.Reference Chen, Huang, He, Gao, Mahara and Lin27 Then penalised cox-proportional hazards regression analysis was conducted in order to examine the associations between the types and guideline or calculation-based levels of physical activity and incident HISH in each genetic cluster. Similar to phase 1, model 2 added adjustments for age group, sex, ethnicity, current employment status, education level, Townsend deprivation index, smoking status and alcohol consumption to the unadjusted model 1. Additionally, model 3 was controlled for aforementioned confounders and psychotropic medication use.
We used Python 3.10.9 for Windows (Python Software Foundation, Beaverton, Oregon, USA; https://www.python.org/) and SciPy 1.10.1 for Windows (The SciPy Project; https://docs.scipy.org/doc/scipy/) for genetic stratification. For other analyses, we used R version 4.2.2 for MacOS (R Foundation for Statistical Computing, Vienna, Austria; https://www.r-project.org/) along with the packages ‘survival’ 3.5.5, ‘rms’ 6.7-1 and ‘CutpointsOEHR’ 0.1.2. The mean and standard deviation are reported for continuous baseline characteristics except for the Townsend index (of which statistics are reported with median and interquartile range), and percentages are reported to present categorical variables. Odds ratios and hazard ratios are presented with a 95% confidence interval.
Results
Among 64 826 participants who met the criteria for the probable depression phenotype, 36 238 were eligible for the imputed genotype-based stratification. After excluding the subjects with missing data in covariates, 28 923 participants with depression were included in the retrospective cohort study (Supplementary Table 1). Participants without records of HISH until May 2013 and provided valid accelerometer data comprised the prospective cohort in phase 2. Sociodemographic details and descriptive statistics of the cohort are presented in Table 1.
IPAQ, International Physical Activity Questionnaire.
Genetic stratification
We found the three largest clusters from the result of hierarchical clustering; each consists of 6681, 8905 and 4401 participants (Fig. 2). Also, genetic features that distinguish each cluster were found through the decision tree algorithm. Following cohort studies were conducted targeting these three clusters.
Phase 1: retrospective cohort study
In phase 1, we sought to examine the association between the IPAQ physical activity group and HISH risk in each genetic cluster. Overall, in unadjusted logistic regression models, participants in moderate or IPAQ physical activity groups were associated with significantly lower odds of HISH. There were similar findings in the models stratified with summed MET minutes per week for all activity. A multivariable model 2, which was adjusted for covariates except for the psychotropic medication use, showed that more active IPAQ groups were at the lower risk odds of HISH in the full cohort (odds ratio for the moderately active group: 0.771 [0.609–0.977], odds ratio for the highly active group: 0.764 [0.602–0.971]), although there were mixed findings from each genetic cluster in model 2, along with model 3 (Table 2).
HISH, hospitalisation due to intentional self-harm; IPAQ, International Physical Activity Questionnaire; OR, odds ratio.
a The aORs from model 2 are adjusted for adjusted for the potential confounders, including age group, sex, ethnicity, current employment status, education level, Townsend deprivation index, smoking status and alcohol consumption.
b The aORs from model 3 are adjusted for all covariates listed in Table 1.
Phase 2: prospective cohort study
Beginning in June 2013, we assessed the relationships between the accelerometer-measured physical activity levels and incident HISH in 7137 participants during a mean (s.d.) follow-up period of 8.55 (0.44) years. After the classification of the accelerometer data, the cohort was stratified using the guideline-based thresholds for MVPA (≥150 min per week). As the amount of LIPA had a U-shaped relationship with the log-relative hazards, the ‘goldilocks’ level for LIPA (between 2.38 and 9.53 h a day) was derived by calculated optimal cut points (Fig. 3).
The results of the penalised Cox proportional hazards regression analysis are presented in Table 3. The ‘goldilocks’ amount of LIPA was associated with a lower risk of ISH in genetic cluster 2 (hazard ratio 0.23; 95% CI 0.07–0.76), and it was significant after adjusting for possible confounders in model 2 (adjusted hazard ratio (aHR), 0.26; 95% CI 0.08–0.88), although there were no significant differences observed in other clusters. The same results were observed in fully adjusted model 3, which controlled for aforementioned confounders, including the use of psychotropic medications (aHR, 0.28; 95% CI 0.08–0.96). No significant differences in the risk of ISH were found between the levels of MVPA.
HISH, hospitalisation due to intentional self-harm; HR, hazard ratio.
a The aHRs from model 2 are adjusted for the potential confounders, including age group, sex, ethnicity, current employment status, education level, Townsend deprivation index, smoking status and alcohol consumption.
b The aHRs from model 3 are adjusted for all covariates listed in Table 1.
c The results for cluster 3 were not applicable due to the insufficient number of events. No significant difference was found between the group with or within LIPA recommendation in the log rank test (P = 0.3).
Discussion
The primary aim of the current study was to investigate the preventive effect of physical activity on ISH in individuals with a depression phenotype. The findings from the retrospective cohort analysis in phase 1 revealed that participants with depression who engage in a moderate or higher level of physical activity tend to be at a lower risk of ISH. In the subgroup analysis, the relationship between the overall physical activity level and incident HISH differed across the genetic clusters. A possible explanation for the mixed results and discrepancies across the genetic clusters and models is the genotype-specific effect of physical activity, which in turn heightens the need for detailed research on the objective classification of physical activity and its impact in groups with different genotypes.
Analysis of the prospective cohort settings in phase 2 showed that maintaining a machine learning-derived ‘goldilocks’ level of LIPA reduces the risk of HISH in genetic cluster 2. As the effect was not significant in other clusters, it can thus be suggested that the calculated level of LIPA could be recommended for patients with specific genetic variants for cluster 2.
Meanwhile, the data from this study did not confirm a significant effect of adherence to the guideline-based amount of MVPA in preventing ISH. These findings are broadly consistent with recent studies that found the beneficial effect of LIPA on depressive symptoms rather than MVPA. Although existing studies and recommendations have focused on MVPA or the overall amount of physical activity, recent reports have suggested that objectively measured LIPA is more effective than another type of physical activity in alleviating depressive symptoms.Reference Kandola, Lewis, Osborn, Stubbs and Hayes28–Reference Helgadóttir, Forsell, Hallgren, Möller and Ekblom31 Because of their limitations in compliance and adherence to the physical activity prescription, LIPA could be a more appropriate option for the middle-aged and older population, such as the participants of the UK Biobank.Reference Ku, Steptoe, Liao, Sun and Chen29
The molecular mechanism of LIPA's preventive effect on depressive symptoms is not clear; nevertheless, one can assume that LIPA affects the pathophysiology of depression in the way general physical activity does. According to our analyses, there were three SNPs that distinguish cluster 2: rs1432639, rs4543289 and rs11209948. Previous results from GWAS reported that two of the three variants, rs1432639 and rs11209948, are related to obesity/body mass index (BMI) and body fat percentage, respectively.Reference Wray, Ripke, Mattheisen, Trzaskowski, Byrne and Abdellaoui22,Reference Hübel, Gaspar, Coleman, Finucane, Purves and Hanscombe32 The nearest gene of the two variants is neuronal growth regulator 1 (NEGR1), which is suggested to be a central hub that links depression and obesity.Reference Noh, Park, Han and Lee33 When considering the proposed mechanism of neuroplasticity of physical activity in depression, the two variants are possibly involved in mediating neural plasticity along with NEGR1.Reference Kandola, Ashdown-Franks, Hendrikse, Sabiston and Stubbs34 Taken together with the role of rs4543289 in the severity and symptoms of depression, the complex interaction between these genetic components could provide a probable explanation for the greater effect of physical activity on ISH in cluster 2.Reference Hyde, Nagle, Tian, Chen, Paciga and Wendland21
One of the distinguishing features of our research is that we utilised various study designs and measurement tools available to extend the current findings. For example, throughout our study, the amount of physical activity was largely associated with a relatively lower risk of incidence ISH. Although the association was inconsistent in the retrospective cohort study, which made use of categorisation derived from self-reported questionnaires, we took advantage of objectively measured physical activity and confirmed effectiveness in the prospective cohort settings. Recent evidence, ranging from the Mendelian randomisation study and meta-analyses, has reported the possible discrepancies between the subjective and objective measurement of physical activity, and objective measurement (i.e. accelerometer data) is more likely to be related to the improved outcomes in depression and related outcomes.Reference Choi, Chen, Stein, Klimentidis, Wang and Koenen35–Reference Vancampfort, Firth, Schuch, Rosenbaum, Mugisha and Hallgren37 This discrepancy could be attributed to the bias of self-reported assessment tools. In spite of its convenience and cost-effectiveness, subjective assessment of physical activity is vulnerable to measurement error and various types of bias. These limitations could be more significant when considering the impact of recall and social desirability bias in respondents, especially those of older age or with mental disorders.Reference Kandola, Lewis, Osborn, Stubbs and Hayes28,Reference Heesch, van Uffelen, Hill and Brown38 Moreover, it has been reported that participation in LIPA is especially prone to recall bias, which in turn provides a possible explanation for the lack of previous research on it, and emphasises the need for epidemiological studies with objective physical activity assessment.Reference Matthews, Moore, George, Sampson and Bowles39
Regarding the BAIGE used in our study, one might point out the potential for selection bias due to the decrease in population size as the study progresses. However, to prevent potential bias that could arise from imputation or other methods, this approach was unavoidable to execute complete-case analysis. Nevertheless, to investigate the potential selection bias due to the reduction in the study population, we conducted additional t-distributed stochastic neighbour embedding (t-SNE) analysis for dimensionality reduction (Supplementary Fig. 2). We visualised how the distribution of data points changed from the initial data-set as the study progressed, confirming that there was minimal selection bias. Additionally, we statistically compared the distribution of demographic and clinical features using the Kolmogorov–Smirnov test. As a result, there were no statistically significant differences in the distributions across the three data-sets (P > 0.05).
To the best of our knowledge, this is the first study to examine the genotype-specific effect of LIPA on ISH in patients with depression. As depression in middle-aged adults is becoming more prevalent, serious consequences of the disorder, such as ISH, are recognised as a major public health concern. Furthermore, research on personalised intervention for depression has been mostly restricted to pharmacological treatment.Reference Vos, ter Hark, Schellekens, Spijker, van der Meij and Grotenhuis40 The present findings have clinical implications for developing personalised recommendations for LIPA, while further research on more genotypes and psychophysiological factors is needed for a better understanding of the non-pharmacological treatment options for depression.
Limitations
There are several limitations to this study. First, this study was based on a single database, which in turn could be affected by possible biases such as ethnicity in the original data-set. Second, most of the variables were collected at a specific time point, which means that the variables might not fully represent the characteristics of the participants. For example, participants might have changed their physical activity pattern or lifestyle factors during the observation period or after the measurement. Third, the genetic stratification was based on previously reported risk SNPs, so there might have been other genetic variants that we have yet to account for, and would have uniquely found in the UK Biobank data-set. Fourth, as the sum of hours during a day is fixed, the effects of specific intensity and level of activity might partly be due to changes in other types of activities. For example, previous studies reported the benefit of displacing the sedentary activity with LIPA or MVPA.Reference Kandola, Lewis, Osborn, Stubbs and Hayes28 Fifth, because the statistical model used in the analysis presupposes a ‘U-shaped’ relationship between the continuous variable and the log-relevant hazard ratio, the optimal range could not be calculated for MVPA, of which the relationship with the hazard ratio was not ‘U-shaped.’ This does not necessarily imply that MVPA has no preventive effect on ISH behaviour, although the findings of this study are consistent with previous reports when applying guideline-based thresholds.Reference Kandola, Lewis, Osborn, Stubbs and Hayes28 In further research, an improved computational method might be employed to derive a ‘goldilocks’ amount of MVPA.
Implications
Taken together, our findings highlight the epidemiological relationships between physical activity and ISH behaviour in depression across different genotypes. According to our study, a certain level of LIPA could be suggested as a feasible non-pharmacological option for patients with a particular genotype, reducing the self-harm risk by around a fourth. To develop more robust recommendations, future clinical investigations should focus on more extensive information on environmental factors and their interactions with genotypic data.
Supplementary material
Supplementary material is available online at https://doi.org/10.1192/bjo.2024.845
Data availability
The data supporting the findings of this study are available from UK Biobank (https://www.ukbiobank.ac.uk/) and can be requested under license.
Author contributions
Concept and design: J.J. and D.L. Acquisition, analysis, or interpretation of data: J.J., S.L., J.H.L. and D.L. Drafting of the manuscript: J.J., S.L., J.H.L. and D.L. Statistical analysis: J.J. and S.L. Obtained funding: J.J. and D.L. Administrative, technical, or material support: D.L. Supervision: J.L. and D.L.
Funding
This study was supported by the Ministry of Science & ICT of Korea through the National Research Foundation grant (RS-2023-00262747) and a Medical Scientist Training Programme.
Declaration of interest
None.
eLetters
No eLetters have been published for this article.