How to distinguish promotion, prevention, and treatment trials in public mental health: development and validation of the VErona-LUgano Tool (VELUT)

Marianna Purgato; Emiliano Albanese; Alden L. Gross; Anna Maria Annoni; Ceren Acarturk; Camilla Cadorin; Mark J. D. Jordans; Crick Lund; Davide Papola; Eleonora Prina; Marit Sijbrandij; Manuela Silva; Federico Tedeschi; Wietse A. Tol; Corrado Barbui

doi:10.1017/S2045796025100280

How to distinguish promotion, prevention, and treatment trials in public mental health: development and validation of the VErona-LUgano Tool (VELUT)

Published online by Cambridge University Press: 12 November 2025

Crick Lund ,

Davide Papola and

Eleonora Prina

...Show all authors

Show author details

Marianna Purgato*: Affiliation:
WHO Collaborating Centre for Research and Training in Mental Health and Service Evaluation, Department of Neurosciences, Biomedicine and Movement Sciences, Section of Psychiatry, University of Verona, Verona, Italy Cochrane Global Mental Health, University of Verona, Verona, Italy
Emiliano Albanese: Affiliation:
Institute of Public Health, Faculty of Biomedical Sciences, Università della Svizzera Italiana, Lugano, Switzerland Department of Psychiatry, University of Geneva, Geneva, Switzerland
Alden L. Gross: Affiliation:
Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Johns Hopkins University Center on Aging and Health, Baltimore, MD, USA
Anna Maria Annoni: Affiliation:
Institute of Public Health, Faculty of Biomedical Sciences, Università della Svizzera Italiana, Lugano, Switzerland
Ceren Acarturk: Affiliation:
Department of Psychology, Koc University, Istanbul, Türkiye
Camilla Cadorin: Affiliation:
WHO Collaborating Centre for Research and Training in Mental Health and Service Evaluation, Department of Neurosciences, Biomedicine and Movement Sciences, Section of Psychiatry, University of Verona, Verona, Italy
Mark J. D. Jordans: Affiliation:
Research and Development Department, War Child, Amsterdam, The Netherlands Amsterdam Institute for Social Science Research, University of Amsterdam, Amsterdam, The Netherlands Centre for Global Mental Health, Health Service and Population Research Department, Institute of Psychiatry, Psychology and Neuroscience, King’s College, London, UK
Crick Lund: Affiliation:
Centre for Global Mental Health, Health Service and Population Research Department, Institute of Psychiatry, Psychology and Neuroscience, King’s College, London, UK Alan J Flisher Centre for Public Mental Health, Department of Psychiatry and Mental Health, University of Cape Town, Cape Town, South Africa
Davide Papola: Affiliation:
WHO Collaborating Centre for Research and Training in Mental Health and Service Evaluation, Department of Neurosciences, Biomedicine and Movement Sciences, Section of Psychiatry, University of Verona, Verona, Italy Cochrane Global Mental Health, University of Verona, Verona, Italy
Eleonora Prina: Affiliation:
WHO Collaborating Centre for Research and Training in Mental Health and Service Evaluation, Department of Neurosciences, Biomedicine and Movement Sciences, Section of Psychiatry, University of Verona, Verona, Italy
Marit Sijbrandij: Affiliation:
Department of Clinical, Neuro- and Developmental Psychology, WHO Collaborating Center for Research and Dissemination of Psychological Interventions, Amsterdam Public Health Institute, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
Manuela Silva: Affiliation:
Lisbon Institute of Global Mental Health, Comprehensive Health Research Center, NOVA University of Lisbon, Lisbon, Portugal
Federico Tedeschi: Affiliation:
WHO Collaborating Centre for Research and Training in Mental Health and Service Evaluation, Department of Neurosciences, Biomedicine and Movement Sciences, Section of Psychiatry, University of Verona, Verona, Italy
Wietse A. Tol: Affiliation:
Section for Global Health, University of Copenhagen, Copenhagen, Denmark Athena Institute, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands Arq National Psychotrauma Centre, Diemen, The Netherlands
Corrado Barbui: Affiliation:
WHO Collaborating Centre for Research and Training in Mental Health and Service Evaluation, Department of Neurosciences, Biomedicine and Movement Sciences, Section of Psychiatry, University of Verona, Verona, Italy Cochrane Global Mental Health, University of Verona, Verona, Italy
*: Corresponding author: Marianna Purgato; Email: marianna.purgato@univr.it

Article contents

Abstract
Background
Methods
Results
Conclusions
Introduction
Methods
Devising items
Selection of items
Statistical analysis
Results
Discussion
Supplementary material
Availability of data and materials
Author contributions
Financial support
Competing interests
Ethical standards
References

Rights & Permissions

Abstract

Background

Promoting mental health, preventing mental disorders and providing effective treatments are public health priorities. Randomized controlled trials (RCTs) frequently evaluate mental health and psychosocial support interventions to achieve one or more of these objectives. Distinguishing between RCTs focused on mental health promotion, prevention or treatment remains conceptually and methodologically challenging. No standardized tool exists to position RCTs along a promotion-to-treatment continuum in mental health. We aimed to develop and validate the VErona-LUgano Tool (VELUT) for distinguishing RCTs along the promotion-to-treatment continuum.

Methods

An interdisciplinary tool development group (TDG) was established. The Population, Intervention, Comparison and Outcome framework was used to define key constructs. Items in the tool were devised, categorized and reduced through qualitative and quantitative methods. Finally, we performed a preliminary validation of the VELUT applying item response theory (IRT) using data from 180 RCTs.

Results

The TDG generated 33 items for the initial version of the VELUT, reduced to 16 through review, cognitive interviews and psychometric analysis. Analyses of 180 RCTs using the 16-item tool showed high internal consistency (α = 0.94) and unidimensionality. Following item reduction and IRT, a final 8-item version was retained, and IRT models confirmed strong item discrimination for the 8 items and high scale reliability (marginal reliability >0.90 across most of the range of the scale), good response distribution, item performance and alignment with the Institute of Medicine (IOM) promotion-to-treatment continuum.

Conclusions

The VELUT addresses methodological gaps in global mental health research by helping to position RCTs of MHPSS interventions along the IOM promotion-to-treatment continuum.

Keywords

evidence synthesis item response theory mental health prevention promotion

Information

Type: Original Article
Information: Epidemiology and Psychiatric Sciences , Volume 34 , 2025 , e54

DOI: https://doi.org/10.1017/S2045796025100280 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2025. Published by Cambridge University Press.

Introduction

The Institute of Medicine (IOM) framework classifies mental health interventions into different categories, including promotion, prevention and treatment (Institute of Medicine, 2009). Numerous randomized controlled trials (RCTs) and systematic reviews have assessed the efficacy and effectiveness of promotion, prevention and treatment interventions for mental health across diverse populations (Barbui et al., Reference Barbui, Purgato, Abdulmalik, Acarturk, Eaton, Gastaldon, Gureje, Hanlon, Jordans, Lund, Nose, Ostuzzi, Papola, Tedeschi, Tol, Turrini, Patel and Thornicroft2020; Hoare et al., Reference Hoare, Collins, Marx, Callaly, Moxham-Smith, Cuijpers, Holte, Nierenberg, Reavley, Christensen, Reynolds, Carvalho, Jacka and Berk2021; Papola et al., Reference Papola, Purgato, Gastaldon, Bovo, van Ommeren, Barbui and Tol2020; Purgato et al., Reference Purgato, Albanese, Papola, Prina, Tedeschi, Gross, Sijbrandij, Acarturk, Annoni, Silva, Jordans, Lund, Tol, Cuijpers and Barbui2023; van Zoonen et al., Reference van Zoonen, Buntrock, Ebert, Smit, Reynolds, Beekman and Cuijpers2014).

Promotion focuses on empowering individuals to enhance their well-being and resilience, and can be a distinct strategy or part of prevention and/or treatment efforts (Eaton, Reference Eaton2019). Prevention aims to avert, reduce or delay the onset of mental disorders through universal, selective or indicated approaches. Prevention may involve early detection and diagnosis, targeting risk and protective factors and reducing the impact of disease on functionality and quality of life (e.g., relapse prevention) (Acarturk et al., Reference Acarturk, Uygun, Ilkkursun, Carswell, Tedeschi, Batu, Eskici, Kurt, Anttila, Au, Baumgartner, Churchill, Cuijpers, Becker, Koesters, Lantta, Nose, Ostuzzi, Popa, Purgato, Sijbrandij, Turrini, Valimaki, Walker, Wancata, Zanini, White, van Ommeren and Barbui2022; Augustinavicius et al., Reference Augustinavicius, Purgato, Tedeschi, Musci, Leku, Carswell, Lakin, van Ommeren, Cuijpers, Sijbrandij, Karyotaki, Tol and Barbui2023; Buntrock et al., Reference Buntrock, Berking, Smit, Lehr, Nobis, Riper, Cuijpers and Ebert2017; Purgato et al., Reference Purgato, Singh, Acarturk and Cuijpers2021a, Reference Purgato, Carswell, Tedeschi, Acarturk, Anttila, Au, Bajbouj, Baumgartner, Biondi, Churchill, Cuijpers, Koesters, Gastaldon, Ilkkursun, Lantta, Nosè, Ostuzzi, Papola, Popa, Roselli, Sijbrandij, Tarsitani, Turrini, Välimäki, Walker, Wancata, Zanini, White, van Ommeren and Barbui2021b; Riello et al., Reference Riello, Purgato, Bove, Tedeschi, MacTaggart, Barbui and Rusconi2021; Purgato et al., Reference Purgato, Tedeschi, Riello, Zaccoletti, Mediavilla, Ayuso-Mateos, MacTaggart, Barbui and Rusconi2025a, Reference Purgato, Tedeschi, Turrini, Cadorin, Compri, Muriago, Ostuzzi, Pinucci, Prina, Serra, Tarsitani, Witteveen, Roversi, Melchior, McDaid, Park, Petri-Romao, Kalisch, Underhill, Bryant, Mediavilla Torres, Ayuso-Mateos, Felez Nobrega, Haro, Sijbrandij, Nose and Barbui2025b). Treatment targets individuals with existing mental disorders, aiming to alleviate symptoms and improve functioning (Eaton, Reference Eaton2019; Tol et al., Reference Tol, Purgato, Bass, Galappatti and Eaton2015).

However, distinguishing between promotion, prevention and (early) treatment trials remains challenging for at least four reasons (Cuijpers, Reference Cuijpers2022). First, prevention studies require outcome measures that entail a clear ascertainment of the onset of a new disorder in adequately sized samples. However, this is lengthy, resource-intensive and often unfeasible, especially in low-resource settings. Many trials rely on proxy outcomes of incidence, such as symptom worsening, which can obscure the true effectiveness of prevention efforts and conceptually overlap with how treatments are commonly evaluated in RCTs. Second, binary classification of mental health is challenging and may oversimplify its complexity, as mental health exists more on a continuum rather than a strict healthy/ill dichotomy (Papola and Patel, Reference Papola and Patel2025; Patel et al., Reference Patel, Saxena, Lund, Thornicroft, Baingana, Bolton, Chisholm, Collins, Cooper, Eaton, Herrman, Herzallah, Huang, Jordans, Kleinman, Medina-Mora, Morgan, Niaz, Omigbodun, Prince, Rahman, Saraceno, Sarkar, De Silva, Singh, Stein, Sunkel and Unützer2018). The chosen outcome measure may not necessarily be informative about whether the preventive potential of an intervention was investigated in the study or not. Third, studies may simultaneously focus on promotion, prevention and treatment. For several mental health and psychosocial support (MHPSS) interventions tested in RCTs, the boundaries between promotion, prevention and treatment are blurred, as all of them may be represented to different degrees. Moreover, the contents, components, levels, actors and possible mechanisms of action of MHPSS are only seldom explicitly described in many RCTs (Papola et al., Reference Papola, Prina, Ceccarelli, Cadorin, Gastaldon, Ferreira, Tol, van Ommeren, Barbui and Purgato2024; Purgato et al., Reference Purgato, Singh, Acarturk and Cuijpers2021a). This implies issues related to the proverbial black box, poor unpacking of the interventions’ components, and no explicit reference to an a priori theory of change (Miller et al., Reference Miller, Jordans, Tol and Galappatti2021). Intervention manuals are often not reported or lack details on whether the intervention being tested was designed for promotion, prevention or treatment (Cuijpers et al., Reference Cuijpers, Boyce and van Ommeren2024; Watts et al., Reference Watts, van Ommeren and Cuijpers2020). Fourth, having clarity on the aim of an intervention being evaluated is important for future use and implementation. If the distinction between promotion, prevention and treatment is diffuse, it might reflect uncertainty about the ultimate purpose of the intervention, and thereby its future use. Fifth, the reporting of many RCTs may be suboptimal, limiting appraisal of internal and external validity, and, therefore, inferential interpretation of findings. Sampling procedures are often poorly reported, and the inclusion and exclusion criteria of participants are often unclear (Miller et al., Reference Miller, Rasmussen and Jordans2023). This makes it difficult to draw coherent lines between populations at risk of and with mental disorders. For example, the same intervention given to the former population may be conceived as preventive, while for the latter as treatment.

Promotion, prevention and treatment RCTs should be clearly discerned to facilitate a better understanding of research findings in the field of public mental health. Appropriately appraising RCTs not only facilitates better overall evidence synthesis but also serves as a crucial tool for enhancing knowledge translation and policy uptake. A clear distinction can finally help researchers optimize the choice of outcomes, pinpoint research gaps, allocate efforts and resources where needed most, and avoid redundancies.

Against this background, we set out to develop a new measurement tool – the VErona-LUgano Tool (VELUT) – designed to position RCTs of MHPSS interventions along the promotion-to-treatment continuum through critical appraisal. This paper describes the methodological process, statistical analyses and results involved in the development and preliminary validation of this tool. It also provides the tool itself, along with instructions on how to use it in evidence synthesis.

Methods

The protocol for this study was approved by the Ethics Committee of Università della Svizzera Italiana (CE_2024_04 of 25-01-2024), and has been published (Purgato et al., Reference Purgato, Albanese, Papola, Prina, Tedeschi, Gross, Sijbrandij, Acarturk, Annoni, Silva, Jordans, Lund, Tol, Cuijpers and Barbui2023).

The methods, processes and procedures described here are adapted from the practical guide for the development of health measurement scales (David and Norman, Reference David and Norman2015), which we complemented with elements of the Child Health and Nutrition Research Initiative research prioritization method (CHNRI) (Rudan, Reference Rudan2016; Rudan et al., Reference Rudan, Yoshida, Chan, Sridhar, Wazny, Nair, Sheikh, Tomlinson, Lawn, Bhutta, Bahl, Chopra, Campbell, El Arifeen, Black and Cousens2017; Shah et al., Reference Shah, Albanese, Duggan, Rudan, Langa, Carrillo, Chan, Joanette, Prince, Rossor, Saxena, Snyder, Sperling, Varghese, Wang, Wortmann and Dua2016). The VELUT is an outcome-based tool composed of questions with appropriate answer options. The items were devised, drafted, selected and tested in multiple rounds.

The development of VELUT began with a conceptualization step, in which we established a tool development group (TDG) composed of international experts (n = 14) in public mental health, global mental health and complementary disciplines, i.e., statistics and epidemiology. The TDG members are from 10 universities in 9 countries and are all co-authors of the present paper. The two coordinators of the TDG are the first co-authors of this paper (M.P. and E.A.) and worked in close collaboration with the last author, who originally conceived this endeavour (C.B.). Figure 1 presents the flowchart of the VELUT development methodological steps, described in detail below.

Legend: IRT: item response theory; N: number of items; RCT: randomized controlled trial; TDG: tool development group; USI: Università della Svizzera italiana; VELUT: Verona Lugano tool.

Figure 1. Study flowchart.

Devising items

The TDG adopted an iterative process to define the construct(s) targeted for assessment and agreed to use both the Population, Intervention, Comparison and Outcome (PICO) framework and the IOM mental disorders preventive intervention research framework for devising, reviewing and selecting items (Institute of Medicine, 2009; Richardson et al., Reference Richardson, Wilson, Nishikawa and Hayward1995).

The TDG generated an initial pool of items related to, and collectively reflecting, the conceptualized constructs. We used two complementary approaches for item generation. First, we systematically searched for existing measures already developed to position RCTs along the promotion-to-treatment continuum; we found no studies reporting similar tools in any health field (Purgato et al., Reference Purgato, Albanese, Papola, Prina, Tedeschi, Gross, Sijbrandij, Acarturk, Annoni, Silva, Jordans, Lund, Tol, Cuijpers and Barbui2023). Second, the TDG integrated multiple sources to develop the items, including theoretical frameworks, research expertise in the design and conduction of RCTs of MHPSS interventions, and collaboration with practitioners. Qualitative methods were employed with the TDG to elicit additional themes and insights regarding the usability and relevance of our tool. The qualitative methods included focus groups with TDG members, the tool’s intended end users, and three interview rounds focused on each individual item, its intended meaning, and exploration of its relevance and understanding. We also identified and discussed key methodological features of RCTs relevant to assessing the study’s promotion, prevention or treatment nature.

Selection of items

First, TDG members categorized all devised items according to the PICO elements, carefully considering the potential of each item to inform the promotion, prevention or treatment nature of the RCTs domains according to the IOM framework (Institute of Medicine, 2009). Second, the two TDG coordinators (M.P., a clinical psychologist, and E.A., an epidemiologist) applied techniques adapted from the CHNRI research prioritization methodology (Rudan, Reference Rudan2016; Rudan et al., Reference Rudan, Yoshida, Chan, Sridhar, Wazny, Nair, Sheikh, Tomlinson, Lawn, Bhutta, Bahl, Chopra, Campbell, El Arifeen, Black and Cousens2017) to consolidate, combine and remove redundant items, maintaining an overall balance between the construct’s granularity and overall salient features. Third, M.P., E.A. and C.B. discussed and refined the wording and clarity of items. TDG members first participated in survey rounds to provide quantitative ratings on each item and their pertinence concerning the measured constructs (1 = not pertinent at all to 5 = extremely pertinent) and clarity (1 = extremely unclear to 5 = extremely clear). Fourth, TDG members participated in cognitive interviews to ensure consistency in the interpretation of the items (and response options) and to discuss face and content validity aspects. Finally, item wording and phrasing were improved through a consolidation step based on iterative discussion in a dedicated online session of TDG members, and independently revised by an external English mother-tongue for grammar and lexicon. The ensuing set of items formed the alpha version of the VELUT, which was used to assess a selection of 180 RCTs from recently published systematic reviews to support further item selection through IRT modelling (below).

Statistical analysis

Data collection

A team of seven evaluators (Appendix, page 8) used the tool to examine its application in a database of published RCTs on the effectiveness of MHPSS interventions (Papola et al., Reference Papola, Purgato, Gastaldon, Bovo, van Ommeren, Barbui and Tol2020; Purgato et al., Reference Purgato, Prina, Ceccarelli, Cadorin, Abdulmalik, Amaddeo, Arcari, Churchill, Jordans, Lund, Papola, Uphoff, van Ginneken, Tol and Barbui2023), which had been previously searched, selected and stored in a repository at the University of Verona.

Two assessors independently evaluated all 180 RCTs with the 16-item, alpha version of the VELUT, for the item response theory (IRT) modelling (described below) (MacCallum et al., Reference MacCallum, Browne and Cai2006; van der Linden, Reference van der Linden2018; van der Linden and Hambleton, Reference van der Linden and Hambleton1997). Next, we computed an overall score for each primary study based on the data obtained with the application of the VELUT, and then we prepared lists of deleted/combined/rephrased items – informed by both the empirical application of the VELUT and according to IRT results – for further discussion with the TDG.

Throughout the research process, the TDG engaged in methodological discussions online and during a 2-day in-person workshop in Lugano (Switzerland), including using an additional method to categorize RCTs.

Statistical analyses

We performed a formal psychometric evaluation to assess the VELUT’s reliability and validity as described below in this paragraph. This included an internal consistency analysis and the factor structure of the VELUT using IRT. Using the standardized version of Cronbach’s alpha (Guttman’s lambda-6), we assessed the internal consistency of the scale, the average inter-item correlation and the signal-to-noise ratio (Albanese et al., Reference Albanese, Butikofer, Armijo-Olivo, Ha and Egger2020), and interpreted IRT model results to identify redundant or overlapping items.

Item response theory

We explored potentially uninformative items by measuring the response clustering (i.e., response or item non-variance ≥60%). We used confirmatory factor analysis (CFA), consistent with IRT, to verify whether the items fit the anticipated domain structure of the VELUT (i.e., PICO and IOM framework, above). Graded response (IRT) models for non-binary answer options were applied to relate the responses to our underlying construct of interest. This approach enabled us to explore characteristics of items, including their correlation with other items (e.g., discrimination) and relative location along the promotion-to-treatment continuum (e.g., item locations or thresholds). To summarize item loadings and thresholds, we generated Item Characteristic Curves (ICCs) to display the probability of endorsing a given response as a function of the latent trait level (i.e., the promotion-to-treatment continuum, on the x-axis), modelled with a cumulative logistic distribution. The ICCs visually summarize item location and discrimination parameters (Orlando et al., Reference Orlando, Sherbourne and Thissen2000; van der Linden and Hambleton, Reference van der Linden and Hambleton1997).

We used graded response logistic models (GRM) (Gross et al., Reference Gross, Sherva, Mukherjee, Newhouse, Kauwe, Munsie, Waterston, Bennett, Jones, Green and Crane2014), allowing both location and discrimination to vary across items, with the slope of the ICC depicting the ability to distinguish between neighbouring levels of the latent trait (David and Norman, Reference David and Norman2015). We used the multidimensional IRT methods, specifically bifactor modelling (Reise, Reference Reise2012) to capture possible departures from unidimensionality of the VELUT scale, and we explored goodness of fit of these models with Akaike information criteria, the Bayesian information criteria (Maydeu-Olivares and Joe, Reference Maydeu-Olivares and Joe2006), and the M2 statistics. The latter, which follows an approximate chi-square distribution, was purposely designed to work well with categorical item data (Maydeu-Olivares and Joe, Reference Maydeu-Olivares and Joe2005). M2 is less sensitive to sample size, it is computationally efficient, and it performs well even in the case of sparse data (i.e., a high number of response patterns). Our analysis of item fit was evaluated using normalized residuals (Bollen, Reference Bollen1989). Global model fit of IRT models was evaluated with the root mean square error of approximation (RMSEA), Standardized Root Mean Square Residual (SRMR) and Comparative Fit Index (CFI).

For all items, we calculated the item information function, which refers to the precision of the item in measuring the latent trait. Item information sums to a Test Information Function, which depicts the combined coverage and precision of the scale items relative to the promotion-to-treatment continuum (Lord, Reference Lord1980). We statistically tested whether unidimensionality and local/conditional independence assumptions were met using parallel analysis with scree plots. Items that violated these assumptions were flagged for potential exclusion and discussed by the TDG. In addition, we evaluated the number of dimensions covered.

Finally, we used IRT for a preliminary validation study of the final version of the VELUT with 36 RCTs, reusing a subset of the 180 trials, that is, the 20% at each decile between the 10th and 90th percentiles of the original distribution of the latent trait. We applied the IRT GRM to evaluate the VELUT’s psychometric properties and generated ICCs, threshold distribution plots (TDPs) and test characteristic curves (TCCs). ICCs and TDPs provide a visualization of the location and discrimination parameters, the latent trait range covered by the items, and the response options thresholds. TCCs illustrate the precision of the VELUT in estimating the underlying latent trait (theta) at different points along the trait continuum, and how well the scale captures the latent trait (theta) across RCTs, with Marginal Reliability thresholds ≥0.90 interpreted as indicative of excellent reliability (Roth andTannenbaum, Reference Roth and Tannenbaum2004). Analyses were carried out using Stata 15 (2017).

Results

Devising items

The TDG collaboratively generated an initial pool of 18 items and progressively added 15 more, integrating multiple sources reflecting the PICO and IOM frameworks. The comprehensiveness (i.e., content validity) of the provisional list of 33 items was confirmed by achieving thematic saturation for the key design and conduct features of RCTs that may provide information on the relevant aspects of the IOM construct.

Selection of items and IRT results for the preliminary version of the VELUT

Of the 33 items developed, 6 were allocated to Population, 18 to Intervention, none to Comparison, five to Outcome in the PICO framework, and four to Study Setting (e.g., humanitarian, community). The items were evenly distributed across the IOM promotion, selective and indicated prevention and treatment continuum. The TDG members discussed the item list, eliminating redundant items (n = 3), rephrasing items (n = 17) and consolidating similar ones to maintain a balance between granularity and construct representation. We retained 16 items and improved their wording and clarity for the next steps.

The average ratings for pertinence and clarity of the 16 items, assessed by members of the TDG using a 1–5 Likert scale, were 3.76 (0.46, range: 2.85–4.37) and 3.75 (0.41, range: 3.16–4.67), respectively. Cognitive interviews confirmed that all TDG members understood all items and response options as intended and could interpret them correctly. Minor wording adaptations were made, such as the use of the term ‘screening’ instead of ‘selection’ for item 1, the use of the term ‘primarily’ instead of ‘initially’, the elimination of redundant words (i.e., ‘for example’), and the choice and clarification of abbreviations. Distributions of response options for the rated 180 RCTs based on the preliminary version of the VELUT are reported in the appendix (Appendix Table e1). Items u1 and u2; u2 and u3; u13, u14 and u16 were discussed with the TDG as their implementation was challenging according to the reviewers during the RCT assessment. The correlation matrix confirmed a strong correlation between items u1 and u2; u2 and u3; and between u13, u14 and u16. As a result, and after the conceptual in-person TDG discussion, some items were rephrased, one item (‘Were subgroup analyses done based on risk profiles and/or symptoms?’) was moved to the beginning of the tool to provide contextual information about the RCT being rated instead of providing information along the promotion-to-treatment continuum, and Item u16 (‘In this specific RCT, was the trial setting humanitarian?’) was deleted. Some other items were combined. Principal component analysis and CFAs, respectively, indicated one strong factor explained 57% of the total variance and 84% of the shared inter-item variance, while parallel analysis with a scree plot suggested strong statistical evidence for unidimensionality. Cronbach’s α of 0.94 suggested strong internal consistency reliability. The very poor fit of a Rasch model (RMSEA = 0.329; CFI = 0.92; SRMR = 0.313) requires better model specification. A 2-p graded-response IRT had better fit to data (RMSEA = 0.158; CFI = 0.98; SRMR = 0.076), and was much improved with a bifactor solution (RMSEA = 0.100; CFI = 0.99; SRMR = 0.059) that allowed specific factors to explain intercorrelations between p1, p2, p3 and o1, o2 in addition to common covariation explained by the common factor. The ICCs of the 16 items showed high heterogeneity of both item difficulty and slope (i.e., discrimination parameter) (Appendix Figure e4), with partially overlapping thresholds (u13 and u14) likely due to low or no endorsement of given response categories (Appendix Figure e5). The ICCs highlighted potentially redundant items (u6 and u11, which were highly correlated, r = 0.96) (Appendix Figure e6).

Further item selection led to minimal depreciation of the test information plot (Appendix Figure e7), and through intensive, iterative discussion during the in-person workshop, the TDG agreed that eight items were either duplicates, not informative or not applicable. Six were deleted, 2 combined and 1 was moved to the beginning of the VELUT and transformed into a classification question, not contributing to the overall score. An agreement was reached about the remaining eight items, their phrasing and respective response options, and a recognition to allow missing data on items in the form of missing/unclear information from the published RCT.

Validation study and final version of the VELUT

Table 1 reports the frequencies of responses to the eight items of the final version of the VELUT applied to 36 RCTs independently selected by the TDG psychiatric epidemiologist (AG). To prioritize representation of trials along the full promotion-to-treatment continuum in this restricted set of 36 RCTs, we selected 20% of trials at each decile between the 10th and 90th percentiles of the original distribution of 180 trials. This selection of trials ensured a decent spread of responses across all items, with the expected exception of the two outcome-related items (O1 and O2) for which answer options of indicated prevention prevailed. There were 12 ‘unclear’ responses out of 36 to item P3 regarding participants’ symptomatology at baseline, which was treated as missing data in graded-response IRT models and handled assuming missingness was random conditional on other variables in the model (e.g., MAR). All item discriminations (estimated via weighted least squares means and variance adjusted estimation) were high (>0.58) to very high (>0.93), indicating a high correlation between the VELUT items and the underlying latent promotion-to-treatment continuum of interest.

Table 1. Frequencies of responses from the application of the VELUT to 36 selected RCTs, and item parameters

Loadings: The strength of the relationship between an item and the latent trait. Threshold: Points on the latent trait (θ) where the probability of selecting a specific response category or higher is 50%. WLSMV: weighted least squares means and variance adjusted for categorical data, such as Likert-scale responses.

^a Thresholds are on the latent trait scale, not on the observed scale.

The thresholds between response options (i.e., category boundaries) across all items ranged between corresponding theta values of −2.86 to 2.48, with most intermediate thresholds falling close to 0 on the latent trait continuum (assumed to be standard normally distributed [N{0,1}]) (Table 1). Figure 2 presents the ICC of the final version of the VELUT. The steepness of slopes (i.e., discrimination) of the ICC shown in the figure supports the ability of each item to distinguish between neighbouring levels of the underlying latent trait, the promotion-to-treatment continuum. ICCs show the broad range of location parameters across items.

Figure 2. Item characteristic curves.

The information and reliability of each item and the overall VELUT scale are illustrated in Figures 3 and 4, respectively.

Figure 3. Item information function (Mplus).

Figure 4. Test information function (Mplus).

The item information functions (Fig. 3) show high information for all items across a broad range of theta, except for O1 and O2. The test information function suggests that the VELUT is a strong measure of the latent trait across all the RCTs appraised (Fig. 4). Test information corresponding to a marginal reliability >0.90 was present across the entire observed range of the promotion-to-treatment latent trait, indicating excellent reliability of the scale, that is, a highly precise measurement of theta. The slope of the TCC (Fig. 5) is steep, indicating that the VELUT should differentiate well between RCTs with varying levels of promotion-to-treatment trait, and the linear relationship with the computed score suggests that the VELUT scores predictably and consistently increase as the latent trait does. Flat areas of the TCC were confined to extreme score values (Fig. 5). This result supports the good interpretability of the VELUT overall scores and their correspondence with the promotion-to-treatment continuum.

Figure 5. Test characteristic curve (TCC).

After extensive in-person discussion in the workshop, the TDG agreed that the overall VELUT score may be complemented with the computation of the frequencies of response options that correspond to the IOM domains of promotion, selective prevention, indicated prevention and treatment to recognize that one RCT can provide evidence for each domain, irrespective of its positioning across the promotion-to-treatment continuum. Both computational combinations of the responses can be used to inform the classification of RCTs in evidence synthesis (see https://www.iph.usi.ch/it/strumenti/velut). The VELUT tool is presented in Table 2.

Table 2. The Verona-Lugano Tool – VELUT

NA, unclear or insufficient information.

^a The study population screened for the presence of a mental conditions or indicators and proxies of psychological distress or other mental health condition using a mental health instrument/symptom checklist.

^b Consider the operationalization of the inclusion and exclusion criteria (if any).

^c See study sample characteristics, typically reported in Table 1.

Discussion

This paper presents the development and preliminary psychometric validation of the VELUT, a novel measurement tool designed to position RCTs along the IOM continuum, ranging from mental health promotion to treatment. Tool development followed rigorous, iterative qualitative and quantitative psychometric procedures and involved international experts from complementary disciplines, who participated as members of an interdisciplinary TDG.

Through multiple rounds of item generation, refinement meetings, qualitative feedback, cognitive interviews and statistical analyses, we identified a final set of eight items demonstrating strong psychometric properties. Our findings indicate that the VELUT exhibits excellent internal consistency, high item discrimination and robust unidimensionality, supporting its construct validity in assessing the promotion-, prevention- or treatment focus of MHPSS interventions in global and in public mental health.

Results from IRT analyses revealed that the category thresholds for all eight items were widely spaced, indicating good item distribution and strong discrimination across varying levels of the latent trait representing the spectrum from promotion to treatment. All 8 items were found to be non-redundant and informative, ensuring comprehensive coverage of the underlying construct. The development of the VELUT addresses a critical gap in global mental health research by providing a structured, evidence-based approach to characterize intervention studies along a continuum that is frequently conceptually and methodologically ambiguous.

Many of our study RCTs were classified as indicated prevention, a finding that aligns with prevailing trends in the scientific literature, which focus on population groups already presenting symptoms of mental disorders (Cuijpers, Reference Cuijpers2003, Reference Cuijpers2022; Thom et al., Reference Thom, Jonas, Reitzle, Mauz, Holling and Schulz2024).

In general, conducting prevention research in mental health is not easy. It requires measuring the incidence of mental disorders to determine whether an intervention reduces the onset of disorders. This presents three distinct challenges. First, the accurate measurement of incidence in prevention trials requires delivering a formal diagnostic interview, a time-consuming process which necessitates the involvement of trained professionals compared to using self-assessed symptom scales (Dattani, Reference Dattani2023; Jensen-Doss and Hawley, Reference Jensen-Doss and Hawley2010). Prevention trials require substantial financial resources and access to trained professionals capable of administering diagnostic assessments. These resources can be difficult to mobilize, particularly in low- and middle-income settings where mental health services are limited, and mental health professionals are scarce. Second, mental health prevention research often requires a broad and diverse participant base and long-term follow-up periods (Legemaat et al., Reference Legemaat, Burger, Geurtsen, Brouwer, Spinhoven, Denys and Bockting2023). Mental health disorders often emerge gradually and can be influenced by several factors, including the structural and social determinants of mental health (Lund et al., Reference Lund, Brooke-Sumner, Baingana, Baron, Breuer, Chandra, Haushofer, Herrman, Jordans, Kieling, Medina-Mora, Morgan, Omigbodun, Tol, Patel and Saxena2018; Machado et al., Reference Machado, Alves and Patel2024; Rose-Clarke et al., Reference Rose-Clarke, Gurung, Brooke-Sumner, Burgess, Burns, Kakuma, Kusi-Mensah, Ladrido-Ignacio, Maulik, Roberts, Walker, Williams, Yaro, Thornicroft and Lund2020). Also, prevention research often targets at-risk populations, complicating recruitment efforts and introducing potential ethical considerations and constraints (World Health Organization, 2022). Third, boundaries between the different levels of prevention are often blurred in mental health research, may be subjectively disputed and interpreted. This is part of the conceptual challenge that guided the development of the VELUT.

The VELUT is a new tool in mental health. There are several critical appraisal tools of primary studies in the scientific literature (Medicine CEBM, 2023) conceptually different from the VELUT. For example, it shares several premises with the Cochrane Risk of Bias tool (RoB) (Higgins et al., Reference Higgins, Thomas, Chandler, Cumpston, Li, Page and Welch2023). Like the RoB, the VELUT is outcome-specific and transparent. However, while the Cochrane RoB is explicitly designed to evaluate the RoB and the internal validity of primary studies, focusing primarily on methodology, the VELUT provides a complementary classification, that is a structured approach for positioning RCTs along the IOM continuum from promotion to treatment, as well as for quantifying the extent to which each RCT emphasizes promotion/universal prevention, selective prevention, indicated prevention or treatment. This quantification, which no existing tools currently offer, is important, as 2 RCTs may have the same overall VELUT score yet exhibit markedly different profiles across each IOM domain. This mitigates potential misclassification bias and underscores the conceptual limitation of relying on aggregate scores alone. Consequently, no fixed threshold was imposed for categorizing RCTs based on total scores, and we recommend against the use of cutoffs that would be inherently arbitrary and may introduce bias.

Our study has limitations. First, our methodology is novel. However, to our knowledge, there are no established methodologies to develop a scale in evidence synthesis. Our methodology is grounded on existing CHNRI methodology, and considers two robust conceptual frameworks, namely PICO and IOM. Second, different from the WHO guidelines development groups, though the TDG was gender balanced, low- and middle-income countries were less represented than high-income countries. However, several members of the TDG have a long experience in humanitarian, resource-poor settings, which may have helped to account for some attention to socio-cultural factors in all the steps of the VELUT development process. We followed a balanced, multidisciplinary, highly participatory, democratic and replicable approach. Third, although we did confirm that the unidimensionality assumption of the IRT model was met, we cannot exclude deviations from local independence of items, although this seems unlikely. A strength of this study is that we used IRT models to inform item selection and to embed a preliminary validation conducted on a large set of relevant RCTs assessed by a trained assessment team. A further strength worth noting is that we also confirmed the known groups’ validity of the VELUT cherry-picking a few RCTs at each extreme of the spectrum, where VELUT scores were lowest and highest, respectively. This provides further empirical support for the construct validity of this new tool.

The TDG has developed a new pragmatic, free and accessible tool to address key challenges in evidence synthesis. Moreover, the VELUT aims to also promote better RCT design by introducing a public mental health perspective to existing methodological tools and resources. The VELUT helps to systematically – and in a pragmatic, user-friendly way – clarify which methodological characteristics align a primary study with a focus on promotion, prevention or treatment. This classification can be done with an overall VELUT score complemented by the computation of the frequencies of response options that correspond to the IOM domains of promotion, selective prevention, indicated prevention and treatment. This classification may guide key methodological decisions such as inclusion and exclusion criteria, outcome measures and intervention selection. The VELUT may also allow the classification of RCTs and, accordingly, guide subgroup and sensitivity analyses in systematic reviews and meta-analyses. Additionally, it will orient policymakers and funders when considering the criteria for RCTs to be funded or implemented in specific global and public health initiatives.

The VELUT has been specifically developed to address gaps in global mental health research and evidence synthesis. But its conceptual framework and structure lend to broader applicability across other areas of health research that may also share the challenge of misclassification of studies along the promotion to treatment continuum. For example, the VELUT can be applied to trials involving children and adolescents, as well as populations with severe mental conditions or comorbidities, since the items are designed to capture general trial features independent of age group or diagnostic complexity. This new tool has potential implications for the design of new clinical trials that could be more precisely focused on studying promotion, prevention or treatment. For example, inclusion and exclusion criteria, tools for outcome assessment and analyses and implementation settings could be defined using the VELUT. Implications also regard the clinical practice and the use of the evidence in the intervention choice; priority setting and strategic planning activities, and, last but not least, the development of calls for application and evaluation research applications for funding agencies and policy makers.

Supplementary material

The supplementary material for this article can be found at https://doi.org/10.1017/S2045796025100280.

Availability of data and materials

Data will be made available upon motivated request to the contact authors, and after discussion with the TDG.

Acknowledgements

Not applicable.

Author contributions

Conceptualization: M.P., C.B., E.A.; methodology: A.G., E.A., M.J.D.J., W.A.T., C.L., M.P., C.B.; planning and being responsible for data analysis: A.G., E.A., M.P., C.B.; writing, critically reviewing, editing: M.P., E.A., A.G., A.M.A., C.A., C.C., M.J.D.J., C.L., D.P., E.P., M. Sijbrandij., M. Silva, F.T., W.A.T., C.B. M.P. and C.B. are guarantors and responsible for this work, its conduction, and will have full access to the data. M.P. and E.A. are joint first authors.

Financial support

This research is partially funded by a grant for the internationalization of research of the University of Verona (Bando PIA-2023) and by the Swiss National Science Foundation (Global Mental Health workshop 226841). D.P. was funded by the European Union’s Horizon-MSCA-2021-PF01 research program under grant agreement N 101061648. The funder had no role in the design and conduct of the study; collection, management, analysis and interpretation of the data; preparation, review or approval of the manuscript; and decision to submit the manuscript for publication.

Competing interests

The author(s) declared that there were no conflicts of interest with respect to the authorship or the publication of this article.

Ethical standards

The protocol for this study was approved by the Ethics Committee of Università della Svizzera Italiana (CE_2024_04 of 25-01-2024).

References

Acarturk, C, Uygun, E, Ilkkursun, Z, Carswell, K, Tedeschi, F, Batu, M, Eskici, S, Kurt, G, Anttila, M, Au, T, Baumgartner, J, Churchill, R, Cuijpers, P, Becker, T, Koesters, M, Lantta, T, Nose, M, Ostuzzi, G, Popa, M, Purgato, M, Sijbrandij, M, Turrini, G, Valimaki, M, Walker, L, Wancata, J, Zanini, E, White, RG, van Ommeren, M and Barbui, C (2022) Effectiveness of a WHO self-help psychological intervention for preventing mental disorders among Syrian refugees in Turkey: A randomized controlled trial. World Psychiatry 21(1), 88–95. https://doi.org/10.1002/wps.20939.CrossRef Google Scholar PubMed

Albanese, E, Butikofer, L, Armijo-Olivo, S, Ha, C and Egger, M (2020) Construct validity of the Physiotherapy Evidence Database (PEDro) quality scale for randomized trials: Item response theory and factor analyses. Res Synth Methods 11(2), 227–236. https://doi.org/10.1002/jrsm.1385.CrossRef Google Scholar PubMed

Augustinavicius, J, Purgato, M, Tedeschi, F, Musci, R, Leku, MR, Carswell, K, Lakin, D, van Ommeren, M, Cuijpers, P, Sijbrandij, M, Karyotaki, E, Tol, WA and Barbui, C (2023) Prevention and promotion effects of Self Help Plus: Secondary analysis of cluster randomised controlled trial data among South Sudanese refugee women in Uganda. BMJ Open 13(9), e048043. https://doi.org/10.1136/bmjopen-2020-048043.CrossRef Google Scholar PubMed

Barbui, C, Purgato, M, Abdulmalik, J, Acarturk, C, Eaton, J, Gastaldon, C, Gureje, O, Hanlon, C, Jordans, M, Lund, C, Nose, M, Ostuzzi, G, Papola, D, Tedeschi, F, Tol, W, Turrini, G, Patel, V and Thornicroft, G (2020) Efficacy of psychosocial interventions for mental health outcomes in low-income and middle-income countries: An umbrella review. Lancet Psychiatry 7(2), 162–172. https://doi.org/10.1016/S2215-0366(19)30511-5.CrossRef Google Scholar PubMed

Bollen, KA (1989) Structural Equations with Latent Variables. New York: Wiley.CrossRef Google Scholar

Buntrock, C, Berking, M, Smit, F, Lehr, D, Nobis, S, Riper, H, Cuijpers, P and Ebert, D (2017) Preventing depression in adults with subthreshold depression: Health-economic evaluation alongside a pragmatic randomized controlled trial of a web-based intervention. Journal of Medical Internet Research 19(1), e5. https://doi.org/10.2196/jmir.6587.CrossRef Google Scholar PubMed

Cuijpers, P (2003) Examining the effects of prevention programs on the incidence of new cases of mental disorders: The lack of statistical power. The American Journal of Psychiatry 160(8), 1385–1391. https://doi.org/10.1176/appi.ajp.160.8.1385.CrossRef Google Scholar PubMed

Cuijpers, P (2022) Why primary prevention often is no prevention at all. European Neuropsychopharmacology 58, 1–3. https://doi.org/10.1016/j.euroneuro.2022.01.004.CrossRef Google Scholar

Cuijpers, P, Boyce, N and van Ommeren, M (2024) Why treatment manuals of psychological interventions should be freely available. Lancet Psychiatry 11(5), 325–326. https://doi.org/10.1016/S2215-0366(24)00071-3.CrossRef Google Scholar PubMed

Dattani, S (2023) How do researchers study the prevalence of mental illnesses? Published online at OurWorldinData.org. Retrieved from https://ourworldindata.org/how-do-researchers-study-the-prevalence-of-mental-illnesses (accessed 1 November 2025).Google Scholar

David, LS and Norman, AG (2015) Health Measurement Scales: A Practical Guide to Their Development and Use, 5th edn. Oxford: Oxford University Press.Google Scholar

Eaton, WW (2019) Public Mental Health. Oxford: Oxford University Press.10.1093/oso/9780190916602.001.0001CrossRef Google Scholar

Gross, AL, Sherva, R, Mukherjee, S, Newhouse, S, Kauwe, JS, Munsie, LM, Waterston, LB, Bennett, DA, Jones, RN, Green, RC and Crane, PK (2014) Calibrating longitudinal cognition in Alzheimer’s disease across diverse test batteries and datasets. Neuroepidemiology 43(3–4), 194–205. https://doi.org/10.1159/000367970.CrossRef Google Scholar PubMed

Higgins, JPT, Thomas, J, Chandler, J, Cumpston, M, Li, T, Page, MJ and Welch, VA (eds) (2023) Cochrane Handbook for Systematic Reviews of Interventions, version 6.4 (updated August 2023). Chichester, UK: John Wiley and Sons, Cochrane. www.training.cochrane.org/handbook.Google Scholar

Hoare, E, Collins, S, Marx, W, Callaly, E, Moxham-Smith, R, Cuijpers, P, Holte, A, Nierenberg, AA, Reavley, N, Christensen, H, Reynolds, CF, Carvalho, AF, Jacka, F and Berk, M (2021) Universal depression prevention: An umbrella review of meta-analyses. Journal of Psychiatric Research 144, 483–493. https://doi.org/10.1016/j.jpsychires.2021.10.006.CrossRef Google Scholar PubMed

Jensen-Doss, A and Hawley, KM (2010) Understanding barriers to evidence-based assessment: clinician attitudes toward standardized assessment tools. Journal of Clinical Child and Adolescent Psychology 39(6), 885–896. https://doi.org/10.1080/15374416.2010.517169.CrossRef Google Scholar PubMed

Legemaat, AM, Burger, H, Geurtsen, GJ, Brouwer, M, Spinhoven, P, Denys, D and Bockting, CL (2023) Effects up to 20-year follow-up of preventive cognitive therapy in adults remitted from recurrent depression: The DELTA study. Psychotherapy and Psychosomatics 92(1), 55–64. https://doi.org/10.1159/000527906.CrossRef Google Scholar PubMed

Lord, FM (1980) Applications of Item Response Theory to Practical Testing Problems. Mahwah, NJ: Erlbaum.Google Scholar

Lund, C, Brooke-Sumner, C, Baingana, F, Baron, EC, Breuer, E, Chandra, P, Haushofer, J, Herrman, H, Jordans, M, Kieling, C, Medina-Mora, ME, Morgan, E, Omigbodun, O, Tol, W, Patel, V and Saxena, S (2018) Social determinants of mental disorders and the Sustainable Development Goals: A systematic review of reviews. Lancet Psychiatry 5(4), 357–369. https://doi.org/10.1016/S2215-0366(18)30060-9CrossRef Google Scholar PubMed

MacCallum, RC, Browne, MW and Cai, L (2006) Testing differences between nested covariance structure models: Power analysis and null hypotheses. Psychological Methods 11(1), 19–35. https://doi.org/10.1037/1082-989X.11.1.19.CrossRef Google Scholar PubMed

Machado, DB, Alves, FJO and Patel, V (2024) Economic interventions for the prevention of mental health problems: The role of cash transfers. The American journal of orthopsychiatry 94(4), 477–484. https://doi.org/10.1037/ort0000764.CrossRef Google Scholar PubMed

Maydeu-Olivares, A and Joe, H (2005) Limited- and full-information estimation and goodness-of-fit testing in 2n contingency tables: A unified framework. Journal of the American Statistical Association 100(471), 1009–1020. https://doi.org/10.1198/016214504000002069.CrossRef Google Scholar

Maydeu-Olivares, A and Joe, H (2006) Limited Information Goodness-of-fit Testing in Multidimensional Contingency Tables. Psychometrika 71, 713–732. https://doi.org/10.1007/s11336-005-1295-9CrossRef Google Scholar

Medicine CEBM (2023) Critical appraisal tools. Available at: https://www.cebm.ox.ac.uk/resources/ebm-tools/critical-appraisal-tools (accessed 1 November 2025).Google Scholar

Miller, KE, Jordans, MJD, Tol, WA and Galappatti, A (2021) A call for greater conceptual clarity in the field of mental health and psychosocial support in humanitarian settings. Epidemiology and Psychiatric Sciences 30, e5. https://doi.org/10.1017/S2045796020001110.CrossRef Google Scholar

Miller, KE, Rasmussen, A and Jordans, MJD (2023) Strategies to improve the quality and usefulness of mental health trials in humanitarian settings. Lancet Psychiatry 10(12), 974–980. https://doi.org/10.1016/S2215-0366(23)00273-0.CrossRef Google Scholar PubMed

Institute of Medicine (2009) Preventing Mental, Emotional, and Behavioral Disorders Among Young People: Progress and Possibilities. Washington, DC: National Academies Press.Google Scholar

Orlando, M, Sherbourne, CD and Thissen, D (2000) Summed-score linking using item response theory: application to depression measurement. Psychological Assessment 12(3), 354–359. https://doi.org/10.1037/1040-3590.12.3.354.CrossRef Google Scholar PubMed

Papola, D and Patel, V (2025) Towards person-centered care in global mental health: Implications for meta-analyses and clinical trials. Epidemiology and Psychiatric Sciences 34, e13. https://doi.org/10.1017/S2045796025000071.CrossRef Google Scholar PubMed

Papola, D, Prina, E, Ceccarelli, C, Cadorin, C, Gastaldon, C, Ferreira, MC, Tol, WA, van Ommeren, M, Barbui, C and Purgato, M (2024) Psychological and social interventions for the promotion of mental health in people living in low- and middle-income countries affected by humanitarian crises. Cochrane Database of Systematic Reviews 5(5), CD014300. https://doi.org/10.1002/14651858.CD014300.pub2.Google Scholar PubMed

Papola, D, Purgato, M, Gastaldon, C, Bovo, C, van Ommeren, M, Barbui, C and Tol, WA (2020) Psychological and social interventions for the prevention of mental disorders in people living in low- and middle-income countries affected by humanitarian crises. Cochrane Database of Systematic Reviews 9(9), CD012417. https://doi.org/10.1002/14651858.CD012417.pub2.Google Scholar PubMed

Patel, V, Saxena, S, Lund, C, Thornicroft, G, Baingana, F, Bolton, P, Chisholm, D, Collins, PY, Cooper, JL, Eaton, J, Herrman, H, Herzallah, MM, Huang, Y, Jordans, MJD, Kleinman, A, Medina-Mora, ME, Morgan, E, Niaz, U, Omigbodun, O, Prince, M, Rahman, A, Saraceno, M, Sarkar, BK, De Silva, M, Singh, I, Stein, DJ, Sunkel, C and Unützer, J (2018) The Lancet Commission on global mental health and sustainable development. Lancet 392, 1553–1598. https://doi.org/10.1016/S0140-6736(18)31612-X.CrossRef Google Scholar PubMed

Purgato, M, Albanese, E, Papola, D, Prina, E, Tedeschi, F, Gross, A, Sijbrandij, M, Acarturk, C, Annoni, AM, Silva, M, Jordans, MJD, Lund, C, Tol, WA, Cuijpers, P and Barbui, C (2023) How to distinguish promotion, prevention and treatment trials in public mental health? Study protocol for the development of the VErona-LUgano Tool (VELUT). BMJ Open 14(8), e082652. https://doi.org/10.1136/bmjopen-2023-082652.CrossRef Google Scholar

Purgato, M, Carswell, K, Tedeschi, F, Acarturk, C, Anttila, M, Au, T, Bajbouj, M, Baumgartner, J, Biondi, M, Churchill, R, Cuijpers, P, Koesters, M, Gastaldon, C, Ilkkursun, Z, Lantta, T, Nosè, M, Ostuzzi, G, Papola, D, Popa, M, Roselli, V, Sijbrandij, M, Tarsitani, L, Turrini, G, Välimäki, M, Walker, L, Wancata, J, Zanini, E, White, RG, van Ommeren, M and Barbui, C (2021b) Effectiveness of Self-Help Plus in preventing mental disorders in refugees and asylum seekers in Western Europe: A multinational randomized controlled trial. Psychother Psychosom 90(6), 403–414. https://doi.org/10.1159/000517504.CrossRef Google Scholar

Purgato, M, Prina, E, Ceccarelli, C, Cadorin, C, Abdulmalik, JO, Amaddeo, F, Arcari, L, Churchill, R, Jordans, MJD, Lund, C, Papola, D, Uphoff, E, van Ginneken, N, Tol, WA and Barbui, C (2023) Primary-level and community worker interventions for the prevention of mental disorders and the promotion of well-being in low- and middle-income countries. Cochrane Database of Systematic Reviews 10(10), CD014722. https://doi.org/10.1002/14651858.CD014722.pub2.Google Scholar PubMed

Purgato, M, Singh, R, Acarturk, C and Cuijpers, P (2021a) Moving beyond a ‘one-size-fits-all’ rationale in global mental health: Prospects of a precision psychology paradigm. pidemiology and Psychiatric Sciences 30, e63. https://doi.org/10.1017/S2045796021000500.CrossRef Google Scholar

Purgato, M, Tedeschi, F, Riello, M, Zaccoletti, D, Mediavilla, R, Ayuso-Mateos, JL, MacTaggart, D, Barbui, C and Rusconi, E (2025a) Effectiveness of Self-Help Plus in its digital version in reducing anxiety and post-traumatic symptomatology among nursing home workers during the COVID-19 pandemic: Secondary analysis of randomised controlled trial data. BMJ Mental Health 28(1), e301379. https://doi.org/10.1136/bmjment-2024-301379.CrossRef Google Scholar

Purgato, M, Tedeschi, F, Turrini, G, Cadorin, C, Compri, B, Muriago, G, Ostuzzi, G, Pinucci, I, Prina, E, Serra, R, Tarsitani, L, Witteveen, AB, Roversi, A, Melchior, M, McDaid, D, Park, AL, Petri-Romao, P, Kalisch, R, Underhill, J, Bryant, R, Mediavilla Torres, R, Ayuso-Mateos, JL, Felez Nobrega, M, Haro, JM, Sijbrandij, M, Nose, M and Barbui, C (2025b) Effectiveness of a stepped-care programme of WHO psychological interventions in a population of migrants: Results from the RESPOND randomized controlled trial. World Psychiatry 24(1), 120–130. https://doi.org/10.1002/wps.21281.CrossRef Google Scholar

Reise, SP (2012) The rediscovery of bifactor measurement models. Multivariate Behavioral Research 47, 667–696. https://doi.org/10.1080/00273171.2012.715555.CrossRef Google Scholar PubMed

Richardson, WS, Wilson, MC, Nishikawa, J and Hayward, RS (1995) The well-built clinical question: a key to evidence-based decisions. ACP Journal Club 123(3), A12–13.10.7326/ACPJC-1995-123-3-A12CrossRef Google Scholar PubMed

Riello, M, Purgato, M, Bove, C, Tedeschi, F, MacTaggart, D, Barbui, C and Rusconi, E (2021) Effectiveness of Self-Help Plus (SH+) in reducing anxiety and post-traumatic symptomatology among care home workers during the COVID-19 pandemic: A randomized controlled trial. Royal Society Open science 8(11), 210219. https://doi.org/10.1098/rsos.210219.CrossRef Google Scholar

Rose-Clarke, K, Gurung, D, Brooke-Sumner, C, Burgess, R, Burns, J, Kakuma, R, Kusi-Mensah, K, Ladrido-Ignacio, L, Maulik, PK, Roberts, T, Walker, IF, Williams, S, Yaro, P, Thornicroft, G and Lund, C (2020) Rethinking research on the social determinants of global mental health. The Lancet Psychiatry 7(8), 659–662. https://doi.org/10.1016/S2215-0366(20)30134-6.CrossRef Google Scholar PubMed

Roth, FP and Tannenbaum, DE (2004) Multilingual Guidelines for Educational and Psychological Testing. American Educational Research Association: Washington, DC.Google Scholar

Rudan, I (2016) Setting health research priorities using the CHNRI method: IV. Key conceptual advances. Journal of Global Health 6(1), 010501. https://doi.org/10.7189/jogh.06.010501.CrossRef Google Scholar PubMed

Rudan, I, Yoshida, S, Chan, KY, Sridhar, D, Wazny, K, Nair, H, Sheikh, A, Tomlinson, M, Lawn, JE, Bhutta, ZA, Bahl, R, Chopra, M, Campbell, H, El Arifeen, S, Black, RE and Cousens, S (2017) Setting.Google Scholar

Shah, H, Albanese, E, Duggan, C, Rudan, I, Langa, KM, Carrillo, MC, Chan, KY, Joanette, Y, Prince, M, Rossor, M, Saxena, S, Snyder, HM, Sperling, R, Varghese, M, Wang, H, Wortmann, M and Dua, T (2016) Research priorities to reduce the global burden of dementia by 2025. The Lancet Neurology 15(12), 1285–1294. https://doi.org/10.1016/S1474-4422(16)30235-6.CrossRef Google Scholar PubMed

Thom, J, Jonas, B, Reitzle, L, Mauz, E, Holling, H and Schulz, M (2024) Trends in the diagnostic prevalence of mental disorders, 2012–2022: Using nationwide outpatient claims data for mental health surveillance. Deutsches Ärzteblatt International 121(11), 355–362. https://doi.org/10.3238/arztebl.m2024.0052.Google Scholar PubMed

Tol, WA, Purgato, M, Bass, JK, Galappatti, A and Eaton, W (2015) Mental health and psychosocial support in humanitarian settings: A public mental health perspective. Epidemiology and Psychiatric Sciences 24(6), 484–494. https://doi.org/10.1017/S2045796015000827.CrossRef Google Scholar PubMed

van der Linden, WJ (2018) Handbook of Item Response Theory: Three Volume Set. Boca Raton, FL: CRC Press.CrossRef Google Scholar

van der Linden, WJ and Hambleton, RK eds (1997) Handbook of Modern Item Response Theory. New York: Springer.10.1007/978-1-4757-2691-6CrossRef Google Scholar

van Zoonen, K, Buntrock, C, Ebert, DD, Smit, F, Reynolds, CF III, Beekman, AT and Cuijpers, P (2014) Preventing the onset of major depressive disorder: A meta-analytic review of psychological interventions. International Journal of Epidemiology 43(2), 318–329. https://doi.org/10.1093/ije/dyt175.CrossRef Google Scholar PubMed

Watts, S, van Ommeren, M and Cuijpers, P (2020) Open access of psychological intervention manuals. World Psychiatry 19(2), 251–252. https://doi.org/10.1002/wps.20756.CrossRef Google Scholar PubMed

World Health Organization (2022) World Mental Health Report: transforming Mental Health for All. Geneva: WHO.Google Scholar

Figure 1. Study flowchart.

Legend: IRT: item response theory; N: number of items; RCT: randomized controlled trial; TDG: tool development group; USI: Università della Svizzera italiana; VELUT: Verona Lugano tool.

Table 1. Frequencies of responses from the application of the VELUT to 36 selected RCTs, and item parameters

Figure 2. Item characteristic curves.

Figure 3. Item information function (Mplus).

Figure 4. Test information function (Mplus).

Figure 5. Test characteristic curve (TCC).

Table 2. The Verona-Lugano Tool – VELUT

Purgato et al. supplementary material

DOI: https://doi.org/10.1017/S2045796025100280.sm001

File 2.4 MB

Article contents

How to distinguish promotion, prevention, and treatment trials in public mental health: development and validation of the VErona-LUgano Tool (VELUT)

Abstract

Keywords

Information

Introduction

Methods

Devising items

Selection of items

Statistical analysis

Data collection

Statistical analyses

Item response theory

Results

Devising items

Selection of items and IRT results for the preliminary version of the VELUT

Validation study and final version of the VELUT

Discussion

Supplementary material

Availability of data and materials

Acknowledgements

Author contributions

Financial support

Competing interests

Ethical standards

References

Purgato et al. supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests