Introduction
Over the last two decades, the National Institutes of Health (NIH) has made significant contributions to advance clinical and translational research (CTR) through its funding and leadership. Awards such as the Clinical and Translational Science Awards (CTSAs) have helped to foster and support new research by building a cadre of clinical and translational scientists. This network currently includes over 60 leading medical organizations across the United States (US) [1]. Programs like the Institutional Development Award (IDeA) Networks for Clinical and Translational Research (IDeA-CTR), established in 1993, currently support 13 statewide or regional competitive awards among states with historically low NIH funding success [2]. A hallmark feature of these NIH grants is the emphasis on evaluation with a focus on demonstrating the value of investments.
There are several theoretical approaches, tools, and metrics for assessing CTSA and CTR initiatives [3–9]. Current evaluation approaches rely on a range of data collection techniques and sources including, but not limited to, self-reported data, document reviews (e.g., CVs), surveys, administrative data, inventories, bibliometric measures, program tracking tools, and network analysis tools [3]. Yet there are few validated survey instruments that provide individual-level measures, and among the currently published approaches, there is no single tool or data source that captures the full range of individual-level experiences, research supports, and investments to assess changes in one’s research career development, accomplishments, and trajectory [3]. Most qualitative and quantitative data collection tools at the individual-level focus on specific areas such as research skills, activities, or capacity [Reference Brennan, McKenzie and Turner10–15], research benefits or impacts [16–Reference Grazier, Trochim, Dilts and Kirk18], research productivity [Reference Bland, Center, Finstad, Risbey and Staples19–Reference Skinnider, Squair and Twa21], research funding or supports [Reference Brennan, McKenzie and Turner10,Reference Alison, Zafiropoulos and Heard22–Reference Robinson, Schwartz, DiMeglio, Ahluwalia and Gabrilove27], and research infrastructure, operations or culture [Reference Brennan, McKenzie and Turner10,15,Reference Alison, Zafiropoulos and Heard22]. The varied approaches limit the ability to fully understand potential changes and the factors linked to research success and independence among clinical and translational scientists across the US. There is a need to rigorously measure changes that occur, particularly as early-stage investigators move along the research continuum to achieve research independence and prominence, as well as what changes occur over time as individualized supports (e.g., pilot funding, mentorship, and training) are put in place to foster investigators’ research.
Nationally published guidelines for the evaluation of CTR initiatives funded by the NIH highlight the need for innovative approaches and standardized data collection efforts that generate site-specific findings and allow aggregation across institutions [Reference Trochim, Rubio and Thomas28]. Moreover, CTSAs have been encouraged to develop evaluations that are prospective and evaluators have been called on to implement “cutting-edge approaches” and to “gain efficiencies” by sharing resources across sites [Reference Trochim, Rubio and Thomas28]. In response to the guidelines, we propose a new tool, the Researcher Investment Tool (RIT), built off prior evaluation work[3] and designed to measure individual researcher experience and support received throughout one’s research career. Evaluators may use the RIT to measure changes in researchers’ experiences and support over time based on investments provided through CTSA, CTR, and similar programs. The purpose of this article is to describe the development and psychometric characteristics of this novel RIT.
Methods
Tool development
The RIT was developed by the NNE-CTR Tracking and Evaluation Core based on a review of the literature including existing tools, measures, and approaches designed to assess the experiences, activities, outputs, and impact of an individual’s research. This narrative review has been reported elsewhere and included a total of 136 publications. The review segmented the literature into the following three areas:
-
1. Frameworks, models, or approaches with no underlying measures. The theoretical constructs and the corresponding descriptions or definitions were recorded.
-
2. Measures based on primary data collection. All individual items from available surveys, evaluation forms, focus groups, and interview protocols were captured. The response options and reviewer notes about the data collection process and participants were also recorded.
-
3. Measures/indices based on secondary data (e.g., bibliometric measures). All existing bibliometric measures were reviewed to include the measure name, definition, focus, and type (e.g., individual or organizational), as well as the date published. Administrative sources of data were also reviewed for type, measures, and data collection details (e.g., review of curriculum vitae).
All abstracted data were recorded in a shared spreadsheet with separate tabs based on the three categories listed above. Individual measures or items were placed into a focus area drawn from the literature and identified by the lead evaluator (e.g, research publications, skills or competencies ) and linked to their citation. All items were assessed for frequency, similarities, and differences in their use.
Overall, the review revealed widespread variation, a lack of comprehensive tools, and a focus on productivity measures (e.g., publications and funding) and bibliometric measures. Given the nature of CTR awards, the evaluation team used the results of the review to identify prominent domains and underlying measures as well as domains and measures specific to clinical and translational research efforts including mentorship, community engagement, research collaboration, and institutional support. The tool was drafted by the lead evaluator and all citations for each item were recorded. New items were created by the lead evaluator, vetted by the evaluation team, and included in the initial cognitive testing phase conducted with researchers who were senior-level researchers serving in a CTR leadership role.
Characteristics of the tool
The RIT includes a 90-item questionnaire with two sections that measure a researcher’s experiences and perceptions through a five-point scale. Section one measures experience using a Likert-type response ranging from “1 = no experience” to “5 = extensive experience.” Section two measures researcher perceptions based on a similar Likert-type scale ranging from “1 = not at all” to “5 = a great extent.” There is also an optional background section that captures demographic information as well as research role and involvement. On average, it takes 10–15 minutes to complete.
As seen in Figure 1, the first two sections of the RIT include eight domains: 1) research skills, 2) service to profession, 3) research productivity, 4) research collaboration, 5) research mentorship, 6) community engagement, 7) research impact, and 8) researcher perceptions of institutional support. Additionally, the research collaboration domain contains two sub-domains, team composition and team collaboration, and the research mentorship domain contains two sub-domains, mentee experience and mentor experience. The items in each domain reflect existing measures compiled from the literature as well as new items added by the NNE-CTR Tracking and Evaluation Core to more fully capture the underlying construct and priorities of clinical and translational research efforts.

Figure 1. Researcher Investment Tool domains.
Researcher investment tool domains
Research skills
This domain is defined as the experiences investigators have in conducting activities related to research. There are 12 items that reflect the various skills or competencies researchers rely on, including the conceptualization of a research project [Reference Smith, Wright, Morgan, Dunleavey and Moore14,Reference Julé, Furtado and Boggs29], proposal or grant writing [Reference Smith, Wright, Morgan, Dunleavey and Moore14,Reference Alison, Zafiropoulos and Heard22], regulatory compliance and submission of ethics applications [Reference Alison, Zafiropoulos and Heard22,Reference Zink and Curran30], research management and oversight[Reference Julé, Furtado and Boggs29,Reference Saleh, Naik and Jester31] the collection, management, and analyses of data [Reference Mills, Caetano and Rhea32], and the dissemination of research findings including reports and presentations [Reference Julé, Furtado and Boggs29,Reference Mills, Caetano and Rhea32]. Our team added one item to assess overall experience related to participating in someone else’s research, given the focus on new or early-stage investigators.
Service to the profession
This 13-item domain reflects the experiences investigators have in participating in research-related service activities. The activities may be self-directed or a result of a personal invitation by others. The service activities focus on teaching [Reference Caminiti, Iezzi, Ghetti, De’ Angelis and Ferrari11,Reference Alison, Zafiropoulos and Heard22], mentoring [Reference Bice, Hollman, Ball and Hollman33], serving as a reviewer[Reference Caminiti, Iezzi, Ghetti, De’ Angelis and Ferrari11], contributing to new guidelines [Reference Caminiti, Iezzi, Ghetti, De’ Angelis and Ferrari11], overseeing research [Reference Julé, Furtado and Boggs29], participating in national advisory groups or other committees [Reference Mason, Lei and Faupel-Badger34], serving as a peer reviewer for manuscripts and grant proposals [Reference Caminiti, Iezzi, Ghetti, De’ Angelis and Ferrari11], serving in an editorial role [Reference Finney, Amundson and Bi35], and training students or postgraduates [Reference Yan, Lao and Lu36]. Additional items were added based on reviews by senior-level researchers serving in a CTR leadership role and a desire to capture common activities that typically occur by invitation as one’s research becomes more visible and one’s reputation is enhanced due to their work (e.g., presenting research at a national meeting, by invitation).
Researcher productivity
This 16-item domain draws on some of the most common measures reported in the literature. Productivity measures are frequently used to track individual-level research outputs and investments. The RIT includes items measuring the experiences investigators have in securing funding, contributing to scientific knowledge, and influencing future research, policies, and practices. The tool measures the type and source of research funding [Reference Brocato and Mavis20,Reference Robinson, Schwartz, DiMeglio, Ahluwalia and Gabrilove27,Reference Akabas, Brass and Tartakovsky37–Reference Tesauro, Seger, Dijoseph, Schnell and Klein47], including specific NIH support [Reference Sweeney, Schwartz, Toto, Merchant, Fair and Gabrilove48]. The publication metrics, type of publications, and research citations[Reference Smith, Wright, Morgan, Dunleavey and Moore14,16,Reference Alison, Zafiropoulos and Heard22,Reference Bawden, Manouchehri, Villa-Roel, Grafstein and Rowe38] were included to assess the reach of an individual’s research and items reflecting the products[Reference Sarli, Dubinsky and Holmes49] or translation of research[16,Reference Hsiao, Fresquez and Christophers42] were also included in the RIT. Our team added two items: one to reflect internal funding, an indicator of early-stage support relevant to CTRs, and another to capture inquiries researchers may receive about their work, a common occurrence as investigators’ research becomes more visible.
Research collaboration
This domain focuses on the experiences and activities investigators have working with different types of research teams. The 12 items are based on literature exploring two sub-domains: the composition of teams (seven items) and their activities (five items). The literature focuses on research involving multi-disciplines [Reference Connors, Pacchiano, Stein and Swartz50–Reference Whitworth, Haining and Stringer52], a patient voice [Reference Kalet, Lusk and Rockfeld53,Reference Walters, Stern and Robertson-Malt54], cross-disciplinary approaches [Reference Connors, Pacchiano, Stein and Swartz50,Reference Whitworth, Haining and Stringer52,Reference Blanchard, Burton and Geraci55–Reference Mâsse, Moser and Stokols57], and the role of team members [Reference Connors, Pacchiano, Stein and Swartz50,Reference Hall, Feng, Moser, Stokols and Taylor56,Reference Hawk, Murphy, Hartmann, Burnett and Maguin58]. Our team added three items reflecting multi-institutional efforts and partner engagement to reflect the scope of our CTR initiative.
Research mentorship
This 15-item domain is defined as the experience investigators have in receiving tailored support provided by a colleague and their experience providing research mentoring. The items are divided into two sub-domains: mentee experience (seven items) and mentor experience (eight items). Understanding the experiences from both vantage points provides a more comprehensive measure [Reference Meagher, Taylor, Probsfield and Fleming59]. The items were drawn from expert review as well as prior work focusing on receiving mentorship [Reference Goldstein, Blair, Keswani, Gosain, Morowitz and Kuo25,Reference Blanchard, Kleppel and Bianchi39,Reference McRae and Zimmerman60], engaging and connecting with mentors [Reference Bice, Hollman, Ball and Hollman33,Reference McRae and Zimmerman60,Reference Yin, Gabrilove, Jackson, Sweeney, Fair and Toto61], learning from mentors[Reference McRae and Zimmerman60,Reference Tsai, Ordóñez, Reus and Mathews62] and mapping out a career path [Reference Cordrey, King, Pilkington, Gore and Gustafson23]. Key components of mentorship which framed this domain included mentoring students and junior researchers [Reference Finney, Amundson and Bi35,Reference Blanchard, Kleppel and Bianchi39,Reference Feldon, Litson and Jeong63], assisting mentees with developing their own grant proposals and resultant funded projects [Reference Bice, Hollman, Ball and Hollman33,Reference Blanchard, Kleppel and Bianchi39,Reference Sweeney, Schwartz, Toto, Merchant, Fair and Gabrilove48], helping mentees discover new research opportunities [Reference Yin, Gabrilove, Jackson, Sweeney, Fair and Toto61,Reference Smyth, Coller, Jackson, Kern, McIntosh and Meagher64], and introducing mentees to colleagues in the field. One item was added based on expert review to include the development of a mentor/mentee plan.
Community engagement
A core feature of CTR initiatives is the focus on community engagement. This domain includes five items related to the alignment of research with community interests, priorities, or concerns [Reference Eder, Carter-Edwards, Hurd, Rumala and Wallerstein65–Reference Patten, Albertie and Chamie67], efforts to include the community in research [Reference Eder, Carter-Edwards, Hurd, Rumala and Wallerstein65,Reference Eder, Evans and Funes66,Reference Holzer and Kass68], and the dissemination of findings [Reference Alison, Zafiropoulos and Heard22,Reference Buxton and Hanney69,Reference Ziegahn, Joosten and Nevarez70].
Research impact
The need to demonstrate the value of research expenditures has been well documented in the literature [Reference Cruz Rivera, Kyte, Aiyegbusi, Keeley and Calvert5,Reference Croxson, Hanney and Buxton17,Reference Grazier, Trochim, Dilts and Kirk18,Reference Donovan71]. Yet, there are few standardized tools in clinical and translational science that measure long-term impact based on an individual’s research. This six-item domain measures the experience investigators have related to the influence of their research efforts. The items focus on research that has influenced health outcomes [Reference Dembe, Lynch, Gugiu and Jackson6,Reference Croxson, Hanney and Buxton17,Reference Donovan, Butler, Butt, Jones and Hanney72], policy [Reference Donovan, Butler, Butt, Jones and Hanney72,Reference Banzi, Moja, Pistotti, Facchini and Liberati73], practice or fields of study [Reference Dembe, Lynch, Gugiu and Jackson6,Reference Croxson, Hanney and Buxton17,Reference Ari, Iskander and Araujo74], and future research[Reference Croxson, Hanney and Buxton17,Reference Ari, Iskander and Araujo74]. Our team added one additional item to reflect contributions to theory.
Institutional support
This domain contains 11 items and is separated into a new section due to the emphasis on researcher perceptions versus experience. The domain is defined as the perceptions investigators have regarding the culture, practices, and resources used by an organization to foster research. Items focus on leadership [Reference Connors, Pacchiano, Stein and Swartz50,Reference Van Schyndel, Koontz and McPherson75], seed funding [Reference Skinnider, Squair and Twa21,Reference Smyth, Coller, Jackson, Kern, McIntosh and Meagher64], designated research time [Reference Connors, Pacchiano, Stein and Swartz50], recognition or career advancement [Reference Monsura, Dizon, Tan and Gapasin76–Reference Pager, Holden and Golenko78], institutional supports and professional development opportunities [Reference Brocato and Mavis20,Reference McRae and Zimmerman60,Reference Emmons, Viswanath, Colditz, Emmons, Viswanath and Colditz79,Reference Hays80], multi-disciplinary research [Reference Brocato and Mavis20,Reference Hall, Feng, Moser, Stokols and Taylor56,Reference Holzer and Kass68,Reference Emmons, Viswanath, Colditz, Emmons, Viswanath and Colditz79], and roles, expectations, and rewards[Reference Bland, Center, Finstad, Risbey and Staples19,Reference Panettieri, Kolls and Lazarus46,Reference Connors, Pacchiano, Stein and Swartz50,Reference Monsura, Dizon, Tan and Gapasin76–Reference Pager, Holden and Golenko78,Reference Heslin81].
Validity testing process
Face and content validity
Typically, face and content validity involve reviewing the literature as well as expert review of the items and underlying constructs.[Reference Walters, Stern and Robertson-Malt54,Reference Grant and Davis82–Reference Rosenkoetter and Tate86] As such, three rounds of validity testing were conducted and revisions to the initial 87-item draft RIT were incorporated after each round of feedback was received. The first round of validity testing involved a comprehensive literature review to abstract and code existing published measures. The second round of validity testing assessed content through cognitive interviewing with two national experts to assess the comprehension, interpretation, and value of each item, the extent to which all items in a domain captured the underlying construct, and the appropriateness of the response options. The third round of testing assessed content validity by surveying 19 senior-level CTR Core Leads and members of the administrative leadership team. The respondents had two weeks to complete the tool, and they were asked to indicate how well each item measured the [named] domain. A five-point star rating was used: one star received a “poor” rating and five stars reflected that an item was an “excellent” match with the [named] domain. Similar to other published efforts, an a priori decision was made to remove any items that did not meet our minimum threshold – at least 50% of participants rated the item with four or five stars [Reference Dembe, Lynch, Gugiu and Jackson6]. In addition to the item rating, respondents were asked to provide open-ended feedback after each domain.
Reliability testing process
We tested the reliability (internal consistency and test-retest reliability) of the RIT by administering the tool to 16 staff/faculty (e.g., Research Navigators, CTR grant staff) across two collaborating institutions. The first round took place in February 2024, and fourteen days later, participants were sent the RIT questionnaire for a second round. This time was long enough for them not to have memorized their answers but short enough that none of their answers should have changed. Data were cleaned and analyzed using statistical software package Stata 17.
Internal consistency testing
Using the first round of RIT data, we assessed the internal consistency of the eight domains and four sub-domains by calculating Cronbach’s alpha (α) coefficients. In addition, we calculated McDonald’s omega (ω) coefficients as reported in other similar studies [Reference Hayes and Coutts88,Reference Malkewitz, Schwall, Meesters and Hardt89]. While both approaches can be helpful, Cronbach’s α often outperforms McDonald’s ω when sample sizes and number of items are small [Reference Orçan90]. Domains with a Cronbach’s α or McDonald’s ω coefficient above 0.70 are generally considered acceptable [Reference Taber91].
Test-retest reliability
Using both the first and second rounds of RIT data, we assessed the test-retest reliability of the eight domains and four sub-domains using Pearson’s correlation coefficient (r). Similarly, correlations above 0.70 are generally considered strong [Reference Polit92].
Results
Cognitive interviewing
Based on feedback during the cognitive interviewing, we removed six items and added 10 items to better assess experience across the following domains: 1) research skills, 2) service to the profession, 3) research productivity, 4) research impact, 5) research collaboration, 6) research mentorship, and 8) research impact. This left us with a 91-item RIT for content validity testing, and we also updated the wording to clarify the domain definitions.
Content validity
The response rate for the content validity testing was 53% (n = 10/19). Content validity results revealed an overall average across all items of 4.4 on the five-point scale, with “5” being the strongest support. The average rating across domains ranged from 4.0 to 4.8. One item from the research productivity domain was removed based on the a priori cutoff. This item focused on developing a research study website. In addition to the removal of one item, resulting in the final 90-item RIT, we incorporated the following changes to reflect the open-ended feedback received during this process. First, we shifted the placement of the community engagement and the research impact domains. Second, we made slight modifications to questions across all eight domains to decrease ambiguity, provide additional examples, specify intent, promote consistency in wording, and ensure items reflected measures related to biomedical and translational research.
Internal consistency
The response rate for the first round of reliability testing was 100% (n = 16). Items five and six within the research productivity domain were dropped from the analysis due to no variability in responses (all responses indicated 1 “No experience”), preventing our ability to run the Cronbach’s α. With these items dropped from the analysis, the internal consistency of all eight domains was high (see Table 1), with Cronbach’s α ranging from 0.85 (research productivity) to 0.97 (community engagement). As anticipated, the majority of inferences from running McDonald’s ω were similar to those of Cronbach’s α without notable caveats, ranging from 0.87 (research impact) to 0.97 (community engagement).
Table 1. Item counts and reliability results

1 Items 5 and 6 from research productivity dropped from analysis due to no variability in responses * p < 0.05.
2 One participant deemed an outlier and dropped from test-retest due to several domains having flipped responses without variation at retest; Three participants did not complete retest. ** p < 0.01.
The response rate for the second round of testing was 81% (n = 13/16); however, one participant was dropped from analyses due to information bias (they reported to us that their second-round answers were not true after we inquired why there was no variability in numeric values across items within each domain), therefore leaving 75% (n = 12/16) of respondents in the final analysis. Test-retest results were strong for seven of the eight domains with significant correlation coefficients greater than 0.70 (p < 0.05), while one domain (research impact) had an adequate but non-significant correlation between 0.50 – 0.60.
Discussion
The RIT provides a first-of-its-kind, standardized, individual-level measure reflecting a broad spectrum of experiences and institutional supports researchers may encounter over time. The RIT domains were drawn from the literature and validated by experts and those involved in clinical and translational research efforts. The psychometric testing revealed strong content validity, acceptable internal consistency across all eight domains, and strong test-retest reliability for nearly all domains, except for the research impact domain. Possible reasons for the lower test-retest reliability include challenges providing one aggregate score when research impact may vary dramatically by project or uncertainty regarding the actual impact of one’s research. Additional refinements to this domain may be needed over time.
As evaluators continue to be called on to document investments in support provided by CTSA, CTR, and other research scholar programs, the need to use valid and reliable measures remains important. This tool may be used over time, for benchmarking, and across institutions and programs to document researcher growth using a consistent approach. It has potential for application in evaluating associations between the scope and magnitude of customized research support. It also can be administered to a defined cohort of scholars to identify strengths and weaknesses based on the experiences reported and the perceived level of support available. The findings can inform future professional development opportunities, mentorship programs, and pilot funding. Given the comprehensive nature of the tool, the RIT also has potential to measure the research productivity and experiences of investigators at all levels, including mid-career investigators. The complete tool has been made available with the intent that it can be widely adopted by others who are seeking to measure the career trajectory of researchers who are receiving tailored, wrap-around support to advance their research path.
Limitations and next steps
Although the RIT has strong psychometric properties, the tool was developed and tested as part of an evaluation of a CTR initiative. Therefore, the scholars and focus on this program may not be representative across other areas of research. Given the nature of CTR funding, the tool includes a clear focus on community engagement and research collaboration; these domains may not be as relevant to some types of research and interpretations should be made in context.
In addition to the focus areas, the participants may not have been representative and the sample size for the validity and reliability testing was small, with limitations that are consistent with other validation efforts [Reference Dembe, Lynch, Gugiu and Jackson6]. Future studies testing the tool would benefit from a more robust and diverse sample of researchers and additional advanced statistical techniques to analyze the complex relationship between items and domains. Despite efforts to create domains that reflect the underlying constructs and are exclusive, more research is needed. Future studies that calculate content validity index scores and perform factor analysis, could provide further insight about the items and independence of the domains.
The tool provides a snapshot of cumulative experiences and current supports through a self-report tool that is lengthy due to its comprehensive nature. Repeated measures of the tool will be needed to assess changes over time, and this approach may prove challenging for busy researchers. Finally, additional testing is needed to determine the length of time reasonable changes in each item are likely to be seen and when these changes are best assessed (e.g., years one, three, and five). We plan to conduct longitudinal analyses utilizing the RIT’s ability to measure changes in researcher’s experiences over time and report the RIT’s sensitivity to detect change over multiple time periods. In addition, more work is needed to understand if specific domains could be administered at different time points. For example, are there some domains that warrant inclusion in every administration while others are included in year one and year five? This would shorten the tool and allow scholars more time to gain measurable experience.
Conclusions
The RIT provides a standardized approach for capturing individual-level measures of a broad spectrum of research experience and supports across eight underlying domains. Given the RIT’s strong psychometric properties, clinical and translational science-based initiatives may consider adopting this tool as part of their broader evaluation efforts. The results can be used to monitor research investments designed to strengthen new- and early-stage clinical and translational research scholars.
Supplementary material
The supplementary material for this article can be found at http://dx.doi.org/10.1017/cts.2024.673.
Author contributions
Dr Brenda Joly takes responsibility for the manuscript as a whole. The contributions of each author include the following: Conceptualization of the tool: BMJ; Literature search and review: BMJ, CG, KP; Development of the tool: BMJ, CG; Refinement and testing the tool: BMJ, CG, KC, VSH; Collection of data: BMJ, CG, KC; Analysis of data: KC, VSH; Interpretation of data: BMJ, CG, KC, VSH; Drafting and revising the manuscript: BMJ, CG, KC, KP, VSH.
Funding statement
The research reported here was supported by grant U54 GM115516 from the National Institutes of Health for the Northern New England Clinical and Translational Research Network.