Introduction
The importance of conserving plant genetic resources (PGR) for future generations and their accessibility to contemporary plant scientists and breeders is widely acknowledged (King et al., Reference King, Dreisigacker, Reynolds, Bandyopadhyay, Braun, Crespo-Herrera, Crossa, Govindan, Huerta, Ibba, Robles-Zazueta, Saint Pierre, PK, RP, Achary, Bhavani, Blasch, Cheng, Dempewolf, Flavell, Gerard, Grewal, Griffiths, Hawkesford, He, Hearne, Hodson, Howell, Jalal Kamali, Karwat, Kilian, King, Kishii, Kommerell, Lagudah, Lan, Montesinos-Lopez, Nicholson, Pérez-Rodríguez, Pinto, Pixley, Rebetzke, Rivera-Amado, Sansaloni, Schulthess, Sharma, Shewry, Subbarao, Tiwari, Trethowan and Uauy2024). Genebanks, managing ex situ collections of PGR, play an essential role in enhancing agricultural production and sustainability (McCouch et al., Reference McCouch, Baute, Bradeen, Bramel, Bretting, Buckler, Burke, Charest, Cloutier, Cole, Dempewolf, Dingkuhn, Feuillet, Gepts, Grattapaglia, Guarino, Jackson, Knapp, Langridge, Lawton-Rauh, Lijua, Lusty, Michael, Myles, Naito, Nelson, Pontarollo, Richards, Rieseberg, Ross-Ibarra, Rounsley, Hamilton, Schurr, Stein, Tomooka, van der Knaap, van Tassel, Toll, Valls, Varshney, Ward, Waugh, Wenzl and Zamir2013). According to FAO (2025a), in 2022 the global genebank network conserved approximately 5.9 million accessions across 871 genebanks. Thirteen of these genebanks are international (such as those managed by the CGIAR), six are regional and the remaining 852 are national genebanks – more than half of them located in Europe. Based on the data associated with the material conserved in these genebanks, gap analyses can be conducted to identify areas where plant genetic diversity has been under-collected, enabling the establishment of targeted collecting missions and conservation programmes (Dulloo and Khoury, Reference Dulloo and Khoury2023). However, it is imperative to examine whether the material housed in genebanks is being adequately conserved and is readily accessible for utilization.
The performance of genebanks is a sensitive issue, with limited public data available regarding genebanks’ efficacy in fulfilling their mandates. However, informal exchanges withinthe genebank and user communities paint a concerning picture. Considering the essential role of genebanks in safeguarding global food security, as underscored by Target 2.5 of the Sustainable Development Goals (‘By 2030, maintain the genetic diversity of seeds, cultivated plants and farmed and domesticated animals and their related wild species’; UN, 2024a), assessing the reliability of these institutions is key, despite the sensitivity surrounding the issue. Recognizing effective and efficient genebanks as such and supporting less effective ones in better delivery of their fundamental activities should be prioritized. The 11th session of the FAO Intergovernmental Technical Working Group on Plant Genetic Resources for Food and Agriculture highlighted this need in their report: ‘The Working Group also recommends that FAO look into options on how and which capacity-building and evaluation mechanisms could be created to support genebanks in reaching the Genebank Standards and explore the possibility for creating an acknowledgement system’ (FAO, 2023). Additionally, the Plant Genetic Resources Strategy for Europe, as developed by ECPGR (2021), advocates for the establishment of a certification system for genebanks that would assure proper quality of the management of these valuable resources (see also van Hintum and Wijker, Reference van Hintum and Wijker2024).
One of the difficulties in addressing the performance and quality of a genebank is the lack of standard measures and agreed definitions of terms defining the operations. The same basic terms are not always defined in the same way in different genebanks; terms like ‘accession’, ‘base sample’ and related standards differ regarding minimum quantity of seeds per sample, regeneration, storage, safety duplication, phytosanitary tests, availability, etc. The lack of clarity regarding, e.g., what constitutes an accession within a genebank complicates assessments of the state of the collection and possible backlogs in viability testing or regeneration efforts. Often, genebanks have material with different priorities or different statuses. There can also be differences in how to determine when seeds of one accession from various regenerations need viability testing and/or regeneration. While the FAO Genebank Standards (FAO, 2014) offer a solid framework for defining terms and establishing minimum operational procedures, they are subject to interpretation, leading to inconsistencies among different genebanks and even within individual institutions.
This paper contributes to these ongoing discussions by enhancing the understanding of basic terminology employed in genebank operations. Moreover, it proposes metrics aimed at providing a more comprehensive overview of genebanks’ status. The idea of the proposed metrics is to have a set of established and easy-to-calculate parameters that can inform genebank management and help focus efforts and resources. Moreover, it can be an important tool for data transparency and sharing among institutions and countries, facilitating standardization, collaboration and reporting.
Basic concepts
A genebank can be defined as an organization dedicated to the long-term conservation of PGR for the benefit of future generations of users while ensuring accessibility to the current generation. Conservation within this context can be defined as the maintenance of PGR material in accordance with established standards such as the FAO Genebank Standards (FAO, 2014). Accessibility, or availability, involves providing users with access to the conserved material under the Standard Material Transfer Agreement (SMTA) of the International Treaty on Plant Genetic Resources for Food and Agriculture or other Material Transfer Agreements (MTAs) (FAO, 2015). Availability also encompasses considerations such as the phytosanitary status of the material, seed quantity and viability, and capacity to overcome complex and changing legal restrictions for seed movement across borders.
These conservation and availability concepts also involve other important practical aspects for genebank management. Further elaboration and refinement of these definitions are essential to address potential disputes and complexities. Establishing common definitions and standards will require much discussion and collaboration across multiple stakeholders. The metrics outlined below will build upon the previously formulated definitions and presume that the genebank has implemented protocols and procedures for its operations, often referred to as Standard Operating Procedures (SOPs). For instance, a metric like the ‘number of accessions requiring germination testing’ presupposes the existence of a protocol outlining which accessions need testing, while the ‘number of performed germination tests’ also assumes that these tests were carried out in accordance with an SOP. It is important to note that while metrics provide quantifiable data, they do not inherently reflect the quality of the SOPs of an institution. Ideally, these SOPs adhere to internationally recognized standards such as the aforementioned ‘FAO Genebank Standards’.
The proposed metrics align with the framework outlined by Lusty et al. (Reference Lusty, van Beem and Hay2021), who describe a ‘Performance Management System for Long-Term Germplasm Conservation’ within the CGIAR context. The quantitative key performance indicators formulated by the Global Crop Diversity Trust and presented in the paper correspond closely to the metrics proposed here, insofar as they pertain to accessions. Broader targets, such as the implementation of a Quality Management System, extend beyond the scope of these metrics. Notably, all accession-based statistics required for the CGIAR’s online reporting tool (ORT) can be derived from the proposed metrics. The proposed metrics can thus be seen as a formalization of the metrics used in the ORT. However, clear community-based definitions for fundamental concepts such as ‘accession’ and ‘base sample’, also adopted outside the CGIAR community, can be expected to improve consistent online reporting within and outside the CGIAR system. Defining terms like ‘accession’ and ‘base sample’ is essential for clarity and consistency within the context of genebank operations. An accession can be defined as a PGR unit corresponding to a sample of a cultivated variety, landrace or wild population that is managed according to a protocol that aims at its long-term conservation and timely availability for distribution to users. Ideally, genebanks should aim to conserve their accessions in perpetuity and assure they are available for immediate distribution. This definition does not include material that is on the waiting list to become an accession but not yet fully part of the collection (e.g. material recently collected or acquired that has still to complete all the processes of documentation and seed processing to be considered an accessions), or material that used to be in the collection but is, for whatever reason, not fully managed anymore (e.g. historical or partially curated and archived accessions, sensu Hanson et al., Reference Hanson, Lusty, Furman, Ellis, Payne and Halewood2024). Several genebanks conserve more than one seed sample per accession; oftentimes, these samples are conserved under different storage conditions and are meant for a different use. The ‘base sample’ is the material of an accession that is stored under long-term storage conditions (LTS) and regularly monitored, according to the relevant protocols, and that will be used ultimately for regenerating the new base sample of this accession(FAO, 2014), i.e., it is the ‘life-line’ of the accession. The ‘active sample’ is the seed material of an accession used for distribution and research and may be stored under medium-term storage conditions (MTS) (FAO, 2014). It is important to note that not all genebanks distinguish between ‘active’ and ‘base’ collections (i.e. in many genebanks, all seed samples and accessions are stored under LTS conditions). Recent scientific research on seed longevity in different genebanks demonstrated that conservation under LTS conditions significantly increases seed longevity when compared to MTS conditions of dry seeds (see, e.g., Hay et al., Reference Hay, de Guzman, Ellis, Makahiya, Borromeo and Sackville Hamilton2013; van Treuren et al., Reference van Treuren, Bas, Kodde, Groot and Kik2018; Guzzon et al., Reference Guzzon, Gianella, Velazquez Juarez, Sanchez Cano and Costich2021). Conservation of samples of the same accession under LTS and MTS temperature conditions can therefore be counterproductive, significantly diverting resources and activities that should be dedicated to seed conservation at optimal temperature conditions (LTS) to maximize seed longevity and minimize the need for regeneration and viability testing (Guzzon et al., Reference Guzzon, Gianella, Velazquez Juarez, Sanchez Cano and Costich2021). For these reasons, we are focusing these metrics on ‘base’ collections.
Other technical genebank terms will be elucidated within the definitions of the metrics below.
Principles of proposed genebank metrics
Genebank metrics aim to describe the status of a genebank collection and the activities in the genebank. They do not describe generic institutional aspects such as funding and number of staff. The status of a genebank generally relates to a specific moment in time, e.g. the number of accessions. On the other hand, activities often relate to specific intervals of time, e.g. number of newly added accessions. When calculating these metrics, it is crucial to clearly specify both the moment of measuring and the period covered. A default period commonly utilized is a five-year timeframe. For momentary metrics, the end of the reporting period typically serves as the default moment for calculation. However, if a different moment or period is utilized, it should be explicitly indicated to ensure transparency and clarity in reporting.
Apart from the distinction between the metrics related to a moment and those related to a period, some metrics can be considered basic, possibly even mandatory. On the other hand, others can be regarded as optional, as they are further elaborations of the basic ones. An example of a mandatory metric is the number of accessions. This is a fundamental metric, whereas, e.g., the number of landraces or accessions originating from Asia can be considered optional elaborations – important but not as essential as the basic ones.
Genebank metrics should meet several criteria to allow for wide acceptance and adoption. First, they should capture fundamental concepts applicable to all genebanks, ensuring relevance and universality. Secondly, calculating the metrics should be relatively straightforward and feasible for any well-organized genebank, minimizing complexity and technical barriers. Additionally, the value of the metrics should ideally be derived from digitized data available in most genebank databases. This facilitates automation through the development of scripts or algorithms that can be written once and executed whenever needed. Once the script is established, users would only need to input parameters such as the reporting period and moment, streamlining the generation of metric values. This standardized approach not only facilitates monitoring and evaluation of genebank developments internally but also enables consistent reporting to funding agencies and international organizations like the FAO for initiatives such as the ‘State of the World’ reports (see, e.g., FAO, Reference Bélanger and Pilling2019, 2025b).
In defining genebank metrics, terms or concepts are often derived from the FAO/IPGRI Multi Crop Passport Descriptors (Alercia et al., Reference Alercia, Diulgheroff and Mackay2015), a generally adopted list of ‘passport descriptors’ describing origin and identity of the accession. However, it is important to acknowledge that certain concepts, like ‘country of origin’, may present conceptual challenges (that extend beyond the scope of this paper). Addressing such conceptual challenges might require broader discussions and consensus-building efforts within the genebank community.
Finally, the metrics outlined in this paper primarily pertain to genebanks conserving orthodox seeds (e.g. seeds that can tolerate drying to low moisture content and subsequent freezing; Li and Pritchard, Reference Li and Pritchard2009), which currently represent the most common form of PGR conservation. Adapting the metrics for use with field-, in vitro- or cryo-collections or for in situ conservation projects will necessitate some adjustments to account for the unique characteristics and challenges associated with these conservation methods. This may involve collaborating with experts in each respective field and drawing upon established best practices to develop comprehensive and meaningful metrics for assessing the status and activities of genebank collections conserving recalcitrant species and/or clonal crops as well as in situ conservation programmes.
Validation process
Earlier versions of the list of genebank metrics originally developed by the Centre for Genetic Resources, The Netherlands (CGN) were shared with colleagues (genebank curators and database managers) from 14 institutions (13 genebanks and the Global Crop Diversity Trust; Table 1) for further development and validation of the metrics. Genebank curators and database managers filled in the metrics with the data of the genebank they manage and provided additional feedback on (1) the usefulness of this tool, (2) metrics that were not clear, (3) important metrics that were missing and (4) metrics considered redundant. After these iterations and inclusion of the feedback received, the initial list of metrics was amended into the version presented in this paper.
Table 1. Institutions that reviewed and validated the genebank metrics. FAO WIEWS codes are provided for all genebanks. IPK genebank collections are conserved in three different stations, each with its own WIEWS code. The column country refers to the country where the headquarter and/or main genebank for each institution is located (for institutions that operate in multiple countries)

The genebank metrics
Genebank management encompasses a wide array of activities and aspects, which can be grouped in various ways depending on the focus and objectives of the assessment. Here, the following thematic groups are considered: (1) size and composition of the collection, (2) data and documentation, (3) conservation, (4) availability and (5) distribution. For each category, the proposed metrics have been given coded names, complete names and brief descriptions. It is also indicated if the metrics are periodic (e.g. referred to a specific period of time or reporting interval) or momentary (reflecting the current status of the collection), and whether they should be considered mandatory (e.g. covering fundamental aspects for the genebank management, metrics that should be readily available to the curators) or optional (covering further elaborations of mandatory metrics). See ‘Supplementary material 1’ for an overview of the genebank metrics proposed in this paper.
Size and composition of the collection
Many of the metrics relate to properties of the seed accessions. An accession in this context should be seen as the basic unit of conservation of the PGR collection, representing, as mentioned, a sample of a cultivated variety, landrace or wild population. It is an integral part of the collection and thus should be conserved and made available following the procedures highlighted in the SOPs of the genebank. In this context, material that is not yet or no longer fully included in the collection is not considered accessions. This refers, for instance, to new introductions that have not yet been fully accessioned as part of the collection or archived or historical accessions. Once the total number of accessions is known, it might be interesting to know more about the nature of these accessions: the biological status (variety, landrace, wild, etc.), the continent or country of origin and possibly how long the accessions have been in the collection, etc.

The number of accessions in the collection, conserved under the SOPs of the genebank. This metric is the basis of all metrics in this category. It is the only mandatory metric in this first category of ‘Size and composition of the collection’.

These four metrics are further specifications of NACC and are all calculated similar to NACC, only with the added requirement of the biological status (based on the FAO/Bioversity Multi Crop Passport Descriptors: ‘100’, ‘110’, ‘120’ or ‘200’ for wild or weedy populations, ‘300’ for traditional cultivars or landraces, ‘410’, ‘411’, ‘412’, ‘413’, ‘414’, ‘415’, ‘416’, ‘420’, ‘420’, ‘421’, ‘422’ and ‘423’ for breeding and research material and ‘500’ for advanced or improved cultivars; Alercia et al., Reference Alercia, Diulgheroff and Mackay2015). The difference between the sum of these four metrics and the number of accessions obviously indicates the number of accessions without a known biological status or with another that is not included in these metrics.

A further specification of NACC, calculated similarly to NACC, only with the added requirement that the accession originates in the country where the genebank is located. An accession is considered as originating in a specific country when it was collected (in case of wild or weedy populations and traditional varieties or landraces) or it was developed (in case of breeding and research material and advanced cultivars). This metric indicates the level at which the genebank is oriented towards national material.

These six descriptors are further specifications of NACC, calculated similarly to NACC, only with the added requirement that the origin country is located in a specific continent. A continent division of the (origin-)countries is required to calculate these metrics. The classification of the United Nations geoscheme (UN, 2024b) should be used, where all continents are distinguished except for North America and South America. North America can be considered to consist of the regions ‘Northern America’, ‘Central America’ and the ‘Caribbean’ and South America of the Region ‘South America’ (see Supplementary Material 2).

The only metric in this category with a time dimension. Per accession, the year of entering the genebank collection (becoming an accession) is subtracted from the year of the moment of reporting.

The only two metrics in this category measured over a period of time. The first is a specification of NACC, calculated similarly to NACC, only with the added requirement that the moment of entering the genebank collection (becoming an accession) falls in the reporting period. The second metric, the number of accessions removed from the collection, is of a slightly different nature as it is not a property of the accessions in the collection. Accessions can be removed from the collection for various reasons, e.g., because they appear to be duplicates of other accessions in the collection. Accessions classified as archived or historical in the reporting period should thus be considered as removed from the collection.
Data and documentation
The quality of the documentation of the accessions is challenging to measure. Regarding passport data, simply counting the datapoints does not yield relevant information. In the case of passport data, for example, a vernacular name is very important for traditional varieties and landraces, but irrelevant for wild populations, whereas the location of collecting is crucial for a wild population and less relevant for a modern variety. For passport data, this issue was addressed with the introduction of the Passport Data Completeness Index (PDCI), which weighs the importance of certain datapoints depending on the biological status of the accession (van Hintum et al., Reference van Hintum, Menting and van Strien2011). As a result, the PDCI gives a good indication of the completeness of passport data, but it says nothing about the reliability of these data. Regarding the phenotypic data, often referred to as characterization and evaluation data in the genebank community, it is more difficult, and possibly a simple count of the number of datapoints of this type is the best that can be done. Hopefully, in the coming years, this can be formalized into a metric indicating the number of datapoints that comply with a proper standard for this type of information, such as MIAPPE (Papoutsoglou et al., Reference Papoutsoglou, Faria, Arend, Arnaud, Athanasiadis, Chaves, Coppens, Cornut, Costa, Ćwiek-Kupczyńska, Droesbeke, Finkers, Gruden, Junker, King, Krajewski, Lange, Laporte, Michotey, Oppermann, Ostler, Poorter, Ramí rez-Gonzalez, Ramšak, Reif, Rocca-Serra, Sansone, Scholz, Tardieu, Uauy, Usadel, Visser, Weise, Kersey, Miguel, Adam-Blondon and Pommier2020), but that has not yet been achieved. Many genebanks are generating genomic data, which can greatly increase the usability of the accessions (e.g. allowing genome-wide association studies, allele mining and predictive breeding) and improve the genebank management (e.g. by defining core-collections or subsets, Sansaloni et al., Reference Sansaloni, Franco, Santos, Percival-Alwyn, Singh, Petroli, Campos, Dreher, Payne, Marshall, Kilian, Milne, Raubach, Shaw, Stephen, Carling, Saint Pierre, Burgueño, Crosa, Li, Guzman, Kehel, Amri, Kilian, Wenzl, Uauy, Banziger, Caccamo and Pixley2020). A metric is proposed to indicate how many accessions have genomic data that are accessible to users (either in public repositories or upon request to the genebank curators). An additional option that can easily be added is the number of accessions that received a Digital Object Identifier (DOI). The DOIs were introduced as a standard to increase the possibility of univocally identifying genebank accessions, linking information sources and monitoring the flow of PGR (Alercia et al., Reference Alercia, López, Sackville Hamilton and Marsella2018). Finally, all these data should (at least in principle) be publicly available to be reported, as access to data is the first step towards access to the material – this should be part of an SOP and is not reflected in a metric.

The calculation of the PDCI is done per accession and requires a simple script based on the paper about this metric (van Hintum et al., Reference van Hintum, Menting and van Strien2011). The value has a maximum of 10, which indicates complete passport documentation. For genebanks that contribute data to Genesys, this index can be automatically retrieved from this information system (Genesys, 2024). Supplementary Material 3 provides guidance in calculating the PDCI.

This metric refers to the average number of characterization and/or evaluation datapoints per accession of the collection, which are readily available in an information system. This obviously depends on the number of digitally stored and accessible datapoints (observations), which could miss the historical data stored on paper or otherwise.

This metric refers to the number of accessions for which genomic data were generated and are accessible internally (to the genebank ) and externally by users (via public repositories or upon request to the genebank). For genebanks holding diverse genomic data, further specifications of this metric can be included, e.g. number of accessions with genomic marker data and number of accessions with genomic sequence data.

The DOI, a unique identifier of the accession, helps to link the information about accessions from different sources. DOIs are assigned by the secretariat of the ITPGRFA or other providers (Alercia et al., Reference Alercia, López, Sackville Hamilton and Marsella2018).
Conservation
As mentioned before, accessions in a PGR collection should be managed following the SOPs of the genebank. If these SOPs comply with the international standards (FAO, 2014), proper conservation should be assured. However, it is essential to monitor the conservation status by indicating the level of activity regarding viability tests and regenerations, and to provide information on seed quality, available seed quantities and the status of the safety back-up of the accessions.

This metric is calculated only on data collected from base samples (i.e. one sample per accession intended for long-term conservation), indicating if the base sample has been tested in the reporting period. Tests on active samples (e.g. intended for short- to medium-term conservation and/or distribution) can also be counted here, provided that they give a direct estimate of the viability of the base sample.

This metric relates to any regeneration of the accession (including also non-base samples) in the reporting period, indicating if new seeds of the accession have been produced.

This metric indicates the average age of the base sample per conserved accession, i.e. the difference in years between the year of storage of the sample and the year of reporting.

These two metrics entirely depend on the SOPs of the genebanks that identify the time intervals for viability testing, the viability thresholds and the minimum seed quantity per sample that trigger regeneration activities. The comparison with the other Conservation metrics, indicating the number of viability tests and regenerations done, gives an overview of potential backlogs in the genebank operations.

Two metrics further specify the previous metric (which gives the number of regenerations needed). They provide an overview of the needs for regeneration. These two metrics also entirely depend on the SOPs of the genebanks that identify the viability thresholds and the minimum seed quantity per sample that trigger regeneration activities.

These metrics indicate the level of safety backup (e.g. duplication of accessions in an external genebank) and the nature of this backup. Internal duplication (e.g. the same accessions conserved in both active and base collections in the same location by the genebank) is, in general, not considered as safety duplication and is not included in the framework of this metric. If accessions are duplicated under long-term storage conditions in at least one different location (including another genetic resources centre of the same institution in a different location), this is considered safety duplication and can be reported in these metrics. The first metric, indicating that an accession is either duplicated in another genebank or location, and as a second level of safety, in the Svalbard Global Seed Vault (see Asdal, Reference Asdal2025), is mandatory.
Availability
In principle, all accessions in the genebank collection should be available for distribution. However, this is not always the case; even if the SOPs are followed, it is possible that genebank accessions are (temporarily) not available for distribution to users. A genebank can run out of seeds for some accessions because of a sudden increased demand for specific material, a sudden decrease in viability of the seeds or simply mistakes in the operations. However, it is also possible that due to a new phytosanitary issue, some materials cannot be distributed until retested. For this reason, metrics, indicating the number of accessions that are not available with an indication of the reason for non-availability, can be very useful.
This is a problematic category since availability usually depends also on some characteristics of the requestor; legal or phytosanitary reasons can make it impossible to distribute material to specific countries or users. However, strict interpretation of these terms should be used.

The first metric, the only mandatory metric in this category, indicates that seeds of the accession are ready for distribution under SMTA or other MTAs in terms of seed amount and quality, phytosanitary requirements and legal status. The other three metrics explore the reasons for the unavailability of accessions and specify how many accessions are not available due to various issues.
Legal unavailability implies that material cannot be distributed under an SMTA or other MTAs. Phytosanitary unavailability implies that it cannot be moved according to phytosanitary regulations in the country of the genebank. The final, lack of quality seeds, implies that the accession does not have sufficient seeds. Moreover, some genebanks do not distribute seed samples if accessions do not reach the viability threshold for distribution established by the SOP of the genebank describing the distribution process.
Distribution
Even if the material is available, that does not mean it is actually requested, distributed and finally used. Defining use for genebank accessions is complex; therefore, the most straightforward interpretation of ‘number of samples distributed from the genebank’ can be used as a proxy to clarify the utilization of the collections. This includes ‘internal use’ of the genebank for routine genebank operations, e.g. germination testing, regeneration and safety duplication. It is essential that this component of internal use is separately quantified. Other metrics indicating the amount and type of use are relevant, such as the number of times accessions have been distributed per year they were in the collection, or metrics indicating the type and location of the users.

These metrics indicate the total use of material in the genebank. The third metric, the only mandatory metric in this category, shows the external distribution numbers in the reporting period. Only the first metric in this category includes internal use; internal use refers to the use of the genebank material for the management of the genebank, i.e. germination testing, regeneration and characterization. The use of accessions in scientific research or breeding activities done by the genebank or associated programmes in the same institution is not considered as internal use. The first two metrics refer to the distributions carried out by the genebank across its history to indicate the level at which the accession has been used in the past. As several genebanks have a long history encompassing several decades of activity and early distribution accounts might not be available or digitized, a later starting date from the beginning of the genebank operations can be proposed.

This metric, excluding internal genebank use, indicates the ‘popularity’, as a rate of distribution across time, of the genebank accessions over time. This is calculated by dividing the number of times each accession was distributed by the number of years this same accession has been part of the collection. As mentioned above for genebanks that do not have a complete historical record of all their distribution, a later starting date from the beginning of the genebank operations or a specific reporting period can be proposed.

A further specification of DIS_EXT, to indicate the (inter-)national orientation of the genebank distributions. This metric excludes internal genebank use.

Further subdivisions of DIS_EXT. These metrics exclude internal genebank use. To calculate these metrics, a continent division of the countries of the users is needed. The classification of the United Nations geoscheme can be used, where all continents are distinguished except for North America and South America (UN, 2024b). North America can be considered to consist of the regions ‘Northern America’, ‘Central America’ and the ‘Caribbean’ and South America of the Region ‘South America’ (see Supplementary Material 2).

Further subdivisions of DIS_EXT. No formal categorization of users exists, and we are proposing the following subdivisions: private companies, public institutions, NGOs and private individuals. Public institutions include universities, public research centres and public genebanks.
Calculation, analysis and presentation
The proposed metrics can be calculated on any level of aggregation: the entire collection, per crop or even per accession. When done per accession, for example, the ‘number of accessions’ (NACC) is obviously 1, and for a landrace, the ‘number of accessions of wild or weedy populations’ (NACC_PW) will be 0. This approach allows for an aggregation that can be decided later; it can be aggregated per crop, per population type, per continent of origin or any other aggregation. The only complication when choosing this accession-wise approach is the metric ‘Number of accessions removed from the collection’ (NACC_OUT) since this metric cannot be calculated with the information about the material currently in the collection.
Once the proposed metrics have been determined, they can be analysed and presented in various forms and for multiple purposes. They can be tabulated in a spreadsheet, transforming absolute numbers into percentages. Examples are given in Figures 1 and 2 and in Supplementary Material 4.

Figure 1. Small part of a screen dump of the spreadsheet that presents the genebank metrics for the CGN collection (see Supplementary Material 4 for the full spreadsheet) focusing on the base sample management of several crops. The columns are calculated based on the values (the absolute numbers) or a combination of values (the percentages) of the metrics aggregated on the crop level.

Figure 2. Small part of a screen dump of the spreadsheet that presents the genebank metrics for the CGN collection (see Supplementary Material 4 for the full spreadsheet) focusing on the destination of the distributed samples of several crops. The column ‘#out 2020-2024’ indicates the number of samples distributed in this period (not for internal genebank purposes).
In Figure 1, it is, among others, immediately visible that relatively many germination tests were done in the reporting period of five years, given that the normal frequency of germination testing per stored seed sample at CGN is once every ten years. It can also be seen that most of the regeneration needs are due to low germination. The Cruciferae and Allium collections have relatively large regeneration needs – these crops are difficult to regenerate and might deserve extra attention to avoid bottlenecks in the future.
Figure 2 shows the ‘popularity’ of the crops and the destination of the material. It becomes, for example, immediately clear that despite the small size of the spinach collection, its popularity has been extremely wide – this might call for expansion of this collection.
This type of data presentation provides important information for the curators and the genebank manager, as described in the examples above, which provides the business intelligence of a genebank. Having a more dynamic presentation of the data would be even more interesting. As the information systems used by genebanks differ, it would always be useful to have an interface allowing the user to drill down to more detailed aggregation levels, for example, starting at the collection level, going to the agricultural/horticultural crop level, the crop level within the horticultural crops, the crop types within the lettuce collection and the geographical origins of the butterhead lettuce (for an illustration of such a hierarchical structuring of PGR collections, see van Treuren et al., Reference van Treuren, Engels, Hoekstra and van Hintum2009). At each level, the metrics could generate important insights, for example, the division between the use by commercial versus private users, or the popularity in Asia. Such dynamic interfaces could create insights into the status and use of collections that are otherwise very difficult to obtain.
Keeping records of the metrics over time would furthermore allow an analysis of developments of the collection and allow better planning of activities such as regeneration and viability testing. More importantly, easy visualization and presentation of germplasm utilization trends would allow better strategic planning of the genebank operations, such as prioritization of the acquisition of new material, funding, staffing and infrastructure needs. A finalized and agreed list of genebank metrics could be incorporated in the major information systems used by genebanks (e.g. GRIN-Global), as well as public PGR information repositories (e.g. EURISCO and Genesys), to facilitate automatic calculation of the metrics and ease data transparency and reporting, thus enhancing the appreciation of genebanking, improving its global coordination and upgrading its capability for strategic planning.
Discussion
Transparency regarding the composition, management and use of genebank collections would significantly contribute to optimizing and rationalizing genebank management and operations. It could also be the basis for reporting and strengthening collaborations. The proposed metrics represent wide concepts, relevant to any genebank conserving orthodox-seeded species, and should be relatively easy to calculate by any genebank.
Agreed concepts are required for these metrics and for all communication regarding genebank composition, management and use. Basic concepts that require shared and agreed definitions are, for example, ‘accession’ and ‘base sample’. Transparency regarding the SOPs of the genebank is also essential for these metrics, but also for improvement of the genebank quality and, ultimately, genebank certification (van Hintum and Wijker, Reference van Hintum and Wijker2024).
The proposed list of genebank metrics is undoubtedly not the end point of the development of standardized genebank metrics but could serve as a positive step forward. The proposed metrics, which are inspired by reporting requirements for the FAO State of the World reports and the online reporting tool for CGIAR genebanks, could spark the discussion needed to facilitate a broadly accepted list of metrics within the genebank community. The metrics coordinated operations would support the management of the genebank on one hand and ease reporting and auditing to improve the communication with funders, policymakers and the user community on the other. We believe that such coordinated effort could improve the genebanks’ delivery worldwide.
After its development in the CGN genebank, the metrics quickly revealed insights that improved management. Previously, less explicit issues regarding the collection’s composition and use became apparent. Documentation metrics highlighted the strong focus on leafy and fruit vegetables, while usage data showed stark differences among crops – many were used less than 0.1 times per year, compared to over 0.4 for leafy and fruit vegetables, with spinach reaching 1.0. The metrics also clarified future priorities, such as addressing the significant regeneration backlog in the Cruciferae collection.
This first set of metrics was developed for orthodox seed collections in genebanks. Further discussions in the PGR community will be needed to optimize and validate metrics for the other ex situ conservation methodologies of PGR (i.e. in vitro conservation, cryopreservation and field collections) as well as community seed bank networks and in situ conservation (networks of genetic reserves and on-farm conservation). Clearly, the development of future metrics should be informed by the practical experiences of the communities involved. For example, the CGIAR genebank community has already implemented performance indicators for other ex situ conservation methodologies, which could serve as a foundation for the formulation of new, relevant metrics. From preliminary discussions held during the validation of the current metrics in the framework of the ‘New AEGIS’ project (https://www.ecpgr.org/aegis/projects/new-aegis), specific metrics for on-farm and in vitro conservation were mentioned. More detailed on-farm conservation specific metrics were proposed dealing with (1) the number of conservation sites per accession, (2) the type and number of ‘conservation units’ per accession (e.g. trees and populations) and (3) number of on-farm accessions duplicated in genebanks. For in vitro conservation, specific metrics should deal with (1) the number of accessions that are low in number (e.g. with a number of clones in slow growth conditions below the one recommended in the genebank’s SOPs), (2) average number of subcultures per accession and (3) number of accessions duplicated in cryopreservation and/or in glasshouse/field collections. Finally, further discussions and quantitative analyses of the metrics of different genebanks could allow the identification of thresholds for some of these metrics to facilitate quality management and certification for genebanks.
It is important to acknowledge that not all genebanks worldwide are currently in a position to implement these metrics, as their operations may be less structured and often lack adequate documentation of both materials and procedures. Nevertheless, the authors contend that the establishment of a global system of collaborating genebanks necessitates a certain standard of quality. Effective conservation depends fundamentally on reliability. By defining clear metrics and promoting transparency, we aim to contribute to the ongoing improvement of genebanks in pursuit of this shared objective. We hope that this initial discussion could foster the creation of sets of metrics tailored to the different methodologies of PGR conservation to increase the integration of conservation actions, their quality management and transparency.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S147926212510021X.
Acknowledgements
The authors acknowledge the support of the projects PRO-GRACE (supported by the European Union’s Horizon research and innovation programme under grant agreement no. 10109) and New AEGIS (funded by the Federal Ministry of Food and Agriculture, Germany, grant no. GenR 2024-1). The authors are grateful to Lorenzo Maggioni (ECPGR), Rob van Treuren (CGN), Rik Lievers (CGN) and Janny van Beem (Global Crop Diversity Trust) for their useful comments on earlier versions of the manuscript and all other colleagues who were involved in the various discussions on the topic of genebank metrics.