Introduction
In the early twenty-first century, total animal experimentation numbersFootnote 1 worldwide increased dramatically, from an estimated 115 million animals used in 2005 to around 192 million in 2015 (Taylor & Alvarez, Reference Taylor and Alvarez2019, p. 210). While China notably increased its animal use during the period, many European countries also saw increases or stagnation (Taylor & Alvarez, Reference Taylor and Alvarez2019, p. 207). In Switzerland, for example, the overall number of animals used for scientific purposes has remained roughly stable since 1996—around 600,000 animals per year—with a roughly stable distribution across different severity degrees (FSVO, 2024). In the European Union (plus Norway after 2018), the number of animals used for regulatory purposes and in routine production has decreased steadily since 2015, but the number of animals used in basic, applied, and translational research has remained stagnant around 6–7 million per year (European Commission, 2025). Across all purposes in the European Union, the percentage of animals used for procedures of different severities has been shifting slightly, with fewer animals being used for “mild,” more for “moderate,” and a roughly stable number used for “severe” procedures (ibid.).
Many countries today have programs or institutions that support the “three Rs” of “replace, reduce, refine.” Should it be concerning, then, if overall harm to animals is not significantly decreasing, especially in research? Does it provide a reason to reconsider existing programs? One might take this for granted, but in fact, the question is controversial. Some scholars and activists have argued that rising or stagnating total animal experimentation numbers are a sign of failure on the part of “three Rs” programs, but representatives of the programs themselves have disagreed, arguing that total numbers obscure too much to be helpful in program evaluation (see the “Debates over 3Rs program evaluation” section for examples).
To move the debate forward, the present article contributes two points: one conceptual and one normative. First, it draws a distinction between two purposes for which total numbers might be relevant—they might matter for an assessment of program impact or for an assessment of program sufficiency relative to its goals (section “How total numbers might matter: program impact versus sufficiency”). While total numbers are at best indirectly relevant for an assessment of impact, they are directly relevant for assessing sufficiency, provided that program goals have implications in terms of total numbers. However, program goals are often vague. To see whether programs are sufficient to achieve their goals, policymakers first need to set clear goals.
Second, this article turns to the normative question of whether making a difference at the level of total animal experimentation numbers should be a policy goal (section “Why total numbers should matter”). Of course, that depends on normative background assumptions—but which ones? One might expect that only agents who value animals particularly highly would have reason to call for an overall decrease in harm to animals in experimentation. However, the article will argue that this policy goal is sensible from a widely agreeable “middle ground” standpoint which acknowledges that animals have a claim to protection but considers their claims less important than those of humans. The goal of this section will not be to list every possible critical concern about animal experimentation, which might also include concerns about scientific quality (Akhtar, Reference Akhtar2015), financial cost (Bottini & Hartung, Reference Bottini and Hartung2009; Keen, Reference Keen, Herrmann and Jayne2019; Meigs et al., Reference Meigs, Smirnova, Rovida, Leist and Hartung2018), and the psychological toll the practice can take on personnel (Baysal et al., Reference Baysal, Goy, Hartnack and Canu2024; Johnson & Smajdor, Reference Johnson, Smajdor, Herrmann and Jayne2019; King & Zohny, Reference King and Zohny2022; LaFollette et al., Reference LaFollette, Riley, Cloutier, Brady, O’Haire and Gaskill2020). Rather, the aim is to provide one robust argument for a policy goal of decreasing overall harm to animals in experimentation—even in cases where it avoids those other critical concerns. The article ends with some suggestions for what policies beyond the “three Rs” would have to look like in order to be able to decrease overall harm by overcoming reliance on animal experimentation (section “Designing policies that fare better”), before concluding (section “Conclusion”).
In sum, this article argues that total animal experimentation numbers should not be dismissed wholesale when evaluating success and failure in animal experimentation policy, though program goals should be more clearly defined. From a widely agreeable standpoint, when the numbers indicate stagnating or even rising overall harm inflicted on animals in experiments, it is indeed time to reconsider “three Rs” programs.
Debates over 3Rs program evaluation
Today, many governments and institutions undertake efforts to promote the “three Rs” of “replace, reduce, refine” (Bayne et al., Reference Bayne, Ramachandra, Rivera and Wang2015; Neuhaus et al., Reference Neuhaus, Reininger-Gutmann, Rinner, Plasenzotti, Wilflingseder, De Kock, Vanhaecke, Rogiers, Jírová, Kejlová, Knudsen, Nielsen, Kleuser, Kral, Thöne-Reineke, Hartung, Pallocca, Leist, Hippenstiel and Spielmann2022a, Reference Neuhaus, Reininger-Gutmann, Rinner, Plasenzotti, Wilflingseder, De Kock, Vanhaecke, Rogiers, Jírová, Kejlová, Knudsen, Nielsen, Kleuser, Kral, Thöne-Reineke, Hartung, Pallocca, Rovida, Leist and Spielmann2022b; see generally Russell & Burch, Reference Russell and Burch1959). Scholars have argued that it is a sign of failure on the part of these programs when total animal experimentation numbers are stagnating or rising (Bailey, Reference Bailey2024; Blattner, Reference Blattner, Herrmann and Jayne2019; Herrmann, Reference Herrmann, Herrmann and Jayne2019). Animal advocates have endorsed the same claim (Bertrand, Reference Bertrand2024; Marshall et al., Reference Marshall, Constantino and Seidle2022; PETA, 2024), while others have stated sarcastically that the 3Rs “succeeded in preserving the status quo” (Stop Vivisection, 2016). It is sometimes assumed that the “reduce” principle calls for a decrease in overall animal experimentation (as pointed out by Olsson et al., Reference Olsson, Franco, Weary and Sandøe2012, p. 333). The traditional and still dominant interpretation, however, is that it calls for reducing the sample size within studies (Lauwereyns, Reference Lauwereyns2018, p. 17; Russell & Burch, Reference Russell and Burch1959, Ch. 4), which is compatible with an overall stagnation or increase if more individual studies are conducted. And while the “replace” principle calls for substituting animals with other models or approaches, new animal models can be innovated in the meantime so that replacement progress is compatible with an overall increase of animals used (Müller, Reference Müller2023). Thus, the implications of the “three Rs” for animal experimentation at the level of total numbers are not as straightforward as one might think.
The view that rising animal experimentation numbers indicate a failure of “three Rs” programs has been disputed by representatives of those programs. For instance, a 2012 report by the British NC3Rs on the institution’s evaluation framework contains a special annex titled “The Home Office statistics are not a gauge of progress in the 3Rs” (NC3Rs, 2012, p. 49). The annex argues that “the number of animals reported in the annual statistics is influenced by a range of scientific and strategic factors independent of the 3Rs” (NC3Rs, 2012, p. 51), such as investment decisions by major funding bodies and technological breakthroughs like gene modification techniques that open up new possibilities to engineer and use animal models (ibid.). Furthermore, the report continues, “reductions in animal use for some studies have been achieved but this may not be apparent if there is an overall increase in the number of such studies performed” (NC3Rs, 2012, p. 51), and “developments that avoid animal use are not easily counted” (NC3Rs, 2012, p. 52).
Along similar lines, scholars associated with a Swiss program for the “three Rs” have more recently argued:
To assess the effectiveness of the 3Rs principle as a policy instrument for advancing humane animal experimentation, appropriate parameters for measuring the effects of the 3Rs are still needed. […] First, the metrics track the numbers of animals used rather than those replaced, and second, most replacement approaches are not recognized as such. (Grimm et al., Reference Grimm, Biller-Andorno, Buch, Dahlhoff, Davies, Cederroth, Maissen, Lukas, Passini, Törnqvist, Olsson and Sandström2023, p. 6)
There is a persistent perception that there is a “missing effect” of the implementation of the 3Rs. However, […] proper outcome measures for replacement, reduction and refinement alike are largely undeveloped, and therefore one is entitled to question whether the effect really is missing, or whether instead we are simply unable to properly measure it on a wider, global scale. (Grimm et al., Reference Grimm, Biller-Andorno, Buch, Dahlhoff, Davies, Cederroth, Maissen, Lukas, Passini, Törnqvist, Olsson and Sandström2023, p. 10)
The central argument advanced by NC3Rs (2012) and Grimm et al. (Reference Grimm, Biller-Andorno, Buch, Dahlhoff, Davies, Cederroth, Maissen, Lukas, Passini, Törnqvist, Olsson and Sandström2023) is straightforward: Overall numbers are irrelevant for “three Rs” evaluation because they are influenced by factors beyond the “three Rs” and because they only track the animal experiments conducted, not the ones avoided. However, as the next section will argue, the force of this argument depends on what we are evaluating: a program’s ability to make a difference, or its ability to achieve a given policy goal.
How total numbers might matter: program impact versus sufficiency
Despite stark differences in tone, critics and representatives of “three Rs” programs might not truly disagree over the role of total numbers but might rather be talking past each other. There is a difference between the question of whether a program has an impact—that is, whether it affects any changes in the world—and the question of whether a program is sufficient—that is, whether it achieves its goals. A program can have great effects while failing to have the specific effects it is meant to have. In such situations, it is easy to have pseudo-disagreements about whether the program is “working.”
By way of analogy, consider the following vignette:
Goalkeeper: An association football coach is furious because their team has been on a losing streak, conceding an increasing number of goals with each match. The coach tells the goalkeeper she needs to train harder, but the goalkeeper protests: The number of goals conceded is misleading, as it ignores other important variables (e.g., the number of shots on target) and it does not track how many goals were prevented. Say that the goalkeeper is indeed among the best in the league when it comes to the number of shots on target blocked, and has even been improving. It is just that the number of shots on target has been rising even more rapidly, masking the goalkeeper’s improvement in the overall result.
In this scenario, both the coach and the goalkeeper have a point. The coach has reasons to be unhappy with the number of goals conceded because this number is directly relevant to whether the team wins or loses. The goalkeeper is right to point out that they are doing their best from their position. The problem is that the two do not mean the same by “success” and “failure.” The coach is evaluating the sufficiency of current efforts: whether they are enough to adequately address the problem (the threat of losing games). The goalkeeper, by contrast, is evaluating impact: whether current efforts are making a difference. The coach and the goalkeeper do not truly disagree. Both can agree that factors beyond the goalkeeper’s influence, such as maybe the team’s defense strategy, require revision to win games.
The case of “three Rs” programs is similar to an extent. When program representatives object to the use of total numbers to assess their own success or failure, they are talking about impact. The NC3Rs program evaluation report does so unambiguously, articulating a framework of “inputs–outputs/outcomes–interim impacts–mature impacts” (2012, pp. 6–7), where impacts are essentially the difference made by the initial inputs. Grimm et al. (Reference Grimm, Biller-Andorno, Buch, Dahlhoff, Davies, Cederroth, Maissen, Lukas, Passini, Törnqvist, Olsson and Sandström2023) focus on “effects” and “effectiveness,” both of which, in context, again denote impact. However, policymakers and governments may be more interested in whether a program can adequately address a given problem and reach given goals—not just in whether they are making some difference.
However, there is also an important disanalogy between the goalkeeper vignette and animal experimentation: A football team definitely wants to concede fewer goals, but whether a given political body wants to conduct less (or less severe) overall animal experimentation is often unclear.
The European Union Directive 2010/63 does mention “full replacement” as a “final goal” in its Article 10, but this has never been concretized into more specific reduction goals (as a European Parliament resolution highlighted; EU Parl, 2021). Most individual countries likewise do not design animal experimentation policies to achieve outcome-oriented goals, let alone specific reduction goals. They rather lay down licensing procedures (Olsson et al., Reference Olsson, Sandøe, Hawkins, Jennings, Golledge and Richardson2024) and more recently provide resources for “three Rs” programs in a continuous, open-ended fashion. An exception is Germany, whose government in 2024 started a participatory process to develop a “reduction strategy” for overall animal experimentation (BMEL, 2024), whose fate at the time of writing is, however, uncertain, as the governing coalition has changed (Tierschutzbund, 2025). There exist other, partial exceptions. In 2023, the European Commission announced a reduction plan for regulatory animal tests, though it refused to set specific reduction goals for animal experimentation overall (EC, 2023). From 2019 to 2024, the United States Environmental Protection Agency worked on a phase-out plan for the animal tests it requires and funds, though it subsequently abandoned the plan (D. Grimm, Reference Grimm2024). Two other federal agencies—the National Institute of Health (NIH) and the Food and Drug Administration (FDA)—in 2025 announced commitments to reduce their reliance on animal testing (FDA, 2025; NIH, 2025). Perhaps the most noteworthy exception for present purposes is the British NC3Rs, which has stated that it aims to prevent a rise in total animal experimentation in the United Kingdom from 2015 to 2025 (NC3Rs, 2014).
In these exceptional cases, where governments or institutions have committed to a reduction or the prevention of a rise in total animal experimentation numbers, a rise in the numbers unambiguously indicates that efforts have been insufficient. This does not contradict the NC3Rs’ claim that total numbers are unhelpful in assessing impact. The problem is that in most countries, no specific outcome-oriented goals exist for animal experimentation policy. And in the absence of clear enough goals, program sufficiency cannot be meaningfully evaluated.
Vagueness about program goals opens up a space for political maneuvering. Those who, for fiscal or other reasons, oppose the creation of additional programs can point out that existing programs are doing good work and are making an impact—it would certainly be worse without them! This tacitly equates impact with sufficiency, setting no goal for the program other than doing whatever it already does. Conversely, those wishing to revise or increase efforts can stress that the total numbers are not changing in the right direction, tacitly suggesting that achieving a particular overall change is part of the program’s goals, which may be unrealistic, given how the program is set up. Due to the inherent limitations of the “three Rs” framework—it says nothing about the allocation of research funds, for example (Müller, Reference Müller2024a; NC3Rs, 2012, p. 51)—programs to promote the “three Rs” might be ill-equipped to achieve significant change at the level of total animal experimentation numbers, just as a goalkeeper alone cannot make up for bad defense. In this way, goal vagueness sets “three Rs” programs up for uncomfortable and unproductive debates about their supposed success or failure whenever animal experimentation statistics are released.
As a first result of this discussion, we can conclude that “three Rs” program representatives are right to point out that total numbers are dubious indicators of program impact. But what often interests policymakers, and should interest them, is the sufficiency of programs to achieve policy goals. The problem is that in animal experimentation policy, the goals are often unclear. To achieve a more productive debate, policymakers should make up their minds about what they want “three Rs” programs to achieve and articulate clear goals.
While setting policy goals is of course a political matter, it can be informed by philosophical arguments. Policymakers may want to stick to views that are compatible with widely shared values and normative assumptions that already underpin existing policy. From this standpoint, as the next section will argue, the policy goal of decreasing overall harm to animals is sensible.
Why total numbers should matter
For present purposes, assume (a) that most people view animals as having moral status and a nontrivial claim to their own welfare, but (b) that most people do not value animals so highly as to consider most animal experimentation to be all-things-considered wrong. In other words, most people think that animals are worth protecting for their own sake, but not in the same way and to the same extent as humans. They occupy some middle ground between anthropocentrism (the view that only humans have moral claims and animals matter only indirectly) and antispeciesism (the view that all interests should be considered equally irrespective of species) or an animal rights view (the view that animals have moral rights that trump mere human interests).Footnote 2 Animal experiments are morally acceptable from this standpoint as long as they yield enough benefit for humans or, less frequently, other animals.
Many governments have good reason to think along the lines of this middle-ground view not just because their individual citizens endorse it, but also because they are already tacitly committed to it. An increasing number of legal systems give animals some special status as sentient beings and protect them for their own sake (Kotzmann, Reference Kotzmann2023). Many legal systems regulate animal experiments through licensing procedures based on harm–benefit analysis (Bayne et al., Reference Bayne, Guillen, France, Morris, Golledge and Richardson2024; Brønstad et al., Reference Brønstad, Newcomer, Decelle, Everitt, Guillen and Laber2016; Laber et al., Reference Laber, Newcomer, Decelle, Everitt, Guillen and Brønstad2016). This reflects a commitment to both the good of animal welfare and the good of free research as well as health and safety. Assuming that most people and their governments thus occupy a middle ground between anthropocentrism and antispeciesism or animal rights, do they have good reasons to adopt the policy goal of an overall decrease in harm to animals in experimentation?
This might not be obvious at first sight. Total numbers clearly matter when the problem consists in all-things-considered wrongs or evils. However, according to the middle ground position, not all animal experiments are all-things-considered wrongs. Whether they are depends on how much benefit is generated and how much harm is inflicted. If there is any indication that many animal experiments create more harm than benefit (e.g., because of poor scientific quality or because species differences undermine translatability, Akhtar, Reference Akhtar2015), a sensible policy goal would be to decrease the number of just these net-harmful experiments specifically, while the number of net-beneficial experiments may be allowed to stagnate or even increase. However, given that existing licensing procedures often already involve a form of harm–benefit analysis, at most, this could imply a policy of double-checking whether the procedures work as intended (e.g., by conducting retrospective harm–benefit analysis, as was done by Pound & Nicol, Reference Pound and Nicol2018). Though this may be worthwhile, it does not yet justify a policy goal of decreasing overall harm to animals in experimentation more generally. If the problem is bad science, the goal should be to reduce bad science—not animal-harming science.
However, not just all-things-considered wrongs should be avoided, but so should pro tanto wrongs. To use a simple example: Saving a young person rather than an old person from a burning building (when one cannot save both) might be the right decision because the young person’s claim to rescue is greater due to their future prospects. Still, the agent has reasons to regret their failure to save the old person even if they made all the morally right decisions (Williams, Reference Williams1965). That greater moral claims outweighed the senior’s claim to rescue does not mean there is no problem—to the contrary, having to choose is a moral problem in itself. And it certainly does not mean that we need not be concerned if such cases are statistically on the rise. As Brigid Brophy pointed out over 50 years ago, if you constantly have to choose whom or what to save from burning buildings, you need better fire safety (Brophy, Reference Brophy, Godlovitch, Godlovitch and Harris1972, p. 137). The same is true for any nontrivial pro tanto wrong.
One could object that the state is not an individual moral agent and does not feel regret over value trade-offs. The law has ways to navigate conflicts between the goods it protects, such as the aforementioned harm–benefit analysis for animal experiments (Bayne et al., Reference Bayne, Guillen, France, Morris, Golledge and Richardson2024; Brønstad et al., Reference Brønstad, Newcomer, Decelle, Everitt, Guillen and Laber2016; Laber et al., Reference Laber, Newcomer, Decelle, Everitt, Guillen and Brønstad2016; Olsson et al., Reference Olsson, Sandøe, Hawkins, Jennings, Golledge and Richardson2024). As far as the law is concerned, sacrificing one good for the sake of another is not a problem if the proper procedures are followed. However, the state’s procedures should not be arbitrary. They should aim to protect all affected goods as fully as possible. And it is always better for the protection of two goods to avoid their conflict in the first place than merely to sacrifice one for the other when the conflict arises. Thus, although a state may not regret having to weigh animal welfare against scientific advances, it should still try to reduce the overall number of occasions on which it needs to choose.
Therefore, adopting the policy goal of decreasing total harm to animals in science and testing is sensible—it has good reasons supporting it—as long as we assume that harming animals for the sake of experimentation-derived benefits is regrettable to a nontrivial extent, even if we take it to be justified all-things-considered.
However, being supported by good reasons is not enough for a policy goal to be worth adopting. The goal must also not have stronger reasons against it. The main worry one might have about the policy goal of decreasing harm to animals in experimentation is that it threatens other social goods, such as the progress of science or health and safety. However, the above argument does not suggest that the policy goal should be to decrease harm to animals in experimentation no matter what. Rather, it should be to reduce the incidence of conflicts between animal welfare and science or health and safety, and only in this way to change the total animal experimentation numbers. In other words, the goal should be to decrease overall harm to animals in experimentation without diminishing the social goods that are enabled by science and health and safety. This requires that either the functions of animal-harming experiments are fulfilled by other approaches, or that any minus in benefits from such experiments is outweighed by greater benefits from other approaches, even if the benefits are different in kind.
This policy goal—decreasing conflicts between animal welfare and the goods enabled by science and regulatory testing—is not the same as increasing the number of animals replaced or animal experiments avoided. The former goal, but not the latter, would be achieved without effort in a world where major research trends just happened to go in the direction of fields that are naturally animal-free. In this world, no researchers would have actively replaced animals, but the goods of animal welfare and science would not be at odds as often. So the two goals differ in what they substantively demand. They also differ in measurability. Total numbers can feasibly be collected, as they are already officially reported in many jurisdictions. By contrast, counting the number of animals replaced or animal experiments avoided would require a great deal of controversial speculation about what would have happened absent intervention. Fortunately, this speculation is not necessary to assess progress toward the policy goal suggested by the argument in this section.
In sum, the acknowledgment that animal welfare matters, even if it does not matter in the same way or to the same extent as human welfare, implies that a policy goal of reducing harm to animals at the level of total numbers is sensible. We should aim to decrease the number of occasions on which we have to choose which of two goods to sacrifice. This argument does not show that most existing “three Rs” programs are failing on their own terms. For that, the goal of decreasing overall harm to animals in experimentation would have to be officially adopted. However, many governments and institutions would have good reasons to adopt that goal, and once they do, stagnant or rising total animal experimentation numbers will show many current “three Rs” programs to be insufficient, impactful though they may be.
Designing policies that fare better
So far, this article has argued that total numbers can be relevant for program sufficiency depending on policy goals and that a policy goal of decreasing overall harm to animals without jeopardizing science or health and safety is sensible. If this goal were adopted by jurisdictions in which total numbers are stagnating, then the respective “three Rs” programs would be clearly insufficient. To all this, one might respond that “three Rs” programs are still the best one can do to work in the direction of that goal. However, there exist alternative approaches that could be taken in addition to “three Rs” programs.
Notably in the Netherlands, efforts to move away from animal experimentation have increasingly been framed in terms of a technology transition rather than the “three Rs” (Denktank, 2015; NCad, 2016).Footnote 3 A similar shift is visible in the previously mentioned steps toward reduction or phase-out strategies in Germany, the European Union, and the United States (see section “How total numbers might matter: program impact versus sufficiency”). Academics (Baumgartl-Simons & Hohensee, Reference Baumgartl-Simons, Hohensee, Herrmann and Jayne2019; Herrmann, Reference Herrmann, Herrmann and Jayne2019; Hutchinson et al., Reference Hutchinson, Owen and Bailey2022; Marshall et al., Reference Marshall, Constantino and Seidle2022; Müller, Reference Müller2024b) and NGOs (Cruelty Free Europe, 2022; PETA, 2024) have made various suggestions for how to flesh out such strategies. Given that the policy goal is not just to decrease animal experimentation, but to decrease harm to animals without jeopardizing science or health and safety, a strategic approach that monitors intended and unintended consequences is needed (see Müller, Reference Müller2024b). A reduction or phase-out strategy should furthermore be flexible enough to allow for course corrections when milestones are missed (see Müller, Reference Müller2024b).
A challenge that reduction or phase-out efforts face is to tackle the environmental factors pointed out by the NC3Rs that affect animal experimentation, but lie beyond the influence of “three Rs” programs, such as funding allocation, technological developments, and regulation. The “three Rs” approach does not address these factors because it focuses on the individual researcher as the main locus of control. No amount of “R” principles that make prescriptions only for researchers will overcome this limitation. However, one could draw on frameworks developed in other policy areas where a transformation in socio-technical systems is sought (notably, climate and biodiversity policy), such as transformative governance (Visseren-Hamakers et al., Reference Visseren-Hamakers, Razzaque, McElwee, Turnhout, Kelemen, Rusch, Fernández-Llamazares, Chan, Lim, Islar, Gautam, Williams, Mungatana, Saiful Karim, Muadian, Gerber, Lui, Liu, Spangenberg and Zaleski2021), transition management (Loorbach, Reference Loorbach2010), or mission-oriented innovation policy (Lindner et al., Reference Lindner, Edler, Hufnagl, Kimpeler, Kroll, Roth, Wittmann and Yorulmaz2021; Mazzucato, Reference Mazzucato2018). All of these approaches heavily feature stakeholder involvement and combine the top-down with the bottom-up. The range of possible measures to be taken in the course of a transformative approach to animal experimentation policy is by no means restricted to prohibitive or other regulatory measures. A “planning without banning” approach (Müller, Reference Müller2024c) suggests that a decrease in harm to animals in research should be achieved through non-prohibitive, stimulating interventions in the environment of scientific work comprised of funding opportunities, infrastructure and equipment, research networks, and educational programs, among other factors. While developing viable reduction or phase-out strategies is a challenging task, governments could feasibly create working groups or competence centers to accomplish it.
Additional efforts beyond the current “three Rs” programs would require additional resources. They would also threaten the interests of some researchers, administrators, and industries that benefit from current arrangements. Very likely, the debate will thus feature “norm antipreneurs” (Bloomfield, Reference Bloomfield2015), who often defend the status quo by arguing that there is no problem to solve, and therefore that current programs do not produce “outcomes that vary radically and damagingly from expectations” (Bloomfield, Reference Bloomfield2015, p. 323). A vague statement of goals, thus of the government’s and the public’s expectations, makes this argument exceedingly easy to make. Activists and policymakers, if they want to decrease conflicts between animal welfare and science or health and safety in the long run, would thus be well-advised to insist on clear goal-setting for “three Rs” programs.
Conclusion
This article has argued that representatives of “three Rs” programs are right to an extent when they dismiss total numbers as indicators of program success because they focus on program impact. However, policymakers should be more interested in whether programs are sufficient to attain policy goals, for which total numbers can very well be relevant. The problem is that clear outcome-oriented goals are often missing in animal experimentation policy. Thus, if policymakers want to be able to tell whether these programs are succeeding, they should make it clear what change they expect them to effect. This article has furthermore argued that there are good reasons to adopt an overall decrease in harm to animals in experimentation as a policy goal. Almost anyone can agree that it is a pity if we have to decide whether to protect animals or attain the benefits of science and regulatory testing—the goal should be to not have to choose as often. If policymakers follow this advice and adopt an overall decrease of harm to animals in experimentation as a policy goal, then stagnating or rising total numbers are indeed markers of program failure, and existing programs for the “three Rs” need to be reconsidered or complemented with new approaches. These would have to tackle the environmental factors that affect overall animal experimentation numbers, such as funding, regulation, and the focus of innovation activities.
Acknowledgments
The author thanks Angela Martin, Christian Rodriguez-Perez, and two anonymous reviewers for this journal for their helpful feedback on earlier versions of this article.
Financial support
This work was supported by the Swiss National Science Foundation’s NRP 79 (Grant No. 447944_214850). The author makes it transparent that he is involved in one of the programs discussed in this article, the NRP 79, as a principal investigator of a research project, but he is not involved in, or personally affected by, the program’s evaluation.