Hostname: page-component-68c7f8b79f-lqrcg Total loading time: 0 Render date: 2025-12-19T16:22:34.148Z Has data issue: false hasContentIssue false

Can Scientific Communities Benefit from a Diversity of Standards?

Published online by Cambridge University Press:  25 September 2025

Matteo Michelini*
Affiliation:
Ruhr University Bochum , Bochum, Germany Eindhoven University of Technology , Eindhoven, the Netherlands
Javier Osorio
Affiliation:
Autonomous University of Madrid, Madrid, Spain
*
Corresponding author: Matteo Michelini. Email: matteo.michelini@edu.ruhr-uni-bochum.de
Rights & Permissions [Opens in a new window]

Abstract

Current models of scientific inquiry assume that scientists all share the same evaluative standards. However, scientists often rely on different yet legitimate ones, a feature we call evaluative diversity. We investigate how scientific success is affected by diversity in evaluative standards through computer-based simulations. Our results show that communities with diverse standards benefit substantially from scientists sharing all the approaches they explored, regardless of whether they considered them valuable. Moreover, we find that even a moderate degree of evaluative diversity can, under certain conditions, lead scientists to reach more satisfying results than those they would reach in homogeneous communities.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of Philosophy of Science Association

1. Introduction

Science is far from being an individual enterprise. Scientists spend a large part of their time talking to colleagues, attending conferences, and more generally, learning from others. In light of this, significant efforts have been made to understand the conditions for successful collective inquiry (Kitcher, Reference Kitcher1993).

To this end, philosophers have developed numerous formal models of group inquiry (Šešelja, Reference Šešelja and Edward2023). With the help of these models, they have generated a range of insightful findings, such as the idea that a restricted flow of information can enhance inquiry (Zollman, Reference Zollman2010) or that diversity in learning strategies tends to benefit science (Pöyhönen, Reference Pöyhönen2017; Weisberg and Muldoon, Reference Weisberg and Muldoon2009). Ultimately, these findings have been used to formulate normative recommendations (Petrovich and Viola, Reference Petrovich and Viola2018; Smaldino et al., Reference Smaldino, Moser, Velilla and Werling2022; Wu and O’Connor, Reference Wu and O’Connor2023).

In all of these models, scientific practice is understood as an instance of collective problem-solving, where scientists aim to find the objectively best approach available by relying both on social learning and individual exploration. Consequently, scientists with the same evidence are assumed to agree on the value of each explored approach (Bedessem, Reference Bedessem2019; Politi, Reference Politi2021).

Yet, this framework fails to capture a fundamental aspect of scientific practice. At times, there may be no unique way to assess existing approaches, and scientists may disagree on their epistemic value, even when they engage with the same approaches and possess the same evidence (Kellert et al., Reference Kellert, Longino and Kenneth Waters2006). Because no superior set of standards exists, scientists’ evaluations may diverge, either because of differing epistemic or nonepistemic values (Chang, Reference Chang2012; Longino, Reference Longino1987, Reference Longino2019) or because they pursue different research goals (Nickelsen, Reference Nickelsen2022; Parker, Reference Parker2006), even if their values and goals are all scientifically admissible (Ward, Reference Ward, Maria Baghramian, Carter and Cosker-Rowland2022, Reference Ward2021). In short, scientific communities often reflect a range of values and goals (Longino, Reference Longino1990), which leads scientists to apply diverse evaluative criteria to the same set of methodological frameworks, theories, or models. We refer to scenarios where these approaches are assessed using different but legitimate standards as instances of evaluative diversity.Footnote 1

Although present models have neglected it so far, evaluative diversity introduces specific mechanisms that influence collective scientific inquiry. However, a purely conceptual analysis might struggle to determine its impact because evaluative diversity presents an unusual trade-off of costs and benefits. On the one hand, it is reasonable to expect scientists with different criteria to explore many different approaches (Reijula et al., Reference Reijula, Kuorikoski and MacLeod2023), which is usually considered a good recipe for success (Wu and O’Connor, Reference Wu and O’Connor2023; Zollman, Reference Zollman2007). On the other hand, evaluative diversity may lead to a situation in which each scientist develops her own personal approach, which might not be relevant to others. Although evaluative diversity reduces the risk of herding (Strevens, Reference Strevens2013), it may also make social learning superfluous, as scientists are likely not to adopt approaches developed by those who aim to meet different evaluative criteria. As a consequence, several questions are still open: Do the normative recommendations and observations that we obtained from existing models also apply to contexts of evaluative diversity? Does evaluative diversity slow down inquiry? Consider a scientist who is researching a particular problem in her field. Would she gain more by collaborating with peers who share similar standards and goals or by engaging with those whose criteria and objectives differ?

This article introduces a computational model of scientific problem-solving in contexts of evaluative diversity.Footnote 2 We develop a novel extension of the NK model (Kauffman and Levin, Reference Kauffman and Levin1987; Wu, Reference Wu2023) where agents explore approaches by themselves and learn about approaches developed by others, but they may assign different scores to the same approach. Accordingly, we consider the success of a community as the average success of its members, where the success of each member is measured based on the coherence of the approach one adopts with respect to one’s standards.

Our results suggest that scientists working in a multi-standards community maximize their success through constant communication. Crucially, agents should communicate not only the approach they adopt and consider valuable but also every approach they discover. In doing so, scientists can benefit from instances of proxy serendipity, where an approach deemed of little value by one agent proves invaluable to another. We show that even a moderate degree of evaluative diversity allows members of a community—under certain conditions—to reach more satisfying results than those reached by scientists who all share the same standards.

This article is organized as follows. Section 2 elaborates on present models of scientific inquiry and discusses the notion of evaluative diversity. Section 3 introduces the baseline model and the extension designed to incorporate evaluative diversity. Section 4 details the main results of the extended model. Section 5 contextualizes our findings within the existing literature. Section 5 concludes.

2. Scientific inquiry with evaluative diversity

Formal models of scientific inquiry share the key assumption that scientific communities are governed by a shared fixed set of standards. According to these standards, the available scientific approaches are objectively ranked, and the success of a community is determined by the objective quality of the approach its members ultimately adopt (Smaldino et al., Reference Smaldino, Moser, Velilla and Werling2022; Šešelja, Reference Šešelja and Edward2023).

This assumption strongly affects the results obtained by such models. To see this, let us consider one of the most well-known findings in the literature, the ‘‘less-is-more’’ effect, which suggests that agents should only rarely share information with other agents (Zollman, Reference Zollman2007; Zollman, Reference Zollman2013), such as only when the problem at stake is especially hard (Frey and Šešelja, Reference Frey and Šešelja2020; Rosenstock et al., Reference Rosenstock, Bruner and O’Connor2017). Because evaluative standards are shared, scientists in existing models are usually highly likely to converge on a successful approach if given the same information. As a consequence, when information flows continuously (e.g., in a complete network), scientists are at risk of rapidly converging on approaches that may initially seem promising but that could be suboptimal. Instead, when the information flow is sparse, there is a slower but more thorough exploration of every approach, increasing the likelihood of uncovering optimal approaches, especially when two approaches are very similar. In short, under the assumption of a fixed shared set of values, a sparse information flow helps scientists achieve a temporary but efficient division of cognitive labor.

While such an assumption suits many instances of collective inquiry, it does not cover all of them. Specifically, it overlooks situations where no unique evaluation standard exists and scientists explore and discuss the same scientific approaches but assess them differently (Politi, Reference Politi2021; Bedessem, Reference Bedessem2019). In such scenarios, scientists may evaluate the same approach with respect to different but equally legitimate aims and values, which may lead them to disagree about its quality. Yet this does not prevent them from learning from each other. In short, present models of scientific inquiry fail to explore collective inquiry in the context of evaluative diversity.

These contexts are numerous because science naturally allows a multiplicity of scientifically legitimate values and objectives (Chang, Reference Chang2012; Longino, Reference Longino1987, Reference Longino2019, Reference Longino, Kellert, Longino and Kenneth Waters2006; Ward, Reference Ward, Maria Baghramian, Carter and Cosker-Rowland2022). In climate modeling, for example, scientists have very different aims, such as prediction of global average parameters or simulation of regional climate change. As a consequence, they develop a wide range of different types of simulations for the evolution of Earth’s climate while still learning and possibly adopting models developed by others (Parker, Reference Parker2006; Winsberg, Reference Winsberg2012).

Another illustrative example can be found in the history of photosynthesis research, which stretches from the mid-18th century to 1960 and led to an accurate model of plant photosynthesis (Nickelsen, Reference Nickelsen2021, Reference Nickelsen2022). Notably, most scholars contributing to the field did not aim to develop a comprehensive model of photosynthesis; they focused on making significant contributions using familiar methods or addressing specific subgoals relevant to their own research. For instance, chemist Justus von Liebig, who developed a model for photosynthesis in 1843 (Liebig, Reference Liebig1843), was interested in the topic only with the intention of enhancing crop production. In contrast, Adolf Baeyer, who put forward his model of photosynthesis in 1870, aimed to explain the role of formaldehyde (Bayer, Reference Bayer1870). As Nickelsen notes, both Liebig’s model and Baeyer’s model had been considered valuable models for photosynthesis until the 1920s, while at the same time, new models were proposed so as to satisfy different aims. To this end, scientists would simply adapt, modify, and integrate (parts of) available models in light of their own goals. Previous models would provide “building blocks” for new ones, developed with completely different goals in mind (Nickelsen, Reference Nickelsen2022).

Additionally, evaluative diversity may come in degrees (Ludwig and Ruphy, Reference Ludwig, Ruphy and Edward2021). At one end of the spectrum, in contexts with high diversity, a scientific community may evaluate the same objects in radically different ways, driven by strongly divergent values and goals. At the other end, there may be scientific communities that share an almost unified set of values and goals, whose members hold highly similar evaluative criteria. For example, two climate scientists may evaluate models differently if they aim for different levels of precision and tractability. Instead, climate scientists who aim to model the climate of structurally similar regions with the same levels of precision are likely to have rather similar but not identical evaluative criteria (Parker, Reference Parker2006).

As mentioned, while scientists may have varying values and goals, this does not prevent them from discussing and adopting each other’s approaches. In fact, scientists may learn as much from each other in heterogeneous communities, where diverse standards coexist, as in homogeneous communities, which rely on a unified set of values and aims. Hence, it is natural to wonder what conditions grant a successful inquiry for heterogeneous communities and how they fare with respect to homogeneous ones. To elicit such conditions, we turn to agent-based modeling, in particular to the NK framework, a type of epistemic landscape.

3. The model

In formal philosophy of science, epistemic landscape models have been used to investigate the division of cognitive labor, the communication structure of scientific inquiry, and both cognitive and social diversity in scientific communities (Alexander et al., Reference Alexander, Himmelreich and Thompson2015; Grim et al., Reference Grim, Singer, Bramson, Holman, McGeehan and Berger2018; Huang, Reference Huang2024; Pöyhönen, Reference Pöyhönen2017; Weisberg and Muldoon, Reference Weisberg and Muldoon2009). These models study epistemic communities that explore a set of locations, where each location represents a research approach. Accordingly, each location has a specific height, representing the objective fruitfulness of each approach.Footnote 3 Here, we use an NK framework, a type of landscape that represents each approach as the combination of different components, and its quality as an aggregate of the quality of each component. The NK framework was first proposed in biology to study gene evolution (Kauffman and Levin, Reference Kauffman and Levin1987) and has lately been used by philosophers (Reijula et al., Reference Reijula, Kuorikoski and MacLeod2023; Wu, Reference Wu2023).

Because our aim is to explore the impact of evaluative diversity, this model incorporates novel features not present in the standard NK framework. Traditional NK models assume an evaluation that univocally determines the value of each approach. In contrast, while we also model agents as navigating a shared space of approaches either through social learning or individual exploration, each agent evaluates this space through their own standards. Hence, different agents may assign different values to the same approach. In this sense, agents may be interpreted as navigating different landscapes, even though the set of available approaches remains identical across the community. We refer to different landscapes in later sections as an intuitive shorthand. Accordingly, the degree of satisfaction of a scientist corresponds to the score she assigns to the approach she adopts, and the success of a community is measured as the average degree of satisfaction of its members.

Each research approach represents a potential way for an agent to tackle the problem at hand and may be understood differently depending on the context. First, a research approach can be taken broadly as a way of proceeding in a field, that is, a research program or methodology (Weisberg and Muldoon, Reference Weisberg and Muldoon2009). Scientists may disagree about the quality of a research program because they may disagree about the relevance of the results that can be obtained with it (Longino, Reference Longino2019). Second, an approach can represent a full-fledged theory. Scientists often value the same theory differently, for example, because they value its epistemic virtues differently (Chang, Reference Chang2012; Schindler, Reference Schindler2022). Finally, an approach can be understood as a possible model for a target phenomenon on which scientists may have contrasting opinions, for example, a climate evolution model.

We discuss the basic structure of our NK model in section 3.1. We talk about the different learning strategies in section 3.2 and, finally, about how to evaluate success in section 3.3.

3.1. Approaches and scores

In an NK landscape, each approach results from $N$ design choices, where each choice selects one component from two available options. Formally, an approach is uniquely defined by a binary string $A = \left( {{a_1},{a_2}, \ldots, {a_N}} \right)$ , where each position $i \in \left[ {1,N} \right]$ corresponds to a design choice and ${a_i}$ represents the component selected for that choice, $0$ or $1$ . A landscape, therefore, consists of all possible binary strings of length $N$ , with each string (i.e., each approach) representing a specific location.

An agent evaluates an approach based on its components, assigning a score to each component depending on how well it aligns with the agent’s evaluative criteria. The overall score assigned to an approach is the average score of all its components. In turn, the score an agent assigns to a component is influenced not only by the component itself but also by the components chosen for $K$ other design choices (when $K = 0$ , the score depends solely on the component). An atomic evaluative unit for a component ${a_i}$ is a string that specifies all the components necessary to determine the score of ${a_i}$ . Hence, an approach uniquely determines an evaluative unit for every component.Footnote 4

Parameter $K$ determines the degree of complexity of the problems agents face (Reijula et al., Reference Reijula, Kuorikoski and MacLeod2023; Wu, Reference Wu2023). A high $K$ represents highly complex challenges, where the quality of a component depends on many other components, and the overall scores of two approaches may be highly different even if they differ only in one component. A low $K$ represents a low-complexity challenge because the scores associated with the components are (almost) independent.

The personal evaluation of an agent is a function that assigns a specific score to each atomic evaluative unit of each component and univocally determines the overall score the agent assigns to any possible approach. A personal evaluation assigns to each evaluative unit a real number between $0$ and $1$ . Although we do not explicitly model agents’ evaluative criteria in the model, an agent’s personal evaluation is the product of such evaluative criteria: The extent to which the agents’ personal evaluations diverge reflects how much their evaluative criteria do so.

Agents are assigned a personal evaluation as follows. For each evaluative unit of each component, we toss a biased coin with a probability of landing heads equal to $d \in \left[ {0,1} \right]$ , which we call the diversity coefficient. If the coin lands heads, agents evaluate the unit differently because its value is controversial (e.g., due to unshared background assumptions). If it lands tails, agents evaluate the unit in the same way and assign the same value (e.g., as with an experimental design yielding high statistical power).

A low degree of diversity ( $d$ ) characterizes contexts in which scientists subscribe to (almost) the exact same standards. A high degree of diversity represents communities where scientists disagree over the value of almost every evaluative unit (be it a method, theory, or research program). Although we consider a full range of values for the diversity coefficient, real scientific communities are unlikely to exhibit a very high degree of diversity (e.g., $d \gt 0.7$ ) because scientists usually subscribe to a minimum core of shared standards (Kitcher, Reference Kitcher2013).

When all scientists agree on the score of a unit (coin lands tails), this score is randomly drawn from a uniform distribution over $\left[ {0,1} \right]$ . For each unit on which they disagree (coin lands heads), we create a set of real numbers, themselves randomly drawn from a uniform distribution over $\left[ {0,1} \right]$ . Each real number in the set represents the outcome of assessing the unit from a specific, distinct perspective. Each scientist is then randomly assigned a score from this set, ensuring that each value is equally likely to be assigned. Therefore, for each controversial unit, different groups emerge where members agree internally but disagree with other groups on the score to assign. In fact, in real scientific contexts, whenever the evaluation of a specific component is controversial, scientists usually cluster around a limited set of perspectives (Chang, Reference Chang2012, Reference Chang and Saatsi2017; Longino, Reference Longino2019). The size of this set is determined independently for each evaluative unit and is randomly chosen from the natural numbers between $2$ and a maximum value $r$ , a model parameter. This parameter, which we denote as richness of available perspectives, controls the maximum number of distinct evaluations per unit. A low value for $r$ (e.g., $3$ ) represents a community where only a few different evaluative perspectives are available, and scientists usually split into very few large groups when evaluating controversial components, whereas a high value for $r$ (e.g., $20$ ) allows for a greater variety of perspectives.Footnote 5

Consider how this framework accounts for cases of evaluative diversity. If each approach is understood as a research program, then different components may represent different methods, different research questions, or background assumptions (Wu, Reference Wu2023). Consequently, scientists may subscribe to different standards when it comes to evaluating a specific method. The same applies to cases where each approach is understood as a scientific theory or a specific model. Each component may then represent a specific modeling choice or scientific tenet, and different positions may be held with respect to their quality. Diversity in personal evaluations reflects the plurality of values and goals. Specifically, the diversity coefficient ( $d$ ) determines the probability of disagreeing on the value of a specific unit, whereas the richness ( $r$ ) controls the sheer number of perspectives available to evaluate each unit.

3.2. Agents

There are two main behavioral procedures governing agents’ actions in the model: a local search algorithm and a social learning protocol. In each step of our simulation, agents first engage in local search, then in social learning.

The local search rule is an internal optimization mechanism by which the agents attempt to modify their specific approach, seeking a new approach with a higher score. First, an agent modifies a single component of her approach and lands on a new approach. The agent then compares the score of the new approach with the score of the original one and decides how to proceed. If the score for the new approach is higher, she adopts that approach; if not, she reverts the changed component and returns to the initial approach. Local search, in this context, aims to represent the typical explorations that scientists conduct on their own: Scientists change their approaches one component at a time to explore new possible ones (Reijula et al., Reference Reijula, Kuorikoski and MacLeod2023).

Social learning represents the collaborative aspect of scientific research, where scientists can draw on the work of others. In this procedure, agents observe and potentially adopt the approaches of other agents they are connected with. First, each agent selects the approach(es) she shares. We consider two possible procedures: partial sharing, which has been the standard procedure in previous models (Wu, Reference Wu2023), and total sharing.

Partial Sharing. An agent shares the best approach she has found so far. This always corresponds to the approach she employs at the moment of social learning.

Total Sharing. An agent shares up to two approaches: the best one she has found so far (i.e., the one she employs at the moment) and the approach she explored during the last iteration of local search. If these two approaches correspond, the agent shares only one.

Agents share their approach(es) with whomever they are connected with. Then, an agent learns about all the approaches shared by the agents she is connected with, and she also immediately learns their corresponding scores based on her own personal evaluation. Consequently, she compares the most valuable among them (with respect to her personal evaluation) to the one she presently adopts. If the new one is more valuable than the present, she moves to the new approach.Footnote 6

Finally, we explore two possible networks: a complete network, which represents systematic communication among all members, and a cycle network, which represents an instance of limited and sporadic communication.

3.3. Model overview and success

To determine how evaluative diversity affects scientific inquiry, we model the inquiry process with a computational simulation. First, we assign values to the relevant parameters (table 1), then initialize the community, and finally let the agents explore their environment while monitoring the progress of the investigation.

Table 1. Parameter Description and Value Range Explored

Parameter Description Values
$n$ Number of agents 5, 10, 20, 30, 40
$N$ Number of components per approach 10, 15
$K$ Number of interdependencies 3, 5, 7, 10
$d$ Diversity coefficient 0, 0.1, 0.3, 0.5, 0.7, 0.9, 1.0
Sharing protocol Approach(es) shared total, partial
Network Communication structure complete, cycle
$r$ Richness of available perspectives 2, 5, 7, 10

During initialization, each agent is assigned a personal evaluation and a randomly selected starting approach. In line with our focus, we interpret all agents’ evaluations as adhering to legitimate scientific standards, that is, as the result of scientifically admissible aims and values.Footnote 7 Once initialized, agents begin their inquiries, performing a local search at each step, followed by social learning. As a result, they explore new approaches and move around in the landscape.

We take the score ${s_i}$ an agent $i$ assigns to their current approach as an indicator of how successful their inquiry has been thus far, that is, as an indicator of the agent’s satisfaction. Consequently, we define the success $S$ of a community as the average value of the satisfaction of individuals, that is,

$$S = {{\mathop \sum \nolimits_i {s_i}} \over n}$$

where $n$ is the number of agents. We choose this criterion—which we name multiple satisfaction measure—inspired by Chang (Reference Chang2012), who defines a successful inquiry as one in which a scientist adopts a theory, model, or approach that meets their standards. Thus, community success is determined by the satisfaction of each member with their preferred approach, measured by the score each agent assigns to it. Because all scientists’ standards are equally legitimate, their progress should be evaluated based on their own standards.

This measure of success departs substantially from those traditionally used in models of scientific inquiry. These models assume the presence of a unique evaluation criterion (section 2) and measure the success of each scientist based on it.Footnote 8 However, as we focus on cases where this assumption does not hold, we resort to the multiple satisfaction measure. As a consequence, our notion of success is somewhat relative, insofar as the progress of a community is evaluated with respect to the standards that exist within it, rather than against an external, “objective” one. Yet we contend that this is appropriate for our scenario, as scientific pluralists have extensively argued (Chang, Reference Chang2012; Longino, Reference Longino1990, Reference Longino2002; Solomon, Reference Solomon2007).

Thus, the ideal outcome of scientific inquiry may involve scientists settling on different approaches. In homogeneous communities, the ideal outcome is one in which each scientist selects the objectively best approach, the success being evaluated according to a shared standard (Smaldino et al., Reference Smaldino, Moser, Velilla and Werling2022; Wu, Reference Wu2023). However, when agents’ evaluative standards diverge, they may adopt different approaches to maximize their individual scores.

For each combination of parameters, we run 500 simulations, and for each of these simulations, we measure the community success at any given step. Then, we compute the average value for success within a specific set of parameters. In particular, each simulation stops after reaching 150 steps, at which point the community typically reaches a stable state because agents no longer change their approaches. We conduct simulations across a range of parameter values, as shown in table 1. For the baseline scenario, we set $n = 30$ , $N = 10$ , $K = 5$ , $r = 5$ . Unless specified otherwise, this configuration generates the presented results.

Thus, the average scientific success of a community serves as a reliable estimate of its expected success under a given set of parameters. Because the multiple satisfaction measure defines community success as the average success of individuals, this value also provides a reliable estimate of an individual’s expected success. Under the assumption that each agent is equally likely to start from any approach and hold any possible personal evaluation, the expected value of the multiple satisfaction measure aligns with the expected satisfaction of a single agent. Therefore, comparing the success of different communities under specific parameters offers insight into both collective and individual success within each community.

4. Results

In line with previous models, homogeneous communities display well-known dynamics, such as rapidly converging on suboptimal solutions when communication is dense (Lazer and Friedman, Reference Lazer and Friedman2007; Wu, Reference Wu2023). However, the situation changes markedly when heterogeneous communities are introduced. These communities perform better when groups are highly connected and under conditions of total sharing because members can benefit from the diverse exploration paths taken by others. Furthermore, we find that communities with a moderate level of diversity in their evaluative criteria outperform homogeneous ones, as long as both are placed in conditions that are optimal for their performance.

To explain how these dynamics play out, we first discuss the behavior of homogeneous communities and then move to heterogeneous ones.

4.1. Homogeneous communities

The process of local search can be visualized by imagining a person climbing a landscape of peaks and valleys, where each location corresponds to an approach, and the location’s height corresponds to the score the agent assigns to it. Accordingly, a peak corresponds to an approach whose score is higher than the score of all the neighboring approaches, that is, of all the approaches that can be obtained by changing only one component of the starting approach. In the landscape, an agent can only move upward, meaning she can switch to another approach only if it has a higher score than her current one. Once an agent reaches a peak in her landscape, she cannot move any further using local search alone.

In a homogeneous community, agents assign the same scores to each approach regardless of their groups. Hence, if two agents start from the same location, they are likely to visit the same locations and reach the same peak. Because of this, social learning involves a trade-off. On the one hand, when an agent shares her current adopted approach, it allows others to reach a location that they couldn’t have reached by themselves and consequently increases their success. On the other hand, social learning hinders the exploration potential of a homogeneous community, that is, the number of approaches a community can still explore, because it leads agents to converge on the same locations.

The effect of the communication structure reflects this trade-off. In the early stages of the simulations, a highly connected community rapidly converges on the highest-scoring approach among those available, often getting stuck in suboptimal locations. In contrast, communities organized in a cycle network have greater exploration potential but slower growth. As a result of more sporadic communication, they do not immediately exploit each other’s findings, leading to extensive exploration of their respective starting areas. This makes the community more likely to converge to higher peaks in the long run (fig. 1). This result replicates the “less-is-more” effect (sec. 2).

Figure 1. The success for homogeneous communities under different network combinations ( $d = 0$ , sharing protocol = total). Shaded areas represent the standard error of the mean.

Total- and partial-sharing protocols show negligible differences. In fact, regardless of the protocol, agents in homogeneous communities adopt only the approaches that other agents also consider the most valuable.

4.2. Heterogeneous communities

The first crucial difference between a community with diverse criteria and one with homogeneous ones lies in the way agents explore approaches through local search. Whereas agents with the same evaluative criteria “see” the same valleys and peaks, agents with diverse criteria may “see” different peaks because the score assigned to an approach is determined by their evaluative criteria. Agents within heterogeneous communities are thus likely to follow different exploration paths even if starting from the same approaches.

Example 1. Consider a set of eight approaches (fig. 2 ) and two agents, $i$ and $j$ , who both start at approach $000$ . If $i$ and $j$ share the same evaluative criteria, with scores as shown in the left graph of figure 2 , they converge to the approach $010$ and stop there. Instead, if agent $i$ assigns scores as indicated in the right graph while agent $j$ retains the scores from the left one, while $j$ follows the same path, arriving at $010$ , agent $i$ may explore $001$ or $100$ , then move to $101$ , and finally reach $111$ . The two agents end up on different approaches.

Figure 2. The same set of eight approaches with two different sets of scores, obtained through two different personal evaluations. Arrows show available exploration paths.

Different evaluations produce different search heuristics: Even if agents follow the same local search rule, this leads them to different positions. As Hong and Page (Reference Hong and Page2004) put it, different preferences may produce different associations between approaches and may lead agents on different exploration paths.

Hence, scientists with different evaluative criteria rarely converge. To see this, consider figure 3, which shows the average number of unique approaches that are being used in a community.Footnote 9 When evaluative criteria are shared, this number decreases rapidly, quickly converging to one or two. Instead, when agents have different evaluative criteria, they settle on different approaches: An increase in evaluative diversity ( $d$ ) or in the richness of perspectives ( $r$ ) usually results in an increase in the average number of unique approaches adopted by a community.Footnote 10 As a consequence, more diverse communities, on average, explore a larger number of approaches, as shown in figure 4.

Figure 3. Number of unique approaches adopted at each step by communities with different degrees of diversity.

Figure 4. Number of total approaches explored by communities with different degrees of diversity.

Yet at the same time, diverse criteria are likely to lead agents to ignore the approaches chosen by their neighbors. As agents try to maximize different criteria, they settle on approaches that are only valuable to themselves. Consider again example 1. If agent $i$ and agent $j$ have different evaluative criteria, they do not benefit from learning about the approaches chosen by each other: Agent $i$ has no interest in moving to approach $010$ , and agent $j$ has no interest in moving to approach $111$ . How, then, can diverse communities benefit from their own exploration potential?

Scientists with different evaluative criteria can profit from each other’s explorations as long as they can learn about every approach that other agents explore. The degree to which a heterogeneous community profits from its exploration potential depends on the type of sharing protocol and the communication structure. Whenever scientists share only the approaches they value the most, heterogeneous communities cannot fully exploit their exploration potential because agents are likely to deprive someone else of valuable indications.

Example 2. Consider the set of approaches in example 1, such that agent $i$ ’s personal evaluation is visualized by the left graph of figure 2 and agent $j$ ’s personal evaluation by the right one. Agent $i$ starts from position $011$ and explores approach $111$ before adopting approach $010$ . If she does not share with $j$ details about approach $111$ , she will deprive $j$ of the possibility of learning about a very valuable approach.

Communication networks have a similar effect. In a homogeneous community, regardless of whether agents are organized on a cycle or a complete network, whenever someone finds a highly valuable approach, sooner or later, everybody will converge on it. Valuable information travels either fast or slowly, but it always reaches everyone. Instead, if the community is heterogeneous and agents are not completely interconnected, much valuable information is at risk of getting lost.

Consider example 2 again, and suppose that this time agents share all the approaches they explore but are organized on a cycle. Suppose, for instance, that agent $k$ is connected with both $j$ and $i$ , whereas $i$ is not connected to $j$ . If agent $k$ does not have an interest in approach $111$ , she may receive information about it from $i$ , but she will not adopt it. As a consequence, in the next round of social learning, she won’t share the details of approach $111$ with agent $j$ . Even if the community uses total sharing, the lack of a direct connection between $i$ and $j$ prevents agent $j$ from learning about the approach $111$ discovered by $i$ , from which $j$ would have greatly benefited. Instead, if agents $i,j,k$ had the same personal evaluation, the lack of a direct connection between $j$ and $i$ would not harm $j$ . Whenever $j$ would benefit from learning about $111$ , then agent $k$ would also benefit from adopting it, which would result in $k$ moving on approach $111$ and sharing information about it with agent $j$ .

In short, although there is great exploration potential in heterogeneous communities (fig. 4), their performance is suboptimal whenever agents share only the adopted approaches and are not completely interconnected: A limited flow of information prevents the community from benefiting from exploration.

Combining total sharing with a highly connected network completely changes the picture. As long as every agent shares each new approach she explores, most agents are always informed about the explorations of everyone else. Consequently, agents can benefit from approaches discovered by others, even if the discoverer neither explicitly sought out nor valued those approaches. We refer to these cases as instances of proxy serendipity because they resemble serendipitous discoveries—discoveries that occur to a scientist who was looking for something else (Copeland, Reference Copeland2017, Reference Copeland, Copeland, Ross and Sand2023)—carried out by a “proxy” agent. Consider example 2: Although agent $j$ could benefit greatly from learning about approach $111$ from agent $i$ , agent $i$ does not value it at all.

The possibility of proxy serendipity makes social learning an added value for heterogeneous communities. Social learning enables agents to capitalize on others’ explorations and even increases the exploration potential of a diverse community. In fact, an agent who moves on a new approach is likely to explore its neighborhood through new paths. Whereas a dense information flow in moderately heterogeneous and homogeneous communities generates some degree of convergence, it opens up new paths for exploration in radically heterogeneous ones.

This suggests a crucial difference between homogeneous communities and diverse ones. In homogeneous communities, limiting the information flow does not prevent the community from capitalizing on exploration; it simply slows down its ability to do so (fig. 1). Important information always finds a way to reach everybody, even if indirectly or after some time. In diverse communities, limiting information can—and most of the time does—impede exploitation (fig. 5). Because information is not equally valuable to everyone, if an approach is not immediately communicated to the person who values it, the information is lost. This suggests the importance of circulating unsuccessful explorations. We return to this point in the discussion.

Figure 5. The average success of communities with different degrees of diversity is plotted based on different values for diversity, networks, and sharing protocols.

4.3. Diverse communities beat homogeneous ones

How, then, do homogeneous communities fare with respect to heterogeneous ones? The answer depends on a number of parameters. Nonetheless, figure 5 already hints at the main result of our work, which is that diverse communities may systematically outperform homogeneous ones. The combination of a complete network with total sharing benefits diverse communities to the point that they perform much better than any homogeneous community. This implies that an agent’s expected success can be higher in a completely connected, diverse community using total sharing than in a community where every member shares her criteria.

Figure 6 allows us to determine the degree of diversity required for heterogeneous communities to outperform homogeneous ones. For the parameter combination considered in the figure, this threshold lies around $0.3$ , indicating that even a moderate level of diversity can be sufficient for a community to surpass a homogeneous counterpart organized on a cycle network. However, this value also depends on the complexity of the problem ( $K$ )—which we discuss in the next section—as well as the maximum number of unique evaluations available for each unit on which the community disagrees ( $r$ ).

Figure 6. The average success of a community at 150 steps. Shaded areas represent the standard deviation.

The effect of $r$ on the success of a community is illustrated in figure 7. An increase in the richness of perspectives corresponds to greater success for heterogeneous communities with total sharing and a complete network and lower success for heterogeneous communities with partial sharing and a cycle network. In fact, a greater richness leads to a wider divergence in evaluations, promoting a more thorough exploration of different approaches. This fosters greater epistemic success whenever agents are all informed of other agents’ explorations—or a less successful inquiry when communication is scarce. The richness of perspectives functions as an amplifier of the effect of evaluative diversity. Consider figure 7 again, and compare the performance of a homogeneous community with partial sharing and a cycle network (green line in the first box) with that of a diverse community with $d = 0.3$ , complete network, and total sharing (purple line in the second box). The homogeneous community outperforms the diverse one if $r = 2$ , whereas the opposite is true if $r = 10$ . Whenever agents can capitalize on diversity, a wide variety of perspectives amplifies its beneficial impact.

Figure 7. Impact of the maximum number of unique evaluations ( $r$ ) on the success of communities. Shaded areas represent the standard error of the mean.

Figure 6 also provides us with information about the variability of our results. It shows that although the success of a homogeneous community can be highly variable (large standard deviation), increasing the evaluative diversity results in a decrease in variability (increasingly small standard deviation). In fact, homogeneous communities’ performance heavily depends on their starting conditions. If agents begin in a region filled with high-quality approaches, they achieve great success; if they start in a less favorable area, the entire community gets stuck with low-success approaches. On the other hand, the diversity in search paths within heterogeneous communities results in much more exploration, preventing the starting conditions and strongly affecting the results. The greater variability in homogeneous communities is primarily due to their sensitivity to initial conditions.

Finally, it should be noted that while heterogeneous communities outperform homogeneous ones under specific conditions, they only do so after a certain number of steps. As illustrated in figure 5, in the very short run (e.g., in the first 30–40 steps—although the precise number depends on the parameter combination), homogeneous communities fare much better than diverse ones. This is due to different search patterns: Homogeneous communities converge rather easily on superior peaks, whereas diverse communities may continue exploring for a much longer time (fig. 4).

4.4. Community size and complexity

Evaluative diversity also affects the way other factors influence the community. First, while homogeneous communities perform rather worse when the complexity of the problem increases (sec. 4.1), heterogeneous communities with a dense information flow are less affected by it. Figure 8 shows this pattern. Homogeneous communities perform consistently worse because complexity makes individual agents more likely to land on suboptimal local peaks. An increase in $K$ corresponds to a decrease in the potential of local search. Because agents see the same landscape, they are likely to get stuck on the same suboptimal peaks. This effect is less pronounced in the case of heterogeneous communities with a dense information flow because agents follow different paths.

Figure 8. Impact of $K$ on the success of communities ( $N = 15$ ). Shaded areas represent the standard error of the mean.

An increase in the size of a community affects the results in a similar way. As shown in figure 9, a larger number of agents will tend to benefit the most homogeneous communities, whereas heterogeneous communities only gain efficiently from this increase when the flow of information is sufficiently dense. This is the case because adding more agents typically leads to a greater number of locations being explored. This expansion in explored areas is particularly significant in homogeneous communities, where the overall exploration potential is usually lower, allowing them to benefit from the increased size. In contrast, heterogeneous communities only profit from a larger number of explored approaches when every agent has access to that information. Because this is not the case with limited communication, increasing the number of agents has little or no effect.

Figure 9. Impact of the number of agents on the success of communities. Shaded areas represent the standard error of the mean.

4.5. Reassessing community performance

As discussed in section 3.3, our results are based on a measure of individual success determined by each agent’s own evaluation, reflecting the idea that scientists can adhere to different, yet epistemically admissible, standards. This approach differs significantly from how NK frameworks have traditionally been used and raises a natural question: How would our results change if epistemic success were assessed using a single, universally correct standard? To address this, we introduce the single-evaluation measure. Under this measure, diverse communities never outperform homogeneous ones. While the single-evaluation measure does not reflect the kinds of inquiry our model is designed to capture (sec. 2), the discussion that follows highlights the importance of the multiple satisfaction measure and serves as an initial robustness check of our findings (Frey and Šešelja, Reference Frey and Šešelja2020).

Here, we retain the behavioral rules of our model (agents still act according to their own evaluative criteria), but the epistemic success of each agent is now assessed according to only one evaluation, which we shall understand as the only epistemically correct evaluation. For every community, we randomly select one of the agents and take her personal evaluation to be the correct one. All other evaluations are to be considered epistemically inaccurate. Then, we evaluate every agent’s individual success based on the correct evaluation. For homogeneous communities, nothing effectively changes from what we had before because all agents share the same evaluation. Conversely, agents in heterogeneous communities may now be evaluated based on an evaluation they themselves do not use when selecting which approach to adopt.

Under this modified setup, heterogeneous communities perform worse on average as diversity increases (fig. 10). This result reflects an increasing misalignment between the standards agents use to guide their search and the one used to measure their epistemic success. In other words, agents in diverse communities may adopt approaches that best meet their own criteria, but these approaches may score poorly under the epistemically correct standard. Figure 10 also illustrates that heterogeneous communities still profit more from a complete network and total sharing than from a cycle network and partial sharing.

Figure 10. Community performance under the single-evaluation measure. The average success of communities with different degrees of diversity is plotted based on different values for diversity, networks, and sharing protocols.

This analysis confirms that if a single epistemic standard is assumed to be correct, evaluative diversity offers no epistemic benefit. Thus, whether diversity and dense communication structures are epistemically beneficial or not depends on the underlying notion of epistemic success. This suggests that while our results might provide insights concerning the contexts we discussed earlier (sec. 2), one should be very careful in overgeneralizing them to different contexts of inquiry.

5. Discussion

This model explores collective problem-solving in a context of scientific evaluative diversity.Footnote 11 Our results can be summarized as follows. First, scientists with diverse standards can effectively collaborate to achieve satisfying outcomes, particularly when they are highly connected and willing to share intermediate results. The key mechanism driving this success, which we call proxy serendipity, allows scientists to benefit from approaches they might not have pursued independently but that ultimately prove valuable to them. Under these conditions, diverse communities may demonstrate greater success in addressing complex problems and consistently outperform homogeneous communities.

These results come with certain limitations. First, it is important to acknowledge the highly idealized nature of our model (Frey and Šešelja, Reference Frey and Šešelja2020; Martini and Fernández Pinto, Reference Martini and Fernández Pinto2017). As such, it is designed to explore logical possibilities rather than to provide direct explanations for real-world phenomena. Specifically, our model operates under the idealized assumptions that communication among scientists is always successful and that all the adopted evaluative standards are equally legitimate (sec. 4.5). Yet effective communication can be challenging, especially when scientists adhere to different standards, potentially leading to misinterpretations. Additionally, even in diverse scientific communities, not all evaluative standards may be scientifically admissible—some may be influenced by financial incentives rather than epistemic considerations. We recognize these limitations and plan to relax both assumptions in future work.

Nonetheless, a number of observations can be made in view of our findings. In what follows, we first discuss evaluative diversity in relation to the broader literature on diversity. Next, we explore the potential implications of our findings for general philosophy of science.

5.1. Evaluative diversity as a new type of diversity

To our knowledge, our model is the first to formally examine the impact of a plurality of values and aims on scientific practice. As such, it contributes to the growing body of formal literature on the role of diversity in science by adding a missing piece to it.

Our results suggest that the impact of diversity in personal evaluations is related to, but not reducible to, the effects of other kinds of diversity (Steel et al., Reference Steel, Fazelpour, Gillette, Crewe and Burgess2018), such as demographic diversity (Huang, Reference Huang2024; Steel et al., Reference Steel, Fazelpour, Crewe and Gillette2021) or cognitive diversity (Pöyhönen, Reference Pöyhönen2017). Diversity in evaluations naturally leads scientists to adopt different approaches and consequently results in a thorough exploration of various options. Furthermore, scientists with different evaluations explore the available approaches as if they follow different search heuristics (sec. 4.2). This supports Hong and Page’s (Reference Hong and Page2004) hypothesis that different preferences may lead to different search strategies and complements the formal literature on learning heuristics (Reijula et al., Reference Reijula, Kuorikoski and MacLeod2023; Reijula and Kuorikoski, Reference Reijula and Kuorikoski2022; Weisberg and Muldoon, Reference Weisberg and Muldoon2009).

Because evaluative diversity naturally fosters the exploration of various alternative approaches, looking at contexts characterized by it requires a shift in the modeler’s mindset. In homogeneous communities, one is usually looking for mechanisms that could prevent scientists from herding (Wu and O’Connor, Reference Wu and O’Connor2023). In contexts of evaluative diversity, herding is no longer a problem. When studying in diverse communities, it is necessary to explore what mechanisms allow them to benefit from the exploration potential they naturally have (sec. 4.2).

5.2. Scientific communities can profit from evaluative diversity

Our results also suggest a number of observations concerning the social epistemology of scientific inquiry. First, they further qualify our general understanding of different communication protocols. They indicate that the optimal communication structure depends not only on the complexity of the problem at hand (Rosenstock et al., Reference Rosenstock, Bruner and O’Connor2017), the stubbornness of scientists (Frey and Šešelja, Reference Frey and Šešelja2020), the way they interpret evidence (Michelini et al., Reference Michelini, Osorio, Houkes, Šešelja and Straßer2023), and their research heuristics (Pöyhönen, Reference Pöyhönen2017) but also on the diversity of epistemic standards. In fact, previous work has shown that communities where agents share the same standards may perform better with limited information flow (Wu, Reference Wu2023), whereas we find that communities characterized by a plurality of goals and values may be better off communicating as much as possible.

According to our results, heterogeneous communities may greatly profit from agents sharing every approach they explore, regardless of whether they value those approaches (sec. 4.2). Under our formal assumptions, agents with diverse standards are better off sharing not only their substantive findings but also those they might not personally consider substantive. Any finding could still prove valuable to others, leading to instances of proxy-serendipitous discoveries. Proxy serendipity, where an agent adopts an approach discovered by someone else who is searching for something different, may be crucial to the success of heterogeneous communities.

These results are consistent with historical research on scientific progress, which shows that scientists can derive significant benefits from approaches initially developed for different purposes (Copeland, Reference Copeland, Copeland, Ross and Sand2023; Nickelsen, Reference Nickelsen2022). Chang describes such cases as instances of “co-optation,” citing, for example, Lavoisier and colleagues, who used phlogistonist experimental results to advance new theories (Chang, Reference Chang2012). Our model advances this literature by providing a formal framework that can capture this phenomenon and identify the conditions under which it becomes critical for successful scientific inquiry.

Nonetheless, in practice, scientists may not have an incentive to communicate the results of all their explorations (Strevens, Reference Strevens, Boyer-Kassem, Mayo-Wilson and Weisberg2017). Communicating the details of an approach is costly, and scientists may do so if only they believe this could grant them some rewards. Indeed, scientists seldom share the results of their unsuccessful explorations because they expect little to no recognition for them, a problem that is usually known as the “file-drawer problem” (Rosenthal, Reference Rosenthal1979). In other words, heterogeneous communities may face a collective prisoner’s dilemma. While everyone benefits from learning about others’ (unsuccessful) explorations, there may be little to no personal benefit in sharing them. This dilemma resembles the one discussed by Boyer (Reference Boyer2014), Heesen (Reference Heesen2017), and Strevens (Reference Strevens, Boyer-Kassem, Mayo-Wilson and Weisberg2017), who wonder what could motivate scientists to share intermediate results even if this exposes them to the risk of being scooped. Among the solutions, they suggest that information withholding may be prevented through some sort of Hobbesian contract between scientists (Strevens, Reference Strevens, Boyer-Kassem, Mayo-Wilson and Weisberg2017) or by adequately rewarding sharing practices (Heesen, Reference Heesen2017; Boyer, Reference Boyer2014). In a similar vein, Copeland (Reference Copeland2017) argues that institutions should, upon a new discovery, reward all scientists who contributed to it, even if only by inspiring the researchers who directly worked on the project. Our results suggest that this proposal may be promising: If a scientist is rewarded based on the extent to which she contributes to others’ successes, she may be more willing to consistently share her approaches in the hope that someone else finds them valuable.

Finally, our results show that under specific conditions, heterogeneous communities outperform homogeneous ones. This observation contributes to the ongoing debate on the effectiveness of scientific pluralism (Chang, Reference Chang and Saatsi2017), particularly with respect to how much we can and should support scientific groups that pursue divergent standards (Kourany, Reference Kourany2010). Chang (Reference Chang2012, p. 268) challenges the idea that scientific resources should be concentrated on a single line of inquiry adhering to a specific set of standards. Instead, he argues that society should support multiple lines of inquiry because scientists with different standards can benefit from each other. Our results may be taken to formally refine this argument. Specifically, they indicate that a scientist following a particular set of standards may be more likely to achieve greater success—that is, to discover a more valuable approach based on her standards—in a heterogeneous community rather than a homogeneous one, especially if the problem at stake is complex. However, this advantage holds only if the heterogeneous community is fully connected, with agents actively sharing any approach they explore. In short, funding research with diverse goals can be an optimal strategy, but only when accompanied by a system of incentives that encourages scientists to share all the approaches they explore with as many colleagues as possible.

6. Conclusion

In this article, we explored the effect of a plurality of values and goals on scientific problem-solving. We developed a formal model to simulate scientific inquiry in contexts where agents have different evaluative standards and, consequently, evaluate the same approaches differently. Our results suggest that a moderate degree of evaluative diversity, combined with a dense communication structure and willingness to share intermediate approaches, benefits the entire community.

This work represents an initial step toward adapting formal models of collective inquiry to the diverse nature of scientific practice. A possible direction for future research is to integrate into the present model the possibility for scientists’ standards to change. This would allow for the study of collective inquiry as a process in which scientific standards and scientific exploration influence each other continuously.

Acknowledgments

We would like to thank Dunja Šešelja, Christian Straßer, Wybo Houkes, Sanaa Jukola, Klee Shöppl, Leon Assaad, Rafael Fuchs, Jonas Stein, Felix Kopecky, Cristoph Merdes, and three anonymous referees for valuable feedback on earlier versions of this article.

Funding information

This research was supported by the Ministerio de Economía y Competitividad (grant number FFI2017-87395-P) and by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation; project number 426833574).

Supplementary material

To view supplementary material for this article, please visit https://doi.org/10.1017/psa.2025.10174.

Footnotes

1 When dealing with formal models, philosophers often use the terms theory, research approach, methods, and models interchangeably when discussing the targets of scientific investigation. In fact, most models of scientific inquiry allow for diverse interpretations, as does the present model (sec. 3). We will refer to these multiple targets as scientific approaches, or simply approaches, following Wu (Reference Wu2023) and Thoma (Reference Thoma2015).

2 All data, code, and supplementary material are available in the following Open Science Framework (OSF) repository: https://osf.io/6urwj/?view_only=01cd2aa373ae4a77ba41c53ecf1fb72a.

3 While the locations of the landscape have been consistently called approaches, their heights have received multiple different names in view of slightly different interpretations. Pöyhönen (Reference Pöyhönen2017) and Weisberg and Muldoon (Reference Weisberg and Muldoon2009) talk about the epistemic significance of each of them, whereas Wu (Reference Wu2023) talks of scores.

4 A more technical version of how scores are assigned can be found in an online-only appendix, available in the OSF repository.

5 Although the quantitative results presented depend on this specific setup, our main findings hold for other procedures. As a test, we performed simulations under the assumption that agents would draw their score from a normal distribution, and we obtained strikingly similar results. See the supplementary material for details.

6 In real scientific communities, the total-sharing procedure is likely to require more effort from scientists than the partial-sharing procedure. Similarly, learning about new approaches from people with different evaluative standards is also likely to require more effort than learning from people with similar standards. In fact, scientists with different standards are likely to struggle to communicate about the value of their approaches. However, in the present work, we leave effort and communication difficulties out of the picture.

7 Although we are aware that distinguishing a legitimate scientific research program from a non-legitimate one may be rather complicated in real science (Chang, Reference Chang and Saatsi2017), we leave this matter aside and simply assume that a filter has been applied already. Combining scientifically admissible and inadmissible standards (e.g., due to financial incentives) represents a stimulating future direction of our research.

8 To highlight the implications of this difference, we later compare our results with those derived from a more conventional measure based on a unique epistemic standard (sec. 4.5). We are grateful to an anonymous reviewer for suggesting the inclusion of this comparative analysis.

9 As indicated in the legends of the figures, we usually compare a community organized on a cycle network with partial sharing to a community organized on a complete network with total sharing because they capture two extremes in terms of information flow: The former involves minimal information exchange, whereas the latter corresponds to extensive information exchange. Figure 5 later in the article provides an overview of the community success of all possible combinations. An overview of the behavior of communities for any combination of network and sharing protocol can be found in the Supplementary Analysis.

10 Lazer and Friedman observe that in fully connected networks, agents initially converge on a few approaches but then go on exploring new ones, producing a “bouncing” effect—an initial decline in the number of unique approaches followed by a temporary increase—before ultimately settling on a single approach (Lazer and Friedman, Reference Lazer and Friedman2007, 679). In our model, this effect is absent in homogeneous and minimally diverse communities ( $d \lt 0.3$ ), but it becomes more pronounced as diversity increases. Greater evaluative diversity sustains communal exploration, making the bouncing effect increasingly visible.

11 We argue that our communities engage in collective problem-solving despite having different evaluative criteria because their search processes remain interdependent. Because agents learn from one another’s discoveries, problem-solving retains a fundamentally collective dimension.

References

Alexander, Jason McKenzie, Himmelreich, Johannes, and Thompson, Christopher. 2015. “Epistemic Landscapes, Optimal Search, and the Division of Cognitive Labor.” Philosophy of Science 82 (3):424–53. doi: https://doi.org/10.1086/681766.Google Scholar
Bayer, Adolf von. 1870. “Ünteration of Water and Its Significance for Plant Life and Fermentation.” Berichte der Deutschen Chemischen 3 (63):1870.Google Scholar
Bedessem, Baptiste. 2019. “The Division of Cognitive Labor: Two Missing Dimensions of the Debate.” European Journal for Philosophy of Science 9 (1):article 3. doi: https://doi.org/10.1007/s13194-018-0230-8.Google Scholar
Boyer, Thomas. 2014. “Is a Bird in the Hand Worth Two in the Bush? Or, Whether Scientists Should Publish Intermediate Results.” Synthese 191 (1):1735. doi: https://doi.org/10.1007/s11229-012-0242-4.Google Scholar
Chang, Hasok. 2012. Is Water H ${_2}$ O? Evidence, Pluralism and Realism. Springer.Google Scholar
Chang, Hasok. 2017. “Is Pluralism Compatible with Scientific Realism?” In The Routledge Handbook of Scientific Realism, edited by Saatsi, Juha, 176–86. New York: Routledge.Google Scholar
Copeland, Samantha. 2023. “Serendipity and the History of the Philosophy of Science.” In Serendipity Science: An Emerging Field and Its Methods, edited by Copeland, Samantha, Ross, Wendy, and Sand, Martin, 101–23. Cham, Switzerland: Springer. doi: https://doi.org/10.1007/978-3-031-33529-7-6.Google Scholar
Copeland, Samantha M. 2017. “On Serendipity in Science: Discovery at the Intersection of Chance and Wisdom.” Synthese (196):122. doi: https://doi.org/10.1007/s11229-017-1544-3.Google Scholar
Šešelja, Dunja. 2023. “Agent-Based Modeling in the Philosophy of Science.” In The Stanford Encyclopedia of Philosophy, edited by Edward, N. Zalta. Stanford: Stanford University Press.Google Scholar
Frey, Daniel, and Šešelja, Dunja. 2020. “Robustness and Idealizations in Agent-Based Models of Scientific Interaction.” British Journal for the Philosophy of Science 71 (4):1411–37. doi: https://doi.org/10.1093/bjps/axy039.Google Scholar
Grim, Patrick, Singer, Dan, Bramson, Aaron, Holman, Bennett, McGeehan, Sean, and Berger, William. 2018. “Diversity, Ability, and Expertise in Epistemic Communities.” Philosophy of Science 86 (1):98123. doi: https://doi.org/10.1086/701070.Google Scholar
Heesen, Remco. 2017. “Communism and the Incentive to Share in Science.” Philosophy of Science 84 (4):698716. doi: https://doi.org/10.1086/693875.Google Scholar
Hong, Lu, and Page, Scott E.. 2004. “Groups of Diverse Problem Solvers Can Outperform Groups of High-Ability Problem Solvers.” Proceedings of the National Academy of Sciences 101 (46):16385–89. doi: https://doi.org/10.1073/pnas.0403723101.Google Scholar
Huang, Alice C. W. 2024. “Landscapes and Bandits: A Unified Model of Functional and Demographic Diversity.” Philosophy of Science 91 (3):579–94. doi: https://doi.org/10.1017/psa.2023.169.Google Scholar
Kauffman, Stuart, and Levin, Simon. 1987. “Towards a General Theory of Adaptive Walks on Rugged Landscapes.” Journal of Theoretical Biology 128 (1):1145. doi: https://doi.org/10.1016/S0022-5193(87)80029-2.Google Scholar
Kellert, Stephen H., Longino, Helen E., and Kenneth Waters, C., editors. 2006. Scientific Pluralism. Minneapolis: University of Minnesota Press.Google Scholar
Kitcher, Philip. 1993. The Advancement of Science: Science without Legend, Objectivity without Illusions. New York: Oxford University Press.Google Scholar
Kitcher, Philip. 2013. “Toward a Pragmatist Philosophy of Science.” Theoria 77:185231.Google Scholar
Kourany, Janet A. 2010. Philosophy of Science after Feminism. Oxford: Oxford University Press. doi: https://doi.org/10.1093/acprof:oso/9780199732623.001.0001.Google Scholar
Lazer, David, and Friedman, Allan. 2007. “The Network Structure of Exploration and Exploitation.” Administrative Science Quarterly 52 (4):667–94. doi: https://doi.org/10.2189/asqu.52.4.667.Google Scholar
Liebig, Justus. 1843. “Die Wechselwirthschaft.” Annalen der Chemie und Pharmazie 46:5897.Google Scholar
Longino, Helen E. 1987. “Can There Be a Feminist Science?Hypatia 2 (3):5164. doi: https://doi.org/10.1111/j.1527-2001.1987.tb01341.x.Google Scholar
Longino, Helen E. 1990. Science as Social Knowledge. Princeton, NJ: Princeton University Press.Google Scholar
Longino, Helen. 2002. The Fate of Knowledge. Princeton, NJ: Princeton University Press.Google Scholar
Longino, Helen. 2006. “Theoretical Pluralism and the Scientific Study of Behavior.” In Scientific Pluralism, edited by Kellert, Stephen, Longino, Helen, and Kenneth Waters, C., 102–31. Minneapolis: University of Minnesota Press.Google Scholar
Longino, Helen E. 2019. Studying Human Behavior: How Scientists Investigate Aggression and Sexuality. Chicago, IL: University of Chicago Press.Google Scholar
Ludwig, David, and Ruphy, Stéphanie. 2021. “Scientific Pluralism.” In The Stanford Encyclopedia of Philosophy, edited by Edward, N. Zalta. Stanford: Stanford University Press.Google Scholar
Martini, Carlo, and Fernández Pinto, Manuela. 2017. “Modeling the Social Organization of Science: Chasing Complexity through Simulations.” European Journal for Philosophy of Science 7:221–38. doi: https://doi.org/10.1007/s13194-016-0153-1.Google Scholar
Michelini, Matteo, Osorio, Javier, Houkes, Wybo, Šešelja, Dunja, and Straßer, Christian. 2023. “Scientific Disagreements and the Diagnosticity of Evidence: How Too Much Data May Lead to Polarization.” Journal of Artificial Societies and Social Simulation 26 (4):5. doi: https://doi.org/10.18564/jasss.5113.Google Scholar
Nickelsen, Kärin. 2021. “Cooperative Division of Cognitive Labour: The Social Epistemology of Photosynthesis Research.” Journal for General Philosophy of Science 53:2340. doi: https://doi.org/10.1007/s10838-020-09543-1.Google Scholar
Nickelsen, Kärin. 2022. Explaining Photosynthesis. Models of Biochemical Mechanisms, 1840–1960. Dordrecht, Netherlands: Springer.Google Scholar
Parker, Wendy. 2006. “Understanding Pluralism in Climate Modeling.” Foundations of Science 11 (4):349–68. doi: https://doi.org/10.1007/s10699-005-3196-x.Google Scholar
Petrovich, Eugenio, and Viola, Marco. 2018. “Social Epistemology at Work: From Philosophical Theory to Policy Advice.” RT: A Journal on Research Policy and Evaluation 6 (1). doi: https://doi.org/10.13130/2282-5398/9828.Google Scholar
Politi, Vincenzo. 2021. “Formal Models of the Scientific Community and the Value-Ladenness of Science.” European Journal for Philosophy of Science 11 (4):97. doi: https://doi.org/10.1007/s13194-021-00418-w.Google Scholar
Pöyhönen, Samuli. 2017. “Value of Cognitive Diversity in Science.” Synthese 194 (11):4519–40. doi: https://doi.org/10.1007/s11229-016-1147-4.Google Scholar
Reijula, Samuli, and Kuorikoski, Jaakko. 2022. “The Diversity-Ability Trade-Off in Scientific Problem Solving.” Philosophy of Science 88 (5):894905. doi: https://doi.org/10.1086/714938.Google Scholar
Reijula, Samuli, Kuorikoski, Jaakko, and MacLeod, Miles. 2023. “The Division of Cognitive Labor and the Structure of Interdisciplinary Problems.” Synthese 201 (6):214. doi: https://doi.org/10.1007/s11229-023-04193-4.Google Scholar
Rosenstock, Sarita, Bruner, Justin, and O’Connor, Cailin. 2017. “In Epistemic Networks, Is Less Really More?Philosophy of Science 84 (2):234–52. doi: https://doi.org/10.1086/690717.Google Scholar
Rosenthal, Robert. 1979. “The File Drawer Problem and Tolerance for Null Results.” Psychological Bulletin 86 (3):638. doi: https://doi.org/10.1037/0033-2909.86.3.638.Google Scholar
Schindler, Samuel. 2022. “Theoretical Virtues: Do Scientists Think What Philosophers Think They Ought to Think?Philosophy of Science 89 (3):542–64. doi: https://doi.org/10.1017/psa.2021.40.Google Scholar
Smaldino, Paul E., Moser, Cody, Velilla, Alejandro Pérez, and Werling, Mikkel. 2022. “Maintaining Transient Diversity Is a General Principle for Improving Collective Problem Solving.” Perspectives on Psychological Science 19 (2):5464. doi: https://doi.org/10.1177/17456916231180100.Google Scholar
Solomon, Miriam. 2007. Social Empiricism. Cambridge, MA: MIT Press.Google Scholar
Steel, Daniel, Fazelpour, Sina, Crewe, Bianca, and Gillette, Kinley. 2021. “Information Elaboration and Epistemic Effects of Diversity.” Synthese 198:1287–307. doi: https://doi.org/10.1007/s11229-019-02108-w.Google Scholar
Steel, Daniel, Fazelpour, Sina, Gillette, Kinley, Crewe, Bianca, and Burgess, Michael. 2018. “Multiple Diversity Concepts and Their Ethical-Epistemic Implications.” European Journal for Philosophy of Science 8 (3):761–80. doi: https://doi.org/10.1007/s13194-018-0209-5.Google Scholar
Strevens, Michael. 2013. “Herding and the Quest for Credit.” Journal of Economic Methodology 20 (1):1934. doi: https://doi.org/10.1080/1350178x.2013.774849.Google Scholar
Strevens, Michael. 2017. “Scientific Sharing, Communism, and the Social Contract.” In Scientific Collaboration and Collective Knowledge, edited by Boyer-Kassem, Thomas, Mayo-Wilson, Conor, and Weisberg, Michael, 333. Oxford: Oxford University Press.Google Scholar
Thoma, Johanna. 2015. “The Epistemic Division of Labor Revisited.” Philosophy of Science 82 (3):454–72. doi: https://doi.org/10.1086/681768.Google Scholar
Ward, Zina B. 2021. “On Value-Laden Science.” Studies in History and Philosophy of Science Part A 85:5462. doi: https://doi.org/10.1016/j.shpsa.2020.09.006.Google Scholar
Ward, Zina B. 2022. “Disagreement and Values in Science.” In The Routledge Handbook of Philosophy of Disagreement, edited by Maria Baghramian, J. Carter, Adam, and Cosker-Rowland, Rach, 297308. New York: Routledge.Google Scholar
Weisberg, Michael, and Muldoon, Ryan. 2009. “Epistemic Landscapes and the Division of Cognitive Labor.” Philosophy of Science 76 (2):225–52. doi: https://doi.org/10.1086/644786.Google Scholar
Winsberg, Eric. 2012. “Values and Uncertainties in the Predictions of Global Climate Models.” Kennedy Institute of Ethics Journal 22 (2):111–37. doi: https://doi.org/10.1353/ken.2012.0008.Google Scholar
Wu, Jingyi. 2023. “Better Than Best: Epistemic Landscapes and Diversity of Practice in Science.” Philosophy of Science 91 (5):1189–98. doi: https://doi.org/10.1017/psa.2023.129.Google Scholar
Wu, Jingyi, and O’Connor, Cailin. 2023. “How Should We Promote Transient Diversity in Science?Synthese 201 (2):124. doi: https://doi.org/10.1007/s11229-023-04037-1.Google Scholar
Zollman, Kevin J. S. 2007. “The Communication Structure of Epistemic Communities.” Philosophy of science 74 (5):574–87. doi: https://doi.org/10.1086/525605.Google Scholar
Zollman, Kevin J. S. 2010. “The Epistemic Benefit of Transient Diversity.” Erkenntnis 72 (1):1735. doi: https://doi.org/10.1007/s10670-009-9194-6.Google Scholar
Zollman, Kevin J. S. 2013. “Network Epistemology: Communication in Epistemic Communities.” Philosophy Compass 8 (1):1527. doi: https://doi.org/10.1111/j.1747-9991.2012.00534.x.Google Scholar
Figure 0

Table 1. Parameter Description and Value Range Explored

Figure 1

Figure 1. The success for homogeneous communities under different network combinations ($d = 0$, sharing protocol = total). Shaded areas represent the standard error of the mean.

Figure 2

Figure 2. The same set of eight approaches with two different sets of scores, obtained through two different personal evaluations. Arrows show available exploration paths.

Figure 3

Figure 3. Number of unique approaches adopted at each step by communities with different degrees of diversity.

Figure 4

Figure 4. Number of total approaches explored by communities with different degrees of diversity.

Figure 5

Figure 5. The average success of communities with different degrees of diversity is plotted based on different values for diversity, networks, and sharing protocols.

Figure 6

Figure 6. The average success of a community at 150 steps. Shaded areas represent the standard deviation.

Figure 7

Figure 7. Impact of the maximum number of unique evaluations ($r$) on the success of communities. Shaded areas represent the standard error of the mean.

Figure 8

Figure 8. Impact of $K$ on the success of communities ($N = 15$). Shaded areas represent the standard error of the mean.

Figure 9

Figure 9. Impact of the number of agents on the success of communities. Shaded areas represent the standard error of the mean.

Figure 10

Figure 10. Community performance under the single-evaluation measure. The average success of communities with different degrees of diversity is plotted based on different values for diversity, networks, and sharing protocols.

Supplementary material: File

Michelini and Osorio supplementary material

Michelini and Osorio supplementary material
Download Michelini and Osorio supplementary material(File)
File 3.4 MB