A string of reports has shown that people cheat more in order to avoid penalties than to win equivalent rewards (e.g., Cameron and Miller, Reference Cameron, Miller and de Cremer2009; Grolleau et al., Reference Grolleau, Kocher and Sutan2016; Huynh, Reference Huynh2020; Kern and Chugh, Reference Kern and Chugh2009; Klein et al., Reference Klein, Thielmann, Hilbig and Heck2020; Markiewicz and Czupryna, Reference Markiewicz and Czupryna2020; Schindler and Pfattheicher, Reference Schindler and Pfattheicher2017; Zhang et al., Reference Zhang, Zhai, Zhou, Zhang, Gu, Luo and Feng2023). To explain these findings, the literature heavily cites loss aversion (Kahneman and Tversky, Reference Kahneman and Tversky1979), the tendency to give more subjective weight to losses than gains, as the potential cause of this discrepancy. However, the loss aversion model has been recently challenged (see e.g., Ert and Erev, Reference Ert and Erev2013; Gal and Rucker, Reference Gal and Rucker2018, Reference Gal and Rucker2021; Lejarraga et al., Reference Lejarraga, Schulte-Mecklenbeck, Pachur and Hertwig2019; Malul et al., Reference Malul, Rosenboim and Shavit2013; Rakow et al., Reference Rakow, Cheung and Restelli2020; Yechiam, Reference Yechiam2019; Yechiam and Hochman, Reference Yechiam and Hochman2013; Yechiam and Zeif, Reference Yechiam and Zeif2025), with findings showing no loss aversion for the small amounts typically used in the experimental literature on cheating (e.g., Zeif and Yechiam, Reference Zeif and Yechiam2022). Moreover, some studies were not able to replicate the increased cheating for losses than for gains (Charness et al., Reference Charness, Blanco-Jimenez, Ezquerra and Rodriguez-Lara2019; Ezquerra et al., Reference Ezquerra, Kolev and Rodriguez-Lara2018; Jones and Paulhus, Reference Jones and Paulhus2017; Ortiz et al., Reference Ortiz, Zindel and Da Silva2023; Reis et al., Reference Reis, Pfister and Foerster2022; Shalvi and De Dreu, Reference Shalvi and De Dreu2014; Steinel et al., Reference Steinel, Valtcheva, Gross, Celse, Max and Shalvi2022). The goal of the present study was to re-examine the effect of gain and loss framing on cheating, using various cheating paradigms and high powered studies, in order to understand whether the difficulty of replicating the framing effect is due to its specificity to certain cheating paradigms or a moderating variable (such as the incentive size), or alternatively whether the effect is very small or simply does not exist.
One popular explanation for the increased cheating in the loss frame is loss aversion, a component of Kahneman and Tversky’s (Reference Kahneman and Tversky1979) prospect theory whereby losses are assumed to have larger subjective weights than equivalent gains. Loss aversion implies that the incentives to cheat increase if task performance reduces losses than if it facilitates gains (Grolleau et al., Reference Grolleau, Kocher and Sutan2016). However, another component of prospect theory that can explain the effect of framing on cheating is the reflection effect: Cheating likely involves some level of perceived risk (of being caught) as compared to not cheating. Hence, the reflection effect, individuals’ tendency to exhibit risk seeking in the loss domain and risk aversion in the gain domain, predicts that risky behaviors such as cheating should increase in the loss domain (see also Kacelnik and Bateson, Reference Kacelnik and Bateson1997). Note that the reflection effect is independent of loss aversion and is considered to be driven by other components of the prospect theory function besides the weighting of losses (the S-shaped value function and underweighting of moderate probabilities; Kahneman and Tversky, Reference Kahneman and Tversky1979).
Importantly, both loss aversion and the reflection effect imply that cheating in the loss frame should be especially pertinent, given high potential rewards/penalties. Loss aversion was found to be stronger in large compared with small amounts (e.g., Abdellaoui et al., Reference Abdellaoui, L’Haridon and Paraschiv2011; Mrkva et al., Reference Mrkva, Johnson, Gächter and Herrmann2020; Zeif and Yechiam, Reference Zeif and Yechiam2022), as was the reflection effect (Hogarth and Einhorn, Reference Hogarth and Einhorn1990). Yet the findings of increased cheating under the threat of losses were also observed under small losses and gains (e.g., Schindler and Pfattheicher, Reference Schindler and Pfattheicher2017; see Supplementary Table 1). This may seem at odds with the modern studies showing no loss aversion for small losses. Still, it is possible that the effect of small-magnitude losses on cheating is not due to loss aversion per se, but rather a related phenomenon known as ‘loss attention’—the increased task attention and effort following losses compared to gains, which was observed in smaller losses as well (see Yechiam et al., Reference Yechiam, Retzer, Telpaz and Hochman2015; Yechiam and Hochman, Reference Yechiam and Hochman2013). This increased effort may strengthen the attractiveness of cheating due to the increased motivation to perform well (see also Guo et al., Reference Guo, Mao, Mu and Cai2024).
Under the loss attention account (Yechiam and Hochman, Reference Yechiam and Hochman2013), a possible moderator regarding performance-related cheating is the specific cheating paradigm. Particularly, individuals’ tendency to make more effort to perform well in the loss domain may only emerge in settings where participants feel that they have some degree of agency concerning the task outcomes. For instance, in the die-in-a-cup task cheating paradigm (e.g., Shalvi et al., Reference Shalvi, Handgraaf and De Dreu2011), individuals cast a die in a closed cup and have no control over the task outcomes. In this setting, there may be no differences in the extent of effort allocation in the gain/loss domains, given that outcomes are not based on effort, and no differences in the motivation to cheat as part of this effort. Indeed, most of the null findings noted above regarding the effect of framing on cheating (see Supplementary Table 1) were in studies that used the die-in-a-cup task or a similar paradigm (e.g., Charness et al., Reference Charness, Blanco-Jimenez, Ezquerra and Rodriguez-Lara2019; Ezquerra et al., Reference Ezquerra, Kolev and Rodriguez-Lara2018; Shalvi and De Dreu, Reference Shalvi and De Dreu2014; Steinel et al., Reference Steinel, Valtcheva, Gross, Celse, Max and Shalvi2022), whereas the findings of increased effects of loss framing were commonly observed in cognitive tasks where effort matters (e.g., Cameron and Miller, Reference Cameron, Miller and de Cremer2009; Grolleau et al., Reference Grolleau, Kocher and Sutan2016; Kern and Chugh, Reference Kern and Chugh2009). For example, in Grolleau et al. (Reference Grolleau, Kocher and Sutan2016), participants completed puzzles and had to report the correct number of puzzles performed. The results evidenced a very strong tendency to report that more puzzles were completed in a loss than in a gain frame, while no effect of framing emerged in a condition where performance was monitored. In this setting, participants’ increased effort investment may have led them to prioritize task success over ethical considerations (despite having no effect on performance level).
Conversely, under the loss aversion account, the moderating effect of payoff size may explain some of the discrepancies in the literature. As reviewed in Supplementary Table 1, the experimental findings of increased cheating under loss framing were typically observed in experimental studies with moderate gains and losses (e.g., Grolleau et al., Reference Grolleau, Kocher and Sutan2016; Zhang et al., Reference Zhang, Zhai, Zhou, Zhang, Gu, Luo and Feng2023), although not always (see Cameron and Miller, Reference Cameron, Miller and de Cremer2009). By contrast, the findings of no such effect were typically reported for somewhat smaller outcomes (e.g., Jones and Paulhus, Reference Jones and Paulhus2017; Ortiz et al., Reference Ortiz, Zindel and Da Silva2023; Reis et al., Reference Reis, Pfister and Foerster2022), although again not always (cf. Schindler and Pfattheicher, Reference Schindler and Pfattheicher2017). Note that for very small outcomes, individuals were even found to behave as if they assign a larger weight to gains than losses, a phenomenon known as gain seeking (Heilprin and Erev, Reference Heilprin and Erev2024; Zeif and Yechiam, Reference Zeif and Yechiam2022), predicting greater cheating in the gain frame for very small outcomes. The only previous study that directly supports the latter prediction is Cannito et al. (Reference Cannito, Palumbo and Sacco2023). They found that a loss frame (compared with a gain frame) significantly increased cheating when participants were presented with a 20-Euro potential reward or penalty, whereas a gain frame increased cheating when participants were presented with 2 Euro rewards/penalties.
The outlined theoretical framework thus highlights some predictions regarding the boundary conditions for the recorded effect of gain/loss framing on cheating that we aimed to test. The first 4 experimental studies evaluated the effect of gain/loss framing on cheating using different cheating paradigms, aiming to determine whether discrepancies between findings stem from the paradigms themselves—particularly the distinction between task settings where performance does or does not depend on cognitive effort. Study 5 was a preregistered experiment designed to evaluate our failure to reproduce the results of Grolleau et al.’s (Reference Grolleau, Kocher and Sutan2016), while Study 6 evaluated effort-related cheating not via self-report. Finally, Study 7 was a preregistered examination of the moderating effect of stake size.
1. Study 1: Random number reporting paradigm
In our initial study, we created a digital variant of the die-in-a-cup task, a paradigm involving potential distortion of participants’ communication regarding task outcomes. We tested whether the increased cheating in the loss frame is robust, or whether, similarly to previous results (e.g., Shalvi and De Dreu, Reference Shalvi and De Dreu2014), it is not obtained in the domain of communication distortion of random events, reported to be the most prevalent modern paradigm for cheating (Gerlach et al., Reference Gerlach, Teodorescu and Hertwig2019).
1.1. Method
1.1.1. Participants
All studies were approved by the authors’ university ethics committee, and participants provided informed consent statements. All studies were conducted with Prolific Academic workers from the United States who stated that English was their first language. Participants taking part in one study were excluded from participating in any of the others. Assuming a small to medium effect (d = 0.3) and 85% power, a sample of 400 is required. In Study 1, a total of 400 participants completed the study (203 females, 191 males, and 6 others). The participants’ average age was 45.4 (SD = 15.5), with individuals ranging from 18 to 80 years old. In Studies 1–3 and 7, participants had an approval rate of at least 90%, whereas in Studies 4–6, no approval ranking was required (as explained below). We randomly allocated participants to the gain- and loss-frame conditions (n = 204 and 196, respectively). In both conditions, they received a fee of $0.6 for completing the study, plus an additional amount between $0.2 and $1.2 based on their reports.
1.1.2. Task
We created a digital version of the die-in-a-cup task (see Shalvi et al., Reference Shalvi, Handgraaf and De Dreu2011, Reference Shalvi, Eldar and Bereby-Meyer2012), where participants were instructed to open the default scientific calculator on their computer, press the random-number button, and report the first 4 random numbers. Their payoff was based on the average of the numbers—segmenting the report was expected to facilitate cheating (Rilke et al., Reference Rilke, Schurr, Barkan and Shalvi2016). Positive deviations from the random average of 0.5 were considered as evidence for cheating. The task was presented in two framing conditions: In the gain frame, participants were informed that the average of the four numbers four they reported would be added to their base payment of $0.2. In the loss frame, they were informed that the average of the four numbers would be deducted from their base payment (of $1.2). Note that in this and all studies, the baseline in the loss-frame condition was higher than that in the gain-frame condition, so that the financial outcomes (for either cheating or not cheating) would be the same in both conditions. In addition to the base payment, an amount of $0.6 was provided for the participants’ time. Participants first answered some demographic questions and then performed the main task. Complete study instructions are available in the Supplementary Material.
1.2. Results
In the gain-frame condition, participants’ reported outcomes averaged 0.527 (SE = 0.011), whereas in the loss-frame condition, the average was 0.477 (SE = 0.014). The deviation from 0.5 in the direction of higher incentives (0.027 and 0.023, respectively) was significantly different from 0 across both conditions, t(399) = 2.87, p = .004, providing evidence that the participants tended to cheat, albeit weakly (Cohen’s d = 0.14). However, the difference between the gain and loss frames was not significant, t(398) = 0.20, p = .84, d = 0.02.
In an attempt to evaluate whether there is a difference in the quality of cheating behavior, we examined only the scores of participants who deviated from the random average of 0.5 in the incentivized direction (>0.5 in the gain frame and <0.5 in the loss frame), as evidence of potential cheating. In the gain frame, 55.4% of the participants deviated from 0.5 in the direction of greater incentives, compared with 50.0% in the loss frame, though the difference did not reach significance, χ 2 (1) = 1.17, p = .28. In the loss frame, for those who deviated from 0.5 in the direction of larger incentives, deviations were somewhat more extreme: 0.177 (SE = 0.010) compared with 0.138 (SE = 0.013) in the gain frame, and this difference was significant, t(209) = 2.37, p = .01, d = 0.33). This suggests that while slightly more participants cheated in the gain frame, the extent of cheating was significantly higher in the loss frame. No significant difference was observed in deviations from 0.5 toward the non-incentivized direction.
2. Study 2: Random number reporting: Validation
The goal of the study was to replicate the unexpected findings of Study 1: while the overall extent of cheating in the gain and loss frames was similar, the quality of cheating was different, with somewhat less frequent and significantly more intense cheating in the loss frame. A similar result was also recently recorded by Steinel et al. (Reference Steinel, Valtcheva, Gross, Celse, Max and Shalvi2022). An additional goal of Study 2 was to understand the relationship between this finding and loss aversion. For this purpose, in addition to the number-entering task, we administered a hypothetical choice task that involved a lottery producing an equal chance to either gain or lose an amount equal to the outcome obtained in the number-entering task. We then examined whether avoiding the lottery, which is indicative of loss aversion (Kahneman and Tversky, Reference Kahneman and Tversky1979), predicts the extent of cheating in the loss frame.
2.1. Method
2.1.1. Participants
A total of 400 Prolific Academic workers (201 females, 193 males, and 6 others) completed the study. Participants’ average age was 45.1 (SD = 15.3). They were randomly allocated to the gain- and loss-frame conditions (n = 207 and 193, respectively). Participants in both conditions received $0.6 for their participation and could earn an additional amount of up to $2 based on their reports and choices.
2.1.2. Task
The number filling task, serving as the cheating paradigm, was the same as in Study 1, with the exception that the base payment was 60 cents in the gain frame and $1.60 in the loss frame, and no additional amount was paid for the participants’ time. This task was followed by an incentivized decision task where participants chose whether to accept a lottery where they could win or lose the exact amount reported by them in the number filling task (and constituting their bonus payoff), with equal odds of winning or losing. Next, participants answered some demographic questions. Complete study instructions are available in the Supplementary Material.
2.2. Results
In the gain-frame condition, participants reported an outcome of 0.556 (SE = 0.011), on average, whereas in the loss condition, they reported 0.474 (SE = 0.012). Again, the deviations from 0.5 in the direction of higher incentives (0.056 and 0.036, respectively) were significant, t(399) = 4.94, p < .001, Cohen’s d = 0.25, providing evidence that participants cheated. However, the effect of framing was, again, not significant, t(398) = 1.80, p = .07, d = 0.18, with slightly more cheating in the gain frame than in the loss frame.
As previously, we also examined only the scores of participants whose mean deviations from random responding were in the incentivized direction (>0.5 in the gain frame and <0.5 in the loss frame). Again, responses consistent with the incentive structure were somewhat more frequent in the gain domain (Gain: 63.8%; Loss: 51.8%), this time significantly so, χ 2 (1) = 5.68, p = .015. However, the results showed similar deviations in the incentivized direction across the gain and loss frames (0.165 ± 0.007 and 0.166 ± 0.008, respectively), t(230) = 0.65, p = .52, d = 0.09. We thus did not fully replicate the more extreme cheating in the loss frame among those responding according to the incentive structure.
With respect to the lottery question, most participants (80.0%) chose to take the bet, which is consistent with the finding of no loss aversion and gain seeking for small amounts of money (Heilprin and Erev, Reference Heilprin and Erev2024; Zeif and Yechiam, Reference Zeif and Yechiam2022). When examining the relationship between avoiding the lottery and cheating, the overall correlation is small but significant, r = 0.12, p = .02, yet a similar correlation emerged for both framing conditions (Gain: r = 0.09; Loss: r = 0.08). Thus, participants who were more loss averse did have a slight tendency to cheat more in the loss frame, but also in the gain frame. Moreover, on average, there was no loss aversion at the population level and no increased cheating in the loss frame.
3. Study 3: Binary number reporting paradigm
The absence of a more intense effect of losses than gains in the previous studies might be due to the fact that in the loss frame, people gained more by reporting smaller numbers, which may have been counterintuitive to some participants (Ayal et al., Reference Ayal, Hochman and Zakay2011). To address this, we administered a version of the number-filling task where participants had to calculate the average of the obtained numbers and only report whether this average is above or below 0.5, and an average above 0.5 was positively incentivized in both conditions.
3.1. Method
3.1.1. Participants
A total of 402 Prolific Academic workers (210 females, 190 males, and 2 others) completed the study. Participants’ average age was 45.3 (SD = 15.4). They were randomly allocated to the gain- and loss-frame conditions (n = 201 in both). Fees were the same as in Study 1.
3.1.2. Task
The number-filling task was used as in the previous studies, but this time, participants did not have to indicate the exact numbers obtained. Instead, they only needed to report whether the average of the numbers produced by the scientific calculator was above or below 0.5. In the positive-frame condition, they received a $1 bonus to their baseline fee (of $0.8) upon indicating that the average was above 0.5. In the negative-frame condition, they paid a $1 penalty from their baseline fee (of $1.8) if the average was below 0.5. The complete study instructions and demographic questions are available in the Supplementary Material.
3.2. Results
In the gain-frame condition, 58.7% of the participants reported that their average score was above 0.5, as compared to 63.2% in the loss frame. Overall, these proportions were above 50% (Binomial test Z = 4.33, p < .001, d = 0.22), denoting a significant degree of cheating. Yet, as in Studies 1 and 2, there was no significant difference between the 2 framing conditions, χ 2 (1) = 0.85, p = .36, d = 0.10 (approximated Cohen’s using the formula of Sánchez-Meca et al., Reference Sánchez-Meca, Marín-Martínez and Chacón-Moscoso2003). We thus do not find a difference between framing conditions in this paradigm as well.
4. Study 4: Performance-level reporting paradigm
So far, the results showed no consistent differences when task outcomes are framed either as gains or as losses. We therefore moved to a different paradigm where participants do not report the results of a random process, but rather their own estimated success in a task where greater efforts can improve performance. This followed the study by Grolleau et al. (Reference Grolleau, Kocher and Sutan2016) in which participants completed puzzles and had to report the correct number of puzzles, and where loss framing was conducive to greater cheating. Somewhat differently from Grolleau et al. (Reference Grolleau, Kocher and Sutan2016), we used a battery of judgment tasks instead of mathematical puzzles. In order to avoid the effect of framing on the performance level, we informed participants of their framing condition only after they performed the cognitive task, before they self-rated their performance. We also externally evaluated performance on this task by adding a benchmark condition where we monitored participants’ performance.
4.1. Method
4.1.1. Participants
A total of 400 Prolific Academic workers (202 females, 195 males, and 3 others) completed the two main study conditions. No approval rate of previous tasks was required, allowing a wide range of task success levels. Participants’ average age was 43.6 (SD = 15.46). They were randomly allocated to the gain and loss frames (n = 201 and 199, respectively). In addition, 201 additional participants (95 females, 105 males, and 3 others) were recruited to the benchmark condition where their performance was externally monitored. Participants in all three conditions received a fee of $0.60 for completing the study, plus an additional amount based on their scores between $0 and $3.60.
4.1.2. Task
The task involved completing six Cognitive Reflection Test (CRT) items, three drawn from the original CRT (Frederick, Reference Frederick2005), and three from the Verbal CRT (VCRT; Sirota et al., Reference Sirota, Dewberry, Juanchich, Valuš and Marshall2020). Participants were told that they would need to answer questions on a sheet of paper, number each question, and write the solution next to the number. They were then provided with the solution to the six questions and asked to report the number of correct answers on their sheet of paper. At this point, in the gain frame, participants were informed that for each correct answer, they will win $0.5 (up to $3) that will be added to their base payment (of $0.60), whereas in the loss frame, they were informed that for each incorrect answer, they will lose $0.5 (up to $3) that will be deducted from their base payoff (of $3.60). In the benchmark condition, participants were not provided with the solutions but were instead told to type in the answers that they have written on paper, allowing us to calculate the gap between self-evaluated scores and externally monitored scores.
4.2. Results
Figure 1 presents the difference in the number of correct answers between conditions. As can be seen, the number of correct answers reported in the two framing conditions was quite similar (Gain: 4.10 ± 0.11; Loss: 4.05 ± 0.12) and not significantly different, t(398) = 0.39, p = .72, d = 0.04. Both estimates were higher than the number of correct answers in the benchmark condition where participants’ performance was externally monitored (3.54 ± 0.15), t(599) = 3.17, p = .002, and the effect size of the deviation was small to moderate, d = 0.30. An ANOVA including all three conditions showed a significant difference, F(2, 598) = 5.91, p = .003, but post hoc Scheffe tests indicated a difference only between the benchmark condition and the gain and loss framing conditions (p = .008 and .021, respectively) and not between the two framing conditions (p = .95). Thus, self-reported estimates were higher than actual achievements, but no difference in self-reported success was found between framing conditions, differently from Grolleau et al. (Reference Grolleau, Kocher and Sutan2016).

Figure 1 Study 4 results: Number of estimated correct answers in the 2 framing conditions. The solid line represents the number of questions solved in the benchmark condition with external evaluation. The error terms and dotted lines denote standard errors.
5. Study 5: Conceptual replication of Grolleau et al.
One major difference between the study of Grolleau et al. (Reference Grolleau, Kocher and Sutan2016) and Study 4 is that Grolleau et al.’s (Reference Grolleau, Kocher and Sutan2016) participants worked under time pressure. Specifically, they had five minutes to perform 20 numeric puzzles. Indeed, participants whose performance was externally monitored in Grolleau et al. (Reference Grolleau, Kocher and Sutan2016) successfully completed only 18% of the puzzles on average. A second difference is that in Grolleau et al. (Reference Grolleau, Kocher and Sutan2016), participants already knew their framing condition when they started to perform the task, whereas in Study 4, they knew it only when they were about to report the number of successes. This likely reduced the additional motivation to make an effort in the loss condition, since participants only learned about their assignment to the loss condition after the cognitive task. Possibly, the findings in Grolleau et al. (Reference Grolleau, Kocher and Sutan2016) may have been driven by a disparity in effort levels between the gain and loss domains (i.e., loss attention), which, while not affecting performance, did influence the subsequent tendency to cheat in the loss domain.
We thus conducted a preregistered study that was more similar to the experiment of Grolleau et al. (Reference Grolleau, Kocher and Sutan2016) in these two respects. In this study, 15 CRT-like questions were answered in a very short time window, and the framing condition was known in advance. We tested whether, in this setting, we would replicate the increased cheating in the loss frame. We controlled for the effect of framing on cognitive performance by manipulating both the nature of the evaluation (self-reported vs. external) and the gain/loss framing. The study thus had a 2 (self vs. external evaluation) × 2 (gain vs. loss framing) between-subject design, exactly as in Grolleau et al. (Reference Grolleau, Kocher and Sutan2016).
5.1. Method
5.1.1. Participants
The preregistered report is available at AsPredicted (Submission #214234). A total of 800 Prolific Academic workers (408 females, 386 males, and 6 others) completed the four main study conditions. The participants’ average age was 46.2 (SD = 16.4). They were randomly allocated to the self- and external evaluation conditions and to the gain and loss framing conditions (Self-evaluation, Gain: n = 201; Loss: n = 190; External evaluation, Gain: n = 207; Loss: n = 202). Participants in all four conditions received a fee of $0.80 for completing the study, plus an additional amount based on their scores ranging between $0 and $8.3.
5.1.2. Task
The task involved 15 logic questions taken from the CRT (Frederick, Reference Frederick2005), the CRT-2 (Thomson and Oppenheimer, Reference Thomson and Oppenheimer2016), the VCRT (Sirota et al., Reference Sirota, Dewberry, Juanchich, Valuš and Marshall2020), and Gruber (Reference Gruber2010). In the self-evaluation conditions, participants were told that they would need to answer test questions on a sheet of paper, number each question, and write the solution next to the number. They were next informed of the gain/loss for each correct/incorrect answer in the gain/loss framing conditions, respectively. Those in the gain framing could get an additional $0.5 for each correct item, up to $7.5 above their base payoff (of $0.8). Those in the loss framing could lose $0.5 per incorrect item, up to $7.5, and this amount was deducted from their base payoff (of $8.3).
Participants were further notified that they had two minutes to perform the task, and that at the end of two minutes, they would need to indicate how many items they answered correctly, and then the experiment would end (it was important for us to clarify the end point in advance so that participants will not expect a surprise quiz, thus implying no direct negative repercussions for cheating).
In the external evaluation condition, participants were told that they would need to answer test questions and were informed of the gain/loss for each correct/incorrect answer in the gain/loss framing condition, respectively. They were next informed that they had two minutes to perform the task, after which the experiment would end. As indicated above, the task in all conditions consisted of 15 items that were randomly presented in two possible orders (each participant was randomly allocated to one of the order conditions). After performing the task, those in the self-evaluation conditions were provided with solutions to all 15 questions and then asked to report the number of questions they performed successfully.
5.2. Results
Figure 2 shows the average number of questions reported as answered correctly in the self-evaluation condition and those answered correctly in the external evaluation condition, by the gain/loss frame. As can be seen, there seemed to be massive cheating in the self-evaluation condition, with 6.02 (SE = 0.22) answers reported as correct compared to only 1.90 (SE = 0.10) answers observed to be correct in the external evaluation condition. Yet again, there was no increased cheating under the loss frame. Indeed, in the self-evaluation target condition, the reported number of correct answers was almost identical in the gain and loss frames (Gain: 6.05 ± 0.32; Loss: 5.98 ± 0.33). An ANOVA showed a significant effect of external versus internal evaluation, F(1, 799) = 279.70, p < .001, with a Cohen’s d of 1.18 denoting a very large effect size indicatory of potential cheating. There was no effect of framing condition, F(1, 799) = 0.75, p = .39; or interaction between the framing condition and the type of evaluation, F(1, 799) = 0.36, p = .50. We thus do not replicate the significant interaction observed by Grolleau et al. (Reference Grolleau, Kocher and Sutan2016), and our findings are consistent with the absence of a significant effect of framing in the previous studies.

Figure 2 Study 5 results: Number of estimated correct answers in the self-evaluation condition and actual correct answers in the external evaluation condition, under gain and loss framing. The error terms denote standard errors.
Going beyond the preregistered analysis, we also examined additional performance statistics in the external evaluation condition. The number of questions that participants tried to answer was similar (Gain: 6.72 ± 0.22; Loss: 7.03 ± 0.22) and not significantly different, t(407) = 0.99, p = .33. Surprisingly, the rate of correct answers from attempted questions was somewhat higher in the gain frame (0.31 ± 0.02) compared to the loss frame (0.24 ± 0.02), t(405) = 2.80, p = .003. Potentially, this could be due to greater choking under pressure in the loss condition.
6. Study 6: Illicit usage of resources paradigm
One could argue that self-evaluations of one’s performance are only indirect evidence of cheating. Indeed, they could be driven, for instance, by the participants basing their estimation on subjective beliefs despite being provided with the solutions. We therefore ran an additional study where, instead of evaluating cheating through self-reported scores, we examined the reliance on an illicit performance strategy—the usage of online resources. We administered a difficult knowledge question that Prolific workers were not likely to know the answer to (following Goodman et al., Reference Goodman, Cryder and Cheema2013; Domnich et al., Reference Domnich, Panatto, Signori, Bragazzi, Cristina, Amicizia and Gasparini2015): the average life expectancy in Africa in 2024. Participants were instructed not to use online resources that provide easily accessible solutions to this question, and cheating was identified by responses matching the most accessible solutions in Google and Bing. We tested whether gain/loss framing would increase illicit usage of these online resources.
6.1. Method
6.1.1. Participants
A total of 400 Prolific Academic workers took part in the study (203 females, 196 males, and 1 other). Their average age was 45.6 (SD = 15.2). Participants were randomly allocated to the gain- and loss-frame conditions (n = 206 and 194, respectively). In both conditions, they received a fee of $0.6 for completing the study plus an additional amount between $0 and $0.4 based on their success in the experimental task, as indicated below.
6.1.2. Task
Initially, we provided instructions to avoid relying on online resources during the experiment. In order to validate the sensitivity of responses to cheating-related instructions, the request was presented in two versions (to which individuals were randomly assigned). In one version, participants were informed that the experiment measures their ability to independently perform cognitive problems, whereas in the other version, they were explicitly informed that they should not use online resources and not move away from the experiment’s window. In the gain-frame condition, participants were informed that they would win 10 cents for each correct answer beyond their baseline fee of $0.6, whereas in the loss frame, they were instructed that they would lose 10 cents for each wrong (or incomplete) answer, deducted from their baseline fee of $1.
Participants then proceeded to complete four items, including the main study question (‘What is the average life expectancy in Africa in 2024?’) and, for control purposes, three items from the CRT. The phrasing of the latter items was changed so that online solutions would be less accessible: The solutions did not appear on the main Google or Bing screen when using the question as a search query (at the time of the study). Finally, participants answered comprehension questions to check whether they interpreted the instructions as prohibiting the usage of online resources. All items appear in the Supplementary Material.
Cheating on the main knowledge question was identified by a response conforming to the default answer appearing in Google or Bing search when copying and pasting the exact question (retrieved on January 14, 2024). This answer—64.38 years—was presented on the search engine’s main screen upon issuing the query without requiring one to look at any specific website. We also coded the rounded answers of 64.4 and 64 years as indicative of cheating. Additionally, we coded the first answer in the ‘People also ask’ section in Google (‘61 years for men and 65 years for women’) as a sign of cheating.
6.2. Results
The results showed that in the gain-frame condition, 16.0% of the responses were classified as being due to cheating, compared to 16.5% in the loss-frame condition. The difference was not significant, χ 2 (1) = 0.02, p = .90. On the CRT, the gain-frame condition resulted in a 60.2% (SE = 2.5%) success rate, compared to 62.4% (SE = 2.9%) in the loss condition, again a nonsignificant difference, t(398) = 0.57, p = .57, d = 0.06.Footnote 1
Was this affected by the specific instructions regarding cheating? An examination of the effect of the instructions showed that the more explicit version of the instructions reduced the rate of illicit responses from 23.2% to only 9.7%, χ 2 (1) = 13.35, p < .001. However, while this validates the sensitivity of the cheating classification, a logistic regression indicated no interaction between the explicitness of the instructions and gain/loss framing, B = 0.19, Wald = 0.10, p = .75. In addition, the rate of those who understood that it was not possible to use online resources during this experiment was higher in the explicit version of the instructions, compared to the alternative version (85.6% compared to 59.8%, respectively), χ 2 (1) = 26.35, p < .001. Yet, there was no significant difference between framing conditions in response to this question (Gain: 78.3%; Loss: 72.3%), χ 2 (1) = 1.48, p = .22.
7. Study 7: Stake size as a moderator
Although in all six previous studies, there was no significant main effect of gain/loss framing, the strongest effect of loss framing was in Study 3, which involved binary reporting of averages of random numbers as above or below 0.5. We therefore used the exact setting of Study 3 to evaluate the moderating effect of payoff size in a preregistered investigation. Supposedly, under the theoretical framework discussed above, the null effect of framing may be due to the small stakes involved: In Study 3, cheating responses could raise one’s salary at most by 1 U.S. dollar. In the present study, we manipulated both the payoff size and gain/loss framing. In the high payoff size condition, participants could win or avoid losing $5, whereas in the low payoff size condition, the stake was only $0.2. Again, under the assumed effect of loss aversion and the reflection effect, the larger payoff should lead to a greater tendency to cheat in the loss frame (so as not to lose the large fixed amount). Notably, recent studies showed no loss aversion for $5 amounts (e.g., Zeif and Yechiam, Reference Zeif and Yechiam2022), yet the literature on cheating did find asymmetric effects of gains and losses in outcomes of this magnitude (see Supplementary Table 1).

Figure 3 Study 7 results: Percentage of answers aligned with the incentive structure, by study condition. The solid line represents the expected average response under no cheating. The error terms denote standard errors.
7.1. Method
7.1.1. Participants
The preregistered report is available at AsPredicted (Submission #209195). In line with the protocol, a total of 800 Prolific Academic workers (404 females, 385 males, and 11 others) completed the study. Participants’ average age was 45.1 (SD = 15.76). They were randomly allocated to the low- and high-payoff conditions and to the gain- and loss-frame conditions (n = 394 and 406, and 398 and 402, respectively). Participants in all conditions received a participation fee of $0.6. In addition, in the low-payoff condition, participants could receive an additional amount of $0.2, whereas in the high-payoff condition, they could get an additional $5, based on their report.
7.1.2. Task
The binary number-reporting task was the same as in Study 3. In the positive-frame condition, participants’ baseline fee was $0.6, and they received either $0.2 if they indicated that the average outcome (of the scientific calculator’s 4 random outputs) was above 0.5 in the low-payoff condition, or $5 if they indicated so in the high-payoff condition. In the negative-frame condition, participants took a $0.2 or $5 penalty from their baseline fee (of $0.8 or $5.6) if they indicated that the average was below 0.5 in the low- and high-payoff conditions, respectively. The complete study instructions and demographic questions are available in the Supplementary Material.
7.2. Results
Figure 3 shows the difference between the conditions in the rate of those whose responses aligned with the incentive structure. As can be seen, across conditions, the rate of participants who reported mean numbers above 0.5 was above the chance level. Specifically, in the low-payoff condition, 60.4% of the participants indicated numbers that were above chance in the gain framing and 64.5% in the loss framing. In the high-payoff condition, the respective rates were 62.2% in the gain condition and 66.3% in the loss condition. Thus, in all conditions, there was some degree of cheating, which was significant (Binomial test Z = 7.53, p < .001, d = 0.25). However, a logistic regression indicated no significant effect of framing condition (B = 0.17, SE = 0.21, p = .41), payoff size (B = 0.08, SE = 0.21, p = .72), or their interaction (B = .01, SE = 0.29, p = .98).
8. Mini meta-analysis
Potentially, the nonsignificant effects recorded in specific studies may add up when aggregated in a meta-analysis. We therefore conducted a mini-meta-analysis of the current studies using a random-effect model with inverse variance weighting (Hedges and Olkin, Reference Hedges and Olkin2014) and the DerSimonian–Laird estimator, which is robust to the distribution of the effect sizes (Kontopantelis and Reeves, Reference Kontopantelis and Reeves2012). Cohen’s d was used as an effect size indicator. In Study 5, we converted the interaction term’s effect size (Cohen’s f) into Cohen’s d (as prescribed in Cohen, Reference Cohen1988). Study 7’s low- and high-payoff conditions were analyzed separately to allow for variations in stake size. The results are shown in Figure 4. Across studies, the overall effect size was 0.0041 (CIlow = −0.061; CIhigh = 0.069) and not significantly different from 0. Moreover, the variance between studies was completely accounted for by random sampling errors (Q(7) = 5.66, p = .58, I 2 = 0%), indicating that despite the variance in the basic tendency to cheat in the different paradigms used in the studies, there was no effect of the study method on the difference between framing conditions.

Figure 4 A tree plot of the effect sizes in the present studies. A positive effect size denotes more cheating in the loss frame.
9. General discussion
We began this study aiming to examine the boundary conditions and moderating variables for the effect of negative framing on cheating, namely the increased tendency to cheat in the loss frame (e.g., Grolleau et al., Reference Grolleau, Kocher and Sutan2016). However, in none of the paradigms and conditions we examined did we find an effect of framing on cheating. We observed no such effect in our web version of the die-in-a-cup task, where participants report the result of a random event (random numbers), in a binary version of the task, in self-reports of individuals’ performance level, and in usage of illicit online resources during performance. This provides evidence that the previously observed effect of framing on cheating is far from reliable and not easily replicable.
One might argue that our initial examination of the number-reporting task involved limited cheating; indeed, the effect size for cheating in this task was small. However, we also examined other paradigms where the effect size was larger. Indeed, it was largest in our conceptual replication of Grolleau et al. (Reference Grolleau, Kocher and Sutan2016), where there was more than a one-standard-deviation difference between self-reported and actual (externally reported) success (i.e., Cohen’s d > 1). Nevertheless, in all paradigms, the effect of framing on the level of cheating was not significant. Moreover, a mini-meta-analysis suggests that the inter-study variance in the effect size of gain/loss framing was simply the result of sampling error around a mean effect that is very close to 0 (and not significantly different from 0).
A preregistered examination of the moderating effect of stake size did not yield significant findings. The only variable that was found to be connected to the degree of cheating was individuals’ loss aversion in Study 2. Yet, participants were not loss averse on average, and moreover, this variable similarly predicted cheating in the gain and loss frame, which implies that it cannot account for the increased cheating in the loss frame, even at the individual level.
One limitation of the present studies is that, as in most framing studies, we did not verify whether the gain and loss frames were subjectively equivalent from the individual’s perspective (Mandel, Reference Mandel2023). While the gain and loss amounts were objectively similar, it could be that they were over- or under-estimated in the different domains. Second, in our studies, we used windfall endowed amounts, meaning that the participants in the loss condition received initial outcomes and did not need to spend effort to gain them. Indeed, as shown in Supplementary Table 1, most of the studies that reported a null effect of gain/loss framing on cheating used a windfall endowment. This may have reduced loss/risk aversion in these studies (Jelschen and Schmidt, Reference Jelschen and Schmidt2023). Replicating these null results with earned endowments is an important research direction.
Yet another limitation is that we did not fully replicate the methods of Grolleau et al.’s (Reference Grolleau, Kocher and Sutan2016) study—a study that arguably provides the strongest evidence for increased cheating in the loss domain. Some relevant differences include Grolleau et al.’s (Reference Grolleau, Kocher and Sutan2016) provision of cash money in advance, which may have increased the weight of losses (see, e.g., Raghubir and Srivastava, Reference Raghubir and Srivastava2008), and the fact that we used somewhat smaller bonus amounts than Grolleau et al. (up to $8.3 wage in Study 4, which lasted 5 minutes compared to 30 Euros in Grolleau et al., Reference Grolleau, Kocher and Sutan2016 for a 30-minute work), although the overall wage per hour was comparable. Our experiment also took place in a less controlled (online) setting. One final thing to note is that Grolleau et al.’s study was conducted using block randomization with a complete room assigned to a particular condition. This may have led to group dynamics facilitating the effect of framing on individuals’ behavior.Footnote 2
Another specific characteristic of our studies is that they focused on Prolific Academic workers. One might argue that these participants are less sensitive to incentive-related instructions because they are internally motivated to honestly perform the task (e.g., to improve their Prolific completion rate scores) as compared to students, particularly those with little or no experience in behavioral experiments (e.g., in Grolleau et al., Reference Grolleau, Kocher and Sutan2016). However, there does not seem to be any evidence to support this, and moreover, in Study 6, we in fact showed that Prolific workers were highly sensitive to relevant instructions concerning cheating, with their cheating levels being elevated when these instructions were relatively ambiguous.
While not discounting these potential limitations, it is also possible that prior framing-effect results in Grolleau et al. (Reference Grolleau, Kocher and Sutan2016) and similar results were facilitated by the sheer prominence of the loss aversion model. In various sub-areas of decision-making research, loss aversion was often used as an off-the-shelf explanation allowing researchers to explain framing effects without needing to develop new, domain-specific theories (Gal and Rucker, Reference Gal and Rucker2018). Likewise, many of above-cited studies of ethical decision-making were based on the assumption that due to loss aversion, loss framing decreases ethical behavior, and their goal was to find moderators and boundary conditions for its effect (e.g., Kern and Chugh, Reference Kern and Chugh2009; Zhang et al., Reference Zhang, Zhai, Zhou, Zhang, Gu, Luo and Feng2023). This default belief may have inadvertently created a publication bias, where studies confirming the link were more readily published. The current results provide a call for a systematic examination of this literature, for example, through a meta-analysis.
One may argue that a gain/loss framing effect was reliably observed in vignette-based studies examining ethical behavior in scenarios such as tax return filing (Kirchler and Maciejovsky, Reference Kirchler and Maciejovsky2001; Rees-Jones, Reference Rees-Jones2018), the usage of bribes (Modesto and Pilati, Reference Modesto and Pilati2025), or insider information (Kern and Chugh, Reference Kern and Chugh2009). However, these studies typically focused on very large hypothetical amounts of money or goods, and thus, it is not clear if the recorded asymmetric effect of losses denotes loss aversion or ‘ruin aversion’ (Yechiam, Reference Yechiam2019).
To sum up, we began our series of studies by seeking paradigms where we might observe increased cheating in the loss frame, in order to account for the disparate results in the literature, yet we obtained null results across all paradigms examined. Our results left us with great skepticism about the replicability of these framing-related findings in the context of cheating behavior. These null results are also consistent with recent findings of judgment studies showing the absence of framing effects for precisely described gains and losses (Kühberger and Tanner, Reference Kühberger and Tanner2010; Mandel, Reference Mandel2014). We thus cannot overrule the possibility that the inconsistencies in the literature may, to a large extent, be driven by sampling noise around an extremely small (or zero) effect. Importantly, some of the empirical findings in this literature were used to support the construct of loss aversion and argue that it goes beyond simple monetary settings (e.g., Grolleau et al., Reference Grolleau, Kocher and Sutan2016; Markiewicz and Czupryna, Reference Markiewicz and Czupryna2020). Our findings suggest that this conclusion needs to be reassessed.
Supplementary material
The supplementary material for this article can be found at http://doi.org/10.1017/jdm.2025.10016.
Data availability statement
The data and code are available at https://osf.io/2zk7p/?view_only=cdf65225599d459c9849f725d509906d.
Funding statement
The studies received support from the Technion—Israel Institute of Technology.
Competing interest
The authors declare none.



