1. Introduction
Optimal insurance is a classical topic in actuarial science and insurance economics and has been an active research topic ever since the seminal work of Arrow (Reference Arrow1963). Researchers have explored this topic from various perspectives, including by applying different premium principles, different optimization criteria, and different types of insurance contracts, and by formulating the problem in a game-theoretic context, among other extensions. The existing literature on optimal insurance, despite being extensive, mostly assumes that the insurance premium is independent of policyholders’ claim history. However, in insurance ratemaking, a reported claim leads to an increase in future premiums, which is seen as a “golden rule” in practice and plays a pivotal role in the credibility models (see Part V in Klugman et al., Reference Klugman, Panjer and Willmot2019). In the Handbook of Insurance, Pinquet (Reference Pinquet and Dionne2013) writes that this golden rule is “almost systematic in non-life insurance” and can be justified by actuarial neutrality and the incentives it creates for risk prevention; see, for instance, Boyer and Dionne (Reference Boyer and Dionne1989) on automobile insurance and Ruser (Reference Ruser1985) on workers’ compensation insurance. Also, if policyholders have good claim history, then their premiums are reduced to reflect this good history. We emphasize that both individual policyholders and insurers purchasing reinsurance are subject to this golden rule; for the former, the bonus-malus system frequently adopted in the pricing of automobile insurance is a good example, while Doherty and Smetters (Reference Doherty and Smetters2005) provide evidence for the latter. Therefore, incorporating this golden rule into the study of optimal insurance will lead to a more realistic model. In this paper, we study an optimal insurance problem under a premium principle that takes into account the policyholder’s claim history and is consistent with the golden rule.
In the classical one-period setup (see, for example, Arrow Reference Arrow1963), a buyer of insurance faces a random loss Z and can purchase an insurance contract with indemnity I to mitigate this loss, which is priced according to a premium principle
$\pi\;:\; I \mapsto \pi(I) \in \mathbb{R}_+$
. The buyer seeks an optimal insurance
$I^*$
to optimize her objective
$\mathcal{J}$

in which x is the buyer’s initial wealth, and
$x - Z + I(Z) - \pi(I)$
is her terminal wealth when she purchases an insurance contract with indemnity I. Arrow (Reference Arrow1963) assumes that
$\pi$
is given by a type of expected-value premium principle, specifically,
$\pi(I) = f(\mathbb{E} I)$
for some function f. Another popular choice is the variance premium principle,
$\pi = \mathbb{E}(I) + \frac{\theta}{2}\mathbb{V}(I)$
, in which
$\theta \ge 0$
and
$\mathbb{V}(\cdot)$
denotes the variance operator; Guerra and Centeno (Reference Guerra and Centeno2010) further propose an extended version,
$\pi = \mathbb{E}(I) + g (\mathbb{V}(I))$
, in which g is an increasing function with
$g(0)=0$
(see also Chi Reference Chi2012). In recent work, more general premium principles are considered; see Cao et al. (Reference Cao, Li, Young and Zou2024b) for a convex-type principle family and Jin et al. (Reference Jin, Xu and Zou2024) for general distortion principles. However, one noticeable drawback of those premium principles is that
$\pi$
is independent of the policyholder’s claim history. In other words, a policyholder who incurs a loss every year would pay the same premium as a policyholder who never reports a loss, if the two of them were to choose the same insurance contract. This independence of
$\pi$
upon the policyholder’s experience contradicts the insurance ratemaking practice that places a surcharge on the policyholder’s future premium whenever she files a claim (see Part V in Klugman et al. Reference Klugman, Panjer and Willmot2019 for standard credibility theory).
To address the above drawback, we propose a novel premium principle that is based on the variance premium principle,Footnote 1 but takes into account the policyholder’s past claim information:

In (1.1),
$Y_{t-1}$
represents the policyholder’s claim habit at the beginning of period t, and
$I_t$
is the indemnity function of the insurance contract covering period t; so, both
$Y_{t-1}$
and
$I_t$
are known at time
$t-1$
, and
$\pi(Y_{t-1}, I_t)$
is the premium paid at time
$t-1$
in exchange for the coverage specified by
$I_t$
during
$(t-1, t]$
. In addition, given
$Y_0 \in [0,1]$
, the claim habit
$Y_{t-1}$
is updated to
$Y_t$
by

in which
$v \in (0,1)$
is a weighting factor, and
$Z_t$
is the random loss variable in period t. Note that the effect of past claims decays exponentially in (1.2), as for the more usual consumption habit; see Constantinides (Reference Constantinides1990). The proposed premium principle in (1.1) inherits all the desirable properties of the usual variance premium principle, such as nonnegative safety loading, translation invariance, and additivity of independent risks; see Table 5.1 in Kaas et al. (Reference Kaas, Goovaerts, Dhaene and Denuit2008) (p. 122). To better understand it, consider a simple example as follows. Suppose we are at time 0, the buyer chooses her insurance contract
$I_1$
, and her initial claim habit
$Y_0$
is known, then the premium she pays at time 0 for the first period is given by
$\pi(Y_0, I_1)$
. At the end of the first period, the loss
$Z_1$
is realized, and either
$Z_1 = 0$
(no loss) or
$Z_1 \gt 0$
(resulting in a claim) occurs. If
$Z_1 \gt 0$
, then
$Y_1 = v Y_0 + (1 - v) \gt Y_0$
, and the variance loading in (1.1) increases from
$\theta Y_0/2$
to
$\theta Y_1 /2$
. As a consequence, the same coverage for the second period would require a higher premium, when compared to that for the first period. If
$Z_1 = 0$
, then
$Y_1 = v Y_0 \lt Y_0$
, and the exact opposite holds. Thus, we conclude that the proposed premium principle in (1.1) complies with the golden rule of insurance ratemaking outlined earlier.
Because our premium principle depends on the policyholder’s claim history and because the claim habit is updated dynamically via (1.2), a standard one-period model, as introduced in Arrow (Reference Arrow1963), is no longer appropriate; indeed, neither claim history nor dynamic updating is possible under a one-period model. This motivates us to consider a two-period model, which is the minimum prerequisite to study the effect of claim history on the policyholder’s insurance decision. We assume a priori that the policyholder purchases proportional insurance at the beginning of each period;Footnote 2 denote by
$\alpha_1$
and
$\alpha_2$
, both taking values in [0,1], the ceded proportion in period 1 and period 2, respectively. Under the claim-dependent variance premium principle in (1.1), the buyer seeks an optimal strategy to maximize her mean-variance (MV) preferences,
$\mathcal{J}(X_2)=\mathbb{E}(X_2) - \frac{\gamma}{2} \mathbb{V}(X_2)$
, in which
$X_2$
is the buyer’s terminal wealth (at time 2), and
$\gamma \gt 0$
is the risk aversion parameter. Although the literature on optimal insurance is rich and extensive (see Gollier Reference Gollier and Dionne2000 for a survey), we are not aware of any work that is closely related to the research problem as described above. Nevertheless, several papers study optimal insurance under some experience rating models that incorporate claim information or behavior, but they are vastly different from ours. Venezia and Levy (Reference Venezia and Levy1983) consider a bonus-malus system (BMS), under which a reported claim leads to a penalty (malus) on future premium, a feature shared with our premium principle (1.1), but in their model, the policyholder might hide certain losses to the insurer (called underreporting) in order to receive the premium bonus. Under the expected value principle and utility maximization criterion, they show that the optimal contract is deductible insurance. Cao et al. (Reference Cao, Li, Young and Zou2024a) obtain finer results for a problem similar to the one in Venezia and Levy (Reference Venezia and Levy1983). Jammernegg and Kischka (Reference Jammernegg and Kischka1994) also study a discrete-time insurance decision-making problem under an experience rating system, but their formulation of the policyholder’s decision is rather restrictive: the policyholder either buys full insurance or does not buy any insurance. Lastly, Holtan (Reference Holtan2001) solves for optimal insurance coverage under a BMS, but the model suffers from several drawbacks, including studying the problem within a classic one-period framework and assuming independence between premium discounts and insurance contracts. Another form of performance-based premium is the retrospective premium, first discussed by Meyers (Reference Meyers1980, Reference Meyers, Teugels and Sundt2004). Unlike BMS or our approach, which adjust premiums based on the insured’s past claim history, a retrospective premium depends on the insured’s future losses and is therefore random at the beginning of the policy period. Specifically, the insured initially pays a deterministic basic premium (for example, calculated using the expected value premium principle). Then, at the end of the policy period, a reward or penalty is applied to the premium based on whether the realized loss is considered small or large by the insurer. For applications of retrospective premium, see Landriault et al. (Reference Landriault, Liu and Shi2024) for optimal reinsurance design and Chen et al. (Reference Chen, Chi and Tan2016) for an optimal retrospective rating plan. Both works focus on one-period optimization problems.
As already hinted, the buyer is endowed with MV preferences, and as an immediate consequence, caution is warranted when we solve for the “optimal” MV insurance strategy. Indeed, MV problems under dynamic multi-period or continuous-time models are time inconsistent, an issue well recognized in the literature, and there are two popular notions regarding the optimality of MV problems, pioneered by Strotz (Reference Strotz1955). The first one is to treat the problem as a noncooperative game that the buyer plays against her future selves (see Björk and Murgoci Reference Björk and Murgoci2010 for a standard reference); the corresponding solution is called the time-consistent equilibrium strategy. The second one is to assume that the buyer solves the MV problem at time 0 and commits to an optimal strategy over the entire planning horizon; the corresponding solution is called the optimal precommitment strategy (see Li and Ng Reference Li and Ng2000 and Zhou and Li Reference Zhou and Li2000). When it comes to dynamic MV (re)insurance problems, much of the existing literature chooses the former notion, due to the analytical tractability of the extended system of Hamilton–Jacobi–Bellman equations, while fewer papers aim to obtain the optimal precommitment insurance contract, in particular under multi-period discrete-time models. For those adopting the time-consistent approach, we refer to Zeng and Li (Reference Zeng and Li2011), Chen and Shen (Reference Chen and Shen2019), Cao et al. (Reference Cao, Landriault and Li2020), Li and Young (Reference Li and Young2022), and Yuan et al. (Reference Yuan, Han, Liang and Yuen2023) for a short list; as noted, papers that follow the precommitment notion are rather limited. In a continuous-time framework, Li et al. (Reference Li, Li and Young2017) demonstrate that the precommitment reinsurance strategy is different from the time-consistent one, but without providing further analysis. Shen and Zou (Reference Shen and Zou2021) conduct a detailed comparison of time-consistent and precommitment solutions to MV investment and insurance problems. We further point out that the problem of optimal insurance under the MV criterion is no easy task even for a one-period problem; see Chi and Tan (Reference Chi and Tan2021), Boonen and Jiang (Reference Boonen and Jiang2022), and Liang et al. (Reference Liang, Jiang and Zhang2023).Footnote 3
In this paper, we obtain for both time-consistent and precommitment solutions of the policyholder’s MV optimal insurance problem under the claim-dependent variance premium principle in (1.1). We summarize our main contributions as follows:
-
• We obtain the time-consistent equilibrium strategy in closed form (see Theorem 3.1), which depends strongly on the policyholder’s claim habit
$\{Y_0, Y_1\}$ .
-
• We obtain the optimal constant precommitment strategy in semiclosed form (see Theorem 4.1). As is well known, solving for the optimal precommitment strategy in a multi-period model, as the one considered here, is notoriously difficult.Footnote 4 As such, the first step we take is to focus on a subclass of strategies, namely, constant precommitment strategies
$(a_1, a_2) \in [0, 1]^2$ , in which both
$a_1$ and
$a_2$ are constants determined at time 0. Under this assumption, we show that the optimal strategy can be found by solving two cubic equations each of which admits a unique solution in [0, 1].
-
• We propose an efficient iterative algorithm to numerically solve for the optimal general precommitment strategy. Note that for a general precommitment strategy
$(\alpha_1, \alpha_2)$ ,
$\alpha_1\in[0,1]$ is still a constant but
$\alpha_2$ might depend on the realization of the loss in period 1. Inspired by Li and Ng (Reference Li and Ng2000), we introduce a family of auxiliary problems, indexed by a free parameter
$\lambda$ , and conclude that the optimal precommitment strategy coincides with the solution of the auxiliary problem under a specific parameter
$\lambda^*$ . However, our MV auxiliary problems are much more challenging than those arising from the MV investment problems in Li and Ng (Reference Li and Ng2000). In particular, the solution of the auxiliary problem for period 2,
$\widetilde{\alpha}_2^\lambda$ , is already complex and depends upon a binary condition and upon solving a cubic equation (see (4.10) and (4.11)). Thus, when substituting
$\widetilde{\alpha}_2^\lambda$ into the wealth dynamics, we cannot obtain an analytical solution of the auxiliary problem for period 1. We, thus, resort to numerical methods and propose an iterative algorithm that efficiently computes the optimal general precommitment strategy, often with fewer than 10 iterations (see Section 4.2).
-
• Furthermore, we compare the three different optimal strategies – the time-consistent equilibrium strategy, optimal constant precommitment strategy, and optimal general precommitment strategy – in detail, and study the effect of the policyholder’s risk aversion parameter
$\gamma$ , initial claim habit
$Y_0$ , and weighting factor v (see (1.2)) on the optimal strategies and value functions. In particular, both
$Y_0$ and v have a significant impact on the policyholder’s insurance decision.
The rest of the paper is organized as follows. In Section 2, we formulate the buyer’s dynamic optimal insurance problem. In Section 3, we derive the buyer’s time-consistent insurance strategy. and Section 4 is then devoted to deriving the constant precommitment insurance strategy in semiclosed form and computing the general precommitment strategy. In Section 5, we carry out numeric analyses for our main results. Section 6 concludes the paper. We place all proofs in Appendices A, B, and C. In an online appendix, we extend the study from a two-period setting to a general n-period setting.
2. Model
We consider a two-period insurance model and a buyer of insurance who is exposed to an insurable risk
$Z_i$
in period i,
$i=1,2$
; the extension to a general n-period model can be found in the online appendix. Assume that
$Z_1$
and
$Z_2$
are independent and identically distributed as a nonnegative random variable Z, which follows a mixture distribution of a point mass at 0 and a strictly positive random variable
$Z_+$
. We fix a filtered probability space
$(\Omega, \{\mathcal{F}_{i}\}_{i=0,1,2}, \mathbb{P})$
and denote the expectation and variance operators under
$\mathbb{P}$
by
$\mathbb{E}(\cdot)$
and
$\mathbb{V}(\cdot)$
, respectively. Define

(equivalently,
$\mathbb{P}(Z = 0) = 1 - q \in [0,1)$
), and assume the variance of Z is finite (
$\sigma^2 \lt \infty$
). Note that the mixture distribution of Z implies

Let F (resp.
$S = 1 - F$
) denote the cumulative distribution function (resp. survival function) of Z.
Remark 2.1 The loss variable Z (or its positive part
$Z_+)$
is quite general and can be interpreted as the aggregate loss in each period. Indeed, one can further assume that
$Z_+$
is given by an aggregate loss model, such as the compound Poisson model (see Chapter 9 in Klugman et al. Reference Klugman, Panjer and Willmot2019).
To mitigate her risk exposure, the policyholder purchases proportional insurance from a representative insurer at the beginning of each period. She chooses the ceded proportion
$\alpha_1$
at time 0 for period 1 and
$\alpha_2 $
at time 1 for period 2, with both taking values in [0, 1]. For convenience, we often denote the policyholder’s insurance strategy by
$\vec{\alpha} \;:\!=\; (\alpha_1, \alpha_2)$
and, when we call it a contract, we mean the two proportional insurance coverages with ceded proportions
$\alpha_1$
in period 1 and
$\alpha_2$
in period 2, respectively. For a chosen contract
$\vec{\alpha}$
, the policyholder transfers
$\alpha_i Z_i$
risk to the insurer and retains the remaining
$(1 - \alpha_i) Z_i$
risk in period i,
$i=1,2$
.
Practical insurance ratemaking models often adjust individual insureds’ premiums according to their claim history (see Part V of Klugman et al. Reference Klugman, Panjer and Willmot2019). Inspired by this fact, we introduce a process
$\{Y_i\}_{i=0,1}$
and call it the policyholder’s claim habit, which is defined as the weighted average of the number of the previously filed claims. Specifically, denoting
$Y_0 \in [0,1]$
, the policyholder’s claim habit at time 0, we define
$Y_1$
by

in which
$v \in [0,1]$
is the weight placed on the previous claim habit, and
${\unicode{x1D7D9}}_{\cdot}$
is an indicator function with its subscript denoting the condition. We assume that the insurer applies the following variance premium principle under claim habit to determine the premium for a proportional contract
$\alpha_i$
in period i:

in which
$\theta \gt 0$
is a positive constant, and
$Y_1$
is given by (2.2). Note that the loading factor in (2.3),
$\theta Y_{i-1}$
, is not a fixed constant but depends on the policyholder’s claim history via
$\{Y_i\}_{i=0,1}$
. Our proposed premium in (2.3) generalizes the usual variance premium principle (see, for instance, Chi Reference Chi2012 and Liang and Yuen Reference Liang and Yuen2016), which is a special case of (2.3) with
$Y_0 = Y_1$
(corresponding to
$v = 1$
). Please see Remark 2.2 for a detailed explanation of the claim-habit-dependent variance premium principle in (2.3).
Remark 2.2.
$\pi$
in (2.3) is a function of two arguments: the first argument
$\alpha_i$
specifies the indemnity of the contract
$\alpha_i Z_i$
for period i, and the second argument
$Y_{i-1}$
measures the policyholder’s heterogeneous riskiness that is revealed by her claim history. First, if
$v = 1$
in (2.2), the insurer ignores individual claim history in pricing, and the principle in (2.3) recovers the usual variance premium principle with a constant loading factor
$\theta/2$
(by setting
$Y_0 = 1$
). Therefore, the proposed principle in (2.3) generalizes the usual variance premium principle by taking into account claim history in ratemaking. Second, when the policyholder files a claim (that is, when
$Z_1 \gt 0)$
in period 1, we have
$Y_1 \gt Y_0$
by (2.2), and the premium loading in period 2 increases according to (2.3); the opposite occurs if there is no claim filed in period 1. As such, the premium principle in (2.3) is consistent with most experience rating models used in practice. For instance, consider a simple bonus-malus system in which the policyholder, starting from a rate class with premium loading
$\theta Y_0$
, is moved to a class with higher premium loading
$\theta(vY_0 + (1-v))$
when she files a claim in the previous period and to a class with lower premium loading
$\theta vY_0$
when she does not. Also, the claim habit defined in (2.2), which enters (2.3) as a factor, resembles partial credibility models (see Part V of Klugman et al. Reference Klugman, Panjer and Willmot2019). Third,
$\theta/2$
is the maximum variance loading imposed by the insurer. To see this, consider a policyholder who files a claim in every period; in this extreme case, the claim habit asymptotically approaches its highest level of
$Y = 1$
.
In the actuarial literature, there are two main approaches for calculating premiums. The first approach is based on a mathematical functional
$\pi$
, called a premium principle, which maps a contract indemnity to a positive number. Examples include the expected-value premium principle and the variance premium principle; see Chapter 5 in Kaas et al. (Reference Kaas, Goovaerts, Dhaene and Denuit2008) for an overview of popular premium principles. The second approach is based on a statistical model, which first computes the base premium
$\bar{P}$
from risk factors (called features or covariates) and, next, applies credibility theory to adjust for claim experience. As an example, one popular choice for calculating the base premium
$\bar{P}$
is to rely on generalized linear regression models (GLMs); next, given the past claim experience
$\bar C$
, the individual premium for next year is given by
$P = Z \bar{P} + (1 - Z)\bar C$
, in which
$Z \in [0,1]$
is the credibility factor, determined by some credibility framework; see, for example, Part V “Credibility” in Klugman et al. (Reference Klugman, Panjer and Willmot2019). Our proposed premium principle in (2.3) unifies the advantages from both approaches: on one hand, it inherits analytical tractability from a well-established premium principle (that is, the variance premium principle); on the other hand, it inherits practical features from experience ratemaking models, as discussed above.
Let
$\{X_i\}_{i=0,1,2}$
denote the buyer’s wealth process, in which
$X_0$
is the buyer’s initial wealth at time 0. Here,
$X_i$
,
$i=1,2$
, clearly depends on the policyholder’s initial wealth, initial claim habit, and insurance strategy, but we suppress this dependence for notational simplicity. However, we will use the precise notation if confusion may arise or the dependence needs to be emphasized. For instance,
$X_2^{(0, x, y), (\alpha_1, \alpha_2)}$
denotes the buyer’s wealth at time 2 starting with
$X_0 = x$
and
$Y_0 = y$
under the insurance strategy
$(\alpha_1, \alpha_2)$
. For a given insurance strategy
$\vec{\alpha} = (\alpha_1, \alpha_2)$
, the buyer’s wealth follows the dynamics

for
$i=1,2$
, in which the premium principle
$\pi$
is given by (2.3).
The information available to the buyer at time 0 includes her initial wealth
$X_0$
and initial claim habit
$Y_0$
, so we set
$\mathcal{F}_0 = \sigma(X_0, Y_0)$
to be the
$\sigma$
-field generated by
$X_0$
and
$Y_0$
. At time 1, she observes the loss
$Z_1$
in period 1, so we set
$\mathcal{F}_1 = \sigma(X_0, Y_0, Z_1)$
; likewise,
$\mathcal{F}_2 = \sigma(X_0, Y_0, Z_1, Z_2)$
. The buyer’s strategy must be non-anticipative, which implies
$\alpha_1 \in \mathcal{F}_0$
and
$\alpha_2 \in \mathcal{F}_1$
. We assume that the buyer knows both
$X_0$
and
$Y_0$
at time 0; as such, the policyholder’s insurance strategy at time 0,
$\alpha_1 \in \mathcal{F}_0$
, is a constant. However,
$\alpha_2 \in \mathcal{F}_1$
is allowed to depend the loss
$Z_1$
of period 1, which in turn determines
$X_1$
and
$Y_1$
by (2.2) and (2.4), respectively. Furthermore, both
$\alpha_1$
and
$\alpha_2$
take values in [0,1]; that is, neither over-insurance nor short-selling insurance is permitted, as it should be in real life. To summarize, the set of all admissible strategies,
$\mathcal{A}$
, is given by

Some papers do not impose the constraint
$\alpha \in [0, 1]$
and allow
$\alpha \in \mathbb{R}$
; see, for example, Peng et al. (Reference Peng, Chen and Hu2014).
The buyer of insurance is endowed with MV preferences, which are characterized by (see Li and Ng Reference Li and Ng2000)

in which
$\gamma \gt 0$
balances the trade-off between “return” (mean) and “risk” (variance). Naturally, one would maximize
$\mathcal{J}(X_2)$
over all feasible strategies
$\vec{\alpha} \in \mathcal{A}$
to find an “optimal” strategy. However, doing so is problematic because of the so-called time inconsistency issue related to MV preferences
$\mathcal{J}$
; we refer readers to Björk and Murgoci Reference Björk and Murgoci2010 for a nice introduction of this topic. There exist different notations of “optimality” associated with MV preferences (see Strotz Reference Strotz1955). In this work, we consider two types of policyholders: both are aware of the inherent time inconsistency of MV preferences
$\mathcal{J}$
, but tackle it differently. The first type seeks a time-consistent (Nash) equilibrium strategy, which she will follow without imposing precommitment; the second type of policyholder obtains an optimal strategy at time 0 and commits to this strategy even though it may cease to be optimal at time 1. In Section 3, we use the game-theoretical approach to find the time-consistent equilibrium strategy for the first-type policyholder. In Section 4, we consider the second type policyholder and find the optimal precommitment strategy.
3. Time-consistent solution
In this section, the buyer of insurance is thrifty, in the terminology of Strotz (Reference Strotz1955), when facing time inconsistency of her MV preferences, and, in response, only considers those strategies, called time-consistent strategies, that she will obey in the future without requiring commitment at an earlier time. A standard time-consistent approach is the game-theoretical approach proposed by Björk and Murgoci (Reference Björk and Murgoci2010), under which the buyer in different time periods are seen as different players in a game; thus, the associated solution is often termed the (time-consistent, or Nash) equilibrium strategy. For discrete-time models as ours, this approach boils down to solving the problem via backward induction from the last period to the first period.
To start, define the buyer’s objectives
$J_0$
and
$J_1$
by


in which the subscripts in
$\mathbb{E}$
and
$\mathbb{V}$
denote the conditions of
$X_i = x \in \mathbb{R}$
and
$Y_i = y \in [0,1]$
for
$i=0$
or
$i=1$
. Next, we provide a formal definition of the time-consistent equilibrium strategy below.
Definition 3.1. A strategy
$ (\alpha_1^*, \alpha_2^*)$
is called a time-consistent equilibrium strategy if it satisfies


in which
$X_1^{(0,x,y), \alpha_1^*}$
is the buyer’s wealth at time 1 starting from
$X_0 = x$
and
$Y_0 = y$
and by following strategy
$\alpha_1^*$
in period 1, and
$Y_1$
is given by (2.2) with
$Y_0 = y$
.
In Definition 3.1, the key to ensure time consistency is to impose the constraint
$\alpha_2 = \alpha_2^*$
in the buyer’s optimization problem at time 0 as in (3.3). In other words, the buyer solves for her best strategy at time 0, assuming that she will follow
$\alpha_2^*$
in period 2. By (3.4),
$\alpha_2^*$
is the buyer’s best strategy at time 1 when she follows
$\alpha_1^*$
in period 1. As such, upon finding
$(\alpha_1^*, \alpha_2^*)$
from (3.3) and (3.4) jointly, the buyer has no incentive to deviate from it at time 1, preserving consistency in decision.
Definition 3.1 suggests that we should follow a backward approach to find the time-consistent equilibrium strategy. To be precise, we first solve a general version of (3.4) at time 1 for an arbitrary pair of initial conditions
$(X_1, Y_1)$
to obtain
$\alpha_2^* \;:\!=\; \alpha_2^*(X_1, Y_1)$
; next, we solve (3.3) at time 0 given that the buyer follows
$\alpha_2^*$
in period 2. We summarize the findings on the buyer’s time-consistent equilibrium strategy in the next theorem, whose proof is given in Appendix A.
Theorem 3.1 The buyer’s time-consistent equilibrium strategy
$ (\alpha_1^*, \alpha_2^*)$
, in the sense of Definition 3.1, is uniquely given by


in which
$Y_1$
is given by (2.2), and
$Y_h$
and
$Y_l$
are the two possible values of
$Y_1$
defined by

Based on
$\alpha_2^*$
in (3.6), we conclude that the policyholder’s equilibrium strategy in period 2 is negatively correlated with her claim habit
$Y_1$
and the insurer’s loading factor
$\theta$
, but is positively correlated with her risk aversion parameter
$\gamma$
. These findings make intuitive sense because when
$Y_1$
or
$\theta$
increases, the insurance contract becomes more expensive, and when
$\gamma$
increases, the policyholder becomes more risk averse and demands more coverage to reduce uncertainty. Interestingly,
$\alpha_2^*$
is independent of the distribution information of
$Z_2$
, but
$\alpha_1^*$
depends on the mean of the loss
$\mu$
. The policyholder’s equilibrium strategy in period 1, given by (3.5), is more complex, partially because we restrict
$\alpha_1 \in [0,1]$
, and the critical point of
${\unicode{x1D4EF}}$
in (A3) might lie outside this interval.
We close this section with a brief discussion of the conditions under which
$\alpha_1^*$
is in the interior of [0, 1]. In what follows, we fix the initial claim habit
$Y_0 = y \in [0,1]$
, which implies
$Y_h = y_h$
and
$Y_l = y_l$
as in (A1). By a straightforward calculation, one can show that

From the above two conditions, one can easily establish simple, though not necessarily tight, sufficient conditions leading to an interior solution. For instance,
$\alpha_1^* \in (0,1)$
if the probability of a positive loss q equals 1. This result also holds when
$v = 1$
, which reduces the premium in (2.3) to the usual variance premium principle. By using (3.8), one can show that if
$2 y - \gamma \mu(1-q)(1-v) \gt 0$
, then
$\alpha_1^* \lt 1$
. Overall, the conditions for
$\alpha_1^* \in (0,1)$
are not strong, and in all of our numerical examples, we always obtain
$\alpha_1^* \in (0,1)$
. On the other hand, note that when the buyer is extremely risk averse (
$\gamma \to \infty$
), the condition in (3.8) no longer holds, and
$\alpha_1^* = 1$
, indicating full insurance.
4. Precommitment solution
In this section, the buyer of insurance is aware of the time inconsistency of her MV preferences but chooses to precommit to her future behavior. As such, she solves her MV problem at time 0 and commits to this optimal strategy throughout the planning horizon even though it may cease to be optimal at a future time. The definition of the buyer’s time-inconsistent problem is given below.
Definition 4.1. A strategy
$(\widehat{\alpha}_1, \widehat{\alpha}_2)$
is called a (time-inconsistent) precommitment strategy if it satisfies

in which
$X_0 = x \in \mathbb{R}$
and
$Y_0 = y \in [0,1]$
.
Note that (4.1) is significantly different from (3.3), although both are formulated at time 0; the key difference is that there is no consistency constraint on
$\alpha_2$
in (4.1). Although the MV insurance problem has been studied under a static setting in, for instance, Chi and Tan (Reference Chi and Tan2021) and Boonen and Jiang (Reference Boonen and Jiang2022), and is closely related to MV portfolio selection problems which are well studied in the literature (see Li and Ng Reference Li and Ng2000), solving (4.1) is highly technical mainly due to the presence of the claim habit
$Y_1$
and its involvement in the premium principle
$\pi$
defined by (2.3). As such, to tackle (4.1), we first restrict our attention to constant strategies.
4.1 Constant strategies
In this subsection, we study (4.1) under the additional constraint that the buyer of insurance follows a constant strategy
$(a_1, a_2) \in [0,1]^2$
, in which
$a_i$
denotes the ceded proportion in period i,
$i=1,2$
. We remark that for a general strategy
$(\alpha_1, \alpha_2) \in \mathcal{A}$
,
$\alpha_2 \in \mathcal{F}_1$
is allowed to depend on
$X_1$
or
$Y_1$
(through the realization of
$Z_1$
), but for a constant strategy
$(a_1, a_2)$
, both
$a_1$
and
$a_2$
are constants chosen by the buyer at time 0. The proof of Theorem 4.1 is provided in Appendix B.
Theorem 4.1. Let
$\widetilde{a}_2 \in (0, 1)$
denote the unique solution of
$\widetilde{{\unicode{x1D4F0}}}_c(a_2)=0$
and
$\overline{a}_2 \in (0, 1)$
the unique solution of
$\overline{{\unicode{x1D4F0}}}_c(a_2) = 0$
, in which
$\widetilde{{\unicode{x1D4F0}}}_c$
and
$\overline{{\unicode{x1D4F0}}}_c$
are defined, respectively, by

and

Then, the optimal constant precommitment strategy
$(a_1^*, a_2^*)$
is given by (the threshold of
$\tilde{a}_2$
is set equal to
$\infty$
if the denominator is 0)

Equations (4.2) and (4.3) are both cubic equations in
$a_2$
; as such, it is unlikely to derive analytical sensitivity results on how model parameters affect the optimal constant precommitment strategy
$(a_1^*, a_2^*)$
. Instead, in Section 5.3, we conduct numerical analysis to fulfill this task. In the special case when
$q = 1$
(loss occurs with a probability of 1) or
$v = 1$
(
$\pi$
reduces to the usual variance premium principle), (4.2) and (4.3) reduce to linear equations, and we obtain the following corollary, in which we observe that the optimal constant precommitment strategies increase in the buyer’s risk aversion
$\gamma$
and decrease in both the safety loading factor
$\theta$
and initial claim habit y.
Corollary 4.1. When
$q=1$
, we have
$a_1^* = \frac{\gamma}{\gamma + \theta y}$
and
$a_2^* = \frac{\gamma}{\gamma + \theta(vy + 1-v)}$
. When
$v = 1$
, we have
$a_1^* = a_2^* = \frac{\gamma}{\gamma + \theta y}$
. In both cases,
$a_i^*$
increases in
$\gamma$
and decreases in
$\theta$
and y, for
$i = 1,2$
.
4.2 General strategies
In this subsection, we study the general precommitment strategy
$(\alpha_1, \alpha_2)$
for Problem (4.1). Recall that for a general strategy
$(\alpha_1, \alpha_2) \in \mathcal{A}$
,
$\alpha_2$
depends on
$X_1$
or
$Y_1$
through the realization of
$Z_1$
, but for a constant strategy
$(a_1, a_2)$
, both
$a_1$
and
$a_2$
are chosen at time 0. Obtaining the optimal MV precommitment strategy in a discrete-time setting is highly nontrivial. Li and Ng (Reference Li and Ng2000) are the first to obtain the precommitment strategy for a multi-period MV portfolio optimization problem, and the key is the so-called embedding method, which embeds the original problem (one single challenging problem) into a family of auxiliary problems, parameterized by a parameter
$\lambda \in \mathbb{R}$
. These auxiliary problems are time consistent and can be solved relatively easily; after solving them, we select a special parameter
$\lambda^*$
under which the corresponding auxiliary problem is equivalent to the original MV problem and, hence, the solution of the auxiliary problem under
$\lambda^*$
is the optimal precommitment strategy we are looking for. Following their approach, we adopt this embedding technique and outline below the roadmap for solving the optimal precommitment strategy
$(\widetilde \alpha_1, \widetilde\alpha_2)$
for Problem (4.1).
-
• We first introduce a family of auxiliary problems indexed by parameter
$\lambda$ with objective function
$G_0^\lambda$ defined by
(4.5)Note that we can write\begin{align} G_0^\lambda(x, y;\; \alpha_1, \alpha_2) &= \mathbb{E}_{0, x, y} \Big( \lambda X_2 - \frac{\gamma}{2} X_2^2 \Big). \end{align}
$G_0^\lambda$ in the form of
$G_0^\lambda = \mathbb{E}_{0, x, y} (\phi(X_2))$ , and the nonlinearity is inside the (conditional) expectation. In comparison,
$J_0$ involves
$\big(\mathbb{E}_{0,x,y} (X_2) \big)^2$ , and the nonlinearity is outside the (conditional) expectation, which is known to cause time inconsistency (see, for instance, Björk and Murgoci Reference Björk and Murgoci2010). Denote the optimal strategy to the auxiliary problem in (4.5) by
$(\widetilde\alpha_1^\lambda, \widetilde\alpha_2^\lambda)$ ,
(4.6)\begin{equation} (\widetilde{\alpha}^\lambda_1, \widetilde{\alpha}^\lambda_2) = \mathop{\textrm{arg max}}\limits_{(\alpha_1, \alpha_2) \in \mathcal{A}} \, G_0^\lambda(x, y;\; \alpha_1, \alpha_2). \end{equation}
-
• With
$(\widetilde\alpha_1^\lambda, \widetilde\alpha_2^\lambda)$ obtained for all
$\lambda \in \mathbb{R}$ , Theorem 2 of Li and Ng (Reference Li and Ng2000) shows that
$(\widehat{\alpha}_1, \widehat{\alpha}_2)$ to the original problem in (4.1) equals
$(\widetilde\alpha_1^{\lambda^*}, \widetilde\alpha_2^{\lambda^*})$ , in which
$\lambda^*$ solves
(4.7)\begin{align} \lambda^* = 1 + \gamma \, \mathbb{E}_{0,x,y} \left( X_2^{(\widetilde{\alpha}_1^{\lambda^*}, \, \widetilde{\alpha}_2^{\lambda^*})} \right)\!.\end{align}
Following the above methodology, we first solve for the optimal strategy
$(\widetilde\alpha_1^\lambda, \widetilde\alpha_2^\lambda)$
in (4.6) to the auxiliary problem, for any
$\lambda\in \mathbb{R}$
. To that end, we apply the dynamic programming principle (DPP) and define the value function at time 0 and time 1, respectively, by

in which
$G_1^\lambda(x, y;\; \alpha_2) = \mathbb{E}_{1, x, y} \left( \lambda X_2 - \frac{\gamma}{2} X_2^2 \right)$
. The DPP implies

in which
$X_1^{(0,x,y), \alpha_1}$
is the buyer’s wealth at time 1 starting from the initial condition
$(X_0 = x, Y_0 = y)$
and following strategy
$\alpha_1$
in period 1, and
$Y_1$
is given by (2.2) with
$Y_0 = y$
. (4.9) suggests that we can follow a backward approach to solve the auxiliary problem in (4.6), which is nearly identical to the one used in Section 3. However, the essential difference is that the consistency condition is imposed a priori (as a constraint) in Definition 3.1, but is satisfied automatically for the auxiliary problems, thanks to the DPP (4.9). Theorem 4.2 characterizes the optimal strategy
$\widetilde\alpha_2^\lambda$
for period 2 for any
$\lambda\in \mathbb{R}$
, whose proof is postponed to Appendix C.
Theorem 4.2. The optimal strategy of the auxiliary problem in period 2 of (4.8) is given by

in which
$\alpha_2^+$
is the unique positive solution of

Although
$\widetilde\alpha_2^\lambda$
is available semi-explicitly, solving for
$\widetilde\alpha_1^\lambda$
through (4.9) is highly nontrivial because (4.11) does not admit a closed-form solution, and the constraint
$\alpha_2 \le 1$
might be binding, as shown in (4.10). There is little hope of finding an analytical solution to
$\widetilde{\alpha}_1^\lambda$
, so we rely on numerical methods for obtaining
$\widetilde{\alpha}_1^\lambda$
in the next section.
With
$(\widetilde\alpha_1^\lambda, \widetilde\alpha_2^\lambda)$
for all
$\lambda \in \mathbb{R}$
in hand, according to (4.7),
$\lambda^*$
can be determined implicitly by (4.7), with

By Theorem 2 of Li and Ng (Reference Li and Ng2000), we have

In Section 5.1, we elaborate on how
$\widetilde\alpha_1^\lambda$
and ultimately,
$(\widehat\alpha_1, \widehat\alpha_2)$
can be solved numerically.
Table 1. Model parameters in the base case.

5. Numerical study
In this section, we conduct a detailed numerical study for three key purposes. First, we numerically solve for the optimal general precommitment strategy discussed in Section 4.2. Second, we compare the three different optimal strategies considered in the previous two sections. Third, we investigate how various model parameters affect the three optimal strategies.
To start, we set up a base case as follows. The loss variable Z follows a mixture distribution of a point mass at 0 and a Gamma distributed random variable
$Z_+$
with probability density function

The model parameters in the base case are summarized in Table 1. Note that given the parameters in the base case, the mean and variance of
$Z_+$
are given by
$\widetilde{\mu} = \kappa \eta = 2$
and
$\widetilde{\sigma}^2 = \kappa \eta^2 =2$
, respectively. Furthermore, we obtain
$\mathbb{E}(Z) = q \widetilde{\mu} = 1.6$
and
$\mathbb{V}(Z) = q \widetilde{\sigma}^2 + q(1-q) \widetilde{\mu}^2 = 2.24.$
5.1 Numerical solution of the general precommitment strategy
The goal of this subsection is to illustrate how we numerically solve for the optimal general precommitment strategy
$(\widehat{\alpha}_1, \widehat{\alpha}_2)$
defined by (4.1). First, we follow the two-step approach introduced in Section 4.2 to obtain the optimal strategy
$(\widetilde{\alpha}^\lambda_1, \widetilde{\alpha}^\lambda_2)$
in (4.6) of the auxiliary problem for all
$\lambda$
. Second, we solve the implicit equation in (4.7) to identify the “optimal” parameter
$\lambda^*$
and use (4.13) to obtain
$\widehat{\alpha}_1 = \widetilde{\alpha}_1^{\lambda^*}$
and
$\widehat{\alpha}_2 = \widetilde{\alpha}_2^{\lambda^*}$
. We outline the details for each step below.
In the first step of finding
$(\widetilde{\alpha}^\lambda_1, \widetilde{\alpha}^\lambda_2)$
, note that the buyer’s initial wealth x and initial claim habit y are known at time 0, with
$x=10$
and
$y=0.5$
as in Table 1. In addition,
$\widetilde{\alpha}^\lambda_2$
is obtained semi-explicitly by (4.10) and can be easily computed by checking which condition holds in (4.10) and, then, solving the cubic equation in (4.11). Thus,
$\widetilde{\alpha}^\lambda_1$
is given by

in which
$X_2 = X_2^{(0, x, y), (\alpha_1, \widetilde{\alpha}^\lambda_2)}$
. Given a fixed strategy
$\alpha_1$
, we can efficiently compute
$G_0^\lambda(x,y;\; \alpha_1, \widetilde{\alpha}_2^\lambda)$
because
$\mathbb{E}_{0,x,y}(X_2)$
and
$\mathbb{E}_{0,x,y}(X_2^2)$
are easily obtainable via (C1) and (C2), along with (4.10). This motivates us to propose the following iterative improvement algorithm to search for the optimizer
$\widetilde{\alpha}^\lambda_1$
: At step k, we search for the maximizer of
$G_0^\lambda(x,y;\; \alpha_1, \widetilde{\alpha}_2^\lambda)$
over n equally spaced points of
$[m_k, M_k] \subset [0, 1]$
(we initiate
$[m_1, M_1] = [0,1]$
and choose
$n=50$
); denote the maximizer by
$\widetilde{\alpha}^{\lambda,k}_1$
. If
$|\widetilde{\alpha}^{\lambda,k}_1 - \widetilde{\alpha}^{\lambda,k-1}_1| \lt \epsilon$
(we set
$\epsilon = 10^{-6}$
), we stop and obtain
$\widetilde{\alpha}^{\lambda}_1 \approx \widetilde{\alpha}^{\lambda,k}_1$
. Otherwise, we set
$m_{k+1}= \max\{\widetilde{\alpha}^{\lambda,k}_1 - 1 / 2^{k}, 0\}$
and
$M_{k+1} = \min\{\widetilde{\alpha}^{\lambda,k}_1 + 1 / 2^{k}, 1\}$
and, then, proceed to step
$(k+1)$
. We plot
$\widetilde{\alpha}^{\lambda}_1$
as a function of
$\lambda$
in Figure 1 and observe that
$\widetilde{\alpha}^{\lambda}_1$
decreases as
$\lambda$
increases. This result is intuitively pleasing because with the increase of
$\lambda$
, the buyer of insurance places more weight on maximizing her expected wealth, but the insurance is sold above its actuarially fair value, so she will purchase less insurance.

Figure 1.
$\widetilde{\alpha}^{\lambda}_1$
as a function of
$\lambda$
.
In the second step, the key is to solve the implicit equation in (4.7), which we reproduce below for convenience

Recall that the expectation in (5.1) can be explicitly computed by (4.12). Because the solution
$\lambda^*$
is a fixed point of
$\varphi$
, we adopt an iterative algorithm that computes
$\lambda_{k+1} = \varphi(\lambda_k)$
and stops whenever
$|\lambda_{k+1} - \lambda_k| \lt \epsilon$
. For an initial value of
$\lambda_0 = 0$
, we find that the algorithm converges in 8 iterations with
$\lambda_8 = 4.1677 = \lambda^*$
, when the error is
$\epsilon = 10^{-6}$
; see the convergence plot in Figure 2. Normally, the choice of the initial value
$\lambda_0$
has a major effect on the rate of convergence, but this is not the case here. Our extensive numerical work (not shown here) shows that the convergence of
$\lambda_n$
to
$\lambda^*$
is insensitive to the choice of the initial value
$\lambda_0$
and is efficient (within 10 iterations). Finally, once we obtain
$\lambda^*$
, we use (4.13) and the algorithm in the first step with
$\lambda = \lambda^*$
to obtain
$\widehat{\alpha}_1 = \widetilde{\alpha}_1^{\lambda^*}$
and
$\widehat{\alpha}_2 = \widetilde{\alpha}_2^{\lambda^*}$
.

Figure 2. Convergence of
$\lambda_n$
to
$\lambda^* = 4.1677$
.
5.2 Comparison of the three optimal strategies
In this subsection, we compare the three different optimal strategies for the policyholder’s MV insurance problem. Recall that the time-consistent equilibrium strategy (TC for short) is stated in (3.5)–(3.6) in Theorem 3.1, the optimal constant precommitment strategy (CP for short) is given by (4.4) in Theorem 4.1, and the optimal general precommitment strategy (GP for short) is found by the two-step approach introduced in Section 4.2 and implemented in Section 5.1.
In the subsequent study, we write
$\alpha_i^*$
to denote a generic value of the optimal strategy, not just the equilibrium TC strategy as before, in period i (
$i=1,2$
), and we write
$X_2^*$
to denote the corresponding terminal wealth. Also, note that for all three strategies,
$\alpha_1^*$
is a constant;
$\alpha_2^*$
depends on
$Z_1$
(the loss in period 1) for the equilibrium TC and optimal GP strategies, but
$\alpha_2^*$
is again a constant for the optimal CP strategy, as suggested by its name. This explains why we compare
$\alpha_1^*$
directly but
$\mathbb{E}(\alpha_2^*)$
for the three optimal strategies. We also compare the corresponding expectations
$\mathbb{E}_{0}(X_2^*)$
, variances
$\mathbb{V}_{0}(X_2^*)$
, and objective values
$\mathbb{E}_{0}(X_2^*) - \frac{\gamma}{2} \mathbb{V}_{0}(X_2^*)$
, in which the subscript 0 is short for the initial condition triple (0, x, y). Before we present our results, we emphasize that both the time-consistent equilibrium and precommitment strategies are valid solutions of the MV problems, and we do not rank them in terms of superiority. With that in mind, we first fix the model parameters as in the base case (see Table 1) and present the comparison results in Table 2.
We summarize the findings in Table 2 and highlight several observations as follows. First, the policyholder achieves the highest objective value at time 0 when she follows the optimal GP strategy, a result that we anticipate due to the definition in (4.1). Second, the optimal GP strategy yields the lowest mean and variance of the terminal wealth, while the optimal CP strategy leads to the highest in both measures. We remark that both mean and variance are affected by the mean-variance trade-off parameter
$\gamma$
(which is set to 0.5 for the results in Table 2). Third, the three optimal strategies are close to each other in (expected) value for both periods. A closer look shows that the optimal GP strategy has the biggest
$\alpha_1^*$
, and the equilibrium TC strategy has the biggest
$\mathbb{E}_0[\alpha_2^*]$
.
The original MV objective of Markowitz was a bivariate function, and we follow the convention to convert it into a univariate objective by introducing a trade-off parameter
$\gamma$
. As already noted, the comparison results in Table 2 are obtained under a specific value of
$\gamma$
, namely,
$\gamma = 0.5$
. To have a better look at the performance of the optimal strategies, we vary the parameter
$\gamma$
but keep other parameters the same as in Table 1 and plot the efficient frontier – that is, mean against variance – in Figure 3. Again, because both the mean and variance are conditioned at time 0, we expected the efficient frontier under the optimal GP strategy to be the best among the three, which is indeed the case. We also observe from Figure 3 that the equilibrium TC strategy yields a better efficient frontier than the optimal CP strategy.
Table 2. Comparison of the three optimal strategies.


Figure 3. Efficient frontier under three optimal strategies.
Next, we perform Monte Carlo simulations to gain further insight into the policyholder’s terminal wealth
$X_2^*$
under the different optimal strategies. For the parameters specified in the base case in Table 1, we plot the histograms of
$X_2^*$
under each of the three optimal strategies in Figure 4. Note that in all panels of Figure 4, we exclude the special scenario for which there are no claims in both periods, that is, the scenario of
$Z_1 = Z_2 = 0$
, with a probability of
$(1 - q)^2 = 0.04$
. We observe from Figure 4 that the terminal wealth
$X_2^*$
under the optimal GP strategy has the lowest variance, as its histogram is most concentrated around the mean. However, we comment that such a result is sensitive to the policyholder’s risk aversion, measured by
$\gamma$
. Keeping all parameters unchanged, except for changing
$\gamma$
from 0.5 to 0.1, we replot the three histograms in Figure 5. The three histograms now have a similar shape, and it is difficult to tell, by the naked eye, which strategy produces the lowest variance.

Figure 4. Histogram for the terminal wealth under different optimal strategies (
$\gamma = 0.5$
).
Note: All parameters are the same as those specified in the base case. We exclude the special scenario for which there are no losses in both periods.

Figure 5. Histogram for the terminal wealth under different optimal strategies (
$\gamma = 0.1$
).
Note: All parameters are the same as those specified in the base case except for
$\gamma = 0.1$
. We exclude the special scenario for which there are no losses in both periods.
5.3 Sensitivity analysis
In this subsection, our focus shifts to the sensitivity analysis of the optimal strategies, and their associated value functions with respect to the model parameters. In this study, we examine the following key parameters: the policyholder’s risk aversion parameter
$\gamma$
, the policyholder’s initial claim habit
$y = Y_0$
, and the weight on the previous habit v (which helps determine the claim habit
$Y_1$
via (2.2)). The model parameters equal those for the base case (see Table 1), except for the parameter that we study, which will vary over a reasonable range. Also recall that the optimal insurance strategy for period 1, denoted by
$\alpha_1^*$
, is a constant, but the optimal insurance strategy for period 2, denoted by
$\alpha_2^*$
, is allowed to depend on the realization of
$Z_1$
for the equilibrium TC and optimal GP strategies. Therefore, similar to the comparison in Table 2, we investigate the effect of a chosen model parameter on
$\alpha_1^*$
and on the expected value of
$\alpha_2^*$
.
We first study the effect of the policyholder’s risk aversion parameter
$\gamma$
on decision-making and plot the optimal (expected) insurance strategies and the (time-0) value functions as a function of
$\gamma$
over [0, 3] in Figure 6. It is intuitively pleasing to see that both
$\alpha_1^*$
and
$\mathbb{E}(\alpha_2^*)$
are increasing functions of
$\gamma$
because risk aversion toward random losses is the reason the policyholder purchases insurance for those losses. As
$\gamma \to 0$
(risk-neutral policyholder), we observe
$\alpha_1^* \to 0$
and
$\alpha_2^* \to 0$
, and the convergence speed is fast and nearly identical for all three optimal strategies. However, for large enough
$\gamma$
(say
$\gamma \gt 1$
), the difference among the three optimal strategies in period 2 is easily visible. We also plot the time-0 value function,
$J_0(x, y;\; \alpha_1^*, \alpha_2^*)$
in (3.1), associated with each of the three optimal strategies. As expected, the optimal GP strategy always achieves the highest value of
$J_0$
, and it appears that the equilibrium TC strategy yields a higher
$J_0$
value than the optimal CP strategy. These two findings are consistent over the subsequent studies, and we do not repeat them.

Figure 6. Effect of the risk aversion parameter
$\gamma$
on optimal strategies and value functions.
The next parameter we study is the policyholder’s initial claim habit
$y = Y_0$
. Recall that we consider a claim-habit dependent variance premium principle, given by (2.3), and the larger the claim habit, the more expensive the insurance contract. Thus, what we observe from the left and middle panels in Figure 7 is fully anticipated; policyholders with higher claim habit have lower insurance demand. When
$y \to 0$
, we have
$\alpha_1^* \to 1$
(full insurance) for all three strategies because the insurance contract is actuarially fair when
$y = 0$
, and a risk-averse policyholder buys full insurance in this case. However, the same does not apply to
$\alpha_2^*$
for period 2, due to the fact that, even given
$y = Y_0 = 0$
, the policyholder’s claim habit
$Y_1$
used in pricing for period 2 can be strictly positive (recall
$Y_1 = (1 - v) {\unicode{x1D7D9}}_{\{Z_1 \gt 0\}}$
).

Figure 7. Effect of initial claim habit y on the optimal strategies and value functions.
Figure 8 analyzes
$v \in [0,1]$
, the weighting factor placed on the previous claim habit when calculating the new claim habit. Recall that
$Y_1 = v Y_0 + (1 - v) {\unicode{x1D7D9}}_{\{Z_1 \gt 0\}}$
and
$Y_0 = y$
is a fixed constant. As such, as v increases,
$Y_1$
might increase or decrease, depending on whether a loss occurs in period 1. For the chosen parameters,
$\alpha_1^*$
decreases with respect to v, but for other parameter values (not shown here), there is no definite monotonicity of
$\alpha_1^*$
with respect to v. Also,
$\mathbb{E}_0[\alpha_2^*]$
and the value functions are not necessarily monotonic with respect to v. It is interesting that, when
$Y_1$
is independent of the loss in period 1 (that is,
$v=1$
and the premium in (2.3) reduces to the usual variance premium principle), the equilibrium TC strategy and the optimal CP strategy coincide in both periods. Furthermore, we can show, by noting
$Y_1 = Y_0 = y$
given
$v = 1$
, that the equilibrium TC (optimal CP) strategies in both periods are the same and equal to

which recovers the result obtained in Corollary 3.1 of Li and Young (Reference Li and Young2021).
The last parameter we study is the probability of a positive loss occurs in any period, that is,
$q = \mathbb{P}(Z\gt0)\in(0,1)$
. We plot the impact of q on the optimal strategies and value functions in Figure 9. As q increases, the policyholder faces the trade-off between increasing her coverage, due to increased loss probability, and reducing her coverage, due to increased premium. This complex trade-off might explain the nonmonotonic behavior of
$\alpha_1^*$
and
$\mathbb{E}(\alpha_2^*)$
. The only exception is
$\mathbb{E}(\alpha_2^*)$
under the time-consistent notion, for which we can show analytically from (3.6) that
$\mathbb{E}(\alpha_2^*)$
always decreases with respect to q. The time-0 value functions decrease with respect to q for all three notions of optimality, which is intuitively pleasing.

Figure 8. Effect of the weighting factor v on the optimal strategies and value functions.

Figure 9. Effect of the loss probability q on the optimal strategies and value functions.
5.4 Comparison with deductible insurance
In the main analysis, we make an a priori assumption that the policyholder purchases proportional insurance to mitigate her risk exposure in both periods. This assumption allows us to derive analytical results in Sections 3 and 4. Although several studies have shown the optimality of proportional insurance under the usual variance premium principle (see Footnote 3), we conduct an ex post comparison of the performance between the optimal proportional insurance and the corresponding deductible insurance under the same premium level.
Recall that we consider three different notions of MV optimal strategies—time consistent, constant precommitment, and general precommitment. Under each notion, we obtain the optimal strategies,
$(\alpha_1^*, \alpha_2^*)$
, either analytically or numerically, which in turn lead to two premiums
$\pi_1^*$
and
$\pi_2^*$
by (2.3). We now assume that the policyholder spends the same premiums
$\pi_1^*$
and
$\pi_2^*$
on deductible insurance, with deductibles
$d_1$
and
$d_2$
in the two periods, respectively. Specifically, we establish the equations

and solve them numerically to obtain the corresponding deductibles
$d_1$
and
$d_2$
. After that, we compute the policyholder’s MV objective value under the deductible insurance
$(d_1,d_2)$
by

in which
$X_2^{(d_1,d_2)} = x - \pi_1^* -\big( Z_1\wedge d_1 \big) - \pi_2^* - \big(Z_2\wedge d_2\big)$
. See Table 2 for results of one comparison.
With the same model parameters as in Table 1, we obtain the policyholder’s MV objective value under the three optimal proportional insurance and their corresponding deductible insurance in Table 3. We observe that, under all three notions of optimality, the optimal proportional insurance yields a higher MV objective value than the deductible insurance with the same premiums. Furthermore, this finding is robust with respect to different values of the model parameters, which we confirm by an extensive numerical study (not shown here to save the pages but is available upon request). To summarize, the comparison study in this section numerically demonstrates the desirability of proportional insurance over deductible insurance under a generalized variance premium principle (2.3) and, thus offers further support to our a priori assumption that the policyholder purchases proportional insurance.
Table 3. Comparison of proportional insurance and deductible insurance.

6. Conclusion
We study an optimal insurance problem for a buyer of insurance in a two-period discrete model. A novel feature of our model is that the insurer applies a variance premium principle that depends on the policyholder’s claim history; under the proposed premium rule, the (unit) premium increases in period 2 if the policyholder files a claim in period 1, which is consistent with experience ratemaking models used by practicing actuaries. The buyer purchases proportional insurance at the beginning of each period and seeks an optimal strategy to maximize MV preferences. We solve for both the time-consistent equilibrium strategy and the optimal precommitment strategy; note that the majority of the literature only solves for one particular optimal strategy for MV problems. We obtain the time-consistent equilibrium strategy in closed form (see Theorem 3.1) and the optimal constant precommitment strategy in semiclosed form (see Theorem 4.1). For the optimal general precommitment strategy, we find its solution in period 2 in semiclosed form (see (4.10)) and characterize its solution in period 1 via an implicit equation (see (4.13)); based on our analytical results, we propose a numerical algorithm that can compute the optimal general precommitment strategy efficiently. Last, we conduct a detailed comparison of the three optimal strategies and study how several key model parameters affect the policyholder’s decision and value function. Our model can be easily extended to an n-period problem, and the methodology introduced in the main paper can be applied to numerically obtain the corresponding optimal strategies (see the online appendix).
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/asb.2025.10072.
Acknowledgments
Jingyi Cao and Dongchen Li acknowledge the financial support from the Natural Sciences and Engineering Research Council of Canada (grant numbers 05061 and 04958, respectively). Virginia R. Young thanks the Cecil J. and Ethel M. Nesbitt Professorship for the financial support of her research.
Competing interests
None.
Appendix
A. Proof of Theorem 3.1
Proof. Step 1. We solve for
$\alpha_2^* = \mathop{\textrm{arg max}}_{\alpha_2} J_1(x, y;\; \alpha_2)$
at time 1 for an arbitrary pair
$(X_1, Y_1) = (x, y) \in \mathbb{R} \times [0,1]$
.
Given
$X_1 = x$
and
$Y_1 = y$
, we use (2.4) and (2.3) to obtain the buyer’s terminal wealth
$X_2 \;:\!=\; X_2^{(1,x,y), \alpha_2}$
under strategy
$\alpha_2$
by
$X_2 = x - \pi(\alpha_2, y) - (1- \alpha_2) Z_2 = x - \mu \alpha_2 - \frac{\theta y \sigma^2}{2} \, \alpha_2^2 - (1- \alpha_2) Z_2$
. Using the above result and (3.2), we have

Note that
$\gamma$
and
$\theta$
are given positive constants, and
$y = Y_1 \in [0,1]$
. As such, we have
$\alpha_2^* \in [0, 1]$
without constraint and
$\alpha_2^* \in \mathcal{F}_2$
, as required by the admissibility definition in (2.5).
Step 2. We solve for
$\alpha_1^* = \mathop{\textrm{arg max}}_{\alpha_1} \, J_0(x,y;\; \alpha_1, \alpha_2^*)$
, in which
$\alpha_2^*$
is obtained above.
Assuming that the buyer will follow
$\alpha_2^* = \frac{\gamma}{\gamma + \theta Y_1}$
in period 2, her terminal wealth
$X_2 \;:\!=\; X_2^{(0,x,y), (\alpha_1, \alpha_2^*)}$
is given by

Note that
$Z_1$
and
$Z_2$
are independent, but
$Y_1$
depends on
$Z_1$
via (2.2).
For the above
$X_2$
, we obtain

Note that the last expectation term above is independent of the buyer’s strategy
$\alpha_1$
and, thus, can be omitted when optimizing
$J_0$
with respect to
$\alpha_1$
.
Regarding the variance
$\mathbb{V}_{0,x,y}(X_2)$
, we have

in which
$\mathbb{C}$
denotes the covariance operator under
$\mathbb{P}$
. Note that the last variance term
$\mathbb{V}_{0,x,y}\big({\kern1pt}{\unicode{x1D4F0}}{\kern1.5pt}(Y_1) + {\unicode{x1D4F1}}{\kern1.5pt}(Y_1) Z_2 \big)$
is independent of the buyer’s strategy
$\alpha_1$
and, thus, can be omitted in the optimization. To analyze the covariance term, we first introduce
$Y_h$
and
$Y_l$
defined in (3.7), which are the two possible values of
$Y_1$
depending on whether the policyholder files a claim; the subscripts h and l in (3.7) refer to “high” and “low,” respectively. Further, for a fixed
$Y_0 = y \in [0,1]$
, define A

We compute
$\mathbb{E}_{0,x,y} \left(Z_1 \, {\unicode{x1D4F0}}{\kern1.5pt}(Y_1) \right) = (1 - q) \, \mathbb{E}_{0,x,y} \big(Z_1 {\unicode{x1D4F0}}{\kern1.5pt}(Y_1) \big| Z_1 = 0 \big) + q \, \mathbb{E}_{0,x,y} \big(Z_1 {\unicode{x1D4F0}}{\kern1.5pt}(Y_1) \big| Z_1 \gt 0 \big) = {\unicode{x1D4F0}}{\kern1.5pt}(y_h) \mu$
and similarly,
$\mathbb{E}_{0,x,y} \left(Z_1 \cdot {\unicode{x1D4F1}}{\kern1.5pt}(Y_1) Z_2 \right) = \mathbb{E}_{0,x,y} \left(Z_1 {\unicode{x1D4F1}}{\kern1.5pt}(Y_1) \right) \cdot \mathbb{E}(Z_2) = {\unicode{x1D4F1}}{\kern1.5pt}(y_h) \, \mu^2$
. Therefore, we obtain the covariance term by

Thus, the following equivalence holds

in which
${\unicode{x1D4EF}}$
is defined by

Note that
${\unicode{x1D4EF}}$
defined in (A3) depends on the policyholder’s initial claim habit value,
$Y_0 = y$
, but is independent of her initial wealth
$X_0 = x$
.
We proceed to show that
${\unicode{x1D4EF}}$
has a unique maximizer in [0, 1]. By its definition in (A3), we easily see that
${\unicode{x1D4EF}}$
is differentiable on (0,1). By differentiating
${\unicode{x1D4EF}}$
, we obtain

and
${\unicode{x1D4EF}\;}''(\alpha_1) = - (\gamma + \theta y) \sigma^2 \lt 0$
. Therefore,
$\alpha_1^*$
defined in (A2) is unique and given by the expression in (3.5). This completes the proof.
B. Proof of Theorem 4.1
Proof. Under a constant strategy
$(a_1, a_2)$
, the buyer’s terminal wealth
$X_2$
is given by

in which
$x = X_0$
and
$y = Y_0$
. We obtain

in which
$\mathbb{E}(Y_1) = vy + q(1-v)$
,
$\mathbb{V}(Y_1)=q(1-q)(1-v)^2$
, and
$\mathbb{C}(Z_1, Y_1) = \mu (1-q) (1-v)$
. Therefore, maximizing
$J_0(x,y;\; a_1, a_2)$
is equivalent to minimizing
${\unicode{x1D4EF}}_c (a_1, a_2) $
defined by

Given
$a_2$
, we first minimize
${\unicode{x1D4EF}}_c$
over
$a_1$
. By
$\frac{\partial {\unicode{x1D4EF}}_c }{\partial a_1}(a_1, a_2)=0$
, we obtain

As
${\unicode{x1D4EF}}_c(a_1, a_2)$
is convex in
$a_1$
, subject to the constraint
$a_1\in [0, 1]$
,
${\unicode{x1D4EF}}_c(\cdot, a_2)$
is minimized at
$\hat{a}_1(a_2) \wedge 1$
. By noting that
$\hat{a}_1(a_2) \le 1 \iff \sqrt{\gamma \mathbb{C}(Z_1, Y_1)} a_2 \le \sqrt{2y}$
(recall that
$a_2 \ge 0$
), we define the minimized value as

in which
$\zeta= \sqrt{\frac{2y}{\gamma \mathbb{C}(Z_1, Y_1)}}$
if
$\mathbb{C}(Z_1, Y_1) \neq 0$
, and
$\zeta= \infty$
if
$\mathbb{C}(Z_1, Y_1) = 0$
.
Next, we minimize
${\unicode{x1D500}}{\kern1.5pt}(\cdot)$
over
$a_2 \in [0, 1]$
. For
${\unicode{x1D500}}_1(a_2)$
, a straightforward calculation shows that
${\unicode{x1D500}}_1^{{\kern2pt}\prime}(a_2) = \widetilde{{\unicode{x1D4F0}}}_c(a_2)$
, which is defined in (4.2). Note that, by using (2.1), we obtain
$(\gamma+\theta y)q\sigma^2 - \gamma(1-q)\mu^2 = \theta y q \sigma^2 + \gamma q^2 \widetilde{\sigma}^2 \gt 0$
, which shows that the coefficient of the
$a_2^3$
-term in
$\widetilde{{\unicode{x1D4F0}}}_c$
is strictly positive. By a straightforward calculation, we obtain
${\unicode{x1D500}}_1^{{\kern2pt}\prime\prime}(a_2) \gt0$
,
${\unicode{x1D500}}_1^{{\kern2pt}\prime}(0) = -2\gamma \lt0$
, and
${\unicode{x1D500}}_1^{{\kern2pt}\prime}(1) \gt0$
. Thus, the equation
${\unicode{x1D500}}_1^{{\kern2pt}\prime}(a_2) = 0$
, or equivalently
$\widetilde{{\unicode{x1D4F0}}}_c(a_2) = 0$
, admits a unique solution which happens to lie in (0, 1); denote this unique solution by
$\widetilde{a}_2$
. Subject to the constraint
$a_2 \le \zeta$
, we conclude that
${\unicode{x1D500}}_1$
is minimized at
$\widetilde{a}_2 \wedge \zeta$
.
For
${\unicode{x1D500}}{\kern1pt}_2(a_2)$
, we first obtain
${\unicode{x1D500}}_2^{{\kern2pt}\prime}(a_2) = \overline{{\unicode{x1D4F0}}}_c(a_2)$
, which is defined in (4.3). Similarly, we carry out computations to get
${\unicode{x1D500}}_2^{{\kern2pt}\prime\prime}(a_2) \gt0$
,
${\unicode{x1D500}}_2^{{\kern2pt}\prime}(0) = -2\gamma \lt0$
, and
${\unicode{x1D500}}_2^{{\kern2pt}\prime}(1) \gt0$
. In consequence, the equation
${\unicode{x1D500}}_2^{{\kern2pt}\prime}(a_2) =0$
, or equivalent
$\overline{{\unicode{x1D4F0}}}_c(a_2) = 0$
, admits a unique solution, which we denote by
$\overline{a}_2 \in(0,1)$
. Subject to the constraint
$a_2 \gt \zeta$
,
${\unicode{x1D500}}{\kern1pt}_2(\cdot)$
is minimized at
$ \overline{a}_2 \vee \zeta$
; note that if
$\zeta \ge 1$
, then
${\unicode{x1D500}}{\kern1.5pt}(a_2) = {\unicode{x1D500}}_1(a_2)$
on [0, 1]. Next, we show that
$\widetilde{a}_2$
and
$\overline{a}_2$
must lie on the same side of the threshold
$\zeta$
. To see this, by comparing
${\unicode{x1D500}}_1^{{\kern2pt}\prime}(\cdot)$
and
${\unicode{x1D500}}_2^{{\kern2pt}\prime}(\cdot)$
, we have

Therefore, at
$a_2 = \zeta = \sqrt{\frac{2y}{\gamma \mathbb{C}(Z_1, Y_1)}}$
,
${\unicode{x1D500}}_1^{{\kern2pt}\prime}(\zeta) = {\unicode{x1D500}}_2^{{\kern2pt}\prime}(\zeta)$
, and we analyze the following two mutually exclusive scenarios: (1)
${\unicode{x1D500}}_1^{{\kern2pt}\prime}(\zeta) ={\unicode{x1D500}}_2^{{\kern2pt}\prime}(\zeta) \ge 0$
and (2)
${\unicode{x1D500}}_1^{{\kern2pt}\prime}(\zeta) = {\unicode{x1D500}}_2^{{\kern2pt}\prime}(\zeta) \lt 0$
. In the first scenario,
$\widetilde{a}_2 \le \zeta$
and
$\overline{a}_2 \le \zeta$
, from which we obtain that
${\unicode{x1D500}}{\kern1.5pt}(a_2) = {\unicode{x1D500}}_1(a_2)$
is minimized at
$\widetilde{a}_2 \wedge \zeta = \widetilde{a}_2$
. As such,
$a_2^* = \widetilde{a}_2$
and
$a_1^* = \hat{a}_1(\widetilde{a}_2)$
, corresponding to the first case in (4.4). In the second scenario,
$\widetilde{a}_2 \gt \zeta$
and
$\overline{a}_2 \gt \zeta$
, which implies that
${\unicode{x1D500}}{\kern1.5pt}(a_2) = {\unicode{x1D500}}{\kern1pt}_2(a_2)$
is minimized at
$\overline{a}_2 \vee \zeta = \overline{a}_2$
. Therefore,
$a_2^* = \overline{a}_2$
and
$a_1^* = 1$
, corresponding to the second case in (4.4). This completes the proof.
C. Proof of Theorem 4.2
Proof. Step 1. We solve
$\max \limits_{\alpha_2 \in [0, 1]} G_1^\lambda(x, y;\; \alpha_2)$
at time 1.
Given
$X_1 = x$
and
$Y_1 = y$
, we use (2.3) and (2.4) to obtain the buyer’s terminal wealth
$X_2 \;:\!=\; X_2^{(1,x,y), \alpha_2}$
under strategy
$\alpha_2$
by
$ X_2 = x - \mu \alpha_2 - \frac{\theta y \sigma^2}{2} \, \alpha_2^2 - (1- \alpha_2) Z_2$
, which implies C


Then, maximizing
$G_1^\lambda(x, y;\; \alpha_2)$
is equivalent to minimizing
${\unicode{x1D4F0}}_p$
, defined by

By differentiating
${\unicode{x1D4F0}}_p(\alpha_2)$
, we obtain
$ {\unicode{x1D4F0}}_p^{{\kern1pt}\prime}(\alpha_2) = \gamma(\theta y \sigma)^2 \alpha_2^3+ 2\big(\lambda\theta y +\gamma-\gamma(x-\mu)\theta y \big)\alpha_2 - 2\gamma$
and
$ {\unicode{x1D4F0}}_p^{{\kern1.5pt}\prime\prime}(\alpha_2) = 3\gamma(\theta y\sigma)^2 \alpha_2^2+2\big(\lambda\theta y+\gamma-\gamma(x-\mu)\theta y\big)$
. By Descartes’ rule of signs, the equation
${\unicode{x1D4F0}}_p^{{\kern1pt}\prime}(\alpha_2) = 0$
admits a unique positive solution; denote this solution by
$\alpha_2^+$
(that is,
${\unicode{x1D4F0}}_p^{{\kern1pt}\prime}(\alpha_2^+) = 0$
). Moreover,

Thus,
$\alpha_2^+$
is the unique minimizer of
${\unicode{x1D4F0}}_p(\alpha_2)$
on
$[0,\infty)$
. Because
${\unicode{x1D4F0}}_p^{{\kern1pt}\prime}(0)= -2\gamma\lt0$
,
$\alpha_2^+ \lt 1$
if and only if
${\unicode{x1D4F0}}_p^{{\kern1pt}\prime}(1) \gt 1$
, in which
${\unicode{x1D4F0}}_p^{{\kern1pt}\prime}(1) = \gamma(\theta y\sigma)^2 + 2\big(\lambda\theta y-\gamma(x-\mu)\theta y\big)$
. Therefore, the solution of
$\max_{\alpha_2} G_1^\lambda(x, y;\; \alpha_2)$
is given by (4.10).