Hostname: page-component-68c7f8b79f-tw422 Total loading time: 0 Render date: 2025-12-16T09:28:43.888Z Has data issue: false hasContentIssue false

Finite mixture models for option pricing: An application to Bitcoin options

Published online by Cambridge University Press:  16 December 2025

Tak Kuen Siu*
Affiliation:
Macquarie Business School, Macquarie University
*
*Postal address: Department of Actuarial Studies and Business Analytics, Macquarie Business School, Macquarie University, Sydney, NSW 2109, Australia. Email: Ken.Siu@mq.edu.au
Rights & Permissions [Opens in a new window]

Abstract

This paper considers option valuation under finite mixture models in a discrete-time economy. Specifically, the Esscher transform is employed to select a pricing kernel. Novel finite mixture models with negative-shifted Gamma and negative-shifted inverse Gaussian distributions are developed. A hybrid finite mixture model that allows different parametric forms for component distributions is introduced to incorporate model uncertainty. An empirical characteristic function estimation method is employed to estimate the finite mixture models. Closed-form pricing formulas for a European call option are obtained for some finite mixture models. Empirical examples using data on the Bitcoin-USD prices are provided to illustrate an application of the proposed models to value Bitcoin options.

Information

Type
Original Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

Finite mixture models, such as finite mixture normals, play an important role in modelling empirical behaviour of financial returns. Empirical evidence supporting a mixture of normal distributions for S&P 500 returns data was provided in [Reference Rydén, Teräsvirta and Åsbrink60]. The significance of finite mixture models for modelling financial returns was discussed in [Reference Taylor73, Reference Tsay80]. The merits of finite mixture models include their analytical tractability and capability of capturing key empirical features of financial returns, such as the skewness and heavy-tailedness of distributions for returns. Finite mixture models also incorporate the impact of uncertainty about market news or events on distributions for financial returns as noted, for example, in [Reference Gemmill and Saflekos25, Reference Taylor73, Reference Taylor74]. Finite mixture models have important applications in machine learning and big data analytics [Reference Bishop8, Reference Fan, Lin, Genest, Banks, Molenberghs, Scott and Wang20]. For an in-depth statistical analysis of finite mixture models, see, for example, [Reference Hall and Titterington33, Reference McLachlan, Lee and Rathnayake50, Reference McLachlan and Peel51, Reference Titterington, Smith and Makov75]. Finite mixture models are intimately linked with the regime-switching models proposed, for example, in [Reference Hamilton34, Reference Tong76, Reference Tong77]. Specifically, if the finite-state (hidden) Markov chain modulating a Markov regime-switching model is replaced by a mixing random variable, the Markov regime-switching model becomes a finite mixture model. [Reference Tong78] pointed out that the stationary distribution of a simple self-exciting threshold autoregressive (SETAR) model is a mixture of normal distributions. See also [Reference Frühwirth-Schnatter23] for exploring links between finite mixture models and regime-switching models.

Option valuation under finite mixture models has been considered in the literature. [Reference Ritchey57] discussed the pricing of a European call option under a finite mixture of normal distributions and obtained a closed-form pricing formula. With heterogeneous expectations, [Reference Guo32] obtained the Ritchey pricing formula and showed that a finite mixture of lognormal distributions is desirable for recovering a risk-neutral distribution from option prices data. See [Reference Bhat and Kumar7, Reference Liu, Shackleton, Taylor and Xu47, Reference Rombouts and Stentoft58, Reference Rombouts and Stentoft59, Reference Venter and Maré81] for some other contributions to option pricing and related problems such as risk-neutral density estimation under finite mixture models. Option valuation under discrete-time regime-switching models has been considered in the literature. [Reference Duan, Popova and Ritchken15] proposed a general class of option pricing models based on regime-switching processes, which include the Markov regime-switching models and some GARCH-type models. [Reference Jalan, Matkovskyy and Aziz42, Reference Liu, Shackleton, Taylor and Xu45] studied option valuation under discrete-time hidden Markov regime-switching models. [Reference Elliott, Siu and Chan19, Reference Siu, Tong and Yang71] considered option pricing under a discrete-time Markov-switching GARCH-type model and a discrete-time SETAR-ARCH model, respectively. [Reference Ching, Siu and Li14] considered the pricing of exotic options under a discrete-time, high-order, Markov regime-switching model. [Reference Elliott, Siu and Badescu18] introduced a discrete-time, Markov regime-switching, affine term-structure model for pricing interest-rate securities. [Reference Siu, Tong and Yang68] discussed option pricing under discrete-time doubly Markov regime-switching models with non-normal innovations. [Reference Godin, Lai and Trottier29] proposed a novel approach to deal with path dependence in option valuation under a discrete-time hidden Markov regime-switching model. The approach was improved by [Reference Godin and Trottier30] based on the extended Girsanov principle allowing for non-normal logarithmic returns. [Reference Fu, Li, Wu and Zhang24] studied an option valuation problem under a discrete-time Markov-switching stochastic volatility model. The aforementioned works focus on models with normal innovations, apart from [Reference Godin and Trottier30, Reference Siu65].

Discrete-time finite mixture models offer a simpler solution to an option valuation problem than discrete-time Markov regime-switching models. Fusing techniques in actuarial science, financial mathematics, applied probability, statistics, econometrics, and data science, this paper discusses canonical option valuation under discrete-time finite mixture models. Specifically, the Esscher transform is employed to select a pricing kernel. The Esscher transform is an important tool in actuarial science and its use for option pricing was proposed in the seminal work [Reference Gerber and Shiu26]. For some subsequent developments, see, for example, [Reference Badescu, Elliott and Siu3, Reference Bühlmann, Delbaen, Embrechts and Shiryaev11, Reference Bühlmann, Delbaen, Embrechts and Shiryaev12, Reference Elliott, Chan and Siu16, Reference Gerber and Shiu27, Reference Gerber and Shiu28, Reference Goovaerts and Laeven31, Reference Siu62, Reference Siu, Tong and Yang70]. The pricing kernel selected by the Esscher transform is justified by maximizing the expected utility based on a state-dependent power utility function. Besides finite mixture normals, we introduce novel finite mixture models including a finite mixture model of negative-shifted Gamma (NegSGa) distributions, a finite mixture model of negative-shifted inverse Gaussian (NegSIG) distributions, as well as a hybrid finite mixture model of normal, NegSGa, and NegSIG distributions. The hybrid finite mixture model incorporates model uncertainty by allowing different parametric forms for its component distributions. The maximum likelihood estimation method is intractable when NegSGa and NegSIG distributions are involved. Given that closed-form expressions for the characteristic functions of normal, NegSGa, and NegSIG distributions are available, an empirical characteristic function (ECF) estimation method is adopted to estimate the finite mixture models. Closed-form pricing formulas for a European call option are obtained for these finite mixture models, up to solving non-linear equations for cases involving NegSIG distributions. Although works on option valuation under mixture normal GARCH-type models exist (see, for example, [Reference Badescu, Kulperger and Lazar4]) and have gained some empirical successes, the finite mixture models considered here lead to closed-form pricing formulas and provide a flexible way to incorporate non-normality. Using data on the daily adjusted close prices of Bitcoin-USD, estimation results for eight models (five finite mixture models and three non-mixture models) are provided. Numerical results are provided and analysed for European call Bitcoin option prices and their implied volatilities.

The rest of the paper is structured as follows. In Section 2, a discrete-time multiperiod finite mixture model is considered. The theoretical results for option valuation under the finite mixture model are presented based on the conditional Esscher transform in Section 3. Analytical solutions to option valuation are obtained for the finite mixture models in Section 4. The estimation procedures and results are provided in Section 5. Numerical results for option prices and implied volatilities are presented and analysed in Section 6. The final section provides concluding remarks and points out some limitations of the proposed models and study. The supplemental material contains Appendix A, an economic justification for the pricing kernel; Appendix B, proofs of lemmas and theorems; Appendix C, exploratory data analysis; and Appendix D, tables and figures.

2. A discrete-time multiperiod model

A discrete-time economy is considered in which uncertainties are described by a complete probability space $(\Omega, \mathcal{F}, \mathbb{P})$ , where $\mathbb{P}$ is a real-world probability measure. Let $\mathcal{T}$ be a finite discrete-time parameter set of the economy, where $\mathcal{T} = \{ 0, 1, 2, \ldots, T \}$ , for some finite horizon $T < \infty$ . The price process of a risky asset is described by a discrete-time stochastic process $\{ S_t \}_{t \in \mathcal{T}}$ with the state space ${\mathbb{R}}_+$ , (i.e. the set of non-negative real numbers). For each $t = 1, 2, \ldots, T$ , let $R_t \,:\!=\, \ln (S_t / S_{t-1})$ , (i.e. the logarithmic return of the risky asset in the tth period from time $t-1$ to time t).

For each $t = 1, 2, \ldots, T$ , $R_t$ takes on values in ${\mathbb{R}}$ , (i.e. the set of real numbers). Write $\mathbb{F}$ for the $\mathbb{P}$ -augmentation of the natural filtration $\{ \mathcal{F}_t \}_{t \in \mathcal{T}}$ generated by $\{ R_t \}_{t \in \mathcal{T} \setminus \{ 0 \}}$ such that $\mathcal{F}_t \,:\!=\,\sigma \{ R_1, R_2, \ldots, R_t \} \vee \mathcal{N}$ and $\mathcal{F}_0 \,:\!=\, \sigma \{ \emptyset,\Omega \} \vee \mathcal{N}$ , where $\mathcal{N}$ is a collection of $\mathbb{P}$ -null sets in $\mathcal{F}$ , $\mathcal{A} \vee \mathcal{B}$ is the minimal $\sigma$ -field containing the $\sigma$ -fields $\mathcal{A}$ and $\mathcal{B}$ , and $\emptyset$ is the empty set. Here it is supposed that for each $t = 1, 2, \ldots, T$ , $R_t$ has a finite mixture distribution with the cumulative distribution function (CDF) F(x), which is given as follows:

(2.1) \begin{equation} F (x) = \sum^{N}_{i = 1} p_i F_i (x),\end{equation}

where $p_i > 0$ for $i = 1, 2, \ldots, N$ , and $\sum^{N}_{i = 1} p_i = 1$ ; $F_i (x)$ is the ith-component CDF of the mixture CDF F(x); and $p_i$ is the weight allocated to the ith-component CDF $F_i (x)$ . Note that $R_1, R_2, \ldots, R_T$ are identically distributed and have the common finite mixture distribution F(x) defined by (2.1).

Let $\Theta$ denote a discrete random variable on $(\Omega, \mathcal{F}, \mathbb{P})$ taking a value on $\{ 1, 2, \ldots, N \}$ . Suppose that the probability mass function of $\Theta$ under $\mathbb{P}$ is given by

(2.2) \begin{equation} \mathbb{P} (\Theta = i) = p_i, \quad i = 1, 2, \ldots, N.\end{equation}

Assume that, for each $t \in \mathcal{T} \setminus \{ 0 \}$ and each $i = 1, 2, \ldots, N$ , the conditional distribution of $R_t$ given $\Theta = i$ under $\mathbb{P}$ is given by

(2.3) \begin{equation} \mathbb{P}(R_t \le x \mid \Theta = i) = F_{i}(x) \quad \mbox{for each $x \in {\mathbb{R}}$},\end{equation}

where $F_{i} (x)$ is the ith-component CDF of the finite mixture CDF F(x) in (2.1). Using the law of total probability, (2.2), and (2.3), for each $t \in \mathcal{T} \setminus \{ 0 \}$ , the unconditional CDF of $R_t$ under $\mathbb{P}$ is given by

(2.4) \begin{equation} F (x) = \mathbb{P}(R_t \le x) = \sum^{N}_{i=1}\mathbb{P}(R_t \le x \mid \Theta = i)\mathbb{P}(\Theta = i) = \sum^{N}_{i = 1} F_{i} (x) p_{i}.\end{equation}

This is the same as the CDF F(x) of the mixture distribution in (2.1). Consequently, the random variable $\Theta$ is interpreted as a mixing random variable for the mixture distribution in (2.1).

For each $t \in \mathcal{T} \setminus \{ 0\}$ , let $M (\eta)$ be the moment-generating function (MGF) of $R_t$ evaluated at $\eta$ under $\mathbb{P}$ and $M_{i} (\eta)$ be the (conditional) MGF of $R_t$ given $\Theta = i$ evaluated at $\eta$ under $\mathbb{P}$ . That is,

(2.5) \begin{equation} M (\eta) \,:\!=\, \mathbb{E}[{\mathrm{e}}^{\eta R_t}] = \int^{\infty}_{-\infty}{\mathrm{e}}^{\eta x}\,{\mathrm{d}} F(x),\end{equation}

where $\mathbb{E} [\cdot]$ denotes the expectation under $\mathbb{P}$ , and

(2.6) \begin{equation} M_{i} (\eta) \,:\!=\, \mathbb{E}[{\mathrm{e}}^{\eta R_t} \mid \Theta = i ] = \int^{\infty}_{-\infty}{\mathrm{e}}^{\eta x}\,{\mathrm{d}} F_i(x),\end{equation}

where $\mathbb{E} [\,\cdot \mid \Theta = i ]$ is the conditional expectation given $\Theta = i$ under $\mathbb{P}$ . Then it is known that

(2.7) \begin{equation} M (\eta) = \sum^{N}_{i = 1} p_{i} M_{i} (\eta).\end{equation}

3. Pricing kernel using the conditional Esscher transform

In this section we specify a risk-neutral probability measure (or a pricing kernel) using a version of the conditional Esscher transform [Reference Bühlmann, Delbaen, Embrechts and Shiryaev11, Reference Bühlmann, Delbaen, Embrechts and Shiryaev12, Reference Siu, Tong and Yang70]. We also identify the respective martingale condition, and the distributions for the returns and mixing random variable under the risk-neutral probability measure. Lastly, we derive a generic expression for the price of a European call option.

Let $h \colon \{1, 2, \ldots, N \} \to {\mathbb{R}}$ be a real-valued function on the finite set $\{ 1, 2, \ldots, N \}$ . Assume that, for each $i \in \{1, 2, \ldots, N \}$ , $M_{i} (h (i)) < \infty$ (i.e. the conditional MGF of $R_t$ given $\Theta = i$ evaluated at $h (i) \in {\mathbb{R}}$ under $\mathbb{P}$ exists).

For each $t = 1, 2, \ldots, T$ and $i \in \{1, 2, \ldots, N \}$ , let

(3.1) \begin{equation} \lambda_t (h (i)) = \frac{{\mathrm{e}}^{h (i) R_t}}{M_{i} (h (i)) }.\end{equation}

Then $\lambda_t (h (i))$ is a positive $\mathcal{F}_t$ -measurable random variable.

For each $t = 1, 2, \ldots, T$ , let $\mathcal{G}_t$ be an enlarged $\sigma$ -field defined by

(3.2) \begin{equation} \mathcal{G}_t \,:\!=\, \mathcal{F}_t \vee \sigma \{\Theta \},\end{equation}

where $\sigma \{ \Theta \}$ denotes the $\sigma$ -algebra generated by the mixing random variable $\Theta$ . Write $\mathbb{G}$ for the respective enlarged filtration $\{ \mathcal{G}_t \}_{t \in \mathcal{T}}$ with $\mathcal{G}_0 \,:\!=\, \mathcal{F}_0 = \sigma \{\emptyset, \Omega \} \vee \mathcal{N}$ .

Similarly to (2.6), let $M_{\Theta} (\eta)$ denote the conditional MGF of $R_t$ given $\Theta$ evaluated at $\eta$ under $\mathbb{P}$ . That is,

(3.3) \begin{equation} M_{\Theta} (\eta) \,:\!=\, \mathbb{E}[{\mathrm{e}}^{\eta R_t} \mid \Theta] = \int^{\infty}_{-\infty}{\mathrm{e}}^{\eta x}\,{\mathrm{d}} F_{\Theta}(x),\end{equation}

where $\mathbb{E} [\, \cdot \mid \Theta ]$ is the conditional expectation given $\Theta$ under $\mathbb{P}$ . Note that $M_{\Theta} (\eta)$ may be thought of as the (random) MGF of the (random) distribution function $F_{\Theta} (x)$ under $\mathbb{P}$ . For each $i = 1, 2, \ldots, N$ , on the set $\{ \omega \in \Omega \mid \Theta (\omega) = i \}$ , $F_{\Theta} (x) = F_i (x)$ and $M_{\Theta} (\eta) = M_i (\eta)$ under $\mathbb{P}$ .

Similarly to (3.1), for each $t = 1, 2, \ldots, T$ let

(3.4) \begin{equation} \lambda_t (h (\Theta)) = \frac{{\mathrm{e}}^{h (\Theta) R_t}}{M_{\Theta} (h (\Theta)) }.\end{equation}

Then $\lambda_t (h (\Theta))$ is a positive $\mathcal{G}_t$ -measurable random variable.

We define the $\mathbb{G}$ -adapted process $\{ \Lambda_t^{h} (\Theta) \}_{t \in \mathcal{T}}$ by putting

(3.5) \begin{align} \Lambda_t^{h} (\Theta) & \,:\!=\, \prod^{t}_{k=1}\lambda_k(h(\Theta)) \nonumber \\ & = \prod^{t}_{k=1}\frac{{\mathrm{e}}^{h(\Theta)R_k}}{M_{\Theta}(h(\Theta))} = \exp\Bigg(h(\Theta)\sum^{t}_{k=1}R_k - t\ln(M_{\Theta}(h(\Theta)))\Bigg), \quad t \in \mathcal{T} \setminus \{ 0 \}, \end{align}

and $\Lambda^{h}_0 (\Theta) = 1$ .

We impose the following assumption.

Assumption 3.1. For each $t = 2, 3, \ldots, T$ , $R_t$ is conditionally independent of $\mathcal{F}_{t-1}$ given $\Theta$ under $\mathbb{P}$ .

Then we have the following lemma.

Lemma 3.1. Suppose that Assumption 3.1 holds. Then $\{\Lambda^{h}_t(\Theta)\}_{t\in\mathcal{T}}$ is a positive $(\mathbb{G}, \mathbb{P})$ -martingale.

A probability measure $\mathbb{P}^{h}$ equivalent to $\mathbb{P}$ on $\mathcal{G}_T$ can then be defined by putting

(3.6) \begin{equation} \frac{{\mathrm{d}}\mathbb{P}^{h}}{{\mathrm{d}}\mathbb{P}} \bigg |_{\mathcal{G}_T} \,:\!=\, \Lambda^{h}_T (\Theta).\end{equation}

The probability measure $\mathbb{P}^{h}$ defined in (3.6) is a variant of the Esscher transform. It is intimately linked with the random Esscher transform in [Reference Siu, Tong and Yang69], and may be thought of as a version of the conditional Esscher transform in discrete time [Reference Bühlmann, Delbaen, Embrechts and Shiryaev11, Reference Bühlmann, Delbaen, Embrechts and Shiryaev12, Reference Siu, Tong and Yang70]. It is also related to the regime-switching Esscher transform in [Reference Elliott, Chan and Siu16, Reference Liu, Shackleton, Taylor and Xu45, Reference Siu, Tong and Yang68]. The random variable $h (\Theta)$ is interpreted as a random Esscher parameter. In the context of insurance mathematics, we can select an Esscher parameter so that, for a given premium level, the Esscher premium principle with the Esscher parameter is equal to the given premium level. The followng lemma gives a result that will be used for later developments in this paper.

Lemma 3.2. Suppose that Assumption 3.1 holds. Then $R_1, R_2, \ldots, R_T$ are conditionally independent given $\Theta$ under $\mathbb{P}$ . Furthermore, $R_1, R_2, \ldots, R_T$ are also conditionally independent given $\Theta$ under $\mathbb{P}^{h}$ defined by (3.6).

Let $M^{h}_{\Theta} (\eta)$ be the conditional MGF of $R_t$ given $\Theta$ evaluated at $\eta$ under the probability measure $\mathbb{P}^{h}$ defined by (3.6). That is,

(3.7) \begin{equation} M^{h}_{\Theta} (\eta) \,:\!=\, \mathbb{E}^{h} [{\mathrm{e}}^{\eta R_t} \mid \Theta ],\end{equation}

where $\mathbb{E}^{h} [ \,\cdot \mid \Theta ]$ is the conditional expectation given $\Theta$ under $\mathbb{P}^{h}$ . Then the following lemma gives the relation between $M^{h}_{\Theta} (\eta)$ and $M_{\Theta} (\eta)$ . See, for example, [Reference Elliott, Siu and Chan19, Reference Gerber and Shiu26, Reference Gerber and Shiu27, Reference Liu, Shackleton, Taylor and Xu45, Reference Siu, Tong and Yang60] for some related discussions.

Lemma 3.3. Let ${\bar \eta} \colon \{ 1, 2, \ldots, N \} \to {\mathbb{R}}$ be a measurable real-valued function. Then, for each $t \in \mathcal{T} \setminus \{ 0 \}$ ,

(3.8) \begin{equation} M^{h}_{\Theta}({\bar\eta}(\Theta)) = \frac{M_{\Theta}({\bar\eta}(\Theta) + h(\Theta))}{M_{\Theta}(h(\Theta))}, \quad \text{$\mathbb{P}$-almost surely}.\end{equation}

Let r be the constant risk-free force of interest, i.e. the continuously compounded risk-free interest rate. By the fundamental theorem of asset pricing [Reference Harrison and Kreps36Reference Harrison and Pliska38], the absence of arbitrage opportunities is essentially equivalent to the existence of a martingale measure, say $\mathbb{P}^{h^{\dagger}}$ , equivalent to $\mathbb{P}$ on $\mathcal{G}_T$ , such that the price process of the risky asset discounted by the risk-free force of interest r, say $\{{\mathrm{e}}^{- r t} S_t \}_{t \in \mathcal{T}}$ , is a positive $(\mathbb{G}, \mathbb{P}^{h^{\dagger}})$ -martingale. The latter statement holds if and only if there exists a random variable $h^{\dagger} \,:\!=\, h^{\dagger} (\Theta)$ , which is $\sigma \{ \Theta \}$ -measurable, such that, for each $t = 1, 2, \ldots, T$ ,

(3.9) \begin{equation} \mathbb{E}^{h^{\dagger}}[{\mathrm{e}}^{- r t} S_t \mid \mathcal{G}_{t-1} ] = {\mathrm{e}}^{- r (t-1)} S_{t-1},\end{equation}

where $\mathbb{E}^{h^{\dagger}} [\,\cdot \mid \mathcal{G}_{t-1}]$ is the conditional expectation given $\mathcal{G}_{t-1}$ under $\mathbb{P}^{h^{\dagger}}$ . Note that the martingale condition for the discounted share price process is with respect to the enlarged filtration $\mathbb{G}$ , which contains information about the share price process and the mixing random variable $\Theta$ . The latter may be thought of as private information that is not known to the public. Consequently, the assumption of the strong form of market information efficiency is imposed. It may be noted that some other studies considered the weak form of market infomation efficiency instead. See, for example, [Reference Bégin and Godin6, Reference Elliott and Madan17, Reference Godin and Trottier30].

The random variable $h^{\dagger} (\Theta)$ is interpreted as the risk-neutral random mixing Esscher parameter. In the following theorem, the fundamental theorem of asset pricing is adapted to the current modelling framework based on the possible values taken by the risk-neutral random mixing Esscher parameter $h^{\dagger} (\Theta)$ . These possible values are given by $h^{\dagger} (i)$ for $i = 1, 2, \ldots, N$ .

Theorem 3.1. Suppose an equivalent martingale measure is specified by the probability measure $\mathbb{P}^{h^{\dagger}}$ defined in (3.6) by setting $h = h^{\dagger}$ . Then the absence of arbitrage opportunities is essentially equivalent to the existence of N real numbers $h^{\dagger} (i)$ , $i = 1, 2, \ldots, N$ , such that

(3.10) \begin{equation} {\mathrm{e}}^{r} = \sum^{N}_{i=1}\frac{M_{i}(1 + h^{\dagger}(i))}{M_{i}(h^{\dagger}(i))}\mathbf{1}_{\{\Theta=i\}}, \quad \text{$\mathbb{P}$-almost surely}, \end{equation}

where $\mathbf{1}_{\{ \Theta = i \} }$ is the indicator function of the event $\{ \Theta = i \}$ .

By slightly modifying the arguments in [Reference Gerber and Shiu27], it may be seen that the N real numbers $ h^{\dagger}_t (i)$ , $i = 1, 2, \ldots, N$ , that satisfy (3.10) are uniquely determined if they exist. The existence of N real numbers $h^{\dagger} (i)$ , $i = 1, 2, \ldots, N$ , that satisfy the martingale condition in (3.10) for some parametric cases will be discussed in Section 4. The pricing kernel selected by the conditional Esscher transform is justified by maximizing the expected state-dependent power utility in the supplemental material (Appendix A).

The following theorem shows that the probability mass function of the mixing random variable $\Theta$ remains unchanged when changing the probability measures from $\mathbb{P}$ to $\mathbb{P}^{h^{\dagger}}$ .

Theorem 3.2. Under $\mathbb{P}^{h^{\dagger}}$ , the mixing random variable $\Theta$ has the probability mass function

(3.11) \begin{equation} \mathbb{P}^{h^{\dagger}} (\Theta = i) = p_{i}, \quad i = 1, 2, \ldots, N. \end{equation}

Theorem 3.3 provides the conditional distribution of $R_t$ given $\Theta = i$ under $\mathbb{P}^{h^{\dagger}}$ for each $i = 1, 2, \ldots, N$ , and the unconditional distribution of $R_t$ under $\mathbb{P}^{h^{\dagger}}$ .

Theorem 3.3. For each $t \in \mathcal{T} \setminus \{ 0 \}$ and each $i = 1, 2, \ldots, N$ , let $F^{h^{\dagger}}_i(x)$ denote the conditional probability distribution of $R_t$ given $\Theta = i$ under $\mathbb{P}^{h^{\dagger}}$ , and let $F^{h^{\dagger}} (x)$ denote the unconditional probability distribution of $R_t$ under $\mathbb{P}^{h^{\dagger}}$ . Then

(3.12) \begin{equation} F^{h^{\dagger}}_i(x) \,:\!=\, \mathbb{P}^{h^{\dagger}}(R_t \le x \mid \Theta = i) = \frac{\int^{x}_{-\infty}{\mathrm{e}}^{h^{\dagger}(i)y}\,F_{i}({\mathrm{d}} y)}{M_i(h^{\dagger}(i))}, \end{equation}

where the integral is interpreted as the Riemann–Stieltjes integral. Furthermore,

(3.13) \begin{equation} F^{h^{\dagger}}(x) \,:\!=\, \mathbb{P}^{h^{\dagger}} (R_t \le x) = \sum^{N}_{i = 1} F^{h^{\dagger}}_i (x) p_i. \end{equation}

Besides using the conditional Esscher transform to specify a pricing kernel, we can adopt other prominent approaches such as the extended Girsanov principle [Reference Elliott and Madan17] and the second-order Esscher transform [Reference Monfort and Pegoraro54] to specify a pricing kernel. However, using the conditional Esscher transform here has several advantages. Firstly, in each of the (parametric) finite mixture distributions to be considered in Section 4, the distribution for returns under the risk-neutral probability measure specified by the conditional Esscher transform is in the same parametric family as the distribution for returns under the real-world probability measure. That is, the (parametric) distribution of returns is closed under changing probability measures via the conditional Esscher transform. Indeed, in continuous-time regime-switching modelling frameworks, it has been shown in some literature that the risk-neutral probability measure selected by the Esscher transform minimizes the relative entropy between an equivalent martingale measure and the real-world probability measure [Reference Elliott, Chan and Siu16, Reference Siu63]. Note that the relative entropy between an equivalent martingale measure and the real-world probability measure is a measure for the ‘distance’ between the two probability measures. Consequently, the risk-neutral probability measure selected by the Esscher transform is ‘closest’ to the real-world probability measure. Secondly, using the conditional Esscher transform, an analytical expression for the price of a European call option is obtained for each of the finite mixture distributions to be considered in Section 4. Thirdly, the pricing kernel specified by the conditional Esscher transform can be justified by an economic equilibrium based on maximization of the expected power utility.

Note that the martingale condition in (3.10) of Theorem 3.1 is equivalent to the following N equations satisfied by $h^{\dagger} (i)$ , $i = 1, 2, \ldots, N$ :

(3.14) \begin{equation} {\mathrm{e}}^{r} = \frac{M_{i} (1 + h^{\dagger} (i))}{M_{i} (h^{\dagger} (i))}, \quad i = 1, 2, \ldots, N.\end{equation}

Furthermore, from Lemma 3.2, $R_1, R_2, \ldots, R_T \mid \Theta$ are (conditionally) independent under $\mathbb{P}^{h^{\dagger}}$ . From (3.11) in Theorem 3.2,

(3.15) \begin{equation} \mathbb{P}^{h^{\dagger}} (\Theta = i) = p_{i}.\end{equation}

Using (3.8) in Lemma 3.3, for each $i = 1, 2, \ldots, N$ the conditional MGF of $R_t$ given $\Theta = i$ under $\mathbb{P}^{h^{\dagger}}$ evaluated at $\eta$ is given by

(3.16) \begin{equation} M^{h^{\dagger}}_i (\eta) = \frac{M_{i} (\eta + h^{\dagger} (i))}{M_{i} (h^{\dagger} (i))}.\end{equation}

Then the MGF of $R_t$ under $\mathbb{P}^{h^{\dagger}}$ evaluated at $\eta$ is given by

(3.17) \begin{equation} M^{h^{\dagger}} (\eta) = \sum^{N}_{i = 1} M^{h^{\dagger}}_{i} (\eta) p_i = \sum^{N}_{i = 1} \frac{M_{i} (\eta + h^{\dagger} (i))}{M_{i} (h^{\dagger} (i))} p_i.\end{equation}

A European call option with strike price K and exercise date T is considered. As in [Reference Gerber and Shiu26], we calculate the value of the call option at time $t = 0$ . Let $C_0$ denote the value of the call option at time $t = 0$ . Then

(3.18) \begin{equation} C_0 = \mathbb{E}^{h^{\dagger}} [{\mathrm{e}}^{- r T} (S_T - K)^+ ].\end{equation}

Using the law of iterated expectations and (3.15),

(3.19) \begin{align} C_0 & = \sum^{N}_{i=1}\mathbb{E}^{h^{\dagger}}[{\mathrm{e}}^{-rT}(S_T - K)^+ \mid \Theta = i]\mathbb{P}^{h^{\dagger}}(\Theta = i) \nonumber \\ & = \sum^{N}_{i=1}\mathbb{E}^{h^{\dagger}}[{\mathrm{e}}^{-rT}(S_T - K)^+ \mid \Theta = i]p_i. \end{align}

For each $i = 1, 2, \ldots, N$ , let

(3.20) \begin{equation} C_0 (i) \,:\!=\, \mathbb{E}^{h^{\dagger}} [ {\mathrm{e}}^{- r T} (S_T - K)^+ \mid \Theta = i ].\end{equation}

Then, from (3.19) and (3.20),

(3.21) \begin{equation} C_0 = \sum^{N}_{i = 1} C_0 (i) p_i.\end{equation}

4. Analytical solutions

In this section, four parametric cases, namely a finite mixture model of normal distributions, a finite mixture model of negative-shifted Gamma distributions, a finite mixture model of negative-shifted inverse Gaussian distributions, and a hybrid finite mixture model of normal, negative-shifted Gamma and negative-shifted inverse Gaussian distributions are considered. The existence of an equivalent martingale measure, or a pricing kernel, selected by a risk-neutral Esscher measure is discussed for each of the four parametric cases. The derivations presented adopt results from [Reference Gerber and Shiu26, Sections 3 and 4]. Some of the conventions and notation of [Reference Gerber and Shiu26] will also be used in the following.

4.1. Finite mixture model of normal distributions

Assume that, for each $i = 1, 2, \ldots, N$ , the ith-component CDF $F_{i} (x)$ is the CDF of a normal distribution with mean $\mu_i$ and variance $\sigma_i^2$ (i.e. $N (\mu_i, \sigma_i^2)$ ). Then

(4.1) \begin{equation} M_i (\eta) = \exp \bigg ( \mu_i \eta + \frac{1}{2} \eta^2 \sigma_i^2 \bigg).\end{equation}

From (3.14) and (4.1),

(4.2) \begin{equation} h^{\dagger} (i) = \frac{r - \mu_i - \frac{1}{2} \sigma^2_i}{\sigma^2_i}, \quad i = 1, 2, \ldots, N.\end{equation}

Then (4.2) ensures the existence of an equivalent martingale measure $\mathbb{P}^{h^{\dagger}}$ .

From (3.16), (4.1), and (4.2),

(4.3) \begin{equation} M^{h^{\dagger}}_{i} (\eta) = \exp \bigg ( \bigg (r - \frac{1}{2} \sigma^2_i \bigg ) \eta + \frac{1}{2} \eta^2 \sigma^2_i \bigg ).\end{equation}

This implies that under $\mathbb{P}^{h^{\dagger}}$ , the conditional distribution of $R_t$ given $\Theta = i$ is a normal distribution with mean $r - \frac{1}{2} \sigma^2_i$ and variance $\sigma^2_i$ , (i.e. $R_t \mid \{ \Theta = i \} \sim N(r - \frac{1}{2} \sigma^2_i, \sigma^2_i)$ ), for each $i = 1, 2, \ldots, N$ and each $t = 1, 2, \ldots, T$ . Define

(4.4) \begin{equation} R_{1, T} \,:\!=\, \sum^{T}_{t = 1} R_t.\end{equation}

Note that by Lemma 3.2, $R_1, R_2, \ldots, R_T \mid \Theta$ are (conditionally) independent under $\mathbb{P}^{h^{\dagger}}$ . Furthermore, for each $t = 1, 2, \ldots, T$ and each $i = 1, 2, \ldots, N$ , $R_t \mid \{ \Theta = i \} \sim N\big(r - \frac{1}{2} \sigma^2_i, \sigma^2_i\big)$ under $\mathbb{P}^{h^{\dagger}}$ . Consequently, $R_{1, T} \mid \{ \Theta = i \} \sim N \big(\big(r - \frac{1}{2} \sigma^2_i\big) T, \sigma^2_i T\big)$ under $\mathbb{P}^{h^{\dagger}}$ . It may be seen that, for each $i = 1, 2, \ldots, N$ ,

(4.5) \begin{equation} C_0 (i) = \mathbb{E}^{h^{\dagger}}[{\mathrm{e}}^{-rT}(S_T - K)^+ \mid \Theta = i] = \mathrm{BSM}(S_0,K,T,r,\sigma_i).\end{equation}

Note that $\mathrm{BSM} (S_0, K, T, r, \sigma_i)$ is the Black–Scholes–Merton call price with volatility parameter $\sigma_i$ :

(4.6) \begin{equation} \mathrm{BSM}(S_0,K,T,r,\sigma_i) = S_0\Phi(d_1(i)) - K{\mathrm{e}}^{-rT}\Phi(d_2(i)),\end{equation}

where

\begin{equation*} d_1(i) = \frac{\ln({S_0}/{K}) + \big(r + \frac{1}{2}\sigma^2_i\big)T}{\sigma_i\sqrt{T}}, \qquad d_2(i) = d_1(i) - \sigma_i\sqrt{T}.\end{equation*}

Then, from (3.21) and (4.5), an analytical expression for the call price is

(4.7) \begin{equation} C_0 = \sum^{N}_{i = 1} \mathrm{BSM} (S_0, K, T, r, \sigma_i) p_i.\end{equation}

4.2. Finite mixture model of negative-shifted gamma distributions

From the empirical studies in Section 5, the skewness of the logarithmic returns of Bitcoin in USD is negative. Consequently, a shifted Gamma distribution may not be suitable to model the logarithmic returns of Bitcoin in USD since the skewness of a shifted Gamma distribution is positive [Reference Gerber and Shiu26, p. 117]. Consequently, instead of considering a finite mixture model of shifted Gamma distributions, we suppose that the logarithmic returns follow a finite mixture model of negative-shifted Gamma distributions. Specifically, we assume that under the real-world probability measure $\mathbb{P}$ , for each $i = 1, 2, \ldots, N$ and each $t = 1, 2, \ldots, T$ ,

(4.8) \begin{equation} R_t \mid \{ \Theta = i \} \overset{{\mathrm{d}}}{=} k_i - Y_i,\end{equation}

where $Y_i$ follows a Gamma distribution $\mathrm{Ga} (\alpha_i, \beta_i)$ with shape parameter $\alpha_i$ and rate parameter $\beta_i$ ; $\overset{{\mathrm{d}}}{=}$ means equality in distribution. In this case, for each $i = 1, 2, \ldots, N$ and each $t = 1, 2, \ldots, T$ , the conditional distribution of $R_t$ given $\{ \Theta = i \}$ under $\mathbb{P}$ is a negative-shifted Gamma distribution with shape parameter $\alpha_i$ , rate parameter $\beta_i$ , and shifted parameter $k_i$ (i.e. $\mathrm{NegSGa} (\alpha_i, \beta_i, k_i)$ ), where $k_i$ , $\alpha_i$ , and $\beta_i$ are all positive.

Let $G (x \mid \alpha, \beta)$ denote the CDF of a Gamma distribution with shape parameter $\alpha$ and rate parameter $\beta$ evaluated at $x > 0$ . Then, for each $i = 1, 2, \ldots, N$ , the ith-component CDF $F_{i} (x)$ is given by

(4.9) \begin{equation} F_i (x) = 1 - G (k_i - x \mid \alpha_i, \beta_i), \quad x < k_i.\end{equation}

Similarly to [Reference Gerber and Shiu26, (4.1.3)], for each $i = 1, 2, \ldots, N$ the MGF of $F_i (x)$ evaluated at $\eta$ is given by

(4.10) \begin{equation} M_i (\eta) = \bigg ( \frac{\beta_i}{\beta_i + \eta} \bigg )^{\alpha_i} {\mathrm{e}}^{k_i \eta}, \quad \eta > - \beta_i.\end{equation}

From (3.14) and (4.10),

(4.11) \begin{equation} {\mathrm{e}}^r = \bigg ( \frac{\beta_i + h^{\dagger} (i)}{\beta_i + h^{\dagger} (i) + 1} \bigg )^{\alpha_i}{\mathrm{e}}^{k_i}, \quad i = 1, 2, \ldots, N.\end{equation}

Take, for each $i = 1, 2, \ldots, N$ ,

(4.12) \begin{equation} \beta^{\dagger}_i = \beta_i + h^{\dagger} (i).\end{equation}

Then, from (4.11) and (4.12),

(4.13) \begin{equation} \beta^{\dagger}_i = \frac{1}{{\mathrm{e}}^{({k_i - r})/{\alpha_i}} - 1}, \quad k_i > r.\end{equation}

Indeed, (4.12) and (4.13) ensure the existence of an equivalent martingale measure $\mathbb{P}^{h^{\dagger}}$ provided that $k_i > r$ . The condition that $k_i > r$ is required to ensure that $\beta^{\dagger}_i > 0$ .

From (3.16), (4.10), and (4.12), for each $i = 1, 2, \ldots, N$ the conditional MGF of $R_t$ given $\Theta = i$ under $\mathbb{P}^{h^{\dagger}}$ evaluated at $\eta$ is given by

(4.14) \begin{equation} M^{h^{\dagger}}_{i} (\eta) = \bigg ( \frac{\beta^{\dagger}_i}{\beta^{\dagger}_i + \eta} \bigg )^{\alpha_i}{\mathrm{e}}^{k_i \eta}, \quad \eta > - \beta^{\dagger}_i.\end{equation}

Consequently, under $\mathbb{P}^{h^{\dagger}}$ , the conditional distribution of $R_t$ given $\Theta = i$ is a negative-shifted Gamma distribution with shape parameter $\alpha_i$ , rate parameter $\beta^{\dagger}_i$ , and shifted parameter $k_i$ . Since $R_1, R_2, \ldots, R_T \mid \Theta$ are (conditionally) independent under $\mathbb{P}^{h^{\dagger}}$ , $R_{1, T} \mid \{ \Theta = i \} \sim \mathrm{NegSGa} (\alpha_i T, \beta^{\dagger}_i, k_i T)$ under $\mathbb{P}^{h^{\dagger}}$ for each $i = 1, 2, \ldots, N$ .

Define

(4.15) \begin{equation} \kappa \,:\!=\, \ln \bigg ( \frac{K}{S_0} \bigg).\end{equation}

Then, similarly to [Reference Gerber and Shiu26, (4.1.7)],

(4.16) \begin{align} C_0(i) & = \mathbb{E}^{h^{\dagger}}[{\mathrm{e}}^{-rT}(S_T - K)^+ \mid \Theta = i] \nonumber \\ & = S_0G(k_iT - \kappa \mid \alpha_iT, \beta^{\dagger}_i + 1) - K{\mathrm{e}}^{-rT}G(k_iT - \kappa \mid \alpha_iT, \beta^{\dagger}_i) \nonumber \\ & = \mathrm{NSGa}(S_0,K,T,r,\alpha_i,\beta^{\dagger}_i,k_i), \quad \mbox{say}. \end{align}

Note that for (4.16) to hold, it is required that $k_i T > \kappa$ for each $i = 1, 2, \ldots, N$ .

From (3.21) and (4.16), an analytical expression for the call price is

(4.17) \begin{equation} C_0 = \sum^{N}_{i = 1} \mathrm{NSGa} (S_0, K, T, r, \alpha_i, \beta^{\dagger}_i, k_i) p_i.\end{equation}

From (4.13), $\beta^{\dagger}_i$ does not depend on $\beta_i$ for each $i = 1, 2, \ldots, N$ . Consequently, the call price $C_0$ in (4.17) does not depend on $\beta_i$ .

4.3. Finite mixture model of negative-shifted inverse Gaussian distributions

Let $J (x \mid a, b)$ denote the CDF of an inverse Gaussian (IG) distribution with parameters a and b evaluated at $x > 0$ . Then, from [Reference Gerber and Shiu26, (4.2.1)],

(4.18) \begin{equation} J (x \mid a, b) = \Phi\bigg({-}\frac{a}{\sqrt{2x}\,} + \sqrt{2bx}\bigg) + {\mathrm{e}}^{2a\sqrt{b}}\Phi\bigg({-}\frac{a}{\sqrt{2x}\,} - \sqrt{2bx}\bigg),\end{equation}

where $\Phi(\cdot)$ is the CDF of a standard normal distribution. That is, if $\mu^{\scriptstyle{sig}}$ and $\lambda$ are the mean and the shape parameter of the IG distribution, respectively, then $\mu^{\scriptstyle{sig}} = {a}/{2\sqrt{b}}$ and $\lambda = {a^2}/{2}$ .

We suppose that, under $\mathbb{P}$ , for each $i = 1, 2, \ldots, N$ and each $t = 1, 2, \ldots, T$ ,

(4.19) \begin{equation} R_t \mid \{ \Theta = i \} \overset{{\mathrm{d}}}{=} k_i - Y_i,\end{equation}

where $Y_i$ follows an IG distribution with parameters $a_i$ and $b_i$ . Note that $a_i$ , $b_i$ , and $k_i$ are all positive. In this case, the conditional distribution of $R_t$ given $\{ \Theta = i \}$ under $\mathbb{P}$ is a negative-shifted inverse Gaussian distribution with IG parameters $a_i$ and $b_i$ , as well as the shifted parameter $k_i$ . That is, under $\mathbb{P}$ , $R_t \mid \{ \Theta = i \} \sim \mathrm{NegSIG} (a_i, b_i, k_i)$ .

Then, for each $i = 1, 2, \ldots, N$ , the ith-component CDF $F_i (x)$ is given by

(4.20) \begin{equation} F_i (x) = 1 - J (k_i - x \mid a_i, b_i), \quad x < k_i.\end{equation}

Similarly to [Reference Gerber and Shiu26, (4.2.3)], for each $i = 1, 2, \ldots, N$ the MGF of $F_i (x)$ evaluated at $\eta$ is given by

(4.21) \begin{align} M_i(\eta) = \exp\left(a_i(\sqrt{b_i} - \sqrt{b_i + \eta}) + k_i\eta\right), \quad \eta > - b_i.\end{align}

From (3.14) and (4.21),

(4.22) \begin{equation} r = a_i \left(\sqrt{b_i + h^{\dagger} (i)} - \sqrt{b_i + h^{\dagger} (i) + 1}\right) + k_i.\end{equation}

Take, for each $i = 1, 2, \ldots, N$ ,

(4.23) \begin{equation} b^{\dagger}_i = b_i + h^{\dagger} (i).\end{equation}

Then, from (4.22) and (4.23),

(4.24) \begin{equation} \sqrt{b^{\dagger}_i} - \sqrt{b^{\dagger}_i + 1} = \frac{r - k_i}{a_i}.\end{equation}

Consequently, from (4.23), the existence of an equivalent martingale measure $\mathbb{P}^{h^{\dagger}}$ becomes the existence of a solution to the non-linear equation in (4.24) provided that $b_i > - h^{\dagger} (i)$ . The latter condition is required to ensure that $b^{\dagger}_i > 0$ .

From (3.16), (4.21), and (4.23), for each $i = 1, 2, \ldots, N$ the conditional MGF of $R_t$ given $\Theta = i$ under $\mathbb{P}^{h^{\dagger}}$ evaluated at $\eta$ is given by

(4.25) \begin{equation} M^{h^{\dagger}}_{i}(\eta) = \exp\Big(a_i\Big(\sqrt{b^{\dagger}_i} - \sqrt{b^{\dagger}_i + \eta}\Big) + k_i\eta\Big), \quad \eta > - b^{\dagger}_i.\end{equation}

Consequently, under $\mathbb{P}^{h^{\dagger}}$ , the conditional distribution of $R_t$ given $\Theta = i$ is a negative-shifted IG distribution with parameters $a_i$ and $b^{\dagger}_i$ , as well as the shifted parameter $k_i$ . Since $R_1, R_2, \ldots, R_T \mid \Theta$ are (conditionally) independent under $\mathbb{P}^{h^{\dagger}}$ , $R_{1, T} \mid \{ \Theta = i \} \sim \mathrm{NegSIG} (a_i T, b^{\dagger}_i, k_i T)$ under $\mathbb{P}^{h^{\dagger}}$ for each $i = 1, 2, \ldots, N$ . Then, similarly to [Reference Gerber and Shiu26, (4.2.7)],

(4.26) \begin{align} C_0(i) & = \mathbb{E}^{h^{\dagger}}[{\mathrm{e}}^{-rT}(S_T - K)^+ \mid \Theta = i] \nonumber \\ & = S_0J(k_iT - \kappa \mid a_iT,b^{\dagger}_i + 1) - K{\mathrm{e}}^{-rT}J(k_iT - \kappa \mid a_iT,b^{\dagger}_i), \nonumber \\ & = \mathrm{NSIG}(S_0, K, T, r, a_i, b^{\dagger}_i, k_i). \end{align}

Note that for (4.26) to hold, it is required that $k_i T > \kappa$ for each $i = 1, 2, \ldots, N$ .

From (3.21) and (4.26), an analytical expression for the call price is

(4.27) \begin{equation} C_0 = \sum^{N}_{i = 1} \mathrm{NSIG} (S_0, K, T, r, a_i, b^{\dagger}_i, k_i) p_i.\end{equation}

This analytical pricing formula is up to solving the non-linear equation for $b^{\dagger}_i$ in (4.24). Furthermore, from (4.24), $b^{\dagger}_i$ does not depend on $b_i$ for each $i = 1, 2, \ldots, N$ . Consequently, the call price $C_0$ in (4.27) does not depend on $b_i$ .

4.4. Hybrid finite mixture model

In this subsection, a hybrid finite mixture model of normal, negative-shifted Gamma and negative inverse Gaussian distributions is considered. The hybrid model allows for the flexibility that the component distributions may come from different parametric classes of distributions. It provides a feasible way to describe both model uncertainty and parameter uncertainty. The former is interpreted as uncertainty about the parametric form of distributions. Specifically, to describe model uncertainty we consider more than one parametric form for the component distributions and use the mixing mechanism in a finite mixture model to take a ‘weighted average’ of the component distributions across different parametric distribution classes. The idea may perhaps be related to the use of Bayesian averaging to capture model uncertainty in the context of smooth ambiguity pioneered in [Reference Klibanoff, Marinacci and Mukerji43]. To incorporate parameter uncertainty, the mixing mechanism in a finite mixture model is used to take a ‘weighted average’ of the component distributions within the same parametric class of distributions. Informally speaking, model uncertainty is incorporated through considering the variation or heterogeneity between different parametric classes of distributions. Parameter uncertainty is captured by considering the variation or heterogeneity within the same parametric class of distributions.

Suppose, for each $i = 1, 2, \ldots, N_1$ , the ith-component CDF $F_{i} (x)$ is the CDF of a normal distribution with mean $\mu_i$ and variance $\sigma^2_i$ . Also, for each $i = N_1 + 1, N_1 + 2, \ldots, N_1 + N_2$ , the ith-component CDF $F_{i} (x)$ is the CDF of a negative-shifted Gamma distribution with shape parameter $\alpha_i$ , rate parameter $\beta_i$ , and shifted parameter $k_i$ . Assume, for each $i = N_1 + N_2 + 1, N_1 + N_2 + 2, \ldots, N$ , the ith-component CDF $F_{i} (x)$ is the CDF of a negative-shifted IG distribution with IG parameters $a_i$ and $b_i$ , and the shifted parameter $k_i$ . Then the hybrid finite mixture model can be written as

(4.28) \begin{align} F(x) & = \sum^{N_1}_{i=1}p_iN(x \mid \mu_i,\sigma^2_i) + \sum^{N_1+N_2}_{i=N_1+1}p_i(1 - G(k_i - x \mid \alpha_i,\beta_i)) \nonumber \\ & \quad + \sum^{N}_{i = N_1 + N_2 + 1}p_i(1 - J(k_i - x \mid a_i, b_i))\end{align}

for $x < \min \{ k_{N_1+1}, k_{N_1 + 2}, \ldots, k_{N} \}$ .

Using the results in Sections 4.14.3, an analytical expression for the call price is

(4.29) \begin{align} C_0 & = \sum^{N_1}_{i=1}\mathrm{BSM}(S_0,K,T,r,\sigma_i)p_i + \sum^{N_1+N_2}_{i = N_1+1}\mathrm{NSGa}(S_0,K,T,r,\alpha_i,\beta^{\dagger}_i,k_i)p_i \nonumber \\ & \quad + \sum^{N}_{i=N_1+N_2+1}\mathrm{NSIG}(S_0,K,T,r,a_i,b^{\dagger}_i,k_i)p_i.\end{align}

Again, the analytical pricing formula in (4.29) is up to solving the non-linear equation for $b^{\dagger}_i$ in (4.24). The existence of an equivalent martingale measure $\mathbb{P}^{h^{\dagger}}$ follows directly from the respective discussions in Sections 4.14.3.

5. Estimation procedures and results

In this section we adopt the ECF estimation method to estimate the unknown parameters in eight models that will be used for pricing Bitcoin options in Section 6. A summary of the eight models is provided in Table D.1 in the supplementary material. Models I–V are finite mixture models, while Models VI–VIII are non-mixture models. The latter are used for comparison. Specifically, Model VI is the Black–Scholes–Merton model [Reference Black and Scholes9, Reference Merton52]. Models VII and VIII are based on negative-shifted Gamma and negative-shifted IG distributions, respectively. First, the estimation procedures of the ECF estimation method are briefly reviewed. Then the estimation results based on real Bitcoin price data are provided.

5.1. Estimation procedures

The ECF estimation method [Reference Feuerverger and McDunnough21, Reference Feuerverger and Mureika22, Reference Knight and Yu44, Reference Madan and Seneta48, Reference Madan and Seneta49, Reference Singleton61, Reference Tran79, Reference Yu83] has been applied to estimate unknown parameters of finite mixture models of normal distributions [Reference Bryant and Paulson10, Reference Tran79, Reference Yu83]. Its key idea is to estimate the unknown parameters by minimizing the ‘distance’ between the empirical characteristic function and its theoretical counterpart. It was noted in [Reference Yu83] that the ECF estimation method can be considered when the maximum likelihood estimation method is difficult and a closed-form solution of the (theoretical) characteristic function is available. This is why the ECF estimation method is adopted here. Specifically, for the cases of a finite mixture model of negative-shifted Gamma distributions and a finite mixture model of negative-shifted IG distributions, it is possible that the observed logarithmic returns may not be bounded above by the shifted parameters. This renders the maximum likelihood estimation method intractable. Furthermore, we note that closed-form expressions for the (theoretical) characteristic functions of a negative-shifted Gamma distribution and a negative-shifted IG distribution are available and that the (theoretical) characteristic function of a finite mixture model is a mixture of the (theoretical) characteristic functions of individual (theoretical) characteristic functions. Consequently, closed-form expressions for the (theoretical) characteristic functions of the finite mixture model of negative-shifted Gamma distributions and the finite mixture model of negative-shifted IG distributions are obtained. The ECF estimation method is intimately linked with the generalized method of moment (GMM) estimation [Reference Hansen35]. See [Reference Yu83] for a detailed discussion on the link between the GMM method and the ECF method. [Reference Gerber and Shiu26] proposed the use of the method of moments to estimate unknown parameters in a shifted Gamma distribution and a shifted IG distribution, and derived the respective closed-form estimators. In the following, the ECF estimation method is briefly discussed in the context of a general finite mixture model.

Let $\{ R_1, R_2, \ldots, R_n \}$ denote a set of n observations about the logarithmic returns of Bitcoin. Then the ECF of $\{ R_1, R_2, \ldots, R_n \}$ evaluated at $u \in {\mathbb{R}}$ is defined as

(5.1) \begin{equation} C_n (u) \,:\!=\, \frac{1}{n} \sum^{n}_{j = 1} \exp (\mathrm{i} u R_j),\end{equation}

where $\mathrm{i} \,:\!=\, \sqrt{-1}$ .

Let $C (u;\, \varphi)$ denote the (theoretical) characteristic function of the general finite mixture distribution F(x) in (2.1) evaluated at $u \in {\mathbb{R}}$ under $\mathbb{P}$ , where $\varphi$ is a vector of unknown parameters in the finite mixture distribution $F (x) \,:\!=\, F (x;\, \varphi)$ and $\varphi \in \Phi \subset {\mathbb{R}}^k$ for some Euclidean space ${\mathbb{R}}^k$ . For each $j = 1, 2, \ldots, N$ , let $C_j (u; \varphi)$ denote the (theoretical) characteristic function of the jth-component CDF $F_j (x)$ of the general finite mixture distribution F(x) evaluated at $u \in {\mathbb{R}}$ under $\mathbb{P}$ . That is, $C_j (u;\, \varphi)$ is the conditional (theoretical) characteristic function of $R_t$ given $\{ \Theta = j \}$ evaluated at $u \in {\mathbb{R}}$ under $\mathbb{P}$ . Note that for each $j = 1, 2, \ldots, N$ , the vector of unknown parameters in $F_j (x)$ is obtained by putting some components of $\varphi$ equal to zero. To simplify the notation, we still adopt $\varphi$ to denote the vector of unknown parameters in $F_j (x)$ . By putting $\eta = \mathrm{i} u$ in (2.7), we have

(5.2) \begin{align} C (u, \varphi) & = \sum^{N}_{j = 1} p_{j} C_{j} (u, \varphi), \end{align}
(5.3) \begin{align} C (u, \varphi) & = \mathbb{E} [{\mathrm{e}}^{\mathrm{i} u R_t} ] = \int_{\mathcal{S}}{\mathrm{e}}^{\mathrm{i} u x}\,{\mathrm{d}} F(x), \end{align}
(5.4) \begin{align} C_j(u,\varphi) & = \mathbb{E}[{\mathrm{e}}^{\mathrm{i}uR_t} \mid \Theta = j] = \int_{\mathcal{S}_j}{\mathrm{e}}^{\mathrm{i}ux}\,{\mathrm{d}} F_j (x). \end{align}

Note that $\mathcal{S}$ is the support of the probability density function (PDF) of the distribution F(x); $\mathcal{S}_j$ is the support of the PDF of the distribution $F_j (x)$ for each $j = 1, 2, \ldots, N$ . In the case of a finite mixture of negative-shifted Gamma distributions, a closed-form expression for $C_j (u, \varphi)$ is obtained by putting $\eta = \mathrm{i} u$ in (4.10). In the case of a finite mixture of negative-shifted IG distributions, a closed-form expression for $C_j (u, \varphi)$ is obtained by putting $\eta = \mathrm{i} u$ in (4.21).

Then the ECF estimation method is to estimate $\varphi$ by solving the following minimization problem:

(5.5) \begin{eqnarray} \min_{\varphi \in \Phi} \int^{\infty}_{-\infty} | C_n (u, \varphi) - C (u, \varphi) |^2 g (u)\, {\mathrm{d}} u,\end{eqnarray}

where g(u) is some continuous weighting function. Here, as in [Reference Yu83], we choose $g (u) = \exp (\!- u ^ 2)$ .

In practice, it is often the case that there is no closed-form expression for the integral in (5.5). Consequently, the integral is evaluated via some numerical integration procedures. Here we shall use the function integrate in R to compute the integral numerically, which adopts adaptive quadrature of functions and involves a set of discrete points for the variable u of the integration. To solve the minimization problem in (5.5), the differential evolution algorithm is implemented using the DEoptim R package [Reference Ardia, Mullen, Peterson, Ulrich and Boudt2]. Differential evolution is an evolutionary global optimization algorithm that adopts biology-inspired operations on a population to minimize an objective function over successive generations [Reference Ardia, Mullen, Peterson, Ulrich and Boudt2, Reference Mitchell53, Reference Price, Storn and Lampinen56, Reference Storn and Price72]. More detail about the use of differential evolution to estimate the finite mixture models considered in the current paper is provided in Section 5.2. It will be seen that human or subjective judgement may need to be exercised even when this advanced optimization algorithm is adopted. Finally, it may be noted that the ECF estimator enjoys some desirable asymptotic properties such as strong consistency and asymptotic normality. See [Reference Yu83] for a more detailed discussion.

5.2. Estimation results

A dataset on daily Bitcoin-USD (BTC-USD) adjusted close prices is used in empirical studies on the eight models in Table D.1. Each of Models I–V has two mixing components. The time period for the BTC-USD prices data is from 1 January 2020 to 29 May 2025 (1976 observations). The dataset was extracted from Yahoo Finance using R. It covers some periods of the Covid pandemic. In [Reference Siu64, Reference Siu66], Bitcoin price data covering some periods of the Covid pandemic were considered. However, [Reference Siu64] focused on risk evaluation based on Value at Risk and Expected Shortfall. [Reference Siu66] studied the impacts of a long memory in the conditional volatility and conditional non-normality on pricing Bitcoin options. With the Bitcoin logarithmic returns data (1975 observations), where the empirical studies were performed using R. The estimation results are discussed in this subsection.

The estimation results for the five two-component finite mixture models (Models I–V) are provided in Tables D.2D.6 in the supplementary material, respectively. Specifically, for each of the five two-component finite mixture models, the estimates of the unknown parameters, the optimal value(s) of the objective function corresponding to the parameter estimates, and the lower and upper boundaries of the parameters are presented. For the finite mixture model of normal distributions (Model I), one set of parameter estimates is provided based on one set of lower and upper boundaries of the parameters. For the other four finite mixture models with non-normal mixing components (Models II–V), two sets of parameter estimates are provided based on two sets of lower and upper boundaries of the parameters. In each of Tables D.2D.6, ‘fn’ represents the optimal value of the objective function in differential evolution evaluated at the parameter estimates. All the parameter estimates in Tables D.2D.6 are in daily units. It will be seen from the results that the parameter estimates obtained from differential evolution may be different when different sets of lower and upper boundaries of the parameters are adopted.

In Table D.2, the lower and upper boundaries of $\mu_1$ and $\mu_2$ are given by $-100$ and 100 times the sample mean of the logarithmic returns, respectively. The upper boundaries of $\sigma_1$ and $\sigma_2$ are 100 times the sample standard deviation of the logarithmic returns. From the estimation results, around $34.87\%$ weight is allocated to the first component. The first component corresponds to a ‘bad’ economic regime with a negative mean return and high volatility. The second component corresponds to a ‘good’ economic regime with a positive mean return and low volatility.

There are two sets (Sets 1 and 2) of estimation results in each of Tables D.3D.6. In each of Tables D.3 and D.4, the upper boundaries of parameters in Set 1 are 5 times their respective method of moments estimates from the sample logrithmic returns data. The upper boundaries of parameters in Set 2 are 10 times their respective method of moments estimates from the sample data. The estimates of $a_i$ and $b_i$ , $i = 1, 2$ , in Table D.4 are obtained from the respective estimates of $\mu^{\scriptstyle{sig}}_i$ and $\lambda_i$ . In each of Tables D.5 and D.6, for each of Sets 1 and 2, the lower and upper boundaries of $\mu$ are $-100$ and 100 times the sample mean of the logarithmic returns, respectively; the upper boundaries of $\sigma$ in Sets 1 and 2 are 100 times the sample standard deviation of the logarithmic returns. Again, the upper boundaries of negative SGa and SIG parameters in Set 1 are five times their respective method of moment estimates from the sample logrithmic returns data. The upper boundaries of negative SGa and SIG parameters in Set 2 are 10 times their respective method of moments estimates from the sample data. The estimates of a and b in Table D.6 are obtained from the respective estimates of $\mu^{\scriptstyle{sig}}$ and $\lambda$ . There are two reasons why the upper boundaries of negative SGa and SIG parameters are set as either 5 or 10 times their respective method of moment estimates. Firstly, some of the parameters may become large if their upper boundaries are large. Specifically, the parameters $\beta_i$ (or $\beta$ ) in the negative SGa distributions may become large if their upper boundaries are large. Secondly, the optimal value ‘fn’ may not improve significantly even if the upper boundaries are large. However, the computational time increases if the upper boundaries are large. In each of Tables D.3D.6, the estimation results from Sets 1 and 2 could be quite different. This illustrates that the parameter estimates obtained from differential evolution depend on the chosen boundaries of the parameters. We may need to exercise subjective judgements even though the advanced optimization algorithm is adopted.

In Table D.3, the optimal value ‘fn’ (i.e. the objective function in (5.5)) increases from $1.699\,928\times10^{-10}$ to $2.519\,128\times10^{-9}$ when moving from Set 1 to Set 2. This indicates that the optimal value ‘fn’ may not improve when parameter bounds are relaxed. This happens because the solver was not able to find the global optimum and led to suboptimal convergence. Although the estimates of $\beta_1$ in Sets 1 and 2 are large, they are still interior optimums (i.e. strictly less than the respective upper boundaries). Although the parameter estimates in Set 1 provide a better fit to the data than those in Set 2 according to the optimal value ‘fn’, the parameter estimates in Set 2 will be used to calculate the option prices in Section 6 to illustrate the mixture effect. In Table D.4, the optimal value ‘fn’ increases from $1.512\,852 \times 10^{-9}$ to $3.604\,965\times10^{-9}$ when moving from Set 1 and Set 2. This also indicates that the optimal value ‘fn’ may not improve when the upper boundaries become larger. Since the parameter estimate of b in Set 2 looks more reasonable than that in Set 1, the former will be used to calculate the option prices in Section 6. In Table D.5, the parameter estimates in Set 1 provide a better fit to the data than those in Set 2. Furthermore, the parameter estimate of $\sigma$ in Set 2 is very small (0.002 944 167), which may result in unreasonably small option prices. Consequently, the parameter estimates in Set 1 will be used to calculate the option prices in Section 6. In Table D.6, the estimates of p in Sets 1 and 2 are $0.012\,423\,47$ and $0.578\,201\,806$ , respectively. To illustrate the mixture effect, the parameter estimates in Set 2 may be better than those in Set 1. However, the estimate of $\sigma$ in Set 2 is very small ( $0.002\,358\,598$ ), which may result in unreasonably small option prices. Furthermore, the parameter estimates in Set 1 provide a better fit to the data than those in Set 2. Consequently, the parameter estimates in Set 1 will be used to calculate the option prices in Section 6.

Tables D.7D.9 present the estimation results for the three non-mixture models (Models VI–VIII), respectively.

In Table D.7, both the maximum likelihood estimates (MLE) and the ECF estimates are provided. For the ECF estimates, the lower and upper boundaries of $\mu$ are $-100$ and 100 times the sample mean of the logarithmic returns, respectively. The upper boundary of $\sigma$ is 100 times the sample standard deviation of the logarithmic returns. The optimal value ‘fn’ (i.e. the objective function in (5.5)) evaluated at the ECF estimates is smaller than the optimal value ‘fn’ evaluated at the MLE, though the optimal values evaluated at both the MLE and the ECF estimates are reasonably small. This indicates that the ECF estimation method may be better than the MLE method when estimating the mean and standard deviation of a normal distribution. However, it is possible that the solver was not able to find the global optimum and led to suboptimal convergence. The ECF estimates will be used to calculate the option prices in Section 6. In each of Tables D.8 and D.9, the upper boundaries for the parameters in Sets 1 and 2 are 5 and 10 times their respective method of moment estimates, respectively. The estimates of a and b in Table D.9 are obtained from those of $\mu^{\scriptstyle{sig}}$ and $\lambda$ . In Table D.8, the parameter estimates in Set 2 provide a better fit to the data and will be used to calculate the option prices in Section 6. In Table D.9, the parameter estimates in Set 1 provide a better fit to the data and will be used to calculate the option prices in Section 6. Using ‘fn’ as a criterion, the fitting performances of Models I–VIII look similar, though Model VIII is the best among them.

6. An application to Bitcoin option pricing

Recently, there has been considerable interest in developing quantitative models for pricing Bitcoin options and options written on cryptocurrencies. For example, [Reference Hou, Wang, Chen and Härdle40, Reference Jalan, Matkovskyy and Aziz42, Reference Siu66, Reference Siu and Elliott67, Reference Venter and Maré81] considered the pricing of Bitcoin options under stochastic volatility and GARCH-type models. [Reference Siu65] constructed Bayesian lower and upper estimates for the prices of ether options under GARCH models by leveraging Bayesian non-linear expectation proposed in [Reference Siu64]. [Reference Cao and Celik13, Reference Hilliard and Ngo39] discussed the pricing of Bitcoin options under jump-diffusion models. [Reference Li, Arab, Liu, Liu and Han45, Reference Pagnottoni55] adopted the machine learning and neural network approaches to price Bitcoin options, respectively. Using the Black–Scholes–Merton model, [Reference Alexander, Chen and Imeraj1] discussed the pricing of crypto quanto and inverse options. From Figure C.1 (panel B) in the supplementary material, the daily logarithmic returns exhibit volatility clustering. Consequently, models with conditional heteroscedasticity and stochastic volatility may provide a better fit to the returns data than finite mixture models. There are two reasons why we consider finite mixture models for pricing Bitcoin options. First, closed-form expressions for the European call Bitcoin options are obtained under finite mixture models (Section 4). Secondly, pricing Bitcoin options under models with conditional heteroscedasticity and stochastic volatility was considered in the aforementioned literature. Lastly, by considering the finite mixture models, we can focus on the impact of non-normality of Bitcoin returns on option pricing.

In this section, the proposed finite mixture models are applied to price Bitcoin options. Specifically, the estimated models (Models I–V) based on real Bitcoin returns data in Section 5 are used to compute the prices of European-style call Bitcoin options. Three estimated non-mixture models (Models VI–VIII) are used for comparison. See Table D.1 for the detail. For each of the eight models, a matrix of option prices with different strike prices and maturities is computed. The possible strike prices are $0.8 s_0$ , $0.85 s_0$ , $0.9 s_0$ , $0.95 s_0$ , $s_0$ , $1.05 s_0$ , $1.1 s_0$ , $1.15 s_0$ , and $1.2 s_0$ , where $s_0$ is the current Bitcoin price, (i.e. the Bitcoin price at the end of the dataset). The maturities are 21, 42, 63, 126, and 252 trading days. They correspond to one, two, three, six, and twelve months, respectively. Since analytical expressions for the option prices are obtained for Models I–VIII, the option prices are quickly obtained. All the computations for the option prices were calculated with R. The risk-free interest rate is assumed to be $4.33\%$ per annum (i.e. ${0.0433}/{252}$ per trading day), as taken from the US Federal Funds Effective Rate (FEDFUNDS) in May 2025. Note that the interest rate on Bitcoin is zero [Reference Bariviera, Basgalla, Hasperué and Naiouf5, Reference Siu and Elliott67].

Tables D.10D.17 in the supplementary material show the prices of European-style call Bitcoin options with different levels of moneyness and maturities from Models I to VIII, respectively.

From Tables D.10D.17, for each of the eight models (Models I–VIII), the option prices increase as the maturity T increases, and they decrease as the moneyness ${K}/{s_0}$ increases. These are necessary no-arbitrage conditions [Reference Merton52]. From Tables D.10 and D.11, the option prices from Model I are higher than those from Model II. One possible explanation for this is that the negative-shifted Gamma distribution has a finite upper limit of its support, which limits the impact of upside movements of the underlying Bitcoin price on the call price. This results in lower call prices. By comparing the option prices in Table D.11 with those in Table D.12, the option prices from Model II are lower than those from Model III. One potential explanation for this is that the negative-shifted IG distribution has a heavier tail than the negative-shifted Gamma distribution. Since the negative-shifted IG distribution is heavy-tailed, this may explain why some option prices from Model III are higher than the respective prices from Model I, as can be seen by comparing Tables D.10 and D.12. From Table D.13, the option prices are unreasonably small when the maturity is short ( $T = 21$ days) as well as when the option is deep out-of-money ( ${K}/{s_0} = 1.2$ ). This indicates that Model IV may not price short-lived options and out-of-money options accurately. For this reason, it will not be used to compute the Black–Scholes–Merton implied volatilities in the later part of this section. By comparing Tables D.10 and D.13, option prices from Model IV are lower than those from Model I. This may be partly explained by a relatively small volatility in component 1 of Model IV and a finite upper limit of the support of the negative-shifted Gamma distribution. By comparing Table D.14 with Tables D.10D.13, Model V gives the most conversative estimates for the option prices among all five finite mixture models. This may be attributed to the combined effect of the heavy-tailedness of a negative-shifted IG distribution and the unbounded support of a normal distribution. By comparing the option prices in Tables D.15D.17 with those in Tables D.10D.14, the option prices from non-mixture models are more conversative than the respective prices from the finite mixture models. This reflects that the finite mixture models provide the flexibility of incorporating two possible economic regimes, one of which gives less conversative option prices.

Figure D.1 in the supplementary material plots the Black–Scholes–Merton implied volatilities against moneyness from Models I–III and Model V for three different maturities. Figure D.2 plots the BSM implied volatilities against maturity from Models I–III and Model V for in-the-money (ITM), at-the-money (ATM), and out-of-money (OTM) options. For each moneyness and each maturity, the BSM implied volatility from a model is computed by minimizing the absolute difference between the option price from the model and the theoretical option price from the BSM model evaluated at the implied volatility. Again, the differential evolution algorithm is used for the minimization, with the lower and upper boundaries being 0 and 1, respectively.

From Figure D.1, all four finite mixture models can produce implied volatility smirks or skews. That is, the BSM implied volatilities for ITM options are higher than those for ATM and OTM options. The BSM implied volatilities from Model V are higher than those from the other three finite mixture models. This is in line with what was observed by comparing Table D.14 with Tables D.10D.13. From Figure D.2, the four finite mixture models produce consistent term structures of the BSM implied volatilities for ITM, ATM, and OTM options. Specifically, when the options are ITM, ATM, and OTM options, all four finite mixture models produce downward-sloping, nearly flat, and upward-sloping term structures of the BSM implied volatilities, respectively.

7. Conclusion

Finite mixture models were adopted for the canonical valuation of Bitcoin options in a discrete-time economy by integrating knowledge in actuarial science, financial mathematics, applied probability, statistics, econometrics, and data science. Specifically, the time-honored tool in actuarial science, namely the Esscher transform, was used to specify a pricing kernel. Novel finite mixture models were introduced to capture non-normality of returns and model uncertainty. Specifically, a finite mixture model of negative SGa distributions, a finite mixture model of negative SIG distributions, as well as a hybrid finite mixture model of normal, negative SGa, and negative SIG distributions were introduced. The finite mixture models were estimated using the empirical characteristic function estimation method. Analytical pricing formulas for a European call option were obtained for some finite mixture models. The pricing formulas provide quick and convenient ways to compute matrices of option prices and to calibrate the models to real option price data. Using real data on the adjusted close prices of Bitcoin-USD, the estimation results of eight models (five finite mixture models and three non-mixture models) were provided. Numerical results illustrating applications of the eight models to pricing European call Bitcoin options were presented. The results reveal that the hybrid mixture model with a normal distribution and a negative-shifted IG distribution gives the most conservative pricing result. It is also found that four finite mixture models produce implied volatility smirks/skews, and that the finite mixture models produce different patterns for term structures of the implied volatility for ITM, ATM, and OTM options. These results could be helpful to stakeholders in Bitcoin option markets such as traders, regulators, and individual investors.

There are some limitations of the proposed finite mixture models and studies that may provide some potential ideas for further research. The models cannot incorporate conditional heteroscedasticity and stochastic volatility. Finite mixture models with either negative-shifted Gamma distributions or negative-shifted IG distributions impose finite upper limits of the supports of the distributions. This limits the capability of the models for modelling large up movements in the Bitcoin returns. The estimation results based on differential evolution depend on the choice of the boundaries of the unknown parameters, which involves subjective judgement. Real Bitcoin option price data were not used in this study.

Acknowledgements

The author sincerely thanks the two reviewers for their positive and meticulous comments.

Funding information

The author also wishes to acknowledge the support from the Discovery Grant from the Australian Research Council (ARC) (Project Number: DP190102674).

Competing interests

There were no competing interests to declare which arose during the preparation or publication process of this article.

Data

The data related to the empirical results found in Section 5 can be found at Yahoo Finance; they were extracted using R.

Supplementary material

The supplementary material for this article can be found at http://doi.org/10.1017/jpr.2025.10048.

References

Alexander, C., Chen, D. and Imeraj, A. (2023). Crypto quanto and inverse options. Math. Finance 33, 10051043.10.1111/mafi.12410CrossRefGoogle Scholar
Ardia, D., Mullen, K., Peterson, B., Ulrich, J. and Boudt, K. (2025). Global optimization by differential evolution. R package DEoptim. January 20, 2025. Version 2.2-8.Google Scholar
Badescu, A., Elliott, R. J. and Siu, T. K. (2009). Esscher transforms and consumption-based models. Insurance Math. Econom. 45, 337347.10.1016/j.insmatheco.2009.08.001CrossRefGoogle Scholar
Badescu, A., Kulperger, R. and Lazar, E. (2008). Option valuation with normal mixture GARCH models. Studies Nonlinear Dynam. Econometrics 12, 15801580.Google Scholar
Bariviera, A. F., Basgalla, M. J., Hasperué, W. and Naiouf, M. (2017). Some stylized facts of the Bitcoin market. Physica A 484, 8290.10.1016/j.physa.2017.04.159CrossRefGoogle Scholar
Bégin, J. F. and Godin, F. (2023). Option pricing under stochastic volatility models with latent volatility. Quant. Finance 23, 10791097.10.1080/14697688.2023.2215496CrossRefGoogle Scholar
Bhat, H. S. and Kumar, N. (2012). Option pricing under a normal mixture distribution derived from the Markov tree model. Europ. J. Operat. Res. 223, 762--774.10.1016/j.ejor.2012.07.003CrossRefGoogle Scholar
Bishop, C. (2006). Pattern Recognition And Machine Learning. Springer, New York.Google Scholar
Black, F. and Scholes, M. (1973). The pricing of options and corporate liabilities. J. Political Econom. 81, 637659.10.1086/260062CrossRefGoogle Scholar
Bryant, J. L. and Paulson, A. S. (1983). Estimation of mixing properties via distance between characteristic functions. Commun. Statist. Theory Meth. 12, 10091029.10.1080/03610928308828512CrossRefGoogle Scholar
Bühlmann, H., Delbaen, F., Embrechts, P. and Shiryaev, A. N. (1996). No arbitrage, change of measure and conditional Esscher transforms. CWI Quart. 9, 291317.Google Scholar
Bühlmann, H., Delbaen, F., Embrechts, P. and Shiryaev, A. N. (1998). On Esscher transforms in discrete finance models. ASTIN Bull. 28, 171186.10.2143/AST.28.2.519064CrossRefGoogle Scholar
Cao, M. and Celik, B. (2021). Valuation of Bitcoin options. J. Futures Markets 41, 10071026.10.1002/fut.22214CrossRefGoogle Scholar
Ching, W K., Siu, T. K. and Li, L. M. (2007). Pricing exotic options under a high-order Markovian regime switching model. J. Appl. Math. Decision Sci. 2007, 18014. DOI: 10.1155/2007/18014.Google Scholar
Duan, J. C., Popova, I. and Ritchken, P. (2001). Option pricing under regime switching. Quant. Finance 2, 116132.10.1088/1469-7688/2/2/303CrossRefGoogle Scholar
Elliott, R. J., Chan, L. L. and Siu, T. K. (2005). Option pricing and Esscher transform under regime switching. Ann. Finance 1, 423–432.10.1007/s10436-005-0013-zCrossRefGoogle Scholar
Elliott, R. J. and Madan, D. B. (1998). A discrete time equivalent martingale measure. Math. Finance 8, 127152.10.1111/1467-9965.00048CrossRefGoogle Scholar
Elliott, R. J., Siu, T. K. and Badescu, A. (2011). Bond valuation under a discrete-time regime-switching term-structure model and its continuous-time extension. Managerial Finance 37, 10251047.CrossRefGoogle Scholar
Elliott, R. J., Siu, T. K. and Chan, L. L. (2006). Option pricing for GARCH models with Markov switching. Internat. J. Theoret. Appl. Finance 9, 825841.10.1142/S0219024906003846CrossRefGoogle Scholar
Fan, J. (2014). Features of big data and sparsest solution in high confidence set. In Past, Present, and Future of Statistical Science, eds Lin, X., Genest, C., Banks, D. L., Molenberghs, G., Scott, D. W. and Wang, J. L.. Chapman & Hall, New York, pp. 507523.Google Scholar
Feuerverger, A. and McDunnough, P. (1981). On the efficiency of empirical characteristic function procedures. J. R. Statist. Soc. B 43, 2027.10.1111/j.2517-6161.1981.tb01143.xCrossRefGoogle Scholar
Feuerverger, A. and Mureika, R. A. (1977). The empirical characteristic function and its applications. Ann. Statist. 5, 8897.10.1214/aos/1176343742CrossRefGoogle Scholar
Frühwirth-Schnatter, S. (2006). Finite Mixture and Markov Switching Models, Springer, New York.Google Scholar
Fu, M. C., Li, B., Wu, R. and Zhang, T. (2022). Option pricing under a discrete-time Markov switching stochastic volatility with co-jump model. Frontiers Math. Finance 1, 137160. DOI: 10.3934/fmf.2021005.10.3934/fmf.2021005CrossRefGoogle Scholar
Gemmill, G., and Saflekos, A. (2000). How useful are implied distributions? Evidence from stock-index options. J. Derivatives 7, 8391. DOI: 10.3905/jod.2000.319123.10.3905/jod.2000.319123CrossRefGoogle Scholar
Gerber, H. U. and Shiu, E. S. W. (1994). Option pricing by Esscher transforms (with discussions). Trans. Soc. Actuaries 46, 99191.Google Scholar
Gerber, H. U. and Shiu, E. S. W. (1995). An actuarial bridge to option pricing. In Securitization of Insurance Risk: The 1995 Bowles Symposium. The Society of Actuaries, United States, pp. 4562.Google Scholar
Gerber, H. U. and Shiu, E. S. W. (1996). Actuarial bridges to dynamic hedging and option pricing. Insurance Math. Econom. 18, 183218.10.1016/0167-6687(96)85007-4CrossRefGoogle Scholar
Godin, F., Lai, V. S. and Trottier, D. A. (2019). Option pricing under regime-switching models: Novel approaches removing path-dependence. Insurance Math. Econom. 87, 130142.10.1016/j.insmatheco.2019.04.006CrossRefGoogle Scholar
Godin, F., and Trottier, D. A. (2021). Option pricing in regime-switching frameworks with the extended Girsanov principle. Insurance Math. Econom. 99, 116129.10.1016/j.insmatheco.2021.02.007CrossRefGoogle Scholar
Goovaerts, M. J. and Laeven, R. J. A. (2008). Actuarial risk measures for financial derivative pricing. Insurance Math. Econom. 42, 540547.CrossRefGoogle Scholar
Guo, C. (2005). Option pricing with heterogeneous expectations. Financial Rev. 33, 8192.10.1111/j.1540-6288.1998.tb01398.xCrossRefGoogle Scholar
Hall, P. and Titterington, D. M. (1984). Efficient nonparametric estimation of mixture proportions. J. R. Statist. Soc. B 46, 465473.10.1111/j.2517-6161.1984.tb01319.xCrossRefGoogle Scholar
Hamilton, J. D. (1989). A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica 57, 357384.10.2307/1912559CrossRefGoogle Scholar
Hansen, L. P. (1982). Large sample properties of generalized method of moments estimators. Econometrica 50, 10291054.10.2307/1912775CrossRefGoogle Scholar
Harrison, J. M. and Kreps, D. M. (1979). Martingales and arbitrage in multiperiod securities markets. J. Econom. Theory 20, 381408.10.1016/0022-0531(79)90043-7CrossRefGoogle Scholar
Harrison, J. M. and Pliska, S. R. (1981). Martingales and stochastic integrals in the theory of continuous trading. Stoch. Process. Appl. 11, 215280.10.1016/0304-4149(81)90026-0CrossRefGoogle Scholar
Harrison, J. M. and Pliska, S. R. (1983). A stochastic calculus model of continuous trading: Complete markets. Stoch. Process. Appl. 15, 313316.10.1016/0304-4149(83)90038-8CrossRefGoogle Scholar
Hilliard, J. E. and Ngo, J. T. D. (2022). Bitcoin: Jumps, convenience yields, and option prices. Quant. Finance 22, 20792091.10.1080/14697688.2022.2109989CrossRefGoogle Scholar
Hou, A. J., Wang, W., Chen, C. Y. H. and Härdle, W. K. (2020). Pricing cryptocurrency options. J. Financial Econometrics 18, 250279.Google Scholar
Ishijima, H. and Kihara, T. (2005). Option pricing with hidden Markov models. Institute of Finance, Waseda University, Tokyo.Google Scholar
Jalan, A., Matkovskyy, R. and Aziz, S. (2021). The Bitcoin options market: A first look at pricing and risk. Applied Economics 53, 20262041.10.1080/00036846.2020.1854671CrossRefGoogle Scholar
Klibanoff, P., Marinacci, M. and Mukerji, S. (2005). A smooth model of decision making under ambiguity. Econometrica 73, 18491892.10.1111/j.1468-0262.2005.00640.xCrossRefGoogle Scholar
Knight, J. L. and Yu, J. (2002). Empirical characteristic function in time series estimation. Econometric Theory 18, 691721.10.1017/S026646660218306XCrossRefGoogle Scholar
Li, L., Arab, A., Liu, J., Liu, J. and Han, Z. (2019). Bitcoin options pricing using LSTM-based prediction model and blockchain statistics. In Proc. 2019 IEEE Int. Conf. Blockchain, pp. 6774.CrossRefGoogle Scholar
Liew, C. C. and Siu, T. K. (2010). A hidden Markov regime-switching model for option valuation. Insurance Math. Econom. 47, 374384.10.1016/j.insmatheco.2010.08.003CrossRefGoogle Scholar
Liu, X., Shackleton, M. B., Taylor, S. J. and Xu, X. (2007). Closed-form transformations from risk-neutral to real-world distributions. J. Banking Finance 31, 15011520.10.1016/j.jbankfin.2006.09.005CrossRefGoogle Scholar
Madan, D. B. and Seneta, E. (1987). Simulation of estimates using the empirical characteristic function. Internat. Statist. Rev. 55, 153161.10.2307/1403191CrossRefGoogle Scholar
Madan, D. B. and Seneta, E. (1990). The variance gamma (V.G.) model from share market returns. J. Business 63, 511524.10.1086/296519CrossRefGoogle Scholar
McLachlan, G. J., Lee, S. X. and Rathnayake, S. I. (2019). Finite mixture models. Ann. Rev. Statist. Appl. 6, 355378.10.1146/annurev-statistics-031017-100325CrossRefGoogle Scholar
McLachlan, G. J. and Peel, D. (2000). Finite Mixture Models. John Wiley, Hoboken, NJ.10.1002/0471721182CrossRefGoogle Scholar
Merton, R. C. (1973). Theory of rational option pricing. Bell J. Econom. Manag. Sci. 4, 141183.10.2307/3003143CrossRefGoogle Scholar
Mitchell, M. (1998). An Introduction to Genetic Algorithms. MIT Press.Google Scholar
Monfort, A. and Pegoraro, F. (2012). Asset pricing with second-order Esscher transforms. J. Banking 36, 16781687.10.1016/j.jbankfin.2012.01.014CrossRefGoogle Scholar
Pagnottoni, P. (2019). Neural network models for Bitcoin option pricing. Frontiers Artificial Intell. 2, 5. DOI: 10.3389/frai.2019.00005.10.3389/frai.2019.00005CrossRefGoogle ScholarPubMed
Price, K. V., Storn, R. M. and Lampinen, J. A. (2006). Differential Evolution: A Practical Approach To Global Optimization. Springer, Berlin.Google Scholar
Ritchey, R. J. (1990). Call option valuation for discrete normal mixtures. J. Financial Res. 13, 285296.10.1111/j.1475-6803.1990.tb00633.xCrossRefGoogle Scholar
Rombouts, J. V. K. and Stentoft, L. (2014). Bayesian option pricing using mixed normal heteroskedasticity models. Comput. Statist. Data Anal. 76, 588--605.10.1016/j.csda.2013.06.023CrossRefGoogle Scholar
Rombouts, J. V. K. and Stentoft, L. (2015). Option pricing with asymmetric heteroskedastic normal mixture models. Int. J. Forecasting 31, 635--650.10.1016/j.ijforecast.2014.09.002CrossRefGoogle Scholar
Rydén, T., Teräsvirta, T. and Åsbrink, S. (1998). Stylized facts of daily return series and the hidden Markov model. J. Appl. Econometrics 13, 217244.10.1002/(SICI)1099-1255(199805/06)13:3<217::AID-JAE476>3.0.CO;2-V3.0.CO;2-V>CrossRefGoogle Scholar
Singleton, K. J. (2001). Estimation of affine asset pricing models using the empirical characteristic function. J. Econometrics 102, 111141.10.1016/S0304-4076(00)00092-0CrossRefGoogle Scholar
Siu, T. K. (2005). Fair valuation of participating policies with surrender options and regime switching. Insurance Math. Econom. 37, 533552.10.1016/j.insmatheco.2005.05.007CrossRefGoogle Scholar
Siu, T. K. (2011). Regime-switching risk: To price or not to price? Internat. J. Stoch. Anal. 2011, 843246.Google Scholar
Siu, T. K. (2023). Bayesian nonlinear expectation for time series modelling and its application to Bitcoin. Empirical Economics 64, 505537.10.1007/s00181-022-02255-zCrossRefGoogle ScholarPubMed
Siu, T. K. (2024). Bayesian lower and upper estimates for Ether option prices with conditional heteroscedasticity and model uncertainty. J. Risk Financial Manag. 17, 436.10.3390/jrfm17100436CrossRefGoogle Scholar
Siu, T. K. (2025). Market consistent valuation for Bitcoin options with long memory in conditional volatility and conditional non-normality. J. Futures Markets 45, 917945.CrossRefGoogle Scholar
Siu, T. K. and Elliott, R. J. (2021). Bitcoin option pricing with a SETAR-GARCH model. Europ. J. Finance 27, 564595.10.1080/1351847X.2020.1828962CrossRefGoogle Scholar
Siu, T. K., Fung, E. S. and Ng, M. K. (2011). Option valuation with a discrete-time double Markovian regime-switching model. Appl. Math. Finance 18, 473490.10.1080/1350486X.2011.578457CrossRefGoogle Scholar
Siu, T. K., Tong, H. and Yang, H. (2001). Bayesian risk measures for derivatives via random Esscher transform. North Amer. Actuarial J. 5, 7891.10.1080/10920277.2001.10596000CrossRefGoogle Scholar
Siu, T. K., Tong, H. and Yang, H. (2004). On pricing derivatives under GARCH models: A dynamic Gerber–Shiu approach. North Amer. Actuarial J. 8, 1731.10.1080/10920277.2004.12254408CrossRefGoogle Scholar
Siu, T. K., Tong, H. and Yang, H. (2006). Option pricing under threshold autoregressive models by threshold Esscher transform. J. Indust. Manag. Optim. 2, 177197.Google Scholar
Storn, R. and Price, K. (1997). Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces. J. Global Optim. 11, 341359.10.1023/A:1008202821328CrossRefGoogle Scholar
Taylor, S. J. (2005). Asset Price Dynamics, Volatility, And Prediction. Princeton University Press.Google Scholar
Taylor, S. J. (2008). Modelling Financial Time Series, 2nd edn. World Scientific, Singapore.Google Scholar
Titterington, D. M., Smith, A. F. M. and Makov, U. E. (1985). Statistical Analysis of Finite Mixture Distributions. John Wiley, Chichester.Google Scholar
Tong, H. (1983). Threshold Models In Non-Linear Time Series Analysis. Springer, New York.10.1007/978-1-4684-7888-4CrossRefGoogle Scholar
Tong, H. (1990). Non-Linear Time Series: A Dynamical System Approach. Oxford University Press.10.1093/oso/9780198522249.001.0001CrossRefGoogle Scholar
Tong, H. (2007). Exploring volatility from a dynamical system perspective. In: Proc. 56th Session Int. Statist. Institute. International Statistical Institute, The Hague.Google Scholar
Tran, K. C. (1998). Estimating mixtures of normal distributions via empirical characteristic function. Econometric Rev. 17, 167183.10.1080/07474939808800410CrossRefGoogle Scholar
Tsay, R. S. (2012). An Introduction To Analysis Of Financial Data With R. John Wiley, Hoboken, NJ.Google Scholar
Venter, P. J., and Maré, E. (2021). Univariate and multivariate GARCH models applied to Bitcoin futures option pricing. J. Risk Financial Manag. 14, 261.10.3390/jrfm14060261CrossRefGoogle Scholar
Wilkens, S. (2005). Option pricing based on mixtures of distributions: Evidence from the Eurex index and interest rate futures options market. Derivatives Use, Trading Regulation 11, 213231.10.1057/palgrave.dutr.1840020CrossRefGoogle Scholar
Yu, J. (2004). Empirical characteristic function estimation and its applications. Econometric Rev. 23, 93123.10.1081/ETC-120039605CrossRefGoogle Scholar
Supplementary material: File

Siu supplementary material

Siu supplementary material
Download Siu supplementary material(File)
File 461.6 KB