1. Introduction
McKean–Vlasov stochastic differential equations (SDEs) have received significant attention in recent years due to their broad applications across various fields, such as stochastic control, stochastic games, and statistical physics. These equations were first introduced in [Reference McKean27], inspired by the kinetic theory of Kac [Reference Kac18], and differ from standard SDEs in that their coefficients additionally depend on the probability distribution of the solution process. In the literature, McKean–Vlasov SDEs are also referred to as mean-field SDEs, because they arise as the limits of weakly interacting particle systems as the number of particles tends to infinity (so-called the propagation of chaos [Reference Sznitman39]).
In view of the development on the aforementioned McKean–Vlasov SDEs, the noise processes considered are primarily Gaussian. However, systems of practical relevance in physics and biology sometimes require modeling with non-Gaussian noise. This can be verified by some abrupt jumps in the individual particles and the related whole population. To reproduce the performance of these natural phenomena, it is appropriate to consider (non-Gaussian) Lévy-type perturbations [Reference Applebaum1, Reference Duan11, Reference Liu, Song, Zhai and Zhang23]. In this paper, we focus on the following d-dimensional Lévy-type McKean–Vlasov SDE:

for
$t \in [0,T]$
, with a small parameter
$\varepsilon \gt 0$
. Here,
$\mathscr{L}_{X(t)}$
denotes the law of X(t) at time t, and W(t) is an m-dimensional standard Wiener process defined on the complete probability space
$(\Omega, \mathcal{F}, (\mathcal{F}_t)_{t \geqslant 0}, \mathbb{P})$
, with
$(\mathcal{F}_t)_{t\geqslant0}$
satisfying the usual conditions. Let
$(U, \mathcal{U}, \nu)$
be a
$\sigma$
-finite measure space with
$U \subseteq \mathbb{R}^d \setminus \{0\}$
, and let
$N({\textrm{d}} t, {\textrm{d}} z)$
be a Poisson random measure on
$\mathbb{R}^{+} \times U$
with intensity measure
$\nu({\textrm{d}} z) {\textrm{d}} t$
, independent of W(t). The compensated Poisson random measure is given by
$\tilde{N}({\textrm{d}} t, {\textrm{d}} z) = N({\textrm{d}} t, {\textrm{d}} z) - \nu({\textrm{d}} z) \,{\textrm{d}} t$
. The precise assumptions on the coefficients
$b:[0,T] \times \mathbb{R}^d \times M_2(\mathbb{R}^d) \to \mathbb{R}^d$
,
$\sigma:[0,T] \times \mathbb{R}^d \times M_2(\mathbb{R}^d) \to \mathbb{R}^{d \times m}$
, and
$h:[0,T] \times \mathbb{R}^d \times M_2(\mathbb{R}^d) \times \mathbb{R}^d \to \mathbb{R}^d$
will be specified in later sections (see Section 2 for the definition of
$M_2(\mathbb{R}^d)$
). We also remark that
$X_{\varepsilon}(t-\!)$
is the left limit at the point t, i.e.
$X_{\varepsilon}(t-\!)=\lim_{s\uparrow t}X_{\varepsilon}(s)$
.
The first aim of this paper is to consider the well-posedness of the McKean–Vlasov SDEs in the form of (1.1). Let us briefly review some previous works on the well-posedness of McKean–Vlasov SDEs with Brownian noise. Under the globally Lipschitz condition, the existence and uniqueness of strong solutions for McKean–Vlasov SDEs were obtained by using the fixed-point theorem, for example, in [Reference Bahlali, Mezerdi and Mezerdi2, Reference Carmona and Delarue5]. Results for the case with a one-sided globally Lipschitz drift term and a globally Lipschitz diffusion term can be found in [Reference Dos Reis, Engelhardt and Smith10, Reference Wang40]. To deal with the situation where the coefficients are locally Lipschitz with respect to (w.r.t.) the measure and globally Lipschitz w.r.t. the state variable, Kloeden and Lorenz [Reference Kloeden and Lorenz20] developed a method for constructing interpolated Eulerlike approximations. Recently, an extension to locally Lipschitz conditions w.r.t. the state variable under a uniform linear growth assumption was studied by Li et al. [Reference Li, Mao, Song, Wu and Yin22]; see also [Reference Ding and Qiao9]. Moreover, Hong et al. [Reference Hong, Hu and Liu16] examined the strong and weak well-posedness of a class of McKean–Vlasov SDEs with the drift and diffusion coefficients fulfilling certain locally monotone conditions, whereas they need to impose additional structural assumptions on the coefficients to ensure a unique solution.
Unlike the case of Brownian noise, the study of McKean–Vlasov SDEs with Lévy noise is still in its infancy, although some interesting works are emerging [Reference Frikha, Konakov and Menozzi12–Reference Graham14, Reference Jourdain, Méléard and Woyczynski17, Reference Mehri, Scheutzow, Stannat and Zangeneh28]. In particular, Hao et al. [Reference Hao and Li15] investigated a class of Lévy-type McKean–Vlasov SDEs satisfying global Lipschitz and linear growth conditions, established the existence and uniqueness of solutions, and explored their intrinsic link with nonlocal Fokker–Planck equations. The well-posedness results have been further developed for the case of superlinear drift, diffusion, and jump coefficients using the fixed-point theorem [Reference Mehri, Scheutzow, Stannat and Zangeneh28, Reference Neelima, Kumar, Dos Reis and Reisinger30]. Recently, Cavallazzi [Reference Cavallazzi6] has proven the strong well-posedness of McKean–Vlasov SDEs driven by Lévy process having a finite moment of order
$\beta\in[1,2]$
and under standard Lipschitz assumptions on the coefficients.
Motivated by previous works on the Brownian case as well as the Lévy case, in this paper, we aim to treat (1.1) only imposing locally Lipschitz conditions w.r.t. the state variable, allowing for a possibly superlinearly growing drift. We highlight that several essential difficulties arise. On the one hand, compared with classical SDEs, standard localization arguments cannot be applied directly due to the distribution-dependent coefficients. On the other hand, the non-Gaussian Lévy noise introduces challenges in both analytic and probabilistic aspects. Therefore, the results for classical SDEs (even with Lévy noise) or McKean–Vlasov SDEs with Brownian noise cannot be extended directly to McKean–Vlasov SDEs with Lévy noise. In this paper, we develop a Lévy-type technique of Eulerlike approximations to overcome the difficulties caused by the local conditions and distribution dependency. The crux of our method, which differs from the Brownian case [Reference Kloeden and Lorenz20, Reference Li, Mao, Song, Wu and Yin22], lies in handling the drift terms under more general conditions as well as the jump terms.
Apart from the existence and uniqueness of solutions, we are further interested in establishing a stochastic averaging principle for (1.1) with drifts of polynomial growth under locally Lipschitz conditions w.r.t. the state variable. In fact, the averaging principle is a powerful method for extracting effective dynamics from complex systems arising in mechanics, mathematics, and other research areas. Since the pioneering work of Khasminskii [Reference Khasminskii19], the averaging principle for usual SDEs has received significant attention and has stimulated much of the study in controls, stability analyses, and optimization methods. Although the problems considered take different forms (usually classified in terms of the noise or the conditions satisfied by their nonlinear terms), the essence behind the averaging method is to simplify dynamical systems and obtain approximate solutions to differential equations; see, e.g., [Reference Ma and Kang24, Reference Pei, Xu and Wu33, Reference Xu, Duan and Xu42]. Based on the idea of stochastic averaging, the second goal of this paper is to show that the solution of (1.1) converges to the following averaged equation (with
$\bar{X}(0) = x_0$
) as
$\varepsilon$
tends to 0:

in a certain sense, under appropriate averaging conditions. Here,
$\bar{b}: \mathbb{R}^d \times M_2(\mathbb{R}^d) \to \mathbb{R}^d$
,
$\bar{\sigma}: \mathbb{R}^d \times M_2(\mathbb{R}^d) \to \mathbb{R}^{d \times m}$
, and
$\bar{h}: \mathbb{R}^d \times M_2(\mathbb{R}^d) \times U \to \mathbb{R}^d$
are Borel measurable functions. For more details on (1.2), see Section 3.
Again, we must point out that, compared with the case of classical SDEs, there are far fewer results on the averaging principle for McKean–Vlasov SDEs due to their distribution-dependent feature. Moreover, the existing studies on averaging principles for McKean–Vlasov SDEs primarily focus on the Brownian case [Reference Shen, Song and Wu36, Reference Xu, Liu and Miao41]. For some interesting results involving other types of noise, e.g., fractional Brownian noise, we refer to [Reference Shen, Xiang and Wu37]. Nevertheless, to the best of the authors’ knowledge, the averaging principle for McKean–Vlasov SDEs with Lévy noise has not yet been considered to date. This inspires us to establish an averaging principle.
The real-life applications of the Lévy-type McKean–Vlasov SDE (1.1) and its corresponding averaged equation (1.2) are not explored in this paper. Instead, we present an illustrative toy model in Example 4.1 and refer to [Reference Bahlali, Mezerdi and Mezerdi2, Reference Carmona and Delarue5, Reference Mehri, Scheutzow, Stannat and Zangeneh28] for discussions on potential applications of McKean–Vlasov SDEs with weak coefficient conditions in fields such as physics, finance, and population dynamics. To numerically approximate a solution of the McKean–Vlasov SDE in our setting, it is necessary to introduce an interacting particle system that is connected to the McKean–Vlasov SDE and is shown to converge to the true solution of the McKean–Vlasov SDE. This is popularly known as the propagation of chaos [Reference Sznitman39]. We present such a result in Appendix B. For more recent progress on propagation of chaos for jump processes, we refer to [Reference Cavallazzi6, Reference Frikha, Konakov and Menozzi12–Reference Graham14, Reference Jourdain, Méléard and Woyczynski17, Reference Mehri, Scheutzow, Stannat and Zangeneh28, Reference Neelima, Kumar, Dos Reis and Reisinger30], and the references therein.
The rest of this paper is arranged as follows. In Section 2, we focus on investigating the existence and uniqueness of solutions for a class of McKean–Vlasov SDEs with Lévy-type perturbations. In Section 3, we prove an averaging principle for the solutions of the considered McKean–Vlasov SDEs. In Section 4, we present a specific example to illustrate the theoretical results of this paper. The details of the proof of Lemma 4 and the propagation of chaos result are postponed to Appendix B.
2. Well-posedness of Lévy-type McKean–Vlasov SDEs
We start with some notations used in the sequel. Let
$|\cdot|$
and
$\langle \cdot, \cdot\rangle$
be the Euclidean vector norm and the scalar product in
$\mathbb{R}^d$
, respectively. For a matrix A, we use the Frobenius norm defined as
$\|A\|=\sqrt{tr[AA^{\text{T}}]}$
, where
$A^{\text{T}}$
represents the transpose of the matrix A. Let
$\mathcal{M}(\mathbb{R}^d)$
denote the space of all probability measures on
$\mathbb{R}^d$
carrying the usual topology of weak convergence. Furthermore, for
$p\geqslant 1$
, let
$\mathcal{M}_p(\mathbb{R}^d)$
represent the subspace of
$\mathcal{M}(\mathbb{R}^d)$
as follows:

For
$\mu_1, \mu_2\in\mathcal{M}_p(\mathbb{R}^d)$
, the
$L^p$
-Wasserstein metric between
$\mu_1$
and
$\mu_2$
is defined as

where
$\mathscr{C}(\mu_1,\mu_2)$
means the collection of all the probability measures whose marginal distributions are
$\mu_1$
,
$\mu_2$
, respectively. Then
$\mathcal{M}_p(\mathbb{R}^d)$
endowed with the above metric is a Polish space.
Let
$\delta_x$
be the Dirac delta measure centered at the point
$x\in\mathbb{R}^d$
. A direct calculation shows that
$\delta_x$
belongs to
$\mathcal{M}_p(\mathbb{R}^d)$
for any
$x\in\mathbb{R}^d$
. Another remark is that if
$\mu_1=\mathscr{L}_X$
and
$\mu_2=\mathscr{L}_Y$
are the distributions of the random variables X and Y, respectively, then

where
$\mathscr{L}_{(X,Y)}$
denotes the joint distribution of the random vector (X, Y).
Given
$T\gt 0$
, let
$D([0,T];\;\mathbb{R}^d)$
be the collection of all càdlàg (i.e. right continuous with left limits) functions from [0, T] to
$\mathbb{R}^d$
. Note that, at the endpoints of the closed interval [0, T], we stipulate that an element in
$D([0,T];\;\mathbb{R}^d)$
is right continuous at 0 and has a left limit at T, respectively. For
$1\leqslant p \lt\infty$
, we use
$L^p(\Omega;\;\mathbb{R}^d)$
to denote the family of all
$\mathbb{R}^d$
-valued random variables Y such that
$\mathbb{E}|Y|^p\lt\infty$
. Similarly, we denote by
$L^p(\Omega;\;D([0,T];\;\mathbb{R}^d))$
the subspace of all
$D([0,T];\;\mathbb{R}^d)$
-valued random variables X that satisfy
$\mathbb{E}[\sup_{0\leqslant t\leqslant T}|X(t)|^p]\lt\infty.$
Then, we present the following proposition.
Proposition 1.
-
(1) The space
$D([0,T];\;\mathbb{R}^d)$ , equipped with the supremum norm, is a Banach space.
-
(2) Let
$p\in [1,\infty)$ . The space
$L^p(\Omega;\;D([0,T];\;\mathbb{R}^d))$ , equipped with the norm
$||X||_{L^p}\;:\!=\;\left(\mathbb{E}\left[\sup_{0\leqslant t\leqslant T}|X(t)|^p\right]\right)^{\frac{1}{p}}$ , is also a Banach space.
Proof. (i) The proof is primarily based on the properties of càdlàg functions, as outlined on p. 140 of [Reference Applebaum1]. Let
$B([0,T];\;\mathbb{R}^d)$
denote the space of bounded functions from [0, T] to
$\mathbb{R}^d$
. It is important to note that
$B([0,T];\;\mathbb{R}^d)$
, when equipped with the supremum norm, is a Banach space [Reference Applebaum1, p. 6]. Referring to the property (4) in [Reference Applebaum1, p. 140], it follows that
$D([0,T];\;\mathbb{R}^d)\subset B([0,T];\;\mathbb{R}^d)$
. Hence, the Cauchy sequence
$\{f_n\}$
of functions in
$D([0,T];\;\mathbb{R}^d)$
converges uniformly to some bounded function
$f\in B([0,T];\;\mathbb{R}^d)$
. The desired result is then obtained by applying the property (6) in [Reference Applebaum1, p. 140], which states that the limit of a sequence of càdlàg functions on [0, T] is itself càdlàg. (ii) The result follows directly from [Reference Pavliotis and Andrew32, Theorem 2.23 and Example 2.25], which establish the general completeness of
$L^p$
-spaces over Banach-valued random variables. For further references, see also [Reference Bogachev3, Theorem 4.1.3] and [Reference Brezis4, Theorem 4.8 and Comment 4 in Chapter 4].
We recall several useful inequalities that will be employed frequently throughout this paper. The first is Young’s inequality, stated as

Next, we list two elementary inequalities:

and

In addition, noting that stochastic integrals w.r.t. compensated Poisson random measures are local martingales, we require the following preparatory results to proceed with the analysis.
Proposition 2.
Let
$H:[0,T]\times U \to \mathbb{R}^d$
be a Borel measurable function satisfying
$\int_0^t\int_U|H(s,z)|^2\nu({\textrm{d}} z)\,{\textrm{d}} s\lt\infty,$
almost surely. Define the stochastic integral
$I_t\;:\!=\;\int_0^t\int_U H(s,z)\tilde{N}({\textrm{d}} s,{\textrm{d}} z)$
. Then, the following estimates hold.
-
(i) For any
$p\geqslant 2$ and
$0\leqslant t\leqslant T$ , there exists a constant
$D_p\gt 0$ such that
(2.4)This result is commonly referred to as Kunita’s first inequality [Reference Applebaum1, Theorem 4.4.23].\begin{align}\mathbb{E}\left(\sup_{0\leqslant s\leqslant t}|I_s|^p\right)&\leqslant D_p\mathbb{E}\left[\left(\int_0^t\int_U|H(s,z)|^2\nu({\textrm{d}} z)\,{\textrm{d}} s\right)^{p/2}\right]\nonumber\\ & \quad +D_p\mathbb{E}\left[\int_0^t\int_U|H(s,z)|^p\nu({\textrm{d}} z)\,{\textrm{d}} s \right].\end{align}
-
(ii) For any
$1\leqslant p\leqslant 2$ and
$0\leqslant t\leqslant T$ , there exists a constant
$K_p\gt 0$ such that
(2.5)\begin{equation}\mathbb{E}\left(\sup_{0\leqslant s\leqslant t}|I_s|^p\right)\leqslant K_p\mathbb{E}\left[\left(\int_0^t\int_U|H(s,z)|^2\nu({\textrm{d}} z)\,{\textrm{d}} s\right)^{p/2}\right].\end{equation}
Proof. We emphasize that this proposition can be viewed as a special instance of Novikov’s result, which is rigorously established in [Reference Novikov31, Theorem 1]; see also [Reference Kühn and Schilling21, Theorem 4.20] for applications of Novikov’s result and its relation to variants of the Burkholder–Davis–Gundy (BDG) inequality. For the case
$p\geqslant 2$
, a proof of the conclusion (i) based on the BDG inequality for local martingales is presented in [Reference Mikulevicius and Pragarauskas29, Lemma 1], whereas an alternative approach utilizing Itô’s formula (applied to
$x\mapsto x^p$
) and Doob’s martingale inequality can be found in [Reference Applebaum1, Theorem 4.4.23]. For the case
$1\leqslant p\leqslant 2$
, the conclusion (ii) is stated in [Reference Dareiotis, Kumar and Sabanis7, Lemma 2.1] without proof. To ensure clarity for readers and maintain mathematical rigor, we provide a detailed proof for this case here. For convenience, we define the processes

and

for
$0\leqslant t\leqslant T$
, where
$1\leqslant p\leqslant 2$
and
$\varepsilon\gt0$
is a small parameter.
On the one hand, applying the integration by parts formula (also referred to Itô’s product formula; see [Reference Applebaum1, Theorem 4.4.13]), we obtain

Noting that
$(A_t+\varepsilon)^{\frac{2-p}{4}}$
is a nonnegative and nondecreasing process, we deduce the bound

Since this estimate holds for all
$t\geqslant 0$
and the right-hand side remains nondecreasing, it follows that

By Hölder inequality [Reference Mao26, p. 5], we further derive

On the other hand, applying the Itô isometry for integrals with respect to compensated Poisson random measures yields

By Doob’s martingale inequality [Reference Applebaum1, Theorem 2.1.5], we then obtain

Combining (2.6) and (2.7), and using the fact that
$\mathbb{E}\left(\sup_{0\leqslant s\leqslant t} |J_s|\right)^2=\mathbb{E}\left(\sup_{0\leqslant s\leqslant t} |J_s|^2\right)$
, we arrive at

The result follows by letting
$\varepsilon\to0$
. This completes the proof.
2.1. Formulation of the well-posedness results
This section is dedicated to establishing the existence and uniqueness theorem for the solutions of the d-dimensional Lévy-type McKean–Vlasov SDEs, i.e.

for
$t\in[0,T]$
with initial condition
$X(0)=x_0$
. The functions b,
$\sigma$
, and h are defined as follows:

where b,
$\sigma$
, and h are Borel measurable functions. We now proceed by providing the precise definition of a solution to (2.8).
Definition 1. We say that (2.8) admits a unique strong solution if there exists an
$\{\mathcal{F}_t\}_{0\leqslant t\leqslant T}$
-adapted
$\mathbb{R}^d$
-valued càdlàg stochastic process
$(X(t))_{t\in[0,T]}$
such that
-
(i)
$X(t)=x_0+\int_0^t b\left(s,X(s-\!),\mathscr{L}_{X(s)}\right) \,{\textrm{d}} s+\sigma\left(s,X(s-\!),\mathscr{L}_{X(s)}\right)\,{\textrm{d}} W(s)+\int_{U}h\left(s,X(s-\!),\mathscr{L}_{X(s)},z\right)\tilde{N}({\textrm{d}} s,{\textrm{d}} z)$ ,
$t\in[0,T]$ ,
$\mathbb{P}$ -almost surely;
-
(ii) if
$Y=(Y(t))_{t\in[0,T]}$ is another solution with
$Y(0)=x_0$ , then
$\mathbb{P}(X(t)=Y(t) \hbox{ for all } t\in[0,T])=1.$
Assume that there exists a constant
$\kappa\geqslant2$
such that the following assumptions hold.
Assumption 1. (One-sided locally Lipschitz condition on the state variable.) For every
$R\gt0$
, there exists a positive constant
$L_R$
such that for any
$t\in[0,T]$
,
$x,y\in\mathbb{R}^d$
with
$|x|\vee|y|\leqslant R$
, and
$\mu\in\mathcal{M}_2(\mathbb{R}^d)$
,

Here, the symbol ‘
$\vee$
’ denotes the maximum of the multiple terms.
Assumption 2. (Globally Lipschitz condition on the measure.) There exists a positive constant L such that, for any
$t\in[0,T]$
,
$x\in\mathbb{R}^d$
, and
$\mu_1,\mu_2\in\mathcal{M}_2(\mathbb{R}^d)$
,

Assumption 3. (Continuity.) For any
$t\in[0,T]$
,
$b(t,\cdot,\cdot), \sigma(t,\cdot,\cdot)$
, and
$\int_Uh(t,\cdot,\cdot,z)\nu({\textrm{d}} z)$
are continuous on
$\mathbb{R}^d\times\mathcal{M}_2(\mathbb{R}^d).$
Assumption 4. (One-sided linear and global linear growth condition.) There exists a positive constant K such that, for any
$t\in[0,T]$
,
$x\in\mathbb{R}^d$
, and
$\mu\in\mathcal{M}_2(\mathbb{R}^d)$
,

Assumption 5. (
$\kappa$
-order growth condition on the drift coefficient.) There exists a positive constant
$K_1$
such that, for any
$t\in[0,T]$
,
$x\in\mathbb{R}^d$
, and
$\mu\in\mathcal{M}_{2}(\mathbb{R}^d)$
,

Assumption 6. (r-order moment condition for the initial data.) Consider
$x_0\in L^r(\Omega;\;\mathbb{R}^d)$
for some
$r\geqslant \max\{\kappa^2/2,4\}$
, i.e.
$\mathbb{E}|x_0|^r\lt\infty.$
Assumption 7. (Additional growth conditions and Lipschitz type conditions on the jump coefficient h.) There exists a positive
$K_2$
such that, for any
$t\in[0,T]$
,
$x\in\mathbb{R}^d$
, and
$\mu\in\mathcal{M}_{2}(\mathbb{R}^d)$
,

In addition, if
$\kappa \gt 2$
, there exist constants
$K_3, L^{\prime}\gt 0$
such that, for any
$t\in[0,T]$
,
$x,y\in\mathbb{R}^d$
, and
$\mu,\mu_1,\mu_2\in\mathcal{M}_{2}(\mathbb{R}^d)$
,


and for every
$R\gt0$
, there exists a constant
$L_R^{\prime}\gt 0$
such that for any
$t\in[0,T]$
,
$x,y\in\mathbb{R}^d$
with
$|x|\vee|y|\leqslant R$
, and
$\mu\in\mathcal{M}_2(\mathbb{R}^d)$
,

The main result of this section is stated as follows.
Theorem 1. (Well-posedness.) Let Assumptions 1–7 be satisfied. Then (2.8) admits a unique strong solution
$(X(t))_{t\in[0,T]}$
$\in L^{\kappa}(\Omega;\;\mathbb{R}^d)$
with the initial value
$X(0)=x_0$
, where
$\kappa\geqslant2$
. Moreover, the following estimate holds:

where
$C\;:\!=\;C(T,r,\mathbb{E}|x_0|^r)$
is a positive constant. Here,
$r\geqslant \max\{\kappa^2/2,4\}$
.
Remark 1. We emphasize that the conditions in Assumptions 1–7 are carefully chosen, and the results in Theorem 1 are broadly applicable.
-
(i) The one-sided locally Lipschitz condition in Assumption 1 is weaker than the classical locally Lipschitz condition. In fact, it is clear that the locally Lipschitz condition implies the one-sided locally Lipschitz condition (via the mean value inequality). However, the converse is false. For example, consider
$b(t,x,\mu)=x^3-x^{\frac{1}{3}}+t+\int_{\mathbb{R}} z\mu({\textrm{d}} z)$ in
$\mathbb{R}$ . For
$|x|\vee|y|\leqslant R$ , we have
\begin{align} \langle x-y, b(t,x,\mu)-b(t,y,\mu)\rangle &=|x-y|^2\left(x^2+xy+y^2\right)-(x-y)\left(x^{\frac{1}{3}}-y^{\frac{1}{3}}\right) \notag\\&\leqslant 3R^2|x-y|^2, \notag \end{align}
$(x-y)(x^{\frac{1}{3}}-y^{\frac{1}{3}})\geqslant0$ for all x, y. Thus, b is one-sided locally Lipschitz but not locally Lipschitz.
-
(ii) In contrast to the one-sided (globally) Lipschitz condition in the recent paper [Reference Neelima, Kumar, Dos Reis and Reisinger30], which asserts that there exists a constant
$C\gt 0$ such that for any
$x,y\in\mathbb{R}^d$ and
$\mu\in\mathcal{M}_2(\mathbb{R}^d),$
\begin{align} &\langle x-y, b(t,x,\mu)-b(t,y,\mu)\rangle+\|\sigma(t,x,\mu)-\sigma(t,y,\mu)\|^2\notag\\ &\quad +\int_U\left|h(t,x,\mu,z)-h(t,y,\mu,z)\right|^2\nu({\textrm{d}} z)\leqslant C|x-y|^2,\notag \end{align}
$\vee$ ’ instead of ‘
$+$ ’. This makes the condition in Assumption 1 weaker in some cases. For instance, consider b as a one-sided locally Lipschitz function and
$\sigma=h=x$ with
$\nu(U)\lt\infty$ . In this case, Assumption 1 holds, but the one-sided (globally) Lipschitz condition in [Reference Neelima, Kumar, Dos Reis and Reisinger30] is not satisfied.
-
(iii) The result simplifies to the case of pure Brownian motion when
$h\equiv 0$ . In contrast to the Brownian motion model considered in [Reference Li, Mao, Song, Wu and Yin22], where the drift coefficient is required to satisfy a linear growth condition, the present framework imposes only a one-sided linear growth condition on the drift coefficient b. Furthermore, b is permitted to exhibit polynomial growth w.r.t. the state variable, as specified in Assumptions 4 and 5.
-
(iv) Referring to [Reference Sato35, Theorem 25.3], when the jump coefficient h is a submultiplicative function with respect to z, the growth conditions in Assumption 7 can be interpreted as requiring that the jump measure
$[\nu]_{U}$ has a bounded
$|h|^r$ -moment or, equivalently, the associated Lévy motion has bounded
$|h|^r$ -moments, for every
$t\in[0,T]$ ,
$x\in\mathbb{R}^d$ , and
$\mu\in\mathcal{M}_{2}(\mathbb{R}^d)$ , where
$r\geqslant \max\{\kappa^2/2,4\}$ and
$\kappa\geqslant 2$ . In particular, the associated Lévy motion can be said to have bounded r-order moments when
$h(t,x,\mu,z)=z$ . We remark that while Brownian motion can be considered a 2-stable Lévy motion, our assumptions exclude applications involving jump measures associated with
$\alpha$ -stable Lévy motions for
$0\lt\alpha\lt 2$ . This exclusion arises because such
$\alpha$ -stable Lévy motions process r-order moments only for
$r\lt\alpha$ ; see [Reference Sato35, Example 25.10]. For a recent study addressing the strong well-posedness of McKean–Vlasov SDEs driven by Lévy noise with finite moments of order
$\beta\in[1,2]$ , we refer to [Reference Cavallazzi6]. However, it should be pointed out that the assumptions in [Reference Cavallazzi6] regarding the coefficients with respect to both the space variable and the measure remain within the globally Lipschitz framework.
2.2. Euler-type approximation and auxiliary lemmas
A key aspect of our approach to proving Theorem 1 is the construction of an Eulerlike sequence for the McKean–Vlasov SDEs (2.8). Once we demonstrate that this sequence is Cauchy in an appropriate complete space (specifically,
$L^{\kappa}(\Omega;\;D([0,T];\;\mathbb{R}^d))$
, as is shown later), we can conclude that there exists a limiting process, which is indeed the desired solution to (2.8).
To this end, let
$T\gt 0$
be given, and consider the equidistant partition of the interval [0, T]. For any integer
$n\geqslant1$
, define
$h_n=\frac{T}{n}$
and
$t_k^n=kh_n$
,
$k=0,1,\ldots, n$
. For a fixed k (
$0\leqslant k \leqslant n-1$
) and
$t\in(t_k^n, t_{k+1}^n]$
, we analyze the following approximation:

where
$\mu^{(n)}_{t_k^n}=\mathscr{L}_{X^{(n)}(t_k^n)}$
denotes the law of
$X^{(n)}(t_k^n)$
. Observe that for each fixed k, if the initial value
$X^{(n)}(t_k^n)$
and the distribution
$\mu^{(n)}_{t_k^n}$
(at the left endpoint
$t_k^n$
) are known, then (2.10) reduces to a standard SDE that is independent of the law of
$X^{(n)}(t)$
. We now establish, by induction, the existence and uniqueness of the solution to (2.10).
In fact, for
$k=0$
and
$t\in[0,t_1^n]$
, the distribution is
$ \mu^{(n)}_{0}=\mathscr{L}_{X^{(n)}(0)}=\mathscr{L}_{x_0}$
. Applying Assumptions 1 and 4, we observe that the coefficients in (2.10) (with
$k=0$
) satisfy

and

Referring to [Reference Majka25, Theorem 1.1], it admits a unique solution on
$[0,t_1^n]$
. Furthermore, by Assumption 5, it follows that for
$r\geqslant\max\{\frac{\kappa^2}{2},4\}$
, there exits a positive constant C such that

whose proof is quite similar to Lemma 1, and we omit the details here. Therefore, we can define
$X^{(n)}(t_1^n)$
(which satisfies
$\mathbb{E}|X^{(n)}(t_1^n)|^r\lt\infty$
) and
$\mu^{(n)}_{t_1^n}=\mathscr{L}_{X^{(n)}(t_1^n)}$
.
For
$k=1$
and
$t\in(t_1^n,t_2^n]$
, we can use
$(X^{(n)}(t_1^n),\mu^{(n)}_{t_1^n})$
in place of
$(X^{(n)}(0),\mu^{(n)}_{0})$
and repeat the above procedure. Inductively, for any
$k=0,1,\ldots, n-1$
and
$t\in(t_k^n, t_{k+1}^n]$
, we obtain the existence and uniqueness of the solution to the SDE (2.10) as well as the corresponding estimate

by similar arguments.
At this point, we define by
$[t]_n=t_k^n$
for all
$t\in (t_k^n,t_{k+1}^n]$
, where
$k=0,1,\ldots, n-1$
. Then, for
$t\in[0,T]$
, we introduce the following approximating SDE

with the initial value
$X^{(n)}(0)=x_0$
, where
$\mu^{(n)}_{[t]_{n}}=\mathscr{L}_{X^{(n)}([t]_{n})}$
. According to the previously presented procedures and results for (2.10), we conclude that there exists a unique solution to (2.12). In fact, for each
$n\geqslant 1$
and
$t\in[0,T]$
, we can always find a certain
$k_\ast$
(
$0\leqslant k_\ast \leqslant n-1$
) such that
$t\in (t_{k_\ast}^n,t_{{k_\ast}+1}^n]$
. Then, the solution to (2.12) can be written as

and it is well-defined based on the results for (2.10) with
$k=0,1,\ldots,k_\ast$
. Moreover, we have the following estimate

Under Assumption 6, which requires that the initial data
$x_0$
satisfies
$\mathbb{E}|x_0|^r\lt\infty$
with
$r\geqslant\max\{\frac{\kappa^2}{2},4\}\geqslant\kappa$
, we deduce that
$X^{(n)}\in L^{r}(\Omega;\;D([0,T];\;\mathbb{R}^d))\subset L^{\kappa}(\Omega;\;D([0,T];\;\mathbb{R}^d))$
. Hence, the stochastic processes
$\{X^{(n)}(t)\}_{n\geqslant1}$
given by (2.12) form a sequence in
$L^{\kappa}(\Omega;\;D([0,T];\;\mathbb{R}^d))$
. To demonstrate that this sequence is Cauchy, we require the following two auxiliary lemmas.
Lemma 1. (Uniform boundedness property.) Under Assumptions 4, 6, and 7, for any
$T\gt 0$
, there exists a positive constant
$C_r$
(independent of n) such that

Proof. For
$r\geqslant\max\{\frac{\kappa^2}{2},4\}$
and
$t\in[0,T]$
, applying Itô’s formula [Reference Applebaum1, Theorem 4.4.7] to
$|x|^r$
, along with the identity
$ \nu({\textrm{d}} z) dt= N({\textrm{d}} t, {\textrm{d}} z) - \tilde{N}({\textrm{d}} t, {\textrm{d}} z)$
, yields that

By virtue of Assumption 4, Young’s inequality (2.1) (with
$\epsilon=1$
,
$p=\frac{r}{r-2}$
, and
$q=\frac{r}{2}$
), Hölder inequality and the elementary inequality (2.3) (with
$l=\frac{r}{2}$
), one can estimate the second term of (2.15) by

Analogously, the third and fourth terms of (2.15) can be estimated by

and

respectively. Furthermore, note that the map
$y\to|y|^r$
is of class
$C^2$
and the remainder formula for
$|y|^r$
gives

for any
$y, b\in\mathbb{R}^d$
. The last term on the right-hand side of (2.15) can thus be estimated as

Denote the above upper-bound (2.20) by
$N_t$
. Substituting (2.16)–(2.20) into (2.15), taking the supremum over [0, u] for
$u\in[0,T]$
and then taking expectations gives that

where

is indeed a local martingale. On the one hand, by the BDG inequality (for the Brownian case) [Reference Mao26, Theorem 7.3 in Chapter 1] and the inequality (2.5) (with
$p=1$
) in Proposition 2, there exists a constant
$C_2\gt 0$
such that

Applying Assumption 4 yields

Then, due to Young’s inequality (2.1) (with
$\epsilon=\frac{1}{2C_2(r-1)}$
,
$p=\frac{r}{r-1}$
, and
$q=r$
), Hölder inequality, the elementary inequality (2.3) (with
$l=\frac{r}{2})$
and Lyapunov inequality, one can further conclude

On the other hand, utilizing Assumptions 4 and 7, along with Young’s inequality (2.1) (with
$\epsilon=1$
,
$p=\frac{r}{r-2}$
, and
$q=\frac{r}{2}$
), the elementary inequality (2.3) (with
$l=\frac{r}{2})$
, and Lyapunov’s inequality, we obtain the following estimate for the supremum of
$|N_t|$
:

Note that, by (2.13), we have

By combining all the estimates from (2.22) to (2.24) and applying Grönwall’s inequality, we deduce from (2.21) that

where
$\widehat{C}_r=K(r-2)\frac{r^2+r+2C_1}{r}+4\cdot3^{\frac{r}{2}-1}(r+1)+C_2^{r}2^{r+1}(r-1)^{r-1}(2K)^{r}(3T)^{\frac{r}{2}-1}+4C_1\cdot\left(\frac{2K}{r}3^{\frac{r}{2}-1}+K_2\right)$
. It is evident that the positive constant
$C_r$
depends on r, T, K,
$K_2$
, and the initial condition
$x_0$
, but is independent of n. Thus, the proof is complete.
Lemma 2. (Time Hölder continuity.) Let Assumptions A4-A7 hold. For any initial condition
$x_0\in L^r(\Omega;\;\mathbb{R}^d)$
with
$r\geqslant\kappa^2/2$
, there exists a positive constant
$C_{\kappa}$
such that, for any
$0\leqslant s\leqslant t\leqslant T$
with
$|t-s|\leqslant1$
,

Proof. It follows from (2.12) that

By taking expectations on both sides, one obtains

We proceed by estimating
$B_1$
,
$B_2$
, and
$B_3$
individually. To maintain clarity, we present only the core estimation steps for each term. By applying Hölder inequality, Assumption 5, Lyapunov inequality, and estimate (2.14), we derive the following bound for
$B_1$
:

For
$B_2$
, using the BDG inequality, Hölder inequality, Assumption 4, and estimate (2.14), we have

where
$M_{\kappa}=[\kappa^{\kappa+1}/2(\kappa-1)^{\kappa-1}]^{\frac{\kappa}{2}}$
. For
$B_3$
, using Kunita’s first inequality (i.e. the inequality (2.4) in Proposition 2), Hölder inequality, Assumptions 4 and 7, and estimate (2.14), we obtain

where D is a positive constant dependent on
$\kappa$
. Consequently, the desired assertion follows by substituting the above estimates on
$B_1$
,
$B_2$
, and
$B_3$
into (2.26) and then taking the supremum over n.
With Lemmas 1 and 2 established, we proceed to demonstrate the following result.
Lemma 3. (Cauchy sequences.) The sequence
$\{X^{(n)}(t)\}_{n\geqslant1}$
given by (2.12) is a Cauchy sequence in
$L^{\kappa}(\Omega;\;D([0,T];\;\mathbb{R}^d))$
. Specifically, for any
$n,m\geqslant1$
, the following holds:

Proof. Note that, for
$t\in[0,T]$
, the difference between
$X^{(n)}$
and
$X^{(m)}$
satisfies the following equation:

To facilitate the analysis, we define the stopping time:

for each
$R\gt 0$
. The stopping time technique is employed here to ensure boundedness of the processes
$X^{(n)}$
and
$X^{(m)}$
up to
$\tau_R$
, leveraging the fact that (2.12) describes a classical (nondistribution-dependent) SDE. It is clear that, for
$0\leqslant t\leqslant \tau_R\wedge T$
, we have
$|X^{(n)}(t-\!)|\leqslant R$
and
$|X^{(m)}(t-\!)|\leqslant R$
. Then, by De Morgan’s law, we arrive at

where
$\mathbb{I}_A$
is the indicator function of the set A.
In the subsequent analysis, we estimate each term
$J_1$
and
$J_2$
on the right-hand side of (2.28).
(1) Estimation of the term
$J_1$
. We note that

where the finiteness of the term is guaranteed by Lemma 1. By applying Itô’s formula, we obtain the following representation:

where the individual terms
$J_{i,R}$
, for
$i=1,\ldots,6$
, are given by

In order to take the supremum over time and the expectation, we need to estimate
$\mathbb{E}\big[\sup_{0\leqslant t\leqslant u}J_{i,R}(t)\big]$
for
$i=1, \ldots, 6$
.
Note that the terms
$J_{i,R}$
for
$i=1,2,3$
are standard Lebesgue integrals, and can be estimated in a similar manner. For any
$u\in[0,T]$
, applying Assumptions 1 and 2, we derive

By further applying Young’s inequality (2.1) (with
$\epsilon=1$
,
$p=\frac{\kappa}{\kappa-1}$
and
$q=\kappa$
), we obtain

Analogously, by Assumptions 1 and 2 and Young’s inequality (2.1) (with
$\epsilon=1$
,
$p=\frac{\kappa}{\kappa-2}$
,
$q=\frac{\kappa}{2}$
), we derive the following bounds for
$J_{2,R}$
and
$J_{3,R}$
:

and

As for the last three terms, we first use the remainder formula in (2.19) and Assumption 7 to obtain

We exploit the BDG inequality and Assumptions 2 and 3 to obtain

Then owing to Young’s inequality (2.1) (with
$\epsilon=\frac{1}{24(\kappa-1)}$
,
$p=\frac{\kappa}{\kappa-1}$
, and
$q=\kappa$
), Hölder inequality, the elementary inequality (2.2) (with
$k=2$
and
$l=\frac{\kappa}{2}$
), the elementary inequality (2.3) (with
$l=\kappa$
), and Lyapunov inequality, we further have

Finally, we apply the inequality (2.5) (with
$p=1$
) in Proposition 2 and Young’s inequality (2.1) (with
$\epsilon=\frac{1}{4D(\kappa-1)}$
,
$p=\frac{\kappa}{\kappa-1}$
, and
$q=\kappa$
) to obtain

Substituting the estimates derived from (2.30)–(2.35) into (2.29) yields the inequality

Here,
$\widehat{M}_1(u)=\kappa^2 L_R+(\kappa-1+3^{\kappa-1})\sqrt{L}+(\kappa-1)(\kappa-2+2\cdot3^{\kappa-1})L+C_1\Big(L_R+\frac{(\kappa-2+2\cdot3^{\kappa-1})L}{\kappa}+2^{\kappa-1}L_R^{\prime}+6^{\kappa-1}L^{\prime}\Big)+(12^{\kappa}+(2D)^{\kappa})(4(\kappa-1))^{\kappa-1}\left(L_R^{\frac{\kappa}{2}}+L^{\frac{\kappa}{2}}3^{\kappa-1}\right)u^{\frac{\kappa}{2}-1}$
and
$\widehat{M}_2(u)=\sqrt{L}+2L(\kappa-1)+C_1\left(\frac{2L}{\kappa}+2^{\kappa-1}L^{\prime}\right)+(12^{\kappa}+(2D)^{\kappa})L^{\frac{\kappa}{2}}(4(\kappa-1))^{\kappa-1}u^{\frac{\kappa}{2}-1}$
. In addition, for any
$t\in[0,T]$
, the result in Lemma 2 implies that

By these estimates, together with Grönwall’s inequality, we conclude that

(2) Estimation of the term
$J_2$
. With the aid of the Cauchy–Schwarz inequality, we have

Here, the result of Lemma 1 has been utilized. Further, by employing the subadditivity of probability and invoking Lemma 1 once more, we can estimate

By substituting this into (2.37), we further obtain

At this point, we can estimate (2.28) by combining (2.36) and (2.38) as follows:

Note that R is independent of n and m, and
$\frac{C}{R^2}$
converges to 0 as
$R\to\infty$
. For any given
$\varepsilon\gt 0$
, there exists a sufficiently large number
$R(\varepsilon)\gt 0$
, such that,

when
$R_{*}\geqslant R(\varepsilon)$
. Since both
$h_n$
and
$h_m$
converge to 0 as
$n,m\to\infty$
, for the
$\varepsilon\gt 0$
chosen previously, we have

by letting
$n,m\to\infty$
. Consequently, we conclude that (2.27) holds.
2.3. Proof of Theorem 1
In this subsection, we turn to proving the main theorem in this section. The proof consists of three steps.
Step 1: Existence. Let
$\{X^{(n)}(t)\}_{n\geqslant1}$
be the Cauchy sequence in
$L^{\kappa}(\Omega;\;D([0,T];\;\mathbb{R}^d))$
given by (2.12). Keep in mind that the space
$L^{\kappa}(\Omega;\;D([0,T];\;\mathbb{R}^d))$
, equipped with the norm
$||X||_{L^{\kappa}}\;:\!=\;\left(\mathbb{E}\left[\sup_{0\leqslant t\leqslant T}|X(t)|^{\kappa}\right]\right)^{\frac{1}{\kappa}}$
, is a Banach space (see Proposition 1). Thus, there exists an
$\{\mathcal{F}_t\}_{0\leqslant t\leqslant T}$
-adapted
$\mathbb{R}^d$
-valued càdlàg stochastic process
$\{X(t)\}_{t\in[0,T]}$
with
$X(0)=x_0$
and
$\mu_t=\mathcal{L}_{X(t)}$
such that

We next prove that
$\{X(t)\}_{t\in[0,T]}$
is a solution to (2.8). Indeed, the main idea is to show that the right-hand side of (2.12) converges in probability to

by taking the limit on both sides of (2.12). Here
$\mu_s=\mathcal{L}(X(s))$
for any
$s\in[0,T]$
.
First, it follows from (2.40) that there exists a subsequence (for notational simplicity, still indexed by n) such that, for all
$s\in[0,T]$
,

By applying Lemma 2, the Wasserstein distance between
$\mu^{(n)}_{[s]_n}$
and
$\mu_s$
satisfies

Taking Assumption 3 into account, it follows immediately that, for all
$s\in[0,T]$
and almost all
$\omega\in\Omega$
,


as
$n\to\infty$
.
Next, we claim that the sequences
$\{b(s, X^{(n)}(s), \mu^{(n)}_{[s]_n})\}_{n\geqslant1}$
and
$\{\sigma(s, X^{(n)}(s), \mu^{(n)}_{[s]_n})\}_{n\geqslant1}$
are uniformly integrable. In fact, from Assumptions 4 and 5 and Lemma 1, we obtain the following uniform boundedness,

and the following uniform absolute continuity,

when
$\mathbb{P}(A)\to 0$
. The uniform integrability of
$\{b(s, X^{(n)}(s), \mu^{(n)}_{[s]_n})\}_{n\geqslant1}$
and
$\{\sigma(s, X^{(n)}(s), \mu^{(n)}_{[s]_n})\}_{n\geqslant1}$
follows from [Reference Shiryaev38, Lemma 3 in p. 190].
Hence, by applying the dominated convergence theorem [Reference Shiryaev38, Theorem 4 on p. 188], together with (2.42), we obtain, for any
$s\in[0,T]$
,


In addition, note that, following from (2.40),

We further have the following estimates based on Assumptions 4 and 5 and Lemma 1:


For any
$t\in[0,T]$
, by applying the dominated convergence theorem in conjunction with (2.43) and (2.45), we eventually obtain

Similarly, in view of (2.44) and (2.46), we arrive at

Finally, we examine the estimates for the integral w.r.t. the Poisson random measure. For any
$u\in[0,T]$
, it follows from the inequality (2.5) (with
$p=1$
) in Proposition 2 and Assumptions 1 and 2 that

By (2.40), (2.41), and Lyapunov inequality, we deduce that

As a consequence, by (2.47), (2.48), and (2.49), we conclude that the process
$\{X(t)\}_{t\in[0,T]}$
is a strong solution to (2.8). This completes the proof of existence.
Step 2: Boundedness. For
$t\in[0,T]$
, let
$X(t)\in L^{\kappa}(\Omega;\;D([0,T];\;\mathbb{R}^d))$
be a solution to (2.8). In the following, we estimate the rth moment of the solution
$(X(t))_{t\in[0,T]}$
, where
$r\geqslant \max\{\frac{\kappa^2}{2},4\}$
and the initial value
$X(0) = x_0$
satisfies
$\mathbb{E}|x_0|^r \lt \infty$
, as specified in Assumption 6.
For every
$R\gt 0$
, we define the stopping time

It is clear that
$|X(t-\!)|\leqslant R$
for
$0\leqslant t\leqslant \pi_R$
, and
$\mathbb{E}\sup_{0\leqslant t\leqslant u\wedge\pi_R}|X(t-\!)|^r\lt\infty$
for any
$u\in[0,T]$
. To derive an upper-bound for
$\mathbb{E}\sup_{0\leqslant t\leqslant u\wedge\pi_R}|X(t)|^r$
, we employ the procedure similar to that in the proof of Lemma 1, where the case for
$X^{(n)}(t)$
with
$t\in[0,T]$
was considered. Specifically, for
$r\geqslant\max\{\frac{\kappa^2}{2},4\}$
and
$t\in[0,u\wedge\pi_R]$
, and utilizing tools such as Itô’s formula, as demonstrated in the proof of Lemma 1, we estimate that

with
$\widehat{D}=2\cdot3^{\frac{r}{2}-1}(r+1)K+C_2^r2^{r}(r-1)^{r-1}(2K)^{r}(3T)^{\frac{r}{2}-1}+2C_1\cdot\left(\frac{2K}{r}3^{\frac{r}{2}-1}+K_2\right).$
Note that

Since
$\pi_R\to T,$
$\mathbb{P}$
-almost surely, we conclude the proof of the estimate (2.9) by applying Grönwall’s inequality and the Fatou’s lemma. Specifically, we have

Step 3: Uniqueness. Let X(t), Y(t) be two solutions of (2.8) on the same probability space with
$X(0)=Y(0)$
. By (2.14), for a fixed
$r\geqslant\max\{\kappa^2/2, 4\}$
, there exists a positive constant
$C_r$
such that

For a sufficiently large
$R\gt 0$
, we define the stopping time

To proceed, we compare
$|X(t)-Y(t)|$
and
$\bar{\tau}_R$
in this context with
$|X^{(n)}(t)-X^{(m)}(t)|$
and
$\tau_R$
, as introduced in the proof of Lemma 3. Clearly, the same method as used in the proof of Lemma 3 can be applied here, yielding the following estimate:

Letting
$R\to\infty$
gives the uniqueness of the solution to (2.8).
This completes the proof of Theorem 1.
3. Stochastic averaging principle
In this section, we establish a stochastic averaging principle for the following stochastic integral equation

where
$\varepsilon$
is a small positive parameter (
$0\lt\varepsilon\ll1$
). Assuming that (3.1) satisfies the conditions specified in Assumptions 1–7, the existence and uniqueness of its solution follow directly as a consequence of Theorem 1.
As mentioned in Section 1, our main goal is to demonstrate that the solution
$(X_{\varepsilon}(t))_{ t\in[0,T]}$
of (3.1) can be approximated by a simpler (or averaged) process in an appropriate sense. To proceed, we associate (3.1) with the following averaged McKean–Vlasov SDE:

where
$\bar{b}: \mathbb{R}^d\times M_2(\mathbb{R}^d)\to \mathbb{R}^d$
,
$\bar{\sigma}: \mathbb{R}^d\times M_2(\mathbb{R}^{d})\to\mathbb{R}^{d\times m}$
, and
$\bar{h}: \mathbb{R}^d\times M_2(\mathbb{R}^d)\times U\to \mathbb{R}^d$
are Borel measurable functions. To ensure that (3.2) also admits a unique solution and to facilitate the application of stochastic averaging techniques, we impose specific averaging conditions. It is worth noting that these conditions differ slightly from the classical ones (see, e.g., [Reference Shen, Song and Wu36, Reference Xu, Duan and Xu42]) due to the distinct characteristics of the nonlinear terms involved in the equation.
Assumption 8. (Averaging conditions.) There exist positive bounded functions (sometimes referred to as rate functions of convergence)
$\varphi_i$
, defined on [0,T], with
$\lim_{t\to\infty}\varphi_i(t)=0$
for
$i=1, 2, 3$
, such that



respectively, for all
$t\in[0,T]$
,
$x,y\in\mathbb{R}^d$
with
$|x|\vee|y|\leqslant R$
, and
$\mu\in\mathcal{M}_2(\mathbb{R}^d)$
. Here,
$C_R^b$
,
$C_R^\sigma$
, and
$C_R^h$
are positive constants.
Furthermore, if
$\kappa\gt 2$
, an additional condition is required.
Assumption 9. (Additional averaging conditions on the jump coefficients.) There exists a positive bounded function
$\varphi$
, defined on [0, T], with
$\lim_{t\to\infty}\varphi(t)=0$
, such that


respectively, for all
$t\in[0,T]$
,
$x,y\in\mathbb{R}^d$
with
$|x|\vee|y|\leqslant R$
, and
$\mu\in\mathcal{M}_2(\mathbb{R}^d)$
.
The main theorem on the averaging principle for (3.1) is thus formulated as follows.
Theorem 2. (Averaging principle.) Suppose that Assumptions 1–9 hold. Then, the following averaging principle holds:

As a direct consequence of Theorem 2 and by applying the Chebyshev–Markov inequality, we have the following corollary.
Corollary 1.
The solution
$X_{\varepsilon}(t)$
converges in probability to the averaged solution
$\bar{X}(t)$
. Specifically, for any
$\delta\gt 0$
,

Prior to establishing Theorem 2, it is necessary to address the well-posedness of the averaged equation (3.2). The following lemma ensures this property.
Lemma 4.
Under Assumptions 1–9, there exists a unique solution
$\bar{X}(t)$
to the averaged equation (3.2).
Proof. By Theorem 1, it suffices to verify that the coefficients functions
$\bar{b}$
,
$\bar{\sigma}$
, and
$\bar{h}$
satisfy the conditions required for the existence and uniqueness of the solution. Note that both (3.1) and (3.2) share the same initial condition
$x_0$
. The condition in Assumption 6 is directly satisfied. Regarding the conditions in Assumptions 1–5, we focus on the function
$\bar{b}$
, as similar arguments apply to the functions
$\bar{\sigma}$
and
$\bar{h}$
. Finally, we verify that
$\bar{h}$
satisfies the condition in Assumption 7. The details of these verifications are provided in Appendix A.
We now complete the proof of Theorem 2 as follows.
Proof of Theorem 2. For any
$t\in[0,T]$
, it follows from (3.1) and (3.2) that

To handle the one-sided locally Lipschitz case, we introduce a stopping time
$\eta_R$
for each
$R\gt 0$
defined as

Using De Morgan’s law, the following decomposition holds:

We now proceed to estimate each term on the right-hand side of the equation above.
-
(1) Estimation of the term
$I_1$ . We begin by bounding the term
$I_1$ as follows
(3.4)Here, we are effectively considering the process up to the stopping time\begin{equation}I_1=\mathbb{E}\left[\sup_{0\leqslant t\leqslant T}|X_{\varepsilon}(t)-\bar{X}(t)|^{2}\mathbb{I}_{\{\eta_R\gt T\}}\right]\leqslant\mathbb{E}\left[\sup_{0\leqslant t\leqslant T}|X_{\varepsilon}(t\wedge\eta_R)-\bar{X}(t\wedge\eta_R)|^{2}\right]\lt\infty.\end{equation}
$\eta_R$ , which ensures that
$|X_{\varepsilon}(t-\!)|$ and
$|\bar{X}(t-\!)|$ are bounded by R for all
$t\leqslant T$ . By applying the Itô’s formula, we obtain
$$|X_{\varepsilon}(t\wedge\eta_R)-\bar{X}(t\wedge\tau_R)|^{2}=\sum_{i=1}^5\Lambda_{i}(t),$$
$\Lambda_i$ for
$i=1, \ldots,5$ are defined as follows:
\begin{align*} \Lambda_{1}(t)&=2\int_0^{t\wedge\eta_R}\left\langle X_{\varepsilon}(s-\!)-\bar{X}(s-\!), b\left(\frac{s}{\varepsilon},X_{\varepsilon}(s-\!),\mathscr{L}_{X_{\varepsilon}(s)}\right)-\bar{b}\left(\bar{X}(s-\!),\mathscr{L}_{\bar{X}(s)}\right)\right\rangle\, {\textrm{d}} s,\notag\\ \Lambda_{2}(t)&=\int_0^{t\wedge\eta_R}\left\| \sigma\left(\frac{s}{\varepsilon},X_{\varepsilon}(s-\!),\mathscr{L}_{X_{\varepsilon}(s)}\right)-\bar{\sigma}\left(\bar{X}(s-\!),\mathscr{L}_{\bar{X}(s)}\right)\right\|^2\, {\textrm{d}} s,\notag\\ \Lambda_{3}(t)&=\int_0^{t\wedge\eta_R}\int_U\left|h\left(\frac{s}{\varepsilon},X_{\varepsilon}(s-\!),\mathscr{L}_{X_{\varepsilon}(s)},z\right)-\bar{h}\left(\bar{X}(s-\!),\mathscr{L}_{\bar{X}(s)},z\right)\right|^2 N({\textrm{d}} s,{\textrm{d}} z),\notag\\ \Lambda_{4}(t)&=2\int_0^{t\wedge\eta_R}\left\langle X_{\varepsilon}(s-\!)-\bar{X}(s-\!), \sigma\left(\frac{s}{\varepsilon},X_{\varepsilon}(s-\!),\mathscr{L}_{X_{\varepsilon}(s)}\right)\right .\notag\\&\quad \left .-\,\bar{\sigma}\left(\bar{X}(s-\!),\mathscr{L}_{\bar{X}(s)}\right)\, {\textrm{d}} W(s)\right\rangle,\notag\\ \Lambda_{5}(t)&=2\int_0^{t\wedge\eta_R}\left\langle X_{\varepsilon}(s-\!)-\bar{X}(s-\!),h\left(\frac{s}{\varepsilon},X_{\varepsilon}(s-\!),\mathscr{L}_{X_{\varepsilon}(s)},z\right)\right .\notag\\&\quad \left .-\,\bar{h}\left(\bar{X}(s-\!),\mathscr{L}_{\bar{X}(s)},z\right)\right\rangle\tilde{N}({\textrm{d}} s,{\textrm{d}} z).\notag\end{align*}
$u\in[0,T]$ and then taking expectations, we can now estimate
$\mathbb{E}\big[\sup_{0\leqslant t\leqslant u}\Lambda_{i}(t)\big]$ for
$i=1, \ldots, 5$ , respectively. In view of Assumptions 1, 2, and 8, we obtain
(3.5)Here, we have used the fact that for each\begin{align}\mathbb{E}&\left[\sup_{0\leqslant t\leqslant u}\Lambda_{1}(t)\right]\notag\\&\leqslant2\mathbb{E}\left[\sup_{0\leqslant t\leqslant u}\int_0^{t\wedge\eta_R}\left\langle X_{\varepsilon}(s-\!)-\bar{X}(s-\!), b\left(\frac{s}{\varepsilon},X_{\varepsilon}(s-\!),\mathscr{L}_{X_{\varepsilon}(s)}\right)\right .\right .\notag\\&\quad \left .\left .-\,b\left(\frac{s}{\varepsilon},\bar{X}(s-\!),\mathscr{L}_{X_{\varepsilon}(s)}\right)\right\rangle\, {\textrm{d}} s\right]\notag\\&\quad +2\mathbb{E}\left[\sup_{0\leqslant t\leqslant u}\int_0^{t\wedge\eta_R}|X_{\varepsilon}(s-\!)-\bar{X}(s-\!)|\cdot\left|b\left(\frac{s}{\varepsilon},\bar{X}(s-\!),\mathscr{L}_{X_{\varepsilon}(s)}\right)\right .\right .\notag\\&\quad \left .\left .-\,b\left(\frac{s}{\varepsilon},\bar{X}(s-\!),\mathscr{L}_{\bar{X}(s)}\right)\right|\, {\textrm{d}} s\right]\notag\\&\quad +2\mathbb{E}\left[\sup_{0\leqslant t\leqslant u}\int_0^{t\wedge\eta_R}|X_{\varepsilon}(s-\!)-\bar{X}(s-\!)|\cdot\left|b\left(\frac{s}{\varepsilon},\bar{X}(s-\!),\mathscr{L}_{\bar{X}(s)}\right)\right .\right .\notag\\&\quad \left .\left .-\,\bar{b}\left(\bar{X}(s-\!),\mathscr{L}_{\bar{X}(s)}\right)\right|\, {\textrm{d}} s\right]\notag\\&\leqslant2L_R\mathbb{E}\left[\int_0^{u\wedge\eta_R}|X_{\varepsilon}(s-\!)-\bar{X}(s-\!)|^2\, {\textrm{d}} s\right]\notag\\&\quad +2\sqrt{L}\mathbb{E}\left[\int_0^{u\wedge\eta_R}|X_{\varepsilon}(s-\!)-\bar{X}(s-\!)|\cdot W_2\left(\mathscr{L}_{X_{\varepsilon}(s)},\mathscr{L}_{\bar{X}(s)}\right)\, {\textrm{d}} s\right]\notag\\&\quad +2\mathbb{E}\left[\int_0^{u\wedge\eta_R}|X_{\varepsilon}(s-\!)-\bar{X}(s-\!)|\cdot\left|b\left(\frac{s}{\varepsilon},\bar{X}(s-\!),\mathscr{L}_{\bar{X}(s)}\right)\right .\right .\notag\\&\quad \left .\left .-\,\bar{b}\left(\bar{X}(s-\!),\mathscr{L}_{\bar{X}(s)}\right)\right|\, {\textrm{d}} s\right]\notag\\&\leqslant(2L_R+2\sqrt{L}+1)\mathbb{E}\left[\int_0^{u\wedge\eta_R}|X_{\varepsilon}(s-\!)-\bar{X}(s-\!)|^2\, {\textrm{d}} s\right]\notag\\&\quad +u\mathbb{E}\left[\frac{\varepsilon}{u\wedge\eta_R}\int_0^{\frac{u\wedge\eta_R}{\varepsilon}}\left|b\left(s,\bar{X}(s\varepsilon-),\mathscr{L}_{\bar{X}(s\varepsilon)}\right)-\bar{b}\left(\bar{X}(s\varepsilon-),\mathscr{L}_{\bar{X}(s\varepsilon)}\right)\right|^2\, {\textrm{d}} s\right]\notag\\&\leqslant(2L_R+2\sqrt{L}+1)\int_0^u\mathbb{E}\sup_{0\leqslant t\leqslant s}|X_{\varepsilon}(t\wedge\eta_R)-\bar{X}(t\wedge\eta_R)|^2\, {\textrm{d}} s+uC_R^b\varphi_1\left(\frac{u\wedge\eta_R}{\varepsilon}\right)\notag\\&\quad\times\left(1+\mathbb{E}\sup_{0\leqslant t\leqslant u}|\bar{X}(t)|^2\right)\notag\\&\leqslant(2L_R+2\sqrt{L}+1)\int_0^u\mathbb{E}\sup_{0\leqslant t\leqslant s}|X_{\varepsilon}(t\wedge\eta_R)-\bar{X}(t\wedge\eta_R)|^2\, {\textrm{d}} s+uC_R^b\cdot C\varphi_1\left(\frac{u\wedge\eta_R}{\varepsilon}\right).\end{align}
$u\in[0,T]$ ,
$$\left(\mathbb{E}\sup_{0\leqslant t\leqslant u}|\bar{X}(t)|^2\right)^{\frac{1}{2}}\leqslant(\mathbb{E}\sup_{0\leqslant t\leqslant u}|\bar{X}(t)|^r)^{\frac{1}{r}}\lt\infty,\quad \text{if}\ \mathbb{E}|X_{\varepsilon}(0)|^r\lt\infty.$$
$\Lambda_1$ , we establish the following bound for
$\Lambda_2$ :
(3.6)Similarly, using Assumptions 1, 2, and 8, we obtain the following estimate for\begin{align}&\mathbb{E}\left[\sup_{0\leqslant t\leqslant u}\Lambda_{2}(t)\right]\notag\\&\leqslant3\mathbb{E}\left[\sup_{0\leqslant t\leqslant u}\int_0^{t\wedge\eta_R}\left\|\sigma\left(\frac{s}{\varepsilon},X_{\varepsilon}(s-\!),\mathscr{L}_{X_{\varepsilon}(s)}\right)-\sigma\left(\frac{s}{\varepsilon},\bar{X}(s-\!),\mathscr{L}_{X^{\varepsilon}(s)}\right)\right\|^2\, {\textrm{d}} s\right]\notag\\&\quad +3\mathbb{E}\left[\sup_{0\leqslant t\leqslant u}\int_0^{t\wedge\eta_R}\left\|\sigma\left(\frac{s}{\varepsilon},\bar{X}(s-\!),\mathscr{L}_{X_{\varepsilon}(s)}\right)-\sigma\left(\frac{s}{\varepsilon},\bar{X}(s-\!),\mathscr{L}_{\bar{X}(s)}\right)\right\|^2\, {\textrm{d}} s\right]\notag\\&\quad +3\mathbb{E}\left[\sup_{0\leqslant t\leqslant u}\int_0^{t\wedge\eta_R}\left\|\sigma\left(\frac{s}{\varepsilon},\bar{X}(s-\!),\mathscr{L}_{\bar{X}(s)}\right)-\bar{\sigma}\left(\bar{X}(s-\!),\mathscr{L}_{\bar{X}(s)}\right)\right\|^2\, {\textrm{d}} s\right]\notag\\&\leqslant3L_R\mathbb{E}\left[\int_0^{u\wedge\eta_R}|X_{\varepsilon}(s-\!)-\bar{X}(s-\!)|^2\, {\textrm{d}} s\right]+3L\mathbb{E}\left[\int_0^{u\wedge\eta_R}W_2^2\left(\mathscr{L}_{X_{\varepsilon}(s)},\mathscr{L}_{\bar{X}(s)}\right)\, {\textrm{d}} s\right]\notag\\&\quad +3\mathbb{E}\left[\int_0^{u\wedge\eta_R}\left\|\sigma\left(\frac{s}{\varepsilon},\bar{X}(s-\!),\mathscr{L}_{\bar{X}(s)}\right)-\bar{\sigma}\left(\bar{X}(s-\!),\mathscr{L}_{\bar{X}(s)}\right)\right\|^2\, {\textrm{d}} s\right]\notag\\&\leqslant3(L_R+L)\int_0^u\mathbb{E}\sup_{0\leqslant t\leqslant s}|X_{\varepsilon}(t\wedge\eta_R)-\bar{X}(t\wedge\eta_R)|^2\, {\textrm{d}} s+3uC_R^{\sigma}\cdot C\varphi_2\left(\frac{u\wedge\eta_R}{\varepsilon}\right).\end{align}
$\Lambda_3$ :
(3.7)Next, we apply the BDG inequality, along with Young’s inequality (2.1) (with\begin{align}&\mathbb{E}\left[\sup_{0\leqslant t\leqslant u}\Lambda_{3}(t)\right]\notag\\&\leqslant\mathbb{E}\left[\int_0^{t\wedge\eta_R}\int_U\left|h\left(\frac{s}{\varepsilon},X_{\varepsilon}(s-\!),\mathscr{L}_{X_{\varepsilon}(s)},z\right)-\bar{h}\left(X(s-\!),\mathscr{L}_{\bar{X}(s)},z\right)\right|^2\nu({\textrm{d}} z)\, {\textrm{d}} s\right] \notag\\&\leqslant3(L_R+L)\int_0^u\mathbb{E}\sup_{0\leqslant t\leqslant s}|X_{\varepsilon}(t\wedge\eta_R)-\bar{X}(t\wedge\eta_R)|^2\, {\textrm{d}} s+3uC_R^{h}\cdot C\varphi_3\left(\frac{u\wedge\eta_R}{\varepsilon}\right).\end{align}
$\epsilon=\frac{1}{24}$ and
$p=q=2$ ) and the estimate (3.6), to derive the following bound for
$\Lambda_4$ :
(3.8)By applying the inequality (2.5) (with\begin{align}\mathbb{E}&\left[\sup_{0\leqslant t\leqslant u}\Lambda_{4}(t)\right]\notag\\&\leqslant2\sqrt{32}\mathbb{E}\left(\int_0^{u\wedge\eta_R}|X_{\varepsilon}(s-\!)-\bar{X}(s-\!)|^2\cdot\left\|\sigma\left(\frac{s}{\varepsilon},X_{\varepsilon}(s-\!),\mathscr{L}_{X_{\varepsilon}(s)}\right)\right .\right .\notag\\&\quad \left .\left .-\bar{\sigma}\left(\bar{X}(s-\!),\mathscr{L}_{\bar{X}(s)}\right)\right\|^2\, {\textrm{d}} s\right)^{\frac{1}{2}}\notag\\&\leqslant12\mathbb{E}\Bigg[\sup_{0\leqslant t\leqslant u}|X_{\varepsilon}(t\wedge\eta_R)-\bar{X}(t\wedge\eta_R)|\left(\int_0^{u\wedge\eta_R}\left\|\sigma\left(\frac{s}{\varepsilon},X_{\varepsilon}(s-\!),\mathscr{L}_{X_{\varepsilon}(s)}\right)\right .\right .\notag\\&\quad \left .\left .-\bar{\sigma}\left(\bar{X}(s-\!),\mathscr{L}_{\bar{X}(s)}\right)\right\|^2\, {\textrm{d}} s\right)^{\frac{1}{2}}\Bigg]\notag\\&\leqslant\frac{1}{4}\mathbb{E}\left[\sup_{0\leqslant t\leqslant u}|X_{\varepsilon}(t\wedge\eta_R)-\bar{X}(t\wedge\eta_R)|^2\right]+144\mathbb{E}\left[\int_0^{u\wedge\eta_R}\left\|\sigma\left(\frac{s}{\varepsilon},X_{\varepsilon}(s-\!),\mathscr{L}_{X_{\varepsilon}(s)}\right)\right .\right .\notag\\&\quad \left .\left .-\bar{\sigma}\left(\bar{X}(s-\!),\mathscr{L}_{\bar{X}(s)}\right)\right\|^2\, {\textrm{d}} s\right]\notag\\&\leqslant\frac{1}{4}\mathbb{E}\left[\sup_{0\leqslant t\leqslant u}|X_{\varepsilon}(t\wedge\eta_R)-\bar{X}(t\wedge\eta_R)|^2\right]+432(L_R+L)\int_0^u\mathbb{E}\sup_{0\leqslant t\leqslant s}|X_{\varepsilon}(t\wedge\eta_R)\notag\\&\quad -\bar{X}(t\wedge\eta_R)|^2\, {\textrm{d}} s+432uC_R^{\sigma}\cdot C\varphi_2\left(\frac{u\wedge\eta_R}{\varepsilon}\right).\end{align}
$p=1$ ) from Proposition 2, Young’s inequality (2.1) (with
$\epsilon=\frac{1}{4D},p=q=2$ ) and the estimate (3.7), we arrive at the following estimate for
$\Lambda_5$ :
(3.9)Finally, substituting the estimates (3.5)–(3.9) into the expression (3.4) for\begin{align}\mathbb{E}&\left[\sup_{0\leqslant t\leqslant u}\Lambda_{5}(t)\right]\notag\\&\leqslant2D\mathbb{E}\left(\int_0^{u\wedge\eta_R}\int_U|X_{\varepsilon}(s-\!)-\bar{X}(s-\!)|^2\left|h\left(\frac{s}{\varepsilon},X_{\varepsilon}(s-\!),\mathscr{L}_{X^{\varepsilon}(s)},z\right)\right .\right .\notag\\&\quad \left .\left .-\,\bar{h}\left(\bar{X}(s-\!),\mathscr{L}_{\bar{X}(s)},z\right)\right|^2\nu({\textrm{d}} z)\, {\textrm{d}} s\right)^{\frac{1}{2}}\notag\\&\leqslant\frac{1}{4}\mathbb{E}\left[\sup_{0\leqslant t\leqslant u}|X_{\varepsilon}(t\wedge\eta_R)-\bar{X}(t\wedge\eta_R)|^2\right]\notag\\&\quad +4D^2\mathbb{E}\left[\int_0^{u\wedge\eta_R}\int_U\left|h\left(\frac{s}{\varepsilon},X_{\varepsilon}(s-\!),\mathscr{L}_{X_{\varepsilon}(s)},z\right)-\bar{h}\left(\bar{X}(s-\!),\mathscr{L}_{\bar{X}(s)},z\right)\right|^2\nu({\textrm{d}} z)\, {\textrm{d}} s\right]\notag\\&\leqslant\frac{1}{4}\mathbb{E}\left[\sup_{0\leqslant t\leqslant u}|X_{\varepsilon}(t\wedge\eta_R)-\bar{X}(t\wedge\eta_R)|^2\right]\notag\\&\quad +12D^2(L_R+L)\int_0^u\mathbb{E}\sup_{0\leqslant t\leqslant s}|X_{\varepsilon}(t\wedge\eta_R)-\bar{X}(t\wedge\eta_R)|^2\, {\textrm{d}} s\notag\\&\quad +12D^2uC_R^{h}\cdot C\varphi_3\left(\frac{u\wedge\eta_R}{\varepsilon}\right).\end{align}
$I_1$ , and further utilizing Grönwall’s inequality, we obtain
(3.10)with\begin{align}I_1&\leqslant\mathbb{E}\left[\sup_{0\leqslant t\leqslant T}|X_{\varepsilon}(t\wedge\eta_R)-\bar{X}(t\wedge\eta_R)|^2\right]\leqslant \widehat{N}_2{\textrm{e}}^{\widehat{N}_1T}\end{align}
$\widehat{N}_1=4(L_R+\sqrt{L})+2+12\cdot(73+2D^2)(L_R+L)$ and
$\widehat{N}_2=2TC_R^b\cdot C\varphi_1\left(\frac{T\wedge\eta_R}{\varepsilon}\right)+870TC_R^{\sigma}\cdot C\varphi_2\left(\frac{T\wedge\eta_R}{\varepsilon}\right)+6(1+4D^2)TC_R^{\sigma}C\varphi_3\left(\frac{T\wedge\eta_R}{\varepsilon}\right)$ .
-
(2) Estimation of the term
$I_2$ . Using the Cauchy–Schwarz inequality and Theorem 1, we deduce that
(3.11)\begin{align}I_2&=\mathbb{E}\left[\sup_{0\leqslant t\leqslant T}|X_{\varepsilon}(t)-\bar{X}(t)|^2\mathbb{I}_{\{\eta_R\leqslant T\}}\right]\leqslant\sqrt{\mathbb{E}\left(\sup_{0\leqslant t\leqslant T}|X_{\varepsilon}(t)-\bar{X}(t)|^2\right)^2}\sqrt{\mathbb{E}\left(\mathbb{I}_{\{\eta_R\leqslant T\}}\right)^2}\notag\\&\leqslant2\sqrt{2}\sqrt{\mathbb{E}\left(\sup_{0\leqslant t\leqslant T}|X_{\varepsilon}(t)|^4+\sup_{0\leqslant t\leqslant T}|\bar{X}(t)|^4\right)}\sqrt{\mathbb{E}\left(\mathbb{I}_{\{\eta_R\leqslant T\}}\frac{|X_{\varepsilon}(\eta_R)|^4+|\bar{X}(\eta_R)|^4}{R^4}\right)}\notag\\&\leqslant\frac{2\sqrt{2}}{R^2}\left(\mathbb{E}\sup_{0\leqslant t\leqslant T}|X_{\varepsilon}(t)|^4+\mathbb{E}\sup_{0\leqslant t\leqslant T}|\bar{X}(t)|^4\right)\leqslant \frac{C}{R^2}.\end{align}
By combining the estimates on
$I_1$
and
$I_2$
, i.e. (3.10) and (3.11), we conclude that

Now, for any
$\delta\gt 0$
, we can choose
$R\gt 0$
large enough such that
$\frac{C}{R^2}\lt\frac{\delta}{2}$
. In addition, by taking
$\varepsilon$
sufficiently small and using the averaging condition 8, we obtain that

Thus, the arbitrariness of
$\delta$
implies that
$\mathbb{E}\sup_{0\leqslant t\leqslant T}|X_{\varepsilon}(t)-\bar{X}(t)|^2$
converges to 0, as
$\varepsilon$
goes to 0. This completes the proof.
4. Example
In this section, we provide an illustrative example to demonstrate the theoretical results established in this paper. We highlight that the model (4.1) is carefully designed to satisfy all the conditions of our assumptions and to facilitate the explicit derivation of the corresponding averaged equation.
Example 1. Consider the following one-dimensional McKean–Vlasov SDE

with
$t\in[0,T]$
and the initial condition
$X_{\varepsilon}(0)=x_0$
. Here, W(t) is a scalar Wiener process,
$U=\mathbb{R}\backslash \{ 0\}$
, and
$\nu$
is a finite measure with
$\nu(U)=1$
. Define the following functions:


where
$\psi(x)=x\sin(\log^2(1+x^2))$
and
$\phi(x)=x\sin(\log^{\frac{3}{2}}(1+x^2))$
are continuously differentiable functions. For any
$x\in \mathbb{R}$
, we can show that

-
(1) Well-posedness. To show that (4.1) has a unique solution
$(X_{\varepsilon}(t))_{t\in[0,T]}$ , we need to verify that the conditions in Theorem 1 are satisfied. For any
$R\gt 0$ ,
$x,y\in\mathbb{R}$ with
$|x|\vee|y|\leqslant R$ and
$\mu\in\mathcal{M}_2(\mathbb{R})$ , we provide the following estimates:
\begin{align}(x-y)(b(t,x,\mu)-b(t,y,\mu))&=(x-y)(x-x^3-y+y^3)\frac{t}{1+t}\notag\\&\leqslant |x-y|^2-|x-y|^2(x^2+xy+y^2)\notag\\&\leqslant |x-y|^2(1-xy)\leqslant (1+R^2)|x-y|^2\;=\!:\; L_R^1|x-y|^2,\notag\end{align}
\begin{align}|\sigma(t,x,\mu)-\sigma(t,y,\mu)|^2&=\left|\psi(x)-\psi(y)\right|^2\left(\frac{t}{2+t}\right)^2\notag\\&\leqslant \left|\int_0^1(\partial_x\psi)(y+\theta(x-y))(x-y)\, {\textrm{d}} \theta\right|^2\notag\\&\leqslant \Big[\sup_{|z|\leqslant R}\left|(\partial_x\psi)(z)\right|\Big]^2|x-y|^2\notag\\&\leqslant \Big[\sup_{|z|\leqslant R}\left(1+4\log(1+z^2)\right)\Big]^2|x-y|^2\notag\\&\leqslant \left(1+4\log(1+R^2)\right)^2|x-y|^2 \;=\!:\; L_R^2|x-y|^2,\notag\end{align}
\begin{align}\int_U|h(t,x,\mu,z)-h(t,y,\mu,z)|^2\nu({\textrm{d}} z)&=\left|\phi(x)-\phi(y)\right|^2(1-{\textrm{e}}^{-t})^2\notag\\&\leqslant \left|\int_0^1(\partial_x\phi)(y+\theta(x-y))(x-y)\, {\textrm{d}} \theta\right|^2\notag\\&\leqslant \left[\sup_{|z|\leqslant R}|(\partial_x\phi)(z)|\right]^2\cdot|x-y|^2\notag\\&\leqslant \Big[\sup_{|z|\leqslant R}\left(1+3\sqrt{\log(1+z^2)}\right)\Big]^2|x-y|^2\notag\\&\leqslant \left(1+3\sqrt{\log(1+R^2)}\right)^2|x-y|^2 \;=\!:\; L_R^3|x-y|^2.\notag\end{align}
$L_R=\max\{L_R^1,L_R^2,L_R^3\}$ . Next, we estimate the following for any
$x\in\mathbb{R}$ and
$\mu_1,\mu_2\in\mathcal{M}_2(\mathbb{R})$
$\sigma$ , and h satisfy Assumptions 2 and 3. Furthermore, using the bounds in (4.2) and the fact that
$\frac{t}{1+t}$ ,
$\frac{t}{2+t}$ , and
$1-{\textrm{e}}^{-t}$ are bounded, we deduce that for any
$x\in\mathbb{R}^d$ and
$\mu\in\mathcal{M}_2(\mathbb{R}),$
\begin{align}x\cdot b(t,x,\mu)&\leqslant x(x-x^3)\left(\frac{t}{1+t}\right)+x\int_{\mathbb{R}}y\mu({\textrm{d}} y)\notag\\&\leqslant x^2+\frac{1}{2}x^2+\frac{1}{2}\left(\int_{\mathbb{R}}y\mu({\textrm{d}} y)\right)^2\leqslant 2\left(1+x^2+W_2^{2}(\mu,\delta_0)\right)\!, \notag\end{align}
\begin{align}|\sigma(t,x,\mu)|^2&=\left|\psi(x)\frac{t}{2+t}+\int_{\mathbb{R}}y\mu({\textrm{d}} y)\right|^2\notag\\&\leqslant 2|\psi(x)|^2+2\left(\int_{\mathbb{R}}y\mu({\textrm{d}} y)\right)^2\leqslant 2\left(1+|x|^2+W_2^2(\mu,\delta_0)\right)\!,\notag\end{align}
\begin{align}\int_U|h(t,x,\mu,z)|^2\nu({\textrm{d}} z)&=\left|\phi(x)(1-{\textrm{e}}^{-t})+\int_{\mathbb{R}}y\mu({\textrm{d}} y)\right|^2 \leqslant 2|\phi(x)|^2+2\left(\int_{\mathbb{R}}y\mu({\textrm{d}} y)\right)^2\notag\\&\leqslant 2\left(1+|x|^2+W_2^2(\mu,\delta_0)\right).\notag\end{align}
$x\in\mathbb{R}^d$ and
$\mu\in\mathcal{M}_2(\mathbb{R})$ , we have
\begin{align}|b(t,x,\mu)|^2&\leqslant 2(x-x^3)^2\left(\frac{t}{1+t}\right)^2\!+2\left(\int_{\mathbb{R}}y\mu({\textrm{d}} y)\right)^2\!\leqslant 2x^2-4x^4+2x^6+2W_2^2(\mu,\delta_0)\notag\\&\leqslant 4\left(1+x^6+W_2^{6}(\mu,\delta_0)\right). \notag\end{align}
$\kappa=6$ . Finally, since
$X_{\varepsilon}(0)$ is a constant, Assumption 6 (with
$r\geqslant 18$ ) naturally holds. Due to the expression of h and the finiteness of
$\nu$ , Assumption 7 can be easily verified using the same technique as Assumptions 1 and 2 were checked.
-
(2) Averaging principle. Define
\begin{align*}&\bar{b}(x,\mu)=x-x^3+\int_{\mathbb{R}}y\mu({\textrm{d}} y), \quad \bar{\sigma}(x,\mu)=\psi(x)+\int_{\mathbb{R}}y\mu({\textrm{d}} y), \notag\\&\quad\bar{h}(x,\mu,z)=\phi(x)+\int_{\mathbb{R}}y\mu({\textrm{d}} y).\end{align*}
$\kappa=6$ and
$r\geqslant 18$ ) are satisfied:
\begin{align}\frac{1}{t}\int_{0}^{t}|b(s,x,\mu)-\bar{b}(x,\mu)|^2\, {\textrm{d}} s&=\frac{1}{t}\int_{0}^{t}|x-x^3|^2\left[1-\frac{s}{1+s}\right]^2\, {\textrm{d}} s=x^2(1-x^2)^2\frac{1}{1+t}\notag\\&\leqslant \varphi_1(t)C_R^b\left(1+|x|^2\right)\notag\end{align}
\begin{align}\frac{1}{t}\int_{0}^{t}|\sigma(s,x,\mu)-\bar{\sigma}(x,\mu)|^2\, {\textrm{d}} s&=\frac{1}{t}\int_{0}^{t}\psi^2(x)\left[1-\frac{s}{2+s}\right]^2\, {\textrm{d}} s=\psi^2(x)\frac{2}{2+t}\notag\\&\leqslant \varphi_2(t)C_R^{\sigma}\left(1+|x|^2\right)\notag\end{align}
\begin{align}\frac{1}{t}\int_{0}^{t}\int_U|h(s,x,\mu,z)-\bar{h}(x,\mu,z)|^2\nu({\textrm{d}} z)\, {\textrm{d}} s&=\frac{1}{t}\int_{0}^{t}\phi^2(x)[1-(1-{\textrm{e}}^{-s})]^2\, {\textrm{d}} s\notag\\&=\phi^2(x)\frac{1-{\textrm{e}}^{-2t}}{2t}\leqslant \varphi_3(t)C_R^{h}\left(1+|x|^2\right)\!,\notag\end{align}
$x\in\mathbb{R}$ with
$|x|\leqslant R$ , where the functions
$\varphi_1(t)=\frac{1}{1+t}$ ,
$\varphi_2(t)=\frac{1}{2+t}$ ,
$\varphi_3(t)=\frac{1-{\textrm{e}}^{-2t}}{2t}$ , and
$\varphi(t)=\frac{1-{\textrm{e}}^{-lt}}{lt}$ are continuous, positive, and bounded, with the property that
$\lim_{t\to\infty}\varphi_i(t)=\lim_{t\to\infty}\varphi(t)=0$ , for
$i=1,2,3$ .
Based on the discussion and the result of Theorem 2, the solution of (4.1) can be approximated by the following equation (for
$t\in[0,T]$
and
$\bar{X}(0)=x_0$
)

in the sense of mean square.
We now carry out numerical simulations to compute the solutions of (4.1) and (4.3) with
$x_0=1$
,
$T=10$
,
$\varepsilon=0.01$
and
$x_0=1$
,
$T=10$
,
$\varepsilon=0.001$
, respectively. Figure 1(a) and (b) illustrate the comparison between the solution
$X_{\varepsilon}(t)$
of (4.1) and the averaged solution
$\bar{X}(t)$
of (4.3). As shown, the solutions of the original equation and the averaged equation exhibit strong agreement. In addition, one can find that, for fixed sample points, the error
$\sup_{0\leqslant t \leqslant 10}|X_{\varepsilon}(t)-\bar{X}(t)|$
decreases when
$\varepsilon$
changes from
$0.01$
to
$0.001$
. This observed behavior aligns with the predictions of the averaging principle stated in Theorem 2.
We remark that in our numerical simulations to approximate the McKean–Vlasov SDE (4.1) and (4.3), we use N-dimensional systems of interacting particles, which can be regarded as standard SDEs. This approach is based on the so-called propagation of chaos result (see Appendix B). Based on Proposition 3, we briefly introduce an Euler–Maruyama (EM) numerical scheme to approximate the solution of (B.2), which, in turn, serves as an approximation for the solution of the McKean–Vlasov SDE (2.8). To this end, we partition the time interval [0, T] into n subintervals of equal length and define
$t_k^n=kh_n$
for
$k=0,1,\ldots,n$
, where
$n\in\mathbb{N}$
and the step size is given by
$h_n=\frac{T}{n}$
. The EM scheme for the interacting particle system (B.2) is specified by the initial condition
$X^{i,N,n}(0)=X^{i,N}(0)$
and the recurrence relation

where
$X^{i,N,n}(t_k^n)$
denotes the approximation of
$X^{i,N}(t_k^n)$
,
$\mu_{t_k^n}^{X,N,n}=\frac{1}{N}\sum_{j=1}^N\delta_{X^{j,N,n}(t_k^n)}$
is the empirical measure, and
$\Delta W^{i,n}(t)=W^i(t_{k+1}^n)-W^i(t_k^n)$
is the Brownian increment. To simulate the integrals w.r.t. the compensated Poisson random measure
$\tilde N({\textrm{d}} t,{\textrm{d}} z) = N({\textrm{d}} t,{\textrm{d}} z) - \nu ({\textrm{d}} z)\, {\textrm{d}} t$
, we also employ the technique of introducing a compound Poisson process
$\int_U z\tilde N(t,{\textrm{d}} z)$
, as detailed in [Reference Applebaum1, Section 4.3.2].
For this example, we simulate
$N=100$
particles with a time step 0.01,
$T=10$
. Figure 2(a) and (b) depict the realizations of the interacting particle systems associated with the McKean–Vlasov SDEs (4.1) and (4.3), respectively, under the initial conditions
$X_{\varepsilon}(0)=\bar{X}(0)=1$
and
$\varepsilon=0.01$
. Numerically, the Wasserstein distance between the distributions of the solutions to (4.1) (i.e.
$\mathscr{L}_{X_{\varepsilon}(t)}$
) and (4.3) (i.e.
$\mathscr{L}_{\bar{X}(t)}$
) is approximated via the empirical distributions of the interacting particle systems, as illustrated in Figure 2(c).
Appendix A. Details of the proof for Lemma 4
Proof for Lemma 4. For any
$ t\in[0,T]$
,
$x,y\in\mathbb{R}^d$
and
$\mu,\mu_1,\mu_2 \in\mathcal{M}_{2}(\mathbb{R}^d)$
, we calculate successively that

Similar estimates hold for
$\bar{\sigma}(x,\mu)$
and
$\bar{h}(x,\mu,z)$
. Let
$t\to\infty$
in the above estimates, we conclude that the averaged equation (3.2) satisfies Assumptions 1–5.
We next check the extra conditions for
$\bar{h}$
by calculating that (for
$l=r$
or
$\kappa$
)

Taking
$t\to\infty$
, we conclude that Assumption 7 holds.
Appendix B. Propagation of chaos
For
$N\geqslant1$
and
$i=1,2,\ldots, N$
, let
$(W^i,\tilde{N}^i,X^i(0))$
be independent copies of
$(W,\tilde{N},X(0))$
. We introduce the noninteracting particle system associated with the McKean–Vlasov SDE (2.8). The state
$X^i(t)$
of the particle i is given by

for
$t\in[0,T]$
with initial data
$X^i(0)$
. According to Theorem 1, we have
$\mathscr{L}_{X^i(t)}=\mathscr{L}_{X(t)}$
, for all
$i=1,2,\ldots, N$
. Here, X(t) is the solution of the McKean–Vlasov SDE (2.8) for
$t\in[0,T]$
with initial data
$X(0)=x_0$
.
We also consider the associated interacting particle system

with initial data
$X^{i,N}(0)=X^i(0)$
, where
$\mu_t^{X,N}$
is the empirical measure of N interacting particles given by
$\mu_t^{X,N}=\frac{1}{N}\sum_{j=1}^N\delta_{X^{j,N}(t)}$
. We proceed to establish and prove the propagation of chaos result. Furthermore, we note that compared with the existing literature on the Lévy case, particularly [Reference Cavallazzi6, Reference Mehri, Scheutzow, Stannat and Zangeneh28, Reference Neelima, Kumar, Dos Reis and Reisinger30], the coefficient conditions in our framework are somewhat more relaxed, as discussed in Remark 1.
Proposition 3. (Propagation of chaos.) Suppose Assumptions 1–7 hold and
$r\geqslant4$
. Then, the interacting particle system (B.2) is well-posed and converges to the noninteracting particle system (B.1), that is,

Proof. First, note that the interacting particle system
$\{X^{i,N}\}_{1\leqslant i\leqslant N}$
given in (B.2) can be regarded as a system of ordinary SDEs driven by Lévy noise, taking values in
$\mathbb{R}^{d\times N}$
. Thus, according to [Reference Majka25, Theorem 1.1], it has a unique càdlàg solution under Assumptions A1, A4, A5 such that

for any
$N\geqslant1$
, where
$C\gt 0$
is independent of N.
To handle the one-sided locally Lipschitz case, for any
$1\leqslant i\leqslant N$
and
$R\gt 0$
, define the stopping time:

Then, by De Morgan’s Law, we obtain

where
$\mathbb{I}_A$
is the indicator function of set A. Similarly to the proof of Theorem 2, we now estimate
$Q_1$
and
$Q_2$
respectively.
(1) Estimation of the term
$Q_1$
. Note that

By Itô’s formula, we have

We now estimate
$Q_{i,R}$
for
$i=1,2,3$
by Assumptions 1 and 2 and obtain that

where

with
$\mu_t^X=\frac{1}{N}\sum_{i=1}^N\delta_{X^{i}(t)}$
the empirical measure of N noninteracting particles. Then, by combining these estimates and applying Grönwall’s inequality, we eventually have

(2) Estimation of the term
$Q_2$
. Using the Cauchy–Schwarz inequality and Theorem 1, we deduce that

With the estimations of
$Q_1$
and
$Q_2$
at hand, we conclude that

Note that, by [Reference Carmona and Delarue5, Theorem 5.8], we have the following estimate for the Wasserstein distance:

Thus, we observe that the right-hand side of the estimate (B.8) converges to 0 as
$N\to\infty$
. The result follows, and the proof is complete.
Acknowledgements
The authors thank Prof. Yanjie Zhang for helpful discussions. The authors also thank the referees for their careful reading of the manuscript and invaluable comments, which were very useful in improving this paper.
Data availability statement
The datasets supporting the findings of this work are available from the corresponding author on reasonable request.
Funding statement
The research of Y. Chao was partially supported by NSFC grant 12101484, the Fundamental Research Funds for the Central Universities (xzy012025071), the Guangdong Provincial Key Laboratory of Mathematical and Neural Dynamical Systems (DSNS2025003), and NSFC grants 12271424 and 12371276. The research of J. Duan was partially supported by NSFC grant 12141107, the Guangdong Provincial Key Laboratory of Mathematical and Neural Dynamical Systems (2024B1212010004), the Cross-Disciplinary Research Team on Data Science and Intelligent Medicine (2023KCXTD054), and the Guangdong–Dongguan Joint Research Grant 2023A1515140016. The research of T. Gao was partially supported by the National Key R&D Program of China (2021ZD0201300) and NSFC grant 12401233. The research of P. Wei was partially supported by CPSF grants 2022TQ0009 and 2022M720264, the Jiangsu Provincial Scientific Research Center of Applied Mathematics (BK20233002), and the National Key R&D Program of China (2020YFA0712800).
Competing interest
There were no competing interests to declare which arose during the preparation or publication process of this article.