Solution theory of fractional SDEs in complete subcritical regimes

Lucio Galeati; Máté Gerencsér

doi:10.1017/fms.2024.136

Solution theory of fractional SDEs in complete subcritical regimes

Part of: Stochastic processes Miscellaneous topics - Partial differential equations

Published online by Cambridge University Press: 24 January 2025

Lucio Galeati

and

Máté Gerencsér

Show author details

Lucio Galeati: Affiliation:
Dipartimento di Ingegneria e Scienze dell’Informazione e Matematica, Università degli Studi dell’Aquila, Edificio Renato Ricamo, via Vetoio, Coppito, L’Aquila, 67100, Italy; E-mail: lucio.galeati@univaq.it
Máté Gerencsér*: Affiliation:
Institute of Analysis and Scientific Computing, TU Wien, Wiedner Hauptstraße 8–10, Vienna, 1040, Austria;
*: E-mail: mate.gerencser@tuwien.ac.at (corresponding author)

Article contents

Abstract
Introduction
A priori estimates and stochastic sewing
Stability
Strong well-posedness for functional drift
Strong well-posedness for distributional drift
Flow regularity and Malliavin differentiability
McKean-Vlasov equations
Weak compactness and weak existence
$\rho $-irregularity
Applications to transport and continuity equations
Funding statement
Competing interest
Footnotes
References

Abstract

We consider stochastic differential equations (SDEs) driven by a fractional Brownian motion with a drift coefficient that is allowed to be arbitrarily close to criticality in a scaling sense. We develop a comprehensive solution theory that includes strong existence, path-by-path uniqueness, existence of a solution flow of diffeomorphisms, Malliavin differentiability and $\rho $-irregularity. As a consequence, we can also treat McKean-Vlasov, transport and continuity equations.

MSC classification

Secondary: 35R60: Partial differential equations with randomness, stochastic partial differential equations 60G22: Fractional processes, including fractional Brownian motion

Type: Probability
Information: Forum of Mathematics, Sigma , Volume 13 , 2025 , e12

DOI: https://doi.org/10.1017/fms.2024.136 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright: © The Author(s), 2025. Published by Cambridge University Press

1 Introduction

Given a vector field $b:\mathbb {R}_+\times \mathbb {R}^d\to \mathbb {R}^d$ , an initial condition $x_0\in \mathbb {R}^d$ and a function $f:\mathbb {R}_+\to \mathbb {R}^d$ , consider the differential equation

(1.1)

$$ \begin{align} X_t = x_0 + \int_0^t b_r ( X_r) \mathrm{d} r + f_t. \end{align} $$

When f is chosen according to some random distribution, one obtains a stochastic differential equation (SDE), which often exhibits much better properties than the unperturbed equation ( $f\equiv 0$ ), even at the level of existence and uniqueness of solutions. This phenomenon is often referred to as regularisation by noise, and its study goes back to the works of Zvonkin [Reference Zvonkin104] and Veterennikov [Reference Veretennikov97]; see the monograph [Reference Flandoli39] for a survey in the case of standard Brownian f.

Although there is plenty of evidence [Reference Catellier and Gubinelli20, Reference Davie32, Reference Galeati and Gubinelli49, Reference Harang and Perkowski58] that it is the pathwise properties of the perturbation that determine the regularisation effects, the available results are far more abundant in the Brownian and, in general, the Markovian case.

However, a wide variety of applications motivate models with anomalous diffusions with long-range memory, including statistical description of turbulence [Reference Kolmogorov62], hydrology [Reference Hurst, Black and Simaika59], anomalous polymer dynamics [Reference Panja83], diffusion in living cells [Reference Szymanski and Weiss94] and rough volatility models in finance [Reference Gatheral, Jaisson and Rosenbaum52]. Such non-Markovian processes are commonly modeled by fractional Brownian motion (fBm). In this case, the lack of Markovian and semimartingale structure renders a large part of a ‘standard’ toolbox (Itô’s formula, Kolmogorov equations, Zvonkin transformation, martingale problem) unavailable. Nevertheless, since fBm paths share many properties with the standard Brownian ones (up to changes in the scaling exponents), one would expect similar regularisation phenomena.

The goal of the present work is twofold. First, we provide the first well-posedness results in the case of non-Markovian noise under demonstrably sharp conditions on b. The optimality follows both from a scaling heuristic (see Section 1.1 below) and from rigorous construction of counterexamples (see Section 1.3 below). The second goal is to expand the existing well-posedness theory by studying various properties of the solutions that are well-known (though often nontrivial) in the Brownian case, but much less so for fractional noise. These include existence, regularity, invertibility of the solution flow, stability with respect to perturbations of the initial condition and/or the nonlinearity, and Malliavin differentiability. The proofs can also be of interest in cases where the results are not new: the methods presented here go beyond not only the Markovian framework but also the scope of Girsanov’s theorem (see Remark 1.8 and Appendix C).

At the same time, the idea is quite intuitive: in order to develop a strong solution theory for (1.1), it is natural to investigate first the solvability of the linearised equation around any given solution X, namely to show that

(1.2)

$$ \begin{align} Y_t= y + \int_0^t \nabla b_r(X_r) Y_r\, \mathrm{d} r \end{align} $$

has a well-defined, unique solution for any $y\in \mathbb {R}^d$ ; observe that, due to its additive nature, the perturbation f does not appear in (1.2). The study of (1.2) is perfectly in line with the classical setting of a continuously differentiable drift b, where (1.2) can be solved directly and its behaviour matches the Grönwall-type estimates encountered when looking at the difference of any two solutions. However if b is not assumed to be differentiable, $\nabla b_r(X_r)$ a priori does not make sense, and thus, a standard interpretation for (1.2) is no longer possible. The key idea in order to overcome this difficulty is two-fold:

a) $\nabla b(\cdot )$ in (1.2) is not evaluated at arbitrary space points, but rather along the solution X, which can have very special properties inherited from the noise f.
b) In order to give meaning to (1.2) in a Young integral sense, we do not need to define $\nabla b_r(X_r)$ pointwise; instead, it suffices to show that the path
(1.3) $$ \begin{align} t\mapsto L_t:=\int_0^t \nabla b_r(X_r) \mathrm{d} r \end{align} $$
is well-defined and enjoys sufficiently nice time regularity (more precisely, it is of finite p-variation for some $p<2$ ). In view of a), depending on the structure of the noise f, this can be a much more reasonable requirement.

In analogy with the Lipschitz setting, one can then transfer estimates for classical linear Young equations of the form

(1.4)

$$ \begin{align} \sup_{t\in [0,1]} |Y_t|\lesssim e^{C\|L\|_{p-{\mathrm{var}}}^p} |y| \end{align} $$

to pathwise bounds for the difference of any two solutions X and $\tilde X$ with different initial conditions, up to replacing L by another process $\hat L=\hat L(X,\tilde X)$ similar in spirit to (1.3).

In order to rigorously formalise all of the above, it is crucial to identify the correct space of perturbations $\varphi $ such that $X=\varphi +f$ indeed inherits the relevant properties from f; these are the a priori estimates given by Lemmas 2.1–2.4. Correspondingly, we formulate two new versions of the Stochastic Sewing Lemma (SSL) by Lê [Reference Lê71]; cf. Lemmas 2.5 and 2.6 below, which are tailor-made for our analysis. Once this setup is in place, it provides exponential moment estimates of certain additive functionals of X, like the one defined in (1.3), turning pathwise bounds like (1.4) into moment bounds. Finally, once the behaviour of the linearised equation (1.2) is understood, many further properties (uniqueness, stability, differentiability of the flow) of the ODE follow similarly.

1.1 Scaling heuristics and existing literature

One way to have a unified view on the many works on regularization by noise is by a scaling argument; for a similar approach in the Brownian setting and $L^q_t L^p_x$ spaces, see [Reference Beck, Flandoli, Gubinelli and Maurelli8, Section 1.5].

From now on, we sample the perturbation as a fBm $B^H$ with Hurst parameter $H\in (0,+\infty ) \setminus \mathbb {N}$ , which satisfies the scaling relation

(1.5)

$$ \begin{align} (B^H_t)_{t\geq0}\overset{\mathrm{law}}{=}(\lambda^{-H}B^H_{\lambda t})_{t\geq 0}, \quad \forall\, \lambda>0. \end{align} $$

Details about the processes $B^H$ are given in Section 1.4 below; let us just briefly recall that $H=1/2$ gives the standard Brownian motion and that this is the only case where $B^H$ is a Markov process. For the values $H=k+1/2$ , $k\in \mathbb {N}_+$ (which we call ‘degenerate Brownian’), the Markovian toolbox is still available since the SDE can be rewritten as a higher-dimensional equation driven by degenerate Brownian noise; see, for example, [Reference Chaudru de Raynal, Honoré and Menozzi24]. For all other choices of H, such tools are unavailable, and the study of the SDE requires a fundamentally different approach. The equation then takes the form

(1.6)

$$ \begin{align} X_t = x_0 + \int_0^t b_r ( X_r) \mathrm{d} r + B^H_t. \end{align} $$

In order for the regularising effects of $B^H$ to dominate the irregularities of b, it is natural to require that, when zooming into small scales in a way that keeps the noise strength constant, the nonlinearity vanishes; if this were not the case, and the nonlinearity were dominant, we would expect to see all the same pathologies (e.g., coalescence or branching of solutions) which could manifest in the ODE without noise. Therefore, keeping (1.5) in mind, for a fixed parameter H, we call a space V of functions (or distributions) on $\mathbb {R}_+\times \mathbb {R}^d$ critical (resp. subcritical/supercritical) if for the rescaled drift coefficient

$$ \begin{align*} b^\lambda_t(x)=\lambda^{1-H} b(\lambda t, \lambda^H x), \end{align*} $$

the leading order seminorm (see the examples below for its practical meaning) scales like , for all $\lambda \leq 1$ ,Footnote ¹ with $\gamma =0$ (resp. $\gamma>0$ / $\gamma <0$ ).

We refer to Section 1.5 for more details on the function spaces appearing in the upcoming examples.

Example 1.1. Consider autonomous, inhomogeneous Hölder-Besov spaces $V=B^\alpha _{\infty ,\infty }$ , where b does not depend on the time variable. Here, the leading order seminorm is the associated homogeneous seminorm; namely, we set as defined in [Reference Bahouri, Chemin and Danchin5]; alternatively, for $f\in B^\alpha _{\infty ,\infty }$ and $\alpha \geq 0$ , one can regard it as $\| (-\Delta )^{\alpha /2} f\|_{B^0_{\infty ,\infty }}$ , while for $\alpha <0$ , one can define it by duality with the homogeneous seminorm of $\dot {B}^{-\alpha }_{1,1}$ . Either way, one finds the scaling relation

$$ \begin{align*} \| f(\eta\, \cdot)\|_{\dot{B}^\alpha_{\infty,\infty}} \sim_\alpha \eta^\alpha \| f(\cdot)\|_{\dot{B}^\alpha_{\infty,\infty}} \quad\, \forall\, (\eta,\alpha)\in \mathbb{R}_{>0}\times \mathbb{R}. \end{align*} $$

Combined with our definition of $b^\lambda $ , one finds $\gamma =1-H+\alpha H$ , and so the subcriticality condition reads as

(1.7)

$$ \begin{align} \alpha>1-\frac{1}{H}. \end{align} $$

However, even in the classical Brownian case, where one gets the condition $\alpha>-1$ , this remains out of reach. Weak well-posedness is known for $\alpha>-1/2$ [Reference Flandoli, Issoglio and Russo41], and a nonstandard kind of well-posedness (where uniqueness is even weaker than uniqueness in law) is shown for $\alpha>-2/3$ [Reference Cannizzaro and Chouk18, Reference Delarue and Diel35], for special classes of drift b. The classical works [Reference Veretennikov97, Reference Zvonkin104] show strong well-posedness for $V=C^\alpha _x$ and $\alpha \geq 0$ .Footnote ² Interestingly, in the degenerate Brownian case, weak well-posedness is proved in [Reference Chaudru de Raynal, Honoré and Menozzi24] in the full regime $\alpha>(2k-1)/(2k+1)$ , which is precisely the condition (1.7). For strong well-posedness, one requires the more restrictive condition

$$ \begin{align*} \alpha>1-\frac{1}{2H}; \end{align*} $$

see [Reference Chaudru de Raynal, Honoré and Menozzi23, Equation (1.11)]. The same condition is required for strong well-posedness in the non-Markovian case for all $H\in (0,\infty )\setminus \mathbb {N}$ ; cf. [Reference Catellier and Gubinelli20, Reference Galeati and Gubinelli49, Reference Gerencsér53, Reference Nualart and Ouknine81]. After the first version of this manuscript, the work [Reference Butkovsky, Lê and Mytnik16] appeared, where the authors are able to establish (among several results) weak existence of solutions in the full subcritical regime (1.7), under the additional assumption that b is a Radon measure; however, uniqueness is still open.

Example 1.2. Another well-studied case is the mixed Lebesgue space $V=L^q_tL^p_x$ . Here, we can take the seminorm to be $\| \cdot \|_V$ itself; using the scaling relation $\|f(\eta \,\cdot )\|_{L^p_x}= \eta ^{-d/p}$ , one finds $\gamma =1-H-1/q-(Hd)/p$ , and the subcritical regime is

(1.8)

$$ \begin{align} \frac{1}{q}+\frac{Hd}{p}<1-H. \end{align} $$

In the classical case $H=1/2$ , equation (1.8) reads as

(1.9)

$$ \begin{align} \frac{2}{q}+\frac{d}{p}<1, \end{align} $$

which is precisely the condition from the classical work [Reference Krylov and Röckner68], where strong well-posedness is proved (under the additional constrant $p\geq 2$ ); instead, the critical regime corresponds to the celebrated Ladyzhenskaya–Prodi–Serrin (LPS) condition. This case has then been extensively studied by several authors, allowing also for multiplicative noise with Sobolev diffusion coefficients; see, among others, [Reference Fedrizzi and Flandoli37, Reference Xia, Xie, Zhang and Zhao99, Reference Zhang101, Reference Zhang102]. In recent years, even the critical case has been reached [Reference Krylov65, Reference Röckner and Zhao88] under certain constraints on $d,p,q$ ; the results have been further refined by allowing coefficients in Morrey spaces (cf. [Reference Krylov66, Reference Krylov67]) or form-bounded drifts (cf. [Reference Kinzebulatov and Madou60, Reference Kinzebulatov and Semënov61]) and the references therein. It was recently understood in [Reference Zhang and Zhao103] that one can go beyond condition (1.8), up to imposing additional constraints on $\mathrm {div}\, b$ ; for further progress in this exciting direction, see also [Reference Gräfner and Perkowski55, Reference Hao and Zhang57].

For $H\in (1/2,1)$ , no results are known, and for $H\in (0, 1/2)$ , the main previously known results for weak and strong well-posedness are both from [Reference Lê71], under the stronger conditions

(1.10)

$$ \begin{align} \frac{1}{q}+\frac{Hd}{p}<\frac{1}{2},\qquad\frac{1}{q}+\frac{Hd}{p}<\frac{1}{2}-H, \end{align} $$

respectively, with the additional constraint $p\in [2,\infty ]$ , later removed in [Reference Galeati and Gubinelli49]. It is conjectured in [Reference Lê71] that the first condition in (1.10) is enough to guarantee strong well-posedness. One particular corollary of our result is that for $q\in (1,2]$ , even (1.8) is sufficient. Therefore, we propose to update the conjecture of [Reference Lê71] (if $q\in (1,2]$ , now a theorem) to assert strong well-posedness under the scaling condition (1.8). Let us also mention that we have recently learned about an ongoing work [Reference Butkovsky, Lê and Matsuda15] towards improving (1.10).

Example 1.3. A common generalisation of Examples 1.1 and 1.2 is the space $V=L^q_t C^\alpha _x$ , where (adopting the leading seminorm to be the one of $L^q_t \dot {B}^\alpha _{\infty ,\infty }$ , in agreement with both previous casesFootnote ³ ) the scaling works out to be $\gamma =1-H-1/q+\alpha H$ . Therefore, the subcriticality condition reads as

$$ \begin{align*} \alpha>1-\frac{1}{H}+\frac{1}{Hq}=1-\frac{1}{q'H}, \end{align*} $$

where, here and in the rest of the paper, q and $q'$ are conjugate exponents, $1/q+1/q'=1$ . This generality has only been studied recently in [Reference Galeati and Gubinelli49, Reference Galeati, Harang and Mayorcas51], where strong well-posedness is proved under the stronger condition

(1.11)

$$ \begin{align} \alpha>1-\frac{1}{2H}+\frac{1}{Hq}, \end{align} $$

with the additional constraints $H\in (0,1/2]$ , $q\in (2,\infty ]$ . Note that by setting $\alpha =-d/p$ , condition (1.11) coincides with the second one in (1.10).

In summary, to the best of our knowledge, weak well-posedness results in a whole subcritical regime are available only in the degenerate Brownian case $H=k+1/2$ , $k\in \mathbb {N}$ , and strong well-posedness only in the standard Brownian case $H=1/2$ .

1.2 Discussion of the main results

In the present paper, we establish strong well-posedness in the full subcritical regime for all $H\in (0,\infty )\setminus \mathbb {N}$ , with coefficients from the class in Example 1.3, under the additional constraint $q\in (1,2]$ . In other terms, our main conditions are summarised by the assumption

(A)

$$ \begin{align} H\in(0,\infty) \setminus \mathbb{N},\qquad q\in(1,2],\qquad \alpha\in\Big(1-\frac{1}{q'H},1\Big). \end{align} $$

The solution theory we present in fact goes beyond strong well-posedness. We show existence in the strong sense not only of solutions but also of solution flows, and uniqueness in the path-by-path sense. Furthermore, several further properties of solutions are established such as stability, continuous differentiability of the flow and its inverse, Malliavin differentiability and $\rho $ -irregularity.

Many of these results are even new in the time-independent case: if b is only a function of x and belongs to $C^\alpha _x$ , then the optimal choice to put it in the framework of (A) is to choose $q=2$ , leading to the condition $\alpha>1-1/(2H)$ . This is the classical condition under which strong well-posedness is known [Reference Catellier and Gubinelli20, Reference Gerencsér53, Reference Nualart and Ouknine81], but several of the further properties have not been previously established.

Our main findings are loosely summarised (without aiming for full precision or generality) in the following statement; the corresponding results (often in a somewhat sharper form) can be found throughout the paper in Theorems 4.3, 4.4, 5.5, 5.6 for i), 3.2 for ii), 6.2 for iii), 6.8 for iv), 7.4 for v), 9.3 for vi), 10.4 for vii). For simplicity, we restrict ourselves to the time interval $t\in [0,1]$ , but it is clear that up to rescaling, we could consider any finite $[0,T]$ (up to allowing the hidden constants to depend on T).

Theorem 1.4. Assume (A) and let $x_0\in \mathbb {R}^d$ , $b\in L^q_t C^\alpha _x$ , $m\in [1,\infty )$ . Then,

i) Strong existence and path-by-path uniqueness holds for (1.6);
ii) For any other $\tilde x_0\in \mathbb {R}^d$ and $\tilde b\in L^q_t C^\alpha _x$ , the associated solutions X and $\tilde X$ satisfy the stability estimate
$$ \begin{align*} \mathbb{E}\bigg[\sup_{t\in[0,1]}|X_t-\tilde X_t|^m\bigg]^{1/m}\lesssim |x_0-\tilde x_0|+\|b- \tilde b\|_{L^q_t C^{\alpha-1}_x}; \end{align*} $$
iii) The solutions form a stochastic flow of diffeomorphisms $\Phi _{s\to t}(x)$ , whose spatial gradient $\nabla \Phi $ is $\mathbb {P}$ -a.s. continuous in all variables; moreover, it holds
$$ \begin{align*} \sup_{0\leq s\leq t\leq 1, x\in \mathbb{R}^d} \mathbb{E}\big[| \nabla \Phi_{s\to t} (x) |^m\big] <\infty; \end{align*} $$
iv) For each $s<t$ and $x\in \mathbb {R}^d$ , the random variable $\omega \mapsto \Phi _{s\to t}(x;\omega )$ is Malliavin differentiable; moreover, it holds
$$ \begin{align*} \sup_{0\leq s\leq t\leq 1, x\in \mathbb{R}^d} \mathbb{E}\big[ \| D \Phi_{s\to t}(x)\|_{\mathcal{H}^H}^m \big]<\infty, \end{align*} $$
where D is the Malliavin derivative and $\mathcal {H}^H$ the Cameron-Martin space of $B^H$ ;
v) Strong existence and uniqueness holds also for the McKean-Vlasov equation
(1.12) $$ \begin{align} X_t=x_0+\int_0^t(b_r\ast\mu_r)(X_r)\mathrm{d} r +B^H_t,\qquad\mu_t=\mathcal{L}(X_t); \end{align} $$
vi) Solutions X are $\mathbb {P}$ -a.s. $\rho $ -irregular for any $\rho <1/(2H)$ ;
vii) If additionally $\alpha>0$ , then for any $p>1$ , strong existence and path-by-path uniqueness holds for solutions $u\in L^\infty _t W^{1,p}_x$ to the transport equation
$$ \begin{align*} \partial_t u + b\cdot \nabla u + \dot{B}^H_t\cdot \nabla u=0 \end{align*} $$
for all initial data $u_0\in W^{1,p}_x$ .

The various aspects of the main results are discussed in detail in their respective sections, so here, let us just briefly comment on them.

The notion of path-by-path uniqueness in i), as a strengthening of the classical pathwise uniqueness, was first established in the seminal work [Reference Davie32] by Davie, with a simpler proof that was later provided by Shaposhnikov [Reference Shaposhnikov91]. This kind of result was then generalised to fBm in [Reference Catellier and Gubinelli20], suggesting it is a consequence of the pathwise properties of the trajectories of the driving noise. Such a uniqueness concept requires giving a pathwise interpretation to the SDE, which becomes nontrivial for $\alpha <0$ , where b can be a distribution of negative regularity and not a function anymore. In this case, following [Reference Catellier and Gubinelli20], we will give meaning to (1.6) as a nonlinear Young ODE; see Section 5 for more details.

Stability estimates in the style of ii) are useful to bypass abstract Yamada-Watanabe arguments and get strong existence directly. Among other possible applications, let us mention their importance in numerical schemes with distributional drifts; see, for example, the recent work [Reference Goudenège, Haress and Richard54]. In this paper, stability estimates play a key role when solving McKean-Vlasov equations as in v); see Section 7. The study of stochastic flows iii) for SDEs goes back to the classical work [Reference Kunita69]; see also [Reference Chen and Li26, Reference Fedrizzi and Flandoli37, Reference Menoukeu-Pamen, Meyer-Brandis, Nilssen, Proske and Zhang76] for flows in irregular settings. In iv), we can in fact derive differentiability with respect to perturbations of the noise quite a bit more generally than Cameron-Martin directions (see Remark 6.9), in line with the observations from [Reference Friz and Victoir42, Reference Kusuoka70]. Concerning v), regularisation by fractional noise for distribution dependent SDEs has been investigated in [Reference Galeati, Harang and Mayorcas51] and recently in [Reference Han56]. Above, we only stated the simplest example of the McKean-Vlasov equation for the sake of presentation. Theorem 7.4 below allows for more general dependence on $(X,\mu )$ . The notion of $\rho $ -irregularity in vi) was introduced by [Reference Catellier and Gubinelli20] as a powerful measurement of the averaging properties of paths. Extending $\rho $ -irregularity from Gaussian processes to perturbed Gaussian processes has previously only been achieved efficiently via Girsanov transform. Here, we provide a simple and more robust alternative. Concerning vii), regularisation by noise results for the transport equation were first established for Brownian noise in [Reference Flandoli, Gubinelli and Priola40] and further developed in [Reference Fedrizzi and Flandoli38, Reference Mohammed, Nilssen and Proske78]; see also [Reference Catellier21, Reference Galeati and Gubinelli49, Reference Nilssen79] for further investigations in the fractional case.

The scope of some intermediate estimates we obtain is larger than (A), and therefore, in some regime where we do not obtain strong well-posedness, we still get compactness and therefore existence of weak solutions. To state the result, we need to enforce the following different condition:

(B)

$$ \begin{align} H\in(0,1) ,\quad q\in(1,\infty],\quad \alpha>\frac{1}{2}-\frac{1}{2H}, \quad \alpha>1-\frac{1}{H q'}. \end{align} $$

The proof of Theorem 1.5 is presented in Section 8, where we also define rigorously what we mean by weak solution to (1.6) in this case; see Theorem 8.2 for a more precise statement.

Theorem 1.5. Assume (B) and let $x_0\in \mathbb {R}^d$ , $b\in L^q_t C^\alpha _x$ ; then there exists a weak solution to the SDE (1.6).

Remark 1.6. For $b\in L^q_t C^\alpha _x$ with $\alpha>0$ , existence of weak solutions can be shown classically by standard Peano-type arguments for any choice of $H\in (0,\infty )\setminus \mathbb {N}$ . Therefore, condition (B) is of real interest only when considering $\alpha <0$ ; in this case, $H\in (0,1)$ is not a real restriction, as it follows from the first condition on $\alpha $ . Note further that in the case $q\in (1,2]$ , it always holds $1-\frac {1}{Hq'}\geq \frac {1}{2}-\frac {1}{2H}$ , and so (B) reduces to (A); thus, the interesting cases covered by Theorem 1.5 are for $q\in (2,\infty ]$ .

Remark 1.7. For $q=\infty $ , condition (B) reduced to $b\in L^\infty _t C^\alpha _x$ , $\alpha>\frac {1}{2}-\frac {1}{2H}$ . In the Brownian case $H=1/2$ , this recovers the condition $\alpha>-1/2$ obtained in [Reference Flandoli, Issoglio and Russo41], which showed uniqueness in law. Recently, [Reference Kremp and Perkowski63, Theorem 6.7] provided counterexamples to uniqueness in law for Brownian SDEs with drifts $b\in C_t C^\alpha _x$ , for any $\alpha \leq -1/2$ ; non-uniqueness here is meant in the class of ‘canonical weak solutions’ (i.e., satisfying a definition à la Bass-Chen [Reference Bass and Chens6] (cf. Definition 8.1 below)). So there can be a nontrivial gap between well-posedness results and the prediction offered by scaling arguments. On the positive side, recently,[Reference Butkovsky and Mytnik17] proved uniqueness in law of the solutions constructed by Theorem 1.5, at least in the case $H\in (0,1/2]$ and autonomous drift $b\in C^\alpha _x$ with $\alpha>\frac {1}{2}-\frac {1}{2H}$ .

Remark 1.8. One fundamental stochastic analytic tool that still applies in the non-Markovian fBm setting is Girsanov’s transform. Indeed, it is heavily used in the seminal works [Reference Catellier and Gubinelli20, Reference Nualart and Ouknine81] and many subsequent ones. However, it has its limitations: in our setting, it only applies under the additional assumption $1-1/(q'H)<0$ (which, in turn, may only happen if $H\in (0,1/2)$ ); see Appendix C for details. Even in the Brownian case $H=1/2$ , our methods yield results beyond the scope of Girsanov’s theorem, which is not available for $q<2$ ; see Remark 1.9 below. Therefore, throughout the article, we avoid Girsanov’s transform altogether.

Another motivation for a Girsanov-free approach is to develop tools that are robust enough to extend to other classes of process; see [Reference Butkovsky, Dareiotis and Gerencsér13] for some first results on such equations via stochastic sewing for Lévy-driven SDEs and Remarks 1.12–1.13 below for other classes of Gaussian processes which fit our framework.

Remark 1.9. Theorem 1.4 gives new results also in the classical $H=1/2$ case. Indeed, to solve (1.6) with classical tools, one would require a good solution theory of the corresponding Kolmogorov equation

(1.13)

$$ \begin{align} \partial_t u-\tfrac{1}{2}\Delta u=b\cdot\nabla u. \end{align} $$

Suppose that $b\in L^q_t C^\alpha _x$ with $q\in (1,2)$ . Then the naive power counting fails: replacing first u by a smooth function on the right-hand side gives, by Schauder estimates, $u\in L^\infty _t C^\beta _x$ with $\beta =\alpha +2-2/q$ , and so $b\cdot \nabla u\in L^q_t C^{\alpha +1-2/q}_x$ . Since $\alpha +1-2/q<\alpha $ , iterating the procedure implies worse and worse spatial regularity on u, and after finitely many steps, the product $b\cdot \nabla u$ becomes even ill-defined. This is somewhat similar to the issue of the Kolmogorov equation of Lévy SDEs with low stability index, which was circumvented in [Reference Chaudru de Raynal, Menozzi and Priola25]. After this manuscript appeared, Schauder estimates for (1.13) with $b\in L^q_t C^\alpha _x$ with $q\in (1,2)$ were developed in [Reference Wei, Hu and Yuan98].

Remark 1.10. By the embedding $L^p_x\subset C^{-d/p}_x$ , our result immediately implies well-posedness of (1.6) with $L^q_t L^p_x$ drift in the full subcritical regime (with respect to p) (1.8) if $q\in (1,2]$ , which can be seen as a fractional analogue of [Reference Krylov and Röckner68]. Note that unlike in [Reference Lê71], $p\in [1,2)$ is also allowed.

The rest of the article is structured as follows. In Section 1.3, we present some counterexamples in the supercritical regime, demonstrating that (up to reaching the critical equality) condition (A) cannot be improved; we then conclude the introduction by recalling some fundamental properties of fBm in Section 1.4 and by introducing the main notations used throughout the paper in Section 1.5. In Section 2, we state and prove some fundamental lemmata, including the aforementioned a priori estimates for solutions of (1.6) and the two new forms of the stochastic sewing lemma of [Reference Lê71]. Section 3 contains further estimates for additive functionals of processes, as well as a key stability property of solutions. In Sections 4 and 5, we use these estimates to establish well-posedness of (1.6); we distinguish the cases $\alpha>0$ and $\alpha <0$ cases, which require a different analysis. Along the way, we prove the existence of a solution semiflow, which we upgrade to a flow of diffeomphisms in Section 6. Section 7 contains applications of our stability estimates to McKean-Vlasov equations. In Section 8, we construct weak solutions under condition (B), via a compactness argument enabled by the available a priori estimates. In Section 9, we show $\rho $ -irregularity of solutions and more general perturbations of fractional Brownian motions. Finally, Section 10 contains applications to transport and continuity equations. In the appendices, we collect some useful tools for which we did not find exact references in the literature: Appendix A contains variants of Kolmogorov continuity criterion, and Appendix B gives two basic bounds for solutions of Young differential equations. In Appendix C, we summarise relations of various Sobolev spaces and their use in Girsanov transform for fractional Brownian motions.

1.3 Counterexamples to uniqueness in the supercritical regime

Although the scaling argument is heuristic, one can often construct counterexamples in the supercritical case. The constructions below are motivated by [Reference Chaudru de Raynal22], which gives counterexamples for $q=\infty $ , $\alpha>0$ .

Assume $d\geq 1$ , $H\in (0,1)$ and $(\alpha ,q)\in \mathbb {R}\times (1,\infty )$ satisfy

(1.14)

$$ \begin{align} \alpha<1-\frac{1}{H q'},\qquad \alpha>-1; \end{align} $$

let B be an $\mathbb {R}^d$ -valued stochastic process such that $\mathbb {P}$ -almost surely $B\in C^\gamma $ for all $\gamma \in (0,H)$ . We claim that under (1.14), there exists $b\in L^q_t C^\alpha _x$ such that the equation

(1.15)

$$ \begin{align} X_t=x+\int_0^t b_s(X_s)\,ds+B_t \end{align} $$

with initial condition $x_0=0$ has at least two solutions whose laws are mutually singular.

We will treat separately the cases $\alpha \in (0,1)$ and $\alpha \in (-1,0)$ .

For $\alpha \in (0,1)$ , the construction is actually one-dimensional and can be extended trivially to higher dimensions by taking $b=(b^i)_{i=1}^d$ with $b^i\equiv 0$ for $i\geq 2$ ; therefore, here we will set $d=1$ . Take $\tilde q>q$ such that $(\alpha ,\tilde q)$ still satisfies (1.14) and define the function

$$ \begin{align*} b_t(x)=t^{-1/\tilde q}\,{\mathrm{sign}}(x)|x|^\alpha; \end{align*} $$

clearly, $b\in L^q_t C^\alpha _x$ . Let $\gamma =1/(\tilde q'(1-\alpha ))$ ; by definition, $\gamma $ satisfies the identity

(1.16)

$$ \begin{align} \gamma=1-\frac{1}{\tilde q}+\gamma\alpha, \end{align} $$

and furthermore, $\gamma <H$ thanks to (1.14). Fix furthermore $\delta>0$ small such that $\delta ^\alpha /\gamma>2\delta $ , which exists since $\alpha \in (0,1)$ . Take $x\in (0,1]$ and consider a weak solution $(X^x,B)$ of (1.15), which is well-known to exist due to the spatial continuity and sublinear growth of b. Define the stopping time

$$ \begin{align*} \tilde\tau:=\inf\{t>0:\, |B_t|\geq \delta t^\gamma\}\wedge 1; \end{align*} $$

it is strictly positive $\mathbb {P}$ -almost surely since $\gamma <H$ and $B\in C^{\tilde \gamma }$ with $\tilde \gamma \in (\gamma ,H)$ . Also define

$$ \begin{align*} \tau_x:=\inf\{t\geq 0:\, X_t^x\leq\delta t^\gamma\}\wedge 1. \end{align*} $$

We claim that $\tau _x\geq \tilde \tau $ . Indeed, $\tau _x>0$ since $x>0$ , and for all $t\leq \tau _x$ by (1.15) and our construction, it holds

(1.17)

$$ \begin{align} X_t^x>\int_0^ts^{-1/\tilde q}(\delta s^\gamma)^{\alpha}\,ds+B_t= (\delta^\alpha/\gamma) t^\gamma+B_t>\delta t^\gamma+\big(\delta t^\gamma+B_t), \end{align} $$

where in the intermediate passage, we used (1.16). Since $\tau _x\geq \tilde \tau>0 \mathbb {P}$ -a.s., there exist $\rho>0$ independent of $x\in (0,1]$ such that

$$ \begin{align*} \mathbb{P}(\tilde\tau_x>\rho)\geq 3/4. \end{align*} $$

The laws of $(X^x,B)$ on $C([0,1])^2$ are tight, and therefore by Skorohod’s representation theorem, we may assume that for a sequence $x_n\searrow 0$ , the random variables $(X^{x_n},B^{x_n})$ live on the same probability space and converge in $C([0,1])^2 \mathbb {P}$ -a.s. The limit $(X^{0,+},B^{0,+})$ is a solution to (1.15) with initial condition $0$ and satisfies

$$ \begin{align*} \mathbb{P}\big(X^{0,+}_t>0\,\,\forall t\in(0,\rho]\big)&\geq \mathbb{P}\big(X^{0,+}_t\geq \delta t^\gamma\,\,\forall t\in[0,\rho]\big) \\&=\lim_{n\to\infty}\mathbb{P}\big(X^{x_n}_t\geq \delta t^\gamma\,\,\forall t\in[0,\rho]\big)\geq3/4. \end{align*} $$

Since b is odd, we can run the same argument for $y\in [-1,0)$ : if $X^y$ is a solution to (1.15), then $-X^y$ is a solution for $(-y,-B)$ , and the definition of $\tilde \tau $ only depends on $|B|$ . Therefore, for the same choice of $\rho $ , in this case, one finds that

$$ \begin{align*} \mathbb{P}\big(X^y_t \leq -\delta t^\gamma \,\forall t\in(0,\rho]\big) \geq 3/4, \end{align*} $$

and so by considering a sequence $y_n\nearrow 0$ by compactness, one can construct $(X^{0,-}, B^{0,-})$ another weak solution to (1.15) with initial condition $0$ satisfying

$$ \begin{align*} \mathbb{P}\big(X^{0,-}_t<0\,\,\forall t\in(0,\rho]\big)\geq 3/4. \end{align*} $$

This shows that $X^{0,+}$ and $X^{0,-}$ do not have the same law, yielding weak non-uniqueness (we leave it as an exercise to the reader to show that their laws are in fact mutually singular).

In the distributional case $\alpha \in (-1,0)$ , we have to be a bit more careful since the meaning of the SDE becomes unclear if X gets too close to $0$ . To this end, we argue again by stopping times, and the construction we present this time is genuinely d-dimensional. Again, take $\tilde q>q$ such that $(\alpha ,\tilde q)$ still satisfy (1.14) and define a vector field $b=(b^i)_{i=1}^d$ by

$$ \begin{align*} b^1(t,x)=t^{-1/\tilde q}\,{\mathrm{sign}}(x^1)|x|^{\alpha},\qquad b^i(t,x)\equiv 0 \text{ for } i=2,\ldots,d; \end{align*} $$

again, $b\in L^q_t C^\alpha _x$ . Take $x\in (0,1]$ and consider a local-in-time solution $X^x$ of (1.15) with initial condition $x_0=(x,0,\ldots ,0)$ , which is well-known to exist due to the spatial regularity of b locally around $x_0$ . Define $\gamma $ as before, so that $\gamma <H$ and (1.16) holds; let us furthermore take an auxiliary parameter $\delta $ that will be specified later. Define the stopping times

$$ \begin{align*} \tilde\tau:=\inf\{t>0:\, |B_t|\geq \delta t^\gamma\}\wedge 1, \quad \tau_x:=\inf\{t\geq 0:\, (X_t^x)^1\leq\delta t^\gamma\}\wedge 1; \end{align*} $$

as before, $\tilde \tau $ is strictly positive $\mathbb {P}$ -almost surely since $\gamma <H$ . We claim that $\tau _x\geq \tilde \tau $ , for which it suffices to show that for $t\leq \tau _x\wedge \tilde \tau $ , one has $(X_t^x)^1\geq 2\delta t^\gamma $ . If $x>3\delta t^\gamma $ , then by simply using the nonnegativity of the first component of b up to $\tau _x$ and the definition of $\tilde \tau $ , we see that

$$ \begin{align*} (X_t^x)^1\geq x+B_t^1\geq 3\delta t^\gamma-\delta t^\gamma, \end{align*} $$

as required. Suppose now that $x\leq 3\delta t^\gamma $ . Clearly, for $s\leq \tau _x$ , one also has $|X_s^x|\geq \delta s^\gamma $ . Inserting this bound in the equation, we get for $s\leq \tau _x\wedge \tilde \tau $

$$ \begin{align*} (X_s^x)^1&\leq x+\int_0^sr^{-1/\tilde q}(\delta r^\gamma)^{\alpha}\,dr+B_s^1 \\ &=x+ (\delta^{\alpha}/\gamma) s^\gamma+B_s^1 \\ &\leq x+ \big(\delta^{\alpha}/\gamma+\delta\big)s^\gamma; \end{align*} $$

observe that since $\alpha <0$ , we find reversed inequalities compared to the previous case. In particular, if $s\geq t/2$ , then using $x\leq 3\delta t^\gamma $ , we also get

$$ \begin{align*} (X_s^x)^1 \leq \big(3\delta 2^\gamma+\delta+\delta^{\alpha}/\gamma\big)s^\gamma. \end{align*} $$

For $\delta \in (0,1)$ , there exist constants $C',C$ depending only on $d,\alpha ,\gamma $ such that the above bound implies $(X_s^x)^1\leq C'\delta ^{\alpha }s^\gamma $ , as well as $|X_s^x|\leq C \delta ^{\alpha }s^\gamma $ . Using this bound in the equation once more,

$$ \begin{align*} (X_t^x)^1&>\int_{t/2}^ts^{-1/\tilde q}\big(C \delta^{\alpha}s^\gamma)^{\alpha}\,ds+B_t^1 \\ &\geq (1/2)C^{\alpha}\delta^{\alpha^2}t^\gamma -\delta t^\gamma. \end{align*} $$

At this point (using the condition $\alpha>-1$ , so that $\alpha ^2<1$ ), one can choose $\delta $ small enough so that the right-hand side is bounded from below by $2\delta t^\gamma $ . With this, we conclude the proof of the property $\tau _x\geq \tilde \tau $ . In other words, for all $t\leq \tilde \tau $ , for all $x\in (0,1]$ , we have $(X_t^x)^1\geq \delta t^\gamma $ . In a symmetric way, for all $t\leq \tilde \tau $ , for all $y\in [-1,0)$ we have $(X_t^y)^1\leq -\delta t^\gamma $ .

We now want to pass to the $x\to 0$ limit, which we can do by noticing that the laws of $(B,\tilde \tau ,X^x,X^{-x})$ are tight on the space

$$ \begin{align*} \mathcal{S}= C([0,1])\times\{(a,g):\,a\in(0,1],g\in C([0,a])^2\} \end{align*} $$

with the metric

$$ \begin{align*} d\big((f,a,g),(f',a',g')\big)=\|f-f'\|_{C([0,1])}+|a-a'|+\|g-g'\|_{C([0,a\wedge a'])^2}. \end{align*} $$

By Prokhorov’s theorem and Skorohod’s representation, we get a sequence $x_n\to 0$ , and on another probability space, a sequence $(\bar B^{x_n},\bar {\tilde \tau }^{x_n},\bar X^{x_n},\bar X^{-x_n})\overset {\mathrm {law}}{=}(B,\tilde \tau ,X^{x_n},X^{-x_n})$ converging $\mathbb {P}$ -almost surely as random variables taking values $\mathcal {S}$ . The limits $X^{0,+}:=\lim \bar X^{x_n}$ and $X^{0,-}:=\lim \bar X^{-x_n}$ both solve (1.15) with initial condition $0$ and driving noise $B^{0}:=\lim \bar B^{x_n}$ . Moreover, $X^{0,+}_t\geq \delta t^\gamma $ for $t\leq \tilde \tau ^0:=\lim \bar {\tilde \tau }^{x_n}$ and $X^{0,-}_t\leq -\delta t^\gamma $ for $t\leq \tilde \tau ^0$ . Since $\tilde \tau ^0\overset {\mathrm {law}}{=}\tilde \tau $ , it is $\mathbb {P}$ -a.s. positive, and therefore, the laws of $X^{0,+}$ and $X^{0,-}$ are mutually singular (for example, on $C([0,1])$ after extending them as constants after $\tilde \tau ^0$ ).

Remark 1.11. Up to multiplying b by a cutoff function at infinity, by taking $\alpha =-d/(p+\varepsilon )$ for sufficiently small $\varepsilon>0$ , the construction presented in the regime $\alpha <0$ provides non-uniqueness for $b\in L^q_tL^p_x$ , for any pair $(p,q)\in [1,\infty ]^2$ satisfying

(1.18)

$$ \begin{align} \frac{1}{q}+\frac{Hd}{p}>1-H,\qquad p>d. \end{align} $$

If $H=1/2$ , then B can be taken as Brownian motion and (1.18) becomes

(1.19)

$$ \begin{align} \frac{2}{q}+\frac{d}{p}>1,\qquad p>d; \end{align} $$

in particular, the exponents $p,q$ violate the LPS condition (1.9). It is interesting to compare (1.19) to the result from [Reference Krylov64], where weak existence for the Brownian SDE was established under the condition

(1.20)

$$ \begin{align} \frac{1}{q}+\frac{d}{p}\leq 1, \end{align} $$

which is further shown to be optimal by construction of counterexamples in the case $1/q+d/p>1$ . Let us also mention [Reference Galeati47] for a heuristic explanation on why condition (1.20) (as well as (1.21) below) arises naturally when only focusing on weak existence results. Our counterexample shows that under (1.20), uniqueness in law in general does not hold, answering a problem left open in [Reference Krylov64] (see the discussion right above Remark 3.1 therein).

After the completion of this work, it has been further shown in [Reference Butkovsky, Lê and Mytnik16] that in the time-independent case, for $H\in (0,1)$ , there exist $b\in C^\alpha $ with supercritical $\alpha <1-1/H$ for which even weak existence does not hold; see Theorem 2.7 therein. More recently, [Reference Butkovsky and Gallay14] expanded the result from [Reference Krylov64] by establishing weak existence of solutions for $H\in (0,1)$ and $b\in L^q_t L^p_x$ with

(1.21)

$$ \begin{align} \frac{1-H}{q}+ \frac{H d}{p} < 1-H. \end{align} $$

Combined with our counterexample, one gets a regime (namely, the intersection of (1.18) and (1.21)) where weak existence holds but uniqueness in law does not.

1.4 Preliminaries on fractional Brownian motion

We recall here several facts about fractional Brownian motion (fBm); for some standard references, we refer to [Reference Nualart80, Reference Picard87].

An $\mathbb {R}^d$ -valued fBm of Hurst parameter H is defined as the unique centered Gaussian process with covariance

$$ \begin{align*} \mathbb{E}(B^H_t\otimes B^H_s)=\tfrac{1}{2}\big(|t|^{2H}+|s|^{2H}-|t-s|^{2H}\big) I_d, \end{align*} $$

where $I_d$ denotes the $d\times d$ identity matrix; in other words, its components are i.i.d. one-dimensional fBms. FBm paths are well-known to be $\mathbb {P}$ -a.s. $(H-\varepsilon )$ -Hölder, but nowhere H-Hölder continuous. FBm admits several representations as a stochastic integral; in particular, given any fBm $B^H$ defined on a probability space, one can construct therein a standard Bm W such that

(1.22)

$$ \begin{align} B^H_t=\int_0^t K_H(t,r)\mathrm{d} W_r\quad\forall\, t \geq 0. \end{align} $$

Such Volterra kernel representation is referred as canonical since $B^H$ and W generate the same filtration. The exact formula for the kernels $K_H$ can be found in, for example, [Reference Nualart and Ouknine81]. For our purposes, it is enough to recall that $K_H$ is deterministic and $K_H(t,\cdot )\in L^2([0,t])$ .

Another standard representation of fBm is the one introduced in [Reference Mandelbrot and van Ness74]: given $B^H$ , one can construct a two-sided Bm $\tilde W$ such that

(1.23)

$$ \begin{align} B^H_t = \gamma_H \int_{-\infty}^t \big[(t-r)_+^{H-1/2}-(-r)_+^{H-1/2}\big]\, \mathrm{d} \tilde W_r, \end{align} $$

where $\gamma _H=\Gamma (H+1/2)^{-1}$ is a normalizing constant and $x_+$ denotes the positive part.

We will mostly work with representation (1.22), but we invite the reader to keep in mind (1.23) since it is usually easier to manipulate in order to derive key properties of the process, like its local nondeterminism; see (1.24) and the discussion below. Given a filtration $\mathbb {F}$ , we say that $B^H$ is a $\mathbb {F}$ -fBM if the associated W given by (1.22) is a $\mathbb {F}$ -Brownian motion.

FBm of parameter $H=1$ is somewhat trivial or ill-defined (see [Reference Picard87]); however, one can extend the definition to all values $H\in (0,+\infty )\setminus \mathbb {N}$ inductively as in [Reference Perrin, Harba, Berzin-Joseph, Iribarren and Bonami86] by $B^{H+1}_t:=\int _0^t B^H_s\mathrm {d} s$ .

Such definition is consistent with most aforementioned properties: it is still a centered, Gaussian process, with trajectories $\mathbb {P}$ -a.s. in $C^{H-\varepsilon }_t$ but nowhere $C^H_t$ , satisfying the scaling relation (1.5); using stochastic Fubini, one can also easily derive similar representations as (1.22)–(1.23). A key consequence of the last property is that for any $H\in (0,+\infty )\setminus \mathbb {N}$ , there exists a constant $c_H\in (0, +\infty )$ such that

(1.24)

$$ \begin{align} \mathrm{Cov} \big(B^H_t - \mathbb{E}_s B^H_t \big) = c_H |t-s|^{2H} I_d \quad\forall \, s\leq t \end{align} $$

(see [Reference Gerencsér53, Proposition 2.1]); here, $\mathbb {E}_s B^H_t:=\mathbb {E}[B^H_t|\mathcal {F}_s]$ , where $\mathcal {F}_s$ can be the natural filtration of $B^H$ or more generally any filtration such that $B^H$ is a $\mathbb {F}$ -fBm. Property (1.24) is a special form of strong local nondeterminism (LND)Footnote ⁴ ; see [Reference Galeati and Gubinelli48, Section 2.4] for a deeper discussion on its relevance on regularisation by noise. Since conditional expectations are also $L^2$ -projections, $B^H_t-\mathbb {E}_s B^H_t$ and $\mathbb {E}_s B^H_t$ are orthogonal Gaussian variables, and thus independent; more generally, $B^H_t-\mathbb {E}_sB^H_t$ is independent of all the history up to time s. Therefore, for any $s\leq t$ , any bounded measurable function $f:\mathbb {R}^d\to \mathbb {R}$ and any other $\mathcal {F}_s$ -measurable random variable X, it holds

(1.25)

$$ \begin{align} \mathbb{E}_s f(B^H_t+X)= P_{ \mathrm{Cov}(B^H_t - \mathbb{E}_s B^H_t)} f(\mathbb{E}_s B^H_t+X) = P_{c_H |t-s|^{2H} I_d} f(\mathbb{E}_s B^H_t+X), \end{align} $$

where in the last passage, we applied (1.24); here, given a symmetric nonnegative $\Sigma $ , $P_\Sigma $ denotes the convolution with the Gaussian density $p_\Sigma $ associated to $\mathcal {N}(0,\Sigma )$ . Throughout the paper, we will adopt the convention that $P_{t I_d}=P_t$ , in agreement with the standard notation for heat kernels, and for simplicity, we will drop the constant $c_H$ , so that in expressions like (1.25), only $P_{|t-s|^{2H}}$ will appear.

Remark 1.12. At the price of slightly anticipating some key concepts which will be introduced throughout the paper, let us discuss here how our methods extend to a larger class of random perturbations $B^H$ than just pure fBm. The main requirement we need, relaxing (1.24), is for $B^H$ to be a Gaussian processFootnote ⁵ satisfying a two-sided bound

(1.26)

$$ \begin{align} C^{-1} |t-s|^{2H} I_d \leq \mathrm{Cov} \big( B^H_t-\mathbb{E}_s B^H_t \big)\leq C |t-s|^{2H} I_d \end{align} $$

for some $C\in (0,+\infty )$ and for all $s<t$ with $|t-s|$ sufficiently small; here, $\mathcal {F}_t$ is the natural filtration of $B^H$ . More precisely, the upper bound in (1.26) provides a priori estimates in the style of Lemma 2.1, while the lower bound (which is the actual LND property) ensures the regularising effect of $B^H$ and the application of stochastic sewing techniques. Indeed, by using properties of Gaussian convolutions, heat kernel bounds and a relation of the form (1.25), one can still find estimates of the form

$$ \begin{align*} \| \mathbb{E}_s f(B^H_t+X)\|_{L^\infty} & = \| \big( P_{ \mathrm{Cov} (B^H_t-\mathbb{E}_s B^H_t)} f \big) (\mathbb{E}_s B^H_t+X)\|_{L^\infty} \leq \| P_{ \mathrm{Cov} (B^H_t-\mathbb{E}_s B^H_t)} f\|_{L^\infty}\\ & \lesssim \| P_{C^{-1}|t-s|^{2H}} f\|_{L^\infty} \lesssim |t-s|^{\alpha H} \| f\|_{C^\alpha}, \end{align*} $$

for $\alpha \leq 0$ , which are the typical bounds needed throughout the proof. There are some passages where condition (1.26) alone is not enough, and we exploited other properties of fBm. Specifically, the counterexamples in Section 1.3 assume $B^H$ to be $(H-\varepsilon )$ -Hölder continuous and symmetric; the flows constructed in Sections 4–5 need some basic time-continuity $\mathbb {E}|B^H_t-B^H_s|\lesssim |t-s|^{H\wedge 1}$ in order to apply Kolmogorov-type criteria; more substantially, the results from Section 8 rely on a Volterra representation $B^H_t =\int _0^t K(t,s) \mathrm {d} W_s$ . These properties are satisfied by other interesting examples – for instance, type-II fBm and mixed fBm discussed in Remark 1.13 below.

The only section truly specific to fBm is Appendix C, which however, exactly for this reason, is not used throughout the main body of the paper. In this case, ad hoc criteria to check Girsanov transform for fBm are presented; any extension to other processes would require precise knowledge of the associated kernel $K(t,s)$ , and its verification can be very technical; cf. [Reference Nualart and Sönmez82].

Remark 1.13. Standard examples of processes satisfying (1.26) are deterministic additive perturbations of fBm (cf. Lemma 6.7), the so-called type-II fBm [Reference Marinucci and Robinson75] and mixed fBm introduced in [Reference Cheridito27]; given any $H_1\neq H_2$ , the process $B^{H_1}+B^{H_2}$ will satisfy condition (1.26) with $H=H_1\wedge H_2$ , both in the case $B^{H_1}$ and $B^{H_2}$ are sampled independently and the one instead where they are constructed from the same reference Brownian motion. In this case, our results yield a far-reaching generalization (also to any $d\geq 2$ ) of the ones provided in [Reference Nualart and Sönmez82] while not requiring highly technical use of Girsanov transform as therein.

Another interesting example is Bifractional Brownian motion of parameters $(H,K)$ (see [Reference Russo and Tudor90]), which is known to be LND with parameter $HK$ [Reference Tudor and Xiao95]; it is a generalization of fBm ( $K=1$ ), but even in the case $HK=1/2$ is not a semimartingale nor a Dirichlet process, although it scales like standard Bm. Our results show that it has a comparable regularising effect, although not amenable to Markovian/martingale techniques.

Another generalization of fBm is the so-called multifractional Brownian motion, in which the Hurst parameter is allowed to vary continuously in time, $H=H(t)$ ; two nonequivalent definitions for this process are given respectively in [Reference Peltier and Véhel85] (by modifying representation (1.23) by allowing $H=H(t)$ ) and in [Reference Benassi, Jaffard and Roux9] (by a harmonisable representation). In both cases, the process can be shown to be ‘locally LND around t’ with parameter $H(t)$ (see [Reference Ayache, Shieh and Xiao4] in the harmonisable case), and thus, we still expect our strategy to yield interesting results under appropriate modifications. Likely, the admissible range of $\alpha $ here would depend on both the supremum and infimum of $H(t)$ ; we leave more precise investigations for future research.

Finally, let us mention that for (sufficiently regular) solutions $u(x,t)$ to certain linear stochastic PDEs for any fixed x, the process $t\mapsto u(x,t)$ is LND (see, for example, [Reference Tudor and Xiao96]); this fact was exploited crucially in regularisation by noise for nonlinear SPDEs in [Reference Athreya, Butkovsky, Lê and Mytnik3].

1.5 Setup and notation

We provide here in a list all the main notations and conventions adopted throughout the paper.

• We always work on the time interval $t\in [0,1]$ . Increments of functions f on $[0,1]$ are denoted by $f_{s,t}:=f_t-f_s$ .
• Whenever considering a filtered probability space $(\Omega ,\mathcal {F},\mathbb {F},\mathbb {P})$ , we will implicitly assume that the filtration $\mathbb {F}=(\mathcal {F}_t)_{t\in [0,1]}$ satisfies the standard assumptions; in particular, $\mathcal {F}_0$ is complete. To denote conditional expectations, we use the shortcut notation $\mathbb {E}_s Y :=\mathbb {E} [Y| \mathcal {F}_s]$ .
• $L^m$ -norms without further notation are understood with respect to $\omega $ ; that is, $\|Y\|_{L^m}=\big (\mathbb {E}|Y|^m\big )^{1/m}$ for $m<\infty $ and $\|Y\|_{L^\infty }=\mathrm {esssup}_{\omega \in \Omega }|Y(\omega )|$ . For conditional $L^m$ -norms, we use the notation $\|Y\|_{L^m|\mathcal {F}_s}=\big (\mathbb {E}(|Y|^m|\mathcal {F}_s)\big )^{1/m}.$ For any $X,Y\in L^m$ such that Y is $\mathcal {F}_s$ -measurable, by conditional Jensen’s inequality, one has the $\mathbb {P}$ -a.s. bound
(1.27) $$ \begin{align} \|X-\mathbb{E}_s X\|_{L^m|\mathcal{F}_s} \leq \|X-Y\|_{L^m|\mathcal{F}_s} +\|Y-\mathbb{E}_s X\|_{L^m|\mathcal{F}_s} \leq 2\|X-Y\|_{L^m|\mathcal{F}_s}. \end{align} $$
Apart from the usual $L^m$ -norms, we also use the norms $\big \|\,\|\cdot \|_{L^m|\mathcal {F}_s}\big \|_{L^n}$ . We will always consider $n\geq m$ , in which case again by conditional Jensen, it holds
$$ \begin{align*} \| X\|_{L^m} \leq \big\|\,\| X \|_{L^m|\mathcal{F}_s}\big\|_{L^n} \end{align*} $$
with equality in the case $m=n$ . Such mixed norms still satisfy natural analogues of classical inequalities like Jensen’s, Hölder’s and Minkowski’s, as can be verified using properties of conditional expectation. Moreover, by the tower property, one can see that for $t\geq s$ , $\big \|\,\|\cdot \|_{L^m|\mathcal {F}_t}\big \|_{L^n}$ is stronger than $\big \|\,\|\cdot \|_{L^m|\mathcal {F}_s}\big \|_{L^n}$ .
• Whenever talking about a weak solution X to the SDE (1.6), we will actually mean a tuple $(X,B^H; \Omega , \mathbb {F}, \mathbb {P})$ such that $(\Omega ,\mathbb {F},\mathbb {P})$ is a filtered probability space as above, X is $\mathbb {F}$ -adapted and $B^H$ is a $\mathbb {F}$ -fBm of parameter H. As usual, X is a strong solution if it is adapted to the (standard augmentation of) the filtration generated by $B^H$ . We say that pathwise uniqueness holds for the SDE if for any two solutions $X^1$ , $X^2$ , defined on the same $(\Omega ,\mathbb {F},\mathbb {P})$ , driven by the same $B^H$ and with same initial condition $x_0$ , it holds $X^1\equiv X^2 \mathbb {P}$ -a.s. We warn the reader to keep in mind that all such concepts are rather classical when b is at least a measurable function, so that (1.6) is meaningful in the Lebesgue sense. In the distributional regime $\alpha <0$ , this is not the case anymore. Therefore, the concept of weak solution becomes less standard. We postpone this discussion to the relevant Section 5, similarly for the concept of path-by-path uniqueness.
• Function spaces in the variable $x\in \mathbb {R}^d$ will often be denoted by the subscript x. For instance, standard Lebesgue spaces $L^p(\mathbb {R}^d;\mathbb {R}^m)$ with $p\in [1,\infty ]$ will often be denoted, when the target dimension m is clear, simply by $L^p_x$ . For $\alpha \in \mathbb {R}\setminus \mathbb {N}$ , we denote by $C^\alpha _x$ the inhomogeneous Hölder-Besov space $B^{\alpha }_{\infty ,\infty }$ (cf. [Reference Bahouri, Chemin and Danchin5]); instead, for nonnegative integer $\alpha $ , by $C^\alpha _x$ , we mean the space of bounded measurable functions whose all partial weak derivatives up to order $\alpha $ are also essentially bounded and measurable (in other words, $C^\alpha _x=W^{\alpha ,\infty }_x$ Sobolev spaces); note that with this convention, elements of $C^0_x$ are not necessarily continuous. Recall that for $\alpha \in (0,1)$ , the space $C^\alpha _x=B^\alpha _{\infty ,\infty }$ coincides with the usual space of bounded $\alpha $ -Hölder continuous functions. By $C^{\alpha ,\mathrm {loc}}_x$ , we mean the space of functions f such that for all compactly supported smooth g, one has $f g\in C^\alpha _x$ . More quantitative versions of them are the weighted Hölder spaces $C^{\alpha ,\lambda }_x$ , for $\alpha \in (0,1]$ and $\lambda \in \mathbb {R}$ , defined through the (semi)norms

where $B_R$ is the ball of radius R around the origin.
• Given a Banach space E, we will use the shortcut notation $L^q_t E$ to denote the space $L^q([0,1];E)$ of Bochner measurable function with finite norm $\| f\|_{L^q E}^q=\int _0^1 \| f_t\|_E^q\, \mathrm {d} t$ for any $q\in [1,\infty ]$ (up to the standard essential supremum convention for $q=\infty $ ). We use the shortcut notation $C_t E = C([0,1];E)$ for the space of continuous, E-valued functions with supremum norm; similarly, for $\gamma \in (0,1)$ , $C^\gamma _t E= C^\gamma ([0,1];E)$ is the space of E-valued, bounded and $\gamma $ -Hölder continuous functions. All definitions can be extended classically to Fréchet spaces E (in particular, allowing for $E=C^{\alpha ,\mathrm {loc}}_x$ or $L^{p,\mathrm {loc}}_x$ ) – for instance, in the the case of $L^q_t E$ by requiring the associated countable seminorms $t\mapsto \| f_t \|_k$ to be all $L^q$ -integrable.
• Given a metric space E and $p\in [1,\infty )$ , we say that a continuous E-valued function f on $[0,1]$ is of finite p-variation, in notation $f\in C^{p-{\mathrm {var}}}_t E$ , if

where the supremum runs over all possible partitions $0=t_0\leq t_1\leq \cdots \leq t_n=1$ of $[0,1]$ . The p-variation seminorm on subintervals $[s,t]\subset [0,1]$ is defined similarly and denoted by . Whenever $E=\mathbb {R}^m$ for some $m\in \mathbb {N}$ , for simplicity, we just drop it and write $C^{p-{\mathrm {var}}}_t$ , , and similarly for $C^\alpha _t$ .
• All the notations introduced above can be concatenated by considering a different Banach/Fréchet space at each step. The convention we adopt is that, when writing spaces with respect to different variables, this is to be read from left to right; for example, $L^q_t C^\alpha _x L^m$ stands for $L^q\big ([0,1], C^\alpha (\mathbb {R}^d,L^m(\Omega ))\big )$ . Similarly, one can define, for example, $L^m C^{p-{\mathrm {var}}}_t C^{\alpha ,\mathrm {loc}}_x$ , $C^\gamma _t L^\infty _x$ , and so on. Mind in particular that with this convention $C^\alpha _t C^\alpha _x\neq C^\alpha _{t,x}$ , the latter denoting the space of $\alpha $ -Hölder continuous functions in $(t,x)$ .
• Let us recall some standard heat kernel estimates: for any $\alpha \geq \beta $ , there exists a constant $N=N(d,\alpha ,\beta )$ such that, for all $t\in (0,1]$ , one has the bound
(1.28) $$ \begin{align} \|P_{t}f\|_{C^\alpha_x}\leq N t^{(\beta-\alpha)/2}\|f\|_{C^\beta_x}; \end{align} $$
see [Reference Galeati and Gubinelli49, Lemma A.10] and the references therein for a more general statement.
• For $0\leq S\leq T\leq 1$ , we denote $[S,T]^2_\leq =\{(s,t)\in [S,T]^2:\,s\leq t\}$ and $[S,T]^3_\leq =\{(s,u,t)\in [S,T]^3:\,s\leq u\leq t\}$ . For $(s,t)\in [S,T]^2_\leq $ , define $s_-=s-(t-s)$ . We then set the slightly more restricted sets of pairs/triples as
$$ \begin{align*} & \overline{[S,T]}^2_\leq:=\{(s,t)\in[S,T]^2_\leq:\,s_-\geq S\},\\ & \overline{[S,T]}^3_\leq=\{(s,u,t)\in[S,T]^3_\leq:\,(u-s)\wedge(t-u)\geq (t-s)/3,\,s_-\geq S\}. \end{align*} $$
• Given a Frechét space E and a map $A:[S,T]^2_\leq \to E$ , we define $\delta A:[S,T]^3_\leq \to E$ by $\delta A_{s,u,t} = A_{s,t}- A_{s,u}- A_{u,t}$ .
• We say that a function $w:[0,1]^2_\leq \to \mathbb {R}_+ $ is a control if it is continuous and superadditive (i.e., $w(s,u)+w(u,t)\leq w(s,t)$ for all $(s,u,t)\in [S,T]^3_\leq $ ). The most common controls for us will be of the form
(1.29) $$ \begin{align} w_{b,\alpha,q}(s,t):=\int_s^t\|b_r\|_{C^\alpha_x}^q\,dr. \end{align} $$
Recall that for any two controls $w_1,w_2$ and $\theta _1,\theta _2\in [0,\infty )$ such that $\theta _1+\theta _2\geq 1$ , $w=w_1^{\theta _1}w_2^{\theta _2}$ is also a control (see [Reference Friz and Victoir45, Exercises 1.8,1.9]). Note also that if w is a control, $\psi $ is an $\mathbb {R}^m$ -valued path and $\gamma \in (0,1]$ , then
(1.30) $$ \begin{align} \|\psi\|_{\frac{1}{\gamma}-{\mathrm{var}}}\leq w(0,1)^\gamma\sup_{0\leq s<t\leq 1}\frac{|\psi_{s,t}|}{w(s,t)^\gamma}; \end{align} $$
conversely, for $p\geq 1$ , if $\psi \in C^{p-{\mathrm {var}}}_t$ , then is a control and $|\psi _{s,t}|\leq w(s,t)^{1/p}$ ; cf. [Reference Friz and Victoir45, Propositions 5.8-5.10].
• The space of probability measures on $\mathbb {R}^d$ is denoted by $\mathcal {P}(\mathbb {R}^d)$ . The law of a random variable X is denoted by $\mathcal {L}(X)$ . For $p\geq 1$ , we denote the p-Wasserstein distance on $\mathcal {P}(\mathbb {R}^d)$ by $\mathbb {W}_p$ , defined as
$$ \begin{align*} \mathbb{W}_p(\mu,\nu)^p=\inf_{\gamma\in \Gamma(\mu,\nu)}\int_{\mathbb{R}^{d}\times\mathbb{R}^d}|x-y|^p\gamma(\mathrm{d} x,\mathrm{d} y), \end{align*} $$
where $\Gamma (\mu ,\nu )$ is the set of all couplings of $\mu $ and $\nu $ (i.e., the probability measures on $\mathbb {R}^{d}\times \mathbb {R}^d$ whose first and second marginals are $\mu $ and $\nu $ , respectively). Note that $\mathbb {W}_p$ can take value $+\infty $ and is defined for any $\mu $ , $\nu $ , without any moment assumption.
• When a statement contains an estimate with a constant depending on a certain set of parameters, in the proof, we do not carry the constants from line to line. Rather, we write $A\lesssim B$ to denote the existence of a constant N depending on the same set of parameters such that $A\leq N B$ . Whenever such a set of parameters includes a parameter that is a norm (this will typically be the norm of the coefficient b), this dependence is always monotone increasing.

2 A priori estimates and stochastic sewing

The key consequence of the subcriticality condition (A) is that in terms of local nondeterminism, drifts of solutions are more regular than the noise; in particular, the solution decomposes as $X=\varphi +B^H$ , where $\varphi $ plays the role of a slow variable, while $B^H$ is the highly oscillating component.Footnote ⁶ This can be formulated as a precise quantitative bound by looking at the best conditional error committed by predicting the process $\varphi _t$ , given the history up to time s; more precisely, we look for estimates of the form

(2.1)

$$ \begin{align} \big\|\|\varphi_t-\mathbb{E}_s\varphi_t\|_{L^m|\mathcal{F}_s}\big\|_{L^\infty}\leq w(s,t)^{1/q}|t-s|^{1/q'+\alpha H}\quad \forall\, (s,t)\in [0,1]^2_{\leq}, \end{align} $$

where $m\in [1,\infty )$ , w is a suitable control and $(q,\alpha ,H)$ are the parameters related to b, $B^H$ .

The subcritical regime $\alpha>1-1/(q' H)$ corresponds to the exponent $1/q'+\alpha H$ appearing in (2.1) being greater than H; this is in stark contrast with the lower bound provided by the LND property of fBm (1.24), which tells us that such an estimate cannot hold for $\varphi $ replaced by $B^H$ , justifying the slow-fast heuristic above.

It is also worth pointing out that $1/q'+\alpha H$ is allowed to exceed $1$ (this is indeed always the case for $H>1$ ), which will be used crucially in the following; in this case, the same bound could not hold if in (2.1), $\mathbb {E}_s \varphi _t$ were replaced by $\varphi _s$ , as one can easily check that the only processes satisfying the corresponding condition are the constant ones.

It will become clear in the sequel why (2.1) is exactly the right condition needed in our analysis; for the moment, let us show that solutions to SDEs naturally enjoy (2.1).

Lemma 2.1 below is based on a readaption of [Reference Gerencsér53, Lemma 2.4], [Reference Butkovsky, Dareiotis and Gerencsér13, Lemma 4.2] to our setting. Note that in the statement, while we enforce the subcritical condition $\alpha>1-1/(q'H)$ , the restriction $q\leq 2$ is not necessary; we do, however, restrict to $\alpha \geq 0$ first. For distributional drifts, similar bounds will be derived from stochastic sewing; see Lemma 2.4 below.

Lemma 2.1. Let $H\in (0,\infty )\setminus \mathbb {N}$ , $q\in [1,\infty )$ , and $\alpha \in [0,1]$ satisfy $\alpha>1-1/(q'H)$ ; let $b\in L^q_t C^\alpha _x$ , X be a weak solution of (1.6) and set $\varphi :=X-B^H$ , so that

$$ \begin{align*} \varphi_t = x_0 + \int_0^t b_r(X_r) \mathrm{d} r. \end{align*} $$

Then, for any $m\in [1,\infty )$ , there exists a constant $N=N(d,H,\alpha ,m,\| b\|_{L^q_t C^\alpha _x})$ such that estimate (2.1) holds with the choice

(2.2)

$$ \begin{align} w(s,t)=N w_{b,\alpha,q}(s,t)= N \int_s^t \| b_r\|_{C^\alpha}^q \mathrm{d} r. \end{align} $$

Proof. First assume that, for some given $\beta \geq 0$ , the bound (2.1) holds with w as above and exponent $\beta $ in place of $1/q'+\alpha H$ . This is definitely the case with $\beta =1/q'$ , as one can see from

(2.3)

$$ \begin{align} \big\|\|\varphi_t-\mathbb{E}_s\varphi_t\|_{L^m|\mathcal{F}_s}\big\|_{L^\infty} & \leq 2\big\|\|\varphi_t-\varphi_s\|_{L^m|\mathcal{F}_s}\big\|_{L^\infty} \nonumber\\ & \leq 2 \int_s^t \| b_r\|_{C^0}\, \mathrm{d} r \leq 2 w_{b,\alpha,q}(s,t)^{1/q} |t-s|^{1/q'}; \end{align} $$

in the above passages, we applied (1.27), the definition of $\varphi $ and lastly Hölder’s inequality.

Assuming we already have the bound for a generic $\beta \geq 1/q'$ , we can then apply (1.27) for the choice $Y=\varphi _s + \int _s^t b_r(\mathbb {E}_s X_r) \mathrm {d} r$ , together with the definition of $\varphi $ , to find

$$ \begin{align*} \| \varphi_t -\mathbb{E}_s \varphi_t \|_{L^m|\mathcal{F}_s} & \leq 2 \Big\| \varphi_t - \varphi_s - \int_s^t b_r ( \mathbb{E}_s \varphi_r +\mathbb{E}_s B^H_r) \mathrm{d} r \Big\|_{L^m|\mathcal{F}_s}\\ & \leq 2 \int_s^t \big\| b_r (\varphi_r + B^H_r) - b_r (\mathbb{E}_s \varphi_r +\mathbb{E}_s B^H_r) \big\|_{L^m|\mathcal{F}_s} \,\mathrm{d} r\\ & \leq 2 \int_s^t \| b_r \|_{C^{\alpha}_x} \big\| \varphi_r -\mathbb{E}_s \varphi_r + B^H_r -\mathbb{E}_s B^H_r \big\|_{L^m|\mathcal{F}_s}^{\alpha}\,\mathrm{d} r\\ & \leq 2 \int_s^t \| b_r \|_{C^{\alpha}_x} \big( \| \varphi_r -\mathbb{E}_s \varphi_r\|_{L^m|\mathcal{F}_s}^{\alpha} + \| B^H_r -\mathbb{E}_s B^H_r \|_{L^m|\mathcal{F}_s}^{\alpha}\big)\,\mathrm{d} r; \end{align*} $$

in the above estimates, we used multiple times basic properties of conditional norms like Jensen’s and Minkowski’s inequality. By the properties of fBm recalled in Section 1.4 and the independence of $B_r^H-\mathbb {E}_s B^H_r$ from $\mathcal {F}_s$ , we have the bound

$$ \begin{align*} \big\| \|B^H_r -\mathbb{E}_s B^H_r \|_{L^m|\mathcal{F}_s} \big\|_{L^\infty}\lesssim |r-s|^H \quad \forall s\leq r. \end{align*} $$

Combined with our standing assumption on $\varphi $ , by taking $L^\infty $ -norms on both sides and using Minkowski’s and Hölder’s inequalities for the integral, we get

$$ \begin{align*} \big\|\| \varphi_t -\mathbb{E}_s \varphi_t \|_{L^m|\mathcal{F}_s}\big\|_{L^\infty} & \lesssim \int_s^t \| b_r\|_{C^\alpha_x} \Big( \big\|\| \varphi_r -\mathbb{E}_s \varphi_r \|_{L^m|\mathcal{F}_s}\big\|_{L^\infty}^\alpha + |r-s|^{\alpha H} \Big) \mathrm{d} r\\ & \lesssim \int_s^t \| b_r\|_{C^\alpha_x} \Big( w_{\alpha,b,q}(s,r)^{\alpha/q}|r-s|^{\alpha\beta} + |r-s|^{\alpha H} \Big) \mathrm{d} r\\ & \lesssim w_{\alpha,b,q}(s,t)^{1/q}\Big(w_{\alpha,b,q}(s,t)^{\alpha/q}|t-s|^{\alpha\beta+1/q'} + |t-s|^{\alpha H+1/q'}\Big). \end{align*} $$

In other terms, if $\varphi $ satisfies (2.1) with $1/q'+\alpha H$ replaced by $\beta $ , then it does so also with $\tilde {\beta }=f(\beta )=\alpha (\beta \wedge H)+1/q'$ (up to a change in the generic constant N).

From here, the argument is identical to the one from [Reference Gerencsér53, Lemma 2.4]: by iterating, we can define a sequence $\{\beta ^n\}_n$ by $\beta ^{n+1}= f(\beta ^n)$ with $\beta _0=1/q'$ ; it remains to note that the condition $\alpha>1-1/(q'H)$ guarantees that the only fixed point $\bar {\beta }$ of the map $\tilde {f}(\beta )= \alpha \beta +1/q'$ is strictly larger than H and is attracting exponentially fast any orbit defined by $\tilde {\beta }_{n+1}=\tilde {f}(\tilde \beta _n)$ . Given that the sequences $\{\beta _n\}_n$ and $\{\tilde \beta _n\}_n$ coincide as long as the first one does not exceed H, this necessarily implies that the first one stabilizes to $\beta =\alpha H+ 1/q'$ after a finite number of iterations $\bar {n}$ (i.e., $\beta _n=\alpha H+ 1/q'$ for all $n\geq \bar {n}$ ).

Remark 2.2. The case $m=\infty $ can be handled with an appropriate stopping argument; see [Reference Gerencsér53, Lemma 2.4]. This can be used to derive similar bounds for processes that are not exact solutions (for example Picard iterates), but we do not need this generality.

The next ingredient is an a priori estimate for $\alpha <0$ , analogous to Lemma 2.1. Recall that for any adapted process $\varphi $ , by (1.27), one has

$$ \begin{align*} \big\|\|\varphi_t-\mathbb{E}_s\varphi_t\|_{L^m|\mathcal{F}_s}\big\|_{L^\infty}\leq 2\big\|\|\varphi_{s,t}\|_{L^m|\mathcal{F}_s}\big\|_{L^\infty}; \end{align*} $$

in the distributional case, we will directly bound the latter quantity. Unlike Lemma 2.1, here we cannot allow for any $q\in (2,\infty ]$ and subcritical $\alpha $ ; rather, we need to impose the stronger condition (B), which was introduced just before Theorem 1.5.

Remark 2.3. As mentioned in Remark 1.6, for $q\in (1,2]$ , condition (B) reduces to A. For $q\in (2,\infty )$ , the a priori estimate below will be relevant in Section 8, where we establish existence of weak solutions in a regime where the uniqueness is not known. Contrary to Lemma 2.1, the proof of Lemma 2.4 will rely on stochastic sewing techniques. We could use the upcoming very general (but quite technical) Lemma 2.5 for this task; but in order to help the intuition, we prefer first to invoke the result from [Reference Friz, Hocquet and Lê44], whose statement is simpler, and postpone the application of Lemma 2.5 to where it is truly needed (e.g., Lemma 3.1).

Lemma 2.4. Assume (B) and, in addition, $\alpha <0$ . Let $b\in L^q_tC^1_x$ and let X be the unique strong solution to (1.6) for some initial condition $x_0\in \mathbb {R}^d$ ; set $w:=w_{b,\alpha ,q}$ and $\varphi =X-B^H$ . Then for any $m\in [1,\infty )$ , there exists a constant $N=N(m,d,\alpha ,q,H,\| b\|_{L^q_t C^\alpha _x})$ such that for all $(s,t)\in [0,1]_\leq ^2$ , one has the bound

(2.4)

$$ \begin{align} \big\|\| \varphi_{s,t}\|_{L^m\vert \mathcal{F}_s}\big\|_{L^\infty} \leq N w(s,t)^{1/q} |t-s|^{\alpha H + 1/q'}. \end{align} $$

Proof. Up to shifting, we can assume without loss of generality $x_0=0$ ; moreover, we only need to deal with $m\in [2,\infty )$ since $\| \cdot \|_{L^m\vert \mathcal {F}_s} \leq \| \cdot \|_{L^2\vert \mathcal {F}_s}$ otherwise. Fix $m\in [2,\infty )$ and set the shorthand $\beta :=\alpha H +1/q'$ ; recall that by (B), one has $\beta>H$ .

Let us first assume that (2.4) holds with w replaced by another control $\tilde {w}$ ; this is definitely the case for $\tilde w = w_{b,0,q}$ , arguing as in (2.3). Given such $\tilde w$ and any closed subinterval $I\subset [0,1]$ , define

with the convention $0/0=0$ . Fix $(s,t)\in [0,1]_\leq ^2$ and, for any $(s',t')\in [s,t]_\leq ^2$ , set

$$ \begin{align*}A_{s',t'}:=\mathbb{E}_{s'} \int_{s'}^{t'} b_r(\varphi_{s'}+B^H_r)\mathrm{d} r = \int_{s'}^{t'} P_{|r-s'|^{2H}} b_r (\varphi_{s'}+\mathbb{E}_{s'} B^H_r) \mathrm{d} r, \end{align*} $$

where in the second passage, we used conditional Fubini and property (1.25) (please remember our convention about not writing explicitly the constant $c_H$ or the matrix $I_d$ ).

Our aim is to apply the stochastic sewing lemma (in the version given by [Reference Friz, Hocquet and Lê44, Theorem 2.7]) to A in order to find a closed estimate for . By the heat kernel estimates (1.28), we have $\mathbb {P}$ -almost surely

$$ \begin{align*} | A_{s',t'}| &\leq \int_{s'}^{t'} \| P_{|r-s'|^{2H}} b_r\|_{C^0} \mathrm{d} r \lesssim \int_{s'}^{t'} |r-s'|^{\alpha H} \|b_r\|_{C^\alpha} \mathrm{d} r \lesssim |t'-s'|^\beta w(s',t')^{1/q}, \end{align*} $$

where in the last passage, we applied Hölder’s inequality, and the $L^{q'}$ -integrability of $|r-s|^{\alpha H}$ follows from (B). Similarly, we have the $\mathbb {P}$ -a.s. bound

The integrability of the power follows again from (B), as do the inequalities $\beta +1/q>1/2$ , $2\beta -H +2/q>1$ (we remark that it is only the latter for which the additional condition in (B) was introduced). Therefore, the stochastic sewing lemma [Reference Friz, Hocquet and Lê44, Theorem 2.7] applies and allows us to derive estimates for the sewing $\mathcal {A}$ associated to A. However, one can easily identify $\mathcal {A}_\cdot $ ; indeed, by the spatial regularity of b, we have the bound

$$ \begin{align*} \|\varphi_{s',t'}-A_{s',t'}\|_{L^m}\lesssim |t'-s'|^\varepsilon\, w_{b,1,1}(s',t') \end{align*} $$

for some $\varepsilon>0$ , which allows to conclude that $\mathcal {A}_{\cdot }=\varphi _{s,\cdot }$ again by [Reference Friz, Hocquet and Lê44, Theorem 2.7-(b)]. Overall, we deduce that there exists a constant $N_0=N_0(m,d,\alpha ,q,H)$ such that

Diving both sides by $|t'-s'|^\beta w^{1/q}(s',t')$ , taking supremum over $[s',t']\subset [s,t]$ and using the fact that all our estimates are on $[s,t]\subset [0,1]$ , we obtain

(2.5)

In particular, (2.5) shows that is finite; we can then go again through the whole argument, with $\tilde {w}$ replaced by w, to find

(2.6)

which readily yields a closed estimate for , at least for $[s,t]$ sufficiently small.

Our last task is to remove the smallness condition on $[s,t]$ in order to achieve a global bound. To this end, define a new control $w_\ast $ by $w_\ast (s,t)^{1/q+\beta -H}=w(s,t)^{1/q} |t-s|^{\beta -H}$ and an increasing sequence $\{t_n\}_n$ by $t_0=0$ and $w_\ast (t_n,t_{n+1})^{1/q+\beta -H}=(2N_0)^{-1}$ . Applying (2.6) for $[s,t]=[t_n,t_{n+1}]$ , by construction, one finds .

If $t_1=1$ , this immediately yields the conclusion. Suppose this is not the case. Then for any pair $s<t$ which do not belong to the same subinterval $[t_n,t_{n+1}]$ , there exist $\ell ,m\in \mathbb {N}$ such that $t_{\ell -1}< s\leq t_\ell \leq \ldots \leq t_m\leq t<t_{m+1}$ . Set $\tau _{\ell -1}=s$ , $\tau _i=t_i$ for $i=\ell ,\ldots , m$ and $\tau _{m+1}=t$ . It holds

$$ \begin{align*} \big\| \| \varphi_{s,t}\|_{L^m\vert \mathcal{F}_s}\big\|_{L^\infty} & \leq \sum_{i=\ell-1}^{m} \big\|\| \varphi_{\tau_i,\tau_{i+1}}\|_{L^m\vert \mathcal{F}_s} \big\|_{L^\infty} \leq \sum_{i=\ell-1}^{m} \big\|\| \varphi_{\tau_i,\tau_{i+1}}\|_{L^m\vert \mathcal{F}_{\tau_i}} \big\|_{L^\infty}\\ & \lesssim_{N_0} \sum_{i=\ell-1}^{m} w(\tau_i,\tau_{i+1})^{1/q} |\tau_i-\tau_{i+1}|^\beta \\ & \leq (m+1-\ell)^{-\alpha H} \Big( \sum_{i=\ell-1}^{m} \big [w(\tau_i,\tau_{i+1})^{1/q} |\tau_i-\tau_{i+1}|^\beta\big]^{\frac{1}{1+\alpha H}} \Big)^{1+\alpha H}\\ & \leq (m+1-\ell)^{-\alpha H} w(s,t)^{1/q} |t-s|^\beta, \end{align*} $$

where in the last two passages, we used the fact that $\beta +1/q=1+\alpha H\in (0,1)$ , Jensen’s inequality and the superadditivity of the control $[w(s,t)^{1/q} |t-s|^\beta ]^{\frac {1}{1+\alpha H}}$ . Observe that $m+1-\ell $ is less than or equal to the overall amount of intervals $[t_n,t_{n+1}]$ . In turn, by their definition and subadditivity of $w_\ast $ , this is bounded by a multiple of

$$\begin{align*}w_\ast (0,1)=w(0,1)^{\frac{(\alpha H + 1-H)^{-1}}{q}}=\| b\|_{L^q C^\alpha}^{(\alpha H + 1-H)^{-1}},\end{align*}$$

which finally yields the conclusion.

Next, we formulate two appropriate versions of the stochastic sewing lemma (SSL). After its introduction by Lê [Reference Lê71], in recent years, the SSL has seen many variations. Our first SSL combines three modifications: it incorporates shifting (as in [Reference Gerencsér53]), as well as controls and general $\big \|\,\|\cdot \|_{L^m|\mathcal {F}_s}\big \|_{L^n}$ norms (as in [Reference Friz, Hocquet and Lê44, Reference Lê72]). Let us remark that this combination is not completely obvious and comes with a price: due to the shifting, we need a nontrivial ‘time component’ $|t-s|^\varepsilon $ in our estimates, which does not appear in [Reference Friz, Hocquet and Lê44, Reference Lê72]. Nonetheless, the resulting statement is well-suited for our applications, where such ‘time component’ always appears naturally.

Recall the notations from Section 1.5, concerning $[0,1]_\leq $ , $\overline {[S,T]}^2_\leq $ , $s_{-}$ and so on.

Lemma 2.5. Let $w_1,w_2$ be controls, and let $m,n$ satisfy $2\leq m\leq n\leq \infty $ and $m<\infty $ . Let $(S,T)\in [0,1]_\leq $ . Assume that $(A_{s,t})_{(s,t)\in \overline {[S,T]}^2_\leq }$ is a continuous mapping from $\overline {[S,T]}^2_\leq $ to $L^m$ such that for all $(s,t)\in \overline {[S,T]}^2_\leq $ , $A_{s,t}$ is $\mathcal {F}_t$ -measurable. Suppose that there exist constants $\varepsilon _1,\varepsilon _2>0$ such that the bounds

(2.7)

$$ \begin{align} \big\|\|A_{s,t}\|_{L^m|\mathcal{F}_s}\big\|_{L^n}&\leq w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}, \end{align} $$

(2.8)

$$ \begin{align} \|\mathbb{E}_{s_-}\delta A_{s,u,t}\|_{L^n}&\leq w_2(s_-,t)|t-s|^{\varepsilon_2} \end{align} $$

hold for all $(s,u,t)\in \overline {[S,T]}^3_\leq $ . Then for all $S<s\leq t\leq T$ , the Riemann sums

(2.9)

$$ \begin{align} \sum_{j=0}^{2^\ell-1} A_{s+j2^{-\ell}(t-s),s+(j+1)2^{-\ell}(t-s)} \end{align} $$

converge as $\ell \to \infty $ in $L^m$ , to the increments $\mathcal {A}_t-\mathcal {A}_s$ of an adapted stochastic process $(\mathcal {A}_t)_{t\in [S,T]}$ that is continuous as a mapping from $[S,T]$ to $L^m$ and $\mathcal {A}_S=0$ . Moreover, $\mathcal {A}$ is the unique such process that satisfies the bounds

(2.10)

$$ \begin{align} \big\|\|\mathcal{A}_{t}-\mathcal{A}_s-A_{s,t}\|_{L^m|\mathcal{F}_s}\big\|_{L^n}&\leq K_1 w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}+K_2 w_2(s_-,t)|t-s|^{\varepsilon_2}, \end{align} $$

(2.11)

$$ \begin{align} \|\mathbb{E}_{s_-}\big(\mathcal{A}_t-\mathcal{A}_s-A_{s,t}\big)\|_{L^n}&\leq K_2 w_2(s_-,t)|t-s|^{\varepsilon_2}, \end{align} $$

with some $K_1,K_2$ for all $(s,u,t)\in \overline {[S,T]}^3_\leq $ . Furthermore, there exists a constant K depending only on $\varepsilon _1,\varepsilon _2,m,n,d$ such that the bounds (2.10)–(2.11) hold with $K_1=K_2=K$ , and moreover, the bound

(2.12)

$$ \begin{align} \big\|\|\mathcal{A}_t-\mathcal{A}_s\|_{L^m|\mathcal{F}_s}\big\|_{L^n}\leq K\big( w_1(s,t)^{1/2}|t-s|^{\varepsilon_1}+w_2(s,t)|t-s|^{\varepsilon_2}\big) \end{align} $$

holds for all $(s,t)\in [S,T]^2_\leq $ .

Proof. Since by the time of the present work there is an abundance of SSLs in the recent literature, we do not aim to give a fully self-contained proof. We only provide the details as long as the combination of the arguments of [Reference Gerencsér53] and [Reference Friz, Hocquet and Lê44, Reference Lê72] is nontrivial.

Step 1 (convergence along dyadic partitions). Let $(s,t)\in \overline {[S,T]}_\leq ^2$ and for each $k=0,1,\ldots $ define $\mathcal {D}_k=\{t_0^k,t_1^k,\ldots ,t_{2^k}^k\}$ , where $t_i^k=s+i2^{-k}(t-s)$ , and set

$$ \begin{align*} \mathcal{A}^k_{s,t}=\sum_{i=1}^{2^k}A_{t_{i-1}^k,t_i^k}. \end{align*} $$

We claim that $\mathcal {A}^k_{s,t}$ converges, and its limit $\tilde {\mathcal {A}}_{s,t}$ satisfies the bounds (2.10)–(2.11) with $K=K_1=K_2$ when replacing $\mathcal {A}_t-\mathcal {A}_s$ by it. In particular, this would also imply the bound

(2.13)

$$ \begin{align} \big\|\|\tilde{\mathcal{A}}_{s,t}\|_{L^m|\mathcal{F}_s}\big\|_{L^n}\leq K\big( w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}+w_2(s_-,t)|t-s|^{\varepsilon_2}\big) \end{align} $$

for all $(s,t)\in \overline [S,T]^2_\leq $ . The claim clearly follows from the following two bounds:

(2.14)

$$ \begin{align} \big\|\| \mathcal{A}^{k-1} _{s,t} -\mathcal{A}^{k} _{s,t} \|_{L^m|\mathcal{F}_s}\big\|_{L^n} & \lesssim w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}2^{-k\varepsilon_1}+w_2(s_-,t)|t-s|^{\varepsilon_2}2^{-k\varepsilon_2}, \end{align} $$

(2.15)

$$ \begin{align} \| \mathbb{E}_{s_-} \big( \mathcal{A}^{k-1} _{s,t} -\mathcal{A}^{k} _{s,t} \big) \| _{L^n}& \lesssim w_2(s_-,t)|t-s|^{\varepsilon_2}2^{-k\varepsilon_2}. \end{align} $$

It is no loss of generality to assume $k\geq 2$ (otherwise, the trivial bounds below suffice), in which case we write

(2.16)

$$ \begin{align} \mathcal{A}^{k+1}_{s,t}-\mathcal{A}^{k}_{s,t}=-\delta A_{t_0^{k},t_1^{k},t_2^{k}}-\sum_{j=1}^{2^{k-1}-1}\delta A_{t_{2j}^{k},t_{2j+1}^{k},t_{2j+2}^{k}}. \end{align} $$

For the first term, we used the conditions (2.7)–(2.8) in a trivial way:

$$ \begin{align*} \,&\big\|\|\delta A_{t_{0}^{k},t_{1}^{k},t_{2}^{k}}\|_{L^m|\mathcal{F}_s}\big\|_{L^n} \lesssim w_1(t_{0}^{k}-(t_{2}^{k}-t_{0}^{k}),t_{2}^{k})^{1/2}|t_{2}^{k}-t_{0}^{k}|^{\varepsilon_1}\lesssim w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}2^{-k\varepsilon_1}, \\ \,&\|\mathbb{E}_{s_-} \delta A_{t_{0}^{k},t_{1}^{k},t_{2}^{k}}\|_{L^n} \leq w_2(t_{0}^{k}-(t_{2}^{k}-t_{0}^{k}),t_{2}^{k})|t_{2}^{k}-t_{0}^{k}|^{\varepsilon_2}\lesssim w_2(s_-,t)|t-s|^{\varepsilon_2}2^{-k\varepsilon_2}. \end{align*} $$

For the sum in (2.16), we write

(2.17)

$$ \begin{align} \sum_{j=1}^{2^{k-1}-1}\delta A_{t_{2j}^{k},t_{2j+1}^{k},t_{2j+2}^{k}}&= \sum_{j=1}^{2^{k-1}-1}\mathbb{E}_{t_{2j-2}^k}\delta A_{t_{2j}^k,t_{2j+1}^k,t_{2j+2}^k} \nonumber \\ &\qquad+\sum_{\ell=0}^1\sum_{j=0}^{2^{k-2}}({\mathrm{id}}-\mathbb{E}_{t_{4j+2\ell}^k}\big)\delta A_{t_{4j+2\ell+2}^k,t_{4j+2\ell+3}^k,t_{4j+2\ell+4}^k} \nonumber\\ &=:I_1+I_2, \end{align} $$

where the term $\delta A_{t_{2^k}^k,t_{2^k+1}^k,t_{2^k+2}^k}$ is defined to be $0$ . The point of this unaesthetic decomposition is twofold. First, since $t^k_{2j-2}=t^k_{2j}-(t^k_{2j+2}-t^k_{2j})$ , in the terms in the first sum, there is sufficient shifting in the conditioning so that they can be estimated via the assumed bound (2.8). Second, for each $\ell =0,1$ , the inner sum above is one of martingale differences.

Therefore, we first estimate by the triangle inequality

(2.18)

$$ \begin{align} \big\|\|I_1\|_{L^m|\mathcal{F}_s}\big\|_{L^n}&\leq\sum_{j=1}^{2^{k-1}-1}\big\|\|\mathbb{E}_{t_{2j-2}^k}\delta A_{t_{2j}^k,t_{2j+1}^k,t_{2j+2}^k}\|_{L^m|\mathcal{F}_s}\big\|_{L^n} \nonumber\\ &\leq\sum_{j=1}^{2^{k-1}-1}\|\mathbb{E}_{t_{2j}^k-(t_{2j+2}^k-t_{2j}^k)}\delta A_{t_{2j}^k,t_{2j+1}^k,t_{2j+2}^k}\big\|_{L^n} \nonumber\\ &\leq \sum_{j=1}^{2^{k-1}-1} w_2(t_{2j-2}^k,t_{2j+2}^k)|t_{2j+2}^k-t_{2j}^k|^{\varepsilon_2} \nonumber\\ &\lesssim |t-s|^{\varepsilon_2}2^{-k\varepsilon_2}w_2(s,t), \end{align} $$

using the superadditivity of $w_2$ in the last line. Similarly, but replacing the triangle inequality by the Burkholder-Davis-Gundy and Minkowski inequalities (e.g., in the form given in [Reference Lê72, Lemma 2.5] for $\mathfrak {p}=2$ ), we have

(2.19)

$$ \begin{align} \big\|\|I_2\|_{L^m|\mathcal{F}_s}\big\|_{L^n}&\lesssim\sum_{\ell=0}^1\Big(\sum_{j=0}^{2^{k-2}}\big\|\|\delta A_{t_{4j+2\ell+2}^k,t_{4j+2\ell+3}^k,t_{4j+2\ell+4}^k}\|_{L^m|\mathcal{F}_s}\big\|_{L^n}^2\Big)^{1/2} \nonumber\\ &\lesssim 2^{-k\varepsilon_1}\sum_{\ell=0}^1 \Big(\sum_{j=0}^{2^{k-2}}w_1(t_{4j+2\ell}^k,t_{4j+2\ell+4}^k)\Big)^{1/2} \nonumber\\ &\lesssim |t-s|^{\varepsilon_1}2^{-k\varepsilon_1}w_1(s,t)^{1/2}. \end{align} $$

This proves (2.14). As for (2.15), it is only easier: noting that

$$ \begin{align*} \mathbb{E}_s\sum_{j=1}^{2^{k-1}-1}\delta A_{t_{2j}^{k},t_{2j+1}^{k},t_{2j+2}^{k}}=\mathbb{E}_s I_1, \end{align*} $$

we can bound $\|\mathbb {E}_sI_1\|_{L^n}\leq \|I_1\|_{L^n}$ just as in (2.18). This concludes the proof of (2.14)–(2.15).

Step 2 (convergence along regular partitions). Let us say that a partition $\pi =\{s=t_0<t_1<\cdots <t_n=t\}$ is regular if $|\pi |:=\max (t_i-t_{i-1})\leq 2\min (t_i-t_{i-1})$ . For any partition, we can define

$$ \begin{align*} \mathcal{A}^{\pi}_{s,t}=\sum_{i=1}^n A_{t_{i-1},t_i}. \end{align*} $$

Very similarly to Step 1, we get that for any sequence of regular partitions $(\pi _n)_{n\in \mathbb {N}}$ with $|\pi _n|\to 0$ , $\mathcal {A}^{\pi }_{s,t}$ converges (for details, see [Reference Gerencsér53, Lemma 2.2]). Therefore, on one hand, this limit has to coincide with $\tilde {\mathcal {A}}_{s,t}$ , and on the other hand, this limit is clearly additive. Moreover, notice that by construction, $\tilde {\mathcal {A}}_{s,t}$ is $\mathcal {F}_t$ -measurable for all $(s,t)\in \overline {[S,T]}_\leq ^2$ , and since it vanishes in $L^m$ , the additivity implies that it is continuous in both arguments as a two-parameter process with values in $L^m$ .

Step 3 (the process $\mathcal {A}$ and its bounds). For any $t\in (S,T]$ , we set $t_i:=S+2^{-i}(t-S)$ . We then claim that the series

$$ \begin{align*} \mathcal{A}_t:=\sum_{i=1}^\infty \tilde{\mathcal{A}}_{(S+2^{-i})\wedge t,(S+2^{-i+1})\wedge t}=:\sum_{i=1}^\infty \tilde{\mathcal{A}}_{s_i,s_{i-1}} \end{align*} $$

converges. Indeed, since $(s_i,s_{i-1})\in \overline {[S,T]}^2_\leq $ , we may use the bound (2.13). By the trivial bounds $w((s_i)_-,s_{i-1})\leq w(S,t)$ and $|s_{i-1}-s_i|\leq 2^{-i}\mathbf {1}_{t-S\geq 2^{-i}}$ , we get not only the convergence of the series but also the bound

$$ \begin{align*} \big\|\|\mathcal{A}_t\|_{L^m|\mathcal{F}_S}\big\|_{L^n}\leq K\big( w_1(S,t)^{1/2}|t-S|^{\varepsilon_1}+w_2(S,t)|t-S|^{\varepsilon_2}\big). \end{align*} $$

This is precisely (2.12) with $s=S$ . The case for general $(s,t)\in [S,T]_{\leq }^2$ follows in the same way. It is also clear that $\mathcal {A}_0=0$ and, by the remarks in Step 2, that $\mathcal {A}$ is adapted and continuous in $L^m$ . Therefore, $\mathcal {A}$ satisfies all of the claimed properties.

Step 4 (Uniqueness). The proof of this is standard and can be found in, for example, [Reference Lê72].

The other version of SSL that we use seems to be new. In Lemma 2.5, one can transfer $L^m$ bounds from A to $\mathcal {A}$ if $m<\infty $ . The $m=\infty $ case is a bit different: $L^\infty $ bounds on A imply Gaussian moment bounds on $\mathcal {A}$ . An alternative way to obtain Gaussian moment bounds via stochastic sewing is presented in [Reference Butkovsky, Dareiotis and Gerencsér12] (see, for example, Theorem 3.3 and Lemma 4.6. therein), but the conditions herein are easier to verify. The proof relies on a conditional version of Azuma–Hoeffding inequality; see Lemma A.1 in Appendix A.

Lemma 2.6. Let $(A_{s,t})_{(s,t)\in \overline {[S,T]}^2_\leq }$ be a continuous mapping from $\overline {[S,T]}^2_\leq $ to $L^2$ , with $A_{s,t} \mathcal {F}_t$ -measurable for all $(s,t)\in \overline {[S,T]}^2_\leq $ , such that the conditions of Lemma 2.5 hold with $m=n=\infty $ ; namely, assume that there exist controls $w_1,w_2$ and constants $\varepsilon _1,\varepsilon _2>0$ such that the bounds

(2.20)

$$ \begin{align} \big\|\|A_{s,t}\|_{L^\infty|\mathcal{F}_s}\big\|_{L^\infty}&\leq w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}, \end{align} $$

(2.21)

$$ \begin{align} \|\mathbb{E}_{s_-}\delta A_{s,u,t}\|_{L^\infty}&\leq w_2(s_-,t)|t-s|^{\varepsilon_2} \end{align} $$

hold for all $(s,u,t)\in \overline {[S,T]}^3_\leq $ . Denote by $(\mathcal {A}_t)_{t\in [S,T]}$ the associated process coming from Lemma 2.5. Then there exists positive constants $\mu $ and K depending only on $\varepsilon _1,\varepsilon _2,d$ such that the bound

(2.22)

$$ \begin{align} \mathbb{E}\bigg[\exp\bigg(\mu\,\frac{|\mathcal{A}_t-\mathcal{A}_s|^2}{\big(w_1(s,t)^{1/2}|t-s|^{\varepsilon_1}+w_2(s,t)|t-s|^{\varepsilon_2}\big)^2}\bigg)\bigg\vert \mathcal{F}_s\bigg]\leq K \end{align} $$

holds for all $(s,t)\in [S,T]_{\leq }^2$ .

Proof. We continue using the notation of the proof of Lemma 2.5. Let $(s,t)\in \overline {[S,T]}_\leq ^2$ and $k=0,1,\ldots $ , and let us bound $\mathcal {A}_{s,t}^{k+1}-\mathcal {A}_{s,t}^k$ . The first term on the right-hand side of (2.16) is trivially bounded by $2w_1(s_-,t)^{1/2}|t-s|^{\varepsilon _1}2^{-k\varepsilon _1}$ with probability $1$ . Decomposing the second term into $I_1$ and $I_2$ as in (2.17), a simple use of triangle inequality as in (2.18) yields the $\mathbb {P}$ -almost sure bound

$$ \begin{align*} |I_1|\lesssim 2^{-k\varepsilon_2}|t-s|^{\varepsilon_2}w_2(s,t). \end{align*} $$

As for $I_2$ , recalling that it is the sum of two martingales, for each, we may use the Azuma-Hoeffding inequality. The role of $\delta _j$ as in Lemma A.1 is played by $4w_1(t_{4j+2\ell }^k,t_{4j+2\ell +4}^k)^{1/2}$ , so similarly to the calculation as in (2.19), we get

$$ \begin{align*} \Lambda:=\sum_i \delta_i^2 \lesssim 2^{-2k\varepsilon_1}|t-s|^{2\varepsilon_1}w_1(s,t). \end{align*} $$

Therefore, by (A.1) combined with the aforementioned $\mathbb {P}$ -almost sure bounds, we get that with some $\mu _1>0$ , $K_1$ ,

$$ \begin{align*} \mathbb{E}\bigg[\exp\Big(\mu_12^{k(\varepsilon_1\wedge\varepsilon_2)}\frac{|\mathcal{A}_{s,t}^{k+1}-\mathcal{A}_{s,t}^k|^2}{(w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}+w_2(s_-,t)|t-s|^{\varepsilon_2})^2}\Big)\bigg\vert \mathcal{F}_{S}\bigg]\leq K_1. \end{align*} $$

Since one can write

$$ \begin{align*} |(\mathcal{A}_t-\mathcal{A}_s)-A_{s,t}|\leq \sum_{k=0}^\infty 2^{-k(\varepsilon_1\wedge\varepsilon_2)}2^{k(\varepsilon_1\wedge\varepsilon_2)}|\mathcal{A}_{s,t}^{k+1}-\mathcal{A}_{s,t}^k|, \end{align*} $$

we get by conditional Jensen’s inequality,

$$ \begin{align*} \mathbb{E}&\bigg[\exp\Big(\mu_1\frac{|(\mathcal{A}_t-\mathcal{A}_s)-A_{s,t}|^2}{(w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}+w_2(s_-,t)|t-s|^{\varepsilon_2})^2}\Big)\bigg\vert\mathcal{F}_{S}\bigg]\leq \sum_{k=0}^\infty 2^{-k(\varepsilon_1\wedge\varepsilon_2)}K_1. \end{align*} $$

Using again the assumed bounds on $A_{s,t}$ , we get with some other constant $K_2$

$$ \begin{align*} \mathbb{E}&\bigg[\exp\Big(\mu_1\frac{|\mathcal{A}_t-\mathcal{A}_s|^2}{w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}+w_2(s_-,t)|t-s|^{\varepsilon_2}}\Big)\bigg\vert \mathcal{F}_{S}\bigg]\leq K_2. \end{align*} $$

It only remains to remove the shifts in the denominator and substitute $\mathcal {F}_S$ with $\mathcal {F}_s$ , which can be done just as in Step 3 of the proof of Lemma 2.5, and therefore, we obtain (2.22).

3 Stability

The use of the tools from Section 2 is illustrated by the following lemma, which will play a key role in our analysis. Let us emphasise the important feature of the statement that although h is assumed to have $\delta $ spatial regularity, in the estimate, only its $\alpha -1$ norm is used.

Lemma 3.1. Assume (A) and let $(S,T)\in [0,1]_\leq ^2$ . Suppose that $h\in L^q_t C^\delta _x$ for some $\delta>0$ and let $\varphi $ be an adapted process satisfying (2.1) with $m=1$ and some control w. For $t\in [S,T]$ , define the process

$$ \begin{align*} \psi_t=\int_S^t h_r\big(B^H_r+\varphi_r\big)\mathrm{d} r \end{align*} $$

and set $\varepsilon =1/q'+(\alpha -1)H$ . Then there exist positive constants $\mu $ and K, depending only on H, q, $\alpha $ and d, such that for all $(s,t)\in [S,T]^2_\leq $ , one has the bound

(3.1)

$$ \begin{align} \mathbb{E}\bigg[ \exp\bigg(\mu\,\frac{|\psi_t-\psi_s|^2}{w_{h,\alpha-1,q}(s,t)^{2/q}|t-s|^{2\varepsilon} \big(1+w(s,t)^{1/q}|t-s|^{\varepsilon})^2}\bigg)\bigg\vert \mathcal{F}_s\bigg]\leq K. \end{align} $$

As a consequence, for any $\widetilde m\in [1,\infty )$ , there exists a constant $\tilde K$ , depending only on $\tilde m$ , H, q, $\alpha $ and d, such that for all $(s,t)\in [S,T]^2_\leq $ , one has the bound

(3.2)

$$ \begin{align} \big\|\|\psi_t-\psi_s\|_{L^{\widetilde m}|\mathcal{F}_s}\big\|_{L^\infty}\leq \tilde K w_{h,\alpha-1,q}(s,t)^{1/q}|t-s|^{\varepsilon}\big(1+w(s,t)^{1/q}|t-s|^{\varepsilon}\big). \end{align} $$

Proof. Note that thanks to the condition (A), $\varepsilon>0$ . For $(s,t)\in \overline {[S,T]}^2_\leq $ , let us set

$$ \begin{align*} A_{s,t}=\mathbb{E}_{s-(t-s)}\int_s^t h_r(B^H_r+\mathbb{E}_{s-(t-s)}\varphi_r)\mathrm{d} r \end{align*} $$

and verify the conditions of Lemma 2.6 (namely those of Lemma 2.5 with $m=n=\infty $ ).

Fix $(s,u,t)\in \overline {[S,T]}_\leq ^3$ and denote $s_1=s-(t-s)$ , $s_2=s-(u-s)$ , $s_3=u-(t-u)$ , $s_4=s$ , $s_5=u$ , $s_6=t$ . These points are almost ordered according to their indices, except $s_3$ and $s_4$ , for which $s_4\leq s_3$ may happen, but this plays no role whatsoever. First, by property (1.25), we have

$$ \begin{align*} A_{s,t}=\int_s^t P_{|r-s_1|^{2H}}h_r\big(\mathbb{E}_{s_1}(B^H_r+\varphi_r)\big)\mathrm{d} r. \end{align*} $$

Therefore, by (1.28) and Hölder’s inequality, it holds

$$ \begin{align*} |A_{s,t}|\leq\int_s^t \|P_{|r-s_1|^{2H}}h_r\|_{C^0_x}\mathrm{d} r\lesssim &\int_s^t|r-s_1|^{(\alpha-1) H}\|h_r\|_{C^{\alpha-1}_x}\mathrm{d} r \\&\lesssim |t-s|^{1/q'+(\alpha-1) H}w_{h,\alpha-1,q}(s,t)^{1/q}. \end{align*} $$

Since $q\leq 2$ , by the definition of $\varepsilon $ , (2.7) is satisfied with $\varepsilon _1=\varepsilon $ and $w_1=N w_{h,\alpha -1,q}^{2/q}$ .

Next, we need to bound $\mathbb {E}_{s-(t-s)}\delta A_{s,u,t}=\mathbb {E}_{s_1}\delta A_{s_4,s_5,s_6}$ . After an elementary rearrangement, we get

$$ \begin{align*} \mathbb{E}_{s_1}\delta A_{s_4,s_5,s_6}=I+J:&=\mathbb{E}_{s_1}\mathbb{E}_{s_2}\int_{s_4}^{s_5} h_r(B^H_r+\mathbb{E}_{s_1}\varphi_r)-h_r(B^H_r+\mathbb{E}_{s_2}\varphi_r)\mathrm{d} r \\ &\quad+\mathbb{E}_{s_1}\mathbb{E}_{s_3}\int_{s_5}^{s_6} h(B^H_r+\mathbb{E}_{s_1}\varphi_r)-h_r(B^H_r+\mathbb{E}_{s_3}\varphi_r)\mathrm{d} r. \end{align*} $$

The two terms are treated in exactly the same way, so we only detail I. We use (1.28) similarly as before to get

$$ \begin{align*} |I|&\leq\mathbb{E}_{s_1}\int_{s_4}^{s_5}\big|P_{|r-s_2|^{2H}}h_r(\mathbb{E}_{s_2}B^H_r+\mathbb{E}_{s_1}\varphi_r)-P_{|r-s_2|^{2H}}h_r(\mathbb{E}_{s_2}B^H_r+\mathbb{E}_{s_2}\varphi_r)\big|\mathrm{d} r \\ &\leq \mathbb{E}_{s_1}\int_{s_4}^{s_5}\|P_{|r-s_2|^{2H}} h_r\|_{C^1_x}|\mathbb{E}_{s_1}\varphi_r-\mathbb{E}_{s_2}\varphi_r|\mathrm{d} r \\ &\lesssim \mathbb{E}_{s_1}\int_{s_4}^{s_5}|r-s_2|^{(\alpha-2)H}\|h_r\|_{C^{\alpha-1}_x}|\mathbb{E}_{s_1}\varphi_r-\mathbb{E}_{s_2}\varphi_r|\mathrm{d} r. \end{align*} $$

By Jensen’s inequality and the assumption on $\varphi $ , we have the $\mathbb {P}$ -almost sure bound

$$ \begin{align*} \mathbb{E}_{s_1}|\mathbb{E}_{s_1}\varphi_r-\mathbb{E}_{s_2}\varphi_r|\leq\mathbb{E}_{s_1}|\mathbb{E}_{s_1}\varphi_r-\varphi_r|\leq w(s_1,r)^{1/q}|t-s|^{1/q'+\alpha H}. \end{align*} $$

Also note that $r\mapsto |r-s_2|^{(\alpha -2)H}\in L^{q'}([s_4,s_5])$ because of the shifted basepoint; in general, this would not be true with $s_2$ replaced by $s_4$ . Therefore, by Hölder’s inequality,

$$ \begin{align*} |I|\lesssim |t-s|^{1/q'+(\alpha-2)H+1/q'+\alpha H}w_{h,\alpha-1,q}(s,t)^{1/q}w(s_1,t)^{1/q}. \end{align*} $$

Note that the exponent of $|t-s|$ is simply $2\varepsilon $ . Using again that $q\leq 2$ , we see that condition (2.8) is satisfied with $\varepsilon _2=2\varepsilon $ and $w_2=N w_{h,\alpha -1,q}(s,t)^{1/q}w(s_1,t)^{1/q}$ .

It remains to verify that the process $\mathcal {A}$ of Lemma 2.5 is given by $\psi $ . Since $\psi _0=0$ , it suffices to show that

(3.3)

$$ \begin{align} \|\psi_t-\psi_s-A_{s,t}\|_{L^1}\leq \tilde{w}(s_-,t)|t-s|^\kappa \end{align} $$

for all $(s,t)\in \overline {[S,T]}^2_{\leq }$ , with some control $\tilde w$ and some $\kappa>0$ . This follows from three easy bounds: first,

$$ \begin{align*} \Big\|&\psi_t-\psi_s-\int_s^t h_r(B^H_r+\mathbb{E}_{s_-}\varphi_r\big)\mathrm{d} r\Big\|_{L^1} \\&\leq \int_s^t\|h_r\|_{C^\delta_x} w(s_-,r)^{\delta/q}\mathrm{d} r \leq w_{h,\delta,q}(s,t)^{1/q}|t-s|^{1/q'}w(s_-,t)^{\delta/q}, \end{align*} $$

second,

$$ \begin{align*} \Big\|\int_s^t &h_r(B^H_r+\mathbb{E}_{s_-}\varphi_r\big)\mathrm{d} r-\int_s^t h_r(\mathbb{E}_{s_-}B^H_r+\mathbb{E}_{s_-}\varphi_r\big)\mathrm{d} r\Big\|_{L^1} \\&\leq \int_s^t\|h_r\|_{C^\delta_x} |r-s_-|^{\delta H}\mathrm{d} r \lesssim w_{h,\delta,q}(s,t)^{1/q}|t-s|^{1/q'+\delta H}, \end{align*} $$

and third,

$$ \begin{align*} \Big\|\int_s^t &h_r(\mathbb{E}_{s_-}B^H_r+\mathbb{E}_{s_-}\varphi_r\big)\mathrm{d} r-A_{s,t}\Big\|_{L^1} \\&\leq \int_s^t\|h_r-P_{|r-s_-|^{2H}}h_r\|_{C^0_x} \mathrm{d} r \lesssim w_{h,\delta,q}(s,t)^{1/q}|t-s|^{1/q'+\delta H}. \end{align*} $$

Hence, we can conclude $\psi =\mathcal {A}$ , and (3.1) follows from (2.22).

We will often consider (1.6) with nonzero initial time. If b is a function, a solution of (1.6) on some interval $[S,T]\subset [0,1]$ with initial condition $X_S$ is a process X satisfying

$$ \begin{align*} X_t=X_S+\int_S^t b_r(X_r)\mathrm{d} r+B^H_t-B^H_S \end{align*} $$

for all $t\in [S,T]$ . Our main stability estimate for solutions is then formulated as follows.

Theorem 3.2. Assume (A). Let $\delta>0$ . Let $[S,T]\subset [0,1]$ , and for $i=1,2$ , let $X^i$ be adapted continuous processes satisfying (1.6) on $[S,T]$ with initial conditions $X^i_S$ and drifts $b^i\in L^q_t C^{1+\delta }_x$ . Denote $M=\max _{i=1,2}\|b^i\|_{L^q_t C^\alpha _x}$ . Then for any $m\in [2,\infty )$ , there exists a positive constant $N=N(m,M,H,\alpha ,q,d)$ , such that one has the $\mathbb {P}$ -almost sure bound

(3.4)

$$ \begin{align} \Big\|\sup_{t\in[S,T]}|X^1_t-X^2_t|\Big\|_{L^m|\mathcal{F}_s}\leq N\Big(|X_S^1-X_S^2|+\|b^1-b^2\|_{L^q_t([S,T];C^{\alpha-1}_x)}\Big). \end{align} $$

Moreover, if $b^1=b^2$ , then one also has the $\mathbb {P}$ -almost sure bound

(3.5)

$$ \begin{align} \Big\|\sup_{t\in[S,T]}\big(|X^1_t-X^2_t|^{-1}\big)\Big\|_{L^m|\mathcal{F}_s}\leq N |X_S^1-X_S^2|^{-1}. \end{align} $$

Proof. As usual, we denote $\varphi ^1=X^1-B^H$ and $\varphi ^2=X^2-B^H$ . For $t\in [S,T]$ , we write

(3.6)

$$ \begin{align} X^1_t-X^2_t&=X^1_S-X^2_S+\int_S^t\Big(\int_0^1\nabla b^1_r\big(B^H_r+\lambda\varphi^1_r+(1-\lambda)\varphi^2_r\big)\mathrm{d} \lambda\Big)\cdot(X^1_r-X^2_r)\mathrm{d} r \nonumber\\ &\qquad+\int_0^t(b^1-b^2)_r(B^H_r+\varphi^2_r)\mathrm{d} r. \end{align} $$

Note that $\nabla b^1\in L^q_tC^\delta _x$ , and therefore, the process

$$ \begin{align*} A_t:=\int_0^1A^\lambda_t\mathrm{d} \lambda:=\int_0^1\Big(\int_S^t\nabla b^1_r\big(B^H_r+\lambda\varphi^1_r+(1-\lambda)\varphi^2_r\big)\mathrm{d} r\Big)\mathrm{d} \lambda \end{align*} $$

is well-defined. Define furthermore

$$ \begin{align*} z_t:=\int_0^t(b^1-b^2)_r(B^H_r+\varphi^2_r)\mathrm{d} r. \end{align*} $$

We then apply Lemma 3.1 with $\varphi =\lambda \varphi ^1_r+(1-\lambda )\varphi ^2_r$ and $h=\nabla b^1$ , as well as with $\varphi =\varphi ^2$ and $h=b^1-b^2$ . Since $\varphi ^1$ and $\varphi ^2$ are the drift parts of solutions, by Lemma 2.1, the processes $\varphi =\lambda \varphi ^1+(1-\lambda )\varphi ^2$ satisfy the bound (2.2) with control $w=w_{b^1,\alpha ,q}+w_{b^2,\alpha ,q}$ , and so Lemma 3.1 indeed applies. Combining the bound (3.1) with Lemma A.2, we get that there exist random variables $\eta _A,\eta _z$ with Gaussian momentsFootnote ⁷ conditionally on $\mathcal {F}_S$ , as well as $\delta>0$ and $p\in (1,2)$ , such that

$$ \begin{align*} \|A\|_{p-{\mathrm{var}};[S,T]}&\leq w_{b^1,\alpha,q}(S,T)^{1/q}\sup_{S\leq s<t\leq T}\frac{|A_t-A_s|}{w_{b^1,\alpha,q}(s,t)^{1/q}|t-s|^\delta} \\ &\leq w_{b^1,\alpha,q}(S,T)^{1/q}\eta_A, \\ \|z\|_{p-{\mathrm{var}};[S,T]}&\leq w_{b^1-b^2,\alpha-1,q}(S,T)^{1/q}\sup_{S\leq s<t\leq T}\frac{|z_t-z_s|}{w_{b^1-b^2,\alpha-1,q}(s,t)^{1/q}|t-s|^\delta} \\ &\leq w_{b^1-b^2,\alpha-1,q}(S,T)^{1/q}\eta_z. \end{align*} $$

We can rewrite (3.6) as

(3.7)

$$ \begin{align} \mathrm{d}(X^1_t-X^2_t)= \mathrm{d} A_t (X^1_t-X^2_t)+\mathrm{d} z_t, \quad (X^1_t-X_t^2)\vert_{t=S}=X^1_S-X^2_S, \end{align} $$

meaning that we are interpreting (3.6) as an affine Young differential equation; see also Appendix B for more details. By applying Lemma B.2 for $x=X^1-X^2$ and $\tilde {p}=p$ , we get

$$ \begin{align*} \sup_{t\in[S,T]}|X_t^1-X_t^2|\lesssim e^{C\|A\|_{p-{\mathrm{var}};[S,T]}^p}\big(|X^1_S-X^2_S|+\|z\|_{p-{\mathrm{var}};[S,T]}\big). \end{align*} $$

Recall that $\eta _A$ satisfies $\mathbb {E}_S [e^{\mu \eta _A^2}]\lesssim 1$ for some $\mu>0$ , and thus also $\mathbb {E}_S[e^{K \eta _A^p}] \lesssim _{K,p} 1$ for all $K>0$ since $p<2$ . Therefore, we obtain

$$ \begin{align*} \mathbb{E}_S\Big[ \sup_{t\in[S,T]}|X_t^1-X_t^2|^m \Big] & \lesssim \mathbb{E}_S[e^{mC\| A\|_{p-{\mathrm{var}}; [S,T]}^p} ] |X^1_S-X^2_S|^m \\&\qquad+ \mathbb{E}_S\Big[ e^{mC\| A\|_{p-{\mathrm{var}};[S,T]}^p} \| z\|_{p-{\mathrm{var}};[S,T]}^m \Big]\\ & \lesssim |X^1_S-X^2_S|^m + w_{b^1-b^2,\alpha-1,q}(S,T)^{m/q}, \end{align*} $$

using conditional Hölder’s inequality to get the last line. This gives (3.4).

In case $b^1=b^2$ , we have $z=0$ , and the Young equation (3.7) becomes homogeneous. Moreover, note that Young equations allow time-reversal: if we fix $\tau \in [S,T]$ , write $\tilde A_t=A_{\tau -t}$ , and

$$ \begin{align*} \mathrm{d} Y_t=\mathrm{d} \tilde A_t Y_t,\quad Y_t\vert_{t=0}=X^1_\tau-X^2_\tau, \end{align*} $$

then $Y_{\tau -S}=X_S^1-X_S^2$ . Therefore, by Lemma B.2, we also have the pathwise estimate

$$ \begin{align*} |X_S^1-X_S^2|\lesssim e^{C\|\tilde A\|_{p-{\mathrm{var}};[0,\tau-S]}^p}|X^1_\tau-X^2_\tau|. \end{align*} $$

Of course, $\|\tilde A\|_{p-{\mathrm {var}};[0,\tau -S]}^p=\| A\|_{p-{\mathrm {var}};[S,\tau ]}^p \leq \| A \|_{p-{\mathrm {var}};[S,T]}^p$ , so after rearranging for the inverses, taking supremum in $\tau \in [S,T]$ , and taking $L^m|\mathcal {F}_S$ norms, we get (3.5).

4 Strong well-posedness for functional drift

We first apply the stability estimate to establish existence and uniqueness of solutions of (1.6) with $\alpha>0$ . In this case, the meaning of solutions is unambiguous, but we will also need the following stronger concepts of solutions.

In the next definition, we denote by $C^{\mathrm {loc}}_x$ the space of continuous functions from $\mathbb {R}^d$ to itself, endowed with the topology of uniform convergence on compact sets. Correspondingly, $L^1_t C^{\mathrm {loc}}_x$ denotes the set of functions $f:[0,1]\times \mathbb {R}^d\to \mathbb {R}^d$ such that, for all smooth compactly supported g, $f g\in L^1([0,T]; C_b(\mathbb {R}^d;\mathbb {R}^d))$ , where $C_b(\mathbb {R}^d;\mathbb {R}^d)$ denotes the Banach space of continuous and bounded functions, endowed with the supremum norm. As for most localized spaces, it is easy to check that $L^1_t C^{\mathrm {loc}}_x$ is a separable Fréchet space.

Definition 4.1.

(i) Assume $b\in L^1_t C^{\mathrm {loc}}_x$ and let $\gamma :[0,1]\to \mathbb {R}^d$ be bounded and measurable. A semiflow associated to the ODE
(4.1) $$ \begin{align} y_t = y_0 + \int_0^t b_s (y_s) \mathrm{d} s + \gamma_t \end{align} $$

is a jointly measurable map $\Phi :[0,1]^2_\leq \times \mathbb {R}^d\to \mathbb {R}^d$ such that
- • for all $(s,x)\in [0,1]\times \mathbb {R}^d$ and all $t\in [s,1]$ , one has
  $$\begin{align*}\Phi_{s\to t}(x)=x+\int_s^t b_r\big(\Phi_{s\to r}(x)\big)\mathrm{d} r+\gamma_t-\gamma_s; \end{align*}$$
- • for all $(s,r,t,x)\in [0,1]^3_\leq \times \mathbb {R}^d$ , one has $\Phi _{s\to t}(x)=\Phi _{r\to t}\big (\Phi _{s\to r}(x)\big )$ .
(ii) A flow is a semiflow such that for all $(s,t)\in \times [0,1]^2_\leq $ , the map $x\mapsto \Phi _{s\to t}(x)$ is a homeomorphism of $\mathbb {R}^d$ .
(iii) If $\gamma $ is a stochastic process, a random (semi)flow is a jointly measurable map $\Phi :\Omega \times [0,1]^2_\leq \times \mathbb {R}^d\to \mathbb {R}^d$ such that for $\mathbb {P}$ -almost all $\omega \in \Omega $ , the map $\Phi ^\omega :[0,1]^2_\leq \times \mathbb {R}^d\to \mathbb {R}^d$ is a (semi)flow associated to (4.1) with $\gamma =\gamma (\omega )$ .
(iv) We say that a random (semi)flow is adapted if for all $(s,t,x)\in [0,1]^2_\leq \times \mathbb {R}^d$ , the random variable $\Phi _{s\to t}(x)$ is $\mathcal {F}_t$ -measurable.
(v) Given $\beta \in (0,1)$ , we say that a (semi)flow is locally $\beta $ -Hölder continuous if for all K, there exists a constant N such that for all $(s,t,x,y)\in [0,1]_\leq ^2\times B_K^2$ , one has $|\Phi _{s\to t}(x)-\Phi _{s\to t}(y)|\leq N|x-y|^\beta $ .

Remark 4.2. Definition 4.1 is based on Kunita’s classical one; cf. [Reference Kunita69, Theorem II.4.3]; it is slightly different (in fact, stronger) from other definitions proposed in the literature, like [Reference Fedrizzi and Flandoli37, Definition 5.1], due to the ordering of the quantifiers. One can draw a nice analogy between this kind of difference and the one between so-called crude and perfect random dynamical systems; cf. [Reference Zhang101, Remark 2.5].

Theorem 4.3. Assume (A), $\alpha>0$ , and let $b\in L^q_t C^\alpha _x$ . Then there exists an adapted random semiflow of solutions to (1.6) that is furthermore $\mathbb {P}$ -almost surely locally $\beta $ -Hölder continuous for all $\beta \in (0,1)$ .

Proof. Let $m\in [2,\infty )$ , to be specified later. Take a sequence of functions $(b^{n})_{n\in \mathbb {N}}$ such that $b^{n}\in L^q_tC^{2}_x$ and $\|b^{n}\|_{L^q_t C^\alpha _x}\leq \|b\|_{L^q_t C^\alpha _x}$ for all $n\in \mathbb {N}$ , and $\|b^{n}-b\|_{L^q_t C^{\alpha -1}_x}\to 0$ as $n\to \infty $ . Replacing b by $b^{n}$ in (1.6), the equation clearly admits an adapted random semiflow which we denote by $\Phi ^{n}$ . For fixed $(s,t)\in [0,1]^2_\leq $ , $x\in \mathbb {R}^d$ , and $n,n'\in \mathbb {N}$ , we may apply Theorem 3.2 to obtain the bound

$$ \begin{align*} \big\|\Phi_{s\to t}^{n}(x)-\Phi_{s\to t}^{n'}(x)\big\|_{L^m}\lesssim \|b^{n}-b^{n'}\|_{L^q_t C^{\alpha-1}_x}. \end{align*} $$

Here and below, the only important feature of the hidden proportionality constant in $\lesssim $ is that it is independent of $n,n'$ . Next, let $(s,s',t),(s,s',t')\in [0,1]^3_\leq $ , $x,x'\in \mathbb {R}^d$ , and $n\in \mathbb {N}$ . Then from applying Theorem 3.2 again, we get

$$ \begin{align*} \big\|\Phi_{s\to t}^{n}(x)-\Phi_{s\to t}^{n}(x')\big\|_{L^m}\lesssim |x-x'|; \end{align*} $$

by a trivial estimate, we get

$$ \begin{align*} \big\|\Phi_{s\to t}^{n}(x)-\Phi_{s\to t'}^{n}(x)\big\|_{L^m}\lesssim |t-t'|^{H\wedge (1/q')}, \end{align*} $$

and using the semigroup property and Theorem 3.2 once more, we have

(4.2)

$$ \begin{align} \|\Phi_{s\to t}^{n}(x)-\Phi_{s'\to t}^{n}(x)\|_{L^m}&=\|\Phi_{s'\to t}^{n}(\Phi_{s\to s'}^{n}(x))-\Phi_{s'\to t}^{n}(x)\|_{L^m} \nonumber\\ &\lesssim\|\Phi_{s\to s'}^{n}(x)-x\|_{L^m}\lesssim |s'-s|^{H\wedge (1/q')}. \end{align} $$

We therefore get that the sequence $\big (\Phi ^{n}\big )_{n\in \mathbb {N}}$ is on the one hand, Cauchy in $C_{s,t,x} L^m_\omega $ and, on the other hand, bounded in $C_{s,t}C^1_xL^m_\omega \cap C_x C^{H\wedge (1/q')}_{s,t}L^m_\omega $ . This implies that for some random field $\Phi $ , one has $\Phi ^{n}\to \Phi $ in $C_{s,t}C^{1-\kappa }_xL^m_\omega \cap C_x C^{H\wedge (1/q')-\kappa }_{s,t} L^m_\omega $ , where $\kappa>0$ is arbitrary. By Kolmogorov’s continuity theorem, for sufficiently large m, the convergence also holds in $L^m_\omega C_{s,t}C^{1-2\kappa ,\mathrm {loc}}_x\cap L^m_\omega C^{\mathrm {loc}}_x C^{H\wedge (1/q')-2\kappa }_{s,t}$ . This yields the claimed spatial regularity of $\Phi $ ; the fact that $\Phi $ is indeed a semiflow for (1.6) instead follows from the locally uniform convergence of $\Phi ^n$ to $\Phi $ , $\Phi ^n$ being semiflow, and the spatial continuity of the drift b.

Theorem 4.4. Assume (A), $\alpha>0$ , and let $b\in L^q_t C^\alpha _x$ . Then there exists an event $\tilde \Omega $ of full probability such that for all $\omega \in \tilde \Omega $ , for all $(S,T)\in [0,1]^2_\leq $ , $x\in \mathbb {R}^d$ , there exists only one solution to (1.6) on $[S,T]$ with initial condition x.

The theorem will follow immediately from Theorem 4.3 and the following lemma, which is a refinement of the technique illustrated in [Reference Shaposhnikov91, Theorem 3.1].

Lemma 4.5. Let $\gamma :[0,1]\to \mathbb {R}^d$ be bounded and measurable, $b\in L^1_t C^{\alpha ,\mathrm {loc}}_x$ , and consider the ODE (4.1). Suppose that it admits a locally $\beta $ -Hölder continuous semiflow $\Phi $ with

(4.3)

$$ \begin{align} \beta(1+\alpha)>1. \end{align} $$

Then for any $(S,T)\in [0,1]_\leq ^2$ and $y\in \mathbb {R}^d$ , there exists a unique solution to the ODE on the interval $[S,T]$ with initial condition y, given by $\Phi _{S\to \cdot }(y)$ .

Proof. Suppose that there exists another solution to the ODE, given by $(z_t)_{t\in [S,T]}$ . Since both z and $\Phi _{S\to \cdot }(y)$ are bounded, we may and will assume $b\in L^1_t C^{\alpha }_x$ and that $\Phi $ is globally $\beta $ -Hölder continuous. Define the control $w=w_{b,\alpha ,1}$ .

Now let us fix $\tau \in [S,T]$ and define the map $f_t:= \Phi _{t\to \tau } (z_t) - \Phi _{S\to \tau }(y)$ . If we are able to show that f is constant in time, then $f \equiv f_0=0$ , which implies $\Phi _{t\to \tau }(z_t)=\Phi _{S\to \tau }(y)$ and in turn by choosing $t=\tau $ gives $z_\tau =\Phi _{\tau \to \tau }(z_\tau )=\Phi _{S\to \tau }(y)$ . In particular, if we above argument holds for any $\tau \in [S,T]$ , we reach the conclusion.

It remains to prove that f is constant on $[S,\tau ]$ . To this end, first observe that for any $S\leq s\leq t\leq \tau $ , it holds

(4.4)

$$ \begin{align} |f_{s,t}| & =|\Phi_{t\to \tau}(z_t)-\Phi_{s\to \tau}(z_s)| \nonumber\\ & =|\Phi_{t\to \tau}(z_t)-\Phi_{t\to \tau}(\Phi_{s\to t}(z_s))| \lesssim |\Phi_{s\to t}(z_s)-z_t|^\beta. \end{align} $$

Next, by definition of flow, it holds

$$\begin{align*}\Phi_{s\to t}(z_s)-z_t=\int_s^t [b_r(\Phi_{s\to r} (z_s))-b_r(z_r)] \mathrm{d} r, \end{align*}$$

which immediately implies $|\Phi _{s\to t}(z_s)-z_t|\lesssim w(s,t)$ ; we can improve the estimate by recursively inserting it in the above identity:

$$ \begin{align*} |\Phi_{s\to t}(z_s)-z_t| & \leq \int_s^t |b_r(\Phi_{s\to r} (z_s))-b_r(z_r)| \mathrm{d} r\\ & \leq \int_s^t \| b_r\|_{C^\alpha} |\Phi_{s\to r} (z_s))-z_r|^\alpha \mathrm{d} r \leq w(s,t)^{1+\alpha}. \end{align*} $$

Inserting the above in estimate (4.4), we can conclude that

$$\begin{align*}|f_{s,t}| \lesssim |\Phi_{s\to t}(z_s)-z_t|^\beta \lesssim w(s,t)^{\beta (1+\alpha)}.\end{align*}$$

Since $\beta (1+\alpha )>1$ and w is a control, f must be necessarily constant.

Remark 4.6. In the functional setting of Definition 4.1, path-by-path uniqueness implies pathwise uniqueness, which in turn implies uniqueness in law by the Yamada–Watanabe theorem [Reference Yamada and Watanabe100, Proposition 1]; we refer to [Reference Shaposhnikov and Wresch92] for a general overview on the various notions of strong/weak existence and uniqueness.

Remark 4.7. The statement of Lemma 4.5 is given for deterministic initial data y and semiflow $\Phi $ , but immediately extends to random ones: if $X_0$ is a $\mathcal {F}_0$ -measurable random variable, then $(\Phi _{0\to t}(X_0)\big )_{t\in [0,1]}$ is clearly the unique adapted solution with initial condition $X_0$ .

5 Strong well-posedness for distributional drift

When $\alpha <0$ , the very first question one has to address is the meaning of the equation – more precisely, the meaning of the integral in (1.6). We start by some consequences of Lemma 3.1. Denote by $\overline {C^\alpha }$ the closure of $C^1$ in $C^\alpha $ . Recall that for any $\alpha <\alpha '$ , one has $C^{\alpha '}\subset \overline {C^\alpha }$ .

Corollary 5.1. Assume (A) and $\alpha <0$ , and take $\delta>0$ . Define the linear map $T^{B^H}:L^q_tC^{1+\delta }_x\to L^\infty _\omega C_tC^\delta _x$ by

$$ \begin{align*} \big(T^{B^H}h\big)_t(x)=\int_0^t h_r(B^H_r+x)\mathrm{d} r. \end{align*} $$

Denote $w=w_{h,\alpha ,q}$ . Then, for any $m\in [2,\infty )$ , there exists a constant $K=K(m,H,\alpha ,q,d,w(0,1))$ such that for all $(s,t)\in [0,1]^2_\leq $ and $x,y\in \mathbb {R}^d$ , one has the bound

(5.1)

$$ \begin{align} \big\|&\|\big(T^{B^H}h\big)_{s,t}(x)-\big(T^{B^H}h\big)_{s,t}(y)\|_{L^m|\mathcal{F}_s}\big\|_{L^\infty} \nonumber \\ &\qquad\leq K|x-y|w(s,t)^{1/q}|t-s|^{1/q'+(\alpha-1)H}. \end{align} $$

Moreover, for any $\kappa \in (0,1)$ sufficiently small, there exists a constant $K=K(m,H,\alpha ,q,d,w(0,1),\kappa )$ such that one has the bound

(5.2)

$$ \begin{align} \Bigg\|\sup_{0\leq s<t\leq 1}\frac{\|\big(T^{B^H}h\big)_{s,t}\|_{C^{1-\kappa,2\kappa}_x}}{w(s,t)^{1/q}|t-s|^{1/q'+(\alpha-1)H-\kappa}}\Bigg\|_{L^m}\leq K. \end{align} $$

Consequently with $p=\big (1+(\alpha -1)H\big )^{-1}\in (1,2)$ , the mapping $h\mapsto T^{B^H} h$ takes values in $L^m_\omega C^{(p+\kappa )-{\mathrm {var}}}_tC^{1-\kappa ,2\kappa }_x$ , and as such, it extends continuously to $L^q_t \overline {C^\alpha _x}$ . This extension also satisfies the bounds (5.1)–(5.2).

Proof. Applying Lemma 3.1 with $t,z\mapsto (x-y)\cdot \int _0^1\nabla h_t(z+\theta x+(1-\theta )y)\mathrm {d} \theta $ in place of h yields (5.1). The bound (5.2) follows from (3.2) and (5.1) by Kolmogorov’s continuity theorem in the form of Corollary A.5.

Corollary 5.1 motivates introducing some temporary notation. Given (A), set $p_{\alpha ,H}=\big (\big (1+(\alpha -1)H\big )^{-1}+2\big )/2\in (1,2)$ , and for any $h\in L^q_t C^\alpha _x$ , we define the event

$$ \begin{align*} \Omega_h:=\Big\{\omega\in \Omega: \, T^{B^H}h(\omega)\in C^{p_{\alpha,H}-{\mathrm{var}}}_tC^{1-\kappa,2\kappa}_x\,\,\forall\kappa>0\Big\}, \end{align*} $$

which is therefore of full probability.

The regularity of $T^{B^H}$ obtained from Corollary 5.1 is sufficient to define a notion of solution via nonlinear Young formalism. For the proof of the next statement, we refer to [Reference Galeati46], which can be readily readapted to the p-variation framework; see also [Reference Anzeletti, Richard and Tanré2].

Lemma 5.2. Let $A:[0,1]\times \mathbb {R}^d\to \mathbb {R}^n$ and $x:[0,1]\to \mathbb {R}^d$ satisfy $A\in C_t^{p-{\mathrm {var}}}C^{\eta ,\mathrm {loc}}_x$ and $x\in C^{\zeta -{\mathrm {var}}}_t$ such that the exponents $p,\zeta \in [1,\infty )$ , $\eta \in (0,1]$ satisfy

$$ \begin{align*} \frac{1}{p}+\frac{\eta}{\zeta}>1. \end{align*} $$

Then the nonlinear Young integral

$$ \begin{align*} y_t=\int_0^t A_{\mathrm{d} s}(x_s):=\lim_{\ell\to \infty}\sum_{j=0}^{2^\ell-1}A_{j2^{-\ell}t,(j+1)2^{-\ell}t}(x_{j2^{-\ell}t}) \end{align*} $$

is well-defined. If $A\in C_t^{p-{\mathrm {var}}}C^{\eta }_x$ , then for all $(s,t)\in [0,1]_\leq ^2$ , y satisfies the bound

(5.3)

where the constant N depends only on $1/p+\eta /\zeta $ .

Definition 5.3. Assume (A), $\alpha <0$ and $b\in L^q_t C^\alpha _x$ . Given $\omega \in \Omega _b$ , we say that a path x is an $\omega $ -path solution to (1.6) if $x=\varphi +B^H(\omega )$ , $\varphi \in C^{\zeta -{\mathrm {var}}}_t$ for some $\gamma $ satisfying $1/p_{\alpha ,H} + 1/\zeta>1$ and the equality

(5.4)

$$ \begin{align} \varphi_t=\varphi_0 + \int_0^t \big(T^{B^H}b(\omega)\big)_{\mathrm{d} s}(\varphi_s) \end{align} $$

holds for all $t\in [0,1]$ , the integral being understood in the nonlinear Young sense. We say that a stochastic process X is a path-by-path solution to (1.6) if, for $\mathbb {P}$ -a.e. $\omega \in \Omega _b$ , $X(\omega )$ is a $\omega $ -path solution in the above sense. Given this formulation of the SDE, the concepts of strong and weak solutions are analogous to the classical ones; see Section 1.5 above.

Typically, we encounter more special cases of nonlinear Young integrals than the generality that Lemma 5.2 allows. First of all, the spatial growth of A is often quantified (as in, for example, Corollary 5.1). Secondly, whenever $\varphi $ is a solution to a nonlinear Young equation, it is automatically of p-variation, and its temporal regularity can be often controlled by that of A (see, for example, [Reference Galeati46, Section 3.2] in the Hölder case or Lemma B.1 in Appendix B).

We can then define the notion of flows similarly to Definition 4.1. In fact, the following definition extends the previous one: for functional drifts, taking $A=T^\gamma b$ , using the Riemann sums characterization of the nonlinear Young integral, one can easily verify that

$$ \begin{align*} \int_0^t (T^\gamma b)_{\mathrm{d} s} (\varphi_s) = \int_0^t b_s(\varphi_s+\gamma_s) \mathrm{d} s \quad \forall\, t\in [0,1]. \end{align*} $$

Therefore, in the functional case, Definitions 4.1 and 5.4 coincide via the change of variables

(5.5)

$$ \begin{align} \Psi_{s\to t}(x)=\Phi_{s\to t}(x+\gamma_s)-\gamma_t. \end{align} $$

Definition 5.4. Assume $A\in C_t^{p-{\mathrm {var}}} C^{\eta ,\mathrm {loc}}_x$ for some $\eta \in (0,1]$ , $p\in [1,2)$ satisfying $(1+\eta )/p>1$ . A semiflow associated to the nonlinear Young equation

(5.6)

$$ \begin{align} y_t = y_0 + \int_0^t A_{\mathrm{d} s} (y_s) \end{align} $$

is a jointly measurable map $\Psi :[0,1]^2_\leq \times \mathbb {R}^d\to \mathbb {R}^d$ such that

• for all $(s,x)\in [0,1]\times \mathbb {R}^d$ , one has $\Psi _{s\to \cdot }(x)\in C^{p-{\mathrm {var}}}_t$ and for all $t\in [s,1]$ , one has the equality
$$ \begin{align*} \Psi_{s\to t}(x)=x+\int_s^t A_{\mathrm{d} r}\big(\Psi_{s\to r}(x)\big); \end{align*} $$
• for all $(s,r,t,x)\in \times [0,1]^3_\leq \times \mathbb {R}^d$ one has $\Psi _{s\to t}(x)=\Psi _{r\to t}\big (\Psi _{s\to r}(x)\big )$ .

The definitions of flow, random (semi)flow, adaptedness,and Hölder continuity are then exactly as in Definition 4.1.

We are now in the position to state and prove our existence and uniqueness theorems in the case of distributional drift.

Theorem 5.5. Assume (A), $\alpha <0$ , and let $b\in L^q_t C^\alpha _x$ . Then there exists an adapted random semiflow of solutions to (1.6) that is furthermore locally $\beta $ -Hölder continuous $\mathbb {P}$ -almost surely for all $\beta \in (0,1)$ .

Proof. By sacrificing a small regularity, we may and will assume $b\in L^q_t \overline {C^\alpha _x}$ . The proof follows similar steps as that of Theorem 4.3. We take $m\in [2,\infty )$ to be chosen large enough later as well a sequence of functions $(b^n)_{n\in \mathbb {N}}$ such that $b^n\in L^q_tC^{2}_x$ and $\|b^n\|_{L^q_t C^\alpha _x}\leq \|b\|_{L^q_t C^\alpha _x}$ for all $n\in \mathbb {N}$ , and $\|b^n-b\|_{L^q_t C^{\alpha -1}_x}\to 0$ as $n\to \infty $ . Replacing b by $b^n$ in (1.6), the equation clearly admits an adapted random semiflow $\Psi ^n_{s\to t}$ . For fixed $(s,t)\in [0,1]^2_\leq $ , $x\in \mathbb {R}^d$ , and $n,n'\in \mathbb {N}$ , by Theorem 3.2, one has the bound

$$ \begin{align*} \big\|\Psi_{s\to t}^n(x)-\Psi_{s\to t}^{n'}(x)\big\|_{L^m}\lesssim \|b^n-b^{n'}\|_{L^q_t C^{\alpha-1}_x}. \end{align*} $$

Similarly, for $(s,t)\in [0,1]^2_\leq $ , $x,x'\in \mathbb {R}^d$ , and $n\in \mathbb {N}$ , Theorem 3.2 yields

$$ \begin{align*} \big\|\Psi_{s\to t}^{n}(x)-\Psi_{s\to t}^{n}(x')\big\|_{L^m}\lesssim |x-x'|. \end{align*} $$

The temporal regularity is obtained from Lemma 2.4: in our present notation, we get

$$ \begin{align*} \big\|\Psi_{s\to t}^{n}(x)-\Psi_{s\to t'}^{n}(x)\big\|_{L^m}\lesssim w_{b,\alpha,q}(t,t')^{1/q}|t'-t|^{\alpha H+1/q'}=:\tilde w(t,t')^{1+\alpha H} \end{align*} $$

with $\tilde w$ defined by the above equality. Regularity in the s variable is obtained precisely as in (4.2). From these estimates, we obtain the convergence

$$ \begin{align*} \Psi^{n}\to \Psi\qquad\text{in }L^m_\omega C_{s,t}C^{1-\kappa,\mathrm{loc}}_x\cap L^m_\omega C^{\mathrm{loc}}_x C^{p_{\alpha,H}-{\mathrm{var}}}_{s,t} \end{align*} $$

to a limit $\Psi $ just as in Theorem 4.3 with all the required properties shown in the same way, except for the fact that $\Psi _{s\to \cdot }(x)$ solves the equation on $[s,1]$ with initial condition x in the nonlinear Young sense. Since at this point s and x are fixed, we assume for simplicity $s=0, x=0$ and denote $\Psi ^{n}_{0\to t}(0)=\psi ^{n}_t$ , $\Psi _{0\to t}(0)=\psi _t$ . It is sufficient to show the convergence

$$ \begin{align*} \int_0^t \big(T^{B^H}b^{n}\big)_{\mathrm{d} s}(\psi^{n}_s)\to \int_0^t\big(T^{B^H}b\big)_{\mathrm{d} s}(\psi_s) \end{align*} $$

in probability for each $t\in [0,1]$ . Recall that by Corollary 5.1, we have that

$$ \begin{align*} T^{B^H}(b^{n}-b)\to 0 \qquad\text{in } C^{p_{\alpha,H}-{\mathrm{var}}}_tC^{1-\kappa,\mathrm{loc}}_x \end{align*} $$

in probability. From the above, we have that $\psi ^{n}$ converges to $\psi $ (and in particular is bounded) in $C^{p_{\alpha ,H}-{\mathrm {var}}}_t$ in probability. Therefore, if we take an auxiliary $\ell \in \mathbb {N}$ and write

$$ \begin{align*} \int_0^t &\big(T^{B^H}b^{n}\big)_{\mathrm{d} s}(\psi^{n}_s)- \int_0^t\big(T^{B^H}b\big)_{\mathrm{d} s}(\psi_s) \\ &=\int_0^t \big(T^{B^H}b^{\ell}\big)_{\mathrm{d} s}(\psi^{n}_s)- \int_0^t\big(T^{B^H}b^{\ell}\big)_{\mathrm{d} s}(\psi_s) \\ &\qquad-\int_0^t \big(T^{B^H}(b^{\ell}-b^{n})\big)_{\mathrm{d} s}(\psi^{n}_s)+ \int_0^t \big(T^{B^H}(b^{\ell}-b)\big)_{\mathrm{d} s}(\psi_s), \end{align*} $$

then we can first choose $\ell $ and n large enough to make the third and fourth integrals small, and then we can keep the same $\ell $ and increase n further to make the difference of the first two terms small, using the Lipschitzness of $b^{\ell }$ . This concludes the proof.

Theorem 5.6. Assume (A), $\alpha <0$ , and let $b\in L^q_t C^\alpha _x$ . Then there exists an event $\tilde \Omega $ of full probability such that for all $\omega \in \tilde \Omega $ , for all $(S,T)\in [0,1]^2_\leq $ , $x\in \mathbb {R}^d$ , there exists only one $\omega $ -path solution to (1.6) on $[S,T]$ with initial condition x; in other words, path-by-path uniqueness holds.

Remark 5.7. In analogy to Remark 4.6, the strong form of uniqueness coming from Theorem 5.6 readily implies pathwise uniqueness of solutions defined on random time intervals (e.g., stopping times) as well as uniqueness in law of weak solutions. In fact, it gives us uniqueness in a larger class of possibly non-adapted pathwise solutions since the nonlinear Young formalism does not require adaptedness of the processes in consideration. However, Theorem 5.5 tells us that the unique solution is in fact a strong one.

Notice, however, that all these considerations only apply in the framework of Definition 5.3 – namely, if the SDE is interpreted in a nonlinear Young sense as (5.4). Differently from the functional one, in the distributional setting, there is no canonical notion of solution, and one can in principle find alternative concepts which fall outside the framework of Definition 5.3 and Theorem 5.6; for a practical example, see Definition 8.1 further below.

Theorem 5.6 follows from a version of Lemma 4.5 in the nonlinear Young setting, which is a generalization of Theorem 5.1 from [Reference Galeati46].

Lemma 5.8. Let $A\in C^{p-{\mathrm {var}}}_t C^{\eta ,\mathrm {loc}}_x$ for some $\eta \in (0,1]$ , $p\in [1,2)$ satisfying $(1+\eta )/p>1$ . Suppose that the nonlinear YDE

$$ \begin{align*} x_t = \int_0^t A_{\mathrm{d} s}(x_s) \end{align*} $$

admits a locally $\beta $ -Hölder continuous semiflow $\Psi $ with any $\beta \in (0,1)$ . Then for any $(S,T)\in [0,1]_\leq ^2$ and $y\in \mathbb {R}^d$ , there exists a unique solution to the nonlinear YDE on $[S,T]$ , which is given by $\Psi _{S\to \cdot }(y)$ .

Proof. The proof is very similar to that of Lemma 4.5, so we will mostly sketch it. Let z be a solution on $[S,T]$ starting from y, which by definition belongs to $C^{q-{\mathrm {var}}}_t$ with some q such that $1/p+\eta /q>1$ . Thus, z is bounded, and in particular, after localizing the argument, we may assume that $\Psi $ is globally $\beta $ -Hölder and that $A\in C^{p-{\mathrm {var}}}_t C^{\eta }_x$ ; furthermore, since the inequalities involving $(\eta ,p,q)$ are strict, we can assume $\eta \in (0,1)$ .

Set ; an application of Lemma B.1 readily informs us that

(5.7)

$$ \begin{align} |\Psi_{s\to t}(x)-x-A_{s,t}(x)| \lesssim w(s,t)^{\frac{1+\eta}{p}} \end{align} $$

uniformly in $(s,t)\in [0,1]_\leq ^2$ and $x\in \mathbb {R}^d$ (the hidden constant can depend on $w(0,1)$ ); a similar bound also holds for $\Psi _{s\to t}(x)$ replaced by $z_t$ .

As before, we fix $\tau \in [S,T]$ and set $f_t:= \Psi _{t\to \tau }(z_t)-\Psi _{S\to \tau }(y)$ ; in order to conclude, it suffices to show that f is constant. As in (4.4), we have $|f_{s,t}|\lesssim |\Psi _{s\to t}(z_s)-z_t|^\beta $ . Moreover, by definition of solution to the YDE and estimate (5.7), it holds that

$$ \begin{align*} |\Psi_{s\to t}(z_s)-z_t| = \big|\Psi_{s\to t}(z_s)-z_s - A_{s,t}(z_s) - (z_t-z_s-A_{s,t}(z_s))\big| \lesssim w(s,t)^{\frac{1+\eta}{p}}. \end{align*} $$

Combining the two estimates, we get

$$ \begin{align*} |f_{s,t}|\lesssim w(s,t)^{\frac{\beta(1+\eta)}{p}}; \end{align*} $$

by assumption, we can choose $\beta $ close enough to $1$ so that $\beta (1+\eta )/p$ is bigger that $1$ , implying the conclusion.

6 Flow regularity and Malliavin differentiability

So far, we have established the existence of a random Hölder continuous semiflow $\Phi _{s\to t}(x)$ ; the aim of this section is to strengthen this result by establishing better properties for $\Phi $ . We will start by showing that $\Phi $ is a random flow in the sense that for each fixed $s<t$ , the maps $x\mapsto \Phi _{s\to t}(x)$ are invertible; see Theorem 6.1 below. The main body of the section is devoted to the proof of Theorem 6.2, showing that both $\Phi _{s\to t}$ and its spatial inverse $\Phi _{s\leftarrow t}$ admit continuous derivatives. We conclude the section by showing that the random variables $\Phi _{s\to t}(x)$ possess a rather strong form of Malliavin differentiability; see Theorem 6.8 below.

From now on, we will use both $\Phi _{s\to t}(x)$ and $\Phi _{s\to t}(x;\omega )$ to denote the semiflow, so to stress the dependence on the fixed element $\omega \in \Omega $ whenever needed; we start with the promised invertibility.

Theorem 6.1. Let (A) hold, $b\in L^q_tC^\alpha _x$ , and denote by $\Phi _{s\to t}(x;\omega )$ the semiflow of solutions constructed in Theorems 4.3 and 5.5. Then there exists an event $\tilde {\Omega }$ of full probability such that, for all $\omega \in \tilde {\Omega }$ and all $(s,t)\in [0,1]^2_\leq $ , the map $x\mapsto \Phi _{s\to t}(x;\omega )$ is a bijection.

Proof. We follow closely the classical arguments by Kunita (cf. [Reference Kunita69, Lemmas II.4.1-II.4.2]), as they are completely independent from the driving noise being Brownian.

First, let us define the family of random variables

$$ \begin{align*} \eta_{s,t}(x,y) := |\Phi_{s\to t}(x) - \Phi_{s\to t}(y)|^{-1}. \end{align*} $$

Set $\gamma =H\wedge 1/q'$ for $\alpha \geq 0$ , $\gamma = \alpha H + 1/q'$ in the case $\alpha <0$ . Recall that the estimates in the proof of Theorem 4.3, respectively Theorem 5.5, overall yield

(6.1)

$$ \begin{align} \| \Phi_{s\to t}(x) - \Phi_{s'\to t'}(y)\|_{L^m} \lesssim |s-s'|^\gamma + |t-t'|^\gamma + |x-y|; \end{align} $$

moreover, by taking expectation in (3.5), we have

(6.2)

$$ \begin{align} \| |\Phi_{s\to t}(x) - \Phi_{s\to t}(y)|^{-1} \|_{L^m} \lesssim |x-y|^{-1}. \end{align} $$

Fix any $\delta>0$ . We can combine estimates (6.1) and (6.2) and argue as in [Reference Kunita69, Lemma II.4.1] to deduce that for any $s<t$ and any x, $x'$ , y, $y'$ satisfying $|x-y|>\delta $ , $|x'-y'|>\delta $ , it holds

(6.3)

$$ \begin{align} \| & \eta_{s,t} (x,y)-\eta_{s',t'}(x',y')\|_{L^m} \nonumber\\ & \lesssim \delta^{-2} \Big[ |x-x'|+|y-y'|+ (1+|x|+|x'|+|y|+|y'|)(|t-t'|^\gamma + |s-s'|^\gamma) \Big]. \end{align} $$

From (6.3), one can apply Kolmogorov’s continuity theorem to deduce that the map $(s,t,x,y)\mapsto \eta _{s,t}(x,y;\omega )$ is continuous on the domain $\{s<t, |x-y|>\delta \}$ for $\mathbb {P}$ -a.e. $\omega $ . As the argument works for any $\delta>0$ , we can find an event $\tilde {\Omega }$ of full probability such that, for all $\omega \in \tilde {\Omega }$ , the map $\eta _{s,t}(x,y;\omega )$ is continuous on $\{s<t, |x-y|\neq 0\}$ , which implies that it must also be finite for all $s<t, x\neq y$ . This clearly implies injectivity of $x\mapsto \Phi _{s,t}(x;\omega )$ for all $s<t$ and $\omega \in \tilde {\Omega }$ .

We move to proving surjectivity, which this time is closely based on [Reference Kunita69, II.Lemma 4.2], having established the key inequalities (6.1) and (6.2). Let $\hat {\mathbb {R}}^d=\mathbb {R}^d\cup \{\infty \}$ be the one-point compactification of $\mathbb {R}^d$ ; set $\hat {x}=x/|x|^2$ for $x\in \mathbb {R}^d\setminus \{0\}$ and $\hat {x}=\infty $ for $x=0$ . Define

$$ \begin{align*} \tilde \eta_{s,t}(\hat{x}) = \begin{cases} (1+ |\Phi_{s\to t}(x)|)^{-1}\quad & \text{if } \hat{x}\in \mathbb{R}^d\\ 0 & \text{if } \hat{x}=0. \end{cases} \end{align*} $$

Arguing as in [Reference Kunita69, Lemma II.4.2], we find

(6.4)

$$ \begin{align} \| \tilde\eta_{s,t}(\hat{x}) -\tilde\eta_{s',t'}(\hat{y})\|_{L^m} \lesssim |\hat{x}-\hat{y}| + |t-t'|^\gamma + |s-s'|^\gamma; \end{align} $$

by Kolmogorov’s theorem, we can find an event of full probability, which we still denote by $\tilde {\Omega }$ , such that $\tilde \eta _{s,t}(\hat {x};\omega )$ is continuous at $\hat {x}=0$ and so that $\Phi _{s,t}(\cdot ;\omega )$ can be extended to a continuous map from $\hat {\mathbb {R}}^d$ to itself for any $s<t$ and $\omega \in \tilde \Omega $ . This extension, denoted by $\tilde {\Phi }_{s\to t}(x;\omega )$ , is continuous in $(s,t,x)$ for every $\omega \in \tilde \Omega $ , and thus, $\Phi _{s\to t}(\cdot \,; \omega )$ is homotopic to the identity map $\tilde {\Phi }_{s\to s}(\cdot \,;\omega )$ , making it surjective. Its original restriction $\Phi _{s\to t}(\cdot \,; \omega )$ must then be surjective as well, from which we can conclude that $x\mapsto \Phi _{s\to t}(x;\omega )$ is surjective for all $s<t$ and $\omega \in \tilde {\Omega }$ .

Our next goal is to establish that $\Phi $ is in fact a random flow of diffeomorphisms; by this, we mean that, in addition to the map $(s,t,x,\omega )\mapsto \Phi _{s\to t}(x;\omega )$ satisfying all the properties listed in Definition 4.1, there exists an event of full probability $\tilde {\Omega }$ such that $x\mapsto \Phi _{s\to t}(x;\omega )$ is a diffeomorphism for all $s<t$ and $\omega \in \tilde {\Omega }$ . We will in fact prove a little bit more:

Theorem 6.2. Let (A) hold, $b\in L^q_tC^\alpha _x$ , and $\Phi $ be the associated random flow. Then there exists a constant $\delta (\alpha ,H)>0$ and an event $\tilde \Omega $ of full probability such that for any $\omega \in \tilde {\Omega }$ and any $s<t$ , the map $x\mapsto \Phi _{s\to t}(x;\omega )$ and its inverse are both $C^{1+\delta ,\mathrm {loc}}_x$ .

In order to prove Theorem 6.2, we will first assume b to be sufficiently smooth ( $b\in L^q_t C^{1+\kappa }_x$ would suffice), so that the associated $\Phi $ is already known to be a flow of diffeomorphism, and derive estimates which only depend on $\| b\|_{L^q_t C^\alpha _x}$ (cf. Lemma 6.3 and Proposition 6.4 below). Establishing the result rigorously for general b is then accomplished by standard approximation procedures, in the style of Theorems 4.3, 5.5. We will frequently use the exponent $\varepsilon =(\alpha -1) H+1/q'$ from Lemma 3.1; recall that (A) is equivalent to $\varepsilon>0$ .

Recall that, for regular b, the Jacobian of the flow – namely, the matrix $J_{s\to t}^x := \nabla \Phi _{s\to t}(x)\in \mathbb {R}^{d\times d}$ – is known to satisfy the variational equation

(6.5)

$$ \begin{align} J_{s\to t}^x = I + \int_s^t \nabla b_r(\Phi_{s\to r}(x)) J_{s\to r}^x \mathrm{d} r. \end{align} $$

Already from this fact we can deduce useful moment estimates for $J^x_{s\to t}$ .

Lemma 6.3. Assume (A) and let $b\in L^q_t C^2_x$ . Then there exists $p(\alpha ,H)<2$ with the following property: for any $m\in [1,\infty )$ , there exists a constant $N=N(m,p,H,\alpha ,q,d,\| b\|_{L^q_t C^\alpha _x})$ such that, for all $x\in \mathbb {R}^d$ and $s\in [0,1]$ , it holds

(6.6)

moreover, for fixed $\delta <\varepsilon $ , for any $x\in \mathbb {R}^d$ and $s \leq t \leq t'$ , it holds

(6.7)

$$ \begin{align} \| J^x_{s\to t} - J^x_{s\to t'} \|_{L^m} \lesssim |t-t'|^\delta. \end{align} $$

Proof. For fixed $s\in [0,1]$ and $x\in \mathbb {R}^d$ , setting $A_{s,t}:= \int _s^t \nabla b_r(\Phi _{s\to r}(x)) \mathrm {d} r$ , equation (6.5) can be regarded as a linear Young differential equation. Arguing as in the proof of Theorem 3.2, one can show that A has finite p-variation for some $p<2$ and that in fact there exists $\mu>0$ (depending on the usual parameters and $\| b\|_{L^q_t C^\alpha _x}$ , but not on x nor s) such that

(6.8)

$$ \begin{align} \mathbb{E}\bigg[ \exp\bigg( \mu \bigg| \sup_{s\leq t<t'\leq 1} \frac{|A_{t,t'}|}{w_{b,\alpha,q}(t,t')^{1/q} |t-t'|^\delta} \bigg|^2\bigg)\bigg] <\infty; \end{align} $$

Lemma B.2 in Appendix B (with $\tilde {p}=p$ ) then implies the pathwise estimate

Claim (6.6) then follows by taking $L^m$ -norms on both sides and observing (as in the proof of Theorem 3.2) that (6.8) implies for all $\lambda>0$ . Similarly, claim (6.7) also follows from Lemma B.2 (this time applying estimate (B.4) therein) combined with (6.8).

The next step in the proof of Theorem 6.2 is given by the following key estimate.

Proposition 6.4. Let b be a regular drift and define $J^x_{s\to t}$ as above; set $\varepsilon =(\alpha -1)H + 1/q'$ . Then there exists $\gamma \in (0,1)$ such that, for any $m\in [1,\infty )$ , there exists $N=N(m,\gamma ,H,\alpha ,q,d,\| b\|_{L^q_t C^\alpha _x})$ such that

(6.9)

$$ \begin{align} \| J^x_{s\to t} - J^y_{s'\to t'}\|_{L^m} \leq N\big[ |x-y|^{\gamma} + |t-t'|^{\varepsilon \gamma} + |s-s'|^{\varepsilon \gamma} \big] \end{align} $$

for all $(s,t), (s',t')\in [0,1]^2_\leq $ and $x,y\in \mathbb {R}^d$ .

The proof requires the following technical refinement of Lemma 3.1.

Lemma 6.5. Assume (A), $h\in L^q_t C^1_x$ , and let $\varphi ^i$ , $i=1,2$ , be two processes satisfying the assumptions of Lemma 3.1 for the same control w; define $\varepsilon $ as therein and set $\psi ^i_t=\int _S^t h_r(B^H_r+\varphi ^i_r) \mathrm {d} r$ . Then for $\gamma \in (0,1)$ satisfying

(6.10)

$$ \begin{align} \varepsilon-\gamma H>0, \quad \varepsilon(2-\gamma)-\gamma H>0, \quad \varepsilon(2-\gamma)-\gamma H + (2-\gamma)/q>1, \end{align} $$

and any $m\in [2,\infty )$ , there exists $N=N(m,\gamma ,H,\alpha ,q,d,\| h\|_{L^q_t C^{\alpha -1}_x})$ such that

$$ \begin{align*} \| (\psi^1-\psi^2)_{s,t} \|_{L^m} \leq N |t-s|^{\varepsilon-\gamma H} w_{h,\alpha-1,q}(s,t)^{\frac{1}{q}} \big(1+w(s,t)\big) \sup_{r\in [S,T]} \| \varphi^1_r-\varphi^2_r\|_{L^m}^\gamma. \end{align*} $$

Remark 6.6. The conditions in (6.10) should be understood as ‘ $\gamma $ small enough’. Indeed, note that all three conditions are upper bounds on $\gamma $ , and under condition (A), we can always find $\gamma>0$ satisfying (6.10): as $\gamma \downarrow 0$ , the three conditions become, respectively, $\varepsilon>0$ , $2\varepsilon>0$ , and $2\varepsilon +2/q>1$ , all of which are trivial since $q\leq 2$ .

Proof. The proof is very similar to that of Lemma 3.1, so we will mostly sketch it; the main differences are just the use of Lemma 2.5 with $n=m$ and some interpolation arguments.

Define $A^i_{s,t} = \mathbb {E}_{s-(t-s)}\int _s^t h_r (B^H_r + \mathbb {E}_{s-(t-s)}\varphi _r) \mathrm {d} r$ so that $\psi ^1-\psi ^2$ is the stochastic sewing of $A^1-A^2$ . Arguing similarly as in Lemma 3.1, we have the estimate

$$ \begin{align*} \|A_{s,t}\|_{L^m} & \leq \bigg\| \int_s^t \| P_{|r-s_1|^{2H}} h_r\|_{C^\gamma_x}\, |\mathbb{E}_{s_1} \varphi^1_r-\mathbb{E}_{s_1} \varphi^2_r|^\gamma \mathrm{d} r \bigg\|_{L^m}\\ & \lesssim |t-s|^{\varepsilon-\gamma H} w_{h,\alpha-1,q}(s,t)^{1/q} \sup_{r\in [S,T]} \| \varphi^1_r-\varphi^2_r\|_{L^m}^\gamma; \end{align*} $$

the first condition of Lemma 2.5 is verified since $\varepsilon -\gamma H>0$ and $1/q \geq 1/2$ . To control $\mathbb {E}_{s_1} \delta A_{s,u,t}=\mathbb {E}_{s_1} \delta A^1_{s,u,t}-\mathbb {E}_{s_1} \delta A^2_{s,u,t}$ , we can decompose it as $\mathbb {E}_{s_1} \delta A_{s,u,t} = I^1-I^2+J^1-J^2$ , and similarly to Lemma 3.1. Estimating each one of them separately as therein yields

$$ \begin{align*} \sup_i \{|I^i|,|J^i|\}\lesssim |t-s|^{2\varepsilon}w_{h,\alpha-1,q}(s,t)^{1/q}w(s_1,t)^{1/q}; \end{align*} $$

moreover, we have

$$ \begin{align*} \| I^1-I^2\|_{L^m} & \leq \bigg\| \int_{s_4}^{s_5}\big|P_{|r-s_2|^{2H}}h_r(\mathbb{E}_{s_2}B^H_r+\mathbb{E}_{s_1}\varphi^1_r)-P_{|r-s_2|^{2H}}h_r(\mathbb{E}_{s_2}B^H_r+\mathbb{E}_{s_2}\varphi^1_r)\big|\mathrm{d} r \\ & \quad -\int_{s_4}^{s_5}\big|P_{|r-s_2|^{2H}}h_r(\mathbb{E}_{s_2}B^H_r+\mathbb{E}_{s_1}\varphi^2_r)-P_{|r-s_2|^{2H}}h_r(\mathbb{E}_{s_2}B^H_r+\mathbb{E}_{s_2}\varphi^2_r)\big|\mathrm{d} r\bigg\|_{L^m}\\ & \leq \int_{s_4}^{s_5} \| P_{|r-s_2|^{2H}}h_r\|_{C^1_x} \big( \| \mathbb{E}_{s_1} \varphi^1_r-\mathbb{E}_{s_1} \varphi^2_r\|_{L^m} + \| \mathbb{E}_{s_2} \varphi^1_r-\mathbb{E}_{s_2} \varphi^2_r\|_{L^m}\big) \mathrm{d} r\\ & \lesssim |t-s|^{(\alpha-2)H + 1/q'} w_{h,\alpha-1,q}(s,t)^{1/q} \sup_{r\in [S,T]} \| \varphi_r^1-\varphi^2_r\|_{L^m}, \end{align*} $$

and similarly for $\| J^1-J^2\|_{L^m}$ . Interpolating the two bounds together overall yields

$$ \begin{align*} \| \mathbb{E}_{s_1} \delta A_{s,u,t}\|_{L^m} \lesssim |t-s|^{\varepsilon(2-\gamma)-\gamma H} w_{h,\alpha-1,q}(s,t)^{1/q} w(s_1,t)^{\frac{1-\gamma}{q}} \sup_{r\in [S,T]} \| \varphi^1_r-\varphi^2_r\|_{L^m}^\gamma. \end{align*} $$

By the hypothesis (6.10), the power of $|t-s|$ is positive and the total power of all the controls is greater than $1$ . The conclusion then follows from Lemma 2.5.

Proof of Proposition 6.4.

As usual, we can split estimate (6.9) into three subestimates, with two of the three parameters $(s,t,x)$ fixed and only one varying. From now on, we will fix $\gamma \in (0,1)$ satisfying condition (6.10).

Step 1: $(s,x)$ fixed, $t<t'$ . In this case, the desired estimate is just (6.7) from Lemma 6.3, for the choice $\delta =\gamma \varepsilon < \varepsilon $ .

Step 2: $(s,t)$ fixed, $x\neq y$ . The difference process $v_t:=J^x_{s,t}-J^{y}_{s,t}$ satisfies an affine Young equation of the form $ \mathrm {d} v_t = \mathrm {d} A_t\, v_t + \mathrm {d} z_t$ , $v_s=0$ , for

$$ \begin{align*} A_t = \int_s^t \nabla b_r(\Phi_{s\to r}(x)) \mathrm{d} r, \quad z_t = \int_s^t \big[ \nabla b_r(\Phi_{s\to r}(x)) - \nabla b_r(\Phi_{s\to r}(y))\big] J^y_{s\to r} \mathrm{d} r; \end{align*} $$

invoking as usual Lemma B.2 (for $\tilde {p}=1/2$ ) and applying estimate (6.8), one ends up with

Observe that z itself can be interpreted as a Young integral: $z_t= \int _s^t \mathrm {d} \tilde {A}_r J^y_{s\to r}$ for

$$\begin{align*}\tilde{A}_u:=\int_s^u \big[ \nabla b_r(\Phi_{s\to r}(x)) - \nabla b_r(\Phi_{s\to r}(y))\big] \mathrm{d} r. \end{align*}$$

Standard properties of Young integral, together with Cauchy’s inequality, then yield

by estimate (6.6), it only remains to find a bound for . Recall that by construction $\Phi _{s\to r}(x) = \varphi _{s\to r}(x) + B^H_r$ , where the process $\varphi _{s\to \cdot }(x)$ satisfies condition (2.2) (or even (2.4) for $\alpha <0$ ) for $w=w_{b,\alpha ,q}$ . We can apply Lemma 6.5 with the choice $h=\nabla b$ , $\varphi ^1_r=\varphi _{s\to r}(x)$ , $\varphi ^2_r=\varphi _{s\to r}(y)$ to obtain, for all $s\leq r<u\leq 1$ and all $m\in [1,\infty )$ ,

$$ \begin{align*} \| \tilde{A}_{r,u}\|_{L^m} &\lesssim |r-u|^{\varepsilon-\gamma H} w(r,u)^{1/q} (1+ \| b\|_{L^q_t C^\alpha_x}^q) \sup_{r\in [s,1]} \| \varphi^1_r - \varphi^2_r\|_{L^m}^\gamma\\ & \lesssim |r-u|^{\varepsilon-\gamma H} w(r,u)^{1/q} |x-y|^\gamma, \end{align*} $$

where in the second inequality, we used estimate (6.1). By Lemma A.3 in Appendix A, we deduce that, for any $m\in [1,\infty )$ and $\delta <\varepsilon -\gamma H$ , it holds

Combining all the above estimates yields the conclusion in this case.

Step 3: $(t,x)$ fixed, $s<s'$ . This step is mostly a variation on the arguments presented in the previous cases, so we only sketch it. We can write

$$\begin{align*}J^x_{s,t}= J^x_{s,s'} + \int_{s'}^t \nabla b(\Phi_{s\to t}(x)) J^x_{s,r} \mathrm{d} r \end{align*}$$

so that the difference $v_t= J^x_{s,t} - J^x_{s',t}$ can be regarded as the solution to an affine Young equation on $[s',t]$ , for A and z defined similarly as in Step 2; the only difference is that now $v_{s'} = J^x_{s,s'}-I$ and $z_t = \int _{s'}^t \mathrm {d} \tilde A_r J^z_{s'\to r}$ for the choice

$$ \begin{align*} \tilde{A}_u:=\int_{s'}^u \big[ \nabla b_r(\Phi_{s\to r}(x)) - \nabla b_r(\Phi_{s'\to r}(x))\big] \mathrm{d} r. \end{align*} $$

From here, the estimates are almost identical to those of Step 2, relying on a combination of Lemmas B.2, A.3 and 6.5; however, in this case, an application of Step 1 and estimate (6.1) gives us

$$ \begin{align*} \| J^x_{s'\to s}-I\|_{L^m} \lesssim |s-s'|^{\varepsilon \gamma}, \quad \sup_{r\in [s',1]} \| \Phi_{s\to r}(x)-\Phi_{s'\to r}(x)\|_{L^m}^\gamma \lesssim |s-s'|^{\varepsilon \gamma}. here \end{align*} $$

We are now finally ready to complete the following:

Proof of Theorem 6.2.

The argument is based on Theorem II.4.4 from [Reference Kunita69]; assume first b to be a regular field. It is clear from (6.9) that for any $\delta <\varepsilon \gamma $ , the map $(s,t,x)\mapsto \nabla J_{s\to t}^x$ is $\mathbb {P}$ -a.s. locally $\delta $ -Hölder continuous, suitable moment estimates depending only on $\| b\|_{L^q_t C^\alpha _x}$ . Furthermore, letting $K_{s\to t}^x$ denote the inverse of $J_{s\to t}^x$ in the sense of matrices, it is well-known that it solves the linear equation

(6.11)

$$ \begin{align} K_{s\to t}^x = I - \int_s^t K_{s\to r}^x\, \nabla b_r(\Phi^x_{s\to r}(x)) \mathrm{d} r; \end{align} $$

arguing as in the proof of Proposition 6.4, one can prove that

$$ \begin{align*} \| K^x_{s\to t} - K^y_{s'\to t'}\|_{L^m} \lesssim |x-y|^\gamma + |t-t'|^{\varepsilon \gamma} + |s-s'|^{\varepsilon \gamma} \end{align*} $$

and so that it is $\mathbb {P}$ -a.s. $\delta $ -Hölder continuous as well.

In the case of general $b\in L^q_t C^\alpha _x$ , we can consider a sequence $b^n$ of regular functions such that $b^n\to b$ in $L^q_t C^\alpha _x$ (up to sacrificing a little bit of spatial regularity as usual), in which case we already know that the associated flows $\Phi ^n$ converge to $\Phi $ in $L^m_\omega C_{s,t} C^{\delta ,\mathrm {loc}}_x$ ; combined with the aforementioned moments estimates, one can then upgrade it to convergence in $L^m_\omega C_{s,t} C^{1+\delta ,\mathrm {loc}}_x$ . In particular, the fields $J^{x,n}_{s\to t}=\nabla \Phi ^n_{s\to t}(x)$ and $K^{x,n}_{s\to t}=(\nabla \Phi ^n_{s\to t}(x))^{-1}$ converge respectively to $J^x_{s\to t}$ and $K^x_{s\to t}$ ; by the limiting procedure, there exists an event $\tilde {\Omega }$ of full probability such that, for all $\omega \in \tilde \Omega $ , it holds $J^x_{s\to t}(\omega )=\nabla \Phi _{s\to t}(x;\omega )$ and $J^x_{s\to t}(\omega ) K^x_{s\to t}(\omega )=I$ for all $s<t$ and $x\in \mathbb {R}^d$ , as well as $J(\omega ), K(\omega )\in C_{s,t} C^{\delta ,\mathrm {loc}}_x$ .

Overall, for every $\omega \in \tilde {\Omega }$ , the map $(s,t,x)\mapsto \Phi _{s\to t}(x;\omega )$ has regularity $C_{s,t} C^{1+\delta ,\mathrm {loc}}_x$ , and its Jacobian admits a continuous inverse $K^x_{s\to t}(\omega )$ . But this implies that, for any $s<t$ , $\nabla \Phi _{s\to t}(x;\omega )$ is a nondegenerate matrix for all $x\in \mathbb {R}^d$ , which by the implicit function theorem readily implies that the inverse of $x\mapsto \Phi _{s\to t}(x;\omega )$ must belong to $C^{1+\delta ,\mathrm {loc}}_x$ as well. This concludes the proof.

It is well-known in the regular case that the Jacobian of the flow and the Malliavin derivative satisfy the same type of linear equation. Therefore, as the last main result of the section, we show Malliavin differentiability of the random variables $X^x_{s\to t}(\omega ):= \Phi _{s\to t}(x;\omega )$ . To this end, we start with a simple yet powerful lemma, showing that deterministic perturbations of the driving noise $B^H$ do not affect our solution theory.

Lemma 6.7. Assume (A), $b\in L^q_t C^\alpha _x$ , and $h: [0,1]\to \mathbb {R}^d$ be a deterministic, measurable function; then for any $s\in [0,1]$ and any $x\in \mathbb {R}^d$ , there exists a pathwise unique strong solution to the perturbed SDE

(6.12)

$$ \begin{align} X_t = x + \int_s^t b_r(X_r) \mathrm{d} r + B^H_{s,t} + h_{s,t} \quad \forall\, t\in [s,1], \end{align} $$

which we denote by $X_{s\to \cdot }(x;h)$ ; in the distributional case $\alpha <0$ , equation (6.12) should be interpreted in the sense of Definition 5.3.

Proof. We give two short alternative arguments to verify the claim. On one hand, carefully going through the proofs of Sections 2–3, the only key properties needed on the process $B^H$ (cf. also Remark 1.12) are its Gaussianity and the two-sided bounds

$$ \begin{align*} \mathbb{E}[ |B^H_t - \mathbb{E}_s B^H_t|^2] \sim |t-s|^{2H}, \end{align*} $$

which are clearly still true for $\tilde {B}^H=B^H+h$ , due to h being deterministic.

Alternatively, if we define $\tilde {b}_t(z):= b_r(z+h_r)$ , $y=x+h_s$ , then any solution X to (6.12) must be in a $1$ - $1$ correspondence with a solution $Y:=X+h$ to the unperturbed SDE

$$\begin{align*}Y_t = y + \int_s^t \tilde{b}_r(Y_r)\mathrm{d} r + B^H_{s,t}, \end{align*}$$

and it is clear that $\tilde {b}$ still satisfies condition (A), thus implying its well-posedness.

We can now pass to study Malliavin differentiability of $X^x_{s\to t}$ . To this end, it is convenient to first recall the notion of $\mathcal {H}$ -derivative. Let $\mathcal {H}^H$ denote the Cameron-Martin space associated to $B^H$ ; we say that a function $F:\Omega \to \mathbb {R}$ is $\mathcal {H}$ -continuously differentiable if for $\mathbb {P}$ -a.e. $\omega \in \Omega $ , the map $h\mapsto F(\omega +h)$ is Fréchet differentiable from $\mathcal {H}^H$ to $\mathbb {R}$ . In particular, this implies the existence of a random bounded linear operator $\partial F(\omega )$ , which we call the $\mathcal {H}$ -differential of F, such that $\mathbb {P}$ -a.s.

$$ \begin{align*} \partial F(\omega)(h)=\partial_h F(\omega):= \lim_{\varepsilon\to 0} \frac{F(\omega+\varepsilon h)-F(\omega)}{\varepsilon}. \end{align*} $$

Denote by $\| \partial F\|$ the (random) operator norm of $\partial F(\omega )$ , as a linear operator from $\mathcal {H}^H$ to $\mathbb {R}^d$ . It is known (cf. [Reference Nualart80, Section 4.1.3]) that if $F\in L^2$ and $\| \partial F\|\in L^2$ , then F is Malliavin differentiable and its Malliavin differential $DF \mathbb {P}$ -a.s. satisfies $\|D F\|_{\mathcal {H}^H} = \| \partial F\|$ . For this reason, when dealing with $X_{s\to t}^x$ , it will be convenient for us to manipulate directly the directional derivatives $\partial _h X^x_{s\to t}$ . This notion of derivative allows to consider h coming from a larger class than merely Cameron-Martin paths; see Remark 6.9 below for a more detailed explanation.

Theorem 6.8. Assume (A) and $b\in L^q_t C^\alpha _x$ . In the setting of Lemma 6.7, let us set $X^x_{s,t}(h):= X_{s\to t}(x;h)$ . Then $\mathbb {P}$ -a.s. the random variables $\partial _h X^x_{s\to t}$ exist for all $h\in C^{2-{\mathrm {var}}}_t$ and define a (random) linear map $\partial X^x_{s,t}$ . Moreover, for any $m\in [1,\infty )$ , it holds

(6.13)

$$ \begin{align} \sup_{s\in [0,1],x\in \mathbb{R}^d} \Big\| \sup_{t\in [s,1]} \| \partial X^x_{s,t}\|_{\mathcal{L}(C^{2-{\mathrm{var}}};\mathbb{R}^d)} \Big\|_{L^m}<\infty. \end{align} $$

In particular, $X^x_{s\to t}$ is Malliavin differentiable, and for any $m\in [1,\infty )$ , it holds

(6.14)

$$ \begin{align} \sup_{s\in [0,1], x\in \mathbb{R}^d} \Big\| \sup_{t\in [s,1]} \| D X^x_{s\to t} \|_{\mathcal{H}^H} \Big\|_{L^m} <\infty. \end{align} $$

Proof. For simplicity, we give the proof in the case where b is smooth, so that all the computations are rigorous, but keeping track that the estimate (6.14) only depends on $\| b\|_{L^q_t C^\alpha _x}$ . The general case then follows by standard (but a bit tedious) approximation arguments, similar to those of Theorems 4.3–5.5; for estimate (6.14), one can alternatively invoke [Reference Nualart80, Lemma 1.5.3].

For smooth b, $\partial _h X^x_{s\to t}$ is classically characterized as the unique solution to the affine equation

(6.15)

$$ \begin{align} \partial_h X^x_{s\to t} = \int_s^t \nabla b_r(X^x_{s\to t}) \partial_h X^x_{s\to r} \mathrm{d} r + h_{s,t}. \end{align} $$

Consider the process $A_t:= \int _s^t \nabla b_r(X^x_{s\to r}) \mathrm {d} r$ as usual, which satisfies (6.8), so that it has $\mathbb {P}$ -a.s. finite p-variation for some $p<2$ , and moreover,

(6.16)

for all $\lambda \in \mathbb {R}$ , where the estimate only depends on $\| b\|_{L^q_t C^\alpha _x}$ and does not depend on x or s. Interpreting (6.15) as an affine Young equation and applying Lemma B.2 from Appendix B with $\tilde {p}=2$ , we then find $C>0$ such that

taking first supremum over $h\in C^{2-{\mathrm {var}}}$ with $\| h\|_{2-{\mathrm {var}}}=1$ and then over $t\in [s,1]$ , we arrive at the pathwise $\mathbb {P}$ -a.s. inequality

Taking the $L^m$ -norm on both sides, using (6.16), then readily yields (6.13).

Estimate (6.14) then follows from the isometric identification of $D X^x_{s,t}$ with $\partial X^x_{s,t}$ , so that $\|D F\|_{\mathcal {H}^H}= \| \partial X^x_{s,t}\|$ , combined with the functional embedding $\mathcal {H}^H\hookrightarrow C^{2-{\mathrm {var}}}_t$ ; see Lemma C.1 in Appendix C for $H\in (0,1/2)$ and recall that $\mathcal {H}^H\hookrightarrow C^{1-{\mathrm {var}}}_t$ for $H\geq 1/2$ .

Remark 6.9. Results on differentiability beyond the usual Malliavin sense, in the sense of the existence of $\partial _h X^x_{s,t}$ for h belonging to a larger class than $\mathcal {H}^H$ , were already observed for standard SDEs in [Reference Kusuoka70] and have natural explanations in rough path theory (cf. [Reference Cass, Friz and Victoir19, Reference Friz and Victoir42]); in these works, however, only $h\in C^{\tilde p-{\mathrm {var}}}_t$ for some $\tilde p<2$ are allowed. Here instead, not only are we able to reach $C^{2-{\mathrm {var}}}_t$ , but the result can be further strengthened to allow for some $\tilde p>2$ : indeed, the key point is a combination of estimate (6.16) and Lemma B.2, which works as long as the condition $1/\tilde {p}>1-1/p$ is satisfied.

7 McKean-Vlasov equations

Armed with the stability estimate (3.4), we can now solve distribution dependent SDEs (henceforth DDSDEs) of the form

(7.1)

$$ \begin{align} X_t = X_0 + \int_0^t F_s(X_s,\mu_s)\mathrm{d} s + B^H_t, \quad \mu_t=\mathcal{L}(X_t). \end{align} $$

The initial condition $X_0$ is assumed to be $\mathcal {F}_0$ -measurable – in particular, independent of $B^H$ . The idea that estimates of the form (3.4), where the difference of two drifts only appears in the weaker norm of $L^q_t C^{\alpha -1}_x$ , can be exploited to solve DDSDEs was first introduced in [Reference Galeati, Harang and Mayorcas51]; the results presented here can be regarded as a natural extension, requiring less time regularity on the drift and allowing to cover $H>1$ as well. In particular, as in the previous sections, we will not need to exploit Girsanov transform, which instead played a prominent role in [Reference Galeati, Harang and Mayorcas51].

Since our analysis also includes the case of distributional drifts F, we provide a meaningful definition of solution; observe that in the case F is actually continuous in the space variable (i.e $\alpha>0$ ), it reduces to the classical one.

Definition 7.1. Let $H\in (0,\infty )\setminus \mathbb {N}$ and $F:[0,1]\times \mathcal {P}(\mathbb {R}^d)\to C^\alpha _x$ be a measurable function. We say that a tuple $(\Omega ,\mathbb {F},\mathbb {P}; X,B^H)$ is a weak solution to (7.1) if

i) $B^H$ is an $\mathbb {F}$ -fBm of parameter H and X is $\mathbb {F}$ -adapted;
ii) setting $b^X_t(\cdot ):=F_t(\cdot ,\mathcal {L}(X_t))$ , it holds $b^X\in L^q_t C^\alpha _x$ for some $(q,\alpha )$ satisfying (A);
iii) X solves the SDE associated to $b^X$ , in the sense of Section 5.

Similarly to Definition 7.1, one can immediately extend the concepts of strong existence, pathwise uniqueness and uniqueness in law to the DDSDE (7.1). With a slight abuse, we will use the terminology input data of the DDSDE (7.1) to indicate both the pair $(X_0,B^H)$ (when discussing strong existence and/or pathwise uniqueness of solutions) and the pair $(\xi ,\mu ^H)=(\mathcal {L}(X_0),\mathcal {L}(B^H))$ (when discussing uniqueness in law). We are now ready to formulate our main assumptions on the drift F.

Assumption 7.2. Let $H\in (0,\infty )\setminus \mathbb {N}$ fixed, $F:[0,1]\times \mathcal {P}(\mathbb {R}^d)\to C^\alpha _x$ be a measurable function; we assume that there exist parameters $(\alpha ,q)$ satisfying (A) and $h\in L^q_t$ such that

i) for all $t\in [0,1],\, \mu \in \mathcal {P}(\mathbb {R}^d)$ , it holds $\| F_t(\cdot ,\mu )\|_{C^\alpha _x} \leq h_t$ ;
ii) for all $t\in [0,1],\, \mu ,\nu \in \mathcal {P}(\mathbb {R}^d)$ , it holds $\| F_t(\cdot ,\mu )-F_t(\cdot ,\nu )\|_{C^{\alpha -1}_x} \leq h_t \mathbb {W}_1(\mu ,\nu )$ ;

Remark 7.3. Basic examples of F satisfying Assumption (7.2) include the following (for their verification, we refer to Section 2.1 from [Reference Galeati, Harang and Mayorcas51]):

i) The true McKean–Vlasov case $F_t(\cdot ,\mu )=f_t(\cdot )+(g_t\ast \mu )(\cdot )$ for $f,g\in L^q_t C^\alpha _x$ ;
ii) Mean-dependence of the form $F_t(\cdot ,\mu )=f_t(\cdot \,-\langle \mu \rangle )$ , where $\langle \mu \rangle :=\int y\,\mu (\mathrm {d} y)$ ;
iii) The mean $\langle \mu \rangle $ in ii) can be replaced by other functions of statistics (e.g., $\langle \psi ,\mu \rangle $ for $\psi \in C^1_x$ ); one can also take linear combinations of the previous examples.

Also, in Assumption 7.2, we only considered the $1$ -Wasserstein distance $\mathbb {W}_1$ , but, in fact, all the results below would also hold if we replaced $\mathbb {W}_1$ with $\mathbb {W}_p$ for some $p\in (1,\infty )$ .

Theorem 7.4. Let F satisfy Assumption 7.2. Then for any $\mathcal {F}_0$ -measurable $X_0 \in L^1_\omega $ (respectively, $\xi \in \mathcal {P}_1(\mathbb {R}^d)$ ) strong existence, pathwise uniqueness and uniqueness in law of solutions to (7.1) holds.

Proof. We start by showing strong existence and pathwise uniqueness by means of a contraction argument. Specifically, suppose we are given a filtered probability space $(\Omega ,\mathbb {F},\mathbb {P})$ on which are defined an $\mathbb {F}$ -fBm $B^H$ and an $\mathcal {F}_0$ -measurable $X_0\in L^1_\omega $ . Consider the space of adapted processes

$$\begin{align*}E:=\Big\{Y:[0,1]\to\mathbb{R}^d:\, Y \text{ is adapted to }\mathcal{F}_t, \sup_{t\in [0,1]} \| Y_t\|_{L^1}<\infty \Big\},\end{align*}$$

which is a complete metric space when endowed with the metric

$$\begin{align*}d_E(Y,Z):=\sup_{t\in [0,1]} e^{-\lambda \int_0^t |h_s|^q \mathrm{d} s} \| Y_t-Z_t\|_{L^1} \end{align*}$$

for a parameter $\lambda>0$ to be chosen later. Define a map I acting on E by letting $I(Y)$ be the unique solution X to the SDE driven by $B^H$ , with initial data $X_0$ (cf. Remark 4.7) and drift $b^Y_t:=F_t(\cdot \, ,\mathcal {L}(Y_t))$ ; the map I is well-defined thanks to Point i) from Assumption 7.2, ensuring the solvability of such SDE. Note that X is a solution to the DDSDE (7.1) on the space $(\Omega ,\mathbb {F},\mathbb {P})$ with input data $(X_0,B^H)$ if and only if it is a fixed point for I.

We claim that I is a contraction on $(E,d_E)$ ; indeed, given any $Y^1,\,Y^2$ , by the stability estimate (3.4) and Assumption 7.2, for any $t\in [0,1]$ , it holds

$$ \begin{align*} \| I(Y^1)_t-I(Y^2)_t\|_{L^1}^q & \lesssim \int_0^t \| F_s(\cdot\,,\mathcal{L}(Y^1_s))-F_s(\cdot\,,\mathcal{L}(Y^2_s))\|_{C^{\alpha-1}}^q \mathrm{d} s\\ & \lesssim \int_0^t |h_s|^q \,\mathbb{W}_1(\mathcal{L}(Y^1_s),\mathcal{L}(Y^2_s))^q \mathrm{d} s\\ & \lesssim d_E(Y^1,Y^2)^q \int_0^t |h_s|^q\, e^{q \lambda \int_0^s |h_r|^q \mathrm{d} r} \mathrm{d} s\\ & \lesssim (q \lambda)^{-1}\, e^{\lambda q \int_0^t |h_r|^q \mathrm{d} r}\, d_E(Y^1,Y^2)^q. \end{align*} $$

Rearranging the terms, we overall find the estimate

$$\begin{align*}d_E\big(I(Y^1),I(Y^2)\big)^q \leq \frac{C}{q \lambda}\, d_E(Y^1,Y^2)^q, \end{align*}$$

from which contractivity follows by choosing $\lambda $ appropriately. Pathwise uniqueness then readily follows; as the argument holds for any choice of $\mathbb {F}$ , we can take $\mathcal {F}_t=\sigma \{X_0, B^H_s, s\leq t\}$ , yielding strong existence.

To establish uniqueness in law, it suffices to observe that if X is a weak solution, then we can construct a copy of it on any reference probability space simply by solving therein the SDE associated to $b^X_t(\cdot )= F_t(\cdot ,\mathcal {L}(X_t))$ : by weak uniqueness for the SDE associated to $b^X$ (see Remark 4.6), the solution $\tilde {X}$ constructed in this way must have the same law as the original X and thus be a solution to the DDSDE itself. Given any pair of weak solutions $X^1,X^2$ , possibly defined on different probability spaces, we can then construct a coupling $(\tilde {X}^1,\tilde {X}^2)$ of them on the same probability space, solving the DDSDE for the same input data $(X_0,B^H)$ ; by the previous argument, it must hold $\tilde {X}^1\equiv \tilde {X}^2$ and so $\mathcal {L}(X^1)=\mathcal {L}(X^2)$ .

Remark 7.5. In fact, going through the same strategy of proof as in [Reference Galeati, Harang and Mayorcas51] not only allows to establish wellposedness of the DDSDE but also to establish stability estimates for DDSDEs. Specifically, assume we are given fields $F^i$ , $i=1,2$ , satisfying Assumption (7.2) for the same parameters $(\alpha ,q)$ and functions $h^i\in L^q_t$ and define the quantity

$$\begin{align*}\| F^1-F^2\|_{\alpha-1,q} :=\bigg( \int_0^1 \sup_{\mu\in \mathcal{P}_1} \big\| F^1_t(\cdot\,,\mu)-F^2_t(\cdot\,,\mu)\big\|_{C^{\alpha-1}_x}^q \mathrm{d} t\bigg)^{1/q}. \end{align*}$$

Then for any $m\in [1,\infty )$ , there exists a constant C, depending on $\alpha ,q,H,m,d, \| h^i\|_{L^q}$ , such that any two solutions $X^i$ defined on the same space with input data $(X_0^i, B^H)$ satisfy

(7.2)

$$ \begin{align} \big\| \| X^1-X^2\|_{C^0_t} \big\|_{L^m} \leq C \big(\|X^1_0-X^2_0|\|_{L^m} + \| F^1-F^2\|_{\alpha-1,q}\big); \end{align} $$

in the case of solutions defined on different spaces, using (7.2) and coupling argument, we can easily deduce bounds on the Wasserstein distances of their laws. In the true McKean–Vlasov case – namely, $F^i_t(\cdot \,,\mu )=f^i_t+g^i_t\ast \mu $ with $f^i,g^i\in L^q_t C^\alpha _x$ – it holds

$$\begin{align*}\| F^1-F^2\|_{q,\alpha} \lesssim \| f^1-f^2\|_{L^q_t C^{\alpha-1}_x} + \| g^1-g^2\|_{L^q_t C^{\alpha-1}_x}. \end{align*}$$

8 Weak compactness and weak existence

So far, we have shown that, under suitable conditions on b (condition (A)), we have (very) strong existence and uniqueness results. However, as we are now going to show, stochastic sewing also allows to establish weak existence and weak compactness of solutions in the regime (B) (defined just before Theorem 1.5), similarly to [Reference Athreya, Butkovsky, Lê and Mytnik3, Theorem 2.6(i)], [Reference Anzeletti, Richard and Tanré2, Theorem 2.8]. For other applications of sewing techniques and compactness arguments, see also [Reference Bechtold and Hofmanová7].

This section is also our way to say something about the equation in the case $q>2$ that goes beyond the trivial inclusion $L^q_t\subset L^2_t$ .

Since here we assume $\alpha <0$ , it is a priori not fully clear what it means to be a weak solution to the equation. Contrary to Section 5, where a robust interpretation was accomplished by the nonlinear Young formalism, here we will adopt the following, weaker definition, adapting the notion from [Reference Bass and Chens6]. This allows us to prove weak existence more generally; see, however, Remark 8.5 for a comparison.

Definition 8.1. Let $b\in L^q_t C^\alpha _x$ for some $\alpha <0$ . We say that a tuple $(\Omega ,{\mathbb {F}},\mathbb {P};X,B^H)$ consisting of a filtered probability space and a pair of continuous processes $(X,B^H)$ is a weak solution to the SDE

(8.1)

$$ \begin{align} X_t = x_0 + \int_0^t b_s(X_s)\mathrm{d} s + B^H_t \end{align} $$

if $B^H$ is a $\mathbb {F}$ -fBm of parameter H, X is $\mathbb {F}_t$ -adapted, and $X_t=x_0+V_t+B^H_t$ , where the process $V_t$ has the property that, for any sequence of smooth bounded functions $b^n$ converging to b in $L^q_t C^\alpha _x$ , it holds that

$$ \begin{align*} \Big\|\int_0^\cdot b^n(s,X_s)\mathrm{d} s - V_\cdot\Big\|_{C^0_t} \to 0 \quad \text{in probability.} \end{align*} $$

Theorem 8.2. Let $H\in (0,1)$ and $b\in L^q_t C^\alpha _x$ , satisfying (B). Then for any $x_0\in \mathbb {R}^d$ there exists a weak solution to the SDE (8.1) in the sense of Definition 8.1.

Remark 8.3. The above result is only interesting in the regime $H\in (0,1)$ and $q>2$ , cf. Remark 1.6. Indeed, for $H>1$ condition $\alpha>1/2-1/(2H)$ automatically enforces $\alpha>0$ , for which existence follows by classical Peano-type results; instead for $q\leq 2$ , (B) implies (A) and so strong wellposedness follows from the previous sections.

First we need the following lemma.

Lemma 8.4. Let $H\in (0,1)$ , $(\alpha ,q)$ be parameters satisfying (B); let X be a process defined on a filtered probability space $(\Omega ,\mathbb {F},\mathbb {P})$ of the form $X=\varphi +B^H$ , where $B^H$ is an $\mathbb {F}$ -fBm and $\varphi $ satisfies the property (2.4). For any $f\in L^q_t C^\delta _x$ , $\delta>0$ , let $w_f:=w_{f,\alpha ,q}$ ; then for any $m\in [2,\infty )$ , there exists a deterministic constant $K=K(m,d,\alpha ,q,H,\| b\|_{L^q_t C^\alpha _x})$ , such that

$$ \begin{align*} \bigg\| \Big\| \int_s^t f_r(X_r)\mathrm{d} r\Big\|_{L^m|\mathcal{F}_s} \bigg\|_{L^\infty} \leq K w_f(s,t)^{1/q} |t-s|^{\alpha H + 1/q'}. \end{align*} $$

As a consequence, for any $\varepsilon>0$ , there exists a constant $K=K(\varepsilon ,m,d,\alpha ,q,H,\| b\|_{L^q_t C^\alpha _x})$ such that

(8.2)

$$ \begin{align} \bigg\| \Big\| \int_0^\cdot f_r(X_r)\mathrm{d} r \Big\|_{C^{\alpha H + 1/q'-\varepsilon}_t} \bigg\|_{L^m} \leq K \| f\|_{L^q_t C^\alpha_x}. \end{align} $$

By linearity and density, this allows to continuously extend in a unique way the map $f\mapsto \int _0^\cdot f_r(X_r)\mathrm {d} r$ from $L^q_t \overline {C^\alpha _x}$ to $L^m_\omega C^0_t$ .

Proof. We only sketch the proof since it is very similar to others already presented (cf. Lemma 3.1). By Lemma 2.4 and the stochastic sewing (again in the version of [Reference Friz, Hocquet and Lê44, Theorem 2.7]), setting $A_{s,t}:=\mathbb {E}_s \int _s^t f_r(\varphi _s+B^H_r)\mathrm {d} r$ and denoting $\beta =1/q'+\alpha H$ , standard computations imply

$$ \begin{align*} \| A_{s,t}\|_{L^\infty} & \lesssim |t-s|^\beta w_f(s,t)^{1/q},\\ \big\| \|\mathbb{E}_s\delta A_{s,u,t}\|_{L^m| \mathcal{F}_s} \big\|_{L^\infty} & \lesssim |t-s|^{\beta-H} w_f(s,t)^{1/q} \big\| \| \varphi_{s,u}\|_{L^m|\mathcal{F}_s}\big\|_{L^\infty}\\ & \lesssim |t-s|^{2\beta-H} w_f(s,t)^{1/q} w_b(s,t)^{1/q}. \end{align*} $$

Under condition (B), one can check that the hypotheses of [Reference Friz, Hocquet and Lê44, Theorem 2.7] are satisfied, which easily yields all the desired estimates.

Let us also recall the definition of $\mathbb {F}$ -fBm and the associated Volterra kernel representation (1.22) from Section 1.4. With these preparations, we can now present the following:

Proof of Theorem 8.2.

As before, we can assume $x_0=0$ without loss of generality. Let $b\in L^q_t C^\alpha _x$ with $(q,\alpha )$ satisfying (B) be given. Since (B) is a strict inequality, we can assume without loss of generality that $q<\infty $ , $b\in L^q_t\overline {C^\alpha _x}$ , and in particular, there exists a sequence $\{b^n\}_n\subset L^q_t C^1_x$ such that $b^n\to b$ in $L^q_t C^\alpha _x$ and $\int _s^t \| b^n_r\|_{C^\alpha _x}^q \mathrm {d} r \leq \int _s^t \| b_r\|_{C^\alpha _x}^q \mathrm {d} r$ (this can be accomplished by taking $b^n_r=\rho _{1/n}\ast b_r$ for some standard mollifiers $\{\rho _\delta \}_{\delta>0}$ , up to replacing $\alpha $ with $\alpha -\varepsilon $ ).

To each such $b^n$ , we can associate a solution $X^n=\varphi ^n + B^H$ , where by Lemma 2.4, $\varphi ^n$ satisfy the bound (2.4) for $w=w_{\alpha ,b,q}$ ; this implies in particular that $\| \varphi ^n_{s,t}\|_m \lesssim |t-s|^{\alpha H + 1/q'}$ uniformly in n, which by Kolmogorov’s theorem readily implies the tightness of the family $\{\varphi ^n\}_n$ . As a consequence, the family $\{(\varphi ^n, B^H, W)\}_n$ is tight in $C_t\times C_t\times C_t$ .

By Prokhorov’s and Skorokhod’s theorems, we can construct another probability space $(\tilde {\Omega },\tilde {\mathcal {F}},\tilde {\mathbb {P}})$ on which there exists a sequence $\{(\tilde {\varphi }^n,\tilde {B}^{H,n}, \tilde {W}^n)\}_n$ such that $(\tilde {\varphi }^n,\tilde {B}^{H,n}, \tilde {W}^n)$ is distributed as $(\varphi ^n, B^H, W)$ for each n and $(\tilde {\varphi }^n,\tilde {B}^{H,n}, \tilde {W}^n) \to (\tilde {\varphi },\tilde {B}^{H}, \tilde {W}) \tilde {\mathbb {P}}$ -a.s. in $C_t\times C_t\times C_t$ . We claim that $\tilde {X}=\tilde {\varphi }+\tilde {B}^H$ is a weak solution to (8.1), in the sense of Definition 8.1. For notational simplicity, we drop the tildes for the rest of the proof.

First of all, we claim that $B^H$ is still distributed as an fBm of parameter H, W as a standard Bm and that the relation $B^H_t = \int _0^t K_H(t,s) \mathrm {d} W_s$ still holds. The first two statements are an immediate consequence of passing to the limit. For the last one, we can use the fact that for each n, the same relation holds between $B^{H,n}$ and $W^n$ , the fact that $K_H(t,\cdot )$ is square integrable and standard results on convergence of stochastic integrals (e.g., [Reference Debussche, Glatt-Holtz and Temam33, Lemma 2.1]) to conclude that for any fixed t, (1.22) holds $\mathbb {P}$ -a.s. The upgrade to a $\mathbb {P}$ -a.s. statement valid for all $t\in [0,1]$ follows from combining this fact with the uniform convergence of $B^{H,n}$ to $B^H$ .

Next, since $X^n=\varphi ^n+B^{H,n}$ is still a solution to the SDE (8.1) with regular drift $b^n$ , $\varphi ^n$ is adapted to $\mathcal {F}^n_t:=\sigma \{ B^{H,n}_s:s\leq t\}=\sigma \{W^n_s: s\leq t\}$ ; so for any $s<t$ , any $t_1,\ldots , t_n \leq s$ and any pair of continuous bounded functions $F,G$ , it holds

$$ \begin{align*} \mathbb{E}\big[F(W^n_{s,t})G(W^n_{t_1}, \varphi^n_{t_1},\ldots,W^n_{t_n},\varphi^n_{t_n})\big] = \mathbb{E}\big[F(W^n_{s,t})\big]\,\mathbb{E}\big[G(W^n_{t_1}, \varphi^n_{t_1},\ldots,W^n_{t_n},\varphi^n_{t_n})\big]. \end{align*} $$

Passing to the limit as $n\to \infty $ , the same relation holds for W and $\varphi $ in place of $W^n$ and $\varphi ^n$ , which shows that W is an $\mathbb {F}$ -Bm for $\mathcal {F}_t:=\sigma \{(W_s,\varphi _s):s\leq t\}$ ; in particular, $B^H$ is an $\mathbb {F}$ -fBm. Similarly, since $\varphi ^n$ uniformly satisfy the bound (2.4) w.r.t. $\mathcal {F}^n_t$ , it holds

$$ \begin{align*} \mathbb{E}\big[|\varphi^n_{s,t}|^m \, &G(W^n_{t_1}, \varphi^n_{t_1},\ldots,W^n_{t_n},\varphi^n_{t_n})\big] \\ &\lesssim \big(w(s,t)^{1/q} |t-s|^{\alpha H + 1/q'}\big)^m \mathbb{E}\big[G(W^n_{t_1}, \varphi^n_{t_1},\ldots,W^n_{t_n},\varphi^n_{t_n})\big]. \end{align*} $$

Passing to the limit as $n\to \infty $ , we conclude that $\varphi $ satisfies (2.4) w.r.t. the filtration $\mathcal {F}_t$ .

Finally, it remains to show that X satisfies the relation $X_t= V_t + B^H_t$ for V satisfying the requirements of Definition 8.1. First, since $B^H$ is an $\mathbb {F}$ -fBm and $\varphi $ satisfies (2.4), Lemma 8.4 applies, so that the process $V_t:=\int _0^t b_r(X_r)\mathrm {d} r$ is well-defined; by this, we mean that the map $f\mapsto \int _0^\cdot f_r(X_r)\mathrm {d} r$ admits a unique extension, and V is the limit in $L^m_\omega C^0_t$ of the processes $\int _0^\cdot b^n_t(X_r)\mathrm {d} r$ , for any sequence of smooth $b^n\to b$ in $L^q_T \overline {C^\alpha _x}$ . By linearity, we have

(8.3)

$$ \begin{align} \mathbb{E} \bigg[ \Big\| \int_0^\cdot f_r(X_r)\mathrm{d} r - V_\cdot \Big\|_{C^{\alpha H + 1/q'-\varepsilon}_t}^m \bigg]^{1/m} \lesssim \| f-b\|_{L^q_t C^\alpha_x} \end{align} $$

for any regular f; a similar estimate holds for any $X^n$ , with b replaced by $b^n$ , with the hidden constants being uniform in n. In order to conclude, again thanks to Lemma 8.4, it suffices to show that $\varphi ^n\to V$ ; for any f as above, it holds

$$ \begin{align*} \mathbb{E}\big[\| \varphi^n-V\|_{C^0_t}\big] & \leq \mathbb{E} \bigg[\Big\| \int_0^\cdot [b^n-f]_r(X^n_r) \mathrm{d} r \Big\|_{C^0_t}\bigg] + \mathbb{E}\bigg[\Big\| \int_0^\cdot [f_r(X^n_r)-f_r(X_r)] \mathrm{d} r \Big\|_{C^0_t}\bigg] \\ &\qquad+ \mathbb{E}\bigg[\Big\| \int_0^\cdot f_r(X_r) \mathrm{d} r - V_\cdot \Big\|_{C^0_t}\bigg]\\ & \lesssim \|b^n-f\|_{L^q_t C^\alpha_x} + \mathbb{E} \bigg[\Big\| \int_0^\cdot [f_r(X^n_r)-f_r(X_r)] \mathrm{d} r \Big\|_{C^0_t}\bigg] + \|b-f\|_{L^q_t C^\alpha_x}, \end{align*} $$

where we applied several times estimate (8.3). Since f is regular, $b^n\to b$ and $X^n\to X$ , passing to the limit, we get

$$ \begin{align*} \limsup_{n\to\infty} \mathbb{E}\bigg[\Big\| \int_0^\cdot b^n_r(X^n_r) \mathrm{d} r - V_\cdot \Big\|_{C^0_t}\bigg] \lesssim 2 \|b-f\|_{L^q_t C^\alpha_x}; \end{align*} $$

by the arbitrariness of f, we can conclude that $\varphi ^n \to V = \varphi $ and so that X is a weak solution.

Remark 8.5. Under Assumption (A), the unique strong solution X to the SDE constructed in Section 5 satisfies Definition 8.1, as readily seen by applying Lemma 3.1 with $h^n=b^n-b$ . In most situations, pathwise solutions X to (8.1) in the nonlinear Young sense (cf. Definition 5.3) which are $\mathbb {F}_t$ -adapted are also weak solutions in the sense of Definition 8.1. Indeed, in order to construct such X, usually one must have already verified that $T^{B^H}$ extends to a bounded operator from $L^q_t C^\alpha _x$ to $L^m_\omega C^{p-{\mathrm {var}}}_t C^{\eta ,\mathrm {loc}}_x$ (similarly to Corollary 5.1) and that $X=\varphi +B^H$ with $\varphi \in C^{\zeta -{\mathrm {var}}}_t \mathbb {P}$ -a.s., for suitable parameters $(p,\eta ,\zeta )$ satisfying $1/p +\eta /\zeta>1$ . Linearity of $T^{B^H}$ and stability of nonlinear Young integration $(A,x)\mapsto \int _0^\cdot A(\mathrm {d} s,x_s)$ (cf. [Reference Galeati46, Theorem 2.7-4)]) then yields

$$ \begin{align*} \Big\| \int_0^\cdot b^n(s,X_s)\mathrm{d} s-\int_0^\cdot T^{B^H}b(\mathrm{d} s,\varphi_s) \Big\|_{C^0_t} \lesssim \big\| T^{B^H}(b^n-b)\big\|_{C^{p-{\mathrm{var}}}_t C^{\eta}_{\| \varphi\|_{\infty}}} (1+\| \varphi\|_{C^{\zeta-{\mathrm{var}}}_t}), \end{align*} $$

where the r.h.s. converges in probability to $0$ due to the aforementioned mapping properties of $T^{B^H}$ and the assumption $b^n\to b$ in $L^q_t C^\alpha _x$ .

The converse implication – namely, whether the weak solution constructed in Theorem 8.2 is also a pathwise solution in the nonlinear Young sense – might only be true for a more restricted range of parameters. Let us only sketch the power counting, omitting the arbitrarily small exponents everywhere. The averaged field $T^{B^H}b$ can be constructed as in Corollary 5.1, as an element of $C^{2-{\mathrm {var}}}_t C^{\alpha +1/(2H),\mathrm {loc}}_x$ . Furthermore, we know from Lemma 2.4 that $\varphi \in C^{r-{\mathrm {var}}}_t$ with $1/r=1+\alpha H$ . Therefore, if

(8.4)

$$ \begin{align} \frac{1}{2}+\left(\alpha + \frac{1}{2H} \right)(\alpha H + 1)>1, \end{align} $$

then the nonlinear Young integral $\int _0^\cdot (T^{B^H}b)_{\mathrm {d} t}(\varphi _t)$ is well-defined and agrees with V. Note that the regime (8.4) is nontrivial in the sense that it allows for drifts for which strong uniqueness is not known since the right-hand side is strictly greater than $1$ for $\alpha =1-1/(2H)$ . We also remark that (8.4) is sufficient, but not necessary to define $\int _0^\cdot (T^{B^H}b)_{\mathrm {d} t}(\varphi _t)$ , since for particular choices of b, the averaged field $T^{B^H}b$ may enjoy better regularity than $C^{2-{\mathrm {var}}}_t C^{\alpha +1/(2H),\mathrm {loc}}_x$ ; see, for example, [Reference Anzeletti, Richard and Tanré2] for such situations.

For a deeper discussion about equivalence of different solution concepts for distributional drifts, including the nonlinear Young one, Definition 8.1 and others, we refer to [Reference Anzeletti, Richard and Tanré2, Theorem 2.15] and [Reference Butkovsky, Lê and Mytnik16, Theorem 2.11].

9 $\rho $ -irregularity

The goal of this section is to derive some pathwise properties for solutions of (1.6), without appealing to Girsanov transform. Indeed, in the time-homogeneous setting, Girsanov is unavailable for $H>1$ ,Footnote ⁸ while in the time-dependent case. it does not apply for any value of $H>0$ (since we can allow drifts which are only $L^q$ in time, for values of q arbitrarily close to $1$ ). For more details, see Appendix C.

As a meaningful representative of a larger class of pathwise properties, we will focus on the notion of $\rho $ -irregularity, first introduced in [Reference Catellier and Gubinelli20] in the context of regularisation by noise for ODEs; it has later found several applications in regularisation for PDEs (see [Reference Chouk and Gubinelli28, Reference Chouk and Gess29, Reference Chouk, Gubinelli, Li, Li and Oh30, Reference Galeati and Gubinelli48]), and more recently in the inviscid mixing properties of shear flows [Reference Galeati and Gubinelli50]. Let us also mention the recent work [Reference Romito and Tolomeo89] for an alternative notion of irregularity, partially related to this one.

Definition 9.1. Let $\gamma \in (0,1)$ , $\rho>0$ . We say that a function $h\in C([0,1],\mathbb {R}^d)$ is $(\gamma ,\rho )$ -irregular if there exists a constant N such that

$$ \begin{align*} \Big|\int_s^te^{i\xi\cdot h_r}\mathrm{d} r\Big|\leq N\,|\xi|^{-\rho}|t-s|^\gamma \quad \forall \xi\in\mathbb{R}^d,\quad 0\leq s\leq t\leq 1; \end{align*} $$

we denote by $\| \Phi ^h\|_{\mathcal {W}^{\gamma ,\rho }}$ the optimal constant. We say that h is $\rho $ -irregular for short if there exists $\gamma>1/2$ such that it is $(\gamma ,\rho )$ -irregular.

It was shown in [Reference Catellier and Gubinelli20, Reference Galeati and Gubinelli48] that for any $H\in (0,\infty )\setminus \mathbb {N}$ , $B^H$ is $\rho $ -irregular for any $\rho <1/(2H)$ ; we establish the same for a class of perturbations of $B^H$ satisfying the following assumption.

Assumption 9.2. Let $\varphi :[0,1]\to \mathbb {R}^d$ be a continuous adapted process which admits moments of any order; moreover, there exist $\beta> 0$ and a control w such that, for any $m \in [1,\infty )$ , there exists a constant $C_m$ such that

(9.1)

$$ \begin{align} \big\|\| \varphi_t -\mathbb{E}_s \varphi_t\|_{L^1|\mathcal{F}_s}\big\|_{L^m} \leq C_m w(s,t)^{1/2}|t-s|^{\beta}\quad \forall\, 0\leq s\leq t\leq 1. \end{align} $$

Theorem 9.3. Let $H\in (0, +\infty ) \setminus \mathbb {N}$ and let $\varphi $ satisfy Assumption 9.2 with $\beta =H$ ; then $X:=\varphi +B^H$ is $\mathbb {P}$ -almost surely $\rho $ -irregular for any $\rho < 1 /(2H)$ . More precisely, for any such $\rho $ and any $m\in [1,\infty )$ , there exists $\gamma =\gamma (m,\rho )>1/2$ such that

(9.2)

$$ \begin{align} \mathbb{E}[ \| \Phi^X\|_{\mathcal{W}^{\gamma,\rho}}^m ]<\infty. \end{align} $$

Remark 9.4. Let us make some observations on Assumption 9.2 and Theorem 9.3:

• Lemmas 2.1 and 2.4 provide sufficient conditions on q and $\alpha $ that guarantee that solutions of (1.6) with $b\in L^q_t C^\alpha _x$ satisfy Assumption 9.2. Note that in some cases, we can therefore obtain $\rho $ -irregularity of solutions but not uniqueness.
• Our usual toolbox could in principle be also used to study Gaussian moments of $\Phi ^X$ (under a somewhat stronger condition than (9.1)). For simplicity, we do not pursue this in detail.
• In terms of exponents, the condition (9.1) appears to require the same order of ‘regularity’, namely $1/2+H$ , as Girsanov transform (see Appendix C). However, (9.1) is a significantly weaker condition: instead of controlling the usual increments $\varphi _t-\varphi _s$ , one only needs to control the stochastic increments $\varphi _t-\mathbb {E}_s\varphi _t$ , which can be much smaller.
• In [Reference Catellier and Gubinelli20, Reference Galeati and Gubinelli48], the additive perturbation problem is studied in detail; the authors try to establish, in a deterministic framework, whether a path $h+\varphi $ can be shown to be $\rho $ -irregular, given the knowledge that h is so and $\varphi $ enjoys higher Hölder regularity. Such results usually come with a loss of regularity in the exponent $\rho $ at least $1/2$ (cf. [Reference Catellier and Gubinelli20, Theorem 1.6] and [Reference Galeati and Gubinelli48, Lemma 78]); the use of more probabilistic arguments and stochastic sewing techniques from Theorem 9.3 instead allows to cover the whole range $\rho <1/(2H)$ without difficulties.

Proof. In order to conclude, it suffices to prove the following claim: for any $\rho <1/(2H)$ , we can find $\gamma>1/2$ such that for any $m \in [1, \infty )$ , it holds

(9.3)

$$ \begin{align} \Big\| \int_s^t e^{i \xi \cdot X_r} \mathrm{d} r \Big\|_{L^m} \lesssim_m |t-s|^{\gamma} |\xi|^{-\rho} \quad \forall \, \xi \in \mathbb{R}^d, 0 \leqslant s \leqslant t \leqslant 1. \end{align} $$

It is clear that in (9.3), we can restrict to $| \xi | \geqslant 1$ (or $| \xi | \geqslant R$ ) whenever needed, since for small $\xi $ , the estimate is trivial. Once (9.3) is obtained, we can deduce that for any $\tilde {\rho }<\rho -d/m$ , it holds

(9.4)

$$ \begin{align} \mathbb{E} \left[ \int_{\mathbb{R}^d} | \xi |^{\tilde \rho} \bigg| \int_s^t e^{i\xi\cdot X_r} \mathrm{d} r \bigg|^m \mathrm{d} \xi \right] = \mathbb{E} \big[\| \mu^X_{s, t} \|_{\mathcal{F} L^{\tilde\rho, m}}^m\big ] \lesssim |t-s|^{\gamma m}; \end{align} $$

here, we follow the notation from [Reference Galeati and Gubinelli48], so that $\mu ^X_{s,t}$ denotes the occupation measure of X on $[s,t]$ and $\mathcal {F} L^{\rho , m}$ denote Fourier–Lebesgue spaces. Applying Lemma 57 from [Reference Galeati and Gubinelli48] to (9.4), together with Assumption 9.2, yields

$$ \begin{align*} \mathbb{E} \big[\| \mu^X_{s, t} \|_{\mathcal{F} L^{\tilde\rho, \infty}}^m\big] & \lesssim \mathbb{E} \big[\| X \|_{C_t}^d \| \mu^X_{s, t} \|_{\mathcal{F} L^{\tilde \rho,m}}^m\big]\\ & \lesssim \mathbb{E} \big[\| X \|_{C_t}^{2 d} \big]^{1 / 2}\, \mathbb{E} \big[\| \mu^X_{s, t} \|_{\mathcal{F} L^{\tilde\rho, m}}^{2 m}\big]^{1/2} \lesssim |t-s|^{\gamma m}. \end{align*} $$

By the arbitrariness of m and Kolmogorov’s continuity criterion, one then deduces that $\mu ^X\in C^{\tilde \gamma }_t \mathcal {F}L^{\tilde \rho ,\infty }_x$ for any $\tilde {\gamma }<\gamma $ and $\tilde {\rho }<\rho $ ; but this is equivalent to saying that X is $(\tilde {\gamma },\tilde {\rho })$ -irregular; cf. [Reference Galeati and Gubinelli48, Section 3.2]. The arbitrariness of $\rho <1/(2H)$ readily implies the conclusion as well as the moment estimate (9.2).

In order to prove the claim (9.3), we will apply Lemma 2.5, with $(S,T)=(0,1)$ , and $n=m$ . Fix $\xi \in \mathbb {R}^d$ ; arguing as in Lemma 2.6, it is easy to check that $\int _0^\cdot e^{i\xi \cdot X_r} \mathrm {d} r$ is the stochastic sewing of

$$\begin{align*}A_{s, t} := \mathbb{E}_{s-(t-s)} \int_s^t e^{i \xi \cdot (\mathbb{E}_{s-(t-s)} \varphi_r + B^H_r)} \mathrm{d} r. \end{align*}$$

Note that for any $r\in (s,t)$ , one has

$$ \begin{align*} \big|\mathbb{E}_{s-(t-s)}e^{i\xi\cdot B^H_r}\big|=\big|\mathbb{E}_{s-(t-s)}e^{i\xi\cdot (B^H_r-\mathbb{E}_{s-(t-s)}B^H_r)}\big|=e^{-c|\xi|^2|r-s+(t-s))|^{2H}}, \end{align*} $$

and therefore, we have

(9.5)

$$ \begin{align} | A_{s,t} | \lesssim e^{-c |\xi|^2 |t-s|^{2 H}} |t-s| \lesssim |\xi|^{-\rho} |t-s|^{1-\rho H}, \end{align} $$

where we used the basic inequality $e^{-c|y|^2}\lesssim |y|^{-\rho }$ . By the assumption on $\rho $ , $\varepsilon _1:=1/2-\rho H>0$ , and therefore, the condition (2.7) is satisfied with $w_1(s,t)=N|\xi |^{-2\rho }(t-s)$ .

As for the second condition of Lemma 2.5, we have for $(s,u,t)\in \overline {[0,1]}_\leq ^3$ that

$$ \begin{align*} \| \mathbb{E}_{s_-} \delta A_{s, u, t} \|_{L^m} & \leq \int_u^t \big\| \mathbb{E}_{u-(t-u)}e^{i \xi \cdot B^H_r} (e^{i \xi \cdot \mathbb{E}_{s-(t-s)} \varphi_r} - e^{i \xi \cdot \mathbb{E}_{u-(t-u)} \varphi_r}) \big\|_{L^m}\mathrm{d} r\\ &\quad+\int_s^u \big\| \mathbb{E}_{s-(t-s)}e^{i \xi \cdot B^H_r} (e^{i \xi \cdot \mathbb{E}_{s-(t-s)} \varphi_r} - e^{i \xi \cdot \mathbb{E}_{s-(u-s)} \varphi_r}) \big\|_{L^m}\mathrm{d} r=:I+J. \end{align*} $$

As usual, I and J are treated identically, so we only consider the former. We write

$$ \begin{align*} I & = \int_u^t e^{-c | \xi |^2 | r - u + t - u |^{2 H}} \big\| e^{i \xi \cdot \mathbb{E}_{s - (t - s)} \varphi_r} - e^{i \xi \cdot \mathbb{E}_{u-(t-u)} \varphi_r} \big\|_{L^m}\mathrm{d} r\\ & \leq e^{- \tilde{c} | \xi |^2 | t - s |^{2 H}} | \xi | \int_u^t \|\mathbb{E}_{s - (t - s)} \varphi_r -\mathbb{E}_{u - (t - u)} \varphi_r \|_{L^m}\mathrm{d} r\\ & \lesssim e^{- \tilde{c} |\xi|^2 |t-s|^{2H}} |\xi|\, w(s_-,t)^{1/2} |t-s|^{1+H}, \end{align*} $$

where in the second line we used $(s,u,t)\in \overline {[0,1]}_{\leq }^3$ and in the last one we used Assumption 9.2. Applying again the basic inequality $e^{- \tilde {c}|y|^{2}} \lesssim |y|^{-1-\rho }$ , we obtain

$$ \begin{align*} \| \mathbb{E}_{s_-} \delta A_{s,u,t} \|_{L^m} \lesssim |\xi|^{-\rho}w(s_-,t)^{1/2}|t-s|^{1-H\rho}. \end{align*} $$

Therefore, condition (2.8) is satisfied with $\varepsilon _2=\varepsilon _1=1/2-\rho H$ and $w_2(s,t)=N|\xi |^{-\rho }w^{1/2}(s,t)(t-s)^{1/2}$ , and by (2.12), we finally get

$$ \begin{align*} \Big\| \int_s^t e^{i \xi \cdot X_r} \mathrm{d} r \Big\|_{L^m} \lesssim|\xi|^{-\rho}|t-s|^{1/2+\varepsilon_1}\big(1+w(s,t)\big), \end{align*} $$

yielding (9.3).

10 Applications to transport and continuity equations

Having established well-posedness of the characteristic lines $\mathrm {d} X_t= b_t(X_t)\mathrm {d} t + \mathrm {d} B^H_t$ , the next natural step is to investigate the associated stochastic transport equation

(10.1)

$$ \begin{align} \partial_t u + b\cdot\nabla u + \dot B^H\cdot \nabla u =0. \end{align} $$

Natural questions in PDE theory and regularization by noise for (10.1) are its well-posedness (cf. the seminal work [Reference Flandoli, Gubinelli and Priola40]) and propagation of the regularity of initial data, first addressed in [Reference Fedrizzi and Flandoli38]. Both features need not be true in the absence of noise; among the vast literature, let us mention the following: the work [Reference Modena and Székelyhidi77] where counterexamples to uniqueness are provided even for Sobolev differentiable drifts; [Reference Brué, Colombo and De Lellis10] where it is shown how uniqueness of the generalized Lagrangian flow (in the sense of DiPerna-Lions [Reference DiPerna and Lions36]) does not imply uniqueness of trajectorial solutions to the ODE; finally [Reference Brué and Nguyen11], providing sharp examples that DiPerna-Lions flows can at most propagate a ‘logarithmic derivative’ of regularity of the initial data $u_0$ , but not better. As we will see in Theorem 10.4, the presence of $B^H$ allows to prevent all such pathologies, yielding nontrivial regularisation by noise results even in situations where uniqueness of solutions is already known to hold.

Rather than working directly with equation (10.1), following [Reference Flandoli, Gubinelli and Priola40], it is useful to introduce the transformation $\tilde u_t(x)=u_t(x+B^H_t)$ , $\tilde {b}_t(x)=b_t(x+B^H_t)$ , which relates it to

(10.2)

$$ \begin{align} \partial_t \tilde u + \tilde{b}\cdot \nabla \tilde u=0. \end{align} $$

This transformation formally assumes $B^H$ to be differentiable, but the resulting equation (10.2) is then well-defined (at least for bounded b) for any continuous path $B^H$ . More rigorously, we are implicitly assuming that the chain rule applies, which amounts to working with $B^H$ as a geometric rough path, see [Reference Catellier21] for the rigorous equivalence between (10.1)–(10.2) in this case. In the Brownian case, this means that the multiplicative noise must be interpreted in the Stratonovich sense, as in [Reference Flandoli, Gubinelli and Priola40]. However, the resulting PDE (10.2) is well-defined also for values $H\leq 1/4$ , where the rough path formalism no longer applies, and indeed, it can be regarded as a PDE with random drift $\tilde {b}$ , rather than a stochastic PDE.

A nice feature of the regular regime $H>1$ , included in our setting, is that here, $B^H$ is $\mathbb {P}$ -a.s. differentiable and so (10.1) is perfectly well-defined and the above transformation is completely rigorous (as soon as $(u_t)_{t\in [0,1]}$ is bounded in some function space) and does not involve any ‘choice’ of the rough lift. The above considerations motivate the following definition; from now on, we will use both notations $\tilde {u}_t(x)$ and $\tilde {u}_t(x;\omega )$ to denote $u_t(\omega , x+B^H_t(\omega ))$ , in order to stress the fixed realization $\omega \in \Omega $ whenever needed, and similarly for $\tilde {b}_t(x)$ and $\tilde {b}_t(x;\omega )$ .

Definition 10.1. For a fixed $\omega \in \Omega $ , we say that v is a weak solution to the PDE (10.2) associated to $\tilde {b}_t(x;\omega )$ if $v\in L^1_t W^{1,1,\mathrm {loc}}_{x}$ , $\tilde {b}\cdot \nabla v\in L^1_t L^{1,\mathrm {loc}}_x$ and for any smooth, compactly supported function $\varphi :[0,1]\times \mathbb {R}^d\to \mathbb {R}$ and any $t\in [0,1]$ , it holds

(10.3)

$$ \begin{align} \langle\varphi_t,v_t\rangle-\langle \varphi_0,v_0 \rangle =\int_0^t [\langle \partial_t\varphi_s ,v_s\rangle + \langle \varphi_s, \tilde{b}_s(\cdot\,;\omega)\cdot\nabla v_s \rangle] \mathrm{d} s. \end{align} $$

We say that a stochastic process u is a pathwise solution to the stochastic transport equation (10.1) if for $\mathbb {P}$ -a.e. $\omega \in \Omega $ , the corresponding $\tilde {u}_t(x;\omega )$ is a weak solution to (10.2) associated to $\tilde {b}_t(x;\omega )$ , in the above sense. Finally, a pathwise solution is said to be strong if it is adapted to the filtration generated by $B^H$ .

Similarly to equations (10.1)–(10.2), we can relate the stochastic continuity equation

(10.4)

$$ \begin{align} \partial_t \mu + \nabla\cdot (b\, \mu) + \dot B^H\cdot \nabla \mu =0 \end{align} $$

to its random PDE counterpart

(10.5)

$$ \begin{align} \partial_t \tilde\mu + \nabla \cdot (\tilde b\, \tilde\mu)=0 \end{align} $$

by means of the transformation $\tilde {\mu }_t(x;\omega )=\mu _t(\omega ,x+B^H_t(\omega ))$ . In the next definition, $\mathcal {M}_+=\mathcal {M}_+(\mathbb {R}^d)$ denotes the set of nonnegative finite Radon measures. For $\mu \in \mathcal {M}_+$ , we write $\mu \in L^p_x$ to mean that $\mu $ admits an $L^p$ -integrable density w.r.t. the Lebesgue measure, in which case, with a slight abuse, we will identify $\mu (\mathrm {d} x)=\mu (x) \mathrm {d} x$ .

Definition 10.2. For a fixed $\omega \in \Omega $ , we say that $\rho $ is a weak solution to the PDE (10.5) associated to $\tilde {b}_t(x;\omega )$ if $\rho _t\in \mathcal {M}_+$ for Lebesgue-a.e. t,

$$\begin{align*}\int_0^1\int_{\mathbb{R}^d} |\tilde{b}_t(x;\omega)| \rho_t(\mathrm{d} x)<\infty,\end{align*}$$

and for any smooth, compactly supported $\varphi :[0,1]\times \mathbb {R}^d\to \mathbb {R}$ and any $t\in [0,1]$ , it holds

$$\begin{align*}\langle\varphi_t,\rho_t\rangle-\langle \varphi_0,\rho_0 \rangle =\int_0^t \langle \partial_t\varphi_s + b_s(\cdot\,;\omega)\cdot\nabla \varphi ,\rho_s\rangle \mathrm{d} s. \end{align*}$$

We say that a stochastic process $\mu $ is a pathwise solution to the stochastic continuity equation (10.4) if for $\mathbb {P}$ -a.e. $\omega \in \Omega $ , the corresponding $\tilde {\mu }_t(x;\omega )$ is a weak solution to (10.5) associated to $\tilde {b}_t(x;\omega )$ , in the above sense. Finally, a pathwise solution is said to be strong if it is adapted to the filtration generated by $B^H$ .

As it is clear from Definitions (10.1)–(10.2), in order to treat equations (10.2)–(10.5) in an analytically weak sense, we need $\tilde {b}$ to enjoy some local integrability and thus to be a well-defined measurable function (up to equivalence class). Therefore, in the case of coefficients $b\in L^q_t C^\alpha _x$ with $\alpha <0$ , throughout this section, we will additionally impose that

(10.6)

$$ \begin{align} b\in L^r_t L^r_x + L^r_t L^\infty_x \quad \text{for some } r>1; \end{align} $$

we denote by $r'$ the conjugate exponent (i.e., $1/r'+1/r=1$ ). In the case $\alpha>0$ , we will use the convention $r'=1$ ; in this case, under (A), condition (10.6) is immediately satisfied for $r=q$ . Let us mention that, in the distributional case $\alpha <0$ , other approaches for giving meaning (10.2)–(10.5) are possible (see Remark 10.9 below), so it is not obvious whether an assumption of the form (10.6) is needed; still, we will adopt it as it allows us to apply nice analytical tools, while already covering a sufficiently rich class of drifts.

Remark 10.3. Let us collect a few useful observations:

i) By standard arguments, whenever a weak solution v to (10.2) exists (in the sense of Definition 10.1), then (up to redefining it on a Lebesgue negligible set of $t\in [0,1]$ ) $t\mapsto v_t$ is continuous w.r.t. suitable weak topologies; in particular, it always makes sense to talk about initial/terminal conditions for such equations. The same considerations apply for pathwise solutions, as well as solutions to the continuity equations (10.4)–(10.5); from now on, we will always work with these weakly continuous in time versions, without specifying it.
ii) If $\rho $ is a weak solution to (10.5), then its mass $\rho _t(\mathbb {R}^d)$ is preserved by the dynamics. In particular, if $\rho \in L^q_t L^p_x$ , then it actually belongs to $L^q_t L^{\tilde p}_x$ for all $\tilde {p}\in [1,p]$ .
iii) In Definition 10.1, we enforce identity (10.3) to hold for all $\varphi $ smooth and compactly supported, but by standard density arguments, it is clear that as soon as more information on v (resp. u) and b is available, then (10.3) can be extended to a larger class of $\varphi $ , as long as all the terms appearing are well-defined. For instance, if $v\in L^\infty _t W^{1,p}_x$ and $b\in L^\infty _t L^\infty _x$ , then it suffices to know that $\varphi , \partial _t \varphi \in L^1_t L^{p'}_x$ , $p'$ being the conjugate of p.
iv) Definitions (10.1)–(10.2) and the above observations extend easily to the case of backward equations on $[0,T]$ with terminal conditions $u_T$ , $\mu _T$ , rather than forward ones with initial $u_0$ , $\mu _0$ .

The next statement summarizes the main result of this section.

Theorem 10.4. Let b satisfy Assumption (A) and additionally (10.6) if $\alpha <0$ . Then,

i) For any $p\in [r',\infty )$ and $u_0\in W^{1,p}_x$ , there exists a strong pathwise solution u to (10.1), which belongs to $L^m_\omega L^\infty _t W^{1,p}_x$ for all $m\in [1,\infty )$ .

If, moreover, $p>r'$ , then path-by-path uniqueness holds in the class $L^\infty _t W^{1,p}_x$ , in the following sense: there exists an event $\tilde {\Omega }$ of full probability such that, for all $\omega \in \tilde \Omega $ and all $v_0\in W^{1,p}_x$ , there can exist at most one weak solution $v \in L^\infty _t W^{1,p}_x$ to the PDE (10.2) associated to $\tilde {b}_t(x;\omega )$ and with initial condition $v_0$ .
ii) For any $p\in [r',\infty )$ and any positive measure $\mu _0\in L^p_x$ , there exists a strong pathwise solution $\mu $ to (10.4), which belongs to $L^m_\omega L^\infty _t L^p_x$ for all $m\in [1,\infty )$ .

Moreover, path-by-path uniqueness holds in the class $L^\infty _t L^p_x$ , in the following sense: there exists an event $\tilde {\Omega }$ of full probability such that, for all $\omega \in \tilde \Omega $ and all $\mu _0\in L^p_x$ , there can exist at most one weak solution $\rho \in L^\infty _t L^p_x$ to the PDE (10.5) associated to $\tilde {b}_t(x;\omega )$ and with initial condition $\mu _0$ .

Theorem 10.4 will be proved by mostly analytical techniques, once they are combined with the information coming from the previous sections. We will first establish existence of pathwise solutions to equations (10.1)–(10.4) satisfying the desired a priori bounds; see Proposition 10.5.

Uniqueness will be established by two different methods. In the transport case, we will first establish a priori bounds for solutions the dual equation (backward continuity equation) in Proposition 10.6 and then perform a duality argument (Lemma 10.7); see [Reference DiPerna and Lions36] and [Reference Beck, Flandoli, Gubinelli and Maurelli8] for significant precursors in this direction.

For the continuity equation, we will instead infer uniqueness from Ambrosio’s superposition principle (cf. Theorem 10.8) combined with our path-by-path uniqueness results (Theorems 4.4–5.6). To the best of our knowledge, it is the first time these two results are combined in this way to infer path-by-path uniqueness for (10.4); let us mention, however, that in [Reference Beck, Flandoli, Gubinelli and Maurelli8, Section 4], the opposite idea is developed, proving path-by-path uniqueness for the SDE starting from the corresponding results for (10.4).

Before giving the proofs, let us recall a few notations and basic facts. We will use $\Psi $ to denote the random flow of diffeomorphisms associated to the (random) ODE $\dot \varphi = \tilde {b}_t(\varphi )$ , where we recall the fundamental relation $X_t=\varphi _t+B^H_t$ as well as (5.5). Similarly to Section 6, we will use the notations $J^x_{s\to t} := \nabla \Psi _{s\to t}(x)$ , $K^x_{s\to t} := (J^x_{s\to t})^{-1} = \nabla \Psi _{s \leftarrow t}(\Psi _{s\to t}(x))$ ; we also set $j_{s\to t}(x):=\det J^x_{s\to t}$ , and similarly for $j_{s\leftarrow t}(x)$ . Recall that, in the case of regular b, we have the relations

(10.7)

$$ \begin{align} j_{s\to t}(x) = \exp\Big(\int_s^t \mathrm{div} b_r (\Phi_{s\to r}(x)) \mathrm{d} r\Big), \ \ j_{s\leftarrow t}(x) = \exp\Big(-\int_s^t \mathrm{div} b_r (\Phi_{r\leftarrow t}(x)) \mathrm{d} r\Big). \end{align} $$

Proposition 10.5. Let b satisfy Assumption (A), and additionally (10.6) if $\alpha <0$ , then,

i) For any $p\in [r',\infty )$ and $u_0\in W^{1,p}_x$ , there exists a strong pathwise solution u to (10.1), which belongs to $L^m_\omega L^\infty _t W^{1,p}_x$ for all $m\in [1,\infty )$ .
ii) For any $p\in [r',\infty )$ and any positive measure $\mu _0$ such that $\mu _0\in L^p_x$ , there exists a strong pathwise solution $\mu $ to (10.4), which belongs to $L^m_\omega L^\infty _t L^p_x$ for all $m\in [1,\infty )$ .

Proof. Let us first assume b to be smooth and derive estimates which only depend on $\| b\|_{L^q_t C^\alpha _x}$ . In this case, the unique solution to (10.2) is given by $\tilde u_t(x)= u_0(\Psi _{0\leftarrow t}(x))$ . Let us give the bound on $\|\nabla \tilde u\|_{L^p}$ , the one for $\| \tilde u\|_{L^p}$ being similar; also observe that these quantities coincide with the corresponding ones for u. It holds

$$ \begin{align*} \sup_{t\in [0,1]} \| \nabla \tilde u_t\|_{L^p}^p & = \sup_{t\in [0,1]} \int_{\mathbb{R}^d} |\nabla \tilde u_t(x)|^p \mathrm{d} x\\ & \leq \sup_{t\in [0,1]} \int_{\mathbb{R}^d} |\nabla u_0(\Psi_{0\leftarrow t}(x))|^p |\nabla \Psi_{0\leftarrow t}(x)|^p \mathrm{d} x\\& = \sup_{t\in [0,1]} \int_{\mathbb{R}^d} |\nabla u_0(y)|^p |\nabla \Psi_{0\leftarrow t}(\Psi_{0\to t}(y))|^p j_{0\to t}(y) \mathrm{d} y\\ & \leq \int_{\mathbb{R}^d} |\nabla u_0(y)|^p \sup_{t\in [0,1]} |K_{0\to t}(y))|^p \, \sup_{t\in [0,1]} j_{0\to t}(y)\, \mathrm{d} y. \end{align*} $$

Taking the $L^m_\omega $ -norm on both sides, we arrive at

$$ \begin{align*} \Big \| \sup_{t\in [0,1]} \| \nabla \tilde u_t\|_{L^p}^p \Big\|_{L^m} & \leq \int_{\mathbb{R}^d} |\nabla u_0(y)|^p \Big\| \sup_{t\in [0,1]} |K_{0\to t}(y))|^p \, \sup_{t\in [0,1]} j_{0\to t}(y) \Big\|_{L^m} \, \mathrm{d} y\\ & \leq \| \nabla u_0\|_{L^p}^p\, \sup_{y\in \mathbb{R}^d} \Big\| \sup_{t\in [0,1]} |K_{0\to t}(y))|^p \Big\|_{L^{2m}}^{1/2} \, \Big\| \sup_{t\in [0,1]} j_{0\to t}(y) \Big\|_{L^{2m}}^{1/2}. \end{align*} $$

The finiteness of arbitrary moments of $\sup _{t\in [0,1]} j_{0\to t}(y)$ comes from identity (10.7), combined with Lemma 3.1 applied to $h=\mathrm {div} b$ and $\varphi _r=\Phi _{0\to r}(y)-B^H_r$ . This estimate is clearly uniform in $y\in \mathbb {R}^d$ . The similar bounds for K follow as in Section 6, using the fact that K solves the linear Young equation (6.11). Up to relabelling $m=m' p$ , we have thus shown that

(10.8)

$$ \begin{align} \| \nabla u\|_{L^m_\omega L^\infty_t L^p_x} \lesssim \| \nabla u_0\|_{L^p_x}. \end{align} $$

We now pass to the case of $\mu $ ; for regular b, solutions are given by the identity

$$ \begin{align*} \tilde \mu_t(x) = \mu_0(\Psi_{0\leftarrow t}(x)) \exp\Big(-\int_0^t \mathrm{div} b_r (\Phi_{r \leftarrow t}(x)) \mathrm{d} r \Big). \end{align*} $$

Arguing similarly to above, it holds

$$ \begin{align*} \Big\| \sup_{t\in [0,1]} \| \tilde \mu_t\|_{L^p_x}^p \Big\|_{L^m} & = \Big\| \sup_{t\in [0,1]} \int_{\mathbb{R}^d} |\mu_0(\Psi_{0\leftarrow t}(x))|^p \exp\Big(-p\int_0^t \mathrm{div} b_r (\Phi_{r \leftarrow t}(x)) \mathrm{d} r \Big) \mathrm{d} x \Big\|_{L^m} \\ & = \Big\| \sup_{t\in [0,1]} \int_{\mathbb{R}^d} |\mu_0(y)|^p \exp\Big((1-p) \int_0^t \mathrm{div} b_r (\Phi_{0\to r}(y)) \mathrm{d} r \Big) \mathrm{d} y \Big\|_{L^m}\\ & \leq \sup_{y\in \mathbb{R}^d} \Big\| \sup_{t\in [0,1]} \exp\Big((1-p) \int_0^t \mathrm{div} b_r (\Phi_{0\to r}(y))\mathrm{d} r \Big) \Big\|_{L^m} \int_{\mathbb{R}^d} |\mu_0(y)|^p \mathrm{d} y, \end{align*} $$

and so invoking again Lemma 3.1 and relabelling m, we arrive at

(10.9)

$$ \begin{align} \| \tilde \mu\|_{L^m_\omega L^\infty_t L^p_x} \lesssim \| \mu_0\|_{L^p_x}. \end{align} $$

Having established the uniform estimates (10.8)–(10.9), both existence claims for general b now follow from a standard compactness argument (see, for instance, [Reference Pardoux84] or [Reference Flandoli, Gubinelli and Priola40, Theorem 15]), so we will only sketch it quickly.

Consider smooth approximations $b^n\to b$ , $u_0^n\to u_0$ and denote by $u^n$ the associated solutions; by reflexivity of $L^p_t L^p_\omega W^{1,p}_x$ , we can extract a (not relabelled) subsequence such that $u^n\rightharpoonup u$ weakly in $L^p_t L^p_\omega L^p_x$ . By properties of weak convergence, the limit u still belongs to $L^m_\omega L^\infty _t W^{1,p}_x$ and is progressively measurable, since the sequence $u^n$ was so; also observe that, as in Remark 10.3-i), we can assume u to be weakly continuous in time, so that it is in fact adapted. By the linear structure of the PDE, one can then finally verify that u is indeed a pathwise solution. Let us stress that here is where for $\alpha <0$ , the assumption (10.6) is crucial since otherwise, it is unclear whether $b^n\cdot \nabla u^n$ converges to $b\cdot \nabla u$ in a weak sense (both w.r.t. $L^m_\omega $ and by testing against $\varphi \in C^\infty _c$ ); indeed, since $p\geq r'$ , all objects are well-defined in $L^m_\omega L^1_t L^{1,\mathrm {loc}}_x$ , and the claim follows from $b^n\to b$ and $u^n\rightharpoonup u$ . The case of $\mu $ can be treated similarly; the only difference is that since $b\in L^r_t L^r_x + L^r_t L^\infty _x$ and $\mu \in L^m_\omega L^\infty _t (L^{r'}_x\cap L^1_x)$ by Remark 10.3, the additional $\mathbb {P}$ -a.s. integrability constraint $\langle |\tilde b(\omega )|,\tilde \mu (\omega )\rangle <\infty $ coming from Definition 10.2 is also satisfied.

We now turn to establishing existence of sufficiently regular solutions to the continuity equation with well-chosen terminal data; handling the backward nature of the equation yields slightly worsened estimates compared to those of Proposition 10.5.

Proposition 10.6. Let $T\in [0,1]$ and $\mu _T\in L^p$ compactly supported. Then there exists a pathwise solution $\mu $ to (10.4) on $[0,T]$ with terminal condition $\mu \vert _{t=T}=\mu _T$ ; moreover, for any $m\in [1,\infty )$ and any $\tilde p<p$ , it holds $\mu \in L^\infty _t L^m_\omega L^{\tilde p}_x$ .

Proof. We can assume $\mathrm {supp} \mu _T \subset B_R$ for some $R\geq 1$ . We will assume b to be regular and show how to derive suitable a priori estimates; the general case then follows by arguing similarly to Proposition 10.5. The solution is given explicitly by

$$ \begin{align*} \mu_t(x) = \mu_T(\Psi_{t\to T}(x)) \exp\Big( \int_t^T \mathrm{div} b_r (\Psi_{t\to r} (x)) \mathrm{d} r\Big). \end{align*} $$

For any fixed $t\in [0,T]$ , it holds

$$ \begin{align*} \int_{\mathbb{R}^d} |\mu_t(x)|^{\tilde p} \mathrm{d} x & = \int_{\mathbb{R}^d} |\mu_T(\Psi_{t\to T}(x))|^{\tilde p} \exp\Big( \tilde p\int_t^T \mathrm{div} b_r (\Psi_{t\to r} (x) \mathrm{d} r)\Big) \mathrm{d} x\\ & = \int_{\mathbb{R}^d} |\mu_T(y)|^{\tilde p} \exp\Big( (\tilde p-1) \int_t^T \mathrm{div} b_r (\Psi_{r\leftarrow T} (y) \mathrm{d} r)\Big) \mathrm{d} y\\ & \leq \| \mu_T \|_{L^p_x}^{\tilde p} \bigg( \int_{B_R} \exp\Big( \frac{ p(\tilde p-1)}{p-\tilde p} \int_t^T \mathrm{div} b_r (\Psi_{r\leftarrow T} (y) \mathrm{d} r)\Big) \mathrm{d} y \bigg)^{1-\frac{\tilde p}{p}}, \end{align*} $$

where in the last passage, we used first $\mathrm {supp} \mu _T\subset B_R$ and then Hölder’s inequality. Applying again the change of variable $x=\psi _{t\leftarrow T}(y)$ and the formula for $j_{t\to T}(x)$ , overall we find a costant $\kappa =\kappa (p,\tilde p)$ such that

$$ \begin{align*} \big\| \| \mu_t\|_{L^{\tilde p}_x} \big\|_{L^m} \leq \| \mu_T\|_{L^p_x}^{\tilde p}\, \bigg\| \int_{\Psi_{t\to T}(B_R)} \exp\Big( \kappa \int_t^T \mathrm{div} b_r (\Psi_{t\to r} (y) \mathrm{d} r)\Big) \mathrm{d} y \bigg\|_{L^m}^{1-\frac{\tilde p}{p}}. \end{align*} $$

It remains to estimate the last quantity appearing on the r.h.s. above. To this end, let us set $N_y := j_{t\to T}(y)^\kappa $ ; as usual by Lemma 3.1, it holds $\| N_y\|_{L^m}\lesssim 1$ , with an estimate uniform in y, t and T and only depending on $\| b\|_{L^q_t C^\alpha _x}$ .

Thanks to estimates (6.1) and Lemma A.4, one can show that for any $\tilde m\in [1,\infty )$ and $\lambda>1$ , uniformly in $t\in [0,T]$ it holds

$$ \begin{align*} \big\| \| \Psi_{t\to T}\|_{C^{0,\lambda}} \big\|_{L^{\tilde m}} <\infty\quad \text{ where } \quad \| \Psi_{t\to T}\|_{C^{0,\lambda}}:= \sup_{|x|\geq 1} |x|^{-\lambda} |\Psi_{t\to T}(x)|; \end{align*} $$

this is because one can first show finiteness of the associated $C^{\eta ,\lambda '}_x$ -norm by Lemma A.4, and then deduce from it that $\Psi _{t\to T}$ also belongs to $C^{0,\lambda }_x$ for $\lambda =\lambda '+\eta $ (such an embedding readily follows from the definitions of such spaces).

Therefore, it holds

$$ \begin{align*} \Big\| \int_{\Psi_{t\to T}(B_R)} N_y \mathrm{d} y \Big\|_{L^m} & \leq \sum_{n\in \mathbb{N}} \Big\| \chi_{ \| \Psi_{t\to T}\|_{C^{0,\lambda}} \in [n,n+1)} \int_{\Psi_{t\to T}(B_R)} N_y\, \mathrm{d} y \Big\|_{L^m}\\ & \leq \sum_{n\in \mathbb{N}} \Big\| \int_{B_{(n+1)R^\lambda}} \chi_{ \| \Psi_{t\to T}\|_{C^{0,\lambda}}\geq n} N_y\, \mathrm{d} y \Big\|_{L^m}\\ & \leq \sum_{n\in \mathbb{N}} \int_{B_{(n+1)R^\lambda}} \| \chi_{ \| \Psi_{t\to T}\|_{C^{0,\lambda}}\geq n}\|_{L^{2m}} \|N_y\|_{L^{2m}}\, \mathrm{d} y\\& \lesssim \sum_{n\in \mathbb{N}} (n+1)^d R^{\lambda d}\, \mathbb{P} (\| \Psi_{t\to T}\|_{C^{0,\lambda}}\geq n)^{\frac{1}{2m}}\\ & \lesssim R^{\lambda d} \sum_{n\in \mathbb{N}} n^{d-\frac{\tilde m}{2m}} \big\| \| \Psi_{t\to T}\|_{C^{0,\lambda}}\big\|_{L^{\tilde m}}^{\frac{\tilde m}{2m}}, \end{align*} $$

where in the last passage, we used Markov’s inequality. Choosing $\tilde {m}$ large enough, so to make the series convergent, then yields the conclusion.

The importance of integrability of solutions to the backward continuity equation comes from the following (deterministic) duality lemma.

Lemma 10.7. Let b satisfy (10.6) and let v, $\rho $ be analytic weak solutions to respectively the forward transport and backward continuity equations associated to $\tilde {b}_t(\cdot ;\omega )$ ; assume that $v\in L^\infty _t W^{1,p_1}_x$ and $\rho \in L^{r'}_t (L^1_x\cap L^{p_2}_x)$ for some $p_1,\,p_2$ satisfying

$$\begin{align*}p_1,\,p_2\in [1,\infty),\quad \frac{1}{p_1}+\frac{1}{p_2}+\frac{1}{r}=1. \end{align*}$$

Then it holds

$$ \begin{align*} \langle v_T, \rho_T\rangle = \langle v_0, \rho_0\rangle. \end{align*} $$

Proof. The argument is relatively standard in the analytic community and is based on the use of mollifiers and commutators; see the seminal work [Reference DiPerna and Lions36]. Let $v^\varepsilon =v\ast g^\varepsilon $ for some standard mollifiers $g^\varepsilon $ ; since $v^\varepsilon $ is spatially smooth, we can test it against $\rho $ (cf. Remark 10.3-iii)), which combined with the respective PDEs yields the relation

$$ \begin{align*} \langle v^\varepsilon_T , \rho_T\rangle - \langle v^\varepsilon_0 , \rho_0\rangle = \int_0^T \langle (\tilde b\cdot \nabla v)^\varepsilon - \tilde b\cdot\nabla v^\varepsilon, \rho\rangle \mathrm{d} s. \end{align*} $$

In order to conclude, it then suffices to show that the r.h.s. converges to $0$ as $\varepsilon \to 0$ . Recall that by assumption, $b= b^1+b^2$ with $b^1\in L^r_t L^r_x$ , $b^2\in L^r_t L^\infty _x$ , so that the same holds for $\tilde {b}$ ; we show how to deal with $\tilde b^1$ , the other case being similar. By our assumptions, Hölder’s inequality and properties of mollifiers, it is easy to check that both $(\tilde b^1\cdot \nabla v)^\varepsilon $ and $\tilde b^1\cdot \nabla v^\varepsilon $ converge to $\tilde b^1\cdot \nabla v$ in $L^r_t L^{\tilde r}_x$ , where $\tilde {r}\in (1,\infty )$ is defined by $1/\tilde {r}=1/r+1/p_1$ . But then

$$ \begin{align*} \bigg| \int_0^T \langle (\tilde b^1_t\cdot \nabla v_t)^\varepsilon - \tilde b^1_t\cdot\nabla v^\varepsilon_t, \rho_t\rangle \mathrm{d} t \bigg| & \leq \int_0^T \| (\tilde b^1_t\cdot \nabla v_t)^\varepsilon - \tilde b^1_t\cdot\nabla v^\varepsilon_t\|_{L^{\tilde r}_x}\, \| \rho_t\|_{L^{p_2}_x} \mathrm{d} t\\ & \leq \| (\tilde b^1\cdot \nabla v)^\varepsilon - \tilde b^1\cdot\nabla v^\varepsilon\|_{L^r_t L^{\tilde r}_x} \| \rho\|_{L^{r'}_t L^{p_2}_x}, \end{align*} $$

where the last term converges to $0$ .

As a final ingredient, we give the aforementioned Ambrosio’s superposition principle; we stress that the statement is deterministic, but we will apply it for fixed realizations of the random drift $\tilde {b}(\cdot ;\omega )$ . Although the full statement is a bit technical, we invite the reader to consult the (more heuristical) Theorem 3.1 from [Reference Ambrosio1] to understand the role it plays in our analysis.

Theorem 10.8 (Theorem 3.2 from [Reference Ambrosio1]).

Let $\mu $ be a weak solution to the continuity equation $\partial _t \mu + \nabla \cdot (\mu f)=0$ such that $\mu _t\in \mathcal {M}_+(\mathbb {R}^d)$ for all t and

$$\begin{align*}\int_0^1 \int_{\mathbb{R}^d} |f_t(x)|\, \mu_t(\mathrm{d} x)\, \mathrm{d} t<\infty. \end{align*}$$

Then $\mu $ is a superposition solution, namely, there exists a measure $\eta \in \mathcal {M}_+(\mathbb {R}^d \times C_t)$ , concentrated on the pairs $(x,\varphi )$ satisfying the relation

$$\begin{align*}\varphi_t = x + \int_0^t f_s(\varphi_s)\mathrm{d} s, \end{align*}$$

such that $\mu _t = (e_t)_\sharp \eta $ for all $t\in [0,1]$ , where $e_t(x,\varphi )=\varphi _t$ is the evaluation map and $(e_t)_\sharp \eta $ denote the pushforward measure.

We are now ready to give the following:

Proof of Theorem 10.4.

Both existence statements come from Proposition 10.5, so we only need to check path-by-path uniqueness.

Let us start with the continuity equation. We claim that the event $\tilde {\Omega }$ of full probability on which path-by-path uniqueness for (10.4) holds is the one for which we have uniqueness of solutions to the ODE $\dot {\varphi }_t=\tilde {b}_t(\varphi _t;\omega )$ for all $x\in \mathbb {R}^d$ ; its existence is granted by Theorems 4.4–5.6, which additionally imply that $\varphi _t=\Psi _{0\to t}(x;\omega )$ . Indeed, suppose we are given any weak solution $\rho \in L^\infty _t L^p_x$ to (10.5); by our assumptions, and possibly Remark 10.3-ii), it holds $\int _0^1 \int _{\mathbb {R}^d} |\tilde {b}_t(x;\omega )| \mu _t(\mathrm {d} x) \mathrm {d} t<\infty $ . We can then apply Theorem 10.8 to deduce that $\rho $ is a superposition solution; since uniqueness of solutions to $\dot {\varphi }_t=\tilde {b}_t(\varphi _t;\omega )$ holds, we readily deduce that $\rho _t = \Psi _{0\to t}(\cdot ;\omega )_\sharp \rho _0$ , which gives uniqueness.

We now pass to consider the transport case; by linearity, we only need to find an event $\tilde \Omega $ on which any weak solution $v\in L^\infty _t W^{1,p}_x$ to (10.2) with $v_0=0$ is necessarily the trivial one. By Remark 10.3-i), we know that any solution is weakly continuous in time; thus, it suffices to verify that $v_t=0$ for all t in a dense subset of $[0,1]$ . To this end, let us fix a countable collection $\{f^n\}_n$ of compactly supported smooth functions which are dense in $C^\infty _x$ and a countable dense set $\Gamma \subset [0,1]$ . By Proposition 10.6, for any $f^n$ and $\tau \in \Gamma $ , we can find a pathwise solution $\mu ^{\tau ,n}$ to the backward continuity equation on $[0,\tau ]$ which $\mathbb {P}$ -a.s. belongs to $L^q_t L^q_x$ for all $q\in [1,\infty )$ . Since everything is countable, we can then find an event $\tilde {\Omega }\subset \Omega $ on which $\mu ^{\tau ,n}(\omega )$ are all defined at once and have the above regularity; we claim that this is the desired event where uniqueness of weak solutions to (10.2) in $L^\infty _t W^{1,p}_x$ holds. Indeed, since q is arbitrarily large and $p>r'$ , we can apply Lemma 10.7 and use the fact that $v_0=0$ to deduce that

$$ \begin{align*} 0 = \langle v_0, \mu^{\tau,n}(\cdot\,;\omega)\rangle = \langle v_\tau, f^n\rangle \quad \forall\, \tau\in \Gamma,\, f^n; \end{align*} $$

by density of $f^n$ , it follows that $v_{\tau }=0$ for all $\tau \in \Gamma $ , which by density of $\Gamma $ and continuity finally implies $v\equiv 0$ .

Remark 10.9. In [Reference Galeati and Gubinelli49, Section 5.2], the authors show how to solve the transport equation (10.1) in a pathwise manner under the assumption that $T^{B^H}b \in C^\gamma _t C^2_x$ for some $\gamma>1/2$ ; in this case, one can treat purely distributional drifts b without enforcing (10.6). However, this assumption is satisfied under more restrictive conditions than (A) (e.g., if $b\in L^\infty _t C^\alpha _x$ for some $\alpha>2-1/(2H)$ ). We believe that existence and uniqueness for (10.1) (resp. (10.4)) should hold under (A) even when $\alpha <0$ , without the need for (10.6), but we leave this problem for future investigations.

A Kolmogorov continuity type criteria

Let us recall (a conditional version of) the classical Azuma–Hoeffding inequality.

Lemma A.1. Let $k\in \mathbb {N}$ and $\{Y_i\}_{i=0}^k$ be a sequence of $\mathbb {R}^d$ -valued martingale differences with respect to some filtration $\{\mathcal {F}_i\}_{i=0}^k$ , with $Y_0=0$ ; assume that there exist deterministic constants $\{\delta _i\}_{i=1}^k$ such that $\mathbb {P}$ -a.s. $|Y_i|\leq \delta _i$ for all i. Then for

$$ \begin{align*} S_j:=\sum_{i=1}^j Y_i,\qquad\Lambda:=\delta_1^2+\cdots+\delta_k^2, \end{align*} $$

one has the $\mathbb {P}$ -a.s. inequality

(A.1)

$$ \begin{align} \mathbb{E}\bigg[ \exp\Big(\frac{|S_k|^2}{4 d \Lambda}\Big)\bigg\vert \mathcal{F}_0\bigg]\leq 3. \end{align} $$

Proof. The proof goes along the same lines as standard Azuma–Hoeffding; since we have not found a direct reference in the literature, we present it here.

First, observe that we can reduce ourselves to the case $d=1$ by reasoning componentwise, the general one following from a simple application of conditional Jensen’s inequality.

Next, we claim that the following version of Hoeffding’s lemma holds: given a random variable X and a filtration $\mathcal {F}$ such that $\mathbb {E}[X\vert \mathcal {F}]=0$ and $a\leq X \leq b \mathbb {P}$ -a.s., it holds

(A.2)

$$ \begin{align} \mathbb{E}[\exp(\lambda X)\vert \mathcal{F}] \leq \exp\bigg( \frac{\lambda^2 (b-a)^2}{8}\bigg)\quad \forall\, \lambda\in\mathbb{R}. \end{align} $$

By homogeneity, it suffices to prove (A.2) for $b-a=1$ ; in this case, we have the basic inequality $e^{\lambda x} \leq (b-x)e^{\lambda a} + (x-a)e^{\lambda b}$ for all $x\in [a,b]$ . Evaluating in X and taking conditional expectation, we obtain

$$ \begin{align*} \mathbb{E} [e^{\lambda X}\vert \mathcal{F}]\leq (a+1)e^{\lambda a} - a e^{\lambda (a+1)} = e^{H(\lambda)}, \quad H(\lambda):=\lambda a + \log (1+a - e^\lambda a). \end{align*} $$

It can be readily checked that $H(0)=H'(0)=0$ and $H"(\lambda )\leq 1/4$ , which by Taylor expansion yields $H(\lambda )\leq \lambda ^2/8$ and thus (A.2).

Next, given the sequence $\{Y_k\}_k$ as in the hypothesis, we can assume by homogeneity $\Lambda =1$ and apply recursively Hoeffding’s lemma as follows:

$$ \begin{align*} \mathbb{E}[ \exp(\lambda S_k)\vert \mathcal{F}_0] & = \mathbb{E}\big[ \exp(\lambda S_{k-1})\, \mathbb{E}[\exp(\lambda Y_k) \vert \mathcal{F}_{k-1}] \big\vert \mathcal{F}_0\big]\\ & \leq \exp\big( \lambda^2 (2 \delta_k)^2/8\big) \mathbb{E}[ \exp(\lambda S_{k-1})\vert \mathcal{F}_0] \leq \ldots \leq e^{\lambda^2/2}. \end{align*} $$

By the inequality $e^{|x|}\leq e^x+e^{-x}$ and Chernoff’s conditional bound, we have

$$ \begin{align*} \mathbb{P}(|S_k|>a\vert \mathcal{F}_0) \leq \inf_{\lambda >0}e^{-\lambda a}\, \mathbb{E}[ e^{\lambda |S_k|}] \leq 2 \inf_{\lambda>0} e^{-\lambda a + \lambda^2/2} = 2 e^{-a^2/2}. \end{align*} $$

Therefore, we arrive at

$$ \begin{align*} \mathbb{E}\bigg[ \exp\Big( \frac{|S_k|^2}{4}\Big)\bigg\vert \mathcal{F}_0\bigg] = \int_0^{+\infty} \mathbb{P}\bigg(|S_k|> \sqrt{4|\log s|}\bigg)\, \mathrm{d} s \leq 1 + 2\int_1^{+\infty} s^{-2} \mathrm{d} s = 3. \end{align*} $$

Next, we pass to a conditional Kolmogorov-type lemma, stated in a way which is suitable for our purposes.

Lemma A.2. Let E be a Banach space, $X:[0,T]\to E$ be a continuous random process; suppose there exist $\alpha ,\,\beta \in (0,1]$ , a control $w:[0,T]^2\to [0,\infty )$ , a constant $K>0$ and a $\sigma $ -algebra $\mathcal {F}$ such that

(A.3)

$$ \begin{align} \mathbb{E}\bigg[\exp\bigg(\frac{\| X_{s,t} \|_E^2}{|t-s|^{2\alpha} \, w(s,t)^{2\beta}}\bigg)\bigg\vert \mathcal{F}\bigg] \leq K \quad \forall\, s<t. \end{align} $$

Then for any $\varepsilon>0$ , there exists a constant $\mu =\mu (\varepsilon )>0$ such that

(A.4)

$$ \begin{align} \mathbb{E}\bigg[\exp\bigg( \mu\, \sup_{s<t} \frac{\| X_{s,t}\|_E^2}{|t-s|^{2(\alpha-\varepsilon)} \, w(s,t)^{2\beta}}\bigg)\bigg \vert \mathcal F\bigg] \leq e \, K. \end{align} $$

Proof. Since we are already assuming X to be continuous, the supremum over $s<t$ appearing in (A.4) equals the supremum over $s,\, t$ taken over dyadic points. Up to rescaling, we may assume wlog $T=1$ .

For any $n\in \mathbb {N}$ and $k\in \{0,\ldots , 2^n\}$ , set $t^n_k= k 2^{-n}$ and define a random variable

$$\begin{align*}J=\sum_{n=1}^\infty 2^{-2n} \sum_{k=0}^{2^n-1} \exp\bigg( \frac{\| X_{t^n_k,t^n_{k+1}}\|_E^2}{2^{-2n\alpha} w(t^n_k, t^n_{k+1})^{2\beta}}\bigg); \end{align*}$$

by (A.3), it holds $\mathbb {E}[J\vert \mathcal {F}]\leq K$ . Now take $s,t$ to be dyadic points satisfying $|t-s|\sim 2^{-m}$ . Then by standard chaining arguments (see, for example, the proof of [Reference Friz and Hairer43, Theorem 3.1]), it holds

$$\begin{align*}\| X_{s,t}\|_E \lesssim \sum_{n\geq m} \sup_k \| X_{t^n_k,t^n_{k+1}} \|_E; \end{align*}$$

however, by the definition of J, it holds

$$\begin{align*}\| X_{t^n_k,t^n_{k+1}} \|_E \leq 2^{-n\alpha} w(t^n_k, t^n_{k+1})^\beta \sqrt{\log(2^{2n} J)} \lesssim_\varepsilon 2^{-n(\alpha-\varepsilon)} w(s,t)^\beta (1+\sqrt{\log J}) \end{align*}$$

so that

$$ \begin{align*} \| X_{s,t} \|_E & \lesssim \sum_{n\geq m} 2^{-n(\alpha-\varepsilon)} w(s,t)^\beta (1+\sqrt{\log J})\\ & \lesssim 2^{-m(\alpha-\varepsilon)} w(s,t)^\beta (1+\sqrt{\log J}) \sim |t-s|^{\alpha-\varepsilon} w(s,t)^\beta (1+\sqrt{\log J}). \end{align*} $$

Overall, we deduce the existence of a constant $C=C(\varepsilon )>0$ such that

(A.5)

$$ \begin{align} \sup_{s<t} \frac{\| X_{s,t}\|_E}{|t-s|^{\alpha-\varepsilon} w(s,t)^\beta} \leq C (1+\sqrt{\log J}). \end{align} $$

The conclusion now readily follows by applying $x\mapsto \exp (\mu x^2)$ on both sides of (A.5) and choosing $\mu =\mu (\varepsilon )$ so that $2\mu C^2(\varepsilon ) =1$ , so that

$$ \begin{align*} \mathbb{E}\Big[\exp\big(\mu C^2(1+\sqrt{\log J}\big)^2\Big\vert \mathcal{F}\Big] \leq \mathbb{E}\Big[\exp\big(2\mu C^2(1+\log J)\big)\Big\vert \mathcal{F}\Big] = e\, \mathbb{E}[J\vert \mathcal{F}] \leq e K. \end{align*} $$

Going through an almost identical argument, one can also obtain the following result, whose proof is therefore omitted.

Lemma A.3. Let E be a Banach space, $X:[0,T]\to E$ be a continuous random process; suppose there exist $\alpha ,\,\beta \in (0,1]$ , $m\in (1,\infty )$ , a control $w:[0,T]^2\to [0,\infty )$ , a constant $K>0$ and a $\sigma $ -algebra $\mathcal {F}$ such that

(A.6)

$$ \begin{align} \mathbb{E}\big[\,\| X_{s,t}\|_E^m\big \vert \mathcal{F}]^{1/m} \leq K |t-s|^\alpha\, w(s,t)^\beta \quad \forall\, s<t. \end{align} $$

Then for any $0<\gamma <\alpha -1/m$ , there exists a constant $C=C(\alpha ,\gamma ,m)>0$ such that

(A.7)

$$ \begin{align} \mathbb{E}\bigg[ \bigg(\sup_{s<t} \frac{\| X_{s,t}\|_E}{|t-s|^{\gamma} \, w(s,t)^{\beta}}\bigg)^m \bigg \vert \mathcal F\bigg]^{1/m} \leq C \, K. \end{align} $$

Let us also mention that, although for simplicity we assumed in Lemmas A.2 and A.3 to work with a norm $\|\cdot \|_E$ , it suffices for it to be a seminorm instead.

Next, we need some basic lemmas in order to control the space-time regularity of random vector fields $A:[0,1]\times \mathbb {R}^d\to \mathbb {R}^m$ . We start by considering the time independent case.

Lemma A.4. Let $F:\mathbb {R}^d\to \mathbb {R}^n$ be a continuous field and suppose there exist $\alpha \in (0,1]$ , $m\in (1,\infty )$ , a constant $K>0$ and a $\sigma $ -algebra $\mathcal {F}$ such that

(A.8)

$$ \begin{align} \| F(x)-F(y)\|_{L^m\vert \mathcal{F}} \leq K |x-y|^\alpha \quad \forall\, x,y\in\mathbb{R}^d. \end{align} $$

Then for any choice of parameters $\lambda ,\eta \in (0,1]$ such that $\eta <\alpha -d/m$ , $\lambda> \alpha -\eta $ , there exists a constant $C=C(\alpha ,m,d,n,\eta ,\lambda )$ such that

(A.9)

Proof. By arguing componentwise, we can restrict to $n=1$ ; by homogeneity, we can assume $K=1$ . Recall that by the classical Garsia-Rodemich-Rumsay lemma, there exists a constant $c = c(d,\eta ,\alpha ,m)$ such that, for any deterministic continuous function f and any $R>0$ , it holds

thus, taking conditional expectation and applying Fubini, we find

Finally, observe that

with the last quantity being finite under our assumptions.

A combination of Lemmas A.3 and A.4 immediately yields the following.

Corollary A.5. Let $G:[0,1]\times \mathbb {R}^d\to \mathbb {R}^n$ be a continuous random vector field and assume there exist parameters $\alpha ,\beta _1,\beta _2\in (0,1]$ , $m\in (1,\infty )$ , a control w, a constant $K>0$ and a $\sigma $ -algebra $\mathcal {F}$ such that

(A.10)

$$ \begin{align} \| G_{s,t}(x)-G_{s,t}(y)\|_{L^m\vert \mathcal{F}} \leq K |x-y|^\alpha |t-s|^{\beta_1} w(s,t)^{\beta_2}\quad \forall\, x,y\in\mathbb{R}^d,\, s<t. \end{align} $$

Then for any choice of parameters

$$\begin{align*}\gamma <\beta_1-\frac{1}{m},\quad \eta<\alpha-\frac{d}{m},\quad \lambda>\alpha-\eta, \end{align*}$$

there exists $C>0$ , depending on all the previous parameters except K, such that

(A.11)

B Some a priori estimates for Young equations

In this appendix, we prove some basic bounds on (linear and nonlinear) Young differential equations, which are used several times in the article. Such estimates are folklore, but since we did not find an appropriate version in the literature, we provide short proofs.

Lemma B.1. Let $A\in C_t^{p-{\mathrm {var}}} C^{\eta }_x$ with $\eta \in (0,1)$ , $p\in [1,2)$ satisfying $(1+\eta )/p>1$ ; set . Let y be any solution to the nonlinear Young equation

$$ \begin{align*} y_t=y_0+\int_0^t A_{\mathrm{d} s}(y_s) \end{align*} $$

on $[0,1]$ ; then one has the bounds

(B.1)

$$ \begin{align} |y_{s,t}| \lesssim w_A(s,t)^{\frac{1}{p}} + w_A(s,t), \qquad |y_{s,t}-A_{s,t}(y_s)| \lesssim w_A(s,t)^{\frac{1+\eta}{p}} + w_A(s,t)^{\frac{1}{p}+\eta} \end{align} $$

valid for all $(s,t)\in [0,1]_\leq ^2$ , where the hidden constants only depend on $(\eta ,p)$ . Similar bounds also hold for solutions only defined on an interval $[S,T]\subset [0,1]$ .

Proof. By definition, y must be of finite q-variation for some q satisfying $1/p +\eta /q>1$ ; applying (5.3) with $x=y$ , one finds

which in particular shows that y is of finite p-variation. Then going through the same computation with $q=p$ and applying [Reference Friz and Victoir45, Proposition 5.10-(i)], there exists a constant C such that, for any $s\leq t$ , it holds

where in the second step, we used the fact that $\eta \in (0,1)$ and Young’s inequality. This readily implies a local bound of the form

We can then apply [Reference Friz and Victoir45, Proposition 5.10-(ii)] to deduce that, for all $(s,t)\in [0,1]_{\leq }^2$ ,

(B.2)

The first inequality in (B.1) immediately follows from (B.2), the second one from a combination of (B.2) with (5.3) for $x=y$ .

In the next statement instead we pass to consider more standard affine Young equations. In particular, $t\mapsto A_t$ is an $\mathbb {R}^{d\times d}$ -valued map of finite p-variation, and the notation $\int _0^t \mathrm {d} A_s\, x_s$ denotes a usual Young integral, equivalently the (deterministic) sewing of the germ $\Sigma _{s,t}:= A_{s,t} x_s$ .

Lemma B.2. Let x be a solution to the affine Young equation

$$\begin{align*}\mathrm{d} x_t = \mathrm{d} A_t\, x_t + \mathrm{d} z_t, \quad x\vert_{t=0}=x_0, \end{align*}$$

where $A\in C^{p-{\mathrm {var}}}_t \mathbb {R}^{d\times d}$ and $z\in C^{\tilde {p}-{\mathrm {var}}}_t$ , for some $p\in [1,2)$ and $\tilde {p}\geq p$ such that $1/p+1/\tilde {p}>1$ ; assume $z_0=0$ . Then there exists a constant $C=C(p,\tilde {p})>0$ such that

(B.3)

When $z=0$ , setting , it holds

(B.4)

Proof. Let us first apply the change of variable $\theta =x-z$ , so that $\theta $ solves

$$\begin{align*}\mathrm{d} \theta_t = \mathrm{d} A_t\, \theta_t + \mathrm{d} A_t\, z_t = \mathrm{d} A_t\, \theta_t + \mathrm{d} \tilde{z}_t \end{align*}$$

where $\tilde {z}_t:=\int _0^t \mathrm {d} A_s\, z_s$ . The advantage of this maneuver is that $\tilde {z}$ is also of finite p-variation and controlled by (a multiple of) $w^{1/p}$ . Indeed, by Young integration, it holds

(B.5)

For any $s<t$ , define

and similarly for $\tilde {z}$ . Manipulating the equation for $\theta $ in a standard manner, one finds a constant $C>0$ such that, for any $s<t$ , it holds

(B.6)

If $Cw(0,1)^{1/p}\leq 1/2$ , then the (B.6) buckles with $s=0,t=1$ . Otherwise, define recursively an increasing sequence $t_i$ by $t_0=0$ and $C w(t_i,t_{i+1})^{1/p}\in (1/3,1/2)$ and $t_n=1$ for some n. set $J_i:=\sup _{r\in [t_i,t_{i+1}]} |\theta _r|$ with the convention $J_{-1}=|x_0|$ . Then thanks to our choice of $t_i$ and equation (B.6), it holds

Recursively, this implies

Finally observe that, by superadditivity of w and our choice of $t_i$ , it holds

$$\begin{align*}n = (3C)^p \sum_i w(t_i,t_{i+1}) \leq (3C)^p w(0,1), \end{align*}$$

and therefore by (B.5),

with some other constant $C'>0$ . Substituting this bound back to (B.6), we similarly get

Combining everything yields the claimed bounds (B.3)–(B.4).

C Fractional regularity and Girsanov’s transform

We collect in this appendix several definitions of fractional regularity and show how, in certain regularity regimes, they can be combined with our results, so to verify the applicability of Girsanov’s transform to the singular SDEs in consideration.

We start by recalling several classical definitions of fractional spaces for paths $f:[0,1]\to E$ , E being a Banach space. For $\beta \in (0,1)$ and $p\in [1,\infty )$ , the fractional Sobolev space $W^{\beta ,p}=W^{\beta ,p}(0,1;E)$ is defined as the set of $f\in L^p(0,1;E)$ such that

Similarly, we define the spaces the Besov–Nikolskii spaces $N^{\beta ,p}=N^{\beta ,p}(0,1;E)$ as the collections of all $f\in L^p(0,1;E)$ such that

In the case $p=\infty $ , we will set $W^{\beta ,p}=N^{\beta ,p}=C^\beta $ . Although we will not need it, let us mention that these spaces are particular instances of the Besov spaces $B^\beta _{p,q}$ as defined in [Reference Simon93], indeed $W^{\beta ,p}=B^\beta _{p,p}$ and $N^{\beta ,p}=B^\beta _{p,\infty }$ .

There is a final class of spaces we will need, which is an original contribution of this work; many processes arising from stochastic sewing can be shown to belong to this class, thanks to Lemmas A.2–A.3. Given $\beta \in (0,1]$ , $p\in [1,\infty )$ with $\beta> 1/p$ , we define the space $D^{\beta ,p}=D^{\beta ,p}(0,1;E)$ as the set of all f for which there exists a continuous control $w=w(f)$ such that

(C.1)

$$ \begin{align} \| f_{s,t}\|_E \leq |t-s|^{\beta-\frac{1}{p}}\, w(s,t)^{\frac{1}{p}}\quad \forall\, s<t. \end{align} $$

Observe that by superadditivity, if such a control w exists, then the optimal choice must be necessarily given by

where the supremum runs over all possible finite partitions $s=t_0 < t_1<\ldots <t_n=t$ of $[s,t]$ . We can therefore endow the space $D^{\beta ,p}$ with the norm

(C.2)

which makes them Banach spaces; observe the analogy with the definition of $C^{p-{\mathrm {var}}}$ and its characterization via controls. In particular, if a function f is known to satisfy (C.1), then it must hold .

For $\beta>1/p$ , we define $W^{\beta ,p}_0=\{f\in W^{\beta ,p}: f_0=0\}$ (as we will shortly see, this is a good definition, as elements of $W^{\beta ,p}$ are continuous functions), and similarly for $N^{\beta ,p}_0$ and $D^{\beta ,p}_0$ .

The next proposition summarises the embeddings between these classes of spaces, as well as the Cameron–Martin spaces $\mathcal {H}^H$ and spaces of finite q-variation.

Proposition C.1. Let $\beta \in (0,1]$ , $p\in [1,\infty )$ with $\beta> 1/p$ ; then, the following hold:

i) for any $\varepsilon>0$ , we have $ W^{\beta ,p} \hookrightarrow D^{\beta ,p} \hookrightarrow N^{\beta ,p} \hookrightarrow W^{\beta -\varepsilon ,p}$ ;
ii) if $\bar \beta \leq \beta $ and $\beta -1/p\geq \bar \beta -1/\bar p$ , then $N^{\beta ,p}\hookrightarrow N^{\bar \beta ,\bar p}$ ; in particular, $N^{\beta ,p} \hookrightarrow C^{\beta -1/p}$ ;
iii) $N^{\beta ,p} \hookrightarrow C^{1/\beta -{\mathrm {var}}} \hookrightarrow N^{\beta ,1/\beta }$ ;
iv) let $H\in (0,1/2)$ and $E=\mathbb {R}^d$ ; then for any $\varepsilon>0$ , it holds
$$ \begin{align*} W^{H+\frac{1}{2}+\varepsilon,2}_0 \hookrightarrow \mathcal{H}^H\hookrightarrow W^{H+\frac{1}{2}-\varepsilon,2}_0; \end{align*} $$
in particular, $\mathcal {H}^H\hookrightarrow C^{q-{\mathrm {var}}}$ for any $q>(H+1/2)^{-1}$ .

Proof. i) The last embedding $N^{\beta ,p} \hookrightarrow W^{\beta -\varepsilon ,p}$ is classical and can be found in [Reference Simon93, Corollary 23]. The embedding $W^{\beta ,p} \hookrightarrow D^{\beta ,p}$ follows from [Reference Friz and Victoir42, Theorem 2]; in particular, by Garsia-Rodemich-Rumsay lemma, the associated control $w_f$ can be taken as

$$\begin{align*}w_f(s,t) = \int_{[s,t]^2} \frac{\| f_{r,u}\|_E^p}{|r-u|^{1+\beta p}} \, \mathrm{d} r \mathrm{d} u. \end{align*}$$

It remains to show the embedding $\mathcal {D}^{\beta ,p} \hookrightarrow N^{\beta ,p}$ ; this follows the same technique used to show that $C^{p-{\mathrm {var}}}\hookrightarrow N^{1/p,p}$ (see, for example, [Reference Liu, Prömel and Teichmann73, Proposition 4.3]). Indeed, for any $h\in [0,T]$ , it holds

$$ \begin{align*} \| f_{h+\cdot} - f_{\cdot}\|_{L^p}^p = \int_0^{1-h} \| f_{t,h+t}\|_E^p \mathrm{d} t \leq |h|^{\beta p-1} \int_0^{1-h} w(t,h+t) \mathrm{d} t, \end{align*} $$

where . Denoting by K the largest integer such that $Kh\leq 1-h$ , we have

$$ \begin{align*} \int_0^{1-h} w(t,h+t) \mathrm{d} t&\leq \int_0^{Kh} w(t,h+t) +|h| w(0,1) \\ & = \sum_{i=0}^{K-1} \int_{ih}^{(i+1)h} w(s,h+s) \mathrm{d} s+|h| w(0,1) \\ &= \int_0^h \sum_{i=0}^{K-1} w(ih+s,(i+1)h+s) \mathrm{d} s+|h| w(0,1) \\ &\leq \int_0^h w(0,1) \mathrm{d} s +|h| w(0,1)= 2 |h| w(0,1), \end{align*} $$

where in the last inequality, we used the superadditivity of w. Overall, we conclude that .

ii) These embeddings can be found in, for example, [Reference Simon93, Corollary 22], [Reference Simon93, Corollary 26].

iii) These embeddings can be found in, for example, [Reference Liu, Prömel and Teichmann73, Proposition 4.1], [Reference Liu, Prömel and Teichmann73, Proposition 4.3].

iv) The second embedding $\mathcal {H}^H\hookrightarrow W^{H+\frac {1}{2}-\varepsilon ,2}_0$ is the result of [Reference Friz and Victoir42, Theorem 3]; the last one follows from it combined with $N^{q,2}\hookrightarrow C^{1/q-{\mathrm {var}}}$ . It only remains to show the first embedding. Although we believe it to be common knowledge, we have not found a proof in the literature; thus, we give a detailed one.

Given $ f\in W^{H+1/2+\varepsilon ,2}_0$ , in order to verify that $f\in \mathcal {H}^H$ , we need to check that $K^{-1}_H f\in L^2$ , where

$$ \begin{align*} K^{-1}_H f = s^{1/2-H} D_{0+}^{1/2-H} s^{H-1/2} D^{2H}_{0+} \end{align*} $$

(see equation (12) from [Reference Nualart and Ouknine81]); $D_{0+}^\gamma $ denotes the Riemann-Liouville fractional derivative of order $\gamma $ , for which again we refer to [Reference Nualart and Ouknine81].

By using standard embeddings between $W^{\delta ,2}$ spaces and potential spaces $I^+_{\delta ,2}$ (cf. [Reference Decreusefond34, Proposition 5]), up to losing an arbitrary small fraction of regularity, we know that for any $f\in W^{H+1/2+\varepsilon ,2}_0$ , it holds $h:=D^{2H}_{0+} f \in W^{1/2-H+\varepsilon /2,2}$ (this is the only point in the proof where the condition $f(0)=0$ is needed). Thus, we are left with verifying that, for the choice $\gamma =1/2-H$ , it holds

$$ \begin{align*} (K^{-1}_H f)_t = C_{\gamma} \bigg( t^{-\gamma} h_t + \gamma t^\gamma \int_0^t \frac{t^{-\gamma} h_t - s^{-\gamma} h_s}{|t-s|^{1+\gamma}} \mathrm{d} s \bigg) \in L^2(0,1;\mathbb{R}^d). \end{align*} $$

From now on, we will drop the constants $C_\gamma $ and $\gamma $ for simplicity.

For the first term, observing that $t^{-\gamma }\in L^r$ for any r such that $1/r<1/2-H$ and that $h\in W^{1/2-H+\varepsilon /2,2}\hookrightarrow L^p$ for $1/p=H-\varepsilon /2$ , it is easy to check by Hölder’s inequality that $t^{-\gamma } h_t \in L^2$ .

By time rescaling and addition and subtraction, we can split the integral term respectively into

$$ \begin{align*} I^1_t := \int_0^t \frac{h_t-h_s}{|t-s|^{1+\gamma}} \mathrm{d} s, \quad I^2_t := t^{-\gamma} \int_0^1 \frac{1-s^{-\gamma}}{(1-s)^{1+\gamma}} h_{t s} \mathrm{d} s. \end{align*} $$

For $I^1$ , it holds

$$ \begin{align*} \int_0^1 |I^1_t|^2 \mathrm{d} t \leq \int_0^1 \bigg( \int_0^1 \frac{|h_t-h_s|}{|t-s|^{1+\gamma}} \mathrm{d} s\bigg)^2 \mathrm{d} t \lesssim \int_{[0,1]^2} \frac{|h_t-h_s|^2}{|t-s|^{1+2\gamma+\varepsilon}} \mathrm{d} s \mathrm{d} t \lesssim \| h\|_{W^{\gamma+\varepsilon/2,2}}, \end{align*} $$

where in the middle passage, we used Jensen’s inequality. To handle $I^2$ , define $F^\gamma _s := (1-s^{-\gamma })/(1-s)^{1+\gamma }$ ; $F^\gamma $ is only unbounded at the points $s=0$ and $s=1$ , where it behaves asymptotically respectively as $-s^{-\gamma }$ and $(1-s)^{-\gamma }$ , and therefore $F^\gamma \in L^1\cap L^2$ . As before, $h\in L^p$ for $1/p=H-\varepsilon /2< 1/2$ , and therefore by Hölder’s inequality

$$ \begin{align*} |I^2_t| \leq t^{-\gamma} \| F^\gamma\|_{L^{p'}} \| h_{t\cdot}\|_{L^p} \sim t^{-\gamma-\frac{1}{p}} \| F^\gamma\|_{L^{p'}} \| h\|_{L^p} \sim t^{\varepsilon/2-1/2}, \end{align*} $$

which readily implies $I^2\in L^2$ as well.

Remark C.2. By Proposition C.1, for a deterministic path g to belong to the Cameron-Martin space $\mathcal {H}^H$ for $H\in (0,1/2)$ , it suffices to verify that $g\in \mathcal {D}^{\beta ,p}$ for parameters $p\in (1,2]$ and $\beta>0$ satisfying

(C.3)

$$ \begin{align} \beta-\frac{1}{p}>H, \end{align} $$

in which case we have the estimate $\| g\|_{\mathcal {H}^H} \lesssim \| g\|_{\mathcal {D}^{\beta ,p}}$ . Therefore, if a stochastic process h is adapted and belongs to $\mathcal {D}^{\beta ,p}$ , then for a sequence of stopping times $(\tau _n)_{n\in \mathbb {N}}$ satisfying $\tau _n\nearrow \infty $ , the laws of $B^H$ are $B^H_\cdot +h_{\cdot \wedge \tau _n}$ are mutually absolutely continuous. If the stronger Novikov-type condition

(C.4)

$$ \begin{align} \mathbb{E}\big[\exp\lambda\|h\|_{\mathcal{D}^{\beta,p}}^2\big]<\infty\quad \forall\,\lambda>0 \end{align} $$

holds, then one can infer the stronger conclusion that the laws of $B^H$ are $B^H_\cdot +h$ are equivalent and that the Radon-Nikodym derivative admits moments of any order; see [Reference Galeati, Harang and Mayorcas51, Proposition 3.10] for a similar statement.

With the above considerations in mind, we are now ready to present a result on the applicability of Girsanov’s transform, which is the main motivation for this appendix.

Lemma C.3. Assume (A) and that

(C.5)

$$ \begin{align} 1-1/(Hq')<0. \end{align} $$

Let $b\in L^q_t C^\alpha _x$ , $x_0\in \mathbb {R}^d$ , and denote by $\mu $ the law of the solution X to the associated SDE (1.6). Then Girsanov’s transform applies and $\mu $ is equivalent to $\mathcal {L}(x_0+B^H)$ . As a consequence, $\mathrm {supp}\, \mu = C([0,1];\mathbb {R}^d)$ .

Proof. Without loss of generality, we may assume $\alpha <0$ and $x_0=0$ . In view of Remark C.2, we need to verify (C.4) with $h=\varphi =X-B^H$ and with some $\beta $ , p satisfying (C.3).

Let $\kappa>0$ small enough so that H, $\alpha -\kappa $ , and q also satisfy (A), and let $\tilde b\in L^q_tC^{\alpha -\kappa }_x$ with norm $1$ . By Lemmas 2.4, 3.1 and A.2, we have that with some $\mu>0$ ,

(C.6)

$$ \begin{align} \mathbb{E}\bigg[\exp\bigg( \mu\Big\|\int_0^\cdot \tilde b_r(B^H_r+\varphi_r)\mathrm{d} r \Big\|_{\mathcal{D}^{1+(\alpha-\kappa) H-\kappa,q}}^2\bigg) \bigg]<\infty. \end{align} $$

Note that for sufficiently small $\kappa $ , the exponents satisfy (C.3) as a consequence of (A). Therefore, (C.6) looks like (C.4), except the arbitrariness of the coefficient. One can then proceed by an interpolation argument as in [Reference Galeati, Harang and Mayorcas51, Proposition 3.8]: for any $\kappa>0$ and $\lambda>0$ , there exists $b^{-}$ and $b^{+}$ such that $b=b^-+b^+$ and

$$ \begin{align*} \frac{2 \lambda}{\mu} \|b^-\|_{L^q_tC^{\alpha-\kappa}_x}^2\leq1,\qquad\qquad\|b^+\|_{L^q_t C^0_x}=:K<\infty, \end{align*} $$

where K may depend on all parameters. Then we can write

$$ \begin{align*} \mathbb{E}&\bigg[\exp\bigg( \lambda\Big\|\int_0^\cdot b(B^H_r+\varphi_r)\mathrm{d} r \Big\|_{\mathcal{D}^{1+(\alpha-\kappa) H-\kappa,q}}^2\bigg) \bigg] \\ &\leq e^{2K^2}\mathbb{E}\bigg[\exp\bigg( \mu\, \frac{2\lambda}{\mu} \Big\|\int_0^\cdot b^-(B^H_r+\varphi_r)\mathrm{d} r \Big\|_{\mathcal{D}^{1+(\alpha-\kappa) H-\kappa,q}}^2\bigg)\bigg]<\infty, \end{align*} $$

applying (C.6) with $\sqrt {2\lambda /\mu }b^-$ in place of $\tilde b$ in the last step.

Remark C.4. The restriction (C.5) in Lemma C.3 is necessary. Indeed, even taking a space-independent drift $b\in L^q$ , so that $\varphi \in W^{1,q}$ , the condition $1-1/q>(H+1/2)-1/2$ necessary for the Sobolev embedding implies (C.5). The reader may feel this pathological and rightly so: for such a b, we can deduce everything about the law of $B^H+\varphi $ from the law of $B^H$ . Note that this also motivates the use of ‘stochastic regularity’ as in, for example, (2.2), which assigns to deterministic functions (like $\varphi $ in this example) infinite regularity.

Note also that (C.5) enforces $H\in (0,1/2)$ . We do not discuss the regime of large H in detail, as Girsanov’s transform becomes less end less useful as H increases. For example, for $H>2$ , one has $B^H\in C^2$ and (in the nontrivial case $\alpha <1$ ) $\varphi \notin C^2$ , yielding trivially the mutual singularity of the laws of $B^H$ and $X=B^H+\varphi $ . Once again, the way out is to use ‘stochastic regularity’ as a substitute for Girsanov.

Acknowledgements

MG thanks Konstantinos Dareiotis for valuable discussions during the development of the parallel article [Reference Dareiotis and Gerencsér31]. The authors thank the institutions MFO Oberwolfach and TU Wien for their hospitality during their research visits.

Funding statement

This research was funded in whole or in part by the Austrian Science Fund (FWF) [10.55776/P34992]. For open access purposes, the author has applied a CC BY public copyright license to any author accepted manuscript version arising from this submission. LG was funded by the DFG under Germanys Excellence Strategy - GZ 2047/1, project-id 390685813 and later by the SNSF Grant 182565 and by the Swiss State Secretariat for Education, Research and Innovation (SERI) under contract number MB22.00034 through the project TENSE.

Competing interest

The authors have no competing interest to declare.

Footnotes

1 Specifically, we are interested in understanding how scales as $\lambda \to 0$ , which is related to studying the local behaviour of solutions; instead, the scaling of as $\lambda \to \infty $ reflects a ‘zoom out’ which identifies the dominant term concerning the long-time dynamics.

2 Please see our convention on the definition of $C^\alpha _x$ from Section 1.5 below, especially for $\alpha \in \mathbb {N}$ ; in particular, $C^0_x$ is understood as the space bounded and measurable functions, with $L^\infty $ -norm.

3 By Besov embeddings $L^p_x \hookrightarrow B^{-d/p}_{\infty ,\infty }$ , with homogeneous norms behaving in the same way under rescaling.

4 In fact, any integral in time of an LND Gaussian process admitting a moving average representation in the style of (1.23) is still LND; see [Reference Galeati and Gubinelli48, Sec. 4.2, Example iv.].

5 For non-Gaussian processes, one can still find a replacement for (1.26) – for example, in the case of Lévy processes; see [Reference Butkovsky, Dareiotis and Gerencsér13].

6 In the regularisation by noise literature, to the best of our knowledge, this concept originates from [Reference Catellier and Gubinelli20], where a similar pathwise solution ansatz leads to the formalism of nonlinear Young integrals, based on deterministic sewing. Here, also inspired by the works [Reference Lê71, Reference Friz, Hocquet and Lê44, Reference Gerencsér53, Reference Butkovsky, Dareiotis and Gerencsér13], we take a step further and readapt the concept to a more probabilistic setup, where a combination of (2.1), LND and stochastic sewing yields sharper results.

7 Note that in terms of the coefficients, the moments of $\eta _A$ depend on $w_{b^1,\alpha ,q}+w_{b^2,\alpha ,q}$ , while the moments of z depend only on $w_{b^2,\alpha ,q}$ .

8 In the case $H>1$ and $b\in C^\alpha _x$ , $\int _0^\cdot b(B^H_r)\mathrm {d} r\in C^{1+\alpha }_t$ , so that Girsanov would require the condition $1+\alpha>H+1/2$ ; covering the whole regime $\alpha>1-1/(2H)$ would lead to the condition $1-1/(2H)>H-1/2$ , which cannot hold for $H>1$ .

References

Ambrosio, L., ‘Transport equation and Cauchy problem for non-smooth vector fields’, in Calculus of Variations and Nonlinear Partial Differential Equations (Springer, 2008), 1–41.CrossRef Google Scholar

Anzeletti, L., Richard, A. and Tanré, E., ‘Regularisation by fractional noise for one-dimensional differential equations with distributional drift’, Electron. J. Probab. 28 (2023), Paper No. 135, 49.CrossRef Google Scholar

Athreya, S., Butkovsky, O., Lê, K. and Mytnik, L., ‘Well-posedness of stochastic heat equation with distributional drift and skew stochastic heat equation’, Comm. Pure Appl. Math. 77(5) (2024), 2708–2777.CrossRef Google Scholar

Ayache, A., Shieh, N.-R., and Xiao, Y., ‘Multiparameter multifractional Brownian motion: local nondeterminism and joint continuity of the local times’, Ann. Inst. Henri Poincaré Probab. Stat. 47(4) (2011), 1029–1054.CrossRef Google Scholar

Bahouri, H., Chemin, J.-Y. and Danchin, R., Fourier Analysis and Nonlinear Partial Differential Equations vol. 343 (Springer, Heidelberg, 2011).CrossRef Google Scholar

Bass, R. F. and Chens, Z.-Q., ‘Stochastic differential equations for Dirichlet processes’, Probab. Theory Related Fields 121(3) (2001), 422–446.CrossRef Google Scholar

Bechtold, F. and Hofmanová, M., ‘Weak solutions for singular multiplicative SDEs via regularization by noise’, Stochastic Process. Appl. 157 (2023), 413–435.CrossRef Google Scholar

Beck, L., Flandoli, F., Gubinelli, M. and Maurelli, M., ‘Stochastic ODEs and stochastic linear PDEs with critical drift: regularity, duality and uniqueness’, Electron. J. Probab. 24 (2019), 1–72.CrossRef Google Scholar

Benassi, A., Jaffard, S. and Roux, D., ‘Elliptic Gaussian random processes’, Rev. Mat. Iberoamericana 13(1) (1997), 19–90.CrossRef Google Scholar

Brué, E., Colombo, M. and De Lellis, C., ‘Positive solutions of transport equations and classical nonuniqueness of characteristic curves’, Arch. Ration. Mech. Anal. 240(2) (2021), 1055–1090.CrossRef Google Scholar

Brué, E. and Nguyen, Q.-H., ‘Sharp regularity estimates for solutions of the continuity equation drifted by Sobolev vector fields’, Anal. PDE 14(8) (2021), 2539–2559.CrossRef Google Scholar

Butkovsky, O., Dareiotis, K. and Gerencsér, M., ‘Approximation of SDEs: a stochastic sewing approach’, Probab. Theory Related Fields (2021).CrossRef Google Scholar

Butkovsky, O., Dareiotis, K. and Gerencsér, M., ‘Strong rate of convergence of the Euler scheme for SDEs with irregular drift driven by Levy noise’, to appear in Ann. Inst. H. Poincaré Probab. Statist., 2022, arXiv:2204.12926.Google Scholar

Butkovsky, O. and Gallay, S., ‘Weak existence for SDEs with singular drifts and fractional Brownian or Levy noise beyond the subcritical regime’, Preprint, 2023, arXiv:2311.12013.Google Scholar

Butkovsky, O., Lê, K. and Matsuda, T, ‘Regularization of differential equations with integrable drifts by fractional noise’, in preparation, 2024.Google Scholar

Butkovsky, O., Lê, K. and Mytnik, L., ‘Stochastic equations with singular drift driven by fractional Brownian motion’, Preprint, 2023, arXiv:2302.11937.Google Scholar

Butkovsky, O. and Mytnik, L., ‘Weak uniqueness for singular stochastic equations’, Preprint, 2024, arXiv:2405.13780.Google Scholar

Cannizzaro, G. and Chouk, K., ‘Multidimensional SDEs with singular drift and universal construction of the polymer measure with white noise potential’, Ann. Probab. 46(3) (2018).CrossRef Google Scholar

Cass, T., Friz, P. and Victoir, N., ‘Non-degeneracy of Wiener functionals arising from rough differential equations’, Trans. Amer. Math. Soc. 361(6) (2009), 3359–3371.CrossRef Google Scholar

Catellier, R. and Gubinelli, M., ‘Averaging along irregular curves and regularisation of ODEs’, Stochastic Process. Appl. 126(8) (2016), 2323–2366.CrossRef Google Scholar

Catellier, R., ‘Rough linear transport equation with an irregular drift’, Stoch. Partial Differ. Equ. Anal. Comput. 4(3) (2016), 477–534.Google Scholar

Chaudru de Raynal, P.-É.. ‘Weak regularization by stochastic drift: Result and counter example’, Discrete Contin. Dyn. Syst. 38(3) (2018), 1269–1291.CrossRef Google Scholar

Chaudru de Raynal, P.-É, Honoré, I. and Menozzi, S., ‘Strong regularization by Brownian noise propagating through a weak Hörmander structure’, Probab. Theory Related Fields 184(1–2) (2022), 1–83.CrossRef Google Scholar

Chaudru de Raynal, P.-É, Honoré, I. and Menozzi, S., ‘Regularization effects of a noise propagating through a chain of differential equations: an almost sharp result’, Trans. Amer. Math. Soc. 375(01) (2021), 1–45.CrossRef Google Scholar

Chaudru de Raynal, P-É, Menozzi, S. and Priola, E., ‘Schauder estimates for drifted fractional operators in the supercritical case’, J. Funct. Anal. 278(8) (2020), 108425.CrossRef Google Scholar

Chen, X. and Li, X.-M., ‘Strong completeness for a class of stochastic differential equations with irregular coefficients’, Electron. J. Probab. 19 (2014).CrossRef Google Scholar

Cheridito, P., ‘Mixed fractional Brownian motion’, Bernoulli 7 (2001), 913–934.CrossRef Google Scholar

Chouk, K. and Gubinelli, M., ‘Nonlinear PDEs with modulated dispersion I: Nonlinear Schrödinger equations’, Comm. Partial Differential Equations 40(11) (2015), 2047–2081.CrossRef Google Scholar

Chouk, K. and Gess, B., ‘Path-by-path regularization by noise for scalar conservation laws’, J. Funct. Anal. 277(5) (2019), 1469–1498.CrossRef Google Scholar

Chouk, K., Gubinelli, M., Li, G., Li, J. and Oh, T., ‘Nonlinear PDEs with modulated dispersion II: Korteweg–de Vries equation’, Preprint, 2024, arXiv:1406.7675 (v2).Google Scholar

Dareiotis, K. and Gerencsér, M., ‘Path-by-path regularisation through multiplicative noise in rough, Young, and ordinary differential equations’, Ann. Probab. 52(5) (2024), 1864–1902.CrossRef Google Scholar

Davie, A. M., ‘Uniqueness of solutions of stochastic differential equations’, Int. Math. Res. Not. IMRN 24 (2007), Art. ID rnm124, 26.Google Scholar

Debussche, A., Glatt-Holtz, N. and Temam, R., ‘Local martingale and pathwise solutions for an abstract fluids model’, Phys. D 240(14–15) (2011), 1123–1144.CrossRef Google Scholar

Decreusefond, L., ‘Stochastic integration with respect to Volterra processes’, Ann. Inst. H. Poincaré Probab. Statist. 41(2) (2005), 123–149.CrossRef Google Scholar

Delarue, F. and Diel, R., ‘Rough paths and 1d SDE with a time dependent distributional drift: application to polymers’, Probab. Theory Related Fields 165(1–2) (2015), 1–63.CrossRef Google Scholar

DiPerna, R. J. and Lions, P.-L., ‘Ordinary differential equations, transport theory and Sobolev spaces’, Invent. Math. 98(3) (1989), 511–547.CrossRef Google Scholar

Fedrizzi, E. and Flandoli, F., ‘Hölder flow and differentiability for SDEs with nonregular drift’, Stoch. Anal. Appl. 31(4) (2013), 708–736.CrossRef Google Scholar

Fedrizzi, E. and Flandoli, F., ‘Noise prevents singularities in linear transport equations’, J. Funct. Anal. 264(6) (2013), 1329–1354.CrossRef Google Scholar

Flandoli, F., Random Perturbation of PDEs and Fluid Dnamic Models: École d’été de Pobabilités de Saint-Flour XL–2010 vol. 2015 (Springer Science & Business Media, 2011).CrossRef Google Scholar

Flandoli, F., Gubinelli, M. and Priola, E., ‘Well-posedness of the transport equation by stochastic perturbation’, Invent. Math. 180(1) (2010), 1–53.CrossRef Google Scholar

Flandoli, F., Issoglio, E. and Russo, F., ‘Multidimensional stochastic differential equations with distributional drift’, Trans. Amer. Math. Soc. 369(3) (2016), 1665–1688.CrossRef Google Scholar

Friz, P. and Victoir, N., ‘A variation embedding theorem and applications’, J. Funct. Anal. 239(2) (2006), 631–637.CrossRef Google Scholar

Friz, P. K. and Hairer, M., A Course on Rough Paths (Springer International Publishing, 2020).CrossRef Google Scholar

Friz, P. K., Hocquet, A. and Lê, K., ‘Rough stochastic differential equations’, Preprint, 2021, arXiv:2106.10340.Google Scholar

Friz, P. K. and Victoir, N.B., Multidimensional Stochastic Processes as Rough Paths (Cambridge University Press, 2009).Google Scholar

Galeati, L., ‘Nonlinear Young differential equations: a review’, J. Dynam. Differential Equations 35(2) (2023), 985–1046.CrossRef Google Scholar

Galeati, L., ‘A note on weak existence for singular SDEs’, Stoch. Dyn. 24(3) (2024), Paper No. 2450025.CrossRef Google Scholar

Galeati, L. and Gubinelli, M., ‘Prevalence of

$\rho$ -irregularity and related properties’, Ann. Inst. H. Poincaré Probab. Statist. 60(4) 2415–2467, November 2024. https://doi.org/10.1214/23-AIHP1399.CrossRef Google Scholar

Galeati, L. and Gubinelli, M., ‘Noiseless regularisation by noise’, Revista Mat. Iberoam. 38(2) (2021), 433–502.CrossRef Google Scholar

Galeati, L. and Gubinelli, M., ‘Mixing for generic rough shear flows’, SIAM J. Math. Anal. 55(6) (2023), 7240–7272.CrossRef Google Scholar

Galeati, L., Harang, F. A. and Mayorcas, A., ‘Distribution dependent SDEs driven by additive fractional Brownian motion’, Probab. Theory Related Fields 185(1–2) (2023), 251–309.CrossRef Google Scholar

Gatheral, J., Jaisson, T. and Rosenbaum, M., ‘Volatility is rough’, Quant. Finance 18(6) (2018), 933–949.CrossRef Google Scholar

Gerencsér, M., ‘Regularisation by regular noise’, Stoch. Partial Differ. Equ. Anal. Comput. 11(2) (2023), 714–729.Google Scholar

Goudenège, L., Haress, E. M. and Richard, A., ‘Numerical approximation of SDEs with fractional noise and distributional drift’, Stochastic Process. Appl., (2023). https://doi.org/10.1016/j.spa.2024.104533.CrossRef Google Scholar

Gräfner, L. and Perkowski, N., ‘Weak well-posedness of energy solutions to singular SDEs with supercritical distributional drift’, Preprint, 2024, arXiv:2407.09046.Google Scholar

Han, Y., ‘Solving McKean–Vlasov SDEs via relative entropy’, arXiv:2204.05709, to appear in Ann. Appl. Probab. (2022).Google Scholar

Hao, Z. and Zhang, X., ‘SDEs with supercritical distributional drifts’, Preprint, 2023, arXiv:2312.11145.Google Scholar

Harang, F. A. and Perkowski, N., ‘

${C}^{\infty }$ -regularization of ODEs perturbed by noise’, Stoch. Dyn. 21(08) (2021).CrossRef Google Scholar

Hurst, H. E., Black, R. P. and Simaika, Y. M., Long-Term Storage: An Experimental Study (Constable, 1965).Google Scholar

Kinzebulatov, D. and Madou, K. R., ‘Strong solutions of SDEs with singular (form-bounded) drift via Roeckner-Zhao approach’, Preprint, 2023, arXiv:2306.04825.CrossRef Google Scholar

Kinzebulatov, D. and Semënov, Y. A., ‘Sharp solvability for singular SDEs’, Electron. J. Probab. 28 (2023), Paper No. 69, 15.CrossRef Google Scholar

Kolmogorov, A. N, ‘Wienersche spiralen und einige andere interessante kurven in hilbertscen raum’, Acad. Sci. URSS (NS) 26 (1940), 115–118.Google Scholar

Kremp, H. and Perkowski, N., ‘Rough weak solutions for singular Lévy SDEs’, Preprint, 2023, arXiv:2309.15460.Google Scholar

Krylov, N. V., ‘On time inhomogeneous stochastic Itô equations with drift in

${L}_{d+1}$ ’, Ukraïn. Mat. Zh. 72(9) (2020), 1232–1253.CrossRef Google Scholar

Krylov, N. V., ‘On strong solutions of Itô’s equations with

$\sigma \in {W}_d^1$ and

$b\in {L}_d$ ’, Ann. Probab. 49(6) (2021), 3142–3167.CrossRef Google Scholar

Krylov, N. V., ‘On strong solutions of time inhomogeneous Itô’s equations with Morrey diffusion gradient and drift. A supercritical case, Preprint, 2022, arXiv:2211.03719.Google Scholar

Krylov, N. V., ‘On weak and strong solutions of time inhomogeneous Itô’s equations with VMO diffusion and Morrey drift’, Stochastic Process. Appl. 179 (2025), Paper No. 104505.CrossRef Google Scholar

Krylov, N. V. and Röckner, M., ‘Strong solutions of stochastic equations with singular time dependent drift’, Probab. Theory Related Fields 131(2) (2005), 154–196.CrossRef Google Scholar

Kunita, H., ‘Stochastic differential equations and stochastic flows of diffeomorphisms’, in Ecole d’été de probabilités de Saint-Flour XII-1982 (Springer, 1984), 143–303.Google Scholar

Kusuoka, S., ‘On the regularity of solutions to SDEs’, in Asymptotic Problems in Probability Theory: Wiener Functionals and Symptotics (Longman Science & Technical, 1993), 90–103.Google Scholar

Lê, K., ‘A stochastic sewing lemma and applications’, Electron. J. Probab. 25 (2020).CrossRef Google Scholar

Lê, K., ‘Stochastic sewing in Banach spaces’, Electron. J. Probab. 28 (2023), 1–22.CrossRef Google Scholar

Liu, C., Prömel, D. J. and Teichmann, J., ‘Characterization of nonlinear Besov spaces’, Trans. Amer. Math. Soc. 373(1) (2020), 529–550.CrossRef Google Scholar

Mandelbrot, B. B. and van Ness, J. W., ‘Fractional Brownian motions, fractional noises and applications’, SIAM Review 10(4) (1968), 422–437.CrossRef Google Scholar

Marinucci, D. and Robinson, P. M.., ‘Mixed fractional Brownian motion’, J. Statist. Plann. Inference 80(1–2) (1999), 111–122.CrossRef Google Scholar

Menoukeu-Pamen, O., Meyer-Brandis, T., Nilssen, T., Proske, F. and Zhang, T., ‘A variational approach to the construction and Malliavin differentiability of strong solutions of SDE’s’, Math. Ann. 357(2) (2013), 761–799.CrossRef Google Scholar

Modena, S. and Székelyhidi, L., ‘Non-uniqueness for the transport equation with Sobolev vector fields’, Ann. PDE 4(2) (2018), 1–38.CrossRef Google Scholar PubMed

Mohammed, S.-E. A., Nilssen, T. K. and Proske, F. N., ‘Sobolev differentiable stochastic flows for SDEs with singular coefficients: applications to the transport equation’, Ann. Probab. 43(3) (2015), 1535–1576.CrossRef Google Scholar

Nilssen, T., ‘Rough linear PDEs with discontinuous coefficients – existence of solutions via regularization by fractional Brownian motion’, Electron. J. Probab. 25 (2020), 1–33.CrossRef Google Scholar

Nualart, D., The Malliavin Calculus and Related Topics (Probability and its Applications), second edn. (Springer-Verlag, Berlin, 2006).Google Scholar

Nualart, D. and Ouknine, Y., ‘Regularization of differential equations by fractional noise’, Stochastic Process. Appl. 102(1) (2002), 103–116.CrossRef Google Scholar

Nualart, D. and Sönmez, E., ‘Regularization of differential equations by two fractional noises’, Stoch. Dyn. 22(6) (2022), Paper No. 2250029, 19.CrossRef Google Scholar

Panja, D., ‘Generalized langevin equation formulation for anomalous polymer dynamics’, J. Stat. Mech. Theory Exp. 2010(02) (2010), L02001.Google Scholar

Pardoux, É., ‘Equations aux derivees partielles stochastiques non lineaires monotones etude de solutions fortes de type Itô’,1975.Google Scholar

Peltier, R.-F. and Véhel, J. L., ‘Multifractional Brownian motion: definition and preliminary results’, PhD thesis, INRIA, 1995.Google Scholar

Perrin, E., Harba, R., Berzin-Joseph, C., Iribarren, I. and Bonami, A., ‘

$n$ th-order fractional Brownian motion and fractional Gaussian noises’, IEEE Trans. Signal Process. 49(5) (2001), 1049–1059.CrossRef Google Scholar

Picard, J., Representation Formulae for the Fractional Brownian Motion (Springer Berlin Heidelberg, Berlin, Heidelberg, 2011), 3–70.Google Scholar

Röckner, M. and Zhao, G., ‘SDEs with critical time dependent drifts: strong solutions’, Preprint, 2021, arXiv:2103.05803.Google Scholar

Romito, M. and Tolomeo, L., ‘Yet another notion of irregularity through small ball estimates’, Preprint, 2022, arXiv:2207.02716.Google Scholar

Russo, F. and Tudor, C. A., ‘On bifractional Brownian motion’, Stochastic Process. Appl. 116(5) (2006), 830–856.CrossRef Google Scholar

Shaposhnikov, A. V., ‘Some remarks on Davie’s uniqueness theorem’, Proc. Edinb. Math. Soc. 59(4) (2016), 1019–1035.CrossRef Google Scholar

Shaposhnikov, A. and Wresch, L., ‘Pathwise vs. path-by-path uniqueness’, Ann. Inst. Henri Poincaré Probab. Stat. 58(3) (2022), 1640–1649.CrossRef Google Scholar

Simon, J., ‘Sobolev, Besov and Nikolskii fractional spaces: imbeddings and comparisons for vector valued spaces on an interval’, Ann. Mat. Pura Appl. (4) 157 (1990), 117–148.CrossRef Google Scholar

Szymanski, J. and Weiss, M., ‘Elucidating the origin of anomalous diffusion in crowded fluids’, Phys. Rev. Lett. 103(3) (2009), 038102.CrossRef Google Scholar PubMed

Tudor, C. A. and Xiao, Y., ‘Sample path properties of bifractional Brownian motion’, Bernoulli 13(4) (2007), 1023–1052.CrossRef Google Scholar

Tudor, C. A. and Xiao, Y., ‘Sample paths of the solution to the fractional-colored stochastic heat equation’, Stoch. Dyn. 17(1) (2017), 1750004, 20.CrossRef Google Scholar

Veretennikov, A. Y., ‘On strong solutions and explicit formulas for solutions of stochastic integral equations’, Sb. Math. 39 (1981), 387–403.CrossRef Google Scholar

Wei, J., Hu, J. and Yuan, C., ‘Stochastic equations with low regularity drifts’, Preprint, 2023, arXiv:2310.00421.Google Scholar

Xia, P., Xie, L., Zhang, X. and Zhao, G., ‘

${L}^q\left({L}^p\right)$ -theory of stochastic differential equations’, Stochastic Process. Appl. 130(8) (2020), 5188–5211.CrossRef Google Scholar

Yamada, T. and Watanabe, S., ‘On the uniqueness of solutions of stochastic differential equations’, Kyoto J. Math. 11(1) (1971).CrossRef Google Scholar

Zhang, X., ‘Stochastic flows of SDEs with irregular coefficients and stochastic transport equations’, Bull. Sci. Math. 134(4) (2010), 340–378.CrossRef Google Scholar

Zhang, X., ‘Stochastic differential equations with Sobolev diffusion and singular drift and applications’, Ann. Appl. Probab. 26(5) (2016), 2697–2732.CrossRef Google Scholar

Zhang, X. and Zhao, G., ‘Stochastic Lagrangian path for Leray’s solutions of

$3$ D Navier–Stokes equations’, Comm. Math. Phys. 381(2) (2021), 491–525.CrossRef Google Scholar

Zvonkin, A. K., ‘A transformation of the phase space of a diffusion process that removes the drift’, Math. Sb. 22(1) (1974), 129.CrossRef Google Scholar

Article contents

Solution theory of fractional SDEs in complete subcritical regimes

Abstract

MSC classification

1 Introduction

1.1 Scaling heuristics and existing literature

1.2 Discussion of the main results

1.3 Counterexamples to uniqueness in the supercritical regime

1.4 Preliminaries on fractional Brownian motion

1.5 Setup and notation

2 A priori estimates and stochastic sewing

3 Stability

4 Strong well-posedness for functional drift

5 Strong well-posedness for distributional drift

6 Flow regularity and Malliavin differentiability

Proof of Proposition 6.4.

Proof of Theorem 6.2.

7 McKean-Vlasov equations

8 Weak compactness and weak existence

Proof of Theorem 8.2.

9 $\rho $ -irregularity

10 Applications to transport and continuity equations

Theorem 10.8 (Theorem 3.2 from [Reference Ambrosio1]).

Proof of Theorem 10.4.

A Kolmogorov continuity type criteria

B Some a priori estimates for Young equations

C Fractional regularity and Girsanov’s transform

Acknowledgements

Funding statement

Competing interest

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests