1 Introduction
 Given a vector field 
 $b:\mathbb {R}_+\times \mathbb {R}^d\to \mathbb {R}^d$
, an initial condition
$b:\mathbb {R}_+\times \mathbb {R}^d\to \mathbb {R}^d$
, an initial condition 
 $x_0\in \mathbb {R}^d$
 and a function
$x_0\in \mathbb {R}^d$
 and a function 
 $f:\mathbb {R}_+\to \mathbb {R}^d$
, consider the differential equation
$f:\mathbb {R}_+\to \mathbb {R}^d$
, consider the differential equation 
 $$ \begin{align} X_t = x_0 + \int_0^t b_r ( X_r) \mathrm{d} r + f_t. \end{align} $$
$$ \begin{align} X_t = x_0 + \int_0^t b_r ( X_r) \mathrm{d} r + f_t. \end{align} $$
 When f is chosen according to some random distribution, one obtains a stochastic differential equation (SDE), which often exhibits much better properties than the unperturbed equation (
 $f\equiv 0$
), even at the level of existence and uniqueness of solutions. This phenomenon is often referred to as regularisation by noise, and its study goes back to the works of Zvonkin [Reference Zvonkin104] and Veterennikov [Reference Veretennikov97]; see the monograph [Reference Flandoli39] for a survey in the case of standard Brownian f.
$f\equiv 0$
), even at the level of existence and uniqueness of solutions. This phenomenon is often referred to as regularisation by noise, and its study goes back to the works of Zvonkin [Reference Zvonkin104] and Veterennikov [Reference Veretennikov97]; see the monograph [Reference Flandoli39] for a survey in the case of standard Brownian f.
Although there is plenty of evidence [Reference Catellier and Gubinelli20, Reference Davie32, Reference Galeati and Gubinelli49, Reference Harang and Perkowski58] that it is the pathwise properties of the perturbation that determine the regularisation effects, the available results are far more abundant in the Brownian and, in general, the Markovian case.
However, a wide variety of applications motivate models with anomalous diffusions with long-range memory, including statistical description of turbulence [Reference Kolmogorov62], hydrology [Reference Hurst, Black and Simaika59], anomalous polymer dynamics [Reference Panja83], diffusion in living cells [Reference Szymanski and Weiss94] and rough volatility models in finance [Reference Gatheral, Jaisson and Rosenbaum52]. Such non-Markovian processes are commonly modeled by fractional Brownian motion (fBm). In this case, the lack of Markovian and semimartingale structure renders a large part of a ‘standard’ toolbox (Itô’s formula, Kolmogorov equations, Zvonkin transformation, martingale problem) unavailable. Nevertheless, since fBm paths share many properties with the standard Brownian ones (up to changes in the scaling exponents), one would expect similar regularisation phenomena.
The goal of the present work is twofold. First, we provide the first well-posedness results in the case of non-Markovian noise under demonstrably sharp conditions on b. The optimality follows both from a scaling heuristic (see Section 1.1 below) and from rigorous construction of counterexamples (see Section 1.3 below). The second goal is to expand the existing well-posedness theory by studying various properties of the solutions that are well-known (though often nontrivial) in the Brownian case, but much less so for fractional noise. These include existence, regularity, invertibility of the solution flow, stability with respect to perturbations of the initial condition and/or the nonlinearity, and Malliavin differentiability. The proofs can also be of interest in cases where the results are not new: the methods presented here go beyond not only the Markovian framework but also the scope of Girsanov’s theorem (see Remark 1.8 and Appendix C).
At the same time, the idea is quite intuitive: in order to develop a strong solution theory for (1.1), it is natural to investigate first the solvability of the linearised equation around any given solution X, namely to show that
 $$ \begin{align} Y_t= y + \int_0^t \nabla b_r(X_r) Y_r\, \mathrm{d} r \end{align} $$
$$ \begin{align} Y_t= y + \int_0^t \nabla b_r(X_r) Y_r\, \mathrm{d} r \end{align} $$
has a well-defined, unique solution for any 
 $y\in \mathbb {R}^d$
; observe that, due to its additive nature, the perturbation f does not appear in (1.2). The study of (1.2) is perfectly in line with the classical setting of a continuously differentiable drift b, where (1.2) can be solved directly and its behaviour matches the Grönwall-type estimates encountered when looking at the difference of any two solutions. However if b is not assumed to be differentiable,
$y\in \mathbb {R}^d$
; observe that, due to its additive nature, the perturbation f does not appear in (1.2). The study of (1.2) is perfectly in line with the classical setting of a continuously differentiable drift b, where (1.2) can be solved directly and its behaviour matches the Grönwall-type estimates encountered when looking at the difference of any two solutions. However if b is not assumed to be differentiable, 
 $\nabla b_r(X_r)$
 a priori does not make sense, and thus, a standard interpretation for (1.2) is no longer possible. The key idea in order to overcome this difficulty is two-fold:
$\nabla b_r(X_r)$
 a priori does not make sense, and thus, a standard interpretation for (1.2) is no longer possible. The key idea in order to overcome this difficulty is two-fold: 
- 
a)  $\nabla b(\cdot )$
 in (1.2) is not evaluated at arbitrary space points, but rather along the solution X, which can have very special properties inherited from the noise f. $\nabla b(\cdot )$
 in (1.2) is not evaluated at arbitrary space points, but rather along the solution X, which can have very special properties inherited from the noise f.
- 
b) In order to give meaning to (1.2) in a Young integral sense, we do not need to define  $\nabla b_r(X_r)$
 pointwise; instead, it suffices to show that the path (1.3)is well-defined and enjoys sufficiently nice time regularity (more precisely, it is of finite p-variation for some $\nabla b_r(X_r)$
 pointwise; instead, it suffices to show that the path (1.3)is well-defined and enjoys sufficiently nice time regularity (more precisely, it is of finite p-variation for some $$ \begin{align} t\mapsto L_t:=\int_0^t \nabla b_r(X_r) \mathrm{d} r \end{align} $$ $$ \begin{align} t\mapsto L_t:=\int_0^t \nabla b_r(X_r) \mathrm{d} r \end{align} $$ $p<2$
). In view of a), depending on the structure of the noise f, this can be a much more reasonable requirement. $p<2$
). In view of a), depending on the structure of the noise f, this can be a much more reasonable requirement.
In analogy with the Lipschitz setting, one can then transfer estimates for classical linear Young equations of the form
 $$ \begin{align} \sup_{t\in [0,1]} |Y_t|\lesssim e^{C\|L\|_{p-{\mathrm{var}}}^p} |y| \end{align} $$
$$ \begin{align} \sup_{t\in [0,1]} |Y_t|\lesssim e^{C\|L\|_{p-{\mathrm{var}}}^p} |y| \end{align} $$
to pathwise bounds for the difference of any two solutions X and 
 $\tilde X$
 with different initial conditions, up to replacing L by another process
$\tilde X$
 with different initial conditions, up to replacing L by another process 
 $\hat L=\hat L(X,\tilde X)$
 similar in spirit to (1.3).
$\hat L=\hat L(X,\tilde X)$
 similar in spirit to (1.3).
 In order to rigorously formalise all of the above, it is crucial to identify the correct space of perturbations 
 $\varphi $
 such that
$\varphi $
 such that 
 $X=\varphi +f$
 indeed inherits the relevant properties from f; these are the a priori estimates given by Lemmas 2.1–2.4. Correspondingly, we formulate two new versions of the Stochastic Sewing Lemma (SSL) by Lê [Reference Lê71]; cf. Lemmas 2.5 and 2.6 below, which are tailor-made for our analysis. Once this setup is in place, it provides exponential moment estimates of certain additive functionals of X, like the one defined in (1.3), turning pathwise bounds like (1.4) into moment bounds. Finally, once the behaviour of the linearised equation (1.2) is understood, many further properties (uniqueness, stability, differentiability of the flow) of the ODE follow similarly.
$X=\varphi +f$
 indeed inherits the relevant properties from f; these are the a priori estimates given by Lemmas 2.1–2.4. Correspondingly, we formulate two new versions of the Stochastic Sewing Lemma (SSL) by Lê [Reference Lê71]; cf. Lemmas 2.5 and 2.6 below, which are tailor-made for our analysis. Once this setup is in place, it provides exponential moment estimates of certain additive functionals of X, like the one defined in (1.3), turning pathwise bounds like (1.4) into moment bounds. Finally, once the behaviour of the linearised equation (1.2) is understood, many further properties (uniqueness, stability, differentiability of the flow) of the ODE follow similarly.
1.1 Scaling heuristics and existing literature
 One way to have a unified view on the many works on regularization by noise is by a scaling argument; for a similar approach in the Brownian setting and 
 $L^q_t L^p_x$
 spaces, see [Reference Beck, Flandoli, Gubinelli and Maurelli8, Section 1.5].
$L^q_t L^p_x$
 spaces, see [Reference Beck, Flandoli, Gubinelli and Maurelli8, Section 1.5].
 From now on, we sample the perturbation as a fBm 
 $B^H$
 with Hurst parameter
$B^H$
 with Hurst parameter 
 $H\in (0,+\infty ) \setminus \mathbb {N}$
, which satisfies the scaling relation
$H\in (0,+\infty ) \setminus \mathbb {N}$
, which satisfies the scaling relation 
 $$ \begin{align} (B^H_t)_{t\geq0}\overset{\mathrm{law}}{=}(\lambda^{-H}B^H_{\lambda t})_{t\geq 0}, \quad \forall\, \lambda>0. \end{align} $$
$$ \begin{align} (B^H_t)_{t\geq0}\overset{\mathrm{law}}{=}(\lambda^{-H}B^H_{\lambda t})_{t\geq 0}, \quad \forall\, \lambda>0. \end{align} $$
Details about the processes 
 $B^H$
 are given in Section 1.4 below; let us just briefly recall that
$B^H$
 are given in Section 1.4 below; let us just briefly recall that 
 $H=1/2$
 gives the standard Brownian motion and that this is the only case where
$H=1/2$
 gives the standard Brownian motion and that this is the only case where 
 $B^H$
 is a Markov process. For the values
$B^H$
 is a Markov process. For the values 
 $H=k+1/2$
,
$H=k+1/2$
, 
 $k\in \mathbb {N}_+$
 (which we call ‘degenerate Brownian’), the Markovian toolbox is still available since the SDE can be rewritten as a higher-dimensional equation driven by degenerate Brownian noise; see, for example, [Reference Chaudru de Raynal, Honoré and Menozzi24]. For all other choices of H, such tools are unavailable, and the study of the SDE requires a fundamentally different approach. The equation then takes the form
$k\in \mathbb {N}_+$
 (which we call ‘degenerate Brownian’), the Markovian toolbox is still available since the SDE can be rewritten as a higher-dimensional equation driven by degenerate Brownian noise; see, for example, [Reference Chaudru de Raynal, Honoré and Menozzi24]. For all other choices of H, such tools are unavailable, and the study of the SDE requires a fundamentally different approach. The equation then takes the form 
 $$ \begin{align} X_t = x_0 + \int_0^t b_r ( X_r) \mathrm{d} r + B^H_t. \end{align} $$
$$ \begin{align} X_t = x_0 + \int_0^t b_r ( X_r) \mathrm{d} r + B^H_t. \end{align} $$
In order for the regularising effects of 
 $B^H$
 to dominate the irregularities of b, it is natural to require that, when zooming into small scales in a way that keeps the noise strength constant, the nonlinearity vanishes; if this were not the case, and the nonlinearity were dominant, we would expect to see all the same pathologies (e.g., coalescence or branching of solutions) which could manifest in the ODE without noise. Therefore, keeping (1.5) in mind, for a fixed parameter H, we call a space V of functions (or distributions) on
$B^H$
 to dominate the irregularities of b, it is natural to require that, when zooming into small scales in a way that keeps the noise strength constant, the nonlinearity vanishes; if this were not the case, and the nonlinearity were dominant, we would expect to see all the same pathologies (e.g., coalescence or branching of solutions) which could manifest in the ODE without noise. Therefore, keeping (1.5) in mind, for a fixed parameter H, we call a space V of functions (or distributions) on 
 $\mathbb {R}_+\times \mathbb {R}^d$
 critical (resp. subcritical/supercritical) if for the rescaled drift coefficient
$\mathbb {R}_+\times \mathbb {R}^d$
 critical (resp. subcritical/supercritical) if for the rescaled drift coefficient 
 $$ \begin{align*} b^\lambda_t(x)=\lambda^{1-H} b(\lambda t, \lambda^H x), \end{align*} $$
$$ \begin{align*} b^\lambda_t(x)=\lambda^{1-H} b(\lambda t, \lambda^H x), \end{align*} $$
the leading order seminorm  (see the examples below for its practical meaning) scales like
 (see the examples below for its practical meaning) scales like  , for all
, for all 
 $\lambda \leq 1$
,Footnote 
1
 with
$\lambda \leq 1$
,Footnote 
1
 with 
 $\gamma =0$
 (resp.
$\gamma =0$
 (resp. 
 $\gamma>0$
/
$\gamma>0$
/
 $\gamma <0$
).
$\gamma <0$
).
We refer to Section 1.5 for more details on the function spaces appearing in the upcoming examples.
Example 1.1. Consider autonomous, inhomogeneous Hölder-Besov spaces 
 $V=B^\alpha _{\infty ,\infty }$
, where b does not depend on the time variable. Here, the leading order seminorm is the associated homogeneous seminorm; namely, we set
$V=B^\alpha _{\infty ,\infty }$
, where b does not depend on the time variable. Here, the leading order seminorm is the associated homogeneous seminorm; namely, we set  as defined in [Reference Bahouri, Chemin and Danchin5]; alternatively, for
 as defined in [Reference Bahouri, Chemin and Danchin5]; alternatively, for 
 $f\in B^\alpha _{\infty ,\infty }$
 and
$f\in B^\alpha _{\infty ,\infty }$
 and 
 $\alpha \geq 0$
, one can regard it as
$\alpha \geq 0$
, one can regard it as 
 $\| (-\Delta )^{\alpha /2} f\|_{B^0_{\infty ,\infty }}$
, while for
$\| (-\Delta )^{\alpha /2} f\|_{B^0_{\infty ,\infty }}$
, while for 
 $\alpha <0$
, one can define it by duality with the homogeneous seminorm of
$\alpha <0$
, one can define it by duality with the homogeneous seminorm of 
 $\dot {B}^{-\alpha }_{1,1}$
. Either way, one finds the scaling relation
$\dot {B}^{-\alpha }_{1,1}$
. Either way, one finds the scaling relation 
 $$ \begin{align*} \| f(\eta\, \cdot)\|_{\dot{B}^\alpha_{\infty,\infty}} \sim_\alpha \eta^\alpha \| f(\cdot)\|_{\dot{B}^\alpha_{\infty,\infty}} \quad\, \forall\, (\eta,\alpha)\in \mathbb{R}_{>0}\times \mathbb{R}. \end{align*} $$
$$ \begin{align*} \| f(\eta\, \cdot)\|_{\dot{B}^\alpha_{\infty,\infty}} \sim_\alpha \eta^\alpha \| f(\cdot)\|_{\dot{B}^\alpha_{\infty,\infty}} \quad\, \forall\, (\eta,\alpha)\in \mathbb{R}_{>0}\times \mathbb{R}. \end{align*} $$
Combined with our definition of 
 $b^\lambda $
, one finds
$b^\lambda $
, one finds 
 $\gamma =1-H+\alpha H$
, and so the subcriticality condition reads as
$\gamma =1-H+\alpha H$
, and so the subcriticality condition reads as 
 $$ \begin{align} \alpha>1-\frac{1}{H}. \end{align} $$
$$ \begin{align} \alpha>1-\frac{1}{H}. \end{align} $$
However, even in the classical Brownian case, where one gets the condition 
 $\alpha>-1$
, this remains out of reach. Weak well-posedness is known for
$\alpha>-1$
, this remains out of reach. Weak well-posedness is known for 
 $\alpha>-1/2$
 [Reference Flandoli, Issoglio and Russo41], and a nonstandard kind of well-posedness (where uniqueness is even weaker than uniqueness in law) is shown for
$\alpha>-1/2$
 [Reference Flandoli, Issoglio and Russo41], and a nonstandard kind of well-posedness (where uniqueness is even weaker than uniqueness in law) is shown for 
 $\alpha>-2/3$
 [Reference Cannizzaro and Chouk18, Reference Delarue and Diel35], for special classes of drift b. The classical works [Reference Veretennikov97, Reference Zvonkin104] show strong well-posedness for
$\alpha>-2/3$
 [Reference Cannizzaro and Chouk18, Reference Delarue and Diel35], for special classes of drift b. The classical works [Reference Veretennikov97, Reference Zvonkin104] show strong well-posedness for 
 $V=C^\alpha _x$
 and
$V=C^\alpha _x$
 and 
 $\alpha \geq 0$
.Footnote 
2
 Interestingly, in the degenerate Brownian case, weak well-posedness is proved in [Reference Chaudru de Raynal, Honoré and Menozzi24] in the full regime
$\alpha \geq 0$
.Footnote 
2
 Interestingly, in the degenerate Brownian case, weak well-posedness is proved in [Reference Chaudru de Raynal, Honoré and Menozzi24] in the full regime 
 $\alpha>(2k-1)/(2k+1)$
, which is precisely the condition (1.7). For strong well-posedness, one requires the more restrictive condition
$\alpha>(2k-1)/(2k+1)$
, which is precisely the condition (1.7). For strong well-posedness, one requires the more restrictive condition 
 $$ \begin{align*} \alpha>1-\frac{1}{2H}; \end{align*} $$
$$ \begin{align*} \alpha>1-\frac{1}{2H}; \end{align*} $$
see [Reference Chaudru de Raynal, Honoré and Menozzi23, Equation (1.11)]. The same condition is required for strong well-posedness in the non-Markovian case for all 
 $H\in (0,\infty )\setminus \mathbb {N}$
; cf. [Reference Catellier and Gubinelli20, Reference Galeati and Gubinelli49, Reference Gerencsér53, Reference Nualart and Ouknine81]. After the first version of this manuscript, the work [Reference Butkovsky, Lê and Mytnik16] appeared, where the authors are able to establish (among several results) weak existence of solutions in the full subcritical regime (1.7), under the additional assumption that b is a Radon measure; however, uniqueness is still open.
$H\in (0,\infty )\setminus \mathbb {N}$
; cf. [Reference Catellier and Gubinelli20, Reference Galeati and Gubinelli49, Reference Gerencsér53, Reference Nualart and Ouknine81]. After the first version of this manuscript, the work [Reference Butkovsky, Lê and Mytnik16] appeared, where the authors are able to establish (among several results) weak existence of solutions in the full subcritical regime (1.7), under the additional assumption that b is a Radon measure; however, uniqueness is still open.
Example 1.2. Another well-studied case is the mixed Lebesgue space 
 $V=L^q_tL^p_x$
. Here, we can take the seminorm to be
$V=L^q_tL^p_x$
. Here, we can take the seminorm to be 
 $\| \cdot \|_V$
 itself; using the scaling relation
$\| \cdot \|_V$
 itself; using the scaling relation 
 $\|f(\eta \,\cdot )\|_{L^p_x}= \eta ^{-d/p}$
, one finds
$\|f(\eta \,\cdot )\|_{L^p_x}= \eta ^{-d/p}$
, one finds 
 $\gamma =1-H-1/q-(Hd)/p$
, and the subcritical regime is
$\gamma =1-H-1/q-(Hd)/p$
, and the subcritical regime is 
 $$ \begin{align} \frac{1}{q}+\frac{Hd}{p}<1-H. \end{align} $$
$$ \begin{align} \frac{1}{q}+\frac{Hd}{p}<1-H. \end{align} $$
In the classical case 
 $H=1/2$
, equation (1.8) reads as
$H=1/2$
, equation (1.8) reads as 
 $$ \begin{align} \frac{2}{q}+\frac{d}{p}<1, \end{align} $$
$$ \begin{align} \frac{2}{q}+\frac{d}{p}<1, \end{align} $$
which is precisely the condition from the classical work [Reference Krylov and Röckner68], where strong well-posedness is proved (under the additional constrant 
 $p\geq 2$
); instead, the critical regime corresponds to the celebrated Ladyzhenskaya–Prodi–Serrin (LPS) condition. This case has then been extensively studied by several authors, allowing also for multiplicative noise with Sobolev diffusion coefficients; see, among others, [Reference Fedrizzi and Flandoli37, Reference Xia, Xie, Zhang and Zhao99, Reference Zhang101, Reference Zhang102]. In recent years, even the critical case has been reached [Reference Krylov65, Reference Röckner and Zhao88] under certain constraints on
$p\geq 2$
); instead, the critical regime corresponds to the celebrated Ladyzhenskaya–Prodi–Serrin (LPS) condition. This case has then been extensively studied by several authors, allowing also for multiplicative noise with Sobolev diffusion coefficients; see, among others, [Reference Fedrizzi and Flandoli37, Reference Xia, Xie, Zhang and Zhao99, Reference Zhang101, Reference Zhang102]. In recent years, even the critical case has been reached [Reference Krylov65, Reference Röckner and Zhao88] under certain constraints on 
 $d,p,q$
; the results have been further refined by allowing coefficients in Morrey spaces (cf. [Reference Krylov66, Reference Krylov67]) or form-bounded drifts (cf. [Reference Kinzebulatov and Madou60, Reference Kinzebulatov and Semënov61]) and the references therein. It was recently understood in [Reference Zhang and Zhao103] that one can go beyond condition (1.8), up to imposing additional constraints on
$d,p,q$
; the results have been further refined by allowing coefficients in Morrey spaces (cf. [Reference Krylov66, Reference Krylov67]) or form-bounded drifts (cf. [Reference Kinzebulatov and Madou60, Reference Kinzebulatov and Semënov61]) and the references therein. It was recently understood in [Reference Zhang and Zhao103] that one can go beyond condition (1.8), up to imposing additional constraints on 
 $\mathrm {div}\, b$
; for further progress in this exciting direction, see also [Reference Gräfner and Perkowski55, Reference Hao and Zhang57].
$\mathrm {div}\, b$
; for further progress in this exciting direction, see also [Reference Gräfner and Perkowski55, Reference Hao and Zhang57].
 For 
 $H\in (1/2,1)$
, no results are known, and for
$H\in (1/2,1)$
, no results are known, and for 
 $H\in (0, 1/2)$
, the main previously known results for weak and strong well-posedness are both from [Reference Lê71], under the stronger conditions
$H\in (0, 1/2)$
, the main previously known results for weak and strong well-posedness are both from [Reference Lê71], under the stronger conditions 
 $$ \begin{align} \frac{1}{q}+\frac{Hd}{p}<\frac{1}{2},\qquad\frac{1}{q}+\frac{Hd}{p}<\frac{1}{2}-H, \end{align} $$
$$ \begin{align} \frac{1}{q}+\frac{Hd}{p}<\frac{1}{2},\qquad\frac{1}{q}+\frac{Hd}{p}<\frac{1}{2}-H, \end{align} $$
respectively, with the additional constraint 
 $p\in [2,\infty ]$
, later removed in [Reference Galeati and Gubinelli49]. It is conjectured in [Reference Lê71] that the first condition in (1.10) is enough to guarantee strong well-posedness. One particular corollary of our result is that for
$p\in [2,\infty ]$
, later removed in [Reference Galeati and Gubinelli49]. It is conjectured in [Reference Lê71] that the first condition in (1.10) is enough to guarantee strong well-posedness. One particular corollary of our result is that for 
 $q\in (1,2]$
, even (1.8) is sufficient. Therefore, we propose to update the conjecture of [Reference Lê71] (if
$q\in (1,2]$
, even (1.8) is sufficient. Therefore, we propose to update the conjecture of [Reference Lê71] (if 
 $q\in (1,2]$
, now a theorem) to assert strong well-posedness under the scaling condition (1.8). Let us also mention that we have recently learned about an ongoing work [Reference Butkovsky, Lê and Matsuda15] towards improving (1.10).
$q\in (1,2]$
, now a theorem) to assert strong well-posedness under the scaling condition (1.8). Let us also mention that we have recently learned about an ongoing work [Reference Butkovsky, Lê and Matsuda15] towards improving (1.10).
Example 1.3. A common generalisation of Examples 1.1 and 1.2 is the space 
 $V=L^q_t C^\alpha _x$
, where (adopting the leading seminorm to be the one of
$V=L^q_t C^\alpha _x$
, where (adopting the leading seminorm to be the one of 
 $L^q_t \dot {B}^\alpha _{\infty ,\infty }$
, in agreement with both previous casesFootnote 
3
) the scaling works out to be
$L^q_t \dot {B}^\alpha _{\infty ,\infty }$
, in agreement with both previous casesFootnote 
3
) the scaling works out to be 
 $\gamma =1-H-1/q+\alpha H$
. Therefore, the subcriticality condition reads as
$\gamma =1-H-1/q+\alpha H$
. Therefore, the subcriticality condition reads as 
 $$ \begin{align*} \alpha>1-\frac{1}{H}+\frac{1}{Hq}=1-\frac{1}{q'H}, \end{align*} $$
$$ \begin{align*} \alpha>1-\frac{1}{H}+\frac{1}{Hq}=1-\frac{1}{q'H}, \end{align*} $$
where, here and in the rest of the paper, q and 
 $q'$
 are conjugate exponents,
$q'$
 are conjugate exponents, 
 $1/q+1/q'=1$
. This generality has only been studied recently in [Reference Galeati and Gubinelli49, Reference Galeati, Harang and Mayorcas51], where strong well-posedness is proved under the stronger condition
$1/q+1/q'=1$
. This generality has only been studied recently in [Reference Galeati and Gubinelli49, Reference Galeati, Harang and Mayorcas51], where strong well-posedness is proved under the stronger condition 
 $$ \begin{align} \alpha>1-\frac{1}{2H}+\frac{1}{Hq}, \end{align} $$
$$ \begin{align} \alpha>1-\frac{1}{2H}+\frac{1}{Hq}, \end{align} $$
with the additional constraints 
 $H\in (0,1/2]$
,
$H\in (0,1/2]$
, 
 $q\in (2,\infty ]$
. Note that by setting
$q\in (2,\infty ]$
. Note that by setting 
 $\alpha =-d/p$
, condition (1.11) coincides with the second one in (1.10).
$\alpha =-d/p$
, condition (1.11) coincides with the second one in (1.10).
 In summary, to the best of our knowledge, weak well-posedness results in a whole subcritical regime are available only in the degenerate Brownian case 
 $H=k+1/2$
,
$H=k+1/2$
, 
 $k\in \mathbb {N}$
, and strong well-posedness only in the standard Brownian case
$k\in \mathbb {N}$
, and strong well-posedness only in the standard Brownian case 
 $H=1/2$
.
$H=1/2$
.
1.2 Discussion of the main results
 In the present paper, we establish strong well-posedness in the full subcritical regime for all 
 $H\in (0,\infty )\setminus \mathbb {N}$
, with coefficients from the class in Example 1.3, under the additional constraint
$H\in (0,\infty )\setminus \mathbb {N}$
, with coefficients from the class in Example 1.3, under the additional constraint 
 $q\in (1,2]$
. In other terms, our main conditions are summarised by the assumption
$q\in (1,2]$
. In other terms, our main conditions are summarised by the assumption 
 $$ \begin{align} H\in(0,\infty) \setminus \mathbb{N},\qquad q\in(1,2],\qquad \alpha\in\Big(1-\frac{1}{q'H},1\Big). \end{align} $$
$$ \begin{align} H\in(0,\infty) \setminus \mathbb{N},\qquad q\in(1,2],\qquad \alpha\in\Big(1-\frac{1}{q'H},1\Big). \end{align} $$
The solution theory we present in fact goes beyond strong well-posedness. We show existence in the strong sense not only of solutions but also of solution flows, and uniqueness in the path-by-path sense. Furthermore, several further properties of solutions are established such as stability, continuous differentiability of the flow and its inverse, Malliavin differentiability and 
 $\rho $
-irregularity.
$\rho $
-irregularity.
 Many of these results are even new in the time-independent case: if b is only a function of x and belongs to 
 $C^\alpha _x$
, then the optimal choice to put it in the framework of (A) is to choose
$C^\alpha _x$
, then the optimal choice to put it in the framework of (A) is to choose 
 $q=2$
, leading to the condition
$q=2$
, leading to the condition 
 $\alpha>1-1/(2H)$
. This is the classical condition under which strong well-posedness is known [Reference Catellier and Gubinelli20, Reference Gerencsér53, Reference Nualart and Ouknine81], but several of the further properties have not been previously established.
$\alpha>1-1/(2H)$
. This is the classical condition under which strong well-posedness is known [Reference Catellier and Gubinelli20, Reference Gerencsér53, Reference Nualart and Ouknine81], but several of the further properties have not been previously established.
 Our main findings are loosely summarised (without aiming for full precision or generality) in the following statement; the corresponding results (often in a somewhat sharper form) can be found throughout the paper in Theorems 4.3, 4.4, 5.5, 5.6 for i), 3.2 for ii), 6.2 for iii), 6.8 for iv), 7.4 for v), 9.3 for vi), 10.4 for vii). For simplicity, we restrict ourselves to the time interval 
 $t\in [0,1]$
, but it is clear that up to rescaling, we could consider any finite
$t\in [0,1]$
, but it is clear that up to rescaling, we could consider any finite 
 $[0,T]$
 (up to allowing the hidden constants to depend on T).
$[0,T]$
 (up to allowing the hidden constants to depend on T).
Theorem 1.4. Assume (A) and let 
 $x_0\in \mathbb {R}^d$
,
$x_0\in \mathbb {R}^d$
, 
 $b\in L^q_t C^\alpha _x$
,
$b\in L^q_t C^\alpha _x$
, 
 $m\in [1,\infty )$
. Then,
$m\in [1,\infty )$
. Then, 
- 
i) Strong existence and path-by-path uniqueness holds for (1.6); 
- 
ii) For any other  $\tilde x_0\in \mathbb {R}^d$
 and $\tilde x_0\in \mathbb {R}^d$
 and $\tilde b\in L^q_t C^\alpha _x$
, the associated solutions X and $\tilde b\in L^q_t C^\alpha _x$
, the associated solutions X and $\tilde X$
 satisfy the stability estimate $\tilde X$
 satisfy the stability estimate $$ \begin{align*} \mathbb{E}\bigg[\sup_{t\in[0,1]}|X_t-\tilde X_t|^m\bigg]^{1/m}\lesssim |x_0-\tilde x_0|+\|b- \tilde b\|_{L^q_t C^{\alpha-1}_x}; \end{align*} $$ $$ \begin{align*} \mathbb{E}\bigg[\sup_{t\in[0,1]}|X_t-\tilde X_t|^m\bigg]^{1/m}\lesssim |x_0-\tilde x_0|+\|b- \tilde b\|_{L^q_t C^{\alpha-1}_x}; \end{align*} $$
- 
iii) The solutions form a stochastic flow of diffeomorphisms  $\Phi _{s\to t}(x)$
, whose spatial gradient $\Phi _{s\to t}(x)$
, whose spatial gradient $\nabla \Phi $
 is $\nabla \Phi $
 is $\mathbb {P}$
-a.s. continuous in all variables; moreover, it holds $\mathbb {P}$
-a.s. continuous in all variables; moreover, it holds $$ \begin{align*} \sup_{0\leq s\leq t\leq 1, x\in \mathbb{R}^d} \mathbb{E}\big[| \nabla \Phi_{s\to t} (x) |^m\big] <\infty; \end{align*} $$ $$ \begin{align*} \sup_{0\leq s\leq t\leq 1, x\in \mathbb{R}^d} \mathbb{E}\big[| \nabla \Phi_{s\to t} (x) |^m\big] <\infty; \end{align*} $$
- 
iv) For each  $s<t$
 and $s<t$
 and $x\in \mathbb {R}^d$
, the random variable $x\in \mathbb {R}^d$
, the random variable $\omega \mapsto \Phi _{s\to t}(x;\omega )$
 is Malliavin differentiable; moreover, it holds where D is the Malliavin derivative and $\omega \mapsto \Phi _{s\to t}(x;\omega )$
 is Malliavin differentiable; moreover, it holds where D is the Malliavin derivative and $$ \begin{align*} \sup_{0\leq s\leq t\leq 1, x\in \mathbb{R}^d} \mathbb{E}\big[ \| D \Phi_{s\to t}(x)\|_{\mathcal{H}^H}^m \big]<\infty, \end{align*} $$ $$ \begin{align*} \sup_{0\leq s\leq t\leq 1, x\in \mathbb{R}^d} \mathbb{E}\big[ \| D \Phi_{s\to t}(x)\|_{\mathcal{H}^H}^m \big]<\infty, \end{align*} $$ $\mathcal {H}^H$
 the Cameron-Martin space of $\mathcal {H}^H$
 the Cameron-Martin space of $B^H$
; $B^H$
;
- 
v) Strong existence and uniqueness holds also for the McKean-Vlasov equation (1.12) $$ \begin{align} X_t=x_0+\int_0^t(b_r\ast\mu_r)(X_r)\mathrm{d} r +B^H_t,\qquad\mu_t=\mathcal{L}(X_t); \end{align} $$ $$ \begin{align} X_t=x_0+\int_0^t(b_r\ast\mu_r)(X_r)\mathrm{d} r +B^H_t,\qquad\mu_t=\mathcal{L}(X_t); \end{align} $$
- 
vi) Solutions X are  $\mathbb {P}$
-a.s. $\mathbb {P}$
-a.s. $\rho $
-irregular for any $\rho $
-irregular for any $\rho <1/(2H)$
; $\rho <1/(2H)$
;
- 
vii) If additionally  $\alpha>0$
, then for any $\alpha>0$
, then for any $p>1$
, strong existence and path-by-path uniqueness holds for solutions $p>1$
, strong existence and path-by-path uniqueness holds for solutions $u\in L^\infty _t W^{1,p}_x$
 to the transport equation for all initial data $u\in L^\infty _t W^{1,p}_x$
 to the transport equation for all initial data $$ \begin{align*} \partial_t u + b\cdot \nabla u + \dot{B}^H_t\cdot \nabla u=0 \end{align*} $$ $$ \begin{align*} \partial_t u + b\cdot \nabla u + \dot{B}^H_t\cdot \nabla u=0 \end{align*} $$ $u_0\in W^{1,p}_x$
. $u_0\in W^{1,p}_x$
.
The various aspects of the main results are discussed in detail in their respective sections, so here, let us just briefly comment on them.
 The notion of path-by-path uniqueness in i), as a strengthening of the classical pathwise uniqueness, was first established in the seminal work [Reference Davie32] by Davie, with a simpler proof that was later provided by Shaposhnikov [Reference Shaposhnikov91]. This kind of result was then generalised to fBm in [Reference Catellier and Gubinelli20], suggesting it is a consequence of the pathwise properties of the trajectories of the driving noise. Such a uniqueness concept requires giving a pathwise interpretation to the SDE, which becomes nontrivial for 
 $\alpha <0$
, where b can be a distribution of negative regularity and not a function anymore. In this case, following [Reference Catellier and Gubinelli20], we will give meaning to (1.6) as a nonlinear Young ODE; see Section 5 for more details.
$\alpha <0$
, where b can be a distribution of negative regularity and not a function anymore. In this case, following [Reference Catellier and Gubinelli20], we will give meaning to (1.6) as a nonlinear Young ODE; see Section 5 for more details.
 Stability estimates in the style of ii) are useful to bypass abstract Yamada-Watanabe arguments and get strong existence directly. Among other possible applications, let us mention their importance in numerical schemes with distributional drifts; see, for example, the recent work [Reference Goudenège, Haress and Richard54]. In this paper, stability estimates play a key role when solving McKean-Vlasov equations as in v); see Section 7. The study of stochastic flows iii) for SDEs goes back to the classical work [Reference Kunita69]; see also [Reference Chen and Li26, Reference Fedrizzi and Flandoli37, Reference Menoukeu-Pamen, Meyer-Brandis, Nilssen, Proske and Zhang76] for flows in irregular settings. In iv), we can in fact derive differentiability with respect to perturbations of the noise quite a bit more generally than Cameron-Martin directions (see Remark 6.9), in line with the observations from [Reference Friz and Victoir42, Reference Kusuoka70]. Concerning v), regularisation by fractional noise for distribution dependent SDEs has been investigated in [Reference Galeati, Harang and Mayorcas51] and recently in [Reference Han56]. Above, we only stated the simplest example of the McKean-Vlasov equation for the sake of presentation. Theorem 7.4 below allows for more general dependence on 
 $(X,\mu )$
. The notion of
$(X,\mu )$
. The notion of 
 $\rho $
-irregularity in vi) was introduced by [Reference Catellier and Gubinelli20] as a powerful measurement of the averaging properties of paths. Extending
$\rho $
-irregularity in vi) was introduced by [Reference Catellier and Gubinelli20] as a powerful measurement of the averaging properties of paths. Extending 
 $\rho $
-irregularity from Gaussian processes to perturbed Gaussian processes has previously only been achieved efficiently via Girsanov transform. Here, we provide a simple and more robust alternative. Concerning vii), regularisation by noise results for the transport equation were first established for Brownian noise in [Reference Flandoli, Gubinelli and Priola40] and further developed in [Reference Fedrizzi and Flandoli38, Reference Mohammed, Nilssen and Proske78]; see also [Reference Catellier21, Reference Galeati and Gubinelli49, Reference Nilssen79] for further investigations in the fractional case.
$\rho $
-irregularity from Gaussian processes to perturbed Gaussian processes has previously only been achieved efficiently via Girsanov transform. Here, we provide a simple and more robust alternative. Concerning vii), regularisation by noise results for the transport equation were first established for Brownian noise in [Reference Flandoli, Gubinelli and Priola40] and further developed in [Reference Fedrizzi and Flandoli38, Reference Mohammed, Nilssen and Proske78]; see also [Reference Catellier21, Reference Galeati and Gubinelli49, Reference Nilssen79] for further investigations in the fractional case.
The scope of some intermediate estimates we obtain is larger than (A), and therefore, in some regime where we do not obtain strong well-posedness, we still get compactness and therefore existence of weak solutions. To state the result, we need to enforce the following different condition:
 $$ \begin{align} H\in(0,1) ,\quad q\in(1,\infty],\quad \alpha>\frac{1}{2}-\frac{1}{2H}, \quad \alpha>1-\frac{1}{H q'}. \end{align} $$
$$ \begin{align} H\in(0,1) ,\quad q\in(1,\infty],\quad \alpha>\frac{1}{2}-\frac{1}{2H}, \quad \alpha>1-\frac{1}{H q'}. \end{align} $$
The proof of Theorem 1.5 is presented in Section 8, where we also define rigorously what we mean by weak solution to (1.6) in this case; see Theorem 8.2 for a more precise statement.
Theorem 1.5. Assume (B) and let 
 $x_0\in \mathbb {R}^d$
,
$x_0\in \mathbb {R}^d$
, 
 $b\in L^q_t C^\alpha _x$
; then there exists a weak solution to the SDE (1.6).
$b\in L^q_t C^\alpha _x$
; then there exists a weak solution to the SDE (1.6).
Remark 1.6. For 
 $b\in L^q_t C^\alpha _x$
 with
$b\in L^q_t C^\alpha _x$
 with 
 $\alpha>0$
, existence of weak solutions can be shown classically by standard Peano-type arguments for any choice of
$\alpha>0$
, existence of weak solutions can be shown classically by standard Peano-type arguments for any choice of 
 $H\in (0,\infty )\setminus \mathbb {N}$
. Therefore, condition (B) is of real interest only when considering
$H\in (0,\infty )\setminus \mathbb {N}$
. Therefore, condition (B) is of real interest only when considering 
 $\alpha <0$
; in this case,
$\alpha <0$
; in this case, 
 $H\in (0,1)$
 is not a real restriction, as it follows from the first condition on
$H\in (0,1)$
 is not a real restriction, as it follows from the first condition on 
 $\alpha $
. Note further that in the case
$\alpha $
. Note further that in the case 
 $q\in (1,2]$
, it always holds
$q\in (1,2]$
, it always holds 
 $1-\frac {1}{Hq'}\geq \frac {1}{2}-\frac {1}{2H}$
, and so (B) reduces to (A); thus, the interesting cases covered by Theorem 1.5 are for
$1-\frac {1}{Hq'}\geq \frac {1}{2}-\frac {1}{2H}$
, and so (B) reduces to (A); thus, the interesting cases covered by Theorem 1.5 are for 
 $q\in (2,\infty ]$
.
$q\in (2,\infty ]$
.
Remark 1.7. For 
 $q=\infty $
, condition (B) reduced to
$q=\infty $
, condition (B) reduced to 
 $b\in L^\infty _t C^\alpha _x$
,
$b\in L^\infty _t C^\alpha _x$
, 
 $\alpha>\frac {1}{2}-\frac {1}{2H}$
. In the Brownian case
$\alpha>\frac {1}{2}-\frac {1}{2H}$
. In the Brownian case 
 $H=1/2$
, this recovers the condition
$H=1/2$
, this recovers the condition 
 $\alpha>-1/2$
 obtained in [Reference Flandoli, Issoglio and Russo41], which showed uniqueness in law. Recently, [Reference Kremp and Perkowski63, Theorem 6.7] provided counterexamples to uniqueness in law for Brownian SDEs with drifts
$\alpha>-1/2$
 obtained in [Reference Flandoli, Issoglio and Russo41], which showed uniqueness in law. Recently, [Reference Kremp and Perkowski63, Theorem 6.7] provided counterexamples to uniqueness in law for Brownian SDEs with drifts 
 $b\in C_t C^\alpha _x$
, for any
$b\in C_t C^\alpha _x$
, for any 
 $\alpha \leq -1/2$
; non-uniqueness here is meant in the class of ‘canonical weak solutions’ (i.e., satisfying a definition à la Bass-Chen [Reference Bass and Chens6] (cf. Definition 8.1 below)). So there can be a nontrivial gap between well-posedness results and the prediction offered by scaling arguments. On the positive side, recently,[Reference Butkovsky and Mytnik17] proved uniqueness in law of the solutions constructed by Theorem 1.5, at least in the case
$\alpha \leq -1/2$
; non-uniqueness here is meant in the class of ‘canonical weak solutions’ (i.e., satisfying a definition à la Bass-Chen [Reference Bass and Chens6] (cf. Definition 8.1 below)). So there can be a nontrivial gap between well-posedness results and the prediction offered by scaling arguments. On the positive side, recently,[Reference Butkovsky and Mytnik17] proved uniqueness in law of the solutions constructed by Theorem 1.5, at least in the case 
 $H\in (0,1/2]$
 and autonomous drift
$H\in (0,1/2]$
 and autonomous drift 
 $b\in C^\alpha _x$
 with
$b\in C^\alpha _x$
 with 
 $\alpha>\frac {1}{2}-\frac {1}{2H}$
.
$\alpha>\frac {1}{2}-\frac {1}{2H}$
.
Remark 1.8. One fundamental stochastic analytic tool that still applies in the non-Markovian fBm setting is Girsanov’s transform. Indeed, it is heavily used in the seminal works [Reference Catellier and Gubinelli20, Reference Nualart and Ouknine81] and many subsequent ones. However, it has its limitations: in our setting, it only applies under the additional assumption 
 $1-1/(q'H)<0$
 (which, in turn, may only happen if
$1-1/(q'H)<0$
 (which, in turn, may only happen if 
 $H\in (0,1/2)$
); see Appendix C for details. Even in the Brownian case
$H\in (0,1/2)$
); see Appendix C for details. Even in the Brownian case 
 $H=1/2$
, our methods yield results beyond the scope of Girsanov’s theorem, which is not available for
$H=1/2$
, our methods yield results beyond the scope of Girsanov’s theorem, which is not available for 
 $q<2$
; see Remark 1.9 below. Therefore, throughout the article, we avoid Girsanov’s transform altogether.
$q<2$
; see Remark 1.9 below. Therefore, throughout the article, we avoid Girsanov’s transform altogether.
Another motivation for a Girsanov-free approach is to develop tools that are robust enough to extend to other classes of process; see [Reference Butkovsky, Dareiotis and Gerencsér13] for some first results on such equations via stochastic sewing for Lévy-driven SDEs and Remarks 1.12–1.13 below for other classes of Gaussian processes which fit our framework.
Remark 1.9. Theorem 1.4 gives new results also in the classical 
 $H=1/2$
 case. Indeed, to solve (1.6) with classical tools, one would require a good solution theory of the corresponding Kolmogorov equation
$H=1/2$
 case. Indeed, to solve (1.6) with classical tools, one would require a good solution theory of the corresponding Kolmogorov equation 
 $$ \begin{align} \partial_t u-\tfrac{1}{2}\Delta u=b\cdot\nabla u. \end{align} $$
$$ \begin{align} \partial_t u-\tfrac{1}{2}\Delta u=b\cdot\nabla u. \end{align} $$
Suppose that 
 $b\in L^q_t C^\alpha _x$
 with
$b\in L^q_t C^\alpha _x$
 with 
 $q\in (1,2)$
. Then the naive power counting fails: replacing first u by a smooth function on the right-hand side gives, by Schauder estimates,
$q\in (1,2)$
. Then the naive power counting fails: replacing first u by a smooth function on the right-hand side gives, by Schauder estimates, 
 $u\in L^\infty _t C^\beta _x$
 with
$u\in L^\infty _t C^\beta _x$
 with 
 $\beta =\alpha +2-2/q$
, and so
$\beta =\alpha +2-2/q$
, and so 
 $b\cdot \nabla u\in L^q_t C^{\alpha +1-2/q}_x$
. Since
$b\cdot \nabla u\in L^q_t C^{\alpha +1-2/q}_x$
. Since 
 $\alpha +1-2/q<\alpha $
, iterating the procedure implies worse and worse spatial regularity on u, and after finitely many steps, the product
$\alpha +1-2/q<\alpha $
, iterating the procedure implies worse and worse spatial regularity on u, and after finitely many steps, the product 
 $b\cdot \nabla u$
 becomes even ill-defined. This is somewhat similar to the issue of the Kolmogorov equation of Lévy SDEs with low stability index, which was circumvented in [Reference Chaudru de Raynal, Menozzi and Priola25]. After this manuscript appeared, Schauder estimates for (1.13) with
$b\cdot \nabla u$
 becomes even ill-defined. This is somewhat similar to the issue of the Kolmogorov equation of Lévy SDEs with low stability index, which was circumvented in [Reference Chaudru de Raynal, Menozzi and Priola25]. After this manuscript appeared, Schauder estimates for (1.13) with 
 $b\in L^q_t C^\alpha _x$
 with
$b\in L^q_t C^\alpha _x$
 with 
 $q\in (1,2)$
 were developed in [Reference Wei, Hu and Yuan98].
$q\in (1,2)$
 were developed in [Reference Wei, Hu and Yuan98].
Remark 1.10. By the embedding 
 $L^p_x\subset C^{-d/p}_x$
, our result immediately implies well-posedness of (1.6) with
$L^p_x\subset C^{-d/p}_x$
, our result immediately implies well-posedness of (1.6) with 
 $L^q_t L^p_x$
 drift in the full subcritical regime (with respect to p) (1.8) if
$L^q_t L^p_x$
 drift in the full subcritical regime (with respect to p) (1.8) if 
 $q\in (1,2]$
, which can be seen as a fractional analogue of [Reference Krylov and Röckner68]. Note that unlike in [Reference Lê71],
$q\in (1,2]$
, which can be seen as a fractional analogue of [Reference Krylov and Röckner68]. Note that unlike in [Reference Lê71], 
 $p\in [1,2)$
 is also allowed.
$p\in [1,2)$
 is also allowed.
 The rest of the article is structured as follows. In Section 1.3, we present some counterexamples in the supercritical regime, demonstrating that (up to reaching the critical equality) condition (A) cannot be improved; we then conclude the introduction by recalling some fundamental properties of fBm in Section 1.4 and by introducing the main notations used throughout the paper in Section 1.5. In Section 2, we state and prove some fundamental lemmata, including the aforementioned a priori estimates for solutions of (1.6) and the two new forms of the stochastic sewing lemma of [Reference Lê71]. Section 3 contains further estimates for additive functionals of processes, as well as a key stability property of solutions. In Sections 4 and 5, we use these estimates to establish well-posedness of (1.6); we distinguish the cases 
 $\alpha>0$
 and
$\alpha>0$
 and 
 $\alpha <0$
 cases, which require a different analysis. Along the way, we prove the existence of a solution semiflow, which we upgrade to a flow of diffeomphisms in Section 6. Section 7 contains applications of our stability estimates to McKean-Vlasov equations. In Section 8, we construct weak solutions under condition (B), via a compactness argument enabled by the available a priori estimates. In Section 9, we show
$\alpha <0$
 cases, which require a different analysis. Along the way, we prove the existence of a solution semiflow, which we upgrade to a flow of diffeomphisms in Section 6. Section 7 contains applications of our stability estimates to McKean-Vlasov equations. In Section 8, we construct weak solutions under condition (B), via a compactness argument enabled by the available a priori estimates. In Section 9, we show 
 $\rho $
-irregularity of solutions and more general perturbations of fractional Brownian motions. Finally, Section 10 contains applications to transport and continuity equations. In the appendices, we collect some useful tools for which we did not find exact references in the literature: Appendix A contains variants of Kolmogorov continuity criterion, and Appendix B gives two basic bounds for solutions of Young differential equations. In Appendix C, we summarise relations of various Sobolev spaces and their use in Girsanov transform for fractional Brownian motions.
$\rho $
-irregularity of solutions and more general perturbations of fractional Brownian motions. Finally, Section 10 contains applications to transport and continuity equations. In the appendices, we collect some useful tools for which we did not find exact references in the literature: Appendix A contains variants of Kolmogorov continuity criterion, and Appendix B gives two basic bounds for solutions of Young differential equations. In Appendix C, we summarise relations of various Sobolev spaces and their use in Girsanov transform for fractional Brownian motions.
1.3 Counterexamples to uniqueness in the supercritical regime
 Although the scaling argument is heuristic, one can often construct counterexamples in the supercritical case. The constructions below are motivated by [Reference Chaudru de Raynal22], which gives counterexamples for 
 $q=\infty $
,
$q=\infty $
, 
 $\alpha>0$
.
$\alpha>0$
.
 Assume 
 $d\geq 1$
,
$d\geq 1$
, 
 $H\in (0,1)$
 and
$H\in (0,1)$
 and 
 $(\alpha ,q)\in \mathbb {R}\times (1,\infty )$
 satisfy
$(\alpha ,q)\in \mathbb {R}\times (1,\infty )$
 satisfy 
 $$ \begin{align} \alpha<1-\frac{1}{H q'},\qquad \alpha>-1; \end{align} $$
$$ \begin{align} \alpha<1-\frac{1}{H q'},\qquad \alpha>-1; \end{align} $$
let B be an 
 $\mathbb {R}^d$
-valued stochastic process such that
$\mathbb {R}^d$
-valued stochastic process such that 
 $\mathbb {P}$
-almost surely
$\mathbb {P}$
-almost surely 
 $B\in C^\gamma $
 for all
$B\in C^\gamma $
 for all 
 $\gamma \in (0,H)$
. We claim that under (1.14), there exists
$\gamma \in (0,H)$
. We claim that under (1.14), there exists 
 $b\in L^q_t C^\alpha _x$
 such that the equation
$b\in L^q_t C^\alpha _x$
 such that the equation 
 $$ \begin{align} X_t=x+\int_0^t b_s(X_s)\,ds+B_t \end{align} $$
$$ \begin{align} X_t=x+\int_0^t b_s(X_s)\,ds+B_t \end{align} $$
with initial condition 
 $x_0=0$
 has at least two solutions whose laws are mutually singular.
$x_0=0$
 has at least two solutions whose laws are mutually singular.
 We will treat separately the cases 
 $\alpha \in (0,1)$
 and
$\alpha \in (0,1)$
 and 
 $\alpha \in (-1,0)$
.
$\alpha \in (-1,0)$
.
 For 
 $\alpha \in (0,1)$
, the construction is actually one-dimensional and can be extended trivially to higher dimensions by taking
$\alpha \in (0,1)$
, the construction is actually one-dimensional and can be extended trivially to higher dimensions by taking 
 $b=(b^i)_{i=1}^d$
 with
$b=(b^i)_{i=1}^d$
 with 
 $b^i\equiv 0$
 for
$b^i\equiv 0$
 for 
 $i\geq 2$
; therefore, here we will set
$i\geq 2$
; therefore, here we will set 
 $d=1$
. Take
$d=1$
. Take 
 $\tilde q>q$
 such that
$\tilde q>q$
 such that 
 $(\alpha ,\tilde q)$
 still satisfies (1.14) and define the function
$(\alpha ,\tilde q)$
 still satisfies (1.14) and define the function 
 $$ \begin{align*} b_t(x)=t^{-1/\tilde q}\,{\mathrm{sign}}(x)|x|^\alpha; \end{align*} $$
$$ \begin{align*} b_t(x)=t^{-1/\tilde q}\,{\mathrm{sign}}(x)|x|^\alpha; \end{align*} $$
clearly, 
 $b\in L^q_t C^\alpha _x$
. Let
$b\in L^q_t C^\alpha _x$
. Let 
 $\gamma =1/(\tilde q'(1-\alpha ))$
; by definition,
$\gamma =1/(\tilde q'(1-\alpha ))$
; by definition, 
 $\gamma $
 satisfies the identity
$\gamma $
 satisfies the identity 
 $$ \begin{align} \gamma=1-\frac{1}{\tilde q}+\gamma\alpha, \end{align} $$
$$ \begin{align} \gamma=1-\frac{1}{\tilde q}+\gamma\alpha, \end{align} $$
and furthermore, 
 $\gamma <H$
 thanks to (1.14). Fix furthermore
$\gamma <H$
 thanks to (1.14). Fix furthermore 
 $\delta>0$
 small such that
$\delta>0$
 small such that 
 $\delta ^\alpha /\gamma>2\delta $
, which exists since
$\delta ^\alpha /\gamma>2\delta $
, which exists since 
 $\alpha \in (0,1)$
. Take
$\alpha \in (0,1)$
. Take 
 $x\in (0,1]$
 and consider a weak solution
$x\in (0,1]$
 and consider a weak solution 
 $(X^x,B)$
 of (1.15), which is well-known to exist due to the spatial continuity and sublinear growth of b. Define the stopping time
$(X^x,B)$
 of (1.15), which is well-known to exist due to the spatial continuity and sublinear growth of b. Define the stopping time 
 $$ \begin{align*} \tilde\tau:=\inf\{t>0:\, |B_t|\geq \delta t^\gamma\}\wedge 1; \end{align*} $$
$$ \begin{align*} \tilde\tau:=\inf\{t>0:\, |B_t|\geq \delta t^\gamma\}\wedge 1; \end{align*} $$
it is strictly positive 
 $\mathbb {P}$
-almost surely since
$\mathbb {P}$
-almost surely since 
 $\gamma <H$
 and
$\gamma <H$
 and 
 $B\in C^{\tilde \gamma }$
 with
$B\in C^{\tilde \gamma }$
 with 
 $\tilde \gamma \in (\gamma ,H)$
. Also define
$\tilde \gamma \in (\gamma ,H)$
. Also define 
 $$ \begin{align*} \tau_x:=\inf\{t\geq 0:\, X_t^x\leq\delta t^\gamma\}\wedge 1. \end{align*} $$
$$ \begin{align*} \tau_x:=\inf\{t\geq 0:\, X_t^x\leq\delta t^\gamma\}\wedge 1. \end{align*} $$
We claim that 
 $\tau _x\geq \tilde \tau $
. Indeed,
$\tau _x\geq \tilde \tau $
. Indeed, 
 $\tau _x>0$
 since
$\tau _x>0$
 since 
 $x>0$
, and for all
$x>0$
, and for all 
 $t\leq \tau _x$
 by (1.15) and our construction, it holds
$t\leq \tau _x$
 by (1.15) and our construction, it holds 
 $$ \begin{align} X_t^x>\int_0^ts^{-1/\tilde q}(\delta s^\gamma)^{\alpha}\,ds+B_t= (\delta^\alpha/\gamma) t^\gamma+B_t>\delta t^\gamma+\big(\delta t^\gamma+B_t), \end{align} $$
$$ \begin{align} X_t^x>\int_0^ts^{-1/\tilde q}(\delta s^\gamma)^{\alpha}\,ds+B_t= (\delta^\alpha/\gamma) t^\gamma+B_t>\delta t^\gamma+\big(\delta t^\gamma+B_t), \end{align} $$
where in the intermediate passage, we used (1.16). Since 
 $\tau _x\geq \tilde \tau>0 \mathbb {P}$
-a.s., there exist
$\tau _x\geq \tilde \tau>0 \mathbb {P}$
-a.s., there exist 
 $\rho>0$
 independent of
$\rho>0$
 independent of 
 $x\in (0,1]$
 such that
$x\in (0,1]$
 such that 
 $$ \begin{align*} \mathbb{P}(\tilde\tau_x>\rho)\geq 3/4. \end{align*} $$
$$ \begin{align*} \mathbb{P}(\tilde\tau_x>\rho)\geq 3/4. \end{align*} $$
The laws of 
 $(X^x,B)$
 on
$(X^x,B)$
 on 
 $C([0,1])^2$
 are tight, and therefore by Skorohod’s representation theorem, we may assume that for a sequence
$C([0,1])^2$
 are tight, and therefore by Skorohod’s representation theorem, we may assume that for a sequence 
 $x_n\searrow 0$
, the random variables
$x_n\searrow 0$
, the random variables 
 $(X^{x_n},B^{x_n})$
 live on the same probability space and converge in
$(X^{x_n},B^{x_n})$
 live on the same probability space and converge in 
 $C([0,1])^2 \mathbb {P}$
-a.s. The limit
$C([0,1])^2 \mathbb {P}$
-a.s. The limit 
 $(X^{0,+},B^{0,+})$
 is a solution to (1.15) with initial condition
$(X^{0,+},B^{0,+})$
 is a solution to (1.15) with initial condition 
 $0$
 and satisfies
$0$
 and satisfies 
 $$ \begin{align*} \mathbb{P}\big(X^{0,+}_t>0\,\,\forall t\in(0,\rho]\big)&\geq \mathbb{P}\big(X^{0,+}_t\geq \delta t^\gamma\,\,\forall t\in[0,\rho]\big) \\&=\lim_{n\to\infty}\mathbb{P}\big(X^{x_n}_t\geq \delta t^\gamma\,\,\forall t\in[0,\rho]\big)\geq3/4. \end{align*} $$
$$ \begin{align*} \mathbb{P}\big(X^{0,+}_t>0\,\,\forall t\in(0,\rho]\big)&\geq \mathbb{P}\big(X^{0,+}_t\geq \delta t^\gamma\,\,\forall t\in[0,\rho]\big) \\&=\lim_{n\to\infty}\mathbb{P}\big(X^{x_n}_t\geq \delta t^\gamma\,\,\forall t\in[0,\rho]\big)\geq3/4. \end{align*} $$
Since b is odd, we can run the same argument for 
 $y\in [-1,0)$
: if
$y\in [-1,0)$
: if 
 $X^y$
 is a solution to (1.15), then
$X^y$
 is a solution to (1.15), then 
 $-X^y$
 is a solution for
$-X^y$
 is a solution for 
 $(-y,-B)$
, and the definition of
$(-y,-B)$
, and the definition of 
 $\tilde \tau $
 only depends on
$\tilde \tau $
 only depends on 
 $|B|$
. Therefore, for the same choice of
$|B|$
. Therefore, for the same choice of 
 $\rho $
, in this case, one finds that
$\rho $
, in this case, one finds that 
 $$ \begin{align*} \mathbb{P}\big(X^y_t \leq -\delta t^\gamma \,\forall t\in(0,\rho]\big) \geq 3/4, \end{align*} $$
$$ \begin{align*} \mathbb{P}\big(X^y_t \leq -\delta t^\gamma \,\forall t\in(0,\rho]\big) \geq 3/4, \end{align*} $$
and so by considering a sequence 
 $y_n\nearrow 0$
 by compactness, one can construct
$y_n\nearrow 0$
 by compactness, one can construct 
 $(X^{0,-}, B^{0,-})$
 another weak solution to (1.15) with initial condition
$(X^{0,-}, B^{0,-})$
 another weak solution to (1.15) with initial condition 
 $0$
 satisfying
$0$
 satisfying 
 $$ \begin{align*} \mathbb{P}\big(X^{0,-}_t<0\,\,\forall t\in(0,\rho]\big)\geq 3/4. \end{align*} $$
$$ \begin{align*} \mathbb{P}\big(X^{0,-}_t<0\,\,\forall t\in(0,\rho]\big)\geq 3/4. \end{align*} $$
This shows that 
 $X^{0,+}$
 and
$X^{0,+}$
 and 
 $X^{0,-}$
 do not have the same law, yielding weak non-uniqueness (we leave it as an exercise to the reader to show that their laws are in fact mutually singular).
$X^{0,-}$
 do not have the same law, yielding weak non-uniqueness (we leave it as an exercise to the reader to show that their laws are in fact mutually singular).
 In the distributional case 
 $\alpha \in (-1,0)$
, we have to be a bit more careful since the meaning of the SDE becomes unclear if X gets too close to
$\alpha \in (-1,0)$
, we have to be a bit more careful since the meaning of the SDE becomes unclear if X gets too close to 
 $0$
. To this end, we argue again by stopping times, and the construction we present this time is genuinely d-dimensional. Again, take
$0$
. To this end, we argue again by stopping times, and the construction we present this time is genuinely d-dimensional. Again, take 
 $\tilde q>q$
 such that
$\tilde q>q$
 such that 
 $(\alpha ,\tilde q)$
 still satisfy (1.14) and define a vector field
$(\alpha ,\tilde q)$
 still satisfy (1.14) and define a vector field 
 $b=(b^i)_{i=1}^d$
 by
$b=(b^i)_{i=1}^d$
 by 
 $$ \begin{align*} b^1(t,x)=t^{-1/\tilde q}\,{\mathrm{sign}}(x^1)|x|^{\alpha},\qquad b^i(t,x)\equiv 0 \text{ for } i=2,\ldots,d; \end{align*} $$
$$ \begin{align*} b^1(t,x)=t^{-1/\tilde q}\,{\mathrm{sign}}(x^1)|x|^{\alpha},\qquad b^i(t,x)\equiv 0 \text{ for } i=2,\ldots,d; \end{align*} $$
again, 
 $b\in L^q_t C^\alpha _x$
. Take
$b\in L^q_t C^\alpha _x$
. Take 
 $x\in (0,1]$
 and consider a local-in-time solution
$x\in (0,1]$
 and consider a local-in-time solution 
 $X^x$
 of (1.15) with initial condition
$X^x$
 of (1.15) with initial condition 
 $x_0=(x,0,\ldots ,0)$
, which is well-known to exist due to the spatial regularity of b locally around
$x_0=(x,0,\ldots ,0)$
, which is well-known to exist due to the spatial regularity of b locally around 
 $x_0$
. Define
$x_0$
. Define 
 $\gamma $
 as before, so that
$\gamma $
 as before, so that 
 $\gamma <H$
 and (1.16) holds; let us furthermore take an auxiliary parameter
$\gamma <H$
 and (1.16) holds; let us furthermore take an auxiliary parameter 
 $\delta $
 that will be specified later. Define the stopping times
$\delta $
 that will be specified later. Define the stopping times 
 $$ \begin{align*} \tilde\tau:=\inf\{t>0:\, |B_t|\geq \delta t^\gamma\}\wedge 1, \quad \tau_x:=\inf\{t\geq 0:\, (X_t^x)^1\leq\delta t^\gamma\}\wedge 1; \end{align*} $$
$$ \begin{align*} \tilde\tau:=\inf\{t>0:\, |B_t|\geq \delta t^\gamma\}\wedge 1, \quad \tau_x:=\inf\{t\geq 0:\, (X_t^x)^1\leq\delta t^\gamma\}\wedge 1; \end{align*} $$
as before, 
 $\tilde \tau $
 is strictly positive
$\tilde \tau $
 is strictly positive 
 $\mathbb {P}$
-almost surely since
$\mathbb {P}$
-almost surely since 
 $\gamma <H$
. We claim that
$\gamma <H$
. We claim that 
 $\tau _x\geq \tilde \tau $
, for which it suffices to show that for
$\tau _x\geq \tilde \tau $
, for which it suffices to show that for 
 $t\leq \tau _x\wedge \tilde \tau $
, one has
$t\leq \tau _x\wedge \tilde \tau $
, one has 
 $(X_t^x)^1\geq 2\delta t^\gamma $
. If
$(X_t^x)^1\geq 2\delta t^\gamma $
. If 
 $x>3\delta t^\gamma $
, then by simply using the nonnegativity of the first component of b up to
$x>3\delta t^\gamma $
, then by simply using the nonnegativity of the first component of b up to 
 $\tau _x$
 and the definition of
$\tau _x$
 and the definition of 
 $\tilde \tau $
, we see that
$\tilde \tau $
, we see that 
 $$ \begin{align*} (X_t^x)^1\geq x+B_t^1\geq 3\delta t^\gamma-\delta t^\gamma, \end{align*} $$
$$ \begin{align*} (X_t^x)^1\geq x+B_t^1\geq 3\delta t^\gamma-\delta t^\gamma, \end{align*} $$
as required. Suppose now that 
 $x\leq 3\delta t^\gamma $
. Clearly, for
$x\leq 3\delta t^\gamma $
. Clearly, for 
 $s\leq \tau _x$
, one also has
$s\leq \tau _x$
, one also has 
 $|X_s^x|\geq \delta s^\gamma $
. Inserting this bound in the equation, we get for
$|X_s^x|\geq \delta s^\gamma $
. Inserting this bound in the equation, we get for 
 $s\leq \tau _x\wedge \tilde \tau $
$s\leq \tau _x\wedge \tilde \tau $
 
 $$ \begin{align*} (X_s^x)^1&\leq x+\int_0^sr^{-1/\tilde q}(\delta r^\gamma)^{\alpha}\,dr+B_s^1 \\ &=x+ (\delta^{\alpha}/\gamma) s^\gamma+B_s^1 \\ &\leq x+ \big(\delta^{\alpha}/\gamma+\delta\big)s^\gamma; \end{align*} $$
$$ \begin{align*} (X_s^x)^1&\leq x+\int_0^sr^{-1/\tilde q}(\delta r^\gamma)^{\alpha}\,dr+B_s^1 \\ &=x+ (\delta^{\alpha}/\gamma) s^\gamma+B_s^1 \\ &\leq x+ \big(\delta^{\alpha}/\gamma+\delta\big)s^\gamma; \end{align*} $$
observe that since 
 $\alpha <0$
, we find reversed inequalities compared to the previous case. In particular, if
$\alpha <0$
, we find reversed inequalities compared to the previous case. In particular, if 
 $s\geq t/2$
, then using
$s\geq t/2$
, then using 
 $x\leq 3\delta t^\gamma $
, we also get
$x\leq 3\delta t^\gamma $
, we also get 
 $$ \begin{align*} (X_s^x)^1 \leq \big(3\delta 2^\gamma+\delta+\delta^{\alpha}/\gamma\big)s^\gamma. \end{align*} $$
$$ \begin{align*} (X_s^x)^1 \leq \big(3\delta 2^\gamma+\delta+\delta^{\alpha}/\gamma\big)s^\gamma. \end{align*} $$
For 
 $\delta \in (0,1)$
, there exist constants
$\delta \in (0,1)$
, there exist constants 
 $C',C$
 depending only on
$C',C$
 depending only on 
 $d,\alpha ,\gamma $
 such that the above bound implies
$d,\alpha ,\gamma $
 such that the above bound implies 
 $(X_s^x)^1\leq C'\delta ^{\alpha }s^\gamma $
, as well as
$(X_s^x)^1\leq C'\delta ^{\alpha }s^\gamma $
, as well as 
 $|X_s^x|\leq C \delta ^{\alpha }s^\gamma $
. Using this bound in the equation once more,
$|X_s^x|\leq C \delta ^{\alpha }s^\gamma $
. Using this bound in the equation once more, 
 $$ \begin{align*} (X_t^x)^1&>\int_{t/2}^ts^{-1/\tilde q}\big(C \delta^{\alpha}s^\gamma)^{\alpha}\,ds+B_t^1 \\ &\geq (1/2)C^{\alpha}\delta^{\alpha^2}t^\gamma -\delta t^\gamma. \end{align*} $$
$$ \begin{align*} (X_t^x)^1&>\int_{t/2}^ts^{-1/\tilde q}\big(C \delta^{\alpha}s^\gamma)^{\alpha}\,ds+B_t^1 \\ &\geq (1/2)C^{\alpha}\delta^{\alpha^2}t^\gamma -\delta t^\gamma. \end{align*} $$
At this point (using the condition 
 $\alpha>-1$
, so that
$\alpha>-1$
, so that 
 $\alpha ^2<1$
), one can choose
$\alpha ^2<1$
), one can choose 
 $\delta $
 small enough so that the right-hand side is bounded from below by
$\delta $
 small enough so that the right-hand side is bounded from below by 
 $2\delta t^\gamma $
. With this, we conclude the proof of the property
$2\delta t^\gamma $
. With this, we conclude the proof of the property 
 $\tau _x\geq \tilde \tau $
. In other words, for all
$\tau _x\geq \tilde \tau $
. In other words, for all 
 $t\leq \tilde \tau $
, for all
$t\leq \tilde \tau $
, for all 
 $x\in (0,1]$
, we have
$x\in (0,1]$
, we have 
 $(X_t^x)^1\geq \delta t^\gamma $
. In a symmetric way, for all
$(X_t^x)^1\geq \delta t^\gamma $
. In a symmetric way, for all 
 $t\leq \tilde \tau $
, for all
$t\leq \tilde \tau $
, for all 
 $y\in [-1,0)$
 we have
$y\in [-1,0)$
 we have 
 $(X_t^y)^1\leq -\delta t^\gamma $
.
$(X_t^y)^1\leq -\delta t^\gamma $
.
 We now want to pass to the 
 $x\to 0$
 limit, which we can do by noticing that the laws of
$x\to 0$
 limit, which we can do by noticing that the laws of 
 $(B,\tilde \tau ,X^x,X^{-x})$
 are tight on the space
$(B,\tilde \tau ,X^x,X^{-x})$
 are tight on the space 
 $$ \begin{align*} \mathcal{S}= C([0,1])\times\{(a,g):\,a\in(0,1],g\in C([0,a])^2\} \end{align*} $$
$$ \begin{align*} \mathcal{S}= C([0,1])\times\{(a,g):\,a\in(0,1],g\in C([0,a])^2\} \end{align*} $$
with the metric
 $$ \begin{align*} d\big((f,a,g),(f',a',g')\big)=\|f-f'\|_{C([0,1])}+|a-a'|+\|g-g'\|_{C([0,a\wedge a'])^2}. \end{align*} $$
$$ \begin{align*} d\big((f,a,g),(f',a',g')\big)=\|f-f'\|_{C([0,1])}+|a-a'|+\|g-g'\|_{C([0,a\wedge a'])^2}. \end{align*} $$
By Prokhorov’s theorem and Skorohod’s representation, we get a sequence 
 $x_n\to 0$
, and on another probability space, a sequence
$x_n\to 0$
, and on another probability space, a sequence 
 $(\bar B^{x_n},\bar {\tilde \tau }^{x_n},\bar X^{x_n},\bar X^{-x_n})\overset {\mathrm {law}}{=}(B,\tilde \tau ,X^{x_n},X^{-x_n})$
 converging
$(\bar B^{x_n},\bar {\tilde \tau }^{x_n},\bar X^{x_n},\bar X^{-x_n})\overset {\mathrm {law}}{=}(B,\tilde \tau ,X^{x_n},X^{-x_n})$
 converging 
 $\mathbb {P}$
-almost surely as random variables taking values
$\mathbb {P}$
-almost surely as random variables taking values 
 $\mathcal {S}$
. The limits
$\mathcal {S}$
. The limits 
 $X^{0,+}:=\lim \bar X^{x_n}$
 and
$X^{0,+}:=\lim \bar X^{x_n}$
 and 
 $X^{0,-}:=\lim \bar X^{-x_n}$
 both solve (1.15) with initial condition
$X^{0,-}:=\lim \bar X^{-x_n}$
 both solve (1.15) with initial condition 
 $0$
 and driving noise
$0$
 and driving noise 
 $B^{0}:=\lim \bar B^{x_n}$
. Moreover,
$B^{0}:=\lim \bar B^{x_n}$
. Moreover, 
 $X^{0,+}_t\geq \delta t^\gamma $
 for
$X^{0,+}_t\geq \delta t^\gamma $
 for 
 $t\leq \tilde \tau ^0:=\lim \bar {\tilde \tau }^{x_n}$
 and
$t\leq \tilde \tau ^0:=\lim \bar {\tilde \tau }^{x_n}$
 and 
 $X^{0,-}_t\leq -\delta t^\gamma $
 for
$X^{0,-}_t\leq -\delta t^\gamma $
 for 
 $t\leq \tilde \tau ^0$
. Since
$t\leq \tilde \tau ^0$
. Since 
 $\tilde \tau ^0\overset {\mathrm {law}}{=}\tilde \tau $
, it is
$\tilde \tau ^0\overset {\mathrm {law}}{=}\tilde \tau $
, it is 
 $\mathbb {P}$
-a.s. positive, and therefore, the laws of
$\mathbb {P}$
-a.s. positive, and therefore, the laws of 
 $X^{0,+}$
 and
$X^{0,+}$
 and 
 $X^{0,-}$
 are mutually singular (for example, on
$X^{0,-}$
 are mutually singular (for example, on 
 $C([0,1])$
 after extending them as constants after
$C([0,1])$
 after extending them as constants after 
 $\tilde \tau ^0$
).
$\tilde \tau ^0$
).
Remark 1.11. Up to multiplying b by a cutoff function at infinity, by taking 
 $\alpha =-d/(p+\varepsilon )$
 for sufficiently small
$\alpha =-d/(p+\varepsilon )$
 for sufficiently small 
 $\varepsilon>0$
, the construction presented in the regime
$\varepsilon>0$
, the construction presented in the regime 
 $\alpha <0$
 provides non-uniqueness for
$\alpha <0$
 provides non-uniqueness for 
 $b\in L^q_tL^p_x$
, for any pair
$b\in L^q_tL^p_x$
, for any pair 
 $(p,q)\in [1,\infty ]^2$
 satisfying
$(p,q)\in [1,\infty ]^2$
 satisfying 
 $$ \begin{align} \frac{1}{q}+\frac{Hd}{p}>1-H,\qquad p>d. \end{align} $$
$$ \begin{align} \frac{1}{q}+\frac{Hd}{p}>1-H,\qquad p>d. \end{align} $$
If 
 $H=1/2$
, then B can be taken as Brownian motion and (1.18) becomes
$H=1/2$
, then B can be taken as Brownian motion and (1.18) becomes 
 $$ \begin{align} \frac{2}{q}+\frac{d}{p}>1,\qquad p>d; \end{align} $$
$$ \begin{align} \frac{2}{q}+\frac{d}{p}>1,\qquad p>d; \end{align} $$
in particular, the exponents 
 $p,q$
 violate the LPS condition (1.9). It is interesting to compare (1.19) to the result from [Reference Krylov64], where weak existence for the Brownian SDE was established under the condition
$p,q$
 violate the LPS condition (1.9). It is interesting to compare (1.19) to the result from [Reference Krylov64], where weak existence for the Brownian SDE was established under the condition 
 $$ \begin{align} \frac{1}{q}+\frac{d}{p}\leq 1, \end{align} $$
$$ \begin{align} \frac{1}{q}+\frac{d}{p}\leq 1, \end{align} $$
which is further shown to be optimal by construction of counterexamples in the case 
 $1/q+d/p>1$
. Let us also mention [Reference Galeati47] for a heuristic explanation on why condition (1.20) (as well as (1.21) below) arises naturally when only focusing on weak existence results. Our counterexample shows that under (1.20), uniqueness in law in general does not hold, answering a problem left open in [Reference Krylov64] (see the discussion right above Remark 3.1 therein).
$1/q+d/p>1$
. Let us also mention [Reference Galeati47] for a heuristic explanation on why condition (1.20) (as well as (1.21) below) arises naturally when only focusing on weak existence results. Our counterexample shows that under (1.20), uniqueness in law in general does not hold, answering a problem left open in [Reference Krylov64] (see the discussion right above Remark 3.1 therein).
 After the completion of this work, it has been further shown in [Reference Butkovsky, Lê and Mytnik16] that in the time-independent case, for 
 $H\in (0,1)$
, there exist
$H\in (0,1)$
, there exist 
 $b\in C^\alpha $
 with supercritical
$b\in C^\alpha $
 with supercritical 
 $\alpha <1-1/H$
 for which even weak existence does not hold; see Theorem 2.7 therein. More recently, [Reference Butkovsky and Gallay14] expanded the result from [Reference Krylov64] by establishing weak existence of solutions for
$\alpha <1-1/H$
 for which even weak existence does not hold; see Theorem 2.7 therein. More recently, [Reference Butkovsky and Gallay14] expanded the result from [Reference Krylov64] by establishing weak existence of solutions for 
 $H\in (0,1)$
 and
$H\in (0,1)$
 and 
 $b\in L^q_t L^p_x$
 with
$b\in L^q_t L^p_x$
 with 
 $$ \begin{align} \frac{1-H}{q}+ \frac{H d}{p} < 1-H. \end{align} $$
$$ \begin{align} \frac{1-H}{q}+ \frac{H d}{p} < 1-H. \end{align} $$
Combined with our counterexample, one gets a regime (namely, the intersection of (1.18) and (1.21)) where weak existence holds but uniqueness in law does not.
1.4 Preliminaries on fractional Brownian motion
We recall here several facts about fractional Brownian motion (fBm); for some standard references, we refer to [Reference Nualart80, Reference Picard87].
 An 
 $\mathbb {R}^d$
-valued fBm of Hurst parameter H is defined as the unique centered Gaussian process with covariance
$\mathbb {R}^d$
-valued fBm of Hurst parameter H is defined as the unique centered Gaussian process with covariance 
 $$ \begin{align*} \mathbb{E}(B^H_t\otimes B^H_s)=\tfrac{1}{2}\big(|t|^{2H}+|s|^{2H}-|t-s|^{2H}\big) I_d, \end{align*} $$
$$ \begin{align*} \mathbb{E}(B^H_t\otimes B^H_s)=\tfrac{1}{2}\big(|t|^{2H}+|s|^{2H}-|t-s|^{2H}\big) I_d, \end{align*} $$
where 
 $I_d$
 denotes the
$I_d$
 denotes the 
 $d\times d$
 identity matrix; in other words, its components are i.i.d. one-dimensional fBms. FBm paths are well-known to be
$d\times d$
 identity matrix; in other words, its components are i.i.d. one-dimensional fBms. FBm paths are well-known to be 
 $\mathbb {P}$
-a.s.
$\mathbb {P}$
-a.s. 
 $(H-\varepsilon )$
-Hölder, but nowhere H-Hölder continuous. FBm admits several representations as a stochastic integral; in particular, given any fBm
$(H-\varepsilon )$
-Hölder, but nowhere H-Hölder continuous. FBm admits several representations as a stochastic integral; in particular, given any fBm 
 $B^H$
 defined on a probability space, one can construct therein a standard Bm W such that
$B^H$
 defined on a probability space, one can construct therein a standard Bm W such that 
 $$ \begin{align} B^H_t=\int_0^t K_H(t,r)\mathrm{d} W_r\quad\forall\, t \geq 0. \end{align} $$
$$ \begin{align} B^H_t=\int_0^t K_H(t,r)\mathrm{d} W_r\quad\forall\, t \geq 0. \end{align} $$
Such Volterra kernel representation is referred as canonical since 
 $B^H$
 and W generate the same filtration. The exact formula for the kernels
$B^H$
 and W generate the same filtration. The exact formula for the kernels 
 $K_H$
 can be found in, for example, [Reference Nualart and Ouknine81]. For our purposes, it is enough to recall that
$K_H$
 can be found in, for example, [Reference Nualart and Ouknine81]. For our purposes, it is enough to recall that 
 $K_H$
 is deterministic and
$K_H$
 is deterministic and 
 $K_H(t,\cdot )\in L^2([0,t])$
.
$K_H(t,\cdot )\in L^2([0,t])$
.
 Another standard representation of fBm is the one introduced in [Reference Mandelbrot and van Ness74]: given 
 $B^H$
, one can construct a two-sided Bm
$B^H$
, one can construct a two-sided Bm 
 $\tilde W$
 such that
$\tilde W$
 such that 
 $$ \begin{align} B^H_t = \gamma_H \int_{-\infty}^t \big[(t-r)_+^{H-1/2}-(-r)_+^{H-1/2}\big]\, \mathrm{d} \tilde W_r, \end{align} $$
$$ \begin{align} B^H_t = \gamma_H \int_{-\infty}^t \big[(t-r)_+^{H-1/2}-(-r)_+^{H-1/2}\big]\, \mathrm{d} \tilde W_r, \end{align} $$
where 
 $\gamma _H=\Gamma (H+1/2)^{-1}$
 is a normalizing constant and
$\gamma _H=\Gamma (H+1/2)^{-1}$
 is a normalizing constant and 
 $x_+$
 denotes the positive part.
$x_+$
 denotes the positive part.
 We will mostly work with representation (1.22), but we invite the reader to keep in mind (1.23) since it is usually easier to manipulate in order to derive key properties of the process, like its local nondeterminism; see (1.24) and the discussion below. Given a filtration 
 $\mathbb {F}$
, we say that
$\mathbb {F}$
, we say that 
 $B^H$
 is a
$B^H$
 is a 
 $\mathbb {F}$
-fBM if the associated W given by (1.22) is a
$\mathbb {F}$
-fBM if the associated W given by (1.22) is a 
 $\mathbb {F}$
-Brownian motion.
$\mathbb {F}$
-Brownian motion.
 FBm of parameter 
 $H=1$
 is somewhat trivial or ill-defined (see [Reference Picard87]); however, one can extend the definition to all values
$H=1$
 is somewhat trivial or ill-defined (see [Reference Picard87]); however, one can extend the definition to all values 
 $H\in (0,+\infty )\setminus \mathbb {N}$
 inductively as in [Reference Perrin, Harba, Berzin-Joseph, Iribarren and Bonami86] by
$H\in (0,+\infty )\setminus \mathbb {N}$
 inductively as in [Reference Perrin, Harba, Berzin-Joseph, Iribarren and Bonami86] by 
 $B^{H+1}_t:=\int _0^t B^H_s\mathrm {d} s$
.
$B^{H+1}_t:=\int _0^t B^H_s\mathrm {d} s$
.
 Such definition is consistent with most aforementioned properties: it is still a centered, Gaussian process, with trajectories 
 $\mathbb {P}$
-a.s. in
$\mathbb {P}$
-a.s. in 
 $C^{H-\varepsilon }_t$
 but nowhere
$C^{H-\varepsilon }_t$
 but nowhere 
 $C^H_t$
, satisfying the scaling relation (1.5); using stochastic Fubini, one can also easily derive similar representations as (1.22)–(1.23). A key consequence of the last property is that for any
$C^H_t$
, satisfying the scaling relation (1.5); using stochastic Fubini, one can also easily derive similar representations as (1.22)–(1.23). A key consequence of the last property is that for any 
 $H\in (0,+\infty )\setminus \mathbb {N}$
, there exists a constant
$H\in (0,+\infty )\setminus \mathbb {N}$
, there exists a constant 
 $c_H\in (0, +\infty )$
 such that
$c_H\in (0, +\infty )$
 such that 
 $$ \begin{align} \mathrm{Cov} \big(B^H_t - \mathbb{E}_s B^H_t \big) = c_H |t-s|^{2H} I_d \quad\forall \, s\leq t \end{align} $$
$$ \begin{align} \mathrm{Cov} \big(B^H_t - \mathbb{E}_s B^H_t \big) = c_H |t-s|^{2H} I_d \quad\forall \, s\leq t \end{align} $$
(see [Reference Gerencsér53, Proposition 2.1]); here, 
 $\mathbb {E}_s B^H_t:=\mathbb {E}[B^H_t|\mathcal {F}_s]$
, where
$\mathbb {E}_s B^H_t:=\mathbb {E}[B^H_t|\mathcal {F}_s]$
, where 
 $\mathcal {F}_s$
 can be the natural filtration of
$\mathcal {F}_s$
 can be the natural filtration of 
 $B^H$
 or more generally any filtration such that
$B^H$
 or more generally any filtration such that 
 $B^H$
 is a
$B^H$
 is a 
 $\mathbb {F}$
-fBm. Property (1.24) is a special form of strong local nondeterminism (LND)Footnote 
4
; see [Reference Galeati and Gubinelli48, Section 2.4] for a deeper discussion on its relevance on regularisation by noise. Since conditional expectations are also
$\mathbb {F}$
-fBm. Property (1.24) is a special form of strong local nondeterminism (LND)Footnote 
4
; see [Reference Galeati and Gubinelli48, Section 2.4] for a deeper discussion on its relevance on regularisation by noise. Since conditional expectations are also 
 $L^2$
-projections,
$L^2$
-projections, 
 $B^H_t-\mathbb {E}_s B^H_t$
 and
$B^H_t-\mathbb {E}_s B^H_t$
 and 
 $\mathbb {E}_s B^H_t$
 are orthogonal Gaussian variables, and thus independent; more generally,
$\mathbb {E}_s B^H_t$
 are orthogonal Gaussian variables, and thus independent; more generally, 
 $B^H_t-\mathbb {E}_sB^H_t$
 is independent of all the history up to time s. Therefore, for any
$B^H_t-\mathbb {E}_sB^H_t$
 is independent of all the history up to time s. Therefore, for any 
 $s\leq t$
, any bounded measurable function
$s\leq t$
, any bounded measurable function 
 $f:\mathbb {R}^d\to \mathbb {R}$
 and any other
$f:\mathbb {R}^d\to \mathbb {R}$
 and any other 
 $\mathcal {F}_s$
-measurable random variable X, it holds
$\mathcal {F}_s$
-measurable random variable X, it holds 
 $$ \begin{align} \mathbb{E}_s f(B^H_t+X)= P_{ \mathrm{Cov}(B^H_t - \mathbb{E}_s B^H_t)} f(\mathbb{E}_s B^H_t+X) = P_{c_H |t-s|^{2H} I_d} f(\mathbb{E}_s B^H_t+X), \end{align} $$
$$ \begin{align} \mathbb{E}_s f(B^H_t+X)= P_{ \mathrm{Cov}(B^H_t - \mathbb{E}_s B^H_t)} f(\mathbb{E}_s B^H_t+X) = P_{c_H |t-s|^{2H} I_d} f(\mathbb{E}_s B^H_t+X), \end{align} $$
where in the last passage, we applied (1.24); here, given a symmetric nonnegative 
 $\Sigma $
,
$\Sigma $
, 
 $P_\Sigma $
 denotes the convolution with the Gaussian density
$P_\Sigma $
 denotes the convolution with the Gaussian density 
 $p_\Sigma $
 associated to
$p_\Sigma $
 associated to 
 $\mathcal {N}(0,\Sigma )$
. Throughout the paper, we will adopt the convention that
$\mathcal {N}(0,\Sigma )$
. Throughout the paper, we will adopt the convention that 
 $P_{t I_d}=P_t$
, in agreement with the standard notation for heat kernels, and for simplicity, we will drop the constant
$P_{t I_d}=P_t$
, in agreement with the standard notation for heat kernels, and for simplicity, we will drop the constant 
 $c_H$
, so that in expressions like (1.25), only
$c_H$
, so that in expressions like (1.25), only 
 $P_{|t-s|^{2H}}$
 will appear.
$P_{|t-s|^{2H}}$
 will appear.
Remark 1.12. At the price of slightly anticipating some key concepts which will be introduced throughout the paper, let us discuss here how our methods extend to a larger class of random perturbations 
 $B^H$
 than just pure fBm. The main requirement we need, relaxing (1.24), is for
$B^H$
 than just pure fBm. The main requirement we need, relaxing (1.24), is for 
 $B^H$
 to be a Gaussian processFootnote 
5
 satisfying a two-sided bound
$B^H$
 to be a Gaussian processFootnote 
5
 satisfying a two-sided bound 
 $$ \begin{align} C^{-1} |t-s|^{2H} I_d \leq \mathrm{Cov} \big( B^H_t-\mathbb{E}_s B^H_t \big)\leq C |t-s|^{2H} I_d \end{align} $$
$$ \begin{align} C^{-1} |t-s|^{2H} I_d \leq \mathrm{Cov} \big( B^H_t-\mathbb{E}_s B^H_t \big)\leq C |t-s|^{2H} I_d \end{align} $$
for some 
 $C\in (0,+\infty )$
 and for all
$C\in (0,+\infty )$
 and for all 
 $s<t$
 with
$s<t$
 with 
 $|t-s|$
 sufficiently small; here,
$|t-s|$
 sufficiently small; here, 
 $\mathcal {F}_t$
 is the natural filtration of
$\mathcal {F}_t$
 is the natural filtration of 
 $B^H$
. More precisely, the upper bound in (1.26) provides a priori estimates in the style of Lemma 2.1, while the lower bound (which is the actual LND property) ensures the regularising effect of
$B^H$
. More precisely, the upper bound in (1.26) provides a priori estimates in the style of Lemma 2.1, while the lower bound (which is the actual LND property) ensures the regularising effect of 
 $B^H$
 and the application of stochastic sewing techniques. Indeed, by using properties of Gaussian convolutions, heat kernel bounds and a relation of the form (1.25), one can still find estimates of the form
$B^H$
 and the application of stochastic sewing techniques. Indeed, by using properties of Gaussian convolutions, heat kernel bounds and a relation of the form (1.25), one can still find estimates of the form 
 $$ \begin{align*} \| \mathbb{E}_s f(B^H_t+X)\|_{L^\infty} & = \| \big( P_{ \mathrm{Cov} (B^H_t-\mathbb{E}_s B^H_t)} f \big) (\mathbb{E}_s B^H_t+X)\|_{L^\infty} \leq \| P_{ \mathrm{Cov} (B^H_t-\mathbb{E}_s B^H_t)} f\|_{L^\infty}\\ & \lesssim \| P_{C^{-1}|t-s|^{2H}} f\|_{L^\infty} \lesssim |t-s|^{\alpha H} \| f\|_{C^\alpha}, \end{align*} $$
$$ \begin{align*} \| \mathbb{E}_s f(B^H_t+X)\|_{L^\infty} & = \| \big( P_{ \mathrm{Cov} (B^H_t-\mathbb{E}_s B^H_t)} f \big) (\mathbb{E}_s B^H_t+X)\|_{L^\infty} \leq \| P_{ \mathrm{Cov} (B^H_t-\mathbb{E}_s B^H_t)} f\|_{L^\infty}\\ & \lesssim \| P_{C^{-1}|t-s|^{2H}} f\|_{L^\infty} \lesssim |t-s|^{\alpha H} \| f\|_{C^\alpha}, \end{align*} $$
for 
 $\alpha \leq 0$
, which are the typical bounds needed throughout the proof. There are some passages where condition (1.26) alone is not enough, and we exploited other properties of fBm. Specifically, the counterexamples in Section 1.3 assume
$\alpha \leq 0$
, which are the typical bounds needed throughout the proof. There are some passages where condition (1.26) alone is not enough, and we exploited other properties of fBm. Specifically, the counterexamples in Section 1.3 assume 
 $B^H$
 to be
$B^H$
 to be 
 $(H-\varepsilon )$
-Hölder continuous and symmetric; the flows constructed in Sections 4–5 need some basic time-continuity
$(H-\varepsilon )$
-Hölder continuous and symmetric; the flows constructed in Sections 4–5 need some basic time-continuity 
 $\mathbb {E}|B^H_t-B^H_s|\lesssim |t-s|^{H\wedge 1}$
 in order to apply Kolmogorov-type criteria; more substantially, the results from Section 8 rely on a Volterra representation
$\mathbb {E}|B^H_t-B^H_s|\lesssim |t-s|^{H\wedge 1}$
 in order to apply Kolmogorov-type criteria; more substantially, the results from Section 8 rely on a Volterra representation 
 $B^H_t =\int _0^t K(t,s) \mathrm {d} W_s$
. These properties are satisfied by other interesting examples – for instance, type-II fBm and mixed fBm discussed in Remark 1.13 below.
$B^H_t =\int _0^t K(t,s) \mathrm {d} W_s$
. These properties are satisfied by other interesting examples – for instance, type-II fBm and mixed fBm discussed in Remark 1.13 below.
 The only section truly specific to fBm is Appendix C, which however, exactly for this reason, is not used throughout the main body of the paper. In this case, ad hoc criteria to check Girsanov transform for fBm are presented; any extension to other processes would require precise knowledge of the associated kernel 
 $K(t,s)$
, and its verification can be very technical; cf. [Reference Nualart and Sönmez82].
$K(t,s)$
, and its verification can be very technical; cf. [Reference Nualart and Sönmez82].
Remark 1.13. Standard examples of processes satisfying (1.26) are deterministic additive perturbations of fBm (cf. Lemma 6.7), the so-called type-II fBm [Reference Marinucci and Robinson75] and mixed fBm introduced in [Reference Cheridito27]; given any 
 $H_1\neq H_2$
, the process
$H_1\neq H_2$
, the process 
 $B^{H_1}+B^{H_2}$
 will satisfy condition (1.26) with
$B^{H_1}+B^{H_2}$
 will satisfy condition (1.26) with 
 $H=H_1\wedge H_2$
, both in the case
$H=H_1\wedge H_2$
, both in the case 
 $B^{H_1}$
 and
$B^{H_1}$
 and 
 $B^{H_2}$
 are sampled independently and the one instead where they are constructed from the same reference Brownian motion. In this case, our results yield a far-reaching generalization (also to any
$B^{H_2}$
 are sampled independently and the one instead where they are constructed from the same reference Brownian motion. In this case, our results yield a far-reaching generalization (also to any 
 $d\geq 2$
) of the ones provided in [Reference Nualart and Sönmez82] while not requiring highly technical use of Girsanov transform as therein.
$d\geq 2$
) of the ones provided in [Reference Nualart and Sönmez82] while not requiring highly technical use of Girsanov transform as therein.
 Another interesting example is Bifractional Brownian motion of parameters 
 $(H,K)$
 (see [Reference Russo and Tudor90]), which is known to be LND with parameter
$(H,K)$
 (see [Reference Russo and Tudor90]), which is known to be LND with parameter 
 $HK$
 [Reference Tudor and Xiao95]; it is a generalization of fBm (
$HK$
 [Reference Tudor and Xiao95]; it is a generalization of fBm (
 $K=1$
), but even in the case
$K=1$
), but even in the case 
 $HK=1/2$
 is not a semimartingale nor a Dirichlet process, although it scales like standard Bm. Our results show that it has a comparable regularising effect, although not amenable to Markovian/martingale techniques.
$HK=1/2$
 is not a semimartingale nor a Dirichlet process, although it scales like standard Bm. Our results show that it has a comparable regularising effect, although not amenable to Markovian/martingale techniques.
 Another generalization of fBm is the so-called multifractional Brownian motion, in which the Hurst parameter is allowed to vary continuously in time, 
 $H=H(t)$
; two nonequivalent definitions for this process are given respectively in [Reference Peltier and Véhel85] (by modifying representation (1.23) by allowing
$H=H(t)$
; two nonequivalent definitions for this process are given respectively in [Reference Peltier and Véhel85] (by modifying representation (1.23) by allowing 
 $H=H(t)$
) and in [Reference Benassi, Jaffard and Roux9] (by a harmonisable representation). In both cases, the process can be shown to be ‘locally LND around t’ with parameter
$H=H(t)$
) and in [Reference Benassi, Jaffard and Roux9] (by a harmonisable representation). In both cases, the process can be shown to be ‘locally LND around t’ with parameter 
 $H(t)$
 (see [Reference Ayache, Shieh and Xiao4] in the harmonisable case), and thus, we still expect our strategy to yield interesting results under appropriate modifications. Likely, the admissible range of
$H(t)$
 (see [Reference Ayache, Shieh and Xiao4] in the harmonisable case), and thus, we still expect our strategy to yield interesting results under appropriate modifications. Likely, the admissible range of 
 $\alpha $
 here would depend on both the supremum and infimum of
$\alpha $
 here would depend on both the supremum and infimum of 
 $H(t)$
; we leave more precise investigations for future research.
$H(t)$
; we leave more precise investigations for future research.
 Finally, let us mention that for (sufficiently regular) solutions 
 $u(x,t)$
 to certain linear stochastic PDEs for any fixed x, the process
$u(x,t)$
 to certain linear stochastic PDEs for any fixed x, the process 
 $t\mapsto u(x,t)$
 is LND (see, for example, [Reference Tudor and Xiao96]); this fact was exploited crucially in regularisation by noise for nonlinear SPDEs in [Reference Athreya, Butkovsky, Lê and Mytnik3].
$t\mapsto u(x,t)$
 is LND (see, for example, [Reference Tudor and Xiao96]); this fact was exploited crucially in regularisation by noise for nonlinear SPDEs in [Reference Athreya, Butkovsky, Lê and Mytnik3].
1.5 Setup and notation
We provide here in a list all the main notations and conventions adopted throughout the paper.
- 
• We always work on the time interval  $t\in [0,1]$
. Increments of functions f on $t\in [0,1]$
. Increments of functions f on $[0,1]$
 are denoted by $[0,1]$
 are denoted by $f_{s,t}:=f_t-f_s$
. $f_{s,t}:=f_t-f_s$
.
- 
• Whenever considering a filtered probability space  $(\Omega ,\mathcal {F},\mathbb {F},\mathbb {P})$
, we will implicitly assume that the filtration $(\Omega ,\mathcal {F},\mathbb {F},\mathbb {P})$
, we will implicitly assume that the filtration $\mathbb {F}=(\mathcal {F}_t)_{t\in [0,1]}$
 satisfies the standard assumptions; in particular, $\mathbb {F}=(\mathcal {F}_t)_{t\in [0,1]}$
 satisfies the standard assumptions; in particular, $\mathcal {F}_0$
 is complete. To denote conditional expectations, we use the shortcut notation $\mathcal {F}_0$
 is complete. To denote conditional expectations, we use the shortcut notation $\mathbb {E}_s Y :=\mathbb {E} [Y| \mathcal {F}_s]$
. $\mathbb {E}_s Y :=\mathbb {E} [Y| \mathcal {F}_s]$
.
- 
•  $L^m$
-norms without further notation are understood with respect to $L^m$
-norms without further notation are understood with respect to $\omega $
; that is, $\omega $
; that is, $\|Y\|_{L^m}=\big (\mathbb {E}|Y|^m\big )^{1/m}$
 for $\|Y\|_{L^m}=\big (\mathbb {E}|Y|^m\big )^{1/m}$
 for $m<\infty $
 and $m<\infty $
 and $\|Y\|_{L^\infty }=\mathrm {esssup}_{\omega \in \Omega }|Y(\omega )|$
. For conditional $\|Y\|_{L^\infty }=\mathrm {esssup}_{\omega \in \Omega }|Y(\omega )|$
. For conditional $L^m$
-norms, we use the notation $L^m$
-norms, we use the notation $\|Y\|_{L^m|\mathcal {F}_s}=\big (\mathbb {E}(|Y|^m|\mathcal {F}_s)\big )^{1/m}.$
 For any $\|Y\|_{L^m|\mathcal {F}_s}=\big (\mathbb {E}(|Y|^m|\mathcal {F}_s)\big )^{1/m}.$
 For any $X,Y\in L^m$
 such that Y is $X,Y\in L^m$
 such that Y is $\mathcal {F}_s$
-measurable, by conditional Jensen’s inequality, one has the $\mathcal {F}_s$
-measurable, by conditional Jensen’s inequality, one has the $\mathbb {P}$
-a.s. bound (1.27)Apart from the usual $\mathbb {P}$
-a.s. bound (1.27)Apart from the usual $$ \begin{align} \|X-\mathbb{E}_s X\|_{L^m|\mathcal{F}_s} \leq \|X-Y\|_{L^m|\mathcal{F}_s} +\|Y-\mathbb{E}_s X\|_{L^m|\mathcal{F}_s} \leq 2\|X-Y\|_{L^m|\mathcal{F}_s}. \end{align} $$ $$ \begin{align} \|X-\mathbb{E}_s X\|_{L^m|\mathcal{F}_s} \leq \|X-Y\|_{L^m|\mathcal{F}_s} +\|Y-\mathbb{E}_s X\|_{L^m|\mathcal{F}_s} \leq 2\|X-Y\|_{L^m|\mathcal{F}_s}. \end{align} $$ $L^m$
-norms, we also use the norms $L^m$
-norms, we also use the norms $\big \|\,\|\cdot \|_{L^m|\mathcal {F}_s}\big \|_{L^n}$
. We will always consider $\big \|\,\|\cdot \|_{L^m|\mathcal {F}_s}\big \|_{L^n}$
. We will always consider $n\geq m$
, in which case again by conditional Jensen, it holds with equality in the case $n\geq m$
, in which case again by conditional Jensen, it holds with equality in the case $$ \begin{align*} \| X\|_{L^m} \leq \big\|\,\| X \|_{L^m|\mathcal{F}_s}\big\|_{L^n} \end{align*} $$ $$ \begin{align*} \| X\|_{L^m} \leq \big\|\,\| X \|_{L^m|\mathcal{F}_s}\big\|_{L^n} \end{align*} $$ $m=n$
. Such mixed norms still satisfy natural analogues of classical inequalities like Jensen’s, Hölder’s and Minkowski’s, as can be verified using properties of conditional expectation. Moreover, by the tower property, one can see that for $m=n$
. Such mixed norms still satisfy natural analogues of classical inequalities like Jensen’s, Hölder’s and Minkowski’s, as can be verified using properties of conditional expectation. Moreover, by the tower property, one can see that for $t\geq s$
, $t\geq s$
, $\big \|\,\|\cdot \|_{L^m|\mathcal {F}_t}\big \|_{L^n}$
 is stronger than $\big \|\,\|\cdot \|_{L^m|\mathcal {F}_t}\big \|_{L^n}$
 is stronger than $\big \|\,\|\cdot \|_{L^m|\mathcal {F}_s}\big \|_{L^n}$
. $\big \|\,\|\cdot \|_{L^m|\mathcal {F}_s}\big \|_{L^n}$
.
- 
• Whenever talking about a weak solution X to the SDE (1.6), we will actually mean a tuple  $(X,B^H; \Omega , \mathbb {F}, \mathbb {P})$
 such that $(X,B^H; \Omega , \mathbb {F}, \mathbb {P})$
 such that $(\Omega ,\mathbb {F},\mathbb {P})$
 is a filtered probability space as above, X is $(\Omega ,\mathbb {F},\mathbb {P})$
 is a filtered probability space as above, X is $\mathbb {F}$
-adapted and $\mathbb {F}$
-adapted and $B^H$
 is a $B^H$
 is a $\mathbb {F}$
-fBm of parameter H. As usual, X is a strong solution if it is adapted to the (standard augmentation of) the filtration generated by $\mathbb {F}$
-fBm of parameter H. As usual, X is a strong solution if it is adapted to the (standard augmentation of) the filtration generated by $B^H$
. We say that pathwise uniqueness holds for the SDE if for any two solutions $B^H$
. We say that pathwise uniqueness holds for the SDE if for any two solutions $X^1$
, $X^1$
, $X^2$
, defined on the same $X^2$
, defined on the same $(\Omega ,\mathbb {F},\mathbb {P})$
, driven by the same $(\Omega ,\mathbb {F},\mathbb {P})$
, driven by the same $B^H$
 and with same initial condition $B^H$
 and with same initial condition $x_0$
, it holds $x_0$
, it holds $X^1\equiv X^2 \mathbb {P}$
-a.s. We warn the reader to keep in mind that all such concepts are rather classical when b is at least a measurable function, so that (1.6) is meaningful in the Lebesgue sense. In the distributional regime $X^1\equiv X^2 \mathbb {P}$
-a.s. We warn the reader to keep in mind that all such concepts are rather classical when b is at least a measurable function, so that (1.6) is meaningful in the Lebesgue sense. In the distributional regime $\alpha <0$
, this is not the case anymore. Therefore, the concept of weak solution becomes less standard. We postpone this discussion to the relevant Section 5, similarly for the concept of path-by-path uniqueness. $\alpha <0$
, this is not the case anymore. Therefore, the concept of weak solution becomes less standard. We postpone this discussion to the relevant Section 5, similarly for the concept of path-by-path uniqueness.
- 
• Function spaces in the variable  $x\in \mathbb {R}^d$
 will often be denoted by the subscript x. For instance, standard Lebesgue spaces $x\in \mathbb {R}^d$
 will often be denoted by the subscript x. For instance, standard Lebesgue spaces $L^p(\mathbb {R}^d;\mathbb {R}^m)$
 with $L^p(\mathbb {R}^d;\mathbb {R}^m)$
 with $p\in [1,\infty ]$
 will often be denoted, when the target dimension m is clear, simply by $p\in [1,\infty ]$
 will often be denoted, when the target dimension m is clear, simply by $L^p_x$
. For $L^p_x$
. For $\alpha \in \mathbb {R}\setminus \mathbb {N}$
, we denote by $\alpha \in \mathbb {R}\setminus \mathbb {N}$
, we denote by $C^\alpha _x$
 the inhomogeneous Hölder-Besov space $C^\alpha _x$
 the inhomogeneous Hölder-Besov space $B^{\alpha }_{\infty ,\infty }$
 (cf. [Reference Bahouri, Chemin and Danchin5]); instead, for nonnegative integer $B^{\alpha }_{\infty ,\infty }$
 (cf. [Reference Bahouri, Chemin and Danchin5]); instead, for nonnegative integer $\alpha $
, by $\alpha $
, by $C^\alpha _x$
, we mean the space of bounded measurable functions whose all partial weak derivatives up to order $C^\alpha _x$
, we mean the space of bounded measurable functions whose all partial weak derivatives up to order $\alpha $
 are also essentially bounded and measurable (in other words, $\alpha $
 are also essentially bounded and measurable (in other words, $C^\alpha _x=W^{\alpha ,\infty }_x$
 Sobolev spaces); note that with this convention, elements of $C^\alpha _x=W^{\alpha ,\infty }_x$
 Sobolev spaces); note that with this convention, elements of $C^0_x$
 are not necessarily continuous. Recall that for $C^0_x$
 are not necessarily continuous. Recall that for $\alpha \in (0,1)$
, the space $\alpha \in (0,1)$
, the space $C^\alpha _x=B^\alpha _{\infty ,\infty }$
 coincides with the usual space of bounded $C^\alpha _x=B^\alpha _{\infty ,\infty }$
 coincides with the usual space of bounded $\alpha $
-Hölder continuous functions. By $\alpha $
-Hölder continuous functions. By $C^{\alpha ,\mathrm {loc}}_x$
, we mean the space of functions f such that for all compactly supported smooth g, one has $C^{\alpha ,\mathrm {loc}}_x$
, we mean the space of functions f such that for all compactly supported smooth g, one has $f g\in C^\alpha _x$
. More quantitative versions of them are the weighted Hölder spaces $f g\in C^\alpha _x$
. More quantitative versions of them are the weighted Hölder spaces $C^{\alpha ,\lambda }_x$
, for $C^{\alpha ,\lambda }_x$
, for $\alpha \in (0,1]$
 and $\alpha \in (0,1]$
 and $\lambda \in \mathbb {R}$
, defined through the (semi)norms where $\lambda \in \mathbb {R}$
, defined through the (semi)norms where  $B_R$
 is the ball of radius R around the origin. $B_R$
 is the ball of radius R around the origin.
- 
• Given a Banach space E, we will use the shortcut notation  $L^q_t E$
 to denote the space $L^q_t E$
 to denote the space $L^q([0,1];E)$
 of Bochner measurable function with finite norm $L^q([0,1];E)$
 of Bochner measurable function with finite norm $\| f\|_{L^q E}^q=\int _0^1 \| f_t\|_E^q\, \mathrm {d} t$
 for any $\| f\|_{L^q E}^q=\int _0^1 \| f_t\|_E^q\, \mathrm {d} t$
 for any $q\in [1,\infty ]$
 (up to the standard essential supremum convention for $q\in [1,\infty ]$
 (up to the standard essential supremum convention for $q=\infty $
). We use the shortcut notation $q=\infty $
). We use the shortcut notation $C_t E = C([0,1];E)$
 for the space of continuous, E-valued functions with supremum norm; similarly, for $C_t E = C([0,1];E)$
 for the space of continuous, E-valued functions with supremum norm; similarly, for $\gamma \in (0,1)$
, $\gamma \in (0,1)$
, $C^\gamma _t E= C^\gamma ([0,1];E)$
 is the space of E-valued, bounded and $C^\gamma _t E= C^\gamma ([0,1];E)$
 is the space of E-valued, bounded and $\gamma $
-Hölder continuous functions. All definitions can be extended classically to Fréchet spaces E (in particular, allowing for $\gamma $
-Hölder continuous functions. All definitions can be extended classically to Fréchet spaces E (in particular, allowing for $E=C^{\alpha ,\mathrm {loc}}_x$
 or $E=C^{\alpha ,\mathrm {loc}}_x$
 or $L^{p,\mathrm {loc}}_x$
) – for instance, in the the case of $L^{p,\mathrm {loc}}_x$
) – for instance, in the the case of $L^q_t E$
 by requiring the associated countable seminorms $L^q_t E$
 by requiring the associated countable seminorms $t\mapsto \| f_t \|_k$
 to be all $t\mapsto \| f_t \|_k$
 to be all $L^q$
-integrable. $L^q$
-integrable.
- 
• Given a metric space E and  $p\in [1,\infty )$
, we say that a continuous E-valued function f on $p\in [1,\infty )$
, we say that a continuous E-valued function f on $[0,1]$
 is of finite p-variation, in notation $[0,1]$
 is of finite p-variation, in notation $f\in C^{p-{\mathrm {var}}}_t E$
, if where the supremum runs over all possible partitions $f\in C^{p-{\mathrm {var}}}_t E$
, if where the supremum runs over all possible partitions  $0=t_0\leq t_1\leq \cdots \leq t_n=1$
 of $0=t_0\leq t_1\leq \cdots \leq t_n=1$
 of $[0,1]$
. The p-variation seminorm on subintervals $[0,1]$
. The p-variation seminorm on subintervals $[s,t]\subset [0,1]$
 is defined similarly and denoted by $[s,t]\subset [0,1]$
 is defined similarly and denoted by . Whenever . Whenever $E=\mathbb {R}^m$
 for some $E=\mathbb {R}^m$
 for some $m\in \mathbb {N}$
, for simplicity, we just drop it and write $m\in \mathbb {N}$
, for simplicity, we just drop it and write $C^{p-{\mathrm {var}}}_t$
, $C^{p-{\mathrm {var}}}_t$
, , and similarly for , and similarly for $C^\alpha _t$
. $C^\alpha _t$
.
- 
• All the notations introduced above can be concatenated by considering a different Banach/Fréchet space at each step. The convention we adopt is that, when writing spaces with respect to different variables, this is to be read from left to right; for example,  $L^q_t C^\alpha _x L^m$
 stands for $L^q_t C^\alpha _x L^m$
 stands for $L^q\big ([0,1], C^\alpha (\mathbb {R}^d,L^m(\Omega ))\big )$
. Similarly, one can define, for example, $L^q\big ([0,1], C^\alpha (\mathbb {R}^d,L^m(\Omega ))\big )$
. Similarly, one can define, for example, $L^m C^{p-{\mathrm {var}}}_t C^{\alpha ,\mathrm {loc}}_x$
, $L^m C^{p-{\mathrm {var}}}_t C^{\alpha ,\mathrm {loc}}_x$
, $C^\gamma _t L^\infty _x$
, and so on. Mind in particular that with this convention $C^\gamma _t L^\infty _x$
, and so on. Mind in particular that with this convention $C^\alpha _t C^\alpha _x\neq C^\alpha _{t,x}$
, the latter denoting the space of $C^\alpha _t C^\alpha _x\neq C^\alpha _{t,x}$
, the latter denoting the space of $\alpha $
-Hölder continuous functions in $\alpha $
-Hölder continuous functions in $(t,x)$
. $(t,x)$
.
- 
• Let us recall some standard heat kernel estimates: for any  $\alpha \geq \beta $
, there exists a constant $\alpha \geq \beta $
, there exists a constant $N=N(d,\alpha ,\beta )$
 such that, for all $N=N(d,\alpha ,\beta )$
 such that, for all $t\in (0,1]$
, one has the bound (1.28)see [Reference Galeati and Gubinelli49, Lemma A.10] and the references therein for a more general statement. $t\in (0,1]$
, one has the bound (1.28)see [Reference Galeati and Gubinelli49, Lemma A.10] and the references therein for a more general statement. $$ \begin{align} \|P_{t}f\|_{C^\alpha_x}\leq N t^{(\beta-\alpha)/2}\|f\|_{C^\beta_x}; \end{align} $$ $$ \begin{align} \|P_{t}f\|_{C^\alpha_x}\leq N t^{(\beta-\alpha)/2}\|f\|_{C^\beta_x}; \end{align} $$
- 
• For  $0\leq S\leq T\leq 1$
, we denote $0\leq S\leq T\leq 1$
, we denote $[S,T]^2_\leq =\{(s,t)\in [S,T]^2:\,s\leq t\}$
 and $[S,T]^2_\leq =\{(s,t)\in [S,T]^2:\,s\leq t\}$
 and $[S,T]^3_\leq =\{(s,u,t)\in [S,T]^3:\,s\leq u\leq t\}$
. For $[S,T]^3_\leq =\{(s,u,t)\in [S,T]^3:\,s\leq u\leq t\}$
. For $(s,t)\in [S,T]^2_\leq $
, define $(s,t)\in [S,T]^2_\leq $
, define $s_-=s-(t-s)$
. We then set the slightly more restricted sets of pairs/triples as $s_-=s-(t-s)$
. We then set the slightly more restricted sets of pairs/triples as $$ \begin{align*} & \overline{[S,T]}^2_\leq:=\{(s,t)\in[S,T]^2_\leq:\,s_-\geq S\},\\ & \overline{[S,T]}^3_\leq=\{(s,u,t)\in[S,T]^3_\leq:\,(u-s)\wedge(t-u)\geq (t-s)/3,\,s_-\geq S\}. \end{align*} $$ $$ \begin{align*} & \overline{[S,T]}^2_\leq:=\{(s,t)\in[S,T]^2_\leq:\,s_-\geq S\},\\ & \overline{[S,T]}^3_\leq=\{(s,u,t)\in[S,T]^3_\leq:\,(u-s)\wedge(t-u)\geq (t-s)/3,\,s_-\geq S\}. \end{align*} $$
- 
• Given a Frechét space E and a map  $A:[S,T]^2_\leq \to E$
, we define $A:[S,T]^2_\leq \to E$
, we define $\delta A:[S,T]^3_\leq \to E$
 by $\delta A:[S,T]^3_\leq \to E$
 by $\delta A_{s,u,t} = A_{s,t}- A_{s,u}- A_{u,t}$
. $\delta A_{s,u,t} = A_{s,t}- A_{s,u}- A_{u,t}$
.
- 
• We say that a function  $w:[0,1]^2_\leq \to \mathbb {R}_+ $
 is a control if it is continuous and superadditive (i.e., $w:[0,1]^2_\leq \to \mathbb {R}_+ $
 is a control if it is continuous and superadditive (i.e., $w(s,u)+w(u,t)\leq w(s,t)$
 for all $w(s,u)+w(u,t)\leq w(s,t)$
 for all $(s,u,t)\in [S,T]^3_\leq $
). The most common controls for us will be of the form (1.29)Recall that for any two controls $(s,u,t)\in [S,T]^3_\leq $
). The most common controls for us will be of the form (1.29)Recall that for any two controls $$ \begin{align} w_{b,\alpha,q}(s,t):=\int_s^t\|b_r\|_{C^\alpha_x}^q\,dr. \end{align} $$ $$ \begin{align} w_{b,\alpha,q}(s,t):=\int_s^t\|b_r\|_{C^\alpha_x}^q\,dr. \end{align} $$ $w_1,w_2$
 and $w_1,w_2$
 and $\theta _1,\theta _2\in [0,\infty )$
 such that $\theta _1,\theta _2\in [0,\infty )$
 such that $\theta _1+\theta _2\geq 1$
, $\theta _1+\theta _2\geq 1$
, $w=w_1^{\theta _1}w_2^{\theta _2}$
 is also a control (see [Reference Friz and Victoir45, Exercises 1.8,1.9]). Note also that if w is a control, $w=w_1^{\theta _1}w_2^{\theta _2}$
 is also a control (see [Reference Friz and Victoir45, Exercises 1.8,1.9]). Note also that if w is a control, $\psi $
 is an $\psi $
 is an $\mathbb {R}^m$
-valued path and $\mathbb {R}^m$
-valued path and $\gamma \in (0,1]$
, then (1.30)conversely, for $\gamma \in (0,1]$
, then (1.30)conversely, for $$ \begin{align} \|\psi\|_{\frac{1}{\gamma}-{\mathrm{var}}}\leq w(0,1)^\gamma\sup_{0\leq s<t\leq 1}\frac{|\psi_{s,t}|}{w(s,t)^\gamma}; \end{align} $$ $$ \begin{align} \|\psi\|_{\frac{1}{\gamma}-{\mathrm{var}}}\leq w(0,1)^\gamma\sup_{0\leq s<t\leq 1}\frac{|\psi_{s,t}|}{w(s,t)^\gamma}; \end{align} $$ $p\geq 1$
, if $p\geq 1$
, if $\psi \in C^{p-{\mathrm {var}}}_t$
, then $\psi \in C^{p-{\mathrm {var}}}_t$
, then is a control and is a control and $|\psi _{s,t}|\leq w(s,t)^{1/p}$
; cf. [Reference Friz and Victoir45, Propositions 5.8-5.10]. $|\psi _{s,t}|\leq w(s,t)^{1/p}$
; cf. [Reference Friz and Victoir45, Propositions 5.8-5.10].
- 
• The space of probability measures on  $\mathbb {R}^d$
 is denoted by $\mathbb {R}^d$
 is denoted by $\mathcal {P}(\mathbb {R}^d)$
. The law of a random variable X is denoted by $\mathcal {P}(\mathbb {R}^d)$
. The law of a random variable X is denoted by $\mathcal {L}(X)$
. For $\mathcal {L}(X)$
. For $p\geq 1$
, we denote the p-Wasserstein distance on $p\geq 1$
, we denote the p-Wasserstein distance on $\mathcal {P}(\mathbb {R}^d)$
 by $\mathcal {P}(\mathbb {R}^d)$
 by $\mathbb {W}_p$
, defined as where $\mathbb {W}_p$
, defined as where $$ \begin{align*} \mathbb{W}_p(\mu,\nu)^p=\inf_{\gamma\in \Gamma(\mu,\nu)}\int_{\mathbb{R}^{d}\times\mathbb{R}^d}|x-y|^p\gamma(\mathrm{d} x,\mathrm{d} y), \end{align*} $$ $$ \begin{align*} \mathbb{W}_p(\mu,\nu)^p=\inf_{\gamma\in \Gamma(\mu,\nu)}\int_{\mathbb{R}^{d}\times\mathbb{R}^d}|x-y|^p\gamma(\mathrm{d} x,\mathrm{d} y), \end{align*} $$ $\Gamma (\mu ,\nu )$
 is the set of all couplings of $\Gamma (\mu ,\nu )$
 is the set of all couplings of $\mu $
 and $\mu $
 and $\nu $
 (i.e., the probability measures on $\nu $
 (i.e., the probability measures on $\mathbb {R}^{d}\times \mathbb {R}^d$
 whose first and second marginals are $\mathbb {R}^{d}\times \mathbb {R}^d$
 whose first and second marginals are $\mu $
 and $\mu $
 and $\nu $
, respectively). Note that $\nu $
, respectively). Note that $\mathbb {W}_p$
 can take value $\mathbb {W}_p$
 can take value $+\infty $
 and is defined for any $+\infty $
 and is defined for any $\mu $
, $\mu $
, $\nu $
, without any moment assumption. $\nu $
, without any moment assumption.
- 
• When a statement contains an estimate with a constant depending on a certain set of parameters, in the proof, we do not carry the constants from line to line. Rather, we write  $A\lesssim B$
 to denote the existence of a constant N depending on the same set of parameters such that $A\lesssim B$
 to denote the existence of a constant N depending on the same set of parameters such that $A\leq N B$
. Whenever such a set of parameters includes a parameter that is a norm (this will typically be the norm of the coefficient b), this dependence is always monotone increasing. $A\leq N B$
. Whenever such a set of parameters includes a parameter that is a norm (this will typically be the norm of the coefficient b), this dependence is always monotone increasing.
2 A priori estimates and stochastic sewing
 The key consequence of the subcriticality condition (A) is that in terms of local nondeterminism, drifts of solutions are more regular than the noise; in particular, the solution decomposes as 
 $X=\varphi +B^H$
, where
$X=\varphi +B^H$
, where 
 $\varphi $
 plays the role of a slow variable, while
$\varphi $
 plays the role of a slow variable, while 
 $B^H$
 is the highly oscillating component.Footnote 
6
 This can be formulated as a precise quantitative bound by looking at the best conditional error committed by predicting the process
$B^H$
 is the highly oscillating component.Footnote 
6
 This can be formulated as a precise quantitative bound by looking at the best conditional error committed by predicting the process 
 $\varphi _t$
, given the history up to time s; more precisely, we look for estimates of the form
$\varphi _t$
, given the history up to time s; more precisely, we look for estimates of the form 
 $$ \begin{align} \big\|\|\varphi_t-\mathbb{E}_s\varphi_t\|_{L^m|\mathcal{F}_s}\big\|_{L^\infty}\leq w(s,t)^{1/q}|t-s|^{1/q'+\alpha H}\quad \forall\, (s,t)\in [0,1]^2_{\leq}, \end{align} $$
$$ \begin{align} \big\|\|\varphi_t-\mathbb{E}_s\varphi_t\|_{L^m|\mathcal{F}_s}\big\|_{L^\infty}\leq w(s,t)^{1/q}|t-s|^{1/q'+\alpha H}\quad \forall\, (s,t)\in [0,1]^2_{\leq}, \end{align} $$
where 
 $m\in [1,\infty )$
, w is a suitable control and
$m\in [1,\infty )$
, w is a suitable control and 
 $(q,\alpha ,H)$
 are the parameters related to b,
$(q,\alpha ,H)$
 are the parameters related to b, 
 $B^H$
.
$B^H$
.
 The subcritical regime 
 $\alpha>1-1/(q' H)$
 corresponds to the exponent
$\alpha>1-1/(q' H)$
 corresponds to the exponent 
 $1/q'+\alpha H$
 appearing in (2.1) being greater than H; this is in stark contrast with the lower bound provided by the LND property of fBm (1.24), which tells us that such an estimate cannot hold for
$1/q'+\alpha H$
 appearing in (2.1) being greater than H; this is in stark contrast with the lower bound provided by the LND property of fBm (1.24), which tells us that such an estimate cannot hold for 
 $\varphi $
 replaced by
$\varphi $
 replaced by 
 $B^H$
, justifying the slow-fast heuristic above.
$B^H$
, justifying the slow-fast heuristic above.
 It is also worth pointing out that 
 $1/q'+\alpha H$
 is allowed to exceed
$1/q'+\alpha H$
 is allowed to exceed 
 $1$
 (this is indeed always the case for
$1$
 (this is indeed always the case for 
 $H>1$
), which will be used crucially in the following; in this case, the same bound could not hold if in (2.1),
$H>1$
), which will be used crucially in the following; in this case, the same bound could not hold if in (2.1), 
 $\mathbb {E}_s \varphi _t$
 were replaced by
$\mathbb {E}_s \varphi _t$
 were replaced by 
 $\varphi _s$
, as one can easily check that the only processes satisfying the corresponding condition are the constant ones.
$\varphi _s$
, as one can easily check that the only processes satisfying the corresponding condition are the constant ones.
It will become clear in the sequel why (2.1) is exactly the right condition needed in our analysis; for the moment, let us show that solutions to SDEs naturally enjoy (2.1).
 Lemma 2.1 below is based on a readaption of [Reference Gerencsér53, Lemma 2.4], [Reference Butkovsky, Dareiotis and Gerencsér13, Lemma 4.2] to our setting. Note that in the statement, while we enforce the subcritical condition 
 $\alpha>1-1/(q'H)$
, the restriction
$\alpha>1-1/(q'H)$
, the restriction 
 $q\leq 2$
 is not necessary; we do, however, restrict to
$q\leq 2$
 is not necessary; we do, however, restrict to 
 $\alpha \geq 0$
 first. For distributional drifts, similar bounds will be derived from stochastic sewing; see Lemma 2.4 below.
$\alpha \geq 0$
 first. For distributional drifts, similar bounds will be derived from stochastic sewing; see Lemma 2.4 below.
Lemma 2.1. Let 
 $H\in (0,\infty )\setminus \mathbb {N}$
,
$H\in (0,\infty )\setminus \mathbb {N}$
, 
 $q\in [1,\infty )$
, and
$q\in [1,\infty )$
, and 
 $\alpha \in [0,1]$
 satisfy
$\alpha \in [0,1]$
 satisfy 
 $\alpha>1-1/(q'H)$
; let
$\alpha>1-1/(q'H)$
; let 
 $b\in L^q_t C^\alpha _x$
, X be a weak solution of (1.6) and set
$b\in L^q_t C^\alpha _x$
, X be a weak solution of (1.6) and set 
 $\varphi :=X-B^H$
, so that
$\varphi :=X-B^H$
, so that 
 $$ \begin{align*} \varphi_t = x_0 + \int_0^t b_r(X_r) \mathrm{d} r. \end{align*} $$
$$ \begin{align*} \varphi_t = x_0 + \int_0^t b_r(X_r) \mathrm{d} r. \end{align*} $$
Then, for any 
 $m\in [1,\infty )$
, there exists a constant
$m\in [1,\infty )$
, there exists a constant 
 $N=N(d,H,\alpha ,m,\| b\|_{L^q_t C^\alpha _x})$
 such that estimate (2.1) holds with the choice
$N=N(d,H,\alpha ,m,\| b\|_{L^q_t C^\alpha _x})$
 such that estimate (2.1) holds with the choice 
 $$ \begin{align} w(s,t)=N w_{b,\alpha,q}(s,t)= N \int_s^t \| b_r\|_{C^\alpha}^q \mathrm{d} r. \end{align} $$
$$ \begin{align} w(s,t)=N w_{b,\alpha,q}(s,t)= N \int_s^t \| b_r\|_{C^\alpha}^q \mathrm{d} r. \end{align} $$
Proof. First assume that, for some given 
 $\beta \geq 0$
, the bound (2.1) holds with w as above and exponent
$\beta \geq 0$
, the bound (2.1) holds with w as above and exponent 
 $\beta $
 in place of
$\beta $
 in place of 
 $1/q'+\alpha H$
. This is definitely the case with
$1/q'+\alpha H$
. This is definitely the case with 
 $\beta =1/q'$
, as one can see from
$\beta =1/q'$
, as one can see from 
 $$ \begin{align} \big\|\|\varphi_t-\mathbb{E}_s\varphi_t\|_{L^m|\mathcal{F}_s}\big\|_{L^\infty} & \leq 2\big\|\|\varphi_t-\varphi_s\|_{L^m|\mathcal{F}_s}\big\|_{L^\infty} \nonumber\\ & \leq 2 \int_s^t \| b_r\|_{C^0}\, \mathrm{d} r \leq 2 w_{b,\alpha,q}(s,t)^{1/q} |t-s|^{1/q'}; \end{align} $$
$$ \begin{align} \big\|\|\varphi_t-\mathbb{E}_s\varphi_t\|_{L^m|\mathcal{F}_s}\big\|_{L^\infty} & \leq 2\big\|\|\varphi_t-\varphi_s\|_{L^m|\mathcal{F}_s}\big\|_{L^\infty} \nonumber\\ & \leq 2 \int_s^t \| b_r\|_{C^0}\, \mathrm{d} r \leq 2 w_{b,\alpha,q}(s,t)^{1/q} |t-s|^{1/q'}; \end{align} $$
in the above passages, we applied (1.27), the definition of 
 $\varphi $
 and lastly Hölder’s inequality.
$\varphi $
 and lastly Hölder’s inequality.
 Assuming we already have the bound for a generic 
 $\beta \geq 1/q'$
, we can then apply (1.27) for the choice
$\beta \geq 1/q'$
, we can then apply (1.27) for the choice 
 $Y=\varphi _s + \int _s^t b_r(\mathbb {E}_s X_r) \mathrm {d} r$
, together with the definition of
$Y=\varphi _s + \int _s^t b_r(\mathbb {E}_s X_r) \mathrm {d} r$
, together with the definition of 
 $\varphi $
, to find
$\varphi $
, to find 
 $$ \begin{align*} \| \varphi_t -\mathbb{E}_s \varphi_t \|_{L^m|\mathcal{F}_s} & \leq 2 \Big\| \varphi_t - \varphi_s - \int_s^t b_r ( \mathbb{E}_s \varphi_r +\mathbb{E}_s B^H_r) \mathrm{d} r \Big\|_{L^m|\mathcal{F}_s}\\ & \leq 2 \int_s^t \big\| b_r (\varphi_r + B^H_r) - b_r (\mathbb{E}_s \varphi_r +\mathbb{E}_s B^H_r) \big\|_{L^m|\mathcal{F}_s} \,\mathrm{d} r\\ & \leq 2 \int_s^t \| b_r \|_{C^{\alpha}_x} \big\| \varphi_r -\mathbb{E}_s \varphi_r + B^H_r -\mathbb{E}_s B^H_r \big\|_{L^m|\mathcal{F}_s}^{\alpha}\,\mathrm{d} r\\ & \leq 2 \int_s^t \| b_r \|_{C^{\alpha}_x} \big( \| \varphi_r -\mathbb{E}_s \varphi_r\|_{L^m|\mathcal{F}_s}^{\alpha} + \| B^H_r -\mathbb{E}_s B^H_r \|_{L^m|\mathcal{F}_s}^{\alpha}\big)\,\mathrm{d} r; \end{align*} $$
$$ \begin{align*} \| \varphi_t -\mathbb{E}_s \varphi_t \|_{L^m|\mathcal{F}_s} & \leq 2 \Big\| \varphi_t - \varphi_s - \int_s^t b_r ( \mathbb{E}_s \varphi_r +\mathbb{E}_s B^H_r) \mathrm{d} r \Big\|_{L^m|\mathcal{F}_s}\\ & \leq 2 \int_s^t \big\| b_r (\varphi_r + B^H_r) - b_r (\mathbb{E}_s \varphi_r +\mathbb{E}_s B^H_r) \big\|_{L^m|\mathcal{F}_s} \,\mathrm{d} r\\ & \leq 2 \int_s^t \| b_r \|_{C^{\alpha}_x} \big\| \varphi_r -\mathbb{E}_s \varphi_r + B^H_r -\mathbb{E}_s B^H_r \big\|_{L^m|\mathcal{F}_s}^{\alpha}\,\mathrm{d} r\\ & \leq 2 \int_s^t \| b_r \|_{C^{\alpha}_x} \big( \| \varphi_r -\mathbb{E}_s \varphi_r\|_{L^m|\mathcal{F}_s}^{\alpha} + \| B^H_r -\mathbb{E}_s B^H_r \|_{L^m|\mathcal{F}_s}^{\alpha}\big)\,\mathrm{d} r; \end{align*} $$
in the above estimates, we used multiple times basic properties of conditional norms like Jensen’s and Minkowski’s inequality. By the properties of fBm recalled in Section 1.4 and the independence of 
 $B_r^H-\mathbb {E}_s B^H_r$
 from
$B_r^H-\mathbb {E}_s B^H_r$
 from 
 $\mathcal {F}_s$
, we have the bound
$\mathcal {F}_s$
, we have the bound 
 $$ \begin{align*} \big\| \|B^H_r -\mathbb{E}_s B^H_r \|_{L^m|\mathcal{F}_s} \big\|_{L^\infty}\lesssim |r-s|^H \quad \forall s\leq r. \end{align*} $$
$$ \begin{align*} \big\| \|B^H_r -\mathbb{E}_s B^H_r \|_{L^m|\mathcal{F}_s} \big\|_{L^\infty}\lesssim |r-s|^H \quad \forall s\leq r. \end{align*} $$
Combined with our standing assumption on 
 $\varphi $
, by taking
$\varphi $
, by taking 
 $L^\infty $
-norms on both sides and using Minkowski’s and Hölder’s inequalities for the integral, we get
$L^\infty $
-norms on both sides and using Minkowski’s and Hölder’s inequalities for the integral, we get 
 $$ \begin{align*} \big\|\| \varphi_t -\mathbb{E}_s \varphi_t \|_{L^m|\mathcal{F}_s}\big\|_{L^\infty} & \lesssim \int_s^t \| b_r\|_{C^\alpha_x} \Big( \big\|\| \varphi_r -\mathbb{E}_s \varphi_r \|_{L^m|\mathcal{F}_s}\big\|_{L^\infty}^\alpha + |r-s|^{\alpha H} \Big) \mathrm{d} r\\ & \lesssim \int_s^t \| b_r\|_{C^\alpha_x} \Big( w_{\alpha,b,q}(s,r)^{\alpha/q}|r-s|^{\alpha\beta} + |r-s|^{\alpha H} \Big) \mathrm{d} r\\ & \lesssim w_{\alpha,b,q}(s,t)^{1/q}\Big(w_{\alpha,b,q}(s,t)^{\alpha/q}|t-s|^{\alpha\beta+1/q'} + |t-s|^{\alpha H+1/q'}\Big). \end{align*} $$
$$ \begin{align*} \big\|\| \varphi_t -\mathbb{E}_s \varphi_t \|_{L^m|\mathcal{F}_s}\big\|_{L^\infty} & \lesssim \int_s^t \| b_r\|_{C^\alpha_x} \Big( \big\|\| \varphi_r -\mathbb{E}_s \varphi_r \|_{L^m|\mathcal{F}_s}\big\|_{L^\infty}^\alpha + |r-s|^{\alpha H} \Big) \mathrm{d} r\\ & \lesssim \int_s^t \| b_r\|_{C^\alpha_x} \Big( w_{\alpha,b,q}(s,r)^{\alpha/q}|r-s|^{\alpha\beta} + |r-s|^{\alpha H} \Big) \mathrm{d} r\\ & \lesssim w_{\alpha,b,q}(s,t)^{1/q}\Big(w_{\alpha,b,q}(s,t)^{\alpha/q}|t-s|^{\alpha\beta+1/q'} + |t-s|^{\alpha H+1/q'}\Big). \end{align*} $$
In other terms, if 
 $\varphi $
 satisfies (2.1) with
$\varphi $
 satisfies (2.1) with 
 $1/q'+\alpha H$
 replaced by
$1/q'+\alpha H$
 replaced by 
 $\beta $
, then it does so also with
$\beta $
, then it does so also with 
 $\tilde {\beta }=f(\beta )=\alpha (\beta \wedge H)+1/q'$
 (up to a change in the generic constant N).
$\tilde {\beta }=f(\beta )=\alpha (\beta \wedge H)+1/q'$
 (up to a change in the generic constant N).
 From here, the argument is identical to the one from [Reference Gerencsér53, Lemma 2.4]: by iterating, we can define a sequence 
 $\{\beta ^n\}_n$
 by
$\{\beta ^n\}_n$
 by 
 $\beta ^{n+1}= f(\beta ^n)$
 with
$\beta ^{n+1}= f(\beta ^n)$
 with 
 $\beta _0=1/q'$
; it remains to note that the condition
$\beta _0=1/q'$
; it remains to note that the condition 
 $\alpha>1-1/(q'H)$
 guarantees that the only fixed point
$\alpha>1-1/(q'H)$
 guarantees that the only fixed point 
 $\bar {\beta }$
 of the map
$\bar {\beta }$
 of the map 
 $\tilde {f}(\beta )= \alpha \beta +1/q'$
 is strictly larger than H and is attracting exponentially fast any orbit defined by
$\tilde {f}(\beta )= \alpha \beta +1/q'$
 is strictly larger than H and is attracting exponentially fast any orbit defined by 
 $\tilde {\beta }_{n+1}=\tilde {f}(\tilde \beta _n)$
. Given that the sequences
$\tilde {\beta }_{n+1}=\tilde {f}(\tilde \beta _n)$
. Given that the sequences 
 $\{\beta _n\}_n$
 and
$\{\beta _n\}_n$
 and 
 $\{\tilde \beta _n\}_n$
 coincide as long as the first one does not exceed H, this necessarily implies that the first one stabilizes to
$\{\tilde \beta _n\}_n$
 coincide as long as the first one does not exceed H, this necessarily implies that the first one stabilizes to 
 $\beta =\alpha H+ 1/q'$
 after a finite number of iterations
$\beta =\alpha H+ 1/q'$
 after a finite number of iterations 
 $\bar {n}$
 (i.e.,
$\bar {n}$
 (i.e., 
 $\beta _n=\alpha H+ 1/q'$
 for all
$\beta _n=\alpha H+ 1/q'$
 for all 
 $n\geq \bar {n}$
).
$n\geq \bar {n}$
).
Remark 2.2. The case 
 $m=\infty $
 can be handled with an appropriate stopping argument; see [Reference Gerencsér53, Lemma 2.4]. This can be used to derive similar bounds for processes that are not exact solutions (for example Picard iterates), but we do not need this generality.
$m=\infty $
 can be handled with an appropriate stopping argument; see [Reference Gerencsér53, Lemma 2.4]. This can be used to derive similar bounds for processes that are not exact solutions (for example Picard iterates), but we do not need this generality.
 The next ingredient is an a priori estimate for 
 $\alpha <0$
, analogous to Lemma 2.1. Recall that for any adapted process
$\alpha <0$
, analogous to Lemma 2.1. Recall that for any adapted process 
 $\varphi $
, by (1.27), one has
$\varphi $
, by (1.27), one has 
 $$ \begin{align*} \big\|\|\varphi_t-\mathbb{E}_s\varphi_t\|_{L^m|\mathcal{F}_s}\big\|_{L^\infty}\leq 2\big\|\|\varphi_{s,t}\|_{L^m|\mathcal{F}_s}\big\|_{L^\infty}; \end{align*} $$
$$ \begin{align*} \big\|\|\varphi_t-\mathbb{E}_s\varphi_t\|_{L^m|\mathcal{F}_s}\big\|_{L^\infty}\leq 2\big\|\|\varphi_{s,t}\|_{L^m|\mathcal{F}_s}\big\|_{L^\infty}; \end{align*} $$
in the distributional case, we will directly bound the latter quantity. Unlike Lemma 2.1, here we cannot allow for any 
 $q\in (2,\infty ]$
 and subcritical
$q\in (2,\infty ]$
 and subcritical 
 $\alpha $
; rather, we need to impose the stronger condition (B), which was introduced just before Theorem 1.5.
$\alpha $
; rather, we need to impose the stronger condition (B), which was introduced just before Theorem 1.5.
Remark 2.3. As mentioned in Remark 1.6, for 
 $q\in (1,2]$
, condition (B) reduces to A. For
$q\in (1,2]$
, condition (B) reduces to A. For 
 $q\in (2,\infty )$
, the a priori estimate below will be relevant in Section 8, where we establish existence of weak solutions in a regime where the uniqueness is not known. Contrary to Lemma 2.1, the proof of Lemma 2.4 will rely on stochastic sewing techniques. We could use the upcoming very general (but quite technical) Lemma 2.5 for this task; but in order to help the intuition, we prefer first to invoke the result from [Reference Friz, Hocquet and Lê44], whose statement is simpler, and postpone the application of Lemma 2.5 to where it is truly needed (e.g., Lemma 3.1).
$q\in (2,\infty )$
, the a priori estimate below will be relevant in Section 8, where we establish existence of weak solutions in a regime where the uniqueness is not known. Contrary to Lemma 2.1, the proof of Lemma 2.4 will rely on stochastic sewing techniques. We could use the upcoming very general (but quite technical) Lemma 2.5 for this task; but in order to help the intuition, we prefer first to invoke the result from [Reference Friz, Hocquet and Lê44], whose statement is simpler, and postpone the application of Lemma 2.5 to where it is truly needed (e.g., Lemma 3.1).
Lemma 2.4. Assume (B) and, in addition, 
 $\alpha <0$
. Let
$\alpha <0$
. Let 
 $b\in L^q_tC^1_x$
 and let X be the unique strong solution to (1.6) for some initial condition
$b\in L^q_tC^1_x$
 and let X be the unique strong solution to (1.6) for some initial condition 
 $x_0\in \mathbb {R}^d$
; set
$x_0\in \mathbb {R}^d$
; set 
 $w:=w_{b,\alpha ,q}$
 and
$w:=w_{b,\alpha ,q}$
 and 
 $\varphi =X-B^H$
. Then for any
$\varphi =X-B^H$
. Then for any 
 $m\in [1,\infty )$
, there exists a constant
$m\in [1,\infty )$
, there exists a constant 
 $N=N(m,d,\alpha ,q,H,\| b\|_{L^q_t C^\alpha _x})$
 such that for all
$N=N(m,d,\alpha ,q,H,\| b\|_{L^q_t C^\alpha _x})$
 such that for all 
 $(s,t)\in [0,1]_\leq ^2$
, one has the bound
$(s,t)\in [0,1]_\leq ^2$
, one has the bound 
 $$ \begin{align} \big\|\| \varphi_{s,t}\|_{L^m\vert \mathcal{F}_s}\big\|_{L^\infty} \leq N w(s,t)^{1/q} |t-s|^{\alpha H + 1/q'}. \end{align} $$
$$ \begin{align} \big\|\| \varphi_{s,t}\|_{L^m\vert \mathcal{F}_s}\big\|_{L^\infty} \leq N w(s,t)^{1/q} |t-s|^{\alpha H + 1/q'}. \end{align} $$
Proof. Up to shifting, we can assume without loss of generality 
 $x_0=0$
; moreover, we only need to deal with
$x_0=0$
; moreover, we only need to deal with 
 $m\in [2,\infty )$
 since
$m\in [2,\infty )$
 since 
 $\| \cdot \|_{L^m\vert \mathcal {F}_s} \leq \| \cdot \|_{L^2\vert \mathcal {F}_s}$
 otherwise. Fix
$\| \cdot \|_{L^m\vert \mathcal {F}_s} \leq \| \cdot \|_{L^2\vert \mathcal {F}_s}$
 otherwise. Fix 
 $m\in [2,\infty )$
 and set the shorthand
$m\in [2,\infty )$
 and set the shorthand 
 $\beta :=\alpha H +1/q'$
; recall that by (B), one has
$\beta :=\alpha H +1/q'$
; recall that by (B), one has 
 $\beta>H$
.
$\beta>H$
.
 Let us first assume that (2.4) holds with w replaced by another control 
 $\tilde {w}$
; this is definitely the case for
$\tilde {w}$
; this is definitely the case for 
 $\tilde w = w_{b,0,q}$
, arguing as in (2.3). Given such
$\tilde w = w_{b,0,q}$
, arguing as in (2.3). Given such 
 $\tilde w$
 and any closed subinterval
$\tilde w$
 and any closed subinterval 
 $I\subset [0,1]$
, define
$I\subset [0,1]$
, define 

with the convention 
 $0/0=0$
. Fix
$0/0=0$
. Fix 
 $(s,t)\in [0,1]_\leq ^2$
 and, for any
$(s,t)\in [0,1]_\leq ^2$
 and, for any 
 $(s',t')\in [s,t]_\leq ^2$
, set
$(s',t')\in [s,t]_\leq ^2$
, set 
 $$ \begin{align*}A_{s',t'}:=\mathbb{E}_{s'} \int_{s'}^{t'} b_r(\varphi_{s'}+B^H_r)\mathrm{d} r = \int_{s'}^{t'} P_{|r-s'|^{2H}} b_r (\varphi_{s'}+\mathbb{E}_{s'} B^H_r) \mathrm{d} r, \end{align*} $$
$$ \begin{align*}A_{s',t'}:=\mathbb{E}_{s'} \int_{s'}^{t'} b_r(\varphi_{s'}+B^H_r)\mathrm{d} r = \int_{s'}^{t'} P_{|r-s'|^{2H}} b_r (\varphi_{s'}+\mathbb{E}_{s'} B^H_r) \mathrm{d} r, \end{align*} $$
where in the second passage, we used conditional Fubini and property (1.25) (please remember our convention about not writing explicitly the constant 
 $c_H$
 or the matrix
$c_H$
 or the matrix 
 $I_d$
).
$I_d$
).
 Our aim is to apply the stochastic sewing lemma (in the version given by [Reference Friz, Hocquet and Lê44, Theorem 2.7]) to A in order to find a closed estimate for  . By the heat kernel estimates (1.28), we have
. By the heat kernel estimates (1.28), we have 
 $\mathbb {P}$
-almost surely
$\mathbb {P}$
-almost surely 
 $$ \begin{align*} | A_{s',t'}| &\leq \int_{s'}^{t'} \| P_{|r-s'|^{2H}} b_r\|_{C^0} \mathrm{d} r \lesssim \int_{s'}^{t'} |r-s'|^{\alpha H} \|b_r\|_{C^\alpha} \mathrm{d} r \lesssim |t'-s'|^\beta w(s',t')^{1/q}, \end{align*} $$
$$ \begin{align*} | A_{s',t'}| &\leq \int_{s'}^{t'} \| P_{|r-s'|^{2H}} b_r\|_{C^0} \mathrm{d} r \lesssim \int_{s'}^{t'} |r-s'|^{\alpha H} \|b_r\|_{C^\alpha} \mathrm{d} r \lesssim |t'-s'|^\beta w(s',t')^{1/q}, \end{align*} $$
where in the last passage, we applied Hölder’s inequality, and the 
 $L^{q'}$
-integrability of
$L^{q'}$
-integrability of 
 $|r-s|^{\alpha H}$
 follows from (B). Similarly, we have the
$|r-s|^{\alpha H}$
 follows from (B). Similarly, we have the 
 $\mathbb {P}$
-a.s. bound
$\mathbb {P}$
-a.s. bound 

The integrability of the power follows again from (B), as do the inequalities 
 $\beta +1/q>1/2$
,
$\beta +1/q>1/2$
, 
 $2\beta -H +2/q>1$
 (we remark that it is only the latter for which the additional condition in (B) was introduced). Therefore, the stochastic sewing lemma [Reference Friz, Hocquet and Lê44, Theorem 2.7] applies and allows us to derive estimates for the sewing
$2\beta -H +2/q>1$
 (we remark that it is only the latter for which the additional condition in (B) was introduced). Therefore, the stochastic sewing lemma [Reference Friz, Hocquet and Lê44, Theorem 2.7] applies and allows us to derive estimates for the sewing 
 $\mathcal {A}$
 associated to A. However, one can easily identify
$\mathcal {A}$
 associated to A. However, one can easily identify 
 $\mathcal {A}_\cdot $
; indeed, by the spatial regularity of b, we have the bound
$\mathcal {A}_\cdot $
; indeed, by the spatial regularity of b, we have the bound 
 $$ \begin{align*} \|\varphi_{s',t'}-A_{s',t'}\|_{L^m}\lesssim |t'-s'|^\varepsilon\, w_{b,1,1}(s',t') \end{align*} $$
$$ \begin{align*} \|\varphi_{s',t'}-A_{s',t'}\|_{L^m}\lesssim |t'-s'|^\varepsilon\, w_{b,1,1}(s',t') \end{align*} $$
for some 
 $\varepsilon>0$
, which allows to conclude that
$\varepsilon>0$
, which allows to conclude that 
 $\mathcal {A}_{\cdot }=\varphi _{s,\cdot }$
 again by [Reference Friz, Hocquet and Lê44, Theorem 2.7-(b)]. Overall, we deduce that there exists a constant
$\mathcal {A}_{\cdot }=\varphi _{s,\cdot }$
 again by [Reference Friz, Hocquet and Lê44, Theorem 2.7-(b)]. Overall, we deduce that there exists a constant 
 $N_0=N_0(m,d,\alpha ,q,H)$
 such that
$N_0=N_0(m,d,\alpha ,q,H)$
 such that 

Diving both sides by 
 $|t'-s'|^\beta w^{1/q}(s',t')$
, taking supremum over
$|t'-s'|^\beta w^{1/q}(s',t')$
, taking supremum over 
 $[s',t']\subset [s,t]$
 and using the fact that all our estimates are on
$[s',t']\subset [s,t]$
 and using the fact that all our estimates are on 
 $[s,t]\subset [0,1]$
, we obtain
$[s,t]\subset [0,1]$
, we obtain 

In particular, (2.5) shows that  is finite; we can then go again through the whole argument, with
 is finite; we can then go again through the whole argument, with 
 $\tilde {w}$
 replaced by w, to find
$\tilde {w}$
 replaced by w, to find 

which readily yields a closed estimate for  , at least for
, at least for 
 $[s,t]$
 sufficiently small.
$[s,t]$
 sufficiently small.
 Our last task is to remove the smallness condition on 
 $[s,t]$
 in order to achieve a global bound. To this end, define a new control
$[s,t]$
 in order to achieve a global bound. To this end, define a new control 
 $w_\ast $
 by
$w_\ast $
 by 
 $w_\ast (s,t)^{1/q+\beta -H}=w(s,t)^{1/q} |t-s|^{\beta -H}$
 and an increasing sequence
$w_\ast (s,t)^{1/q+\beta -H}=w(s,t)^{1/q} |t-s|^{\beta -H}$
 and an increasing sequence 
 $\{t_n\}_n$
 by
$\{t_n\}_n$
 by 
 $t_0=0$
 and
$t_0=0$
 and 
 $w_\ast (t_n,t_{n+1})^{1/q+\beta -H}=(2N_0)^{-1}$
. Applying (2.6) for
$w_\ast (t_n,t_{n+1})^{1/q+\beta -H}=(2N_0)^{-1}$
. Applying (2.6) for 
 $[s,t]=[t_n,t_{n+1}]$
, by construction, one finds
$[s,t]=[t_n,t_{n+1}]$
, by construction, one finds  .
.
 If 
 $t_1=1$
, this immediately yields the conclusion. Suppose this is not the case. Then for any pair
$t_1=1$
, this immediately yields the conclusion. Suppose this is not the case. Then for any pair 
 $s<t$
 which do not belong to the same subinterval
$s<t$
 which do not belong to the same subinterval 
 $[t_n,t_{n+1}]$
, there exist
$[t_n,t_{n+1}]$
, there exist 
 $\ell ,m\in \mathbb {N}$
 such that
$\ell ,m\in \mathbb {N}$
 such that 
 $t_{\ell -1}< s\leq t_\ell \leq \ldots \leq t_m\leq t<t_{m+1}$
. Set
$t_{\ell -1}< s\leq t_\ell \leq \ldots \leq t_m\leq t<t_{m+1}$
. Set 
 $\tau _{\ell -1}=s$
,
$\tau _{\ell -1}=s$
, 
 $\tau _i=t_i$
 for
$\tau _i=t_i$
 for 
 $i=\ell ,\ldots , m$
 and
$i=\ell ,\ldots , m$
 and 
 $\tau _{m+1}=t$
. It holds
$\tau _{m+1}=t$
. It holds 
 $$ \begin{align*} \big\| \| \varphi_{s,t}\|_{L^m\vert \mathcal{F}_s}\big\|_{L^\infty} & \leq \sum_{i=\ell-1}^{m} \big\|\| \varphi_{\tau_i,\tau_{i+1}}\|_{L^m\vert \mathcal{F}_s} \big\|_{L^\infty} \leq \sum_{i=\ell-1}^{m} \big\|\| \varphi_{\tau_i,\tau_{i+1}}\|_{L^m\vert \mathcal{F}_{\tau_i}} \big\|_{L^\infty}\\ & \lesssim_{N_0} \sum_{i=\ell-1}^{m} w(\tau_i,\tau_{i+1})^{1/q} |\tau_i-\tau_{i+1}|^\beta \\ & \leq (m+1-\ell)^{-\alpha H} \Big( \sum_{i=\ell-1}^{m} \big [w(\tau_i,\tau_{i+1})^{1/q} |\tau_i-\tau_{i+1}|^\beta\big]^{\frac{1}{1+\alpha H}} \Big)^{1+\alpha H}\\ & \leq (m+1-\ell)^{-\alpha H} w(s,t)^{1/q} |t-s|^\beta, \end{align*} $$
$$ \begin{align*} \big\| \| \varphi_{s,t}\|_{L^m\vert \mathcal{F}_s}\big\|_{L^\infty} & \leq \sum_{i=\ell-1}^{m} \big\|\| \varphi_{\tau_i,\tau_{i+1}}\|_{L^m\vert \mathcal{F}_s} \big\|_{L^\infty} \leq \sum_{i=\ell-1}^{m} \big\|\| \varphi_{\tau_i,\tau_{i+1}}\|_{L^m\vert \mathcal{F}_{\tau_i}} \big\|_{L^\infty}\\ & \lesssim_{N_0} \sum_{i=\ell-1}^{m} w(\tau_i,\tau_{i+1})^{1/q} |\tau_i-\tau_{i+1}|^\beta \\ & \leq (m+1-\ell)^{-\alpha H} \Big( \sum_{i=\ell-1}^{m} \big [w(\tau_i,\tau_{i+1})^{1/q} |\tau_i-\tau_{i+1}|^\beta\big]^{\frac{1}{1+\alpha H}} \Big)^{1+\alpha H}\\ & \leq (m+1-\ell)^{-\alpha H} w(s,t)^{1/q} |t-s|^\beta, \end{align*} $$
where in the last two passages, we used the fact that 
 $\beta +1/q=1+\alpha H\in (0,1)$
, Jensen’s inequality and the superadditivity of the control
$\beta +1/q=1+\alpha H\in (0,1)$
, Jensen’s inequality and the superadditivity of the control 
 $[w(s,t)^{1/q} |t-s|^\beta ]^{\frac {1}{1+\alpha H}}$
. Observe that
$[w(s,t)^{1/q} |t-s|^\beta ]^{\frac {1}{1+\alpha H}}$
. Observe that 
 $m+1-\ell $
 is less than or equal to the overall amount of intervals
$m+1-\ell $
 is less than or equal to the overall amount of intervals 
 $[t_n,t_{n+1}]$
. In turn, by their definition and subadditivity of
$[t_n,t_{n+1}]$
. In turn, by their definition and subadditivity of 
 $w_\ast $
, this is bounded by a multiple of
$w_\ast $
, this is bounded by a multiple of 
 $$\begin{align*}w_\ast (0,1)=w(0,1)^{\frac{(\alpha H + 1-H)^{-1}}{q}}=\| b\|_{L^q C^\alpha}^{(\alpha H + 1-H)^{-1}},\end{align*}$$
$$\begin{align*}w_\ast (0,1)=w(0,1)^{\frac{(\alpha H + 1-H)^{-1}}{q}}=\| b\|_{L^q C^\alpha}^{(\alpha H + 1-H)^{-1}},\end{align*}$$
which finally yields the conclusion.
 Next, we formulate two appropriate versions of the stochastic sewing lemma (SSL). After its introduction by Lê [Reference Lê71], in recent years, the SSL has seen many variations. Our first SSL combines three modifications: it incorporates shifting (as in [Reference Gerencsér53]), as well as controls and general 
 $\big \|\,\|\cdot \|_{L^m|\mathcal {F}_s}\big \|_{L^n}$
 norms (as in [Reference Friz, Hocquet and Lê44, Reference Lê72]). Let us remark that this combination is not completely obvious and comes with a price: due to the shifting, we need a nontrivial ‘time component’
$\big \|\,\|\cdot \|_{L^m|\mathcal {F}_s}\big \|_{L^n}$
 norms (as in [Reference Friz, Hocquet and Lê44, Reference Lê72]). Let us remark that this combination is not completely obvious and comes with a price: due to the shifting, we need a nontrivial ‘time component’ 
 $|t-s|^\varepsilon $
 in our estimates, which does not appear in [Reference Friz, Hocquet and Lê44, Reference Lê72]. Nonetheless, the resulting statement is well-suited for our applications, where such ‘time component’ always appears naturally.
$|t-s|^\varepsilon $
 in our estimates, which does not appear in [Reference Friz, Hocquet and Lê44, Reference Lê72]. Nonetheless, the resulting statement is well-suited for our applications, where such ‘time component’ always appears naturally.
 Recall the notations from Section 1.5, concerning 
 $[0,1]_\leq $
,
$[0,1]_\leq $
, 
 $\overline {[S,T]}^2_\leq $
,
$\overline {[S,T]}^2_\leq $
, 
 $s_{-}$
 and so on.
$s_{-}$
 and so on.
Lemma 2.5. Let 
 $w_1,w_2$
 be controls, and let
$w_1,w_2$
 be controls, and let 
 $m,n$
 satisfy
$m,n$
 satisfy 
 $2\leq m\leq n\leq \infty $
 and
$2\leq m\leq n\leq \infty $
 and 
 $m<\infty $
. Let
$m<\infty $
. Let 
 $(S,T)\in [0,1]_\leq $
. Assume that
$(S,T)\in [0,1]_\leq $
. Assume that 
 $(A_{s,t})_{(s,t)\in \overline {[S,T]}^2_\leq }$
 is a continuous mapping from
$(A_{s,t})_{(s,t)\in \overline {[S,T]}^2_\leq }$
 is a continuous mapping from 
 $\overline {[S,T]}^2_\leq $
 to
$\overline {[S,T]}^2_\leq $
 to 
 $L^m$
 such that for all
$L^m$
 such that for all 
 $(s,t)\in \overline {[S,T]}^2_\leq $
,
$(s,t)\in \overline {[S,T]}^2_\leq $
, 
 $A_{s,t}$
 is
$A_{s,t}$
 is 
 $\mathcal {F}_t$
-measurable. Suppose that there exist constants
$\mathcal {F}_t$
-measurable. Suppose that there exist constants 
 $\varepsilon _1,\varepsilon _2>0$
 such that the bounds
$\varepsilon _1,\varepsilon _2>0$
 such that the bounds 
 $$ \begin{align} \big\|\|A_{s,t}\|_{L^m|\mathcal{F}_s}\big\|_{L^n}&\leq w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}, \end{align} $$
$$ \begin{align} \big\|\|A_{s,t}\|_{L^m|\mathcal{F}_s}\big\|_{L^n}&\leq w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}, \end{align} $$
 $$ \begin{align} \|\mathbb{E}_{s_-}\delta A_{s,u,t}\|_{L^n}&\leq w_2(s_-,t)|t-s|^{\varepsilon_2} \end{align} $$
$$ \begin{align} \|\mathbb{E}_{s_-}\delta A_{s,u,t}\|_{L^n}&\leq w_2(s_-,t)|t-s|^{\varepsilon_2} \end{align} $$
hold for all 
 $(s,u,t)\in \overline {[S,T]}^3_\leq $
. Then for all
$(s,u,t)\in \overline {[S,T]}^3_\leq $
. Then for all 
 $S<s\leq t\leq T$
, the Riemann sums
$S<s\leq t\leq T$
, the Riemann sums 
 $$ \begin{align} \sum_{j=0}^{2^\ell-1} A_{s+j2^{-\ell}(t-s),s+(j+1)2^{-\ell}(t-s)} \end{align} $$
$$ \begin{align} \sum_{j=0}^{2^\ell-1} A_{s+j2^{-\ell}(t-s),s+(j+1)2^{-\ell}(t-s)} \end{align} $$
converge as 
 $\ell \to \infty $
 in
$\ell \to \infty $
 in 
 $L^m$
, to the increments
$L^m$
, to the increments 
 $\mathcal {A}_t-\mathcal {A}_s$
 of an adapted stochastic process
$\mathcal {A}_t-\mathcal {A}_s$
 of an adapted stochastic process 
 $(\mathcal {A}_t)_{t\in [S,T]}$
 that is continuous as a mapping from
$(\mathcal {A}_t)_{t\in [S,T]}$
 that is continuous as a mapping from 
 $[S,T]$
 to
$[S,T]$
 to 
 $L^m$
 and
$L^m$
 and 
 $\mathcal {A}_S=0$
. Moreover,
$\mathcal {A}_S=0$
. Moreover, 
 $\mathcal {A}$
 is the unique such process that satisfies the bounds
$\mathcal {A}$
 is the unique such process that satisfies the bounds 
 $$ \begin{align} \big\|\|\mathcal{A}_{t}-\mathcal{A}_s-A_{s,t}\|_{L^m|\mathcal{F}_s}\big\|_{L^n}&\leq K_1 w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}+K_2 w_2(s_-,t)|t-s|^{\varepsilon_2}, \end{align} $$
$$ \begin{align} \big\|\|\mathcal{A}_{t}-\mathcal{A}_s-A_{s,t}\|_{L^m|\mathcal{F}_s}\big\|_{L^n}&\leq K_1 w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}+K_2 w_2(s_-,t)|t-s|^{\varepsilon_2}, \end{align} $$
 $$ \begin{align} \|\mathbb{E}_{s_-}\big(\mathcal{A}_t-\mathcal{A}_s-A_{s,t}\big)\|_{L^n}&\leq K_2 w_2(s_-,t)|t-s|^{\varepsilon_2}, \end{align} $$
$$ \begin{align} \|\mathbb{E}_{s_-}\big(\mathcal{A}_t-\mathcal{A}_s-A_{s,t}\big)\|_{L^n}&\leq K_2 w_2(s_-,t)|t-s|^{\varepsilon_2}, \end{align} $$
with some 
 $K_1,K_2$
 for all
$K_1,K_2$
 for all 
 $(s,u,t)\in \overline {[S,T]}^3_\leq $
. Furthermore, there exists a constant K depending only on
$(s,u,t)\in \overline {[S,T]}^3_\leq $
. Furthermore, there exists a constant K depending only on 
 $\varepsilon _1,\varepsilon _2,m,n,d$
 such that the bounds (2.10)–(2.11) hold with
$\varepsilon _1,\varepsilon _2,m,n,d$
 such that the bounds (2.10)–(2.11) hold with 
 $K_1=K_2=K$
, and moreover, the bound
$K_1=K_2=K$
, and moreover, the bound 
 $$ \begin{align} \big\|\|\mathcal{A}_t-\mathcal{A}_s\|_{L^m|\mathcal{F}_s}\big\|_{L^n}\leq K\big( w_1(s,t)^{1/2}|t-s|^{\varepsilon_1}+w_2(s,t)|t-s|^{\varepsilon_2}\big) \end{align} $$
$$ \begin{align} \big\|\|\mathcal{A}_t-\mathcal{A}_s\|_{L^m|\mathcal{F}_s}\big\|_{L^n}\leq K\big( w_1(s,t)^{1/2}|t-s|^{\varepsilon_1}+w_2(s,t)|t-s|^{\varepsilon_2}\big) \end{align} $$
holds for all 
 $(s,t)\in [S,T]^2_\leq $
.
$(s,t)\in [S,T]^2_\leq $
.
Proof. Since by the time of the present work there is an abundance of SSLs in the recent literature, we do not aim to give a fully self-contained proof. We only provide the details as long as the combination of the arguments of [Reference Gerencsér53] and [Reference Friz, Hocquet and Lê44, Reference Lê72] is nontrivial.
 
Step 1 (convergence along dyadic partitions). Let 
 $(s,t)\in \overline {[S,T]}_\leq ^2$
 and for each
$(s,t)\in \overline {[S,T]}_\leq ^2$
 and for each 
 $k=0,1,\ldots $
 define
$k=0,1,\ldots $
 define 
 $\mathcal {D}_k=\{t_0^k,t_1^k,\ldots ,t_{2^k}^k\}$
, where
$\mathcal {D}_k=\{t_0^k,t_1^k,\ldots ,t_{2^k}^k\}$
, where 
 $t_i^k=s+i2^{-k}(t-s)$
, and set
$t_i^k=s+i2^{-k}(t-s)$
, and set 
 $$ \begin{align*} \mathcal{A}^k_{s,t}=\sum_{i=1}^{2^k}A_{t_{i-1}^k,t_i^k}. \end{align*} $$
$$ \begin{align*} \mathcal{A}^k_{s,t}=\sum_{i=1}^{2^k}A_{t_{i-1}^k,t_i^k}. \end{align*} $$
We claim that 
 $\mathcal {A}^k_{s,t}$
 converges, and its limit
$\mathcal {A}^k_{s,t}$
 converges, and its limit 
 $\tilde {\mathcal {A}}_{s,t}$
 satisfies the bounds (2.10)–(2.11) with
$\tilde {\mathcal {A}}_{s,t}$
 satisfies the bounds (2.10)–(2.11) with 
 $K=K_1=K_2$
 when replacing
$K=K_1=K_2$
 when replacing 
 $\mathcal {A}_t-\mathcal {A}_s$
 by it. In particular, this would also imply the bound
$\mathcal {A}_t-\mathcal {A}_s$
 by it. In particular, this would also imply the bound 
 $$ \begin{align} \big\|\|\tilde{\mathcal{A}}_{s,t}\|_{L^m|\mathcal{F}_s}\big\|_{L^n}\leq K\big( w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}+w_2(s_-,t)|t-s|^{\varepsilon_2}\big) \end{align} $$
$$ \begin{align} \big\|\|\tilde{\mathcal{A}}_{s,t}\|_{L^m|\mathcal{F}_s}\big\|_{L^n}\leq K\big( w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}+w_2(s_-,t)|t-s|^{\varepsilon_2}\big) \end{align} $$
for all 
 $(s,t)\in \overline [S,T]^2_\leq $
. The claim clearly follows from the following two bounds:
$(s,t)\in \overline [S,T]^2_\leq $
. The claim clearly follows from the following two bounds: 
 $$ \begin{align} \big\|\| \mathcal{A}^{k-1} _{s,t} -\mathcal{A}^{k} _{s,t} \|_{L^m|\mathcal{F}_s}\big\|_{L^n} & \lesssim w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}2^{-k\varepsilon_1}+w_2(s_-,t)|t-s|^{\varepsilon_2}2^{-k\varepsilon_2}, \end{align} $$
$$ \begin{align} \big\|\| \mathcal{A}^{k-1} _{s,t} -\mathcal{A}^{k} _{s,t} \|_{L^m|\mathcal{F}_s}\big\|_{L^n} & \lesssim w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}2^{-k\varepsilon_1}+w_2(s_-,t)|t-s|^{\varepsilon_2}2^{-k\varepsilon_2}, \end{align} $$
 $$ \begin{align} \| \mathbb{E}_{s_-} \big( \mathcal{A}^{k-1} _{s,t} -\mathcal{A}^{k} _{s,t} \big) \| _{L^n}& \lesssim w_2(s_-,t)|t-s|^{\varepsilon_2}2^{-k\varepsilon_2}. \end{align} $$
$$ \begin{align} \| \mathbb{E}_{s_-} \big( \mathcal{A}^{k-1} _{s,t} -\mathcal{A}^{k} _{s,t} \big) \| _{L^n}& \lesssim w_2(s_-,t)|t-s|^{\varepsilon_2}2^{-k\varepsilon_2}. \end{align} $$
It is no loss of generality to assume 
 $k\geq 2$
 (otherwise, the trivial bounds below suffice), in which case we write
$k\geq 2$
 (otherwise, the trivial bounds below suffice), in which case we write 
 $$ \begin{align} \mathcal{A}^{k+1}_{s,t}-\mathcal{A}^{k}_{s,t}=-\delta A_{t_0^{k},t_1^{k},t_2^{k}}-\sum_{j=1}^{2^{k-1}-1}\delta A_{t_{2j}^{k},t_{2j+1}^{k},t_{2j+2}^{k}}. \end{align} $$
$$ \begin{align} \mathcal{A}^{k+1}_{s,t}-\mathcal{A}^{k}_{s,t}=-\delta A_{t_0^{k},t_1^{k},t_2^{k}}-\sum_{j=1}^{2^{k-1}-1}\delta A_{t_{2j}^{k},t_{2j+1}^{k},t_{2j+2}^{k}}. \end{align} $$
For the first term, we used the conditions (2.7)–(2.8) in a trivial way:
 $$ \begin{align*} \,&\big\|\|\delta A_{t_{0}^{k},t_{1}^{k},t_{2}^{k}}\|_{L^m|\mathcal{F}_s}\big\|_{L^n} \lesssim w_1(t_{0}^{k}-(t_{2}^{k}-t_{0}^{k}),t_{2}^{k})^{1/2}|t_{2}^{k}-t_{0}^{k}|^{\varepsilon_1}\lesssim w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}2^{-k\varepsilon_1}, \\ \,&\|\mathbb{E}_{s_-} \delta A_{t_{0}^{k},t_{1}^{k},t_{2}^{k}}\|_{L^n} \leq w_2(t_{0}^{k}-(t_{2}^{k}-t_{0}^{k}),t_{2}^{k})|t_{2}^{k}-t_{0}^{k}|^{\varepsilon_2}\lesssim w_2(s_-,t)|t-s|^{\varepsilon_2}2^{-k\varepsilon_2}. \end{align*} $$
$$ \begin{align*} \,&\big\|\|\delta A_{t_{0}^{k},t_{1}^{k},t_{2}^{k}}\|_{L^m|\mathcal{F}_s}\big\|_{L^n} \lesssim w_1(t_{0}^{k}-(t_{2}^{k}-t_{0}^{k}),t_{2}^{k})^{1/2}|t_{2}^{k}-t_{0}^{k}|^{\varepsilon_1}\lesssim w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}2^{-k\varepsilon_1}, \\ \,&\|\mathbb{E}_{s_-} \delta A_{t_{0}^{k},t_{1}^{k},t_{2}^{k}}\|_{L^n} \leq w_2(t_{0}^{k}-(t_{2}^{k}-t_{0}^{k}),t_{2}^{k})|t_{2}^{k}-t_{0}^{k}|^{\varepsilon_2}\lesssim w_2(s_-,t)|t-s|^{\varepsilon_2}2^{-k\varepsilon_2}. \end{align*} $$
For the sum in (2.16), we write
 $$ \begin{align} \sum_{j=1}^{2^{k-1}-1}\delta A_{t_{2j}^{k},t_{2j+1}^{k},t_{2j+2}^{k}}&= \sum_{j=1}^{2^{k-1}-1}\mathbb{E}_{t_{2j-2}^k}\delta A_{t_{2j}^k,t_{2j+1}^k,t_{2j+2}^k} \nonumber \\ &\qquad+\sum_{\ell=0}^1\sum_{j=0}^{2^{k-2}}({\mathrm{id}}-\mathbb{E}_{t_{4j+2\ell}^k}\big)\delta A_{t_{4j+2\ell+2}^k,t_{4j+2\ell+3}^k,t_{4j+2\ell+4}^k} \nonumber\\ &=:I_1+I_2, \end{align} $$
$$ \begin{align} \sum_{j=1}^{2^{k-1}-1}\delta A_{t_{2j}^{k},t_{2j+1}^{k},t_{2j+2}^{k}}&= \sum_{j=1}^{2^{k-1}-1}\mathbb{E}_{t_{2j-2}^k}\delta A_{t_{2j}^k,t_{2j+1}^k,t_{2j+2}^k} \nonumber \\ &\qquad+\sum_{\ell=0}^1\sum_{j=0}^{2^{k-2}}({\mathrm{id}}-\mathbb{E}_{t_{4j+2\ell}^k}\big)\delta A_{t_{4j+2\ell+2}^k,t_{4j+2\ell+3}^k,t_{4j+2\ell+4}^k} \nonumber\\ &=:I_1+I_2, \end{align} $$
where the term 
 $\delta A_{t_{2^k}^k,t_{2^k+1}^k,t_{2^k+2}^k}$
 is defined to be
$\delta A_{t_{2^k}^k,t_{2^k+1}^k,t_{2^k+2}^k}$
 is defined to be 
 $0$
. The point of this unaesthetic decomposition is twofold. First, since
$0$
. The point of this unaesthetic decomposition is twofold. First, since 
 $t^k_{2j-2}=t^k_{2j}-(t^k_{2j+2}-t^k_{2j})$
, in the terms in the first sum, there is sufficient shifting in the conditioning so that they can be estimated via the assumed bound (2.8). Second, for each
$t^k_{2j-2}=t^k_{2j}-(t^k_{2j+2}-t^k_{2j})$
, in the terms in the first sum, there is sufficient shifting in the conditioning so that they can be estimated via the assumed bound (2.8). Second, for each 
 $\ell =0,1$
, the inner sum above is one of martingale differences.
$\ell =0,1$
, the inner sum above is one of martingale differences.
Therefore, we first estimate by the triangle inequality
 $$ \begin{align} \big\|\|I_1\|_{L^m|\mathcal{F}_s}\big\|_{L^n}&\leq\sum_{j=1}^{2^{k-1}-1}\big\|\|\mathbb{E}_{t_{2j-2}^k}\delta A_{t_{2j}^k,t_{2j+1}^k,t_{2j+2}^k}\|_{L^m|\mathcal{F}_s}\big\|_{L^n} \nonumber\\ &\leq\sum_{j=1}^{2^{k-1}-1}\|\mathbb{E}_{t_{2j}^k-(t_{2j+2}^k-t_{2j}^k)}\delta A_{t_{2j}^k,t_{2j+1}^k,t_{2j+2}^k}\big\|_{L^n} \nonumber\\ &\leq \sum_{j=1}^{2^{k-1}-1} w_2(t_{2j-2}^k,t_{2j+2}^k)|t_{2j+2}^k-t_{2j}^k|^{\varepsilon_2} \nonumber\\ &\lesssim |t-s|^{\varepsilon_2}2^{-k\varepsilon_2}w_2(s,t), \end{align} $$
$$ \begin{align} \big\|\|I_1\|_{L^m|\mathcal{F}_s}\big\|_{L^n}&\leq\sum_{j=1}^{2^{k-1}-1}\big\|\|\mathbb{E}_{t_{2j-2}^k}\delta A_{t_{2j}^k,t_{2j+1}^k,t_{2j+2}^k}\|_{L^m|\mathcal{F}_s}\big\|_{L^n} \nonumber\\ &\leq\sum_{j=1}^{2^{k-1}-1}\|\mathbb{E}_{t_{2j}^k-(t_{2j+2}^k-t_{2j}^k)}\delta A_{t_{2j}^k,t_{2j+1}^k,t_{2j+2}^k}\big\|_{L^n} \nonumber\\ &\leq \sum_{j=1}^{2^{k-1}-1} w_2(t_{2j-2}^k,t_{2j+2}^k)|t_{2j+2}^k-t_{2j}^k|^{\varepsilon_2} \nonumber\\ &\lesssim |t-s|^{\varepsilon_2}2^{-k\varepsilon_2}w_2(s,t), \end{align} $$
using the superadditivity of 
 $w_2$
 in the last line. Similarly, but replacing the triangle inequality by the Burkholder-Davis-Gundy and Minkowski inequalities (e.g., in the form given in [Reference Lê72, Lemma 2.5] for
$w_2$
 in the last line. Similarly, but replacing the triangle inequality by the Burkholder-Davis-Gundy and Minkowski inequalities (e.g., in the form given in [Reference Lê72, Lemma 2.5] for 
 $\mathfrak {p}=2$
), we have
$\mathfrak {p}=2$
), we have 
 $$ \begin{align} \big\|\|I_2\|_{L^m|\mathcal{F}_s}\big\|_{L^n}&\lesssim\sum_{\ell=0}^1\Big(\sum_{j=0}^{2^{k-2}}\big\|\|\delta A_{t_{4j+2\ell+2}^k,t_{4j+2\ell+3}^k,t_{4j+2\ell+4}^k}\|_{L^m|\mathcal{F}_s}\big\|_{L^n}^2\Big)^{1/2} \nonumber\\ &\lesssim 2^{-k\varepsilon_1}\sum_{\ell=0}^1 \Big(\sum_{j=0}^{2^{k-2}}w_1(t_{4j+2\ell}^k,t_{4j+2\ell+4}^k)\Big)^{1/2} \nonumber\\ &\lesssim |t-s|^{\varepsilon_1}2^{-k\varepsilon_1}w_1(s,t)^{1/2}. \end{align} $$
$$ \begin{align} \big\|\|I_2\|_{L^m|\mathcal{F}_s}\big\|_{L^n}&\lesssim\sum_{\ell=0}^1\Big(\sum_{j=0}^{2^{k-2}}\big\|\|\delta A_{t_{4j+2\ell+2}^k,t_{4j+2\ell+3}^k,t_{4j+2\ell+4}^k}\|_{L^m|\mathcal{F}_s}\big\|_{L^n}^2\Big)^{1/2} \nonumber\\ &\lesssim 2^{-k\varepsilon_1}\sum_{\ell=0}^1 \Big(\sum_{j=0}^{2^{k-2}}w_1(t_{4j+2\ell}^k,t_{4j+2\ell+4}^k)\Big)^{1/2} \nonumber\\ &\lesssim |t-s|^{\varepsilon_1}2^{-k\varepsilon_1}w_1(s,t)^{1/2}. \end{align} $$
This proves (2.14). As for (2.15), it is only easier: noting that
 $$ \begin{align*} \mathbb{E}_s\sum_{j=1}^{2^{k-1}-1}\delta A_{t_{2j}^{k},t_{2j+1}^{k},t_{2j+2}^{k}}=\mathbb{E}_s I_1, \end{align*} $$
$$ \begin{align*} \mathbb{E}_s\sum_{j=1}^{2^{k-1}-1}\delta A_{t_{2j}^{k},t_{2j+1}^{k},t_{2j+2}^{k}}=\mathbb{E}_s I_1, \end{align*} $$
we can bound 
 $\|\mathbb {E}_sI_1\|_{L^n}\leq \|I_1\|_{L^n}$
 just as in (2.18). This concludes the proof of (2.14)–(2.15).
$\|\mathbb {E}_sI_1\|_{L^n}\leq \|I_1\|_{L^n}$
 just as in (2.18). This concludes the proof of (2.14)–(2.15).
 
Step 2 (convergence along regular partitions). Let us say that a partition 
 $\pi =\{s=t_0<t_1<\cdots <t_n=t\}$
 is regular if
$\pi =\{s=t_0<t_1<\cdots <t_n=t\}$
 is regular if 
 $|\pi |:=\max (t_i-t_{i-1})\leq 2\min (t_i-t_{i-1})$
. For any partition, we can define
$|\pi |:=\max (t_i-t_{i-1})\leq 2\min (t_i-t_{i-1})$
. For any partition, we can define 
 $$ \begin{align*} \mathcal{A}^{\pi}_{s,t}=\sum_{i=1}^n A_{t_{i-1},t_i}. \end{align*} $$
$$ \begin{align*} \mathcal{A}^{\pi}_{s,t}=\sum_{i=1}^n A_{t_{i-1},t_i}. \end{align*} $$
Very similarly to Step 1, we get that for any sequence of regular partitions 
 $(\pi _n)_{n\in \mathbb {N}}$
 with
$(\pi _n)_{n\in \mathbb {N}}$
 with 
 $|\pi _n|\to 0$
,
$|\pi _n|\to 0$
, 
 $\mathcal {A}^{\pi }_{s,t}$
 converges (for details, see [Reference Gerencsér53, Lemma 2.2]). Therefore, on one hand, this limit has to coincide with
$\mathcal {A}^{\pi }_{s,t}$
 converges (for details, see [Reference Gerencsér53, Lemma 2.2]). Therefore, on one hand, this limit has to coincide with 
 $\tilde {\mathcal {A}}_{s,t}$
, and on the other hand, this limit is clearly additive. Moreover, notice that by construction,
$\tilde {\mathcal {A}}_{s,t}$
, and on the other hand, this limit is clearly additive. Moreover, notice that by construction, 
 $\tilde {\mathcal {A}}_{s,t}$
 is
$\tilde {\mathcal {A}}_{s,t}$
 is 
 $\mathcal {F}_t$
-measurable for all
$\mathcal {F}_t$
-measurable for all 
 $(s,t)\in \overline {[S,T]}_\leq ^2$
, and since it vanishes in
$(s,t)\in \overline {[S,T]}_\leq ^2$
, and since it vanishes in 
 $L^m$
, the additivity implies that it is continuous in both arguments as a two-parameter process with values in
$L^m$
, the additivity implies that it is continuous in both arguments as a two-parameter process with values in 
 $L^m$
.
$L^m$
.
 
Step 3 (the process 
 $\mathcal {A}$
 and its bounds). For any
$\mathcal {A}$
 and its bounds). For any 
 $t\in (S,T]$
, we set
$t\in (S,T]$
, we set 
 $t_i:=S+2^{-i}(t-S)$
. We then claim that the series
$t_i:=S+2^{-i}(t-S)$
. We then claim that the series 
 $$ \begin{align*} \mathcal{A}_t:=\sum_{i=1}^\infty \tilde{\mathcal{A}}_{(S+2^{-i})\wedge t,(S+2^{-i+1})\wedge t}=:\sum_{i=1}^\infty \tilde{\mathcal{A}}_{s_i,s_{i-1}} \end{align*} $$
$$ \begin{align*} \mathcal{A}_t:=\sum_{i=1}^\infty \tilde{\mathcal{A}}_{(S+2^{-i})\wedge t,(S+2^{-i+1})\wedge t}=:\sum_{i=1}^\infty \tilde{\mathcal{A}}_{s_i,s_{i-1}} \end{align*} $$
converges. Indeed, since 
 $(s_i,s_{i-1})\in \overline {[S,T]}^2_\leq $
, we may use the bound (2.13). By the trivial bounds
$(s_i,s_{i-1})\in \overline {[S,T]}^2_\leq $
, we may use the bound (2.13). By the trivial bounds 
 $w((s_i)_-,s_{i-1})\leq w(S,t)$
 and
$w((s_i)_-,s_{i-1})\leq w(S,t)$
 and 
 $|s_{i-1}-s_i|\leq 2^{-i}\mathbf {1}_{t-S\geq 2^{-i}}$
, we get not only the convergence of the series but also the bound
$|s_{i-1}-s_i|\leq 2^{-i}\mathbf {1}_{t-S\geq 2^{-i}}$
, we get not only the convergence of the series but also the bound 
 $$ \begin{align*} \big\|\|\mathcal{A}_t\|_{L^m|\mathcal{F}_S}\big\|_{L^n}\leq K\big( w_1(S,t)^{1/2}|t-S|^{\varepsilon_1}+w_2(S,t)|t-S|^{\varepsilon_2}\big). \end{align*} $$
$$ \begin{align*} \big\|\|\mathcal{A}_t\|_{L^m|\mathcal{F}_S}\big\|_{L^n}\leq K\big( w_1(S,t)^{1/2}|t-S|^{\varepsilon_1}+w_2(S,t)|t-S|^{\varepsilon_2}\big). \end{align*} $$
This is precisely (2.12) with 
 $s=S$
. The case for general
$s=S$
. The case for general 
 $(s,t)\in [S,T]_{\leq }^2$
 follows in the same way. It is also clear that
$(s,t)\in [S,T]_{\leq }^2$
 follows in the same way. It is also clear that 
 $\mathcal {A}_0=0$
 and, by the remarks in Step 2, that
$\mathcal {A}_0=0$
 and, by the remarks in Step 2, that 
 $\mathcal {A}$
 is adapted and continuous in
$\mathcal {A}$
 is adapted and continuous in 
 $L^m$
. Therefore,
$L^m$
. Therefore, 
 $\mathcal {A}$
 satisfies all of the claimed properties.
$\mathcal {A}$
 satisfies all of the claimed properties.
Step 4 (Uniqueness). The proof of this is standard and can be found in, for example, [Reference Lê72].
 The other version of SSL that we use seems to be new. In Lemma 2.5, one can transfer 
 $L^m$
 bounds from A to
$L^m$
 bounds from A to 
 $\mathcal {A}$
 if
$\mathcal {A}$
 if 
 $m<\infty $
. The
$m<\infty $
. The 
 $m=\infty $
 case is a bit different:
$m=\infty $
 case is a bit different: 
 $L^\infty $
 bounds on A imply Gaussian moment bounds on
$L^\infty $
 bounds on A imply Gaussian moment bounds on 
 $\mathcal {A}$
. An alternative way to obtain Gaussian moment bounds via stochastic sewing is presented in [Reference Butkovsky, Dareiotis and Gerencsér12] (see, for example, Theorem 3.3 and Lemma 4.6. therein), but the conditions herein are easier to verify. The proof relies on a conditional version of Azuma–Hoeffding inequality; see Lemma A.1 in Appendix A.
$\mathcal {A}$
. An alternative way to obtain Gaussian moment bounds via stochastic sewing is presented in [Reference Butkovsky, Dareiotis and Gerencsér12] (see, for example, Theorem 3.3 and Lemma 4.6. therein), but the conditions herein are easier to verify. The proof relies on a conditional version of Azuma–Hoeffding inequality; see Lemma A.1 in Appendix A.
Lemma 2.6. Let 
 $(A_{s,t})_{(s,t)\in \overline {[S,T]}^2_\leq }$
 be a continuous mapping from
$(A_{s,t})_{(s,t)\in \overline {[S,T]}^2_\leq }$
 be a continuous mapping from 
 $\overline {[S,T]}^2_\leq $
 to
$\overline {[S,T]}^2_\leq $
 to 
 $L^2$
, with
$L^2$
, with 
 $A_{s,t} \mathcal {F}_t$
-measurable for all
$A_{s,t} \mathcal {F}_t$
-measurable for all 
 $(s,t)\in \overline {[S,T]}^2_\leq $
, such that the conditions of Lemma 2.5 hold with
$(s,t)\in \overline {[S,T]}^2_\leq $
, such that the conditions of Lemma 2.5 hold with 
 $m=n=\infty $
; namely, assume that there exist controls
$m=n=\infty $
; namely, assume that there exist controls 
 $w_1,w_2$
 and constants
$w_1,w_2$
 and constants 
 $\varepsilon _1,\varepsilon _2>0$
 such that the bounds
$\varepsilon _1,\varepsilon _2>0$
 such that the bounds 
 $$ \begin{align} \big\|\|A_{s,t}\|_{L^\infty|\mathcal{F}_s}\big\|_{L^\infty}&\leq w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}, \end{align} $$
$$ \begin{align} \big\|\|A_{s,t}\|_{L^\infty|\mathcal{F}_s}\big\|_{L^\infty}&\leq w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}, \end{align} $$
 $$ \begin{align} \|\mathbb{E}_{s_-}\delta A_{s,u,t}\|_{L^\infty}&\leq w_2(s_-,t)|t-s|^{\varepsilon_2} \end{align} $$
$$ \begin{align} \|\mathbb{E}_{s_-}\delta A_{s,u,t}\|_{L^\infty}&\leq w_2(s_-,t)|t-s|^{\varepsilon_2} \end{align} $$
hold for all 
 $(s,u,t)\in \overline {[S,T]}^3_\leq $
. Denote by
$(s,u,t)\in \overline {[S,T]}^3_\leq $
. Denote by 
 $(\mathcal {A}_t)_{t\in [S,T]}$
 the associated process coming from Lemma 2.5. Then there exists positive constants
$(\mathcal {A}_t)_{t\in [S,T]}$
 the associated process coming from Lemma 2.5. Then there exists positive constants 
 $\mu $
 and K depending only on
$\mu $
 and K depending only on 
 $\varepsilon _1,\varepsilon _2,d$
 such that the bound
$\varepsilon _1,\varepsilon _2,d$
 such that the bound 
 $$ \begin{align} \mathbb{E}\bigg[\exp\bigg(\mu\,\frac{|\mathcal{A}_t-\mathcal{A}_s|^2}{\big(w_1(s,t)^{1/2}|t-s|^{\varepsilon_1}+w_2(s,t)|t-s|^{\varepsilon_2}\big)^2}\bigg)\bigg\vert \mathcal{F}_s\bigg]\leq K \end{align} $$
$$ \begin{align} \mathbb{E}\bigg[\exp\bigg(\mu\,\frac{|\mathcal{A}_t-\mathcal{A}_s|^2}{\big(w_1(s,t)^{1/2}|t-s|^{\varepsilon_1}+w_2(s,t)|t-s|^{\varepsilon_2}\big)^2}\bigg)\bigg\vert \mathcal{F}_s\bigg]\leq K \end{align} $$
holds for all 
 $(s,t)\in [S,T]_{\leq }^2$
.
$(s,t)\in [S,T]_{\leq }^2$
.
Proof. We continue using the notation of the proof of Lemma 2.5. Let 
 $(s,t)\in \overline {[S,T]}_\leq ^2$
 and
$(s,t)\in \overline {[S,T]}_\leq ^2$
 and 
 $k=0,1,\ldots $
, and let us bound
$k=0,1,\ldots $
, and let us bound 
 $\mathcal {A}_{s,t}^{k+1}-\mathcal {A}_{s,t}^k$
. The first term on the right-hand side of (2.16) is trivially bounded by
$\mathcal {A}_{s,t}^{k+1}-\mathcal {A}_{s,t}^k$
. The first term on the right-hand side of (2.16) is trivially bounded by 
 $2w_1(s_-,t)^{1/2}|t-s|^{\varepsilon _1}2^{-k\varepsilon _1}$
 with probability
$2w_1(s_-,t)^{1/2}|t-s|^{\varepsilon _1}2^{-k\varepsilon _1}$
 with probability 
 $1$
. Decomposing the second term into
$1$
. Decomposing the second term into 
 $I_1$
 and
$I_1$
 and 
 $I_2$
 as in (2.17), a simple use of triangle inequality as in (2.18) yields the
$I_2$
 as in (2.17), a simple use of triangle inequality as in (2.18) yields the 
 $\mathbb {P}$
-almost sure bound
$\mathbb {P}$
-almost sure bound 
 $$ \begin{align*} |I_1|\lesssim 2^{-k\varepsilon_2}|t-s|^{\varepsilon_2}w_2(s,t). \end{align*} $$
$$ \begin{align*} |I_1|\lesssim 2^{-k\varepsilon_2}|t-s|^{\varepsilon_2}w_2(s,t). \end{align*} $$
As for 
 $I_2$
, recalling that it is the sum of two martingales, for each, we may use the Azuma-Hoeffding inequality. The role of
$I_2$
, recalling that it is the sum of two martingales, for each, we may use the Azuma-Hoeffding inequality. The role of 
 $\delta _j$
 as in Lemma A.1 is played by
$\delta _j$
 as in Lemma A.1 is played by 
 $4w_1(t_{4j+2\ell }^k,t_{4j+2\ell +4}^k)^{1/2}$
, so similarly to the calculation as in (2.19), we get
$4w_1(t_{4j+2\ell }^k,t_{4j+2\ell +4}^k)^{1/2}$
, so similarly to the calculation as in (2.19), we get 
 $$ \begin{align*} \Lambda:=\sum_i \delta_i^2 \lesssim 2^{-2k\varepsilon_1}|t-s|^{2\varepsilon_1}w_1(s,t). \end{align*} $$
$$ \begin{align*} \Lambda:=\sum_i \delta_i^2 \lesssim 2^{-2k\varepsilon_1}|t-s|^{2\varepsilon_1}w_1(s,t). \end{align*} $$
Therefore, by (A.1) combined with the aforementioned 
 $\mathbb {P}$
-almost sure bounds, we get that with some
$\mathbb {P}$
-almost sure bounds, we get that with some 
 $\mu _1>0$
,
$\mu _1>0$
, 
 $K_1$
,
$K_1$
, 
 $$ \begin{align*} \mathbb{E}\bigg[\exp\Big(\mu_12^{k(\varepsilon_1\wedge\varepsilon_2)}\frac{|\mathcal{A}_{s,t}^{k+1}-\mathcal{A}_{s,t}^k|^2}{(w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}+w_2(s_-,t)|t-s|^{\varepsilon_2})^2}\Big)\bigg\vert \mathcal{F}_{S}\bigg]\leq K_1. \end{align*} $$
$$ \begin{align*} \mathbb{E}\bigg[\exp\Big(\mu_12^{k(\varepsilon_1\wedge\varepsilon_2)}\frac{|\mathcal{A}_{s,t}^{k+1}-\mathcal{A}_{s,t}^k|^2}{(w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}+w_2(s_-,t)|t-s|^{\varepsilon_2})^2}\Big)\bigg\vert \mathcal{F}_{S}\bigg]\leq K_1. \end{align*} $$
Since one can write
 $$ \begin{align*} |(\mathcal{A}_t-\mathcal{A}_s)-A_{s,t}|\leq \sum_{k=0}^\infty 2^{-k(\varepsilon_1\wedge\varepsilon_2)}2^{k(\varepsilon_1\wedge\varepsilon_2)}|\mathcal{A}_{s,t}^{k+1}-\mathcal{A}_{s,t}^k|, \end{align*} $$
$$ \begin{align*} |(\mathcal{A}_t-\mathcal{A}_s)-A_{s,t}|\leq \sum_{k=0}^\infty 2^{-k(\varepsilon_1\wedge\varepsilon_2)}2^{k(\varepsilon_1\wedge\varepsilon_2)}|\mathcal{A}_{s,t}^{k+1}-\mathcal{A}_{s,t}^k|, \end{align*} $$
we get by conditional Jensen’s inequality,
 $$ \begin{align*} \mathbb{E}&\bigg[\exp\Big(\mu_1\frac{|(\mathcal{A}_t-\mathcal{A}_s)-A_{s,t}|^2}{(w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}+w_2(s_-,t)|t-s|^{\varepsilon_2})^2}\Big)\bigg\vert\mathcal{F}_{S}\bigg]\leq \sum_{k=0}^\infty 2^{-k(\varepsilon_1\wedge\varepsilon_2)}K_1. \end{align*} $$
$$ \begin{align*} \mathbb{E}&\bigg[\exp\Big(\mu_1\frac{|(\mathcal{A}_t-\mathcal{A}_s)-A_{s,t}|^2}{(w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}+w_2(s_-,t)|t-s|^{\varepsilon_2})^2}\Big)\bigg\vert\mathcal{F}_{S}\bigg]\leq \sum_{k=0}^\infty 2^{-k(\varepsilon_1\wedge\varepsilon_2)}K_1. \end{align*} $$
Using again the assumed bounds on 
 $A_{s,t}$
, we get with some other constant
$A_{s,t}$
, we get with some other constant 
 $K_2$
$K_2$
 
 $$ \begin{align*} \mathbb{E}&\bigg[\exp\Big(\mu_1\frac{|\mathcal{A}_t-\mathcal{A}_s|^2}{w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}+w_2(s_-,t)|t-s|^{\varepsilon_2}}\Big)\bigg\vert \mathcal{F}_{S}\bigg]\leq K_2. \end{align*} $$
$$ \begin{align*} \mathbb{E}&\bigg[\exp\Big(\mu_1\frac{|\mathcal{A}_t-\mathcal{A}_s|^2}{w_1(s_-,t)^{1/2}|t-s|^{\varepsilon_1}+w_2(s_-,t)|t-s|^{\varepsilon_2}}\Big)\bigg\vert \mathcal{F}_{S}\bigg]\leq K_2. \end{align*} $$
It only remains to remove the shifts in the denominator and substitute 
 $\mathcal {F}_S$
 with
$\mathcal {F}_S$
 with 
 $\mathcal {F}_s$
, which can be done just as in Step 3 of the proof of Lemma 2.5, and therefore, we obtain (2.22).
$\mathcal {F}_s$
, which can be done just as in Step 3 of the proof of Lemma 2.5, and therefore, we obtain (2.22).
3 Stability
 The use of the tools from Section 2 is illustrated by the following lemma, which will play a key role in our analysis. Let us emphasise the important feature of the statement that although h is assumed to have 
 $\delta $
 spatial regularity, in the estimate, only its
$\delta $
 spatial regularity, in the estimate, only its 
 $\alpha -1$
 norm is used.
$\alpha -1$
 norm is used.
Lemma 3.1. Assume (A) and let 
 $(S,T)\in [0,1]_\leq ^2$
. Suppose that
$(S,T)\in [0,1]_\leq ^2$
. Suppose that 
 $h\in L^q_t C^\delta _x$
 for some
$h\in L^q_t C^\delta _x$
 for some 
 $\delta>0$
 and let
$\delta>0$
 and let 
 $\varphi $
 be an adapted process satisfying (2.1) with
$\varphi $
 be an adapted process satisfying (2.1) with 
 $m=1$
 and some control w. For
$m=1$
 and some control w. For 
 $t\in [S,T]$
, define the process
$t\in [S,T]$
, define the process 
 $$ \begin{align*} \psi_t=\int_S^t h_r\big(B^H_r+\varphi_r\big)\mathrm{d} r \end{align*} $$
$$ \begin{align*} \psi_t=\int_S^t h_r\big(B^H_r+\varphi_r\big)\mathrm{d} r \end{align*} $$
and set 
 $\varepsilon =1/q'+(\alpha -1)H$
. Then there exist positive constants
$\varepsilon =1/q'+(\alpha -1)H$
. Then there exist positive constants 
 $\mu $
 and K, depending only on H, q,
$\mu $
 and K, depending only on H, q, 
 $\alpha $
 and d, such that for all
$\alpha $
 and d, such that for all 
 $(s,t)\in [S,T]^2_\leq $
, one has the bound
$(s,t)\in [S,T]^2_\leq $
, one has the bound 
 $$ \begin{align} \mathbb{E}\bigg[ \exp\bigg(\mu\,\frac{|\psi_t-\psi_s|^2}{w_{h,\alpha-1,q}(s,t)^{2/q}|t-s|^{2\varepsilon} \big(1+w(s,t)^{1/q}|t-s|^{\varepsilon})^2}\bigg)\bigg\vert \mathcal{F}_s\bigg]\leq K. \end{align} $$
$$ \begin{align} \mathbb{E}\bigg[ \exp\bigg(\mu\,\frac{|\psi_t-\psi_s|^2}{w_{h,\alpha-1,q}(s,t)^{2/q}|t-s|^{2\varepsilon} \big(1+w(s,t)^{1/q}|t-s|^{\varepsilon})^2}\bigg)\bigg\vert \mathcal{F}_s\bigg]\leq K. \end{align} $$
As a consequence, for any 
 $\widetilde m\in [1,\infty )$
, there exists a constant
$\widetilde m\in [1,\infty )$
, there exists a constant 
 $\tilde K$
, depending only on
$\tilde K$
, depending only on 
 $\tilde m$
, H, q,
$\tilde m$
, H, q, 
 $\alpha $
 and d, such that for all
$\alpha $
 and d, such that for all 
 $(s,t)\in [S,T]^2_\leq $
, one has the bound
$(s,t)\in [S,T]^2_\leq $
, one has the bound 
 $$ \begin{align} \big\|\|\psi_t-\psi_s\|_{L^{\widetilde m}|\mathcal{F}_s}\big\|_{L^\infty}\leq \tilde K w_{h,\alpha-1,q}(s,t)^{1/q}|t-s|^{\varepsilon}\big(1+w(s,t)^{1/q}|t-s|^{\varepsilon}\big). \end{align} $$
$$ \begin{align} \big\|\|\psi_t-\psi_s\|_{L^{\widetilde m}|\mathcal{F}_s}\big\|_{L^\infty}\leq \tilde K w_{h,\alpha-1,q}(s,t)^{1/q}|t-s|^{\varepsilon}\big(1+w(s,t)^{1/q}|t-s|^{\varepsilon}\big). \end{align} $$
Proof. Note that thanks to the condition (A), 
 $\varepsilon>0$
. For
$\varepsilon>0$
. For 
 $(s,t)\in \overline {[S,T]}^2_\leq $
, let us set
$(s,t)\in \overline {[S,T]}^2_\leq $
, let us set 
 $$ \begin{align*} A_{s,t}=\mathbb{E}_{s-(t-s)}\int_s^t h_r(B^H_r+\mathbb{E}_{s-(t-s)}\varphi_r)\mathrm{d} r \end{align*} $$
$$ \begin{align*} A_{s,t}=\mathbb{E}_{s-(t-s)}\int_s^t h_r(B^H_r+\mathbb{E}_{s-(t-s)}\varphi_r)\mathrm{d} r \end{align*} $$
and verify the conditions of Lemma 2.6 (namely those of Lemma 2.5 with 
 $m=n=\infty $
).
$m=n=\infty $
).
 Fix 
 $(s,u,t)\in \overline {[S,T]}_\leq ^3$
 and denote
$(s,u,t)\in \overline {[S,T]}_\leq ^3$
 and denote 
 $s_1=s-(t-s)$
,
$s_1=s-(t-s)$
, 
 $s_2=s-(u-s)$
,
$s_2=s-(u-s)$
, 
 $s_3=u-(t-u)$
,
$s_3=u-(t-u)$
, 
 $s_4=s$
,
$s_4=s$
, 
 $s_5=u$
,
$s_5=u$
, 
 $s_6=t$
. These points are almost ordered according to their indices, except
$s_6=t$
. These points are almost ordered according to their indices, except 
 $s_3$
 and
$s_3$
 and 
 $s_4$
, for which
$s_4$
, for which 
 $s_4\leq s_3$
 may happen, but this plays no role whatsoever. First, by property (1.25), we have
$s_4\leq s_3$
 may happen, but this plays no role whatsoever. First, by property (1.25), we have 
 $$ \begin{align*} A_{s,t}=\int_s^t P_{|r-s_1|^{2H}}h_r\big(\mathbb{E}_{s_1}(B^H_r+\varphi_r)\big)\mathrm{d} r. \end{align*} $$
$$ \begin{align*} A_{s,t}=\int_s^t P_{|r-s_1|^{2H}}h_r\big(\mathbb{E}_{s_1}(B^H_r+\varphi_r)\big)\mathrm{d} r. \end{align*} $$
Therefore, by (1.28) and Hölder’s inequality, it holds
 $$ \begin{align*} |A_{s,t}|\leq\int_s^t \|P_{|r-s_1|^{2H}}h_r\|_{C^0_x}\mathrm{d} r\lesssim &\int_s^t|r-s_1|^{(\alpha-1) H}\|h_r\|_{C^{\alpha-1}_x}\mathrm{d} r \\&\lesssim |t-s|^{1/q'+(\alpha-1) H}w_{h,\alpha-1,q}(s,t)^{1/q}. \end{align*} $$
$$ \begin{align*} |A_{s,t}|\leq\int_s^t \|P_{|r-s_1|^{2H}}h_r\|_{C^0_x}\mathrm{d} r\lesssim &\int_s^t|r-s_1|^{(\alpha-1) H}\|h_r\|_{C^{\alpha-1}_x}\mathrm{d} r \\&\lesssim |t-s|^{1/q'+(\alpha-1) H}w_{h,\alpha-1,q}(s,t)^{1/q}. \end{align*} $$
Since 
 $q\leq 2$
, by the definition of
$q\leq 2$
, by the definition of 
 $\varepsilon $
, (2.7) is satisfied with
$\varepsilon $
, (2.7) is satisfied with 
 $\varepsilon _1=\varepsilon $
 and
$\varepsilon _1=\varepsilon $
 and 
 $w_1=N w_{h,\alpha -1,q}^{2/q}$
.
$w_1=N w_{h,\alpha -1,q}^{2/q}$
.
 Next, we need to bound 
 $\mathbb {E}_{s-(t-s)}\delta A_{s,u,t}=\mathbb {E}_{s_1}\delta A_{s_4,s_5,s_6}$
. After an elementary rearrangement, we get
$\mathbb {E}_{s-(t-s)}\delta A_{s,u,t}=\mathbb {E}_{s_1}\delta A_{s_4,s_5,s_6}$
. After an elementary rearrangement, we get 
 $$ \begin{align*} \mathbb{E}_{s_1}\delta A_{s_4,s_5,s_6}=I+J:&=\mathbb{E}_{s_1}\mathbb{E}_{s_2}\int_{s_4}^{s_5} h_r(B^H_r+\mathbb{E}_{s_1}\varphi_r)-h_r(B^H_r+\mathbb{E}_{s_2}\varphi_r)\mathrm{d} r \\ &\quad+\mathbb{E}_{s_1}\mathbb{E}_{s_3}\int_{s_5}^{s_6} h(B^H_r+\mathbb{E}_{s_1}\varphi_r)-h_r(B^H_r+\mathbb{E}_{s_3}\varphi_r)\mathrm{d} r. \end{align*} $$
$$ \begin{align*} \mathbb{E}_{s_1}\delta A_{s_4,s_5,s_6}=I+J:&=\mathbb{E}_{s_1}\mathbb{E}_{s_2}\int_{s_4}^{s_5} h_r(B^H_r+\mathbb{E}_{s_1}\varphi_r)-h_r(B^H_r+\mathbb{E}_{s_2}\varphi_r)\mathrm{d} r \\ &\quad+\mathbb{E}_{s_1}\mathbb{E}_{s_3}\int_{s_5}^{s_6} h(B^H_r+\mathbb{E}_{s_1}\varphi_r)-h_r(B^H_r+\mathbb{E}_{s_3}\varphi_r)\mathrm{d} r. \end{align*} $$
The two terms are treated in exactly the same way, so we only detail I. We use (1.28) similarly as before to get
 $$ \begin{align*} |I|&\leq\mathbb{E}_{s_1}\int_{s_4}^{s_5}\big|P_{|r-s_2|^{2H}}h_r(\mathbb{E}_{s_2}B^H_r+\mathbb{E}_{s_1}\varphi_r)-P_{|r-s_2|^{2H}}h_r(\mathbb{E}_{s_2}B^H_r+\mathbb{E}_{s_2}\varphi_r)\big|\mathrm{d} r \\ &\leq \mathbb{E}_{s_1}\int_{s_4}^{s_5}\|P_{|r-s_2|^{2H}} h_r\|_{C^1_x}|\mathbb{E}_{s_1}\varphi_r-\mathbb{E}_{s_2}\varphi_r|\mathrm{d} r \\ &\lesssim \mathbb{E}_{s_1}\int_{s_4}^{s_5}|r-s_2|^{(\alpha-2)H}\|h_r\|_{C^{\alpha-1}_x}|\mathbb{E}_{s_1}\varphi_r-\mathbb{E}_{s_2}\varphi_r|\mathrm{d} r. \end{align*} $$
$$ \begin{align*} |I|&\leq\mathbb{E}_{s_1}\int_{s_4}^{s_5}\big|P_{|r-s_2|^{2H}}h_r(\mathbb{E}_{s_2}B^H_r+\mathbb{E}_{s_1}\varphi_r)-P_{|r-s_2|^{2H}}h_r(\mathbb{E}_{s_2}B^H_r+\mathbb{E}_{s_2}\varphi_r)\big|\mathrm{d} r \\ &\leq \mathbb{E}_{s_1}\int_{s_4}^{s_5}\|P_{|r-s_2|^{2H}} h_r\|_{C^1_x}|\mathbb{E}_{s_1}\varphi_r-\mathbb{E}_{s_2}\varphi_r|\mathrm{d} r \\ &\lesssim \mathbb{E}_{s_1}\int_{s_4}^{s_5}|r-s_2|^{(\alpha-2)H}\|h_r\|_{C^{\alpha-1}_x}|\mathbb{E}_{s_1}\varphi_r-\mathbb{E}_{s_2}\varphi_r|\mathrm{d} r. \end{align*} $$
By Jensen’s inequality and the assumption on 
 $\varphi $
, we have the
$\varphi $
, we have the 
 $\mathbb {P}$
-almost sure bound
$\mathbb {P}$
-almost sure bound 
 $$ \begin{align*} \mathbb{E}_{s_1}|\mathbb{E}_{s_1}\varphi_r-\mathbb{E}_{s_2}\varphi_r|\leq\mathbb{E}_{s_1}|\mathbb{E}_{s_1}\varphi_r-\varphi_r|\leq w(s_1,r)^{1/q}|t-s|^{1/q'+\alpha H}. \end{align*} $$
$$ \begin{align*} \mathbb{E}_{s_1}|\mathbb{E}_{s_1}\varphi_r-\mathbb{E}_{s_2}\varphi_r|\leq\mathbb{E}_{s_1}|\mathbb{E}_{s_1}\varphi_r-\varphi_r|\leq w(s_1,r)^{1/q}|t-s|^{1/q'+\alpha H}. \end{align*} $$
Also note that 
 $r\mapsto |r-s_2|^{(\alpha -2)H}\in L^{q'}([s_4,s_5])$
 because of the shifted basepoint; in general, this would not be true with
$r\mapsto |r-s_2|^{(\alpha -2)H}\in L^{q'}([s_4,s_5])$
 because of the shifted basepoint; in general, this would not be true with 
 $s_2$
 replaced by
$s_2$
 replaced by 
 $s_4$
. Therefore, by Hölder’s inequality,
$s_4$
. Therefore, by Hölder’s inequality, 
 $$ \begin{align*} |I|\lesssim |t-s|^{1/q'+(\alpha-2)H+1/q'+\alpha H}w_{h,\alpha-1,q}(s,t)^{1/q}w(s_1,t)^{1/q}. \end{align*} $$
$$ \begin{align*} |I|\lesssim |t-s|^{1/q'+(\alpha-2)H+1/q'+\alpha H}w_{h,\alpha-1,q}(s,t)^{1/q}w(s_1,t)^{1/q}. \end{align*} $$
Note that the exponent of 
 $|t-s|$
 is simply
$|t-s|$
 is simply 
 $2\varepsilon $
. Using again that
$2\varepsilon $
. Using again that 
 $q\leq 2$
, we see that condition (2.8) is satisfied with
$q\leq 2$
, we see that condition (2.8) is satisfied with 
 $\varepsilon _2=2\varepsilon $
 and
$\varepsilon _2=2\varepsilon $
 and 
 $w_2=N w_{h,\alpha -1,q}(s,t)^{1/q}w(s_1,t)^{1/q}$
.
$w_2=N w_{h,\alpha -1,q}(s,t)^{1/q}w(s_1,t)^{1/q}$
.
 It remains to verify that the process 
 $\mathcal {A}$
 of Lemma 2.5 is given by
$\mathcal {A}$
 of Lemma 2.5 is given by 
 $\psi $
. Since
$\psi $
. Since 
 $\psi _0=0$
, it suffices to show that
$\psi _0=0$
, it suffices to show that 
 $$ \begin{align} \|\psi_t-\psi_s-A_{s,t}\|_{L^1}\leq \tilde{w}(s_-,t)|t-s|^\kappa \end{align} $$
$$ \begin{align} \|\psi_t-\psi_s-A_{s,t}\|_{L^1}\leq \tilde{w}(s_-,t)|t-s|^\kappa \end{align} $$
for all 
 $(s,t)\in \overline {[S,T]}^2_{\leq }$
, with some control
$(s,t)\in \overline {[S,T]}^2_{\leq }$
, with some control 
 $\tilde w$
 and some
$\tilde w$
 and some 
 $\kappa>0$
. This follows from three easy bounds: first,
$\kappa>0$
. This follows from three easy bounds: first, 
 $$ \begin{align*} \Big\|&\psi_t-\psi_s-\int_s^t h_r(B^H_r+\mathbb{E}_{s_-}\varphi_r\big)\mathrm{d} r\Big\|_{L^1} \\&\leq \int_s^t\|h_r\|_{C^\delta_x} w(s_-,r)^{\delta/q}\mathrm{d} r \leq w_{h,\delta,q}(s,t)^{1/q}|t-s|^{1/q'}w(s_-,t)^{\delta/q}, \end{align*} $$
$$ \begin{align*} \Big\|&\psi_t-\psi_s-\int_s^t h_r(B^H_r+\mathbb{E}_{s_-}\varphi_r\big)\mathrm{d} r\Big\|_{L^1} \\&\leq \int_s^t\|h_r\|_{C^\delta_x} w(s_-,r)^{\delta/q}\mathrm{d} r \leq w_{h,\delta,q}(s,t)^{1/q}|t-s|^{1/q'}w(s_-,t)^{\delta/q}, \end{align*} $$
second,
 $$ \begin{align*} \Big\|\int_s^t &h_r(B^H_r+\mathbb{E}_{s_-}\varphi_r\big)\mathrm{d} r-\int_s^t h_r(\mathbb{E}_{s_-}B^H_r+\mathbb{E}_{s_-}\varphi_r\big)\mathrm{d} r\Big\|_{L^1} \\&\leq \int_s^t\|h_r\|_{C^\delta_x} |r-s_-|^{\delta H}\mathrm{d} r \lesssim w_{h,\delta,q}(s,t)^{1/q}|t-s|^{1/q'+\delta H}, \end{align*} $$
$$ \begin{align*} \Big\|\int_s^t &h_r(B^H_r+\mathbb{E}_{s_-}\varphi_r\big)\mathrm{d} r-\int_s^t h_r(\mathbb{E}_{s_-}B^H_r+\mathbb{E}_{s_-}\varphi_r\big)\mathrm{d} r\Big\|_{L^1} \\&\leq \int_s^t\|h_r\|_{C^\delta_x} |r-s_-|^{\delta H}\mathrm{d} r \lesssim w_{h,\delta,q}(s,t)^{1/q}|t-s|^{1/q'+\delta H}, \end{align*} $$
and third,
 $$ \begin{align*} \Big\|\int_s^t &h_r(\mathbb{E}_{s_-}B^H_r+\mathbb{E}_{s_-}\varphi_r\big)\mathrm{d} r-A_{s,t}\Big\|_{L^1} \\&\leq \int_s^t\|h_r-P_{|r-s_-|^{2H}}h_r\|_{C^0_x} \mathrm{d} r \lesssim w_{h,\delta,q}(s,t)^{1/q}|t-s|^{1/q'+\delta H}. \end{align*} $$
$$ \begin{align*} \Big\|\int_s^t &h_r(\mathbb{E}_{s_-}B^H_r+\mathbb{E}_{s_-}\varphi_r\big)\mathrm{d} r-A_{s,t}\Big\|_{L^1} \\&\leq \int_s^t\|h_r-P_{|r-s_-|^{2H}}h_r\|_{C^0_x} \mathrm{d} r \lesssim w_{h,\delta,q}(s,t)^{1/q}|t-s|^{1/q'+\delta H}. \end{align*} $$
Hence, we can conclude 
 $\psi =\mathcal {A}$
, and (3.1) follows from (2.22).
$\psi =\mathcal {A}$
, and (3.1) follows from (2.22).
 We will often consider (1.6) with nonzero initial time. If b is a function, a solution of (1.6) on some interval 
 $[S,T]\subset [0,1]$
 with initial condition
$[S,T]\subset [0,1]$
 with initial condition 
 $X_S$
 is a process X satisfying
$X_S$
 is a process X satisfying 
 $$ \begin{align*} X_t=X_S+\int_S^t b_r(X_r)\mathrm{d} r+B^H_t-B^H_S \end{align*} $$
$$ \begin{align*} X_t=X_S+\int_S^t b_r(X_r)\mathrm{d} r+B^H_t-B^H_S \end{align*} $$
for all 
 $t\in [S,T]$
. Our main stability estimate for solutions is then formulated as follows.
$t\in [S,T]$
. Our main stability estimate for solutions is then formulated as follows.
Theorem 3.2. Assume (A). Let 
 $\delta>0$
. Let
$\delta>0$
. Let 
 $[S,T]\subset [0,1]$
, and for
$[S,T]\subset [0,1]$
, and for 
 $i=1,2$
, let
$i=1,2$
, let 
 $X^i$
 be adapted continuous processes satisfying (1.6) on
$X^i$
 be adapted continuous processes satisfying (1.6) on 
 $[S,T]$
 with initial conditions
$[S,T]$
 with initial conditions 
 $X^i_S$
 and drifts
$X^i_S$
 and drifts 
 $b^i\in L^q_t C^{1+\delta }_x$
. Denote
$b^i\in L^q_t C^{1+\delta }_x$
. Denote 
 $M=\max _{i=1,2}\|b^i\|_{L^q_t C^\alpha _x}$
. Then for any
$M=\max _{i=1,2}\|b^i\|_{L^q_t C^\alpha _x}$
. Then for any 
 $m\in [2,\infty )$
, there exists a positive constant
$m\in [2,\infty )$
, there exists a positive constant 
 $N=N(m,M,H,\alpha ,q,d)$
, such that one has the
$N=N(m,M,H,\alpha ,q,d)$
, such that one has the 
 $\mathbb {P}$
-almost sure bound
$\mathbb {P}$
-almost sure bound 
 $$ \begin{align} \Big\|\sup_{t\in[S,T]}|X^1_t-X^2_t|\Big\|_{L^m|\mathcal{F}_s}\leq N\Big(|X_S^1-X_S^2|+\|b^1-b^2\|_{L^q_t([S,T];C^{\alpha-1}_x)}\Big). \end{align} $$
$$ \begin{align} \Big\|\sup_{t\in[S,T]}|X^1_t-X^2_t|\Big\|_{L^m|\mathcal{F}_s}\leq N\Big(|X_S^1-X_S^2|+\|b^1-b^2\|_{L^q_t([S,T];C^{\alpha-1}_x)}\Big). \end{align} $$
Moreover, if 
 $b^1=b^2$
, then one also has the
$b^1=b^2$
, then one also has the 
 $\mathbb {P}$
-almost sure bound
$\mathbb {P}$
-almost sure bound 
 $$ \begin{align} \Big\|\sup_{t\in[S,T]}\big(|X^1_t-X^2_t|^{-1}\big)\Big\|_{L^m|\mathcal{F}_s}\leq N |X_S^1-X_S^2|^{-1}. \end{align} $$
$$ \begin{align} \Big\|\sup_{t\in[S,T]}\big(|X^1_t-X^2_t|^{-1}\big)\Big\|_{L^m|\mathcal{F}_s}\leq N |X_S^1-X_S^2|^{-1}. \end{align} $$
Proof. As usual, we denote 
 $\varphi ^1=X^1-B^H$
 and
$\varphi ^1=X^1-B^H$
 and 
 $\varphi ^2=X^2-B^H$
. For
$\varphi ^2=X^2-B^H$
. For 
 $t\in [S,T]$
, we write
$t\in [S,T]$
, we write 
 $$ \begin{align} X^1_t-X^2_t&=X^1_S-X^2_S+\int_S^t\Big(\int_0^1\nabla b^1_r\big(B^H_r+\lambda\varphi^1_r+(1-\lambda)\varphi^2_r\big)\mathrm{d} \lambda\Big)\cdot(X^1_r-X^2_r)\mathrm{d} r \nonumber\\ &\qquad+\int_0^t(b^1-b^2)_r(B^H_r+\varphi^2_r)\mathrm{d} r. \end{align} $$
$$ \begin{align} X^1_t-X^2_t&=X^1_S-X^2_S+\int_S^t\Big(\int_0^1\nabla b^1_r\big(B^H_r+\lambda\varphi^1_r+(1-\lambda)\varphi^2_r\big)\mathrm{d} \lambda\Big)\cdot(X^1_r-X^2_r)\mathrm{d} r \nonumber\\ &\qquad+\int_0^t(b^1-b^2)_r(B^H_r+\varphi^2_r)\mathrm{d} r. \end{align} $$
Note that 
 $\nabla b^1\in L^q_tC^\delta _x$
, and therefore, the process
$\nabla b^1\in L^q_tC^\delta _x$
, and therefore, the process 
 $$ \begin{align*} A_t:=\int_0^1A^\lambda_t\mathrm{d} \lambda:=\int_0^1\Big(\int_S^t\nabla b^1_r\big(B^H_r+\lambda\varphi^1_r+(1-\lambda)\varphi^2_r\big)\mathrm{d} r\Big)\mathrm{d} \lambda \end{align*} $$
$$ \begin{align*} A_t:=\int_0^1A^\lambda_t\mathrm{d} \lambda:=\int_0^1\Big(\int_S^t\nabla b^1_r\big(B^H_r+\lambda\varphi^1_r+(1-\lambda)\varphi^2_r\big)\mathrm{d} r\Big)\mathrm{d} \lambda \end{align*} $$
is well-defined. Define furthermore
 $$ \begin{align*} z_t:=\int_0^t(b^1-b^2)_r(B^H_r+\varphi^2_r)\mathrm{d} r. \end{align*} $$
$$ \begin{align*} z_t:=\int_0^t(b^1-b^2)_r(B^H_r+\varphi^2_r)\mathrm{d} r. \end{align*} $$
We then apply Lemma 3.1 with 
 $\varphi =\lambda \varphi ^1_r+(1-\lambda )\varphi ^2_r$
 and
$\varphi =\lambda \varphi ^1_r+(1-\lambda )\varphi ^2_r$
 and 
 $h=\nabla b^1$
, as well as with
$h=\nabla b^1$
, as well as with 
 $\varphi =\varphi ^2$
 and
$\varphi =\varphi ^2$
 and 
 $h=b^1-b^2$
. Since
$h=b^1-b^2$
. Since 
 $\varphi ^1$
 and
$\varphi ^1$
 and 
 $\varphi ^2$
 are the drift parts of solutions, by Lemma 2.1, the processes
$\varphi ^2$
 are the drift parts of solutions, by Lemma 2.1, the processes 
 $\varphi =\lambda \varphi ^1+(1-\lambda )\varphi ^2$
 satisfy the bound (2.2) with control
$\varphi =\lambda \varphi ^1+(1-\lambda )\varphi ^2$
 satisfy the bound (2.2) with control 
 $w=w_{b^1,\alpha ,q}+w_{b^2,\alpha ,q}$
, and so Lemma 3.1 indeed applies. Combining the bound (3.1) with Lemma A.2, we get that there exist random variables
$w=w_{b^1,\alpha ,q}+w_{b^2,\alpha ,q}$
, and so Lemma 3.1 indeed applies. Combining the bound (3.1) with Lemma A.2, we get that there exist random variables 
 $\eta _A,\eta _z$
 with Gaussian momentsFootnote 
7
 conditionally on
$\eta _A,\eta _z$
 with Gaussian momentsFootnote 
7
 conditionally on 
 $\mathcal {F}_S$
, as well as
$\mathcal {F}_S$
, as well as 
 $\delta>0$
 and
$\delta>0$
 and 
 $p\in (1,2)$
, such that
$p\in (1,2)$
, such that 
 $$ \begin{align*} \|A\|_{p-{\mathrm{var}};[S,T]}&\leq w_{b^1,\alpha,q}(S,T)^{1/q}\sup_{S\leq s<t\leq T}\frac{|A_t-A_s|}{w_{b^1,\alpha,q}(s,t)^{1/q}|t-s|^\delta} \\ &\leq w_{b^1,\alpha,q}(S,T)^{1/q}\eta_A, \\ \|z\|_{p-{\mathrm{var}};[S,T]}&\leq w_{b^1-b^2,\alpha-1,q}(S,T)^{1/q}\sup_{S\leq s<t\leq T}\frac{|z_t-z_s|}{w_{b^1-b^2,\alpha-1,q}(s,t)^{1/q}|t-s|^\delta} \\ &\leq w_{b^1-b^2,\alpha-1,q}(S,T)^{1/q}\eta_z. \end{align*} $$
$$ \begin{align*} \|A\|_{p-{\mathrm{var}};[S,T]}&\leq w_{b^1,\alpha,q}(S,T)^{1/q}\sup_{S\leq s<t\leq T}\frac{|A_t-A_s|}{w_{b^1,\alpha,q}(s,t)^{1/q}|t-s|^\delta} \\ &\leq w_{b^1,\alpha,q}(S,T)^{1/q}\eta_A, \\ \|z\|_{p-{\mathrm{var}};[S,T]}&\leq w_{b^1-b^2,\alpha-1,q}(S,T)^{1/q}\sup_{S\leq s<t\leq T}\frac{|z_t-z_s|}{w_{b^1-b^2,\alpha-1,q}(s,t)^{1/q}|t-s|^\delta} \\ &\leq w_{b^1-b^2,\alpha-1,q}(S,T)^{1/q}\eta_z. \end{align*} $$
We can rewrite (3.6) as
 $$ \begin{align} \mathrm{d}(X^1_t-X^2_t)= \mathrm{d} A_t (X^1_t-X^2_t)+\mathrm{d} z_t, \quad (X^1_t-X_t^2)\vert_{t=S}=X^1_S-X^2_S, \end{align} $$
$$ \begin{align} \mathrm{d}(X^1_t-X^2_t)= \mathrm{d} A_t (X^1_t-X^2_t)+\mathrm{d} z_t, \quad (X^1_t-X_t^2)\vert_{t=S}=X^1_S-X^2_S, \end{align} $$
meaning that we are interpreting (3.6) as an affine Young differential equation; see also Appendix B for more details. By applying Lemma B.2 for 
 $x=X^1-X^2$
 and
$x=X^1-X^2$
 and 
 $\tilde {p}=p$
, we get
$\tilde {p}=p$
, we get 
 $$ \begin{align*} \sup_{t\in[S,T]}|X_t^1-X_t^2|\lesssim e^{C\|A\|_{p-{\mathrm{var}};[S,T]}^p}\big(|X^1_S-X^2_S|+\|z\|_{p-{\mathrm{var}};[S,T]}\big). \end{align*} $$
$$ \begin{align*} \sup_{t\in[S,T]}|X_t^1-X_t^2|\lesssim e^{C\|A\|_{p-{\mathrm{var}};[S,T]}^p}\big(|X^1_S-X^2_S|+\|z\|_{p-{\mathrm{var}};[S,T]}\big). \end{align*} $$
Recall that 
 $\eta _A$
 satisfies
$\eta _A$
 satisfies 
 $\mathbb {E}_S [e^{\mu \eta _A^2}]\lesssim 1$
 for some
$\mathbb {E}_S [e^{\mu \eta _A^2}]\lesssim 1$
 for some 
 $\mu>0$
, and thus also
$\mu>0$
, and thus also 
 $\mathbb {E}_S[e^{K \eta _A^p}] \lesssim _{K,p} 1$
 for all
$\mathbb {E}_S[e^{K \eta _A^p}] \lesssim _{K,p} 1$
 for all 
 $K>0$
 since
$K>0$
 since 
 $p<2$
. Therefore, we obtain
$p<2$
. Therefore, we obtain 
 $$ \begin{align*} \mathbb{E}_S\Big[ \sup_{t\in[S,T]}|X_t^1-X_t^2|^m \Big] & \lesssim \mathbb{E}_S[e^{mC\| A\|_{p-{\mathrm{var}}; [S,T]}^p} ] |X^1_S-X^2_S|^m \\&\qquad+ \mathbb{E}_S\Big[ e^{mC\| A\|_{p-{\mathrm{var}};[S,T]}^p} \| z\|_{p-{\mathrm{var}};[S,T]}^m \Big]\\ & \lesssim |X^1_S-X^2_S|^m + w_{b^1-b^2,\alpha-1,q}(S,T)^{m/q}, \end{align*} $$
$$ \begin{align*} \mathbb{E}_S\Big[ \sup_{t\in[S,T]}|X_t^1-X_t^2|^m \Big] & \lesssim \mathbb{E}_S[e^{mC\| A\|_{p-{\mathrm{var}}; [S,T]}^p} ] |X^1_S-X^2_S|^m \\&\qquad+ \mathbb{E}_S\Big[ e^{mC\| A\|_{p-{\mathrm{var}};[S,T]}^p} \| z\|_{p-{\mathrm{var}};[S,T]}^m \Big]\\ & \lesssim |X^1_S-X^2_S|^m + w_{b^1-b^2,\alpha-1,q}(S,T)^{m/q}, \end{align*} $$
using conditional Hölder’s inequality to get the last line. This gives (3.4).
 In case 
 $b^1=b^2$
, we have
$b^1=b^2$
, we have 
 $z=0$
, and the Young equation (3.7) becomes homogeneous. Moreover, note that Young equations allow time-reversal: if we fix
$z=0$
, and the Young equation (3.7) becomes homogeneous. Moreover, note that Young equations allow time-reversal: if we fix 
 $\tau \in [S,T]$
, write
$\tau \in [S,T]$
, write 
 $\tilde A_t=A_{\tau -t}$
, and
$\tilde A_t=A_{\tau -t}$
, and 
 $$ \begin{align*} \mathrm{d} Y_t=\mathrm{d} \tilde A_t Y_t,\quad Y_t\vert_{t=0}=X^1_\tau-X^2_\tau, \end{align*} $$
$$ \begin{align*} \mathrm{d} Y_t=\mathrm{d} \tilde A_t Y_t,\quad Y_t\vert_{t=0}=X^1_\tau-X^2_\tau, \end{align*} $$
then 
 $Y_{\tau -S}=X_S^1-X_S^2$
. Therefore, by Lemma B.2, we also have the pathwise estimate
$Y_{\tau -S}=X_S^1-X_S^2$
. Therefore, by Lemma B.2, we also have the pathwise estimate 
 $$ \begin{align*} |X_S^1-X_S^2|\lesssim e^{C\|\tilde A\|_{p-{\mathrm{var}};[0,\tau-S]}^p}|X^1_\tau-X^2_\tau|. \end{align*} $$
$$ \begin{align*} |X_S^1-X_S^2|\lesssim e^{C\|\tilde A\|_{p-{\mathrm{var}};[0,\tau-S]}^p}|X^1_\tau-X^2_\tau|. \end{align*} $$
Of course, 
 $\|\tilde A\|_{p-{\mathrm {var}};[0,\tau -S]}^p=\| A\|_{p-{\mathrm {var}};[S,\tau ]}^p \leq \| A \|_{p-{\mathrm {var}};[S,T]}^p$
, so after rearranging for the inverses, taking supremum in
$\|\tilde A\|_{p-{\mathrm {var}};[0,\tau -S]}^p=\| A\|_{p-{\mathrm {var}};[S,\tau ]}^p \leq \| A \|_{p-{\mathrm {var}};[S,T]}^p$
, so after rearranging for the inverses, taking supremum in 
 $\tau \in [S,T]$
, and taking
$\tau \in [S,T]$
, and taking 
 $L^m|\mathcal {F}_S$
 norms, we get (3.5).
$L^m|\mathcal {F}_S$
 norms, we get (3.5).
4 Strong well-posedness for functional drift
 We first apply the stability estimate to establish existence and uniqueness of solutions of (1.6) with 
 $\alpha>0$
. In this case, the meaning of solutions is unambiguous, but we will also need the following stronger concepts of solutions.
$\alpha>0$
. In this case, the meaning of solutions is unambiguous, but we will also need the following stronger concepts of solutions.
 In the next definition, we denote by 
 $C^{\mathrm {loc}}_x$
 the space of continuous functions from
$C^{\mathrm {loc}}_x$
 the space of continuous functions from 
 $\mathbb {R}^d$
 to itself, endowed with the topology of uniform convergence on compact sets. Correspondingly,
$\mathbb {R}^d$
 to itself, endowed with the topology of uniform convergence on compact sets. Correspondingly, 
 $L^1_t C^{\mathrm {loc}}_x$
 denotes the set of functions
$L^1_t C^{\mathrm {loc}}_x$
 denotes the set of functions 
 $f:[0,1]\times \mathbb {R}^d\to \mathbb {R}^d$
 such that, for all smooth compactly supported g,
$f:[0,1]\times \mathbb {R}^d\to \mathbb {R}^d$
 such that, for all smooth compactly supported g, 
 $f g\in L^1([0,T]; C_b(\mathbb {R}^d;\mathbb {R}^d))$
, where
$f g\in L^1([0,T]; C_b(\mathbb {R}^d;\mathbb {R}^d))$
, where 
 $C_b(\mathbb {R}^d;\mathbb {R}^d)$
 denotes the Banach space of continuous and bounded functions, endowed with the supremum norm. As for most localized spaces, it is easy to check that
$C_b(\mathbb {R}^d;\mathbb {R}^d)$
 denotes the Banach space of continuous and bounded functions, endowed with the supremum norm. As for most localized spaces, it is easy to check that 
 $L^1_t C^{\mathrm {loc}}_x$
 is a separable Fréchet space.
$L^1_t C^{\mathrm {loc}}_x$
 is a separable Fréchet space.
Definition 4.1.
- 
(i) Assume  $b\in L^1_t C^{\mathrm {loc}}_x$
 and let $b\in L^1_t C^{\mathrm {loc}}_x$
 and let $\gamma :[0,1]\to \mathbb {R}^d$
 be bounded and measurable. A semiflow associated to the ODE (4.1) $\gamma :[0,1]\to \mathbb {R}^d$
 be bounded and measurable. A semiflow associated to the ODE (4.1) $$ \begin{align} y_t = y_0 + \int_0^t b_s (y_s) \mathrm{d} s + \gamma_t \end{align} $$ $$ \begin{align} y_t = y_0 + \int_0^t b_s (y_s) \mathrm{d} s + \gamma_t \end{align} $$is a jointly measurable map  $\Phi :[0,1]^2_\leq \times \mathbb {R}^d\to \mathbb {R}^d$
 such that $\Phi :[0,1]^2_\leq \times \mathbb {R}^d\to \mathbb {R}^d$
 such that- 
• for all  $(s,x)\in [0,1]\times \mathbb {R}^d$
 and all $(s,x)\in [0,1]\times \mathbb {R}^d$
 and all $t\in [s,1]$
, one has $t\in [s,1]$
, one has $$\begin{align*}\Phi_{s\to t}(x)=x+\int_s^t b_r\big(\Phi_{s\to r}(x)\big)\mathrm{d} r+\gamma_t-\gamma_s; \end{align*}$$ $$\begin{align*}\Phi_{s\to t}(x)=x+\int_s^t b_r\big(\Phi_{s\to r}(x)\big)\mathrm{d} r+\gamma_t-\gamma_s; \end{align*}$$
- 
• for all  $(s,r,t,x)\in [0,1]^3_\leq \times \mathbb {R}^d$
, one has $(s,r,t,x)\in [0,1]^3_\leq \times \mathbb {R}^d$
, one has $\Phi _{s\to t}(x)=\Phi _{r\to t}\big (\Phi _{s\to r}(x)\big )$
. $\Phi _{s\to t}(x)=\Phi _{r\to t}\big (\Phi _{s\to r}(x)\big )$
.
 
- 
- 
(ii) A flow is a semiflow such that for all  $(s,t)\in \times [0,1]^2_\leq $
, the map $(s,t)\in \times [0,1]^2_\leq $
, the map $x\mapsto \Phi _{s\to t}(x)$
 is a homeomorphism of $x\mapsto \Phi _{s\to t}(x)$
 is a homeomorphism of $\mathbb {R}^d$
. $\mathbb {R}^d$
.
- 
(iii) If  $\gamma $
 is a stochastic process, a random (semi)flow is a jointly measurable map $\gamma $
 is a stochastic process, a random (semi)flow is a jointly measurable map $\Phi :\Omega \times [0,1]^2_\leq \times \mathbb {R}^d\to \mathbb {R}^d$
 such that for $\Phi :\Omega \times [0,1]^2_\leq \times \mathbb {R}^d\to \mathbb {R}^d$
 such that for $\mathbb {P}$
-almost all $\mathbb {P}$
-almost all $\omega \in \Omega $
, the map $\omega \in \Omega $
, the map $\Phi ^\omega :[0,1]^2_\leq \times \mathbb {R}^d\to \mathbb {R}^d$
 is a (semi)flow associated to (4.1) with $\Phi ^\omega :[0,1]^2_\leq \times \mathbb {R}^d\to \mathbb {R}^d$
 is a (semi)flow associated to (4.1) with $\gamma =\gamma (\omega )$
. $\gamma =\gamma (\omega )$
.
- 
(iv) We say that a random (semi)flow is adapted if for all  $(s,t,x)\in [0,1]^2_\leq \times \mathbb {R}^d$
, the random variable $(s,t,x)\in [0,1]^2_\leq \times \mathbb {R}^d$
, the random variable $\Phi _{s\to t}(x)$
 is $\Phi _{s\to t}(x)$
 is $\mathcal {F}_t$
-measurable. $\mathcal {F}_t$
-measurable.
- 
(v) Given  $\beta \in (0,1)$
, we say that a (semi)flow is locally $\beta \in (0,1)$
, we say that a (semi)flow is locally $\beta $
-Hölder continuous if for all K, there exists a constant N such that for all $\beta $
-Hölder continuous if for all K, there exists a constant N such that for all $(s,t,x,y)\in [0,1]_\leq ^2\times B_K^2$
, one has $(s,t,x,y)\in [0,1]_\leq ^2\times B_K^2$
, one has $|\Phi _{s\to t}(x)-\Phi _{s\to t}(y)|\leq N|x-y|^\beta $
. $|\Phi _{s\to t}(x)-\Phi _{s\to t}(y)|\leq N|x-y|^\beta $
.
Remark 4.2. Definition 4.1 is based on Kunita’s classical one; cf. [Reference Kunita69, Theorem II.4.3]; it is slightly different (in fact, stronger) from other definitions proposed in the literature, like [Reference Fedrizzi and Flandoli37, Definition 5.1], due to the ordering of the quantifiers. One can draw a nice analogy between this kind of difference and the one between so-called crude and perfect random dynamical systems; cf. [Reference Zhang101, Remark 2.5].
Theorem 4.3. Assume (A), 
 $\alpha>0$
, and let
$\alpha>0$
, and let 
 $b\in L^q_t C^\alpha _x$
. Then there exists an adapted random semiflow of solutions to (1.6) that is furthermore
$b\in L^q_t C^\alpha _x$
. Then there exists an adapted random semiflow of solutions to (1.6) that is furthermore 
 $\mathbb {P}$
-almost surely locally
$\mathbb {P}$
-almost surely locally 
 $\beta $
-Hölder continuous for all
$\beta $
-Hölder continuous for all 
 $\beta \in (0,1)$
.
$\beta \in (0,1)$
.
Proof. Let 
 $m\in [2,\infty )$
, to be specified later. Take a sequence of functions
$m\in [2,\infty )$
, to be specified later. Take a sequence of functions 
 $(b^{n})_{n\in \mathbb {N}}$
 such that
$(b^{n})_{n\in \mathbb {N}}$
 such that 
 $b^{n}\in L^q_tC^{2}_x$
 and
$b^{n}\in L^q_tC^{2}_x$
 and 
 $\|b^{n}\|_{L^q_t C^\alpha _x}\leq \|b\|_{L^q_t C^\alpha _x}$
 for all
$\|b^{n}\|_{L^q_t C^\alpha _x}\leq \|b\|_{L^q_t C^\alpha _x}$
 for all 
 $n\in \mathbb {N}$
, and
$n\in \mathbb {N}$
, and 
 $\|b^{n}-b\|_{L^q_t C^{\alpha -1}_x}\to 0$
 as
$\|b^{n}-b\|_{L^q_t C^{\alpha -1}_x}\to 0$
 as 
 $n\to \infty $
. Replacing b by
$n\to \infty $
. Replacing b by 
 $b^{n}$
 in (1.6), the equation clearly admits an adapted random semiflow which we denote by
$b^{n}$
 in (1.6), the equation clearly admits an adapted random semiflow which we denote by 
 $\Phi ^{n}$
. For fixed
$\Phi ^{n}$
. For fixed 
 $(s,t)\in [0,1]^2_\leq $
,
$(s,t)\in [0,1]^2_\leq $
, 
 $x\in \mathbb {R}^d$
, and
$x\in \mathbb {R}^d$
, and 
 $n,n'\in \mathbb {N}$
, we may apply Theorem 3.2 to obtain the bound
$n,n'\in \mathbb {N}$
, we may apply Theorem 3.2 to obtain the bound 
 $$ \begin{align*} \big\|\Phi_{s\to t}^{n}(x)-\Phi_{s\to t}^{n'}(x)\big\|_{L^m}\lesssim \|b^{n}-b^{n'}\|_{L^q_t C^{\alpha-1}_x}. \end{align*} $$
$$ \begin{align*} \big\|\Phi_{s\to t}^{n}(x)-\Phi_{s\to t}^{n'}(x)\big\|_{L^m}\lesssim \|b^{n}-b^{n'}\|_{L^q_t C^{\alpha-1}_x}. \end{align*} $$
Here and below, the only important feature of the hidden proportionality constant in 
 $\lesssim $
 is that it is independent of
$\lesssim $
 is that it is independent of 
 $n,n'$
. Next, let
$n,n'$
. Next, let 
 $(s,s',t),(s,s',t')\in [0,1]^3_\leq $
,
$(s,s',t),(s,s',t')\in [0,1]^3_\leq $
, 
 $x,x'\in \mathbb {R}^d$
, and
$x,x'\in \mathbb {R}^d$
, and 
 $n\in \mathbb {N}$
. Then from applying Theorem 3.2 again, we get
$n\in \mathbb {N}$
. Then from applying Theorem 3.2 again, we get 
 $$ \begin{align*} \big\|\Phi_{s\to t}^{n}(x)-\Phi_{s\to t}^{n}(x')\big\|_{L^m}\lesssim |x-x'|; \end{align*} $$
$$ \begin{align*} \big\|\Phi_{s\to t}^{n}(x)-\Phi_{s\to t}^{n}(x')\big\|_{L^m}\lesssim |x-x'|; \end{align*} $$
by a trivial estimate, we get
 $$ \begin{align*} \big\|\Phi_{s\to t}^{n}(x)-\Phi_{s\to t'}^{n}(x)\big\|_{L^m}\lesssim |t-t'|^{H\wedge (1/q')}, \end{align*} $$
$$ \begin{align*} \big\|\Phi_{s\to t}^{n}(x)-\Phi_{s\to t'}^{n}(x)\big\|_{L^m}\lesssim |t-t'|^{H\wedge (1/q')}, \end{align*} $$
and using the semigroup property and Theorem 3.2 once more, we have
 $$ \begin{align} \|\Phi_{s\to t}^{n}(x)-\Phi_{s'\to t}^{n}(x)\|_{L^m}&=\|\Phi_{s'\to t}^{n}(\Phi_{s\to s'}^{n}(x))-\Phi_{s'\to t}^{n}(x)\|_{L^m} \nonumber\\ &\lesssim\|\Phi_{s\to s'}^{n}(x)-x\|_{L^m}\lesssim |s'-s|^{H\wedge (1/q')}. \end{align} $$
$$ \begin{align} \|\Phi_{s\to t}^{n}(x)-\Phi_{s'\to t}^{n}(x)\|_{L^m}&=\|\Phi_{s'\to t}^{n}(\Phi_{s\to s'}^{n}(x))-\Phi_{s'\to t}^{n}(x)\|_{L^m} \nonumber\\ &\lesssim\|\Phi_{s\to s'}^{n}(x)-x\|_{L^m}\lesssim |s'-s|^{H\wedge (1/q')}. \end{align} $$
We therefore get that the sequence 
 $\big (\Phi ^{n}\big )_{n\in \mathbb {N}}$
 is on the one hand, Cauchy in
$\big (\Phi ^{n}\big )_{n\in \mathbb {N}}$
 is on the one hand, Cauchy in 
 $C_{s,t,x} L^m_\omega $
 and, on the other hand, bounded in
$C_{s,t,x} L^m_\omega $
 and, on the other hand, bounded in 
 $C_{s,t}C^1_xL^m_\omega \cap C_x C^{H\wedge (1/q')}_{s,t}L^m_\omega $
. This implies that for some random field
$C_{s,t}C^1_xL^m_\omega \cap C_x C^{H\wedge (1/q')}_{s,t}L^m_\omega $
. This implies that for some random field 
 $\Phi $
, one has
$\Phi $
, one has 
 $\Phi ^{n}\to \Phi $
 in
$\Phi ^{n}\to \Phi $
 in 
 $C_{s,t}C^{1-\kappa }_xL^m_\omega \cap C_x C^{H\wedge (1/q')-\kappa }_{s,t} L^m_\omega $
, where
$C_{s,t}C^{1-\kappa }_xL^m_\omega \cap C_x C^{H\wedge (1/q')-\kappa }_{s,t} L^m_\omega $
, where 
 $\kappa>0$
 is arbitrary. By Kolmogorov’s continuity theorem, for sufficiently large m, the convergence also holds in
$\kappa>0$
 is arbitrary. By Kolmogorov’s continuity theorem, for sufficiently large m, the convergence also holds in 
 $L^m_\omega C_{s,t}C^{1-2\kappa ,\mathrm {loc}}_x\cap L^m_\omega C^{\mathrm {loc}}_x C^{H\wedge (1/q')-2\kappa }_{s,t}$
. This yields the claimed spatial regularity of
$L^m_\omega C_{s,t}C^{1-2\kappa ,\mathrm {loc}}_x\cap L^m_\omega C^{\mathrm {loc}}_x C^{H\wedge (1/q')-2\kappa }_{s,t}$
. This yields the claimed spatial regularity of 
 $\Phi $
; the fact that
$\Phi $
; the fact that 
 $\Phi $
 is indeed a semiflow for (1.6) instead follows from the locally uniform convergence of
$\Phi $
 is indeed a semiflow for (1.6) instead follows from the locally uniform convergence of 
 $\Phi ^n$
 to
$\Phi ^n$
 to 
 $\Phi $
,
$\Phi $
, 
 $\Phi ^n$
 being semiflow, and the spatial continuity of the drift b.
$\Phi ^n$
 being semiflow, and the spatial continuity of the drift b.
Theorem 4.4. Assume (A), 
 $\alpha>0$
, and let
$\alpha>0$
, and let 
 $b\in L^q_t C^\alpha _x$
. Then there exists an event
$b\in L^q_t C^\alpha _x$
. Then there exists an event 
 $\tilde \Omega $
 of full probability such that for all
$\tilde \Omega $
 of full probability such that for all 
 $\omega \in \tilde \Omega $
, for all
$\omega \in \tilde \Omega $
, for all 
 $(S,T)\in [0,1]^2_\leq $
,
$(S,T)\in [0,1]^2_\leq $
, 
 $x\in \mathbb {R}^d$
, there exists only one solution to (1.6) on
$x\in \mathbb {R}^d$
, there exists only one solution to (1.6) on 
 $[S,T]$
 with initial condition x.
$[S,T]$
 with initial condition x.
The theorem will follow immediately from Theorem 4.3 and the following lemma, which is a refinement of the technique illustrated in [Reference Shaposhnikov91, Theorem 3.1].
Lemma 4.5. Let 
 $\gamma :[0,1]\to \mathbb {R}^d$
 be bounded and measurable,
$\gamma :[0,1]\to \mathbb {R}^d$
 be bounded and measurable, 
 $b\in L^1_t C^{\alpha ,\mathrm {loc}}_x$
, and consider the ODE (4.1). Suppose that it admits a locally
$b\in L^1_t C^{\alpha ,\mathrm {loc}}_x$
, and consider the ODE (4.1). Suppose that it admits a locally 
 $\beta $
-Hölder continuous semiflow
$\beta $
-Hölder continuous semiflow 
 $\Phi $
 with
$\Phi $
 with 
 $$ \begin{align} \beta(1+\alpha)>1. \end{align} $$
$$ \begin{align} \beta(1+\alpha)>1. \end{align} $$
Then for any 
 $(S,T)\in [0,1]_\leq ^2$
 and
$(S,T)\in [0,1]_\leq ^2$
 and 
 $y\in \mathbb {R}^d$
, there exists a unique solution to the ODE on the interval
$y\in \mathbb {R}^d$
, there exists a unique solution to the ODE on the interval 
 $[S,T]$
 with initial condition y, given by
$[S,T]$
 with initial condition y, given by 
 $\Phi _{S\to \cdot }(y)$
.
$\Phi _{S\to \cdot }(y)$
.
Proof. Suppose that there exists another solution to the ODE, given by 
 $(z_t)_{t\in [S,T]}$
. Since both z and
$(z_t)_{t\in [S,T]}$
. Since both z and 
 $\Phi _{S\to \cdot }(y)$
 are bounded, we may and will assume
$\Phi _{S\to \cdot }(y)$
 are bounded, we may and will assume 
 $b\in L^1_t C^{\alpha }_x$
 and that
$b\in L^1_t C^{\alpha }_x$
 and that 
 $\Phi $
 is globally
$\Phi $
 is globally 
 $\beta $
-Hölder continuous. Define the control
$\beta $
-Hölder continuous. Define the control 
 $w=w_{b,\alpha ,1}$
.
$w=w_{b,\alpha ,1}$
.
 Now let us fix 
 $\tau \in [S,T]$
 and define the map
$\tau \in [S,T]$
 and define the map 
 $f_t:= \Phi _{t\to \tau } (z_t) - \Phi _{S\to \tau }(y)$
. If we are able to show that f is constant in time, then
$f_t:= \Phi _{t\to \tau } (z_t) - \Phi _{S\to \tau }(y)$
. If we are able to show that f is constant in time, then 
 $f \equiv f_0=0$
, which implies
$f \equiv f_0=0$
, which implies 
 $\Phi _{t\to \tau }(z_t)=\Phi _{S\to \tau }(y)$
 and in turn by choosing
$\Phi _{t\to \tau }(z_t)=\Phi _{S\to \tau }(y)$
 and in turn by choosing 
 $t=\tau $
 gives
$t=\tau $
 gives 
 $z_\tau =\Phi _{\tau \to \tau }(z_\tau )=\Phi _{S\to \tau }(y)$
. In particular, if we above argument holds for any
$z_\tau =\Phi _{\tau \to \tau }(z_\tau )=\Phi _{S\to \tau }(y)$
. In particular, if we above argument holds for any 
 $\tau \in [S,T]$
, we reach the conclusion.
$\tau \in [S,T]$
, we reach the conclusion.
 It remains to prove that f is constant on 
 $[S,\tau ]$
. To this end, first observe that for any
$[S,\tau ]$
. To this end, first observe that for any 
 $S\leq s\leq t\leq \tau $
, it holds
$S\leq s\leq t\leq \tau $
, it holds 
 $$ \begin{align} |f_{s,t}| & =|\Phi_{t\to \tau}(z_t)-\Phi_{s\to \tau}(z_s)| \nonumber\\ & =|\Phi_{t\to \tau}(z_t)-\Phi_{t\to \tau}(\Phi_{s\to t}(z_s))| \lesssim |\Phi_{s\to t}(z_s)-z_t|^\beta. \end{align} $$
$$ \begin{align} |f_{s,t}| & =|\Phi_{t\to \tau}(z_t)-\Phi_{s\to \tau}(z_s)| \nonumber\\ & =|\Phi_{t\to \tau}(z_t)-\Phi_{t\to \tau}(\Phi_{s\to t}(z_s))| \lesssim |\Phi_{s\to t}(z_s)-z_t|^\beta. \end{align} $$
Next, by definition of flow, it holds
 $$\begin{align*}\Phi_{s\to t}(z_s)-z_t=\int_s^t [b_r(\Phi_{s\to r} (z_s))-b_r(z_r)] \mathrm{d} r, \end{align*}$$
$$\begin{align*}\Phi_{s\to t}(z_s)-z_t=\int_s^t [b_r(\Phi_{s\to r} (z_s))-b_r(z_r)] \mathrm{d} r, \end{align*}$$
which immediately implies 
 $|\Phi _{s\to t}(z_s)-z_t|\lesssim w(s,t)$
; we can improve the estimate by recursively inserting it in the above identity:
$|\Phi _{s\to t}(z_s)-z_t|\lesssim w(s,t)$
; we can improve the estimate by recursively inserting it in the above identity: 
 $$ \begin{align*} |\Phi_{s\to t}(z_s)-z_t| & \leq \int_s^t |b_r(\Phi_{s\to r} (z_s))-b_r(z_r)| \mathrm{d} r\\ & \leq \int_s^t \| b_r\|_{C^\alpha} |\Phi_{s\to r} (z_s))-z_r|^\alpha \mathrm{d} r \leq w(s,t)^{1+\alpha}. \end{align*} $$
$$ \begin{align*} |\Phi_{s\to t}(z_s)-z_t| & \leq \int_s^t |b_r(\Phi_{s\to r} (z_s))-b_r(z_r)| \mathrm{d} r\\ & \leq \int_s^t \| b_r\|_{C^\alpha} |\Phi_{s\to r} (z_s))-z_r|^\alpha \mathrm{d} r \leq w(s,t)^{1+\alpha}. \end{align*} $$
Inserting the above in estimate (4.4), we can conclude that
 $$\begin{align*}|f_{s,t}| \lesssim |\Phi_{s\to t}(z_s)-z_t|^\beta \lesssim w(s,t)^{\beta (1+\alpha)}.\end{align*}$$
$$\begin{align*}|f_{s,t}| \lesssim |\Phi_{s\to t}(z_s)-z_t|^\beta \lesssim w(s,t)^{\beta (1+\alpha)}.\end{align*}$$
Since 
 $\beta (1+\alpha )>1$
 and w is a control, f must be necessarily constant.
$\beta (1+\alpha )>1$
 and w is a control, f must be necessarily constant.
Remark 4.6. In the functional setting of Definition 4.1, path-by-path uniqueness implies pathwise uniqueness, which in turn implies uniqueness in law by the Yamada–Watanabe theorem [Reference Yamada and Watanabe100, Proposition 1]; we refer to [Reference Shaposhnikov and Wresch92] for a general overview on the various notions of strong/weak existence and uniqueness.
Remark 4.7. The statement of Lemma 4.5 is given for deterministic initial data y and semiflow 
 $\Phi $
, but immediately extends to random ones: if
$\Phi $
, but immediately extends to random ones: if 
 $X_0$
 is a
$X_0$
 is a 
 $\mathcal {F}_0$
-measurable random variable, then
$\mathcal {F}_0$
-measurable random variable, then 
 $(\Phi _{0\to t}(X_0)\big )_{t\in [0,1]}$
 is clearly the unique adapted solution with initial condition
$(\Phi _{0\to t}(X_0)\big )_{t\in [0,1]}$
 is clearly the unique adapted solution with initial condition 
 $X_0$
.
$X_0$
.
5 Strong well-posedness for distributional drift
 When 
 $\alpha <0$
, the very first question one has to address is the meaning of the equation – more precisely, the meaning of the integral in (1.6). We start by some consequences of Lemma 3.1. Denote by
$\alpha <0$
, the very first question one has to address is the meaning of the equation – more precisely, the meaning of the integral in (1.6). We start by some consequences of Lemma 3.1. Denote by 
 $\overline {C^\alpha }$
 the closure of
$\overline {C^\alpha }$
 the closure of 
 $C^1$
 in
$C^1$
 in 
 $C^\alpha $
. Recall that for any
$C^\alpha $
. Recall that for any 
 $\alpha <\alpha '$
, one has
$\alpha <\alpha '$
, one has 
 $C^{\alpha '}\subset \overline {C^\alpha }$
.
$C^{\alpha '}\subset \overline {C^\alpha }$
.
Corollary 5.1. Assume (A) and 
 $\alpha <0$
, and take
$\alpha <0$
, and take 
 $\delta>0$
. Define the linear map
$\delta>0$
. Define the linear map 
 $T^{B^H}:L^q_tC^{1+\delta }_x\to L^\infty _\omega C_tC^\delta _x$
 by
$T^{B^H}:L^q_tC^{1+\delta }_x\to L^\infty _\omega C_tC^\delta _x$
 by 
 $$ \begin{align*} \big(T^{B^H}h\big)_t(x)=\int_0^t h_r(B^H_r+x)\mathrm{d} r. \end{align*} $$
$$ \begin{align*} \big(T^{B^H}h\big)_t(x)=\int_0^t h_r(B^H_r+x)\mathrm{d} r. \end{align*} $$
Denote 
 $w=w_{h,\alpha ,q}$
. Then, for any
$w=w_{h,\alpha ,q}$
. Then, for any 
 $m\in [2,\infty )$
, there exists a constant
$m\in [2,\infty )$
, there exists a constant 
 $K=K(m,H,\alpha ,q,d,w(0,1))$
 such that for all
$K=K(m,H,\alpha ,q,d,w(0,1))$
 such that for all 
 $(s,t)\in [0,1]^2_\leq $
 and
$(s,t)\in [0,1]^2_\leq $
 and 
 $x,y\in \mathbb {R}^d$
, one has the bound
$x,y\in \mathbb {R}^d$
, one has the bound 
 $$ \begin{align} \big\|&\|\big(T^{B^H}h\big)_{s,t}(x)-\big(T^{B^H}h\big)_{s,t}(y)\|_{L^m|\mathcal{F}_s}\big\|_{L^\infty} \nonumber \\ &\qquad\leq K|x-y|w(s,t)^{1/q}|t-s|^{1/q'+(\alpha-1)H}. \end{align} $$
$$ \begin{align} \big\|&\|\big(T^{B^H}h\big)_{s,t}(x)-\big(T^{B^H}h\big)_{s,t}(y)\|_{L^m|\mathcal{F}_s}\big\|_{L^\infty} \nonumber \\ &\qquad\leq K|x-y|w(s,t)^{1/q}|t-s|^{1/q'+(\alpha-1)H}. \end{align} $$
Moreover, for any 
 $\kappa \in (0,1)$
 sufficiently small, there exists a constant
$\kappa \in (0,1)$
 sufficiently small, there exists a constant 
 $K=K(m,H,\alpha ,q,d,w(0,1),\kappa )$
 such that one has the bound
$K=K(m,H,\alpha ,q,d,w(0,1),\kappa )$
 such that one has the bound 
 $$ \begin{align} \Bigg\|\sup_{0\leq s<t\leq 1}\frac{\|\big(T^{B^H}h\big)_{s,t}\|_{C^{1-\kappa,2\kappa}_x}}{w(s,t)^{1/q}|t-s|^{1/q'+(\alpha-1)H-\kappa}}\Bigg\|_{L^m}\leq K. \end{align} $$
$$ \begin{align} \Bigg\|\sup_{0\leq s<t\leq 1}\frac{\|\big(T^{B^H}h\big)_{s,t}\|_{C^{1-\kappa,2\kappa}_x}}{w(s,t)^{1/q}|t-s|^{1/q'+(\alpha-1)H-\kappa}}\Bigg\|_{L^m}\leq K. \end{align} $$
Consequently with 
 $p=\big (1+(\alpha -1)H\big )^{-1}\in (1,2)$
, the mapping
$p=\big (1+(\alpha -1)H\big )^{-1}\in (1,2)$
, the mapping 
 $h\mapsto T^{B^H} h$
 takes values in
$h\mapsto T^{B^H} h$
 takes values in 
 $L^m_\omega C^{(p+\kappa )-{\mathrm {var}}}_tC^{1-\kappa ,2\kappa }_x$
, and as such, it extends continuously to
$L^m_\omega C^{(p+\kappa )-{\mathrm {var}}}_tC^{1-\kappa ,2\kappa }_x$
, and as such, it extends continuously to 
 $L^q_t \overline {C^\alpha _x}$
. This extension also satisfies the bounds (5.1)–(5.2).
$L^q_t \overline {C^\alpha _x}$
. This extension also satisfies the bounds (5.1)–(5.2).
Proof. Applying Lemma 3.1 with 
 $t,z\mapsto (x-y)\cdot \int _0^1\nabla h_t(z+\theta x+(1-\theta )y)\mathrm {d} \theta $
 in place of h yields (5.1). The bound (5.2) follows from (3.2) and (5.1) by Kolmogorov’s continuity theorem in the form of Corollary A.5.
$t,z\mapsto (x-y)\cdot \int _0^1\nabla h_t(z+\theta x+(1-\theta )y)\mathrm {d} \theta $
 in place of h yields (5.1). The bound (5.2) follows from (3.2) and (5.1) by Kolmogorov’s continuity theorem in the form of Corollary A.5.
 Corollary 5.1 motivates introducing some temporary notation. Given (A), set 
 $p_{\alpha ,H}=\big (\big (1+(\alpha -1)H\big )^{-1}+2\big )/2\in (1,2)$
, and for any
$p_{\alpha ,H}=\big (\big (1+(\alpha -1)H\big )^{-1}+2\big )/2\in (1,2)$
, and for any 
 $h\in L^q_t C^\alpha _x$
, we define the event
$h\in L^q_t C^\alpha _x$
, we define the event 
 $$ \begin{align*} \Omega_h:=\Big\{\omega\in \Omega: \, T^{B^H}h(\omega)\in C^{p_{\alpha,H}-{\mathrm{var}}}_tC^{1-\kappa,2\kappa}_x\,\,\forall\kappa>0\Big\}, \end{align*} $$
$$ \begin{align*} \Omega_h:=\Big\{\omega\in \Omega: \, T^{B^H}h(\omega)\in C^{p_{\alpha,H}-{\mathrm{var}}}_tC^{1-\kappa,2\kappa}_x\,\,\forall\kappa>0\Big\}, \end{align*} $$
which is therefore of full probability.
 The regularity of 
 $T^{B^H}$
 obtained from Corollary 5.1 is sufficient to define a notion of solution via nonlinear Young formalism. For the proof of the next statement, we refer to [Reference Galeati46], which can be readily readapted to the p-variation framework; see also [Reference Anzeletti, Richard and Tanré2].
$T^{B^H}$
 obtained from Corollary 5.1 is sufficient to define a notion of solution via nonlinear Young formalism. For the proof of the next statement, we refer to [Reference Galeati46], which can be readily readapted to the p-variation framework; see also [Reference Anzeletti, Richard and Tanré2].
Lemma 5.2. Let 
 $A:[0,1]\times \mathbb {R}^d\to \mathbb {R}^n$
 and
$A:[0,1]\times \mathbb {R}^d\to \mathbb {R}^n$
 and 
 $x:[0,1]\to \mathbb {R}^d$
 satisfy
$x:[0,1]\to \mathbb {R}^d$
 satisfy 
 $A\in C_t^{p-{\mathrm {var}}}C^{\eta ,\mathrm {loc}}_x$
 and
$A\in C_t^{p-{\mathrm {var}}}C^{\eta ,\mathrm {loc}}_x$
 and 
 $x\in C^{\zeta -{\mathrm {var}}}_t$
 such that the exponents
$x\in C^{\zeta -{\mathrm {var}}}_t$
 such that the exponents 
 $p,\zeta \in [1,\infty )$
,
$p,\zeta \in [1,\infty )$
, 
 $\eta \in (0,1]$
 satisfy
$\eta \in (0,1]$
 satisfy 
 $$ \begin{align*} \frac{1}{p}+\frac{\eta}{\zeta}>1. \end{align*} $$
$$ \begin{align*} \frac{1}{p}+\frac{\eta}{\zeta}>1. \end{align*} $$
Then the nonlinear Young integral
 $$ \begin{align*} y_t=\int_0^t A_{\mathrm{d} s}(x_s):=\lim_{\ell\to \infty}\sum_{j=0}^{2^\ell-1}A_{j2^{-\ell}t,(j+1)2^{-\ell}t}(x_{j2^{-\ell}t}) \end{align*} $$
$$ \begin{align*} y_t=\int_0^t A_{\mathrm{d} s}(x_s):=\lim_{\ell\to \infty}\sum_{j=0}^{2^\ell-1}A_{j2^{-\ell}t,(j+1)2^{-\ell}t}(x_{j2^{-\ell}t}) \end{align*} $$
is well-defined. If 
 $A\in C_t^{p-{\mathrm {var}}}C^{\eta }_x$
, then for all
$A\in C_t^{p-{\mathrm {var}}}C^{\eta }_x$
, then for all 
 $(s,t)\in [0,1]_\leq ^2$
, y satisfies the bound
$(s,t)\in [0,1]_\leq ^2$
, y satisfies the bound 

where the constant N depends only on 
 $1/p+\eta /\zeta $
.
$1/p+\eta /\zeta $
.
Definition 5.3. Assume (A), 
 $\alpha <0$
 and
$\alpha <0$
 and 
 $b\in L^q_t C^\alpha _x$
. Given
$b\in L^q_t C^\alpha _x$
. Given 
 $\omega \in \Omega _b$
, we say that a path x is an
$\omega \in \Omega _b$
, we say that a path x is an 
 $\omega $
-path solution to (1.6) if
$\omega $
-path solution to (1.6) if 
 $x=\varphi +B^H(\omega )$
,
$x=\varphi +B^H(\omega )$
, 
 $\varphi \in C^{\zeta -{\mathrm {var}}}_t$
 for some
$\varphi \in C^{\zeta -{\mathrm {var}}}_t$
 for some 
 $\gamma $
 satisfying
$\gamma $
 satisfying 
 $1/p_{\alpha ,H} + 1/\zeta>1$
 and the equality
$1/p_{\alpha ,H} + 1/\zeta>1$
 and the equality 
 $$ \begin{align} \varphi_t=\varphi_0 + \int_0^t \big(T^{B^H}b(\omega)\big)_{\mathrm{d} s}(\varphi_s) \end{align} $$
$$ \begin{align} \varphi_t=\varphi_0 + \int_0^t \big(T^{B^H}b(\omega)\big)_{\mathrm{d} s}(\varphi_s) \end{align} $$
holds for all 
 $t\in [0,1]$
, the integral being understood in the nonlinear Young sense. We say that a stochastic process X is a path-by-path solution to (1.6) if, for
$t\in [0,1]$
, the integral being understood in the nonlinear Young sense. We say that a stochastic process X is a path-by-path solution to (1.6) if, for 
 $\mathbb {P}$
-a.e.
$\mathbb {P}$
-a.e. 
 $\omega \in \Omega _b$
,
$\omega \in \Omega _b$
, 
 $X(\omega )$
 is a
$X(\omega )$
 is a 
 $\omega $
-path solution in the above sense. Given this formulation of the SDE, the concepts of strong and weak solutions are analogous to the classical ones; see Section 1.5 above.
$\omega $
-path solution in the above sense. Given this formulation of the SDE, the concepts of strong and weak solutions are analogous to the classical ones; see Section 1.5 above.
 Typically, we encounter more special cases of nonlinear Young integrals than the generality that Lemma 5.2 allows. First of all, the spatial growth of A is often quantified (as in, for example, Corollary 5.1). Secondly, whenever 
 $\varphi $
 is a solution to a nonlinear Young equation, it is automatically of p-variation, and its temporal regularity can be often controlled by that of A (see, for example, [Reference Galeati46, Section 3.2] in the Hölder case or Lemma B.1 in Appendix B).
$\varphi $
 is a solution to a nonlinear Young equation, it is automatically of p-variation, and its temporal regularity can be often controlled by that of A (see, for example, [Reference Galeati46, Section 3.2] in the Hölder case or Lemma B.1 in Appendix B).
 We can then define the notion of flows similarly to Definition 4.1. In fact, the following definition extends the previous one: for functional drifts, taking 
 $A=T^\gamma b$
, using the Riemann sums characterization of the nonlinear Young integral, one can easily verify that
$A=T^\gamma b$
, using the Riemann sums characterization of the nonlinear Young integral, one can easily verify that 
 $$ \begin{align*} \int_0^t (T^\gamma b)_{\mathrm{d} s} (\varphi_s) = \int_0^t b_s(\varphi_s+\gamma_s) \mathrm{d} s \quad \forall\, t\in [0,1]. \end{align*} $$
$$ \begin{align*} \int_0^t (T^\gamma b)_{\mathrm{d} s} (\varphi_s) = \int_0^t b_s(\varphi_s+\gamma_s) \mathrm{d} s \quad \forall\, t\in [0,1]. \end{align*} $$
Therefore, in the functional case, Definitions 4.1 and 5.4 coincide via the change of variables
 $$ \begin{align} \Psi_{s\to t}(x)=\Phi_{s\to t}(x+\gamma_s)-\gamma_t. \end{align} $$
$$ \begin{align} \Psi_{s\to t}(x)=\Phi_{s\to t}(x+\gamma_s)-\gamma_t. \end{align} $$
Definition 5.4. Assume 
 $A\in C_t^{p-{\mathrm {var}}} C^{\eta ,\mathrm {loc}}_x$
 for some
$A\in C_t^{p-{\mathrm {var}}} C^{\eta ,\mathrm {loc}}_x$
 for some 
 $\eta \in (0,1]$
,
$\eta \in (0,1]$
, 
 $p\in [1,2)$
 satisfying
$p\in [1,2)$
 satisfying 
 $(1+\eta )/p>1$
. A semiflow associated to the nonlinear Young equation
$(1+\eta )/p>1$
. A semiflow associated to the nonlinear Young equation 
 $$ \begin{align} y_t = y_0 + \int_0^t A_{\mathrm{d} s} (y_s) \end{align} $$
$$ \begin{align} y_t = y_0 + \int_0^t A_{\mathrm{d} s} (y_s) \end{align} $$
is a jointly measurable map 
 $\Psi :[0,1]^2_\leq \times \mathbb {R}^d\to \mathbb {R}^d$
 such that
$\Psi :[0,1]^2_\leq \times \mathbb {R}^d\to \mathbb {R}^d$
 such that 
- 
• for all  $(s,x)\in [0,1]\times \mathbb {R}^d$
, one has $(s,x)\in [0,1]\times \mathbb {R}^d$
, one has $\Psi _{s\to \cdot }(x)\in C^{p-{\mathrm {var}}}_t$
 and for all $\Psi _{s\to \cdot }(x)\in C^{p-{\mathrm {var}}}_t$
 and for all $t\in [s,1]$
, one has the equality $t\in [s,1]$
, one has the equality $$ \begin{align*} \Psi_{s\to t}(x)=x+\int_s^t A_{\mathrm{d} r}\big(\Psi_{s\to r}(x)\big); \end{align*} $$ $$ \begin{align*} \Psi_{s\to t}(x)=x+\int_s^t A_{\mathrm{d} r}\big(\Psi_{s\to r}(x)\big); \end{align*} $$
- 
• for all  $(s,r,t,x)\in \times [0,1]^3_\leq \times \mathbb {R}^d$
 one has $(s,r,t,x)\in \times [0,1]^3_\leq \times \mathbb {R}^d$
 one has $\Psi _{s\to t}(x)=\Psi _{r\to t}\big (\Psi _{s\to r}(x)\big )$
. $\Psi _{s\to t}(x)=\Psi _{r\to t}\big (\Psi _{s\to r}(x)\big )$
.
The definitions of flow, random (semi)flow, adaptedness,and Hölder continuity are then exactly as in Definition 4.1.
We are now in the position to state and prove our existence and uniqueness theorems in the case of distributional drift.
Theorem 5.5. Assume (A), 
 $\alpha <0$
, and let
$\alpha <0$
, and let 
 $b\in L^q_t C^\alpha _x$
. Then there exists an adapted random semiflow of solutions to (1.6) that is furthermore locally
$b\in L^q_t C^\alpha _x$
. Then there exists an adapted random semiflow of solutions to (1.6) that is furthermore locally 
 $\beta $
-Hölder continuous
$\beta $
-Hölder continuous 
 $\mathbb {P}$
-almost surely for all
$\mathbb {P}$
-almost surely for all 
 $\beta \in (0,1)$
.
$\beta \in (0,1)$
.
Proof. By sacrificing a small regularity, we may and will assume 
 $b\in L^q_t \overline {C^\alpha _x}$
. The proof follows similar steps as that of Theorem 4.3. We take
$b\in L^q_t \overline {C^\alpha _x}$
. The proof follows similar steps as that of Theorem 4.3. We take 
 $m\in [2,\infty )$
 to be chosen large enough later as well a sequence of functions
$m\in [2,\infty )$
 to be chosen large enough later as well a sequence of functions 
 $(b^n)_{n\in \mathbb {N}}$
 such that
$(b^n)_{n\in \mathbb {N}}$
 such that 
 $b^n\in L^q_tC^{2}_x$
 and
$b^n\in L^q_tC^{2}_x$
 and 
 $\|b^n\|_{L^q_t C^\alpha _x}\leq \|b\|_{L^q_t C^\alpha _x}$
 for all
$\|b^n\|_{L^q_t C^\alpha _x}\leq \|b\|_{L^q_t C^\alpha _x}$
 for all 
 $n\in \mathbb {N}$
, and
$n\in \mathbb {N}$
, and 
 $\|b^n-b\|_{L^q_t C^{\alpha -1}_x}\to 0$
 as
$\|b^n-b\|_{L^q_t C^{\alpha -1}_x}\to 0$
 as 
 $n\to \infty $
. Replacing b by
$n\to \infty $
. Replacing b by 
 $b^n$
 in (1.6), the equation clearly admits an adapted random semiflow
$b^n$
 in (1.6), the equation clearly admits an adapted random semiflow 
 $\Psi ^n_{s\to t}$
. For fixed
$\Psi ^n_{s\to t}$
. For fixed 
 $(s,t)\in [0,1]^2_\leq $
,
$(s,t)\in [0,1]^2_\leq $
, 
 $x\in \mathbb {R}^d$
, and
$x\in \mathbb {R}^d$
, and 
 $n,n'\in \mathbb {N}$
, by Theorem 3.2, one has the bound
$n,n'\in \mathbb {N}$
, by Theorem 3.2, one has the bound 
 $$ \begin{align*} \big\|\Psi_{s\to t}^n(x)-\Psi_{s\to t}^{n'}(x)\big\|_{L^m}\lesssim \|b^n-b^{n'}\|_{L^q_t C^{\alpha-1}_x}. \end{align*} $$
$$ \begin{align*} \big\|\Psi_{s\to t}^n(x)-\Psi_{s\to t}^{n'}(x)\big\|_{L^m}\lesssim \|b^n-b^{n'}\|_{L^q_t C^{\alpha-1}_x}. \end{align*} $$
Similarly, for 
 $(s,t)\in [0,1]^2_\leq $
,
$(s,t)\in [0,1]^2_\leq $
, 
 $x,x'\in \mathbb {R}^d$
, and
$x,x'\in \mathbb {R}^d$
, and 
 $n\in \mathbb {N}$
, Theorem 3.2 yields
$n\in \mathbb {N}$
, Theorem 3.2 yields 
 $$ \begin{align*} \big\|\Psi_{s\to t}^{n}(x)-\Psi_{s\to t}^{n}(x')\big\|_{L^m}\lesssim |x-x'|. \end{align*} $$
$$ \begin{align*} \big\|\Psi_{s\to t}^{n}(x)-\Psi_{s\to t}^{n}(x')\big\|_{L^m}\lesssim |x-x'|. \end{align*} $$
The temporal regularity is obtained from Lemma 2.4: in our present notation, we get
 $$ \begin{align*} \big\|\Psi_{s\to t}^{n}(x)-\Psi_{s\to t'}^{n}(x)\big\|_{L^m}\lesssim w_{b,\alpha,q}(t,t')^{1/q}|t'-t|^{\alpha H+1/q'}=:\tilde w(t,t')^{1+\alpha H} \end{align*} $$
$$ \begin{align*} \big\|\Psi_{s\to t}^{n}(x)-\Psi_{s\to t'}^{n}(x)\big\|_{L^m}\lesssim w_{b,\alpha,q}(t,t')^{1/q}|t'-t|^{\alpha H+1/q'}=:\tilde w(t,t')^{1+\alpha H} \end{align*} $$
with 
 $\tilde w$
 defined by the above equality. Regularity in the s variable is obtained precisely as in (4.2). From these estimates, we obtain the convergence
$\tilde w$
 defined by the above equality. Regularity in the s variable is obtained precisely as in (4.2). From these estimates, we obtain the convergence 
 $$ \begin{align*} \Psi^{n}\to \Psi\qquad\text{in }L^m_\omega C_{s,t}C^{1-\kappa,\mathrm{loc}}_x\cap L^m_\omega C^{\mathrm{loc}}_x C^{p_{\alpha,H}-{\mathrm{var}}}_{s,t} \end{align*} $$
$$ \begin{align*} \Psi^{n}\to \Psi\qquad\text{in }L^m_\omega C_{s,t}C^{1-\kappa,\mathrm{loc}}_x\cap L^m_\omega C^{\mathrm{loc}}_x C^{p_{\alpha,H}-{\mathrm{var}}}_{s,t} \end{align*} $$
to a limit 
 $\Psi $
 just as in Theorem 4.3 with all the required properties shown in the same way, except for the fact that
$\Psi $
 just as in Theorem 4.3 with all the required properties shown in the same way, except for the fact that 
 $\Psi _{s\to \cdot }(x)$
 solves the equation on
$\Psi _{s\to \cdot }(x)$
 solves the equation on 
 $[s,1]$
 with initial condition x in the nonlinear Young sense. Since at this point s and x are fixed, we assume for simplicity
$[s,1]$
 with initial condition x in the nonlinear Young sense. Since at this point s and x are fixed, we assume for simplicity 
 $s=0, x=0$
 and denote
$s=0, x=0$
 and denote 
 $\Psi ^{n}_{0\to t}(0)=\psi ^{n}_t$
,
$\Psi ^{n}_{0\to t}(0)=\psi ^{n}_t$
, 
 $\Psi _{0\to t}(0)=\psi _t$
. It is sufficient to show the convergence
$\Psi _{0\to t}(0)=\psi _t$
. It is sufficient to show the convergence 
 $$ \begin{align*} \int_0^t \big(T^{B^H}b^{n}\big)_{\mathrm{d} s}(\psi^{n}_s)\to \int_0^t\big(T^{B^H}b\big)_{\mathrm{d} s}(\psi_s) \end{align*} $$
$$ \begin{align*} \int_0^t \big(T^{B^H}b^{n}\big)_{\mathrm{d} s}(\psi^{n}_s)\to \int_0^t\big(T^{B^H}b\big)_{\mathrm{d} s}(\psi_s) \end{align*} $$
in probability for each 
 $t\in [0,1]$
. Recall that by Corollary 5.1, we have that
$t\in [0,1]$
. Recall that by Corollary 5.1, we have that 
 $$ \begin{align*} T^{B^H}(b^{n}-b)\to 0 \qquad\text{in } C^{p_{\alpha,H}-{\mathrm{var}}}_tC^{1-\kappa,\mathrm{loc}}_x \end{align*} $$
$$ \begin{align*} T^{B^H}(b^{n}-b)\to 0 \qquad\text{in } C^{p_{\alpha,H}-{\mathrm{var}}}_tC^{1-\kappa,\mathrm{loc}}_x \end{align*} $$
in probability. From the above, we have that 
 $\psi ^{n}$
 converges to
$\psi ^{n}$
 converges to 
 $\psi $
 (and in particular is bounded) in
$\psi $
 (and in particular is bounded) in 
 $C^{p_{\alpha ,H}-{\mathrm {var}}}_t$
 in probability. Therefore, if we take an auxiliary
$C^{p_{\alpha ,H}-{\mathrm {var}}}_t$
 in probability. Therefore, if we take an auxiliary 
 $\ell \in \mathbb {N}$
 and write
$\ell \in \mathbb {N}$
 and write 
 $$ \begin{align*} \int_0^t &\big(T^{B^H}b^{n}\big)_{\mathrm{d} s}(\psi^{n}_s)- \int_0^t\big(T^{B^H}b\big)_{\mathrm{d} s}(\psi_s) \\ &=\int_0^t \big(T^{B^H}b^{\ell}\big)_{\mathrm{d} s}(\psi^{n}_s)- \int_0^t\big(T^{B^H}b^{\ell}\big)_{\mathrm{d} s}(\psi_s) \\ &\qquad-\int_0^t \big(T^{B^H}(b^{\ell}-b^{n})\big)_{\mathrm{d} s}(\psi^{n}_s)+ \int_0^t \big(T^{B^H}(b^{\ell}-b)\big)_{\mathrm{d} s}(\psi_s), \end{align*} $$
$$ \begin{align*} \int_0^t &\big(T^{B^H}b^{n}\big)_{\mathrm{d} s}(\psi^{n}_s)- \int_0^t\big(T^{B^H}b\big)_{\mathrm{d} s}(\psi_s) \\ &=\int_0^t \big(T^{B^H}b^{\ell}\big)_{\mathrm{d} s}(\psi^{n}_s)- \int_0^t\big(T^{B^H}b^{\ell}\big)_{\mathrm{d} s}(\psi_s) \\ &\qquad-\int_0^t \big(T^{B^H}(b^{\ell}-b^{n})\big)_{\mathrm{d} s}(\psi^{n}_s)+ \int_0^t \big(T^{B^H}(b^{\ell}-b)\big)_{\mathrm{d} s}(\psi_s), \end{align*} $$
then we can first choose 
 $\ell $
 and n large enough to make the third and fourth integrals small, and then we can keep the same
$\ell $
 and n large enough to make the third and fourth integrals small, and then we can keep the same 
 $\ell $
 and increase n further to make the difference of the first two terms small, using the Lipschitzness of
$\ell $
 and increase n further to make the difference of the first two terms small, using the Lipschitzness of 
 $b^{\ell }$
. This concludes the proof.
$b^{\ell }$
. This concludes the proof.
Theorem 5.6. Assume (A), 
 $\alpha <0$
, and let
$\alpha <0$
, and let 
 $b\in L^q_t C^\alpha _x$
. Then there exists an event
$b\in L^q_t C^\alpha _x$
. Then there exists an event 
 $\tilde \Omega $
 of full probability such that for all
$\tilde \Omega $
 of full probability such that for all 
 $\omega \in \tilde \Omega $
, for all
$\omega \in \tilde \Omega $
, for all 
 $(S,T)\in [0,1]^2_\leq $
,
$(S,T)\in [0,1]^2_\leq $
, 
 $x\in \mathbb {R}^d$
, there exists only one
$x\in \mathbb {R}^d$
, there exists only one 
 $\omega $
-path solution to (1.6) on
$\omega $
-path solution to (1.6) on 
 $[S,T]$
 with initial condition x; in other words, path-by-path uniqueness holds.
$[S,T]$
 with initial condition x; in other words, path-by-path uniqueness holds.
Remark 5.7. In analogy to Remark 4.6, the strong form of uniqueness coming from Theorem 5.6 readily implies pathwise uniqueness of solutions defined on random time intervals (e.g., stopping times) as well as uniqueness in law of weak solutions. In fact, it gives us uniqueness in a larger class of possibly non-adapted pathwise solutions since the nonlinear Young formalism does not require adaptedness of the processes in consideration. However, Theorem 5.5 tells us that the unique solution is in fact a strong one.
Notice, however, that all these considerations only apply in the framework of Definition 5.3 – namely, if the SDE is interpreted in a nonlinear Young sense as (5.4). Differently from the functional one, in the distributional setting, there is no canonical notion of solution, and one can in principle find alternative concepts which fall outside the framework of Definition 5.3 and Theorem 5.6; for a practical example, see Definition 8.1 further below.
Theorem 5.6 follows from a version of Lemma 4.5 in the nonlinear Young setting, which is a generalization of Theorem 5.1 from [Reference Galeati46].
Lemma 5.8. Let 
 $A\in C^{p-{\mathrm {var}}}_t C^{\eta ,\mathrm {loc}}_x$
 for some
$A\in C^{p-{\mathrm {var}}}_t C^{\eta ,\mathrm {loc}}_x$
 for some 
 $\eta \in (0,1]$
,
$\eta \in (0,1]$
, 
 $p\in [1,2)$
 satisfying
$p\in [1,2)$
 satisfying 
 $(1+\eta )/p>1$
. Suppose that the nonlinear YDE
$(1+\eta )/p>1$
. Suppose that the nonlinear YDE 
 $$ \begin{align*} x_t = \int_0^t A_{\mathrm{d} s}(x_s) \end{align*} $$
$$ \begin{align*} x_t = \int_0^t A_{\mathrm{d} s}(x_s) \end{align*} $$
admits a locally 
 $\beta $
-Hölder continuous semiflow
$\beta $
-Hölder continuous semiflow 
 $\Psi $
 with any
$\Psi $
 with any 
 $\beta \in (0,1)$
. Then for any
$\beta \in (0,1)$
. Then for any 
 $(S,T)\in [0,1]_\leq ^2$
 and
$(S,T)\in [0,1]_\leq ^2$
 and 
 $y\in \mathbb {R}^d$
, there exists a unique solution to the nonlinear YDE on
$y\in \mathbb {R}^d$
, there exists a unique solution to the nonlinear YDE on 
 $[S,T]$
, which is given by
$[S,T]$
, which is given by 
 $\Psi _{S\to \cdot }(y)$
.
$\Psi _{S\to \cdot }(y)$
.
Proof. The proof is very similar to that of Lemma 4.5, so we will mostly sketch it. Let z be a solution on 
 $[S,T]$
 starting from y, which by definition belongs to
$[S,T]$
 starting from y, which by definition belongs to 
 $C^{q-{\mathrm {var}}}_t$
 with some q such that
$C^{q-{\mathrm {var}}}_t$
 with some q such that 
 $1/p+\eta /q>1$
. Thus, z is bounded, and in particular, after localizing the argument, we may assume that
$1/p+\eta /q>1$
. Thus, z is bounded, and in particular, after localizing the argument, we may assume that 
 $\Psi $
 is globally
$\Psi $
 is globally 
 $\beta $
-Hölder and that
$\beta $
-Hölder and that 
 $A\in C^{p-{\mathrm {var}}}_t C^{\eta }_x$
; furthermore, since the inequalities involving
$A\in C^{p-{\mathrm {var}}}_t C^{\eta }_x$
; furthermore, since the inequalities involving 
 $(\eta ,p,q)$
 are strict, we can assume
$(\eta ,p,q)$
 are strict, we can assume 
 $\eta \in (0,1)$
.
$\eta \in (0,1)$
.
 Set  ; an application of Lemma B.1 readily informs us that
; an application of Lemma B.1 readily informs us that 
 $$ \begin{align} |\Psi_{s\to t}(x)-x-A_{s,t}(x)| \lesssim w(s,t)^{\frac{1+\eta}{p}} \end{align} $$
$$ \begin{align} |\Psi_{s\to t}(x)-x-A_{s,t}(x)| \lesssim w(s,t)^{\frac{1+\eta}{p}} \end{align} $$
uniformly in 
 $(s,t)\in [0,1]_\leq ^2$
 and
$(s,t)\in [0,1]_\leq ^2$
 and 
 $x\in \mathbb {R}^d$
 (the hidden constant can depend on
$x\in \mathbb {R}^d$
 (the hidden constant can depend on 
 $w(0,1)$
); a similar bound also holds for
$w(0,1)$
); a similar bound also holds for 
 $\Psi _{s\to t}(x)$
 replaced by
$\Psi _{s\to t}(x)$
 replaced by 
 $z_t$
.
$z_t$
.
 As before, we fix 
 $\tau \in [S,T]$
 and set
$\tau \in [S,T]$
 and set 
 $f_t:= \Psi _{t\to \tau }(z_t)-\Psi _{S\to \tau }(y)$
; in order to conclude, it suffices to show that f is constant. As in (4.4), we have
$f_t:= \Psi _{t\to \tau }(z_t)-\Psi _{S\to \tau }(y)$
; in order to conclude, it suffices to show that f is constant. As in (4.4), we have 
 $|f_{s,t}|\lesssim |\Psi _{s\to t}(z_s)-z_t|^\beta $
. Moreover, by definition of solution to the YDE and estimate (5.7), it holds that
$|f_{s,t}|\lesssim |\Psi _{s\to t}(z_s)-z_t|^\beta $
. Moreover, by definition of solution to the YDE and estimate (5.7), it holds that 
 $$ \begin{align*} |\Psi_{s\to t}(z_s)-z_t| = \big|\Psi_{s\to t}(z_s)-z_s - A_{s,t}(z_s) - (z_t-z_s-A_{s,t}(z_s))\big| \lesssim w(s,t)^{\frac{1+\eta}{p}}. \end{align*} $$
$$ \begin{align*} |\Psi_{s\to t}(z_s)-z_t| = \big|\Psi_{s\to t}(z_s)-z_s - A_{s,t}(z_s) - (z_t-z_s-A_{s,t}(z_s))\big| \lesssim w(s,t)^{\frac{1+\eta}{p}}. \end{align*} $$
Combining the two estimates, we get
 $$ \begin{align*} |f_{s,t}|\lesssim w(s,t)^{\frac{\beta(1+\eta)}{p}}; \end{align*} $$
$$ \begin{align*} |f_{s,t}|\lesssim w(s,t)^{\frac{\beta(1+\eta)}{p}}; \end{align*} $$
by assumption, we can choose 
 $\beta $
 close enough to
$\beta $
 close enough to 
 $1$
 so that
$1$
 so that 
 $\beta (1+\eta )/p$
 is bigger that
$\beta (1+\eta )/p$
 is bigger that 
 $1$
, implying the conclusion.
$1$
, implying the conclusion.
6 Flow regularity and Malliavin differentiability
 So far, we have established the existence of a random Hölder continuous semiflow 
 $\Phi _{s\to t}(x)$
; the aim of this section is to strengthen this result by establishing better properties for
$\Phi _{s\to t}(x)$
; the aim of this section is to strengthen this result by establishing better properties for 
 $\Phi $
. We will start by showing that
$\Phi $
. We will start by showing that 
 $\Phi $
 is a random flow in the sense that for each fixed
$\Phi $
 is a random flow in the sense that for each fixed 
 $s<t$
, the maps
$s<t$
, the maps 
 $x\mapsto \Phi _{s\to t}(x)$
 are invertible; see Theorem 6.1 below. The main body of the section is devoted to the proof of Theorem 6.2, showing that both
$x\mapsto \Phi _{s\to t}(x)$
 are invertible; see Theorem 6.1 below. The main body of the section is devoted to the proof of Theorem 6.2, showing that both 
 $\Phi _{s\to t}$
 and its spatial inverse
$\Phi _{s\to t}$
 and its spatial inverse 
 $\Phi _{s\leftarrow t}$
 admit continuous derivatives. We conclude the section by showing that the random variables
$\Phi _{s\leftarrow t}$
 admit continuous derivatives. We conclude the section by showing that the random variables 
 $\Phi _{s\to t}(x)$
 possess a rather strong form of Malliavin differentiability; see Theorem 6.8 below.
$\Phi _{s\to t}(x)$
 possess a rather strong form of Malliavin differentiability; see Theorem 6.8 below.
 From now on, we will use both 
 $\Phi _{s\to t}(x)$
 and
$\Phi _{s\to t}(x)$
 and 
 $\Phi _{s\to t}(x;\omega )$
 to denote the semiflow, so to stress the dependence on the fixed element
$\Phi _{s\to t}(x;\omega )$
 to denote the semiflow, so to stress the dependence on the fixed element 
 $\omega \in \Omega $
 whenever needed; we start with the promised invertibility.
$\omega \in \Omega $
 whenever needed; we start with the promised invertibility.
Theorem 6.1. Let (A) hold, 
 $b\in L^q_tC^\alpha _x$
, and denote by
$b\in L^q_tC^\alpha _x$
, and denote by 
 $\Phi _{s\to t}(x;\omega )$
 the semiflow of solutions constructed in Theorems 4.3 and 5.5. Then there exists an event
$\Phi _{s\to t}(x;\omega )$
 the semiflow of solutions constructed in Theorems 4.3 and 5.5. Then there exists an event 
 $\tilde {\Omega }$
 of full probability such that, for all
$\tilde {\Omega }$
 of full probability such that, for all 
 $\omega \in \tilde {\Omega }$
 and all
$\omega \in \tilde {\Omega }$
 and all 
 $(s,t)\in [0,1]^2_\leq $
, the map
$(s,t)\in [0,1]^2_\leq $
, the map 
 $x\mapsto \Phi _{s\to t}(x;\omega )$
 is a bijection.
$x\mapsto \Phi _{s\to t}(x;\omega )$
 is a bijection.
Proof. We follow closely the classical arguments by Kunita (cf. [Reference Kunita69, Lemmas II.4.1-II.4.2]), as they are completely independent from the driving noise being Brownian.
First, let us define the family of random variables
 $$ \begin{align*} \eta_{s,t}(x,y) := |\Phi_{s\to t}(x) - \Phi_{s\to t}(y)|^{-1}. \end{align*} $$
$$ \begin{align*} \eta_{s,t}(x,y) := |\Phi_{s\to t}(x) - \Phi_{s\to t}(y)|^{-1}. \end{align*} $$
Set 
 $\gamma =H\wedge 1/q'$
 for
$\gamma =H\wedge 1/q'$
 for 
 $\alpha \geq 0$
,
$\alpha \geq 0$
, 
 $\gamma = \alpha H + 1/q'$
 in the case
$\gamma = \alpha H + 1/q'$
 in the case 
 $\alpha <0$
. Recall that the estimates in the proof of Theorem 4.3, respectively Theorem 5.5, overall yield
$\alpha <0$
. Recall that the estimates in the proof of Theorem 4.3, respectively Theorem 5.5, overall yield 
 $$ \begin{align} \| \Phi_{s\to t}(x) - \Phi_{s'\to t'}(y)\|_{L^m} \lesssim |s-s'|^\gamma + |t-t'|^\gamma + |x-y|; \end{align} $$
$$ \begin{align} \| \Phi_{s\to t}(x) - \Phi_{s'\to t'}(y)\|_{L^m} \lesssim |s-s'|^\gamma + |t-t'|^\gamma + |x-y|; \end{align} $$
moreover, by taking expectation in (3.5), we have
 $$ \begin{align} \| |\Phi_{s\to t}(x) - \Phi_{s\to t}(y)|^{-1} \|_{L^m} \lesssim |x-y|^{-1}. \end{align} $$
$$ \begin{align} \| |\Phi_{s\to t}(x) - \Phi_{s\to t}(y)|^{-1} \|_{L^m} \lesssim |x-y|^{-1}. \end{align} $$
Fix any 
 $\delta>0$
. We can combine estimates (6.1) and (6.2) and argue as in [Reference Kunita69, Lemma II.4.1] to deduce that for any
$\delta>0$
. We can combine estimates (6.1) and (6.2) and argue as in [Reference Kunita69, Lemma II.4.1] to deduce that for any 
 $s<t$
 and any x,
$s<t$
 and any x, 
 $x'$
, y,
$x'$
, y, 
 $y'$
 satisfying
$y'$
 satisfying 
 $|x-y|>\delta $
,
$|x-y|>\delta $
, 
 $|x'-y'|>\delta $
, it holds
$|x'-y'|>\delta $
, it holds 
 $$ \begin{align} \| & \eta_{s,t} (x,y)-\eta_{s',t'}(x',y')\|_{L^m} \nonumber\\ & \lesssim \delta^{-2} \Big[ |x-x'|+|y-y'|+ (1+|x|+|x'|+|y|+|y'|)(|t-t'|^\gamma + |s-s'|^\gamma) \Big]. \end{align} $$
$$ \begin{align} \| & \eta_{s,t} (x,y)-\eta_{s',t'}(x',y')\|_{L^m} \nonumber\\ & \lesssim \delta^{-2} \Big[ |x-x'|+|y-y'|+ (1+|x|+|x'|+|y|+|y'|)(|t-t'|^\gamma + |s-s'|^\gamma) \Big]. \end{align} $$
 From (6.3), one can apply Kolmogorov’s continuity theorem to deduce that the map 
 $(s,t,x,y)\mapsto \eta _{s,t}(x,y;\omega )$
 is continuous on the domain
$(s,t,x,y)\mapsto \eta _{s,t}(x,y;\omega )$
 is continuous on the domain 
 $\{s<t, |x-y|>\delta \}$
 for
$\{s<t, |x-y|>\delta \}$
 for 
 $\mathbb {P}$
-a.e.
$\mathbb {P}$
-a.e. 
 $\omega $
. As the argument works for any
$\omega $
. As the argument works for any 
 $\delta>0$
, we can find an event
$\delta>0$
, we can find an event 
 $\tilde {\Omega }$
 of full probability such that, for all
$\tilde {\Omega }$
 of full probability such that, for all 
 $\omega \in \tilde {\Omega }$
, the map
$\omega \in \tilde {\Omega }$
, the map 
 $\eta _{s,t}(x,y;\omega )$
 is continuous on
$\eta _{s,t}(x,y;\omega )$
 is continuous on 
 $\{s<t, |x-y|\neq 0\}$
, which implies that it must also be finite for all
$\{s<t, |x-y|\neq 0\}$
, which implies that it must also be finite for all 
 $s<t, x\neq y$
. This clearly implies injectivity of
$s<t, x\neq y$
. This clearly implies injectivity of 
 $x\mapsto \Phi _{s,t}(x;\omega )$
 for all
$x\mapsto \Phi _{s,t}(x;\omega )$
 for all 
 $s<t$
 and
$s<t$
 and 
 $\omega \in \tilde {\Omega }$
.
$\omega \in \tilde {\Omega }$
.
 We move to proving surjectivity, which this time is closely based on [Reference Kunita69, II.Lemma 4.2], having established the key inequalities (6.1) and (6.2). Let 
 $\hat {\mathbb {R}}^d=\mathbb {R}^d\cup \{\infty \}$
 be the one-point compactification of
$\hat {\mathbb {R}}^d=\mathbb {R}^d\cup \{\infty \}$
 be the one-point compactification of 
 $\mathbb {R}^d$
; set
$\mathbb {R}^d$
; set 
 $\hat {x}=x/|x|^2$
 for
$\hat {x}=x/|x|^2$
 for 
 $x\in \mathbb {R}^d\setminus \{0\}$
 and
$x\in \mathbb {R}^d\setminus \{0\}$
 and 
 $\hat {x}=\infty $
 for
$\hat {x}=\infty $
 for 
 $x=0$
. Define
$x=0$
. Define 
 $$ \begin{align*} \tilde \eta_{s,t}(\hat{x}) = \begin{cases} (1+ |\Phi_{s\to t}(x)|)^{-1}\quad & \text{if } \hat{x}\in \mathbb{R}^d\\ 0 & \text{if } \hat{x}=0. \end{cases} \end{align*} $$
$$ \begin{align*} \tilde \eta_{s,t}(\hat{x}) = \begin{cases} (1+ |\Phi_{s\to t}(x)|)^{-1}\quad & \text{if } \hat{x}\in \mathbb{R}^d\\ 0 & \text{if } \hat{x}=0. \end{cases} \end{align*} $$
Arguing as in [Reference Kunita69, Lemma II.4.2], we find
 $$ \begin{align} \| \tilde\eta_{s,t}(\hat{x}) -\tilde\eta_{s',t'}(\hat{y})\|_{L^m} \lesssim |\hat{x}-\hat{y}| + |t-t'|^\gamma + |s-s'|^\gamma; \end{align} $$
$$ \begin{align} \| \tilde\eta_{s,t}(\hat{x}) -\tilde\eta_{s',t'}(\hat{y})\|_{L^m} \lesssim |\hat{x}-\hat{y}| + |t-t'|^\gamma + |s-s'|^\gamma; \end{align} $$
by Kolmogorov’s theorem, we can find an event of full probability, which we still denote by 
 $\tilde {\Omega }$
, such that
$\tilde {\Omega }$
, such that 
 $\tilde \eta _{s,t}(\hat {x};\omega )$
 is continuous at
$\tilde \eta _{s,t}(\hat {x};\omega )$
 is continuous at 
 $\hat {x}=0$
 and so that
$\hat {x}=0$
 and so that 
 $\Phi _{s,t}(\cdot ;\omega )$
 can be extended to a continuous map from
$\Phi _{s,t}(\cdot ;\omega )$
 can be extended to a continuous map from 
 $\hat {\mathbb {R}}^d$
 to itself for any
$\hat {\mathbb {R}}^d$
 to itself for any 
 $s<t$
 and
$s<t$
 and 
 $\omega \in \tilde \Omega $
. This extension, denoted by
$\omega \in \tilde \Omega $
. This extension, denoted by 
 $\tilde {\Phi }_{s\to t}(x;\omega )$
, is continuous in
$\tilde {\Phi }_{s\to t}(x;\omega )$
, is continuous in 
 $(s,t,x)$
 for every
$(s,t,x)$
 for every 
 $\omega \in \tilde \Omega $
, and thus,
$\omega \in \tilde \Omega $
, and thus, 
 $\Phi _{s\to t}(\cdot \,; \omega )$
 is homotopic to the identity map
$\Phi _{s\to t}(\cdot \,; \omega )$
 is homotopic to the identity map 
 $\tilde {\Phi }_{s\to s}(\cdot \,;\omega )$
, making it surjective. Its original restriction
$\tilde {\Phi }_{s\to s}(\cdot \,;\omega )$
, making it surjective. Its original restriction 
 $\Phi _{s\to t}(\cdot \,; \omega )$
 must then be surjective as well, from which we can conclude that
$\Phi _{s\to t}(\cdot \,; \omega )$
 must then be surjective as well, from which we can conclude that 
 $x\mapsto \Phi _{s\to t}(x;\omega )$
 is surjective for all
$x\mapsto \Phi _{s\to t}(x;\omega )$
 is surjective for all 
 $s<t$
 and
$s<t$
 and 
 $\omega \in \tilde {\Omega }$
.
$\omega \in \tilde {\Omega }$
.
 Our next goal is to establish that 
 $\Phi $
 is in fact a random flow of diffeomorphisms; by this, we mean that, in addition to the map
$\Phi $
 is in fact a random flow of diffeomorphisms; by this, we mean that, in addition to the map 
 $(s,t,x,\omega )\mapsto \Phi _{s\to t}(x;\omega )$
 satisfying all the properties listed in Definition 4.1, there exists an event of full probability
$(s,t,x,\omega )\mapsto \Phi _{s\to t}(x;\omega )$
 satisfying all the properties listed in Definition 4.1, there exists an event of full probability 
 $\tilde {\Omega }$
 such that
$\tilde {\Omega }$
 such that 
 $x\mapsto \Phi _{s\to t}(x;\omega )$
 is a diffeomorphism for all
$x\mapsto \Phi _{s\to t}(x;\omega )$
 is a diffeomorphism for all 
 $s<t$
 and
$s<t$
 and 
 $\omega \in \tilde {\Omega }$
. We will in fact prove a little bit more:
$\omega \in \tilde {\Omega }$
. We will in fact prove a little bit more:
Theorem 6.2. Let (A) hold, 
 $b\in L^q_tC^\alpha _x$
, and
$b\in L^q_tC^\alpha _x$
, and 
 $\Phi $
 be the associated random flow. Then there exists a constant
$\Phi $
 be the associated random flow. Then there exists a constant 
 $\delta (\alpha ,H)>0$
 and an event
$\delta (\alpha ,H)>0$
 and an event 
 $\tilde \Omega $
 of full probability such that for any
$\tilde \Omega $
 of full probability such that for any 
 $\omega \in \tilde {\Omega }$
 and any
$\omega \in \tilde {\Omega }$
 and any 
 $s<t$
, the map
$s<t$
, the map 
 $x\mapsto \Phi _{s\to t}(x;\omega )$
 and its inverse are both
$x\mapsto \Phi _{s\to t}(x;\omega )$
 and its inverse are both 
 $C^{1+\delta ,\mathrm {loc}}_x$
.
$C^{1+\delta ,\mathrm {loc}}_x$
.
 In order to prove Theorem 6.2, we will first assume b to be sufficiently smooth (
 $b\in L^q_t C^{1+\kappa }_x$
 would suffice), so that the associated
$b\in L^q_t C^{1+\kappa }_x$
 would suffice), so that the associated 
 $\Phi $
 is already known to be a flow of diffeomorphism, and derive estimates which only depend on
$\Phi $
 is already known to be a flow of diffeomorphism, and derive estimates which only depend on 
 $\| b\|_{L^q_t C^\alpha _x}$
 (cf. Lemma 6.3 and Proposition 6.4 below). Establishing the result rigorously for general b is then accomplished by standard approximation procedures, in the style of Theorems 4.3, 5.5. We will frequently use the exponent
$\| b\|_{L^q_t C^\alpha _x}$
 (cf. Lemma 6.3 and Proposition 6.4 below). Establishing the result rigorously for general b is then accomplished by standard approximation procedures, in the style of Theorems 4.3, 5.5. We will frequently use the exponent 
 $\varepsilon =(\alpha -1) H+1/q'$
 from Lemma 3.1; recall that (A) is equivalent to
$\varepsilon =(\alpha -1) H+1/q'$
 from Lemma 3.1; recall that (A) is equivalent to 
 $\varepsilon>0$
.
$\varepsilon>0$
.
 Recall that, for regular b, the Jacobian of the flow – namely, the matrix 
 $J_{s\to t}^x := \nabla \Phi _{s\to t}(x)\in \mathbb {R}^{d\times d}$
 – is known to satisfy the variational equation
$J_{s\to t}^x := \nabla \Phi _{s\to t}(x)\in \mathbb {R}^{d\times d}$
 – is known to satisfy the variational equation 
 $$ \begin{align} J_{s\to t}^x = I + \int_s^t \nabla b_r(\Phi_{s\to r}(x)) J_{s\to r}^x \mathrm{d} r. \end{align} $$
$$ \begin{align} J_{s\to t}^x = I + \int_s^t \nabla b_r(\Phi_{s\to r}(x)) J_{s\to r}^x \mathrm{d} r. \end{align} $$
Already from this fact we can deduce useful moment estimates for 
 $J^x_{s\to t}$
.
$J^x_{s\to t}$
.
Lemma 6.3. Assume (A) and let 
 $b\in L^q_t C^2_x$
. Then there exists
$b\in L^q_t C^2_x$
. Then there exists 
 $p(\alpha ,H)<2$
 with the following property: for any
$p(\alpha ,H)<2$
 with the following property: for any 
 $m\in [1,\infty )$
, there exists a constant
$m\in [1,\infty )$
, there exists a constant 
 $N=N(m,p,H,\alpha ,q,d,\| b\|_{L^q_t C^\alpha _x})$
 such that, for all
$N=N(m,p,H,\alpha ,q,d,\| b\|_{L^q_t C^\alpha _x})$
 such that, for all 
 $x\in \mathbb {R}^d$
 and
$x\in \mathbb {R}^d$
 and 
 $s\in [0,1]$
, it holds
$s\in [0,1]$
, it holds 

moreover, for fixed 
 $\delta <\varepsilon $
, for any
$\delta <\varepsilon $
, for any 
 $x\in \mathbb {R}^d$
 and
$x\in \mathbb {R}^d$
 and 
 $s \leq t \leq t'$
, it holds
$s \leq t \leq t'$
, it holds 
 $$ \begin{align} \| J^x_{s\to t} - J^x_{s\to t'} \|_{L^m} \lesssim |t-t'|^\delta. \end{align} $$
$$ \begin{align} \| J^x_{s\to t} - J^x_{s\to t'} \|_{L^m} \lesssim |t-t'|^\delta. \end{align} $$
Proof. For fixed 
 $s\in [0,1]$
 and
$s\in [0,1]$
 and 
 $x\in \mathbb {R}^d$
, setting
$x\in \mathbb {R}^d$
, setting 
 $A_{s,t}:= \int _s^t \nabla b_r(\Phi _{s\to r}(x)) \mathrm {d} r$
, equation (6.5) can be regarded as a linear Young differential equation. Arguing as in the proof of Theorem 3.2, one can show that A has finite p-variation for some
$A_{s,t}:= \int _s^t \nabla b_r(\Phi _{s\to r}(x)) \mathrm {d} r$
, equation (6.5) can be regarded as a linear Young differential equation. Arguing as in the proof of Theorem 3.2, one can show that A has finite p-variation for some 
 $p<2$
 and that in fact there exists
$p<2$
 and that in fact there exists 
 $\mu>0$
 (depending on the usual parameters and
$\mu>0$
 (depending on the usual parameters and 
 $\| b\|_{L^q_t C^\alpha _x}$
, but not on x nor s) such that
$\| b\|_{L^q_t C^\alpha _x}$
, but not on x nor s) such that 
 $$ \begin{align} \mathbb{E}\bigg[ \exp\bigg( \mu \bigg| \sup_{s\leq t<t'\leq 1} \frac{|A_{t,t'}|}{w_{b,\alpha,q}(t,t')^{1/q} |t-t'|^\delta} \bigg|^2\bigg)\bigg] <\infty; \end{align} $$
$$ \begin{align} \mathbb{E}\bigg[ \exp\bigg( \mu \bigg| \sup_{s\leq t<t'\leq 1} \frac{|A_{t,t'}|}{w_{b,\alpha,q}(t,t')^{1/q} |t-t'|^\delta} \bigg|^2\bigg)\bigg] <\infty; \end{align} $$
Lemma B.2 in Appendix B (with 
 $\tilde {p}=p$
) then implies the pathwise estimate
$\tilde {p}=p$
) then implies the pathwise estimate 

Claim (6.6) then follows by taking 
 $L^m$
-norms on both sides and observing (as in the proof of Theorem 3.2) that (6.8) implies
$L^m$
-norms on both sides and observing (as in the proof of Theorem 3.2) that (6.8) implies  for all
 for all 
 $\lambda>0$
. Similarly, claim (6.7) also follows from Lemma B.2 (this time applying estimate (B.4) therein) combined with (6.8).
$\lambda>0$
. Similarly, claim (6.7) also follows from Lemma B.2 (this time applying estimate (B.4) therein) combined with (6.8).
The next step in the proof of Theorem 6.2 is given by the following key estimate.
Proposition 6.4. Let b be a regular drift and define 
 $J^x_{s\to t}$
 as above; set
$J^x_{s\to t}$
 as above; set 
 $\varepsilon =(\alpha -1)H + 1/q'$
. Then there exists
$\varepsilon =(\alpha -1)H + 1/q'$
. Then there exists 
 $\gamma \in (0,1)$
 such that, for any
$\gamma \in (0,1)$
 such that, for any 
 $m\in [1,\infty )$
, there exists
$m\in [1,\infty )$
, there exists 
 $N=N(m,\gamma ,H,\alpha ,q,d,\| b\|_{L^q_t C^\alpha _x})$
 such that
$N=N(m,\gamma ,H,\alpha ,q,d,\| b\|_{L^q_t C^\alpha _x})$
 such that 
 $$ \begin{align} \| J^x_{s\to t} - J^y_{s'\to t'}\|_{L^m} \leq N\big[ |x-y|^{\gamma} + |t-t'|^{\varepsilon \gamma} + |s-s'|^{\varepsilon \gamma} \big] \end{align} $$
$$ \begin{align} \| J^x_{s\to t} - J^y_{s'\to t'}\|_{L^m} \leq N\big[ |x-y|^{\gamma} + |t-t'|^{\varepsilon \gamma} + |s-s'|^{\varepsilon \gamma} \big] \end{align} $$
for all 
 $(s,t), (s',t')\in [0,1]^2_\leq $
 and
$(s,t), (s',t')\in [0,1]^2_\leq $
 and 
 $x,y\in \mathbb {R}^d$
.
$x,y\in \mathbb {R}^d$
.
The proof requires the following technical refinement of Lemma 3.1.
Lemma 6.5. Assume (A), 
 $h\in L^q_t C^1_x$
, and let
$h\in L^q_t C^1_x$
, and let 
 $\varphi ^i$
,
$\varphi ^i$
, 
 $i=1,2$
, be two processes satisfying the assumptions of Lemma 3.1 for the same control w; define
$i=1,2$
, be two processes satisfying the assumptions of Lemma 3.1 for the same control w; define 
 $\varepsilon $
 as therein and set
$\varepsilon $
 as therein and set 
 $\psi ^i_t=\int _S^t h_r(B^H_r+\varphi ^i_r) \mathrm {d} r$
. Then for
$\psi ^i_t=\int _S^t h_r(B^H_r+\varphi ^i_r) \mathrm {d} r$
. Then for 
 $\gamma \in (0,1)$
 satisfying
$\gamma \in (0,1)$
 satisfying 
 $$ \begin{align} \varepsilon-\gamma H>0, \quad \varepsilon(2-\gamma)-\gamma H>0, \quad \varepsilon(2-\gamma)-\gamma H + (2-\gamma)/q>1, \end{align} $$
$$ \begin{align} \varepsilon-\gamma H>0, \quad \varepsilon(2-\gamma)-\gamma H>0, \quad \varepsilon(2-\gamma)-\gamma H + (2-\gamma)/q>1, \end{align} $$
and any 
 $m\in [2,\infty )$
, there exists
$m\in [2,\infty )$
, there exists 
 $N=N(m,\gamma ,H,\alpha ,q,d,\| h\|_{L^q_t C^{\alpha -1}_x})$
 such that
$N=N(m,\gamma ,H,\alpha ,q,d,\| h\|_{L^q_t C^{\alpha -1}_x})$
 such that 
 $$ \begin{align*} \| (\psi^1-\psi^2)_{s,t} \|_{L^m} \leq N |t-s|^{\varepsilon-\gamma H} w_{h,\alpha-1,q}(s,t)^{\frac{1}{q}} \big(1+w(s,t)\big) \sup_{r\in [S,T]} \| \varphi^1_r-\varphi^2_r\|_{L^m}^\gamma. \end{align*} $$
$$ \begin{align*} \| (\psi^1-\psi^2)_{s,t} \|_{L^m} \leq N |t-s|^{\varepsilon-\gamma H} w_{h,\alpha-1,q}(s,t)^{\frac{1}{q}} \big(1+w(s,t)\big) \sup_{r\in [S,T]} \| \varphi^1_r-\varphi^2_r\|_{L^m}^\gamma. \end{align*} $$
Remark 6.6. The conditions in (6.10) should be understood as ‘
 $\gamma $
 small enough’. Indeed, note that all three conditions are upper bounds on
$\gamma $
 small enough’. Indeed, note that all three conditions are upper bounds on 
 $\gamma $
, and under condition (A), we can always find
$\gamma $
, and under condition (A), we can always find 
 $\gamma>0$
 satisfying (6.10): as
$\gamma>0$
 satisfying (6.10): as 
 $\gamma \downarrow 0$
, the three conditions become, respectively,
$\gamma \downarrow 0$
, the three conditions become, respectively, 
 $\varepsilon>0$
,
$\varepsilon>0$
, 
 $2\varepsilon>0$
, and
$2\varepsilon>0$
, and 
 $2\varepsilon +2/q>1$
, all of which are trivial since
$2\varepsilon +2/q>1$
, all of which are trivial since 
 $q\leq 2$
.
$q\leq 2$
.
Proof. The proof is very similar to that of Lemma 3.1, so we will mostly sketch it; the main differences are just the use of Lemma 2.5 with 
 $n=m$
 and some interpolation arguments.
$n=m$
 and some interpolation arguments.
 Define 
 $A^i_{s,t} = \mathbb {E}_{s-(t-s)}\int _s^t h_r (B^H_r + \mathbb {E}_{s-(t-s)}\varphi _r) \mathrm {d} r$
 so that
$A^i_{s,t} = \mathbb {E}_{s-(t-s)}\int _s^t h_r (B^H_r + \mathbb {E}_{s-(t-s)}\varphi _r) \mathrm {d} r$
 so that 
 $\psi ^1-\psi ^2$
 is the stochastic sewing of
$\psi ^1-\psi ^2$
 is the stochastic sewing of 
 $A^1-A^2$
. Arguing similarly as in Lemma 3.1, we have the estimate
$A^1-A^2$
. Arguing similarly as in Lemma 3.1, we have the estimate 
 $$ \begin{align*} \|A_{s,t}\|_{L^m} & \leq \bigg\| \int_s^t \| P_{|r-s_1|^{2H}} h_r\|_{C^\gamma_x}\, |\mathbb{E}_{s_1} \varphi^1_r-\mathbb{E}_{s_1} \varphi^2_r|^\gamma \mathrm{d} r \bigg\|_{L^m}\\ & \lesssim |t-s|^{\varepsilon-\gamma H} w_{h,\alpha-1,q}(s,t)^{1/q} \sup_{r\in [S,T]} \| \varphi^1_r-\varphi^2_r\|_{L^m}^\gamma; \end{align*} $$
$$ \begin{align*} \|A_{s,t}\|_{L^m} & \leq \bigg\| \int_s^t \| P_{|r-s_1|^{2H}} h_r\|_{C^\gamma_x}\, |\mathbb{E}_{s_1} \varphi^1_r-\mathbb{E}_{s_1} \varphi^2_r|^\gamma \mathrm{d} r \bigg\|_{L^m}\\ & \lesssim |t-s|^{\varepsilon-\gamma H} w_{h,\alpha-1,q}(s,t)^{1/q} \sup_{r\in [S,T]} \| \varphi^1_r-\varphi^2_r\|_{L^m}^\gamma; \end{align*} $$
the first condition of Lemma 2.5 is verified since 
 $\varepsilon -\gamma H>0$
 and
$\varepsilon -\gamma H>0$
 and 
 $1/q \geq 1/2$
. To control
$1/q \geq 1/2$
. To control 
 $\mathbb {E}_{s_1} \delta A_{s,u,t}=\mathbb {E}_{s_1} \delta A^1_{s,u,t}-\mathbb {E}_{s_1} \delta A^2_{s,u,t}$
, we can decompose it as
$\mathbb {E}_{s_1} \delta A_{s,u,t}=\mathbb {E}_{s_1} \delta A^1_{s,u,t}-\mathbb {E}_{s_1} \delta A^2_{s,u,t}$
, we can decompose it as 
 $\mathbb {E}_{s_1} \delta A_{s,u,t} = I^1-I^2+J^1-J^2$
, and similarly to Lemma 3.1. Estimating each one of them separately as therein yields
$\mathbb {E}_{s_1} \delta A_{s,u,t} = I^1-I^2+J^1-J^2$
, and similarly to Lemma 3.1. Estimating each one of them separately as therein yields 
 $$ \begin{align*} \sup_i \{|I^i|,|J^i|\}\lesssim |t-s|^{2\varepsilon}w_{h,\alpha-1,q}(s,t)^{1/q}w(s_1,t)^{1/q}; \end{align*} $$
$$ \begin{align*} \sup_i \{|I^i|,|J^i|\}\lesssim |t-s|^{2\varepsilon}w_{h,\alpha-1,q}(s,t)^{1/q}w(s_1,t)^{1/q}; \end{align*} $$
moreover, we have
 $$ \begin{align*} \| I^1-I^2\|_{L^m} & \leq \bigg\| \int_{s_4}^{s_5}\big|P_{|r-s_2|^{2H}}h_r(\mathbb{E}_{s_2}B^H_r+\mathbb{E}_{s_1}\varphi^1_r)-P_{|r-s_2|^{2H}}h_r(\mathbb{E}_{s_2}B^H_r+\mathbb{E}_{s_2}\varphi^1_r)\big|\mathrm{d} r \\ & \quad -\int_{s_4}^{s_5}\big|P_{|r-s_2|^{2H}}h_r(\mathbb{E}_{s_2}B^H_r+\mathbb{E}_{s_1}\varphi^2_r)-P_{|r-s_2|^{2H}}h_r(\mathbb{E}_{s_2}B^H_r+\mathbb{E}_{s_2}\varphi^2_r)\big|\mathrm{d} r\bigg\|_{L^m}\\ & \leq \int_{s_4}^{s_5} \| P_{|r-s_2|^{2H}}h_r\|_{C^1_x} \big( \| \mathbb{E}_{s_1} \varphi^1_r-\mathbb{E}_{s_1} \varphi^2_r\|_{L^m} + \| \mathbb{E}_{s_2} \varphi^1_r-\mathbb{E}_{s_2} \varphi^2_r\|_{L^m}\big) \mathrm{d} r\\ & \lesssim |t-s|^{(\alpha-2)H + 1/q'} w_{h,\alpha-1,q}(s,t)^{1/q} \sup_{r\in [S,T]} \| \varphi_r^1-\varphi^2_r\|_{L^m}, \end{align*} $$
$$ \begin{align*} \| I^1-I^2\|_{L^m} & \leq \bigg\| \int_{s_4}^{s_5}\big|P_{|r-s_2|^{2H}}h_r(\mathbb{E}_{s_2}B^H_r+\mathbb{E}_{s_1}\varphi^1_r)-P_{|r-s_2|^{2H}}h_r(\mathbb{E}_{s_2}B^H_r+\mathbb{E}_{s_2}\varphi^1_r)\big|\mathrm{d} r \\ & \quad -\int_{s_4}^{s_5}\big|P_{|r-s_2|^{2H}}h_r(\mathbb{E}_{s_2}B^H_r+\mathbb{E}_{s_1}\varphi^2_r)-P_{|r-s_2|^{2H}}h_r(\mathbb{E}_{s_2}B^H_r+\mathbb{E}_{s_2}\varphi^2_r)\big|\mathrm{d} r\bigg\|_{L^m}\\ & \leq \int_{s_4}^{s_5} \| P_{|r-s_2|^{2H}}h_r\|_{C^1_x} \big( \| \mathbb{E}_{s_1} \varphi^1_r-\mathbb{E}_{s_1} \varphi^2_r\|_{L^m} + \| \mathbb{E}_{s_2} \varphi^1_r-\mathbb{E}_{s_2} \varphi^2_r\|_{L^m}\big) \mathrm{d} r\\ & \lesssim |t-s|^{(\alpha-2)H + 1/q'} w_{h,\alpha-1,q}(s,t)^{1/q} \sup_{r\in [S,T]} \| \varphi_r^1-\varphi^2_r\|_{L^m}, \end{align*} $$
and similarly for 
 $\| J^1-J^2\|_{L^m}$
. Interpolating the two bounds together overall yields
$\| J^1-J^2\|_{L^m}$
. Interpolating the two bounds together overall yields 
 $$ \begin{align*} \| \mathbb{E}_{s_1} \delta A_{s,u,t}\|_{L^m} \lesssim |t-s|^{\varepsilon(2-\gamma)-\gamma H} w_{h,\alpha-1,q}(s,t)^{1/q} w(s_1,t)^{\frac{1-\gamma}{q}} \sup_{r\in [S,T]} \| \varphi^1_r-\varphi^2_r\|_{L^m}^\gamma. \end{align*} $$
$$ \begin{align*} \| \mathbb{E}_{s_1} \delta A_{s,u,t}\|_{L^m} \lesssim |t-s|^{\varepsilon(2-\gamma)-\gamma H} w_{h,\alpha-1,q}(s,t)^{1/q} w(s_1,t)^{\frac{1-\gamma}{q}} \sup_{r\in [S,T]} \| \varphi^1_r-\varphi^2_r\|_{L^m}^\gamma. \end{align*} $$
By the hypothesis (6.10), the power of 
 $|t-s|$
 is positive and the total power of all the controls is greater than
$|t-s|$
 is positive and the total power of all the controls is greater than 
 $1$
. The conclusion then follows from Lemma 2.5.
$1$
. The conclusion then follows from Lemma 2.5.
Proof of Proposition 6.4.
 As usual, we can split estimate (6.9) into three subestimates, with two of the three parameters 
 $(s,t,x)$
 fixed and only one varying. From now on, we will fix
$(s,t,x)$
 fixed and only one varying. From now on, we will fix 
 $\gamma \in (0,1)$
 satisfying condition (6.10).
$\gamma \in (0,1)$
 satisfying condition (6.10).
 
Step 1: 
 $(s,x)$
 fixed,
$(s,x)$
 fixed, 
 $t<t'$
. In this case, the desired estimate is just (6.7) from Lemma 6.3, for the choice
$t<t'$
. In this case, the desired estimate is just (6.7) from Lemma 6.3, for the choice 
 $\delta =\gamma \varepsilon < \varepsilon $
.
$\delta =\gamma \varepsilon < \varepsilon $
.
 
Step 2: 
 $(s,t)$
 fixed,
$(s,t)$
 fixed, 
 $x\neq y$
. The difference process
$x\neq y$
. The difference process 
 $v_t:=J^x_{s,t}-J^{y}_{s,t}$
 satisfies an affine Young equation of the form
$v_t:=J^x_{s,t}-J^{y}_{s,t}$
 satisfies an affine Young equation of the form 
 $ \mathrm {d} v_t = \mathrm {d} A_t\, v_t + \mathrm {d} z_t$
,
$ \mathrm {d} v_t = \mathrm {d} A_t\, v_t + \mathrm {d} z_t$
, 
 $v_s=0$
, for
$v_s=0$
, for 
 $$ \begin{align*} A_t = \int_s^t \nabla b_r(\Phi_{s\to r}(x)) \mathrm{d} r, \quad z_t = \int_s^t \big[ \nabla b_r(\Phi_{s\to r}(x)) - \nabla b_r(\Phi_{s\to r}(y))\big] J^y_{s\to r} \mathrm{d} r; \end{align*} $$
$$ \begin{align*} A_t = \int_s^t \nabla b_r(\Phi_{s\to r}(x)) \mathrm{d} r, \quad z_t = \int_s^t \big[ \nabla b_r(\Phi_{s\to r}(x)) - \nabla b_r(\Phi_{s\to r}(y))\big] J^y_{s\to r} \mathrm{d} r; \end{align*} $$
invoking as usual Lemma B.2 (for 
 $\tilde {p}=1/2$
) and applying estimate (6.8), one ends up with
$\tilde {p}=1/2$
) and applying estimate (6.8), one ends up with 

Observe that z itself can be interpreted as a Young integral: 
 $z_t= \int _s^t \mathrm {d} \tilde {A}_r J^y_{s\to r}$
 for
$z_t= \int _s^t \mathrm {d} \tilde {A}_r J^y_{s\to r}$
 for 
 $$\begin{align*}\tilde{A}_u:=\int_s^u \big[ \nabla b_r(\Phi_{s\to r}(x)) - \nabla b_r(\Phi_{s\to r}(y))\big] \mathrm{d} r. \end{align*}$$
$$\begin{align*}\tilde{A}_u:=\int_s^u \big[ \nabla b_r(\Phi_{s\to r}(x)) - \nabla b_r(\Phi_{s\to r}(y))\big] \mathrm{d} r. \end{align*}$$
Standard properties of Young integral, together with Cauchy’s inequality, then yield

by estimate (6.6), it only remains to find a bound for  . Recall that by construction
. Recall that by construction 
 $\Phi _{s\to r}(x) = \varphi _{s\to r}(x) + B^H_r$
, where the process
$\Phi _{s\to r}(x) = \varphi _{s\to r}(x) + B^H_r$
, where the process 
 $\varphi _{s\to \cdot }(x)$
 satisfies condition (2.2) (or even (2.4) for
$\varphi _{s\to \cdot }(x)$
 satisfies condition (2.2) (or even (2.4) for 
 $\alpha <0$
) for
$\alpha <0$
) for 
 $w=w_{b,\alpha ,q}$
. We can apply Lemma 6.5 with the choice
$w=w_{b,\alpha ,q}$
. We can apply Lemma 6.5 with the choice 
 $h=\nabla b$
,
$h=\nabla b$
, 
 $\varphi ^1_r=\varphi _{s\to r}(x)$
,
$\varphi ^1_r=\varphi _{s\to r}(x)$
, 
 $\varphi ^2_r=\varphi _{s\to r}(y)$
 to obtain, for all
$\varphi ^2_r=\varphi _{s\to r}(y)$
 to obtain, for all 
 $s\leq r<u\leq 1$
 and all
$s\leq r<u\leq 1$
 and all 
 $m\in [1,\infty )$
,
$m\in [1,\infty )$
, 
 $$ \begin{align*} \| \tilde{A}_{r,u}\|_{L^m} &\lesssim |r-u|^{\varepsilon-\gamma H} w(r,u)^{1/q} (1+ \| b\|_{L^q_t C^\alpha_x}^q) \sup_{r\in [s,1]} \| \varphi^1_r - \varphi^2_r\|_{L^m}^\gamma\\ & \lesssim |r-u|^{\varepsilon-\gamma H} w(r,u)^{1/q} |x-y|^\gamma, \end{align*} $$
$$ \begin{align*} \| \tilde{A}_{r,u}\|_{L^m} &\lesssim |r-u|^{\varepsilon-\gamma H} w(r,u)^{1/q} (1+ \| b\|_{L^q_t C^\alpha_x}^q) \sup_{r\in [s,1]} \| \varphi^1_r - \varphi^2_r\|_{L^m}^\gamma\\ & \lesssim |r-u|^{\varepsilon-\gamma H} w(r,u)^{1/q} |x-y|^\gamma, \end{align*} $$
where in the second inequality, we used estimate (6.1). By Lemma A.3 in Appendix A, we deduce that, for any 
 $m\in [1,\infty )$
 and
$m\in [1,\infty )$
 and 
 $\delta <\varepsilon -\gamma H$
, it holds
$\delta <\varepsilon -\gamma H$
, it holds 

Combining all the above estimates yields the conclusion in this case.
 
Step 3: 
 $(t,x)$
 fixed,
$(t,x)$
 fixed, 
 $s<s'$
. This step is mostly a variation on the arguments presented in the previous cases, so we only sketch it. We can write
$s<s'$
. This step is mostly a variation on the arguments presented in the previous cases, so we only sketch it. We can write 
 $$\begin{align*}J^x_{s,t}= J^x_{s,s'} + \int_{s'}^t \nabla b(\Phi_{s\to t}(x)) J^x_{s,r} \mathrm{d} r \end{align*}$$
$$\begin{align*}J^x_{s,t}= J^x_{s,s'} + \int_{s'}^t \nabla b(\Phi_{s\to t}(x)) J^x_{s,r} \mathrm{d} r \end{align*}$$
so that the difference 
 $v_t= J^x_{s,t} - J^x_{s',t}$
 can be regarded as the solution to an affine Young equation on
$v_t= J^x_{s,t} - J^x_{s',t}$
 can be regarded as the solution to an affine Young equation on 
 $[s',t]$
, for A and z defined similarly as in Step 2; the only difference is that now
$[s',t]$
, for A and z defined similarly as in Step 2; the only difference is that now 
 $v_{s'} = J^x_{s,s'}-I$
 and
$v_{s'} = J^x_{s,s'}-I$
 and 
 $z_t = \int _{s'}^t \mathrm {d} \tilde A_r J^z_{s'\to r}$
 for the choice
$z_t = \int _{s'}^t \mathrm {d} \tilde A_r J^z_{s'\to r}$
 for the choice 
 $$ \begin{align*} \tilde{A}_u:=\int_{s'}^u \big[ \nabla b_r(\Phi_{s\to r}(x)) - \nabla b_r(\Phi_{s'\to r}(x))\big] \mathrm{d} r. \end{align*} $$
$$ \begin{align*} \tilde{A}_u:=\int_{s'}^u \big[ \nabla b_r(\Phi_{s\to r}(x)) - \nabla b_r(\Phi_{s'\to r}(x))\big] \mathrm{d} r. \end{align*} $$
From here, the estimates are almost identical to those of Step 2, relying on a combination of Lemmas B.2, A.3 and 6.5; however, in this case, an application of Step 1 and estimate (6.1) gives us
 $$ \begin{align*} \| J^x_{s'\to s}-I\|_{L^m} \lesssim |s-s'|^{\varepsilon \gamma}, \quad \sup_{r\in [s',1]} \| \Phi_{s\to r}(x)-\Phi_{s'\to r}(x)\|_{L^m}^\gamma \lesssim |s-s'|^{\varepsilon \gamma}. here \end{align*} $$
$$ \begin{align*} \| J^x_{s'\to s}-I\|_{L^m} \lesssim |s-s'|^{\varepsilon \gamma}, \quad \sup_{r\in [s',1]} \| \Phi_{s\to r}(x)-\Phi_{s'\to r}(x)\|_{L^m}^\gamma \lesssim |s-s'|^{\varepsilon \gamma}. here \end{align*} $$
We are now finally ready to complete the following:
Proof of Theorem 6.2.
 The argument is based on Theorem II.4.4 from [Reference Kunita69]; assume first b to be a regular field. It is clear from (6.9) that for any 
 $\delta <\varepsilon \gamma $
, the map
$\delta <\varepsilon \gamma $
, the map 
 $(s,t,x)\mapsto \nabla J_{s\to t}^x$
 is
$(s,t,x)\mapsto \nabla J_{s\to t}^x$
 is 
 $\mathbb {P}$
-a.s. locally
$\mathbb {P}$
-a.s. locally 
 $\delta $
-Hölder continuous, suitable moment estimates depending only on
$\delta $
-Hölder continuous, suitable moment estimates depending only on 
 $\| b\|_{L^q_t C^\alpha _x}$
. Furthermore, letting
$\| b\|_{L^q_t C^\alpha _x}$
. Furthermore, letting 
 $K_{s\to t}^x$
 denote the inverse of
$K_{s\to t}^x$
 denote the inverse of 
 $J_{s\to t}^x$
 in the sense of matrices, it is well-known that it solves the linear equation
$J_{s\to t}^x$
 in the sense of matrices, it is well-known that it solves the linear equation 
 $$ \begin{align} K_{s\to t}^x = I - \int_s^t K_{s\to r}^x\, \nabla b_r(\Phi^x_{s\to r}(x)) \mathrm{d} r; \end{align} $$
$$ \begin{align} K_{s\to t}^x = I - \int_s^t K_{s\to r}^x\, \nabla b_r(\Phi^x_{s\to r}(x)) \mathrm{d} r; \end{align} $$
arguing as in the proof of Proposition 6.4, one can prove that
 $$ \begin{align*} \| K^x_{s\to t} - K^y_{s'\to t'}\|_{L^m} \lesssim |x-y|^\gamma + |t-t'|^{\varepsilon \gamma} + |s-s'|^{\varepsilon \gamma} \end{align*} $$
$$ \begin{align*} \| K^x_{s\to t} - K^y_{s'\to t'}\|_{L^m} \lesssim |x-y|^\gamma + |t-t'|^{\varepsilon \gamma} + |s-s'|^{\varepsilon \gamma} \end{align*} $$
and so that it is 
 $\mathbb {P}$
-a.s.
$\mathbb {P}$
-a.s. 
 $\delta $
-Hölder continuous as well.
$\delta $
-Hölder continuous as well.
 In the case of general 
 $b\in L^q_t C^\alpha _x$
, we can consider a sequence
$b\in L^q_t C^\alpha _x$
, we can consider a sequence 
 $b^n$
 of regular functions such that
$b^n$
 of regular functions such that 
 $b^n\to b$
 in
$b^n\to b$
 in 
 $L^q_t C^\alpha _x$
 (up to sacrificing a little bit of spatial regularity as usual), in which case we already know that the associated flows
$L^q_t C^\alpha _x$
 (up to sacrificing a little bit of spatial regularity as usual), in which case we already know that the associated flows 
 $\Phi ^n$
 converge to
$\Phi ^n$
 converge to 
 $\Phi $
 in
$\Phi $
 in 
 $L^m_\omega C_{s,t} C^{\delta ,\mathrm {loc}}_x$
; combined with the aforementioned moments estimates, one can then upgrade it to convergence in
$L^m_\omega C_{s,t} C^{\delta ,\mathrm {loc}}_x$
; combined with the aforementioned moments estimates, one can then upgrade it to convergence in 
 $L^m_\omega C_{s,t} C^{1+\delta ,\mathrm {loc}}_x$
. In particular, the fields
$L^m_\omega C_{s,t} C^{1+\delta ,\mathrm {loc}}_x$
. In particular, the fields 
 $J^{x,n}_{s\to t}=\nabla \Phi ^n_{s\to t}(x)$
 and
$J^{x,n}_{s\to t}=\nabla \Phi ^n_{s\to t}(x)$
 and 
 $K^{x,n}_{s\to t}=(\nabla \Phi ^n_{s\to t}(x))^{-1}$
 converge respectively to
$K^{x,n}_{s\to t}=(\nabla \Phi ^n_{s\to t}(x))^{-1}$
 converge respectively to 
 $J^x_{s\to t}$
 and
$J^x_{s\to t}$
 and 
 $K^x_{s\to t}$
; by the limiting procedure, there exists an event
$K^x_{s\to t}$
; by the limiting procedure, there exists an event 
 $\tilde {\Omega }$
 of full probability such that, for all
$\tilde {\Omega }$
 of full probability such that, for all 
 $\omega \in \tilde \Omega $
, it holds
$\omega \in \tilde \Omega $
, it holds 
 $J^x_{s\to t}(\omega )=\nabla \Phi _{s\to t}(x;\omega )$
 and
$J^x_{s\to t}(\omega )=\nabla \Phi _{s\to t}(x;\omega )$
 and 
 $J^x_{s\to t}(\omega ) K^x_{s\to t}(\omega )=I$
 for all
$J^x_{s\to t}(\omega ) K^x_{s\to t}(\omega )=I$
 for all 
 $s<t$
 and
$s<t$
 and 
 $x\in \mathbb {R}^d$
, as well as
$x\in \mathbb {R}^d$
, as well as 
 $J(\omega ), K(\omega )\in C_{s,t} C^{\delta ,\mathrm {loc}}_x$
.
$J(\omega ), K(\omega )\in C_{s,t} C^{\delta ,\mathrm {loc}}_x$
.
 Overall, for every 
 $\omega \in \tilde {\Omega }$
, the map
$\omega \in \tilde {\Omega }$
, the map 
 $(s,t,x)\mapsto \Phi _{s\to t}(x;\omega )$
 has regularity
$(s,t,x)\mapsto \Phi _{s\to t}(x;\omega )$
 has regularity 
 $C_{s,t} C^{1+\delta ,\mathrm {loc}}_x$
, and its Jacobian admits a continuous inverse
$C_{s,t} C^{1+\delta ,\mathrm {loc}}_x$
, and its Jacobian admits a continuous inverse 
 $K^x_{s\to t}(\omega )$
. But this implies that, for any
$K^x_{s\to t}(\omega )$
. But this implies that, for any 
 $s<t$
,
$s<t$
, 
 $\nabla \Phi _{s\to t}(x;\omega )$
 is a nondegenerate matrix for all
$\nabla \Phi _{s\to t}(x;\omega )$
 is a nondegenerate matrix for all 
 $x\in \mathbb {R}^d$
, which by the implicit function theorem readily implies that the inverse of
$x\in \mathbb {R}^d$
, which by the implicit function theorem readily implies that the inverse of 
 $x\mapsto \Phi _{s\to t}(x;\omega )$
 must belong to
$x\mapsto \Phi _{s\to t}(x;\omega )$
 must belong to 
 $C^{1+\delta ,\mathrm {loc}}_x$
 as well. This concludes the proof.
$C^{1+\delta ,\mathrm {loc}}_x$
 as well. This concludes the proof.
 It is well-known in the regular case that the Jacobian of the flow and the Malliavin derivative satisfy the same type of linear equation. Therefore, as the last main result of the section, we show Malliavin differentiability of the random variables 
 $X^x_{s\to t}(\omega ):= \Phi _{s\to t}(x;\omega )$
. To this end, we start with a simple yet powerful lemma, showing that deterministic perturbations of the driving noise
$X^x_{s\to t}(\omega ):= \Phi _{s\to t}(x;\omega )$
. To this end, we start with a simple yet powerful lemma, showing that deterministic perturbations of the driving noise 
 $B^H$
 do not affect our solution theory.
$B^H$
 do not affect our solution theory.
Lemma 6.7. Assume (A), 
 $b\in L^q_t C^\alpha _x$
, and
$b\in L^q_t C^\alpha _x$
, and 
 $h: [0,1]\to \mathbb {R}^d$
 be a deterministic, measurable function; then for any
$h: [0,1]\to \mathbb {R}^d$
 be a deterministic, measurable function; then for any 
 $s\in [0,1]$
 and any
$s\in [0,1]$
 and any 
 $x\in \mathbb {R}^d$
, there exists a pathwise unique strong solution to the perturbed SDE
$x\in \mathbb {R}^d$
, there exists a pathwise unique strong solution to the perturbed SDE 
 $$ \begin{align} X_t = x + \int_s^t b_r(X_r) \mathrm{d} r + B^H_{s,t} + h_{s,t} \quad \forall\, t\in [s,1], \end{align} $$
$$ \begin{align} X_t = x + \int_s^t b_r(X_r) \mathrm{d} r + B^H_{s,t} + h_{s,t} \quad \forall\, t\in [s,1], \end{align} $$
which we denote by 
 $X_{s\to \cdot }(x;h)$
; in the distributional case
$X_{s\to \cdot }(x;h)$
; in the distributional case 
 $\alpha <0$
, equation (6.12) should be interpreted in the sense of Definition 5.3.
$\alpha <0$
, equation (6.12) should be interpreted in the sense of Definition 5.3.
Proof. We give two short alternative arguments to verify the claim. On one hand, carefully going through the proofs of Sections 2–3, the only key properties needed on the process 
 $B^H$
 (cf. also Remark 1.12) are its Gaussianity and the two-sided bounds
$B^H$
 (cf. also Remark 1.12) are its Gaussianity and the two-sided bounds 
 $$ \begin{align*} \mathbb{E}[ |B^H_t - \mathbb{E}_s B^H_t|^2] \sim |t-s|^{2H}, \end{align*} $$
$$ \begin{align*} \mathbb{E}[ |B^H_t - \mathbb{E}_s B^H_t|^2] \sim |t-s|^{2H}, \end{align*} $$
which are clearly still true for 
 $\tilde {B}^H=B^H+h$
, due to h being deterministic.
$\tilde {B}^H=B^H+h$
, due to h being deterministic.
 Alternatively, if we define 
 $\tilde {b}_t(z):= b_r(z+h_r)$
,
$\tilde {b}_t(z):= b_r(z+h_r)$
, 
 $y=x+h_s$
, then any solution X to (6.12) must be in a
$y=x+h_s$
, then any solution X to (6.12) must be in a 
 $1$
-
$1$
-
 $1$
 correspondence with a solution
$1$
 correspondence with a solution 
 $Y:=X+h$
 to the unperturbed SDE
$Y:=X+h$
 to the unperturbed SDE 
 $$\begin{align*}Y_t = y + \int_s^t \tilde{b}_r(Y_r)\mathrm{d} r + B^H_{s,t}, \end{align*}$$
$$\begin{align*}Y_t = y + \int_s^t \tilde{b}_r(Y_r)\mathrm{d} r + B^H_{s,t}, \end{align*}$$
and it is clear that 
 $\tilde {b}$
 still satisfies condition (A), thus implying its well-posedness.
$\tilde {b}$
 still satisfies condition (A), thus implying its well-posedness.
 We can now pass to study Malliavin differentiability of 
 $X^x_{s\to t}$
. To this end, it is convenient to first recall the notion of
$X^x_{s\to t}$
. To this end, it is convenient to first recall the notion of 
 $\mathcal {H}$
-derivative. Let
$\mathcal {H}$
-derivative. Let 
 $\mathcal {H}^H$
 denote the Cameron-Martin space associated to
$\mathcal {H}^H$
 denote the Cameron-Martin space associated to 
 $B^H$
; we say that a function
$B^H$
; we say that a function 
 $F:\Omega \to \mathbb {R}$
 is
$F:\Omega \to \mathbb {R}$
 is 
 $\mathcal {H}$
-continuously differentiable if for
$\mathcal {H}$
-continuously differentiable if for 
 $\mathbb {P}$
-a.e.
$\mathbb {P}$
-a.e. 
 $\omega \in \Omega $
, the map
$\omega \in \Omega $
, the map 
 $h\mapsto F(\omega +h)$
 is Fréchet differentiable from
$h\mapsto F(\omega +h)$
 is Fréchet differentiable from 
 $\mathcal {H}^H$
 to
$\mathcal {H}^H$
 to 
 $\mathbb {R}$
. In particular, this implies the existence of a random bounded linear operator
$\mathbb {R}$
. In particular, this implies the existence of a random bounded linear operator 
 $\partial F(\omega )$
, which we call the
$\partial F(\omega )$
, which we call the 
 $\mathcal {H}$
-differential of F, such that
$\mathcal {H}$
-differential of F, such that 
 $\mathbb {P}$
-a.s.
$\mathbb {P}$
-a.s. 
 $$ \begin{align*} \partial F(\omega)(h)=\partial_h F(\omega):= \lim_{\varepsilon\to 0} \frac{F(\omega+\varepsilon h)-F(\omega)}{\varepsilon}. \end{align*} $$
$$ \begin{align*} \partial F(\omega)(h)=\partial_h F(\omega):= \lim_{\varepsilon\to 0} \frac{F(\omega+\varepsilon h)-F(\omega)}{\varepsilon}. \end{align*} $$
Denote by 
 $\| \partial F\|$
 the (random) operator norm of
$\| \partial F\|$
 the (random) operator norm of 
 $\partial F(\omega )$
, as a linear operator from
$\partial F(\omega )$
, as a linear operator from 
 $\mathcal {H}^H$
 to
$\mathcal {H}^H$
 to 
 $\mathbb {R}^d$
. It is known (cf. [Reference Nualart80, Section 4.1.3]) that if
$\mathbb {R}^d$
. It is known (cf. [Reference Nualart80, Section 4.1.3]) that if 
 $F\in L^2$
 and
$F\in L^2$
 and 
 $\| \partial F\|\in L^2$
, then F is Malliavin differentiable and its Malliavin differential
$\| \partial F\|\in L^2$
, then F is Malliavin differentiable and its Malliavin differential 
 $DF \mathbb {P}$
-a.s. satisfies
$DF \mathbb {P}$
-a.s. satisfies 
 $\|D F\|_{\mathcal {H}^H} = \| \partial F\|$
. For this reason, when dealing with
$\|D F\|_{\mathcal {H}^H} = \| \partial F\|$
. For this reason, when dealing with 
 $X_{s\to t}^x$
, it will be convenient for us to manipulate directly the directional derivatives
$X_{s\to t}^x$
, it will be convenient for us to manipulate directly the directional derivatives 
 $\partial _h X^x_{s\to t}$
. This notion of derivative allows to consider h coming from a larger class than merely Cameron-Martin paths; see Remark 6.9 below for a more detailed explanation.
$\partial _h X^x_{s\to t}$
. This notion of derivative allows to consider h coming from a larger class than merely Cameron-Martin paths; see Remark 6.9 below for a more detailed explanation.
Theorem 6.8. Assume (A) and 
 $b\in L^q_t C^\alpha _x$
. In the setting of Lemma 6.7, let us set
$b\in L^q_t C^\alpha _x$
. In the setting of Lemma 6.7, let us set 
 $X^x_{s,t}(h):= X_{s\to t}(x;h)$
. Then
$X^x_{s,t}(h):= X_{s\to t}(x;h)$
. Then 
 $\mathbb {P}$
-a.s. the random variables
$\mathbb {P}$
-a.s. the random variables 
 $\partial _h X^x_{s\to t}$
 exist for all
$\partial _h X^x_{s\to t}$
 exist for all 
 $h\in C^{2-{\mathrm {var}}}_t$
 and define a (random) linear map
$h\in C^{2-{\mathrm {var}}}_t$
 and define a (random) linear map 
 $\partial X^x_{s,t}$
. Moreover, for any
$\partial X^x_{s,t}$
. Moreover, for any 
 $m\in [1,\infty )$
, it holds
$m\in [1,\infty )$
, it holds 
 $$ \begin{align} \sup_{s\in [0,1],x\in \mathbb{R}^d} \Big\| \sup_{t\in [s,1]} \| \partial X^x_{s,t}\|_{\mathcal{L}(C^{2-{\mathrm{var}}};\mathbb{R}^d)} \Big\|_{L^m}<\infty. \end{align} $$
$$ \begin{align} \sup_{s\in [0,1],x\in \mathbb{R}^d} \Big\| \sup_{t\in [s,1]} \| \partial X^x_{s,t}\|_{\mathcal{L}(C^{2-{\mathrm{var}}};\mathbb{R}^d)} \Big\|_{L^m}<\infty. \end{align} $$
In particular, 
 $X^x_{s\to t}$
 is Malliavin differentiable, and for any
$X^x_{s\to t}$
 is Malliavin differentiable, and for any 
 $m\in [1,\infty )$
, it holds
$m\in [1,\infty )$
, it holds 
 $$ \begin{align} \sup_{s\in [0,1], x\in \mathbb{R}^d} \Big\| \sup_{t\in [s,1]} \| D X^x_{s\to t} \|_{\mathcal{H}^H} \Big\|_{L^m} <\infty. \end{align} $$
$$ \begin{align} \sup_{s\in [0,1], x\in \mathbb{R}^d} \Big\| \sup_{t\in [s,1]} \| D X^x_{s\to t} \|_{\mathcal{H}^H} \Big\|_{L^m} <\infty. \end{align} $$
Proof. For simplicity, we give the proof in the case where b is smooth, so that all the computations are rigorous, but keeping track that the estimate (6.14) only depends on 
 $\| b\|_{L^q_t C^\alpha _x}$
. The general case then follows by standard (but a bit tedious) approximation arguments, similar to those of Theorems 4.3–5.5; for estimate (6.14), one can alternatively invoke [Reference Nualart80, Lemma 1.5.3].
$\| b\|_{L^q_t C^\alpha _x}$
. The general case then follows by standard (but a bit tedious) approximation arguments, similar to those of Theorems 4.3–5.5; for estimate (6.14), one can alternatively invoke [Reference Nualart80, Lemma 1.5.3].
 For smooth b, 
 $\partial _h X^x_{s\to t}$
 is classically characterized as the unique solution to the affine equation
$\partial _h X^x_{s\to t}$
 is classically characterized as the unique solution to the affine equation 
 $$ \begin{align} \partial_h X^x_{s\to t} = \int_s^t \nabla b_r(X^x_{s\to t}) \partial_h X^x_{s\to r} \mathrm{d} r + h_{s,t}. \end{align} $$
$$ \begin{align} \partial_h X^x_{s\to t} = \int_s^t \nabla b_r(X^x_{s\to t}) \partial_h X^x_{s\to r} \mathrm{d} r + h_{s,t}. \end{align} $$
Consider the process 
 $A_t:= \int _s^t \nabla b_r(X^x_{s\to r}) \mathrm {d} r$
 as usual, which satisfies (6.8), so that it has
$A_t:= \int _s^t \nabla b_r(X^x_{s\to r}) \mathrm {d} r$
 as usual, which satisfies (6.8), so that it has 
 $\mathbb {P}$
-a.s. finite p-variation for some
$\mathbb {P}$
-a.s. finite p-variation for some 
 $p<2$
, and moreover,
$p<2$
, and moreover, 

for all 
 $\lambda \in \mathbb {R}$
, where the estimate only depends on
$\lambda \in \mathbb {R}$
, where the estimate only depends on 
 $\| b\|_{L^q_t C^\alpha _x}$
 and does not depend on x or s. Interpreting (6.15) as an affine Young equation and applying Lemma B.2 from Appendix B with
$\| b\|_{L^q_t C^\alpha _x}$
 and does not depend on x or s. Interpreting (6.15) as an affine Young equation and applying Lemma B.2 from Appendix B with 
 $\tilde {p}=2$
, we then find
$\tilde {p}=2$
, we then find 
 $C>0$
 such that
$C>0$
 such that 

taking first supremum over 
 $h\in C^{2-{\mathrm {var}}}$
 with
$h\in C^{2-{\mathrm {var}}}$
 with 
 $\| h\|_{2-{\mathrm {var}}}=1$
 and then over
$\| h\|_{2-{\mathrm {var}}}=1$
 and then over 
 $t\in [s,1]$
, we arrive at the pathwise
$t\in [s,1]$
, we arrive at the pathwise 
 $\mathbb {P}$
-a.s. inequality
$\mathbb {P}$
-a.s. inequality 

Taking the 
 $L^m$
-norm on both sides, using (6.16), then readily yields (6.13).
$L^m$
-norm on both sides, using (6.16), then readily yields (6.13).
 Estimate (6.14) then follows from the isometric identification of 
 $D X^x_{s,t}$
 with
$D X^x_{s,t}$
 with 
 $\partial X^x_{s,t}$
, so that
$\partial X^x_{s,t}$
, so that 
 $\|D F\|_{\mathcal {H}^H}= \| \partial X^x_{s,t}\|$
, combined with the functional embedding
$\|D F\|_{\mathcal {H}^H}= \| \partial X^x_{s,t}\|$
, combined with the functional embedding 
 $\mathcal {H}^H\hookrightarrow C^{2-{\mathrm {var}}}_t$
; see Lemma C.1 in Appendix C for
$\mathcal {H}^H\hookrightarrow C^{2-{\mathrm {var}}}_t$
; see Lemma C.1 in Appendix C for 
 $H\in (0,1/2)$
 and recall that
$H\in (0,1/2)$
 and recall that 
 $\mathcal {H}^H\hookrightarrow C^{1-{\mathrm {var}}}_t$
 for
$\mathcal {H}^H\hookrightarrow C^{1-{\mathrm {var}}}_t$
 for 
 $H\geq 1/2$
.
$H\geq 1/2$
.
Remark 6.9. Results on differentiability beyond the usual Malliavin sense, in the sense of the existence of 
 $\partial _h X^x_{s,t}$
 for h belonging to a larger class than
$\partial _h X^x_{s,t}$
 for h belonging to a larger class than 
 $\mathcal {H}^H$
, were already observed for standard SDEs in [Reference Kusuoka70] and have natural explanations in rough path theory (cf. [Reference Cass, Friz and Victoir19, Reference Friz and Victoir42]); in these works, however, only
$\mathcal {H}^H$
, were already observed for standard SDEs in [Reference Kusuoka70] and have natural explanations in rough path theory (cf. [Reference Cass, Friz and Victoir19, Reference Friz and Victoir42]); in these works, however, only 
 $h\in C^{\tilde p-{\mathrm {var}}}_t$
 for some
$h\in C^{\tilde p-{\mathrm {var}}}_t$
 for some 
 $\tilde p<2$
 are allowed. Here instead, not only are we able to reach
$\tilde p<2$
 are allowed. Here instead, not only are we able to reach 
 $C^{2-{\mathrm {var}}}_t$
, but the result can be further strengthened to allow for some
$C^{2-{\mathrm {var}}}_t$
, but the result can be further strengthened to allow for some 
 $\tilde p>2$
: indeed, the key point is a combination of estimate (6.16) and Lemma B.2, which works as long as the condition
$\tilde p>2$
: indeed, the key point is a combination of estimate (6.16) and Lemma B.2, which works as long as the condition 
 $1/\tilde {p}>1-1/p$
 is satisfied.
$1/\tilde {p}>1-1/p$
 is satisfied.
7 McKean-Vlasov equations
Armed with the stability estimate (3.4), we can now solve distribution dependent SDEs (henceforth DDSDEs) of the form
 $$ \begin{align} X_t = X_0 + \int_0^t F_s(X_s,\mu_s)\mathrm{d} s + B^H_t, \quad \mu_t=\mathcal{L}(X_t). \end{align} $$
$$ \begin{align} X_t = X_0 + \int_0^t F_s(X_s,\mu_s)\mathrm{d} s + B^H_t, \quad \mu_t=\mathcal{L}(X_t). \end{align} $$
The initial condition 
 $X_0$
 is assumed to be
$X_0$
 is assumed to be 
 $\mathcal {F}_0$
-measurable – in particular, independent of
$\mathcal {F}_0$
-measurable – in particular, independent of 
 $B^H$
. The idea that estimates of the form (3.4), where the difference of two drifts only appears in the weaker norm of
$B^H$
. The idea that estimates of the form (3.4), where the difference of two drifts only appears in the weaker norm of 
 $L^q_t C^{\alpha -1}_x$
, can be exploited to solve DDSDEs was first introduced in [Reference Galeati, Harang and Mayorcas51]; the results presented here can be regarded as a natural extension, requiring less time regularity on the drift and allowing to cover
$L^q_t C^{\alpha -1}_x$
, can be exploited to solve DDSDEs was first introduced in [Reference Galeati, Harang and Mayorcas51]; the results presented here can be regarded as a natural extension, requiring less time regularity on the drift and allowing to cover 
 $H>1$
 as well. In particular, as in the previous sections, we will not need to exploit Girsanov transform, which instead played a prominent role in [Reference Galeati, Harang and Mayorcas51].
$H>1$
 as well. In particular, as in the previous sections, we will not need to exploit Girsanov transform, which instead played a prominent role in [Reference Galeati, Harang and Mayorcas51].
 Since our analysis also includes the case of distributional drifts F, we provide a meaningful definition of solution; observe that in the case F is actually continuous in the space variable (i.e 
 $\alpha>0$
), it reduces to the classical one.
$\alpha>0$
), it reduces to the classical one.
Definition 7.1. Let 
 $H\in (0,\infty )\setminus \mathbb {N}$
 and
$H\in (0,\infty )\setminus \mathbb {N}$
 and 
 $F:[0,1]\times \mathcal {P}(\mathbb {R}^d)\to C^\alpha _x$
 be a measurable function. We say that a tuple
$F:[0,1]\times \mathcal {P}(\mathbb {R}^d)\to C^\alpha _x$
 be a measurable function. We say that a tuple 
 $(\Omega ,\mathbb {F},\mathbb {P}; X,B^H)$
 is a weak solution to (7.1) if
$(\Omega ,\mathbb {F},\mathbb {P}; X,B^H)$
 is a weak solution to (7.1) if 
 Similarly to Definition 7.1, one can immediately extend the concepts of strong existence, pathwise uniqueness and uniqueness in law to the DDSDE (7.1). With a slight abuse, we will use the terminology input data of the DDSDE (7.1) to indicate both the pair 
 $(X_0,B^H)$
 (when discussing strong existence and/or pathwise uniqueness of solutions) and the pair
$(X_0,B^H)$
 (when discussing strong existence and/or pathwise uniqueness of solutions) and the pair 
 $(\xi ,\mu ^H)=(\mathcal {L}(X_0),\mathcal {L}(B^H))$
 (when discussing uniqueness in law). We are now ready to formulate our main assumptions on the drift F.
$(\xi ,\mu ^H)=(\mathcal {L}(X_0),\mathcal {L}(B^H))$
 (when discussing uniqueness in law). We are now ready to formulate our main assumptions on the drift F.
Assumption 7.2. Let 
 $H\in (0,\infty )\setminus \mathbb {N}$
 fixed,
$H\in (0,\infty )\setminus \mathbb {N}$
 fixed, 
 $F:[0,1]\times \mathcal {P}(\mathbb {R}^d)\to C^\alpha _x$
 be a measurable function; we assume that there exist parameters
$F:[0,1]\times \mathcal {P}(\mathbb {R}^d)\to C^\alpha _x$
 be a measurable function; we assume that there exist parameters 
 $(\alpha ,q)$
 satisfying (A) and
$(\alpha ,q)$
 satisfying (A) and 
 $h\in L^q_t$
 such that
$h\in L^q_t$
 such that 
- 
i) for all  $t\in [0,1],\, \mu \in \mathcal {P}(\mathbb {R}^d)$
, it holds $t\in [0,1],\, \mu \in \mathcal {P}(\mathbb {R}^d)$
, it holds $\| F_t(\cdot ,\mu )\|_{C^\alpha _x} \leq h_t$
; $\| F_t(\cdot ,\mu )\|_{C^\alpha _x} \leq h_t$
;
- 
ii) for all  $t\in [0,1],\, \mu ,\nu \in \mathcal {P}(\mathbb {R}^d)$
, it holds $t\in [0,1],\, \mu ,\nu \in \mathcal {P}(\mathbb {R}^d)$
, it holds $\| F_t(\cdot ,\mu )-F_t(\cdot ,\nu )\|_{C^{\alpha -1}_x} \leq h_t \mathbb {W}_1(\mu ,\nu )$
; $\| F_t(\cdot ,\mu )-F_t(\cdot ,\nu )\|_{C^{\alpha -1}_x} \leq h_t \mathbb {W}_1(\mu ,\nu )$
;
Remark 7.3. Basic examples of F satisfying Assumption (7.2) include the following (for their verification, we refer to Section 2.1 from [Reference Galeati, Harang and Mayorcas51]):
- 
i) The true McKean–Vlasov case  $F_t(\cdot ,\mu )=f_t(\cdot )+(g_t\ast \mu )(\cdot )$
 for $F_t(\cdot ,\mu )=f_t(\cdot )+(g_t\ast \mu )(\cdot )$
 for $f,g\in L^q_t C^\alpha _x$
; $f,g\in L^q_t C^\alpha _x$
;
- 
ii) Mean-dependence of the form  $F_t(\cdot ,\mu )=f_t(\cdot \,-\langle \mu \rangle )$
, where $F_t(\cdot ,\mu )=f_t(\cdot \,-\langle \mu \rangle )$
, where $\langle \mu \rangle :=\int y\,\mu (\mathrm {d} y)$
; $\langle \mu \rangle :=\int y\,\mu (\mathrm {d} y)$
;
- 
iii) The mean  $\langle \mu \rangle $
 in ii) can be replaced by other functions of statistics (e.g., $\langle \mu \rangle $
 in ii) can be replaced by other functions of statistics (e.g., $\langle \psi ,\mu \rangle $
 for $\langle \psi ,\mu \rangle $
 for $\psi \in C^1_x$
); one can also take linear combinations of the previous examples. $\psi \in C^1_x$
); one can also take linear combinations of the previous examples.
Also, in Assumption 7.2, we only considered the 
 $1$
-Wasserstein distance
$1$
-Wasserstein distance 
 $\mathbb {W}_1$
, but, in fact, all the results below would also hold if we replaced
$\mathbb {W}_1$
, but, in fact, all the results below would also hold if we replaced 
 $\mathbb {W}_1$
 with
$\mathbb {W}_1$
 with 
 $\mathbb {W}_p$
 for some
$\mathbb {W}_p$
 for some 
 $p\in (1,\infty )$
.
$p\in (1,\infty )$
.
Theorem 7.4. Let F satisfy Assumption 7.2. Then for any 
 $\mathcal {F}_0$
-measurable
$\mathcal {F}_0$
-measurable 
 $X_0 \in L^1_\omega $
 (respectively,
$X_0 \in L^1_\omega $
 (respectively, 
 $\xi \in \mathcal {P}_1(\mathbb {R}^d)$
) strong existence, pathwise uniqueness and uniqueness in law of solutions to (7.1) holds.
$\xi \in \mathcal {P}_1(\mathbb {R}^d)$
) strong existence, pathwise uniqueness and uniqueness in law of solutions to (7.1) holds.
Proof. We start by showing strong existence and pathwise uniqueness by means of a contraction argument. Specifically, suppose we are given a filtered probability space 
 $(\Omega ,\mathbb {F},\mathbb {P})$
 on which are defined an
$(\Omega ,\mathbb {F},\mathbb {P})$
 on which are defined an 
 $\mathbb {F}$
-fBm
$\mathbb {F}$
-fBm 
 $B^H$
 and an
$B^H$
 and an 
 $\mathcal {F}_0$
-measurable
$\mathcal {F}_0$
-measurable 
 $X_0\in L^1_\omega $
. Consider the space of adapted processes
$X_0\in L^1_\omega $
. Consider the space of adapted processes 
 $$\begin{align*}E:=\Big\{Y:[0,1]\to\mathbb{R}^d:\, Y \text{ is adapted to }\mathcal{F}_t, \sup_{t\in [0,1]} \| Y_t\|_{L^1}<\infty \Big\},\end{align*}$$
$$\begin{align*}E:=\Big\{Y:[0,1]\to\mathbb{R}^d:\, Y \text{ is adapted to }\mathcal{F}_t, \sup_{t\in [0,1]} \| Y_t\|_{L^1}<\infty \Big\},\end{align*}$$
which is a complete metric space when endowed with the metric
 $$\begin{align*}d_E(Y,Z):=\sup_{t\in [0,1]} e^{-\lambda \int_0^t |h_s|^q \mathrm{d} s} \| Y_t-Z_t\|_{L^1} \end{align*}$$
$$\begin{align*}d_E(Y,Z):=\sup_{t\in [0,1]} e^{-\lambda \int_0^t |h_s|^q \mathrm{d} s} \| Y_t-Z_t\|_{L^1} \end{align*}$$
for a parameter 
 $\lambda>0$
 to be chosen later. Define a map I acting on E by letting
$\lambda>0$
 to be chosen later. Define a map I acting on E by letting 
 $I(Y)$
 be the unique solution X to the SDE driven by
$I(Y)$
 be the unique solution X to the SDE driven by 
 $B^H$
, with initial data
$B^H$
, with initial data 
 $X_0$
 (cf. Remark 4.7) and drift
$X_0$
 (cf. Remark 4.7) and drift 
 $b^Y_t:=F_t(\cdot \, ,\mathcal {L}(Y_t))$
; the map I is well-defined thanks to Point i) from Assumption 7.2, ensuring the solvability of such SDE. Note that X is a solution to the DDSDE (7.1) on the space
$b^Y_t:=F_t(\cdot \, ,\mathcal {L}(Y_t))$
; the map I is well-defined thanks to Point i) from Assumption 7.2, ensuring the solvability of such SDE. Note that X is a solution to the DDSDE (7.1) on the space 
 $(\Omega ,\mathbb {F},\mathbb {P})$
 with input data
$(\Omega ,\mathbb {F},\mathbb {P})$
 with input data 
 $(X_0,B^H)$
 if and only if it is a fixed point for I.
$(X_0,B^H)$
 if and only if it is a fixed point for I.
 We claim that I is a contraction on 
 $(E,d_E)$
; indeed, given any
$(E,d_E)$
; indeed, given any 
 $Y^1,\,Y^2$
, by the stability estimate (3.4) and Assumption 7.2, for any
$Y^1,\,Y^2$
, by the stability estimate (3.4) and Assumption 7.2, for any 
 $t\in [0,1]$
, it holds
$t\in [0,1]$
, it holds 
 $$ \begin{align*} \| I(Y^1)_t-I(Y^2)_t\|_{L^1}^q & \lesssim \int_0^t \| F_s(\cdot\,,\mathcal{L}(Y^1_s))-F_s(\cdot\,,\mathcal{L}(Y^2_s))\|_{C^{\alpha-1}}^q \mathrm{d} s\\ & \lesssim \int_0^t |h_s|^q \,\mathbb{W}_1(\mathcal{L}(Y^1_s),\mathcal{L}(Y^2_s))^q \mathrm{d} s\\ & \lesssim d_E(Y^1,Y^2)^q \int_0^t |h_s|^q\, e^{q \lambda \int_0^s |h_r|^q \mathrm{d} r} \mathrm{d} s\\ & \lesssim (q \lambda)^{-1}\, e^{\lambda q \int_0^t |h_r|^q \mathrm{d} r}\, d_E(Y^1,Y^2)^q. \end{align*} $$
$$ \begin{align*} \| I(Y^1)_t-I(Y^2)_t\|_{L^1}^q & \lesssim \int_0^t \| F_s(\cdot\,,\mathcal{L}(Y^1_s))-F_s(\cdot\,,\mathcal{L}(Y^2_s))\|_{C^{\alpha-1}}^q \mathrm{d} s\\ & \lesssim \int_0^t |h_s|^q \,\mathbb{W}_1(\mathcal{L}(Y^1_s),\mathcal{L}(Y^2_s))^q \mathrm{d} s\\ & \lesssim d_E(Y^1,Y^2)^q \int_0^t |h_s|^q\, e^{q \lambda \int_0^s |h_r|^q \mathrm{d} r} \mathrm{d} s\\ & \lesssim (q \lambda)^{-1}\, e^{\lambda q \int_0^t |h_r|^q \mathrm{d} r}\, d_E(Y^1,Y^2)^q. \end{align*} $$
Rearranging the terms, we overall find the estimate
 $$\begin{align*}d_E\big(I(Y^1),I(Y^2)\big)^q \leq \frac{C}{q \lambda}\, d_E(Y^1,Y^2)^q, \end{align*}$$
$$\begin{align*}d_E\big(I(Y^1),I(Y^2)\big)^q \leq \frac{C}{q \lambda}\, d_E(Y^1,Y^2)^q, \end{align*}$$
from which contractivity follows by choosing 
 $\lambda $
 appropriately. Pathwise uniqueness then readily follows; as the argument holds for any choice of
$\lambda $
 appropriately. Pathwise uniqueness then readily follows; as the argument holds for any choice of 
 $\mathbb {F}$
, we can take
$\mathbb {F}$
, we can take 
 $\mathcal {F}_t=\sigma \{X_0, B^H_s, s\leq t\}$
, yielding strong existence.
$\mathcal {F}_t=\sigma \{X_0, B^H_s, s\leq t\}$
, yielding strong existence.
 To establish uniqueness in law, it suffices to observe that if X is a weak solution, then we can construct a copy of it on any reference probability space simply by solving therein the SDE associated to 
 $b^X_t(\cdot )= F_t(\cdot ,\mathcal {L}(X_t))$
: by weak uniqueness for the SDE associated to
$b^X_t(\cdot )= F_t(\cdot ,\mathcal {L}(X_t))$
: by weak uniqueness for the SDE associated to 
 $b^X$
 (see Remark 4.6), the solution
$b^X$
 (see Remark 4.6), the solution 
 $\tilde {X}$
 constructed in this way must have the same law as the original X and thus be a solution to the DDSDE itself. Given any pair of weak solutions
$\tilde {X}$
 constructed in this way must have the same law as the original X and thus be a solution to the DDSDE itself. Given any pair of weak solutions 
 $X^1,X^2$
, possibly defined on different probability spaces, we can then construct a coupling
$X^1,X^2$
, possibly defined on different probability spaces, we can then construct a coupling 
 $(\tilde {X}^1,\tilde {X}^2)$
 of them on the same probability space, solving the DDSDE for the same input data
$(\tilde {X}^1,\tilde {X}^2)$
 of them on the same probability space, solving the DDSDE for the same input data 
 $(X_0,B^H)$
; by the previous argument, it must hold
$(X_0,B^H)$
; by the previous argument, it must hold 
 $\tilde {X}^1\equiv \tilde {X}^2$
 and so
$\tilde {X}^1\equiv \tilde {X}^2$
 and so 
 $\mathcal {L}(X^1)=\mathcal {L}(X^2)$
.
$\mathcal {L}(X^1)=\mathcal {L}(X^2)$
.
Remark 7.5. In fact, going through the same strategy of proof as in [Reference Galeati, Harang and Mayorcas51] not only allows to establish wellposedness of the DDSDE but also to establish stability estimates for DDSDEs. Specifically, assume we are given fields 
 $F^i$
,
$F^i$
, 
 $i=1,2$
, satisfying Assumption (7.2) for the same parameters
$i=1,2$
, satisfying Assumption (7.2) for the same parameters 
 $(\alpha ,q)$
 and functions
$(\alpha ,q)$
 and functions 
 $h^i\in L^q_t$
 and define the quantity
$h^i\in L^q_t$
 and define the quantity 
 $$\begin{align*}\| F^1-F^2\|_{\alpha-1,q} :=\bigg( \int_0^1 \sup_{\mu\in \mathcal{P}_1} \big\| F^1_t(\cdot\,,\mu)-F^2_t(\cdot\,,\mu)\big\|_{C^{\alpha-1}_x}^q \mathrm{d} t\bigg)^{1/q}. \end{align*}$$
$$\begin{align*}\| F^1-F^2\|_{\alpha-1,q} :=\bigg( \int_0^1 \sup_{\mu\in \mathcal{P}_1} \big\| F^1_t(\cdot\,,\mu)-F^2_t(\cdot\,,\mu)\big\|_{C^{\alpha-1}_x}^q \mathrm{d} t\bigg)^{1/q}. \end{align*}$$
Then for any 
 $m\in [1,\infty )$
, there exists a constant C, depending on
$m\in [1,\infty )$
, there exists a constant C, depending on 
 $\alpha ,q,H,m,d, \| h^i\|_{L^q}$
, such that any two solutions
$\alpha ,q,H,m,d, \| h^i\|_{L^q}$
, such that any two solutions 
 $X^i$
 defined on the same space with input data
$X^i$
 defined on the same space with input data 
 $(X_0^i, B^H)$
 satisfy
$(X_0^i, B^H)$
 satisfy 
 $$ \begin{align} \big\| \| X^1-X^2\|_{C^0_t} \big\|_{L^m} \leq C \big(\|X^1_0-X^2_0|\|_{L^m} + \| F^1-F^2\|_{\alpha-1,q}\big); \end{align} $$
$$ \begin{align} \big\| \| X^1-X^2\|_{C^0_t} \big\|_{L^m} \leq C \big(\|X^1_0-X^2_0|\|_{L^m} + \| F^1-F^2\|_{\alpha-1,q}\big); \end{align} $$
in the case of solutions defined on different spaces, using (7.2) and coupling argument, we can easily deduce bounds on the Wasserstein distances of their laws. In the true McKean–Vlasov case – namely, 
 $F^i_t(\cdot \,,\mu )=f^i_t+g^i_t\ast \mu $
 with
$F^i_t(\cdot \,,\mu )=f^i_t+g^i_t\ast \mu $
 with 
 $f^i,g^i\in L^q_t C^\alpha _x$
 – it holds
$f^i,g^i\in L^q_t C^\alpha _x$
 – it holds 
 $$\begin{align*}\| F^1-F^2\|_{q,\alpha} \lesssim \| f^1-f^2\|_{L^q_t C^{\alpha-1}_x} + \| g^1-g^2\|_{L^q_t C^{\alpha-1}_x}. \end{align*}$$
$$\begin{align*}\| F^1-F^2\|_{q,\alpha} \lesssim \| f^1-f^2\|_{L^q_t C^{\alpha-1}_x} + \| g^1-g^2\|_{L^q_t C^{\alpha-1}_x}. \end{align*}$$
8 Weak compactness and weak existence
So far, we have shown that, under suitable conditions on b (condition (A)), we have (very) strong existence and uniqueness results. However, as we are now going to show, stochastic sewing also allows to establish weak existence and weak compactness of solutions in the regime (B) (defined just before Theorem 1.5), similarly to [Reference Athreya, Butkovsky, Lê and Mytnik3, Theorem 2.6(i)], [Reference Anzeletti, Richard and Tanré2, Theorem 2.8]. For other applications of sewing techniques and compactness arguments, see also [Reference Bechtold and Hofmanová7].
 This section is also our way to say something about the equation in the case 
 $q>2$
 that goes beyond the trivial inclusion
$q>2$
 that goes beyond the trivial inclusion 
 $L^q_t\subset L^2_t$
.
$L^q_t\subset L^2_t$
.
 Since here we assume 
 $\alpha <0$
, it is a priori not fully clear what it means to be a weak solution to the equation. Contrary to Section 5, where a robust interpretation was accomplished by the nonlinear Young formalism, here we will adopt the following, weaker definition, adapting the notion from [Reference Bass and Chens6]. This allows us to prove weak existence more generally; see, however, Remark 8.5 for a comparison.
$\alpha <0$
, it is a priori not fully clear what it means to be a weak solution to the equation. Contrary to Section 5, where a robust interpretation was accomplished by the nonlinear Young formalism, here we will adopt the following, weaker definition, adapting the notion from [Reference Bass and Chens6]. This allows us to prove weak existence more generally; see, however, Remark 8.5 for a comparison.
Definition 8.1. Let 
 $b\in L^q_t C^\alpha _x$
 for some
$b\in L^q_t C^\alpha _x$
 for some 
 $\alpha <0$
. We say that a tuple
$\alpha <0$
. We say that a tuple 
 $(\Omega ,{\mathbb {F}},\mathbb {P};X,B^H)$
 consisting of a filtered probability space and a pair of continuous processes
$(\Omega ,{\mathbb {F}},\mathbb {P};X,B^H)$
 consisting of a filtered probability space and a pair of continuous processes 
 $(X,B^H)$
 is a weak solution to the SDE
$(X,B^H)$
 is a weak solution to the SDE 
 $$ \begin{align} X_t = x_0 + \int_0^t b_s(X_s)\mathrm{d} s + B^H_t \end{align} $$
$$ \begin{align} X_t = x_0 + \int_0^t b_s(X_s)\mathrm{d} s + B^H_t \end{align} $$
if 
 $B^H$
 is a
$B^H$
 is a 
 $\mathbb {F}$
-fBm of parameter H, X is
$\mathbb {F}$
-fBm of parameter H, X is 
 $\mathbb {F}_t$
-adapted, and
$\mathbb {F}_t$
-adapted, and 
 $X_t=x_0+V_t+B^H_t$
, where the process
$X_t=x_0+V_t+B^H_t$
, where the process 
 $V_t$
 has the property that, for any sequence of smooth bounded functions
$V_t$
 has the property that, for any sequence of smooth bounded functions 
 $b^n$
 converging to b in
$b^n$
 converging to b in 
 $L^q_t C^\alpha _x$
, it holds that
$L^q_t C^\alpha _x$
, it holds that 
 $$ \begin{align*} \Big\|\int_0^\cdot b^n(s,X_s)\mathrm{d} s - V_\cdot\Big\|_{C^0_t} \to 0 \quad \text{in probability.} \end{align*} $$
$$ \begin{align*} \Big\|\int_0^\cdot b^n(s,X_s)\mathrm{d} s - V_\cdot\Big\|_{C^0_t} \to 0 \quad \text{in probability.} \end{align*} $$
Theorem 8.2. Let 
 $H\in (0,1)$
 and
$H\in (0,1)$
 and 
 $b\in L^q_t C^\alpha _x$
, satisfying (B). Then for any
$b\in L^q_t C^\alpha _x$
, satisfying (B). Then for any 
 $x_0\in \mathbb {R}^d$
 there exists a weak solution to the SDE (8.1) in the sense of Definition 8.1.
$x_0\in \mathbb {R}^d$
 there exists a weak solution to the SDE (8.1) in the sense of Definition 8.1.
Remark 8.3. The above result is only interesting in the regime 
 $H\in (0,1)$
 and
$H\in (0,1)$
 and 
 $q>2$
, cf. Remark 1.6. Indeed, for
$q>2$
, cf. Remark 1.6. Indeed, for 
 $H>1$
 condition
$H>1$
 condition 
 $\alpha>1/2-1/(2H)$
 automatically enforces
$\alpha>1/2-1/(2H)$
 automatically enforces 
 $\alpha>0$
, for which existence follows by classical Peano-type results; instead for
$\alpha>0$
, for which existence follows by classical Peano-type results; instead for 
 $q\leq 2$
, (B) implies (A) and so strong wellposedness follows from the previous sections.
$q\leq 2$
, (B) implies (A) and so strong wellposedness follows from the previous sections.
First we need the following lemma.
Lemma 8.4. Let 
 $H\in (0,1)$
,
$H\in (0,1)$
, 
 $(\alpha ,q)$
 be parameters satisfying (B); let X be a process defined on a filtered probability space
$(\alpha ,q)$
 be parameters satisfying (B); let X be a process defined on a filtered probability space 
 $(\Omega ,\mathbb {F},\mathbb {P})$
 of the form
$(\Omega ,\mathbb {F},\mathbb {P})$
 of the form 
 $X=\varphi +B^H$
, where
$X=\varphi +B^H$
, where 
 $B^H$
 is an
$B^H$
 is an 
 $\mathbb {F}$
-fBm and
$\mathbb {F}$
-fBm and 
 $\varphi $
 satisfies the property (2.4). For any
$\varphi $
 satisfies the property (2.4). For any 
 $f\in L^q_t C^\delta _x$
,
$f\in L^q_t C^\delta _x$
, 
 $\delta>0$
, let
$\delta>0$
, let 
 $w_f:=w_{f,\alpha ,q}$
; then for any
$w_f:=w_{f,\alpha ,q}$
; then for any 
 $m\in [2,\infty )$
, there exists a deterministic constant
$m\in [2,\infty )$
, there exists a deterministic constant 
 $K=K(m,d,\alpha ,q,H,\| b\|_{L^q_t C^\alpha _x})$
, such that
$K=K(m,d,\alpha ,q,H,\| b\|_{L^q_t C^\alpha _x})$
, such that 
 $$ \begin{align*} \bigg\| \Big\| \int_s^t f_r(X_r)\mathrm{d} r\Big\|_{L^m|\mathcal{F}_s} \bigg\|_{L^\infty} \leq K w_f(s,t)^{1/q} |t-s|^{\alpha H + 1/q'}. \end{align*} $$
$$ \begin{align*} \bigg\| \Big\| \int_s^t f_r(X_r)\mathrm{d} r\Big\|_{L^m|\mathcal{F}_s} \bigg\|_{L^\infty} \leq K w_f(s,t)^{1/q} |t-s|^{\alpha H + 1/q'}. \end{align*} $$
As a consequence, for any 
 $\varepsilon>0$
, there exists a constant
$\varepsilon>0$
, there exists a constant 
 $K=K(\varepsilon ,m,d,\alpha ,q,H,\| b\|_{L^q_t C^\alpha _x})$
 such that
$K=K(\varepsilon ,m,d,\alpha ,q,H,\| b\|_{L^q_t C^\alpha _x})$
 such that 
 $$ \begin{align} \bigg\| \Big\| \int_0^\cdot f_r(X_r)\mathrm{d} r \Big\|_{C^{\alpha H + 1/q'-\varepsilon}_t} \bigg\|_{L^m} \leq K \| f\|_{L^q_t C^\alpha_x}. \end{align} $$
$$ \begin{align} \bigg\| \Big\| \int_0^\cdot f_r(X_r)\mathrm{d} r \Big\|_{C^{\alpha H + 1/q'-\varepsilon}_t} \bigg\|_{L^m} \leq K \| f\|_{L^q_t C^\alpha_x}. \end{align} $$
By linearity and density, this allows to continuously extend in a unique way the map 
 $f\mapsto \int _0^\cdot f_r(X_r)\mathrm {d} r$
 from
$f\mapsto \int _0^\cdot f_r(X_r)\mathrm {d} r$
 from 
 $L^q_t \overline {C^\alpha _x}$
 to
$L^q_t \overline {C^\alpha _x}$
 to 
 $L^m_\omega C^0_t$
.
$L^m_\omega C^0_t$
.
Proof. We only sketch the proof since it is very similar to others already presented (cf. Lemma 3.1). By Lemma 2.4 and the stochastic sewing (again in the version of [Reference Friz, Hocquet and Lê44, Theorem 2.7]), setting 
 $A_{s,t}:=\mathbb {E}_s \int _s^t f_r(\varphi _s+B^H_r)\mathrm {d} r$
 and denoting
$A_{s,t}:=\mathbb {E}_s \int _s^t f_r(\varphi _s+B^H_r)\mathrm {d} r$
 and denoting 
 $\beta =1/q'+\alpha H$
, standard computations imply
$\beta =1/q'+\alpha H$
, standard computations imply 
 $$ \begin{align*} \| A_{s,t}\|_{L^\infty} & \lesssim |t-s|^\beta w_f(s,t)^{1/q},\\ \big\| \|\mathbb{E}_s\delta A_{s,u,t}\|_{L^m| \mathcal{F}_s} \big\|_{L^\infty} & \lesssim |t-s|^{\beta-H} w_f(s,t)^{1/q} \big\| \| \varphi_{s,u}\|_{L^m|\mathcal{F}_s}\big\|_{L^\infty}\\ & \lesssim |t-s|^{2\beta-H} w_f(s,t)^{1/q} w_b(s,t)^{1/q}. \end{align*} $$
$$ \begin{align*} \| A_{s,t}\|_{L^\infty} & \lesssim |t-s|^\beta w_f(s,t)^{1/q},\\ \big\| \|\mathbb{E}_s\delta A_{s,u,t}\|_{L^m| \mathcal{F}_s} \big\|_{L^\infty} & \lesssim |t-s|^{\beta-H} w_f(s,t)^{1/q} \big\| \| \varphi_{s,u}\|_{L^m|\mathcal{F}_s}\big\|_{L^\infty}\\ & \lesssim |t-s|^{2\beta-H} w_f(s,t)^{1/q} w_b(s,t)^{1/q}. \end{align*} $$
Under condition (B), one can check that the hypotheses of [Reference Friz, Hocquet and Lê44, Theorem 2.7] are satisfied, which easily yields all the desired estimates.
 Let us also recall the definition of 
 $\mathbb {F}$
-fBm and the associated Volterra kernel representation (1.22) from Section 1.4. With these preparations, we can now present the following:
$\mathbb {F}$
-fBm and the associated Volterra kernel representation (1.22) from Section 1.4. With these preparations, we can now present the following:
Proof of Theorem 8.2.
 As before, we can assume 
 $x_0=0$
 without loss of generality. Let
$x_0=0$
 without loss of generality. Let 
 $b\in L^q_t C^\alpha _x$
 with
$b\in L^q_t C^\alpha _x$
 with 
 $(q,\alpha )$
 satisfying (B) be given. Since (B) is a strict inequality, we can assume without loss of generality that
$(q,\alpha )$
 satisfying (B) be given. Since (B) is a strict inequality, we can assume without loss of generality that 
 $q<\infty $
,
$q<\infty $
, 
 $b\in L^q_t\overline {C^\alpha _x}$
, and in particular, there exists a sequence
$b\in L^q_t\overline {C^\alpha _x}$
, and in particular, there exists a sequence 
 $\{b^n\}_n\subset L^q_t C^1_x$
 such that
$\{b^n\}_n\subset L^q_t C^1_x$
 such that 
 $b^n\to b$
 in
$b^n\to b$
 in 
 $L^q_t C^\alpha _x$
 and
$L^q_t C^\alpha _x$
 and 
 $\int _s^t \| b^n_r\|_{C^\alpha _x}^q \mathrm {d} r \leq \int _s^t \| b_r\|_{C^\alpha _x}^q \mathrm {d} r$
 (this can be accomplished by taking
$\int _s^t \| b^n_r\|_{C^\alpha _x}^q \mathrm {d} r \leq \int _s^t \| b_r\|_{C^\alpha _x}^q \mathrm {d} r$
 (this can be accomplished by taking 
 $b^n_r=\rho _{1/n}\ast b_r$
 for some standard mollifiers
$b^n_r=\rho _{1/n}\ast b_r$
 for some standard mollifiers 
 $\{\rho _\delta \}_{\delta>0}$
, up to replacing
$\{\rho _\delta \}_{\delta>0}$
, up to replacing 
 $\alpha $
 with
$\alpha $
 with 
 $\alpha -\varepsilon $
).
$\alpha -\varepsilon $
).
 To each such 
 $b^n$
, we can associate a solution
$b^n$
, we can associate a solution 
 $X^n=\varphi ^n + B^H$
, where by Lemma 2.4,
$X^n=\varphi ^n + B^H$
, where by Lemma 2.4, 
 $\varphi ^n$
 satisfy the bound (2.4) for
$\varphi ^n$
 satisfy the bound (2.4) for 
 $w=w_{\alpha ,b,q}$
; this implies in particular that
$w=w_{\alpha ,b,q}$
; this implies in particular that 
 $\| \varphi ^n_{s,t}\|_m \lesssim |t-s|^{\alpha H + 1/q'}$
 uniformly in n, which by Kolmogorov’s theorem readily implies the tightness of the family
$\| \varphi ^n_{s,t}\|_m \lesssim |t-s|^{\alpha H + 1/q'}$
 uniformly in n, which by Kolmogorov’s theorem readily implies the tightness of the family 
 $\{\varphi ^n\}_n$
. As a consequence, the family
$\{\varphi ^n\}_n$
. As a consequence, the family 
 $\{(\varphi ^n, B^H, W)\}_n$
 is tight in
$\{(\varphi ^n, B^H, W)\}_n$
 is tight in 
 $C_t\times C_t\times C_t$
.
$C_t\times C_t\times C_t$
.
 By Prokhorov’s and Skorokhod’s theorems, we can construct another probability space 
 $(\tilde {\Omega },\tilde {\mathcal {F}},\tilde {\mathbb {P}})$
 on which there exists a sequence
$(\tilde {\Omega },\tilde {\mathcal {F}},\tilde {\mathbb {P}})$
 on which there exists a sequence 
 $\{(\tilde {\varphi }^n,\tilde {B}^{H,n}, \tilde {W}^n)\}_n$
 such that
$\{(\tilde {\varphi }^n,\tilde {B}^{H,n}, \tilde {W}^n)\}_n$
 such that 
 $(\tilde {\varphi }^n,\tilde {B}^{H,n}, \tilde {W}^n)$
 is distributed as
$(\tilde {\varphi }^n,\tilde {B}^{H,n}, \tilde {W}^n)$
 is distributed as 
 $(\varphi ^n, B^H, W)$
 for each n and
$(\varphi ^n, B^H, W)$
 for each n and 
 $(\tilde {\varphi }^n,\tilde {B}^{H,n}, \tilde {W}^n) \to (\tilde {\varphi },\tilde {B}^{H}, \tilde {W}) \tilde {\mathbb {P}}$
-a.s. in
$(\tilde {\varphi }^n,\tilde {B}^{H,n}, \tilde {W}^n) \to (\tilde {\varphi },\tilde {B}^{H}, \tilde {W}) \tilde {\mathbb {P}}$
-a.s. in 
 $C_t\times C_t\times C_t$
. We claim that
$C_t\times C_t\times C_t$
. We claim that 
 $\tilde {X}=\tilde {\varphi }+\tilde {B}^H$
 is a weak solution to (8.1), in the sense of Definition 8.1. For notational simplicity, we drop the tildes for the rest of the proof.
$\tilde {X}=\tilde {\varphi }+\tilde {B}^H$
 is a weak solution to (8.1), in the sense of Definition 8.1. For notational simplicity, we drop the tildes for the rest of the proof.
 First of all, we claim that 
 $B^H$
 is still distributed as an fBm of parameter H, W as a standard Bm and that the relation
$B^H$
 is still distributed as an fBm of parameter H, W as a standard Bm and that the relation 
 $B^H_t = \int _0^t K_H(t,s) \mathrm {d} W_s$
 still holds. The first two statements are an immediate consequence of passing to the limit. For the last one, we can use the fact that for each n, the same relation holds between
$B^H_t = \int _0^t K_H(t,s) \mathrm {d} W_s$
 still holds. The first two statements are an immediate consequence of passing to the limit. For the last one, we can use the fact that for each n, the same relation holds between 
 $B^{H,n}$
 and
$B^{H,n}$
 and 
 $W^n$
, the fact that
$W^n$
, the fact that 
 $K_H(t,\cdot )$
 is square integrable and standard results on convergence of stochastic integrals (e.g., [Reference Debussche, Glatt-Holtz and Temam33, Lemma 2.1]) to conclude that for any fixed t, (1.22) holds
$K_H(t,\cdot )$
 is square integrable and standard results on convergence of stochastic integrals (e.g., [Reference Debussche, Glatt-Holtz and Temam33, Lemma 2.1]) to conclude that for any fixed t, (1.22) holds 
 $\mathbb {P}$
-a.s. The upgrade to a
$\mathbb {P}$
-a.s. The upgrade to a 
 $\mathbb {P}$
-a.s. statement valid for all
$\mathbb {P}$
-a.s. statement valid for all 
 $t\in [0,1]$
 follows from combining this fact with the uniform convergence of
$t\in [0,1]$
 follows from combining this fact with the uniform convergence of 
 $B^{H,n}$
 to
$B^{H,n}$
 to 
 $B^H$
.
$B^H$
.
 Next, since 
 $X^n=\varphi ^n+B^{H,n}$
 is still a solution to the SDE (8.1) with regular drift
$X^n=\varphi ^n+B^{H,n}$
 is still a solution to the SDE (8.1) with regular drift 
 $b^n$
,
$b^n$
, 
 $\varphi ^n$
 is adapted to
$\varphi ^n$
 is adapted to 
 $\mathcal {F}^n_t:=\sigma \{ B^{H,n}_s:s\leq t\}=\sigma \{W^n_s: s\leq t\}$
; so for any
$\mathcal {F}^n_t:=\sigma \{ B^{H,n}_s:s\leq t\}=\sigma \{W^n_s: s\leq t\}$
; so for any 
 $s<t$
, any
$s<t$
, any 
 $t_1,\ldots , t_n \leq s$
 and any pair of continuous bounded functions
$t_1,\ldots , t_n \leq s$
 and any pair of continuous bounded functions 
 $F,G$
, it holds
$F,G$
, it holds 
 $$ \begin{align*} \mathbb{E}\big[F(W^n_{s,t})G(W^n_{t_1}, \varphi^n_{t_1},\ldots,W^n_{t_n},\varphi^n_{t_n})\big] = \mathbb{E}\big[F(W^n_{s,t})\big]\,\mathbb{E}\big[G(W^n_{t_1}, \varphi^n_{t_1},\ldots,W^n_{t_n},\varphi^n_{t_n})\big]. \end{align*} $$
$$ \begin{align*} \mathbb{E}\big[F(W^n_{s,t})G(W^n_{t_1}, \varphi^n_{t_1},\ldots,W^n_{t_n},\varphi^n_{t_n})\big] = \mathbb{E}\big[F(W^n_{s,t})\big]\,\mathbb{E}\big[G(W^n_{t_1}, \varphi^n_{t_1},\ldots,W^n_{t_n},\varphi^n_{t_n})\big]. \end{align*} $$
Passing to the limit as 
 $n\to \infty $
, the same relation holds for W and
$n\to \infty $
, the same relation holds for W and 
 $\varphi $
 in place of
$\varphi $
 in place of 
 $W^n$
 and
$W^n$
 and 
 $\varphi ^n$
, which shows that W is an
$\varphi ^n$
, which shows that W is an 
 $\mathbb {F}$
-Bm for
$\mathbb {F}$
-Bm for 
 $\mathcal {F}_t:=\sigma \{(W_s,\varphi _s):s\leq t\}$
; in particular,
$\mathcal {F}_t:=\sigma \{(W_s,\varphi _s):s\leq t\}$
; in particular, 
 $B^H$
 is an
$B^H$
 is an 
 $\mathbb {F}$
-fBm. Similarly, since
$\mathbb {F}$
-fBm. Similarly, since 
 $\varphi ^n$
 uniformly satisfy the bound (2.4) w.r.t.
$\varphi ^n$
 uniformly satisfy the bound (2.4) w.r.t. 
 $\mathcal {F}^n_t$
, it holds
$\mathcal {F}^n_t$
, it holds 
 $$ \begin{align*} \mathbb{E}\big[|\varphi^n_{s,t}|^m \, &G(W^n_{t_1}, \varphi^n_{t_1},\ldots,W^n_{t_n},\varphi^n_{t_n})\big] \\ &\lesssim \big(w(s,t)^{1/q} |t-s|^{\alpha H + 1/q'}\big)^m \mathbb{E}\big[G(W^n_{t_1}, \varphi^n_{t_1},\ldots,W^n_{t_n},\varphi^n_{t_n})\big]. \end{align*} $$
$$ \begin{align*} \mathbb{E}\big[|\varphi^n_{s,t}|^m \, &G(W^n_{t_1}, \varphi^n_{t_1},\ldots,W^n_{t_n},\varphi^n_{t_n})\big] \\ &\lesssim \big(w(s,t)^{1/q} |t-s|^{\alpha H + 1/q'}\big)^m \mathbb{E}\big[G(W^n_{t_1}, \varphi^n_{t_1},\ldots,W^n_{t_n},\varphi^n_{t_n})\big]. \end{align*} $$
Passing to the limit as 
 $n\to \infty $
, we conclude that
$n\to \infty $
, we conclude that 
 $\varphi $
 satisfies (2.4) w.r.t. the filtration
$\varphi $
 satisfies (2.4) w.r.t. the filtration 
 $\mathcal {F}_t$
.
$\mathcal {F}_t$
.
 Finally, it remains to show that X satisfies the relation 
 $X_t= V_t + B^H_t$
 for V satisfying the requirements of Definition 8.1. First, since
$X_t= V_t + B^H_t$
 for V satisfying the requirements of Definition 8.1. First, since 
 $B^H$
 is an
$B^H$
 is an 
 $\mathbb {F}$
-fBm and
$\mathbb {F}$
-fBm and 
 $\varphi $
 satisfies (2.4), Lemma 8.4 applies, so that the process
$\varphi $
 satisfies (2.4), Lemma 8.4 applies, so that the process 
 $V_t:=\int _0^t b_r(X_r)\mathrm {d} r$
 is well-defined; by this, we mean that the map
$V_t:=\int _0^t b_r(X_r)\mathrm {d} r$
 is well-defined; by this, we mean that the map 
 $f\mapsto \int _0^\cdot f_r(X_r)\mathrm {d} r$
 admits a unique extension, and V is the limit in
$f\mapsto \int _0^\cdot f_r(X_r)\mathrm {d} r$
 admits a unique extension, and V is the limit in 
 $L^m_\omega C^0_t$
 of the processes
$L^m_\omega C^0_t$
 of the processes 
 $\int _0^\cdot b^n_t(X_r)\mathrm {d} r$
, for any sequence of smooth
$\int _0^\cdot b^n_t(X_r)\mathrm {d} r$
, for any sequence of smooth 
 $b^n\to b$
 in
$b^n\to b$
 in 
 $L^q_T \overline {C^\alpha _x}$
. By linearity, we have
$L^q_T \overline {C^\alpha _x}$
. By linearity, we have 
 $$ \begin{align} \mathbb{E} \bigg[ \Big\| \int_0^\cdot f_r(X_r)\mathrm{d} r - V_\cdot \Big\|_{C^{\alpha H + 1/q'-\varepsilon}_t}^m \bigg]^{1/m} \lesssim \| f-b\|_{L^q_t C^\alpha_x} \end{align} $$
$$ \begin{align} \mathbb{E} \bigg[ \Big\| \int_0^\cdot f_r(X_r)\mathrm{d} r - V_\cdot \Big\|_{C^{\alpha H + 1/q'-\varepsilon}_t}^m \bigg]^{1/m} \lesssim \| f-b\|_{L^q_t C^\alpha_x} \end{align} $$
for any regular f; a similar estimate holds for any 
 $X^n$
, with b replaced by
$X^n$
, with b replaced by 
 $b^n$
, with the hidden constants being uniform in n. In order to conclude, again thanks to Lemma 8.4, it suffices to show that
$b^n$
, with the hidden constants being uniform in n. In order to conclude, again thanks to Lemma 8.4, it suffices to show that 
 $\varphi ^n\to V$
; for any f as above, it holds
$\varphi ^n\to V$
; for any f as above, it holds 
 $$ \begin{align*} \mathbb{E}\big[\| \varphi^n-V\|_{C^0_t}\big] & \leq \mathbb{E} \bigg[\Big\| \int_0^\cdot [b^n-f]_r(X^n_r) \mathrm{d} r \Big\|_{C^0_t}\bigg] + \mathbb{E}\bigg[\Big\| \int_0^\cdot [f_r(X^n_r)-f_r(X_r)] \mathrm{d} r \Big\|_{C^0_t}\bigg] \\ &\qquad+ \mathbb{E}\bigg[\Big\| \int_0^\cdot f_r(X_r) \mathrm{d} r - V_\cdot \Big\|_{C^0_t}\bigg]\\ & \lesssim \|b^n-f\|_{L^q_t C^\alpha_x} + \mathbb{E} \bigg[\Big\| \int_0^\cdot [f_r(X^n_r)-f_r(X_r)] \mathrm{d} r \Big\|_{C^0_t}\bigg] + \|b-f\|_{L^q_t C^\alpha_x}, \end{align*} $$
$$ \begin{align*} \mathbb{E}\big[\| \varphi^n-V\|_{C^0_t}\big] & \leq \mathbb{E} \bigg[\Big\| \int_0^\cdot [b^n-f]_r(X^n_r) \mathrm{d} r \Big\|_{C^0_t}\bigg] + \mathbb{E}\bigg[\Big\| \int_0^\cdot [f_r(X^n_r)-f_r(X_r)] \mathrm{d} r \Big\|_{C^0_t}\bigg] \\ &\qquad+ \mathbb{E}\bigg[\Big\| \int_0^\cdot f_r(X_r) \mathrm{d} r - V_\cdot \Big\|_{C^0_t}\bigg]\\ & \lesssim \|b^n-f\|_{L^q_t C^\alpha_x} + \mathbb{E} \bigg[\Big\| \int_0^\cdot [f_r(X^n_r)-f_r(X_r)] \mathrm{d} r \Big\|_{C^0_t}\bigg] + \|b-f\|_{L^q_t C^\alpha_x}, \end{align*} $$
 where we applied several times estimate (8.3). Since f is regular, 
 $b^n\to b$
 and
$b^n\to b$
 and 
 $X^n\to X$
, passing to the limit, we get
$X^n\to X$
, passing to the limit, we get 
 $$ \begin{align*} \limsup_{n\to\infty} \mathbb{E}\bigg[\Big\| \int_0^\cdot b^n_r(X^n_r) \mathrm{d} r - V_\cdot \Big\|_{C^0_t}\bigg] \lesssim 2 \|b-f\|_{L^q_t C^\alpha_x}; \end{align*} $$
$$ \begin{align*} \limsup_{n\to\infty} \mathbb{E}\bigg[\Big\| \int_0^\cdot b^n_r(X^n_r) \mathrm{d} r - V_\cdot \Big\|_{C^0_t}\bigg] \lesssim 2 \|b-f\|_{L^q_t C^\alpha_x}; \end{align*} $$
by the arbitrariness of f, we can conclude that 
 $\varphi ^n \to V = \varphi $
 and so that X is a weak solution.
$\varphi ^n \to V = \varphi $
 and so that X is a weak solution.
Remark 8.5. Under Assumption (A), the unique strong solution X to the SDE constructed in Section 5 satisfies Definition 8.1, as readily seen by applying Lemma 3.1 with 
 $h^n=b^n-b$
. In most situations, pathwise solutions X to (8.1) in the nonlinear Young sense (cf. Definition 5.3) which are
$h^n=b^n-b$
. In most situations, pathwise solutions X to (8.1) in the nonlinear Young sense (cf. Definition 5.3) which are 
 $\mathbb {F}_t$
-adapted are also weak solutions in the sense of Definition 8.1. Indeed, in order to construct such X, usually one must have already verified that
$\mathbb {F}_t$
-adapted are also weak solutions in the sense of Definition 8.1. Indeed, in order to construct such X, usually one must have already verified that 
 $T^{B^H}$
 extends to a bounded operator from
$T^{B^H}$
 extends to a bounded operator from 
 $L^q_t C^\alpha _x$
 to
$L^q_t C^\alpha _x$
 to 
 $L^m_\omega C^{p-{\mathrm {var}}}_t C^{\eta ,\mathrm {loc}}_x$
 (similarly to Corollary 5.1) and that
$L^m_\omega C^{p-{\mathrm {var}}}_t C^{\eta ,\mathrm {loc}}_x$
 (similarly to Corollary 5.1) and that 
 $X=\varphi +B^H$
 with
$X=\varphi +B^H$
 with 
 $\varphi \in C^{\zeta -{\mathrm {var}}}_t \mathbb {P}$
-a.s., for suitable parameters
$\varphi \in C^{\zeta -{\mathrm {var}}}_t \mathbb {P}$
-a.s., for suitable parameters 
 $(p,\eta ,\zeta )$
 satisfying
$(p,\eta ,\zeta )$
 satisfying 
 $1/p +\eta /\zeta>1$
. Linearity of
$1/p +\eta /\zeta>1$
. Linearity of 
 $T^{B^H}$
 and stability of nonlinear Young integration
$T^{B^H}$
 and stability of nonlinear Young integration 
 $(A,x)\mapsto \int _0^\cdot A(\mathrm {d} s,x_s)$
 (cf. [Reference Galeati46, Theorem 2.7-4)]) then yields
$(A,x)\mapsto \int _0^\cdot A(\mathrm {d} s,x_s)$
 (cf. [Reference Galeati46, Theorem 2.7-4)]) then yields 
 $$ \begin{align*} \Big\| \int_0^\cdot b^n(s,X_s)\mathrm{d} s-\int_0^\cdot T^{B^H}b(\mathrm{d} s,\varphi_s) \Big\|_{C^0_t} \lesssim \big\| T^{B^H}(b^n-b)\big\|_{C^{p-{\mathrm{var}}}_t C^{\eta}_{\| \varphi\|_{\infty}}} (1+\| \varphi\|_{C^{\zeta-{\mathrm{var}}}_t}), \end{align*} $$
$$ \begin{align*} \Big\| \int_0^\cdot b^n(s,X_s)\mathrm{d} s-\int_0^\cdot T^{B^H}b(\mathrm{d} s,\varphi_s) \Big\|_{C^0_t} \lesssim \big\| T^{B^H}(b^n-b)\big\|_{C^{p-{\mathrm{var}}}_t C^{\eta}_{\| \varphi\|_{\infty}}} (1+\| \varphi\|_{C^{\zeta-{\mathrm{var}}}_t}), \end{align*} $$
where the r.h.s. converges in probability to 
 $0$
 due to the aforementioned mapping properties of
$0$
 due to the aforementioned mapping properties of 
 $T^{B^H}$
 and the assumption
$T^{B^H}$
 and the assumption 
 $b^n\to b$
 in
$b^n\to b$
 in 
 $L^q_t C^\alpha _x$
.
$L^q_t C^\alpha _x$
.
 The converse implication – namely, whether the weak solution constructed in Theorem 8.2 is also a pathwise solution in the nonlinear Young sense – might only be true for a more restricted range of parameters. Let us only sketch the power counting, omitting the arbitrarily small exponents everywhere. The averaged field 
 $T^{B^H}b$
 can be constructed as in Corollary 5.1, as an element of
$T^{B^H}b$
 can be constructed as in Corollary 5.1, as an element of 
 $C^{2-{\mathrm {var}}}_t C^{\alpha +1/(2H),\mathrm {loc}}_x$
. Furthermore, we know from Lemma 2.4 that
$C^{2-{\mathrm {var}}}_t C^{\alpha +1/(2H),\mathrm {loc}}_x$
. Furthermore, we know from Lemma 2.4 that 
 $\varphi \in C^{r-{\mathrm {var}}}_t$
 with
$\varphi \in C^{r-{\mathrm {var}}}_t$
 with 
 $1/r=1+\alpha H$
. Therefore, if
$1/r=1+\alpha H$
. Therefore, if 
 $$ \begin{align} \frac{1}{2}+\left(\alpha + \frac{1}{2H} \right)(\alpha H + 1)>1, \end{align} $$
$$ \begin{align} \frac{1}{2}+\left(\alpha + \frac{1}{2H} \right)(\alpha H + 1)>1, \end{align} $$
then the nonlinear Young integral 
 $\int _0^\cdot (T^{B^H}b)_{\mathrm {d} t}(\varphi _t)$
 is well-defined and agrees with V. Note that the regime (8.4) is nontrivial in the sense that it allows for drifts for which strong uniqueness is not known since the right-hand side is strictly greater than
$\int _0^\cdot (T^{B^H}b)_{\mathrm {d} t}(\varphi _t)$
 is well-defined and agrees with V. Note that the regime (8.4) is nontrivial in the sense that it allows for drifts for which strong uniqueness is not known since the right-hand side is strictly greater than 
 $1$
 for
$1$
 for 
 $\alpha =1-1/(2H)$
. We also remark that (8.4) is sufficient, but not necessary to define
$\alpha =1-1/(2H)$
. We also remark that (8.4) is sufficient, but not necessary to define 
 $\int _0^\cdot (T^{B^H}b)_{\mathrm {d} t}(\varphi _t)$
, since for particular choices of b, the averaged field
$\int _0^\cdot (T^{B^H}b)_{\mathrm {d} t}(\varphi _t)$
, since for particular choices of b, the averaged field 
 $T^{B^H}b$
 may enjoy better regularity than
$T^{B^H}b$
 may enjoy better regularity than 
 $C^{2-{\mathrm {var}}}_t C^{\alpha +1/(2H),\mathrm {loc}}_x$
; see, for example, [Reference Anzeletti, Richard and Tanré2] for such situations.
$C^{2-{\mathrm {var}}}_t C^{\alpha +1/(2H),\mathrm {loc}}_x$
; see, for example, [Reference Anzeletti, Richard and Tanré2] for such situations.
For a deeper discussion about equivalence of different solution concepts for distributional drifts, including the nonlinear Young one, Definition 8.1 and others, we refer to [Reference Anzeletti, Richard and Tanré2, Theorem 2.15] and [Reference Butkovsky, Lê and Mytnik16, Theorem 2.11].
9 
 $\rho $
-irregularity
$\rho $
-irregularity
 The goal of this section is to derive some pathwise properties for solutions of (1.6), without appealing to Girsanov transform. Indeed, in the time-homogeneous setting, Girsanov is unavailable for 
 $H>1$
,Footnote 
8
 while in the time-dependent case. it does not apply for any value of
$H>1$
,Footnote 
8
 while in the time-dependent case. it does not apply for any value of 
 $H>0$
 (since we can allow drifts which are only
$H>0$
 (since we can allow drifts which are only 
 $L^q$
 in time, for values of q arbitrarily close to
$L^q$
 in time, for values of q arbitrarily close to 
 $1$
). For more details, see Appendix C.
$1$
). For more details, see Appendix C.
 As a meaningful representative of a larger class of pathwise properties, we will focus on the notion of 
 $\rho $
-irregularity, first introduced in [Reference Catellier and Gubinelli20] in the context of regularisation by noise for ODEs; it has later found several applications in regularisation for PDEs (see [Reference Chouk and Gubinelli28, Reference Chouk and Gess29, Reference Chouk, Gubinelli, Li, Li and Oh30, Reference Galeati and Gubinelli48]), and more recently in the inviscid mixing properties of shear flows [Reference Galeati and Gubinelli50]. Let us also mention the recent work [Reference Romito and Tolomeo89] for an alternative notion of irregularity, partially related to this one.
$\rho $
-irregularity, first introduced in [Reference Catellier and Gubinelli20] in the context of regularisation by noise for ODEs; it has later found several applications in regularisation for PDEs (see [Reference Chouk and Gubinelli28, Reference Chouk and Gess29, Reference Chouk, Gubinelli, Li, Li and Oh30, Reference Galeati and Gubinelli48]), and more recently in the inviscid mixing properties of shear flows [Reference Galeati and Gubinelli50]. Let us also mention the recent work [Reference Romito and Tolomeo89] for an alternative notion of irregularity, partially related to this one.
Definition 9.1. Let 
 $\gamma \in (0,1)$
,
$\gamma \in (0,1)$
, 
 $\rho>0$
. We say that a function
$\rho>0$
. We say that a function 
 $h\in C([0,1],\mathbb {R}^d)$
 is
$h\in C([0,1],\mathbb {R}^d)$
 is 
 $(\gamma ,\rho )$
-irregular if there exists a constant N such that
$(\gamma ,\rho )$
-irregular if there exists a constant N such that 
 $$ \begin{align*} \Big|\int_s^te^{i\xi\cdot h_r}\mathrm{d} r\Big|\leq N\,|\xi|^{-\rho}|t-s|^\gamma \quad \forall \xi\in\mathbb{R}^d,\quad 0\leq s\leq t\leq 1; \end{align*} $$
$$ \begin{align*} \Big|\int_s^te^{i\xi\cdot h_r}\mathrm{d} r\Big|\leq N\,|\xi|^{-\rho}|t-s|^\gamma \quad \forall \xi\in\mathbb{R}^d,\quad 0\leq s\leq t\leq 1; \end{align*} $$
we denote by 
 $\| \Phi ^h\|_{\mathcal {W}^{\gamma ,\rho }}$
 the optimal constant. We say that h is
$\| \Phi ^h\|_{\mathcal {W}^{\gamma ,\rho }}$
 the optimal constant. We say that h is 
 $\rho $
-irregular for short if there exists
$\rho $
-irregular for short if there exists 
 $\gamma>1/2$
 such that it is
$\gamma>1/2$
 such that it is 
 $(\gamma ,\rho )$
-irregular.
$(\gamma ,\rho )$
-irregular.
 It was shown in [Reference Catellier and Gubinelli20, Reference Galeati and Gubinelli48] that for any 
 $H\in (0,\infty )\setminus \mathbb {N}$
,
$H\in (0,\infty )\setminus \mathbb {N}$
, 
 $B^H$
 is
$B^H$
 is 
 $\rho $
-irregular for any
$\rho $
-irregular for any 
 $\rho <1/(2H)$
; we establish the same for a class of perturbations of
$\rho <1/(2H)$
; we establish the same for a class of perturbations of 
 $B^H$
 satisfying the following assumption.
$B^H$
 satisfying the following assumption.
Assumption 9.2. Let 
 $\varphi :[0,1]\to \mathbb {R}^d$
 be a continuous adapted process which admits moments of any order; moreover, there exist
$\varphi :[0,1]\to \mathbb {R}^d$
 be a continuous adapted process which admits moments of any order; moreover, there exist 
 $\beta> 0$
 and a control w such that, for any
$\beta> 0$
 and a control w such that, for any 
 $m \in [1,\infty )$
, there exists a constant
$m \in [1,\infty )$
, there exists a constant 
 $C_m$
 such that
$C_m$
 such that 
 $$ \begin{align} \big\|\| \varphi_t -\mathbb{E}_s \varphi_t\|_{L^1|\mathcal{F}_s}\big\|_{L^m} \leq C_m w(s,t)^{1/2}|t-s|^{\beta}\quad \forall\, 0\leq s\leq t\leq 1. \end{align} $$
$$ \begin{align} \big\|\| \varphi_t -\mathbb{E}_s \varphi_t\|_{L^1|\mathcal{F}_s}\big\|_{L^m} \leq C_m w(s,t)^{1/2}|t-s|^{\beta}\quad \forall\, 0\leq s\leq t\leq 1. \end{align} $$
Theorem 9.3. Let 
 $H\in (0, +\infty ) \setminus \mathbb {N}$
 and let
$H\in (0, +\infty ) \setminus \mathbb {N}$
 and let 
 $\varphi $
 satisfy Assumption 9.2 with
$\varphi $
 satisfy Assumption 9.2 with 
 $\beta =H$
; then
$\beta =H$
; then 
 $X:=\varphi +B^H$
 is
$X:=\varphi +B^H$
 is 
 $\mathbb {P}$
-almost surely
$\mathbb {P}$
-almost surely 
 $\rho $
-irregular for any
$\rho $
-irregular for any 
 $\rho < 1 /(2H)$
. More precisely, for any such
$\rho < 1 /(2H)$
. More precisely, for any such 
 $\rho $
 and any
$\rho $
 and any 
 $m\in [1,\infty )$
, there exists
$m\in [1,\infty )$
, there exists 
 $\gamma =\gamma (m,\rho )>1/2$
 such that
$\gamma =\gamma (m,\rho )>1/2$
 such that 
 $$ \begin{align} \mathbb{E}[ \| \Phi^X\|_{\mathcal{W}^{\gamma,\rho}}^m ]<\infty. \end{align} $$
$$ \begin{align} \mathbb{E}[ \| \Phi^X\|_{\mathcal{W}^{\gamma,\rho}}^m ]<\infty. \end{align} $$
Remark 9.4. Let us make some observations on Assumption 9.2 and Theorem 9.3:
- 
• Lemmas 2.1 and 2.4 provide sufficient conditions on q and  $\alpha $
 that guarantee that solutions of (1.6) with $\alpha $
 that guarantee that solutions of (1.6) with $b\in L^q_t C^\alpha _x$
 satisfy Assumption 9.2. Note that in some cases, we can therefore obtain $b\in L^q_t C^\alpha _x$
 satisfy Assumption 9.2. Note that in some cases, we can therefore obtain $\rho $
-irregularity of solutions but not uniqueness. $\rho $
-irregularity of solutions but not uniqueness.
- 
• Our usual toolbox could in principle be also used to study Gaussian moments of  $\Phi ^X$
 (under a somewhat stronger condition than (9.1)). For simplicity, we do not pursue this in detail. $\Phi ^X$
 (under a somewhat stronger condition than (9.1)). For simplicity, we do not pursue this in detail.
- 
• In terms of exponents, the condition (9.1) appears to require the same order of ‘regularity’, namely  $1/2+H$
, as Girsanov transform (see Appendix C). However, (9.1) is a significantly weaker condition: instead of controlling the usual increments $1/2+H$
, as Girsanov transform (see Appendix C). However, (9.1) is a significantly weaker condition: instead of controlling the usual increments $\varphi _t-\varphi _s$
, one only needs to control the stochastic increments $\varphi _t-\varphi _s$
, one only needs to control the stochastic increments $\varphi _t-\mathbb {E}_s\varphi _t$
, which can be much smaller. $\varphi _t-\mathbb {E}_s\varphi _t$
, which can be much smaller.
- 
• In [Reference Catellier and Gubinelli20, Reference Galeati and Gubinelli48], the additive perturbation problem is studied in detail; the authors try to establish, in a deterministic framework, whether a path  $h+\varphi $
 can be shown to be $h+\varphi $
 can be shown to be $\rho $
-irregular, given the knowledge that h is so and $\rho $
-irregular, given the knowledge that h is so and $\varphi $
 enjoys higher Hölder regularity. Such results usually come with a loss of regularity in the exponent $\varphi $
 enjoys higher Hölder regularity. Such results usually come with a loss of regularity in the exponent $\rho $
 at least $\rho $
 at least $1/2$
 (cf. [Reference Catellier and Gubinelli20, Theorem 1.6] and [Reference Galeati and Gubinelli48, Lemma 78]); the use of more probabilistic arguments and stochastic sewing techniques from Theorem 9.3 instead allows to cover the whole range $1/2$
 (cf. [Reference Catellier and Gubinelli20, Theorem 1.6] and [Reference Galeati and Gubinelli48, Lemma 78]); the use of more probabilistic arguments and stochastic sewing techniques from Theorem 9.3 instead allows to cover the whole range $\rho <1/(2H)$
 without difficulties. $\rho <1/(2H)$
 without difficulties.
Proof. In order to conclude, it suffices to prove the following claim: for any 
 $\rho <1/(2H)$
, we can find
$\rho <1/(2H)$
, we can find 
 $\gamma>1/2$
 such that for any
$\gamma>1/2$
 such that for any 
 $m \in [1, \infty )$
, it holds
$m \in [1, \infty )$
, it holds 
 $$ \begin{align} \Big\| \int_s^t e^{i \xi \cdot X_r} \mathrm{d} r \Big\|_{L^m} \lesssim_m |t-s|^{\gamma} |\xi|^{-\rho} \quad \forall \, \xi \in \mathbb{R}^d, 0 \leqslant s \leqslant t \leqslant 1. \end{align} $$
$$ \begin{align} \Big\| \int_s^t e^{i \xi \cdot X_r} \mathrm{d} r \Big\|_{L^m} \lesssim_m |t-s|^{\gamma} |\xi|^{-\rho} \quad \forall \, \xi \in \mathbb{R}^d, 0 \leqslant s \leqslant t \leqslant 1. \end{align} $$
It is clear that in (9.3), we can restrict to 
 $| \xi | \geqslant 1$
 (or
$| \xi | \geqslant 1$
 (or 
 $| \xi | \geqslant R$
) whenever needed, since for small
$| \xi | \geqslant R$
) whenever needed, since for small 
 $\xi $
, the estimate is trivial. Once (9.3) is obtained, we can deduce that for any
$\xi $
, the estimate is trivial. Once (9.3) is obtained, we can deduce that for any 
 $\tilde {\rho }<\rho -d/m$
, it holds
$\tilde {\rho }<\rho -d/m$
, it holds 
 $$ \begin{align} \mathbb{E} \left[ \int_{\mathbb{R}^d} | \xi |^{\tilde \rho} \bigg| \int_s^t e^{i\xi\cdot X_r} \mathrm{d} r \bigg|^m \mathrm{d} \xi \right] = \mathbb{E} \big[\| \mu^X_{s, t} \|_{\mathcal{F} L^{\tilde\rho, m}}^m\big ] \lesssim |t-s|^{\gamma m}; \end{align} $$
$$ \begin{align} \mathbb{E} \left[ \int_{\mathbb{R}^d} | \xi |^{\tilde \rho} \bigg| \int_s^t e^{i\xi\cdot X_r} \mathrm{d} r \bigg|^m \mathrm{d} \xi \right] = \mathbb{E} \big[\| \mu^X_{s, t} \|_{\mathcal{F} L^{\tilde\rho, m}}^m\big ] \lesssim |t-s|^{\gamma m}; \end{align} $$
here, we follow the notation from [Reference Galeati and Gubinelli48], so that 
 $\mu ^X_{s,t}$
 denotes the occupation measure of X on
$\mu ^X_{s,t}$
 denotes the occupation measure of X on 
 $[s,t]$
 and
$[s,t]$
 and 
 $\mathcal {F} L^{\rho , m}$
 denote Fourier–Lebesgue spaces. Applying Lemma 57 from [Reference Galeati and Gubinelli48] to (9.4), together with Assumption 9.2, yields
$\mathcal {F} L^{\rho , m}$
 denote Fourier–Lebesgue spaces. Applying Lemma 57 from [Reference Galeati and Gubinelli48] to (9.4), together with Assumption 9.2, yields 
 $$ \begin{align*} \mathbb{E} \big[\| \mu^X_{s, t} \|_{\mathcal{F} L^{\tilde\rho, \infty}}^m\big] & \lesssim \mathbb{E} \big[\| X \|_{C_t}^d \| \mu^X_{s, t} \|_{\mathcal{F} L^{\tilde \rho,m}}^m\big]\\ & \lesssim \mathbb{E} \big[\| X \|_{C_t}^{2 d} \big]^{1 / 2}\, \mathbb{E} \big[\| \mu^X_{s, t} \|_{\mathcal{F} L^{\tilde\rho, m}}^{2 m}\big]^{1/2} \lesssim |t-s|^{\gamma m}. \end{align*} $$
$$ \begin{align*} \mathbb{E} \big[\| \mu^X_{s, t} \|_{\mathcal{F} L^{\tilde\rho, \infty}}^m\big] & \lesssim \mathbb{E} \big[\| X \|_{C_t}^d \| \mu^X_{s, t} \|_{\mathcal{F} L^{\tilde \rho,m}}^m\big]\\ & \lesssim \mathbb{E} \big[\| X \|_{C_t}^{2 d} \big]^{1 / 2}\, \mathbb{E} \big[\| \mu^X_{s, t} \|_{\mathcal{F} L^{\tilde\rho, m}}^{2 m}\big]^{1/2} \lesssim |t-s|^{\gamma m}. \end{align*} $$
By the arbitrariness of m and Kolmogorov’s continuity criterion, one then deduces that 
 $\mu ^X\in C^{\tilde \gamma }_t \mathcal {F}L^{\tilde \rho ,\infty }_x$
 for any
$\mu ^X\in C^{\tilde \gamma }_t \mathcal {F}L^{\tilde \rho ,\infty }_x$
 for any 
 $\tilde {\gamma }<\gamma $
 and
$\tilde {\gamma }<\gamma $
 and 
 $\tilde {\rho }<\rho $
; but this is equivalent to saying that X is
$\tilde {\rho }<\rho $
; but this is equivalent to saying that X is 
 $(\tilde {\gamma },\tilde {\rho })$
-irregular; cf. [Reference Galeati and Gubinelli48, Section 3.2]. The arbitrariness of
$(\tilde {\gamma },\tilde {\rho })$
-irregular; cf. [Reference Galeati and Gubinelli48, Section 3.2]. The arbitrariness of 
 $\rho <1/(2H)$
 readily implies the conclusion as well as the moment estimate (9.2).
$\rho <1/(2H)$
 readily implies the conclusion as well as the moment estimate (9.2).
 In order to prove the claim (9.3), we will apply Lemma 2.5, with 
 $(S,T)=(0,1)$
, and
$(S,T)=(0,1)$
, and 
 $n=m$
. Fix
$n=m$
. Fix 
 $\xi \in \mathbb {R}^d$
; arguing as in Lemma 2.6, it is easy to check that
$\xi \in \mathbb {R}^d$
; arguing as in Lemma 2.6, it is easy to check that 
 $\int _0^\cdot e^{i\xi \cdot X_r} \mathrm {d} r$
 is the stochastic sewing of
$\int _0^\cdot e^{i\xi \cdot X_r} \mathrm {d} r$
 is the stochastic sewing of 
 $$\begin{align*}A_{s, t} := \mathbb{E}_{s-(t-s)} \int_s^t e^{i \xi \cdot (\mathbb{E}_{s-(t-s)} \varphi_r + B^H_r)} \mathrm{d} r. \end{align*}$$
$$\begin{align*}A_{s, t} := \mathbb{E}_{s-(t-s)} \int_s^t e^{i \xi \cdot (\mathbb{E}_{s-(t-s)} \varphi_r + B^H_r)} \mathrm{d} r. \end{align*}$$
Note that for any 
 $r\in (s,t)$
, one has
$r\in (s,t)$
, one has 
 $$ \begin{align*} \big|\mathbb{E}_{s-(t-s)}e^{i\xi\cdot B^H_r}\big|=\big|\mathbb{E}_{s-(t-s)}e^{i\xi\cdot (B^H_r-\mathbb{E}_{s-(t-s)}B^H_r)}\big|=e^{-c|\xi|^2|r-s+(t-s))|^{2H}}, \end{align*} $$
$$ \begin{align*} \big|\mathbb{E}_{s-(t-s)}e^{i\xi\cdot B^H_r}\big|=\big|\mathbb{E}_{s-(t-s)}e^{i\xi\cdot (B^H_r-\mathbb{E}_{s-(t-s)}B^H_r)}\big|=e^{-c|\xi|^2|r-s+(t-s))|^{2H}}, \end{align*} $$
and therefore, we have
 $$ \begin{align} | A_{s,t} | \lesssim e^{-c |\xi|^2 |t-s|^{2 H}} |t-s| \lesssim |\xi|^{-\rho} |t-s|^{1-\rho H}, \end{align} $$
$$ \begin{align} | A_{s,t} | \lesssim e^{-c |\xi|^2 |t-s|^{2 H}} |t-s| \lesssim |\xi|^{-\rho} |t-s|^{1-\rho H}, \end{align} $$
where we used the basic inequality 
 $e^{-c|y|^2}\lesssim |y|^{-\rho }$
. By the assumption on
$e^{-c|y|^2}\lesssim |y|^{-\rho }$
. By the assumption on 
 $\rho $
,
$\rho $
, 
 $\varepsilon _1:=1/2-\rho H>0$
, and therefore, the condition (2.7) is satisfied with
$\varepsilon _1:=1/2-\rho H>0$
, and therefore, the condition (2.7) is satisfied with 
 $w_1(s,t)=N|\xi |^{-2\rho }(t-s)$
.
$w_1(s,t)=N|\xi |^{-2\rho }(t-s)$
.
 As for the second condition of Lemma 2.5, we have for 
 $(s,u,t)\in \overline {[0,1]}_\leq ^3$
 that
$(s,u,t)\in \overline {[0,1]}_\leq ^3$
 that 
 $$ \begin{align*} \| \mathbb{E}_{s_-} \delta A_{s, u, t} \|_{L^m} & \leq \int_u^t \big\| \mathbb{E}_{u-(t-u)}e^{i \xi \cdot B^H_r} (e^{i \xi \cdot \mathbb{E}_{s-(t-s)} \varphi_r} - e^{i \xi \cdot \mathbb{E}_{u-(t-u)} \varphi_r}) \big\|_{L^m}\mathrm{d} r\\ &\quad+\int_s^u \big\| \mathbb{E}_{s-(t-s)}e^{i \xi \cdot B^H_r} (e^{i \xi \cdot \mathbb{E}_{s-(t-s)} \varphi_r} - e^{i \xi \cdot \mathbb{E}_{s-(u-s)} \varphi_r}) \big\|_{L^m}\mathrm{d} r=:I+J. \end{align*} $$
$$ \begin{align*} \| \mathbb{E}_{s_-} \delta A_{s, u, t} \|_{L^m} & \leq \int_u^t \big\| \mathbb{E}_{u-(t-u)}e^{i \xi \cdot B^H_r} (e^{i \xi \cdot \mathbb{E}_{s-(t-s)} \varphi_r} - e^{i \xi \cdot \mathbb{E}_{u-(t-u)} \varphi_r}) \big\|_{L^m}\mathrm{d} r\\ &\quad+\int_s^u \big\| \mathbb{E}_{s-(t-s)}e^{i \xi \cdot B^H_r} (e^{i \xi \cdot \mathbb{E}_{s-(t-s)} \varphi_r} - e^{i \xi \cdot \mathbb{E}_{s-(u-s)} \varphi_r}) \big\|_{L^m}\mathrm{d} r=:I+J. \end{align*} $$
As usual, I and J are treated identically, so we only consider the former. We write
 $$ \begin{align*} I & = \int_u^t e^{-c | \xi |^2 | r - u + t - u |^{2 H}} \big\| e^{i \xi \cdot \mathbb{E}_{s - (t - s)} \varphi_r} - e^{i \xi \cdot \mathbb{E}_{u-(t-u)} \varphi_r} \big\|_{L^m}\mathrm{d} r\\ & \leq e^{- \tilde{c} | \xi |^2 | t - s |^{2 H}} | \xi | \int_u^t \|\mathbb{E}_{s - (t - s)} \varphi_r -\mathbb{E}_{u - (t - u)} \varphi_r \|_{L^m}\mathrm{d} r\\ & \lesssim e^{- \tilde{c} |\xi|^2 |t-s|^{2H}} |\xi|\, w(s_-,t)^{1/2} |t-s|^{1+H}, \end{align*} $$
$$ \begin{align*} I & = \int_u^t e^{-c | \xi |^2 | r - u + t - u |^{2 H}} \big\| e^{i \xi \cdot \mathbb{E}_{s - (t - s)} \varphi_r} - e^{i \xi \cdot \mathbb{E}_{u-(t-u)} \varphi_r} \big\|_{L^m}\mathrm{d} r\\ & \leq e^{- \tilde{c} | \xi |^2 | t - s |^{2 H}} | \xi | \int_u^t \|\mathbb{E}_{s - (t - s)} \varphi_r -\mathbb{E}_{u - (t - u)} \varphi_r \|_{L^m}\mathrm{d} r\\ & \lesssim e^{- \tilde{c} |\xi|^2 |t-s|^{2H}} |\xi|\, w(s_-,t)^{1/2} |t-s|^{1+H}, \end{align*} $$
where in the second line we used 
 $(s,u,t)\in \overline {[0,1]}_{\leq }^3$
 and in the last one we used Assumption 9.2. Applying again the basic inequality
$(s,u,t)\in \overline {[0,1]}_{\leq }^3$
 and in the last one we used Assumption 9.2. Applying again the basic inequality 
 $e^{- \tilde {c}|y|^{2}} \lesssim |y|^{-1-\rho }$
, we obtain
$e^{- \tilde {c}|y|^{2}} \lesssim |y|^{-1-\rho }$
, we obtain 
 $$ \begin{align*} \| \mathbb{E}_{s_-} \delta A_{s,u,t} \|_{L^m} \lesssim |\xi|^{-\rho}w(s_-,t)^{1/2}|t-s|^{1-H\rho}. \end{align*} $$
$$ \begin{align*} \| \mathbb{E}_{s_-} \delta A_{s,u,t} \|_{L^m} \lesssim |\xi|^{-\rho}w(s_-,t)^{1/2}|t-s|^{1-H\rho}. \end{align*} $$
Therefore, condition (2.8) is satisfied with 
 $\varepsilon _2=\varepsilon _1=1/2-\rho H$
 and
$\varepsilon _2=\varepsilon _1=1/2-\rho H$
 and 
 $w_2(s,t)=N|\xi |^{-\rho }w^{1/2}(s,t)(t-s)^{1/2}$
, and by (2.12), we finally get
$w_2(s,t)=N|\xi |^{-\rho }w^{1/2}(s,t)(t-s)^{1/2}$
, and by (2.12), we finally get 
 $$ \begin{align*} \Big\| \int_s^t e^{i \xi \cdot X_r} \mathrm{d} r \Big\|_{L^m} \lesssim|\xi|^{-\rho}|t-s|^{1/2+\varepsilon_1}\big(1+w(s,t)\big), \end{align*} $$
$$ \begin{align*} \Big\| \int_s^t e^{i \xi \cdot X_r} \mathrm{d} r \Big\|_{L^m} \lesssim|\xi|^{-\rho}|t-s|^{1/2+\varepsilon_1}\big(1+w(s,t)\big), \end{align*} $$
yielding (9.3).
10 Applications to transport and continuity equations
 Having established well-posedness of the characteristic lines 
 $\mathrm {d} X_t= b_t(X_t)\mathrm {d} t + \mathrm {d} B^H_t$
, the next natural step is to investigate the associated stochastic transport equation
$\mathrm {d} X_t= b_t(X_t)\mathrm {d} t + \mathrm {d} B^H_t$
, the next natural step is to investigate the associated stochastic transport equation 
 $$ \begin{align} \partial_t u + b\cdot\nabla u + \dot B^H\cdot \nabla u =0. \end{align} $$
$$ \begin{align} \partial_t u + b\cdot\nabla u + \dot B^H\cdot \nabla u =0. \end{align} $$
Natural questions in PDE theory and regularization by noise for (10.1) are its well-posedness (cf. the seminal work [Reference Flandoli, Gubinelli and Priola40]) and propagation of the regularity of initial data, first addressed in [Reference Fedrizzi and Flandoli38]. Both features need not be true in the absence of noise; among the vast literature, let us mention the following: the work [Reference Modena and Székelyhidi77] where counterexamples to uniqueness are provided even for Sobolev differentiable drifts; [Reference Brué, Colombo and De Lellis10] where it is shown how uniqueness of the generalized Lagrangian flow (in the sense of DiPerna-Lions [Reference DiPerna and Lions36]) does not imply uniqueness of trajectorial solutions to the ODE; finally [Reference Brué and Nguyen11], providing sharp examples that DiPerna-Lions flows can at most propagate a ‘logarithmic derivative’ of regularity of the initial data 
 $u_0$
, but not better. As we will see in Theorem 10.4, the presence of
$u_0$
, but not better. As we will see in Theorem 10.4, the presence of 
 $B^H$
 allows to prevent all such pathologies, yielding nontrivial regularisation by noise results even in situations where uniqueness of solutions is already known to hold.
$B^H$
 allows to prevent all such pathologies, yielding nontrivial regularisation by noise results even in situations where uniqueness of solutions is already known to hold.
 Rather than working directly with equation (10.1), following [Reference Flandoli, Gubinelli and Priola40], it is useful to introduce the transformation 
 $\tilde u_t(x)=u_t(x+B^H_t)$
,
$\tilde u_t(x)=u_t(x+B^H_t)$
, 
 $\tilde {b}_t(x)=b_t(x+B^H_t)$
, which relates it to
$\tilde {b}_t(x)=b_t(x+B^H_t)$
, which relates it to 
 $$ \begin{align} \partial_t \tilde u + \tilde{b}\cdot \nabla \tilde u=0. \end{align} $$
$$ \begin{align} \partial_t \tilde u + \tilde{b}\cdot \nabla \tilde u=0. \end{align} $$
This transformation formally assumes 
 $B^H$
 to be differentiable, but the resulting equation (10.2) is then well-defined (at least for bounded b) for any continuous path
$B^H$
 to be differentiable, but the resulting equation (10.2) is then well-defined (at least for bounded b) for any continuous path 
 $B^H$
. More rigorously, we are implicitly assuming that the chain rule applies, which amounts to working with
$B^H$
. More rigorously, we are implicitly assuming that the chain rule applies, which amounts to working with 
 $B^H$
 as a geometric rough path, see [Reference Catellier21] for the rigorous equivalence between (10.1)–(10.2) in this case. In the Brownian case, this means that the multiplicative noise must be interpreted in the Stratonovich sense, as in [Reference Flandoli, Gubinelli and Priola40]. However, the resulting PDE (10.2) is well-defined also for values
$B^H$
 as a geometric rough path, see [Reference Catellier21] for the rigorous equivalence between (10.1)–(10.2) in this case. In the Brownian case, this means that the multiplicative noise must be interpreted in the Stratonovich sense, as in [Reference Flandoli, Gubinelli and Priola40]. However, the resulting PDE (10.2) is well-defined also for values 
 $H\leq 1/4$
, where the rough path formalism no longer applies, and indeed, it can be regarded as a PDE with random drift
$H\leq 1/4$
, where the rough path formalism no longer applies, and indeed, it can be regarded as a PDE with random drift 
 $\tilde {b}$
, rather than a stochastic PDE.
$\tilde {b}$
, rather than a stochastic PDE.
 A nice feature of the regular regime 
 $H>1$
, included in our setting, is that here,
$H>1$
, included in our setting, is that here, 
 $B^H$
 is
$B^H$
 is 
 $\mathbb {P}$
-a.s. differentiable and so (10.1) is perfectly well-defined and the above transformation is completely rigorous (as soon as
$\mathbb {P}$
-a.s. differentiable and so (10.1) is perfectly well-defined and the above transformation is completely rigorous (as soon as 
 $(u_t)_{t\in [0,1]}$
 is bounded in some function space) and does not involve any ‘choice’ of the rough lift. The above considerations motivate the following definition; from now on, we will use both notations
$(u_t)_{t\in [0,1]}$
 is bounded in some function space) and does not involve any ‘choice’ of the rough lift. The above considerations motivate the following definition; from now on, we will use both notations 
 $\tilde {u}_t(x)$
 and
$\tilde {u}_t(x)$
 and 
 $\tilde {u}_t(x;\omega )$
 to denote
$\tilde {u}_t(x;\omega )$
 to denote 
 $u_t(\omega , x+B^H_t(\omega ))$
, in order to stress the fixed realization
$u_t(\omega , x+B^H_t(\omega ))$
, in order to stress the fixed realization 
 $\omega \in \Omega $
 whenever needed, and similarly for
$\omega \in \Omega $
 whenever needed, and similarly for 
 $\tilde {b}_t(x)$
 and
$\tilde {b}_t(x)$
 and 
 $\tilde {b}_t(x;\omega )$
.
$\tilde {b}_t(x;\omega )$
.
Definition 10.1. For a fixed 
 $\omega \in \Omega $
, we say that v is a weak solution to the PDE (10.2) associated to
$\omega \in \Omega $
, we say that v is a weak solution to the PDE (10.2) associated to 
 $\tilde {b}_t(x;\omega )$
 if
$\tilde {b}_t(x;\omega )$
 if 
 $v\in L^1_t W^{1,1,\mathrm {loc}}_{x}$
,
$v\in L^1_t W^{1,1,\mathrm {loc}}_{x}$
, 
 $\tilde {b}\cdot \nabla v\in L^1_t L^{1,\mathrm {loc}}_x$
 and for any smooth, compactly supported function
$\tilde {b}\cdot \nabla v\in L^1_t L^{1,\mathrm {loc}}_x$
 and for any smooth, compactly supported function 
 $\varphi :[0,1]\times \mathbb {R}^d\to \mathbb {R}$
 and any
$\varphi :[0,1]\times \mathbb {R}^d\to \mathbb {R}$
 and any 
 $t\in [0,1]$
, it holds
$t\in [0,1]$
, it holds 
 $$ \begin{align} \langle\varphi_t,v_t\rangle-\langle \varphi_0,v_0 \rangle =\int_0^t [\langle \partial_t\varphi_s ,v_s\rangle + \langle \varphi_s, \tilde{b}_s(\cdot\,;\omega)\cdot\nabla v_s \rangle] \mathrm{d} s. \end{align} $$
$$ \begin{align} \langle\varphi_t,v_t\rangle-\langle \varphi_0,v_0 \rangle =\int_0^t [\langle \partial_t\varphi_s ,v_s\rangle + \langle \varphi_s, \tilde{b}_s(\cdot\,;\omega)\cdot\nabla v_s \rangle] \mathrm{d} s. \end{align} $$
We say that a stochastic process u is a pathwise solution to the stochastic transport equation (10.1) if for 
 $\mathbb {P}$
-a.e.
$\mathbb {P}$
-a.e. 
 $\omega \in \Omega $
, the corresponding
$\omega \in \Omega $
, the corresponding 
 $\tilde {u}_t(x;\omega )$
 is a weak solution to (10.2) associated to
$\tilde {u}_t(x;\omega )$
 is a weak solution to (10.2) associated to 
 $\tilde {b}_t(x;\omega )$
, in the above sense. Finally, a pathwise solution is said to be strong if it is adapted to the filtration generated by
$\tilde {b}_t(x;\omega )$
, in the above sense. Finally, a pathwise solution is said to be strong if it is adapted to the filtration generated by 
 $B^H$
.
$B^H$
.
Similarly to equations (10.1)–(10.2), we can relate the stochastic continuity equation
 $$ \begin{align} \partial_t \mu + \nabla\cdot (b\, \mu) + \dot B^H\cdot \nabla \mu =0 \end{align} $$
$$ \begin{align} \partial_t \mu + \nabla\cdot (b\, \mu) + \dot B^H\cdot \nabla \mu =0 \end{align} $$
to its random PDE counterpart
 $$ \begin{align} \partial_t \tilde\mu + \nabla \cdot (\tilde b\, \tilde\mu)=0 \end{align} $$
$$ \begin{align} \partial_t \tilde\mu + \nabla \cdot (\tilde b\, \tilde\mu)=0 \end{align} $$
by means of the transformation 
 $\tilde {\mu }_t(x;\omega )=\mu _t(\omega ,x+B^H_t(\omega ))$
. In the next definition,
$\tilde {\mu }_t(x;\omega )=\mu _t(\omega ,x+B^H_t(\omega ))$
. In the next definition, 
 $\mathcal {M}_+=\mathcal {M}_+(\mathbb {R}^d)$
 denotes the set of nonnegative finite Radon measures. For
$\mathcal {M}_+=\mathcal {M}_+(\mathbb {R}^d)$
 denotes the set of nonnegative finite Radon measures. For 
 $\mu \in \mathcal {M}_+$
, we write
$\mu \in \mathcal {M}_+$
, we write 
 $\mu \in L^p_x$
 to mean that
$\mu \in L^p_x$
 to mean that 
 $\mu $
 admits an
$\mu $
 admits an 
 $L^p$
-integrable density w.r.t. the Lebesgue measure, in which case, with a slight abuse, we will identify
$L^p$
-integrable density w.r.t. the Lebesgue measure, in which case, with a slight abuse, we will identify 
 $\mu (\mathrm {d} x)=\mu (x) \mathrm {d} x$
.
$\mu (\mathrm {d} x)=\mu (x) \mathrm {d} x$
.
Definition 10.2. For a fixed 
 $\omega \in \Omega $
, we say that
$\omega \in \Omega $
, we say that 
 $\rho $
 is a weak solution to the PDE (10.5) associated to
$\rho $
 is a weak solution to the PDE (10.5) associated to 
 $\tilde {b}_t(x;\omega )$
 if
$\tilde {b}_t(x;\omega )$
 if 
 $\rho _t\in \mathcal {M}_+$
 for Lebesgue-a.e. t,
$\rho _t\in \mathcal {M}_+$
 for Lebesgue-a.e. t, 
 $$\begin{align*}\int_0^1\int_{\mathbb{R}^d} |\tilde{b}_t(x;\omega)| \rho_t(\mathrm{d} x)<\infty,\end{align*}$$
$$\begin{align*}\int_0^1\int_{\mathbb{R}^d} |\tilde{b}_t(x;\omega)| \rho_t(\mathrm{d} x)<\infty,\end{align*}$$
and for any smooth, compactly supported 
 $\varphi :[0,1]\times \mathbb {R}^d\to \mathbb {R}$
 and any
$\varphi :[0,1]\times \mathbb {R}^d\to \mathbb {R}$
 and any 
 $t\in [0,1]$
, it holds
$t\in [0,1]$
, it holds 
 $$\begin{align*}\langle\varphi_t,\rho_t\rangle-\langle \varphi_0,\rho_0 \rangle =\int_0^t \langle \partial_t\varphi_s + b_s(\cdot\,;\omega)\cdot\nabla \varphi ,\rho_s\rangle \mathrm{d} s. \end{align*}$$
$$\begin{align*}\langle\varphi_t,\rho_t\rangle-\langle \varphi_0,\rho_0 \rangle =\int_0^t \langle \partial_t\varphi_s + b_s(\cdot\,;\omega)\cdot\nabla \varphi ,\rho_s\rangle \mathrm{d} s. \end{align*}$$
We say that a stochastic process 
 $\mu $
 is a pathwise solution to the stochastic continuity equation (10.4) if for
$\mu $
 is a pathwise solution to the stochastic continuity equation (10.4) if for 
 $\mathbb {P}$
-a.e.
$\mathbb {P}$
-a.e. 
 $\omega \in \Omega $
, the corresponding
$\omega \in \Omega $
, the corresponding 
 $\tilde {\mu }_t(x;\omega )$
 is a weak solution to (10.5) associated to
$\tilde {\mu }_t(x;\omega )$
 is a weak solution to (10.5) associated to 
 $\tilde {b}_t(x;\omega )$
, in the above sense. Finally, a pathwise solution is said to be strong if it is adapted to the filtration generated by
$\tilde {b}_t(x;\omega )$
, in the above sense. Finally, a pathwise solution is said to be strong if it is adapted to the filtration generated by 
 $B^H$
.
$B^H$
.
 As it is clear from Definitions (10.1)–(10.2), in order to treat equations (10.2)–(10.5) in an analytically weak sense, we need 
 $\tilde {b}$
 to enjoy some local integrability and thus to be a well-defined measurable function (up to equivalence class). Therefore, in the case of coefficients
$\tilde {b}$
 to enjoy some local integrability and thus to be a well-defined measurable function (up to equivalence class). Therefore, in the case of coefficients 
 $b\in L^q_t C^\alpha _x$
 with
$b\in L^q_t C^\alpha _x$
 with 
 $\alpha <0$
, throughout this section, we will additionally impose that
$\alpha <0$
, throughout this section, we will additionally impose that 
 $$ \begin{align} b\in L^r_t L^r_x + L^r_t L^\infty_x \quad \text{for some } r>1; \end{align} $$
$$ \begin{align} b\in L^r_t L^r_x + L^r_t L^\infty_x \quad \text{for some } r>1; \end{align} $$
we denote by 
 $r'$
 the conjugate exponent (i.e.,
$r'$
 the conjugate exponent (i.e., 
 $1/r'+1/r=1$
). In the case
$1/r'+1/r=1$
). In the case 
 $\alpha>0$
, we will use the convention
$\alpha>0$
, we will use the convention 
 $r'=1$
; in this case, under (A), condition (10.6) is immediately satisfied for
$r'=1$
; in this case, under (A), condition (10.6) is immediately satisfied for 
 $r=q$
. Let us mention that, in the distributional case
$r=q$
. Let us mention that, in the distributional case 
 $\alpha <0$
, other approaches for giving meaning (10.2)–(10.5) are possible (see Remark 10.9 below), so it is not obvious whether an assumption of the form (10.6) is needed; still, we will adopt it as it allows us to apply nice analytical tools, while already covering a sufficiently rich class of drifts.
$\alpha <0$
, other approaches for giving meaning (10.2)–(10.5) are possible (see Remark 10.9 below), so it is not obvious whether an assumption of the form (10.6) is needed; still, we will adopt it as it allows us to apply nice analytical tools, while already covering a sufficiently rich class of drifts.
Remark 10.3. Let us collect a few useful observations:
- 
i) By standard arguments, whenever a weak solution v to (10.2) exists (in the sense of Definition 10.1), then (up to redefining it on a Lebesgue negligible set of  $t\in [0,1]$
) $t\in [0,1]$
) $t\mapsto v_t$
 is continuous w.r.t. suitable weak topologies; in particular, it always makes sense to talk about initial/terminal conditions for such equations. The same considerations apply for pathwise solutions, as well as solutions to the continuity equations (10.4)–(10.5); from now on, we will always work with these weakly continuous in time versions, without specifying it. $t\mapsto v_t$
 is continuous w.r.t. suitable weak topologies; in particular, it always makes sense to talk about initial/terminal conditions for such equations. The same considerations apply for pathwise solutions, as well as solutions to the continuity equations (10.4)–(10.5); from now on, we will always work with these weakly continuous in time versions, without specifying it.
- 
ii) If  $\rho $
 is a weak solution to (10.5), then its mass $\rho $
 is a weak solution to (10.5), then its mass $\rho _t(\mathbb {R}^d)$
 is preserved by the dynamics. In particular, if $\rho _t(\mathbb {R}^d)$
 is preserved by the dynamics. In particular, if $\rho \in L^q_t L^p_x$
, then it actually belongs to $\rho \in L^q_t L^p_x$
, then it actually belongs to $L^q_t L^{\tilde p}_x$
 for all $L^q_t L^{\tilde p}_x$
 for all $\tilde {p}\in [1,p]$
. $\tilde {p}\in [1,p]$
.
- 
iii) In Definition 10.1, we enforce identity (10.3) to hold for all  $\varphi $
 smooth and compactly supported, but by standard density arguments, it is clear that as soon as more information on v (resp. u) and b is available, then (10.3) can be extended to a larger class of $\varphi $
 smooth and compactly supported, but by standard density arguments, it is clear that as soon as more information on v (resp. u) and b is available, then (10.3) can be extended to a larger class of $\varphi $
, as long as all the terms appearing are well-defined. For instance, if $\varphi $
, as long as all the terms appearing are well-defined. For instance, if $v\in L^\infty _t W^{1,p}_x$
 and $v\in L^\infty _t W^{1,p}_x$
 and $b\in L^\infty _t L^\infty _x$
, then it suffices to know that $b\in L^\infty _t L^\infty _x$
, then it suffices to know that $\varphi , \partial _t \varphi \in L^1_t L^{p'}_x$
, $\varphi , \partial _t \varphi \in L^1_t L^{p'}_x$
, $p'$
 being the conjugate of p. $p'$
 being the conjugate of p.
- 
iv) Definitions (10.1)–(10.2) and the above observations extend easily to the case of backward equations on  $[0,T]$
 with terminal conditions $[0,T]$
 with terminal conditions $u_T$
, $u_T$
, $\mu _T$
, rather than forward ones with initial $\mu _T$
, rather than forward ones with initial $u_0$
, $u_0$
, $\mu _0$
. $\mu _0$
.
The next statement summarizes the main result of this section.
Theorem 10.4. Let b satisfy Assumption (A) and additionally (10.6) if 
 $\alpha <0$
. Then,
$\alpha <0$
. Then, 
- 
i) For any  $p\in [r',\infty )$
 and $p\in [r',\infty )$
 and $u_0\in W^{1,p}_x$
, there exists a strong pathwise solution u to (10.1), which belongs to $u_0\in W^{1,p}_x$
, there exists a strong pathwise solution u to (10.1), which belongs to $L^m_\omega L^\infty _t W^{1,p}_x$
 for all $L^m_\omega L^\infty _t W^{1,p}_x$
 for all $m\in [1,\infty )$
. $m\in [1,\infty )$
.If, moreover,  $p>r'$
, then path-by-path uniqueness holds in the class $p>r'$
, then path-by-path uniqueness holds in the class $L^\infty _t W^{1,p}_x$
, in the following sense: there exists an event $L^\infty _t W^{1,p}_x$
, in the following sense: there exists an event $\tilde {\Omega }$
 of full probability such that, for all $\tilde {\Omega }$
 of full probability such that, for all $\omega \in \tilde \Omega $
 and all $\omega \in \tilde \Omega $
 and all $v_0\in W^{1,p}_x$
, there can exist at most one weak solution $v_0\in W^{1,p}_x$
, there can exist at most one weak solution $v \in L^\infty _t W^{1,p}_x$
 to the PDE (10.2) associated to $v \in L^\infty _t W^{1,p}_x$
 to the PDE (10.2) associated to $\tilde {b}_t(x;\omega )$
 and with initial condition $\tilde {b}_t(x;\omega )$
 and with initial condition $v_0$
. $v_0$
.
- 
ii) For any  $p\in [r',\infty )$
 and any positive measure $p\in [r',\infty )$
 and any positive measure $\mu _0\in L^p_x$
, there exists a strong pathwise solution $\mu _0\in L^p_x$
, there exists a strong pathwise solution $\mu $
 to (10.4), which belongs to $\mu $
 to (10.4), which belongs to $L^m_\omega L^\infty _t L^p_x$
 for all $L^m_\omega L^\infty _t L^p_x$
 for all $m\in [1,\infty )$
. $m\in [1,\infty )$
.Moreover, path-by-path uniqueness holds in the class  $L^\infty _t L^p_x$
, in the following sense: there exists an event $L^\infty _t L^p_x$
, in the following sense: there exists an event $\tilde {\Omega }$
 of full probability such that, for all $\tilde {\Omega }$
 of full probability such that, for all $\omega \in \tilde \Omega $
 and all $\omega \in \tilde \Omega $
 and all $\mu _0\in L^p_x$
, there can exist at most one weak solution $\mu _0\in L^p_x$
, there can exist at most one weak solution $\rho \in L^\infty _t L^p_x$
 to the PDE (10.5) associated to $\rho \in L^\infty _t L^p_x$
 to the PDE (10.5) associated to $\tilde {b}_t(x;\omega )$
 and with initial condition $\tilde {b}_t(x;\omega )$
 and with initial condition $\mu _0$
. $\mu _0$
.
Theorem 10.4 will be proved by mostly analytical techniques, once they are combined with the information coming from the previous sections. We will first establish existence of pathwise solutions to equations (10.1)–(10.4) satisfying the desired a priori bounds; see Proposition 10.5.
Uniqueness will be established by two different methods. In the transport case, we will first establish a priori bounds for solutions the dual equation (backward continuity equation) in Proposition 10.6 and then perform a duality argument (Lemma 10.7); see [Reference DiPerna and Lions36] and [Reference Beck, Flandoli, Gubinelli and Maurelli8] for significant precursors in this direction.
For the continuity equation, we will instead infer uniqueness from Ambrosio’s superposition principle (cf. Theorem 10.8) combined with our path-by-path uniqueness results (Theorems 4.4–5.6). To the best of our knowledge, it is the first time these two results are combined in this way to infer path-by-path uniqueness for (10.4); let us mention, however, that in [Reference Beck, Flandoli, Gubinelli and Maurelli8, Section 4], the opposite idea is developed, proving path-by-path uniqueness for the SDE starting from the corresponding results for (10.4).
 Before giving the proofs, let us recall a few notations and basic facts. We will use 
 $\Psi $
 to denote the random flow of diffeomorphisms associated to the (random) ODE
$\Psi $
 to denote the random flow of diffeomorphisms associated to the (random) ODE 
 $\dot \varphi = \tilde {b}_t(\varphi )$
, where we recall the fundamental relation
$\dot \varphi = \tilde {b}_t(\varphi )$
, where we recall the fundamental relation 
 $X_t=\varphi _t+B^H_t$
 as well as (5.5). Similarly to Section 6, we will use the notations
$X_t=\varphi _t+B^H_t$
 as well as (5.5). Similarly to Section 6, we will use the notations 
 $J^x_{s\to t} := \nabla \Psi _{s\to t}(x)$
,
$J^x_{s\to t} := \nabla \Psi _{s\to t}(x)$
, 
 $K^x_{s\to t} := (J^x_{s\to t})^{-1} = \nabla \Psi _{s \leftarrow t}(\Psi _{s\to t}(x))$
; we also set
$K^x_{s\to t} := (J^x_{s\to t})^{-1} = \nabla \Psi _{s \leftarrow t}(\Psi _{s\to t}(x))$
; we also set 
 $j_{s\to t}(x):=\det J^x_{s\to t}$
, and similarly for
$j_{s\to t}(x):=\det J^x_{s\to t}$
, and similarly for 
 $j_{s\leftarrow t}(x)$
. Recall that, in the case of regular b, we have the relations
$j_{s\leftarrow t}(x)$
. Recall that, in the case of regular b, we have the relations 
 $$ \begin{align} j_{s\to t}(x) = \exp\Big(\int_s^t \mathrm{div} b_r (\Phi_{s\to r}(x)) \mathrm{d} r\Big), \ \ j_{s\leftarrow t}(x) = \exp\Big(-\int_s^t \mathrm{div} b_r (\Phi_{r\leftarrow t}(x)) \mathrm{d} r\Big). \end{align} $$
$$ \begin{align} j_{s\to t}(x) = \exp\Big(\int_s^t \mathrm{div} b_r (\Phi_{s\to r}(x)) \mathrm{d} r\Big), \ \ j_{s\leftarrow t}(x) = \exp\Big(-\int_s^t \mathrm{div} b_r (\Phi_{r\leftarrow t}(x)) \mathrm{d} r\Big). \end{align} $$
Proposition 10.5. Let b satisfy Assumption (A), and additionally (10.6) if 
 $\alpha <0$
, then,
$\alpha <0$
, then, 
- 
i) For any  $p\in [r',\infty )$
 and $p\in [r',\infty )$
 and $u_0\in W^{1,p}_x$
, there exists a strong pathwise solution u to (10.1), which belongs to $u_0\in W^{1,p}_x$
, there exists a strong pathwise solution u to (10.1), which belongs to $L^m_\omega L^\infty _t W^{1,p}_x$
 for all $L^m_\omega L^\infty _t W^{1,p}_x$
 for all $m\in [1,\infty )$
. $m\in [1,\infty )$
.
- 
ii) For any  $p\in [r',\infty )$
 and any positive measure $p\in [r',\infty )$
 and any positive measure $\mu _0$
 such that $\mu _0$
 such that $\mu _0\in L^p_x$
, there exists a strong pathwise solution $\mu _0\in L^p_x$
, there exists a strong pathwise solution $\mu $
 to (10.4), which belongs to $\mu $
 to (10.4), which belongs to $L^m_\omega L^\infty _t L^p_x$
 for all $L^m_\omega L^\infty _t L^p_x$
 for all $m\in [1,\infty )$
. $m\in [1,\infty )$
.
Proof. Let us first assume b to be smooth and derive estimates which only depend on 
 $\| b\|_{L^q_t C^\alpha _x}$
. In this case, the unique solution to (10.2) is given by
$\| b\|_{L^q_t C^\alpha _x}$
. In this case, the unique solution to (10.2) is given by 
 $\tilde u_t(x)= u_0(\Psi _{0\leftarrow t}(x))$
. Let us give the bound on
$\tilde u_t(x)= u_0(\Psi _{0\leftarrow t}(x))$
. Let us give the bound on 
 $\|\nabla \tilde u\|_{L^p}$
, the one for
$\|\nabla \tilde u\|_{L^p}$
, the one for 
 $\| \tilde u\|_{L^p}$
 being similar; also observe that these quantities coincide with the corresponding ones for u. It holds
$\| \tilde u\|_{L^p}$
 being similar; also observe that these quantities coincide with the corresponding ones for u. It holds 
 $$ \begin{align*} \sup_{t\in [0,1]} \| \nabla \tilde u_t\|_{L^p}^p & = \sup_{t\in [0,1]} \int_{\mathbb{R}^d} |\nabla \tilde u_t(x)|^p \mathrm{d} x\\ & \leq \sup_{t\in [0,1]} \int_{\mathbb{R}^d} |\nabla u_0(\Psi_{0\leftarrow t}(x))|^p |\nabla \Psi_{0\leftarrow t}(x)|^p \mathrm{d} x\\& = \sup_{t\in [0,1]} \int_{\mathbb{R}^d} |\nabla u_0(y)|^p |\nabla \Psi_{0\leftarrow t}(\Psi_{0\to t}(y))|^p j_{0\to t}(y) \mathrm{d} y\\ & \leq \int_{\mathbb{R}^d} |\nabla u_0(y)|^p \sup_{t\in [0,1]} |K_{0\to t}(y))|^p \, \sup_{t\in [0,1]} j_{0\to t}(y)\, \mathrm{d} y. \end{align*} $$
$$ \begin{align*} \sup_{t\in [0,1]} \| \nabla \tilde u_t\|_{L^p}^p & = \sup_{t\in [0,1]} \int_{\mathbb{R}^d} |\nabla \tilde u_t(x)|^p \mathrm{d} x\\ & \leq \sup_{t\in [0,1]} \int_{\mathbb{R}^d} |\nabla u_0(\Psi_{0\leftarrow t}(x))|^p |\nabla \Psi_{0\leftarrow t}(x)|^p \mathrm{d} x\\& = \sup_{t\in [0,1]} \int_{\mathbb{R}^d} |\nabla u_0(y)|^p |\nabla \Psi_{0\leftarrow t}(\Psi_{0\to t}(y))|^p j_{0\to t}(y) \mathrm{d} y\\ & \leq \int_{\mathbb{R}^d} |\nabla u_0(y)|^p \sup_{t\in [0,1]} |K_{0\to t}(y))|^p \, \sup_{t\in [0,1]} j_{0\to t}(y)\, \mathrm{d} y. \end{align*} $$
Taking the 
 $L^m_\omega $
-norm on both sides, we arrive at
$L^m_\omega $
-norm on both sides, we arrive at 
 $$ \begin{align*} \Big \| \sup_{t\in [0,1]} \| \nabla \tilde u_t\|_{L^p}^p \Big\|_{L^m} & \leq \int_{\mathbb{R}^d} |\nabla u_0(y)|^p \Big\| \sup_{t\in [0,1]} |K_{0\to t}(y))|^p \, \sup_{t\in [0,1]} j_{0\to t}(y) \Big\|_{L^m} \, \mathrm{d} y\\ & \leq \| \nabla u_0\|_{L^p}^p\, \sup_{y\in \mathbb{R}^d} \Big\| \sup_{t\in [0,1]} |K_{0\to t}(y))|^p \Big\|_{L^{2m}}^{1/2} \, \Big\| \sup_{t\in [0,1]} j_{0\to t}(y) \Big\|_{L^{2m}}^{1/2}. \end{align*} $$
$$ \begin{align*} \Big \| \sup_{t\in [0,1]} \| \nabla \tilde u_t\|_{L^p}^p \Big\|_{L^m} & \leq \int_{\mathbb{R}^d} |\nabla u_0(y)|^p \Big\| \sup_{t\in [0,1]} |K_{0\to t}(y))|^p \, \sup_{t\in [0,1]} j_{0\to t}(y) \Big\|_{L^m} \, \mathrm{d} y\\ & \leq \| \nabla u_0\|_{L^p}^p\, \sup_{y\in \mathbb{R}^d} \Big\| \sup_{t\in [0,1]} |K_{0\to t}(y))|^p \Big\|_{L^{2m}}^{1/2} \, \Big\| \sup_{t\in [0,1]} j_{0\to t}(y) \Big\|_{L^{2m}}^{1/2}. \end{align*} $$
The finiteness of arbitrary moments of 
 $\sup _{t\in [0,1]} j_{0\to t}(y)$
 comes from identity (10.7), combined with Lemma 3.1 applied to
$\sup _{t\in [0,1]} j_{0\to t}(y)$
 comes from identity (10.7), combined with Lemma 3.1 applied to 
 $h=\mathrm {div} b$
 and
$h=\mathrm {div} b$
 and 
 $\varphi _r=\Phi _{0\to r}(y)-B^H_r$
. This estimate is clearly uniform in
$\varphi _r=\Phi _{0\to r}(y)-B^H_r$
. This estimate is clearly uniform in 
 $y\in \mathbb {R}^d$
. The similar bounds for K follow as in Section 6, using the fact that K solves the linear Young equation (6.11). Up to relabelling
$y\in \mathbb {R}^d$
. The similar bounds for K follow as in Section 6, using the fact that K solves the linear Young equation (6.11). Up to relabelling 
 $m=m' p$
, we have thus shown that
$m=m' p$
, we have thus shown that 
 $$ \begin{align} \| \nabla u\|_{L^m_\omega L^\infty_t L^p_x} \lesssim \| \nabla u_0\|_{L^p_x}. \end{align} $$
$$ \begin{align} \| \nabla u\|_{L^m_\omega L^\infty_t L^p_x} \lesssim \| \nabla u_0\|_{L^p_x}. \end{align} $$
We now pass to the case of 
 $\mu $
; for regular b, solutions are given by the identity
$\mu $
; for regular b, solutions are given by the identity 
 $$ \begin{align*} \tilde \mu_t(x) = \mu_0(\Psi_{0\leftarrow t}(x)) \exp\Big(-\int_0^t \mathrm{div} b_r (\Phi_{r \leftarrow t}(x)) \mathrm{d} r \Big). \end{align*} $$
$$ \begin{align*} \tilde \mu_t(x) = \mu_0(\Psi_{0\leftarrow t}(x)) \exp\Big(-\int_0^t \mathrm{div} b_r (\Phi_{r \leftarrow t}(x)) \mathrm{d} r \Big). \end{align*} $$
Arguing similarly to above, it holds
 $$ \begin{align*} \Big\| \sup_{t\in [0,1]} \| \tilde \mu_t\|_{L^p_x}^p \Big\|_{L^m} & = \Big\| \sup_{t\in [0,1]} \int_{\mathbb{R}^d} |\mu_0(\Psi_{0\leftarrow t}(x))|^p \exp\Big(-p\int_0^t \mathrm{div} b_r (\Phi_{r \leftarrow t}(x)) \mathrm{d} r \Big) \mathrm{d} x \Big\|_{L^m} \\ & = \Big\| \sup_{t\in [0,1]} \int_{\mathbb{R}^d} |\mu_0(y)|^p \exp\Big((1-p) \int_0^t \mathrm{div} b_r (\Phi_{0\to r}(y)) \mathrm{d} r \Big) \mathrm{d} y \Big\|_{L^m}\\ & \leq \sup_{y\in \mathbb{R}^d} \Big\| \sup_{t\in [0,1]} \exp\Big((1-p) \int_0^t \mathrm{div} b_r (\Phi_{0\to r}(y))\mathrm{d} r \Big) \Big\|_{L^m} \int_{\mathbb{R}^d} |\mu_0(y)|^p \mathrm{d} y, \end{align*} $$
$$ \begin{align*} \Big\| \sup_{t\in [0,1]} \| \tilde \mu_t\|_{L^p_x}^p \Big\|_{L^m} & = \Big\| \sup_{t\in [0,1]} \int_{\mathbb{R}^d} |\mu_0(\Psi_{0\leftarrow t}(x))|^p \exp\Big(-p\int_0^t \mathrm{div} b_r (\Phi_{r \leftarrow t}(x)) \mathrm{d} r \Big) \mathrm{d} x \Big\|_{L^m} \\ & = \Big\| \sup_{t\in [0,1]} \int_{\mathbb{R}^d} |\mu_0(y)|^p \exp\Big((1-p) \int_0^t \mathrm{div} b_r (\Phi_{0\to r}(y)) \mathrm{d} r \Big) \mathrm{d} y \Big\|_{L^m}\\ & \leq \sup_{y\in \mathbb{R}^d} \Big\| \sup_{t\in [0,1]} \exp\Big((1-p) \int_0^t \mathrm{div} b_r (\Phi_{0\to r}(y))\mathrm{d} r \Big) \Big\|_{L^m} \int_{\mathbb{R}^d} |\mu_0(y)|^p \mathrm{d} y, \end{align*} $$
and so invoking again Lemma 3.1 and relabelling m, we arrive at
 $$ \begin{align} \| \tilde \mu\|_{L^m_\omega L^\infty_t L^p_x} \lesssim \| \mu_0\|_{L^p_x}. \end{align} $$
$$ \begin{align} \| \tilde \mu\|_{L^m_\omega L^\infty_t L^p_x} \lesssim \| \mu_0\|_{L^p_x}. \end{align} $$
Having established the uniform estimates (10.8)–(10.9), both existence claims for general b now follow from a standard compactness argument (see, for instance, [Reference Pardoux84] or [Reference Flandoli, Gubinelli and Priola40, Theorem 15]), so we will only sketch it quickly.
 Consider smooth approximations 
 $b^n\to b$
,
$b^n\to b$
, 
 $u_0^n\to u_0$
 and denote by
$u_0^n\to u_0$
 and denote by 
 $u^n$
 the associated solutions; by reflexivity of
$u^n$
 the associated solutions; by reflexivity of 
 $L^p_t L^p_\omega W^{1,p}_x$
, we can extract a (not relabelled) subsequence such that
$L^p_t L^p_\omega W^{1,p}_x$
, we can extract a (not relabelled) subsequence such that 
 $u^n\rightharpoonup u$
 weakly in
$u^n\rightharpoonup u$
 weakly in 
 $L^p_t L^p_\omega L^p_x$
. By properties of weak convergence, the limit u still belongs to
$L^p_t L^p_\omega L^p_x$
. By properties of weak convergence, the limit u still belongs to 
 $L^m_\omega L^\infty _t W^{1,p}_x$
 and is progressively measurable, since the sequence
$L^m_\omega L^\infty _t W^{1,p}_x$
 and is progressively measurable, since the sequence 
 $u^n$
 was so; also observe that, as in Remark 10.3-i), we can assume u to be weakly continuous in time, so that it is in fact adapted. By the linear structure of the PDE, one can then finally verify that u is indeed a pathwise solution. Let us stress that here is where for
$u^n$
 was so; also observe that, as in Remark 10.3-i), we can assume u to be weakly continuous in time, so that it is in fact adapted. By the linear structure of the PDE, one can then finally verify that u is indeed a pathwise solution. Let us stress that here is where for 
 $\alpha <0$
, the assumption (10.6) is crucial since otherwise, it is unclear whether
$\alpha <0$
, the assumption (10.6) is crucial since otherwise, it is unclear whether 
 $b^n\cdot \nabla u^n$
 converges to
$b^n\cdot \nabla u^n$
 converges to 
 $b\cdot \nabla u$
 in a weak sense (both w.r.t.
$b\cdot \nabla u$
 in a weak sense (both w.r.t. 
 $L^m_\omega $
 and by testing against
$L^m_\omega $
 and by testing against 
 $\varphi \in C^\infty _c$
); indeed, since
$\varphi \in C^\infty _c$
); indeed, since 
 $p\geq r'$
, all objects are well-defined in
$p\geq r'$
, all objects are well-defined in 
 $L^m_\omega L^1_t L^{1,\mathrm {loc}}_x$
, and the claim follows from
$L^m_\omega L^1_t L^{1,\mathrm {loc}}_x$
, and the claim follows from 
 $b^n\to b$
 and
$b^n\to b$
 and 
 $u^n\rightharpoonup u$
. The case of
$u^n\rightharpoonup u$
. The case of 
 $\mu $
 can be treated similarly; the only difference is that since
$\mu $
 can be treated similarly; the only difference is that since 
 $b\in L^r_t L^r_x + L^r_t L^\infty _x$
 and
$b\in L^r_t L^r_x + L^r_t L^\infty _x$
 and 
 $\mu \in L^m_\omega L^\infty _t (L^{r'}_x\cap L^1_x)$
 by Remark 10.3, the additional
$\mu \in L^m_\omega L^\infty _t (L^{r'}_x\cap L^1_x)$
 by Remark 10.3, the additional 
 $\mathbb {P}$
-a.s. integrability constraint
$\mathbb {P}$
-a.s. integrability constraint 
 $\langle |\tilde b(\omega )|,\tilde \mu (\omega )\rangle <\infty $
 coming from Definition 10.2 is also satisfied.
$\langle |\tilde b(\omega )|,\tilde \mu (\omega )\rangle <\infty $
 coming from Definition 10.2 is also satisfied.
We now turn to establishing existence of sufficiently regular solutions to the continuity equation with well-chosen terminal data; handling the backward nature of the equation yields slightly worsened estimates compared to those of Proposition 10.5.
Proposition 10.6. Let 
 $T\in [0,1]$
 and
$T\in [0,1]$
 and 
 $\mu _T\in L^p$
 compactly supported. Then there exists a pathwise solution
$\mu _T\in L^p$
 compactly supported. Then there exists a pathwise solution 
 $\mu $
 to (10.4) on
$\mu $
 to (10.4) on 
 $[0,T]$
 with terminal condition
$[0,T]$
 with terminal condition 
 $\mu \vert _{t=T}=\mu _T$
; moreover, for any
$\mu \vert _{t=T}=\mu _T$
; moreover, for any 
 $m\in [1,\infty )$
 and any
$m\in [1,\infty )$
 and any 
 $\tilde p<p$
, it holds
$\tilde p<p$
, it holds 
 $\mu \in L^\infty _t L^m_\omega L^{\tilde p}_x$
.
$\mu \in L^\infty _t L^m_\omega L^{\tilde p}_x$
.
Proof. We can assume 
 $\mathrm {supp} \mu _T \subset B_R$
 for some
$\mathrm {supp} \mu _T \subset B_R$
 for some 
 $R\geq 1$
. We will assume b to be regular and show how to derive suitable a priori estimates; the general case then follows by arguing similarly to Proposition 10.5. The solution is given explicitly by
$R\geq 1$
. We will assume b to be regular and show how to derive suitable a priori estimates; the general case then follows by arguing similarly to Proposition 10.5. The solution is given explicitly by 
 $$ \begin{align*} \mu_t(x) = \mu_T(\Psi_{t\to T}(x)) \exp\Big( \int_t^T \mathrm{div} b_r (\Psi_{t\to r} (x)) \mathrm{d} r\Big). \end{align*} $$
$$ \begin{align*} \mu_t(x) = \mu_T(\Psi_{t\to T}(x)) \exp\Big( \int_t^T \mathrm{div} b_r (\Psi_{t\to r} (x)) \mathrm{d} r\Big). \end{align*} $$
For any fixed 
 $t\in [0,T]$
, it holds
$t\in [0,T]$
, it holds 
 $$ \begin{align*} \int_{\mathbb{R}^d} |\mu_t(x)|^{\tilde p} \mathrm{d} x & = \int_{\mathbb{R}^d} |\mu_T(\Psi_{t\to T}(x))|^{\tilde p} \exp\Big( \tilde p\int_t^T \mathrm{div} b_r (\Psi_{t\to r} (x) \mathrm{d} r)\Big) \mathrm{d} x\\ & = \int_{\mathbb{R}^d} |\mu_T(y)|^{\tilde p} \exp\Big( (\tilde p-1) \int_t^T \mathrm{div} b_r (\Psi_{r\leftarrow T} (y) \mathrm{d} r)\Big) \mathrm{d} y\\ & \leq \| \mu_T \|_{L^p_x}^{\tilde p} \bigg( \int_{B_R} \exp\Big( \frac{ p(\tilde p-1)}{p-\tilde p} \int_t^T \mathrm{div} b_r (\Psi_{r\leftarrow T} (y) \mathrm{d} r)\Big) \mathrm{d} y \bigg)^{1-\frac{\tilde p}{p}}, \end{align*} $$
$$ \begin{align*} \int_{\mathbb{R}^d} |\mu_t(x)|^{\tilde p} \mathrm{d} x & = \int_{\mathbb{R}^d} |\mu_T(\Psi_{t\to T}(x))|^{\tilde p} \exp\Big( \tilde p\int_t^T \mathrm{div} b_r (\Psi_{t\to r} (x) \mathrm{d} r)\Big) \mathrm{d} x\\ & = \int_{\mathbb{R}^d} |\mu_T(y)|^{\tilde p} \exp\Big( (\tilde p-1) \int_t^T \mathrm{div} b_r (\Psi_{r\leftarrow T} (y) \mathrm{d} r)\Big) \mathrm{d} y\\ & \leq \| \mu_T \|_{L^p_x}^{\tilde p} \bigg( \int_{B_R} \exp\Big( \frac{ p(\tilde p-1)}{p-\tilde p} \int_t^T \mathrm{div} b_r (\Psi_{r\leftarrow T} (y) \mathrm{d} r)\Big) \mathrm{d} y \bigg)^{1-\frac{\tilde p}{p}}, \end{align*} $$
where in the last passage, we used first 
 $\mathrm {supp} \mu _T\subset B_R$
 and then Hölder’s inequality. Applying again the change of variable
$\mathrm {supp} \mu _T\subset B_R$
 and then Hölder’s inequality. Applying again the change of variable 
 $x=\psi _{t\leftarrow T}(y)$
 and the formula for
$x=\psi _{t\leftarrow T}(y)$
 and the formula for 
 $j_{t\to T}(x)$
, overall we find a costant
$j_{t\to T}(x)$
, overall we find a costant 
 $\kappa =\kappa (p,\tilde p)$
 such that
$\kappa =\kappa (p,\tilde p)$
 such that 
 $$ \begin{align*} \big\| \| \mu_t\|_{L^{\tilde p}_x} \big\|_{L^m} \leq \| \mu_T\|_{L^p_x}^{\tilde p}\, \bigg\| \int_{\Psi_{t\to T}(B_R)} \exp\Big( \kappa \int_t^T \mathrm{div} b_r (\Psi_{t\to r} (y) \mathrm{d} r)\Big) \mathrm{d} y \bigg\|_{L^m}^{1-\frac{\tilde p}{p}}. \end{align*} $$
$$ \begin{align*} \big\| \| \mu_t\|_{L^{\tilde p}_x} \big\|_{L^m} \leq \| \mu_T\|_{L^p_x}^{\tilde p}\, \bigg\| \int_{\Psi_{t\to T}(B_R)} \exp\Big( \kappa \int_t^T \mathrm{div} b_r (\Psi_{t\to r} (y) \mathrm{d} r)\Big) \mathrm{d} y \bigg\|_{L^m}^{1-\frac{\tilde p}{p}}. \end{align*} $$
It remains to estimate the last quantity appearing on the r.h.s. above. To this end, let us set 
 $N_y := j_{t\to T}(y)^\kappa $
; as usual by Lemma 3.1, it holds
$N_y := j_{t\to T}(y)^\kappa $
; as usual by Lemma 3.1, it holds 
 $\| N_y\|_{L^m}\lesssim 1$
, with an estimate uniform in y, t and T and only depending on
$\| N_y\|_{L^m}\lesssim 1$
, with an estimate uniform in y, t and T and only depending on 
 $\| b\|_{L^q_t C^\alpha _x}$
.
$\| b\|_{L^q_t C^\alpha _x}$
.
 Thanks to estimates (6.1) and Lemma A.4, one can show that for any 
 $\tilde m\in [1,\infty )$
 and
$\tilde m\in [1,\infty )$
 and 
 $\lambda>1$
, uniformly in
$\lambda>1$
, uniformly in 
 $t\in [0,T]$
 it holds
$t\in [0,T]$
 it holds 
 $$ \begin{align*} \big\| \| \Psi_{t\to T}\|_{C^{0,\lambda}} \big\|_{L^{\tilde m}} <\infty\quad \text{ where } \quad \| \Psi_{t\to T}\|_{C^{0,\lambda}}:= \sup_{|x|\geq 1} |x|^{-\lambda} |\Psi_{t\to T}(x)|; \end{align*} $$
$$ \begin{align*} \big\| \| \Psi_{t\to T}\|_{C^{0,\lambda}} \big\|_{L^{\tilde m}} <\infty\quad \text{ where } \quad \| \Psi_{t\to T}\|_{C^{0,\lambda}}:= \sup_{|x|\geq 1} |x|^{-\lambda} |\Psi_{t\to T}(x)|; \end{align*} $$
this is because one can first show finiteness of the associated 
 $C^{\eta ,\lambda '}_x$
-norm by Lemma A.4, and then deduce from it that
$C^{\eta ,\lambda '}_x$
-norm by Lemma A.4, and then deduce from it that 
 $\Psi _{t\to T}$
 also belongs to
$\Psi _{t\to T}$
 also belongs to 
 $C^{0,\lambda }_x$
 for
$C^{0,\lambda }_x$
 for 
 $\lambda =\lambda '+\eta $
 (such an embedding readily follows from the definitions of such spaces).
$\lambda =\lambda '+\eta $
 (such an embedding readily follows from the definitions of such spaces).
Therefore, it holds
 $$ \begin{align*} \Big\| \int_{\Psi_{t\to T}(B_R)} N_y \mathrm{d} y \Big\|_{L^m} & \leq \sum_{n\in \mathbb{N}} \Big\| \chi_{ \| \Psi_{t\to T}\|_{C^{0,\lambda}} \in [n,n+1)} \int_{\Psi_{t\to T}(B_R)} N_y\, \mathrm{d} y \Big\|_{L^m}\\ & \leq \sum_{n\in \mathbb{N}} \Big\| \int_{B_{(n+1)R^\lambda}} \chi_{ \| \Psi_{t\to T}\|_{C^{0,\lambda}}\geq n} N_y\, \mathrm{d} y \Big\|_{L^m}\\ & \leq \sum_{n\in \mathbb{N}} \int_{B_{(n+1)R^\lambda}} \| \chi_{ \| \Psi_{t\to T}\|_{C^{0,\lambda}}\geq n}\|_{L^{2m}} \|N_y\|_{L^{2m}}\, \mathrm{d} y\\& \lesssim \sum_{n\in \mathbb{N}} (n+1)^d R^{\lambda d}\, \mathbb{P} (\| \Psi_{t\to T}\|_{C^{0,\lambda}}\geq n)^{\frac{1}{2m}}\\ & \lesssim R^{\lambda d} \sum_{n\in \mathbb{N}} n^{d-\frac{\tilde m}{2m}} \big\| \| \Psi_{t\to T}\|_{C^{0,\lambda}}\big\|_{L^{\tilde m}}^{\frac{\tilde m}{2m}}, \end{align*} $$
$$ \begin{align*} \Big\| \int_{\Psi_{t\to T}(B_R)} N_y \mathrm{d} y \Big\|_{L^m} & \leq \sum_{n\in \mathbb{N}} \Big\| \chi_{ \| \Psi_{t\to T}\|_{C^{0,\lambda}} \in [n,n+1)} \int_{\Psi_{t\to T}(B_R)} N_y\, \mathrm{d} y \Big\|_{L^m}\\ & \leq \sum_{n\in \mathbb{N}} \Big\| \int_{B_{(n+1)R^\lambda}} \chi_{ \| \Psi_{t\to T}\|_{C^{0,\lambda}}\geq n} N_y\, \mathrm{d} y \Big\|_{L^m}\\ & \leq \sum_{n\in \mathbb{N}} \int_{B_{(n+1)R^\lambda}} \| \chi_{ \| \Psi_{t\to T}\|_{C^{0,\lambda}}\geq n}\|_{L^{2m}} \|N_y\|_{L^{2m}}\, \mathrm{d} y\\& \lesssim \sum_{n\in \mathbb{N}} (n+1)^d R^{\lambda d}\, \mathbb{P} (\| \Psi_{t\to T}\|_{C^{0,\lambda}}\geq n)^{\frac{1}{2m}}\\ & \lesssim R^{\lambda d} \sum_{n\in \mathbb{N}} n^{d-\frac{\tilde m}{2m}} \big\| \| \Psi_{t\to T}\|_{C^{0,\lambda}}\big\|_{L^{\tilde m}}^{\frac{\tilde m}{2m}}, \end{align*} $$
where in the last passage, we used Markov’s inequality. Choosing 
 $\tilde {m}$
 large enough, so to make the series convergent, then yields the conclusion.
$\tilde {m}$
 large enough, so to make the series convergent, then yields the conclusion.
The importance of integrability of solutions to the backward continuity equation comes from the following (deterministic) duality lemma.
Lemma 10.7. Let b satisfy (10.6) and let v, 
 $\rho $
 be analytic weak solutions to respectively the forward transport and backward continuity equations associated to
$\rho $
 be analytic weak solutions to respectively the forward transport and backward continuity equations associated to 
 $\tilde {b}_t(\cdot ;\omega )$
; assume that
$\tilde {b}_t(\cdot ;\omega )$
; assume that 
 $v\in L^\infty _t W^{1,p_1}_x$
 and
$v\in L^\infty _t W^{1,p_1}_x$
 and 
 $\rho \in L^{r'}_t (L^1_x\cap L^{p_2}_x)$
 for some
$\rho \in L^{r'}_t (L^1_x\cap L^{p_2}_x)$
 for some 
 $p_1,\,p_2$
 satisfying
$p_1,\,p_2$
 satisfying 
 $$\begin{align*}p_1,\,p_2\in [1,\infty),\quad \frac{1}{p_1}+\frac{1}{p_2}+\frac{1}{r}=1. \end{align*}$$
$$\begin{align*}p_1,\,p_2\in [1,\infty),\quad \frac{1}{p_1}+\frac{1}{p_2}+\frac{1}{r}=1. \end{align*}$$
Then it holds
 $$ \begin{align*} \langle v_T, \rho_T\rangle = \langle v_0, \rho_0\rangle. \end{align*} $$
$$ \begin{align*} \langle v_T, \rho_T\rangle = \langle v_0, \rho_0\rangle. \end{align*} $$
Proof. The argument is relatively standard in the analytic community and is based on the use of mollifiers and commutators; see the seminal work [Reference DiPerna and Lions36]. Let 
 $v^\varepsilon =v\ast g^\varepsilon $
 for some standard mollifiers
$v^\varepsilon =v\ast g^\varepsilon $
 for some standard mollifiers 
 $g^\varepsilon $
; since
$g^\varepsilon $
; since 
 $v^\varepsilon $
 is spatially smooth, we can test it against
$v^\varepsilon $
 is spatially smooth, we can test it against 
 $\rho $
 (cf. Remark 10.3-iii)), which combined with the respective PDEs yields the relation
$\rho $
 (cf. Remark 10.3-iii)), which combined with the respective PDEs yields the relation 
 $$ \begin{align*} \langle v^\varepsilon_T , \rho_T\rangle - \langle v^\varepsilon_0 , \rho_0\rangle = \int_0^T \langle (\tilde b\cdot \nabla v)^\varepsilon - \tilde b\cdot\nabla v^\varepsilon, \rho\rangle \mathrm{d} s. \end{align*} $$
$$ \begin{align*} \langle v^\varepsilon_T , \rho_T\rangle - \langle v^\varepsilon_0 , \rho_0\rangle = \int_0^T \langle (\tilde b\cdot \nabla v)^\varepsilon - \tilde b\cdot\nabla v^\varepsilon, \rho\rangle \mathrm{d} s. \end{align*} $$
In order to conclude, it then suffices to show that the r.h.s. converges to 
 $0$
 as
$0$
 as 
 $\varepsilon \to 0$
. Recall that by assumption,
$\varepsilon \to 0$
. Recall that by assumption, 
 $b= b^1+b^2$
 with
$b= b^1+b^2$
 with 
 $b^1\in L^r_t L^r_x$
,
$b^1\in L^r_t L^r_x$
, 
 $b^2\in L^r_t L^\infty _x$
, so that the same holds for
$b^2\in L^r_t L^\infty _x$
, so that the same holds for 
 $\tilde {b}$
; we show how to deal with
$\tilde {b}$
; we show how to deal with 
 $\tilde b^1$
, the other case being similar. By our assumptions, Hölder’s inequality and properties of mollifiers, it is easy to check that both
$\tilde b^1$
, the other case being similar. By our assumptions, Hölder’s inequality and properties of mollifiers, it is easy to check that both 
 $(\tilde b^1\cdot \nabla v)^\varepsilon $
 and
$(\tilde b^1\cdot \nabla v)^\varepsilon $
 and 
 $\tilde b^1\cdot \nabla v^\varepsilon $
 converge to
$\tilde b^1\cdot \nabla v^\varepsilon $
 converge to 
 $\tilde b^1\cdot \nabla v$
 in
$\tilde b^1\cdot \nabla v$
 in 
 $L^r_t L^{\tilde r}_x$
, where
$L^r_t L^{\tilde r}_x$
, where 
 $\tilde {r}\in (1,\infty )$
 is defined by
$\tilde {r}\in (1,\infty )$
 is defined by 
 $1/\tilde {r}=1/r+1/p_1$
. But then
$1/\tilde {r}=1/r+1/p_1$
. But then 
 $$ \begin{align*} \bigg| \int_0^T \langle (\tilde b^1_t\cdot \nabla v_t)^\varepsilon - \tilde b^1_t\cdot\nabla v^\varepsilon_t, \rho_t\rangle \mathrm{d} t \bigg| & \leq \int_0^T \| (\tilde b^1_t\cdot \nabla v_t)^\varepsilon - \tilde b^1_t\cdot\nabla v^\varepsilon_t\|_{L^{\tilde r}_x}\, \| \rho_t\|_{L^{p_2}_x} \mathrm{d} t\\ & \leq \| (\tilde b^1\cdot \nabla v)^\varepsilon - \tilde b^1\cdot\nabla v^\varepsilon\|_{L^r_t L^{\tilde r}_x} \| \rho\|_{L^{r'}_t L^{p_2}_x}, \end{align*} $$
$$ \begin{align*} \bigg| \int_0^T \langle (\tilde b^1_t\cdot \nabla v_t)^\varepsilon - \tilde b^1_t\cdot\nabla v^\varepsilon_t, \rho_t\rangle \mathrm{d} t \bigg| & \leq \int_0^T \| (\tilde b^1_t\cdot \nabla v_t)^\varepsilon - \tilde b^1_t\cdot\nabla v^\varepsilon_t\|_{L^{\tilde r}_x}\, \| \rho_t\|_{L^{p_2}_x} \mathrm{d} t\\ & \leq \| (\tilde b^1\cdot \nabla v)^\varepsilon - \tilde b^1\cdot\nabla v^\varepsilon\|_{L^r_t L^{\tilde r}_x} \| \rho\|_{L^{r'}_t L^{p_2}_x}, \end{align*} $$
where the last term converges to 
 $0$
.
$0$
.
 As a final ingredient, we give the aforementioned Ambrosio’s superposition principle; we stress that the statement is deterministic, but we will apply it for fixed realizations of the random drift 
 $\tilde {b}(\cdot ;\omega )$
. Although the full statement is a bit technical, we invite the reader to consult the (more heuristical) Theorem 3.1 from [Reference Ambrosio1] to understand the role it plays in our analysis.
$\tilde {b}(\cdot ;\omega )$
. Although the full statement is a bit technical, we invite the reader to consult the (more heuristical) Theorem 3.1 from [Reference Ambrosio1] to understand the role it plays in our analysis.
Theorem 10.8 (Theorem 3.2 from [Reference Ambrosio1]).
 Let 
 $\mu $
 be a weak solution to the continuity equation
$\mu $
 be a weak solution to the continuity equation 
 $\partial _t \mu + \nabla \cdot (\mu f)=0$
 such that
$\partial _t \mu + \nabla \cdot (\mu f)=0$
 such that 
 $\mu _t\in \mathcal {M}_+(\mathbb {R}^d)$
 for all t and
$\mu _t\in \mathcal {M}_+(\mathbb {R}^d)$
 for all t and 
 $$\begin{align*}\int_0^1 \int_{\mathbb{R}^d} |f_t(x)|\, \mu_t(\mathrm{d} x)\, \mathrm{d} t<\infty. \end{align*}$$
$$\begin{align*}\int_0^1 \int_{\mathbb{R}^d} |f_t(x)|\, \mu_t(\mathrm{d} x)\, \mathrm{d} t<\infty. \end{align*}$$
Then 
 $\mu $
 is a superposition solution, namely, there exists a measure
$\mu $
 is a superposition solution, namely, there exists a measure 
 $\eta \in \mathcal {M}_+(\mathbb {R}^d \times C_t)$
, concentrated on the pairs
$\eta \in \mathcal {M}_+(\mathbb {R}^d \times C_t)$
, concentrated on the pairs 
 $(x,\varphi )$
 satisfying the relation
$(x,\varphi )$
 satisfying the relation 
 $$\begin{align*}\varphi_t = x + \int_0^t f_s(\varphi_s)\mathrm{d} s, \end{align*}$$
$$\begin{align*}\varphi_t = x + \int_0^t f_s(\varphi_s)\mathrm{d} s, \end{align*}$$
such that 
 $\mu _t = (e_t)_\sharp \eta $
 for all
$\mu _t = (e_t)_\sharp \eta $
 for all 
 $t\in [0,1]$
, where
$t\in [0,1]$
, where 
 $e_t(x,\varphi )=\varphi _t$
 is the evaluation map and
$e_t(x,\varphi )=\varphi _t$
 is the evaluation map and 
 $(e_t)_\sharp \eta $
 denote the pushforward measure.
$(e_t)_\sharp \eta $
 denote the pushforward measure.
We are now ready to give the following:
Proof of Theorem 10.4.
Both existence statements come from Proposition 10.5, so we only need to check path-by-path uniqueness.
 Let us start with the continuity equation. We claim that the event 
 $\tilde {\Omega }$
 of full probability on which path-by-path uniqueness for (10.4) holds is the one for which we have uniqueness of solutions to the ODE
$\tilde {\Omega }$
 of full probability on which path-by-path uniqueness for (10.4) holds is the one for which we have uniqueness of solutions to the ODE 
 $\dot {\varphi }_t=\tilde {b}_t(\varphi _t;\omega )$
 for all
$\dot {\varphi }_t=\tilde {b}_t(\varphi _t;\omega )$
 for all 
 $x\in \mathbb {R}^d$
; its existence is granted by Theorems 4.4–5.6, which additionally imply that
$x\in \mathbb {R}^d$
; its existence is granted by Theorems 4.4–5.6, which additionally imply that 
 $\varphi _t=\Psi _{0\to t}(x;\omega )$
. Indeed, suppose we are given any weak solution
$\varphi _t=\Psi _{0\to t}(x;\omega )$
. Indeed, suppose we are given any weak solution 
 $\rho \in L^\infty _t L^p_x$
 to (10.5); by our assumptions, and possibly Remark 10.3-ii), it holds
$\rho \in L^\infty _t L^p_x$
 to (10.5); by our assumptions, and possibly Remark 10.3-ii), it holds 
 $\int _0^1 \int _{\mathbb {R}^d} |\tilde {b}_t(x;\omega )| \mu _t(\mathrm {d} x) \mathrm {d} t<\infty $
. We can then apply Theorem 10.8 to deduce that
$\int _0^1 \int _{\mathbb {R}^d} |\tilde {b}_t(x;\omega )| \mu _t(\mathrm {d} x) \mathrm {d} t<\infty $
. We can then apply Theorem 10.8 to deduce that 
 $\rho $
 is a superposition solution; since uniqueness of solutions to
$\rho $
 is a superposition solution; since uniqueness of solutions to 
 $\dot {\varphi }_t=\tilde {b}_t(\varphi _t;\omega )$
 holds, we readily deduce that
$\dot {\varphi }_t=\tilde {b}_t(\varphi _t;\omega )$
 holds, we readily deduce that 
 $\rho _t = \Psi _{0\to t}(\cdot ;\omega )_\sharp \rho _0$
, which gives uniqueness.
$\rho _t = \Psi _{0\to t}(\cdot ;\omega )_\sharp \rho _0$
, which gives uniqueness.
 We now pass to consider the transport case; by linearity, we only need to find an event 
 $\tilde \Omega $
 on which any weak solution
$\tilde \Omega $
 on which any weak solution 
 $v\in L^\infty _t W^{1,p}_x$
 to (10.2) with
$v\in L^\infty _t W^{1,p}_x$
 to (10.2) with 
 $v_0=0$
 is necessarily the trivial one. By Remark 10.3-i), we know that any solution is weakly continuous in time; thus, it suffices to verify that
$v_0=0$
 is necessarily the trivial one. By Remark 10.3-i), we know that any solution is weakly continuous in time; thus, it suffices to verify that 
 $v_t=0$
 for all t in a dense subset of
$v_t=0$
 for all t in a dense subset of 
 $[0,1]$
. To this end, let us fix a countable collection
$[0,1]$
. To this end, let us fix a countable collection 
 $\{f^n\}_n$
 of compactly supported smooth functions which are dense in
$\{f^n\}_n$
 of compactly supported smooth functions which are dense in 
 $C^\infty _x$
 and a countable dense set
$C^\infty _x$
 and a countable dense set 
 $\Gamma \subset [0,1]$
. By Proposition 10.6, for any
$\Gamma \subset [0,1]$
. By Proposition 10.6, for any 
 $f^n$
 and
$f^n$
 and 
 $\tau \in \Gamma $
, we can find a pathwise solution
$\tau \in \Gamma $
, we can find a pathwise solution 
 $\mu ^{\tau ,n}$
 to the backward continuity equation on
$\mu ^{\tau ,n}$
 to the backward continuity equation on 
 $[0,\tau ]$
 which
$[0,\tau ]$
 which 
 $\mathbb {P}$
-a.s. belongs to
$\mathbb {P}$
-a.s. belongs to 
 $L^q_t L^q_x$
 for all
$L^q_t L^q_x$
 for all 
 $q\in [1,\infty )$
. Since everything is countable, we can then find an event
$q\in [1,\infty )$
. Since everything is countable, we can then find an event 
 $\tilde {\Omega }\subset \Omega $
 on which
$\tilde {\Omega }\subset \Omega $
 on which 
 $\mu ^{\tau ,n}(\omega )$
 are all defined at once and have the above regularity; we claim that this is the desired event where uniqueness of weak solutions to (10.2) in
$\mu ^{\tau ,n}(\omega )$
 are all defined at once and have the above regularity; we claim that this is the desired event where uniqueness of weak solutions to (10.2) in 
 $L^\infty _t W^{1,p}_x$
 holds. Indeed, since q is arbitrarily large and
$L^\infty _t W^{1,p}_x$
 holds. Indeed, since q is arbitrarily large and 
 $p>r'$
, we can apply Lemma 10.7 and use the fact that
$p>r'$
, we can apply Lemma 10.7 and use the fact that 
 $v_0=0$
 to deduce that
$v_0=0$
 to deduce that 
 $$ \begin{align*} 0 = \langle v_0, \mu^{\tau,n}(\cdot\,;\omega)\rangle = \langle v_\tau, f^n\rangle \quad \forall\, \tau\in \Gamma,\, f^n; \end{align*} $$
$$ \begin{align*} 0 = \langle v_0, \mu^{\tau,n}(\cdot\,;\omega)\rangle = \langle v_\tau, f^n\rangle \quad \forall\, \tau\in \Gamma,\, f^n; \end{align*} $$
by density of 
 $f^n$
, it follows that
$f^n$
, it follows that 
 $v_{\tau }=0$
 for all
$v_{\tau }=0$
 for all 
 $\tau \in \Gamma $
, which by density of
$\tau \in \Gamma $
, which by density of 
 $\Gamma $
 and continuity finally implies
$\Gamma $
 and continuity finally implies 
 $v\equiv 0$
.
$v\equiv 0$
.
Remark 10.9. In [Reference Galeati and Gubinelli49, Section 5.2], the authors show how to solve the transport equation (10.1) in a pathwise manner under the assumption that 
 $T^{B^H}b \in C^\gamma _t C^2_x$
 for some
$T^{B^H}b \in C^\gamma _t C^2_x$
 for some 
 $\gamma>1/2$
; in this case, one can treat purely distributional drifts b without enforcing (10.6). However, this assumption is satisfied under more restrictive conditions than (A) (e.g., if
$\gamma>1/2$
; in this case, one can treat purely distributional drifts b without enforcing (10.6). However, this assumption is satisfied under more restrictive conditions than (A) (e.g., if 
 $b\in L^\infty _t C^\alpha _x$
 for some
$b\in L^\infty _t C^\alpha _x$
 for some 
 $\alpha>2-1/(2H)$
). We believe that existence and uniqueness for (10.1) (resp. (10.4)) should hold under (A) even when
$\alpha>2-1/(2H)$
). We believe that existence and uniqueness for (10.1) (resp. (10.4)) should hold under (A) even when 
 $\alpha <0$
, without the need for (10.6), but we leave this problem for future investigations.
$\alpha <0$
, without the need for (10.6), but we leave this problem for future investigations.
A Kolmogorov continuity type criteria
Let us recall (a conditional version of) the classical Azuma–Hoeffding inequality.
Lemma A.1. Let 
 $k\in \mathbb {N}$
 and
$k\in \mathbb {N}$
 and 
 $\{Y_i\}_{i=0}^k$
 be a sequence of
$\{Y_i\}_{i=0}^k$
 be a sequence of 
 $\mathbb {R}^d$
-valued martingale differences with respect to some filtration
$\mathbb {R}^d$
-valued martingale differences with respect to some filtration 
 $\{\mathcal {F}_i\}_{i=0}^k$
, with
$\{\mathcal {F}_i\}_{i=0}^k$
, with 
 $Y_0=0$
; assume that there exist deterministic constants
$Y_0=0$
; assume that there exist deterministic constants 
 $\{\delta _i\}_{i=1}^k$
 such that
$\{\delta _i\}_{i=1}^k$
 such that 
 $\mathbb {P}$
-a.s.
$\mathbb {P}$
-a.s. 
 $|Y_i|\leq \delta _i$
 for all i. Then for
$|Y_i|\leq \delta _i$
 for all i. Then for 
 $$ \begin{align*} S_j:=\sum_{i=1}^j Y_i,\qquad\Lambda:=\delta_1^2+\cdots+\delta_k^2, \end{align*} $$
$$ \begin{align*} S_j:=\sum_{i=1}^j Y_i,\qquad\Lambda:=\delta_1^2+\cdots+\delta_k^2, \end{align*} $$
one has the 
 $\mathbb {P}$
-a.s. inequality
$\mathbb {P}$
-a.s. inequality 
 $$ \begin{align} \mathbb{E}\bigg[ \exp\Big(\frac{|S_k|^2}{4 d \Lambda}\Big)\bigg\vert \mathcal{F}_0\bigg]\leq 3. \end{align} $$
$$ \begin{align} \mathbb{E}\bigg[ \exp\Big(\frac{|S_k|^2}{4 d \Lambda}\Big)\bigg\vert \mathcal{F}_0\bigg]\leq 3. \end{align} $$
Proof. The proof goes along the same lines as standard Azuma–Hoeffding; since we have not found a direct reference in the literature, we present it here.
 First, observe that we can reduce ourselves to the case 
 $d=1$
 by reasoning componentwise, the general one following from a simple application of conditional Jensen’s inequality.
$d=1$
 by reasoning componentwise, the general one following from a simple application of conditional Jensen’s inequality.
 Next, we claim that the following version of Hoeffding’s lemma holds: given a random variable X and a filtration 
 $\mathcal {F}$
 such that
$\mathcal {F}$
 such that 
 $\mathbb {E}[X\vert \mathcal {F}]=0$
 and
$\mathbb {E}[X\vert \mathcal {F}]=0$
 and 
 $a\leq X \leq b \mathbb {P}$
-a.s., it holds
$a\leq X \leq b \mathbb {P}$
-a.s., it holds 
 $$ \begin{align} \mathbb{E}[\exp(\lambda X)\vert \mathcal{F}] \leq \exp\bigg( \frac{\lambda^2 (b-a)^2}{8}\bigg)\quad \forall\, \lambda\in\mathbb{R}. \end{align} $$
$$ \begin{align} \mathbb{E}[\exp(\lambda X)\vert \mathcal{F}] \leq \exp\bigg( \frac{\lambda^2 (b-a)^2}{8}\bigg)\quad \forall\, \lambda\in\mathbb{R}. \end{align} $$
By homogeneity, it suffices to prove (A.2) for 
 $b-a=1$
; in this case, we have the basic inequality
$b-a=1$
; in this case, we have the basic inequality 
 $e^{\lambda x} \leq (b-x)e^{\lambda a} + (x-a)e^{\lambda b}$
 for all
$e^{\lambda x} \leq (b-x)e^{\lambda a} + (x-a)e^{\lambda b}$
 for all 
 $x\in [a,b]$
. Evaluating in X and taking conditional expectation, we obtain
$x\in [a,b]$
. Evaluating in X and taking conditional expectation, we obtain 
 $$ \begin{align*} \mathbb{E} [e^{\lambda X}\vert \mathcal{F}]\leq (a+1)e^{\lambda a} - a e^{\lambda (a+1)} = e^{H(\lambda)}, \quad H(\lambda):=\lambda a + \log (1+a - e^\lambda a). \end{align*} $$
$$ \begin{align*} \mathbb{E} [e^{\lambda X}\vert \mathcal{F}]\leq (a+1)e^{\lambda a} - a e^{\lambda (a+1)} = e^{H(\lambda)}, \quad H(\lambda):=\lambda a + \log (1+a - e^\lambda a). \end{align*} $$
It can be readily checked that 
 $H(0)=H'(0)=0$
 and
$H(0)=H'(0)=0$
 and 
 $H"(\lambda )\leq 1/4$
, which by Taylor expansion yields
$H"(\lambda )\leq 1/4$
, which by Taylor expansion yields 
 $H(\lambda )\leq \lambda ^2/8$
 and thus (A.2).
$H(\lambda )\leq \lambda ^2/8$
 and thus (A.2).
 Next, given the sequence 
 $\{Y_k\}_k$
 as in the hypothesis, we can assume by homogeneity
$\{Y_k\}_k$
 as in the hypothesis, we can assume by homogeneity 
 $\Lambda =1$
 and apply recursively Hoeffding’s lemma as follows:
$\Lambda =1$
 and apply recursively Hoeffding’s lemma as follows: 
 $$ \begin{align*} \mathbb{E}[ \exp(\lambda S_k)\vert \mathcal{F}_0] & = \mathbb{E}\big[ \exp(\lambda S_{k-1})\, \mathbb{E}[\exp(\lambda Y_k) \vert \mathcal{F}_{k-1}] \big\vert \mathcal{F}_0\big]\\ & \leq \exp\big( \lambda^2 (2 \delta_k)^2/8\big) \mathbb{E}[ \exp(\lambda S_{k-1})\vert \mathcal{F}_0] \leq \ldots \leq e^{\lambda^2/2}. \end{align*} $$
$$ \begin{align*} \mathbb{E}[ \exp(\lambda S_k)\vert \mathcal{F}_0] & = \mathbb{E}\big[ \exp(\lambda S_{k-1})\, \mathbb{E}[\exp(\lambda Y_k) \vert \mathcal{F}_{k-1}] \big\vert \mathcal{F}_0\big]\\ & \leq \exp\big( \lambda^2 (2 \delta_k)^2/8\big) \mathbb{E}[ \exp(\lambda S_{k-1})\vert \mathcal{F}_0] \leq \ldots \leq e^{\lambda^2/2}. \end{align*} $$
By the inequality 
 $e^{|x|}\leq e^x+e^{-x}$
 and Chernoff’s conditional bound, we have
$e^{|x|}\leq e^x+e^{-x}$
 and Chernoff’s conditional bound, we have 
 $$ \begin{align*} \mathbb{P}(|S_k|>a\vert \mathcal{F}_0) \leq \inf_{\lambda >0}e^{-\lambda a}\, \mathbb{E}[ e^{\lambda |S_k|}] \leq 2 \inf_{\lambda>0} e^{-\lambda a + \lambda^2/2} = 2 e^{-a^2/2}. \end{align*} $$
$$ \begin{align*} \mathbb{P}(|S_k|>a\vert \mathcal{F}_0) \leq \inf_{\lambda >0}e^{-\lambda a}\, \mathbb{E}[ e^{\lambda |S_k|}] \leq 2 \inf_{\lambda>0} e^{-\lambda a + \lambda^2/2} = 2 e^{-a^2/2}. \end{align*} $$
Therefore, we arrive at
 $$ \begin{align*} \mathbb{E}\bigg[ \exp\Big( \frac{|S_k|^2}{4}\Big)\bigg\vert \mathcal{F}_0\bigg] = \int_0^{+\infty} \mathbb{P}\bigg(|S_k|> \sqrt{4|\log s|}\bigg)\, \mathrm{d} s \leq 1 + 2\int_1^{+\infty} s^{-2} \mathrm{d} s = 3. \end{align*} $$
$$ \begin{align*} \mathbb{E}\bigg[ \exp\Big( \frac{|S_k|^2}{4}\Big)\bigg\vert \mathcal{F}_0\bigg] = \int_0^{+\infty} \mathbb{P}\bigg(|S_k|> \sqrt{4|\log s|}\bigg)\, \mathrm{d} s \leq 1 + 2\int_1^{+\infty} s^{-2} \mathrm{d} s = 3. \end{align*} $$
Next, we pass to a conditional Kolmogorov-type lemma, stated in a way which is suitable for our purposes.
Lemma A.2. Let E be a Banach space, 
 $X:[0,T]\to E$
 be a continuous random process; suppose there exist
$X:[0,T]\to E$
 be a continuous random process; suppose there exist 
 $\alpha ,\,\beta \in (0,1]$
, a control
$\alpha ,\,\beta \in (0,1]$
, a control 
 $w:[0,T]^2\to [0,\infty )$
, a constant
$w:[0,T]^2\to [0,\infty )$
, a constant 
 $K>0$
 and a
$K>0$
 and a 
 $\sigma $
-algebra
$\sigma $
-algebra 
 $\mathcal {F}$
 such that
$\mathcal {F}$
 such that 
 $$ \begin{align} \mathbb{E}\bigg[\exp\bigg(\frac{\| X_{s,t} \|_E^2}{|t-s|^{2\alpha} \, w(s,t)^{2\beta}}\bigg)\bigg\vert \mathcal{F}\bigg] \leq K \quad \forall\, s<t. \end{align} $$
$$ \begin{align} \mathbb{E}\bigg[\exp\bigg(\frac{\| X_{s,t} \|_E^2}{|t-s|^{2\alpha} \, w(s,t)^{2\beta}}\bigg)\bigg\vert \mathcal{F}\bigg] \leq K \quad \forall\, s<t. \end{align} $$
Then for any 
 $\varepsilon>0$
, there exists a constant
$\varepsilon>0$
, there exists a constant 
 $\mu =\mu (\varepsilon )>0$
 such that
$\mu =\mu (\varepsilon )>0$
 such that 
 $$ \begin{align} \mathbb{E}\bigg[\exp\bigg( \mu\, \sup_{s<t} \frac{\| X_{s,t}\|_E^2}{|t-s|^{2(\alpha-\varepsilon)} \, w(s,t)^{2\beta}}\bigg)\bigg \vert \mathcal F\bigg] \leq e \, K. \end{align} $$
$$ \begin{align} \mathbb{E}\bigg[\exp\bigg( \mu\, \sup_{s<t} \frac{\| X_{s,t}\|_E^2}{|t-s|^{2(\alpha-\varepsilon)} \, w(s,t)^{2\beta}}\bigg)\bigg \vert \mathcal F\bigg] \leq e \, K. \end{align} $$
Proof. Since we are already assuming X to be continuous, the supremum over 
 $s<t$
 appearing in (A.4) equals the supremum over
$s<t$
 appearing in (A.4) equals the supremum over 
 $s,\, t$
 taken over dyadic points. Up to rescaling, we may assume wlog
$s,\, t$
 taken over dyadic points. Up to rescaling, we may assume wlog 
 $T=1$
.
$T=1$
.
 For any 
 $n\in \mathbb {N}$
 and
$n\in \mathbb {N}$
 and 
 $k\in \{0,\ldots , 2^n\}$
, set
$k\in \{0,\ldots , 2^n\}$
, set 
 $t^n_k= k 2^{-n}$
 and define a random variable
$t^n_k= k 2^{-n}$
 and define a random variable 
 $$\begin{align*}J=\sum_{n=1}^\infty 2^{-2n} \sum_{k=0}^{2^n-1} \exp\bigg( \frac{\| X_{t^n_k,t^n_{k+1}}\|_E^2}{2^{-2n\alpha} w(t^n_k, t^n_{k+1})^{2\beta}}\bigg); \end{align*}$$
$$\begin{align*}J=\sum_{n=1}^\infty 2^{-2n} \sum_{k=0}^{2^n-1} \exp\bigg( \frac{\| X_{t^n_k,t^n_{k+1}}\|_E^2}{2^{-2n\alpha} w(t^n_k, t^n_{k+1})^{2\beta}}\bigg); \end{align*}$$
by (A.3), it holds 
 $\mathbb {E}[J\vert \mathcal {F}]\leq K$
. Now take
$\mathbb {E}[J\vert \mathcal {F}]\leq K$
. Now take 
 $s,t$
 to be dyadic points satisfying
$s,t$
 to be dyadic points satisfying 
 $|t-s|\sim 2^{-m}$
. Then by standard chaining arguments (see, for example, the proof of [Reference Friz and Hairer43, Theorem 3.1]), it holds
$|t-s|\sim 2^{-m}$
. Then by standard chaining arguments (see, for example, the proof of [Reference Friz and Hairer43, Theorem 3.1]), it holds 
 $$\begin{align*}\| X_{s,t}\|_E \lesssim \sum_{n\geq m} \sup_k \| X_{t^n_k,t^n_{k+1}} \|_E; \end{align*}$$
$$\begin{align*}\| X_{s,t}\|_E \lesssim \sum_{n\geq m} \sup_k \| X_{t^n_k,t^n_{k+1}} \|_E; \end{align*}$$
however, by the definition of J, it holds
 $$\begin{align*}\| X_{t^n_k,t^n_{k+1}} \|_E \leq 2^{-n\alpha} w(t^n_k, t^n_{k+1})^\beta \sqrt{\log(2^{2n} J)} \lesssim_\varepsilon 2^{-n(\alpha-\varepsilon)} w(s,t)^\beta (1+\sqrt{\log J}) \end{align*}$$
$$\begin{align*}\| X_{t^n_k,t^n_{k+1}} \|_E \leq 2^{-n\alpha} w(t^n_k, t^n_{k+1})^\beta \sqrt{\log(2^{2n} J)} \lesssim_\varepsilon 2^{-n(\alpha-\varepsilon)} w(s,t)^\beta (1+\sqrt{\log J}) \end{align*}$$
so that
 $$ \begin{align*} \| X_{s,t} \|_E & \lesssim \sum_{n\geq m} 2^{-n(\alpha-\varepsilon)} w(s,t)^\beta (1+\sqrt{\log J})\\ & \lesssim 2^{-m(\alpha-\varepsilon)} w(s,t)^\beta (1+\sqrt{\log J}) \sim |t-s|^{\alpha-\varepsilon} w(s,t)^\beta (1+\sqrt{\log J}). \end{align*} $$
$$ \begin{align*} \| X_{s,t} \|_E & \lesssim \sum_{n\geq m} 2^{-n(\alpha-\varepsilon)} w(s,t)^\beta (1+\sqrt{\log J})\\ & \lesssim 2^{-m(\alpha-\varepsilon)} w(s,t)^\beta (1+\sqrt{\log J}) \sim |t-s|^{\alpha-\varepsilon} w(s,t)^\beta (1+\sqrt{\log J}). \end{align*} $$
Overall, we deduce the existence of a constant 
 $C=C(\varepsilon )>0$
 such that
$C=C(\varepsilon )>0$
 such that 
 $$ \begin{align} \sup_{s<t} \frac{\| X_{s,t}\|_E}{|t-s|^{\alpha-\varepsilon} w(s,t)^\beta} \leq C (1+\sqrt{\log J}). \end{align} $$
$$ \begin{align} \sup_{s<t} \frac{\| X_{s,t}\|_E}{|t-s|^{\alpha-\varepsilon} w(s,t)^\beta} \leq C (1+\sqrt{\log J}). \end{align} $$
The conclusion now readily follows by applying 
 $x\mapsto \exp (\mu x^2)$
 on both sides of (A.5) and choosing
$x\mapsto \exp (\mu x^2)$
 on both sides of (A.5) and choosing 
 $\mu =\mu (\varepsilon )$
 so that
$\mu =\mu (\varepsilon )$
 so that 
 $2\mu C^2(\varepsilon ) =1$
, so that
$2\mu C^2(\varepsilon ) =1$
, so that 
 $$ \begin{align*} \mathbb{E}\Big[\exp\big(\mu C^2(1+\sqrt{\log J}\big)^2\Big\vert \mathcal{F}\Big] \leq \mathbb{E}\Big[\exp\big(2\mu C^2(1+\log J)\big)\Big\vert \mathcal{F}\Big] = e\, \mathbb{E}[J\vert \mathcal{F}] \leq e K. \end{align*} $$
$$ \begin{align*} \mathbb{E}\Big[\exp\big(\mu C^2(1+\sqrt{\log J}\big)^2\Big\vert \mathcal{F}\Big] \leq \mathbb{E}\Big[\exp\big(2\mu C^2(1+\log J)\big)\Big\vert \mathcal{F}\Big] = e\, \mathbb{E}[J\vert \mathcal{F}] \leq e K. \end{align*} $$
Going through an almost identical argument, one can also obtain the following result, whose proof is therefore omitted.
Lemma A.3. Let E be a Banach space, 
 $X:[0,T]\to E$
 be a continuous random process; suppose there exist
$X:[0,T]\to E$
 be a continuous random process; suppose there exist 
 $\alpha ,\,\beta \in (0,1]$
,
$\alpha ,\,\beta \in (0,1]$
, 
 $m\in (1,\infty )$
, a control
$m\in (1,\infty )$
, a control 
 $w:[0,T]^2\to [0,\infty )$
, a constant
$w:[0,T]^2\to [0,\infty )$
, a constant 
 $K>0$
 and a
$K>0$
 and a 
 $\sigma $
-algebra
$\sigma $
-algebra 
 $\mathcal {F}$
 such that
$\mathcal {F}$
 such that 
 $$ \begin{align} \mathbb{E}\big[\,\| X_{s,t}\|_E^m\big \vert \mathcal{F}]^{1/m} \leq K |t-s|^\alpha\, w(s,t)^\beta \quad \forall\, s<t. \end{align} $$
$$ \begin{align} \mathbb{E}\big[\,\| X_{s,t}\|_E^m\big \vert \mathcal{F}]^{1/m} \leq K |t-s|^\alpha\, w(s,t)^\beta \quad \forall\, s<t. \end{align} $$
Then for any 
 $0<\gamma <\alpha -1/m$
, there exists a constant
$0<\gamma <\alpha -1/m$
, there exists a constant 
 $C=C(\alpha ,\gamma ,m)>0$
 such that
$C=C(\alpha ,\gamma ,m)>0$
 such that 
 $$ \begin{align} \mathbb{E}\bigg[ \bigg(\sup_{s<t} \frac{\| X_{s,t}\|_E}{|t-s|^{\gamma} \, w(s,t)^{\beta}}\bigg)^m \bigg \vert \mathcal F\bigg]^{1/m} \leq C \, K. \end{align} $$
$$ \begin{align} \mathbb{E}\bigg[ \bigg(\sup_{s<t} \frac{\| X_{s,t}\|_E}{|t-s|^{\gamma} \, w(s,t)^{\beta}}\bigg)^m \bigg \vert \mathcal F\bigg]^{1/m} \leq C \, K. \end{align} $$
Let us also mention that, although for simplicity we assumed in Lemmas A.2 and A.3 to work with a norm 
 $\|\cdot \|_E$
, it suffices for it to be a seminorm instead.
$\|\cdot \|_E$
, it suffices for it to be a seminorm instead.
 Next, we need some basic lemmas in order to control the space-time regularity of random vector fields 
 $A:[0,1]\times \mathbb {R}^d\to \mathbb {R}^m$
. We start by considering the time independent case.
$A:[0,1]\times \mathbb {R}^d\to \mathbb {R}^m$
. We start by considering the time independent case.
Lemma A.4. Let 
 $F:\mathbb {R}^d\to \mathbb {R}^n$
 be a continuous field and suppose there exist
$F:\mathbb {R}^d\to \mathbb {R}^n$
 be a continuous field and suppose there exist 
 $\alpha \in (0,1]$
,
$\alpha \in (0,1]$
, 
 $m\in (1,\infty )$
, a constant
$m\in (1,\infty )$
, a constant 
 $K>0$
 and a
$K>0$
 and a 
 $\sigma $
-algebra
$\sigma $
-algebra 
 $\mathcal {F}$
 such that
$\mathcal {F}$
 such that 
 $$ \begin{align} \| F(x)-F(y)\|_{L^m\vert \mathcal{F}} \leq K |x-y|^\alpha \quad \forall\, x,y\in\mathbb{R}^d. \end{align} $$
$$ \begin{align} \| F(x)-F(y)\|_{L^m\vert \mathcal{F}} \leq K |x-y|^\alpha \quad \forall\, x,y\in\mathbb{R}^d. \end{align} $$
Then for any choice of parameters 
 $\lambda ,\eta \in (0,1]$
 such that
$\lambda ,\eta \in (0,1]$
 such that 
 $\eta <\alpha -d/m$
,
$\eta <\alpha -d/m$
, 
 $\lambda> \alpha -\eta $
, there exists a constant
$\lambda> \alpha -\eta $
, there exists a constant 
 $C=C(\alpha ,m,d,n,\eta ,\lambda )$
 such that
$C=C(\alpha ,m,d,n,\eta ,\lambda )$
 such that 

Proof. By arguing componentwise, we can restrict to 
 $n=1$
; by homogeneity, we can assume
$n=1$
; by homogeneity, we can assume 
 $K=1$
. Recall that by the classical Garsia-Rodemich-Rumsay lemma, there exists a constant
$K=1$
. Recall that by the classical Garsia-Rodemich-Rumsay lemma, there exists a constant 
 $c = c(d,\eta ,\alpha ,m)$
 such that, for any deterministic continuous function f and any
$c = c(d,\eta ,\alpha ,m)$
 such that, for any deterministic continuous function f and any 
 $R>0$
, it holds
$R>0$
, it holds 

thus, taking conditional expectation and applying Fubini, we find

Finally, observe that

with the last quantity being finite under our assumptions.
A combination of Lemmas A.3 and A.4 immediately yields the following.
Corollary A.5. Let 
 $G:[0,1]\times \mathbb {R}^d\to \mathbb {R}^n$
 be a continuous random vector field and assume there exist parameters
$G:[0,1]\times \mathbb {R}^d\to \mathbb {R}^n$
 be a continuous random vector field and assume there exist parameters 
 $\alpha ,\beta _1,\beta _2\in (0,1]$
,
$\alpha ,\beta _1,\beta _2\in (0,1]$
, 
 $m\in (1,\infty )$
, a control w, a constant
$m\in (1,\infty )$
, a control w, a constant 
 $K>0$
 and a
$K>0$
 and a 
 $\sigma $
-algebra
$\sigma $
-algebra 
 $\mathcal {F}$
 such that
$\mathcal {F}$
 such that 
 $$ \begin{align} \| G_{s,t}(x)-G_{s,t}(y)\|_{L^m\vert \mathcal{F}} \leq K |x-y|^\alpha |t-s|^{\beta_1} w(s,t)^{\beta_2}\quad \forall\, x,y\in\mathbb{R}^d,\, s<t. \end{align} $$
$$ \begin{align} \| G_{s,t}(x)-G_{s,t}(y)\|_{L^m\vert \mathcal{F}} \leq K |x-y|^\alpha |t-s|^{\beta_1} w(s,t)^{\beta_2}\quad \forall\, x,y\in\mathbb{R}^d,\, s<t. \end{align} $$
Then for any choice of parameters
 $$\begin{align*}\gamma <\beta_1-\frac{1}{m},\quad \eta<\alpha-\frac{d}{m},\quad \lambda>\alpha-\eta, \end{align*}$$
$$\begin{align*}\gamma <\beta_1-\frac{1}{m},\quad \eta<\alpha-\frac{d}{m},\quad \lambda>\alpha-\eta, \end{align*}$$
there exists 
 $C>0$
, depending on all the previous parameters except K, such that
$C>0$
, depending on all the previous parameters except K, such that 

B Some a priori estimates for Young equations
In this appendix, we prove some basic bounds on (linear and nonlinear) Young differential equations, which are used several times in the article. Such estimates are folklore, but since we did not find an appropriate version in the literature, we provide short proofs.
Lemma B.1. Let 
 $A\in C_t^{p-{\mathrm {var}}} C^{\eta }_x$
 with
$A\in C_t^{p-{\mathrm {var}}} C^{\eta }_x$
 with 
 $\eta \in (0,1)$
,
$\eta \in (0,1)$
, 
 $p\in [1,2)$
 satisfying
$p\in [1,2)$
 satisfying 
 $(1+\eta )/p>1$
; set
$(1+\eta )/p>1$
; set  . Let y be any solution to the nonlinear Young equation
. Let y be any solution to the nonlinear Young equation 
 $$ \begin{align*} y_t=y_0+\int_0^t A_{\mathrm{d} s}(y_s) \end{align*} $$
$$ \begin{align*} y_t=y_0+\int_0^t A_{\mathrm{d} s}(y_s) \end{align*} $$
on 
 $[0,1]$
; then one has the bounds
$[0,1]$
; then one has the bounds 
 $$ \begin{align} |y_{s,t}| \lesssim w_A(s,t)^{\frac{1}{p}} + w_A(s,t), \qquad |y_{s,t}-A_{s,t}(y_s)| \lesssim w_A(s,t)^{\frac{1+\eta}{p}} + w_A(s,t)^{\frac{1}{p}+\eta} \end{align} $$
$$ \begin{align} |y_{s,t}| \lesssim w_A(s,t)^{\frac{1}{p}} + w_A(s,t), \qquad |y_{s,t}-A_{s,t}(y_s)| \lesssim w_A(s,t)^{\frac{1+\eta}{p}} + w_A(s,t)^{\frac{1}{p}+\eta} \end{align} $$
valid for all 
 $(s,t)\in [0,1]_\leq ^2$
, where the hidden constants only depend on
$(s,t)\in [0,1]_\leq ^2$
, where the hidden constants only depend on 
 $(\eta ,p)$
. Similar bounds also hold for solutions only defined on an interval
$(\eta ,p)$
. Similar bounds also hold for solutions only defined on an interval 
 $[S,T]\subset [0,1]$
.
$[S,T]\subset [0,1]$
.
Proof. By definition, y must be of finite q-variation for some q satisfying 
 $1/p +\eta /q>1$
; applying (5.3) with
$1/p +\eta /q>1$
; applying (5.3) with 
 $x=y$
, one finds
$x=y$
, one finds 

which in particular shows that y is of finite p-variation. Then going through the same computation with 
 $q=p$
 and applying [Reference Friz and Victoir45, Proposition 5.10-(i)], there exists a constant C such that, for any
$q=p$
 and applying [Reference Friz and Victoir45, Proposition 5.10-(i)], there exists a constant C such that, for any 
 $s\leq t$
, it holds
$s\leq t$
, it holds 

where in the second step, we used the fact that 
 $\eta \in (0,1)$
 and Young’s inequality. This readily implies a local bound of the form
$\eta \in (0,1)$
 and Young’s inequality. This readily implies a local bound of the form 

We can then apply [Reference Friz and Victoir45, Proposition 5.10-(ii)] to deduce that, for all 
 $(s,t)\in [0,1]_{\leq }^2$
,
$(s,t)\in [0,1]_{\leq }^2$
, 

The first inequality in (B.1) immediately follows from (B.2), the second one from a combination of (B.2) with (5.3) for 
 $x=y$
.
$x=y$
.
 In the next statement instead we pass to consider more standard affine Young equations. In particular, 
 $t\mapsto A_t$
 is an
$t\mapsto A_t$
 is an 
 $\mathbb {R}^{d\times d}$
-valued map of finite p-variation, and the notation
$\mathbb {R}^{d\times d}$
-valued map of finite p-variation, and the notation 
 $\int _0^t \mathrm {d} A_s\, x_s$
 denotes a usual Young integral, equivalently the (deterministic) sewing of the germ
$\int _0^t \mathrm {d} A_s\, x_s$
 denotes a usual Young integral, equivalently the (deterministic) sewing of the germ 
 $\Sigma _{s,t}:= A_{s,t} x_s$
.
$\Sigma _{s,t}:= A_{s,t} x_s$
.
Lemma B.2. Let x be a solution to the affine Young equation
 $$\begin{align*}\mathrm{d} x_t = \mathrm{d} A_t\, x_t + \mathrm{d} z_t, \quad x\vert_{t=0}=x_0, \end{align*}$$
$$\begin{align*}\mathrm{d} x_t = \mathrm{d} A_t\, x_t + \mathrm{d} z_t, \quad x\vert_{t=0}=x_0, \end{align*}$$
where 
 $A\in C^{p-{\mathrm {var}}}_t \mathbb {R}^{d\times d}$
 and
$A\in C^{p-{\mathrm {var}}}_t \mathbb {R}^{d\times d}$
 and 
 $z\in C^{\tilde {p}-{\mathrm {var}}}_t$
, for some
$z\in C^{\tilde {p}-{\mathrm {var}}}_t$
, for some 
 $p\in [1,2)$
 and
$p\in [1,2)$
 and 
 $\tilde {p}\geq p$
 such that
$\tilde {p}\geq p$
 such that 
 $1/p+1/\tilde {p}>1$
; assume
$1/p+1/\tilde {p}>1$
; assume 
 $z_0=0$
. Then there exists a constant
$z_0=0$
. Then there exists a constant 
 $C=C(p,\tilde {p})>0$
 such that
$C=C(p,\tilde {p})>0$
 such that 

When 
 $z=0$
, setting
$z=0$
, setting  , it holds
, it holds 

Proof. Let us first apply the change of variable 
 $\theta =x-z$
, so that
$\theta =x-z$
, so that 
 $\theta $
 solves
$\theta $
 solves 
 $$\begin{align*}\mathrm{d} \theta_t = \mathrm{d} A_t\, \theta_t + \mathrm{d} A_t\, z_t = \mathrm{d} A_t\, \theta_t + \mathrm{d} \tilde{z}_t \end{align*}$$
$$\begin{align*}\mathrm{d} \theta_t = \mathrm{d} A_t\, \theta_t + \mathrm{d} A_t\, z_t = \mathrm{d} A_t\, \theta_t + \mathrm{d} \tilde{z}_t \end{align*}$$
where 
 $\tilde {z}_t:=\int _0^t \mathrm {d} A_s\, z_s$
. The advantage of this maneuver is that
$\tilde {z}_t:=\int _0^t \mathrm {d} A_s\, z_s$
. The advantage of this maneuver is that 
 $\tilde {z}$
 is also of finite p-variation and controlled by (a multiple of)
$\tilde {z}$
 is also of finite p-variation and controlled by (a multiple of) 
 $w^{1/p}$
. Indeed, by Young integration, it holds
$w^{1/p}$
. Indeed, by Young integration, it holds 

For any 
 $s<t$
, define
$s<t$
, define 

and similarly for 
 $\tilde {z}$
. Manipulating the equation for
$\tilde {z}$
. Manipulating the equation for 
 $\theta $
 in a standard manner, one finds a constant
$\theta $
 in a standard manner, one finds a constant 
 $C>0$
 such that, for any
$C>0$
 such that, for any 
 $s<t$
, it holds
$s<t$
, it holds 

If 
 $Cw(0,1)^{1/p}\leq 1/2$
, then the (B.6) buckles with
$Cw(0,1)^{1/p}\leq 1/2$
, then the (B.6) buckles with 
 $s=0,t=1$
. Otherwise, define recursively an increasing sequence
$s=0,t=1$
. Otherwise, define recursively an increasing sequence 
 $t_i$
 by
$t_i$
 by 
 $t_0=0$
 and
$t_0=0$
 and 
 $C w(t_i,t_{i+1})^{1/p}\in (1/3,1/2)$
 and
$C w(t_i,t_{i+1})^{1/p}\in (1/3,1/2)$
 and 
 $t_n=1$
 for some n. set
$t_n=1$
 for some n. set 
 $J_i:=\sup _{r\in [t_i,t_{i+1}]} |\theta _r|$
 with the convention
$J_i:=\sup _{r\in [t_i,t_{i+1}]} |\theta _r|$
 with the convention 
 $J_{-1}=|x_0|$
. Then thanks to our choice of
$J_{-1}=|x_0|$
. Then thanks to our choice of 
 $t_i$
 and equation (B.6), it holds
$t_i$
 and equation (B.6), it holds 

Recursively, this implies

Finally observe that, by superadditivity of w and our choice of 
 $t_i$
, it holds
$t_i$
, it holds 
 $$\begin{align*}n = (3C)^p \sum_i w(t_i,t_{i+1}) \leq (3C)^p w(0,1), \end{align*}$$
$$\begin{align*}n = (3C)^p \sum_i w(t_i,t_{i+1}) \leq (3C)^p w(0,1), \end{align*}$$
and therefore by (B.5),

with some other constant 
 $C'>0$
. Substituting this bound back to (B.6), we similarly get
$C'>0$
. Substituting this bound back to (B.6), we similarly get 

C Fractional regularity and Girsanov’s transform
We collect in this appendix several definitions of fractional regularity and show how, in certain regularity regimes, they can be combined with our results, so to verify the applicability of Girsanov’s transform to the singular SDEs in consideration.
 We start by recalling several classical definitions of fractional spaces for paths 
 $f:[0,1]\to E$
, E being a Banach space. For
$f:[0,1]\to E$
, E being a Banach space. For 
 $\beta \in (0,1)$
 and
$\beta \in (0,1)$
 and 
 $p\in [1,\infty )$
, the fractional Sobolev space
$p\in [1,\infty )$
, the fractional Sobolev space 
 $W^{\beta ,p}=W^{\beta ,p}(0,1;E)$
 is defined as the set of
$W^{\beta ,p}=W^{\beta ,p}(0,1;E)$
 is defined as the set of 
 $f\in L^p(0,1;E)$
 such that
$f\in L^p(0,1;E)$
 such that 

Similarly, we define the spaces the Besov–Nikolskii spaces 
 $N^{\beta ,p}=N^{\beta ,p}(0,1;E)$
 as the collections of all
$N^{\beta ,p}=N^{\beta ,p}(0,1;E)$
 as the collections of all 
 $f\in L^p(0,1;E)$
 such that
$f\in L^p(0,1;E)$
 such that 

In the case 
 $p=\infty $
, we will set
$p=\infty $
, we will set 
 $W^{\beta ,p}=N^{\beta ,p}=C^\beta $
. Although we will not need it, let us mention that these spaces are particular instances of the Besov spaces
$W^{\beta ,p}=N^{\beta ,p}=C^\beta $
. Although we will not need it, let us mention that these spaces are particular instances of the Besov spaces 
 $B^\beta _{p,q}$
 as defined in [Reference Simon93], indeed
$B^\beta _{p,q}$
 as defined in [Reference Simon93], indeed 
 $W^{\beta ,p}=B^\beta _{p,p}$
 and
$W^{\beta ,p}=B^\beta _{p,p}$
 and 
 $N^{\beta ,p}=B^\beta _{p,\infty }$
.
$N^{\beta ,p}=B^\beta _{p,\infty }$
.
 There is a final class of spaces we will need, which is an original contribution of this work; many processes arising from stochastic sewing can be shown to belong to this class, thanks to Lemmas A.2–A.3. Given 
 $\beta \in (0,1]$
,
$\beta \in (0,1]$
, 
 $p\in [1,\infty )$
 with
$p\in [1,\infty )$
 with 
 $\beta> 1/p$
, we define the space
$\beta> 1/p$
, we define the space 
 $D^{\beta ,p}=D^{\beta ,p}(0,1;E)$
 as the set of all f for which there exists a continuous control
$D^{\beta ,p}=D^{\beta ,p}(0,1;E)$
 as the set of all f for which there exists a continuous control 
 $w=w(f)$
 such that
$w=w(f)$
 such that 
 $$ \begin{align} \| f_{s,t}\|_E \leq |t-s|^{\beta-\frac{1}{p}}\, w(s,t)^{\frac{1}{p}}\quad \forall\, s<t. \end{align} $$
$$ \begin{align} \| f_{s,t}\|_E \leq |t-s|^{\beta-\frac{1}{p}}\, w(s,t)^{\frac{1}{p}}\quad \forall\, s<t. \end{align} $$
Observe that by superadditivity, if such a control w exists, then the optimal choice must be necessarily given by

where the supremum runs over all possible finite partitions 
 $s=t_0 < t_1<\ldots <t_n=t$
 of
$s=t_0 < t_1<\ldots <t_n=t$
 of 
 $[s,t]$
. We can therefore endow the space
$[s,t]$
. We can therefore endow the space 
 $D^{\beta ,p}$
 with the norm
$D^{\beta ,p}$
 with the norm 

which makes them Banach spaces; observe the analogy with the definition of 
 $C^{p-{\mathrm {var}}}$
 and its characterization via controls. In particular, if a function f is known to satisfy (C.1), then it must hold
$C^{p-{\mathrm {var}}}$
 and its characterization via controls. In particular, if a function f is known to satisfy (C.1), then it must hold  .
.
For 
 $\beta>1/p$
, we define
$\beta>1/p$
, we define 
 $W^{\beta ,p}_0=\{f\in W^{\beta ,p}: f_0=0\}$
 (as we will shortly see, this is a good definition, as elements of
$W^{\beta ,p}_0=\{f\in W^{\beta ,p}: f_0=0\}$
 (as we will shortly see, this is a good definition, as elements of 
 $W^{\beta ,p}$
 are continuous functions), and similarly for
$W^{\beta ,p}$
 are continuous functions), and similarly for 
 $N^{\beta ,p}_0$
 and
$N^{\beta ,p}_0$
 and 
 $D^{\beta ,p}_0$
.
$D^{\beta ,p}_0$
.
 The next proposition summarises the embeddings between these classes of spaces, as well as the Cameron–Martin spaces 
 $\mathcal {H}^H$
 and spaces of finite q-variation.
$\mathcal {H}^H$
 and spaces of finite q-variation.
Proposition C.1. Let 
 $\beta \in (0,1]$
,
$\beta \in (0,1]$
, 
 $p\in [1,\infty )$
 with
$p\in [1,\infty )$
 with 
 $\beta> 1/p$
; then, the following hold:
$\beta> 1/p$
; then, the following hold: 
- 
i) for any  $\varepsilon>0$
, we have $\varepsilon>0$
, we have $ W^{\beta ,p} \hookrightarrow D^{\beta ,p} \hookrightarrow N^{\beta ,p} \hookrightarrow W^{\beta -\varepsilon ,p}$
; $ W^{\beta ,p} \hookrightarrow D^{\beta ,p} \hookrightarrow N^{\beta ,p} \hookrightarrow W^{\beta -\varepsilon ,p}$
;
- 
ii) if  $\bar \beta \leq \beta $
 and $\bar \beta \leq \beta $
 and $\beta -1/p\geq \bar \beta -1/\bar p$
, then $\beta -1/p\geq \bar \beta -1/\bar p$
, then $N^{\beta ,p}\hookrightarrow N^{\bar \beta ,\bar p}$
; in particular, $N^{\beta ,p}\hookrightarrow N^{\bar \beta ,\bar p}$
; in particular, $N^{\beta ,p} \hookrightarrow C^{\beta -1/p}$
; $N^{\beta ,p} \hookrightarrow C^{\beta -1/p}$
;
- 
iii)  $N^{\beta ,p} \hookrightarrow C^{1/\beta -{\mathrm {var}}} \hookrightarrow N^{\beta ,1/\beta }$
; $N^{\beta ,p} \hookrightarrow C^{1/\beta -{\mathrm {var}}} \hookrightarrow N^{\beta ,1/\beta }$
;
- 
iv) let  $H\in (0,1/2)$
 and $H\in (0,1/2)$
 and $E=\mathbb {R}^d$
; then for any $E=\mathbb {R}^d$
; then for any $\varepsilon>0$
, it holds in particular, $\varepsilon>0$
, it holds in particular, $$ \begin{align*} W^{H+\frac{1}{2}+\varepsilon,2}_0 \hookrightarrow \mathcal{H}^H\hookrightarrow W^{H+\frac{1}{2}-\varepsilon,2}_0; \end{align*} $$ $$ \begin{align*} W^{H+\frac{1}{2}+\varepsilon,2}_0 \hookrightarrow \mathcal{H}^H\hookrightarrow W^{H+\frac{1}{2}-\varepsilon,2}_0; \end{align*} $$ $\mathcal {H}^H\hookrightarrow C^{q-{\mathrm {var}}}$
 for any $\mathcal {H}^H\hookrightarrow C^{q-{\mathrm {var}}}$
 for any $q>(H+1/2)^{-1}$
. $q>(H+1/2)^{-1}$
.
Proof. i) The last embedding 
 $N^{\beta ,p} \hookrightarrow W^{\beta -\varepsilon ,p}$
 is classical and can be found in [Reference Simon93, Corollary 23]. The embedding
$N^{\beta ,p} \hookrightarrow W^{\beta -\varepsilon ,p}$
 is classical and can be found in [Reference Simon93, Corollary 23]. The embedding 
 $W^{\beta ,p} \hookrightarrow D^{\beta ,p}$
 follows from [Reference Friz and Victoir42, Theorem 2]; in particular, by Garsia-Rodemich-Rumsay lemma, the associated control
$W^{\beta ,p} \hookrightarrow D^{\beta ,p}$
 follows from [Reference Friz and Victoir42, Theorem 2]; in particular, by Garsia-Rodemich-Rumsay lemma, the associated control 
 $w_f$
 can be taken as
$w_f$
 can be taken as 
 $$\begin{align*}w_f(s,t) = \int_{[s,t]^2} \frac{\| f_{r,u}\|_E^p}{|r-u|^{1+\beta p}} \, \mathrm{d} r \mathrm{d} u. \end{align*}$$
$$\begin{align*}w_f(s,t) = \int_{[s,t]^2} \frac{\| f_{r,u}\|_E^p}{|r-u|^{1+\beta p}} \, \mathrm{d} r \mathrm{d} u. \end{align*}$$
It remains to show the embedding 
 $\mathcal {D}^{\beta ,p} \hookrightarrow N^{\beta ,p}$
; this follows the same technique used to show that
$\mathcal {D}^{\beta ,p} \hookrightarrow N^{\beta ,p}$
; this follows the same technique used to show that 
 $C^{p-{\mathrm {var}}}\hookrightarrow N^{1/p,p}$
 (see, for example, [Reference Liu, Prömel and Teichmann73, Proposition 4.3]). Indeed, for any
$C^{p-{\mathrm {var}}}\hookrightarrow N^{1/p,p}$
 (see, for example, [Reference Liu, Prömel and Teichmann73, Proposition 4.3]). Indeed, for any 
 $h\in [0,T]$
, it holds
$h\in [0,T]$
, it holds 
 $$ \begin{align*} \| f_{h+\cdot} - f_{\cdot}\|_{L^p}^p = \int_0^{1-h} \| f_{t,h+t}\|_E^p \mathrm{d} t \leq |h|^{\beta p-1} \int_0^{1-h} w(t,h+t) \mathrm{d} t, \end{align*} $$
$$ \begin{align*} \| f_{h+\cdot} - f_{\cdot}\|_{L^p}^p = \int_0^{1-h} \| f_{t,h+t}\|_E^p \mathrm{d} t \leq |h|^{\beta p-1} \int_0^{1-h} w(t,h+t) \mathrm{d} t, \end{align*} $$
where  . Denoting by K the largest integer such that
. Denoting by K the largest integer such that 
 $Kh\leq 1-h$
, we have
$Kh\leq 1-h$
, we have 
 $$ \begin{align*} \int_0^{1-h} w(t,h+t) \mathrm{d} t&\leq \int_0^{Kh} w(t,h+t) +|h| w(0,1) \\ & = \sum_{i=0}^{K-1} \int_{ih}^{(i+1)h} w(s,h+s) \mathrm{d} s+|h| w(0,1) \\ &= \int_0^h \sum_{i=0}^{K-1} w(ih+s,(i+1)h+s) \mathrm{d} s+|h| w(0,1) \\ &\leq \int_0^h w(0,1) \mathrm{d} s +|h| w(0,1)= 2 |h| w(0,1), \end{align*} $$
$$ \begin{align*} \int_0^{1-h} w(t,h+t) \mathrm{d} t&\leq \int_0^{Kh} w(t,h+t) +|h| w(0,1) \\ & = \sum_{i=0}^{K-1} \int_{ih}^{(i+1)h} w(s,h+s) \mathrm{d} s+|h| w(0,1) \\ &= \int_0^h \sum_{i=0}^{K-1} w(ih+s,(i+1)h+s) \mathrm{d} s+|h| w(0,1) \\ &\leq \int_0^h w(0,1) \mathrm{d} s +|h| w(0,1)= 2 |h| w(0,1), \end{align*} $$
where in the last inequality, we used the superadditivity of w. Overall, we conclude that  .
.
ii) These embeddings can be found in, for example, [Reference Simon93, Corollary 22], [Reference Simon93, Corollary 26].
iii) These embeddings can be found in, for example, [Reference Liu, Prömel and Teichmann73, Proposition 4.1], [Reference Liu, Prömel and Teichmann73, Proposition 4.3].
 iv) The second embedding 
 $\mathcal {H}^H\hookrightarrow W^{H+\frac {1}{2}-\varepsilon ,2}_0$
 is the result of [Reference Friz and Victoir42, Theorem 3]; the last one follows from it combined with
$\mathcal {H}^H\hookrightarrow W^{H+\frac {1}{2}-\varepsilon ,2}_0$
 is the result of [Reference Friz and Victoir42, Theorem 3]; the last one follows from it combined with 
 $N^{q,2}\hookrightarrow C^{1/q-{\mathrm {var}}}$
. It only remains to show the first embedding. Although we believe it to be common knowledge, we have not found a proof in the literature; thus, we give a detailed one.
$N^{q,2}\hookrightarrow C^{1/q-{\mathrm {var}}}$
. It only remains to show the first embedding. Although we believe it to be common knowledge, we have not found a proof in the literature; thus, we give a detailed one.
 Given 
 $ f\in W^{H+1/2+\varepsilon ,2}_0$
, in order to verify that
$ f\in W^{H+1/2+\varepsilon ,2}_0$
, in order to verify that 
 $f\in \mathcal {H}^H$
, we need to check that
$f\in \mathcal {H}^H$
, we need to check that 
 $K^{-1}_H f\in L^2$
, where
$K^{-1}_H f\in L^2$
, where 
 $$ \begin{align*} K^{-1}_H f = s^{1/2-H} D_{0+}^{1/2-H} s^{H-1/2} D^{2H}_{0+} \end{align*} $$
$$ \begin{align*} K^{-1}_H f = s^{1/2-H} D_{0+}^{1/2-H} s^{H-1/2} D^{2H}_{0+} \end{align*} $$
(see equation (12) from [Reference Nualart and Ouknine81]); 
 $D_{0+}^\gamma $
 denotes the Riemann-Liouville fractional derivative of order
$D_{0+}^\gamma $
 denotes the Riemann-Liouville fractional derivative of order 
 $\gamma $
, for which again we refer to [Reference Nualart and Ouknine81].
$\gamma $
, for which again we refer to [Reference Nualart and Ouknine81].
 By using standard embeddings between 
 $W^{\delta ,2}$
 spaces and potential spaces
$W^{\delta ,2}$
 spaces and potential spaces 
 $I^+_{\delta ,2}$
 (cf. [Reference Decreusefond34, Proposition 5]), up to losing an arbitrary small fraction of regularity, we know that for any
$I^+_{\delta ,2}$
 (cf. [Reference Decreusefond34, Proposition 5]), up to losing an arbitrary small fraction of regularity, we know that for any 
 $f\in W^{H+1/2+\varepsilon ,2}_0$
, it holds
$f\in W^{H+1/2+\varepsilon ,2}_0$
, it holds 
 $h:=D^{2H}_{0+} f \in W^{1/2-H+\varepsilon /2,2}$
 (this is the only point in the proof where the condition
$h:=D^{2H}_{0+} f \in W^{1/2-H+\varepsilon /2,2}$
 (this is the only point in the proof where the condition 
 $f(0)=0$
 is needed). Thus, we are left with verifying that, for the choice
$f(0)=0$
 is needed). Thus, we are left with verifying that, for the choice 
 $\gamma =1/2-H$
, it holds
$\gamma =1/2-H$
, it holds 
 $$ \begin{align*} (K^{-1}_H f)_t = C_{\gamma} \bigg( t^{-\gamma} h_t + \gamma t^\gamma \int_0^t \frac{t^{-\gamma} h_t - s^{-\gamma} h_s}{|t-s|^{1+\gamma}} \mathrm{d} s \bigg) \in L^2(0,1;\mathbb{R}^d). \end{align*} $$
$$ \begin{align*} (K^{-1}_H f)_t = C_{\gamma} \bigg( t^{-\gamma} h_t + \gamma t^\gamma \int_0^t \frac{t^{-\gamma} h_t - s^{-\gamma} h_s}{|t-s|^{1+\gamma}} \mathrm{d} s \bigg) \in L^2(0,1;\mathbb{R}^d). \end{align*} $$
From now on, we will drop the constants 
 $C_\gamma $
 and
$C_\gamma $
 and 
 $\gamma $
 for simplicity.
$\gamma $
 for simplicity.
 For the first term, observing that 
 $t^{-\gamma }\in L^r$
 for any r such that
$t^{-\gamma }\in L^r$
 for any r such that 
 $1/r<1/2-H$
 and that
$1/r<1/2-H$
 and that 
 $h\in W^{1/2-H+\varepsilon /2,2}\hookrightarrow L^p$
 for
$h\in W^{1/2-H+\varepsilon /2,2}\hookrightarrow L^p$
 for 
 $1/p=H-\varepsilon /2$
, it is easy to check by Hölder’s inequality that
$1/p=H-\varepsilon /2$
, it is easy to check by Hölder’s inequality that 
 $t^{-\gamma } h_t \in L^2$
.
$t^{-\gamma } h_t \in L^2$
.
By time rescaling and addition and subtraction, we can split the integral term respectively into
 $$ \begin{align*} I^1_t := \int_0^t \frac{h_t-h_s}{|t-s|^{1+\gamma}} \mathrm{d} s, \quad I^2_t := t^{-\gamma} \int_0^1 \frac{1-s^{-\gamma}}{(1-s)^{1+\gamma}} h_{t s} \mathrm{d} s. \end{align*} $$
$$ \begin{align*} I^1_t := \int_0^t \frac{h_t-h_s}{|t-s|^{1+\gamma}} \mathrm{d} s, \quad I^2_t := t^{-\gamma} \int_0^1 \frac{1-s^{-\gamma}}{(1-s)^{1+\gamma}} h_{t s} \mathrm{d} s. \end{align*} $$
For 
 $I^1$
, it holds
$I^1$
, it holds 
 $$ \begin{align*} \int_0^1 |I^1_t|^2 \mathrm{d} t \leq \int_0^1 \bigg( \int_0^1 \frac{|h_t-h_s|}{|t-s|^{1+\gamma}} \mathrm{d} s\bigg)^2 \mathrm{d} t \lesssim \int_{[0,1]^2} \frac{|h_t-h_s|^2}{|t-s|^{1+2\gamma+\varepsilon}} \mathrm{d} s \mathrm{d} t \lesssim \| h\|_{W^{\gamma+\varepsilon/2,2}}, \end{align*} $$
$$ \begin{align*} \int_0^1 |I^1_t|^2 \mathrm{d} t \leq \int_0^1 \bigg( \int_0^1 \frac{|h_t-h_s|}{|t-s|^{1+\gamma}} \mathrm{d} s\bigg)^2 \mathrm{d} t \lesssim \int_{[0,1]^2} \frac{|h_t-h_s|^2}{|t-s|^{1+2\gamma+\varepsilon}} \mathrm{d} s \mathrm{d} t \lesssim \| h\|_{W^{\gamma+\varepsilon/2,2}}, \end{align*} $$
where in the middle passage, we used Jensen’s inequality. To handle 
 $I^2$
, define
$I^2$
, define 
 $F^\gamma _s := (1-s^{-\gamma })/(1-s)^{1+\gamma }$
;
$F^\gamma _s := (1-s^{-\gamma })/(1-s)^{1+\gamma }$
; 
 $F^\gamma $
 is only unbounded at the points
$F^\gamma $
 is only unbounded at the points 
 $s=0$
 and
$s=0$
 and 
 $s=1$
, where it behaves asymptotically respectively as
$s=1$
, where it behaves asymptotically respectively as 
 $-s^{-\gamma }$
 and
$-s^{-\gamma }$
 and 
 $(1-s)^{-\gamma }$
, and therefore
$(1-s)^{-\gamma }$
, and therefore 
 $F^\gamma \in L^1\cap L^2$
. As before,
$F^\gamma \in L^1\cap L^2$
. As before, 
 $h\in L^p$
 for
$h\in L^p$
 for 
 $1/p=H-\varepsilon /2< 1/2$
, and therefore by Hölder’s inequality
$1/p=H-\varepsilon /2< 1/2$
, and therefore by Hölder’s inequality 
 $$ \begin{align*} |I^2_t| \leq t^{-\gamma} \| F^\gamma\|_{L^{p'}} \| h_{t\cdot}\|_{L^p} \sim t^{-\gamma-\frac{1}{p}} \| F^\gamma\|_{L^{p'}} \| h\|_{L^p} \sim t^{\varepsilon/2-1/2}, \end{align*} $$
$$ \begin{align*} |I^2_t| \leq t^{-\gamma} \| F^\gamma\|_{L^{p'}} \| h_{t\cdot}\|_{L^p} \sim t^{-\gamma-\frac{1}{p}} \| F^\gamma\|_{L^{p'}} \| h\|_{L^p} \sim t^{\varepsilon/2-1/2}, \end{align*} $$
which readily implies 
 $I^2\in L^2$
 as well.
$I^2\in L^2$
 as well.
Remark C.2. By Proposition C.1, for a deterministic path g to belong to the Cameron-Martin space 
 $\mathcal {H}^H$
 for
$\mathcal {H}^H$
 for 
 $H\in (0,1/2)$
, it suffices to verify that
$H\in (0,1/2)$
, it suffices to verify that 
 $g\in \mathcal {D}^{\beta ,p}$
 for parameters
$g\in \mathcal {D}^{\beta ,p}$
 for parameters 
 $p\in (1,2]$
 and
$p\in (1,2]$
 and 
 $\beta>0$
 satisfying
$\beta>0$
 satisfying 
 $$ \begin{align} \beta-\frac{1}{p}>H, \end{align} $$
$$ \begin{align} \beta-\frac{1}{p}>H, \end{align} $$
in which case we have the estimate 
 $\| g\|_{\mathcal {H}^H} \lesssim \| g\|_{\mathcal {D}^{\beta ,p}}$
. Therefore, if a stochastic process h is adapted and belongs to
$\| g\|_{\mathcal {H}^H} \lesssim \| g\|_{\mathcal {D}^{\beta ,p}}$
. Therefore, if a stochastic process h is adapted and belongs to 
 $\mathcal {D}^{\beta ,p}$
, then for a sequence of stopping times
$\mathcal {D}^{\beta ,p}$
, then for a sequence of stopping times 
 $(\tau _n)_{n\in \mathbb {N}}$
 satisfying
$(\tau _n)_{n\in \mathbb {N}}$
 satisfying 
 $\tau _n\nearrow \infty $
, the laws of
$\tau _n\nearrow \infty $
, the laws of 
 $B^H$
 are
$B^H$
 are 
 $B^H_\cdot +h_{\cdot \wedge \tau _n}$
 are mutually absolutely continuous. If the stronger Novikov-type condition
$B^H_\cdot +h_{\cdot \wedge \tau _n}$
 are mutually absolutely continuous. If the stronger Novikov-type condition 
 $$ \begin{align} \mathbb{E}\big[\exp\lambda\|h\|_{\mathcal{D}^{\beta,p}}^2\big]<\infty\quad \forall\,\lambda>0 \end{align} $$
$$ \begin{align} \mathbb{E}\big[\exp\lambda\|h\|_{\mathcal{D}^{\beta,p}}^2\big]<\infty\quad \forall\,\lambda>0 \end{align} $$
holds, then one can infer the stronger conclusion that the laws of 
 $B^H$
 are
$B^H$
 are 
 $B^H_\cdot +h$
 are equivalent and that the Radon-Nikodym derivative admits moments of any order; see [Reference Galeati, Harang and Mayorcas51, Proposition 3.10] for a similar statement.
$B^H_\cdot +h$
 are equivalent and that the Radon-Nikodym derivative admits moments of any order; see [Reference Galeati, Harang and Mayorcas51, Proposition 3.10] for a similar statement.
With the above considerations in mind, we are now ready to present a result on the applicability of Girsanov’s transform, which is the main motivation for this appendix.
Lemma C.3. Assume (A) and that
 $$ \begin{align} 1-1/(Hq')<0. \end{align} $$
$$ \begin{align} 1-1/(Hq')<0. \end{align} $$
Let 
 $b\in L^q_t C^\alpha _x$
,
$b\in L^q_t C^\alpha _x$
, 
 $x_0\in \mathbb {R}^d$
, and denote by
$x_0\in \mathbb {R}^d$
, and denote by 
 $\mu $
 the law of the solution X to the associated SDE (1.6). Then Girsanov’s transform applies and
$\mu $
 the law of the solution X to the associated SDE (1.6). Then Girsanov’s transform applies and 
 $\mu $
 is equivalent to
$\mu $
 is equivalent to 
 $\mathcal {L}(x_0+B^H)$
. As a consequence,
$\mathcal {L}(x_0+B^H)$
. As a consequence, 
 $\mathrm {supp}\, \mu = C([0,1];\mathbb {R}^d)$
.
$\mathrm {supp}\, \mu = C([0,1];\mathbb {R}^d)$
.
Proof. Without loss of generality, we may assume 
 $\alpha <0$
 and
$\alpha <0$
 and 
 $x_0=0$
. In view of Remark C.2, we need to verify (C.4) with
$x_0=0$
. In view of Remark C.2, we need to verify (C.4) with 
 $h=\varphi =X-B^H$
 and with some
$h=\varphi =X-B^H$
 and with some 
 $\beta $
, p satisfying (C.3).
$\beta $
, p satisfying (C.3).
 Let 
 $\kappa>0$
 small enough so that H,
$\kappa>0$
 small enough so that H, 
 $\alpha -\kappa $
, and q also satisfy (A), and let
$\alpha -\kappa $
, and q also satisfy (A), and let 
 $\tilde b\in L^q_tC^{\alpha -\kappa }_x$
 with norm
$\tilde b\in L^q_tC^{\alpha -\kappa }_x$
 with norm 
 $1$
. By Lemmas 2.4, 3.1 and A.2, we have that with some
$1$
. By Lemmas 2.4, 3.1 and A.2, we have that with some 
 $\mu>0$
,
$\mu>0$
, 
 $$ \begin{align} \mathbb{E}\bigg[\exp\bigg( \mu\Big\|\int_0^\cdot \tilde b_r(B^H_r+\varphi_r)\mathrm{d} r \Big\|_{\mathcal{D}^{1+(\alpha-\kappa) H-\kappa,q}}^2\bigg) \bigg]<\infty. \end{align} $$
$$ \begin{align} \mathbb{E}\bigg[\exp\bigg( \mu\Big\|\int_0^\cdot \tilde b_r(B^H_r+\varphi_r)\mathrm{d} r \Big\|_{\mathcal{D}^{1+(\alpha-\kappa) H-\kappa,q}}^2\bigg) \bigg]<\infty. \end{align} $$
Note that for sufficiently small 
 $\kappa $
, the exponents satisfy (C.3) as a consequence of (A). Therefore, (C.6) looks like (C.4), except the arbitrariness of the coefficient. One can then proceed by an interpolation argument as in [Reference Galeati, Harang and Mayorcas51, Proposition 3.8]: for any
$\kappa $
, the exponents satisfy (C.3) as a consequence of (A). Therefore, (C.6) looks like (C.4), except the arbitrariness of the coefficient. One can then proceed by an interpolation argument as in [Reference Galeati, Harang and Mayorcas51, Proposition 3.8]: for any 
 $\kappa>0$
 and
$\kappa>0$
 and 
 $\lambda>0$
, there exists
$\lambda>0$
, there exists 
 $b^{-}$
 and
$b^{-}$
 and 
 $b^{+}$
 such that
$b^{+}$
 such that 
 $b=b^-+b^+$
 and
$b=b^-+b^+$
 and 
 $$ \begin{align*} \frac{2 \lambda}{\mu} \|b^-\|_{L^q_tC^{\alpha-\kappa}_x}^2\leq1,\qquad\qquad\|b^+\|_{L^q_t C^0_x}=:K<\infty, \end{align*} $$
$$ \begin{align*} \frac{2 \lambda}{\mu} \|b^-\|_{L^q_tC^{\alpha-\kappa}_x}^2\leq1,\qquad\qquad\|b^+\|_{L^q_t C^0_x}=:K<\infty, \end{align*} $$
where K may depend on all parameters. Then we can write
 $$ \begin{align*} \mathbb{E}&\bigg[\exp\bigg( \lambda\Big\|\int_0^\cdot b(B^H_r+\varphi_r)\mathrm{d} r \Big\|_{\mathcal{D}^{1+(\alpha-\kappa) H-\kappa,q}}^2\bigg) \bigg] \\ &\leq e^{2K^2}\mathbb{E}\bigg[\exp\bigg( \mu\, \frac{2\lambda}{\mu} \Big\|\int_0^\cdot b^-(B^H_r+\varphi_r)\mathrm{d} r \Big\|_{\mathcal{D}^{1+(\alpha-\kappa) H-\kappa,q}}^2\bigg)\bigg]<\infty, \end{align*} $$
$$ \begin{align*} \mathbb{E}&\bigg[\exp\bigg( \lambda\Big\|\int_0^\cdot b(B^H_r+\varphi_r)\mathrm{d} r \Big\|_{\mathcal{D}^{1+(\alpha-\kappa) H-\kappa,q}}^2\bigg) \bigg] \\ &\leq e^{2K^2}\mathbb{E}\bigg[\exp\bigg( \mu\, \frac{2\lambda}{\mu} \Big\|\int_0^\cdot b^-(B^H_r+\varphi_r)\mathrm{d} r \Big\|_{\mathcal{D}^{1+(\alpha-\kappa) H-\kappa,q}}^2\bigg)\bigg]<\infty, \end{align*} $$
applying (C.6) with 
 $\sqrt {2\lambda /\mu }b^-$
 in place of
$\sqrt {2\lambda /\mu }b^-$
 in place of 
 $\tilde b$
 in the last step.
$\tilde b$
 in the last step.
Remark C.4. The restriction (C.5) in Lemma C.3 is necessary. Indeed, even taking a space-independent drift 
 $b\in L^q$
, so that
$b\in L^q$
, so that 
 $\varphi \in W^{1,q}$
, the condition
$\varphi \in W^{1,q}$
, the condition 
 $1-1/q>(H+1/2)-1/2$
 necessary for the Sobolev embedding implies (C.5). The reader may feel this pathological and rightly so: for such a b, we can deduce everything about the law of
$1-1/q>(H+1/2)-1/2$
 necessary for the Sobolev embedding implies (C.5). The reader may feel this pathological and rightly so: for such a b, we can deduce everything about the law of 
 $B^H+\varphi $
 from the law of
$B^H+\varphi $
 from the law of 
 $B^H$
. Note that this also motivates the use of ‘stochastic regularity’ as in, for example, (2.2), which assigns to deterministic functions (like
$B^H$
. Note that this also motivates the use of ‘stochastic regularity’ as in, for example, (2.2), which assigns to deterministic functions (like 
 $\varphi $
 in this example) infinite regularity.
$\varphi $
 in this example) infinite regularity.
 Note also that (C.5) enforces 
 $H\in (0,1/2)$
. We do not discuss the regime of large H in detail, as Girsanov’s transform becomes less end less useful as H increases. For example, for
$H\in (0,1/2)$
. We do not discuss the regime of large H in detail, as Girsanov’s transform becomes less end less useful as H increases. For example, for 
 $H>2$
, one has
$H>2$
, one has 
 $B^H\in C^2$
 and (in the nontrivial case
$B^H\in C^2$
 and (in the nontrivial case 
 $\alpha <1$
)
$\alpha <1$
) 
 $\varphi \notin C^2$
, yielding trivially the mutual singularity of the laws of
$\varphi \notin C^2$
, yielding trivially the mutual singularity of the laws of 
 $B^H$
 and
$B^H$
 and 
 $X=B^H+\varphi $
. Once again, the way out is to use ‘stochastic regularity’ as a substitute for Girsanov.
$X=B^H+\varphi $
. Once again, the way out is to use ‘stochastic regularity’ as a substitute for Girsanov.
Acknowledgements
MG thanks Konstantinos Dareiotis for valuable discussions during the development of the parallel article [Reference Dareiotis and Gerencsér31]. The authors thank the institutions MFO Oberwolfach and TU Wien for their hospitality during their research visits.
Funding statement
This research was funded in whole or in part by the Austrian Science Fund (FWF) [10.55776/P34992]. For open access purposes, the author has applied a CC BY public copyright license to any author accepted manuscript version arising from this submission. LG was funded by the DFG under Germanys Excellence Strategy - GZ 2047/1, project-id 390685813 and later by the SNSF Grant 182565 and by the Swiss State Secretariat for Education, Research and Innovation (SERI) under contract number MB22.00034 through the project TENSE.
Competing interest
The authors have no competing interest to declare.
 
 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 
 
 
 . Whenever
. Whenever  
 
 
 , and similarly for
, and similarly for  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 is a control and
 is a control and  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 scales as
 scales as  
 as
 as  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
