1 Introduction
Deriving rigorously continuum equations of classical fluid mechanics as large-scale descriptions of locally conserved quantities in Newtonian particle systems is a famous open problem in mathematical physics. However, it has seen little progress [Reference Bonetto, Lebowitz and Rey-Bellet5]. Morrey [Reference Morrey41] gave a formal derivation based on local equilibrium and local Gibbs states, but rigorous proof of necessary local ergodicity of Hamiltonian systems has remained elusive. Considering instead statistical mechanical systems, which may be viewed as Hamiltonian systems with additional randomness, largely resolves this difficulty. Indeed, there has been remarkable progress on deriving many continuum fluid equations, known as hydrodynamic limits, from stochastic interacting particle systems, largely based on the works of Guo-Papanicolaou–Varadhan [Reference Guo, Papnicolaou and Varadhan28], Varadhan [Reference Varadhan48] and Yau [Reference Yau52] that make precise the two notions of local equilibrium and local Gibbs states for stochastic systems; see [Reference De Masi, Laniro, Pellegrinotti, Presutti, Lebowitz and Montroll16, Reference Kipnis and Landim37, Reference Spohn47] for thorough reviews in this direction.
However, a complete picture of hydrodynamic equations via statistical mechanics requires understanding conjecturally universal fluctuations of locally conserved quantities in the stochastic model about hydrodynamic limits. (By ‘universal’, we mean a scaling limit for fluctuations that does not depend on precise microscopic structures of the system at hand, only the choice of scaling and a few numbers, such as moments of certain random variables.) To this end, much less is known. We discuss the history of this universality problem shortly. To highlight its significance, a general derivation of universal local fluctuations was asked for by Spohn [Reference Spohn47], in the form of Conjecture II.3.6, and by Jensen–Yau [Reference Jensen and Yau35], in the form of an open problem in Lecture 7; almost no progress has been made in the past few decades according to [Reference Goncalves, Landim and Milanes26]. Let us expand on this more precisely.
-
• Conjecture II.3.6 in [Reference Spohn47] asks the question of how to use local statistics to derive scaling limits for fluctuations of hydrodynamic limits in nonstationary interacting particle systems. The physical reasoning given therein supposes that the nonstationary particle system is sufficiently close to stationary at local scales. (This is the ‘extended local equilibrium hypothesis’ therein.) Using this information, one can then deduce formally what the scaling limit for fluctuations should be. The question, which is what Conjecture II.3.6 asks, is how to prove (any of) this rigorously. (Technically, Conjecture II.3.6 in [Reference Spohn47] asks about nonstationary particle systems whose hydrodynamic limits are also nonconstant. We do not address this case simply because the scaling for the models that we study in this paper does not seem to allow for it. We clarify this later in the introduction. In any case, the heart of Conjecture II.3.6 in [Reference Spohn47] is a method of using local statistics and local stationarity to derive scaling limits for fluctuations. As smooth is approximately constant on local scales, a thorough investigation for nonstationary systems with constant hydrodynamic limits should, in principle, shed light on the case of nonconstant but smooth hydrodynamic limits.)
-
• Problem 3 in Section 7 of [Reference Jensen and Yau35] asks the same as Conjecture II.3.6 in [Reference Spohn47] with the same interest in nonstationary models. ([Reference Jensen and Yau35] also emphasizes interest in ‘nonequilibrium’ cases, which includes the case of nonconstant hydrodynamic limit discussed in the previous bullet point.) [Reference Jensen and Yau35], however, notes that for nonstationary and/or nonequilibrium models, only one result [Reference Chang and Yau8] was available at the time. In particular, [Reference Jensen and Yau35] asks for progress beyond this specific work, which already addresses a large class of models. We further discuss [Reference Chang and Yau8] shortly. ([Reference Jensen and Yau35] also asks for scaling limits for fluctuations in nonstationary and nonequilibrium systems in dimension $\mathrm {d}\geqslant 2$ . We do not address this case in this paper, and we leave it for future work.)
Additionally, since Spohn [Reference Spohn47] and Jensen–Yau [Reference Jensen and Yau35], there has been a surge of activity and interest in nonlinear KPZ statistics (where ‘KPZ’ means Kardar–Parisi–Zhang) as large-scale limits of fluctuations [54]. To this end, even less is known.
We respond to these conjectures and open problems with a general derivation of the so-called Boltzmann–Gibbs principle based on local dynamic properties of the stochastic model as asked for by Spohn [Reference Spohn47]. It combines well enough with stochastic analytic methods to rigorously derive KPZ fluctuations from a large class of stochastic particle systems that are beyond perturbations of stochastically reversible models and therefore in some version of nonequilibrium, in the spirit of Problem 3 in Section 7 of [Reference Jensen and Yau35]. To start, we discuss relevant prior work and questions of Spohn [Reference Spohn47] and Jensen–Yau [Reference Jensen and Yau35]; see also Chapter 11 of [Reference Kipnis and Landim37].
-
• The Boltzmann–Gibbs principle was originally developed by Brox–Rost [Reference Brox and Rost6] to derive hydrodynamic limit fluctuations. Their method succeeds only for statistically stationary/equilibrium systems. It has since been streamlined [Reference Kipnis and Landim37, Reference Komorowski, Landim and Olla40] and derived for many equilibrium models [Reference Chang7, Reference De Masi, Presutti, Spohn and Wick18, Reference Dittrieb and Gartner20, Reference Goncalves and Jara24, Reference Goncalves, Jara and Sethuraman25, Reference Gubinelli and Perkowski27, Reference Landim and Vares39, Reference Spohn, Fritz, Jaffe and Szasz44, Reference Spohn45, Reference Spohn and Papanicolao46, Reference Zhu53]. However, assuming, or even explicitly knowing, statistical equilibrium is certainly a restrictive global constraint. For example, interactions with stochastic reservoirs, or so-called ‘open boundaries’, immediately breaks any understanding of invariant measures [Reference Corwin and Knizel10] except for in special situations. Moreover, it is not even believed that the equilibrium method should succeed in a general nonequilibrium setting; see [Reference Chang and Yau8].
-
• To avoid a need for understanding global invariant measures explicitly, Jara–Menezes [Reference Jara and Menezes34] adapted the relative entropy method of Yau [Reference Yau52], which was originally introduced for deriving hydrodynamic limits, to rigorously implement the strategy of local equilibrium/Gibbs states due to Morrey [Reference Morrey41]. However, as we work at the delicate fluctuation scale, in [Reference Jara and Menezes34] the authors require a strong initial closeness to local Gibbs states in a global sense; initially, the model is close to a Gibbs state at local scales, but this must be true everywhere in order to solve a global many-body eigenvalue problem. So, this method also depends on strong global assumptions. In any case, by this method, Jara–Menezes derive fluctuations for a smoothly inhomogeneous exclusion process [Reference Jara and Menezes34] whose variants were studied in [Reference Corwin and Tsai14, Reference Covert and Rezakhanlou15, Reference Jara and Menezes34]. In [Reference Jara and Landim33], Jara–Landim do this for a class of exclusion processes with additional Glauber-type disturbances/perturbations.
-
• In a groundbreaking work of Chang–Yau [Reference Chang and Yau8], hydrodynamic limit fluctuations were rigorously derived, with continuum limit given by a linear Gaussian partial differential equation (PDE), basically without any conditions on the initial data beyond being reasonable initial data for the limit stochastic PDE. Chang–Yau [Reference Chang and Yau8] specialize to a system of diffusions; their work is similarly based upon solving a many-body eigenvalue problem by means of large-deviations estimates and close-to-optimal log-Sobolev inequalities for the global invariant measure. Therefore, although the results of Chang–Yau [Reference Chang and Yau8] are for nonequilibrium systems, analysis of global Gibbs states/invariant measures is essential. Moreover, it is unclear if the work of [Reference Chang and Yau8] can be used to access KPZ fluctuations in nonequilibrium models. This is because the KPZ equation requires analytic considerations to solve when outside the invariant measure, and the analysis in [Reference Chang and Yau8] seems difficult to upgrade at the level of appropriate norms; see Remark 4.2.
In this paper, the Boltzmann–Gibbs principle is derived with local, and thus more general, considerations involving only system dynamics, not by directly exploiting global invariant measures. Modulo details, the ingredients for our method are listed below; for a more detailed illustration of the method, see Section 3.2, which we have set up in a fairly general fashion.
-
• On local mesoscopic scales, the dynamics admit an almost-optimal and ‘elliptic’ log-Sobolev inequality; this implies strong local relaxation of dynamics as assumed by Morrey [Reference Morrey41]. This assumption is very different than global assumptions in [Reference Chang and Yau8]. For example, in many models containing interactions with stochastic reservoirs at localized ‘boundary’ points, the invariant measure is poorly understood except for a small set of special cases [Reference Corwin and Knizel10]. For such ‘open boundary models’, the invariant measures are by no means perturbations of their ‘boundary-free’ versions, even on macroscopic scales; see [Reference Corwin and Knizel10]. In particular, this obstructs the approach of [Reference Chang and Yau8], which is based on calculations for local marginals of an explicit global invariant measure. However, except in $\mathrm {O}(1)$ -many small sets near the reservoirs, the local dynamics, and thus their invariant measures, are unaffected. Also, locally near any reservoir, the system looks like a half-space model, which has better understood invariant measures. Thus, the method in this paper has potential applications to open boundary models, which are also of a nonequilibrium flavor and, again, behave quite differently than models without boundary; see [Reference Corwin and Knizel10, Reference Corwin and Shen11, Reference Goncalves, Landim and Milanes26, Reference Yang50]. See [Reference Yang50], in particular, for an application of the first steps of the method developed herein to a class of open boundary models (whose invariant measures are unknown and never used in [Reference Yang50]); we discuss this further in Section 1.3. In a similar spirit, the models in [Reference Jara and Menezes34] do not admit explicit invariant measures because of the smooth inhomogeneity; this is one motivation for [Reference Jara and Menezes34]. But locally, smooth is basically constant, so the inhomogeneity does not obstruct our method (modulo a few perturbations). By the same token, our method seems to hold for fluctuations of smooth nonconstant hydrodynamic limits as in [Reference Chang and Yau8, Reference Jara and Menezes34] after some perturbative adjustments. However, for the scaling that gives nonlinear KPZ statistics, making sense of nonconstant hydrodynamic limits itself seems to pose an issue. Indeed, these should formally be ‘infinite-time’ (or ‘infinite-speed’) viscosity solutions to hyperbolic Hamilton–Jacobi equations, whose meaning is only clear for constant solutions. For this reason only, we do not discuss fluctuations about nonconstant hydrodynamic limits in this paper.
-
• On local mesoscopic scales, regularity of fluctuations is roughly that of a white noise, which is what we expect for their stochastic partial differential equation (SPDE) limits; this is not an assumption and usually falls out of the analysis, and it controls which local Gibbs states are relevant.
-
• We emphasize these ingredients concern only local dynamics of the model. Properties of the global invariant measure may be helpful at a technical level, but they should not be essential to deriving SPDEs from fluctuations. For example, the methods in this paper use an explicit product measure that happens to be an invariant measure for the entire process (as opposed to just the dynamics in a local set). However, we do not necessarily require that this measure is invariant. All we need are entropy production bounds (see Lemma 8.9). (Intuitively, these bounds are a convenient quantification of local equilibration; see the first bullet point in this list. The aforementioned invariance just makes this calculation much shorter than those in [Reference Yang50, Reference Yau52]; see Lemma 7.4 in [Reference Yang50], for instance.)
As for initial data, we only require that it can be made sense of by the macroscopic SPDE. This is, again, a basic requirement.
One upshot of the locality in our method is a Boltzmann–Gibbs principle which holds in a much stronger topology than in [Reference Chang and Yau8]. Beyond being possibly of interest in its own right, this seems to be important for deriving KPZ equation fluctuations, whose solution theory currently requires either a strong stationary assumption [Reference Goncalves and Jara24] that we aim to avoid or analysis in relatively strong norms [Reference Bertini and Giacomin3, Reference Hairer29].
Instead of developing a general theory of deriving Boltzmann–Gibbs principles, we specialize to KPZ fluctuations in a class of nonintegrable and nonstationary interacting particle systems. The main result of this paper, namely Theorem 1.8, is convergence of height function (or ‘current’) fluctuations to the KPZ equation for a class of exclusion processes with environment-dependent dynamics. These are of high interest both in KPZ [Reference Goncalves and Jara24] and beyond KPZ [Reference De Masi, Presutti, Spohn and Wick18, Reference Funaki22, Reference Funaki, Handa and Uchiyama23, Reference Komoriya38, Reference Landim and Vares39]. This adds to the almost empty set of nonintegrable, nonstationary interacting particle systems for which universality of KPZ equation fluctuations is proven.
Let us introduce the KPZ equation more precisely below, in which $\xi $ is Gaussian space-time white noise on $\mathbb R_{\geqslant 0}\times \mathbb {T}^{1}$ with $\mathbb {T}^{1}=\mathbb R/\mathbb Z$ the torus, in which $\bar {\mathfrak {d}}$ is constant and in which $\infty $ is meant to suggest equation (1.1) as a scaling limit:
The KPZ equation (1.1) was originally derived in [Reference Kardar, Parisi and Zhang36] to be a universal model for dynamical interface fluctuations describing the statistics of propagating fires, bacterial colonies, epidemic spread, tumor growth and crack formations. However, it was already apparent in [Reference Kardar, Parisi and Zhang36] the important observation that $\mathbf {u}^{\infty }=\nabla \mathbf {h}^{\infty }$ describes hydrodynamic fluctuations. As for a brief history, in [Reference Bertini and Giacomin3], Bertini–Giacomin show that height function fluctuations in the asymmetric simple exclusion process (ASEP) converge to KPZ with $\bar {\mathfrak {d}}=0$ . In [Reference Bertini and Giacomin3], the integrability of ASEP is leveraged crucially. Related works [Reference Corwin, Ghosal, Shen and Tsai9, Reference Corwin, Shen and Tsai12, Reference Corwin and Tsai13] employed the same integrability method to show convergence to KPZ for height function fluctuations in a limited number of special systems. For nonintegrable models, there has only been a successful general approach for stationary systems [Reference Goncalves and Jara24, Reference Gubinelli and Perkowski27]. Progress for nonintegrable, nonstationary particle systems is minimal beyond a few works that we discuss after presenting Theorem 1.8. Environment-dependent speed-change dynamics are of particular interest for KPZ (see Big Picture Question 1.6 of [54]), which is why we study it here.
In a nutshell, the difficulty in universality of equation (1.1) and the Boltzmann–Gibbs principle is as follows. Suppose we let $\mathbf {h}'$ denote the solution to equation (1.1), but instead of $|\nabla \mathbf {h}'|^{2}$ we have $\mathbf {F}(\nabla \mathbf {h}')$ for a general $\mathbf {F}$ . In [Reference Kardar, Parisi and Zhang36], a formal Taylor series implies $\mathbf {h}'$ converges to equation (1.1) under a ‘critical scaling’ with an explicit $\mathbf {F}$ -dependent coefficient in front of the quadratic and explicit $\bar {\mathfrak {d}}=\bar {\mathfrak {d}}(\mathbf {F})$ ; see [Reference Hairer and Quastel31]. Such coefficients are wrong, however, unless $\mathbf {F}$ is a degree-two polynomial, in which case the calculation is trivial because one already starts with KPZ. The picture for particle systems is similar. General environment dependence roughly corresponds to general nonlinearities $\mathbf {F}$ whose effective limits we must compute. Moreover, the integrable ASEP model that was studied in [Reference Bertini and Giacomin3] is associated to degree-2 $\mathbf {F}$ for which homogenization is formally trivial. Making precise the asymptotics for general $\mathbf {F}$ is the heart of proving universality, and it is one of our main motivations.
One explanation for why the Taylor series heuristic in [Reference Kardar, Parisi and Zhang36] is incorrect is that KPZ is a singular SPDE; the roughness of the $\xi $ -noise makes the equation classically ill posed. A way of solving equation (1.1) (see [Reference Bertini and Giacomin3]) is to instead define $\mathbf {h}^{\infty }=-\log \mathbf {Z}^{\infty }$ where $\mathbf {Z}^{\infty }$ solves the stochastic heat equation (SHE) below, which can be solved with Ito–Walsh calculus; this is the Cole-Hopf transform:
We conclude by tying the Boltzmann–Gibbs principle and KPZ into the same story: The Boltzmann–Gibbs principle is the mechanism by which the correct coefficients in the limit $\mathbf {h}'\to \mathbf {h}^{\infty }$ , which we discussed in a previous paragraph, are computed.
1.1 The Model
We start by introducing the interacting particle systems of interest as Markov processes on a finite state space. In words, the process below is the ASEP system in [Reference Bertini and Giacomin3] with additional environment-dependent asymmetry of speed N that affects the nonlinearity in the dynamics of its height function in nonintegrable fashion; the height function is constructed in Definition 1.1. The parameter $N\in \mathbb Z_{>0}$ is a scaling parameter we take infinitely large to recover limit SPDEs; this is the ‘large-N limit’.
-
• Define $\mathbb {T}_{N}=\mathbb Z/N\mathbb Z$ to be the microscopic N-point torus that we embed into the one-dimensional lattice $\mathbb Z$ upon identifying every element in $\mathbb {T}_{N}$ by an integer between 0 and $N-1$ . Arithmetic on the torus $\mathbb {T}_{N}$ is taken mod N unless said otherwise.
-
• Provided any set $\mathbb {K}_{N}\subseteq \mathbb {T}_{N}\kern-1.5pt$ , define the corresponding state space $\Omega _{\mathbb {K}_{N}}=\{\pm 1\}^{\mathbb {K}_{N}}$ . For convenience, we define $\Omega =\Omega _{\mathbb {T}_{N}}\kern-1.5pt$ . Elements of $\Omega _{\mathbb {K}_{N}}$ sets are denoted by $\eta =(\eta _{x})_{x\in \mathbb {K}_{N}}$ . The interpretation of $\eta $ -variables in terms of particles is the following. Adopting spin notation of [Reference Dembo and Tsai19], if $\eta _{x}=1$ , there is a particle at $x\in \mathbb {T}_{N}\kern-1.5pt$ . Otherwise, if $\eta _{x}=-1$ , there is no particle there.
-
• Consider $\mathfrak {d}:\Omega \to \mathbb R$ independent of N and define $\mathfrak {d}_{x}(\eta )=\mathfrak {d}(\tau _{x}\eta )$ , in which $\tau _{x}\eta $ shifts the $\eta $ -configuration by ${x}\in \mathbb {T}_{N}$ to recenter at x so that $(\tau _{x}\eta )_{z}=\eta _{z+x}$ for all $z\in \mathbb {T}_{N}\kern-1.5pt$ . We now let $\mathsf {L}_{x}$ denote the infinitesimal generator for a symmetric simple exclusion process with speed 1 on $\{x,x+1\}\subseteq \mathbb {T}_{N}\kern-1.5pt$ . To specify it, for any $\eta \in \Omega $ , let $\eta ^{z,w}\in \Omega $ be the configuration defined by $\eta ^{z,w}_{x}=\eta _{x}$ if $x\neq z,w$ and $\eta ^{z,w}_{z}=\eta _{w}$ and $\eta ^{z,w}_{w}=\eta _{z}$ . (In other words, $\eta ^{z,w}$ swaps occupation numbers at $z,w$ .) For any $\mathfrak {f}:\Omega \to \mathbb R$ , we define
$$ \begin{align*} \mathsf{L}_{x}\mathfrak{f}(\eta) \ = \ \mathfrak{f}(\eta^{x,x+1})-\mathfrak{f}(\eta). \end{align*} $$We now define the generator of the Markov process of interest here via $\mathsf {L}_{N}=\mathsf {L}_{N,\mathrm {S}}+\mathsf {L}_{N,\mathrm {A}}$ :(1.3) $$ \begin{align} \mathsf{L}_{N,\mathrm{S}} \ = \ 2^{-1}N^{2}{\sum}_{x\in\mathbb{T}_{N}}\mathsf{L}_{x} \quad \mathrm{and} \quad \mathsf{L}_{N,\mathrm{A}} \ = \ 2^{-1}N^{\frac32}{\sum}_{x\in\mathbb{T}_{N}}\left(\mathbf{1}_{\substack{\eta_{x}=-1\\ \eta_{x+1}=1}}-\mathbf{1}_{\substack{\eta_{x}=1\\ \eta_{x+1}=-1}}\right)\left(1+N^{-\frac12}\mathfrak{d}_{x}\right)\mathsf{L}_{x}. \end{align} $$ -
• Denote by $\eta _{\mathrm {t}}$ the particle configuration at time $\mathrm {t}\geqslant 0$ under the Markov process with generator $\mathsf {L}_{N}$ . More generally, given any $\mathrm {t}\geqslant 0$ , any $x\in \mathbb {T}_{N}\kern-1.5pt$ , and any functional $\mathfrak {f}:\Omega \to \mathbb R^{d}$ , we define $\mathfrak {f}_{\mathrm {t},x}=\mathfrak {f}(\tau _{x}\eta _{\mathrm {t}})$ ; recall the spatial shift $\tau _{x}$ from above. (We will introduce assumptions on the initial data $\eta _{0}$ in Definition 1.6 below. For now, the reader can think of it as given for now.)
Definition 1.1. Define the following height function, in which $\mathbf {h}^{N}_{T,0}$ is equal to $2N^{-1/2}$ times the net flux of particles crossing $0$ , with the convention that leftward traveling particles count as positive flux; this is the same height function as in [Reference Bertini and Giacomin3] but now on the torus $\mathbb {T}_{N}\kern-1.5pt$ . Also, we assume $\mathbf {h}^{N}_{0,0}=0$ .
We now define the Gartner transform, for which we introduce the renormalization term $\mathrm {R}=\mathrm {R}_{1}+\mathrm {R}_{2}$ with $\mathrm {R}_{1}=2^{-1}N-24^{-1}$ and $\mathrm {R}_{2}=N^{1/2}\mathrm {R}_{2,1}+\mathrm {R}_{2,2}+\mathrm {R}_{2,3}=N^{1/2}\mathrm {R}_{2,1}{+2^{-1}\bar {\mathfrak {d}}}+\mathrm {R}_{2,3}$ , in which $\bar {\mathfrak {d}}$ is from Definition 2.2. The constant $\bar {\mathfrak {d}}$ is the same constant appearing in the $\mathrm {SHE}(\bar {\mathfrak {d}})$ scaling limit in our main result of Theorem 1.8. We define the remaining two terms $\mathrm {R}_{2,1}$ and $\mathrm {R}_{2,3}$ shortly; roughly, they come from the hydrodynamic flux of the $\mathfrak {d}$ -asymmetry. First, we have
To define the renormalization counterterm $\mathrm {R}_{2,1}$ , define $\mathbf {E}_{0}$ as the expectation with respect to the product Bernoulli measure on $\Omega $ whose one-dimensional marginals have expectation equal to the hydrodynamic limit 0. Define $\mathrm {R}_{2,1}={-}2^{-1}{\mathbf {E}_{0}}(\mathfrak {d}-\mathfrak {d}\eta _{0}\eta _{1})$ as the hydrodynamic limit of the flux of the environment-dependent asymmetry. In particular, in the exponential defining $\mathbf {Z}^{N}\kern-1.5pt$ , we look at height function fluctuations after subtracting the leading-order hydrodynamic limit/flux. Indeed, hydrodynamic limits tell us the normalized height function (not its fluctuations in $\mathbf {h}^{N}$ ) is roughly $\mathrm {R}_{2,1}$ in expectation when close to a constant density profile, and when multiplying by $N^{1/2}$ to study fluctuations, we must renormalize by $N^{1/2}\mathrm {R}_{2,1}$ . This provides an interpretation, from interacting particle systems and hydrodynamic limits, of renormalizations needed in [Reference Hairer29] to make sense of KPZ.
To wrap up this construction, let us define the order 1 counterterm $\mathrm {R}_{2,3}={\mathbf {E}_{0}}\widetilde {\mathfrak {s}}$ , where $\widetilde {\mathfrak {s}}$ is a functional defined in Definition 2.2. Roughly speaking, it captures, at a level of hydrodynamic limits, a transport-induced growth of the height function coming from the $\mathfrak {d}$ asymmetry; this is the parallel, for the $\mathfrak {d}$ asymmetry, of the $24^{-1}$ -term in the renormalization constant $\mathrm {R}_{1}$ which comes from the leading-order asymmetry and that was also present in [Reference Bertini and Giacomin3]; see Remark 2.3 for a more detailed explanation of $\mathrm {R}_{2,3}$ .
Remark 1.2. We linearly interpolate values of functions on $\mathbb {T}_{N}$ for all times to get a piecewise linear function on $N\mathbb {T}^{1}=\mathbb R/N\mathbb Z$ .
1.2 The theorem
Our main result is showing that $\mathbf {Z}^{N}\to \mathrm {SHE}(\bar {\mathfrak {d}})$ for a particular value of $\bar {\mathfrak {d}}\in \mathbb R$ that is determined by a few statistics of the $\mathfrak {d}$ -asymmetry; we shortly specify this value. First, we require a structural assumption for the $\mathfrak {d}$ -asymmetry, which is also necessary in the approach to universality of KPZ by what are known as energy solutions in [Reference Goncalves and Jara24, Reference Goncalves, Jara and Sethuraman25], for example. Such an assumption is often called a gradient condition. It implies (see [Reference Goncalves and Jara24, Reference Goncalves, Jara and Sethuraman25]) a family of explicit product invariant measures.
Assumption 1.3. The support of $\mathfrak {d}:\Omega \to \mathbb R$ , defined as the smallest subset of $\mathbb {T}_{N}$ such that $\mathfrak {d}$ depends only on $\eta $ -variables in $\mathbb {T}_{N}\kern-1.5pt$ , is contained in a neighborhood of $0\in \mathbb {T}_{N}$ with radius at most the uniformly bounded constant $\mathfrak {l}_{\mathfrak {d}}\in \mathbb Z_{>0}$ . There is a uniformly bounded functional $\mathfrak {w}$ whose support is contained in the same neighborhood so that $\mathfrak {d}\nabla _{1}^{\mathbf {X}}\eta =\mathfrak {d}(\eta _{1}-\eta _{0})=\nabla _{1}^{\mathbf {X}}\mathfrak {w}=\tau _{1}\mathfrak {w}-\mathfrak {w}$ .
Remark 1.4. We can perturb Assumption 1.3 to make invariant measures globally intractable. Little would change if perturbations are not too large, so log-Sobolev inequalities on mesoscopic scales drastically change. For example, perturbations that affect the system on global timescales but take too long for mesoscopic dynamics to detect are certainly allowable.
We turn to scaling limits. This starts with the following rescaling operators that give ‘macroscopic’ coordinates.
Definition 1.5. Given $\psi :\mathbb R_{\geqslant 0}\times \mathbb {T}_{N}\to \mathbb R$ , define $\Gamma ^{N,\mathbf {X}}\psi :\mathbb {T}^{1}\to \mathbb R$ via linearly interpolating values of $\Gamma ^{N,\mathbf {X}}\psi _{x}=\psi _{0,Nx}$ for $x\in N^{-1}\mathbb {T}_{N}\subseteq \mathbb {T}^{1}$ . Define $\Gamma ^{N}\psi :\mathbb R_{\geqslant 0}\times \mathbb {T}^{1}\to \mathbb R$ by interpolating values of $\Gamma ^{N}\psi _{\mathrm {t},x}=\psi _{\mathrm {t},Nx}$ for $x\in N^{-1}\mathbb {T}_{N}\subseteq \mathbb {T}^{1}$ .
We now present a class of initial conditions of the particle system/the height function for which the KPZ equation limit will be established. We are almost forced to take some assumption of the following kind because the limit SPDEs themselves need reasonable initial data to be well defined.
Definition 1.6. A probability measure on $\Omega $ is stable if the following conditions are satisfied. First, with probability 1 under said measure, the total number of particles on $\mathbb {T}_{N}$ is $N/2$ . Equivalently, the sum of $\eta _{x}$ over $x\in \mathbb {T}_{N}$ under said measure is zero. Next, provided any $p\geqslant 1$ and any $\mathfrak {l}\in \mathbb Z$ and any $\mathfrak {u}\in [0,2^{-1})$ , we have the following estimate uniformly over $\mathbb {T}_{N}\kern-1.5pt$ , in which $\nabla _{\mathfrak {l}}^{\mathbf {X}}$ is a spatial gradient $\nabla _{\mathfrak {l}}^{\mathbf {X}}\phi _{x}=\phi _{x+\mathfrak {l}}-\phi _{x}$ for any $\phi :\mathbb {T}_{N}\to \mathbb R$ :
Also, $\Gamma ^{N,\mathbf {X}}\mathbf {h}^{N}$ converges in law as $N\to \infty $ with respect to the uniform norm on the space $\mathscr {C}(\mathbb {T}^{1})$ of continuous functions.
Remark 1.7. We make a few clarifications about Definition 1.6. The assumption that $\eta _{x}$ sums to zero with probability 1 guarantees that $\eta _{T,x}$ sums (over $x\in \mathbb {T}_{N}$ ) to zero with probability 1 for all $T\geqslant 0$ . Indeed, the total particle number is conserved. From this, we deduce the gradient relation $\eta _{T,x}=N^{1/2}(\mathbf {h}_{T,x}^{N}-\mathbf {h}_{T,x-1}^{N})$ .
Let us give examples of stable initial data. Take the product (mean-zero) Bernoulli measure on $\{\pm 1\}^{\mathbb {T}_{N}}$ . Condition on the subset of $\eta \in \{\pm 1\}^{\mathbb {T}_{N}}$ where $\eta _{x}$ sums to zero over $x\in \mathbb {T}_{N}\kern-1.5pt$ . A Brownian bridge version of Donsker’s invariance principle implies that these are stable initial data, and the limit of $\Gamma ^{N,\mathbf {X}}\mathbf {h}^{N}$ is a Brownian bridge on $\mathbb {T}^{1}$ . These stable initial data give the stationary measure for the $\eta $ process. An example of a deterministic (and thus nonstationary) stable measure is given by the flat data $\eta _{x}=(-1)^{x}$ . In this case, the limit of $\Gamma ^{N,\mathbf {X}}\mathbf {h}^{N}$ is the zero function on $\mathbb {T}^{1}$ . In general, given any continuous function $\mathsf {F}$ on $\mathbb {T}^{1}$ , one can construct stable initial data such that $\Gamma ^{N,\mathbf {X}}\mathbf {h}^{N}$ has limit $\mathsf {F}$ . (This is a random walk bridge version of the fact that a Brownian bridge has dense support in $\mathscr {C}(\mathbb {T}^{1})$ .)
Finally, if $\mathbf {Z}^{N}$ is uniformly bounded above and below, then Definition 1.6 is basically equivalent to the same but $\mathbf {h}^{N}$ is replaced by $\mathbf {Z}^{N}\kern-1.5pt$ . Actually, we can take $\mathbf {Z}^{N}$ singular as SHE is smoothing; this would only change the topology in which we study $\mathbf {Z}^{N}\kern-1.5pt$ .
Let $\mathscr {D}_{1}$ be the Skorokhod space until time 1 with values in $\mathscr {C}(\mathbb {T}^{1})$ ; see [Reference Billingsley, Wiley and Sons4] for the Skorokhod topology. The final time 1 is not important and can be replaced by any fixed time independent of N. We will not explicitly give here the transport coefficient $\bar {\mathfrak {d}}$ , which appears in the limit $\mathrm {SHE}(\bar {\mathfrak {d}})$ , until Definition 2.2 because it requires nontrivial set up. The key feature is that it agrees with the equilibrium linear transport coefficient in [Reference Goncalves and Jara24] given by a correction to a hydrodynamic limit contribution of the $\mathfrak {d}$ -asymmetry.
Theorem 1.8. Suppose we take the sequence of Gartner transforms $\mathbf {Z}^{N}$ with stable initial data for $\mathbf {h}^{N}$ . If we adopt Assumption 1.3 , the renormalized sequence $\Gamma ^{N}\mathbf {Z}^{N}$ converges to $\mathrm {SHE}(\bar {\mathfrak {d}})$ with respect to the Skorokhod topology on $\mathscr {D}_{1}$ with the initial data $\lim _{N\to \infty }\mathbf {Exp}(-\Gamma ^{N,\mathbf {X}}\mathbf {h}^{N})$ . The transport coefficient $\bar {\mathfrak {d}}\in \mathbb R$ that determines the limit $\mathrm {SHE}(\bar {\mathfrak {d}})$ is a derivative of an equilibrium expectation of the flux corresponding to the $\mathfrak {d}$ -asymmetry; see Definition 2.2 .
Remark 1.9. Observe the $\mathfrak {d}$ -asymmetry is biased to the left. Moreover, any jump $x+1\to x$ in the system increases the value of $\mathbf {h}^{N}$ at x. Thus, the average growth speed of $\mathbf {h}^{N}$ increases as $\mathfrak {d}$ increases; this is why the leading-order $N^{1/2}\mathrm {R}_{2,1}$ -renormalization from the $\mathfrak {d}$ -asymmetry is proportional to $\mathfrak {d}$ . This implies the nonlinearity in the hydrodynamic limit, which resembles the role of general $\mathbf {F}$ in our Taylor expansion discussion prior to equation (1.2), is proportional to and therefore ‘positive’ in $\mathfrak {d}$ . Said Taylor expansion heuristic ultimately deduces from this that the KPZ/SHE limits for fluctuations have $+\bar {\mathfrak {d}}\nabla $ instead of $-\bar {\mathfrak {d}}\nabla $ ; although the exact coefficients predicted by Taylor expansion are possibly incorrect if not done carefully, its qualitative prediction for direction of transport/growth is correct, as substantiated by Theorem 1.8.
To the author’s knowledge, Theorem 1.8 is a first result on nonintegrable and nonstationary interacting particle systems in which a homogenized linear transport term $\bar {\mathfrak {d}}\nabla $ in $\mathrm {SHE}(\bar {\mathfrak {d}})$ is obtained in the KPZ limit. The proof estimates what-will-be-error terms quantitatively, so we can let the speed of the $\mathfrak {d}$ -asymmetry to be a slightly larger power of N to obtain $\mathrm {SHE}(`\infty $ ’), where $`\infty $ ’ means follow a constant diverging-speed characteristic. Also, as in [Reference Chang and Yau8], the Boltzmann–Gibbs principle is sometimes applied to linearize environment dependence of symmetric dynamics, recovering a Laplacian in the limit from a nonlinear second-order operator. The method herein can do this after a few refinements. In a nutshell, this is homogenization in the top-order differential; Theorem 1.8 is homogenization of lower-order terms, while perturbations in the top order are more delicate. To give a complete discussion of our method, we discuss the refinements in relation to [Reference Chang and Yau8]; see Remarks 4.11 and 5.6. But we defer details to future work; they are not complicated once we give Remarks 4.11 and 5.6 and apply calculations already in [Reference Chang and Yau8] that combine naturally and generally with the ideas in this paper. These details are also separate from singular KPZ fluctuations of interest here.
1.3 Additional context
Theorem 1.8 can be interpreted in the following fashion. We establish in this paper a general method of deriving the Boltzmann–Gibbs principle for interacting particle systems, and to illustrate its utility and ‘strength’, we obtain nonlinear and singular KPZ fluctuations for a general set of particle system height functions. By ‘strength’, we refer to the fact that earlier work, including but certainly not limited to [Reference Brox and Rost6, Reference Chang and Yau8, Reference Goncalves and Jara24, Reference Jara and Menezes34], establishes Boltzmann–Gibbs principles that only hold in a sense that is too weak to address the singular behavior of KPZ. Indeed, the KPZ equation and SHE are PDEs that must be interpreted in a sufficiently strong topology to establish convergence in the context of particle systems [Reference Amir, Corwin and Quastel1, Reference Bertini and Giacomin3, Reference Dembo and Tsai19]; while previous Boltzmann–Gibbs principles do not play well with the analytic procedure needed to solve SHE, our method gives a Boltzmann–Gibbs principle that does, allowing us to rigorously derive singular KPZ fluctuations. By ‘strength’, we also refer to local nature of our Boltzmann-Gibbs principle and its derivation.
Towards universality of KPZ, Theorem 1.8 contributes to an almost empty set of nonstationary and nonintegrable interacting particle systems for which convergence to KPZ is rigorously shown. In [Reference Dembo and Tsai19], nonsimple exclusion processes of maximal-particle-jump length at most three are studied successfully. These are basically integrable if one is able to analyze hydrodynamic limits [Reference Dembo and Tsai19]. In [Reference Yang49], the jump-length condition in [Reference Dembo and Tsai19] was removed; the necessary ingredient was a very weak form of the Boltzmann–Gibbs principle to show effective vanishing for a one-dimensional subspace of ‘fluctuating observables’ or ‘pseudo-gradients’ as defined in [Reference Yang49]. The key technical development here is to upgrade the weak principle of [Reference Yang49] to a full Boltzmann–Gibbs principle that not only applies to fluctuating observables but computes generally nonzero effective limits of general local observables. To this end, we develop a nonstationary version of the multiscale renormalization of [Reference Goncalves and Jara24, Reference Sethuraman and Xu43]. This necessary multiscale step is part of what makes the full Boltzmann–Gibbs principle more difficult compared to [Reference Yang49]; we use only the more robust one-block step of [Reference Guo, Papnicolaou and Varadhan28] to analyze locally fluctuating terms, but to compute the macroscopic effects of general local observables, we require the more difficult two-blocks step of [Reference Guo, Papnicolaou and Varadhan28]. Finally, with some rather technical multiscale refinements, [Reference Yang49] extends to KPZ fluctuations in open boundary systems [Reference Yang50], for which little is known, by locality of its method; again, the same holds for our Boltzmann–Gibbs principle, letting us add to the universality of the so-called open KPZ equation [Reference Corwin and Shen11], for which minimal progress has been made. Extensions of earlier work on hydrodynamic fluctuations of linear Gaussian type, such as [Reference Chang and Yau8], to open boundary versions are also possible by using the method herein and similar technical refinements. These are all currently being carried out by the author.
In [Reference Hairer29, Reference Hairer30, Reference Hairer and Quastel31, Reference Hairer and Xu32], regularity structures were used to study both the KPZ equation and its universality for generalizations of KPZ for nonquadratic nonlinearities that we discussed earlier. However, regularity structures depend on strong assumptions on the $\xi $ -noise. To the author’s knowledge, it is not known how to apply regularity structures to tackle either universality of KPZ or Boltzmann–Gibbs principles for interacting particle systems. It would certainly be interesting to see these developments.
1.4 The infinite volume case
This paper treats particle systems on a discrete torus (with limit SPDE on a compact torus). The use of the torus is purely for technical convenience as it gives a priori spatial compactness. (It is a frequent assumption in studies of large-scale asymptotics of interacting particle systems; see [Reference Chang and Yau8], for instance.) However, all our methods are spatially local, and the limit SPDE (1.1) is well posed on the full line $\mathbb R$ , so the infinite volume case for systems on $\mathbb Z$ should be doable, for example, via the method in [Reference Yang49].
1.5 Outline
As this paper has many technically involved moving pieces, we make an effort to explain many points.
-
• In Section 2, we derive an approximate microscopic version of $\mathrm {SHE}(\bar {\mathfrak {d}})$ for $\mathbf {Z}^{N}\kern-1.5pt$ . This is standard for proving KPZ fluctuations.
-
• Section 3 proves Theorem 1.8 assuming three key ingredients, the last of which we show in Section 3 and the first two of which we outline. In particular, we introduce and discuss the Boltzmann–Gibbs principle. We then outline the rest of the paper.
1.6 Conventions
We write here a list of conventions, including notation, that are used frequently in the paper.
-
• We use Landau big-Oh notation. Also, $\mathrm {a}\lesssim \mathrm {b}$ is equivalent to $\mathrm {a}=\mathrm {O}(\mathrm {b})$ , and $\mathrm {a}\gtrsim \mathrm {b}$ is equivalent to $\mathrm {b}\lesssim \mathrm {a}$ .
-
• The notation $\mathrm {SHE}(\bar {\mathfrak {d}})$ stands for the solution of SHE (1.2) with linear transport coefficient $\bar {\mathfrak {d}}$ .
-
• We let $\mathscr {D}_{1}$ / $\mathscr {C}_{1}$ be the Skorokhod space of cadlag paths/space of continuous paths until time 1 valued in $\mathscr {C}(\mathbb {T}^{1})$ .
-
• The microscopic length scale is order 1. The macroscopic length scale is order N. Mesoscopic length scales are in between. The microscopic timescale is order $N^{-2}$ . The macroscopic timescale is order $1$ . Mesoscopic timescales are in between.
-
• Set $\mathbb {T}_{N}=\mathbb Z/N\mathbb Z$ and $\mathbb {T}^{1}=\mathbb R/\mathbb Z$ . Recall we chose an embedding $\mathbb {T}_{N}\subseteq \mathbb Z$ . For $\alpha>0$ , define $\alpha \mathbb {T}^{1}=\mathbb R/\alpha \mathbb Z$ .
-
• Whenever we say $\mathbb {I}\subseteq \mathbb {T}_{N}$ for some subset $\mathbb {I}\subseteq \mathbb Z$ , we mean its image under the mod- $|\mathbb {T}_{N}|$ map $\mathbb Z\to \mathbb Z/N\mathbb Z=\mathbb {T}_{N}\subseteq \mathbb Z$ .
-
• Provided any $z\in \mathbb {T}_{N}\kern-1.5pt$ , we let $|z|$ denote the absolute value after the embedding $\mathbb {T}_{N}\subseteq \mathbb Z$ .
-
• For stable initial data, see Definition 1.6 and Remark 1.7. For rescaling operators $\Gamma ^{N,\mathbf {X}}$ and $\Gamma ^{N}$ , see Definition 1.5.
-
• For any $\eta \in \Omega $ and $x\in \mathbb {T}_{N}\kern-1.5pt$ , define $\tau _{x}\eta $ to be the configuration defined by $(\tau _{x}\eta )_{z}=\eta _{-x+z}$ for all $z\in \mathbb {T}_{N}$ For any functional $\mathfrak {f}:\Omega \to \mathbb R$ and $x\in \mathbb {T}_{N}$ and $S\geqslant 0$ , we define $\tau _{x}\mathfrak {f}=\mathfrak {f}\circ \tau _{x}:\Omega \to \mathbb R$ to recenter $\mathfrak {f}$ at x and $\mathfrak {f}_{S,y}=\tau _{y}\mathfrak {f}(\eta _{S})$ .
-
• Given any functional $\mathfrak {f}:\Omega \to \mathbb R$ , we define its support to be the smallest subset of $\mathbb {T}_{N}$ for which $\mathfrak {f}$ depends only on $\eta $ -variables in that subset. For example, if $\mathfrak {f}(\eta )=\eta _{0}$ , then the support of $\mathfrak {f}$ is the single point $\{0\}\subseteq \mathbb {T}_{N}$ .
-
• For the $\mathfrak {l}_{\mathfrak {d}}$ length scale and the support of $\mathfrak {d}:\Omega \to \mathbb R$ , see Assumption 1.3.
-
• For $\mathrm {t}_{\mathrm {st}}$ or $\varepsilon _{\mathrm {ap}}$ and $\varepsilon _{\mathrm {RN}}$ , see Definition 3.1. For $\varepsilon _{1}$ and $\varepsilon _{\mathrm {RN},1}$ , see Propositions 4.6 and 4.7. For $\mathbf {Y}^{N}$ , see Definition 3.5.
-
• Provided any finite, not necessarily uniformly bounded, set $\mathrm {I}$ , define the averaged summation $\widetilde {\sum }_{x\in \mathrm {I}}=|\mathrm {I}|^{-1}{\sum }_{x\in \mathrm {I}}$ .
-
• For any $\phi :\mathbb {T}_{N}\to \mathbb R$ and $\mathfrak {l}\in \mathbb Z$ , define the spatial gradient on length scale $\mathfrak {l}$ by $\nabla _{\mathfrak {l}}^{\mathbf {X}}\phi _{x}=\phi _{x+\mathfrak {l}}-\phi _{x}$ . We also define the discrete Laplacian via the composition $\Delta =-\nabla _{1}^{\mathbf {X}}\nabla _{-1}^{\mathbf {X}}$ . Lastly, define $\Delta ^{!!}=N^{2}\Delta $ and $\nabla _{\mathfrak {l}}^{!}=N\nabla _{\mathfrak {l}}^{\mathbf {X}}$ .
-
• For any $\psi :[0,1]\to \mathbb R$ and $\mathrm {t}\in \mathbb R$ , define the time gradient on timescale $\mathrm {t}$ by $\nabla _{\mathrm {t}}^{\mathbf {T}}\psi _{\mathrm {s}}=\psi _{\mathrm {s}+\mathrm {t}}-\psi _{\mathrm {s}}$ ; if $\mathrm {s}+\mathrm {t}\not \in [0,1]$ , then replace $\mathrm {s}+\mathrm {t}$ in the definition of $\nabla ^{\mathbf {T}}_{\mathrm {t}}\psi _{\mathrm {s}}$ by the boundary point $\{0,1\}$ closest to $\mathrm {s}+\mathrm {t}$ .
-
• For any $a,b\in \mathbb R$ , we define the discretized interval .
-
• For any $p\geqslant 1$ , we let $\|\|_{\omega ;p}$ denote the p-norm with respect to all the randomness in the particle system. Provided any $\mathrm {t}\geqslant 0$ and spatial set $\mathbb {K}$ and function $\phi $ , we define $\|\phi \|_{\mathrm {t};\mathbb {K}}=\sup _{(s,y)\in [0,\mathrm {t}]\times \mathbb {K}}|\phi _{s,y}|$ .
-
• For any $S,T\geqslant 0$ , we define $\mathbf {O}_{S,T}=|T-S|$ usually as an on-diagonal heat kernel factor; see Proposition A.3.
2 Approximate microscopic SHE
Definition 2.1. For $\sigma \in \mathbb R$ , let $\mathbf {E}_{\sigma }$ be the expectation with respect to product Bernoulli measure on $\Omega $ with $\mathbf {E}_{\sigma }\eta _{x} = \sigma $ for $x\in \mathbb {T}_{N}\kern-1.5pt$ .
Definition 2.2. Define $\mathfrak {q} \overset {}= \frac 12\mathfrak {d}-\frac 12\mathfrak {d}\cdot \eta _{0}\eta _{1}$ . Its support is contained in for $\mathfrak {l}_{\mathfrak {d}}\in \mathbb Z_{\geqslant 1}$ uniformly bounded.
-
• Define $\widetilde {\mathfrak {q}}\overset {}=\tau _{-2\mathfrak {l}_{\mathfrak {d}}}\mathfrak {q}$ to shift the support of $\mathfrak {q}$ strictly to the left of $0\in \mathbb {T}_{N}\kern-1.5pt$ . Define $\bar {\mathfrak {q}}=\widetilde {\mathfrak {q}}-\mathbf {E}_{0}\widetilde {\mathfrak {q}}-\bar {\mathfrak {d}}\eta _{0}$ with $\bar {\mathfrak {d}}\overset {}=\partial _{\sigma }\mathbf {E}_{\sigma }\widetilde {\mathfrak {q}}|_{\sigma =0}$ .
-
• Define $\widetilde {\mathfrak {s}}(\eta )\overset {}=-\widetilde {\mathfrak {q}}(\eta )\cdot \sum _{y=0}^{2\mathfrak {l}_{\mathfrak {d}}-1}\eta _{-y}$ and the $\mathbf {E}_{0}$ -fluctuation $\mathfrak {s}\overset {}=\widetilde {\mathfrak {s}}-\mathbf {E}_{0}\widetilde {\mathfrak {s}}$ .
Remark 2.3. Recall $\mathbf {E}_{0}\widetilde {\mathfrak {s}}$ is a part of the renormalization constant in the exponential $\mathbf {Z}^{N}\kern-1.5pt$ . To understand this renormalization, since $\widetilde {\mathfrak {q}}$ is local, we can write it as a polynomial in $\eta _{x}$ -variables for x in a fixed neighborhood of the origin. When we multiply its degree $\neq 1$ monomials by a linear term to get $\widetilde {\mathfrak {s}}$ , we get a polynomial with no constant term and therefore zero $\mathbf {E}_{0}$ -expectation. Thus, degree $\neq 1$ terms in $\widetilde {\mathfrak {q}}$ , and therefore of $\mathfrak {q}$ , do not produce constants that need to be renormalized. However, a linear term in $\mathfrak {q}$ can be cancelled into a constant after multiplication by a linear statistic since $\eta _{x}^{2}=1$ , and nonzero constants have nonzero $\mathbf {E}_{0}$ -expectation, so these terms yield constants that then need to be part of the renormalization of the height function and $\mathbf {Z}^{N}\kern-1.5pt$ . On the other hand, if $\mathfrak {q}$ replaced by the linear functional $\eta \mapsto \eta _{0}$ , then $\eta _{T,x}\mathbf {Z}_{T,x}^{N}\approx c_{1}N^{1/2}\nabla _{-1}^{\mathbf {X}}\mathbf {Z}^{N}_{T,x}+c_{2}\mathbf {Z}^{N}_{T,x}$ with constants $c_{i}=c_{i}$ ultimately follows by Taylor expansion as in Section 2 of [Reference Dembo and Tsai19]. One can readily check that $c_{2}$ is obtained by replacing $\widetilde {\mathfrak {q}}(\eta )$ by $\eta \mapsto \eta _{0}$ in $\widetilde {\mathfrak {s}}$ and then taking $\mathbf {E}_{0}$ . Therefore, the renormalization $\mathbf {E}_{0}\widetilde {\mathfrak {s}}$ in $\mathbf {Z}^{N}$ can be equivalently computed by first linearizing the $\widetilde {\mathfrak {q}}$ -environment dependence to get ASEP without environment dependence as in [Reference Bertini and Giacomin3] and then computing the renormalization for this homogenized/linearized ASEP by following the classical calculation in [Reference Bertini and Giacomin3] of Bertini–Giacomin.
Proposition 2.4. We have the following with notation defined afterwards, in which $|\mathfrak {b}_{i;}|\lesssim 1$ are possibly random:
-
• Let us first define the discrete first-order gradient $\nabla _{\mathfrak {l}}^{\mathbf {X}}\varphi _{x}=\varphi _{x+\mathfrak {l}}-\varphi _{x}$ provided any $\mathfrak {l} \in \mathbb Z$ and $\varphi :\mathbb {T}_{N}\to \mathbb R$ . We proceed to define $\Delta ^{!!} = N^{2}\nabla _{1}^{\mathbf {X}}\nabla _{-1}^{\mathbf {X}}$ and $\nabla _{\mathfrak {l}}^{!} = N\nabla _{\mathfrak {l}}^{\mathbf {X}}$ . The first term in the equation above is defined by $\mathscr {L}_{N} \overset {}= 2^{-1}\Delta ^{!!} {+ \bar {\mathfrak {d}}\nabla _{-1}^{!}}$ .
-
• The $\mathrm {d}\xi _{\bullet ,x}^{N}$ -term is a martingale differential/compensated Poisson process corresponding to jumps over $\{x,x+1\}\subseteq \mathbb {T}_{N}\kern-1.5pt$ . Put precisely, it is the following measure in T (given any x) that describes the change in $\mathbf {Z}_{T,x}^{N}$ according to clocks in the $\eta $ process:
$$ \begin{align*} \mathrm{d}\xi_{T,x}^{N} & = (\mathrm{e}^{2N^{-\frac12}}-1)\mathbf{1}_{\eta_{T,x}=1}\mathbf{1}_{\eta_{T,x+1}=-1}[\mathrm{d}\mathcal{Q}_{T,x}^{N,\mathrm{S},\to}-\tfrac12N^{2}\mathrm{d} T]\\& \quad + (\mathrm{e}^{-2N^{-\frac12}}-1)\mathbf{1}_{\eta_{T,x}=-1}\mathbf{1}_{\eta_{T,x+1}=1}[\mathrm{d}\mathcal{Q}_{T,x}^{N,\mathrm{S},\leftarrow}-\tfrac12N^{2}\mathrm{d} T] \\& \quad - (\mathrm{e}^{2N^{-\frac12}}-1)\mathbf{1}_{\eta_{T,x}=1}\mathbf{1}_{\eta_{T,x+1}=-1}[\mathrm{d}\mathcal{Q}_{T,x}^{N,\mathrm{A},\to}-(\tfrac12N^{\frac32}+\tfrac12N\mathfrak{d}_{x}(\eta_{T}))\mathrm{d} T]\\& \quad + (\mathrm{e}^{-2N^{-\frac12}}-1)\mathbf{1}_{\eta_{T,x}=-1}\mathbf{1}_{\eta_{T,x+1}=1}[\mathrm{d}\mathcal{Q}_{T,x}^{N,\mathrm{A},\leftarrow}-(\tfrac12N^{\frac32}+\tfrac12N\mathfrak{d}_{x}(\eta_{T}))\mathrm{d} T]. \end{align*} $$The clocks $\mathcal {Q}^{N,\mathrm {S},\to }_{T,x}$ and $\mathcal {Q}^{N,\mathrm {S},\leftarrow }_{T,x}$ are Poisson processes in T of speed $2^{-1}N^{2}$ . The clocks $\mathcal {Q}^{N,\mathrm {A},\to }_{T,x}$ and $\mathcal {Q}^{N,\mathrm {A},\leftarrow }_{T,x}$ are Poisson processes in T of speed $2^{-1}N^{3/2}+2^{-1}N\mathfrak {d}_{x}(\eta _{T})$ , which is positive for sufficiently large N as $|\mathfrak {d}|\lesssim 1$ . Lastly, the predictable quadratic covariation between any two distinct (compensated) Poisson clocks is zero. -
• When we write $\nabla _{\star }^{!}$ , we sum over the choices $\star =1,-2\mathfrak {l}_{\mathfrak {d}}$ with $\mathfrak {b}_{2;}$ depending possibly on $\star $ but still uniformly bounded.
We provide a proof of Proposition 2.4 at the end of the subsection to avoid obstructing important takeaways in this section. Roughly speaking, the particle system at hand is ASEP from [Reference Dembo and Tsai19] but with only simple jumps and additional asymmetry $N^{-1}\mathfrak {d}$ , and the Gartner transform $\mathbf {Z}^{N}$ is also that of [Reference Dembo and Tsai19] but with additional deterministic $\mathrm {R}_{2}t$ drift in the exponential. In view of these two observations, we follow the derivation of the microscopic SHE for the Gartner transform in Section 2 of [Reference Dembo and Tsai19]. Roughly, the only difference is the $N^{-1}\mathfrak {d}$ asymmetry. As jumps in $\mathbf {Z}^{N}$ are order $N^{-1/2}\mathbf {Z}^{N}\kern-1.5pt$ , the effect of $N^{-1}\mathfrak {d}$ asymmetry is order $N^{1/2}\mathbf {Z}^{N}$ after time scaling. We linearize the flux $\mathfrak {q}$ of this $N^{-1}\mathfrak {d}$ asymmetry to get $\bar {\mathfrak {q}}$ in Definition 2.2, and Taylor expansions/summation by parts give us the last three terms in the $\mathbf {Z}^{N}$ -equation after cancelling with the additional $\mathrm {R}_{2}$ -drift in $\mathbf {Z}^{N}\kern-1.5pt$ . The last three terms in the $\mathbf {Z}^{N}$ equation ultimately vanish in the large-N limit. Now, to make Proposition 2.4 useful, we consider its mild form.
Definition 2.5. We let $\mathbf {H}_{S,T,x,y}^{N}$ on $\mathbb R_{\geqslant 0}^{2}\times \mathbb {T}_{N}^{2}$ be the heat kernel defined by $\mathbf {H}_{S,S,x,y}^{N}=\mathbf {1}_{x=y}$ and $\partial _{T}\mathbf {H}_{S,T,x,y}^{N}=\mathscr {L}_{N}\mathbf {H}_{S,T,x,y}^{N}$ , where $\mathscr {L}_{N}$ acts on the backwards spatial variable $x\in \mathbb {T}_{N}\kern-1.5pt$ . Provided any test function $\varphi :\mathbb R\times \mathbb {T}_{N}\to \mathbb R$ , we additionally define a pair of space-time and spatial heat operators for which we give three ways that each operator may be written in this paper:
Corollary 2.6. Admit the setting and notation of Proposition 2.4 . We have the stochastic integral equation
Proof of Proposition 2.4.
We follow the derivation of the microscopic SHE in Section 2 of [Reference Dembo and Tsai19]. Following their first steps at the beginning of Section 2, we derive the following for the time-differential of $\mathbf {Z}^{N}\kern-1.5pt$ , which we discuss below:
We clarify $\Phi ^{\mathrm {S}}$ and $\Phi ^{\mathrm {A},1}$ and $\Phi ^{\mathrm {A},2}$ coefficients shortly. We briefly note, however, that equation (2.2) is a martingale/Dynkin decomposition for $\mathbf {Z}^{N}\kern-1.5pt$ , where the martingale is explicitly recorded in terms of Poisson clocks in the particle system. In particular, to derive equation (2.2), one starts with the following integral equation (that comes from the Dynkin formula), in which the stochastic integral on the left-hand side (LHS) should be interpreted as integration of $\mathbf {Z}^{N}_{S,x}$ against the measure $\mathrm {d}\xi _{S,x}^{N}$ :
where $\mathrm {R}$ is the renormalization constant in Definition 1.1, and $\mathsf {L}_{N}$ is the generator of the particle system. Indeed, as in Section 2 of [Reference Dembo and Tsai19], integrating all of the clock terms in $\mathrm {d}\xi ^{N}_{S,x}$ accounts for the total change $\mathbf {Z}_{T,x}^{N}-\mathbf {Z}_{0,x}^{N}$ . The drift terms in $\mathrm {d}\xi ^{N}_{S,x}$ account for the generator term $\mathsf {L}_{N}\mathbf {Z}^{N}_{S,x}$ . The $\mathsf {L}_{N,\mathrm {S}}$ part of $\mathsf {L}_{N}$ yields $N^{2}\Phi ^{S}\mathbf {Z}^{N}$ in equation (2.2), and the $\mathsf {L}_{N,\mathrm {A}}$ part yields $N^{2}\Phi ^{\mathrm {A},1}\mathbf {Z}^{N}+N^{2}\Phi ^{\mathrm {A},2}\mathbf {Z}^{N}\kern-1.5pt$ . (One can define $\Phi ^{\mathrm {S}},\Phi ^{\mathrm {A},1}, \Phi ^{\mathrm {A},2}$ for this to be true, at which point we are left to compute these terms.) In particular, the claim about vanishing quadratic covariations follows since each (compensated) Poisson clock in the statement of Proposition 2.4 comes from a different $\mathsf {L}_{x}$ operator $\mathsf {L}_{N}$ . The $\mathrm {R}\mathbf {Z}^{N}$ term on the right-hand side (RHS) of the previous display comes from the fact that $\mathbf {Z}^{N}_{T,x}$ exponentiates $\mathrm {R}T$ , and it gives $\mathrm {R}_{1}\mathbf {Z}^{N}+\mathrm {R}_{2}\mathbf {Z}^{N}$ in equation (2.2).
The exact formulas for $\Phi ^{\mathrm {S}}$ and $\Phi ^{\mathrm {A},1}$ are not important to this proof as we deal with them via citing the calculations in Section 2 of [Reference Dembo and Tsai19]; the same applies to $\mathrm {R}_{1}$ . The emphasis of this calculation will be computing $\Phi ^{\mathrm {A},2}$ and $\mathrm {R}_{2}$ , the former of which is equal to the following ‘instantaneous’ growth/change in $\mathbf {Z}^{N}$ that results from motion of the particle system through the $\mathfrak {d}$ -asymmetry:
In Section 2 of [Reference Dembo and Tsai19], through Taylor expansion and lengthy though elementary calculations, the authors identify the contribution in equation (2.2) of $\Phi ^{\mathrm {S}}$ and $\Phi ^{\mathrm {A},1}$ and $\mathrm {R}_{1}$ with a discrete approximation of the continuum Laplacian. Precisely, they establish the identity
Provided equations (2.2) and (2.3), we are left with computing $\Phi ^{\mathrm {A},2}$ and $\mathrm {R}_{2}$ contributions. To this end, it will be convenient to first define $\mathrm {E}_{\pm ,N}=\mathbf {Exp}(\pm 2N^{-1/2})-1$ along with two ‘trigonometric-type’ functions $\mathrm {T}^{\pm ,N}=\mathrm {E}_{-,N}\pm \mathrm {E}_{+,N}$ . Let us also observe the identity $2\mathbf {1}(\eta =\pm 1)=1\pm \eta $ for $\eta \in \{\pm 1\}$ , which can be checked immediately. This allows us to rewrite the indicator functions in $\Phi ^{\mathrm {A},2}$ explicitly as local functionals of the particle system and thus lets us compute as follows:
As $\widetilde {\mathfrak {q}}_{T,x}-\mathfrak {q}_{T,x}=\nabla _{-2\mathfrak {l}_{\mathfrak {d}}}^{\mathbf {X}}\mathfrak {q}_{T,x}$ , where $\nabla _{-2\mathfrak {l}_{\mathfrak {d}}}^{\mathbf {X}}$ acts on x (see Definition 2.2), for the first term in equation (2.5), we have
We will now multiply the calculation (2.5) by $\mathbf {Z}^{N}\kern-1.5pt$ , use the identity (2.7), and then add the additional drift $N^{-2}\mathrm {R}_{2}\mathbf {Z}^{N}\kern-1.5pt$ . We will match the resulting sum and identities to the non- $\Delta $ and non- $\xi ^{N}$ terms in the proposed SDE for $\mathbf {Z}^{N}\kern-1.5pt$ . For the purposes of clearer organization, we write these calculations in the following bullet points. We address each term in equation (2.7) in written order. Let us clarify that, throughout the following list, we may change $\mathfrak {b}_{1;}$ from line to line, but it is always a sum of an N-independent number of order $N^{-1/2}$ error terms that come from Taylor expansion. Lastly, recall $\mathrm {R}_{2}=N^{1/2}\mathrm {R}_{2,1}+\mathrm {R}_{2,2}+\mathrm {R}_{2,3}$ .
-
• Let us first match $4^{-1}N\mathrm {T}^{-,N}\bar {\mathfrak {q}}$ from equation (2.7) plugged into equation (2.5) to $-N^{1/2}\bar {\mathfrak {q}}$ in the proposed SDE up to error $\mathrm {O}(N^{-1/2})$ . This follows, by definition of $\mathrm {T}^{-,N}$ from immediately before equation (2.5), via $\mathrm {T}^{-,N}\sim -4N^{-1/2}+\mathrm {O}(N^{-3/2})$ .
-
• Let us now match $4^{-1}N\mathrm {T}^{-,N}\mathbf {E}_{0}\widetilde {\mathfrak {q}}$ , obtained by plugging equation (2.7) in equation (2.5), with $-N^{1/2}\mathrm {R}_{2,1}$ so that these terms cancel each other in the $\mathbf {Z}^{N}$ SDE, again up to $\mathrm {O}(N^{-1/2})$ that adds to $\mathfrak {b}_{1;}$ . By definition $\mathrm {R}_{2,1}={-}2^{-1}\mathbf {E}_{0}(\mathfrak {d}-\mathfrak {d}\cdot \eta _{0}\eta _{1})={-}\mathbf {E}_{0}\mathfrak {q}={-}\mathbf {E}_{0}\widetilde {\mathfrak {q}}$ since the product Bernoulli measure in $\mathbf {E}_{0}$ is invariant under spatial shifts. It now suffices to again use $\mathrm {T}^{-,N}\sim -4N^{-1/2}+\mathrm {O}(N^{-3/2})$ .
-
• We match $\mathrm {R}_{2,2}+4^{-1}N\mathrm {T}^{-,N}\bar {\mathfrak {d}}\eta $ again obtained by plugging equation (2.7) in equation (2.5) to the first-order operator $-\bar {\mathfrak {d}}\nabla _{-1}^{!}=-N\bar {\mathfrak {d}}\nabla _{-1}^{\mathbf {X}}$ in $\mathscr {L}_{N}$ up to $\mathrm {O}(N^{-1/2})$ to be absorbed into $\mathfrak {b}_{1;}$ :
(2.8) $$ \begin{align} 4^{-1}N\mathrm{T}^{-,N}\bar{\mathfrak{d}}\eta \mathbf{Z}^{N} + \mathrm{R}_{2,2}\mathbf{Z}^{N} \ = \ -\bar{\mathfrak{d}}\nabla_{-1}^{!}\mathbf{Z}^{N} \ = \ -N\bar{\mathfrak{d}}\nabla_{-1}^{\mathbf{X}}\mathbf{Z}^{N}\kern-1.5pt. \end{align} $$We compute $\nabla _{-1}^{\mathbf {X}}\mathbf {Z}^{N}$ with Taylor expansion via its definition (see Section 2 of [Reference Dembo and Tsai19]):$$ \begin{align*} \nabla_{-1}^{\mathbf{X}}\mathbf{Z}^{N}_{T,x} \ &= \ \mathrm{e}^{-\mathbf{h}_{T,x-1}^{N}+\mathrm{R}T}-\mathrm{e}^{-\mathbf{h}_{T,x}^{N}+\mathrm{R}T} \ = \ (\mathrm{e}^{\mathbf{h}_{T,x}^{N}-\mathbf{h}_{T,x-1}^{N}}-1)\mathbf{Z}_{T,x}^{N} \\ &= \ (N^{-\frac12}\eta_{T,x}+2^{-1}N^{-1}+\mathrm{O}(N^{-\frac32}))\mathbf{Z}_{T,x}^{N}. \end{align*} $$Thus, $-N\bar {\mathfrak {d}}\nabla ^{\mathbf {X}}_{-1}\mathbf {Z}^{N}\sim (-N^{1/2}\bar {\mathfrak {d}}\eta -2^{-1}\bar {\mathfrak {d}})\mathbf {Z}^{N}+\mathrm {O}(N^{-1/2})\mathbf {Z}^{N}$ . On the other hand, Taylor expansion gives $4^{-1}N\mathrm {T}^{-,N}\bar {\mathfrak {d}}\eta \sim -N^{1/2}\bar {\mathfrak {d}}\eta +\mathrm {O}(N^{-1/2})$ that can again be absorbed by $\mathfrak {b}_{1;}$ . Recalling $\mathrm {R}_{2,2}=2^{-1}\bar {\mathfrak {d}}$ , we get the desired matching (2.8). -
• We move to $-4^{-1}N\mathrm {T}^{-,N}\nabla ^{\mathbf {X}}_{-2\mathfrak {l}_{\mathfrak {d}}}\mathfrak {q}_{T,x}\cdot \mathbf {Z}^{N}_{T,x}$ again obtained by plugging equation (2.7) in equation (2.5). We compute/match it as follows:
(2.9) $$ \begin{align} -4^{-1}N\mathrm{T}^{-,N}{\nabla^{\mathbf{X}}_{-2\mathfrak{l}_{\mathfrak{d}}}\mathfrak{q}_{T,x}\cdot\mathbf{Z}^{N}_{T,x}} + \mathrm{R}_{2,3}\mathbf{Z}^{N}_{T,x} \ = \ -\mathfrak{s}_{T,x}\mathbf{Z}^{N}_{T,x}+N^{1/2}\nabla^{\mathbf{X}}_{-2\mathfrak{l}_{\mathfrak{d}}}(\mathfrak{b}_{2;T,x}\mathbf{Z}^{N}_{T,x}). \end{align} $$We clarify $\mathfrak {b}_{2;}$ shortly. We start with calculation below to be explained after; recall $\widetilde {\mathfrak {q}}=\tau _{-2\mathfrak {l}_{\mathfrak {d}}}\mathfrak {q}$ :(2.10) $$ \begin{align} &-4^{-1}N\mathrm{T}^{-,N}{\nabla^{\mathbf{X}}_{-2\mathfrak{l}_{\mathfrak{d}}}\mathfrak{q}_{T,x}\cdot\mathbf{Z}^{N}_{T,x}} \ = \ -4^{-1}N\mathrm{T}^{-,N}{\nabla^{\mathbf{X}}_{-2\mathfrak{l}_{\mathfrak{d}}}\left(\mathfrak{q}_{T,x}\mathbf{Z}^{N}_{T,x}\right)} + 4^{-1}N\mathrm{T}^{-,N}\widetilde{\mathfrak{q}}_{T,x}\nabla^{\mathbf{X}}_{-2\mathfrak{l}_{\mathfrak{d}}}\mathbf{Z}^{N}_{T,x} \nonumber \\ &= \ -4^{-1}N\mathrm{T}^{-,N}{\nabla^{\mathbf{X}}_{-2\mathfrak{l}_{\mathfrak{d}}}\left(\mathfrak{q}_{T,x}\mathbf{Z}^{N}_{T,x}\right)} + 4^{-1}N\mathrm{T}^{-,N}\widetilde{\mathfrak{q}}_{T,x}N^{-1/2}\left({\sum}_{\mathfrak{j}=1}^{2\mathfrak{l}_{\mathfrak{d}}}\tau_{x-\mathfrak{j}}\eta_{T} + \mathrm{O}(N^{-1})\right)\mathbf{Z}^{N}_{T,x}. \end{align} $$The first line follows by a discrete version of the Leibniz rule that can be verified by unfolding discrete gradients and cancelling terms. The second line (2.10) follows by Taylor expanding $\nabla ^{\mathbf {X}}_{-2\mathfrak {l}_{\mathfrak {d}}}\mathbf {Z}^{N}_{T,x}$ as in Section 2 of [Reference Dembo and Tsai19]. Because $\mathrm {T}^{-,N}=-4N^{-1/2}+\mathrm {O}(N^{-3/2})$ , we can absorb $\mathrm {O}(N^{-1})$ in equation (2.10) to $N^{-1/2}\mathfrak {b}_{1;}$ and drop it from equation (2.10). This also implies that the second term in equation (2.10) is $-\widetilde {\mathfrak {s}}\mathbf {Z}^{N}=-\mathfrak {s}\mathbf {Z}^{N}-\mathbf {E}_{0}\widetilde {\mathfrak {s}}\mathbf {Z}^{N}=-\mathfrak {s}\mathbf {Z}^{N}-\mathrm {R}_{2,3}\mathbf {Z}^{N}$ . Lastly, the first term in equation (2.10) has the form $N^{-1/2}\nabla _{-2\mathfrak {l}_{\mathfrak {d}}}^{!}(\mathfrak {b}_{2;T,x}\mathbf {Z}^{N}_{T,x})$ for $|\mathfrak {b}_{2;}|\lesssim 1$ . Combining this paragraph with equation (2.10) gives the desired matching (2.9). -
• We are left with analyzing the last term in equation (2.5). For this first recall, the gradient condition that we have assumed provides the current representation $\mathfrak {d}_{T,x}\nabla ^{\mathbf {X}}_{1}\eta _{T,x}=\nabla ^{\mathbf {X}}_{1}\mathfrak {w}_{T,x}$ , where $\mathfrak {w}$ is uniformly bounded. Moreover, we observe that $|\mathrm {T}^{+,N}|\lesssim N^{-1}$ , which is a smaller estimate than what we had for $\mathrm {T}^{-,N}$ by a factor of $N^{-1/2}$ . Thus, we may employ the exact same argument as the previous bullet point, precisely by replacing $\mathfrak {q}$ with $\mathfrak {w}$ and $-2\mathfrak {l}_{\mathfrak {d}}$ with $1$ , to identify the last term in equation (2.5), after multiplying by $\mathbf {Z}^{N}_{T,x}$ , to be of the form $N^{-1/2}\nabla ^{!}_{1}(\mathfrak {b}_{2;T,x}\mathbf {Z}^{N}_{T,x})+\mathrm {O}(N^{-1/2})\mathbf {Z}^{N}_{T,x}$ . We clarify that, here, there is no matching $\widetilde {\mathfrak {s}}$ -terms with $\mathrm {R}_{2}$ -terms because the $N^{-1/2}$ factor we gain from having a coefficient $\mathrm {T}^{+,N}$ instead of $\mathrm {T}^{-,N}$ renders all such terms order $N^{-1/2}$ , thus absorbed by $\mathfrak {b}_{1;}$ .
This completes the proof.
3 Proof of Theorem 1.8
At a high level, the proof of Theorem 1.8 is built on an analysis of the semidiscrete stochastic integral equation from Corollary 2.6. As with [Reference Dembo and Tsai19, Reference Yang49], our main goal will be to prove that only the first two terms therein contribute in the large-N limit in a ‘high probability’ sense. The last two terms on the RHS of this equation are easily shown to vanish in the large-N limit by deterministic and analytic estimates, at least if we assume that the Gartner transform and its space-time supremum are not totally ill behaved; such assumption will ultimately be justified by virtue of the fact that the Gartner transform is supposed to resemble the solution of SHE on the compact torus, which itself is uniformly continuous in space-time. But the $\mathfrak {s}$ -term in Corollary 2.6 does not admit such an elementary analytic estimate since it does not necessarily have a deterministically small prefactor. The probabilistic approach we take to study the heat operator with the $\mathfrak {s}$ -functional is based on the feature that it vanishes at a level of ‘hydrodynamic limits’ since the global $\eta $ -density for our initial data is roughly zero, and by construction the expectation of $\mathfrak {s}$ with respect to the product Bernoulli measure of this $\eta $ -density is also zero. Equivalently, in the language of [Reference Dembo and Tsai19, Reference Yang49] the $\mathfrak {s}$ -term is ‘weakly vanishing’. We will make this ‘hydrodynamic’ argument precise in Lemma 3.17.
We are left with analyzing the order $N^{1/2}$ term in the stochastic equation of Corollary 2.6. Because $N^{1/2}$ certainly diverges in the large-N limit, neither the previous analytic or hydrodynamic limit arguments will succeed. In fact, if we replace the particle-system-dependent term $\bar {\mathfrak {q}}$ with any local $\mathfrak {f}$ that has ‘zero hydrodynamic limit’ like $\mathfrak {s}$ above, it is likely false that the corresponding heat operator term in Corollary 2.6 acting on $N^{1/2}\mathfrak {f}\mathbf {Z}^{N}$ will vanish in the large-N limit, based on the equilibrium calculations in [Reference Goncalves and Jara24], for example. Therefore, we must take advantage of $\bar {\mathfrak {q}}$ being the local functional $\widetilde {\mathfrak {q}}$ after subtracting off its ‘leading order’ behavior beyond the hydrodynamic limit when averaged in space-time against the heat kernel and $\mathbf {Z}^{N}\kern-1.5pt$ . We will do this through a nonstationary first-order Boltzmann–Gibbs principle, which will require a combination of analytic and probabilistic ingredients. The analytic considerations required mainly amount to regularity estimates of $\mathbf {Z}^{N}\kern-1.5pt$ , which by calculus implies regularity of $\mathbf {h}^{N}$ and, by definition, controls local invariant measures that are parameterized by $\eta $ -density. For this reason, first define the following stopping times, which uniformly control $\mathbf {Z}^{N}$ and its space-time regularity. In the construction below, we will require a strange integer condition that is ultimately unnecessary; it will just make presentation later in the paper clearer and more convenient.
Definition 3.1. Consider $\varepsilon _{\mathrm {ap}}>0$ arbitrarily small but bounded below uniformly and chosen so that $N^{\varepsilon _{\mathrm {ap}}}$ is an integer. We note this may force $\varepsilon _{\mathrm {ap}}$ to be N-dependent, but this is okay; we only need its uniform positivity and smallness. Define
where $\|\|_{\mathrm {t};\mathbb {K}}$ is the $\mathscr {L}^{\infty }([0,\mathrm {t}]\times \mathbb {K})$ -norm. We now introduce space-time scales on which we want a priori regularity estimates:
-
• We first define $\mathbb {I}^{\mathbf {T},1}\overset {\bullet }=\{N^{-2+\mathfrak {j}\varepsilon _{\mathrm {ap}}}\}_{\mathfrak {j}\geqslant 0}\cap [0,N^{-1}]$ . Observe that $N^{-2+\mathfrak {j}\varepsilon _{\mathrm {ap}}}$ are positive integer multiples of $N^{-2}$ .
-
• We now define $\mathbb {I}^{\mathbf {T}}\overset {\bullet }=\ \{\mathfrak {k}N^{-2+\mathfrak {j}\varepsilon _{\mathrm {ap}}}\}$ , in which $1\leqslant \mathfrak {k}\leqslant N^{\varepsilon _{\mathrm {ap}}}$ and $\mathfrak {j}\geqslant 0$ ranges over all indices for which $N^{-2+\mathfrak {j}\varepsilon _{\mathrm {ap}}}\leqslant N^{-1}$ .
We also define/assume $\varepsilon _{\mathrm {RN}}=999^{-999}\geqslant 999^{999}\varepsilon _{\mathrm {ap}}$ and then define the length scale $\mathfrak {l}_{N}\overset {\bullet }=N^{1/2+\varepsilon _{\mathrm {RN}}}$ . We now define the two stopping times below in which we recall $\nabla ^{\mathbf {X}}_{\mathfrak {l}}\phi _{x}=\phi _{x+\mathfrak {l}}-\phi _{x}$ and $\nabla ^{\mathbf {T}}_{\mathrm {s}}\psi _{t,x}=\psi _{(1\wedge (t+\mathrm {s}))\vee 0,x}-\psi _{t,x}$ for $(t,x)\in [0,1]\times \mathbb {T}_{N}\kern-1.5pt$ :
We conclude by defining the stopping time $\mathfrak {t}_{\mathrm {st}}={\mathfrak {t}_{\mathrm {ap}}}\wedge \mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}\wedge \mathfrak {t}_{\mathrm {RN}}^{\mathbf {X}}$ that is contained in $[0,1]$ with probability 1 and whose purpose is to supply a priori space-time control on the Gartner transform. Let us clarify that the utility behind the two regularity stopping times $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}$ and $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {X}}$ will be to yield a priori estimates that are necessary to perform a renormalization scheme during the proof of the Boltzmann–Gibbs principle, while the utility behind $\mathfrak {t}_{\mathrm {ap}}$ is to avoid having to simultaneously apply probabilistic and analytic estimates to study particle system data and control $\mathbf {Z}^{N}\kern-1.5pt$ , the latter being ignorable if we look before the stopping time $\mathfrak {t}_{\mathrm {ap}}$ .
Remark 3.2. The stopping time $\mathfrak {t}_{\mathrm {ap}}$ also gives a priori lower bounds on $\mathbf {Z}^{N}\kern-1.5pt$ . This will be important in the proof of the Boltzmann–Gibbs principle. In particular, we require regularity estimates of the height function. However, since the height function solves an equation that becomes a singular SPDE in the large-N limit and because singular SPDE analysis becomes difficult to conduct at the level of the particle system, we instead deduce regularity of height functions in terms of regularity of the Gartner transform as the Gartner transform equation becomes a nonsingular SPDE in the large-N limit. Calculus then tells us that a priori upper and lower bounds for the Gartner transform suffice to deduce regularity of the height function.
Remark 3.3. We expect the Gartner transform to look like the solution of SHE in the large-N limit, which, roughly speaking, has Holder regularity with exponent $2^{-1}$ in space and with exponent $4^{-1}$ in time. Therefore, the conditions/inequalities defining the stopping times $\mathfrak {t}_{\mathrm {ap}}$ and $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}$ and $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {X}}$ are actually quite lenient because of the $N^{\varepsilon _{\mathrm {ap}}}$ factors and the assumption that $\varepsilon _{\mathrm {ap}}>0$ is universal and thus uniformly bounded from below. In particular, we will eventually be able to show that these three stopping times are all equal to 1 with sufficiently high probability, so their a priori estimates ‘self-propagate’.
Remark 3.4. The constant $999^{999}$ in $\varepsilon _{\mathrm {RN}}=999^{999}\varepsilon _{\mathrm {ap}}$ can be replaced by any sufficiently large but universal constant.
To take advantage of stopping times in Definition 3.1, we now introduce the following auxiliary processes, the first of which stops the Gartner transform at the minimum stopping time $\mathfrak {t}_{\mathrm {st}}$ and the second of which evolves according to the same type of SHE dynamic as the Gartner transform though ignoring space-time sets where the conditions defining $\mathfrak {t}_{\mathrm {st}}$ fail, thus making the second auxiliary process amenable to the analysis of this paper, including the proof of the Boltzmann–Gibbs principle.
Definition 3.5. Define $\mathbf {Y}^{N}_{T,x}=\mathbf {Z}^{N}_{T,x}\mathbf {1}(T\leqslant \mathfrak {t}_{\mathrm {st}})$ , and define the process $\mathbf {U}^{N}$ on $\mathbb R_{\geqslant 0}\times \mathbb {T}_{N}$ via the stochastic equation
where $\nabla _{\star }^{!}$ means what it does in Proposition 2.4.
Remark 3.6. The product $\mathbf {U}^{N}\mathrm {d}\xi ^{N}$ denotes compensated jumps of a martingale, where the jumps at $(T,x)$ are given by the jumps of $\mathrm {d}\xi ^{N}$ at $(T,x)$ from Proposition 2.4 times the value $\mathbf {U}^{N}$ at $(T,x)$ . In fact, whenever we write a product of a space-time function and $\mathrm {d}\xi ^{N}\kern-1.5pt$ , we mean exactly this where $\mathbf {U}^{N}$ is replaced by said space-time function. We additionally observe that for any functions $\mathbf {F}_{1},\mathbf {F}_{2}$ , we have the identity $\mathbf {F}_{1}\mathrm {d}\xi ^{N}-\mathbf {F}_{2}\mathrm {d}\xi ^{N}=(\mathbf {F}_{1}-\mathbf {F}_{2})\mathrm {d}\xi ^{N}$ , as $\mathbf {F}_{1}\mathrm {d}\xi ^{N}$ and $\mathbf {F}_{2}\mathrm {d}\xi ^{N}$ are coupled and always jump together.
To justify studying the $\mathbf {U}^{N}$ process, let us observe that on the event $\mathfrak {t}_{\mathrm {st}}=1$ we have not changed the $\mathbf {Z}^{N}\kern-1.5pt$ equation in Corollary 2.6 and have simply defined $\mathbf {U}^{N}$ with the same stochastic equation. Because the stochastic equation is linear in the solution $\mathbf {U}^{N}\kern-1.5pt$ , we have uniqueness of solutions with same initial data by elementary considerations, and thus $\mathbf {Z}^{N}=\mathbf {U}^{N}$ on such an event. In general, we have this identification between $\mathbf {Z}^{N}$ and $\mathbf {U}^{N}$ until $\mathfrak {t}_{\mathrm {st}}$ regardless of its value.
Lemma 3.7. Provided any $\mathrm {t}\in [0,1]$ , we have the containment of events $\{\mathfrak {t}_{\mathrm {st}}=\mathrm {t}\}\subseteq \cap _{0\leqslant \mathrm {s}\leqslant \mathrm {t}}\cap _{x\in \mathbb {T}_{N}}\{\mathbf {Z}^{N}_{\mathrm {s},x}=\mathbf {U}^{N}_{\mathrm {s},x}\}$ .
We reiterate that working with the $\mathbf {U}^{N}$ process will be convenient because of the a priori space-time regularity estimates built into the $\mathbf {Y}^{N}$ process therein, while Lemma 3.7 guarantees us $\mathbf {Z}^{N}$ and $\mathbf {U}^{N}$ are equal on the event where $\mathfrak {t}_{\mathrm {st}}=1$ which, as noted in Remark 3.3, we will show happens with sufficiently high probability. Then, after taking advantage of the cutoff in the stopping time $\mathfrak {t}_{\mathrm {st}}$ , we compare $\mathbf {U}^{N}$ to the following process that forgets the order $N^{1/2}$ term in the $\mathbf {Z}^{N}$ and $\mathbf {U}^{N}$ equations.
Definition 3.8. Define the process $\mathbf {Q}^{N}$ on $\mathbb R_{\geqslant 0}\times \mathbb {T}_{N}$ via the following stochastic integral equation
where $\nabla _{\star }^{!}$ means what it does in Proposition 2.4.
We will now introduce the three key ingredients in the proof of Theorem 1.8. The first ingredient shows $\mathfrak {t}_{\mathrm {st}}=1$ with a notion of high probability we will introduce shortly. This first step allows us to deduce Theorem 1.8 from itself but replacing $\mathbf {Z}^{N}$ therein with $\mathbf {U}^{N}$ introduced in Definition 3.5. The second step then compares $\mathbf {U}^{N}$ with the auxiliary process $\mathbf {Q}^{N}$ in Definition 3.8. Proofs of these two ingredients require the Boltzmann–Gibbs principle and take up the majority of this paper. The third step is to then prove Theorem 1.8 but replacing $\mathbf {Z}^{N}$ with $\mathbf {Q}^{N}\kern-1.5pt$ . This last step is fairly standard, as noted at the beginning of this section.
Definition 3.9. Consider any generic event $\mathcal {E}$ . In the following, we think of constants $\delta>0$ as arbitrarily small but universal, and we think of constants $\kappa \geqslant 0$ as arbitrarily large but universal.
-
• We say $\mathcal {E}$ holds with high probability if for any $\delta>0$ , we have $\mathbf {P}(\mathcal {E}^{C})\leqslant \delta +C_{\delta }\mathrm {o}_{N}\kern-1pt$ , where $\mathrm {o}_{N}\to _{N\to \infty }0$ uniformly in $\delta $ .
-
• We say $\mathcal {E}$ holds with overwhelming probability if for any $\kappa \geqslant 0$ , we have $\mathbf {P}(\mathcal {E}^{C})\lesssim _{\kappa }N^{-\kappa }\kern-1pt$ .
Remark 3.10. Any event $\mathcal {E}$ that satisfies the probability estimate $\mathbf {P}(\mathcal {E}^{C})\lesssim N^{-\beta }$ for some, not all, constant $\beta>0$ holds with high probability because we may take $\mathrm {o}_{N}=N^{-\beta }$ . But, it does not necessarily hold with overwhelming probability.
Proposition 3.11. The event $\{\mathfrak {t}_{\mathrm {st}}=1\}$ holds with high probability.
Proposition 3.12. Define the difference process $\mathbf {D}^{N}=\mathbf {U}^{N}-\mathbf {Q}^{N}$ on $\mathbb R_{\geqslant 0}\times \mathbb {T}_{N}\kern-1.5pt$ . There exists a universal constant $\beta>0$ such that the event $\{\|\mathbf {D}^{N}\|_{1;\mathbb {T}_{N}}\lesssim N^{-\beta }\}$ holds with high probability, where the implied constant is also universal.
Proposition 3.13. The rescaled process $\Gamma ^{N}\mathbf {Q}^{N}$ is tight in the large-N limit in the Skorokhod space $\mathscr {D}_{1}$ . Moreover, every limit point in $\mathscr {D}_{1}$ of $\Gamma ^{N}\mathbf {Q}^{N}$ is the solution of $\ \mathrm {SHE}(\bar {\mathfrak {d}})$ with initial data equal to the spatially rescaled initial data $\lim _{N\to \infty }\Gamma ^{N,\mathbf {X}}\mathbf {Z}^{N}\kern-1.5pt$ .
Proof of Theorem 1.8.
Proposition 3.11 shows the difference $\mathbf {Z}^{N}-\mathbf {U}^{N}$ converges to 0 in probability in the Skorokhod space $\mathscr {D}_{1}$ ; with high probability the difference is identically the zero process in $\mathscr {D}_{1}$ . Proposition 3.12 shows the difference $\mathbf {D}^{N}=\mathbf {U}^{N}-\mathbf {Q}^{N}$ also converges to 0 in probability in $\mathscr {D}_{1}$ because Proposition 3.12 shows that $\mathbf {D}^{N}$ converges to 0 in probability with respect to the uniform metric on $\mathscr {D}_{1}$ , and the uniform metric on $\mathscr {D}_{1}$ is stronger than the Skorokhod topology on $\mathscr {D}_{1}$ . To justify the last claim, it is enough to take the identify function within the infimum on the RHS of (12.13) in [Reference Billingsley, Wiley and Sons4]; though [Reference Billingsley, Wiley and Sons4] studies just $\mathbb R$ -valued processes, the same is true of processes valued in any separable Banach space, including $\mathscr {C}(\mathbb {T}^{1})$ . Combining these two observations implies $\mathbf {Z}^{N}-\mathbf {Q}^{N}\to 0$ in $\mathscr {D}_{1}$ . Finally, observe that Proposition 3.13 implies $\Gamma ^{N}\mathbf {Q}^{N}$ converges to what we propose $\Gamma ^{N}\mathbf {Z}^{N}$ converges to in $\mathscr {D}_{1}$ . Because $\Gamma ^{N}$ is a rescaling operator that is continuous with respect to the Skorokhod topology on $\mathscr {D}_{1}$ , as it only rescales in space, standard probability shows $\Gamma ^{N}\mathbf {Z}^{N}\to \mathrm {SHE}(\bar {\mathfrak {d}})$ in $\mathscr {D}_{1}$ with initial data $\lim _{N\to \infty }\Gamma ^{N,\mathbf {X}}\mathbf {Z}^{N}\kern-1.5pt$ .
We have now established Theorem 1.8 by taking Proposition 3.11, Proposition 3.12 and Proposition 3.13 for granted. Again, the proofs for Proposition 3.11 and Proposition 3.12 will be the purpose for the rest of this paper after the current section, and for this we establish a version of the nonstationary first-order Boltzmann–Gibbs principle. On the other hand, proof for Proposition 3.13 is fairly straightforward, as we alluded to near the beginning of this section, provided the analysis in [Reference Dembo and Tsai19], especially Lemma 2.5 therein and its proof. Also, Proposition 3.13 will be important in establishing Proposition 3.11 and Proposition 3.12 because it will yield a priori stability/bounds for studying the $\mathbf {Z}^{N}$ -SPDE. With this in mind, we prove Proposition 3.13 in the current section. We will then conclude this section by writing an outline and discussion of the proofs of Proposition 3.11 and Proposition 3.12 whose details we devote the rest of this paper to. We first make the following point to avoid circular logic. Proposition 3.13 consists of tightness of $\Gamma ^{N}\mathbf {Q}^{N}$ and identification of limit points. Tightness is independent of Propositions 3.11 and 3.12, but identification of limit points uses Propositions 3.11 and 3.12. Propositions 3.11 and 3.12, in turn, requires tightness of $\Gamma ^{N}\mathbf {Q}^{N}$ .
3.1 Proof of Proposition 3.13
Following the proof of Theorem 1.1 in [Reference Dembo and Tsai19], which is given in Section 3 of [Reference Dembo and Tsai19], let us first show the tightness claim in Proposition 3.13 with moment estimates for the auxiliary process $\mathbf {Q}^{N}$ ; this is the analog of Proposition 3.2 and Corollary 3.3 in [Reference Dembo and Tsai19]. Afterwards, we will identify subsequential limit points of $\Gamma ^{N}\mathbf {Q}^{N}$ in the Skorokhod space $\mathscr {D}_{1}$ as $\mathrm {SHE}(\bar {\mathfrak {d}})$ . This is the analog of the analysis behind Section 3.2 in [Reference Dembo and Tsai19], and it similarly amounts to proving that all the limit points of $\Gamma ^{N}\mathbf {Q}^{N}$ satisfy a martingale problem associated to $\mathrm {SHE}(\bar {\mathfrak {d}})$ . We emphasize there is no real difficulty in choosing nonzero $\bar {\mathfrak {d}}$ .
Lemma 3.14. The sequence $\Gamma ^{N}\mathbf {Q}^{N}$ is tight with respect to the Skorokhod topology on $\mathscr {D}_{1}$ .
Proof. Provided heat kernel estimates and martingale estimates in Appendix A, the moment estimates in Proposition 3.2 in [Reference Dembo and Tsai19] for the Gartner transform therein hold for $\mathbf {Q}^{N}$ , as long as we remove the exponential weights therein and then replace the spatial gradients therein with spatial gradients on the torus $\mathbb {T}_{N}\kern-1.5pt$ . Thus, it suffices to follow the proof of Corollary 3.3 in [Reference Dembo and Tsai19].
We are now left with identifying limit points in $\mathscr {D}_{1}$ of the sequence $\Gamma ^{N}\mathbf {Q}^{N}$ . As we briefly alluded to above, we will require the following martingale problem formulation of $\mathrm {SHE}(\bar {\mathfrak {d}})$ . Technically, the following martingale problem formulation of $\mathrm {SHE}(\bar {\mathfrak {d}})$ differs from the martingale problems for SPDEs introduced and employed in [Reference Bertini and Giacomin3, Reference Dembo and Tsai19] and related papers unless $\bar {\mathfrak {d}}=0$ because of the additional first-order linear transport operator. But this transport is lower order and introduces only a linear drift.
Definition 3.15. Let us first choose any pair of spatial test functions $\psi _{1;\cdot },\psi _{2;\cdot }:\mathbb {T}\to \mathbb R$ and any pair of space-time test functions $\phi _{1;\cdot ,\cdot },\phi _{2;\cdot ,\cdot }:\mathbb R_{\geqslant 0}\times \mathbb {T}\to \mathbb R$ , where we recall that $\mathbb {T}=\mathbb R/\mathbb Z$ is the unit torus. We also define the following bilinear pairings.
-
• Define $\langle \psi _{1;\cdot },\psi _{2;\cdot }\rangle _{\mathbb {T}}=\int _{\mathbb {T}}\psi _{1;x}\psi _{2;x}\mathrm {d} x$ and for $\mathrm {t}\in [0,1]$ , let $\langle \phi _{1;\cdot ,\cdot },\phi _{2;\cdot ,\cdot }\rangle _{\mathrm {t};\mathbb {T}}=\int _{0}^{\mathrm {t}}\langle \phi _{1;\mathrm {s},\cdot },\phi _{2;\mathrm {s},\cdot }\rangle _{\mathbb {T}}\mathrm {d}\mathrm {s}$ .
Consider a possibly random continuous process $\mathbf {S}_{\cdot ,\cdot }\in \mathscr {C}_{1}$ . Let us say $\mathbf {S}_{\cdot ,\cdot }$ solves the $\mathrm {SHE}(\bar {\mathfrak {d}})$ martingale problem if:
-
• We have $\mathbf {S}_{0,\cdot }=\lim _{N\to \infty }\Gamma ^{N,\mathbf {X}}\mathbf {Z}^{N}$ and the second moment bound $\|\mathbf {S}_{\mathrm {t},x}\|_{\omega ;2}\lesssim 1$ uniformly over all $\mathrm {t}\in [0,1]$ and $x\in \mathbb {T}$ .
-
• Let $\mathscr {L}^{\ast }$ be the formal adjoint of the continuum differential operator $\mathscr {L}=2^{-1}\Delta +\bar {\mathfrak {d}}\nabla $ with respect to the Lebesgue measure on $\mathbb {T}$ . For any smooth, time-constant test function $\phi \in \mathscr {C}^{\infty }(\mathbb {T})$ , the following are local $\mathbb R$ -valued martingales in $\mathrm {t}\in [0,1]$ :
(3.4) $$ \begin{align} \mathbf{m}_{\mathrm{t}}(\phi) \ \overset{\bullet}= \ \langle\phi,\mathbf{S}_{\mathrm{t},\cdot}\rangle_{\mathbb{T}}-\langle\phi,\mathbf{S}_{0,\cdot}\rangle_{\mathbb{T}}-\langle\mathscr{L}^{\ast}\phi,\mathbf{S}_{\cdot,\cdot}\rangle_{\mathrm{t};\mathbb{T}} \quad \mathrm{and} \quad \mathbf{m}_{\mathrm{t}}(\phi)^{2}-\langle\phi^{2},\mathbf{S}_{\cdot,\cdot}^{2}\rangle_{\mathrm{t};\mathbb{T}}. \end{align} $$
In [Reference Bertini and Giacomin3, Reference Dembo and Tsai19], the key feature of the martingale problem for $\mathrm {SHE}=\mathrm {SHE}(0)$ is that any solution is equal to the mild solution as probability measures on the path-space $\mathscr {C}_{1}$ . It turns out solutions of $\mathrm {SHE}(\bar {\mathfrak {d}})$ share a similar property since $\mathrm {SHE}(\bar {\mathfrak {d}})$ is still linear.
Lemma 3.16. If $\mathbf {S}\in \mathscr {C}_{1}$ is a solution of the $\mathrm {SHE}(\bar {\mathfrak {d}})$ martingale problem, then $\mathbf {S}=\mathrm {SHE}(\bar {\mathfrak {d}})$ as probability measures on $\mathscr {C}_{1}$ .
Let us now identify limit points in $\mathscr {D}_{1}$ of the sequence $\Gamma ^{N}\mathbf {Q}^{N}$ as $\mathrm {SHE}(\bar {\mathfrak {d}})$ with the martingale problem in Definition 3.15 and the uniqueness result Lemma 3.16. To this end, we follow Section 3.2 of [Reference Dembo and Tsai19]. The first step is to compute the predictable bracket of the martingale differential $\mathbf {Q}^{N}\mathrm {d}\xi ^{N}$ . The environment dependence is lower order, so it is negligible in the large-N limit. We ultimately get a predictable bracket equal to that in Proposition 3.4 in [Reference Dembo and Tsai19] for $m=1$ , after replacing all Gartner transforms therein with $\mathbf {Q}^{N}$ . The next step to identify $\Gamma ^{N}\mathbf {Q}^{N}$ is then the following hydrodynamic limit, as was the situation in Section 3.2 in [Reference Dembo and Tsai19]. Thus, the proof of Proposition 3.13 amounts to proving the following parallel to Lemma 2.5 in [Reference Dembo and Tsai19].
Lemma 3.17. Consider any local function $\mathfrak {f}:\Omega \to \mathbb R$ whose support is a uniformly bounded neighborhood of $0\in \mathbb {T}_{N}\kern-1.5pt$ . Suppose $\mathbf {E}_{0}\mathfrak {f}=0$ , in which $\mathbf {E}_{0}$ is the expectation with respect to the product Bernoulli measure on $\Omega $ whose one-dimensional marginals vanish in expectation. Let us define the space-time shift $\mathfrak {f}_{S,y}=\tau _{y}\mathfrak {f}(\eta _{S})$ . For any $\phi \in \mathscr {C}^{\infty }(\mathbb {T})$ and $\mathrm {t}\in [0,1]$ , we have
Proof. Recall $\mathbf {Q}^{N}$ satisfies spatial regularity on macroscopic length scales as we explained in the proof of Lemma 3.14. Thus, by following the proof of Lemma 2.5 in [Reference Dembo and Tsai19], it suffices to replace $\mathfrak {f}_{S,y}$ in the proposed limit with its spatial average over a block of length $\delta N^{1/2}$ with $\delta>0$ small but taken to zero after taking the large-N limit. At this point, following the proof of Lemma 2.5 from [Reference Dembo and Tsai19] suffices because we also have the pointwise moment estimate to bound $\mathbf {Q}^{N}$ as argued in the proof of Lemma 3.14, and because $\mathbf {E}_{0}\mathfrak {f}=0$ , the one-block and two-blocks schemes from Section 4 of [Reference Dembo and Tsai19] lets us replace the block average of $\mathfrak {f}$ by the block average of $\eta $ ; these steps are successful here as well by virtue of entropy production estimates in a finite volume of order N that is even better than the entropy production in Lemma 4.1 in [Reference Dembo and Tsai19] that was used in the proof of Lemma 2.5 in [Reference Dembo and Tsai19]. Lastly, to estimate the block average of $\eta $ of length $\delta N^{1/2}$ with $\delta>0$ vanishing after the large-N limit, it suffices to note $\mathfrak {t}_{\mathrm {st}}=1$ with high probability by Proposition 3.11, and $\mathfrak {t}_{\mathrm {st}}=1$ implies regularity estimates for $\mathbf {Z}^{N}$ that are used to show the vanishing of the $\eta $ -block average at hand; see the proof of Lemma 2.5 in [Reference Dembo and Tsai19] for more details on this last point. This completes the proof.
3.2 Strategy
Recall $\bar {\mathfrak {q}}$ in Definition 2.2; it is the correction of a local statistic by its hydrodynamic limit and appropriate linear projection. In what follows, all we need from boldface objects are that they are possibly random but have space-time regularity at worst matching SHE or KPZ, and all we need from the $\mathbf {H}^{N}$ operator is that it is integration in space-time against a reasonably smooth test function (though in this paper, we specify to the heat operator in Definition 2.5).
Proposition 3.11 and Proposition 3.12 effectively follow by showing the order $N^{1/2}$ -term in the $\mathbf {U}^{N}$ equation is small. Indeed, this would imply the estimate in Proposition 3.12 by standard methods for linear equations, as the $\mathbf {U}^{N}$ equation in Definition 3.5 and the $\mathbf {Q}^{N}$ equation in Definition 3.8 differ only in this $N^{1/2}$ term. On the other hand, provided that $\mathbf {U}^{N}\approx \mathbf {Q}^{N}$ via Proposition 3.12, space-time estimates for $\mathbf {U}^{N}$ are inherited from those for $\mathbf {Q}^{N}$ , which we have already shown behaves like $\mathrm {SHE}(\bar {\mathfrak {d}})$ and therefore satisfies significantly improved versions of the estimates defining $\mathfrak {t}_{\mathrm {st}}$ , namely replacing $\varepsilon _{\mathrm {ap}}$ therein with $\varepsilon _{\mathrm {ap}}/999$ for example. Thus, $\mathbf {U}^{N}$ satisfies improved versions of the regularity estimates defining $\mathfrak {t}_{\mathrm {st}}$ . Using Lemma 3.7, this implies that $\mathbf {Z}^{N}$ satisfies the same estimates before the stopping time $\mathfrak {t}_{\mathrm {st}}$ , after which we may extend these estimates after the stopping $\mathfrak {t}_{\mathrm {st}}$ upon directly studying $\mathbf {Z}^{N}$ on very short/submicroscopic timescales. This shows the estimates defining $\mathfrak {t}_{\mathrm {st}}$ are self-propagating, and thus $\mathfrak {t}_{\mathrm {st}}=1$ , with high probability as claimed in Proposition 3.11.
We now discuss showing the order $N^{1/2}$ -term in the $\mathbf {U}^{N}$ equation from Definition 3.5 is small, which we state as the following heuristic that is usually known as a Boltzmann–Gibbs principle; we will prove a stronger version of the following.
Heuristic 3.18. We have the convergence-in-probability $\|\mathbf {H}^{N}(N^{1/2}\bar {\mathfrak {q}}\mathbf {Y}^{N})\|_{1;\mathbb {T}_{N}}\to 0$ in the large-N limit.
3.2.1 Approach via mesoscopic equilibrium
The strategy for Heuristic 3.18 is based on replacing $\bar {\mathfrak {q}}$ by its invariant measure expectation via ergodic theory on the mesoscopic length scale $N^{1/2+\varepsilon _{\mathrm {RN}}}$ . By invariant measure, we technically mean a canonical measure expectation of parameter $\sigma $ given by the $\eta $ -density on a block of length of order $N^{1/2+\varepsilon _{\mathrm {RN}}}$ ; see Definition 4.4 for what this means. The philosophy, coming from [Reference Guo, Papnicolaou and Varadhan28], of this approach is that the particle system evolves on extremely fast $N^{2}$ timescales, and thus on mesoscopic/local scales, relaxation to invariant measure happens quickly.
To make this discussion a little more concrete, we will present evidence of the following statement. Again, we refer the reader to Definition 4.4 for the canonical measure used below. We also refer the reader to Lemma 1 and Lemma 3 in [Reference Goncalves and Jara24] for another set of results that are slightly weaker but philosophically analogous to the following.
Heuristic 3.19. Let $\mathsf {E}^{\mathrm {can}}_{1/2+\varepsilon _{\mathrm {RN}}}(\tau _{y}\eta _{S})$ be canonical measure expectation of $\bar {\mathfrak {q}}_{S,y}$ on a block of length scale $\mathfrak {l}_{N}=N^{1/2+\varepsilon _{\mathrm {RN}}}$ whose $\sigma $ -parameter is equal to the $\eta $ -density in a length $\mathfrak {l}_{N}$ neighborhood of the support of $\bar {\mathfrak {q}}_{S,y}$ . We have the convergence in probability $\|\mathbf {H}^{N}(N^{1/2}(\bar {\mathfrak {q}}_{S,y}-{\mathsf {E}^{\mathrm {can}}_{1/2+\varepsilon _{\mathrm {RN}}}(\tau _{y}\eta _{S})})\|_{1;\mathbb {T}_{N}}\to 0$ in the large-N limit.
To explain the benefit of Heuristic 3.19, Proposition 8 in [Reference Goncalves and Jara24] basically shows $|{\mathsf {E}^{\mathrm {can}}_{1/2+\varepsilon _{\mathrm {RN}}}(\tau _{y}\eta _{S})}|\lesssim \mathfrak {l}_{N}^{-1}=N^{-1/2-\varepsilon _{\mathrm {RN}}}$ . This beats $N^{1/2}$ , so it remains to prove Heuristic 3.19. We clarify that this bound only holds if the $\eta $ -density at scale $\mathfrak {l}_{N}$ is controlled by $\mathfrak {l}_{N}^{-1/2}$ , which is basically equivalent to $\mathbf {h}^{N}$ and $\mathbf {Z}^{N}$ satisfying regularity estimates defining $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {X}}$ in Definition 3.1.
3.2.2 Evidence for Heuristic 3.19
The replacement that we proposed in Heuristic 3.19 will not be performed in one step. We need to replace $\bar {\mathfrak {q}}$ by its canonical measure expectations on progressively larger length scales until we hit $\mathfrak {l}_{N}=N^{1/2+\varepsilon _{\mathrm {RN}}}$ , with every length scale being a small but universal power of N larger than the previous scale. This ‘renormalization’ is similar to that of Lemma 2 in [Reference Goncalves and Jara24]; see also proofs of Theorems 1.1 and 1.2 in [Reference Sethuraman and Xu43]. In what follows, $\log _{N}(\mathrm {a})=\frac {\log \mathrm {a}}{\log N}$ is the base N logarithm.
Heuristic 3.20. For $\mathfrak {l}\in \mathbb Z_{\geqslant 0}$ larger than the length of the support of $\bar {\mathfrak {q}}$ , we let ${\mathsf {E}^{\mathrm {can}}_{\log _{N}\mathfrak {l}}(\tau _{y}\eta _{S})}$ be canonical measure expectation of $\bar {\mathfrak {q}}_{S,y}$ on a neighborhood of the support of $\bar {\mathfrak {q}}_{S,y}$ with length scale $\mathfrak {l}$ . Take $\varepsilon _{\mathrm {RN},1}>0$ sufficiently small but universal. We have the following uniformly, in probability, in all $\mathfrak {l}$ -indices larger than the length of the support of $\bar {\mathfrak {q}}_{S,y}$ that also satisfy $N^{\varepsilon _{\mathrm {RN},1}}\mathfrak {l}\leqslant \mathfrak {l}_{N}$ :
Let $\mathfrak {l}_{0}\in \mathbb Z_{\geqslant 0}$ be any uniformly bounded length scale that is larger than the length of the support of $\bar {\mathfrak {q}}$ . We additionally have the claimed convergence in probability in Heuristic 3.19 if we replace $\mathsf {E}^{\mathrm {can}}_{1/2+\varepsilon _{\mathrm {RN}}}(\tau _{y}\eta _{S})$ therein by $\mathsf {E}^{\mathrm {can}}_{\log _{N}\mathfrak {l}_{0}}(\tau _{y}\eta _{S})$ .
Let us focus on the proposed bound for $\mathsf {R}$ terms in Heuristic 3.20 and discuss necessary adjustments for the difference between $\bar {\mathfrak {q}}$ and $\mathsf {E}^{\mathrm {can}}_{\log _{N}\mathfrak {l}_{0}}$ afterwards; analyses of both will be similar to each other except for an important technical obstruction faced by the latter difference. The first key observation for the $\mathsf {R}$ terms is their fluctuation property; with respect to any invariant canonical measure on the support of $\mathsf {R}$ , the function $\mathsf {R}$ vanishes in expectation. Given that the support of $\mathsf {R}$ is mesoscopic in scale and given the fastness to invariant measures on mesoscopic length scales, the functional $\mathsf {R}$ is rapidly fluctuating on mesoscopic timescales. To take advantage of these fluctuations, we average $\mathsf {R}$ on mesoscopic space-time scales, in contrast to macroscopic scales that are used in [Reference Chang and Yau8]. Assuming we can replace $\mathsf {R}$ by such time average for now and additionally assuming that the law of the particle system around the support of $\mathsf {R}$ is an invariant canonical measure, we would be able to control this mesoscopic space-time average of $\mathsf {R}$ as if it were the space-time average of a noise by the Kipnis–Varadhan inequality; see Appendix 1.6 in [Reference Kipnis and Landim37]. To be precise, if $\mathfrak {l}(\mathsf {R}_{\log _{N}\mathfrak {l}})$ is twice the support length of $\mathsf {R}_{\log _{N}\mathfrak {l}}$ ,
Thus, the LHS exhibits Brownian behavior in space-time. The factor $N^{-1}$ on the RHS, which makes equation (3.7) extremely useful for obtaining Heuristic 3.20, comes from the fact that the system evolves at a speed $N^{2}$ . Therefore, convergence to invariant measure happens on timescales of order $N^{-2}$ , creating more fluctuation before time $\mathfrak {t}_{\mathrm {av}}$ . In a similar spirit, the factor $\mathfrak {l}({\mathsf {R}_{\log _{N}\mathfrak {l}}})$ on the RHS of equation (3.7), which actually makes equation (3.7) worse as we increase the length scale of the support of ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ in Heuristic 3.20, comes from the fact that we require the particle system to converge to invariant measure in a neighborhood of the support of ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ in order to exploit its fluctuations; this happens more slowly as the support of ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ increases. As for $|{\mathsf {R}_{\log _{N}\mathfrak {l}}}|$ :
-
• Heuristic 3.19 replaces $\bar {\mathfrak {q}}$ by the $\mathsf {E}^{\mathrm {can}}_{1/2+\varepsilon _{\mathrm {RN}}}$ -term that has good deterministic bounds. This feature of $\mathsf {E}^{\mathrm {can}}_{1/2+\varepsilon _{\mathrm {RN}}}$ is not exclusive to the length scale $\mathfrak {l}_{N}=N^{1/2+\varepsilon _{\mathrm {RN}}}$ . Precisely, as length scales of $\mathsf {E}^{\mathrm {can}}$ terms in Heuristic 3.20 defining $\mathsf {R}$ terms increases, such $\mathsf {E}^{\mathrm {can}}$ functionals decrease in magnitude as noted after Heuristic 3.19, and therefore so do $\mathsf {R}$ functions. It turns out that this competition between $\mathfrak {l}({\mathsf {R}_{\log _{N}\mathfrak {l}}})$ and $|{\mathsf {R}_{\log _{N}\mathfrak {l}}}|$ on the RHS of equation (3.7) almost perfectly cancel because $\mathsf {E}^{\mathrm {can}}_{\log _{N}\mathfrak {l}}$ and ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ are controlled by the inverse of the length scale. This is why the multiscale replacement in Heuristic 3.20 is feasible.
Provided the previous bullet point, we will want to take $\mathfrak {t}_{\mathrm {av}}\sim N^{-1}$ and $\mathfrak {l}_{\mathrm {av}}\gg 1$ . Actually, for our applications of estimates of the form (3.7) in this paper, we will take $\mathfrak {t}_{\mathrm {av}}$ slightly smaller than $N^{-1}$ and $\mathfrak {l}_{\mathrm {av}}$ a mesoscopic length scale noticeably larger than simply $\mathfrak {l}_{\mathrm {av}}\gg 1$ , but this is entirely for technical reasons.
3.2.3 Replacement by space-time averages
In the discussion of Heuristic 3.20 given after its statement, we omitted an important issue of replacing ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ terms with space-time averages. We first explain why we can replace by spatial averages.
-
• Recall in the paragraph following the above single bullet point that we want to replace ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ by its spatial average on length scale $\mathfrak {l}_{\mathrm {av}}\gg 1$ . If we replace ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ in the heat operator in Heuristic 3.20 with its spatial average on length scale $\mathfrak {l}_{\mathrm {av}}$ , the error is controlled by the difference between ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ and its spatial translations, where the length scale for the translations are at most $\mathfrak {l}_{\mathrm {av}}\mathfrak {l}({\mathsf {R}_{\log _{N}\mathfrak {l}}})$ ; see the LHS of equation (3.7). These differences are spatial gradients of ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ ; to bound these when multiplied by $\mathbf {Y}^{N}$ and plugged into the heat operator, we apply summation by parts. This lets us transfer the spatial gradients from ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ to $\mathbf {Y}^{N}$ and the $\mathbf {H}^{N}$ heat kernel, so it suffices to estimate these spatial gradients of smoother objects. The heat kernel is macroscopically smooth, so its spatial gradients carry a factor of $N^{-1}$ . The $\mathbf {Y}^{N}$ term is constructed with a priori spatial regularity bounds; by Definition 3.1 and Definition 3.5, such $\mathbf {Y}^{N}$ factor is basically macroscopically spatially Holder- $\frac 12$ continuous. Adding these two estimates for the spatial regularity of the $\mathbf {H}^{N}$ heat kernel and of $\mathbf {Y}^{N}$ , the error in introducing the spatial average of ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ in Heuristic 3.20 is basically at most the following; recall from Definition 3.5 that $\mathbf {Y}^{N}$ is basically bounded and recall from earlier in this paragraph that the max length scale of spatial gradients here is $\mathfrak {l}_{\mathrm {av}}\mathfrak {l}({\mathsf {R}_{\log _{N}\mathfrak {l}}})$ :
(3.8) $$ \begin{align} N^{-1}\mathfrak{l}_{\mathrm{av}}\mathfrak{l}({\mathsf{R}_{\log_{N}\mathfrak{l}}})N^{\frac12}|{\mathsf{R}_{\log_{N}\mathfrak{l}}}| + N^{\frac12}|{\mathsf{R}_{\log_{N}\mathfrak{l}}}|N^{-\frac12}\mathfrak{l}_{\mathrm{av}}^{\frac12}\mathfrak{l}({\mathsf{R}_{\log_{N}\mathfrak{l}}})^{\frac12}. \end{align} $$As $\mathfrak {l}({\mathsf {R}_{\log _{N}\mathfrak {l}}})|{\mathsf {R}_{\log _{N}\mathfrak {l}}}|\lesssim 1$ like in the proof idea of Heuristic 3.20, if $|{\mathsf {R}_{\log _{N}\mathfrak {l}}}|\gg 1$ , then choosing $\mathfrak {l}_{\mathrm {av}}\gg _{\mathfrak {l}({\mathsf {R}_{\log _{N}\mathfrak {l}}})}1$ to not be too big makes equation (3.8) small. -
• If $|{\mathsf {R}_{\log _{N}\mathfrak {l}}}|\not \gg 1$ , the first term in equation (3.8) still vanishes in the large-N limit if we pick $\mathfrak {l}_{\mathrm {av}}\ll N^{1/2}$ , which we certainly will in this paper. As for the second term in equation (3.8), we recall that term comes from blindly controlling the spatial gradients of $\mathbf {Y}^{N}$ via its spatial Holder regularity. However, we also know that $\mathbf {Y}^{N}$ , whenever it is nonzero and equal to $\mathbf {Z}^{N}\kern-1.5pt$ , is explicit in terms of the particle system by definition. Therefore, its spatial gradient on the length scale $w\mathfrak {l}({\mathsf {R}_{\log _{N}\mathfrak {l}}})$ for $|w|\leqslant \mathfrak {l}_{\mathrm {av}}$ , if nonzero, is $\mathbf {Y}^{N}$ itself times an explicit functional of the particle system whose support, it turns out after explicit calculation, is contained outside the support of ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ and of length at most $\mathfrak {l}_{\mathrm {av}}\mathfrak {l}({\mathsf {R}_{\log _{N}\mathfrak {l}}})\lesssim \mathfrak {l}_{\mathrm {av}}$ and thus not too large. We emphasize that this disjoint-support condition we just mentioned is a consequence of the shifting $\mathfrak {q}$ in Definition 2.2, which is actually the exact purpose of that shift. Ultimately, the product between this functional and ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ , which is lower order because gradients of the explicit formula for $\mathbf {Y}^{N}$ introduce factors of $N^{-1/2}$ , admits an inverse-length-scale estimate and satisfies a similar fluctuation property as ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ itself because of the disjoint support condition, so we can apply for it a simpler version of this analysis.
-
• Again, we actually pick $\mathfrak {l}_{\mathrm {av}}\gg 1$ more precisely. For ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ terms whose support lengths are asymptotically large but still below a certain N-dependent threshold, we take $\mathfrak {l}_{\mathrm {av}}\approx \mathfrak {l}({\mathsf {R}_{\log _{N}\mathfrak {l}}})$ . For ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ whose supports have lengths above this threshold, we will be less strict with $\mathfrak {l}_{\mathrm {av}}$ and take advantage of the consequentially small $|{\mathsf {R}_{\log _{N}\mathfrak {l}}}|$ in equation (3.8), letting it do the work.
We now discuss the problem of introducing a time average of ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ after introducing a spatial average.
-
• Similar to the replacement by spatial average for ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ terms from Heuristic 3.20, replacements by time averages for ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ terms therein contributes errors that are controlled by time-gradients of the $\mathbf {H}^{N}$ heat kernel and of $\mathbf {Y}^{N}$ . The $\mathbf {H}^{N}$ heat kernel is smooth in time and the latter has time regularity of Holder- $\frac 14$ , basically. Thus, similar to equation (3.8), we deduce that the error in said replacement by time average on timescale $\mathfrak {t}_{\mathrm {av}}$ is controlled by the following, in which the $\mathfrak {l}_{\mathrm {av}}$ -based factor comes from the fact that we have already spatially averaged ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ on length scale $\mathfrak {l}_{\mathrm {av}}\gg 1$ and thus gained an a priori estimate for ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ /its scale- $\mathfrak {l}_{\mathrm {av}}$ spatial average because of its fluctuating behavior as in equation (3.7):
(3.9) $$ \begin{align} \mathfrak{t}_{\mathrm{av}}N^{\frac12}\mathfrak{l}_{\mathrm{av}}^{-\frac12}|{\mathsf{R}_{\log_{N}\mathfrak{l}}}| + N^{\frac12}\mathfrak{l}_{\mathrm{av}}^{-\frac12}|{\mathsf{R}_{\log_{N}\mathfrak{l}}}|\mathfrak{t}_{\mathrm{av}}^{\frac14}. \end{align} $$Recall we want $\mathfrak {t}_{\mathrm {av}}=N^{-1}$ ; the first term in equation (3.9) vanishes in the large-N limit. The second term, however, clearly blows up. Instead, we take $\mathfrak {t}_{\mathrm {av}}=\mathfrak {t}_{\mathrm {av},1}=N^{-2}\mathfrak {l}_{\mathrm {av}}$ and pretend $\mathfrak {l}_{\mathrm {av}}=N^{\varepsilon }$ for $\varepsilon>0$ small but universal, which will ultimately be the case later in this paper. This choice of $\mathfrak {t}_{\mathrm {av}}$ makes it so both terms in equation (3.9) vanish in the large-N limit. -
• We replaced the spatial average of ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ with its time average on scale $\mathfrak {t}_{\mathrm {av},1}=N^{-2}\mathfrak {l}_{\mathrm {av}}$ . Let us now replace this time average with its own time average on a timescale $\mathfrak {t}_{\mathrm {av},2}=N^{\rho }\mathfrak {t}_{\mathrm {av},1}$ , where $\rho>0$ is small but universal. Similar to equation (3.9), we establish the following rough estimate for the error in this time-average replacement, but with a key distinction we explain below:
(3.10) $$ \begin{align} \mathfrak{t}_{\mathrm{av},2}N^{\frac12}N^{-1}\mathfrak{t}_{\mathrm{av},1}^{-\frac12}\mathfrak{l}_{\mathrm{av}}^{-\frac12}|{\mathsf{R}_{\log_{N}\mathfrak{l}}}| + N^{\frac12}N^{-1}\mathfrak{t}_{\mathrm{av},1}^{-\frac12}\mathfrak{l}_{\mathrm{av}}^{-\frac12}|{\mathsf{R}_{\log_{N}\mathfrak{l}}}|\mathfrak{t}_{\mathrm{av},2}^{\frac14}. \end{align} $$Besides replacing timescales, the difference between equations (3.9) and (3.10) is $\mathfrak {t}_{\mathrm {av},1}$ -based factors. These come from the fact that we have already time averaged the spatial average of ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ on timescale $\mathfrak {t}_{\mathrm {av},1}$ , so we get an improved a priori estimate similar to equation (3.7). If we choose $\mathfrak {t}_{\mathrm {av},2}\leqslant N^{-1}$ , the first term in equation (3.10) certainly vanishes in the large-N limit, as $\mathfrak {t}_{\mathrm {av},1}\gg N^{-2}$ . On the other hand, a simple calculation implies that the second term in (3.10) vanishes in the large-N limit if we choose $\rho =\varepsilon /999$ , where we recall $\varepsilon $ is defined via $\mathfrak {l}_{\mathrm {av}}=N^{\varepsilon }$ . Thus, we have successfully replaced the time average on timescale $\mathfrak {t}_{\mathrm {av},1}$ of the spatial average of ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ with its time average on the timescale $\mathfrak {t}_{\mathrm {av},2}\gg \mathfrak {t}_{\mathrm {av},1}$ . As the double time average is basically an average on the larger timescale, in this step we have basically replaced the timescale $\mathfrak {t}_{\mathrm {av},1}$ by $\mathfrak {t}_{\mathrm {av},2}\gg \mathfrak {t}_{\mathrm {av},1}$ . -
• We then iteratively boost the timescale by $N^{\rho }$ until we hit the maximal timescale. This strongly resembles the renormalization procedure discussed after equation (3.7) and in Lemma 2 in [Reference Goncalves and Jara24] but in the time direction and not the spatial direction. In particular, it is key that each replacement of timescale increases a priori estimates for ${\mathsf {R}_{\log _{N}\mathfrak {l}}}$ .
3.2.4 Nonequilibrium calculations
The previous heuristics are justifiable if the model is at an invariant measure. In general, we will reduce estimates to invariant measure calculations by virtue of the local equilibrium method in [Reference Guo, Papnicolaou and Varadhan28], namely the one-block and two-blocks estimates, which basically suggest that statistics for our large-scale system are very close to some invariant measure at mesoscopic scales. In particular, we employ the following strategy that will later be made quantitatively precise.
-
• The local equilibrium method in [Reference Guo, Papnicolaou and Varadhan28] is based on entropy-Dirichlet form duality and therefore highly robust under perturbations [Reference Yau52], unlike the approach of Chang–Yau [Reference Chang and Yau8] via the global invariant measure/eigenvalue problem. It implies the Dirichlet form of the system is very small on mesoscopic blocks. By the log-Sobolev inequality of [Reference Yau51], the same is true for relative entropy.
-
• By the relative entropy inequality, we may try to reduce calculations to those at invariant measures. However, relative entropy estimates from the previous bullet point will not be good enough to perform any direct comparison to invariant measures for the purposes of proving Heuristic 3.20; this is because local equilibrium reduction by the entropy inequality on larger scales needs sharper large-deviations bounds for terms we are trying to reduce to equilibrium. Thus, we see a competition between deterioration in local equilibrium reduction in replacing ${\mathsf {E}_{\log _{N}\mathfrak {l}}}$ in Heuristic 3.20 by itself on progressively larger scales, versus improving bounds for ${\mathsf {E}_{\log _{N}\mathfrak {l}}}$ on progressively larger scales. As before, this competition sufficiently cancels.
We conclude with the following outline for the paper in view of this strategy discussion/this entire section.
-
• In Section 4, we present the main technical ingredient of this paper, the nonstationary first-order Boltzmann–Gibbs principle. This is a quantitative version of Heuristic 3.18. We state three ingredients for its proof, the first two of which give a quantitative version of Heuristic 3.20 and the last of which is the step used to prove Heuristic 3.18 assuming Heuristic 3.19, namely the inverse-length-scale bound for ${\mathsf {E}_{\log _{N}\mathfrak {l}}}$ terms. We give only the relatively short proof of the last of these three ingredients in the next section; we defer technically involved proofs of the first two ingredients to the last part of the paper before the appendix.
-
• In Section 5, we state and prove a second weaker version of the Boltzmann–Gibbs principle that controls gradients of the heat operator in Heuristic 3.18. This will be used in order to prove the space-time regularity estimates defining the stopping time $\mathfrak {t}_{\mathrm {st}}$ in Definition 3.1 are self-propagating. This is the goal of Section 6, which we carry out by estimating space-time regularity of each term in the $\mathbf {U}^{N}$ equation individually using a moment calculation exactly like in the proof of Proposition 3.2 in [Reference Dembo and Tsai19], for example, except we will require one application of the aforementioned second/weaker Boltzmann–Gibbs principle to control the gradient of the order $N^{1/2}$ heat operator term in Corollary 2.6.
-
• In Section 7, we combine the estimates in Sections 4, 5 and 6 with a priori $\mathbf {Q}^{N}$ estimates, which are standard to prove, to show Proposition 3.11 and Proposition 3.12.
-
• For the sake of clarity, we shortly reintroduce $\mathsf {E}^{\mathrm {can}}$ and $\mathsf {R}$ notation from this subsection (as well as a few additional and related constructions) more systematically.
4 Boltzmann–Gibbs principle I – statement
The main result of this section is the Boltzmann–Gibbs principle. This allows us to access corrections to the $\mathfrak {q}$ -term in Definition 2.2, or equivalently after spatial translation, the $\widetilde {\mathfrak {q}}$ functional therein, beyond its hydrodynamic limit.
Theorem 4.1. In what follows, let $\mathbf {E}$ be expectation with respect to the law of the $\mathbf {h}_{T,\cdot }^{N}$ and $\eta _{T,\cdot }$ processes with stable initial data. There exists universal $\beta _{\mathrm {BG}}>0$ independent of $\varepsilon _{\mathrm {RN}}>0$ so that with universal implied constant,
Remark 4.2. The Boltzmann–Gibbs principle, for example, in [Reference Brox and Rost6, Reference Chang and Yau8], is usually stated in a much weaker form, namely pointwise in space-time rather than in a uniform space-time norm as in Theorem 4.1. But such an estimate is not well suited for norms.
Remark 4.3. The estimate (4.1) holds if we change $\bar {\mathfrak {q}}$ by replacing $\mathfrak {q}$ in its definition (see Definition 2.2) with any local functional supported to the left of $0$ , say $\mathfrak {y}$ . By local, although we always use it in this paper to mean uniformly bounded support, we can actually allow for the support of this ‘new’ functional $\mathfrak {y}$ that replaces $\mathfrak {q}$ to grow with N; the RHS of equation (4.1) for $\mathfrak {y}$ in place of $\mathfrak {q}$ would then have a factor that grows as the $100$ -th power, for example, of the support length of $\mathfrak {y}$ .
The Boltzmann–Gibbs principle for sufficiently well-behaved stationary models is generally accessible by one application of the one-block estimate of [Reference Guo, Papnicolaou and Varadhan28] and Sobolev inequalities, which hold generally exclusively for stationary models; see Chapter 11 in [Reference Kipnis and Landim37]. Like in [Reference Chang and Yau8], however, for nonstationary particle systems we require a multiscale idea, and in this paper we will adopt the multiscale analysis in [Reference Goncalves and Jara24, Reference Sethuraman and Xu43] that was actually originally implemented for stationary particle systems to prove a refinement of the Boltzmann–Gibbs principle, though our implementation is different than that in [Reference Goncalves and Jara24] due to the nonstationary nature of models considered herein. We set up such a multiscale analysis in the following constructions, which effectively outline a procedure of local equilibrium on small mesoscopic blocks and a renormalization scheme that bootstraps equilibrium on smaller mesoscopic blocks to equilibrium on progressively larger mesoscopic blocks; see our discussion of Heuristic 3.20. First, we must introduce key probability/invariant measures.
Definition 4.4. Consider any subset $\mathbb {I}\subseteq \mathbb {T}_{N}$ and any $\sigma \in \mathbb R$ . We define the canonical measure $\mu _{\sigma ,\mathbb {I}}^{\mathrm {can}}$ to be the uniform measure on the set of $\eta \in \Omega _{\mathbb {I}}$ for which the $\eta $ -average on $\mathbb {I}$ is equal to $\sigma $ . Define the grand-canonical measure $\mu _{\sigma ,\mathbb {I}}$ as the product Bernoulli measure on $\Omega _{\mathbb {I}}$ whose one-dimensional marginals have expectation equal to $\sigma $ . These two probability measures are each defined precisely below, and we will also let $\mu _{\sigma }=\mu _{\sigma ,\mathbb {T}_{N}}$ denote the grand-canonical ensemble of parameter $\sigma $ on the entire set $\mathbb {T}_{N}\kern-1.5pt$ :
For clarity, we mention that the canonical ensemble of parameter $\sigma $ on any subset $\mathbb {I}\subseteq \mathbb {T}_{N}$ is the measure obtained upon taking any grand-canonical ensemble on $\mathbb {I}$ and conditioning on the support of the canonical measure/hyperplane with $\eta $ -average on $\mathbb {I}$ equal to $\sigma $ . Moreover, the projection/pushforward of this canonical ensemble onto any subset $\mathbb {I}'\subseteq \mathbb {I}$ is a convex combination of canonical measures on $\mathbb {I}'$ ; the coefficient in such a convex combination that corresponds to the canonical measure with parameter $\sigma '$ on $\mathbb {I}'$ is the probability of this $\sigma '$ -hyperplane in $\Omega _{\mathbb {I}'}$ under the $\sigma $ -canonical measure on $\mathbb {I}$ . Lastly, when taking the expectation of any functional $\mathfrak {f}$ with respect to a grand-canonical measure, we make take this grand-canonical measure on any neighborhood of the support of $\mathfrak {f}$ , as marginals are jointly independent under grand-canonical measures.
Definition 4.5. Below, we take $\varepsilon _{1},\varepsilon _{\mathrm {RN},1}>0$ arbitrarily small but universal and thus uniformly bounded from below.
-
• We establish two notations for the following empirical $\eta $ -density at time $S\geqslant 0$ in a neighborhood of $y\in \mathbb {T}_{N}$ of length $N^{\varepsilon _{1}}$ . We use the $\sigma $ -notation when we think of the following as a parameter for canonical and grand-canonical ensembles/measures in Definition 4.4, and we use the latter $\mathsf {A}$ -notation when we think of it as an ‘averaging operator’ functional on $\Omega $ :
(4.3) $$ \begin{align} \sigma_{\varepsilon_{1},S,y} \ \overset{\bullet}= \ {\mathsf{A}^{\mathbf{X}}_{\varepsilon_{1},y}(\eta_{S})} \ \overset{\bullet}= \ \widetilde{\sum}_{0\leqslant w\leqslant N^{\varepsilon_{1}}}\eta_{S,y-w}. \end{align} $$ -
• Define the following conditional expectation of the $\bar {\mathfrak {q}}$ functional viewed as a function of $\sigma _{\varepsilon _{1},S,y}$ or $\eta _{S,\cdot }$ for . This conditional expectation is expectation of $\bar {\mathfrak {q}}_{S,y}$ with respect to the canonical measure of parameter $\sigma _{\varepsilon _{1},S,y}$ defined immediately above. We additionally define another expectation operator of $\bar {\mathfrak {q}}_{0,0}$ but now with respect to a grand-canonical measure corresponding to the same $\eta $ -density/profile $\sigma _{\varepsilon _{1},S,y}$ defined immediately above:
(4.4) $$ \begin{align} {\mathsf{E}_{\varepsilon_{1}}^{\mathrm{can}}(\tau_{y}\eta_{S})} \ \overset{\bullet}= \ {\mathbf{E}_{0}}\left(\bar{\mathfrak{q}}_{S,y}\middle|{\mathsf{A}^{\mathbf{X}}_{\varepsilon_{1},y}(\eta_{S})}\right) \quad \mathrm{and} \quad {\mathsf{E}_{\varepsilon_{1}}^{\mathrm{gc}}(\tau_{y}\eta_{S})} \ \overset{\bullet}= \ {\mathbf{E}_{\sigma_{\varepsilon_{1},S,y}}}\bar{\mathfrak{q}}_{0,0}. \end{align} $$We emphasize that the support of $\bar {\mathfrak {q}}_{S,y}$ is contained strictly in for any $S\geqslant 0$ and $y\in \mathbb {T}_{N}\kern-1.5pt$ , which we emphasize is the support of $\mathsf {A}^{\mathbf {X}}_{\varepsilon _{1},y}(\eta _{S})$ . More generally, given any functional $\mathfrak {f}:\Omega \to \mathbb R$ with support strictly contained in , we let $\mathsf {E}^{\mathrm {can}}_{\varepsilon _{1}}(\tau _{y}\eta _{S};\mathfrak {f})$ be as above but replacing $\bar {\mathfrak {q}}_{S,y}$ by $\mathfrak {f}$ . We now define the difference between $\bar {\mathfrak {q}}$ and its $N^{\varepsilon _{1}}$ -local expectation:(4.5) $$ \begin{align} {\mathsf{S}_{\varepsilon_{1}}(\tau_{y}\eta_{S})} \ \overset{\bullet}= \ \bar{\mathfrak{q}}_{S,y}-{\mathsf{E}_{\varepsilon_{1}}^{\mathrm{can}}(\tau_{y}\eta_{S})}. \end{align} $$ -
• Observe now that the previous constructions extend from our predetermined choice of $\varepsilon _{1}>0$ to any $\varepsilon _{1}\geqslant 0$ . With this, we conclude this construction with a renormalization/transfer-of-scales operator for any $\delta \geqslant 0$ :
(4.6) $$ \begin{align} {\mathsf{R}_{\delta}(\tau_{y}\eta_{S}) \ \overset{\bullet}= \ \mathsf{E}_{\delta}^{\mathrm{can}}(\tau_{y}\eta_{S}) - \mathsf{E}^{\mathrm{can}}_{\delta+\varepsilon_{\mathrm{RN},1}}(\tau_{y}\eta_{S})}. \end{align} $$ -
• We emphasize that the constructions in the above bullet points are functionals $\Omega \to \mathbb R$ evaluated at (shifts of) $\eta _{S}$ . In particular, they make sense upon plugging in any $\eta $ instead of (shifts of) $\eta _{S}$ .
We explain the proof of Theorem 4.1; even though we did so in the previous section, for clarity we present it with the above notation. The key is to replace $\bar {\mathfrak {q}}$ in equation (4.1) by its $\mathsf {E}^{\mathrm {can}}$ -expectation on the length scale $N^{1/2+\beta '}$ , in which $\beta '>0$ is universal. The motivation behind such replacement is the following observation. The functional $\bar {\mathfrak {q}}$ vanishes in $\mathbf {E}_{0}$ expectation, and because the global $\eta $ density is roughly 0, the fluctuations $\mathbf {E}_{0}\bar {\mathfrak {q}}-\mathsf {E}^{\mathrm {can}}(\tau _{y}\eta _{S})$ at length scale $\mathfrak {l}$ are at most order $\mathfrak {l}^{-1/2}$ by central limit theorem, for example. Taking $\mathfrak {l}=N^{1/2+\beta '}$ does not allow scale- $\mathfrak {l}$ expectation $\mathsf {E}^{\mathrm {can}}(\tau _{y}\eta _{S})$ to beat the $N^{1/2}$ factor on the LHS of equation (4.1). But $\mathbf {E}_{0}\bar {\mathfrak {q}}=0$ requires only the correction $\mathbf {E}_{0}\widetilde {\mathfrak {q}}$ in Definition 2.2. The purpose of the additional linear correction, in a technical sense, is to actually cancel the leading order behavior of the scale- $\mathfrak {l}$ expectation of $\bar {\mathfrak {q}}$ so that, according to Proposition 8 of [Reference Goncalves and Jara24], the fluctuations at length scale $\mathfrak {l}$ are order at most $\mathfrak {l}^{-1}$ . Thus, our choice of $\mathfrak {l}$ beats $N^{1/2}$ because of the extra exponent $\beta '$ . We note showing that the $\eta $ -density is roughly 0 in the stationary case is easy; in the nonstationary case, we need regularity of $\mathbf {Y}^{N}$ .
Let us now explain how the replacement of $\bar {\mathfrak {q}}$ in equation (4.1) by its $\mathsf {E}^{\mathrm {can}}$ -expectation on length $N^{1/2+\beta '}$ will be justified. As suggested by the constructions in Definition 4.5, we will first replace $\bar {\mathfrak {q}}$ with its $\mathsf {E}^{\mathrm {can}}$ -expectation at the length scale $N^{\varepsilon _{1}}$ with $\varepsilon _{1}>0$ from Definition 4.5 sufficiently small though universal. The error in this first replacement step is the heat operator acting on $N^{1/2}\mathbf {Y}^{N}$ times the difference $\mathsf {S}_{\varepsilon _{1}}(\tau _{y}\eta _{S})$ from Definition 4.5, which is a fluctuating factor with small support with size of order $N^{\varepsilon _{1}}$ . We will then estimate this fluctuating factor using basically the methods of [Reference Yang49]; as noted in Section 3.2, this roughly amounts to averaging out in time these fluctuations, applying the Kipnis-Varadhan inequality (see Appendix 1.6 in [Reference Kipnis and Landim37]) at stationarity, and then performing reduction to stationarity by a ‘local equilibrium’ estimate via the entropy inequality.
We now replaced $\bar {\mathfrak {q}}$ in equation (4.1) with its $\mathsf {E}^{\mathrm {can}}$ -expectation with respect to the small mesoscopic length scale $N^{\varepsilon _{1}}$ . The next step is to replace this $\mathsf {E}^{\mathrm {can}}$ -expectation with another $\mathsf {E}^{\mathrm {can}}$ -expectation but on the slightly larger mesoscopic length scale $N^{\varepsilon _{1}+\varepsilon _{\mathrm {RN},1}}$ where $\varepsilon _{\mathrm {RN},1}$ in Definition 4.5 is arbitrarily small but universal. As noted at the end of Section 3.2, we encounter additional obstructions when we try to replace by $\mathsf {E}^{\mathrm {can}}$ -expectation on larger length scales. Indeed, the entropy inequality breaks down when we try to reduce to equilibrium on larger subsets unless we have better a priori estimates for $\mathsf {E}^{\mathrm {can}}$ on larger scales. This a priori control on $\mathsf {E}^{\mathrm {can}}$ -expectations is explained in the first paragraph after Definition 4.5, and it is enough extra benefit from the initial replacement to then perform a replacement by $\mathsf {E}^{\mathrm {can}}$ -expectation on a slightly larger length scale, so long as $\varepsilon _{\mathrm {RN},1}$ is sufficiently smaller than $\varepsilon _{1}$ , so the jump in length scales is not too large that the extra benefit in the previous scale- $N^{\varepsilon _{1}}$ replacement is not good enough. Ultimately, our analysis remains intact as we increase the length scale. We then iterate until the desired length scale $N^{1/2+\beta '}$ .
We write three ingredients below for the proof of Theorem 4.1, each corresponding to one of the three paragraphs above. The first is initial replacement of $\bar {\mathfrak {q}}$ in equation (4.1) with its $\mathsf {E}^{\mathrm {can}}$ -expectation on scale $N^{\varepsilon _{1}}$ in the second paragraph above. The second is the multiscale ‘renormalization’ of length scales from the third paragraph. The last is the inverse-length-scale bound on $\mathsf {E}^{\mathrm {can}}(\tau _{y}\eta _{S})$ .
Proposition 4.6. Take $\varepsilon _{1}=1/14$ . There exists a universal constant $\beta _{1}>0$ , which is again uniformly bounded from below, such that the following holds, in which the $\|\|_{1;\mathbb {T}_{N}}$ norm is with respect to $(T,x)$ -variables in the heat operator on the LHS:
Proposition 4.7. Suppose $\varepsilon _{\mathrm {RN},1}>0$ is sufficiently small but universal depending only on $\varepsilon _{1}>0$ . Define $\mathfrak {b}_{+}\in \mathbb Z_{\geqslant 0}$ to be the last nonnegative integer $\mathfrak {b}$ so that $\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}\leqslant \frac 12+\varepsilon _{\mathrm {RN}}$ , where $\varepsilon _{\mathrm {RN}}>0$ is the universal constant from Definition 3.1 . There is a universal constant $\beta _{2}>0$ , which is therefore uniformly bounded from below, such that the following expectation estimate holds, again in which the $\|\|_{1;\mathbb {T}_{N}}$ norm is with respect to $(T,x)$ -variables in the heat operator on the LHS:
We also have $\mathfrak {b}_{+}\lesssim _{\varepsilon _{1},\varepsilon _{\mathrm {RN},1},\varepsilon _{\mathrm {RN}}}1$ , so the supremum on the LHS of equation (4.8) may be replaced by a sum.
Proposition 4.8. Suppose that $\varepsilon _{\mathrm {RN},1}\leqslant 999^{-999}\varepsilon _{\mathrm {RN}}$ , where $\varepsilon _{\mathrm {RN}}>0$ is from Definition 3.1 . We have the following deterministic estimate, again in which the $\|\|_{1;\mathbb {T}_{N}}$ norm is with respect to $(T,x)$ -variables in the heat operator on the LHS:
Remark 4.9. Note that $\varepsilon _{1}+\mathfrak {b}_{+}\varepsilon _{\mathrm {RN},1}\leqslant \frac 12+\varepsilon _{\mathrm {RN}}$ , so we have a priori regularity estimates for $\mathbf {Y}^{N}$ on the length scale $N^{\varepsilon _{1}+\mathfrak {b}_{+}\varepsilon _{\mathrm {RN},1}}$ defining the canonical measure expectation in equation (4.9); see Definitions 3.1 and 3.5 for why this is true.
Proof of Theorem 4.1 .
We have the following tautological decomposition that uses linearity of the heat operator to replace $\bar {\mathfrak {q}}$ by its $\mathsf {E}^{\mathrm {can}}$ on length scale $N^{\varepsilon _{1}+\mathfrak {b}_{+}\varepsilon _{\mathrm {RN},1}}$ and then collects the error $\mathsf {S}$ :
We proceed with the following multiscale decomposition of the second term on the RHS of equation (4.10) that rewrites the difference $\mathsf {S}$ of $\bar {\mathfrak {q}}$ with $\mathsf {E}^{\mathrm {can}}$ on length scale $N^{\varepsilon _{1}+\mathfrak {b}_{+}\varepsilon _{\mathrm {RN},1}}$ in terms of a telescoping sum of the successive differences of $\mathsf {E}^{\mathrm {can}}$ terms on progressively larger length scales; again, the following is by definition and by linearity of the heat operator:
We plug equation (4.11) into the second term on the RHS of equation (4.10). We then take $\|\|_{1;\mathbb {T}_{N}}$ norms of both sides of the resulting identity, employ the triangle inequality for $\|\|_{1;\mathbb {T}_{N}}\kern-1.5pt$ , take expectations and apply Proposition 4.6, Proposition 4.7, and Proposition 4.8.
We defer the proofs of Proposition 4.6 and Proposition 4.7 to the last nonappendix sections because of their complexity.
4.1 Proof of Proposition 4.8
The only preliminary ingredient we need for the current argument is the following estimate for which we employ crucially the a priori space-time regularity estimates in $\mathbf {Y}^{N}$ . Its proof is relatively quick; it is an idea used in [Reference Dembo and Tsai19] in the proof of the hydrodynamic limit estimate of Lemma 2.5 therein where $\eta $ -variables are realized as $\mathbf {h}^{N}$ gradients.
Lemma 4.10. Suppose the inequalities for $\varepsilon _{1}$ and $\varepsilon _{\mathrm {RN},1}$ and $\varepsilon _{\mathrm {RN}}$ and $\varepsilon _{\mathrm {ap}}$ in Definition 3.1 and Proposition 4.8 hold. Then we have the following deterministic estimates:
Proof. The first estimate in equation (4.12) is immediate by definition of $\mathbf {Y}^{N}$ in Definition 3.5. Indeed, it suffices to look just at times $T\leqslant \mathfrak {t}_{\mathrm {st}}$ because afterwards, we have $\mathbf {Y}^{N}=0$ . Similarly, until the stopping time $\mathfrak {t}_{\mathrm {st}}$ we have $\mathbf {Y}^{N}=\mathbf {Z}^{N}\kern-1.5pt$ , where $\mathbf {Z}^{N}$ is uniformly bounded by $N^{\varepsilon _{\mathrm {ap}}}$ times uniformly bounded factors. Thus, we are left with proving the second bound in equation (4.12). Note the following that relates the $\mathsf {A}^{\mathbf {X}}$ term to $\mathbf {h}^{N}$ , whose proof follows by $\eta _{T,x}=N^{1/2}(\mathbf {h}_{T,x}^{N}-\mathbf {h}_{T,x-1}^{N})$ and in which we set $\widetilde {\varepsilon }_{1}={\varepsilon _{1}+\mathfrak {b}_{+}\varepsilon _{\mathrm {RN},1}}$ :
We refer to the proof of Lemma 2.5 in [Reference Dembo and Tsai19] for a similar identity in which $N^{\widetilde {\varepsilon }_{1}}$ is instead a small multiple of $N^{1/2}$ . We now employ elementary calculus for the logarithm to establish the following estimate for the far RHS of equation (4.13). Roughly speaking, because the derivative of the logarithm is bad at 0 and is otherwise uniformly smooth, the gradient on the far RHS of equation (4.13) may be controlled by the same gradient but of $\mathbf {Z}^{N}\kern-1.5pt$ , then times the space-time supremum of $(\mathbf {Z}^{N})^{-1}$ . Extending equation (4.13) this way,
Observe that the space-time norms on the RHS of equation (4.14) are a space-time supremum until the stopping time $\mathfrak {t}_{\mathrm {st}}$ . Until this stopping time, we have a uniform upper bound for the first norm on the RHS of equation (4.14) of $N^{2\varepsilon _{\mathrm {ap}}}$ by definition. Similarly, because we have assumed the inequality $N^{\widetilde {\varepsilon }_{1}}\leqslant \mathfrak {l}_{N}$ by construction in Proposition 4.7, where $\widetilde {\varepsilon }_{1}=\varepsilon _{1}+\mathfrak {b}_{+}\varepsilon _{\mathrm {RN},1}$ is from Proposition 4.7 and $\mathfrak {l}_{N}\in \mathbb Z_{\geqslant 0}$ is from Definition 3.1, by definition of $\mathfrak {t}_{\mathrm {st}}$ in Definition 3.1 we get a priori spatial regularity estimates for $\mathbf {Z}^{N}\kern-1.5pt$ , which imply the second norm on the RHS of equation (4.14) is bounded above by $N^{2\varepsilon _{\mathrm {ap}}}N^{-1}(1+N^{\widetilde {\varepsilon }_{1}})(1+\|\mathbf {Z}^{N}\|_{\mathfrak {t}_{\mathrm {st}};\mathbb {T}_{N}})^{4}$ , which may be thought of as $N^{2\varepsilon _{\mathrm {ap}}}$ times the square of the spatial Holder regularity estimate of exponent $\frac 12$ for $\mathbf {Z}^{N}\kern-1.5pt$ . By Definition 3.1, we also know $\|\mathbf {Z}^{N}\|_{\mathfrak {t}_{\mathrm {st}};\mathbb {T}_{N}}\lesssim N^{\varepsilon _{\mathrm {ap}}}$ . Thus, we get via (4.14) and this paragraph that
Recall from Proposition 4.7 that $\mathfrak {b}_{+}$ is the final nonnegative integer $\mathfrak {b}$ with $\varepsilon _{1}+\mathfrak {b}_{+}\varepsilon _{\mathrm {RN},1}\leqslant \frac 12+\varepsilon _{\mathrm {RN}}$ . As $\varepsilon _{\mathrm {RN},1}\leqslant 999^{-999}\varepsilon _{\mathrm {RN}}$ by our assumption, we obtain the lower bound $\varepsilon _{1}+\mathfrak {b}_{+}\varepsilon _{\mathrm {RN},1}\geqslant \frac 12+\frac {99}{100}\varepsilon _{\mathrm {RN}}$ , for example, because if not, then we could increase $\mathfrak {b}_{+}$ by 1 while only adding $999^{-999}\varepsilon _{\mathrm {RN}}$ , and this would not boost $\frac 12+\frac {99}{100}\varepsilon _{\mathrm {RN}}$ past $\frac 12+\varepsilon _{\mathrm {RN}}$ . Combining equation (4.15) with this lower bound for $\varepsilon _{1}+\mathfrak {b}_{+}\varepsilon _{\mathrm {RN},1}$ finishes the proof of the lemma.
We proceed with proof of Proposition 4.8. The first step we take is to replace the canonical measure expectation $\mathsf {E}^{\mathrm {can}}$ in the heat operator on the LHS of equation (4.9) by a grand-canonical measure expectation $\mathsf {E}^{\mathrm {gc}}$ evaluated at the same $\eta $ -density $\sigma _{\widetilde {\varepsilon }_{1},S,y}$ and the same functional $\bar {\mathfrak {q}}$ , where we have again employed the notation $\widetilde {\varepsilon }_{1}={\varepsilon _{1}+\mathfrak {b}_{+}\varepsilon _{\mathrm {RN},1}}$ introduced in the proof of Lemma 4.10 just to ease notation. For this, we apply Proposition 8 in [Reference Goncalves and Jara24] with the choice of function $f=\mathfrak {q}_{0,0}$ and with the choice of length scale therein to be $\ell =N^{\widetilde {\varepsilon }_{1}}$ :
The last/second inequality in equation (4.16) follows by the same observation that we made in the final paragraph in the proof of Lemma 4.10. If we multiply the LHS by $N^{1/2}\mathbf {Y}^{N}_{S,y}$ and put this in the heat operator, since $|\mathbf {Y}^{N}|\leqslant N^{\varepsilon _{\mathrm {ap}}}$ , it is enough to show Proposition 4.8 but with $\mathsf {E}^{\mathrm {gc}}$ in place of $\mathsf {E}^{\mathrm {can}}$ , therefore completing the desired first step/replacement. To control $\mathsf {E}^{\mathrm {gc}}$ , let us first recall from Definition 4.5 that $\mathsf {E}^{\mathrm {gc}}$ is expectation of $\bar {\mathfrak {q}}_{0,0}$ with respect to a grand-canonical ensemble of parameter $\sigma _{\widetilde {\varepsilon }_{1},S,y}$ . We will now Taylor expand this function of $\sigma _{\widetilde {\varepsilon }_{1},S,y}$ up to second order around the value $\sigma =0$ and obtain the following estimate:
The first term on the RHS of equation (4.17) is easily checked to be 0, as the linear term in $\bar {\mathfrak {q}}$ has expectation 0, and what is left is just $\widetilde {\mathfrak {q}}_{0,0}$ minus its expectation with respect to ${\mathbf {E}_{0}}$ . The key idea is that the second term also vanishes because $\bar {\mathfrak {d}}\partial _{\sigma }{\mathbf {E}_{\sigma }}\eta =\bar {\mathfrak {d}}\partial _{\sigma }\sigma =\bar {\mathfrak {d}}$ , and $\bar {\mathfrak {d}}$ is defined to equal $\partial _{\sigma }{\mathbf {E}_{\sigma }}\widetilde {\mathfrak {q}}_{0,0}|_{\sigma =0}$ , while the constant expectation of $\bar {\mathfrak {q}}_{0,0}$ certainly vanishes after $\partial _{\sigma }$ differentiation. Let us refer the reader to Definition 2.2 for definitions of all functionals and factors just mentioned. Thus, by this paragraph and equation (4.17), we are left with proving equation (4.9) upon replacing $\mathsf {E}^{\mathrm {can}}$ with $\mathsf {E}^{\mathrm {gc}}$ and then replacing $\mathsf {E}^{\mathrm {gc}}$ with the big-Oh term on the RHS of equation (4.17). That estimate follows by Lemma 4.10, as $\sigma _{\widetilde {\varepsilon }_{1},S,y}={\mathsf {A}^{\mathbf {X}}_{\widetilde {\varepsilon }_{1},y}(\eta _{S})}$ by definition. This completes the proof.
Remark 4.11. If we were to apply our method to environment dependence in reversible dynamics, it is fairly standard [Reference Brox and Rost6, Reference Chang and Yau8, Reference Kipnis and Landim37] that we would need to prove Theorem 4.1 but with the spatial gradient of the heat operator on the LHS of equation (4.1). Since gradients of $\mathbf {H}^{N}$ introduce higher-degree short-time singularities of $\mathbf {H}^{N}$ , we need to resolve more singular factors during the proof of equation (4.1) with this extra gradient. There are ultimately several possible ways to resolve such singularities. For the purposes of computing scaling limits of fluctuations, however, for linear non-KPZ limits of interest in [Reference Chang and Yau8], for example, the simplest would be to smooth the short-time behavior of $\mathbf {H}^{N}$ by convolving against a time-1 heat kernel. This would remove the higher-order singularity while only changing this paper by revising the fluctuation scaling limit of main interest to hold only after smoothing, thus with respect to a weaker topology that is the topology used for fluctuation scaling limits in previous literature anyway; see [Reference Brox and Rost6, Reference Chang and Yau8, Reference Jara and Menezes34, Reference Kipnis and Landim37]. But in the current paper, the singular on-diagonal factors in $\mathbf {H}^{N}$ actually pose no issue in proving convergence in Theorem 1.8 in quite a strong sense. This is a concrete example of ‘analytic’ strength of our method, compatible with PDE ideas to solve $\mathrm {SHE}$ .
5 Boltzmann–Gibbs principle II
The point of this section is a second version of the nonstationary first-order Boltzmann–Gibbs principle. To motivate it, we emphasize the proof of Theorem 4.1 requires important a priori space-time regularity estimates on $\mathbf {Z}^{N}$ that were engineered into the definition of $\mathbf {Y}^{N}$ via the stopping time $\mathfrak {t}_{\mathrm {st}}$ . We will need to establish such a priori space-time regularity estimates in order for $\mathbf {Y}^{N}$ to be a faithful proxy for $\mathbf {Z}^{N}\kern-1.5pt$ . It turns out that establishing the important time-regularity estimates is a rather straightforward set of moment estimates for the $\mathbf {Z}^{N}$ equation. However, for technical reasons, this is not true for establishing the required spatial regularity defining $\mathfrak {t}_{\mathrm {st}}$ . Indeed, a direct moment bound on spatial regularity of the order $N^{1/2}$ term in the stochastic equation from Corollary 2.6, without analyzing $\bar {\mathfrak {q}}$ carefully, ends up being much worse than the required spatial regularity estimate in $\mathfrak {t}_{\mathrm {st}}$ . In order to resolve such issue, we will need to estimate spatial gradients of the order $N^{1/2}$ term in the stochastic equation from Corollary 2.6 by taking advantage of the fluctuating behavior of the $\bar {\mathfrak {q}}$ function as we did in the proof of Theorem 4.1. This leads to our second version of the Boltzmann–Gibbs principle in Theorem 5.3, which we present after introducing some notation.
Definition 5.1. Consider any $\phi :\mathbb {T}_{N}\to \mathbb R$ . Define the following normalized maximal gradient on the length scale $\mathfrak {l}_{+}\in \mathbb Z_{\geqslant 0}$ :
We extend the previous normalized maximal gradient to heat operators in the following fashion in which $\Phi :\mathbb R_{\geqslant 0}\times \mathbb {T}_{N}\to \mathbb R$ :
Remark 5.2. Intuitively, provided any function $\phi :\mathbb {T}_{N}\to \mathbb R$ that is ‘smooth’ on scale $\mathfrak {l}_{+}\in \mathbb Z_{\geqslant 0}$ , its normalized maximal gradient on this length scale will be controlled, roughly speaking. The goal for Theorem 5.3 will be to prove the homogenization estimate in Theorem 4.1, or actually a slightly weaker version, holds not just uniform in space-time but at the level of normalized maximal gradients with respect to the length scale $\mathfrak {l}_{N}$ in Definition 3.1 on which we want to get spatial regularity of the Gartner transform. Let us also emphasize that the above extensions of the normalized maximal gradients to the spatial and space-time heat operators are emphatically not the normalized maximal gradients of the heat operators themselves when we view them as functions in their own right. This is because of the absolute value inside the sum and integral in (5.2).
Theorem 5.3. There exists a universal constant $\beta>0$ , necessarily uniformly bounded below, that is independent of $\varepsilon _{\mathrm {RN}}>0$ from Definition 3.1 such that for the length scale $\mathfrak {l}_{N}$ in Definition 3.1 , we have the expectation estimate
We clarify there are no absolute value bars around the normalized maximal gradient ‘operator’ on the LHS of equation (5.3).
The proof of Theorem 5.3 is similar to that of Theorem 4.1 in architecture; it is a mix of probabilistic homogenization estimates along with stochastic regularity estimates built into the $\mathbf {Y}^{N}$ process in the heat operator on the LHS of equation (5.3). However, before we discuss the proof, we briefly explain its utility; this will be explored in detail in Section 6. Recall our motivation for Theorem 5.3 is to show a priori spatial regularity in $\mathfrak {t}_{\mathrm {st}}$ ‘propagates itself’ with high probability, thus $\mathfrak {t}_{\mathrm {st}}=1$ with high probability. Take any with $\mathfrak {l}_{N}$ in Definition 3.1/Theorem 5.3. Theorem 5.3 gives, with high probability simultaneously in $\mathfrak {l}$ ,
The last bound in the above display follows via bounding $|\mathfrak {l}|\lesssim |\mathfrak {l}_{N}|=N^{1/2+\varepsilon _{\mathrm {RN}}}$ . Because $\beta>0$ in Theorem 5.3 is uniformly bounded below and independent of $\varepsilon _{\mathrm {RN}}$ while $\varepsilon _{\mathrm {RN}}$ is arbitrarily small but universal, the far RHS of the previous display is at most $N^{-1/2}|\mathfrak {l}|^{1/2}$ , proving the a priori spatial regularity of $\mathbf {Y}^{N}$ at least propagates the same level of spatial regularity of the order $N^{1/2}$ term in the stochastic equation for $\mathbf {U}^{N}$ ; we employ another argument in Section 7 via Lemma 3.7 to transfer this to $\mathbf {Z}^{N}\kern-1.5pt$ .
5.1 Proof of Theorem 5.3
Take any $(T,x)\in [0,1]\times \mathbb {T}_{N}\kern-1.5pt$ , and define $T_{N}=T-N^{-1/2-999\varepsilon _{\mathrm {RN}}}$ . The triangle inequality gives
The second term on the RHS of equation (5.5) is estimated deterministically. The first will be estimated basically via Theorem 4.1.
Lemma 5.4. We have the following estimate for the length scale $\mathfrak {l}_{N}$ in Definition 3.1 :
Lemma 5.5. There exists a universal constant $\beta>0$ such that for $\mathfrak {l}_{N}$ in Definition 3.1 , we have
Clearly, the triangle inequality (5.5) combined with Lemmas 5.4 and 5.5 implies Theorem 5.3 as $\varepsilon _{\mathrm {ap}}\leqslant 999^{-999}\varepsilon _{\mathrm {RN}}$ . Lemma 5.4 will be a straightforward consequence of heat estimates in Proposition A.3. Lemma 5.5 will be proved more delicately:
-
• We will replace $\bar {\mathfrak {q}}$ on the LHS of equation (5.7) with $\mathsf {E}^{\mathrm {can}}_{\varepsilon _{1}+\mathfrak {b}_{+}\varepsilon _{\mathrm {RN},1}}$ from Proposition 4.8. However, we cannot directly cite Proposition 4.6 and Proposition 4.7 for this because of the normalized maximal gradient on the LHS of equation (5.7) that is absent from Proposition 4.6 and Proposition 4.7. This will require gymnastics with heat operators that we demonstrate when we give a precise proof.
-
• Having made the previous replacement, we observe the proof of Proposition 4.8 is done through Lemma 4.10, which provides a deterministic estimate for $\mathsf {E}^{\mathrm {can}}_{\varepsilon _{1}+\mathfrak {b}_{+}\varepsilon _{\mathrm {RN},1}}$ . Thus, we will have the gradient of the heat operator acting on a small function; this is small by the heat estimates in Proposition A.3.
5.1.1 Proof of Lemma 5.4
Recall $\bar {\mathfrak {q}}\lesssim 1$ and $|\mathbf {Y}^{N}|\leqslant N^{\varepsilon _{\mathrm {ap}}}$ by construction in Definitions 3.1 and 3.5. Therefore, we have the following straightforward bound by controlling an integral/sum by replacing the integrand/summand with its absolute value:
It suffices to note the $\|\|_{1;\mathbb {T}_{N}}$ -norm on the RHS of equation (5.8) is bounded by $N^{-1-1/4-999\varepsilon _{\mathrm {RN}}/2}$ since the heat operator is smooth on the macroscopic length scale N, providing the factor of $N^{-1}$ ; see equation (A.6) in Proposition A.3. We emphasize the short time integral in the heat operator coming from the indicator function of the length $N^{-1/2-999\varepsilon _{\mathrm {RN}}}$ -interval given by $S\geqslant T_{N}$ above.
5.1.2 Proof of Lemma 5.5
We will first employ the following triangle inequality, recalling notation from Proposition 4.8:
Following the proof of Proposition 4.8, note the $\mathsf {E}^{\mathrm {can}}\mathbf {Y}^{N}$ term in equation (5.10) is at most $N^{5\varepsilon _{\mathrm {ap}}}N^{-1/2-99\varepsilon _{\mathrm {RN}}/100}$ deterministically. Thus, we get the following where we again use (A.6) in Proposition A.3 to get the last bound below as we did in the proof of Lemma 5.4, while to get the first bound we also drop the time-set indicator function in the heat operator after replacing everything in the heat operator by its absolute value, including the heat kernel gradient, which is okay for the sake of an upper bound:
Because $\varepsilon _{\mathrm {ap}}\leqslant 999^{-999}\varepsilon _{\mathrm {RN}}$ , the above display shows the contribution of equation (5.10) is certainly controlled by the RHS of the proposed estimate (5.7). Thus, it suffices to prove the same about equation (5.9). To this end, we consider the following.
-
• For any $\phi :\mathbb R_{\geqslant 0}\times \mathbb {T}_{N}\to \mathbb R$ , we have the following identity by Proposition A.3 in which $\mathrm {t}_{(N)}=T-T_{N}=N^{-1/2-999\varepsilon _{\mathrm {RN}}}$ ; below, on the RHS, the outer spatial heat operator sums over $w\in \mathbb {T}_{N}$ and the inner space-time heat operator integrates/sums over space-time variables $(S,y)$ :
(5.11) $$ \begin{align} \mathbf{H}_{T,x}^{N}(\phi_{S,y}\mathbf{1}_{S\leqslant T_{N}}) \ = \ \mathbf{H}^{N,\mathbf{X}}_{\mathrm{t}_{(N)},x}\left(\mathbf{H}^{N}_{T_{N},w}(\phi_{S,y})\right). \end{align} $$Taking gradients/normalized maximal gradients, from the above identity we establish the following estimate via the following reasoning. Let the normalized maximal gradient act on the outer spatial heat operator on the RHS of the previous identity. We control such normalized maximal gradient of the spatial heat operator by taking out the inner space-time heat operator it acts on while giving up its $\|\|_{1;\mathbb {T}_{N}}$ -norm and replacing spatial gradients of $\mathbf {H}^{N}$ by their absolute values and sum over $\mathbb {T}_{N}\kern-1.5pt$ :(5.12) $$ \begin{align} \|\widetilde{\nabla}_{\mathfrak{l}_{N}}^{\mathbf{X}}\mathbf{H}_{T,x}^{N}(\phi_{S,y}\mathbf{1}_{S\leqslant T_{N}})\|_{1;\mathbb{T}_{N}} \ \leqslant \ \||\widetilde{\nabla}_{\mathfrak{l}_{N}}^{\mathbf{X}}|\mathbf{H}_{\mathrm{t}_{(N)},x}^{N,\mathbf{X}}(1)\|_{1;\mathbb{T}_{N}}\|\mathbf{H}^{N}(\phi_{S,y})\|_{1;\mathbb{T}_{N}}. \end{align} $$ -
• The first factor within the RHS of equation (5.12) above is the 1-norm on $\mathbb {T}_{N}$ in the forwards spatial variable of the spatial gradient of the $\mathbf {H}^{N}$ heat kernel at time $\mathrm {t}_{(N)}$ , maximized over $\mathbb {T}_{N}$ with respect to the backwards spatial variable. Via equation (A.3) in Proposition A.3, this is at most uniformly bounded factors times $N^{-1}\mathrm {t}_{(N)}^{-1/2}\lesssim N^{-3/4+999\varepsilon _{\mathrm {RN}}/2}$ . Therefore, we get from this and equation (5.12)
(5.13) $$ \begin{align} \|\widetilde{\nabla}_{\mathfrak{l}_{N}}^{\mathbf{X}}\mathbf{H}_{T,x}^{N}(\phi_{S,y}\mathbf{1}_{S\leqslant T_{N}})\|_{1;\mathbb{T}_{N}} \ \lesssim \ N^{-\frac34+\frac{999}{2}\varepsilon_{\mathrm{RN}}}\|\mathbf{H}^{N}(\phi_{S,y})\|_{1;\mathbb{T}_{N}}. \end{align} $$
We use equation (5.13) for $\phi =N^{1/2}(\bar {\mathfrak {q}}-\mathsf {E}^{\mathrm {can}}_{\varepsilon _{1}+\mathfrak {b}_{+}\varepsilon _{\mathrm {RN},1}})\mathbf {Y}^{N}$ . Following the multiscale decomposition (4.11) in the proof of Theorem 4.1, by Propositions 4.6 and 4.7, we get the following that we explain shortly; below, $\beta>0$ is universal and independent of $\varepsilon _{\mathrm {RN}}$ :
The independence from $\varepsilon _{\mathrm {RN}}$ /universal feature of the exponent $\beta $ on the RHS of (5.14) follows by the observation that to replace $\bar {\mathfrak {q}}$ with $\mathsf {E}^{\mathrm {can}}_{\varepsilon _{1}+\mathfrak {b}_{+}\varepsilon _{\mathrm {RN},1}}$ from the proof of Theorem 4.1 using Proposition 4.6 and Proposition 4.7, the estimates in Proposition 4.6 and Proposition 4.7 have upper bounds that are universal negative powers of N independent of $\varepsilon _{\mathrm {RN}}$ . Taking $\varepsilon _{\mathrm {RN}}$ sufficiently small shows that equation (5.9) is bounded above by the far RHS of equation (5.14), and thus controlled by the RHS of the proposed estimate (5.7), upon possibly adjusting the value of $\beta $ by a universal positive factor. □
Remark 5.6. If we take a second-order spatial gradient in Theorem 5.3 instead of first-order gradient, which would be relevant if we were to apply our method to derive Boltzmann–Gibbs principles to study environment dependence in the reversible dynamics of the particle system, we would have to resolve a higher-degree short-time singularity of the heat kernel. Unlike Remark 4.11, however, we cannot just smooth since Theorem 5.3 will be used later to control density fluctuations, which by definition leads us to the LHS of equation (5.3) without smoothing. Instead, for non-KPZ fluctuations of interest in [Reference Chang and Yau8], we estimate Sobolev regularity of the density fluctuation $N\nabla ^{\mathbf {X}}_{1}\mathbf {h}^{N}$ by the proof of Theorem 2 in Chang–Yau [Reference Chang and Yau8]. It amounts to estimating said regularity by a general energy estimate that becomes useful if we have an ‘a priori’ Boltzmann–Gibbs principle. Regularity gives the Boltzmann–Gibbs principle via our local method. Then we iterate via fixed-point methods, using this Boltzmann–Gibbs principle to get regularity and so forth. Though this approach is inapplicable here since we study singular KPZ fluctuations, we make this remark in case of potential interest and to emphasize how one may apply our methods to generalize [Reference Chang and Yau8], for example, to nontrivial perturbations of environment-dependent exclusion processes as in [Reference Jara and Landim33, Reference Jara and Menezes34] or open boundary models, as we noted in the introduction.
6 Regularity estimates
The purpose of this section is to establish a ‘self-propagating’ aspect of the a priori regularity estimates defining $\mathfrak {t}_{\mathrm {st}}$ and $\mathbf {Y}^{N}\kern-1.5pt$ ; see Definitions 3.1 and 3.5. The self-propagating feature of the time-regularity estimate follows from a fairly straightforward set of moment estimates; see (3.14) in Proposition 3.2 in [Reference Dembo and Tsai19]. The self-propagating feature of the spatial regularity estimates will require the second Boltzmann–Gibbs principle in Theorem 5.3, and this will produce a weaker but sufficient result.
Proposition 6.1. Consider any arbitrarily small but universal constant $\vartheta>0$ . Given any possibly random time $\mathfrak {t}_{\mathrm {r}}\in [0,1]$ , let us define the following pair of events, in which we recall the notation of Definition 3.1 :
There exists a universal constant $\beta _{\mathrm {r}}>0$ , which is thus uniformly bounded from below, such that for any $\kappa>0$ , we have
6.1 $\mathcal {E}_{\vartheta }^{\mathbf {T}}(\mathfrak {t}_{\mathrm {r}};\mathbb {T}_{N})$ estimate
We first focus on getting the time-regularity estimate, namely that for the $\mathcal {E}_{\vartheta }^{\mathbf {T}}(\mathfrak {t}_{\mathrm {r}};\mathbb {T}_{N})$ probability. Following the proof of time regularity estimates for the Gartner transform in Proposition 3.2 of [Reference Dembo and Tsai19], we estimate time regularity of $\mathbf {U}^{N}$ by its defining stochastic equation in Definition 3.5. Specifically, we control time regularity of each term therein. We start with the following result that does this for all terms except initial data and $\mathrm {d}\xi ^{N}$ terms, which we treat separately.
Lemma 6.2. Take any $|\mathfrak {k}|\lesssim 1$ . We have the following deterministic estimate in which $\mathfrak {f}_{1},\mathfrak {f}_{2}:\Omega \to \mathbb R$ are uniformly bounded:
Proof. We will apply time-regularity estimates on the heat operator $\mathbf {H}^{N}$ from Proposition A.3 to estimate the first supremum on the LHS of equation (6.4). We additionally apply a mixed space-time regularity estimate on $\mathbf {H}^{N}$ in Proposition A.3 to estimate the second supremum on the LHS of equation (6.4). The former time regularity estimate gives the following in which the equality follows trivially, and the last estimate follows by recalling that we are restricting to timescales $\mathrm {s}\leqslant N^{-1}$ and that $|\mathbf {Y}^{N}|\lesssim N^{\varepsilon _{\mathrm {ap}}}$ by construction:
Similarly, the aforementioned mixed space-time regularity estimate for $\mathbf {H}^{N}$ in Proposition A.3 gives
Combining the previous two estimates (6.5) and (6.6) while recalling $\varepsilon _{\mathrm {ap}}$ is arbitrarily small but still universal would provide the proposed estimate (6.4) if we dropped the 1 term on the RHS of equation (6.4), and the squared norms therein were replaced by nonsquared norms. But this would imply equation (6.4) as written via the inequality $2|a|\leqslant 1+a^{2}$ , which holds for all $a\in \mathbb R$ .
We will proceed by estimating time regularity of the initial data term from the $\mathbf {U}^{N}$ equation in Definition 3.5. At this point in this subsection, in contrast to Lemma 6.2 our estimates will not be deterministic. In particular, the proof of the following estimate is based on establishing uniform upper bounds on moments of time gradients for each point in space-time. We will then glue the estimates to establish high probability time-regularity estimate simultaneously over some very fine discretization of space-time. We then conclude with a much simpler estimate to control submicroscopic short-time regularity. This will allow us to bootstrap from a discrete set of times to a continuous set of times.
Lemma 6.3. Consider any $\vartheta ,\kappa>0$ arbitrarily small and large, respectively, but both universal. We have
Proof. We proceed with steps briefly outlined prior to the statement of Lemma 6.3, namely a pointwise moment estimate, a union bound estimate and the short-time regularity estimate, which we write in this order.
-
• Observe by Proposition A.3, for example, the spatial operators $\mathbf {H}^{N,\mathbf {X}}$ satisfy the classical semigroup property for heat kernels and Markov processes. Thus, because we assume stable initial data for the Gartner transform, we will follow the proof of (3.14) of Proposition 3.2 in [Reference Dembo and Tsai19] to get the following for fixed $\mathrm {s}\in \mathbb {I}^{\mathbf {T}}$ and $T\geqslant 0$ and $x\in \mathbb {T}_{N}$ with arbitrary $p\geqslant 1$ and $\gamma>0$ ; below, the $\|\|_{\omega ;2p}$ -norm is with respect to the randomness in the particle system:
(6.8) $$ \begin{align} \|\nabla_{-\mathrm{s}}^{\mathbf{T}}\mathbf{H}_{T,x}^{N,\mathbf{X}}(\mathbf{Z}_{0,\cdot}^{N})\|_{\omega;2p} \ \lesssim_{p,\gamma} \ \mathrm{s}^{1/4-\gamma} \ \lesssim \ N^{2\gamma}\mathrm{s}^{1/4}. \end{align} $$Recall from the definition of stable initial data that we may take any $\gamma>0$ in the previous estimate (6.8). The last estimate in equation (6.8) follows as $\mathrm {s}\in \mathbb {I}^{\mathbf {T}}$ implies $\mathrm {s}\geqslant N^{-2}$ , and thus $\mathrm {s}^{-\gamma }\leqslant N^{2\gamma }$ . Applying the Chebyshev inequality, from equation (6.8) we establish the following probability estimate where, provided any $\vartheta ,\kappa>0$ , we take $\gamma>0$ sufficiently small with $p\geqslant 1$ sufficiently large but both depending only on $\vartheta ,\kappa>0$ so that $2p\gamma -2p\vartheta \leqslant -2\kappa $ ; we note that the following is uniform in space-time:(6.9) $$ \begin{align} \mathbf{P}\left(|\nabla_{-\mathrm{s}}^{\mathbf{T}}\mathbf{H}_{T,x}^{N,\mathbf{X}}(\mathbf{Z}_{0,\cdot}^{N})| \geqslant N^{\vartheta}\mathrm{s}^{\frac14}\right) \ \lesssim_{p,\gamma} \ N^{-2p\vartheta}N^{2p\gamma} \ \leqslant \ N^{-2\kappa}. \end{align} $$We emphasize that the dependence on $\gamma>0$ and $p\geqslant 1$ in equation (6.9) is now dependence on $\vartheta ,\kappa>0$ . -
• Consider a time-discretization $\mathbb {I}^{\mathbf {T},\mathrm {d}}=\{\mathfrak {j}N^{-99}\}_{\mathfrak {j}=0}^{N^{99}}$ , and let $\mathbb {I}^{\mathrm {d}}=\mathbb {I}^{\mathbf {T},\mathrm {d}}\times \mathbb {T}_{N}$ be the space-time for this time discretization. We now employ a union bound along with the previous probability estimate (6.9) to get
(6.10) $$ \begin{align} \mathbf{P}\left(\sup_{\mathrm{s}\in\mathbb{I}^{\mathbf{T}}} \sup_{(T,x)\in\mathbb{I}^{\mathrm{d}}}|\nabla_{-\mathrm{s}}^{\mathbf{T}}\mathbf{H}_{T,x}^{N,\mathbf{X}}(\mathbf{Z}_{0,\cdot}^{N})|\geqslant N^{\vartheta}\mathrm{s}^{\frac14}\right) &\leqslant \sum_{\substack{\mathrm{s}\in\mathbb{I}^{\mathbf{T}}\\ (T,x)\in\mathbb{I}^{\mathrm{d}}}}\mathbf{P} \left(|\nabla_{-\mathrm{s}}^{\mathbf{T}}\mathbf{H}_{T,x}^{N,\mathbf{X}}(\mathbf{Z}_{0,\cdot}^{N})| \geqslant N^{\vartheta}\mathrm{s}^{\frac14}\right) \lesssim N^{-2\kappa+101}. \end{align} $$We emphasize that the final estimate on the far RHS of equation (6.10) follows from applying equation (6.9) to each probability in the summation in the middle of equation (6.10) and then multiplying by the size of the product set $\mathbb {I}^{\mathbf {T}}\times \mathbb {I}^{\mathrm {d}}$ ; the size of $\mathbb {I}^{\mathbf {T}}$ from Definition 3.5 is uniformly bounded by $\kappa _{\varepsilon _{\mathrm {ap}}}N^{\varepsilon _{\mathrm {ap}}}$ because it is parameterized by one index set of size $N^{\varepsilon _{\mathrm {ap}}}$ and another index set that is in bijection with the set of exponents $\{-2+\mathfrak {j}\varepsilon _{\mathrm {ap}}\}_{\mathfrak {j}\geqslant 0}\cap [-2,1]$ . We also used the upper bound $|\mathbb {I}^{\mathrm {d}}|=|\mathbb {I}^{\mathbf {T},\mathrm {d}}||\mathbb {T}_{N}|\lesssim N^{99}N=N^{100}$ . -
• Let us now bootstrap from the discretization estimate (6.10) to the proposed estimate (6.7) over the entire semidiscrete space-time $[0,\mathfrak {t}_{\mathrm {r}}]\times \mathbb {T}_{N}\kern-1.5pt$ . To this end, let us first observe that $\mathbf {H}^{N,\mathbf {X}}(\mathbf {Z}^{N})$ is in the kernel of the operator $\partial _{T}-\mathscr {L}_{N}$ because it is a linear combination of heat kernels, each of which vanish under this operator. Thus, given any $0\leqslant \mathfrak {t}_{1}\leqslant \mathfrak {t}_{2}$ , we have
(6.11) $$ \begin{align} \sup_{x\in\mathbb{T}_{N}}|\mathbf{H}^{N,\mathbf{X}}_{\mathfrak{t}_{2},x}(\mathbf{Z}_{0,\bullet}^{N})-\mathbf{H}^{N,\mathbf{X}}_{\mathfrak{t}_{1},x}(\mathbf{Z}_{0,\bullet}^{N})| \ &\leqslant \ \int_{\mathfrak{t}_{1}}^{\mathfrak{t}_{2}}\sup_{x\in\mathbb{T}_{N}}|\mathscr{L}_{N}\mathbf{H}_{r,x}^{N,\mathbf{X}}(\mathbf{Z}_{0,\bullet}^{N})| \mathrm{d} r \nonumber\\ &\leqslant \ |\mathfrak{t}_{2}-\mathfrak{t}_{1}|\sup_{\mathfrak{t}_{1}\leqslant r\leqslant\mathfrak{t}_{2}}\sup_{x\in\mathbb{T}_{N}}|\mathscr{L}_{N}\mathbf{H}_{r,x}^{N,\mathbf{X}}(\mathbf{Z}_{0,\bullet}^{N})|. \end{align} $$Observe $\mathscr {L}_{N}:\mathscr {L}^{\infty }(\mathbb {T}_{N})\to \mathscr {L}^{\infty }(\mathbb {T}_{N})$ has operator norm $\mathrm {O}(N^{2})$ ; see Proposition 2.4. Combining this with equation (6.11), and the observation in Proposition A.3 that the spatial heat operator $\mathbf {H}^{N,\mathbf {X}}:\mathscr {L}^{\infty }(\mathbb {T}_{N})\to \mathscr {L}^{\infty }(\mathbb {T}_{N})$ has operator norm 1, provides(6.12) $$ \begin{align} \sup_{x\in\mathbb{T}_{N}}|\mathbf{H}^{N,\mathbf{X}}_{\mathfrak{t}_{2},x}(\mathbf{Z}_{0,\bullet}^{N})-\mathbf{H}^{N,\mathbf{X}}_{\mathfrak{t}_{1},x}(\mathbf{Z}_{0,\bullet}^{N})| \ \lesssim \ N^{2}|\mathfrak{t}_{2}-\mathfrak{t}_{1}|\sup_{\substack{\mathfrak{t}_{1}\leqslant r\leqslant\mathfrak{t}_{2}\\x\in\mathbb{T}_{N}}}|\mathbf{H}_{r,x}^{N,\mathbf{X}}(\mathbf{Z}_{0,\bullet}^{N})| \ \leqslant \ N^{2}|\mathfrak{t}_{2}-\mathfrak{t}_{1}|\|\mathbf{Z}^{N}\|_{0;\mathbb{T}_{N}}. \end{align} $$We will now establish the proposed estimate (6.7). First, we observe that, choosing $\kappa \geqslant 300$ arbitrarily large but still universal, we may work on the complement of the event in the probability on the far LHS of equation (6.10); anything outside this event happens with probability at most $N^{-2\kappa +100}\leqslant N^{-3\kappa /2}$ times factors depending only on $\vartheta ,\kappa $ . Given any $\mathrm {t}\in [0,\mathfrak {t}_{\mathrm {r}}]$ , let $\mathrm {t}^{\mathrm {d}}$ denote any element in $\mathbb {I}^{\mathbf {T},\mathrm {d}}$ which minimizes $|\mathrm {t}-\mathrm {t}^{\mathrm {d}}|$ . Because the elements in $\mathbb {I}^{\mathbf {T},\mathrm {d}}$ are evenly spaced by $N^{-99}$ , we automatically have $|\mathrm {t}^{\mathrm {d}}-\mathrm {t}|\leqslant N^{-99}$ . We now transfer a time gradient at $\mathrm {t}$ onto one at $\mathrm {t}^{\mathrm {d}}$ and collect the errors:(6.13) $$ \begin{align} \nabla_{-\mathrm{s}}^{\mathbf{T}}\mathbf{H}_{\mathrm{t},x}^{N,\mathbf{X}}(\mathbf{Z}_{0,\cdot}^{N})=\nabla_{-\mathrm{s}}^{\mathbf{T}}\mathbf{H}_{\mathrm{t}^{\mathrm{d}},x}^{N,\mathbf{X}}(\mathbf{Z}_{0,\cdot}^{N}) + (\mathbf{H}_{\mathrm{t}-\mathrm{s},x}^{N,\mathbf{X}}(\mathbf{Z}_{0,\cdot}^{N})-\mathbf{H}_{\mathrm{t}^{\mathrm{d}}-\mathrm{s},x}^{N,\mathbf{X}}(\mathbf{Z}_{0,\cdot}^{N})) + (\mathbf{H}^{N,\mathbf{X}}_{\mathrm{t}^{\mathrm{d}},x}(\mathbf{Z}_{0,\cdot}^{N})-\mathbf{H}^{N,\mathbf{X}}_{\mathrm{t},x}(\mathbf{Z}_{0,\cdot}^{N})). \end{align} $$We observe $(\mathrm {t}^{\mathrm {d}},x)\in \mathbb {I}^{\mathrm {d}}$ by construction. Because we work on the complement of the event in the probability on the far LHS of equation (6.10), the first term on the RHS of equation (6.13), after dividing by $\mathrm {s}^{1/4}$ and taking a supremum on $\mathbb {I}^{\mathrm {d}}$ , is at most $N^{\vartheta }$ . On the other hand, by equation (6.12) for $\{\mathfrak {t}_{1},\mathfrak {t}_{2}\}=\{\mathrm {t}-\mathrm {s},\mathrm {t}^{\mathrm {d}}-\mathrm {s}\}$ and $\{\mathfrak {t}_{1},\mathfrak {t}_{2}\}=\{\mathrm {t},\mathrm {t}^{\mathrm {d}}\}$ we get the following upon recalling $\mathrm {s}\in \mathbb {I}^{\mathbf {T}}$ implies $\mathrm {s}^{-1/4}\lesssim N^{1/2}$ . We explain the last estimate in the following display; it is deterministic because of our conditioning:(6.14) $$ \begin{align} \mathfrak{s}^{-1/4}\|\mathbf{H}_{\mathrm{t}-\mathrm{s},x}^{N,\mathbf{X}}(\mathbf{Z}_{0,\cdot}^{N})-\mathbf{H}_{\mathrm{t}^{\mathrm{d}}-\mathrm{s},x}^{N,\mathbf{X}}(\mathbf{Z}_{0,\cdot}^{N})\|_{\mathfrak{t}_{\mathrm{r}};\mathbb{T}_{N}} \ \lesssim \ N^{1/2}N^{2}|\mathrm{t}-\mathrm{s}-\mathrm{t}^{\mathrm{d}}+\mathrm{s}|\|\mathbf{Z}^{N}\|_{0;\mathbb{T}_{N}} \ \lesssim \ N^{-96}\|\mathbf{U}^{N}\|_{\mathfrak{t}_{\mathrm{r}};\mathbb{T}_{N}}. \end{align} $$The last estimate in equation (6.14) follows by recalling $|\mathrm {t}-\mathrm {s}-\mathrm {t}^{\mathrm {d}}+\mathrm {s}|=|\mathrm {t}-\mathrm {t}^{\mathrm {d}}| \leqslant N^{-99}$ and by realizing that $\mathbf {U}^{N}$ at time 0 is equal to $\mathbf {Z}^{N}$ at time 0 by construction in Definition 3.5, and this lets us replace $\mathbf {Z}^{N}$ with $\mathbf {U}^{N}$ in the middle of equation (6.14); the final step then bounds $\|\|_{0;\mathbb {T}_{N}}$ by $\|\|_{\mathfrak {t}_{\mathrm {r}};\mathbb {T}_{N}}\kern-1.5pt$ . We establish the same estimate upon replacing what is inside the norm on the far LHS of equation (6.14) by the third/last term on the RHS of equation (6.13) via the same reasoning. Thus, on the complement of the event in the probability on the far LHS of equation (6.10), the complement of the event in the probability in equation (6.7) holds. As this complement event in equation (6.10) fails with probability at most $N^{-\kappa }$ times $\vartheta ,\kappa $ -dependent factors, the same is true for the complement event in equation (6.7) as well.
This completes the proof.
We establish a similar time-regularity estimate for the $\mathrm {d}\xi ^{N}$ -term in the $\mathbf {U}^{N}$ equation from Definition 3.5. The strategy is the same, but because of the martingale theory that needs to be employed to efficiently study this term, gymnastics are needed. We emphasize the quadratic nature of regularity estimates we are currently proving comes naturally via the proof of the next result.
Lemma 6.4. Consider any $\vartheta ,\kappa \in \mathbb R_{>0}$ arbitrarily small and large, respectively, but both universal. We have
Proof. We employ a slightly adapted version of the strategy as in the proof of Lemma 6.3. In particular, the first step we will take, for $\kappa>0$ in the lemma large, is proving the following pointwise estimate for which $\mathrm {s}\in \mathbb {I}^{\mathbf {T}}$ and $(T,x)\in [0,1]\times \mathbb {T}_{N}\kern-1.5pt$ ; we will defer the proof of the following estimate (6.16) until later in this argument to avoid obscuring the strategy of this proof:
We proceed with a union bound over $(T,x)\in \mathbb {I}^{\mathrm {d}}$ with $\mathbb {I}^{\mathrm {d}}$ the discretization in the proof of Lemma 6.3; we deduce from equation (6.16) the following whose proof follows that of equation (6.10), where the last estimate in equation (6.17) below follows by choosing $\kappa>0$ large:
Following the proof of Lemma 6.3, we obtain time regularity of $\mathbf {H}^{N}(\mathbf {U}^{N}\mathrm {d}\xi ^{N})$ for short order $N^{-99}$ times to bootstrap estimates in equation (6.17) on $\mathbb {I}^{\mathrm {d}}$ to estimates over the entire semidiscrete space-time $[0,\mathfrak {t}_{\mathrm {r}}]\times \mathbb {T}_{N}\kern-1.5pt$ . However, dissimilar to the proof for Lemma 6.3, the quantity $\mathbf {H}^{N}(\mathbf {U}^{N}\mathrm {d}\xi ^{N})$ of interest is a space-time heat operator, not a spatial heat operator. Therefore, the semidiscrete PDE that it satisfies is the same $\mathscr {L}_{N}$ -heat equation satisfied by the $\mathbf {H}^{N}$ heat kernel but with an additional martingale differential. This makes the short-time estimates for $\mathbf {H}^{N}(\mathbf {U}^{N}\mathrm {d}\xi ^{N})$ more complicated, so we adopt another approach. Consider the $\mathbf {U}^{N}$ equation in Definition 3.5. The short-time regularity for $\mathbf {H}^{N}(\mathbf {U}^{N}\mathrm {d}\xi ^{N})$ is tautologically controlled by the short-time regularity for all other terms in that $\mathbf {U}^{N}$ equation. We have already addressed short-time regularity for all of these terms in the $\mathbf {U}^{N}$ equation, for example, in Lemma 6.2 and the proof for Lemma 6.3, except for $\mathbf {U}^{N}$ itself. Because $\mathbf {U}^{N}$ evolves in a large part through jumps in the particle system, we will not establish any deterministic short-time regularity estimates like we did for the other terms in the $\mathbf {U}^{N}$ equation, but we instead get high probability short-time regularity. Precisely, we get the following, for which we consider the space-time $\mathbb {I}^{\mathrm {d},\mathfrak {t}_{\mathrm {r}}}=(\mathbb {J}^{\mathbf {T}}\cap [0,\mathfrak {t}_{\mathrm {r}}])\times \mathbb {T}_{N}\kern-1.5pt$ , where $\mathbb {J}^{\mathbf {T}}\subset [0,1]$ has size $|\mathbb {J}^{\mathbf {T}}|\leqslant N^{200}$ ; we eventually take, for example, in equation (6.19), the set $\mathbb {J}^{\mathbf {T}}=\{\mathrm {t}^{\mathrm {d}}-\mathrm {s}\}$ for $\mathrm {t}^{\mathrm {d}}\in \mathbb {I}^{\mathbf {T},\mathrm {d}}=\{\mathfrak {j}N^{-99}\}_{\mathfrak {j}=0}^{N^{99}}$ and $\mathrm {s}\in \mathbb {I}^{\mathbf {T}}$ ; see Definition 3.1:
We again provide the proof of equation (6.18) at the end of this argument to avoid obstructing the point. Let us restrict to the complement of the events inside the probabilities in equations (6.17) and (6.18). Now, we will follow the proof of Lemma 6.3 starting with equation (6.13) and replacing $\mathbf {H}^{N,\mathbf {X}}(\mathbf {Z}^{N})$ by $\mathbf {H}^{N}(\mathbf {U}^{N}\mathrm {d}\xi ^{N})$ . The first term on the RHS of the resulting equation is controlled by restricting to the complement of the event in the probability in equation (6.17). To control the second and third terms on the RHS of the resulting equation, we use the following obtained by restricting to the complement of the event in the probability in equation (6.18); indeed, with notation as in equation (6.13), in equation (6.19) below we assume $|\mathrm {t}-\mathrm {t}^{\mathrm {d}}|=|\mathrm {t}-\mathrm {s}-(\mathrm {t}^{\mathrm {d}}-\mathrm {s})|\leqslant N^{-100}$ , so we may control the LHS of equation (6.19) if we restrict to the complement of the event in equation (6.18) since the LHS of equation (6.19) is a scale $\leqslant N^{-99}$ time gradient of $\mathbf {H}^{N}(\mathbf {U}^{N}\mathrm {d}\xi ^{N})$ evaluated at a point in the discretization $\mathbb {I}^{\mathrm {d},\mathfrak {t}_{\mathfrak {r}}}$ with the choice of $\mathbb {J}^{\mathbf {T}}$ explained right before equation (6.18):
The final estimate in equation (6.19) follows by recalling $\mathrm {s}\in \mathbb {I}^{\mathbf {T}}$ implies $\mathrm {s}\geqslant N^{-2}$ , and thus $\mathrm {s}^{-1/4}\lesssim N^{1/2}$ . Thus, whenever the events from equations (6.17) and (6.18) themselves fail, we deduce that the event in the probability in equation (6.15) fails as well. Because these events in equations (6.17) and (6.18) succeed with probability $\mathrm {O}_{\kappa ,\vartheta }(N^{-3\kappa /2})$ each, like the end of the proof of Lemma 6.3, we deduce that the probability that the event in equation (6.15) succeeds is at most $\mathrm {O}_{\kappa ,\vartheta }(N^{-3\kappa /2})\lesssim _{\kappa ,\vartheta }N^{-\kappa }$ . This completes the proof modulo the proofs of the probability estimates (6.16) and (6.18), which we provide below.
-
• We will first prove equation (6.16). To this end, let us first define $\mathscr {N}(\mathbf {U}^{N})=1+\|\mathbf {U}^{N}\|_{T;\mathbb {T}_{N}}^{2}$ in order to ease notation. We now employ the Chebyshev inequality with $p\geqslant 2$ to be determined shortly to establish the following upper bound for the LHS of equation (6.16):
(6.20) $$ \begin{align} &\mathbf{P}\left(|\nabla_{-\mathrm{s}}^{\mathbf{T}}\mathbf{H}_{T,x}^{N}(\mathbf{U}^{N}\mathrm{d}\xi^{N})| \geqslant N^{\vartheta}\mathrm{s}^{\frac14}(1+\|\mathbf{U}^{N}\|_{T;\mathbb{T}_{N}}^{2})\right) \nonumber\\ &\lesssim \ N^{-2p\vartheta}\mathrm{s}^{-p/2}\mathbf{E}\left(\mathscr{N}(\mathbf{U}^{N})^{-2p}|\nabla_{-\mathrm{s}}^{\mathbf{T}}\mathbf{H}_{T,x}^{N}(\mathbf{U}^{N}\mathrm{d}\xi^{N})| ^{2p}\right). \end{align} $$To motivate the next step, observe that for regularity of space-time heat operators in Lemma 6.2, we could pull out $\mathbf {U}^{N}$ and $\mathbf {Y}^{N}$ factors from the integral/heat operator upon inserting extra factors given by their space-time supremum norms; this is $\mathscr {L}^{1}/\mathscr {L}^{\infty }$ interpolation. However, to study the heat operator in the expectation on the RHS of equation (6.20), we require martingale inequalities. In particular, we need to take advantage of cancellations in $\mathbf {U}^{N}\mathrm {d}\xi ^{N}$ that appear when integrating against the heat kernel. This prevents us from applying the $\mathscr {L}^{1}/\mathscr {L}^{\infty }$ interpolation. The alternative we take begins with a level set decomposition/bound:(6.21) $$ \begin{align} \mathbf{E}\left(\mathscr{N}(\mathbf{U}^{N})^{-2p}|\nabla_{-\mathrm{s}}^{\mathbf{T}}\mathbf{H}_{T,x}^{N}(\mathbf{U}^{N}\mathrm{d}\xi^{N})| ^{2p}\right) \ &\lesssim_{p} \ {\sum}_{\mathfrak{l}=1}^{\infty}\mathfrak{l}^{-2p}\mathbf{E}\left(|\nabla_{-\mathrm{s}}^{\mathbf{T}}\mathbf{H}_{T,x}^{N}(\mathbf{U}^{N}\mathrm{d}\xi^{N})| ^{2p}\mathbf{1}_{\mathscr{N}(\mathbf{U}^{N})\in[\mathfrak{l},\mathfrak{l}+1]}\right). \end{align} $$The estimate (6.21) follows by considering level sets of $\mathscr {N}(\mathbf {U}^{N})$ ; on the $[\mathfrak {l},\mathfrak {l}+1]$ level set, we may employ the deterministic bound $\mathscr {N}(\mathbf {U}^{N})^{-1}\lesssim \mathfrak {l}^{-1}$ . Next, we move a factor of $\mathfrak {l}^{-p}$ in the expectation and get the following, which we explain after:(6.22) $$ \begin{align} &{\sum}_{\mathfrak{l}=1}^{\infty}\mathfrak{l}^{-2p}\mathbf{E}\left(|\nabla_{-\mathrm{s}}^{\mathbf{T}}\mathbf{H}_{T,x}^{N}(\mathbf{U}^{N}\mathrm{d}\xi^{N})| ^{2p}\mathbf{1}_{\mathscr{N}(\mathbf{U}^{N})\in[\mathfrak{l},\mathfrak{l}+1]}\right) \nonumber\\ &= \ {\sum}_{\mathfrak{l}=1}^{\infty}\mathfrak{l}^{-p}\mathbf{E}\left(|\nabla_{-\mathrm{s}}^{\mathbf{T}}\mathbf{H}_{T,x}^{N}(\mathfrak{l}^{-1/2}\mathbf{U}^{N}\mathrm{d}\xi^{N})| ^{2p}\mathbf{1}_{\mathscr{N}(\mathbf{U}^{N})\in[\mathfrak{l},\mathfrak{l}+1]}\right) \nonumber \\ &\leqslant \ \left({\sum}_{\mathfrak{l}=1}^{\infty}\mathfrak{l}^{-p}\right)\sup_{\mathfrak{l}\geqslant1}\mathbf{E}\left(|\nabla_{-\mathrm{s}}^{\mathbf{T}}\mathbf{H}_{T,x}^{N}(\mathfrak{l}^{-1/2}\mathbf{U}^{N}\mathrm{d}\xi^{N})| ^{2p}\mathbf{1}_{\mathscr{N}(\mathbf{U}^{N})\in[\mathfrak{l},\mathfrak{l}+1]}\right) \nonumber \\ &\lesssim \ \sup_{\mathfrak{l}\geqslant1}\mathbf{E}\left(|\nabla_{-\mathrm{s}}^{\mathbf{T}}\mathbf{H}_{T,x}^{N}(\mathfrak{l}^{-1/2}\mathbf{U}^{N}\mathrm{d}\xi^{N})| ^{2p}\mathbf{1}_{\mathscr{N}(\mathbf{U}^{N})\in[\mathfrak{l},\mathfrak{l}+1]}\right). \end{align} $$The first identity in the previous display follows by moving $\mathfrak {l}^{-p}$ into the expectation then into the $2p$ -th power upon replacing it by $\mathfrak {l}^{-1/2}$ and then moving this deterministic scalar through both the linear time gradient and heat operator. The final estimate (6.22) follows from an elementary bound on the summation in the line before. By definition of $\mathscr {N}(\mathbf {U}^{N})$ , if $\mathscr {N}(\mathbf {U}^{N})\in [\mathfrak {l},\mathfrak {l}+1]$ , the process $\mathfrak {l}^{-1/2}\mathbf {U}^{N}$ is uniformly bounded deterministically. Moreover, because $\mathbf {U}^{N}$ is adapted to the underlying filtration of the particle system, so is the deterministic multiple $\mathfrak {l}^{-1/2}\mathbf {U}^{N}$ . In particular, we may replace $\mathfrak {l}^{-1/2}\mathbf {U}^{N}$ in equation (6.22) with the adapted process $(\mathfrak {l}^{-1/2}\mathbf {U}^{N}\wedge 100)\vee (-100)$ , drop the indicator function in equation (6.22) and then follow the proof of time regularity (3.14) in [Reference Dembo and Tsai19]. For the last step, we will need to apply the time-regularity estimates from Proposition A.3 for the $\mathbf {H}^{N}$ heat kernel instead of those in [Reference Dembo and Tsai19] along with the martingale inequality in Lemma A.4 that extends Lemma 3.1 in [Reference Dembo and Tsai19], which is proved only for the Gartner transform, to uniformly bounded adapted processes. This ultimately gives, for $\varrho>0$ arbitrarily small but universal, the following in which we stress that $\mathfrak {l}^{-1/2}\mathbf {U}^{N}$ is uniformly bounded and adapted on the LHS below:(6.23) $$ \begin{align} \sup_{\mathfrak{l}\geqslant1}\mathbf{E}\left(|\nabla_{-\mathrm{s}}^{\mathbf{T}}\mathbf{H}_{T,x}^{N}(\mathfrak{l}^{-1/2}\mathbf{U}^{N}\mathrm{d}\xi^{N})| ^{2p}\mathbf{1}_{\mathscr{N}(\mathbf{U}^{N})\in[\mathfrak{l},\mathfrak{l}+1]}\right) \ \lesssim_{p,\varrho} \ \mathrm{s}^{p/2-p\varrho} \ \lesssim \ N^{2p\varrho}\mathrm{s}^{p/2}. \end{align} $$We recall that times $\mathrm {s}\in \mathbb {I}^{\mathbf {T}}$ of interest satisfy $\mathrm {s}\geqslant N^{-2}$ , which implies $\mathrm {s}^{-1}\leqslant N^{2}$ and thus provides the final estimate in equation (6.23). We now combine equations (6.20), (6.21), (6.22) and (6.23) to deduce(6.24) $$ \begin{align} \mathbf{P}\left(|\nabla_{-\mathrm{s}}^{\mathbf{T}}\mathbf{H}_{T,x}^{N}(\mathbf{U}^{N}\mathrm{d}\xi^{N})| \geqslant N^{\vartheta}\mathrm{s}^{\frac14}(1+\|\mathbf{U}^{N}\|_{T;\mathbb{T}_{N}}^{2})\right) \ \lesssim_{p,\varrho} \ N^{-2p\vartheta}N^{2p\varrho}\mathrm{s}^{-p/2}\mathrm{s}^{p/2} \ = \ N^{-2p\vartheta+2p\varrho}. \end{align} $$Now, provided any $\vartheta ,\kappa>0$ , we choose $\varrho>0$ sufficiently small and $p\geqslant 2$ sufficiently large, but both depending only on $\vartheta ,\kappa $ , so that the exponent on the far RHS of equation (6.24) is less than or equal to $-\kappa $ . We emphasize that the dependence on p and $\varrho $ in equation (6.24) becomes dependence on $\vartheta ,\kappa $ . This completes the proof of equation (6.16). -
• We move to the proof of equation (6.18). To this end, it suffices to replace $\mathbf {H}^{N}(\mathbf {U}^{N}\mathrm {d}\xi ^{N})$ therein with each other term in the $\mathbf {U}^{N}$ equation from Definition 3.5. Indeed, if we can prove that the short-time regularity for every other term in the $\mathbf {U}^{N}$ equation exceeds the lower bound in the event in the probability in equation (6.18) with probability at most $N^{-2\kappa }$ times $\vartheta ,\kappa $ -dependent factors, then by using the triangle inequality and a union bound, we can deduce the same for $\mathbf {H}^{N}(\mathbf {U}^{N}\mathrm {d}\xi ^{N})$ , which is the proposed estimate (6.18), if we also adjust the implied constant by a factor of 100. This is the fact that if $a=b+c$ , then $a\geqslant d$ implies $b\geqslant d/2$ or, not exclusively, $c\geqslant d/2$ . To control short-time regularity of every other term besides $\mathbf {H}^{N}(\mathbf {U}^{N}\mathrm {d}\xi ^{N})$ on the RHS of the $\mathbf {U}^{N}$ equation from Definition 3.5, we apply Lemma 6.2 and the third bullet point from the proof of Lemma 6.3. It remains to control short-time regularity for $\mathbf {U}^{N}$ itself. This is done in Lemma A.6, so we are done with proving equation (6.18).
This completes the proof.
Corollary 6.5. Admit the setting of Proposition 6.1 . We have $\mathbf {P}(\mathcal {E}_{\vartheta }^{\mathbf {T}}(\mathfrak {t}_{\mathrm {r}};\mathbb {T}_{N})) \lesssim _{\vartheta ,\kappa } N^{-\kappa }$ .
6.2 $\mathcal {E}_{\vartheta }^{\mathbf {X}}(\mathfrak {t}_{\mathrm {r}};\mathbb {T}_{N})$ estimate
The proof of the $\mathcal {E}_{\vartheta }^{\mathbf {X}}(\mathfrak {t}_{\mathrm {r}};\mathbb {T}_{N})$ estimate in Proposition 6.1 will follow basically the same strategy but, as we mentioned at the beginning of the section, we require additional input of Theorem 5.3 to control the spatial gradient of the order $N^{1/2}$ term in the $\mathbf {U}^{N}$ equation. In view of similarities with the proof of the $\mathcal {E}_{\vartheta }^{\mathbf {T}}(\mathfrak {t}_{\mathrm {r}};\mathbb {T}_{N})$ estimate, we start as follows.
Lemma 6.6. Take any $|\mathfrak {k}|\lesssim 1$ and $|\mathfrak {l}|\leqslant \mathfrak {l}_{N}$ , with $\mathfrak {l}_{N}$ in Definition 3.1 . If $\mathfrak {f}_{i}$ are uniformly bounded, then
Proof. We follow the proof of Lemma 6.2 but instead of time-regularity estimates of the heat operator $\mathbf {H}^{N}$ in Proposition A.3, we instead apply spatial-regularity estimates therein. As $\mathbf {H}^{N}$ is macroscopically smooth, this gives the estimate but for $N|\mathfrak {l}|^{-1}$ in place of $N^{1/2}|\mathfrak {l}|^{-1/2}$ in the first term on the LHS of equation (6.25). But this is stronger as $N|\mathfrak {l}|^{-1}\geqslant N^{1/2}|\mathfrak {l}|^{-1/2}$ for $|\mathfrak {l}|\leqslant \mathfrak {l}_{N}\leqslant N$ .
Lemma 6.7. Consider any $\vartheta ,\kappa>0$ arbitrarily small and large, respectively, but both universal. We have
Proof. We follow the proof of Lemma 6.3. In particular, we prove a pointwise probability estimate analogous to equation (6.9) and a union bound analogous to equation (6.10). We conclude with continuity/bootstrap analogous to the third bullet point in the proof of Lemma 6.3.
-
• Observe that the operator $\mathscr {L}_{N}$ commutes with any constant coefficient spatial gradient; this can be easily verified. Because the spatial heat operator $\mathbf {H}^{N,\mathbf {X}}$ is a matrix/operator exponential of a constant multiple of $\mathscr {L}_{N}$ , spatial gradients commute with the spatial heat operator. With this and the proof of (3.13) in [Reference Dembo and Tsai19], we establish the following for fixed and $T\geqslant 0$ and $x\in \mathbb {T}_{N}$ with arbitrary $p\geqslant 1$ and $\gamma>0$ , in which we assume $\mathfrak {l}\neq 0$ as this case is trivial:
(6.27) $$ \begin{align} \|\nabla^{\mathbf{X}}_{\mathfrak{l}}\mathbf{H}^{N,\mathbf{X}}(\mathbf{Z}_{0,\cdot}^{N})\|_{\omega;2p} \ = \ \|\mathbf{H}^{N,\mathbf{X}}(\nabla^{\mathbf{X}}_{\mathfrak{l}}\mathbf{Z}_{0,\cdot}^{N})\|_{\omega;2p} \ \lesssim_{p,\gamma} \ N^{-1/2+\gamma}|\mathfrak{l}|^{1/2-\gamma} \ \leqslant \ N^{\gamma}N^{-1/2}|\mathfrak{l}|^{1/2}. \end{align} $$The last inequality (6.27) follows from noting $|\mathfrak {l}|\neq 0$ and it is an integer. When we follow the proof of (3.13) in [Reference Dembo and Tsai19], we employ the heat kernel estimates in Proposition A.3 for $\mathbf {H}^{N,\mathbf {X}}$ rather than heat kernel estimates in [Reference Dembo and Tsai19]. The Chebyshev inequality then gives, for $p\geqslant 1$ , the following in which given $\vartheta ,\kappa>0$ , we choose $\gamma>0$ sufficiently small and p sufficiently large, but both depending only on $\vartheta ,\kappa $ , such that $-2p\vartheta +2p\gamma \leqslant -2\kappa $ :(6.28) $$ \begin{align} \mathbf{P}\left(|\nabla^{\mathbf{X}}_{\mathfrak{l}}\mathbf{H}^{N,\mathbf{X}}(\mathbf{Z}_{0,\cdot}^{N})| \geqslant N^{-1/2+\vartheta}|\mathfrak{l}|^{1/2}\right) \ \lesssim_{p,\gamma} \ N^{p-2p\vartheta}|\mathfrak{l}|^{-p}N^{2p\gamma}N^{-p}|\mathfrak{l}|^{p} \ \leqslant \ N^{-2\kappa}. \end{align} $$Again, the dependence on $p,\gamma \in \mathbb R_{>0}$ in the previous estimate (6.28) is now dependence on $\vartheta ,\kappa $ . -
• Consider the same discretization $\mathbb {I}^{\mathrm {d}}$ from the proof of Lemma 6.3. A union bound in the same fashion as that used to prove (6.10), when combined with (6.28), gives the following; recall $|\mathbb {I}^{\mathrm {d}}|\lesssim N^{100}$ as seen in the proof of Lemma 6.7, and $\mathfrak {l}_{N}\leqslant N$ :
(6.29)In what follows, we will take $\kappa $ sufficiently large so that $2\kappa -101\geqslant 3\kappa /2$ . -
• We complete the proof via bootstrapping our estimate on $\mathbb {I}^{\mathrm {d}}$ to an estimate on the entire semidiscrete space-time $[0,\mathfrak {t}_{\mathrm {r}}]\times \mathbb {T}_{N}\kern-1.5pt$ . To this end, given any $\mathrm {t}\in [0,\mathfrak {t}_{\mathrm {r}}]$ , we again let $\mathrm {t}^{\mathrm {d}}$ be any element in $\mathbb {I}^{\mathbf {T},\mathrm {d}}=\{\mathfrak {j}N^{-99}\}_{\mathfrak {j}\geqslant 0}\cap [0,1]$ that minimizes $|\mathrm {t}-\mathrm {t}^{\mathrm {d}}|$ . We now provide the following parallel to equation (6.13) where $x\in \mathbb {T}_{N}$ is arbitrary:
(6.30) $$ \begin{align} \nabla_{\mathfrak{l}}^{\mathbf{X}}\mathbf{H}^{N,\mathbf{X}}_{\mathrm{t},x}(\mathbf{Z}_{0,\cdot}^{N}) =\nabla_{\mathfrak{l}}^{\mathbf{X}}\mathbf{H}^{N,\mathbf{X}}_{\mathrm{t}^{\mathrm{d}},x}(\mathbf{Z}_{0,\cdot}^{N}) + (\mathbf{H}_{\mathrm{t},x+\mathfrak{l}}^{N,\mathbf{X}}(\mathbf{Z}_{0,\cdot}^{N})-\mathbf{H}_{\mathrm{t}^{\mathrm{d}},x+\mathfrak{l}}^{N,\mathbf{X}}(\mathbf{Z}_{0,\cdot}^{N})) + (\mathbf{H}_{\mathrm{t}^{\mathrm{d}},x}^{N,\mathbf{X}}(\mathbf{Z}_{0,\cdot}^{N})-\mathbf{H}_{\mathrm{t},x}^{N,\mathbf{X}}(\mathbf{Z}_{0,\cdot}^{N})). \end{align} $$Following the third bullet point in the proof of Lemma 6.3, we have an estimate for the first term on the RHS of equation (6.30) outside an event of probability at most $N^{-3\kappa /2}$ times $\vartheta ,\kappa $ -dependent factors. We additionally have deterministic estimates for the second and third terms on the RHS of equation (6.30) by short-time continuity; see equation (6.12). This gives the following analog of equation (6.14):(6.31) $$ \begin{align} N^{1/2}|\mathfrak{l}|^{-1/2}\|\mathbf{H}_{\mathrm{t},x+\mathfrak{l}}^{N,\mathbf{X}}(\mathbf{Z}_{0,\cdot}^{N})-\mathbf{H}_{\mathrm{t}^{\mathrm{d}},x+\mathfrak{l}}^{N,\mathbf{X}}(\mathbf{Z}_{0,\cdot}^{N})\|_{\mathfrak{t}_{\mathrm{r}};\mathbb{T}_{N}} \ \lesssim \ N^{5/2}|\mathrm{t}-\mathrm{t}^{\mathrm{d}}|\|\mathbf{Z}^{N}\|_{0;\mathbb{T}_{N}} \ \lesssim \ N^{-96}\|\mathbf{U}^{N}\|_{\mathfrak{t}_{\mathrm{r}};\mathbb{T}_{N}}. \end{align} $$The first estimate in equation (6.31) follows by $|\mathfrak {l}|\geqslant 1$ combined with equation (6.12). The second estimate in equation (6.31) follows by $|\mathrm {t}-\mathrm {t}^{\mathrm {d}}|\leqslant N^{-99}$ and $\|\mathbf {Z}^{N}\|_{0;\mathbb {T}_{N}}\leqslant \|\mathbf {U}^{N}\|_{\mathfrak {t}_{\mathrm {r}};\mathbb {T}_{N}}$ , both of which we used in the third bullet point in the proof of Lemma 6.3.
We now apply the reasoning of the last paragraph in the third bullet point in the proof of Lemma 6.3 to finish the proof.
Lemma 6.8. Consider any $\vartheta ,\kappa>0$ arbitrarily small and large, respectively, but both universal. We have
Proof. We follow the proof of Lemma 6.4 upon replacing $\mathrm {s}^{-\frac 14}$ factors by $N^{\frac 12}|\mathfrak {l}|^{-\frac 12}$ and replacing $\nabla _{-\mathrm {s}}^{\mathbf {T}}\mathbf {H}^{N}(\mathbf {U}^{N}\mathrm {d}\xi ^{N})$ terms by $\nabla _{\mathfrak {l}}^{\mathbf {X}}\mathbf {H}^{N}(\mathbf {U}^{N}\mathrm {d}\xi ^{N})$ terms. Precisely, we can first establish equation (6.16) with these replacements upon using the same argument given in the proof of Lemma 6.4, except instead of following the proof of (3.14) in [Reference Dembo and Tsai19] we follow the proof of (3.13) in [Reference Dembo and Tsai19]. Taking a union bound over all length scales $1\leqslant |\mathfrak {l}|\leqslant \mathfrak {l}_{N}$ and all space-time points in the discretization $\mathbb {I}^{\mathrm {d}}$ then gives equation (6.17) with the same replacements. The short-time estimate equation (6.18) without any replacements lets us bootstrap from an estimate on $\mathbb {I}^{\mathrm {d}}$ to one over the entire set $[0,\mathfrak {t}_{\mathrm {r}}]\times \mathbb {T}_{N}$ in the same fashion as the end of the proof of Lemma 6.4.
Corollary 6.9. Admit the setting of Proposition 6.1 . We have $\mathbf {P}(\mathcal {E}_{\vartheta }^{\mathbf {X}}(\mathfrak {t}_{\mathrm {r}};\mathbb {T}_{N})) \lesssim _{\vartheta } N^{-\beta _{\mathrm {r}}}$ .
Proof. Like in the proof of Corollary 6.5, first observe $\nabla ^{\mathbf {X}}\mathbf {U}^{N}$ is controlled by $\nabla ^{\mathbf {X}}$ of the terms on the RHS of the $\mathbf {U}^{N}$ equation in Definition 3.5. Such $\nabla ^{\mathbf {X}}$ terms can be controlled by the proposed lower bound in the event $\mathcal {E}_{\vartheta }^{\mathbf {X}}(\mathfrak {t}_{\mathrm {r}};\mathbb {T}_{N})$ with the appropriate probability by applying Lemmas 6.6, 6.7 and 6.8, except for the order $N^{1/2}$ term in the $\mathbf {U}^{N}$ equation. For this term, we employ equation (5.4), which, by Theorem 5.3, holds with the desired probability of at least $1-\mathrm {O}_{\vartheta }(N^{-\beta _{\mathrm {r}}})$ for $\beta _{\mathrm {r}}>0$ universal.
7 Proof of Proposition 3.11 and Proposition 3.12
7.1 Preliminary $\mathbf {Q}^{N}$ estimates
Recall that Proposition 3.12 proposes a comparison between $\mathbf {U}^{N}$ and $\mathbf {Q}^{N}$ . For this, it will be important to ensure $\mathbf {Q}^{N}$ is ‘reasonable’; because our proof of Proposition 3.11, which amounts to establishing estimates for $\mathbf {U}^{N}$ and $\mathbf {Z}^{N}\kern-1.5pt$ , will use the comparison between $\mathbf {U}^{N}$ and $\mathbf {Q}^{N}$ , we will actually need to ensure that $\mathbf {Q}^{N}$ is ‘reasonable’. Before we start with the details of this subsection, let us recall the notions of high/overwhelming probability in Definition 3.9.The first estimate we present is an upper bound with respect to $\|\|_{1;\mathbb {T}_{N}}$ with overwhelming probability.
Lemma 7.1. Provided any $\vartheta>0$ uniformly bounded from below, we have $\|\mathbf {Q}^{N}\|_{1;\mathbb {T}_{N}}\leqslant N^{\vartheta }$ with overwhelming probability.
Proof. We start with the following inequality that we explain and justify afterwards. Roughly speaking, the following inequality controls a supremum over the semidiscrete space-time $[0,1]\times \mathbb {T}_{N}$ in terms of one over the discretization $\mathbb {I}^{\mathrm {d}}=\mathbb {I}^{\mathbf {T},\mathrm {d}}\times \mathbb {T}_{N}$ with $\mathbb {I}^{\mathbf {T},\mathrm {d}}=\{\mathfrak {j}N^{-99}\}_{0\leqslant \mathfrak {j}\leqslant N^{99}}$ and in terms of short-time estimates for $\mathbf {Q}^{N}$ . We emphasize the following estimate is deterministic:
The estimate (7.1) is proved using reasoning similar to that used in the third bullet point from the proof of Lemma 6.3. Consider any $\mathrm {t}\in [0,1]$ , and let $\mathrm {t}^{\mathrm {d}}\in \mathbb {I}^{\mathbf {T},\mathrm {d}}$ be any element in $\mathbb {I}^{\mathbf {T},\mathrm {d}}$ that minimizes $|\mathrm {t}-\mathrm {t}^{\mathrm {d}}|$ . We will write $\mathbf {Q}^{N}$ evaluated at $\mathrm {t}$ as $\mathbf {Q}^{N}$ evaluated at $\mathrm {t}^{\mathrm {d}}$ plus the corresponding difference, the difference being a time gradient on a timescale $\mathrm {t}-\mathrm {t}^{\mathrm {d}}$ evaluated at $\mathrm {t}^{\mathrm {d}}\in \mathbb {I}^{\mathrm {d}}$ . Therefore, we are left with estimating each term on the RHS of equation (7.1). Observe that it suffices to estimate each by $2^{-1}N^{\vartheta }$ with overwhelming probability, as the intersection of two events, both of which hold with overwhelming probability, also holds with overwhelming probability itself, which is a consequence of the union bound. Moreover, according to Lemma A.6, the second term on the RHS of equation (7.1) is at most a small multiple of the first term on the RHS of equation (7.1) with overwhelming probability. Thus, it suffices to bound from above the first term on the RHS of equation (7.1) by $4^{-1}N^{\vartheta }$ with overwhelming probability. For this, we use moment bounds for $\mathbf {Q}^{N}$ resembling those for the Gartner transform from Proposition 3.2 in [Reference Dembo and Tsai19].
Now recall from the proof of Lemma 3.14 that for any $p\geqslant 1$ , the $2p$ -moment of $\mathbf {Q}^{N}$ at any point in $[0,1]\times \mathbb {T}_{N}$ is bounded by a constant depending only on p. This follows via the observation that $\mathbf {Q}^{N}$ satisfies the moment estimate (3.12) in [Reference Dembo and Tsai19] if we remove the subexponential weights therein, which we made at the beginning of the proof of Lemma 3.14. Therefore, we have the following estimate by a union bound, the Cheybshev inequality and this moment estimate for $\mathbf {Q}^{N}$ :
The final estimate in equation (7.2) follows from the straightforward observation $|\mathbb {I}^{\mathrm {d}}|=|\mathbb {I}^{\mathbf {T},\mathrm {d}}||\mathbb {T}_{N}|\lesssim N^{100}$ . We can choose $p\geqslant 1$ arbitrarily large but depending only on the fixed $\vartheta>0$ so that the complement of the event in the probability on the far LHS of equation (7.2) holds with overwhelming probability. Therefore, we have proved the lemma if we replace $\|\|_{1;\mathbb {T}_{N}}$ with the supremum over the discrete space-time set $\mathbb {I}^{\mathrm {d}}$ , which completes the proof as noted after equation (7.1).
The second ingredient we present for this subsection is a lower bound, or equivalently an upper bound for the inverse of $\mathbf {Q}^{N}$ with respect to the $\|\|_{1;\mathbb {T}_{N}}$ norm. We clarify that the following estimate holds with high probability, in contrast to the upper bound in Lemma 7.1 that holds with overwhelming probability. This is because the upcoming proof is slightly less quantitative. We also emphasize the importance of stable initial data for the upcoming lower bound estimate.
Lemma 7.2. Provided any $\vartheta>0$ uniformly bounded from below, we have $\|(\mathbf {Q}^{N})^{-1}\|_{1;\mathbb {T}_{N}}\leqslant N^{\vartheta }$ with high probability.
Proof. Observe that a space-time uniform upper bound of $N^{\vartheta }$ for the inverse of $\mathbf {Q}^{N}$ is equivalent to a space-time uniform lower bound of $N^{-\vartheta }$ for $\mathbf {Q}^{N}$ itself. Additionally, we observe that the initial data of $\mathbf {Q}^{N}$ , which is the initial data of the Gartner transform $\mathbf {Z}^{N}\kern-1.5pt$ , is uniformly bounded above and below on $\mathbb {T}_{N}$ because it is the exponential of a function that is uniformly bounded above and below. Lemma 7.2 then follows from a standard analysis based on combining these observations, the comparison principle for the defining equation of $\mathbf {Q}^{N}$ in Definition 3.8, and tightness estimates for the same defining equation, at least with continuous initial data, that we alluded to in Proposition 3.13. Roughly speaking, because $\mathbf {Q}^{N}$ is initially uniformly bounded below, by the comparison principle for the $\mathbf {Q}^{N}$ equation it suffices to prove uniform lower bounds for constant initial data given by the infimum of the $\mathbf {Q}^{N}$ initial data. If the $\mathbf {Q}^{N}$ equation only had a spatial heat operator, constant data would be preserved and the result would follow. For some short but N-independent time $\mathfrak {t}^{+}$ and for any N-independent $\gamma>0$ , perturbative analysis would provide a high probability $\mathfrak {t}^{+},\gamma $ -dependent lower bound for $\mathbf {Q}^{N}$ . Again, by using the comparison principle, it suffices to provide a uniform lower bound for time $1-\mathfrak {t}^{+}$ for the solution to the $\mathbf {Q}^{N}$ equation with initial data given by the $\mathfrak {t}^{+},\gamma $ -dependent space-time infimum/lower bound. We then iterate this scheme, namely by providing high probability lower bounds for constant data for sufficiently short but N-independent times $\mathfrak {t}^{+}$ , requiring only a $\mathfrak {t}^{+}$ -dependent number of steps. We emphasize that this iteration does not break down because each perturbative step in this strategy amounts to estimates for the $\mathbf {Q}^{N}$ with constant initial data. By linearity of the $\mathbf {Q}^{N}$ equation, the value of this constant initial data does not matter in terms of how much smaller, in a proportional/multiplicative sense, the solution is after a short time $\mathfrak {t}^{+}$ ; namely, our analysis of the $\mathbf {Q}^{N}$ equation does not change in each step even if the constant initial data is different between steps. This completes the proof.
7.2 Proof of Proposition 3.12
The first step that we take is to write the difference $\mathbf {D}^{N}=\mathbf {U}^{N}-\mathbf {Q}^{N}$ explicitly in terms of the difference between the respective stochastic equations for $\mathbf {U}^{N}$ in Definition 3.5 and $\mathbf {Q}^{N}$ in Definition 3.8. Because the heat operators are linear, and because the stochastic equations in Definitions 3.5 and 3.8 are both linear in their respective solutions $\mathbf {U}^{N}$ and $\mathbf {Q}^{N}$ , it is straightforward to verify the following stochastic equation for $\mathbf {D}^{N}$ , in which the spatial heat operators/initial data terms in the $\mathbf {U}^{N}$ and $\mathbf {Q}^{N}$ equations cancel, and in which we use notation in Definitions 3.5 and 3.8:
We emphasize that the order $N^{1/2}$ term in the $\mathbf {U}^{N}$ equation does not have a matching term in the $\mathbf {Q}^{N}$ equation. According to the Boltzmann–Gibbs principle in Theorem 4.1, we expect the second term from the RHS of equation (7.3) to vanish in the large-N limit. In this case, the $\mathbf {D}^{N}$ term solves a linear equation with zero initial data and vanishing small ‘forcing’, which suggests that $\mathbf {D}^{N}$ vanishes uniformly in the large-N limit. To actually prove this, we will obtain bounds for $\mathbf {D}^{N}$ by employing the moment strategy for proofs of tightness in [Reference Bertini and Giacomin3, Reference Dembo and Tsai19]. We note, however, Theorem 4.1 only provides an estimate with respect to first moment, whereas the aforementioned SPDE analysis of [Reference Bertini and Giacomin3, Reference Dembo and Tsai19] rely on bounds for quite high moments. This is technical, but it is also nontrivial to resolve. The first step we take to resolve it is introducing the following stopping time and ‘cutoff’-type $\mathbf {C}^{N}$ process.
Definition 7.3. Recall the universal constant $\beta _{\mathrm {BG}}>0$ from Theorem 4.1; recall it is uniformly bounded from below. Define
We additionally define the stopped process $\widetilde {\mathbf {Y}}^{N}=\mathbf {Y}^{N}\mathbf {1}(T\leqslant \mathfrak {t}_{\mathrm {BG}})$ , and we also define $\mathbf {C}^{N}$ to be the solution to the following stochastic equation on $\mathbb R_{\geqslant 0}\times \mathbb {T}_{N}\kern-1.5pt$ , whose solutions are unique by standard linear theory:
where $\nabla _{\star }^{!}$ means what it does in Proposition 2.4.
The following first result in this subsection justifies analyzing $\mathbf {C}^{N}$ as a ‘high probability proxy’ for $\mathbf {D}^{N}$ .
Lemma 7.4. With high probability, we have $\mathfrak {t}_{\mathrm {BG}}=1$ . Thus, with high probability, we have $\mathbf {C}^{N}=\mathbf {D}^{N}$ on $[0,1]\times \mathbb {T}_{N}\kern-1.5pt$ .
Proof. We emphasize that the second high probability claim in Lemma 7.4 follows from the first high probability claim in Lemma 7.4 by the same reason Lemma 3.7 holds; the terms $\mathbf {D}^{N}$ and $\mathbf {C}^{N}$ solve the same stochastic equation, whose solutions are unique, until the time $\mathfrak {t}_{\mathrm {BG}}$ , which according to the first claim is equal to $1$ with high probability. To prove the first high probability claim in Lemma 7.4, note $\mathfrak {t}_{\mathrm {BG}}\neq 1$ implies $\mathfrak {t}_{\mathrm {BG}}<1$ , as $\mathfrak {t}_{\mathrm {BG}}\leqslant 1$ deterministically. We now claim $\mathfrak {t}_{\mathrm {BG}}<1$ implies
The first inequality in equation (7.5) follows trivially because $\mathfrak {t}_{\mathrm {BG}}\leqslant 1$ deterministically. To justify the second inequality in equation (7.5), we assume the opposite, so the final inequality in equation (7.5) is reversed and strict. The heat operator is continuous in time with probability 1 as the product $N^{1/2}\bar {\mathfrak {q}}\mathbf {Y}^{N}$ is finite, even if not uniformly bounded in N; we emphasize we are not claiming quantitative regularity of the heat operator that is controlled in the large-N limit by any means. If the last inequality in equation (7.5) is reversed while strict, then for $\mathfrak {t}_{\mathrm {BG}}<1$ we may find $\mathrm {t}\in (\mathfrak {t}_{\mathrm {BG}},1]$ by continuity of the heat operator such that
Because we have assumed the reverse of the last inequality in equation (7.5), this implies $\mathfrak {t}_{\mathrm {BG}}$ must actually be at least $\mathrm {t}$ because until time $\mathrm {t}$ , the $\mathbf {H}^{N}(N^{1/2}\bar {\mathfrak {q}}\mathbf {Y}^{N})$ term is strictly less than $N^{-\beta _{\mathrm {BG}}/999}$ , and therefore by continuity in time of this $\mathbf {H}^{N}(N^{1/2}\bar {\mathfrak {q}}\mathbf {Y}^{N})$ term, we may ‘wait’ a positive amount after $\mathrm {t}$ to see $\mathbf {H}^{N}(N^{1/2}\bar {\mathfrak {q}}\mathbf {Y}^{N})$ exceed $N^{-\beta _{\mathrm {BG}}/999}$ . The condition $\mathfrak {t}_{\mathrm {BG}}\geqslant \mathrm {t}$ we have just established contradicts the fact that $\mathrm {t}>\mathfrak {t}_{\mathrm {BG}}$ by construction. This provides the last inequality in equation (7.5).
We now recap that $\mathfrak {t}_{\mathrm {BG}}\neq 1$ implies $\mathfrak {t}_{\mathrm {BG}}<1$ , which in turn implies the inequalities (7.5). Therefore, to prove $\mathfrak {t}_{\mathrm {BG}}=1$ occurs with high probability, we estimate the probability of observing the inequalities in equation (7.5). This is at most $\mathrm {O}(N^{-\beta _{\mathrm {BG}}/99})$ by Theorem 4.1, which estimates the expectation of the far LHS of equation (7.5), and the Markov inequality.
We now introduce the following deterministic estimate that explains the utility of introducing $\mathfrak {t}_{\mathrm {BG}}$ and $\widetilde {\mathbf {Y}}^{N}$ and $\widetilde {\mathfrak {D}}^{N}$ .
Lemma 7.5. We have the deterministic estimate $\|\mathbf {H}^{N}(N^{1/2}\bar {\mathfrak {q}}\widetilde {\mathbf {Y}}^{N})\|_{1;\mathbb {T}_{N}}\leqslant N^{-\beta _{\mathrm {BG}}/999}$ .
Proof. Because $\mathfrak {t}_{\mathrm {BG}}\leqslant 1$ deterministically, and because $\widetilde {\mathbf {Y}}^{N}$ vanishes after time $\mathfrak {t}_{\mathrm {BG}}$ by construction, we may employ Proposition A.3 to deduce the following consequence of $\mathscr {L}^{\infty }$ -contractive property and the semigroup property of spatial heat operators; see the proof of Lemma 5.5, namely equation (5.12) therein, for a ‘gradient’ version of the following:
It now suffices to observe the RHS of equation (7.7) is bounded above by $N^{-\beta _{\mathrm {BG}}/999}$ , because this proposed upper bound is true if we replace $\mathfrak {t}_{\mathrm {BG}}$ with any $\mathrm {t}<\mathfrak {t}_{\mathrm {BG}}$ and, like in the proof of Lemma 7.4, the heat operator $\mathbf {H}^{N}$ is continuous in space-time.
The last ingredient we require for the proof for Proposition 3.12 is a pointwise moment estimate for $\mathbf {C}^{N}$ which is proved with stochastic analytic means like those used in the proof of Proposition 3.2 in [Reference Dembo and Tsai19] for the Gartner transform therein. Afterwards, we will ‘glue’ this pointwise estimate to a uniform estimate on $[0,1]\times \mathbb {T}_{N}$ via union bound and continuity.
Lemma 7.6. Consider any $p\geqslant 1$ . We have the estimate $\|\mathbf {C}^{N}_{T,x}\|_{\omega ;2p}\lesssim _{p}N^{-\beta _{\mathrm {BG}}/999}$ uniformly on $[0,1]\times \mathbb {T}_{N}\kern-1.5pt$ .
Proof. We estimate the $\|\|_{\omega ;2p}^{2}$ squared norm for every term on the RHS of the $\mathbf {C}^{N}$ equation from Definition 7.3. We first employ Lemma 7.5 to establish the following estimate uniformly in $p\geqslant 1$ and in uniformly in space-time; let us clarify the first bound below uses an elementary/general $\mathscr {L}^{p}\leqslant \mathscr {L}^{\infty }$ bound for random variables:
For the remaining terms in the $\mathbf {C}^{N}$ equation in Definition 7.3, we will follow the proof of (3.12) in Proposition 3.2 in [Reference Dembo and Tsai19]. Similar to the proof of Lemma 3.14, all of the estimates used to prove (3.12) in Proposition 3.2 in [Reference Dembo and Tsai19] hold for the corresponding terms in the $\mathbf {C}^{N}$ equation from Definition 7.3 as we only need the heat kernel estimates in Proposition A.3 and, to control the $\mathbf {C}^{N}\mathrm {d}\xi ^{N}$ term in the $\mathbf {C}^{N}$ equation, the martingale inequality in Lemma A.4 that generalizes Lemma 3.1 in [Reference Dembo and Tsai19] beyond the Gartner transform. We ultimately deduce from equation (7.8) and this paragraph the following integral bound for $\|\|_{\omega ;2p}^{2}$ ; recall that $\mathbf {O}_{S,T}=|T-S|$ :
We emphasize the estimate (7.9) holds uniformly in space-time on the LHS. Thus, we may extend equation (7.9) upon taking a supremum over $\mathbb {T}_{N}$ on the LHS therein. At this point, we may employ the Gronwall inequality to deduce, for times $T\leqslant 1$ :
with the last inequality above following by an elementary integral calculation inside the exponential in the middle of equation (7.10).
From Lemma 7.6, we get the following union bound estimate that controls $\mathbf {C}^{N}$ over a very fine discretization of space-time.
Corollary 7.7. Define $\mathbb {I}^{\mathrm {d}}=\mathbb {I}^{\mathbf {T},\mathrm {d}}\times \mathbb {T}_{N}$ , in which $\mathbb {I}^{\mathbf {T},\mathrm {d}}=\{\mathfrak {j}N^{-99}\}_{\mathfrak {j}=0}^{N^{99}}$ . The following holds with overwhelming probability:
Proof. Provided any $(\mathrm {t},x)\in \mathbb {I}^{\mathrm {d}}$ , the Chebyshev inequality implies the probability estimate
Therefore, a union bound implies that the probability the proposed estimate (7.11) fails is bounded by $\kappa _{p}N^{-2p\beta _{\mathrm {BG}}/9999}$ times the number $|\mathbb {I}^{\mathrm {d}}|$ of points we take a union bound over, in which $\kappa _{p}\geqslant 1$ depends only on $p\geqslant 1$ . As $|\mathbb {I}^{\mathrm {d}}|=|\mathbb {I}^{\mathbf {T},\mathrm {d}}||\mathbb {T}_{N}|\lesssim N^{100}$ , we know the probability the proposed estimate in equation (7.11) fails is bounded by $\kappa _{p}N^{-2p\beta _{\mathrm {BG}}/9999+100}$ . Taking $p\geqslant 1$ arbitrarily large implies equation (7.11) holds with overwhelming probability.
Proof of Proposition 3.12 .
Throughout, observe that the intersection of any uniformly bounded number of events that hold with high probability also holds with high probability, which can be easily shown with the union bound for the complements of these events. With this, by Lemma 7.4, it suffices to prove Proposition 3.12 but replacing $\mathbf {D}^{N}$ by $\mathbf {C}^{N}$ . Next, we employ the estimate (7.1), bootstrapping an estimate from a discretization to the continuum, but for $\mathbf {C}^{N}$ in place of $\mathbf {Q}^{N}$ ; observe the proof of equation (7.1) is blind to what $\mathbf {Q}^{N}$ actually is:
It suffices to estimate each term from the RHS of equation (7.13) by $N^{-\beta }$ times a universal constant with high probability. For the first term on the RHS of equation (7.13), we employ Corollary 7.7. For the second term, we employ Lemma A.6, which implies the second term on the RHS of equation (7.13) is controlled by the first term on the RHS of equation (7.13). This finishes the proof of Proposition 3.12.
7.3 Proof of Proposition 3.11
We use a continuity method that is frequently used in the study of PDE. Roughly speaking, we observe that for stable initial data, regularity estimates defining the stopping time $\mathfrak {t}_{\mathrm {st}}$ of current interest from Definition 3.1 are satisfied at time 0, at least with high probability. We then condition on path-space events in which $\mathbf {U}^{N}$ admits sufficiently good upper bounds, which will be inherited by sufficiently good upper and lower bounds for $\mathbf {Q}^{N}$ and $\mathbf {D}^{N}=\mathbf {U}^{N}-\mathbf {Q}^{N}$ . We also condition on path-space events in which the space-time regularity of $\mathbf {U}^{N}$ is sufficiently good provided upper and lower bounds; this will follow from the probability estimates in Proposition 6.1. In particular, until time $\mathfrak {t}_{\mathrm {st}}$ , we basically know the space-time estimates defining $\mathfrak {t}_{\mathrm {st}}$ with high probability except upon replacing $\mathbf {Z}^{N}$ in there with $\mathbf {U}^{N}$ . However, Lemma 3.7 implies that since we look at time before $\mathfrak {t}_{\mathrm {st}}$ , we do not actually have to make such replacement. Now, if $\mathfrak {t}_{\mathrm {st}}\neq 1$ , in which case $\mathfrak {t}_{\mathrm {st}}<1$ , we may apply the short-time estimates in Lemma A.6 to push the space-time estimates in $\mathfrak {t}_{\mathrm {st}}$ for $\mathbf {Z}^{N}$ past $\mathfrak {t}_{\mathrm {st}}$ by a very small amount of time, thus contradicting the definition of $\mathfrak {t}_{\mathrm {st}}$ similar to our proof of Lemma 7.4. We clarify that the crux of the strategy is the observation that we can turn slightly suboptimal estimates for $\mathbf {Z}^{N}$ into slightly better suboptimal estimates, which are closer to ‘the truth’. This is because we need the a priori suboptimal estimates in $\mathfrak {t}_{\mathrm {st}}$ only to analyze the $\bar {\mathfrak {q}}$ term in the $\mathbf {Z}^{N}$ and $\mathbf {U}^{N}$ equations, and such term is vanishingly small anyway with respect to space-time regularity norms in $\mathfrak {t}_{\mathrm {st}}$ . Therefore, the space-time behavior of the $\bar {\mathfrak {q}}$ term in the $\mathbf {Z}^{N}$ and $\mathbf {U}^{N}$ equations is, with high probability, better than the space-time behavior of $\mathbf {Z}^{N}$ and $\mathbf {U}^{N}$ that we assume through the stopping time $\mathfrak {t}_{\mathrm {st}}$ , while the other terms in the $\mathbf {Z}^{N}$ and $\mathbf {U}^{N}$ equations admit ‘good’ space-time estimates by standard moment bounds as in [Reference Bertini and Giacomin3, Reference Dembo and Tsai19]. To make this precise, we introduce another set of stopping times.
Definition 7.8. We define $\varepsilon _{\mathrm {ap},1}=999^{-999}\varepsilon _{\mathrm {ap}}\wedge 999^{-999}\beta $ , where $\varepsilon _{\mathrm {ap}}>0$ is from Definition 3.1 and $\beta>0$ is the universal constant in Proposition 3.12. We now define the following pair of stopping times, the first of which provides uniform upper and lower bounds for $\mathbf {Q}^{N}$ , and where the second stopping time below provides a uniform upper bound for $\mathbf {D}^{N}$ :
We proceed with the following time regularity stopping time, in which $\mathbb {I}^{\mathbf {T}}$ is the set of discrete mesoscopic timescales that were defined in Definition 3.1; we use the same exponent $\varepsilon _{\mathrm {ap},1}$ below as we did for $\mathfrak {t}_{\mathrm {st},1}$ and $\mathfrak {t}_{\mathrm {st},2}$ :
We additionally define the following spatial regularity stopping time, where $\mathfrak {l}_{N}$ is the maximal length scale for spatial gradients that was used in the stopping time $\mathfrak {t}_{\mathrm {st}}$ in Definition 3.1. We again use the exponent $\varepsilon _{\mathrm {ap},1}$ below as we did for $\mathfrak {t}_{\mathrm {st},1}$ , $\mathfrak {t}_{\mathrm {st},2}$ and $\mathfrak {t}_{\mathrm {st},3}$ :
We proceed with defining the following a priori short-time estimate random time for $\mathbf {Z}^{N}\kern-1.5pt$ . We emphasize that the following time is not a stopping time as it looks forward in the future and thus it is not adapted to the filtration of the interacting particle system. However, this will not be important as our analysis in this section is deterministic after we have established Lemma 7.9 below:
We conclude by defining $\mathfrak {t}_{\mathrm {st},6}=\mathfrak {t}_{\mathrm {st},1}\wedge \mathfrak {t}_{\mathrm {st},2}\wedge \mathfrak {t}_{\mathrm {st},3}\wedge \mathfrak {t}_{\mathrm {st},4}\wedge \mathfrak {t}_{\mathrm {st},5}$ .
Lemma 7.9. With high probability, we have $\mathfrak {t}_{\mathrm {st},6}=1$ .
Proof. As remarked at the beginning of the proof for Proposition 3.12, the intersection of a uniformly bounded number of events that hold with high probability also holds with high probability. Therefore, it suffices to prove that $\mathfrak {t}_{\mathrm {st},\mathfrak {j}}=1$ with high probability for any $\mathfrak {j}\in \{1,\ldots ,6\}$ . For $\mathfrak {j}=1$ , we first observe that the $\|\|_{\mathrm {t};\mathbb {T}_{N}}$ norm is monotone nondecreasing in $\mathrm {t}$ . Thus, because $\mathfrak {t}_{\mathrm {st},1}\leqslant 1$ , if $\mathfrak {t}_{\mathrm {st},1}\neq 1$ , then $\mathfrak {t}_{\mathrm {st},1}<1$ , so the lower bound defining $\mathfrak {t}_{\mathrm {st},1}$ is actually realized for some $\mathrm {t}\in [0,1)$ , and thus the upper bounds in Lemma 7.1 and Lemma 7.2 fail on this event. By the probability estimates in Lemmas 7.1 and 7.2, such failure happens outside an event of high probability, so we have $\mathfrak {t}_{\mathrm {st},1}=1$ with high probability. A similar argument but when using Proposition 3.12 in place of Lemmas 7.1 and 7.2 shows $\mathfrak {t}_{\mathrm {st},2}=1$ with high probability as well. We proceed with showing $\mathfrak {t}_{\mathrm {st},3}=1$ with probability. Consider Proposition 6.1 for the choice of stopping time $\mathfrak {t}_{\mathrm {r}}=\mathfrak {t}_{\mathrm {st},3}$ . Let us assume that $\mathfrak {t}_{\mathrm {st},3}\neq 1$ , and thus like the previous argument we have $\mathfrak {t}_{\mathrm {st},3}<1$ for this event. Also, similar to the previous argument, note if $\mathfrak {t}_{\mathrm {st},3}<1$ , then the lower bound defining $\mathfrak {t}_{\mathrm {st},3}$ is realized at the time $\mathfrak {t}_{\mathrm {st},3}$ . In particular, through the time-regularity estimate in Proposition 6.1 we know that this only happens outside an event that happens with high probability, and thus $\mathfrak {t}_{\mathrm {st},3}=1$ with high probability. The same argument but using the spatial regularity estimate from Proposition 6.1 for $\mathfrak {t}_{\mathrm {r}}=\mathfrak {t}_{\mathrm {st},4}$ implies that $\mathfrak {t}_{\mathrm {st},4}=1$ with high probability as well. We are left with proving $\mathfrak {t}_{\mathrm {st},5}=1$ with high probability. This follows immediately from Lemma A.6, so we are done.
Proof of Proposition 3.11 .
We first observe the following union bound inequality, which tells us that if $\mathfrak {t}_{\mathrm {st}}<1$ , then $\mathfrak {t}_{\mathrm {st},6}<1$ or $\mathfrak {t}_{\mathrm {st},6}=1$ and $\mathfrak {t}_{\mathrm {st}}<1$ , where $\mathfrak {t}_{\mathrm {st},6}$ is the last stochastic time defined in Definition 7.8:
We apply Lemma 7.9 and deduce the first probability on the RHS of equation (7.17) is at most $\gamma +\kappa _{\gamma }\mathrm {o}_{N}$ for any $\gamma>0$ , where $\mathrm {o}_{N}$ vanishes in the large-N limit uniformly in $\gamma>0$ . Thus, it suffices to deduce the same estimate for the second term on the RHS of equation (7.17). Actually, we will prove the second probability on the RHS of equation (7.17) is equal to 0. To this end, let us recall the definition of $\mathfrak {t}_{\mathrm {st}}$ from Definition 3.1 and, again using a union bound inequality, get the following upper bound for the second term on the RHS of equation (7.17), which follows by conditioning on which of $\mathfrak {t}_{\mathrm {ap}}$ and $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}$ and $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {X}}$ in Definition 3.1 is smallest and equal to $\mathfrak {t}_{\mathrm {st}}$ :
We are left with showing each term on the RHS the above is equal to 0; this would give the proof of Proposition 3.11, again because equation (7.18) is an upper bound for the second term on the RHS of equation (7.17). We will organize our computations for each probability in equation (7.18) in one of three bullet points below. First, we assume N is sufficiently large so that $N^{\varepsilon _{\mathrm {ap},1}}\geqslant 99999$ , for example.
-
• We treat the first term in equation (7.18). To this end, consider $0<\mathfrak {t}_{N}\leqslant N^{-100}$ so $\mathfrak {t}_{\mathrm {ap}}+\mathfrak {t}_{N}\leqslant 1$ . Because $\mathfrak {t}_{\mathrm {st},6}=1$ by assumption of the event we are working on, we know $\mathfrak {t}_{\mathrm {st},5}=1$ as well. By definition of $\mathfrak {t}_{\mathrm {st},5}$ in Definition 7.8, we deduce the following short-time estimate, which relates the value of $\mathbf {Z}^{N}$ after time $\mathfrak {t}_{\mathrm {ap}}$ and until time $\mathfrak {t}_{\mathrm {ap}}+\mathfrak {t}_{N}$ to its values at time $\mathfrak {t}_{\mathrm {ap}}$ ; the following first inequality is proved by using the proof for equation (7.1), which we recall is blind to what $\mathbf {Q}^{N}$ actually is, while the second inequality estimating short-time behavior of $\mathbf {Z}^{N}$ follows from the identity $\mathfrak {t}_{\mathrm {st},5}=1$ we have just noted:
(7.19) $$ \begin{align} \|\mathbf{Z}^{N}\|_{\mathfrak{t}_{\mathrm{ap}}+\mathfrak{t}_{N};\mathbb{T}_{N}} \ \leqslant \ \|\mathbf{Z}^{N}\|_{\mathfrak{t}_{\mathrm{ap}};\mathbb{T}_{N}} + \sup_{\mathrm{s}\in[0,N^{-99}]}\|\nabla_{\mathrm{s}}^{\mathbf{T}}\mathbf{Z}^{N}\|_{\mathfrak{t}_{\mathrm{ap}};\mathbb{T}_{N}} \ \leqslant \ \|\mathbf{Z}^{N}\|_{\mathfrak{t}_{\mathrm{ap}};\mathbb{T}_{N}} + N^{-\frac12+\varepsilon_{\mathrm{ap},1}}\|\mathbf{Z}^{N}\|_{\mathfrak{t}_{\mathrm{ap}};\mathbb{T}_{N}}. \end{align} $$Recall from Lemma 3.7 that until time $\mathfrak {t}_{\mathrm {st}}=\mathfrak {t}_{\mathrm {ap}}$ , we have the identification $\mathbf {Z}^{N}=\mathbf {U}^{N}=\mathbf {Q}^{N}+\mathbf {D}^{N}$ , where we that recall $\mathbf {U}^{N}$ is defined in Definition 3.5 and $\mathbf {Q}^{N}$ is defined in Definition 3.8, and $\mathbf {D}^{N}$ is defined in Proposition 3.12. Since $\mathfrak {t}_{\mathrm {st},6}=1$ , we also have $\mathfrak {t}_{\mathrm {st},1}=1$ and $\mathfrak {t}_{\mathrm {st},2}=1$ by assumption, and this allows us to extend equation (7.19) as follows:(7.20) $$ \begin{align} \|\mathbf{Z}^{N}\|_{\mathfrak{t}_{\mathrm{ap}}+\mathfrak{t}_{N};\mathbb{T}_{N}} \ \lesssim \ \|\mathbf{Z}^{N}\|_{\mathfrak{t}_{\mathrm{ap}};\mathbb{T}_{N}} \ \leqslant \ \|\mathbf{Q}^{N}\|_{\mathfrak{t}_{\mathrm{ap}};\mathbb{T}_{N}}+\|\mathbf{D}^{N}\|_{\mathfrak{t}_{\mathrm{ap}};\mathbb{T}_{N}} \ \lesssim \ N^{\varepsilon_{\mathrm{ap},1}}. \end{align} $$We recall $\varepsilon _{\mathrm {ap},1}\leqslant 999^{-999}\varepsilon _{\mathrm {ap}}$ with $\varepsilon _{\mathrm {ap}}>0$ in Definition 3.1. We also recall N is large enough so that even with the implied constants in equation (7.20), we deduce that the far LHS of equation (7.20) is at most $N^{2\varepsilon _{\mathrm {ap,1}}}\leqslant N^{\varepsilon _{\mathrm {ap}}/2}$ . Parallel to equation (7.19), we also get, by applying $a^{-1}-(a+b)^{-1}\leqslant ba^{-2}$ for $a,b\geqslant 0$ and by recalling $\mathfrak {t}_{\mathrm {st},5}=1$ that controls $(\mathbf {Z}^{N})^{-1}\nabla ^{\mathbf {T}}\mathbf {Z}^{N}$ for short times, that(7.21) $$ \begin{align} \|(\mathbf{Z}^{N})^{-1}\|_{\mathfrak{t}_{\mathrm{ap}}+\mathfrak{t}_{N};\mathbb{T}_{N}} \ \leqslant \ \|(\mathbf{Z}^{N})^{-1}\|_{\mathfrak{t}_{\mathrm{ap}};\mathbb{T}_{N}} + \sup_{\mathrm{s}\in[0,N^{-99}]}\|(\mathbf{Z}^{N})^{-2}\nabla_{\mathrm{s}}^{\mathbf{T}}\mathbf{Z}^{N}\|_{\mathfrak{t}_{\mathrm{ap}};\mathbb{T}_{N}} \ \lesssim \ \|(\mathbf{Z}^{N})^{-1}\|_{\mathfrak{t}_{\mathrm{ap}};\mathbb{T}_{N}}, \end{align} $$while parallel to equation (7.20), we extend equation (7.21) to the following estimate in which we now invoke the lower bound for $\mathbf {Q}^{N}$ that comes from the constraint $\mathfrak {t}_{\mathrm {st},1}=1$ along with the upper bound for $\mathbf {D}^{N}$ that comes from our assumption $\mathfrak {t}_{\mathrm {st},2}=1$ :(7.22) $$ \begin{align} \|(\mathbf{Z}^{N})^{-1}\|_{\mathfrak{t}_{\mathrm{ap}}+\mathfrak{t}_{N};\mathbb{T}_{N}} \ \lesssim \ \|(\mathbf{Z}^{N})^{-1}\|_{\mathfrak{t}_{\mathrm{ap}};\mathbb{T}_{N}} \ \lesssim \ \|(\mathbf{Q}^{N})^{-1}\|_{\mathfrak{t}_{\mathrm{ap}};\mathbb{T}_{N}} + \|(\mathbf{Q}^{N})^{-2}\mathbf{D}^{N}\|_{\mathfrak{t}_{\mathrm{ap}};\mathbb{T}_{N}} \ \lesssim \ N^{\varepsilon_{\mathrm{ap},1}}. \end{align} $$Indeed, the last estimate above follows from the assumption that $\varepsilon _{\mathrm {ap},1}\leqslant 999^{-999}\beta $ , and thus the $N^{-\varepsilon _{\mathrm {ap},1}}$ lower bound for $\mathbf {Q}^{N}$ that we get from $\mathfrak {t}_{\mathrm {st},1}=1$ is much larger than the $N^{-\beta /2}$ upper bound for $\mathbf {D}^{N}$ that we get from $\mathfrak {t}_{\mathrm {st},2}=1$ , at least in the large-N limit. We emphasize that $\varepsilon _{\mathrm {ap},1}\leqslant 999^{-999}\varepsilon _{\mathrm {ap}}$ by construction as well, and thus the far LHS of equation (7.22) is bounded above by $N^{\varepsilon _{\mathrm {ap}}/2}$ without any implied constants or extra factors. Thus, equations (7.20) and (7.22) imply the lower bound in the infimum defining $\mathfrak {t}_{\mathrm {ap}}$ fails for all times $\mathfrak {t}\in [0,1]$ before $\mathfrak {t}_{\mathrm {ap}}+\mathfrak {t}_{N}$ . This contradicts the definition of $\mathfrak {t}_{\mathrm {ap}}$ if $\mathfrak {t}_{\mathrm {ap}}<1$ , as these lower bounds necessarily fail at and/or immediately after $\mathfrak {t}_{\mathrm {ap}}<1$ by definition of $\mathfrak {t}_{\mathrm {ap}}$ . This shows the first probability in equation (7.18) is 0. -
• We move to the second probability in equation (7.18), which amounts to estimating time gradients of $\mathbf {Z}^{N}\kern-1.5pt$ . In particular, take $0<\mathfrak {t}_{N}\leqslant N^{-100}$ so $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}+\mathfrak {t}_{N}\leqslant 1$ similar to the previous bullet point. Consider any $0\leqslant \mathfrak {t}\leqslant \mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}+\mathfrak {t}_{N}$ . Define $0\leqslant \mathfrak {t}_{0}\leqslant \mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}$ to be the closest such time to $\mathfrak {t}$ . Last, take any $\mathfrak {r}\in \mathbb {I}^{\mathbf {T}}$ , with $\mathbb {I}^{\mathbf {T}}$ in Definition 3.1. The time gradient of $\mathbf {Z}^{N}$ evaluated at time $\mathfrak {t}$ with respect to timescale $-\mathfrak {r}$ is the time gradient of $\mathbf {Z}^{N}$ with respect to the same timescale $-\mathfrak {r}$ but evaluated at time $\mathfrak {t}_{0}\leqslant \mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}$ , if we include two error terms that result from replacing the times at which we evaluate $\mathbf {Z}^{N}\kern-1.5pt$ . To be precise, this first error term is given by the difference of $\mathbf {Z}^{N}$ at time $\mathfrak {t}+\mathfrak {r}$ with $\mathbf {Z}^{N}$ at time $\mathfrak {t}_{0}+\mathfrak {r}$ , and the second error term is given by the difference of $\mathbf {Z}^{N}$ at time $\mathfrak {t}$ with $\mathbf {Z}^{N}$ at time $\mathfrak {t}_{0}$ . We observe now that the difference between any of these two pairs of times at which we compare the values of $\mathbf {Z}^{N}$ is bounded by $N^{-100}$ because the distance of any time $\mathfrak {t}\leqslant \mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}+\mathfrak {t}_{N}$ to the set of times less than or equal to $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}$ is at most $\mathfrak {t}_{N}\leqslant N^{-100}$ . The conclusion of the last three sentences is the following, uniform over allowable timescales $\mathfrak {r}\in \mathbb {I}^{\mathbf {T}}$ and which is a time-gradient version of (7.19):
(7.23) $$ \begin{align} \|\nabla_{-\mathfrak{r}}^{\mathbf{T}}\mathbf{Z}^{N}\|_{\mathfrak{t}_{\mathrm{RN}}^{\mathbf{T}}+\mathfrak{t}_{N};\mathbb{T}_{N}} \ \leqslant \ \|\nabla_{-\mathfrak{r}}^{\mathbf{T}}\mathbf{Z}^{N}\|_{\mathfrak{t}_{\mathrm{RN}}^{\mathbf{T}};\mathbb{T}_{N}}+2\sup_{\mathrm{s}\in[0,N^{-99}]}\|\nabla_{\mathrm{s}}^{\mathbf{T}}\mathbf{Z}^{N}\|_{\mathfrak{t}_{\mathrm{RN}}^{\mathbf{T}};\mathbb{T}_{N}}. \end{align} $$Because we assume $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}=\mathfrak {t}_{\mathrm {st}}$ , the first term on the RHS of equation (7.23) stays the same if we replace $\mathbf {Z}^{N}$ by $\mathbf {U}^{N}$ , consequence of the pathwise identification in Lemma 3.7. Because $\mathfrak {t}_{\mathrm {st},6}=1$ by assumption of the event in the second probability in equation (7.18) on which we are working, we have the identity $\mathfrak {t}_{\mathrm {st},3}=1$ ; see Definition 7.8. The identity $\mathfrak {t}_{\mathrm {st},3}=1$ implies that the estimate in the infimum defining $\mathfrak {t}_{\mathrm {st},3}$ fails for $\mathfrak {t}=1$ , which therefore controls the first term on the RHS of equation (7.23) via an upper bound we specify shortly. On the other hand, because $\mathfrak {t}_{\mathrm {st},6}=1$ by assumption, we may similarly deduce that $\mathfrak {t}_{\mathrm {st},5}=1$ holds automatically. By definition of $\mathfrak {t}_{\mathrm {st},5}$ , the identity $\mathfrak {t}_{\mathrm {st},5}=1$ similarly implies that the estimate in the infimum defining $\mathfrak {t}_{\mathrm {st},5}$ also fails if $\mathfrak {t}=1$ . This bounds the second term on the RHS of equation (7.23) by the short-time factor of $N^{-1/2+\varepsilon _{\mathrm {ap},1}}$ times the same norm but for $\mathbf {Z}^{N}$ instead of its scale- $\mathrm {s}$ time gradient. Ultimately, from this paragraph and equation (7.23), we deduce the following for which we note $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}+\mathfrak {t}_{N}\leqslant 1$ , so all norms may be pushed to time 1 as we are only concerned with upper bounds. Let us clarify the second bound below follows by $\mathfrak {r}\in \mathbb {I}^{\mathbf {T}}$ , which implies $\mathfrak {r}\geqslant N^{-2}$ and $\mathfrak {r}^{-1/4}\leqslant N^{1/2}$ ; also note $\mathbf {U}^{N}=\mathbf {Z}^{N}$ until time $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}$ :(7.24) $$ \begin{align} \mathfrak{r}^{-\frac14}\|\nabla_{-\mathfrak{r}}^{\mathbf{T}}\mathbf{Z}^{N}\|_{\mathfrak{t}_{\mathrm{RN}}^{\mathbf{T}}+\mathfrak{t}_{N};\mathbb{T}_{N}} \ &\leqslant \ \|\nabla_{-\mathfrak{r}}^{\mathbf{T}}\mathbf{U}^{N}\|_{\mathfrak{t}_{\mathrm{RN}}^{\mathbf{T}};\mathbb{T}_{N}}+ 2\sup_{\mathrm{s}\in[0,N^{-99}]}\|\nabla_{\mathrm{s}}^{\mathbf{T}}\mathbf{Z}^{N}\|_{\mathfrak{t}_{\mathrm{RN}}^{\mathbf{T}};\mathbb{T}_{N}} \end{align} $$(7.25) $$ \begin{align} &\leqslant \ N^{\varepsilon_{\mathrm{ap},1}}(1+\|\mathbf{U}^{N}\|_{1;\mathbb{T}_{N}}^{2}) + \mathfrak{r}^{-\frac14}N^{-\frac12+\varepsilon_{\mathrm{ap},1}}\|\mathbf{Z}^{N}\|_{\mathfrak{t}_{\mathrm{RN}}^{\mathbf{T}};\mathbb{T}_{N}}\end{align} $$(7.26) $$ \begin{align}&\leqslant \ N^{\varepsilon_{\mathrm{ap},1}}(1+\|\mathbf{U}^{N}\|_{1;\mathbb{T}_{N}}^{2}) + N^{\varepsilon_{\mathrm{ap},1}}\|\mathbf{U}^{N}\|_{\mathfrak{t}_{\mathrm{RN}}^{\mathbf{T}};\mathbb{T}_{N}}. \end{align} $$Because we now have $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}=\mathfrak {t}_{\mathrm {st}}$ on the event we currently work on, we may follow the second inequality in equation (7.20) and estimate $\mathbf {U}^{N}$ by $\mathbf {Q}^{N}$ and $\mathbf {D}^{N}$ . Similar to the end of equation (7.20), we remark that $\mathfrak {t}_{\mathrm {st},6}=1$ implies $\mathfrak {t}_{\mathrm {st},1}=1$ and $\mathfrak {t}_{\mathrm {st},2}=1$ automatically, which, again as in the end of equation (7.20), implies upper bounds for each of $\mathbf {Q}^{N}$ and $\mathbf {D}^{N}=\mathbf {U}^{N}-\mathbf {Q}^{N}$ given by $N^{\varepsilon _{\mathrm {ap},1}}$ each, for example, and thus an upper bound for $\mathbf {U}^{N}$ of the same order. In particular, via this paragraph and the estimate (7.26), we deduce the following estimate in which we again recall $\varepsilon _{\mathrm {ap},1}\leqslant 999^{-999}\varepsilon _{\mathrm {ap}}$ so that the middle term below is at most $N^{\varepsilon _{\mathrm {ap}}/2}$ even with the implied constants/factors in the first estimate below:(7.27) $$ \begin{align} \mathfrak{r}^{-\frac14}\|\nabla_{-\mathfrak{r}}^{\mathbf{T}}\mathbf{Z}^{N}\|_{\mathfrak{t}_{\mathrm{RN}}^{\mathbf{T}}+\mathfrak{t}_{N};\mathbb{T}_{N}} \ \lesssim \ N^{3\varepsilon_{\mathrm{ap},1}} \ \leqslant \ N^{\varepsilon_{\mathrm{ap}}/2}. \end{align} $$Because the last estimate in equation (7.27) is uniform over admissible time-gradient timescales $\mathfrak {r}\in \mathbb {I}^{\mathbf {T}}$ , we observe that the estimate in the infimum defining $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}$ fails if $\mathfrak {t}=\mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}+\mathfrak {t}_{N}$ . Because the LHS of said estimate in said infimum is monotone nondecreasing in $\mathfrak {t}\geqslant 0$ , we observe that it also fails for all times $\mathfrak {t}\leqslant \mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}+\mathfrak {t}_{N}$ . Thus, by definition of $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}$ , we have $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}>\mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}+\mathfrak {t}_{N}$ as long as $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}<1$ so that we can actually find $\mathfrak {t}_{N}>0$ satisfying $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}+\mathfrak {t}_{N}\leqslant 1$ . The previous inequality $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}>\mathfrak {t}_{\mathrm {RN}}^{\mathbf {T}}+\mathfrak {t}_{N}$ is a clear contradiction for $\mathfrak {t}_{N}>0$ , so the second probability in equation (7.18) must be that of an empty event and thus equal to zero. -
• We move to the last probability in equation (7.18) for spatial gradients of $\mathbf {Z}^{N}\kern-1.5pt$ . We follow a strategy similar to the previous bullet point but replacing time gradients by spatial gradients. In particular, let us first take $0<\mathfrak {t}_{N}\leqslant N^{-100}$ such that $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {X}}+\mathfrak {t}_{N}\leqslant 1$ . We may replace any spatial gradient of $\mathbf {Z}^{N}$ evaluated at any time $\mathfrak {t}\leqslant \mathfrak {t}_{\mathrm {RN}}^{\mathbf {X}}+\mathfrak {t}_{N}$ with a spatial gradient of $\mathbf {Z}^{N}$ but evaluated at a time $\mathfrak {t}_{0}\leqslant \mathfrak {t}_{\mathrm {RN}}^{\mathbf {X}}$ satisfying $|\mathfrak {t}-\mathfrak {t}_{0}|\leqslant \mathfrak {t}_{N}\leqslant N^{-100}$ , if we account for the resulting errors given by scale $\mathrm {s}\leqslant N^{-99}$ time gradients of $\mathbf {Z}^{N}\kern-1.5pt$ , which come by replacing $\mathbf {Z}^{N}$ at times $\mathfrak {t}+\mathrm {s}$ and $\mathfrak {t}$ with $\mathbf {Z}^{N}$ at times $\mathfrak {t}_{0}+\mathrm {s}$ and $\mathfrak {t}_{0}$ , respectively. Below, we have taken any arbitrary length scale $1\leqslant |\mathfrak {l}|\leqslant \mathfrak {l}_{N}$ with $\mathfrak {l}_{N}=N^{1/2+\varepsilon _{\mathrm {RN}}}$ from Definition 3.1:
(7.28) $$ \begin{align} \|\nabla_{\mathfrak{l}}^{\mathbf{X}}\mathbf{Z}^{N}\|_{\mathfrak{t}_{\mathrm{RN}}^{\mathbf{X}}+\mathfrak{t}_{N};\mathbb{T}_{N}} \ \leqslant \ \|\nabla_{\mathfrak{l}}^{\mathbf{X}}\mathbf{Z}^{N}\|_{\mathfrak{t}_{\mathrm{RN}}^{\mathbf{X}};\mathbb{T}_{N}}+\sup_{\mathrm{s}\in[0,N^{-99}]}\|\nabla_{\mathrm{s}}^{\mathbf{T}}\mathbf{Z}^{N}\|_{\mathfrak{t}_{\mathrm{RN}}^{\mathbf{X}};\mathbb{T}_{N}}. \end{align} $$Let us multiply both sides of equation (7.28) by $N^{1/2}|\mathfrak {l}|^{-1/2}$ . We argue as in the second bullet point. Because we have assumed $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {X}}=\mathfrak {t}_{\mathrm {st}}$ on the event we are currently trying to prove has zero probability, the identification in Lemma 3.7 lets us replace $\mathbf {Z}^{N}$ with $\mathbf {U}^{N}$ in the first term on the RHS of (7.28). Because we have assumed $\mathfrak {t}_{\mathrm {st},6}=1$ on the current event as well, by Definition 7.8 we get $\mathfrak {t}_{\mathrm {st},4}=1$ automatically. This identity then implies the inequality in the infimum defining $\mathfrak {t}_{\mathrm {st},4}$ fails for $\mathfrak {t}=1$ , thereby providing an upper bound for the first term on the RHS of equation (7.28), which we specify shortly. On the other hand, we again note that $\mathfrak {t}_{\mathrm {st},6}=1$ implies that $\mathfrak {t}_{\mathrm {st},5}=1$ automatically, and this last identity provides an estimate for the second term on the RHS of equation (7.28). Again, we note $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {X}}+\mathfrak {t}_{N}\leqslant 1$ by construction, so all norms may be pushed to time 1 since we are only concerned with upper bounds. We deduce the following parallel to equation (7.26), for which we note $|\mathfrak {l}|^{-1}\leqslant 1$ trivially:(7.29) $$ \begin{align} N^{1/2}|\mathfrak{l}|^{-1/2}\|\nabla_{\mathfrak{l}}^{\mathbf{X}} \mathbf{Z}^{N}\|_{\mathfrak{t}_{\mathrm{RN}}^{\mathbf{X}}+\mathfrak{t}_{N};\mathbb{T}_{N}} \ &\leqslant \ \|\nabla_{\mathfrak{l}}^{\mathbf{X}}\mathbf{U}^{N}\|_{\mathfrak{t}_{\mathrm{RN}}^{\mathbf{X}}; \mathbb{T}_{N}}+\sup_{\mathrm{s}\in[0,N^{-99}]}\|\nabla_{\mathrm{s}}^{\mathbf{T}}\mathbf{Z}^{N}\|_{\mathfrak{t}_{\mathrm{RN}}^{\mathbf{X}}; \mathbb{T}_{N}}\end{align} $$(7.30) $$ \begin{align}&\leqslant \ N^{\varepsilon_{\mathrm{ap},1}}(1+\|\mathbf{U}^{N}\|_{1;\mathbb{T}_{N}}^{2}) + N^{1/2}|\mathfrak{l}|^{-1/2}N^{-\frac12+\varepsilon_{\mathrm{ap},1}}\|\mathbf{Z}^{N}\|_{\mathfrak{t}_{\mathrm{RN}}^{\mathbf{X}};\mathbb{T}_{N}}\end{align} $$(7.31) $$ \begin{align}&\leqslant \ N^{\varepsilon_{\mathrm{ap},1}}(1+\|\mathbf{U}^{N}\|_{1;\mathbb{T}_{N}}^{2}) + N^{\varepsilon_{\mathrm{ap},1}}\|\mathbf{U}^{N}\|_{\mathfrak{t}_{\mathrm{RN}}^{\mathbf{X}};\mathbb{T}_{N}}. \end{align} $$We now proceed with the argument in the second bullet point above starting with the paragraph immediately after equation (7.26). This provides an upper bound of $N^{\varepsilon _{\mathrm {ap}}/2}$ for the LHS of equation (7.30) uniformly in $1\leqslant |\mathfrak {l}|\leqslant \mathfrak {l}_{N}$ , which, as we assumed $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {X}}+\mathfrak {t}_{N}\leqslant 1$ with $\mathfrak {t}_{N}>0$ given that $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {X}}<1$ , implies $\mathfrak {t}_{\mathrm {RN}}^{\mathbf {X}}>\mathfrak {t}_{\mathrm {RN}}^{\mathbf {X}}+\mathfrak {t}_{N}$ , and this is a clear contradiction because $\mathfrak {t}_{N}$ is strictly positive.
We have shown each probability in equation (7.18) is equal to zero. Combining this with equation (7.17) and the paragraph following equation (7.17) we used to control the first probability on the RHS of equation (7.17), this completes the proof of Proposition 3.11.
8 Boltzmann–Gibbs principle I – preliminary estimates
We record general estimates for proofs of both Propositions 4.6 and 4.7 as their proofs will be similar in strategy. This includes a deterministic heat operator estimate that lets us replace space-time suprema of space-time heat operators by an integral whose expectation we can directly take. We estimate said expectation by a localization procedure for mesoscopic space-time averages of local functionals and then use a ‘local equilibrium’ estimate via one-block and two-blocks of [Reference Guo, Papnicolaou and Varadhan28] and the log-Sobolev inequality of [Reference Yau51], ultimately reducing all of our calculations to standard equilibrium estimates that we will introduce. We conclude with a multiscale scheme to replace local functionals by space-time averages via step-by-step replacements.
8.0.1 Heat operator estimate
Our first estimate is deterministic. First, some convenient notation.
Definition 8.1. For any possibly random function $\phi :\mathbb R_{\geqslant 0}\times \mathbb {T}_{N}\to \mathbb R$ and any $\mathrm {t}\geqslant 0$ , define the space-time integral/sum
Lemma 8.2. Consider any possibly random function $\phi :\mathbb R_{\geqslant 0}\times \mathbb {T}_{N}\to \mathbb R$ . Provided any $\gamma>0$ , we have the estimate
Proof. Take $\mathrm {t}\in [0,1]$ and $x\in \mathbb {T}_{N}\kern-1.5pt$ . For any $\mathrm {s}\geqslant 0$ , we define $\mathrm {s}_{\sim }=\mathbf {O}_{\mathrm {s},\mathrm {t}}\vee N^{-2}$ with $\mathbf {O}_{\mathrm {s},\mathrm {t}}=|\mathrm {t}-\mathrm {s}|$ . We have
The second estimate in equation (8.3) follows from first recalling the space-time heat operator $\mathbf {H}^{N}$ is integrating against the heat kernel in space-time. Thus, the second estimate in equation (8.3) is the immediate consequence of the Holder inequality, upon viewing integration as integrating against the heat kernel in space-time, with Holder conjugate exponents $3$ and $3/2$ . To build off of equation (8.3), let us treat the first factor from the far RHS of equation (8.3). Note $\mathrm {s}_{\sim }^{-1}$ is independent of the spatial summation against the heat kernel, and because the heat kernel is a probability measure with respect to the forwards spatial variable, the first factor on the RHS of equation (8.3) turns into the integral of $\mathrm {s}_{\sim }^{-1}$ on the integration domain $[0,\mathrm {t}]\subseteq [0,1]$ . Although $\mathbf {O}_{\mathrm {s},\mathrm {t}}^{-1}$ is not integrable near $\mathrm {t}$ , because we have regularized $\mathbf {O}_{\mathrm {s},\mathrm {t}}$ with $\mathrm {s}_{\sim }$ , the resulting integral is logarithmic in N and therefore at most $C_{\gamma }N^{\gamma }$ where $\gamma>0$ is arbitrary. This gives
Thus, it remains to bound the heat operator $\mathbf {H}^{N}$ in equation (8.5) by $\mathbf {I}_{1}$ . Recall this heat operator is integration in space-time against the heat kernel. By Proposition A.3, the heat kernel is $\mathrm {O}(N^{-1}\mathrm {s}_{\sim }^{-1/2})$ . The $\mathrm {s}_{\sim }^{-1/2}$ factor cancels the $\mathrm {s}_{\sim }^{1/2}$ factor in the heat operator in equation (8.5). The $N^{-1}$ factor in this heat kernel estimate makes the sum over $\mathbb {T}_{N}$ into an average over $\mathbb {T}_{N}$ , because $|\mathbb {T}_{N}|\lesssim N$ , thus we are left with $\mathbf {I}_{\mathrm {t}}(\cdot )$ instead of $\mathbf {H}^{N}_{\mathrm {t},x}(\mathrm {s}_{\sim }^{1/2}\cdot )$ . As $|\phi ||\mathbf {Y}^{N}|\geqslant 0$ , we may extend $\mathbf {I}_{\mathrm {t}}(|\phi |^{3/2}|\mathbf {Y}^{N}|^{3/2})\leqslant \mathbf {I}_{1}(|\phi |^{3/2}|\mathbf {Y}^{N}|^{3/2})$ , thus yielding the RHS of the proposed estimate from equation (8.5). Because the RHS of the proposed estimate is independent of the original space-time variables $\mathrm {t}$ and x, it bounds the far LHS of equation (8.3) uniformly in these variables. This yields the first estimate in equation (8.2). The second inequality follows by $|\mathbf {Y}^{N}|\lesssim N^{\varepsilon _{\mathrm {ap}}}$ ; see Definitions 3.1 and 3.5.
8.0.2 Localization map
We eventually apply Lemma 8.2 with $\phi $ equal to the time average of a local functional of the particle system. Although the functional in the time average is local, its time average itself is, in principle, completely nonlocal, because even on mesoscopic timescales the values of $\eta $ -variables far away from the support of the integrated local functional may affect the $\eta $ -variables inside the support of the integrated local functional in finite time. This is just the fact that random walks can travel arbitrarily far in finite time. However, the probability of noninteracting random walks traveling much farther than their expected maximal displacement vanishes exponentially fast. We extend this to the $\eta $ -variables, which are random walks that interact via exclusion. Before we give the main estimate of this localization, we introduce convenient notation for the rest of this paper.
Definition 8.3. Provided any $\eta \in \Omega $ and any time $\mathrm {t}\geqslant 0$ and any length scale $\mathfrak {l}\in \mathbb Z_{\geqslant 0}$ , define a configuration $\mathrm {Loc}_{\mathrm {t},\mathfrak {l}}\eta \in \Omega $ by the following ‘trivial extension’ of the projection of $\eta $ onto , in which $\mathfrak {L}_{\mathrm {t},\mathfrak {l}}=N^{1+\gamma _{0}}\mathrm {t}^{1/2}+N^{3/2+\gamma _{0}}\mathrm {t}+N^{\gamma _{0}}\mathfrak {l}$ where $\gamma _{0}>0$ is taken as a fixed universal constant satisfying $\gamma _{0}\leqslant 999^{-999}\varepsilon _{\mathrm {ap}}\wedge 999^{-999}\varepsilon _{\mathrm {RN}}\wedge 999^{-999}\varepsilon _{1}\wedge 999^{-999}\varepsilon _{\mathrm {RN},1}$ :
Remark 8.4. We briefly explain $\mathbb {B}_{\mathrm {t},\mathfrak {l}}$ . Take a simple symmetric random walk of order $N^{2}$ speed plus a random order $N^{3/2}$ speed asymmetry. Suppose this random walk starts outside $\mathbb {B}_{\mathrm {t},\mathfrak {l}}$ and let it walk for time $\mathrm {t}$ . The probability that this random walk hits the set is bounded by the probability the maximal process/displacement is at least $N^{1+\gamma _{0}}\mathrm {t}^{1/2}+N^{3/2+\gamma _{0}}\mathrm {t}$ . Because of the extra $N^{\gamma _{0}}$ factor, this occurs with exponentially small probability courtesy of sub-Gaussian martingale inequalities applied to the simple symmetric random walk with large-deviations estimates for the Poisson number of asymmetric drift/jumps. Thus, by union bound, the probability that any of a polynomial number of such random walks hits is also exponentially small in N, as the subexponential bound beats the polynomial-in-N number of random walks asymptotically.
Roughly speaking, the primary technical goal of our analysis is to reduce estimates for local functionals to the same estimates but after pretending the model is at an invariant measure. The philosophy of local equilibrium from [Reference Guo, Papnicolaou and Varadhan28], which we make precise in Lemma 8.8 and Lemma 8.9, will only succeed for local functionals of the particle system. Thus, we want to ignore $\eta $ -values outside the block $\mathbb {B}_{\mathrm {t},\mathfrak {l}}$ from Definition 8.3 while affecting space-time averages of whatever functionals whose analysis we want to reduce to local equilibrium in an asymptotically negligible manner. This is the ultimate goal of Lemma 8.6 below, for example.
We proceed with additional notation for space-time averaging operators, which we will employ for local functionals.
Definition 8.5. Provided any timescale $\mathfrak {t}_{\mathrm {av}}\geqslant 0$ , any length scale $\mathfrak {l}_{\mathrm {av}}\in \mathbb Z_{\geqslant 0}$ , and any functional $\mathfrak {f}:\Omega \to \mathbb R$ , let us define the following where $\mathfrak {l}_{\mathfrak {f}}$ is the smallest nonnegative integer for which $\mathfrak {f}$ and the shifts $\tau _{\mathfrak {l}_{\mathfrak {f}}}\mathfrak {f}$ and $\tau _{-\mathfrak {l}_{\mathfrak {f}}}\mathfrak {f}$ have mutually disjoint supports:
We adopt the convention that $\mathfrak {I}^{\mathbf {T}}_{0}$ and $\mathfrak {I}^{\mathbf {X}}_{0}$ and $\mathfrak {I}^{\mathbf {X}}_{1}$ are identity maps (there is morally no difference between $\mathfrak {I}^{\mathbf {X}}_{1}$ and the identity map except for a harmless spatial shift). We will drop any identity maps from the notation.
Let $\mathscr {D}(\mathbb R_{\geqslant 0},\Omega )$ be the space of sample particle system paths, on which the system induces a path-space probability measure.
Lemma 8.6. Consider a functional $\mathfrak {f}:\Omega \to \mathbb R$ whose support is contained in the block with $\mathfrak {l}\in \mathbb Z_{\geqslant 0}$ . Provided any $\mathfrak {t}_{\mathrm {av}}\in [0,1]$ and $\mathfrak {l}_{\mathrm {av}}\in \mathbb Z_{\geqslant 0}$ , we have the following for any $\kappa>0$ , where we use notation defined after:
We first introduce the parameter $\mathfrak {l}_{\mathrm {tot}}=99\mathfrak {l}+99\mathfrak {l}\mathfrak {l}_{\mathrm {av}}$ . Let us also recall the parameter $\gamma _{0}>0$ from Definition 8.3 . Moreover, the expectation $\mathbf {E}^{\mathrm {dyn}}_{\cdot }$ denotes the expectation with respect to the path-space measure on $\mathscr {D}(\mathbb R_{\geqslant 0},\Omega )$ of the particle system with the initial configuration $\cdot \in \Omega $ . We take $\cdot =\eta $ and $\cdot =\mathrm {Loc}=\mathrm {Loc}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}(\eta )$ in the previous estimate (8.8).
Proof. Note equation (8.8) compares expectations of the same space-time average of $\mathfrak {f}$ with respect to the same dynamics but with initial configurations that only disagree outside $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ , so two $\eta $ -processes with fixed initial configurations that are different outside $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ ; see Definition 8.3. Therefore, the LHS of equation (8.8) is bounded above by $\|\mathfrak {f}\|_{\omega ;\infty }$ times the probability these two $\eta $ -processes, with initial configurations $\eta $ and $\mathrm {Loc}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}(\eta )$ , see different $\eta $ -values in $\mathbb {B}_{\mathrm {tot}}$ at any time before or at $\mathfrak {t}_{\mathrm {av}}$ , under some coupling of the two processes. Indeed, the expectations on the LHS of equation (8.8) only differ on such event, as the time average of $\mathfrak {I}^{\mathbf {X}}_{\mathfrak {l}_{\mathrm {av}}}(\mathfrak {f})$ evaluated at $\mathfrak {t}_{\mathrm {av}}$ only depends on $\eta $ in $\mathbb {B}_{\mathrm {tot}}$ until $\mathfrak {t}_{\mathrm {av}}$ . Thus, it suffices to bound the path-space probability that the two processes disagree in , which contains the support of $\mathfrak {I}^{\mathbf {X}}_{\mathfrak {l}_{\mathrm {av}}}(\mathfrak {f})$ , at any time before or at $\mathfrak {t}_{\mathrm {av}}$ by $\mathrm {O}_{\kappa ,\gamma _{0}}(N^{-\kappa })$ .
It is left to couple the two $\eta $ -processes with initial configurations $\eta $ and $\mathrm {Loc}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}(\eta )$ . We will not use the basic coupling for exclusion processes, but we instead modify it slightly to be explained shortly. We refer to the process with initial configuration $\eta $ as Species 1, and that with initial configuration $\mathrm {Loc}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}(\eta )$ as Species 2.
-
• Define a discrepancy between Species 1 and Species 2 as a point x where $\eta _{x}=1$ in one species and $\eta _{x}=-1$ in the other.
-
• For any jump under a Poisson clock coming from the symmetric part of the generator, we realize such a jump from one point in $\mathbb {T}_{N}$ to another as swapping $\eta $ -values at those points; see the symmetric part $\mathsf {L}_{N,\mathrm {S}}$ in equation (1.3) of the generator $\mathsf {L}_{N}$ . We couple the symmetric parts of the dynamics in Species 1 and Species 2 by coupling these ‘spin-swap’ bond clocks; Species 1 and Species 2 always swap $\eta $ -variables together under symmetric clocks. This coupling can never create new discrepancies, only transport them. Also, individual discrepancies evolve as free and symmetric random walks; under this coupling of symmetric dynamics, with speed $N^{2}/2$ a discrepancy will move as a simple symmetric random walk suppressed by nothing, including the exclusion condition in the particle system. This free and symmetric feature of the discrepancy walks would not be true if we instead employed the basic coupling for the symmetric dynamics, but it is directly verifiable for this bond coupling (according to symmetric bond clocks, a discrepancy always jumps without being blocked with equal speeds to the left and right, because Species 1 and 2 always swap $\eta $ -variables together along the activated bond, and this moves the discrepancy along said bond).
-
• To couple clocks of asymmetric dynamics, suppose there is a particle at x in Species 1. If x is vacant in Species 2, there is no coupling at x. If x is occupied in Species 2 and the speed of an asymmetric jump from x is equal among both species (in both directions), we couple the jumps (so particles jump together in the same direction); this is the basic coupling. If the asymmetry speeds of jumps from x are not equal among the two species in at least one direction, we do not couple the jumps and let them move independently. This difference in asymmetry speeds comes from the $\mathfrak {d}$ -asymmetry. Thus, it can only happen if $\mathfrak {d}_{x}$ takes different values between the species. As $\mathfrak {d}$ has support length at most $2\mathfrak {l}_{\mathfrak {d}}$ (see Assumption 1.3), a difference between $\mathfrak {d}_{x}$ -values in the two species can only happen in a $\mathrm {O}(\mathfrak {l}_{\mathfrak {d}})$ -length neighborhood of an already present discrepancy between the two species. The basic coupling between particles in the two species whose asymmetry speeds are equal cannot create discrepancies; it only introduces a speed $\mathrm {O}(N^{3/2})$ random drift/killing. The ‘noncoupling’ of the asymmetry jumps of nonequal speeds, however, can create up to two discrepancies in a single clock ring. Because these discrepancies can be created potentially anywhere in a length- $\mathrm {O}(\mathfrak {l}_{\mathfrak {d}})$ neighborhood of a discrepancy, this introduces a ‘branching’ mechanism with uniformly bounded number of offspring plus $\mathrm {O}(\mathfrak {l}_{\mathfrak {d}})\lesssim 1$ drift at speed $\mathrm {O}(N^{3/2})$ (actually, it is $\mathrm {O}(N)$ , but $\mathrm {O}(N^{3/2})$ is enough).
-
• To summarize, the dynamics of a discrepancy according to the previous bullet points is a branching symmetric simple random walk of speed $\mathrm {O}(N^{2})$ with a random uniformly bounded drift/killing of speed $\mathrm {O}(N^{3/2})$ . Thus, it is a (nontrivially correlated) collection of a symmetric simple random walks of speed $\mathrm {O}(N^{2})$ with an additional random drift/killing with speed $\mathrm {O}(N^{3/2})$ . Moreover, because the number of total discrepancies/walks is bounded by the total number of initial discrepancies, which is at most $|\mathbb {T}_{N}|\lesssim N$ , plus the number of total ringings in two species until time $1$ , which is Poisson of speed $\mathrm {O}(N^{10})$ , by standard tail estimates for the Poisson distribution, we have $\mathrm {O}(N^{100})$ -many discrepancies/walks outside of an event with exponentially low probability in N. (The number of discrepancies at any time is trivially at most $|\mathbb {T}_{N}|\lesssim N$ , but it is not necessarily true that the number of discrepancy walks we must consider is at most $|\mathbb {T}_{N}|$ , because a discrepancy can be killed to let another be born via branching, which implies the number of ‘family members’/discrepancies we must consider can be arbitrarily large.)
According to the previous bullet points, we are left to bound the probability that $\mathrm {O}(N^{100})$ -many discrepancy walks end up in the support of $\mathfrak {I}^{\mathbf {X}}_{\mathfrak {l}_{\mathrm {av}}}(\mathfrak {f})$ before time $\mathfrak {t}_{\mathrm {av}}$ , where the law of these discrepancy walks is described in the second sentence in the final bullet point above. This means that one of these walks that starts outside $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ travels into $\mathbb {B}_{\mathrm {tot}}$ . Per Remark 8.4, this probability is exponentially small in $N^{\gamma _{0}}$ and thus at most $\mathrm {O}_{\kappa ,\gamma _{0}}(N^{-\kappa })$ , so we are done.
8.0.3 Local equilibrium
In the current section, we take advantage of the estimate in Lemma 8.6 on the localization map therein. The first step that we will take is the following expectation estimate of the $\mathbf {I}_{1}$ -term from Lemma 8.2, in which we will take $\phi $ to be the space-time average from Definition 8.5 for a generic choice of functional $\mathfrak {f}$ . First, we introduce useful notation.
Definition 8.7. Consider any initial probability measure $\mu _{0,N}$ on $\Omega $ . Provided any $\mathrm {t}\geqslant 0$ , we define $\mu _{\mathrm {t},N}$ to be the probability measure on $\Omega $ obtained upon evolving $\mu _{0,N}$ under the forward Kolmogorov equation associated to the interacting particle system for time $\mathrm {t}$ . Let us define $\mathfrak {P}_{\mathrm {t}}$ to be the Radon–Nikodym derivative of $\mu _{\mathrm {t},N}$ with respect to the grand-canonical measure $\mu _{0}$ . We also define $\bar {\mathfrak {P}}_{1}$ as the average of $\mathfrak {P}_{\mathrm {t}}$ over space-time shifts, for which we define the action $\tau _{y}\mathfrak {P}_{\mathrm {t}}(\eta )=\mathfrak {P}_{\mathrm {t}}(\tau _{y}\eta )$ for any $y\in \mathbb {T}_{N}\kern-1.5pt$ :
In the construction above, we can certainly replace the action of $\tau _{-y}$ by $\tau _{y}$ without changing $\bar {\mathfrak {P}}_{1}$ . In general, we can replace $\tau _{-y}$ with any bijection on $\mathbb {T}_{N}$ evaluated at $y\in \mathbb {T}_{N}$ ; this follows immediately by changing variables in the summation.
Lemma 8.8. Consider any $0\leqslant \mathfrak {t}_{\mathrm {av}}\leqslant 1$ and any $\mathfrak {l}_{\mathrm {av}}\in \mathbb Z_{\geqslant 0}$ . Consider any functional $\mathfrak {f}:\Omega \to \mathbb R$ whose support is contained in the block . We again define $\mathfrak {l}_{\mathrm {tot}}=99\mathfrak {l}+99\mathfrak {l}\mathfrak {l}_{\mathrm {av}}$ as in Lemma 8.6 . For any $\kappa>0$ , we have the following in which we recall the $\mathbf {E}^{\mathrm {dyn}}$ expectations and $\mathrm {Loc}=\mathrm {Loc}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}(\eta )$ in Lemma 8.6 , and $\bar {\mathfrak {P}}_{1}$ in Definition 8.7 :
Proof. Let us start by computing the expectation on the far LHS of equation (8.10). Because $\mathbf {I}_{1}$ is a deterministic and linear operator, we can move the expectation past the $\mathbf {I}_{1}$ operator; observe that what the expectation now hits is a functional of the path-space $\mathscr {D}(\mathbb R_{\geqslant 0},\Omega )$ , namely of the $\eta $ -process, starting at time S until time $S+\mathfrak {t}_{\mathrm {av}}$ This is the same as sampling the time-S configuration and using it as the time-zero/initial configuration for the process after ‘resetting’ time S to be time 0. Therefore, we rewrite the expectation of this space-time average as the path-space expectation with a fixed initial configuration that is then sampled/taken expectation over with respect to the law of the particle system at time $S\geqslant 0$ . Precisely, we deduce the following with explanation given after; we note the following explanation additionally requires only recentering $\mathfrak {f}_{S,y}$ and spatially shifting $\eta _{S}$ accordingly:
To establish equation (8.11), when we rewrite the expectation of the path-space functional $\mathfrak {I}^{\mathbf {T}}\mathfrak {I}^{\mathbf {X}}(\mathfrak {f}_{S,y})$ as an expectation with respect to the path-space measure after time $S\geqslant 0$ , with initial configuration then taken expectation over with respect to the law of the particle at time $S\geqslant 0$ , we emphasize that the inner $\mathbf {E}^{\mathrm {dyn}}$ expectation should have an initial configuration $\eta _{S}$ instead of $\tau _{y}\eta _{S}$ , and $\mathfrak {f}_{0,0}$ on the RHS should be $\mathfrak {f}_{0,y}$ ; although it is now evaluated at time $0$ and initial configuration $\eta _{S}$ due to our time-S shift, it is still centered at $y\in \mathbb {T}_{N}$ and not at $0\in \mathbb {T}_{N}\kern-1.5pt$ . However, the path-space expectation $\mathbf {E}^{\mathrm {dyn}}$ is invariant under any spatial shift, because the particle system dynamic law is invariant under spatial shifts, so we may shift the initial configuration via $\tau _{y}$ and study instead the space-time average of $\mathfrak {f}_{0,0}$ rather than $\mathfrak {f}_{0,y}$ . We now implement the averaging procedure from the one-block step of [Reference Guo, Papnicolaou and Varadhan28]. This starts by observing that the inner $\mathbf {E}^{\mathrm {dyn}}$ is a function of only $\tau _{y}\eta _{S}$ , and the function itself at which we evaluate $\tau _{y}\eta _{S}$ is a dynamic path-space expectation, which is itself independent of $y\in \mathbb {T}_{N}$ and $S\geqslant 0$ . Now, rewrite the RHS of equation (8.11) as follows by noting the expectation of $\tau _{y}\eta _{S}$ is that of $\tau _{y}\eta $ times the Radon–Nikodym derivative $\mathfrak {P}_{S}$ for the law of the particle system at time S with respect to the grand-canonical product measure $\mu _{0}$ , where $\eta $ is distributed according to said grand-canonical measure:
For the RHS of equation (8.12), inside the outermost expectation we change variables $\eta \mapsto \tau _{-y}\eta $ , and thus $\tau _{y}\eta \mapsto \eta $ , per point $y\in \mathbb {T}_{N}\kern-1.5pt$ . The grand-canonical ensemble is invariant under these spatial shifts. This places the $\tau _{y}$ operator on the Radon–Nikodym derivative $\mathfrak {P}_{S}$ and leaves the resulting $\mathbf {E}^{\mathrm {dyn}}$ independent of the integration space-time variables in $\mathbf {I}_{1}$ . With the Fubini theorem, this gives
Note $\bar {\mathfrak {P}}_{1}=\mathbf {I}_{1}(\tau _{-y}\mathfrak {P}_{S})$ ; see Definition 8.7. Combining previous identities (8.11), (8.12) and (8.14) with this observation gives:
We are left with replacing the $\eta $ -variable in $\mathbf {E}^{\mathrm {dyn}}$ from the far RHS of equation (8.15) with the localization map $\mathrm {Loc}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ in Definition 8.3. For this we employ Lemma 8.6, which provides the additional $N^{-\kappa }\|\mathfrak {f}\|_{\omega ;\infty }$ term in equation (8.10).
We will now take advantage of Lemma 8.8 by essentially removing the $\bar {\mathfrak {P}}_{1}$ density from the RHS of equation (8.10), upon collecting additional error terms. The mechanism for this replacement is the relative entropy inequality, the log Sobolev inequality of [Reference Yau51], and an entropy production estimate, all of which are standard and whose uses will be specified and explained below. We state the following estimate in a general framework as both Propositions 4.6 and 4.7 require different modifications to the RHS of equation (8.10) before applying Lemma 8.9 below.
Lemma 8.9. Take any uniformly bounded functional $\mathfrak {h}:\Omega \to \mathbb R$ whose support is contained in a subset denoted by $\mathbb {B}$ . Provided any $\kappa \geqslant 0$ satisfying $\kappa \lesssim 1+\|\mathfrak {h}\|_{\omega ;\infty }^{-1}\lesssim \|\mathfrak {h}\|_{\omega ;\infty }^{-1}$ , we have the following (recall the canonical measures from Definition 4.4 ):
Proof. First, observe we may replace $\bar {\mathfrak {P}}_{1}$ on the LHS of equation (8.16) with its projection/conditional expectation on $\mathbb {B}$ , as the functional $\mathfrak {h}$ depends only on $\eta $ -variables in $\mathbb {B}$ . We will let $\Pi _{\mathbb {B}}\bar {\mathfrak {P}}_{1}$ denote this projection. Moreover, we may condition on the $\eta $ -density on $\mathbb {B}$ . If $\mathfrak {p}_{\sigma }$ is the probability of the support of $\mu _{\sigma ,\mathbb {B}}^{\mathrm {can}}$ under the $\Pi _{\mathbb {B}}\bar {\mathfrak {P}}_{1}$ measure, we get the following where $\Sigma _{\sigma }\subseteq \Omega _{\mathbb {B}}$ is the support of $\mu _{\sigma ,\mathbb {B}}^{\mathrm {can}}$ and in which the sum over all $\sigma \in \mathbb R$ on the RHS of equation (8.17) below is finite because only finitely many hyperplanes $\Sigma _{\sigma }\subseteq \Omega _{\mathbb {B}}$ in the finite set $\Omega _{\mathbb {B}}$ are nonempty; note the sum over $\sigma \in \mathbb R$ of the disjoint hyperplanes $\{\mathbf {1}_{\Sigma _{\sigma }}\}_{\sigma \in \mathbb R}$ is equal to 1:
We forget any $\sigma \in \mathbb R$ for which $\mathfrak {p}_{\sigma }=0$ on the far RHS of equation (8.17), as these terms do not show up when we condition on all possible $\sigma $ -values. We now observe that the $\sigma $ -indexed expectation on the far RHS of equation (8.17) is expectation of $|\mathfrak {h}|$ times the Radon–Nikodym derivative of $\Pi _{\mathbb {B}}\bar {\mathfrak {P}}_{1}\mathbf {1}_{\Sigma _{\sigma }}\mathrm {d}\mu _{0}$ with respect to the canonical measure $\mathrm {d}\mu _{\sigma ,\mathbb {B}}^{\mathrm {can}}$ . So, we may use the relative entropy inequality, which may be found in Appendix 1.8 of [Reference Kipnis and Landim37], with a constant $\kappa>0$ , in which $\mathfrak {D}^{\sigma }_{\mathrm {KL}}(\cdot )$ denotes relative entropy with respect to $\mu _{\sigma ,\mathbb {B}}^{\mathrm {can}}$ onto $\mathbb {B}$ , which also may be found/defined in Appendix 1.8 of [Reference Kipnis and Landim37]; for the second term on the RHS of equation (8.18) below, we estimate a sum over $\sigma \in \mathbb R$ against probabilities $\mathfrak {p}_{\sigma }$ in terms of a supremum over $\sigma \in \mathbb R$ :
We now study the RHS of equation (8.18). Below, the first bullet point basically follows the standard probability calculations in the proof of Lemma 3.3 in [Reference Chang and Yau8], starting after (3.20) therein, and the usual one-block step in [Reference Guo, Papnicolaou and Varadhan28]. The second bullet point is calculus.
-
• We first analyze the first term on the RHS of equation (8.18). By the log Sobolev inequality with diffusive constant $\mathrm {O}(|\mathbb {B}|^{2})$ in Theorem A of [Reference Yau51], we bound the $\mathfrak {D}^{\sigma }_{\mathrm {KL}}$ term by $\mathrm {O}(|\mathbb {B}|^{2})$ times the Dirichlet form of $\mathfrak {p}_{\sigma }^{-1}\Pi _{\mathbb {B}}\bar {\mathfrak {P}}_{1}\mathbf {1}_{\Sigma _{\sigma }}$ . The resulting convex combination over $\sigma $ of these Dirichlet forms is, by standard probability, the Dirichlet form of $\Pi _{\mathbb {B}}\bar {\mathfrak {P}}_{1}$ with respect to the grand-canonical measure $\mu _{0}$ projected on $\mathbb {B}$ . By standard entropy production as in Lemma 4.1 in [Reference Dembo and Tsai19], without the need for boundary considerations, and Proposition 4.3 in [Reference Dembo and Tsai19], this is then controlled by $N^{-2}|\mathbb {B}|$ . This gives an upper bound for the first term on the RHS of equation (8.18) given by the first term on the RHS of the proposed estimate (8.16).
-
• Because $\kappa \lesssim \|\mathfrak {h}\|_{\omega ;\infty }^{-1}$ by assumption, the argument $\kappa |\mathfrak {h}|$ in the exponential in equation (8.18) is uniformly bounded. Since the exponential function is uniformly Lipschitz on uniformly bounded sets, for $\widetilde {\kappa }>0$ universal and independent of $\kappa $ ,
(8.19) $$ \begin{align} \log\mathbf{E}^{\mu_{\sigma,\mathbb{B}}^{\mathrm{can}}}\mathbf{Exp}\left(\kappa|\mathfrak{h}|\right) \ \leqslant \ \log\mathbf{E}^{\mu_{\sigma,\mathbb{B}}^{\mathrm{can}}}\left(\mathbf{Exp}(0) + \widetilde{\kappa}\kappa|\mathfrak{h}|\right) \ = \ \log\mathbf{E}^{\mu_{\sigma,\mathbb{B}}^{\mathrm{can}}}\left(1+\widetilde{\kappa}\kappa|\mathfrak{h}|\right) \ \leqslant \ \widetilde{\kappa}\kappa|\mathfrak{h}|. \end{align} $$Dividing by $\kappa $ estimates the second term on the RHS of equation (8.18) by the second term in the proposed estimate (8.16).
This completes the proof.
8.0.4 Equilibrium estimates
We now record estimates on stationary particle systems that will be crucial to study expectations of space-time averages provided our reduction to local equilibrium in Lemma 8.9. The first is a spatial average estimate, which exploits spatially fluctuating behavior of local functionals at the stationary measure. This will be used as a large-deviations-type estimate in future applications.
Lemma 8.10. Suppose $\{\mathfrak {f}_{\mathfrak {j}}\}_{\mathfrak {j}\geqslant 0}$ are uniformly bounded, and their respective supports are contained inside $\{\mathbb {B}_{\mathfrak {j}}\}_{\mathfrak {j}\geqslant 0}$ . Suppose $\{\mathbb {B}_{\mathfrak {j}}\}_{\mathfrak {j}\geqslant 0}$ are mutually disjoint and that $\mathfrak {f}_{\mathfrak {j}}$ vanishes in expectation with respect to any canonical measure on its support for every $\mathfrak {j}$ . We have the following for any $\gamma ,\kappa>0$ , where probability and expectation below are both with respect to any canonical measure on $\mathbb {B}_{1}\cup \ldots \cup \mathbb {B}_{\mathfrak {J}}$ , and $\mathcal {E}_{\mathfrak {J}}$ is the event where the average of $\mathfrak {f}_{1},\ldots ,\mathfrak {f}_{\mathfrak {J}}$ exceeds $N^{\gamma }|\mathfrak {J}|^{-1/2}\max _{\mathfrak {j}=1,\ldots ,\mathfrak {J}}\|\mathfrak {f}_{\mathfrak {j}}\|_{\omega ;\infty }$ in absolute value:
Proof. We note $\mathfrak {f}_{\mathfrak {j}}$ are conditionally mean zero. Indeed, their supports are mutually disjoint, and each is mean zero with respect to every canonical measure; for any canonical measure on a set in $\mathbb {T}_{N}\kern-1.5pt$ , conditioning on one subset induces a convex combination of canonical measures on any other nonintersecting subset. Standard concentration inequalities like the Azuma martingale inequality, therefore give that the average of $\mathfrak {f}_{1},\ldots ,\mathfrak {f}_{\mathfrak {J}}$ is sub-Gaussian with zero mean and variance of order $|\mathfrak {J}|^{-1}\max _{\mathfrak {j}=1,\ldots ,\mathfrak {J}}\|\mathfrak {f}_{\mathfrak {j}}\|_{\omega ;\infty }^{2}$ , from which the proposed estimate follows by pretending that this average of $\mathfrak {f}_{1},\ldots ,\mathfrak {f}_{\mathfrak {J}}$ functionals is Gaussian with zero mean and variance $|\mathfrak {J}|^{-1}\max _{\mathfrak {j}=1,\ldots ,\mathfrak {J}}\|\mathfrak {f}_{\mathfrak {j}}\|_{\omega ;\infty }^{2}$ along with standard Gaussian moment generating function control. This yields an exponentially small (in $N^{\gamma })$ estimate for $\mathbf {P}(\mathcal {E}_{\mathfrak {J}})$ , which is exponentially small in $N^{\gamma }$ and thus $\mathrm {O}_{\gamma ,\kappa }(N^{-\kappa })$ for any $\kappa>0$ .
We proceed with equilibrium estimates for space-time averages instead of just spatial averages. The primary advantage for this is the ability to take advantage of the ‘more ergodic’ time averaging of statistics of the particle system; recall the time scaling is $N^{2}$ whereas the spatial scaling is N. However, the following estimates only hold in a second moment at best, a priori, and thus quite far from the large deviations scale of Lemma 8.10; see Proposition 7 and Corollary 1 in [Reference Goncalves and Jara24] for more details.
Lemma 8.11. Suppose that $\mathfrak {f}$ is a uniformly bounded functional, and its support is contained in $\mathbb {B}\subseteq \mathbb {T}_{N}\kern-1.5pt$ . We additionally assume that the expectation of $\mathfrak {f}$ with respect to any canonical measure on $\mathbb {B}$ is equal to zero. Provided any timescale $\mathfrak {t}_{\mathrm {av}}\geqslant 0$ and any length scale $\mathfrak {l}_{\mathrm {av}}\in \mathbb Z_{\geqslant 0}$ and any $\kappa>0$ , we have the following estimate that we clarify/explain afterwards and for which we recall the notation of Definition 8.3 , Definition 8.5 , and Lemma 8.6 :
We have used the abbreviation $\mathrm {Loc}=\mathrm {Loc}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}\eta $ , where $\mathfrak {l}_{\mathrm {tot}}=99|\mathbb {B}|+99|\mathbb {B}|\mathfrak {l}_{\mathrm {av}}$ is much larger than the support of $\mathfrak {I}^{\mathbf {X}}_{\mathfrak {l}_{\mathrm {av}}}(\mathfrak {f}_{0,0})$ . Observe $\mathrm {Loc}$ is only a function of $\eta $ -variables on the neighborhood $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ ; therefore, so is the inner expectation. The outer expectation on the LHS of equation (8.21) is the expectation over these $\eta $ -variables in $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ , sampled from canonical ensemble on $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ of $\eta $ -density equal to $\sigma $ . In particular, inside the supremum on the LHS of equation (8.21) is the expectation of the square of the space-time average of $\mathfrak {f}_{0,0}$ , where the initial configuration for the space-time average/particle system has $\eta $ -variables in $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ sampled via the canonical ensemble of parameter $\sigma $ on $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ and has $\eta $ -variables outside $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ deterministically equal to $1$ .
Proof. Suppose that instead of the $\Omega $ -valued process/particle system considered in this paper that the particle system in question in Lemma 8.11 is actually valued in $\Omega _{\mathbb {B}_{2}}$ with $\mathbb {B}_{2}=\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ . In particular, suppose the particle system/particle random walks are $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ -periodic, in which case the particle/ $\eta $ configuration (on $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ ) in equation (8.21) is distributed according to canonical measure on $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ . Observe this $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ -periodic system has canonical measures as invariant measures; this follows by the same reason that the $\mathbb {T}_{N}$ -periodic system has canonical measures on $\mathbb {T}_{N}$ as invariant measures. Therefore, the Kipnis–Varadhan inequality in Appendix 1.6 of [Reference Kipnis and Landim37] implies that, uniformly in $\sigma $ , the double expectation on the LHS of equation (8.21) is bounded above by $\mathrm {O}(\mathfrak {t}_{\mathrm {av}}^{-1})$ times a squared Sobolev norm of the spatial average $\mathfrak {I}^{\mathbf {X}}_{\mathfrak {l}_{\mathrm {av}}}(\mathfrak {f}_{0,0})$ . From Proposition 6 in [Reference Goncalves and Jara24], said squared Sobolev norm of $\mathfrak {I}^{\mathbf {X}}_{\mathfrak {l}_{\mathrm {av}}}(\mathfrak {f}_{0,0})$ is $\mathrm {O}(N^{-2}\mathfrak {l}_{\mathrm {av}}^{-1}\|\mathfrak {f}\|_{\omega ;\infty }^{2}|\mathbb {B}|^{2})$ , where $|\mathbb {B}|$ is the support length of $\mathfrak {f}$ . Thus, we have established the proposed estimate (8.21) if we can replace the $\Omega $ -valued/‘original’ particle system with the $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ -periodic system, thereby forgetting $\eta $ outside $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ .
We now make the aforementioned replacement and estimate the resulting error, which will provide the $N^{-\kappa }$ -term on the RHS of equation (8.21). We will use a coupling argument similar to the proof of Lemma 8.6. In what follows, we refer to the $\Omega $ -valued/‘original’ particle system as Species 1, and we refer to the $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ -periodic system appearing below as Species 2.
-
• As in the proof of Lemma 8.6, the symmetric dynamic in Species 1 may be thought of as attaching Poisson clocks to bonds in $\mathbb {T}_{N}$ connecting nearest neighbors, where the ringing of the Poisson clock associated to a given bond corresponds to swapping $\eta $ -variables at the points attached to that bond. For Species 2, let us also construct the symmetric dynamic as attaching Poisson clocks to bonds in $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ that connect points that are distance 1 apart with respect to the geodesic/torus distance on $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ ; this includes the maximum and minimum of $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ , for example. For those bonds that appear in both Species 1 and Species 2, we will use the same bond clocks, so that shared/common bonds always swap $\eta $ -spins together. For bonds which are shared between Species 1 and Species 2, we use the modified basic coupling for the respective asymmetric dynamics from the proof of Lemma 8.6 (to account for the $\mathfrak {d}$ -asymmetry). All other bonds are then chosen arbitrarily/independently.
-
• Observe that the error in the LHS of equation (8.21) after replacing the $\Omega $ -valued/‘original’ system with the $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ -periodic system is $\mathrm {O}(\|\mathfrak {f}\|_{\omega ;\infty }^{2})\lesssim 1$ times the probability Species 1 and Species 2, under the coupling in the previous bullet point, have discrepancy inside the support of $\mathfrak {I}^{\mathbf {X}}_{\mathfrak {l}_{\mathrm {av}}}(\mathfrak {f}_{0,0})$ , similar to the proof of Lemma 8.6. Below, we identify a discrepancy in $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ with its entire ancestry, similar to the final bullet point in the proof of Lemma 8.6 when we considered a branching random walk as a collection of correlated random walks. In particular, even if the discrepancy was born from a branching, we identify it as a random walk that followed its ancestors until said branching, after which it becomes its own branching random walk.
-
• Suppose that we observe a discrepancy in the support of $\mathfrak {I}^{\mathbf {X}}_{\mathfrak {l}_{\mathrm {av}}}(\mathfrak {f}_{0,0})$ , and therefore in $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ . This discrepancy must have been born at a point where the clocks are not all coupled between Species 1 and Species 2 (like in the proof of Lemma 8.6, coupled clocks cannot create discrepancies). By construction, such points are initially within $\mathrm {O}(\mathfrak {l}_{\mathfrak {d}})$ of the boundary of $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ . This discrepancy must have propagated into the support of $\mathfrak {I}^{\mathbf {X}}_{\mathfrak {l}_{\mathrm {av}}}(\mathfrak {f}_{0,0})$ by length-1 jumps from $\mathrm {O}(\mathfrak {l}_{\mathfrak {d}})$ of said boundary. Third, while said discrepancy in $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ travels to the support of $\mathfrak {I}^{\mathbf {X}}_{\mathfrak {l}_{\mathrm {av}}}(\mathfrak {f}_{0,0})$ , when it gets $\mathrm {O}(\mathfrak {l}_{\mathfrak {d}})$ away from the boundary of $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ , it then travels according to the branching random walk that we described in the last bullet point in the proof of Lemma 8.6 because the different boundary conditions in the two species become irrelevant when we are in $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ and beyond $\mathrm {O}(\mathfrak {l}_{\mathfrak {d}})$ of its boundary. Therefore, we see said branching random walk travel at least the distance from within $\mathrm {O}(\mathfrak {l}_{\mathfrak {d}})$ of the boundary of $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ to the support of $\mathfrak {I}^{\mathbf {X}}_{\mathfrak {l}_{\mathrm {av}}}(\mathfrak {f}_{0,0})$ , if we see a discrepancy in the support of $\mathfrak {I}^{\mathbf {X}}_{\mathfrak {l}_{\mathrm {av}}}(\mathfrak {f}_{0,0})$ at all. (It may be the case that one of these discrepancy random walks returns to within $\mathrm {O}(\mathfrak {l}_{\mathfrak {d}})$ of the boundary of $\mathbb {B}_{\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {tot}}}$ , where it does not travel like the aforementioned branching random walk, but in this case, as it travels into the support of $\mathfrak {I}^{\mathbf {X}}_{\mathfrak {l}_{\mathrm {av}}}(\mathfrak {f}_{0,0})$ we just wait for it to get beyond $\mathrm {O}(\mathfrak {l}_{\mathfrak {d}})$ of said boundary again.) Thus, the probability that we see any discrepancy in the support of $\mathfrak {I}^{\mathbf {X}}_{\mathfrak {l}_{\mathrm {av}}}(\mathfrak {f}_{0,0})$ is controlled by random walk probabilities and a large deviations bound for the number of discrepancy walks as in the last bullet point in the proof of Lemma 8.6.
This completes the proof.
8.0.5 Spatial replacement
We introduce a set of replacement estimates that allow us to introduce space-time averaging for a functional that is multiplied by $\mathbf {Y}^{N}$ and the heat kernel while estimating the error in doing so.
Definition 8.12. Consider any functional $\mathfrak {f}:\Omega \to \mathbb R$ and any pair of length scales $\mathfrak {l},\mathfrak {l}'\in \mathbb Z_{\geqslant 0}$ . Define a transfer-of-length-scale operator $\mathfrak {D}^{\mathbf {X}}_{\mathfrak {l},\mathfrak {l}'}(\mathfrak {f})=\mathfrak {I}_{\mathfrak {l}}^{\mathbf {X}}(\mathfrak {f})-\mathfrak {I}_{\mathfrak {l}'}^{\mathbf {X}}(\mathfrak {f})$ , where the $\mathfrak {I}^{\mathbf {X}}$ operator, with identity time average $\mathfrak {I}^{\mathbf {T}}$ operator, is from Definition 8.5.
Lemma 8.13. Consider any $\mathfrak {f}:\Omega \to \mathbb R$ whose support has length at most $\mathfrak {l}_{\mathfrak {f}}$ along with any length scale $|\mathfrak {l}|\mathfrak {l}_{\mathfrak {f}}\leqslant \mathfrak {l}_{N}$ with $\mathfrak {l}_{N}$ from Definition 3.1 . For any $\mathrm {t}\geqslant 0$ , we have the following in which we let $\bar {\mathfrak {l}}=|\mathfrak {l}|\mathfrak {l}_{\mathfrak {f}}$ in the statement and proof of this result:
Remark 8.14. The assumption $|\mathfrak {l}|\mathfrak {l}_{\mathfrak {f}}\leqslant \mathfrak {l}_{N}$ will be important because we need spatial regularity of $\mathbf {Y}^{N}$ on length scale $|\mathfrak {l}|\mathfrak {l}_{\mathfrak {f}}$ , and we only guarantee this if $|\mathfrak {l}|\mathfrak {l}_{\mathfrak {f}}\leqslant \mathfrak {l}_{N}$ by the constructions in Definition 3.1 and Definition 3.5. We will actually soften moderately the assumption $|\mathfrak {l}|\mathfrak {l}_{\mathfrak {f}}\leqslant \mathfrak {l}_{N}$ in a later ‘adapted’ application of Lemma 8.13, namely in the proof of Lemma 11.1, with explanation. The first term on the RHS of equation (8.22) would not change if $\bar {\mathfrak {l}}=|\mathfrak {l}|\mathfrak {l}_{\mathfrak {f}}\approx \mathfrak {l}_{N}N^{\gamma }$ for $\gamma>0$ small; only the second one slightly would.
Proof. The $\mathfrak {D}^{\mathbf {X}}$ -term on the LHS of equation (8.22) may be realized as an average of spatial gradients of $\mathfrak {f}$ on length scales that are at most $|\bar {\mathfrak {l}}|$ . Indeed, the $\mathfrak {I}^{\mathbf {X}}(\mathfrak {f})$ -term defining the $\mathfrak {D}^{\mathbf {X}}$ -term on the LHS of equation (8.22) is an average of spatial translations of $\mathfrak {f}$ with length scale at most $|\bar {\mathfrak {l}}|$ , and the difference of each spatial translation with $\mathfrak {f}$ is a spatial gradient of $\mathfrak {f}$ of the same length scale. Thus, it suffices to prove equation (8.22) but replacing $\mathfrak {D}^{\mathbf {X}}_{0,\mathfrak {l}}$ on the LHS of equation (8.22) by $\nabla ^{\mathbf {X}}_{\mathfrak {l}'}$ for any $|\mathfrak {l}'|\leqslant \bar {\mathfrak {l}}=|\mathfrak {l}|\mathfrak {l}_{\mathfrak {f}}$ . Letting $\mathfrak {l}'$ be such a length scale, we start with the following discrete-type Leibniz rule; it may be checked directly:
The second line follows by the triangle inequality for $\|\|_{1;\mathbb {T}_{N}}$ and linearity of expectation. Note the additional spatial shift in $\mathbf {Y}^{N}$ follows from the discrete nature of the spatial gradients; if we considered instead an ‘infinitesimal’ length scale, this shift would disappear as $\mathfrak {l}'\to 0$ and we would recover the usual Leibniz rule. We will now estimate each of the terms in equation (8.24). For the first term, we may employ the heat operator gradient estimate in Proposition A.3 along with the estimate $|\mathbf {Y}^{N}|\lesssim N^{\varepsilon _{\mathrm {ap}}}$ that follows via Definitions 3.1 and 3.5; this estimates the first term in (8.24) by moving $\nabla ^{\mathbf {X}}$ onto the macroscopically smooth $\mathbf {H}^{N}$ :
since any time average is uniformly bounded by its input $|\mathfrak {I}^{\mathbf {T}}(\mathfrak {f})|\leqslant \|\mathfrak {f}\|_{\omega ;\infty }$ , since $|\mathbf {Y}^{N}|\leqslant N^{\varepsilon _{\mathrm {ap}}}$ and since $|\mathfrak {l}'|\leqslant |\bar {\mathfrak {l}}|\leqslant |\mathfrak {l}_{N}|=N^{1/2+\varepsilon _{\mathrm {RN}}}$ ; see Definitions 3.1 and 3.5. Thus, we are left to estimate the second term in equation (8.24). Observe $\mathbf {Y}^{N}=0$ or $\mathbf {Y}^{N}=\mathbf {Z}^{N}\kern-1.5pt$ . The former case is trivial, and the second case implies $\mathbf {Z}^{N}$ has a priori spatial regularity on length scale $\mathfrak {l}'$ since $|\mathfrak {l}'|\leqslant |\bar {\mathfrak {l}}|\leqslant \mathfrak {l}_{N}$ ; see Definitions 3.1 and 3.5. This spatial regularity controls the gradient in the second term in equation (8.24) uniformly in space-time, so
As $|\mathfrak {l}'|\leqslant \bar {\mathfrak {l}}$ , we ultimately deduce that the second term in equation (8.24) is bounded by the second term in equation (8.22), so we are done.
8.0.6 Multiscale time replacement
The last preliminary estimates we introduce will serve important for replacing functionals and their spatial averages with their respective time averages. We emphasize that such replacement by mesoscopic time average is difficult because of the poor time regularity of the $\mathbf {Y}^{N}$ process against which we multiply the functionals/spatial averages that we want to replace with their respective time averages. This ultimately leads us to a multiscale replacement, which, per standard multiscale analysis, forces us to simultaneously take advantage of progressively improving estimates for space-time averages on progressively larger timescales; see equation (8.21) and its dependence in the timescale $\mathfrak {t}_{\mathrm {av}}$ therein. First, some convenient notation.
Definition 8.15. Consider any $\mathfrak {f}:\Omega \to \mathbb R$ and any pair of timescales $\mathfrak {t},\mathfrak {t}'\geqslant 0$ . We define the transfer-of-timescale operator $\mathfrak {D}_{\mathfrak {t},\mathfrak {t}'}^{\mathbf {T}}(\mathfrak {f})=\mathfrak {I}_{\mathfrak {t}}^{\mathbf {T}}(\mathfrak {f})-\mathfrak {I}_{\mathfrak {t}'}^{\mathbf {T}}(\mathfrak {f})$ , where $\mathfrak {I}^{\mathbf {T}}$ is defined in Definition 8.5 by taking the identity spatial average/ $\mathfrak {l}_{\mathrm {av}}=0$ therein.
The first step we take is the introduction of a time average with respect to some timescale, which we take as the microscopic timescale $N^{-2}$ in the following preliminary estimate. We emphasize that the following estimate is established by an integration-by-parts-type calculation; in order to estimate the integrated time gradient of a functional, we will move such time gradient onto the other factors/integrands. We then estimate these time gradients along with another pair of ultimately negligible ‘short-time’ boundary terms/integrals. We note that the proof of the following is a time-version of Lemma 8.13, though it is somewhat more involved because time gradients of $\mathbf {Y}^{N}$ may cross the time at which $\mathbf {Y}^{N}$ goes from being equal to $\mathbf {Z}^{N}$ to when it is zero.
Lemma 8.16. Consider any functional $\mathfrak {f}:\Omega \to \mathbb R$ and the timescales $\mathfrak {t}_{-\infty }=0$ and $\mathfrak {t}_{0}=N^{-2}$ . Provided any $\gamma>0$ , we have
Proof. Observe the $\mathfrak {D}$ -operator on the LHS of equation (8.26) is a difference between a scale $N^{-2}$ time average and $\mathfrak {f}$ itself, then multiplied by $\mathbf {Y}^{N}$ and integrated against the heat kernel defining $\mathbf {H}^{N}$ . Thus, it is an average of time gradients of $\mathfrak {f}$ with respect to timescales that are between $\mathfrak {t}_{-\infty }=0$ and $\mathfrak {t}_{0}=N^{-2}$ . We then estimate time-gradients with respect to these timescales uniformly over said timescales. In particular, it suffices to get, for any $\gamma>0$ :
We now write the following Leibniz-rule-type identity, which is a time version of equation (8.24); because we are taking time-gradients with respect to positive timescales, instead of differentiating on infinitesimal timescales, the following identity includes additional time shifts that would disappear if we took the timescale $\mathrm {s}$ to zero. We emphasize, however, that the following identity may be easily checked just by expanding time-gradients on the RHS of the first line below and cancelling terms:
The second line (8.29) follows by the triangle inequality along with tautologically rewriting the $\mathbf {Y}^{N}$ gradient. Let us now estimate each term in equation (8.29) uniformly in the allowed timescales $\mathrm {s}$ . For the first term in equation (8.29), we use Proposition A.3 and a priori upper bounds for $\mathbf {Y}^{N}$ from Definitions 3.1 and 3.5 to get the following deterministic estimate for any $\gamma>0$ , which moves $\nabla ^{\mathbf {T}}$ onto the macroscopically smooth (in time) heat operator $\mathbf {H}^{N}$ :
Observe that the far RHS of equation (8.30) is the first term on the RHS of equation (8.26), so we are left to show the second term in equation (8.29) is controlled by the RHS of equation (8.26). To this end, by construction in Definitions 3.1 and 3.5, we have, for $\mathbb {I}=\mathfrak {t}_{\mathrm {st}}+[0,\mathrm {s}]$ , that
where the last inequality follows by the observation $\mathrm {s}\geqslant 0$ implies $\nabla ^{\mathbf {T}}_{-\mathrm {s}}\mathbf {Y}^{N}=\nabla ^{\mathbf {T}}_{-\mathrm {s}}\mathbf {Z}^{N}$ before time $\mathfrak {t}_{\mathrm {st}}$ , and after time $\mathfrak {t}_{\mathrm {st}}+\mathrm {s}$ , we have $\nabla _{-\mathrm {s}}^{\mathbf {T}}\mathbf {Y}^{N}=0$ (see Definition 3.5). We now take expectations in (8.33). For the first term, note $\|\mathfrak {f}\|_{\omega ;\infty }$ -factor is constant. By Lemma A.6, with overwhelming probability, we have $\|\nabla _{-\mathrm {s}}^{\mathbf {T}}\mathbf {Z}^{N}\|_{\mathfrak {t}_{\mathrm {st}};\mathbb {T}_{N}}\lesssim _{\gamma } N^{-1/2+\gamma }\|\mathbf {Z}^{N}\|_{\mathfrak {t}_{\mathrm {st}};\mathbb {T}_{N}}$ with $\gamma>0$ arbitrary but fixed and for any $0\leqslant \mathrm {s}\leqslant N^{-2}$ . On the complement of this event, by construction of $\mathfrak {t}_{\mathrm {st}}$ in Definition 3.1, we know deterministically that $\|\nabla _{-\mathrm {s}}^{\mathbf {T}}\mathbf {Z}^{N}\|_{\mathfrak {t}_{\mathrm {st}};\mathbb {T}_{N}}\lesssim N^{\varepsilon _{\mathrm {ap}}}$ . Thus, again since $\|\mathbf {Z}^{N}\|_{\mathfrak {t}_{\mathrm {st}};\mathbb {T}_{N}}\lesssim N^{\varepsilon _{\mathrm {ap}}}$ by Definition 3.1, for the first term in equation (8.33),
so it remains to estimate the expectation of the second term in equation (8.33). For this, we recall that $|\mathbf {Y}^{N}|\lesssim N^{\varepsilon _{\mathrm {ap}}}$ by construction in Definitions 3.1 and 3.5. So by equation (A.5) for $\mathbb {I}=\mathfrak {t}_{\mathrm {st}}=[0,\mathrm {s}]$ , the second term in equation (8.33) is controlled by $N^{\varepsilon _{\mathrm {ap}}}\|\mathfrak {f}\|_{\omega ;\infty }$ times the length $|\mathbb {I}|=\mathrm {s}\leqslant N^{-2}$ since $\mathbf {H}^{N}$ integrates, in time over $\mathbb {I}$ , the spatial contractions $\mathbf {H}^{N,\mathbf {X}}$ :
Combining the previous two displays with equation (8.33) along with equations (8.29) and (8.30) gives equation (8.27), so we are done.
The second step we take is the following multiscale estimate of this discussion. Its proof is basically that of Lemma 8.16.
Lemma 8.17. Consider the set $\mathbb {I}^{\mathbf {T},1}$ of timescales in Definition 3.1 ; set $\mathfrak {t}_{\mathfrak {j}}=N^{-2+\mathfrak {j}\varepsilon _{\mathrm {ap}}}\in \mathbb {I}^{\mathbf {T},1}$ for indices $\mathfrak {j}\geqslant 0$ . Provided any pair of adjacent timescales $\mathfrak {t}_{\mathfrak {j}}$ and $\mathfrak {t}_{\mathfrak {j}+1}$ satisfying $\mathfrak {t}_{\mathfrak {j}},\mathfrak {t}_{\mathfrak {j}+1}\leqslant N^{-1}$ , we have the following estimate provided any $\gamma>0$ :
Provided any $\mathfrak {J}\in \mathbb Z_{\geqslant 1}$ with $\mathfrak {t}_{\mathfrak {J}}\leqslant N^{-1}$ , we additionally have the following the estimate again for any $\gamma>0$ :
Proof. The second estimate (8.37) is an immediate consequence of equation (8.36) courtesy of the following observations.
-
• Observe $\mathfrak {D}_{\mathfrak {t}_{0},\mathfrak {t}_{\mathfrak {J}}}^{\mathbf {T}}$ is a telescoping sum of $|\mathfrak {J}|$ -many $\mathfrak {D}_{\mathfrak {t}_{\mathfrak {j}},\mathfrak {t}_{\mathfrak {j}+1}}^{\mathbf {T}}$ terms.
-
• Plugging the aforementioned telescoping sum into the LHS of equation (8.37), we apply the triangle inequality for $\|\|_{1;\mathbb {T}_{N}}$ and linearity of expectation, which implies the LHS of equation (8.37) is at most $|\mathfrak {J}|$ times the supremum of the RHS of equation (8.36) over $0\leqslant \mathfrak {j}<\mathfrak {J}$ .
We will now prove equation (8.36). This starts by the following computation of the $\mathfrak {D}^{\mathbf {T}}$ difference operator on the LHS of equation (8.36). The next identity follows from observing that because $\mathfrak {t}_{\mathfrak {j+1}}$ is a positive integer multiple of $\mathfrak {t}_{\mathfrak {j}}$ by assumption/choice of $\varepsilon _{\mathrm {ap}}$ in Definition 3.1, we may write the time average on the timescale $\mathfrak {t}_{\mathfrak {j}+1}$ as the average of $\mathfrak {t}_{\mathfrak {j}+1}\mathfrak {t}_{\mathfrak {j}}^{-1}$ many time averages on the timescale $\mathfrak {t}_{\mathfrak {j}}$ , each of these time averages carrying a time shift given by an integer multiple of $\mathfrak {t}_{\mathfrak {j}}$ . This resembles the fact that an average of 10 terms can be written as an average of 5 ‘other’ terms, where each of the 5 ‘other’ terms is an average of pairs of the 10 terms with neighboring indices, for example. Ultimately, we use this representation to rewrite $\mathfrak {D}^{\mathbf {T}}$ as an average of time-gradients of the time average on the smaller timescale $\mathfrak {t}_{\mathfrak {j}}$ ; the second step requires putting the first term in the middle of equation (8.38) into the average:
By the triangle inequality, to establish equation (8.36), it suffices to prove the estimate
We may certainly replace the averaged sum on the LHS of equation (8.39) with a supremum over the same index set and prove the resulting estimate. Consider any $\mathfrak {k}$ in the index set on the LHS of equation (8.39). Similar to equation (8.29), we write the time gradient on the LHS of equation (8.39) as $\nabla ^{\mathbf {T}}\mathfrak {I}^{\mathbf {T}}(\mathfrak {f})\mathbf {Y}^{N}=\nabla ^{\mathbf {T}}(\mathfrak {I}^{\mathbf {T}}(\mathfrak {f})\mathbf {Y}^{N})-\mathfrak {I}^{\mathbf {T}}(\mathfrak {f})\nabla ^{\mathbf {T}}\mathbf {Y}^{N}$ with an additional time shift in $\mathbf {Y}^{N}$ that ultimately turns into reversing the timescale for the time gradient of $\mathbf {Y}^{N}$ . Again, similar to equation (8.29), this and the triangle inequality give the estimate
We are now left with estimating each term on the RHS of equation (8.40). To this end, we will follow the calculation (8.30) from the proof of Lemma 8.16 and apply Proposition A.3 but with $\mathrm {s}$ in equation (8.30) replaced by $\mathfrak {k}\mathfrak {t}_{\mathfrak {j}}$ . Because $\mathfrak {k}\leqslant \mathfrak {t}_{\mathfrak {j}+1}\mathfrak {t}_{\mathfrak {j}}^{-1}-1$ , we may deduce the timescale inequality $\mathfrak {k}\mathfrak {t}_{\mathfrak {j}}\leqslant \mathfrak {t}_{\mathfrak {j}+1}\leqslant N^{-1}$ . Ultimately, we establish the following deterministic estimate provided any $\gamma>0$ :
We emphasize the final inequality in equation (8.41), similar to equation (8.30), requires the a priori bound $|\mathbf {Y}^{N}|\leqslant N^{\varepsilon _{\mathrm {ap}}}$ that follows via definition of $\mathbf {Y}^{N}$ in Definition 3.5 and of $\mathfrak {t}_{\mathrm {st}}$ in Definition 3.1. We now move to the second term on the RHS of equation (8.40). For this, we follow the estimate for the second term in equation (8.29) given in the proof of equation (8.26). In particular, we start with the calculation giving equation (8.33) but with $\mathrm {s}$ therein and in the set $\mathbb {I}$ replaced by $\mathfrak {k}\mathfrak {t}_{\mathfrak {j}}$ and proceed verbatim. The only difference is that instead of using Lemma A.6 to estimate $\nabla ^{\mathbf {T}}\mathbf {Y}^{N}$ before time $\mathfrak {t}_{\mathrm {st}}$ , we instead use the a priori spatial regularity estimate defining $\mathfrak {t}_{\mathrm {st}}$ for $\mathbf {Y}^{N}$ ; see Definitions 3.1 and 3.5. Also, the final display in the proof of Lemma 8.16 should have its far RHS replaced with $N^{-1+\varepsilon _{\mathrm {ap}}}$ , because in this case $|\mathbb {I}|=\mathfrak {k}\mathfrak {t}_{\mathfrak {j}}\leqslant \mathfrak {t}_{\mathfrak {j}+1}\leqslant N^{-1}$ (while before we used $|\mathbb {I}|\lesssim N^{-2}$ ), but this does not change the validity of the lemma.
9 Boltzmann–Gibbs principle I – proof of proposition 4.6
This section consists of many technical gymnastics and applications of the preliminary ingredients in Section 8. To clarify the discussion, we will present the main ingredients needed for the proof of Proposition 4.6 with explanations about their respective statements and proofs. We then combine these ingredients to deduce Proposition 4.6 and afterwards provide the proofs for each.
9.0.1 Spatial average
The first step we take is replacement of the fluctuation $\mathsf {S}_{\varepsilon _{1}}(\tau _{y}\eta _{S})$ inside the heat operator on the LHS of equation (4.7) by spatial average on length scale $N^{1/6}$ . The choice of this length scale is motivated as follows. We first want to choose the length scale long enough so that we may exploit cancellations in spatial averages; see Lemmas 8.10 and 8.11. However, the larger the length scale we pick, the larger the error in such a replacement. It turns out $N^{1/6}$ is an appropriate compromise.
In the following Lemma 9.1, the importance of the fraction $6/25$ is an upper bound for the length scale exponent $1/6$ plus the $\varepsilon _{1}$ exponent $1/14$ . The following bound roughly follows by a summation-by-parts argument; it controls the difference between the $\mathsf {S}$ -term and its spatial average, after multiplying by $\mathbf {Y}^{N}$ and $\mathbf {H}^{N}$ and integrating in space-time, by regularity of $\mathbf {Y}^{N}$ and $\mathbf {H}^{N}$ . The only other input for Lemma 9.1 is an explicit formula for the spatial gradients of $\mathbf {Y}^{N}$ in terms of the particle system to estimate its spatial regularity explicitly; Lemma 8.13 is not enough. Recall the transfer-of-spatial scales in Definition 8.12.
Lemma 9.1. Define the length scale $\mathfrak {l}_{1}=N^{1/6}$ . We have the following estimate in which $\mathfrak {e}:\Omega \to \mathbb R$ is described afterwards:
-
• We have $\|\mathfrak {e}\|_{\omega ;\infty }\lesssim 1$ . The support $\mathbb {B}_{\mathfrak {e}}$ of $\mathfrak {e}$ is contained in the ball of radius $N^{6/25}$ centered at $0\in \mathbb {T}_{N}\kern-1.5pt$ .
-
• For any parameter $\sigma \in \mathbb R$ , the functional $\mathfrak {e}$ vanishes in expectation with respect to the canonical measure $\mu _{\sigma ,\mathbb {B}_{\mathfrak {e}}}^{\mathrm {can}}$ on its support.
Remark 9.2. As we noted before, the exponent $6/25$ on the LHS of equation (9.1) comes as an upper bound for $1/6+1/14$ , which are the exponents in the length scale $\mathfrak {l}_{1}$ and the support length of $\mathsf {S}$ . In particular, the transfer-of-length-scales operator from the LHS of equation (9.1) is the difference of $\mathsf {S}$ and spatial translates with disjoint supports, and thus has support length $N^{1/6}N^{1/14}\leqslant N^{6/25}$ .
We conclude the first step by introducing a cutoff on the spatial average of $\mathsf {S}_{\varepsilon _{1}}$ on length scale $N^{1/6}$ . First, we will introduce notation for the cutoff of the spatial average that we motivate shortly and that will be used throughout the rest of this subsection. Its utility is providing a priori upper bounds that will be useful for applications of reduction to local equilibrium in Lemma 8.9. Indeed, observe that in Lemma 8.9, allowed choices for the constant $\kappa $ therein depend on a priori upper bounds on functionals that we want to use Lemma 8.9 for. With better a priori deterministic bounds, we can pick a larger/better value of $\kappa $ . For motivation, assuming $\mathfrak {f}$ vanishes in expectation with respect to any canonical ensemble on its support, averages of its spatial translates with disjoint support satisfies a central limit theorem (CLT) type estimate in Lemma 8.10. The cutoff defining $\bar {\mathfrak {I}}^{\mathbf {X}}$ below vanishes by default when $\mathfrak {I}^{\mathbf {X}}$ exceeds this CLT-type upper bound, which according to Lemma 8.10 occurs with exponentially small probability in N. Thus, we deduce that with respect to any canonical ensemble on the support of the average $\mathfrak {I}^{\mathbf {X}}$ , the cutoff $\bar {\mathfrak {I}}^{\mathbf {X}}$ does nothing outside of an event of exponentially small probability. For general measures, we reduce locally to canonical measures via Lemma 8.9.
Definition 9.3. Provided any functional $\mathfrak {f}:\Omega \to \mathbb R$ and length scale $\mathfrak {l}_{\mathrm {av}}\in \mathbb Z_{\geqslant 0}$ , we define, recalling Definition 8.5,
Lemma 9.4. Define $\widetilde {\mathfrak {I}}^{\mathbf {X}}_{\mathfrak {l}}=\mathfrak {I}^{\mathbf {X}}_{\mathfrak {l}}-\bar {\mathfrak {I}}^{\mathbf {X}}_{\mathfrak {l}}$ for any $\mathfrak {l}\in \mathbb Z$ . We have the following for which we recall $\mathfrak {l}_{1}=N^{1/6}$ in Lemma 9.1 :
9.0.2 Time average
We now replace the cutoff spatial average $\bar {\mathfrak {I}}^{\mathbf {X}}$ introduced in the previous Lemma 9.4 by a time average on a mesoscopic timescale that is roughly of order $N^{-1}$ . We say ‘roughly’ since we need to pick a timescale for time averaging that lives in the set $\mathbb {I}^{\mathbf {T}}$ from Definition 3.1 in order to use Lemma 8.16 and Lemma 8.17, as we only have time-regularity bounds on $\mathbf {Y}^{N}$ , which will be important for the aforementioned time-average replacement, on the timescales in $\mathbb {I}^{\mathbf {T}}$ . For the statement of the following result, we first recall the transfer-of-timescale operator in Definition 8.15. For the proof of the following result, we employ Lemma 8.16 and Lemma 8.17 along with the a priori estimates for the cutoff $\bar {\mathfrak {I}}^{\mathbf {X}}$ , as Lemma 8.16 and Lemma 8.17 yield the error in the proposed time-average replacement below, while this error is controlled by estimates for $\bar {\mathfrak {I}}^{\mathbf {X}}$ . We also provide an analog of Lemma 9.5 below but with $N^{1/2}\bar {\mathfrak {I}}^{\mathbf {X}}$ replaced by $N^{6/25}\mathfrak {e}$ and whose proof is almost that of Lemma 9.5 but with a few cosmetic changes. In particular, we will only provide the necessary adjustments when addressing $N^{6/25}\mathfrak {e}$ .
Lemma 9.5. Consider $\mathfrak {j}_{1}\in \mathbb Z_{\geqslant 0}$ such that $\mathfrak {t}_{\mathfrak {j}_{1}}\in \mathbb {I}^{\mathbf {T},1}$ is the largest time in $\mathbb {I}^{\mathbf {T},1}$ satisfying $\mathfrak {t}_{\mathfrak {j}_{1}}\leqslant N^{-10/9}$ and consider $\mathfrak {j}_{2}\in \mathbb Z_{\geqslant 0}$ so that $\mathfrak {t}_{\mathfrak {j}_{2}}\in \mathbb {I}^{\mathbf {T},1}$ is the largest time in $\mathbb {I}^{\mathbf {T},1}$ satisfying $\mathfrak {t}_{\mathfrak {j}_{2}}\leqslant N^{-1}$ . We have lower bounds $\mathfrak {t}_{\mathfrak {j}_{1}}\geqslant N^{-10/9-\varepsilon _{\mathrm {ap}}}$ and $\mathfrak {t}_{\mathfrak {j}_{2}}\geqslant N^{-1-\varepsilon _{\mathrm {ap}}}$ because $\mathfrak {t}_{\mathfrak {j}}$ increases by a factor of $N^{\varepsilon _{\mathrm {ap}}}$ in $\mathfrak {j}$ . We additionally have the following pair of expectation estimates:
9.0.3 Final estimates
We now take advantage of replacing functionals inside the heat operator by the respective time averages on the mesoscopic timescales $\mathfrak {t}_{\mathfrak {j}_{i}}$ from Lemma 9.5. This starts with the following estimate whose proof is effectively given in the proof of Lemma 9.5 in terms of technical details, in particular by equilibrium considerations in Lemma 8.11 and then a reduction to equilibrium by Lemma 8.9. These are overviewed in the ‘Strategy’ subsection of Section 3. We similarly establish an analog for the time average of $N^{6/25}\mathfrak {e}$ on the timescale $\mathfrak {t}_{\mathfrak {j}_{2}}$ instead of $\mathfrak {t}_{\mathfrak {j}_{1}}$ . Again, the proof is basically given in that of Lemma 9.5.
Lemma 9.6. Consider the timescales $\mathfrak {t}_{\mathfrak {j}_{i}}\in \mathbb {I}^{\mathbf {T},1}$ from Lemma 9.5 . We have the estimate
We may now deduce Proposition 4.6 upon step-by-step replacements and the triangle inequality for $\|\|_{1;\mathbb {T}_{N}}$ and $\mathbf {E}$ . In each of the replacements below, we inherit the notation of the lemma cited therein.
-
• By Lemma 9.1, we may replace $N^{1/2}\mathsf {S}_{\varepsilon _{1}}$ by $N^{1/2}\mathfrak {I}_{\mathfrak {l}_{1}}^{\mathbf {X}}(\mathsf {S}_{\varepsilon _{1}})+N^{6/25}\mathfrak {e}$ while controlling the error in doing so.
-
• By Lemma 9.4, we may further replace $N^{1/2}\mathfrak {I}_{\mathfrak {l}_{1}}^{\mathbf {X}}(\mathsf {S}_{\varepsilon _{1}})+N^{6/25}\mathfrak {e}$ by $N^{1/2}\bar {\mathfrak {I}}_{\mathfrak {l}_{1}}^{\mathbf {X}}(\mathsf {S}_{\varepsilon _{1}})+N^{6/25}\mathfrak {e}$ .
-
• By Lemma 9.5, we may then replace $N^{1/2}\bar {\mathfrak {I}}_{\mathfrak {l}_{1}}^{\mathbf {X}}(\mathsf {S}_{\varepsilon _{1}})+N^{6/25}\mathfrak {e}$ by $N^{1/2}\mathfrak {I}^{\mathbf {T}}_{\mathfrak {t}_{\mathfrak {j}_{1}}}\bar {\mathfrak {I}}_{\mathfrak {l}_{1}}^{\mathbf {X}}(\mathsf {S}_{\varepsilon _{1}})+N^{6/25}\mathfrak {I}^{\mathbf {T}}_{\mathfrak {t}_{\mathfrak {j}_{2}}}(\mathfrak {e})$ .
It now suffices to apply Lemma 9.6 and the triangle inequality.
Proof of Lemma 9.1 .
We will make explicit the functional $\mathfrak {e}$ in this proof. The first step that we take is to observe the $\mathfrak {D}^{\mathbf {X}}$ -term on the LHS of equation (9.1) is the average of spatial-gradients of $\mathsf {S}_{\varepsilon _{1}}$ with respect to length scales between $N^{\varepsilon _{1}}$ and $\mathfrak {l}_{1}N^{\varepsilon _{1}}$ , as the $\mathfrak {D}^{\mathbf {X}}$ -term is the difference between $\mathsf {S}_{\varepsilon _{1}}$ itself and the average of all its spatial-translates $\tau _{-N^{\varepsilon _{1}}\mathfrak {k}}\mathsf {S}_{\varepsilon _{1}}$ for all $\mathfrak {k}=1$ to $\mathfrak {k}=\mathfrak {l}_{1}$ . We then employ a discrete-type Leibniz rule similar to that used to establish equation (8.24). Ultimately, this gives
by definition, and for $\mathsf {S}_{S,y}={\mathsf {S}_{\varepsilon _{1}}(\tau _{y}\eta _{S})}$ , per $\mathfrak {k}$ on the RHS of equation (9.6), we get, parallel to equation (8.24),
We eventually employ a spatial heat operator estimate in Proposition A.3 to analyze the first term on the RHS of equation (9.7) uniformly in $\mathfrak {k}$ -variables on the RHS of equation (9.6). First, we continue by expanding the second term on the RHS of equation (9.7). To this end, we recall that either $\mathbf {Y}^{N}=0$ or $\mathbf {Y}^{N}=\mathbf {Z}^{N}\kern-1.5pt$ . We consider the latter case as the former case is trivial. By definition of the Gartner transform $\mathbf {Z}^{N}$ in terms of the $\eta $ -variables, Taylor expansion implies the scale- $\mathfrak {k}$ spatial gradient of $\mathbf {Y}^{N}=\mathbf {Z}^{N}$ is equal to
The infinite series in front of $\mathbf {Z}^{N}$ in equation (9.8) is $\mathrm {O}(N^{-1/2+\varepsilon _{1}}|\mathfrak {k}|)$ . Indeed, this infinite series converges absolutely provided $N^{\varepsilon _{1}}|\mathfrak {k}|\leqslant N^{\alpha }$ with $\alpha <1/2$ , which is the case here for $\varepsilon _{1}=1/14$ and $|\mathfrak {k}|\leqslant \mathfrak {l}_{1}=N^{1/6}$ for $\alpha =6/25$ . Let $\mathfrak {e}_{\mathfrak {k}}$ be the product of this infinite series factor in equation (9.8) with $N^{1/2}\mathsf {S}_{\varepsilon _{1}}(\tau _{y}\eta _{S})$ . We emphasize the following features of $\mathfrak {e}_{\mathfrak {k}}$ .
-
• We have $|\mathfrak {e}_{\mathfrak {k}}|\lesssim N^{\varepsilon _{1}}|\mathfrak {k}|\leqslant N^{6/25}$ since, as we explained before, the infinite series in equation (9.8) is $\mathrm {O}(N^{-1/2+\varepsilon _{1}}|\mathfrak {k}|)$ and $N^{1/2}|\mathsf {S}_{\varepsilon _{1}}|\lesssim N^{1/2}$ since $\bar {\mathfrak {q}}$ is uniformly bounded; see Definition 2.2.
-
• The product $\mathfrak {e}_{\mathfrak {k}}$ has support contained in a neighborhood of radius $N^{6/25}$ centered at $0\in \mathbb {T}_{N}\kern-1.5pt$ . Indeed, the $N^{1/2}\mathsf {S}_{\varepsilon _{1}}$ factor has support contained in a radius $N^{\varepsilon _{1}}$ neighborhood of $0$ with $\varepsilon _{1}=14^{-1}$ , and the infinite series in equation (9.8) has support contained in ; this can be seen by looking at which $\eta $ -variables appear in the far RHS of equation (9.8).
-
• The product $\mathfrak {e}_{\mathfrak {k}}$ vanishes in expectation with respect to any canonical measure on it support. Indeed, this is the case for the $\mathsf {S}_{\varepsilon _{1}}$ factor as can be seen in Definition 4.5, while the support of $\mathsf {S}_{\varepsilon _{1}}$ is contained strictly to the left of $0\in \mathbb {T}_{N}$ and thus disjoint from the support of the infinite series in equation (9.8). Here, we crucially use the property that the projection of any canonical measure over one set onto any subset is a convex combination of canonical measures on the subset, which can be seen by observing that the canonical measure is always the uniform measure on its support. In particular, when we take the expectation of $\mathfrak {e}_{\mathfrak {k}}$ with respect to any canonical measure on its support, we may first take an expectation of the $\mathsf {S}_{\varepsilon _{1}}$ factor with respect to the projection of this canonical measure to the support of $\mathsf {S}_{\varepsilon _{1}}$ , which equals a convex combination of canonical measures over the support of $\mathsf {S}_{\varepsilon _{1}}$ , and deduce that the expectation of $\mathfrak {e}_{\mathfrak {k}}$ with respect to any canonical measure on its support vanishes.
-
• Let $\mathfrak {e}=-N^{-6/25}\widetilde {\sum }_{\mathfrak {k}=1,\ldots ,\mathfrak {l}_{1}}\mathfrak {e}_{\mathfrak {k}}$ . The sign is not so important.
The support of $\mathfrak {e}$ satisfies the conditions of $\mathfrak {e}_{\mathfrak {k}}$ supports from the second bullet point above. Moreover, $\mathfrak {e}$ vanishes in expectation with respect to any canonical measure on its support because each $\mathfrak {e}_{\mathfrak {k}}$ that it averages together satisfies this condition, and projection of any canonical measure on the support of $\mathfrak {e}$ projects to a convex combination of canonical measures on the support of each $\mathfrak {e}_{\mathfrak {k}}$ . Lastly, we have $|\mathfrak {e}|\lesssim N^{-6/25}\sup _{\mathfrak {k}}|\mathfrak {e}_{\mathfrak {k}}|\lesssim 1$ ; see the first bullet point in the above list. Using everything after equation (9.6), we obtain the following in which the $N^{6/25}$ factor on the LHS compensates introducing a factor of $N^{-6/25}$ for $\mathfrak {e}$ :
It remains to take the $\|\|_{1;\mathbb {T}_{N}}$ of both sides of equation (9.9) and estimate the resulting RHS. By the triangle inequality, it suffices to control the $\|\|_{1;\mathbb {T}_{N}}$ of each $\mathfrak {k}$ -indexed term on the RHS of equation (9.9) uniformly in the index $\mathfrak {k}$ . For this, we apply the spatial gradient estimate in Proposition A.3, which transfers the spatial gradient onto the heat kernel in $\mathbf {H}^{N}$ and then integrates the resulting time-integrable singularity. Ultimately, we get the following estimate uniformly in $\mathfrak {k}$ -indices on the RHS of equation (9.9) with universal implied constant:
The final inequality in equation (9.10) follows by power counting and $N^{\varepsilon _{1}}|\mathfrak {k}|\leqslant N^{6/25}$ and the a priori bound $|\mathbf {Y}^{N}|\leqslant N^{\varepsilon _{\mathrm {ap}}}$ .
Proof of Lemma 9.4 .
Consider $\gamma =999^{-999}\varepsilon _{\mathrm {ap}}$ . Via Lemma 8.2 for $\phi _{S,y}=|\widetilde {\mathfrak {I}}^{\mathbf {X}}_{\mathfrak {l}_{1}}(\mathsf {S}_{\varepsilon _{1}}(\tau _{y}\eta _{S}))|$ and this choice of $\gamma $ , we deduce
where the last inequality follows from applying the Holder inequality with respect to the $\mathbf {E}$ -expectation for the Holder inequality exponent $3/2$ . We now apply Lemma 8.8 to ‘transfer’ the space-time averaging on the RHS of (9.11) to the law of the particle system; in this application of Lemma 8.8, we make the following choices for inputs/parameters:
-
• Pick $\mathfrak {t}_{\mathrm {av}},\mathfrak {l}_{\mathrm {av}}=0$ and $\mathfrak {f}_{S,y}=\widetilde {\mathfrak {I}}^{\mathbf {X}}_{\mathfrak {l}_{1}}({\mathsf {S}_{\varepsilon _{1}}(\tau _{y}\eta _{S})})=\mathrm {O}(1)$ with support in and $\varepsilon _{1}=\frac {1}{14}$ .
In this case, the $\mathbf {E}^{\mathrm {dyn}}_{\mathrm {Loc}}$ expectation on the RHS of equation (8.10) does nothing since $\mathfrak {t}_{\mathrm {av}}=0$ , so the path-space dependence of the space-time average from the RHS of equation (9.11) is only through its initial condition $\mathrm {Loc}(\eta )$ that is equal to $\eta $ itself as far as $\mathfrak {f}$ is concerned because the $\mathrm {Loc}$ map only cuts off $\eta $ outside the support of $\mathfrak {f}$ by construction in Definition 8.3/Lemma 8.8. Thus, as $\mathfrak {f}$ is uniformly bounded, we deduce the following estimate from Lemma 8.8 with the aforementioned specialization:
Let us now estimate the first term within the RHS of equation (9.12). We will do this through Lemma 8.9 for $\mathfrak {h}=|\widetilde {\mathfrak {I}}^{\mathbf {X}}_{\mathfrak {l}_{1}}(\mathsf {S}_{\varepsilon _{1}}(\bar {\mathfrak {q}}))|^{3/2}$ , whose support is contained in a block with length of order $N^{\varepsilon _{1}}\mathfrak {l}_{1}\lesssim N^{1/14+1/6}\leqslant N^{6/25}$ . We also choose $\kappa =1$ in this application of Lemma 8.9, so we deduce the following in which $\mathbb {B}$ denotes the support of our choice of $\mathfrak {h}=|\widetilde {\mathfrak {I}}^{\mathbf {X}}_{\mathfrak {l}_{1}}(\mathsf {S}_{\varepsilon _{1}}(\bar {\mathfrak {q}}))|^{3/2}$ :
Observe the term inside the expectation on the far RHS is equal to zero on the event where the indicator function defining $\widetilde {\mathfrak {I}}^{\mathbf {X}}$ is not zero. Thus, because $\bar {\mathfrak {q}}$ and its functionals are uniformly bounded, the expectation on the far RHS of equation (9.13) is at most uniformly bounded factors times the probability that the indicator function in Definition 9.3 fails. We estimate this using Lemma 8.10 with the choice of functions $\mathfrak {f}_{\mathfrak {j}}=\tau _{-\mathfrak {j}N^{\varepsilon _{1}}}\mathsf {S}_{\varepsilon _{1}}(\eta )$ for $\mathfrak {j}\geqslant 1$ and $\gamma =\varepsilon _{\mathrm {ap}}$ , whose supports are mutually disjoint since $\mathsf {S}_{\varepsilon _{1}}$ has support length $N^{\varepsilon _{1}}$ by construction in Definition 4.5, and for $\mathfrak {J}=\mathfrak {l}_{1}$ . Thus, we have
We now combine equations (9.11), (9.12), (9.13) and (9.14) along with elementary power counting in N to deduce the claim.
Proof of Lemma 9.5 .
We establish the proposed estimate for the first term on the LHS of equation (9.4), so we formally set $\mathfrak {e}=0$ for now. Observe $\mathfrak {j}_{1}\lesssim _{\varepsilon _{\mathrm {ap}}}1$ for $\mathfrak {j}_{1}$ in the statement of Lemma 9.5, as $\mathfrak {t}_{\mathfrak {j}}$ increases by a factor of $N^{\varepsilon _{\mathrm {ap}}}$ with each step in the index $\mathfrak {j}$ . Also, we emphasize the important assumption $\mathfrak {t}_{\mathfrak {j}_{1}}\leqslant N^{-1}$ . Lastly, we note that via the triangle inequality, it suffices to control the LHS of equation (9.4) both with the replacement $\mathfrak {t}_{\mathfrak {j}_{1}}$ by $\mathfrak {t}_{0}=N^{-2}$ and with the replacement $0$ in the LHS of equation (9.4) by $\mathfrak {t}_{0}=N^{-2}$ , namely
Use Lemma 8.16 and (8.37) in Lemma 8.17 with $\mathfrak {f}_{S,y}=N^{1/2}\bar {\mathfrak {I}}^{\mathbf {X}}_{\mathfrak {l}_{1}}{(\mathsf {S}_{\varepsilon _{1}}(\tau _{y}\eta _{S})})$ and $\gamma =\varepsilon _{\mathrm {ap}}$ and, for Lemma 8.17, $\mathfrak {J}=\mathfrak {j}_{1}$ . Lemma 8.16 estimates the first term on the RHS of equation (9.15) at the cost of $\mathrm {O}(\mathrm {RHS}((9.4)))$ since our choice of $\mathfrak {f}$ admits an a priori cutoff:
We note equation (8.37) in Lemma 8.17 controls the second term on the RHS of equation (9.15) with $\mathfrak {J}=\mathfrak {j}_{1}\lesssim 1$ and $\mathfrak {f}=N^{1/2}\bar {\mathfrak {I}}^{\mathbf {X}}_{\mathfrak {l}_{1}}(\mathsf {S}_{\varepsilon _{1}}(\eta ))$ and $\gamma =\varepsilon _{\mathrm {ap}}$ . As this choice of $\mathfrak {f}$ satisfies $|\mathfrak {f}|\lesssim N^{1/2}$ , this shows the second term on the RHS of equation (9.15) is
We apply Lemma 8.2 with $\phi _{S,y}=N^{1/2}\mathfrak {I}^{\mathbf {T}}_{\mathfrak {t}_{\mathfrak {j}}}\bar {\mathfrak {I}}^{\mathbf {X}}_{\mathfrak {l}_{1}}({\mathsf {S}_{\varepsilon _{1}}(\tau _{y}\eta _{S})})$ ; for $\mathfrak {j}\leqslant \mathfrak {j}_{1}$ , this gives
It suffices to estimate the RHS of equation (9.19) uniformly in $\mathfrak {j}$ satisfying $\mathfrak {t}_{\mathfrak {j}}\leqslant N^{-1}$ . To this end, we employ Lemma 8.8 to estimate the expectation of this individual integral by the expectation of a single functional against the space-time averaged law of the particle system. This provides the following for which we forget, for now, the $2/3$ -power on the RHS of equation (9.19), in which $\mathrm {Loc}=\mathrm {Loc}_{\mathfrak {t}_{\mathfrak {j}},\mathfrak {l}_{\mathrm {tot}}}$ of Definition 8.3/Lemma 8.8 is taken with $\gamma _{0}=\varepsilon _{\mathrm {ap}}$ and $\mathfrak {f}_{S,y}=\bar {\mathfrak {I}}^{\mathbf {X}}_{\mathfrak {l}_{1}}({\mathsf {S}_{\varepsilon _{1}}(\tau _{y}\eta _{S})})$ and $\mathfrak {l}_{\mathrm {av}}=1$ , as our choice of $\mathfrak {f}$ already accounts for the spatial averaging, and $\mathfrak {l}=N^{\varepsilon _{1}}\mathfrak {l}_{1}\leqslant N^{6/25}$ equal to the support length of our choice of functional $\mathfrak {f}_{S,y}=\bar {\mathfrak {I}}^{\mathbf {X}}_{\mathfrak {l}_{1}}({\mathsf {S}_{\varepsilon _{1}}(\tau _{y}\eta _{S})})$ :
Plugging in the second term on the RHS of equation (9.20) into the RHS of equation (9.19), its contribution is controlled by the RHS of the proposed estimate (9.4), so it suffices to estimate the first term on the RHS of equation (9.20). For this purpose, we will employ Lemma 8.9 with the following choices for inputs $\kappa $ and $\mathfrak {h}$ ; for the choices below, we recall $\mathfrak {l}_{1}=N^{1/6}$ from Lemma 9.1.
-
• We will choose the constant $\kappa $ in the statement of Lemma 8.9 to be $\kappa =N^{-3\varepsilon _{\mathrm {ap}}/2}\mathfrak {l}_{1}^{3/4}=N^{1/8-3\varepsilon _{\mathrm {ap}}/2}$ .
-
• Now, choose $\mathfrak {h}$ in Lemma 8.9 to be the $\mathbf {E}^{\mathrm {dyn}}$ functional. Observe first that these two bullet points are ‘compatible’ for applying Lemma 8.9 because the $\mathbf {E}^{\mathrm {dyn}}$ functional is uniformly bounded by the time average it is taking expectation of. This time average is controlled uniformly by the $\|\|_{\omega ;\infty }$ -norm of the quantity it is averaging, which in this case is the $\bar {\mathfrak {I}}^{\mathbf {X}}$ functional. But this $\bar {\mathfrak {I}}^{\mathbf {X}}$ functional is at most $N^{-1/12+\varepsilon _{\mathrm {ap}}}$ ; see Definition 9.3. Taking the $-3/2$ -power of this bound gives $\kappa $ .
-
• Observe the support of the $\mathfrak {h}=\mathbf {E}^{\mathrm {dyn}}$ functional is equal to the support of $\mathrm {Loc}$ from our application of Lemma 8.8 that yielded equation (9.20), as the $\mathbf {E}^{\mathrm {dyn}}$ functional takes $\mathrm {Loc}$ as its initial configuration for the path-space expectation. The support of $\mathrm {Loc}$ is given in Definition 8.3/Lemma 8.8, which we emphasize is taken with $\gamma _{0}=\varepsilon _{\mathrm {ap}}$ and $\mathfrak {t}=\mathfrak {t}_{\mathfrak {j}}$ and $\mathfrak {l}\lesssim N^{\varepsilon _{1}}\mathfrak {l}_{1}$ for $\varepsilon _{1}=1/14$ and $\mathfrak {l}_{1}=N^{1/6}$ ; indeed, according to Lemma 8.8 we take the parameter $|\mathfrak {l}\mathfrak {l}_{\mathrm {av}}|$ for the $\mathrm {Loc}$ support equal to $\mathrm {O}(1)$ times the support length of $\mathsf {S}$ that we are space-time averaging on the RHS of equation (9.20), which is of order $N^{\varepsilon _{1}}$ , times the length scale of this spatial averaging, which is order $\mathfrak {l}_{1}$ ; we also add $\mathrm {O}(N^{\varepsilon _{1}})$ , which is basically the support length of $\mathsf {S}_{\varepsilon _{1}}(\eta )$ , but this is lower order.
Lemma 8.9 with the aforementioned choices lets us control the first term on the RHS of equation (9.20) by two terms, one depending on the support $\mathbb {B}$ of $\mathfrak {h}$ and another being the supremum of canonical measure expectations. This first support-term, after multiplying by the prefactors before the expectation on the RHS of equation (9.20), is ultimately negligible courtesy of the following calculation:
The last inequality in equation (9.21) follows by $\mathfrak {l}_{1}=N^{1/6}$ and $\varepsilon _{1}=1/14$ and, from Definition 3.1, that $\mathfrak {t}_{\mathfrak {j}}\leqslant N^{-1}$ and $\mathfrak {t}_{\mathfrak {j}+1}\leqslant N^{\varepsilon _{\mathrm {ap}}}\mathfrak {t}_{\mathfrak {j}}$ . By plugging this in the RHS of equation (9.19) and taking its $2/3$ -power, we deduce that its contribution is controlled by the RHS of the proposed estimate (9.4). We are now left to estimate the supremum of canonical measure expectations on the RHS of the estimate we obtain when employing Lemma 8.9 with the previous list of choices for inputs. For clarity, let us record below the supremum we are left to estimate, insert into the RHS of equation (9.19) and deduce is controlled by the RHS of the proposed estimate (9.4), in which $\mathbf {E}^{\sigma }$ denotes expectation with respect to the canonical measure of parameter $\sigma $ on the support of $\mathbf {E}^{\mathrm {dyn}}$ /of $\mathrm {Loc}$ :
We take the same $\mathrm {Loc}$ as we did for our applications of Lemma 8.9 in the previous quantity $\Phi $ . To estimate $\Phi $ , we proceed with the following two-step estimate, which is basically applying Lemma 8.11, but first removing the cutoff for the spatial average on the RHS of equation (9.22) that is absent in Lemma 8.11. Intuitively, this cutoff does nothing with very high probability by Lemma 8.10.
-
• We first replace $\bar {\mathfrak {I}}^{\mathbf {X}}$ by $\mathfrak {I}^{\mathbf {X}}$ . The cost in doing so is recorded in the following estimate:
(9.23) $$ \begin{align} \mathbf{E}^{\sigma}\mathbf{E}_{\mathrm{Loc}}^{\mathrm{dyn}}|\mathfrak{I}_{\mathfrak{t}_{\mathfrak{j}}}^{\mathbf{T}}\bar{\mathfrak{I}}_{\mathfrak{l}_{1}}^{\mathbf{X}}(\mathsf{S}_{\varepsilon_{1}})|^{\frac32} \ \lesssim \ \mathbf{E}^{\sigma}\mathbf{E}_{\mathrm{Loc}}^{\mathrm{dyn}}|\mathfrak{I}_{\mathfrak{t}_{\mathfrak{j}}}^{\mathbf{T}}\mathfrak{I}_{\mathfrak{l}_{1}}^{\mathbf{X}}(\mathsf{S}_{\varepsilon_{1}})|^{\frac32} + \mathbf{E}^{\sigma}\mathbf{E}_{\mathrm{Loc}}^{\mathrm{dyn}}|\mathfrak{I}_{\mathfrak{t}_{\mathfrak{j}}}^{\mathbf{T}}(\mathfrak{I}_{\mathfrak{l}_{1}}^{\mathbf{X}}(\mathsf{S}_{\varepsilon_{1}})-\bar{\mathfrak{I}}_{\mathfrak{l}_{1}}^{\mathbf{X}}(\mathsf{S}_{\varepsilon_{1}}))|^{\frac32}. \end{align} $$We will estimate the second term within the RHS of equation (9.23). By thinking of the $\mathfrak {I}^{\mathbf {T}}$ time average as an expectation, we will first move the $3/2$ -power and absolute value past the $\mathfrak {I}^{\mathbf {T}}$ average via the Holder inequality to get(9.24) $$ \begin{align} \mathbf{E}^{\sigma}\mathbf{E}_{\mathrm{Loc}}^{\mathrm{dyn}}|\mathfrak{I}_{\mathfrak{t}_{\mathfrak{j}}}^{\mathbf{T}}(\mathfrak{I}_{\mathfrak{l}_{1}}^{\mathbf{X}}(\mathsf{S}_{\varepsilon_{1}})-\bar{\mathfrak{I}}_{\mathfrak{l}_{1}}^{\mathbf{X}}(\mathsf{S}_{\varepsilon_{1}}))|^{\frac32} \ \leqslant \ \mathbf{E}^{\sigma}\mathbf{E}_{\mathrm{Loc}}^{\mathrm{dyn}}\mathfrak{I}_{\mathfrak{t}_{\mathfrak{j}}}^{\mathbf{T}}(|\mathfrak{I}_{\mathfrak{l}_{1}}^{\mathbf{X}}(\mathsf{S}_{\varepsilon_{1}})-\bar{\mathfrak{I}}_{\mathfrak{l}_{1}}^{\mathbf{X}}(\mathsf{S}_{\varepsilon_{1}})|^{\frac32}). \end{align} $$Following the proof of Lemma 8.11, we first replace $\mathbf {E}^{\mathrm {dyn}}$ in equation (9.24) with an expectation with respect to the path-space measure corresponding to the particle system but with periodic boundary conditions on the support of $\mathrm {Loc}$ if we allow error of at most order $N^{-100}$ , as $\bar {\mathfrak {q}}$ is uniformly bounded. We now move both expectations, after this replacement, on the RHS of equation (9.24) past the $\mathfrak {I}^{\mathbf {T}}$ time average by the Fubini theorem. Also, from the proof of Lemma 8.11, for this smaller periodic system on the support of $\mathrm {Loc}$ , the $\sigma $ -canonical measure defining the expectation $\mathbf {E}^{\sigma }$ on the RHS of equation (9.24) is an invariant measure. Therefore, it suffices to estimate the expectation of what is inside the $\mathfrak {I}^{\mathbf {T}}$ average on the RHS of equation (9.24) when we sample the $\eta $ -variables in the support of $\mathrm {Loc}$ by the $\sigma $ -canonical measure. As the support of $\mathfrak {I}^{\mathbf {X}}-\bar {\mathfrak {I}}^{\mathbf {X}}$ is contained in that of $\mathrm {Loc}$ , and since projections of canonical measures onto smaller subsets are convex combinations of canonical measures, it suffices to estimate expectation of $|\mathfrak {I}^{\mathbf {X}}-\bar {\mathfrak {I}}^{\mathbf {X}}|$ with respect to any canonical measure. By the large-deviations estimate in Lemma 8.10, as $|\mathfrak {I}^{\mathbf {X}}-\bar {\mathfrak {I}}^{\mathbf {X}}|$ is uniformly bounded, this expectation is at most the probability $\mathrm {O}(N^{-100})$ that the indicator function defining $\bar {\mathfrak {I}}^{\mathbf {X}}$ fails. Ultimately, from this paragraph and the bound $\mathfrak {t}_{\mathfrak {j}+1}\leqslant 1$ , we get the following, which then, after plugging into the RHS of equation (9.19) and taking its $2/3$ -power, has contribution controlled by the RHS of the proposed equation (9.4):(9.25) $$ \begin{align} N^{\frac34+8\varepsilon_{\mathrm{ap}}}\mathfrak{t}_{\mathfrak{j}+1}^{\frac38}\mathbf{E}^{\sigma}\mathbf{E}_{\mathrm{Loc}}^{\mathrm{dyn}}\mathfrak{I}_{\mathfrak{t}_{\mathfrak{j}}}^{\mathbf{T}}(|\mathfrak{I}_{\mathfrak{l}_{1}}^{\mathbf{X}}(\mathsf{S}_{\varepsilon_{1}})-\bar{\mathfrak{I}}_{\mathfrak{l}_{1}}^{\mathbf{X}}(\mathsf{S}_{\varepsilon_{1}})|^{\frac32}) \ \lesssim \ N^{\frac34+8\varepsilon_{\mathrm{ap}}}N^{-100} \ \lesssim \ N^{-99}. \end{align} $$ -
• We now estimate the first term on the RHS of equation (9.23). We first employ the Holder inequality to boost the $3/2$ exponent to $2$ , so
(9.26) $$ \begin{align} N^{\frac34+8\varepsilon_{\mathrm{ap}}}\mathfrak{t}_{\mathfrak{j}+1}^{\frac38}\mathbf{E}^{\sigma}\mathbf{E}_{\mathrm{Loc}}^{\mathrm{dyn}}|\mathfrak{I}_{\mathfrak{t}_{\mathfrak{j}}}^{\mathbf{T}}\mathfrak{I}_{\mathfrak{l}_{1}}^{\mathbf{X}}(\mathsf{S}_{\varepsilon_{1}})|^{\frac32} \ &\leqslant \left(N^{1+\frac{32}{3}\varepsilon_{\mathrm{ap}}}\mathfrak{t}_{\mathfrak{j}+1}^{\frac12}\mathbf{E}^{\sigma}\mathbf{E}_{\mathrm{Loc}}^{\mathrm{dyn}}|\mathfrak{I}_{\mathfrak{t}_{\mathfrak{j}}}^{\mathbf{T}}\mathfrak{I}_{\mathfrak{l}_{1}}^{\mathbf{X}}(\mathsf{S}_{\varepsilon_{1}})|^{2}\right)^{\frac34}. \end{align} $$We now use Lemma 8.11 to the RHS of equation (9.26) where $\mathfrak {f}$ in Lemma 8.11 is taken to be $\mathsf {S}_{\varepsilon _{1}}(\bar {\mathfrak {q}})$ here, which satisfies the assumptions needed of $\mathfrak {f}$ in Lemma 8.11 as noted in Definition 4.5. We clarify we also take $\mathfrak {t}_{\mathrm {av}}=\mathfrak {t}_{\mathfrak {j}}$ and $\mathfrak {l}_{\mathrm {av}}=\mathfrak {l}_{1}=N^{1/6}$ . This ultimately provides the following estimate for which we recall $\mathfrak {t}_{\mathfrak {j}+1}=N^{\varepsilon _{\mathrm {ap}}}\mathfrak {t}_{\mathfrak {j}}$ and $\mathfrak {t}_{\mathfrak {j}}\geqslant N^{-2}$ and the support of $\mathsf {S}$ has length order $N^{\varepsilon _{1}}$ with $\varepsilon _{1}=1/14$ , the first two of which follow by construction in Definition 3.1 and the last in Definition 4.5/Proposition 4.6:(9.27) $$ \begin{align} N^{1+\frac{32}{3}\varepsilon_{\mathrm{ap}}}\mathfrak{t}_{\mathfrak{j}+1}^{\frac12}\mathbf{E}^{\sigma}\mathbf{E}_{\mathrm{Loc}}^{\mathrm{dyn}}|\mathfrak{I}_{\mathfrak{t}_{\mathfrak{j}}}^{\mathbf{T}}\mathfrak{I}_{\mathfrak{l}_{1}}^{\mathbf{X}}(\mathsf{S}_{\varepsilon_{1}})|^{2} \ \lesssim \ N^{1+\frac{35}{3}\varepsilon_{\mathrm{ap}}}\mathfrak{t}_{\mathfrak{j}}^{\frac12}N^{-2}\mathfrak{t}_{\mathfrak{j}}^{-1}\mathfrak{l}_{1}^{-1}N^{\frac{2}{14}}+N^{-100} \ \lesssim \ N^{-\frac{1}{999}+\frac{35}{3}\varepsilon_{\mathrm{ap}}}. \end{align} $$Plugging the above upper bound (9.27) in the RHS of equation (9.19) and taking its $2/3$ -power proves the contribution of the first term in the $\Phi $ -decomposition (9.23) is controlled by the RHS of the proposed estimate (9.4).
The previous two bullet points estimate $\Phi $ in equation (9.22), so that its contribution, after plugging it into the RHS of equation (9.19) and taking its $2/3$ -power, is appropriately controlled, so we are done.
It now suffices to estimate the second term on the LHS of equation (9.4). To this end, it suffices to follow the argument we have given to estimate the first term on the LHS of equation (9.4) but with the following technical adjustments; we also explain intuitively why it works.
-
• In applying Lemma 8.16 and equation (8.37) in Lemma 8.17, we instead choose $\mathfrak {f}=N^{6/25}\mathfrak {e}$ from the second term on the LHS of equation (9.4).
-
• When applying Lemma 8.2 and Lemma 8.8, we instead integrate/apply the heat operator against our choice $\mathfrak {f}=N^{6/25}\mathfrak {e}$ from the previous bullet point and choose $\mathrm {Loc}=\mathrm {Loc}_{\mathfrak {t}_{\mathfrak {j}},\mathfrak {l}_{\mathrm {tot}}}$ with $\mathfrak {l}_{\mathrm {tot}}\lesssim N^{6/25}$ , because the support of $\mathfrak {e}$ has a length of order $N^{6/25}$ , which follows by construction in Lemma 9.1, and there is no added length scale gain for $\mathfrak {f}=N^{6/25}\mathfrak {e}$ from spatial averaging.
-
• When applying Lemma 8.9, we instead choose $\kappa =1$ and $\mathfrak {h}$ equal to $\mathbf {E}^{\mathrm {dyn}}$ of the time average of $\mathfrak {e}$ , which we recall has support with length of order $N^{6/25}$ . These choices of $\kappa $ and $\mathfrak {h}$ are ‘compatible’ as $\mathfrak {e}$ is uniformly bounded according to Lemma 9.1.
-
• The strategy we used to bound the first term on the LHS of equation (9.4) but with these modifications successfully controls the second term on the LHS of equation (9.4) for the following reason. We have a smaller power of N for this second term; this means our estimates should be $N^{-1/2+6/25}=N^{-13/50}$ better than those for the first term on the LHS of equation (9.4). However, we also lose the spatial averaging, which introduces factors basically of order $N^{-1/12}$ , so our estimates are actually only $N^{-13/50+1/12}\leqslant N^{-53/300}$ better. Moreover, the support of the spatial average $\mathfrak {I}^{\mathbf {X}}$ for the first term on the LHS of equation (9.4) has basically the same length as the support of $\mathfrak {e}$ . Lastly, when applying Lemma 8.11, the length of the support of the functional we space-time average has now increased from order $N^{\varepsilon _{1}}=N^{1/14}$ to $N^{6/25}$ . As the estimate in Lemma 8.11 depends linearly on the support, our estimates are actually $N^{-53/300+6/25-1/14}\leqslant N^{-8/900}$ better. In particular, we get sharper estimates for the second term on the LHS of equation (9.4) when we modify the analysis for the first term therein via the previous three bullet points.
This completes the proof, as we have estimated both terms on the LHS of the proposed estimate (9.4) by the RHS of equation (9.4).
Proof of Lemma 9.6 .
We again forget the second term on the LHS of equation (9.5) for now and focus on the first term therein. The first step we take is to introduce additional spatial averaging for the space-time average on the LHS of equation (9.5). Unlike Lemma 9.1, however, we will not required an explicit formula for gradients of the $\mathbf {Y}^{N}$ process and instead employ the replacement estimate in Lemma 8.13. We will pick $\mathfrak {l}=N^{1/6}$ in our forthcoming application of Lemma 8.13. We also pick $\mathfrak {f}_{S,y}=N^{1/2}\bar {\mathfrak {I}}^{\mathbf {X}}_{\mathfrak {l}_{1}}({\mathsf {S}_{\varepsilon _{1}}(\tau _{y}\eta _{S})})$ on the LHS of equation (9.5). Note the choice of $\mathfrak {f}$ depends only on $\eta $ -variables in a block of length $N^{\varepsilon _{1}}\mathfrak {l}_{1}$ , with $\varepsilon _{1}=1/14$ and $\mathfrak {l}_{1}=N^{1/6}$ ; see Definitions 4.5 and 9.3. We ultimately deduce the first term on the LHS of equation (9.5) is bounded above by $\mathrm {O}(1)$ times
The factor $\bar {\mathfrak {l}}$ in (9.28) via Lemma 8.13 is equal to the length scale for spatial averaging $\mathfrak {l}_{1}=N^{1/6}$ times the length of the support of $\mathfrak {f}_{S,y}=N^{1/2}\bar {\mathfrak {I}}^{\mathbf {X}}_{\mathfrak {l}_{1}}({\mathsf {S}_{\varepsilon _{1}}(\tau _{y}\eta _{S})})$ (which is $\mathrm {O}(N^{\varepsilon _{1}}\mathfrak {l}_{1})\lesssim N^{6/25}$ ). Since $|\bar {\mathfrak {l}}|\leqslant N^{1/2}$ , Lemma 8.13 applies. We now explain equation (9.28).
-
• Lemma 8.13 for $\mathfrak {l}=N^{\frac 16}$ and $\mathfrak {f}_{S,y}=N^{\frac 12}\bar {\mathfrak {I}}^{\mathbf {X}}_{\mathfrak {l}_{1}}({\mathsf {S}_{\varepsilon _{1}}(\tau _{y}\eta _{S})})$ implies the difference between the first term on the LHS of equation (9.5) and the first term in equation (9.28) is controlled by the RHS of equation (8.22) with these choices. It suffices to note that these two terms on the RHS of equation (8.22) are controlled by the third and second terms in equation (9.28), respectively, as $|\bar {\mathfrak {I}}^{\mathbf {X}}_{\mathfrak {l}_{1}}({\mathsf {S}_{\varepsilon _{1}}(\tau _{y}\eta _{S})})|\lesssim N^{\varepsilon _{\mathrm {ap}}}|\mathfrak {l}_{1}|^{-\frac 12}=N^{-\frac {1}{12}+\varepsilon _{\mathrm {ap}}}$ .
The last term in equation (9.28) is clearly controlled by the RHS of the proposed estimate (9.28). It remains to control the first two terms in equation (9.28), for which we employ the following two bullet points based on the proof of Lemma 9.5.
-
• To analyze the second term in equation (9.28), we directly follow the proof of Lemma 9.5 starting from equation (9.19) but dropping the prefactor $N^{3\varepsilon _{\mathrm {ap}}}\mathfrak {t}^{1/4}$ and choosing $\mathfrak {j}=\mathfrak {j}_{1}$ from Lemma 9.5/Lemma 9.6. In particular, we will make the same choices in our applications of results in Section 8 and we ultimately deduce the second term in equation (9.28) is controlled by the RHS of the proposed estimate (9.5). Intuitively we succeed because although we lose a factor of $\mathfrak {t}^{1/4}$ , in the calculations starting at equation (9.19) in the proof of Lemma 9.5 we only use the bound $\mathfrak {t}_{\mathfrak {j}+1}\leqslant N^{-1}$ , and we only lose a factor of $N^{-1/4}$ . On the other hand, the prefactor is no longer $N^{1/2}$ but rather $\bar {\mathfrak {l}}^{1/2}$ . Recalling $\bar {\mathfrak {l}}\leqslant N^{1/6}N^{\varepsilon _{1}}\mathfrak {l}_{1}\leqslant N^{1/6}N^{6/25} = N^{61/150}$ , we also gain a factor $N^{-1/2+61/300}=N^{-89/300}$ that beats out the $N^{1/4}$ factor that we obtained in dropping $\mathfrak {t}^{1/4}$ from earlier in this bullet point.
-
• We now analyze the first term in equation (9.28). In this case, we will also follow the proof for Lemma 9.5 starting with equation (9.19), although now we must address the additional $\mathfrak {I}^{\mathbf {X}}$ operator in the first term in equation (9.28). We start with Lemma 8.2 for $\phi $ equal to the $\mathfrak {I}^{\mathbf {T}}\mathfrak {I}^{\mathbf {X}}\bar {\mathfrak {I}}^{\mathbf {X}}$ term in the first term in equation (9.28) to get the following parallel of equation (9.19):
(9.29) $$ \begin{align} \mathbf{E}\|\mathbf{H}^{N}(N^{\frac12}\mathfrak{I}^{\mathbf{T}}_{\mathfrak{t}_{\mathfrak{j}_{1}}}\mathfrak{I}^{\mathbf{X}}_{N^{1/6}}\bar{\mathfrak{I}}^{\mathbf{X}}_{\mathfrak{l}_{1}}({\mathsf{S}_{\varepsilon_{1}}(\tau_{y}\eta_{S})})\mathbf{Y}^{N}_{S,y})\|_{1;\mathbb{T}_{N}} \lesssim(N^{\frac34+2\varepsilon_{\mathrm{ap}}}\mathbf{E}\mathbf{I}_{1}(|\mathfrak{I}^{\mathbf{T}}_{\mathfrak{t}_{\mathfrak{j}_{1}}}\mathfrak{I}^{\mathbf{X}}_{N^{1/6}}\bar{\mathfrak{I}}^{\mathbf{X}}_{\mathfrak{l}_{1}}({\mathsf{S}_{\varepsilon_{1}}(\tau_{y}\eta_{S})})|^{\frac32}))^{\frac23}. \end{align} $$We now apply Lemma 8.8 to obtain the following parallel of equation (9.20) in the proof of Lemma 9.5; we make the same choices for inputs for Lemma 8.8, except our choice for $\mathfrak {l}_{\mathrm {av}}$ is now equal to $N^{1/6}$ instead of 0. We basically establish equation (9.20) but with the additional $\mathfrak {I}^{\mathbf {X}}$ operator, which is present in equation (9.29), and without any $\mathfrak {t}_{\mathfrak {j}+1}$ -dependent prefactor, which is present in equation (9.20), and for which the $\mathrm {Loc}$ term below is now defined with $\mathfrak {l}_{\mathrm {tot}}$ being that from the proof of Lemma 9.5 but times $N^{1/6}$ since $\mathfrak {l}_{\mathrm {tot}}$ takes into account the spatial-average length scale $\mathfrak {l}_{\mathrm {av}}=N^{1/6}$ coming from $\mathfrak {I}^{\mathbf {X}}_{N^{1/6}}$ in equation (9.29) (see Lemma 8.8 for $\mathrm {Loc}$ and $\mathfrak {l}_{\mathrm {tot}}$ ):(9.30) $$ \begin{align} N^{\frac34+2\varepsilon_{\mathrm{ap}}}\mathbf{E}\mathbf{I}_{1}\left(|\mathfrak{I}^{\mathbf{T}}_{\mathfrak{t}_{\mathfrak{j}_{1}}}\mathfrak{I}^{\mathbf{X}}_{N^{1/6}}\bar{\mathfrak{I}}^{\mathbf{X}}_{\mathfrak{l}_{1}}({\mathsf{S}_{\varepsilon_{1}}(\tau_{y}\eta_{S})})|^{\frac32}\right) \ &\lesssim \ N^{\frac34+2\varepsilon_{\mathrm{ap}}}{\mathbf{E}_{0}}\bar{\mathfrak{P}}_{1}\mathbf{E}_{\mathrm{Loc}}^{\mathrm{dyn}}|\mathfrak{I}^{\mathbf{T}}_{\mathfrak{t}_{\mathfrak{j}_{1}}}\mathfrak{I}^{\mathbf{X}}_{N^{1/6}}\bar{\mathfrak{I}}^{\mathbf{X}}_{\mathfrak{l}_{1}}(\mathsf{S}_{\varepsilon_{1}})|^{\frac32} + N^{-100}. \end{align} $$The second term on the RHS of equation (9.30) is controlled by the RHS of the proposed estimate (9.5) after taking $2/3$ -powers upon plugging its contribution into the RHS of equation (9.29). We now apply Lemma 8.9 with the same choices as we made in the proof of Lemma 9.5, which are explicitly declared prior to equation (9.21), but $\mathfrak {l}_{\mathrm {av}}=N^{1/6}$ . This bounds the first term on the RHS of equation (9.30) by the sum of a support term plus a supremum of canonical measure expectations of the $\mathbf {E}^{\mathrm {dyn}}$ term on the RHS of equation (9.30). The first support term is estimated in the exact same fashion as equation (9.21), except we do not have the helpful $\mathfrak {t}_{\mathfrak {j}+1}$ -dependent factor, namely its $3/8$ -power. However, this factor is actually not needed to prove the upper bound on the far RHS of equation (9.30). Also, the support of $\mathbf {E}^{\mathrm {dyn}}$ is changed as $\mathfrak {l}_{\mathrm {tot}}$ has changed as noted before equation (9.30), so our version of equation (9.21) must be adjusted via replacing $N^{\varepsilon _{1}}\mathfrak {l}_{1}$ therein by $\mathfrak {l}_{\mathrm {av}}N^{\varepsilon _{1}}\mathfrak {l}_{1}=N^{1/6}N^{\varepsilon _{1}}\mathfrak {l}_{1}$ , though the upper bound in equation (9.21) still holds after this adjustment. Ultimately, the contribution of the support term/first term on the RHS of equation (8.16) for our choices of inputs in Lemma 8.9 is controlled by the RHS of the proposed estimate (9.5) after plugging into equation (9.29) and taking $2/3$ -powers. We are left to bound canonical measure expectations; by Lemma 8.9 these are an analog of equation (9.22):(9.31) $$ \begin{align} \Phi \ \overset{\bullet}= \ {\sup}_{\sigma\in\mathbb R}N^{\frac34+2\varepsilon_{\mathrm{ap}}}\mathbf{E}^{\sigma}\mathbf{E}_{\mathrm{Loc}}^{\mathrm{dyn}}|\mathfrak{I}_{\mathfrak{t}_{\mathfrak{j}_{1}}}^{\mathbf{T}}\mathfrak{I}^{\mathbf{X}}_{N^{1/6}}\bar{\mathfrak{I}}_{\mathfrak{l}_{1}}^{\mathbf{X}}(\mathsf{S}_{\varepsilon_{1}})|^{\frac32}. \end{align} $$By following the first bullet point after equation (9.22), we can first remove the bar over $\bar {\mathfrak {I}}^{\mathbf {X}}$ in equation (9.31). Now, we observe that the double $\mathfrak {I}^{\mathbf {X}}$ average on length scales $N^{1/6}$ and $\mathfrak {l}_{1}$ is actually a single $\mathfrak {I}^{\mathbf {X}}$ average on the product of the length scales $N^{1/6}\mathfrak {l}_{1}$ . Thus,(9.32) $$ \begin{align} \Phi \ \lesssim \ {\sup}_{\sigma\in\mathbb R}N^{\frac34+\frac32\varepsilon_{\mathrm{ap}}}\mathbf{E}^{\sigma}\mathbf{E}_{\mathrm{Loc}}^{\mathrm{dyn}}|\mathfrak{I}_{\mathfrak{t}_{\mathfrak{j}_{1}}}^{\mathbf{T}}\mathfrak{I}^{\mathbf{X}}_{N^{1/6}\mathfrak{l}_{1}}(\mathsf{S}_{\varepsilon_{1}})|^{\frac32} + N^{-100} \ \overset{\bullet}= \ \Phi' + N^{-100}. \end{align} $$At this point, we will now directly follow the second bullet point containing the estimate (9.26) but now with a spatial-average length scale equal to $N^{1/6}\mathfrak {l}_{1}$ . Intuitively, in the estimate (9.27), we lose the $\mathfrak {t}_{\mathfrak {j}+1}$ -dependent factor, namely its square root, thus we gain the bad factor of $N^{5/9}$ because $\mathfrak {t}_{\mathfrak {j}_{1}}\geqslant N^{-10/9-\varepsilon _{\mathrm {ap}}}$ by Lemma 9.5. On the other hand, the additional $N^{1/6}$ factor for the length scale gives us an additional $N^{-1/6}$ factor in equation (9.27) because that estimate is ‘inversely’ linear in the length scale:(9.33) $$ \begin{align} \mathbf{E}^{\sigma}\mathbf{E}_{\mathrm{Loc}}^{\mathrm{dyn}}|\mathfrak{I}_{\mathfrak{t}_{\mathfrak{j}_{1}}}^{\mathbf{T}}\mathfrak{I}^{\mathbf{X}}_{N^{1/6}\mathfrak{l}_{1}}(\mathsf{S}_{\varepsilon_{1}})|^{\frac32} \ \lesssim \ \left(N^{-2+6\varepsilon_{\mathrm{ap}}}\mathfrak{t}_{\mathfrak{j}_{1}}^{-1}N^{-\frac16}\mathfrak{l}_{1}^{-1}N^{\frac{2}{14}}+N^{-100}\right)^{3/4} \ \lesssim \ N^{-\frac34-\frac{1}{999}+8\varepsilon_{\mathrm{ap}}}. \end{align} $$The last estimate in equation (9.33) follows by power-counting; recall $\mathfrak {l}_{1}=N^{1/6}$ in Lemma 9.1 and the $\mathfrak {t}_{\mathfrak {j}_{1}}$ -lower bound prior to equation (9.33).
We have estimated the first two terms in equation (9.28) by the above two bullet points, completing the proposed estimate for the first term on the LHS of equation (9.5). It remains to estimate the second term on the LHS of equation (9.5). For this, we will follow the analysis in the proof of Lemma 9.5 for the second term on the LHS of equation (9.4). In particular, we lose an additional factor $\mathfrak {t}^{1/4}$ , which at best provides a factor of $N^{-1/4}$ because $\mathfrak {t}\leqslant N^{-1}$ for this lemma. Let us now observe in the final bullet point in the proof of Lemma 9.5, until the application of Lemma 8.11 mentioned therein, the benefit we gain, over the explicitly written upper bounds in the proof of Lemma 9.5, from having the smaller $N^{6/25}$ prefactor, as opposed to $N^{1/2}$ , is a factor of $N^{-13/50}$ . This certainly beats out the $N^{1/4}$ we have gained from forgetting $\mathfrak {t}^{1/4}$ . As for the application of Lemma 8.11 mentioned in the final bullet point in the proof of Lemma 9.5, observe that we only use the bound $\mathfrak {t}^{-1/2}\leqslant N^{-1}$ , whereas for the current lemma we have $\mathfrak {t}^{-1/2}\leqslant N^{-1/2+\varepsilon _{\mathrm {ap}}}$ . Therefore, the $N^{1/4}$ we must include from forgetting $\mathfrak {t}^{1/4}$ that we noted before is compensated for by the additional $N^{-1/2+\varepsilon _{\mathrm {ap}}}$ factor we gain from the improved bound $\mathfrak {t}^{-1/2}\leqslant N^{-1} \to \mathfrak {t}^{-1/2}\leqslant N^{-1/2+\varepsilon _{\mathrm {ap}}}$ . We conclude that the analysis for $N^{6/25}\mathfrak {e}$ near the end of the proof of Lemma 9.5 estimates the second term on the LHS of the proposed bound (9.5) by the RHS of equation (9.5). We have now estimated both terms on the LHS of the proposed estimate (9.5), so we are done.
10 Boltzmann–Gibbs principle I – proof of proposition 4.7, case I
The organization of this section is similar to that of Section 9. We will present the main ingredients that we need in the proof of Proposition 4.7 in what we will define shortly as Case I and then deduce Case I from these ingredients. We then provide the proof of each of these ingredients that, similar to Section 9, consist of a replacement by spatial average, a large-deviations-type cutoff for this spatial average, replacement by time average via multiscale analysis and a ‘final estimate’ for the time average. We decompose the proof of Proposition 4.7 into two cases, the first of which of interest here is the case where the index $\mathfrak {b}\in \mathbb Z_{\geqslant 0}$ in the supremum on the LHS of equation (4.8) is chosen so that $\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}\leqslant 1/4$ . In this case, our strategy follows basically that for the proof of Proposition 4.6, but it is technically easier since the $\mathsf {R}_{\delta }$ term we study in Proposition 4.7 admits a priori estimates:
Lemma 10.1. Consider $\delta \geqslant 0$ with $\delta +\varepsilon _{\mathrm {RN},1}\leqslant \frac 12+\varepsilon _{\mathrm {RN}}$ . We have the following estimate for which we recall the notation (4.6) and in which $\mathsf {R}^{\mathrm {cut}}$ is explained afterwards:
We define ${\mathsf {R}_{\delta }^{\mathrm {cut}}(\tau _{y}\eta _{S})=\tau _{y}\mathsf {R}^{\mathrm {cut}}_{\delta }(\eta _{S})}$ , where $\mathsf {R}_{\delta }^{\mathrm {cut}}(\eta )$ has support contained in that of $\mathsf {R}_{\delta }(\eta )$ in (4.6). Moreover:
-
• We have the deterministic bound $|\mathsf {R}_{\delta }^{\mathrm {cut}}(\eta )|\lesssim N^{10\varepsilon _{\mathrm {ap}}}N^{-\delta }$ , where $10$ is just a large constant to be treated loosely.
-
• The term $\mathsf {R}_{\delta }^{\mathrm {cut}}(\eta )$ vanishes in expectation with respect to any canonical measure on its support; see Definition 4.4 .
It will be convenient for us to introduce the following notation that distinguishes the current Case I, namely restricting to $\mathfrak {b}\leqslant \mathfrak {b}_{\mathrm {mid}}$ in the supremum on the LHS of equation (4.8). We also introduce notation for the $\mathsf {R}^{\mathrm {cut}}$ functionals relevant to Proposition 4.7.
Definition 10.2. Let us define $\mathfrak {b}_{\mathrm {mid}}\in \mathbb Z_{\geqslant 0}$ as the largest nonnegative integer for which $\varepsilon _{1}+\mathfrak {b}_{\mathrm {mid}}\varepsilon _{\mathrm {RN},1}\leqslant 1/4$ . In particular, observe that $\mathfrak {b}_{\mathrm {mid}}\leqslant \mathfrak {b}_{+}$ , where $\mathfrak {b}_{+}$ is defined in Proposition 4.7. We additionally define $\mathsf {R}^{\mathfrak {b}}_{S,y}={\mathsf {R}_{\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}}^{\mathrm {cut}}(\tau _{y}\eta _{S})}$ that satisfies:
-
• The support length of $\mathsf {R}^{\mathfrak {b}}$ is order $N^{\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}+\varepsilon _{\mathrm {RN},1}}$ as it has the same support as $\mathsf {R}_{\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}}$ in equation (4.6).
-
• The functional $\mathsf {R}^{\mathfrak {b}}$ satisfies the deterministic estimate $|\mathsf {R}^{\mathfrak {b}}|\lesssim N^{10\varepsilon _{\mathrm {ap}}}N^{-\varepsilon _{1}-\mathfrak {b}\varepsilon _{\mathrm {RN},1}}$ by construction; see Lemma 10.1.
-
• Lastly, to ease notation, we will define and sometimes use $N^{\beta +\varepsilon _{\mathrm {RN},1}}\mathfrak {l}_{\beta ,\mathfrak {b}}=N^{\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}+\varepsilon _{\mathrm {RN},1}}$ , where $\beta =999^{-99}$ .
We clarify the length scale $\mathfrak {l}_{\beta ,\mathfrak {b}}$ as basically the length of the support of $\mathsf {R}^{\mathfrak {b}-1}$ , and thus basically of $\mathsf {R}^{\mathfrak {b}}$ up to ultimately negligible factors of $N^{\varepsilon _{\mathrm {RN},1}}$ , but a factor of $N^{-\beta }$ smaller. Actually, it will not be important to be so careful about $N^{\varepsilon _{\mathrm {ap}}}$ and $N^{\varepsilon _{\mathrm {RN},1}}$ factors, as $\beta $ is much larger than $\varepsilon _{\mathrm {ap}}$ and $\varepsilon _{\mathrm {RN},1}$ of Definition 3.1, so $N^{-\beta }$ factors will beat all relevant powers of $N^{\varepsilon _{\mathrm {ap}}}$ and $N^{\varepsilon _{\mathrm {RN},1}}$ .
Outside a priori estimates for $\mathsf {R}^{\mathfrak {b}}$ -terms in Lemma 10.1 and Definition 10.2 that we did not have for the $\mathsf {S}$ -terms in the proof of Proposition 4.6, we emphasize the proof of Proposition 4.7 basically follows that of Proposition 4.6 except for a few technical differences whose impact on the proof can be readily checked. In particular, many estimates have the same flavor with only minor differences in power-counting that ultimately amount to elementary arithmetic.
10.0.1 Spatial average
The following result replaces $\mathsf {R}^{\mathfrak {b}}$ by spatial averages on length scales $\mathfrak {l}_{\beta ,\mathfrak {b}}$ and provides an analog of Lemma 9.1 but for $\mathsf {R}^{\mathfrak {b}}$ instead of $\mathsf {S}$ . We first emphasize a difference between the following result and Lemma 9.1. In Lemma 9.1, replacing $\mathsf {S}$ with a spatial average forces us to analyze explicitly the leading-order error term $N^{6/25}\mathfrak {e}$ . For the following result, we instead employ the a priori estimate for $\mathsf {R}^{\mathfrak {b}}$ in Lemma 10.1 to avoid this issue. In particular, Lemma 8.13 is enough.
Lemma 10.3. Define the length scale $\mathfrak {l}_{\beta ,\mathfrak {b}}=N^{\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}-\beta }$ , where $\beta =999^{-99}$ . We have the following uniformly in $\mathfrak {b}\leqslant \mathfrak {b}_{\mathrm {mid}}$ for which we recall the notation for differences of spatial averages on different length scales in Definition 8.12 :
Lemma 10.3 lets us replace $\mathsf {R}^{\mathfrak {b}}$ with its spatial average on length scale $\mathfrak {l}_{\beta ,\mathfrak {b}}$ , which for clarity we recall is basically the support length of $\mathsf {R}^{\mathfrak {b}-1}$ times $N^{-\beta }$ for $\beta =999^{-999}$ . Analogous to Lemma 9.4, we now replace this spatial average by a cutoff that holds at a large deviations scale with respect to any canonical measure and therefore for general measures after space-time averaging courtesy of the local equilibrium reduction in Lemma 8.9. We first provide a technical comment – the local equilibrium reduction in Lemma 8.9 deteriorates as the support of the functional, in this case the length- $\mathfrak {l}_{\beta ,\mathfrak {b}}$ average of $\mathsf {R}^{\mathfrak {b}}$ , increases, thus as $\mathfrak {b}$ increases. However, it also improves as $\mathfrak {b}$ increases because, according to Lemma 10.1, a priori estimates for $\mathsf {R}^{\mathfrak {b}}$ also improve as $\mathfrak {b}$ increases; this is ultimately enough to counter the aforementioned deterioration. We apply this observation throughout this section. Otherwise, the proof of the following ‘cutoff replacement’ follows the general strategy for that of Lemma 9.4.
Lemma 10.4. Recall the operator $\widetilde {\mathfrak {I}}^{\mathbf {X}}=\mathfrak {I}^{\mathbf {X}}-\bar {\mathfrak {I}}^{\mathbf {X}}$ from Lemma 9.4 . We have the following estimate uniformly in $\mathfrak {b}\leqslant \mathfrak {b}_{\mathrm {mid}}$ , in which we recall the length scale $\mathfrak {l}_{\beta ,\mathfrak {b}}$ and $\beta =999^{-99}$ that were both used in the statement of Lemma 10.3 :
10.0.2 Time average
Following Lemma 9.5, we will now replace the cutoff spatial average of $\mathsf {R}^{\mathfrak {b}}$ introduced in Lemma 10.4 by a time average on appropriate mesoscopic timescale. However, we instead replace by time average with respect to $\mathfrak {b}$ -dependent timescale that is shorter than the roughly $N^{-1}$ timescale used in the proof of Proposition 4.6. This last difference is technical, as we will see when we estimate time averages of $\mathsf {R}^{\mathfrak {b}}$ on this $\mathfrak {b}$ -dependent timescale. Before we state the following result, we first recall the transfer-of-timescale operator in Definition 8.15. Let us also make another technical comment – the Kipnis–Varadhan inequality for the equilibrium estimates in Lemma 8.11 deteriorates as the support of the functional $\mathsf {R}^{\mathfrak {b}}$ we are time averaging in Lemma 8.11 increases, in this case as the index $\mathfrak {b}$ increases. We will counter such deterioration with the improving a priori bound on $\mathsf {R}^{\mathfrak {b}}$ in Lemma 10.1. These competing factors basically cancel, so the proof of Lemma 9.5 holds almost verbatim.
Lemma 10.5. Provided $\mathfrak {b}\leqslant \mathfrak {b}_{\mathrm {mid}}$ , consider $\mathfrak {j}_{+}\in \mathbb Z_{\geqslant 0}$ such that $\mathfrak {t}_{\mathfrak {j}_{+}}\in \mathbb {I}^{\mathbf {T},1}$ is the largest time in $\mathbb {I}^{\mathbf {T},1}$ satisfying $\mathfrak {t}_{\mathfrak {j}_{+}}\leqslant N^{-1+\beta }\mathfrak {l}_{\beta ,\mathfrak {b}}^{-1}$ , where $\mathfrak {l}_{\beta ,\mathfrak {b}}$ and $\beta $ are both defined in Lemma 10.3 . As $\mathfrak {l}_{\beta ,\mathfrak {b}}\geqslant N^{\varepsilon _{1}-\beta }$ with $\varepsilon _{1}=\frac {1}{14}\geqslant 999\beta $ , we have $\mathfrak {t}_{\mathfrak {j}_{+}}\leqslant N^{-1}$ and
10.0.3 Final estimates
We now estimate the time average of $\mathsf {R}^{\mathfrak {b}}$ uniformly in $\mathfrak {b}\leqslant \mathfrak {b}_{\mathrm {mid}}$ on the timescale $\mathfrak {t}_{\mathfrak {j}_{+}}$ ‘reached’ with multiscale replacement in Lemma 10.5. This amounts to the analog below for Lemma 9.6. We apply the same remarks about the simultaneous deterioration and improvement of the estimates implied by Lemma 8.11 and Lemma 10.1 as the index $\mathfrak {b}$ increases. Otherwise, the proof of the following estimate is basically that of Lemma 9.6.
Lemma 10.6. Consider any $\mathfrak {b}\leqslant \mathfrak {b}_{\mathrm {mid}}$ and the corresponding timescale $\mathfrak {t}_{\mathfrak {j}_{+}}$ from Lemma 10.5 . Uniformly in $\mathfrak {b}\leqslant \mathfrak {b}_{\mathrm {mid}}$ , we have
Let us now prove Proposition 4.7 in Case I, where the index $\mathfrak {b}\in \mathbb Z_{\geqslant 0}$ in the supremum on the LHS of equation (4.8) satisfies $\mathfrak {b}\leqslant \mathfrak {b}_{\mathrm {mid}}$ . We make the following replacements to the LHS of equation (4.8) and cite results in this section that control errors in such replacements.
-
• Lemma 10.1 lets us replace $\mathsf {R}_{\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}}$ on the LHS of equation (4.8) with $\mathsf {R}^{\mathfrak {b}}$ with clearly controllable error. Indeed, Lemma 10.1 is applicable for $\mathfrak {b}\leqslant \mathfrak {b}_{\mathrm {mid}}$ as by definition of $\mathfrak {b}_{\mathrm {mid}}$ in Definition 10.2, we have $\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}\leqslant 1/4\leqslant 1/2+\varepsilon _{\mathrm {RN}}$ if $\mathfrak {b}\leqslant \mathfrak {b}_{\mathrm {mid}}$ .
-
• We now replace $\mathsf {R}^{\mathfrak {b}}$ by $\mathfrak {I}^{\mathbf {X}}_{\mathfrak {l}_{\beta ,\mathfrak {b}}}(\mathsf {R}^{\mathfrak {b}})$ with $\mathfrak {l}_{\beta ,\mathfrak {b}}$ in Lemma 10.3. Lemma 10.3 controls the error by a universal negative power of N as $\beta =999^{-99}$ is much larger than $\varepsilon _{\mathrm {ap}},\varepsilon _{\mathrm {RN},1}$ .
-
• We now replace $\mathfrak {I}^{\mathbf {X}}_{\mathfrak {l}_{\beta ,\mathfrak {b}}}(\mathsf {R}^{\mathfrak {b}})$ by $\bar {\mathfrak {I}}^{\mathbf {X}}_{\mathfrak {l}_{\beta ,\mathfrak {b}}}(\mathsf {R}^{\mathfrak {b}})$ . The error is controlled by a universal negative power of N by Lemma 10.4.
-
• Replace $\bar {\mathfrak {I}}^{\mathbf {X}}_{\mathfrak {l}_{\beta ,\mathfrak {b}}}(\mathsf {R}^{\mathfrak {b}})$ by $\mathfrak {I}^{\mathbf {T}}_{\mathfrak {t}_{\mathfrak {j}_{+}}}\bar {\mathfrak {I}}^{\mathbf {X}}_{\mathfrak {l}_{\beta ,\mathfrak {b}}}(\mathsf {R}^{\mathfrak {b}})$ with $\mathfrak {t}_{\mathfrak {j}_{+}}$ in Lemma 10.5. The error, by Lemma 10.5, is a universal negative power of N.
-
• We now apply Lemma 10.6 to estimate the resulting heat operator acting on $N^{1/2}\mathfrak {I}^{\mathbf {T}}_{\mathfrak {t}_{\mathfrak {j}_{+}}}\bar {\mathfrak {I}}^{\mathbf {X}}_{\mathfrak {l}_{\beta ,\mathfrak {b}}}(\mathsf {R}^{\mathfrak {b}})\mathbf {Y}^{N}$ .
Combining the previous bullet points with the triangle inequality for $\|\|_{1;\mathbb {T}_{N}}$ and $\mathbf {E}$ completes the proof.
Proof of Lemma 10.1 .
We first extend the $\mathsf {E}^{\mathrm {can}}$ -expectations in Definition 4.5 to any functional $\mathfrak {f}$ instead of just $\bar {\mathfrak {q}}$ . Let us observe the following quantity vanishes under expectation with respect to any canonical ensemble on its support. We clarify/emphasize the second term below is an expectation of the functional $\mathfrak {f}_{0,0}=\mathfrak {f}$ with respect to the canonical measure on with parameter equal to the $\eta $ -density $\sigma _{\delta +\varepsilon _{\mathrm {RN},1},S,y}$ on this set at time S; see Definition 4.5. We also clarify that we require $\mathfrak {f}$ to be supported in a uniformly bounded neighborhood to the left of $0\in \mathbb {T}_{N}$ like $\bar {\mathfrak {q}}$ , and thus contained in . This implies the support of equation (10.6) is contained in that of $\mathsf {R}_{\delta }(\tau _{y}\eta )$ of equation (4.6).
Vanishing of equation (10.6) by canonical measure expectation follows by tower property of conditional expectation and that the projection of any canonical measure on any larger set onto any smaller subset is a convex combination of canonical measures; see the proof of Lemma 2 in [Reference Goncalves and Jara24]. We eventually take $\mathsf {R}^{\mathrm {cut}}$ to be a quantity of the form (10.6) in which $\mathfrak {f}$ admits the deterministic upper bound required for $\mathsf {R}^{\mathrm {cut}}$ . To this end, we first recall $\mathsf {R}$ in equation (4.6). Again, by the projection property for canonical measures, now combined with the tower property for expectation, we get the following with notation explained afterwards:
-
• Let $\sigma (\delta )$ be a random $\eta $ -density on . Its law is given by that of the $\eta $ -density on $y+\mathbb {I}^{\delta }$ according to the measure defining the expectation $\mathsf {E}^{\mathrm {can}}_{\delta +\varepsilon _{\mathrm {RN},1}}(\tau _{y}\eta _{S})$ . (Note $y+\mathbb {I}^{\delta }$ is the subset that the canonical measure in $\mathsf {E}^{\mathrm {can}}_{\delta }(\tau _{y}\eta _{S})$ is defined on.)
-
• Define $\mathfrak {f}_{1}=\mathsf {E}^{\mathrm {can}}_{\delta ,\sigma (\delta )}(\bar {\mathfrak {q}})$ to be the canonical measure expectation of $\bar {\mathfrak {q}}$ with respect to the $\sigma (\delta )$ -canonical measure on $y+\mathbb {I}^{\delta }$ .
-
• Observe that $\mathfrak {f}_{1}$ is a functional of the particle system, and it depends only on the random variable/ $\eta $ -density $\sigma (\delta )$ from the first bullet point. In particular, the second term/iterated $\mathsf {E}$ -expectation is an expectation of $\mathfrak {f}_{1}$ , where the randomness in the inside expectation is now through the random $\eta $ -density $\sigma (\delta )$ that is sampled with respect to the canonical measure defining the outer expectation on the RHS of equation (10.7). So, we get equation (10.7) by first conditioning $\bar {\mathfrak {q}}$ in the $\mathsf {E}^{\mathrm {can}}_{\delta +\varepsilon _{\mathrm {RN},1}}(\tau _{y}\eta _{S})$ -expectation in (4.6) on $\sigma (\delta )$ ; again, we emphasize canonical measures project to canonical measures, so taking said expectation conditioning on $\sigma (\delta )$ leads to canonical measure expectation of $\bar {\mathfrak {q}}$ with parameter $\sigma (\delta )$ on its defining set $y+\mathbb {I}^{\delta }$ . This is what equation (10.7) says. We clarify that $\mathsf {E}^{\mathrm {can}}_{\delta }(\tau _{y}\eta _{S})$ and $\mathsf {E}^{\mathrm {can}}_{\delta ,\sigma (\delta )}(\bar {\mathfrak {q}})$ are the same function, but the former is evaluated at $\sigma _{\delta ,S,y}$ , that is, the scale- $N^{\delta }$ density of the actual particle system at time S and point y, and the latter is evaluated at the random $\sigma (\delta )$ sampled via $\mathsf {E}^{\mathrm {can}}_{\delta +\varepsilon _{\mathrm {RN},1}}(\tau _{y}\eta _{S})$ as explained.
We now make the following observation, which implies that it suffices to provide a priori estimates for $\mathbb {I}^{\delta }$ expectations.
-
• Suppose $|{\mathsf {E}^{\mathrm {can}}_{\delta }(\tau _{y}\eta _{S})}|+|\mathsf {E}^{\mathrm {can}}_{\delta ,\sigma (\delta )}(\bar {\mathfrak {q}})|\leqslant N^{10\varepsilon _{\mathrm {ap}}}N^{-\delta }$ . This is not necessarily true; we will show it is sufficiently close to true.
-
• The previous bullet point would finish the proof, since equation (10.7) provides a representation of $\mathsf {R}_{\delta }(\eta )$ as equation (10.6) for $\mathfrak {f}= \mathsf {E}^{\mathrm {can}}_{\delta ,\sigma (\delta )}(\bar {\mathfrak {q}})$ , which would certainly satisfy be $\mathrm {O}(N^{10\varepsilon _{\mathrm {ap}}}N^{-\delta })$ if the previous bullet point were true.
In view of the previous bullet points, it suffices to make the following replacements in equation (10.7), provided that $\mathbf {Y}^{N}\neq 0$ , for which we first establish convenient notation $\mathsf {C}_{\alpha }(a)=a\mathbf {1}(|a|\leqslant \alpha )$ for a cutoff operator/map where $a\in \mathbb R$ is any real number; note that the replacements below do not change the support of any term in equation (10.7):
Indeed, whenever $\mathbf {Y}^{N}=0$ , then the proposed estimate in Lemma 10.1 is trivial. Moreover, defining $\mathsf {R}^{\mathrm {cut}}$ to be the RHS of equation (10.7) but with the replacements in equation (10.8), the previous two bullet points would imply that $\mathsf {R}^{\mathrm {cut}}$ satisfies the proposed pair of properties claimed in the lemma. We now show the replacements (10.8) to the RHS of equation (10.7), whenever $\mathbf {Y}^{N}\neq 0$ , only provide error that, after multiplication by $\mathbf {Y}^{N}$ , is controlled by the RHS of the proposed estimate.
-
• The first replacement in equation (10.8) gives no error whenever $\mathbf {Y}^{N}\neq 0$ . Indeed, Lemma 4.10 still holds with $\varepsilon _{1}+\mathfrak {b}_{+}\varepsilon _{\mathrm {RN},1}$ therein replaced with $\delta $ here, because all we require for Lemma 4.10 is a priori spatial regularity of $\mathbf {Y}^{N}$ on the length scale $N^{\delta }\leqslant \mathfrak {l}_{N}$ if $\delta +\varepsilon _{\mathrm {RN},1}\leqslant \frac 12+\varepsilon _{\mathrm {RN}}$ , which holds by assumption here; see Definition 3.1 for $\mathfrak {l}_{N}$ . Thus, we may assume the canonical measure parameter $\sigma _{\delta +\varepsilon _{\mathrm {RN},1},S,y}$ defining $\mathsf {E}^{\mathrm {can}}_{\delta }(\tau _{y}\eta _{S})$ in equation (10.7) is at most $N^{5\varepsilon _{\mathrm {ap}}}N^{-\delta /2}$ , from which we show the first replacement in equation (10.8) does nothing by following the proof of Proposition 4.8.
-
• We move to the second replacement in equation (10.8) applied to equation (10.7). Similar to the previous bullet point, for the outer expectation in the second term on the RHS of equation (10.7), we know that its canonical measure parameter satisfies $\sigma _{\delta +2\varepsilon _{\mathrm {RN},1},S,y}\leqslant N^{4\varepsilon _{\mathrm {ap}}}N^{-\delta /2}$ ; combining the previous bullet point with this most recent observation gives
(10.9) $$ \begin{align} {\mathsf{R}_{\delta}(\tau_{y}\eta_{S})}\mathbf{Y}_{S,y}^{N} \ &= \ \mathbf{1}(|\sigma_{\delta+\varepsilon_{\mathrm{RN},1},S,y}|\leqslant N^{5\varepsilon_{\mathrm{ap}}}N^{-\delta/2}){\mathsf{E}_{\delta}^{\mathrm{can}}(\tau_{y}\eta_{S})}\mathbf{Y}_{S,y}^{N}\end{align} $$(10.10) $$ \begin{align}&- \mathbf{1}(|\sigma_{\delta+2\varepsilon_{\mathrm{RN},1},S,y}|\leqslant N^{4\varepsilon_{\mathrm{ap}}}N^{-\delta/2}){\mathsf{E}_{\delta+\varepsilon_{\mathrm{RN},1}}^{\mathrm{can}}(\tau_{y}\eta_{S}; \mathsf{E}_{\delta,\sigma(\delta)}^{\mathrm{can}}(\bar{\mathfrak{q}}))}\mathbf{Y}_{S,y}^{N}. \end{align} $$We clarify we have $\sigma _{\delta +2\varepsilon _{\mathrm {RN},1},S,y}\leqslant N^{4\varepsilon _{\mathrm {ap}}}N^{-\delta /2}$ instead of $\sigma _{\delta +2\varepsilon _{\mathrm {RN},1},S,y}\leqslant N^{5\varepsilon _{\mathrm {ap}}}N^{-\delta /2}$ because the extra $N^{\varepsilon _{\mathrm {ap}}}$ produced in the proof of Proposition 4.8 and Lemma 4.10 comes from applying the bound $|\mathbf {Y}^{N}|\leqslant N^{\varepsilon _{\mathrm {ap}}}$ in the proof of Lemma 4.10 that we do not yet need because we are not bounding $\mathbf {Y}^{N}$ to get the previous display. -
• A digression. Fix $\sigma _{\delta +2\varepsilon _{\mathrm {RN},1},S,y}$ , and consider the canonical measure on its support with this parameter, namely the measure defining $\mathsf {E}^{\mathrm {can}}_{\delta +\varepsilon _{\mathrm {RN},1}}(\tau _{y}\eta _{S})$ . If we instead consider the corresponding grand-canonical measure, then $\sigma (\delta )$ would be an average of $|\mathbb {I}^{\delta }|=N^{\delta +\varepsilon _{\mathrm {RN},1}}$ -many independent Bernoulli random variables with expectation $\sigma _{\delta +2\varepsilon _{\mathrm {RN},1},S,y}$ . Concentration inequalities would then imply $|\sigma (\delta )-\sigma _{\delta +2\varepsilon _{\mathrm {RN},1},S,y}|\geqslant N^{\varepsilon _{\mathrm {ap}}}\sigma _{\delta +2\varepsilon _{\mathrm {RN},1},S,y}$ happens with exponentially small probability in N. This can be seen by viewing $\sigma (\delta )$ as a random walk indexed by $\mathbb {I}^{\delta }$ with drift $\sigma _{\delta +2\varepsilon _{\mathrm {RN},1},S,y}$ . The only difference in this discussion if we look at the canonical ensemble instead of grand-canonical ensemble is that $\sigma (\delta )$ is the length- $|\mathbb {I}^{\delta }|$ -increment of a random walk bridge with drift $\sigma _{\delta +2\varepsilon _{\mathrm {RN},1},S,y}$ , for which subexponential concentration inequalities are also readily available.
-
• Given that $\sigma _{\delta +2\varepsilon _{\mathrm {RN},1},S,y}\leqslant N^{4\varepsilon _{\mathrm {ap}}}N^{-\delta /2}$ , we note that the $\sigma (\delta )$ density on the smaller subset $\mathbb {I}^{\delta }$ , inside the previous display, is bounded by $N^{5\varepsilon _{\mathrm {ap}}}N^{-\delta /2}$ with overwhelming probability; see Definition 3.9. Indeed, by thinking of $\sigma (\delta )$ as an $\mathbb {I}^{\delta }$ -indexed increment of a random walk bridge with drift $N^{4\varepsilon _{\mathrm {ap}}}$ , standard sub-Gaussian concentration inequalities for random walk bridges shows that $\sigma (\delta )$ deviates from its normalized drift $N^{4\varepsilon _{\mathrm {ap}}}N^{-\delta /2}$ plus its Brownian-type fluctuation $N^{-\delta /2}$ by a factor of $N^{\varepsilon _{\mathrm {ap}}}$ with exponentially small probability in N, and therefore with overwhelming probability.
-
• We now have the following where $\mathcal {E}=\{|\sigma (\delta )|\leqslant N^{5\varepsilon _{\mathrm {ap}}}N^{-\delta /2}\}$ with complement $\mathcal {E}^{C}$ ; we explain these calculations after:
(10.11) $$ \begin{align} &\mathsf{E}_{\delta+\varepsilon_{\mathrm{RN},1}}^{\mathrm{can}}(\tau_{y}\eta_{S};\mathsf{E}_{\delta,\sigma(\delta)}^{\mathrm{can}}(\bar{\mathfrak{q}})) \nonumber\\ &= \ \mathsf{E}_{\delta+\varepsilon_{\mathrm{RN},1}}^{\mathrm{can}} (\tau_{y}\eta_{S};\mathbf{1}_{\mathcal{E}}\mathsf{E}_{\delta,\sigma(\delta)}^{\mathrm{can}}(\bar{\mathfrak{q}})) + \mathsf{E}_{\delta+\varepsilon_{\mathrm{RN},1}}^{\mathrm{can}} (\tau_{y}\eta_{S};\mathbf{1}_{\mathcal{E}^{C}}\mathsf{E}_{\delta,\sigma(\delta)}^{\mathrm{can}}(\bar{\mathfrak{q}})) \end{align} $$(10.12) $$ \begin{align} &= \ \mathsf{E}_{\delta+\varepsilon_{\mathrm{RN},1}}^{\mathrm{can}}(\tau_{y}\eta_{S};\mathbf{1}_{\mathcal{E}}\mathsf{C}_{N^{10\varepsilon_{\mathrm{ap}}}N^{-\delta}}(\mathsf{E}_{\delta,\sigma(\delta)}^{\mathrm{can}}(\bar{\mathfrak{q}}))) + \mathsf{E}_{\delta+\varepsilon_{\mathrm{RN},1}}^{\mathrm{can}}(\tau_{y}\eta_{S};\mathbf{1}_{\mathcal{E}^{C}}\mathsf{E}_{\delta,\sigma(\delta)}^{\mathrm{can}}(\bar{\mathfrak{q}})) \end{align} $$(10.13) $$ \begin{align} &= \ \mathsf{E}_{\delta+\varepsilon_{\mathrm{RN},1}}^{\mathrm{can}}(\tau_{y}\eta_{S};\mathsf{C}_{N^{10\varepsilon_{\mathrm{ap}}}N^{-\delta}}(\mathsf{E}_{\delta,\sigma(\delta)}^{\mathrm{can}}(\bar{\mathfrak{q}}))) + \mathsf{E}_{\delta+\varepsilon_{\mathrm{RN},1}}^{\mathrm{can}}(\tau_{y}\eta_{S};\mathrm{O}(\mathbf{1}_{\mathcal{E}^{C}})). \end{align} $$The first line (10.11) is trivial: $1=\mathbf {1}_{\mathcal {E}}+\mathbf {1}_{\mathcal {E}^{C}}$ . The second line (10.12) follows by the argument in the first bullet point in the current list. The third line (10.13) follows again by writing $\mathbf {1}_{\mathcal {E}}=1-\mathbf {1}_{\mathcal {E}^{C}}$ . If we plug the second term in equation (10.13) into equation (10.10), by the previous bullet point and that $|\bar {\mathfrak {q}}|\lesssim 1$ , we get $\mathrm {O}(N^{-200})$ , which is controlled by the RHS of the proposed estimate. -
• As noted in the current bullet point list prior to equations (10.9) and (10.10), we can now drop the indicator functions (10.9) and (10.10), so the error in making the replacements (10.8) in (10.7) is also appropriately controlled by the RHS of the proposed estimate.
Defining $\mathsf {R}^{\mathrm {cut}}$ to be the RHS of equation (10.7) but with replacements (10.8) for the RHS of equation (10.7), the previous bullet points provide the proposed estimate for $\mathbf {Y}^{N}\neq 0$ , whereas the estimate is trivial if $\mathbf {Y}^{N}=0$ . Moreover, as noted prior to equation (10.8), the $\mathsf {R}^{\mathrm {cut}}$ functional satisfies all the required properties in the statement of the lemma, so we are done.
Proof of Lemma 10.3 .
We apply Lemma 8.13 with the choices $\mathfrak {t}=0$ and $\mathfrak {f}=N^{1/2}\mathsf {R}^{\mathfrak {b}}$ and $\mathfrak {l}=\mathfrak {l}_{\beta ,\mathfrak {b}}$ . Along with a few other gymnastics and conditions that need to be checked, which we explain shortly, this ultimately provides the following estimate; we first clarify that the prefactor $N^{5\varepsilon _{\mathrm {ap}}}\mathfrak {l}_{\beta ,\mathfrak {b}}^{1/2}N^{\frac 12\varepsilon _{1}+\frac 12\mathfrak {b}\varepsilon _{\mathrm {RN},1}+\frac 12\varepsilon _{\mathrm {RN},1}}$ for the second term on the RHS of equation (10.14) comes from noting $\bar {\mathfrak {l}}$ in Lemma 8.13 is the product of $\mathfrak {l}=\mathfrak {l}_{\beta ,\mathfrak {b}}$ and the support length of $\mathfrak {f}=N^{1/2}\mathsf {R}^{\mathfrak {b}}$ , which is given in Definition 10.2:
Lemma 8.13 with the previous choices yields equation (10.14) with no changes to the first term but with $\|\mathsf {R}^{\mathfrak {b}}\|_{\omega ;\infty }$ in the second term on the RHS of equation (10.14) replaced by $\mathbf {E}\|\mathbf {H}^{N}(|\mathsf {R}^{\mathfrak {b}}|)\|_{1;\mathbb {T}_{N}}$ . But $\mathbf {E}\|\mathbf {H}^{N}(|\mathsf {R}^{\mathfrak {b}}|)\|_{1;\mathbb {T}_{N}}\leqslant \|\mathsf {R}^{\mathfrak {b}}\|_{\omega ;\infty }$ . Lastly, Lemma 8.13 with our choices of $\mathfrak {f}$ and $\mathfrak {l}$ may only be applied if the support length of $\mathfrak {f}=N^{1/2}\mathsf {R}^{\mathfrak {b}}$ times $\mathfrak {l}=\mathfrak {l}_{\beta ,\mathfrak {b}}$ is at most $\mathfrak {l}_{N}=N^{1/2+\varepsilon _{\mathrm {RN}}}$ from Definition 3.1. This follows since $\mathfrak {l}_{\beta ,\mathfrak {b}}$ is at most $N^{-\beta }$ times the support length of $\mathsf {R}^{\mathfrak {b}}$ , while the constraint $\mathfrak {b}\leqslant \mathfrak {b}_{\mathrm {mid}}$ guarantees the square of the support length of $\mathsf {R}^{\mathfrak {b}}$ is at most $N^{1/2+2\varepsilon _{\mathrm {RN},1}}\leqslant \mathfrak {l}_{N}$ by construction in Definition 10.2; for the last bound concerning $\mathfrak {l}_{N}$ , we recall $\mathfrak {l}_{N}$ in Definition 3.1 and note that by Definition 10.2, we clearly have the inequality $100\varepsilon _{\mathrm {RN},1}\leqslant \varepsilon _{\mathrm {RN}}$ . It now suffices to plug in $\mathfrak {l}_{\beta ,\mathfrak {b}}$ and $\mathsf {R}^{\mathfrak {b}}$ bounds in Definition 10.2; note $|\mathsf {R}^{\mathfrak {b}}|\lesssim N^{-\varepsilon _{1}+10\varepsilon _{\mathrm {ap}}}=N^{-1/14+10\varepsilon _{\mathrm {ap}}}$ , and $\beta \geqslant 999\varepsilon _{\mathrm {RN}}+999\varepsilon _{\mathrm {RN},1}$ .
Proof of Lemma 10.4 .
We will follow the proof of Lemma 9.4. In particular, it suffices to copy and paste that argument except we formally replace $\mathsf {S}_{\varepsilon _{1}}$ with $\mathsf {R}^{\mathfrak {b}}_{S,y}$ and $\mathfrak {l}_{1}$ with $\mathfrak {l}_{\beta ,\mathfrak {b}}$ . This has the following effects on that argument and its proofs.
-
• The bounds (9.11) and (9.12) still hold as written after the aforementioned replacements. The estimate (9.13) also ‘almost’ holds because the application of Lemma 8.9 used to obtain it, even after the aforementioned replacement, is still valid since $\mathsf {R}^{\mathfrak {b}}$ is still uniformly bounded. However, the first term on the RHS of equation (9.13) must be adjusted as the support length of the ‘new’ functional $\mathsf {R}^{\mathfrak {b}}$ is no longer order $N^{6/25}$ , whose cube is present in the first term on the RHS of equation (9.13). Recalling from Definition 10.2 that for $\mathfrak {b}\leqslant \mathfrak {b}_{\mathrm {mid}}$ the support length of $\mathsf {R}^{\mathfrak {b}}$ is of order at most $N^{\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}+\varepsilon _{\mathrm {RN},1}}\leqslant N^{1/3}$ , after the replacement $\mathsf {S}\to \mathsf {R}^{\mathfrak {b}}$ the first term on the RHS of equation (9.13) has its $N^{18/25}$ factor replaced by N, which controls the cube of the support length of $\mathsf {R}^{\mathfrak {b}}$ .
-
• The estimate (9.14) still holds after the replacement of $\mathfrak {l}_{1}$ by $\mathfrak {l}_{\beta ,\mathfrak {b}}$ and of $\mathsf {S}$ by $\mathsf {R}^{\mathfrak {b}}$ , as $|\mathsf {R}^{\mathfrak {b}}|\lesssim 1$ and $\mathsf {R}^{\mathfrak {b}}$ also vanishes in expectation with respect to any canonical measure on its support. In particular, Lemma 8.10 holds with $\mathfrak {f}_{\mathfrak {j}}$ equal to spatial shifts of $\mathsf {R}^{\mathfrak {b}}$ with mutually disjoint supports and with $\mathfrak {J}=\mathfrak {l}_{\beta ,\mathfrak {b}}$ .
We deduce the claim from directly following the proof of Lemma 9.4, at least upon checking that the replacement of $N^{18/25}$ with N on the RHS of equation (9.13) still makes the contribution of the first term on the RHS of equation (9.13), after plugging into equations (9.11) and (9.12), controlled by the RHS of the proposed estimate (10.3). This follows by elementary power-counting, so we are done.
Proof of Lemma 10.5 .
We follow the proof of Lemma 9.5. Observe $\mathfrak {t}_{\mathfrak {j}_{+}}\leqslant N^{-1}$ . Indeed, $\mathfrak {l}_{\beta ,\mathfrak {b}}$ is $N^{-\beta }$ times the support length of $\mathsf {R}^{\mathfrak {b}-1}$ , and the support length of $\mathsf {R}^{\mathfrak {b}-1}$ is $\gtrsim N^{-10\varepsilon _{\mathrm {RN,1}}}N^{\varepsilon _{1}}$ . As $\varepsilon _{\mathrm {RN},1}$ and $\beta $ are much smaller than $\varepsilon _{1}=1/14$ from the statement of Proposition 4.6, the factor of $N^{-\varepsilon _{1}}$ beats powers of $N^{\beta }$ and $N^{\varepsilon _{\mathrm {RN},1}}$ , and by inspecting the definition in the statement of Lemma 10.5, we deduce the proposed timescale upper bound. Also, throughout the following proof, we replace $\mathfrak {l}_{1}$ whenever we appeal to the proof of Lemma 9.5 by $\mathfrak {l}_{\beta ,\mathfrak {b}}$ from Definition 10.2.
We directly follow the argument in the proof of Lemma 9.5 preceding equation (9.19) but now with cutoff spatial averages of $\mathsf {R}^{\mathfrak {b}}$ rather than $\mathsf {S}$ . Because $|\mathsf {R}^{\mathfrak {b}}|\lesssim 1$ and $|\bar {\mathfrak {I}}^{\mathbf {X}}_{\mathfrak {l}_{\beta ,\mathfrak {b}}}(\mathsf {R}^{\mathfrak {b}})|\lesssim N^{-\alpha }$ for $\alpha \gtrsim 1$ , we obtain equation (9.16) but with $\mathfrak {f}=\mathsf {R}^{\mathfrak {b}}$ instead of $\mathfrak {f}=\mathsf {S}$ and with $\alpha>0$ universal instead of $1/12$ on the far RHS of equation (9.16). Thus, it ultimately suffices to control from above the following quantity that is analogous to equation (9.19) uniformly in the indices $\mathfrak {j}<\mathfrak {j}_{+}$ of interest:
The estimate in equation (10.15) follows from applying Lemma 8.2 as equation (9.19) did but now with $\mathsf {S}$ replaced by $\mathsf {R}^{\mathfrak {b}}$ and $\mathfrak {l}_{1}$ by $\mathfrak {l}_{\beta ,\mathfrak {b}}$ . Following the paragraph after equation (9.19) and prior to equation (9.20), because of the first paragraph in this proof it suffices to estimate the RHS of equation (10.15) for all $N^{-2}\lesssim \mathfrak {t}_{\mathfrak {j}}\leqslant N^{-1}$ , which by construction in Definition 3.1 means $\mathfrak {t}_{\mathfrak {j}+1}\leqslant N^{-1+\varepsilon _{\mathrm {ap}}}$ . To this end, observe equation (9.20) holds with the replacement $\mathsf {S}\to \mathsf {R}^{\mathfrak {b}}$ except $\mathrm {Loc}$ therein is now with respect to length scale $\mathfrak {l}_{\mathrm {tot}}$ from Definition 8.3/Lemma 8.8, which is also taken with $\gamma _{0}=\varepsilon _{\mathrm {ap}}$ , with the choice $\mathfrak {l}$ equal to the support length of $\bar {\mathfrak {I}}^{\mathbf {X}}_{\mathfrak {l}_{\beta ,\mathfrak {b}}}(\mathsf {R}^{\mathfrak {b}})$ , which is $\mathfrak {l}_{\beta ,\mathfrak {b}}$ times the support length of $\mathsf {R}^{\mathfrak {b}}$ written in Definition 10.2, and with $\mathfrak {l}_{\mathrm {av}}=1$ because the spatial-averaging scale $\mathfrak {l}_{\beta ,\mathfrak {b}}$ is already built into $\mathfrak {l}$ . The effect of this distinction in $\mathfrak {l}_{\mathrm {tot}}$ will be given shortly. For clarity, let us record this estimate below, which we reference shortly:
Similar to the second term on the RHS of equation (9.20), the second term on the RHS of equation (10.16) has contribution ultimately controlled by the RHS of the proposed estimate. To study the first term on the RHS of equation (10.16), we use Lemma 8.9 as with the first term on the RHS of equation (9.20) from the proof of Lemma 9.5. We will make the same choices for inputs/ingredients for Lemma 8.9 as we made to analyze the first term on the RHS of equation (9.20), except with the following adjustment that takes into consideration the different a priori estimates we have on the cutoff spatial average $\bar {\mathfrak {I}}^{\mathbf {X}}_{\mathfrak {l}_{\beta ,\mathfrak {b}}}(\mathsf {R}^{\mathfrak {b}})$ as opposed to $\mathsf {S}_{\varepsilon _{1}}$ .
-
• First, let us clarify we choose $\mathfrak {h}$ to be the $\mathbf {E}^{\mathrm {dyn}}$ -term on the RHS of equation (10.16), so with $\mathsf {R}^{\mathfrak {b}}$ and not $\mathsf {S}$ .
-
• We choose $\kappa =N^{-3\varepsilon _{\mathrm {ap}}/2}\mathfrak {l}_{\beta ,\mathfrak {b}}^{3/4}N^{-15\varepsilon _{\mathrm {ap}}+3\varepsilon _{1}/2+3\mathfrak {b}\varepsilon _{\mathrm {RN},1}/2}$ for the $\kappa $ constant in the statement of Lemma 8.9. As $\kappa |\bar {\mathfrak {I}}_{\mathfrak {l}_{\beta ,\mathfrak {b}}}^{\mathbf {X}}(\mathsf {R}^{\mathfrak {b}})|\lesssim 1$ , this choice of $\kappa $ is compatible with our choice of $\mathfrak {h}$ , so our application of Lemma 8.9 with these choices is legal.
Similar to the proof of Lemma 9.5 and bounds on the first term on the RHS of equation (9.20), Lemma 8.9 bounds the first term on the RHS of equation (10.16) in terms of two quantities. The first of these two terms is the far LHS of equation (9.21), which is ultimately negligible even with replacing $N^{\varepsilon _{1}}\mathfrak {l}_{1}$ in equation (9.21) with our new choice of $\mathfrak {l}_{\mathrm {tot}}$ adapted to the support length of $\mathsf {R}^{\mathfrak {b}}$ and the spatial-average length scale $\mathfrak {l}_{\beta ,\mathfrak {b}}$ . Recall from Lemma 8.8 that $\mathfrak {l}_{\mathrm {tot}}$ is bounded by the spatial-average length scale $\mathfrak {l}_{\beta ,\mathfrak {b}}$ times the support length $N^{\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}+\varepsilon _{\mathrm {RN},1}}$ , in Definition 10.2, of $\mathsf {R}^{\mathfrak {b}}$ . With the new choice for $\kappa $ made in the bullet point list above, we deduce that the first upper-bound term for the first term on the RHS of equation (10.16)/far LHS of equation (9.21) but, after replacing $N^{\varepsilon _{1}}\mathfrak {l}_{1}$ with $\mathfrak {l}_{\mathrm {tot}}$ , is ultimately controlled by the RHS of the proposed estimate (10.4). This can be verified with an elementary power-counting after plugging into the middle of equation (9.21) our choice of $\kappa $ in the bullet points above and replacing $N^{\varepsilon _{1}}\mathfrak {l}_{1}$ in equation (9.21) by our new $\mathfrak {l}_{\mathrm {tot}}$ . In particular, if $|\mathbb {B}|$ denotes the support length of $\mathbf {E}^{\mathrm {dyn}}$ , we have the following estimate that is analogous to equation (9.21):
Recalling $\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}\leqslant 1/4$ and $\mathfrak {l}_{\beta ,\mathfrak {b}}\leqslant N^{1/4+\beta }$ if $\mathfrak {b}\leqslant \mathfrak {b}_{\mathrm {mid}}$ by construction in Definition 10.2, the contribution of the RHS of equation (10.17), after plugging into equation (10.16) and taking its $2/3$ -power in (10.15), is controlled by the RHS of the proposed estimate (10.4) as we also have $\mathfrak {t}_{\mathfrak {j}}\leqslant N^{-1}$ and $\mathfrak {t}_{\mathfrak {j}+1}\leqslant N^{-1+\varepsilon _{\mathrm {ap}}}$ , the first noted in the first paragraph of this proof and the latter by Definition 3.1.
We move to the second upper-bound term for the first term on the RHS of equation (10.16) that results from our application of Lemma 8.9. This is the $\Phi $ -term in equation (9.22) except $\mathsf {S}\to \mathsf {R}^{\mathfrak {b}}$ and $\mathfrak {l}_{1}\to \mathfrak {l}_{\beta ,\mathfrak {b}}$ . In particular, everything until/before equation (9.27) and after equation (9.22) holds with the replacements $\mathsf {S}\to \mathsf {R}^{\mathfrak {b}}$ and $\mathfrak {l}_{1}\to \mathfrak {l}_{\beta ,\mathfrak {b}}$ ; indeed, $\mathsf {R}^{\mathfrak {b}}$ is uniformly bounded and vanishes in expectation with respect to any canonical measure on its support, so Lemma 8.10 applies to $\mathsf {R}^{\mathfrak {b}}$ and averages of its spatial translates. However, the estimate (9.27) must be modified to account for the new/longer support length of $\mathsf {R}^{\mathfrak {b}}$ as well as the spatial-average length scale in the $\bar {\mathfrak {I}}^{\mathbf {X}}$ -term on the RHS of equation (10.16) and the improved a priori deterministic estimates on $\mathsf {R}^{\mathfrak {b}}$ . In particular, by Lemma 8.11 but with the choice of $\mathfrak {f}=\mathsf {R}^{\mathfrak {b}}$ , we have the following estimate similar to how equation (9.27) was derived; we justify the following estimate afterwards:
In contrast to equation (9.27), the $N^{2/14}$ -factor therein is replaced by the square of the support length of $\mathsf {R}^{\mathfrak {b}}$ that is order $N^{\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}+\varepsilon _{\mathrm {RN},1}}$ as written in Definition 10.2. Moreover, the spatial-average length scale $\mathfrak {l}_{1}$ in equation (9.27) is replaced by the length scale $\mathfrak {l}_{\beta ,\mathfrak {b}}$ . Lastly, we included the $\|\|_{\omega ;\infty }$ -factor in Lemma 8.11 in our estimate (10.18), which we did not do in equation (9.27). Recalling now the $\|\mathsf {R}^{\mathfrak {b}}\|_{\omega ;\infty }$ -estimate in Definition 10.2 and $\mathfrak {t}_{\mathfrak {j}}\geqslant N^{-2}$ and $\mathfrak {l}_{\beta ,\mathfrak {b}}\geqslant N^{\varepsilon _{1}-\beta }$ with $\varepsilon _{1}=1/14$ much larger than $\varepsilon _{\mathrm {ap}}$ and $\varepsilon _{\mathrm {RN},1}$ and $\beta $ , an elementary power-counting calculation shows the RHS of equation (10.18) is $\mathrm {O}(N^{-\alpha })$ for $\alpha>0$ universal. Therefore, as with the end of the proof of Lemma 9.5 after equation (9.27), we are done.
Proof of Lemma 10.6 .
Unlike the proof of Lemma 9.6, we will not need to introduce additional spatial averaging, so the proof of the current Lemma 10.6 is much simpler. We start via the following version of equation (10.15), which is just equation (10.15) but without prefactors and for the maximal timescale $\mathfrak {t}_{\mathfrak {j}_{+}}$ and with an additional $\mathbf {Y}^{N}$ -factor; we explain its quick proof/derivation afterwards:
Indeed, to prove equation (10.19), we recall $|\mathbf {Y}^{N}|\leqslant N^{\varepsilon _{\mathrm {ap}}}$ to forget $\mathbf {Y}^{N}$ on the LHS and apply Lemma 8.2 in the same way as we did to get equation (10.15) and deduce equation (10.19). For the RHS of equation (10.19), we have the following by Lemma 8.8 in the same way as we derived equation (10.16), in which the $\mathrm {Loc}$ term is, like in equation (10.16), also chosen in Definition 8.3/Lemma 8.8 with $\mathfrak {l}_{\mathrm {tot}}$ defined by $\mathfrak {l}_{\mathrm {av}}=1$ and $\mathfrak {l}$ equal to $\mathfrak {l}_{\beta ,\mathfrak {b}}$ times the support length of $\mathsf {R}^{\mathfrak {b}}$ , which we recall is explicitly written in Definition 10.2:
Contribution of the second term on the RHS of equation (10.20), after plugging it in the RHS of equation (10.19) and taking $2/3$ -powers, is bounded by the RHS of the proposed estimate (10.5). We now estimate the first term on the RHS of equation (10.20) via Lemma 8.9. In particular, let us apply Lemma 8.9 in the same way as we did in the proof of Lemma 10.5, namely with the same choices of $\mathfrak {h}$ and $\kappa $ therein. This estimates the first term on the RHS of equation (10.20) by two terms, just as in the proof of Lemma 10.5. The first of these, namely the first term on the RHS of equation (8.16), depends on the support of $\mathbf {E}^{\mathrm {dyn}}$ . It is ultimately controlled via the following, where $\mathbb {B}$ is the support of $\mathbf {E}^{\mathrm {dyn}}$ for which $\mathfrak {l}_{\mathrm {tot}}$ is explained prior to equation (10.20); we explain the estimates below after:
Let us recall the support length $|\mathbb {B}|$ of $\mathbf {E}^{\mathrm {dyn}}$ is given in the statement of Lemma 8.8, and this gives the first estimate in equation (10.21). The second estimate in equation (10.21) follows by recalling choices below we made for terms in equation (10.21) and elementary power-counting. In the bullet points below, we refer back to Definition 10.2 and Lemma 10.3 and the proof of Lemma 10.5 for notation/constructions.
-
• Recall from the statement of Lemma 10.5 that the timescale in equation (10.21) satisfies the upper bound $\mathfrak {t}_{\mathfrak {j}_{+}}\leqslant N^{-1+\beta }\mathfrak {l}_{\beta ,\mathfrak {b}}^{-1}$ .
-
• Recall from bullet points after equation (10.16) that $\kappa \gtrsim \mathfrak {l}_{\beta ,\mathfrak {b}}^{3/4}N^{3\varepsilon _{1}/2+3\mathfrak {b}\varepsilon _{\mathrm {RN},1}/2}N^{10\varepsilon _{\mathrm {ap}}+20\beta }\gtrsim \mathfrak {l}_{\beta ,\mathfrak {b}}^{9/4}$ .
-
• Third, note $\mathfrak {l}_{\beta ,\mathfrak {b}}\geqslant N^{\varepsilon _{1}-\beta }$ with $\varepsilon _{1}=1/14$ much bigger than $\beta $ for all $\mathfrak {b}\geqslant 0$ , which follows by construction in Definition 10.2.
-
• We clarify that $\mathfrak {l}_{\mathrm {tot}}$ is controlled by the product of the $\mathsf {R}^{\mathfrak {b}}$ -support length $N^{\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}+\varepsilon _{\mathrm {RN},1}}$ and the spatial-averaging length scale $\mathfrak {l}_{\beta ,\mathfrak {b}}$ , both of these from Definition 10.2, as we explained prior to equation (10.20). Thus, $\mathfrak {l}_{\mathrm {tot}}\lesssim N^{10\beta }\mathfrak {l}_{\beta ,\mathfrak {b}}\lesssim N^{1/4+\varepsilon _{\mathrm {RN},1}+10\beta }$ .
The estimate (10.21) controls the first term in the bound for the first term on the RHS of equation (10.20) that arises from an application of Lemma 8.9. Let us now estimate the second term in said bound/the RHS of equation (8.16). Following the paragraph prior to equation (10.18), this second term is a large negative power of N plus the following with estimates below to be justified/explained afterwards:
We now make the following observations for factors in the first term on the RHS of equation (10.22).
-
• Note $\mathfrak {t}_{\mathfrak {j}_{+}}\geqslant N^{-1+\beta -2\varepsilon _{\mathrm {ap}}}\mathfrak {l}_{\beta ,\mathfrak {b}}^{-1}$ , as $\mathfrak {t}_{\mathfrak {j}}$ increases by a factor of $N^{\varepsilon _{\mathrm {ap}}}$ in the index and $\mathfrak {t}_{\mathfrak {j}_{+}}$ is the last $\mathfrak {t}_{\mathfrak {j}}$ to satisfy $\mathfrak {t}_{\mathfrak {j}_{+}}\leqslant N^{-1+\beta }\mathfrak {l}_{\beta ,\mathfrak {b}}^{-1}$ .
-
• Note $N^{2\varepsilon _{1}+2\mathfrak {b}\varepsilon _{\mathrm {RN},1}+2\varepsilon _{\mathrm {RN},1}}\|\mathsf {R}^{\mathfrak {b}}_{0,0}\|_{\omega ;\infty }^{2}\lesssim N^{2\varepsilon _{\mathrm {RN},1}}$ ; see Definition 10.2. This is the utility of bounds for $\mathsf {R}^{\mathfrak {b}}$ that improve in $\mathfrak {b}$ . Also, we have $\mathfrak {l}_{\beta ,\mathfrak {b}}\gtrsim N^{-\beta }N^{\varepsilon _{1}}$ for $\varepsilon _{1}=1/14$ ; again see Definition 10.2.
With this pair of observations, like the proof of Lemma 9.6, we deduce the contribution of the first term on the RHS of equation (10.22) is controlled by the RHS of the proposed bound (10.5). Combining this with equation (10.21) to estimate (10.19) completes the proof.
11 Boltzmann–Gibbs principle I – proof of Proposition 4.7, Case II
The strategy we take in this section is remarkably similar to the strategy of the previous section. In particular, we will employ Lemma 10.1 to replace ${\mathsf {R}_{\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}}(\tau _{y}\eta _{S})}$ by ${\mathsf {R}_{\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}}^{\mathrm {cut}}(\tau _{y}\eta _{S})}$ on the LHS of equation (4.8) for all . Recall $\mathfrak {b}_{\mathrm {mid}}$ is the index cutoff that distinguishes Case I and Case II of Proposition 4.7; see Definition 10.2. Afterwards:
-
• First, we define $\mathsf {R}^{\mathfrak {b}}_{S,y}={\mathsf {R}_{\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}}^{\mathrm {cut}}(\tau _{y}\eta _{S})}$ throughout this section, as in the previous section, to ease notation.
-
• Second, we replace $\mathsf {R}^{\mathfrak {b}}$ with a spatial average like with the previous section. However, we will average it here on spatial-scale $\mathfrak {l}_{\beta }=N^{\beta }$ , not the length scale $\mathfrak {l}_{\beta ,\mathfrak {b}}$ that matches, up to the factor of $N^{-\beta }$ , the length of the support of $\mathsf {R}^{\mathfrak {b}}$ . We cannot average it on the length scale $\mathfrak {l}_{\beta ,\mathfrak {b}}$ in this section, as controlling the resulting spatial gradients would require spatial regularity estimates for $\mathbf {Y}^{N}$ on length scales that are well beyond those which we have a priori $\mathbf {Y}^{N}$ estimates for. However, as the support of $\mathsf {R}^{\mathfrak {b}}$ is larger in Case II, the a priori estimates for $\mathsf {R}^{\mathfrak {b}}$ in Lemma 10.1 are better than they generally were in Case I; this helps. Actually, for this reason we ultimately will not need to replace this spatial average of $\mathsf {R}^{\mathfrak {b}}$ with a cutoff as in Lemma 10.4.
-
• Third, we replace the spatial average of $\mathsf {R}^{\mathfrak {b}}$ with its time average/the space-time average of $\mathsf {R}^{\mathfrak {b}}$ with respect to a timescale that is roughly equal to $\mathfrak {t}_{\mathfrak {j}_{+}}=N^{-1-\beta /2}$ and in particular independent of $\mathfrak {b}$ , although this last feature will not be important. Again, we say ‘roughly’ because we will need to use a timescale contained in $\mathbb {I}^{\mathbf {T},1}$ for the technical reason that we only have a priori regularity estimates for $\mathbf {Y}^{N}$ on these timescales. After this replacement we will estimate this last space-time average of $\mathsf {R}^{\mathfrak {b}}$ to complete the proof of Case II of Proposition 4.7, and thus the proof of Proposition 4.7 when combined with the last section.
-
• The previous three steps, in terms of the technical estimates, are done with the same general tools introduced in Section 8.
To ease the following reading we recall the following facts from after Definition 10.2 about $\mathsf {R}^{\mathfrak {b}}$ that follow via Lemma 10.1.
-
• The support of $\mathsf {R}^{\mathfrak {b}}$ has length of order $N^{\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}+\varepsilon _{\mathrm {RN},1}}$ , and we have $\|\mathsf {R}^{\mathfrak {b}}\|_{\omega ;\infty }\leqslant N^{10\varepsilon _{\mathrm {ap}}}N^{-\varepsilon _{1}-\mathfrak {b}\varepsilon _{\mathrm {RN},1}}$ by construction.
-
• Lastly, as in Definition 10.2, we will define and sometimes use $N^{\beta +\varepsilon _{\mathrm {RN},1}}\mathfrak {l}_{\beta ,\mathfrak {b}}=N^{\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}+\varepsilon _{\mathrm {RN},1}}$ , where $\beta =999^{-99}$ .
Similar to the previous two sections, we provide each of the previous ingredients listed above and use them to establish Case II of Proposition 4.7. We then provide the proof for each of the ingredients to complete this section. Only the proof of spatial-average replacement in Lemma 11.1 requires an additional idea, while other proofs will effectively be copied.
11.0.1 Spatial average
We start with the aforementioned replacement of $\mathsf {R}^{\mathfrak {b}}$ with its spatial average on length scale $N^{\beta }$ . The proof of the following result is highly similar to that of Lemma 10.3, so we refer to that argument with necessary adjustments, including one important detail, when we present the proof of Lemma 11.1 below.
Lemma 11.1. We define the length scale $\mathfrak {l}_{\beta }=N^{\beta }$ for $\beta =999^{-99}$ . Uniformly in , we have the following estimate for which we recall the transfer-of-spatial-scale operator from Definition 8.12 :
11.0.2 Time Average
The following is replacement-by-time average in the third bullet point. For its proof, we basically copy that of Lemma 10.5 with technical modifications. Recall $\mathbb {I}^{\mathbf {T},1}$ from Definition 3.1.
Lemma 11.2. Let $\mathfrak {j}_{+}\in \mathbb Z_{\geqslant 0}$ be the largest index for which $\mathfrak {t}_{\mathfrak {j}_{+}}\leqslant N^{-1-\beta /2}$ is the largest time in $\mathbb {I}^{\mathbf {T},1}$ satisfying this bound; here $\beta =999^{-99}$ . For , we have the following; recall the transfer-of-timescale-operator in Definition 8.15 :
11.0.3 Final estimates
Our last ingredient before we deduce Case II of Proposition 4.7 is the following estimate on the space-time average $\mathfrak {I}^{\mathbf {T}}_{\mathfrak {t}_{\mathfrak {j}_{+}}}\mathfrak {I}^{\mathbf {X}}_{\mathfrak {l}_{\beta }}\mathsf {R}^{\mathfrak {b}}$ for $\mathfrak {l}_{\beta }$ in Lemma 11.1 and for $\mathfrak {t}_{\mathfrak {j}_{+}}$ in Lemma 11.2. The following final ingredient serves as an analog of Lemma 9.6 and Lemma 10.6. Indeed, similar to those two results, most of the work is done for the lemma immediately before.
Lemma 11.3. Take the timescale $\mathfrak {t}_{\mathfrak {j}_{+}}$ from Lemma 11.2 . Uniformly in , we have the following estimate:
Case II of Proposition 4.7, namely Proposition 4.7 but restricting to , follows from Lemmas 11.1, 11.2, and 11.3 combined with the same replacement reasoning that we used in the proof of Case I of Proposition 4.7 at the end of the previous section. Together with the previous section, this concludes the proof of Proposition 4.7 entirely.
Proof of Lemma 11.1 .
Let us follow the proof of Lemma 10.3, though our application of Lemma 8.13 will be somewhat illegal but remedied as we soon explain. Formally, let us apply Lemma 8.13 with $\mathfrak {f}=N^{1/2}\mathsf {R}^{\mathfrak {b}}$ and $\mathfrak {t}=0$ as in the proof of Lemma 10.3, but now with $\mathfrak {l}=\mathfrak {l}_{\beta }=N^{\beta }$ , where $\beta =999^{-99}$ . We claim that this provides the following inequality that we justify afterwards:
If we could apply Lemma 8.13 with the aforementioned choices, then equation (11.4) would follow just as equation (10.14) did, except the extra square root of $\mathfrak {l}_{\beta }$ would not be necessary in the second term on the RHS of equation (11.4). However, for $\mathfrak {b}<\mathfrak {b}_{+}$ it is not necessarily true that the support length of $\mathsf {R}^{\mathfrak {b}}$ times $\mathfrak {l}=\mathfrak {l}_{\beta }$ is bounded above by $\mathfrak {l}_{N}$ in Definition 3.1; if $\mathfrak {b}=\mathfrak {b}_{+}-1$ , then the support length of $\mathsf {R}^{\mathfrak {b}}$ is $\mathrm {O}(N^{\varepsilon _{1}+\mathfrak {b}_{+}\varepsilon _{\mathrm {RN},1}})$ as we noted in the bullet point list prior to Lemma 11.1. It is certainly possible that the support length of $\mathsf {R}^{\mathfrak {b}}$ is very close to or basically equal to $\mathfrak {l}_{N}=N^{1/2+\varepsilon _{\mathrm {RN}}}$ by construction in the statement of Proposition 4.7, so after multiplying by $\mathfrak {l}=\mathfrak {l}_{\beta }=N^{\beta }$ the resulting product may exceed $\mathfrak {l}_{N}$ . This is remedied by the following observations.
-
• The only reason why we require the $\bar {\mathfrak {l}}\leqslant \mathfrak {l}_{N}$ constraint in the proof of Lemma 8.13 is so that we have a priori spatial regularity estimates for $\mathbf {Y}^{N}$ on the length scale $\bar {\mathfrak {l}}$ , which is defined in the statement of Lemma 8.13, by construction in Definition 3.5.
-
• However, even if $\bar {\mathfrak {l}}=\mathfrak {l}_{\beta }N^{\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}+\varepsilon _{\mathrm {RN},1}}$ exceeds $\mathfrak {l}_{N}$ , it only does by a factor of order $\mathfrak {l}_{\beta }=N^{\beta }$ for $\mathfrak {b}<\mathfrak {b}_{+}$ . Indeed, $\mathfrak {l}_{\beta }^{-1}\bar {\mathfrak {l}}$ is always bounded by $\mathfrak {l}_{N}=N^{1/2+\varepsilon _{\mathrm {RN}}}$ in Definition 3.1 for all $\mathfrak {b}<\mathfrak {b}_{+}$ by construction of $\mathfrak {b}_{+}$ in the statement of Proposition 4.7. Rewriting spatial gradients of $\mathbf {Y}^{N}$ on length scales of order $\mathfrak {l}_{\beta }N^{\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}+\varepsilon _{\mathrm {RN},1}}$ as order- $\mathfrak {l}_{\beta }$ -many spatial gradients on the length scale $N^{\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}+\varepsilon _{\mathrm {RN},1}}$ , we may control the spatial regularity of $\mathbf {Y}^{N}$ on length scales of order $\mathfrak {l}_{\beta }N^{\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}+\varepsilon _{\mathrm {RN},1}}$ by $\mathfrak {l}_{\beta }=N^{\beta }$ times spatial regularity estimates for $\mathbf {Y}^{N}$ on length scale $N^{\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}+\varepsilon _{\mathrm {RN},1}}$ .
-
• The above length- $\mathfrak {l}_{\beta }N^{\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}+\varepsilon _{\mathrm {RN},1}}$ spatial regularity bound on $\mathbf {Y}^{N}$ is $\mathfrak {l}_{\beta }^{1/2}$ worse than what the proof of Lemma 8.13 needs it to be since the proof of Lemma 8.13 uses Holder regularity with exponent basically $1/2$ for $\mathbf {Y}^{N}$ , and our bound is linear in $\mathfrak {l}_{\beta }$ rather than square root. Because the spatial regularity of $\mathbf {Y}^{N}$ only is relevant for the second term on the RHS of (8.22)/(11.4), this is why we get equation (11.4) with $\mathfrak {l}_{\beta }$ in the second term on the RHS and not its square root as Lemma 8.13 says; see Remark 8.14.
By equation (11.4), like the proof of Lemma 10.3, it suffices to use $N^{\frac 12(\varepsilon _{1}+\mathfrak {b}\varepsilon _{\mathrm {RN},1}+\varepsilon _{\mathrm {RN},1})}|\mathsf {R}^{\mathfrak {b}}|\lesssim N^{10\varepsilon _{\mathrm {ap}}}N^{-\frac 12(\varepsilon _{1}-\mathfrak {b}\varepsilon _{\mathrm {RN},1}+\varepsilon _{\mathrm {RN},1})}\lesssim N^{-1/4+10\varepsilon _{\mathrm {ap}}+\varepsilon _{\mathrm {RN},1}/2}$ for $\mathfrak {b}>\mathfrak {b}_{\mathrm {mid}}$ and $\beta =999^{-99}$ .
Proof of Lemma 11.2 .
We directly follow the proof of Lemma 10.5 verbatim, but we replace $\mathfrak {l}_{\beta ,\mathfrak {b}}$ therein by $\mathfrak {l}_{\beta }=N^{\beta }$ . For the sake of precision/clarity, the estimates (10.15) and (10.16) both hold with the previous length-scale replacement and for $\mathfrak {b}$ -indices of interest in the current lemma, as do equations (10.17) and (10.18). Moreover, it is easy to check that in the latter two of these bounds, the upper bounds with the aforementioned length-scale replacement, after plugging into equations (10.15) and (10.16) and taking $2/3$ -powers, are controlled by the RHS of the proposed estimate (11.2). Indeed, given $\mathfrak {b}>\mathfrak {b}_{\mathrm {mid}}$ , we have $\kappa =N^{-3\varepsilon _{\mathrm {ap}}/2}N^{-3\beta /4}\|\mathsf {R}^{\mathfrak {b}}\|_{\omega ;\infty }^{3/2}\geqslant N^{-3\varepsilon _{\mathrm {ap}}/2-3\beta /4+3/8}$ , which is enough to control equation (10.17); see Definition 10.2. For equation (10.18), all that we need to estimate the RHS of equation (10.18) is $\mathfrak {l}_{\beta ,\mathfrak {b}}\geqslant N^{\beta }$ since $\mathfrak {t}_{\mathfrak {j}}\geqslant N^{-2}$ ; by construction, we still have $\mathfrak {l}_{\beta }=N^{\beta }$ , so replacing $\mathfrak {l}_{\beta ,\mathfrak {b}}$ by $\mathfrak {l}_{\beta }$ is not an issue.
Proof of Lemma 11.3 .
We directly follow the proof of Lemma 10.6 verbatim except we replace $\mathfrak {l}_{\beta ,\mathfrak {b}}$ therein with $\mathfrak {l}_{\beta }=N^{\beta }$ and we replace $\mathfrak {t}_{\mathfrak {j}_{+}}$ therein with $\mathfrak {t}_{\mathfrak {j}_{+}}$ defined in the statement of Lemma 11.2/Lemma 11.3. Similar to the proof of Lemma 11.2, it is enough to verify that the estimates (10.19), (10.20), (10.21) and (10.22) from the proof for Lemma 10.6 that we are following still hold with the aforementioned length-scale and timescale replacements. For equations (10.19) and (10.20), this is because Lemma 8.2 and Lemma 8.8 do not care about the space-time scales in terms of applicability. For equations (10.21) and (10.22), this is a consequence of power-counting in N. For equation (10.21), it is enough to note $\kappa \geqslant N^{-3\varepsilon _{\mathrm {ap}}-3\beta /4+\frac 38}$ after our replacement $\mathfrak {l}_{\beta ,\mathfrak {b}}\to \mathfrak {l}_{\beta }$ as noted in the proof of Lemma 11.2. We clarify weakening $\mathfrak {t}_{\mathfrak {j}_{+}}$ from the proof of Lemma 10.6 to $\mathfrak {t}_{\mathfrak {j}_{+}}$ in the current lemma, which only weakens equation (10.22) by a factor of $N^{\beta /2}$ , gets dominated by $\mathfrak {l}_{\beta }^{-1}\lesssim N^{-\beta }$ obtained by replacing $\mathfrak {l}_{\beta ,\mathfrak {b}}$ by $\mathfrak {l}_{\beta }$ in (10.22).
A Auxiliary estimates
A.1 Heat estimates
We start with the following, from which heat kernel estimates ultimately follow.
Lemma A.1. Let us define $\mathbf {H}^{N,\mathbb Z}$ to be the full-line heat kernel on $\mathbb Z$ satisfying the following conditions.
-
• Define $\Delta _{\mathbb Z}^{!!}=N^{2}\Delta _{\mathbb Z}$ and $\nabla _{\mathbb Z,-1}^{!}=N\nabla _{\mathbb Z,-1}$ , with $\Delta _{\mathbb Z}$ the Laplacian on $\mathbb Z$ and $\nabla _{\mathbb Z,-1}$ the negative-direction gradient on $\mathbb Z$ .
-
• Provided $0\leqslant S\leqslant T$ and $x,y\in \mathbb Z$ , we have $\mathbf {H}_{S,S,x,y}^{N,\mathbb Z}=\mathbf {1}_{x=y}$ and $\partial _{T}\mathbf {H}_{S,T,x,y}^{N,\mathbb Z}=2^{-1}\Delta _{\mathbb Z}^{!!}\mathbf {H}_{S,T,x,y}^{N,\mathbb Z}+\bar {\mathfrak {d}}\nabla _{\mathbb Z,-1}^{!}\mathbf {H}_{S,T,x,y}^{N,\mathbb Z}$ .
We have the following identity relating $\mathbf {H}^{N}$ and $\mathbf {H}^{N,\mathbb Z}$ and the following Chapman–Kolmogorov equation, in which $S\leqslant R\leqslant T$ and :
Proof. To show the first identity in equation (A.1), note both sides are equal to $\mathbf {1}_{x=y}$ if $S=T$ . Indeed, if $x,y\in \mathbb {T}_{N}\kern-1.5pt$ , then $x=y+\mathfrak {k}|\mathbb {T}_{N}|$ can only happen for $\mathfrak {k}=0$ . Next, we note that both sides vanish under $\partial _{T}-\mathscr {L}_{N}$ for $T>S$ , where $\mathscr {L}_{N}$ acts on x. By uniqueness of solutions to linear ordinary differential equations (ODEs), the first identity holds. To show the second identity, note both sides equal $\mathbf {H}^{N}_{S,R,x,y}$ at $T=R$ . Then, note both sides vanish under $\partial _{T}-\mathscr {L}_{N}$ for $T>R$ , where $\mathscr {L}_{N}$ acts on x. So, the second identity holds, again, by uniqueness.
Definition A.2. Provided $\mathfrak {l},\mathfrak {l}'\in \mathbb Z$ and any function $\phi :\mathbb {T}_{N}\to \mathbb R$ , we define the composition $\nabla _{\mathfrak {l},\mathfrak {l}'}^{\mathbf {X}}\varphi =\nabla _{\mathfrak {l}}^{\mathbf {X}}(\nabla _{\mathfrak {l}'}^{\mathbf {X}}\phi )$ .
The following result collects pointwise (and summed) estimates for the $\mathbf {H}^{N}$ heat kernel, which can be interpreted as those for a Gaussian heat kernel (or its periodic version) at times of order $N^{2}$ . Proving them amounts to the following steps. First, to prove the pointwise and spatial regularity estimates listed below, it suffices to assume $\bar {\mathfrak {d}}=0$ . Indeed, the $\mathbf {H}^{N}$ heat kernel is the density for a symmetric simple random walk plus constant speed drift. It is therefore the convolution of a Poisson density function (for the law of the position of the drift) with the $\mathbf {H}^{N}$ kernel for $\bar {\mathfrak {d}}=0$ . Convolution with the Poisson density function is contractive in all pointwise and spatial regularity norms used below, so reduction to $\bar {\mathfrak {d}}=0$ follows. To prove bounds in the case of $\bar {\mathfrak {d}}=0$ , it suffices to use the first identity in equation (A.1) with bounds in Proposition A.1 and Corollary A.2 of [Reference Dembo and Tsai19], which have subexponential decay in space, and their higher-order analogs, which are proven by the same method. To prove the time-regularity bounds below, it suffices to note that time gradients of $\mathbf {H}^{N}$ are time integrals of its spatial gradients because of the PDE that $\mathbf {H}^{N}$ satisfies. Then, we can use spatial regularity estimates that we just explained. (In particular, even for mixed space-time gradients, we are always left with estimating iterated spatial gradients of $\mathbf {H}^{N}$ .)
Proposition A.3. We first take $0\leqslant S\leqslant T\leqslant 1$ . Provided any $\mathfrak {l},\mathfrak {l}'\in \mathbb Z$ and any $0\leqslant \nu \leqslant 1$ , we have the following estimates, in which spatial gradients act on $x\in \mathbb {T}_{N}\kern-1.5pt$ ; recall $\mathbf {O}_{S,T}=|T-S|$ :
We have the following summation estimates under the same assumptions made/with the same parameters prior to equation (A.2):
Additionally consider any timescale $\mathrm {t}\geqslant 0$ . We have the following in which the time-gradient acts on $T\geqslant 0$ :
We now list heat operator estimates. For any $\phi :\mathbb R_{\geqslant 0}\times \mathbb {T}_{N}\to \mathbb R$ and $\mathbb {I}\subseteq \mathbb R_{\geqslant 0}$ , we have space-time contraction estimates:
Let us now recall notation of Definition 5.1 . Provided any $\mathfrak {r}\geqslant 0$ , we have the spatial-gradient estimates
We have the following time-regularity heat operator estimates if $\mathrm {t}\geqslant N^{-2}$ ; below we take $\gamma>0$ arbitrary:
The estimates in equation (A.7) also hold for $\mathrm {t}\in \mathbb R$ in general. Lastly, for any possibly random $\mathrm {t}_{0}\geqslant 0$ , we have the following two identities, the first by the Chapman–Kolmogorov equation in equation (A.1) and the second by combining the first with the spatial contraction in equation (A.5):
A.2 Martingale estimates
We provide a generalization of the martingale inequality from Lemma 3.1 in [Reference Dembo and Tsai19]. The issue with Lemma 3.1 in [Reference Dembo and Tsai19] is that it only holds for the Gartner transform, as its explicit formula was important in the proof. On the other hand, the proof of Lemma 3.1 in [Reference Dembo and Tsai19] uses this explicit formula only to estimate the short-time behavior of the Gartner transform. Thus, because short-time behavior does not depend on explicit formulas, we have the following generalization to other processes such as $\mathbf {U}^{N}$ from Definition 3.5, which is important to analyze the $\mathbf {U}^{N}\mathrm {d}\xi ^{N}$ term in the $\mathbf {U}^{N}$ equation in Definition 3.5. However, the following generalization of Lemma 3.1 of [Reference Dembo and Tsai19] is similar in proof and statement, so we refer to Lemma 3.1 in [Reference Dembo and Tsai19].
Lemma A.4. Consider any $\phi :\mathbb R_{\geqslant 0}\times \mathbb {T}_{N}\to \mathbb R$ and the following local quadratic function of $\phi $ provided fixed times $0\leqslant \mathfrak {t}_{1}\leqslant \mathfrak {t}_{2}$ ; in the following, we additionally define $\lfloor t\rfloor _{N}$ as the largest element in $N^{-2}\mathbb Z_{\geqslant 0}$ that is less than t:
Take $\mathbf {X}^{N}$ on $\mathbb R_{\geqslant 0}\times \mathbb {T}_{N}$ satisfying the following for $\mathrm {V}_{i}:\mathbb R_{\geqslant 0}\times \mathbb {T}_{N}\times \Omega \to \mathbb R$ , and $\mathfrak {l}\in \mathbb Z$ fixed; recall $\mathscr {L}_{N}$ in Proposition 2.4 :
Suppose $|\mathrm {V}_{i}|\leqslant N^{3/2}$ uniformly over $\mathbb R_{\geqslant 0}\times \mathbb {T}_{N}\times \Omega _{\mathbb {T}_{N}}$ . For any deterministic $p\geqslant 1$ and $0\leqslant \mathfrak {t}_{1}\leqslant \mathfrak {t}_{2}$ and $\phi :\mathbb R_{\geqslant 0}\times \mathbb {T}_{N}\to \mathbb R$ ,
Lastly, equation (A.11) holds for $\mathbf {X}^{N}=\mathbf {U}^{N}$ in Definition 3.5 and $\mathbf {X}^{N}=\mathbf {Q}^{N}$ in Definition 3.8 and $\mathbf {X}^{N}=\mathbf {C}^{N}$ in Definition 7.3 .
Before we discuss the proof, we introduce a brief digression. Consider the fundamental solution $\mathbf {J}$ with variables in $\mathbb R_{\geqslant 0}^{2}\times \mathbb {T}_{N}^{2}$ and $\mathbf {J}_{S,S,x,y}=\mathbf {1}_{x=y}$ to the following deterministic parabolic equation on $\mathbb R_{\geqslant 0}\times \mathbb {T}_{N}$ whose utility we explain afterwards:
If we ‘forget’ the $\mathbf {X}^{N}\mathrm {d}\xi ^{N}$ term in the $\mathbf {X}^{N}$ equation in Lemma A.4, the resulting equation is stochastic only via the $\mathrm {V}$ functions. Also, the resulting linear equation has fundamental solution controlled by $\mathbf {J}$ , as the coefficients in said equation are bounded by $N^{3/2}$ . This motivates the following PDE estimate that follows by standard estimates for $\mathscr {L}_{N}$ and the Gronwall inequality.
Lemma A.5. We have $\mathbf {J}\geqslant 0$ and, defining $\mathbf {J}_{x,y}^{\mathrm {s}}=\sup _{0\leqslant \mathrm {t}\leqslant N^{-2}}\mathbf {J}_{\mathrm {s},\mathrm {s}+\mathrm {t},x,y}$ , we have the deterministic estimate
Proof of Lemma A.4 .
The estimate (A.11) is basically that of Lemma 3.1 in [Reference Dembo and Tsai19], except we take spatial suprema of moments on the RHS. In particular, in view of the paragraph preceding Lemma A.4 it suffices to show, for $\lfloor t\rfloor _{N}$ defined in Lemma A.4,
We note $\mathbf {X}_{t}^{N}$ can be controlled by $\mathbf {J}_{\lfloor t\rfloor _{N},t}$ spatially integrated against $\mathbf {X}_{\lfloor t\rfloor _{N}}^{N}$ times an exponential of a Poisson clock counter that was introduced in the proof of Lemma 3.1 of [Reference Dembo and Tsai19]. We then follow the proof of Lemma 3.1 of [Reference Dembo and Tsai19] upon estimating the spatial integral of $\mathbf {J}$ against $\|\mathbf {X}^{N}_{\lfloor t\rfloor _{N}}\|_{\omega ;2p}^{2}$ by the spatial supremum of the latter $\mathbf {X}^{N}$ -moment, Lemma A.5 and exponential estimates for the Poisson distribution in the proof of Lemma 3.1 of [Reference Dembo and Tsai19].
A.3 Short-time estimates
We provide a general short-time bound, not with respect to moments like the short-time estimates used in Lemma A.4 but space-time supremum norms. Lemma A.6 follows by deterministic control on $\mathscr {L}_{N,\mathrm {V}}$ below and noting that jumps in $\mathbf {X}^{N}$ , which are order $N^{-1/2}\mathbf {X}^{N}$ , have polynomial-in-N speed that cannot ring too much in very short times.
Lemma A.6. Consider any process $\mathbf {X}^{N}$ on $\mathbb R_{\geqslant 0}\times \mathbb {T}_{N}$ satisfying the following stochastic equation, where $\mathrm {V}_{i}$ are functionals on $\mathbb R_{\geqslant 0}\times \mathbb {T}_{N}\times \Omega _{\mathbb {T}_{N}}$ , and the operator $\mathscr {L}_{N,\mathrm {V}}$ is defined via the second equation below for $\mathfrak {l}\in \mathbb Z$ :
Suppose $\mathrm {V}_{1}$ and $\mathrm {V}_{2}$ satisfy the estimates $|\mathrm {V}_{1}|+|\mathrm {V}_{2}|\lesssim N^{\frac 32}$ uniformly in all variables, and suppose $|\mathfrak {l}|\lesssim 1$ . If $\mathbf {X}^{N}\not \equiv 0$ , we have the following estimate with overwhelming probability (see Definition 3.9 ) in which $\varepsilon _{\mathrm {ap},1}>0$ is a small universal constant:
Acknowledgments
The author thanks Amir Dembo for useful discussion. The author would also like to thank the editor(s) and anonymous referees for their detailed feedback, which greatly improved this paper.
Competing interests
The authors have no competing interest to declare.
Financial support
The author was funded by a fellowship from the ARCS Foundation.