1 Introduction
Let
$d\geq 2$
. For
$\lambda , t>0$
, the Bochner-Riesz means
$S^\lambda _t f$
are defined as

We consider the pointwise convergence problem of the Bochner-Riesz means; for which
$\lambda $
,
$S^\lambda _t f$
converges to f almost everywhere as
$t\to \infty $
for arbitrary
$f\in L^p(\mathbb {R}^d)$
? For
$p> 2$
, this is the case if
${\lambda>\max (d(\frac {1}{2}-\frac {1}{p})-\frac {1}{2}, 0)}$
. This result is due to Carbery [Reference Carbery4] for
$d=2$
and Carbery, Rubio de Francia and Vega [Reference Carbery, Rubio de Francia and Vega5] for every
$d\geq 2$
(see also [Reference Lee and Seeger18] for endpont results). Note that the given range of
$\lambda $
is precisely the one for which
$S^\lambda _t$
is known to be bounded on
$L^p(\mathbb {R}^2)$
and is conjectured to be bounded on
$L^p(\mathbb {R}^d)$
for
$d>2$
(see [Reference Carleson and Sjölin6, Reference Córdoba7, Reference Fefferman12] for
$d=2$
and [Reference Guo, Oh, Wang, Wu and Zhang14] and references therein for partial results for
$d\geq 3$
). Meanwhile, it is not known what happens when
$\lambda =0$
and
$p=2$
, unlike the one-dimensional case where the Carleson-Hunt theorem is available.
For the case
$1\leq p < 2$
, by Stein’s maximal principle [Reference Stein20], almost everywhere convergence of an arbitrary
$L^p$
function is equivalent to the following weak-type estimate:

for test functions f. Here, by
$A\lesssim B$
, we mean
$A\leq CB$
for an absolute positive constant C. Examples of Fefferman [Reference Fefferman11] and Tao [Reference Tao22] show that (1.1) fails unless
$\lambda> \max (0,d(\frac {1}{p}-\frac {1}{2}) - \frac {1}{2p})$
. We prove a new partial result on this problem in dimension
$d=2$
.
Theorem 1.1. Let
$d=2$
and
$p=\frac {86}{57}$
. Then (1.1) holds for any
$\lambda>\frac {9}{86}$
.
By an interpolation of classical
$L^1$
and
$L^2$
estimates, (1.1) holds if
$\lambda> (d-1)(\frac {1}{p}-\frac {1}{2})$
. For
$d=2$
, Tao [Reference Tao23] improved this classical sufficient condition by giving a
$L^{10/7}$
-estimate. Building on [Reference Tao23], Li and Wu [Reference Li and Wu19] gave a new
$L^{18/13}$
-estimate using the Bourgain-Demeter decoupling theorem [Reference Bourgain and Demeter2]. More recently, Gan and Wu [Reference Gan and Wu13] obtained an improved
$L^{10/7}$
-estimate by using a weighted version of the
$\ell ^p$
-decoupling theorem. Moreover, they gave a new partial result in dimension
$d=3$
. See Theorem 4.9 for an improved bound in
$\mathbb {R}^3$
.
Combining Theorem 1.1 with the classical
$L^1$
and
$L^2$
estimates and the
$L^{18/13}$
-estimate [Reference Li and Wu19] yields the following corollary, which improves previously known results for
$18/13<p<2$
.
Corollary 1.2. Let
$d=2$
,
$1< p< 2$
and

Then (1.1) holds. Consequently,
$\lim _{t\to \infty } S^\lambda _t f(x) = f(x)$
for almost every
$x\in \mathbb {R}^2$
for any
$f\in L^p(\mathbb {R}^2)$
.
Our new estimate is based on weighted
$\ell ^p$
-decoupling inequalities to be discussed below. We first recall the
$\ell ^p$
-decoupling theorem, which follows from the Bourgain-Demeter
$\ell ^2$
-decoupling theorem [Reference Bourgain and Demeter2]. Let
$S\subset \mathbb {R}^d$
be a strictly convex
$C^2$
hypersurface with Gaussian curvature comparable to
$1$
. For
$R\gg 1$
, cover the
$R^{-1}$
-neighborhood of S by finitely overlapping rectangles
$\theta $
of dimensions
$R^{-1/2} \times \cdots \times R^{-1/2} \times R^{-1}$
. Suppose that the Fourier transform of
$f_\theta $
is supported on
$\theta $
and
$f=\sum _\theta f_\theta $
. The
$\ell ^p$
-decoupling inequality says that, for
$2\leq p\leq \frac {2(d+1)}{d-1}$
and any ball
$B_R\subset \mathbb {R}^d$
of radius R,

where we denote by
$A\lessapprox B$
the inequality
$A \leq C_\epsilon R^\epsilon B$
, which holds for any
$\epsilon>0$
.
Gan and Wu [Reference Gan and Wu13] observed that the
$\ell ^p$
-decoupling estimate can be improved when the integration of f is taken over a subset
$Y\subset B_R$
for the exponent
$p=\frac {2d}{d-1}$
and showed that it yields an improved weak-type estimate (1.1). Here we give a refinement of their weighted
$\ell ^p$
-decoupling estimates for a wider range of p.
Theorem 1.3. Let
$p_d=\frac {2(d+1)}{d-1}$
,
$2\leq p \leq p_d$
, and

Then for any
$Y\subset B_R$
,

Note that
$\alpha (p)>0$
for
$2<p<p_d$
, so the factor
$(|Y|R^{-d})^{\alpha (p)}$
represents a gain over the
$\ell ^p$
-decoupling inequality. Gan and Wu [Reference Gan and Wu13] proved (1.3) for the exponent
$p=\frac {2d}{d-1}$
with
$\alpha (p)=\frac {d+1}{2d}\Big (\frac {1}{p}-\frac {1}{p_d}\Big ) = \frac {1}{16}$
for
$d=2$
and with
$\alpha (p) = \frac {1}{3d-1}\Big (\frac {1}{2}-\frac {1}{p}\Big ) = \frac {1}{2d(3d-1)}$
for
$d\geq 3$
.
We next discuss the sharpness of Theorem 1.3. The exponent
$\alpha (p)$
given in (1.2) is optimal when
$\frac {2(d^2+2d-1)}{d^2+1} \leq p\leq p_d$
, which is the range where
$\alpha (p)=\frac {d+1}{2d}(\frac {1}{p}-\frac {1}{p_d})$
. Indeed, if (1.3) holds for any
$Y\subset B_R$
, then
$\alpha (p)$
must obey

To see this, one takes
$\widehat {f_\theta } = \frac {1}{|\theta |}\chi _{\theta }$
for a bump function
$\chi _\theta $
supported on
$\theta $
and lets
$Y=B_c(0)$
for sufficiently small
$c\sim 1$
(cf. [Reference Gan and Wu13]). Then
$|\sum _\theta f_\theta (x)| \sim \# \theta \sim R^{\frac {d-1}{2}}$
on Y. This example gives
$\alpha (p)\leq \frac {d+1}{2d}\left (\frac {1}{p}-\frac {1}{p_d}\right )$
. To see the other upper bound, it suffices to take
$f=f_\theta $
for a fixed
$\theta $
and take
$Y=\theta ^*$
, the rectangle dual to
$\theta $
.
Comparing (1.4) and (1.2), we see that the exponent
$\alpha (p)$
given in Theorem 1.3 is optimal for
$d=2$
for all
$2\leq p\leq 6$
. Note that
$\alpha (p)$
takes the maximum value
$\alpha (p)=\frac {1}{7}$
at
$p=\frac {14}{5}$
; we will use Theorem 1.3 with this particular p for our application to the Bochner-Riesz means. However, in dimensions
$d\geq 3$
, we do not currently have any reason to believe that the exponent
$\alpha (p) = \frac {1}{d-1}\big (\frac {1}{2}-\frac {1}{p}\big )$
from (1.2) for p close to
$2$
is optimal. Moreover, if we assume further conditions on
$Y\subset B_R$
, then it is possible that (1.3) may hold with
$\alpha (p)$
, which does not obey (1.4); see [Reference Gan and Wu13] for a discussion of such a possibility when
$|Y|\sim R^{1/2}$
and
$d=2$
.
Our proof of Theorem 1.3 builds on [Reference Gan and Wu13], which is based on a broad-narrow analysis going back to the work of Bourgain and Guth [Reference Bourgain and Guth3] (see also [Reference Guth15]). In the broad case, we use the Bennett-Carbery-Tao multilinear restriction estimate [Reference Bennett, Carbery and Tao1] to exploit the transversality and combine it with the
$\ell ^p$
-decoupling theorem as in [Reference Gan and Wu13]. In the narrow case, we do induction on scales in a way similar to the one developed in the proof of the refined Strichartz estimate by Du, Guth and Li [Reference Du, Guth and Li8]. The novelty of our proof lies on the analysis of the narrow case. In order to perform induction on scales more efficiently, we utilize a weighted decoupling estimate (see Theorem 2.5), which originates from the refined and weighted decoupling estimates obtained by Guth, Iosevich, Ou and Wang [Reference Guth, Iosevich, Ou and Wang16] and Du, Ou, Ren and Zhang [Reference Du, Ou, Ren and Zhang9]. Given Theorem 1.3, Theorem 1.1 follows from the argument of [Reference Gan and Wu13].
Plan of the Paper
In Section 2, we prove a multilinear weighted decoupling estimate, which will be used to handle the broad part. We also state a weighted refined decoupling estimate to be used in the narrow part. In Section 3, we prove Theorem 1.3. In Section 4, we give an exposition of the paper [Reference Gan and Wu13] on the implication of weighted
$\ell ^p$
-decoupling estimates to the weak-type estimate (1.1) for the Bochner-Riesz means, completing the proof of Theorem 1.1. In addition, we give an improved bound in dimension
$d=3$
. Throughout the paper, we denote
$p_d = \frac {2(d+1)}{d-1}$
with the convention
$p_1 = \infty $
.
2 Weighted and refined decoupling inequalities
In this section, we prove a multilinear weighted refined decoupling inequality, Theorem 2.3, and then introduce a weighted refined decoupling estimate, Theorem 2.5, which will be used in the proof of Theorem 1.3.
2.1 A multilinear weighted decoupling estimate
We first recall the multilinear restriction estimate [Reference Bennett, Carbery and Tao1]. Suppose that
$U_j\subset \mathbb {S}^{d-1}$
for
$j=1,2,\cdots ,d$
, such that for any
$v_j\in U_j$
,

for some
$0<\nu \leq 1$
.
Theorem 2.1 (Multilinear restriction)
For each
$1\leq j\leq d$
, let
$S_j \subset S$
be a cap such that the set
$U_j\subset \mathbb {S}^{n-1}$
of vectors normal to
$S_j$
satisfies (2.1). Suppose that the Fourier transform of
$f_j$
is supported on the
$R^{-1}$
-neighborhood of
$S_j$
. Then

Next, we recall the refined decoupling estimate, which is stated in terms of wave packets. We recall the setup. Given
$0<\epsilon \ll 1$
, let
$\epsilon _1 = \epsilon ^{100}$
. For each
$\theta $
, let
$\mathbb {T}_\theta $
denote a collection of finitely overlapping tubes T covering
$\mathbb {R}^d$
of length
$\sim R^{1+\epsilon _1}$
and radius
$\sim R^{(1+\epsilon _1)/2}$
with long axis orthogonal to
$\theta $
. Let
$\mathbb {T}(R) = \cup _\theta \mathbb {T}_\theta $
. Given
$T\in \mathbb {T}(R)$
, let
$\theta (T)$
denote the
$\theta $
for which
$T\in \mathbb {T}_\theta $
.
We say that a function
$f_T$
is microlocalized to
$(T,\theta )$
if
$f_T$
and
$\widehat {f_T}$
are essentially supported on
$2T$
and
$2\theta $
, respectively, in the sense that the
$L^p$
norms of the restrictions of
$f_T$
to
$(2T)^c$
in the physical space and to
$(2\theta )^c$
in the frequency space are both
$O(R^{-100d} \| f_T \|_{L^p})$
. Terms involving
$O(R^{-100d})$
can be absorbed into the main term of the estimate, so we will ignore these terms in the following statements and proofs. We recall that a function f whose Fourier transform is supported on the
$R^{-1}$
-neighborhood of S admits a wave packet decomposition
$f=\sum _\theta f_\theta =\sum _{T\in \mathbb {T}(R)} f_T$
, where
$f_T$
is microlocalized to
$(T,\theta (T))$
.
We can now state the refined decoupling theorem, [Reference Guth, Iosevich, Ou and Wang16, Theorem 4.2].
Theorem 2.2 (Refined decoupling)
Let
$\mathbb {W} \subset \mathbb {T}(R)$
and suppose that
$f=\sum _{T\in \mathbb {W}} f_T$
, where
$f_T$
is microlocalized to
$(T,\theta (T))$
. Let
$Y\subset B_R$
be the union of a collection of
$R^{1/2}$
-cubes Q, each of which is contained in
$3T$
for at most
$M\geq 1$
tubes
$T\in \mathbb {W}$
. Then for
$2\leq p\leq p_d$
,

The following is a multilinear refinement of Theorem 2.2.
Theorem 2.3. For each
$j=1,2,\cdots , d$
, let
$S_j \subset S$
be a cap such that the set
$U_j\subset \mathbb {S}^{n-1}$
of vectors normal to
$S_j$
satisfies (2.1). Let
$\mathbb {W}_j \subset \mathbb {T}(R)$
such that
$2\theta (T)$
is contained in the
$O(R^{-1})$
-neighborhood of
$S_j$
for every
$T\in \mathbb {W}_j$
and let
$f_j = \sum _{T\in \mathbb {W}_j} f_T$
, where
$f_T$
is microlocalized to
$(T,\theta (T))$
. Let
$\{Q\}$
be a collection of
$R^{1/2}$
-cubes contained in
$B_R$
such that each cube in the collection is contained in
$3T$
for at most
$M_j$
tubes
$T\in \mathbb {W}_j$
for each
$1\leq j\leq d$
. Let Y be a subset of
$\cup _Q Q$
. Then for
$2 \leq p\leq p_d$
,

For the proof, we interpolate Theorem 2.1 and Theorem 2.2. We remark that the use of refined decoupling is optional here as we shall use Theorem 2.3 with
$M_j \lesssim R^{\frac {d-1}{2}}$
. We note that Theorem 2.3 for the case
$p=\frac {2d}{d-1}$
was essentially proved in [Reference Gan and Wu13]. Our proof is slightly more direct than the one given there in that it does not go through the multilinear Kakeya estimate.
Proof. We prove the result for
$2\leq p\leq \frac {2d}{d-1}$
; the result for the remaining p follows from interpolation with the statement for
$p=p_d$
, which is a consequence of Hölder and Theorem 2.2.
We may view
$f_T$
as essentially constant on T; this can be made precise by using the frequency localization property of
$f_T$
; cf. Section 3.1. By dyadic pigeonholing, we may reduce the matter to the case where, for each
$j=1,\cdots ,d$
,
$| f_T | \sim \gamma _j$
on T for some
$\gamma _j>0$
for all
$T\in \mathbb {W}_j$
. By homogeneity, we may assume that
$\gamma _j=1$
. Then for
$1\leq q< \infty $
,

By Hölder, we get

The multilinear restriction estimate and the
$L^2$
-orthogonality imply

However, by Hölder and the refined decoupling estimate, Theorem 2.2, we have

We combine the
$(d+1)(\frac {1}{p}-\frac {1}{p_d})$
-th power of (2.3) and
$(d+1)(\frac {1}{2}-\frac {1}{p})$
-th power of (2.4). Then we get

Finally, we combine this estimate with (2.2).
2.2 A weighted refined decoupling estimate
Let
$v(T) \in \mathbb {S}^{d-1}$
denote the direction of
$T\in \mathbb {T}$
. Following [Reference Du, Ou, Ren and Zhang9], for
$R^{-1/2}\leq r\leq 1$
and
$1\leq m\leq d$
, we say that
$f = \sum _{T\in \mathbb {W}} f_T$
has
$(r,m)$
-concentrated frequencies if there exists a m-dimensional subspace
$V\subset \mathbb {R}^d$
such that
$\angle (v(T), V) \leq r$
for any
$T \in \mathbb {W}$
.
When f has
$(R^{-1/2},m)$
-concentrated frequencies, one can decouple f by
$\ell ^2$
-decoupling in dimension m (cf. [Reference Guth15, Lemma 9.3]). In the paper [Reference Du, Ou, Ren and Zhang9], it was observed that this continues to hold for the refined decoupling estimate as well.
Theorem 2.4 ([Reference Du, Ou, Ren and Zhang9, Theorem 1.2(a)]Footnote 1 )
Let
$\mathbb {W} \subset \mathbb {T}(R)$
and suppose that
$f=\sum _{T\in \mathbb {W}} f_T$
, where
$f_T$
is microlocalized to
$(T,\theta (T))$
. Let
$Y\subset B_R$
be the union of a collection of
$R^{1/2}$
-cubes Q, each of which is contained in
$3T$
for at most
$M\geq 1$
tubes
$T\in \mathbb {W}$
. Suppose that f has
$(R^{-1/2},m)$
concentrated frequencies for some
$1\leq m\leq d$
. Then for
$2\leq p\leq p_m$
,

A consequence of Theorem 2.4 is a weighted refined decoupling estimate with weights
$w: Y\to [0,1]$
such that
$\int _{Q} w \lesssim R^{\alpha /2}$
holds for any
$R^{1/2}$
-cube Q in Y; see [Reference Du, Ou, Ren and Zhang9, Theorem 1.2(b)]). Here we state a slightly more general version, which does not require such an assumption.
Theorem 2.5 (Weighted refined decoupling)
Let
$1\leq m\leq d$
and
$2\leq p\leq p_m$
. Suppose that
${f= \sum _{T\in \mathbb {W}} f_T}$
, where
$\mathbb {W} \subset \mathbb {T}(R)$
and each
$f_T$
is microlocalized to
$(T,\theta (T))$
. Assume that f has
$(R^{-1/2},m)$
-concentrated frequencies. Let
$\{ Q\}$
be a collection of disjoint
$R^{1/2}$
-cubes in
$B_R$
such that each Q is contained in
$3T$
for at most
$M\geq 1$
tubes
$T\in \mathbb {W}$
. Let Y be a subset of
$\cup _Q Q$
. Let
$w: Y \to [0,1]$
and
$w(T) := \int _T w$
. Then

The power of
$\max _{T\in \mathbb {W}} \frac {w(3T)}{|T|}$
represents a gain over the refined decoupling estimate. Theorem 2.5 essentially follows from the proof of [Reference Du, Ou, Ren and Zhang9, Theorem 1.2(b)]). We include a sketch of the proof for the sake of completeness.
Proof. By dyadic pigeonholing, we may assume that each
$R^{1/2}$
-cube Q in the collection is contained in
$3T$
for
$\sim M$
tubes
$T\in \mathbb {W}$
. By Hölder,

Next, we use Theorem 2.4 with exponent
$p=p_m$
and then replace the
$\ell ^{p_m}(L^{p_m})$
norm back to
$\ell ^{p}(L^{p})$
as in the proof of Theorem 2.3, which gives

Then

Combining this inequality and (2.5) finishes the proof.
3 Broad-narrow analysis: Proof of Theorem 1.3
In this section, we prove the following theorem, which implies Theorem 1.3 by the wave packet decomposition.
Theorem 3.1. Suppose that
$\mathbb {W} \subset \mathbb {T}(R)$
and
$f=\sum _{T\in \mathbb {W}} f_T$
, where
$f_T$
is microlocalized to
$(T,\theta (T))$
. Let
$p_d = \frac {2(d+1)}{d-1}$
and
$Y\subset B_R$
. Then for
$2\leq p \leq p_d$
and
$\alpha (p)$
defined in (1.2),

We turn to the proof of Theorem 3.1. We decompose f at a scale dictated by the parameter
${K=R^{\epsilon _1}=R^{\epsilon ^{100}}}$
. Consider a cover of
$\mathcal {S}$
by rectangles
$\tau $
of dimensions
$K^{-1}\times \cdots \times K^{-1}\times K^{-2}$
such that for each
$T\in \mathbb {W}$
, there exist at least one and at most
$O(1)$
rectangles
$\tau $
containing
$\theta (T)$
. We fix one such
$\tau =\tau (T)$
for each
$T\in \mathbb {W}$
. We let
$\mathbb {W}_\tau = \{ T\in \mathbb {W}: \tau (T) = \tau \}$
and
$f_\tau = \sum _{\mathbb {T} \in \mathbb {W}_\tau } f_T$
, so that
$f=\sum _\tau f_\tau $
.
For each lattice
$K^2$
-cube B intersecting Y, define the ‘significant set’

Then the definition of
$S(B)$
ensures that for each
$K^2$
-cube B,

We say that a lattice
$K^2$
-cube B intersecting Y is narrow if there exists a
$(d-1)$
-dimensional subspace
$V\subset \mathbb {R}^d$
such that for every
$\tau \in S(B)$
,

where
$G(\tau )\in \mathbb {S}^{d-1}$
denotes the direction normal to
$\tau $
. Otherwise, we say that B is broad. If B is broad, then there exist
$\tau _1,\tau _2,\cdots ,\tau _d \in S(B)$
such that

Let
$\mathcal {B}_{\operatorname {broad}}$
denote the collection of broad
$K^2$
-cubes and
$\mathcal {B}_{\operatorname {narrow}}$
denote the collection of narrow
$K^2$
-cubes. We let

We say that we are in the broad case if
$\| f\|_{L^p(Y_{\operatorname {broad}})} \geq \| f\|_{L^p(Y_{\operatorname {narrow}})}$
. Otherwise, we say that we are in the narrow case. We handle each case in the following subsections.
3.1 Broad case
In this subsection, we denote by
$A \lessapprox B$
expressions of the form
$A\leq K^{O(1)} B$
; the loss of
$K^{O(1)}$
is harmless as long as it is at most
$R^{\epsilon /2}$
.
In the broad case, we have
$\| f\|_{L^p(Y)} \lesssim \| f\|_{L^p(Y_{\operatorname {broad}})}$
. Let
$B\in \mathcal {B}_{\operatorname {broad}}$
. Then there exist
$\tau _1,\tau _2,\cdots ,\tau _d \in S(B)$
satisfying (3.2). We fix such a d-tuple and denote
$\bar {\tau }(B) = (\tau _1,\cdots ,\tau _d)$
. Let

Since
$\# \Gamma \lesssim K^{O(1)}$
, by dyadic pigeonholing, there exist
$\bar {\tau }=(\tau _1,\cdots ,\tau _d) \in \Gamma $
and a sub-collection of broad
$K^2$
-cubes
$\mathcal {B}_{\operatorname {broad}}^1$
such that
$\bar {\tau }(B) = \bar {\tau }$
for all
$B\in \mathcal {B}_{\operatorname {broad}}^1$
and

Let
$B\in \mathcal {B}_{\operatorname {broad}}^1$
. Since
$\tau _j \in S(B)$
for each j, we have

Using the Fourier support property of
$f_j$
, we morally have the following reverse Hölder inequality (cf. [Reference Du and Zhang10]):

We shall first assume (3.3), and then make it precise later. Summing (3.3) over
$B\in \mathcal {B}_{\operatorname {broad}}^1$
, we get

By using Theorem 2.3 with the bound
$M_j\lesssim R^{\frac {d-1}{2}}$
and
$\mathbb {W}_{\tau _j} \subset \mathbb {W}$
, we obtain

which implies the claimed estimate (3.1) as
$\alpha (p) \leq \frac {d+1}{2d}(\frac {1}{p}-\frac {1}{p_d})$
.
Now we make the step (3.3) rigorous following [Reference Hickman17]. Since
$f_{\tau _j}$
has compact Fourier support, we may write
$f_{\tau _j} = f_{\tau _j}* \psi $
for some
$\psi $
with compact Fourier support. Therefore, we have
$|f_{\tau _j}| \lesssim |f_{\tau _j}|* \Psi _{1}$
, where
$\Psi _{\rho }(x) = \rho (\rho +|x|)^{-(d+1)}$
. Moreover,

for all
$r>0$
. This follows from Hölder for
$r>1$
. The case
$0<r<1$
can be proved by using Bernstein’s inequality; see [Reference Hickman17, Lemma 5.9]. We also note that
$\Psi _{1}(x) = K^{2d}\Psi _{K^{2}}(K^2 x) \lessapprox \Psi _{K^2}(x)$
. Using (3.5) with
$r=p/d$
, we get

We are going to use the following two properties of
$\Psi _{K^2}$
; its
$L^1$
norm is comparable to
$1$
, and it is locally constant on balls of radius
$K^2$
. The latter implies that
$|f_{\tau _j}|^{\frac {p}{d}}* \Psi _{K^2}(x) \sim |f_{\tau _j}|^{\frac {p}{d}}* \Psi _{K^2}(y)$
whenever
$|x-y|\lesssim K^2$
. Therefore, by (3.6),

Let
$f_{\tau _j,y_j} (x)= f_{\tau _j}(x-y_j)$
. Summing this estimate over
$B\in \mathcal {B}_{\operatorname {broad}}^1$
, we get

3.2 Narrow case
In the narrow case, we use induction on the scale R. In this subsection, we denote by
$A \lessapprox B$
expressions of the form
$A\leq (\log R )^{O(1)} B$
and keep track of powers of K.
We start by considering the base case
$R\sim 1$
. Since
$|\mathbb {W}|\sim 1$
, we have

which is better than the claimed estimate since
$\alpha (p) \leq 1/p$
and
$|Y|\lesssim 1$
.
Let
$B\in \mathcal {B}_{\operatorname {narrow}}$
. Recall that
$\| f\|_{L^p(B\cap Y)} \sim \| \sum _{\tau \in S(B)} f_\tau \|_{L^p(B\cap Y)}$
. For each
$\tau $
, we use a wave packet decomposition

where
$f_{T_1}$
is microlocalized to
$(T_1,\tau )$
. Note that
$\mathbb {T}_{\tau } \subset \mathbb {T}(K^2)$
and each
$T_1\in \mathbb {T}_{\tau }$
has length
$\sim K^{2(1+\epsilon _1)}$
and radius
$\sim K^{(1+\epsilon _1)}$
. For dyadic
$R^{-1000d} \leq \eta \leq |T_1|$
, let
$\mathbb {T}_{\tau ,\eta }$
denote the collection of
$T_1\in \mathbb {T}_{\tau }$
such that
$|3T_1 \cap Y| \sim \eta $
. The contribution of tubes
$T_1 \in \mathbb {T}_\tau $
such that
$|3T_1 \cap Y| \lesssim R^{-1000d}$
is negligible; it can be absorbed into the main term by crude estimates. By dyadic pigeonholing, there exists dyadic
$\eta $
such that

We fix this
$\eta $
and let
$\mathbb {T}(K^2;B)$
denote the collection of
$T_1\in \cup _{\tau \in S(B)} \mathbb {T}_{\tau ,\eta }$
such that
$2T_1$
intersects B. Then

For each
$B\in \mathcal {B}_{\operatorname {narrow}}$
,
$\sum _{T_1 \in \mathbb {T}(K^2;B)} f_{T_1}$
has
$(K^{-1},d-1)$
-concentrated frequencies and
$|S(B)| \lesssim K^{d-2}$
. Therefore, by the weighted refined decoupling Theorem 2.5,

For each
$\tau $
, we cover
$B_R$
by parallelepiped
$\square $
of dimensions
$K^{-1}R \times \cdots \times K^{-1}R \times R$
with the long axis perpendicular to
$\tau $
. We denote the collection of
$\square $
by
$\mathbb {B}_\tau $
and
$\mathbb {B} = \cup _\tau \mathbb {B}_\tau $
. In addition, given
$\square \in \mathbb {B}$
, we denote by
$\tau (\square )$
the
$\tau $
for which
$\square \in \mathbb {B}_\tau $
.
Let
$\mathbb {T}_\square $
denote the collection of all
$T_1 \in \cup _{B\in \mathcal {B}_{\operatorname {narrow}}} \mathbb {T}(K^2;B)$
such that
$2T_1$
intersects
$\square $
and
${\tau (T_1) = \tau (\square )}$
. Let
$Y_\square =\cup _{T_1 \in \mathbb {T}_{\square } } 2T_1$
. Let
$\mathbb {W}_\square $
denote the collection of
$T\in \mathbb {W}_{\tau (\square )}$
such that
$2T$
intersects
$Y_\square $
and define
$f_\square = \sum _{T\in \mathbb {W}_\square } f_T$
. We record here that

Moreover,
$f_\square $
is microlocalized to
$(\square ,\tau (\square ))$
, and we have

Summing the p-th power of (3.7) over narrow
$B\in \mathcal {B}_{\operatorname {narrow}}$
, we get

Next, we apply the induction hypothesis to
$f_{\square }$
after a parabolic rescaling. Without loss of generality, suppose that
$\tau (\square )$
is contained in
$B_{K^{-1}}^{d-1}(0) \times B^1_{K^{-2}}(0)$
. Let
$L(x_1,\cdots ,x_d) = (Kx_1,\cdots ,Kx_{d-1},K^2 x_d)$
. Then we may apply the induction hypothesis to
$f_{\square }\circ L$
over
$L^{-1}(Y_\square ) \subset L^{-1}(\square )$
at the scale
$R_1= R/K^2$
:

For every
$\square $
,

Thus, we have

By combining this estimate, (3.8) and (3.9), we obtain

Note that

since
$\eta \lesssim |T_1|$
and
$\frac {1}{p}-\frac {1}{p_{d-1}} - \alpha (p) \geq 0$
. Moreover,
$(d-1)\alpha (p) - (\frac {1}{2}-\frac {1}{p}) \leq 0$
by the choice of
$\alpha (p)$
. Therefore, we have

and the induction closes.
4 Convergence of Bochner-Riesz means: Proof of Theorem 1.1
In this section, we prove Theorem 1.1. Given Theorem 3.1, it essentially follows from the argument of Gan and Wu [Reference Gan and Wu13]. The main goal of this section is to give an exposition of their argument. We fix
$j\in \mathbb {N}$
.
Definition 4.1. A multiplier
$m_j$
is type j if

and

for some
$a \in C_0^\infty $
supported on the annulus
$|x|\sim 1$
.
We use the notation
$\widehat {m(D)f}(\xi ) = m(\xi )\widehat {f}(\xi )$
. The Bochner-Riesz means
$S_t^\lambda f$
can be decomposed as

where
$m_j$
is type j for
$j\geq 1$
and
$m_0(D/t) f$
is dominated by the Hardy-Littlewood maximal function uniformly in t (see [Reference Stein21, Reference Tao22]). Tao [Reference Tao22] reduced the weak-type estimate (1.1) to a bound on the maximal function
$f \mapsto \sup _{t\sim 1} |m_j(D/t)f|$
. To be specific, if

holds some
$\lambda _0$
for type j multipliers
$m_j$
and test functions f for all
$j\geq 1$
, then the weak-type estimate (1.1) holds for
$\lambda>\lambda _0$
. Here and in the following, we denote by
$A\lessapprox B$
the inequality
$A\leq C_\epsilon 2^{\epsilon j} B$
or
$A\leq C_\epsilon R^{\epsilon } B$
, which holds for any
$\epsilon>0$
.
4.1 Linearization
In this subsection, we reduce Theorem 1.1 to the following.
Proposition 4.2. Fix a collection of disjoint intervals
$\{I_1, \cdots , I_{2^j} \}$
of length
$\sim 2^{-j}$
for which
$ {\cup _{l} I_l =[1/2,1]}$
. Let
$c_l \in I_l$
and let
$\{ F_1, \cdots , F_{2^j}\}$
be a partition of
$B(0,C2^j)$
. For a type j multiplier
$m_j$
, let T denote the linear operator defined as

Then

where the implicit constant is independent of the choice of
$\{c_l, F_l\}$
.
We postpone the proof of Proposition 4.2 to following subsections and first look at its implications. Clearly, T in Proposition 4.2 is dominated by the maximal function
$f \mapsto \sup _{t\sim 1} |m_j(D/t)f|$
. Conversely, we have the following:
Lemma 4.3. Let T be as in Proposition 4.2. Assume that for any type j multiplier
$m_j$
, there exists a constant
$A_j>0$
independent of
$\{F_l, c_l\}_{1\leq l\leq 2^j}$
such that

Then for every type j multiplier
$m_j$
,

Proof. Let
$m_j$
be a type j multiplier. We linearlize and discretize the maximal operator. Fix a measurable function
$t:B(0,CR) \to [1/2,1]$
. For each
$l=1,\cdots ,2^j$
, define
$F_l = \{ x\in B(0,CR) : t(x) \in I_l \}$
so that

Let
$a_l$
denote the left endpoint of the interval
$I_l$
. By the fundamental theorem of calculus and Hölder, we have

For each
$1\leq k\leq d$
, let
$m_{j,k} (\xi ) = |I_l| \xi _k \partial _{k} m_{j}(\xi )$
. Then
$m_{j,k}$
is a type j multiplier and

Choose
$b_l\in I_l$
for which

Then

By summing the inequality over l and using the assumption, we get

Since this holds for any measurable function t, the claim is proved.
By Lemma 4.3 and Tao’s reduction discussed earlier, we have reduced Theorem 1.1 to Proposition 4.2, which will be proved in the following subsections.
4.2 Decompositions
In this subsection, we work on
$\mathbb {R}^d$
for
$d\geq 2$
. We fix
$\{ c_l,F_l\}_{l=1}^{2^j}$
as in Proposition 4.2 and let

Given any exponent
$1\leq p_0\leq \infty $
, by dyadic pigeonholing, there exist
$\mathcal {J} \subset \{1,2,\cdots , 2^j\}$
and
$2^{-100d j} \leq \gamma \leq 2^{jd}$
for which
$|F_l|\sim \gamma $
for all
$l\in \mathcal {J}$
Footnote
2
and

where

By real interpolation, it suffices to work with
$f=1_E$
, the characteristic function of
$E\subset B(0,2^j)$
. Given
$0<\epsilon \ll 1$
, let
$0<\epsilon _1 \ll \epsilon $
. For a technical reason, we work with the parameter
$R=2^{(1-\epsilon _1)j}$
, but one may identify R and
$2^j$
as
$R\approx 2^j$
. The main result of this subsection is the following.
Proposition 4.4 (cf. [Reference Gan and Wu13])
Let
$E\subset B(0,2^j)$
and
$2\leq p \leq p_d$
. Let
$R=2^{(1-\epsilon _1)j}$
. Suppose that (3.1) holds with the exponent
$\alpha (p)$
. Then

Remark 4.5. When
$d=2$
and
$p=6$
, Proposition 4.4 gives the
$L^{18/13}$
bound obtained in [Reference Li and Wu19]. However, when
$d\geq 3$
and
$p=p_d$
, Proposition 4.4 gives sufficient conditions for the weak-type estimate (1.1), which are no better than the classical sufficient condition
$\lambda>(d-1)(\frac {1}{p}-\frac {1}{2})$
.
Proposition 4.4 follows from the argument from [Reference Gan and Wu13]. We sketch the proof for the sake of exposition. We decompose E according to the density. For a parameter
$\beta _2>0$
to be chosen, let
$\{ Q\}$
denote the collection of maximal dyadic cubes such that

where
$l(Q)$
is the side-length of Q. Let
$E_2= \cup _Q (E\cap Q)$
and
$E_1 = E \setminus E_2$
. Then

for any dyadic cube B; if otherwise, there is a dyadic cube Q in the collection containing B leading to the contradiction
$|E_1 \cap B|=0$
. These sets
$E_1$
and
$E_2$
are the low and the high density parts of E, respectively. Note that
$T_{\mathcal {J}}1_E = T_{\mathcal {J}} 1_{E_1} + T_{\mathcal {J}} 1_{E_2}$
.
For the high-density part
$E_2$
, we have the following estimate from a local
$L^2$
-estimate, which goes back to [Reference Tao23].
Lemma 4.6 [Reference Gan and Wu13, Lemma 3.4]
For any
$\alpha>0$
,

For the low-density part
$E_1$
, we use a wave-packet decomposition. Let
$R=2^{j(1-\epsilon _1)}$
, so that
$m_j$
is essentially supported on the
$R^{-1}$
-neighborhood of
$\mathbb {S}^{d-1}$
. Cover
$\mathbb {S}^{d-1}$
by finitely overlapping caps
$\{ \theta \}$
of diameter
$R^{-1/2}$
and let
$\{ \chi _{\theta } \}$
be an associated smooth partition of unity. Define
$m_{j,\theta }(\xi ) = m_j(\xi ) \chi _{\theta }(\xi /|\xi |)$
so that
$m_{j,\theta }(\xi /c_l)$
is essentially a bump function supported on a box
$\theta _l$
of dimensions
$R^{-1/2}\times \cdots R^{-1/2}\times R^{-1}$
.
Let
$f_{\theta _l}$
denote
$m_{j,\theta }(D/c_l) f$
. Let
$\mathbb {T}_\theta $
denote the collection of tubes as in Section 2. Then
$f_{\theta _l}$
is morally constant on each
$T\in \mathbb {T}_{\theta }$
. Indeed, we may choose an
$L^1$
-normalized non-negative function
$\Psi _\theta $
such that
$\Psi _\theta (x) \sim \Psi _\theta (y)$
for all
$x,y\in T$
and

up to a negligible error term (cf. Section 3.1). Therefore,
$ |f_{\theta _l}|* \Psi _\theta (x) \sim |f_{\theta _l}|* \Psi _\theta (y) $
for all
$x,y \in T$
. Let
$f=\chi _{E_1}$
. Given
$\beta _1>0$
, we let

and
$\mathbb {T}_{\theta _l,large} = \mathbb {T}_{\theta } \setminus \mathbb {T}_{\theta _l,small}$
. Let
$\{ \chi _{T} \}_{T\in \mathbb {T}_\theta }$
denote a smooth partition of unity associated with the covering
$\mathbb {T}_\theta $
. Note that
$f_{\theta _l} \chi _{T}$
is microlocalized to
$(T, \theta _l)$
for each
$T\in \mathbb {T}_\theta $
. We define

We define
$T_{large}(f)$
similarly, which gives the decomposition
$T_{\mathcal {J}} f = T_{small}(f)+T_{large}(f)$
.
Lemma 4.7 [Reference Gan and Wu13, Lemmas 3.2 and 3.3]
Suppose that the weighted
$\ell ^p$
-decoupling estimate (3.1) holds with the exponent
$\alpha (p)$
. Then

The assumption on the weighted
$\ell ^p$
-decoupling inequality (3.1) is utilized only for the bound (4.5). For the sake of completeness, we give the proof.
Proof of (4.5)
By the weighted
$\ell ^p$
-decoupling estimate (3.1),

Let
$f= 1_{E_1}$
. Note that

Therefore,

by the
$L^2$
-orthogonality, giving the claimed bound.
We may now prove Proposition 4.4.
Proof of Proposition 4.4
By Lemma 4.7,

We take
$\beta _1$
so that the two terms are equal, which gives

By combining Lemma 4.6 and (4.6),

Finally, it suffices to choose
$\beta _2$
so that the two terms are equal:

4.3 Proof of Proposition 4.2
By applying Proposition 4.4 and Theorem 3.1 with
$p=\frac {14}{5}$
and
$\alpha (p) = \frac {1}{7}$
, we obtain the restricted weak-type estimate

or equivalently,

By Hölder’s inequality (for Lorentz spaces), we get

In order to use the gain of
$(\gamma R^{-2})^{\frac {1}{13}}$
in (4.7), we interpolate it with a
$L^{4/3}$
-estimate. We recall the
$L^{4/3}$
-estimate due to Carleson and Sjörin [Reference Carleson and Sjölin6] and Córdoba [Reference Córdoba7]:

Since the
$L^{4/3}$
-operator norm of
$m_j(D/t)$
is independent of
$t>0$
, we have

We interpolate (4.7) and (4.8) so that we may apply
$\gamma R^{-2} |\mathcal {J}| \lessapprox 1$
. To be specific, take
$\theta = \frac {39}{43}$
so that
$\theta \cdot \frac {1}{13} = (1-\theta ) \frac {3}{4}$
. Then
$\theta \frac {17}{26} + (1-\theta ) \frac {3}{4}=\frac {57}{86}$
. Real interpolation of (4.7) and (4.8) gives

which completes the proof of Proposition 4.2 by (4.3).
Remark 4.8. Consider the example
$f(x) = e^{2\pi i x_2} \psi (x_1,2^{-j/2} x_2)$
from [Reference Tao22]. In this example, if
${x_1\sim x_2 \sim 2^j}$
, then

where
$t(x) = |x|/x_2$
. Thus, we may choose
$\{ c_l,F_l\}$
so that
$\gamma \sim 2^{j}$
,
$|\mathcal {J}|\sim 2^j$
,
$\| T_{\mathcal {J}} f\|_{L^p} \gtrsim 2^{2j(\frac {1}{p}-\frac {1}{2})}$
and
$\| f\|_{L^q} \sim 2^{j\frac {1}{2q}}$
. In particular,

This suggests that there may be significant room for improvement in the estimate (4.8). Any improvement to (4.8) would advance our knowledge of the almost everywhere convergence of the Bochner-Riesz means.
4.4 An improved bound for
$d= 3$
Theorem 4.9. Let
$d=3$
,
$1< p< 2$
and

Then the weak-type estimate (1.1) holds. Consequently,
$\lim _{t\to \infty } S^\lambda _t f(x) = f(x)$
for almost every
$x\in \mathbb {R}^3$
for any
$f\in L^p(\mathbb {R}^3)$
.
Theorem 4.9 gives a small improvement to the result obtained in [Reference Gan and Wu13]; when
$p=\frac {3}{2}$
, (1.1) holds for
$\lambda> \frac {27}{85}=0.317\cdots $
, improving the sufficient condition
$\lambda> \frac {107}{325}=0.329\cdots $
from [Reference Gan and Wu13].
Proof. Let
$E\subset B(0,2^j)$
. By taking
$p=\frac {14}{5}$
and
$\alpha (p)=\frac {1}{14}$
in Proposition 4.4, we get

or equivalently,

By Hölder’s inequality, we get

However, by the sharp
$L^p$
estimate for the Bochner-Riesz means at the exponent
$p=\frac {13}{9}$
due to Wu [Reference Wu24] (see also [Reference Guo, Oh, Wang, Wu and Zhang14]), there holds

Therefore,

We interpolate (4.9) and (4.10) so that we may apply
$\gamma |\mathcal {J}| R^{-3} \lessapprox 1$
. This gives

This implies, by (4.3) and Lemma 4.3, that

By interpolation with the standard
$L^1$
and
$L^2$
estimates, we get for
$1\leq p \leq 2$
,

By the reduction of Tao, this implies Theorem 4.9.
Funding statement
Partially supported by a grant from the Research Grants Council of the Hong Kong Administrative Region, China (Project No. CityU 21309222).
Competing interests
The authors have no competing interest to declare.