Hostname: page-component-7dd5485656-kp629 Total loading time: 0 Render date: 2025-10-31T14:05:39.616Z Has data issue: false hasContentIssue false

On equidistribution of polynomial sequences in quotients of PSL2()

Published online by Cambridge University Press:  30 October 2025

LAURITZ STRECK*
Affiliation:
Centre for Mathematical Sciences, University of Cambridge , Wilberforce Road, Cambridge, CB3 0WA, UK
Rights & Permissions [Opens in a new window]

Abstract

In this paper, it is shown that for every lattice $\Gamma \subset PSL_2(\mathbb {R})$, there exists a $c>0$ such that for any $0 \leq \gamma <c$, the sequence $p h(n^{1+\gamma })$ equidistributes for any $p \in \Gamma \backslash PSL_2(\mathbb {R})$, where h is the horocycle flow. This makes modest progress towards a conjecture of Shah and generalizes a result of Venkatesh [Sparse equidistribution problems, period bounds, and subconvexity. Ann. of Math. (2) 172(2) (2010), 989–1094], who established the same equidistribution for co-compact lattices. The proof uses a dichotomy between good equidistribution estimates and approximability of $\{p h(t), t \leq T \}$ by closed horocycles of small period.

Information

Type
Original Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press

1 Introduction

Consider the (multiplicative) group $G:=PSL_2(\mathbb {R})$ with a Haar measure $\mu _G$ . A lattice $\Gamma \subset G$ is a discrete subgroup such that the quotient $X:=\Gamma \backslash G$ has a fundamental domain in G of finite Haar measure. The Haar measure then descends to a finite measure $\mu _X$ . We define the matrices

$$ \begin{align*} h(x):=\begin{pmatrix} 1 & x \\ 0 & 1 \end{pmatrix}, \quad a(y):=\begin{pmatrix} y^{{1}/{2}} & 0 \\ 0 & y^{-{1}/{2}} \end{pmatrix}. \end{align*} $$

The geodesic flow at time t of $p \in X$ is defined by $g_t(p):=p a(e^t)$ and the horocycle flow at time t is defined by $h_t(p ):=p h(t)$ .

While the orbit $g_t(p )$ for $t \to \infty $ can behave quite irregularly depending on the initial point, the horocycle orbit $h_t(p )$ is known to behave much more rigidly. Before we detail the known results, we pin down some notation. We say that the orbit $h_t(p )$ equidistributes with respect to $\mu _X$ if for any compactly supported, continuous function f on X,

$$ \begin{align*} \lim_{T \to \infty} \frac{1}{T} \int_{0}^T f(ph(t)) \; dt \to \int f \; d\mu_X. \end{align*} $$

Similarly, we say that the orbit equidistributes along a sequence $a_n \in \mathbb {R}$ with respect to $\mu _X$ if

$$ \begin{align*} \lim_{N \to \infty} \frac{1}{N} \sum_{n=0}^{N-1} f(ph(a_n)) \; dt \to \int f \; d\mu_X. \end{align*} $$

Lastly, a point $p \in X$ is called periodic if there is a $t_0 \in \mathbb {R}$ such that $p =p h(t_0)$ . In this case, the horocycle orbit will be trapped in the periodic orbit and will never equidistribute with respect to $\mu _X$ ; the system $t \mapsto p h(t)$ is then isomorphic to the circle-rotation ${x \mapsto x+t_0^{-1}}$ on the torus $\mathbb {R}/\mathbb {Z}$ . Below, we use ‘ $p h(a_n)$ equidistributes’ as a shorthand for ‘for all non-periodic $p \in X$ , $p h(a_n)$ equidistributes with respect to $\mu _X$ ’. It was shown by Dani and Smillie that both $p h(t)$ for $t \in \mathbb {R}$ and $p h(n)$ for $n \in \mathbb {N}$ equidistribute.

It was subsequently asked what happens for sequences other than $\mathbb {N}$ . Margulis conjectured that $p h(p_n)$ , where $p_n$ is the nth prime number, should also equidistribute. Shah conjectured that for any $\gamma \geq 0$ , $p h(n^{1+\gamma })$ would equidistribute. We remark that these results follow for $\mu _X$ -almost every $p \in X$ from the work of Bourgain in a much more general context [Reference Bourgain1]. The challenge is really to establish equidistribution for all non-periodic $p \in X$ .

Venkatesh made progress on Shah’s conjecture by showing that for co-compact $\Gamma $ , there is a small $c=c(\Gamma )>0$ such that for all $0 \leq \gamma <c$ and all $p \in X$ , $p h(n^{1+\gamma })$ equidistributes [Reference Venkatesh9]. Venkatesh’s proof operates by controlling arithmetic sequences of the type $p h(sn)$ for $n \in \{0, \ldots , N-1\}$ with s small compared with n. Controlling these sparse sequences also means that the almost-primes equidistribute for co-compact $\Gamma $ ; that is, for sufficiently big R, $p h(q)$ equidistributes, where q runs over all numbers having at most R many prime factors. That controlling sparse sequences is enough to control the almost-primes can be seen either using sieve methods or using the pseudo-random measure $\nu $ , introduced by Goldston and Yilmaz, and subsequently used by Green and Tao to show that the primes contain infinitely long arithmetic progressions [Reference Goldston and Yildirim3, Reference Green and Tao4] (see [Reference Sarnak and Ubis6] for a proof of the equidistribution of almost-primes using sieve methods and [Reference Streck7] for a proof using the pseudo-random measure $\nu $ ).

Using Venkatesh’s method in the case of a non-compact lattice, one can show that $p h(n^{1+\gamma })$ and $ph(q)$ , q almost prime, equidistribute under the assumption of a Diophantine condition on p [Reference McAdam5, Reference Zheng10, Reference Zheng11]. This Diophantine condition assures that the horocycle orbit $ph(t)$ equidistributes with rate $T^{-\varepsilon }$ for all T, which is needed for Venkatesh’s argument. Using the fact that for any point p, there are some times $T_i \to \infty $ such that $p h(t), t \leq T_i$ equidistributes with error $T_i^{-\varepsilon }$ , one can also deduce with the same method that the orbits $p h(n^{1+\gamma })$ and $ph(q)$ , q almost prime, are dense.

However, showing equidistribution for all p is significantly harder, as there are p such that there are times T for which the equidistribution of $p h(t), t \leq T$ is far worse than polynomial. In this case, Venkatesh’s method cannot be applied.

Sarnak and Ubis were the first to show such a sparse equidistribution result for all initial p. They showed that the almost-primes equidistribute for $\Gamma =PSL_2(\mathbb {Z})$ , which is not co-compact [Reference Sarnak and Ubis6]. It was subsequently proved by the author that the almost-primes equidistribute for all lattices $\Gamma $ in $PSL_2(\mathbb {R})$ [Reference Streck7].

In this paper, the equidistribution of $p h(n^{1+\gamma })$ is established for small $\gamma $ in the setting of a general lattice. This generalises Venkatesh’s result from co-compact $\Gamma $ to all lattices $\Gamma $ in $PSL_2(\mathbb {R})$ and makes (modest) progress on the conjecture of Shah.

We make this precise in the result below, which is the main result of this paper. For this, we need some more notation and start by defining the metric $d_X$ . The group $G=PSL_2(\mathbb {R})$ comes with a natural left-invariant metric $d_G$ (see for example [Reference Einsiedler and Ward2, Ch. 9]). This metric descends to X via $d_X(\Gamma g, \Gamma h):=\inf _{\gamma \in \Gamma } d_G(g, \gamma h)$ . We also fix a point $p_0 \in X$ and define $\mathrm {dist}(p):=d_X(p, p_0)$ .

For two functions $f, g \colon U \to \mathbb {R}$ , we write $f \ll g$ or $f=O(g)$ if there is a constant C such that $|f(x)| \leq C |g(x)|$ for all $x \in U$ , where U is some domain. In this paper, this constant C implicit in the definition is always allowed to depend on the lattice $\Gamma $ and the choice of $\gamma $ , but nothing else. We write $f \sim g$ if both $f \ll g$ and $g \ll f$ .

For a function $f \in C^4(X)$ , let $\Vert f \Vert _{W^4}$ be its Sobolev norm in the Hilbert space $W^{4, 2}$ involving the fourth derivative and let $\Vert f \Vert _{\infty , j}$ be the supremum norm of the jth derivatives. Define

$$ \begin{align*} \Vert f \Vert:=\Vert f \Vert_{W^4}+\Vert f \Vert_{\infty, 1}+\Vert f \Vert_{\infty, 0}; \end{align*} $$

this norm is the same one Strömbergsson used to show his equidistribution result [Reference Strömbergsson8]. In his result, the equidistribution properties of a horocycle piece $\{p h(t), 0 \leq t \leq T\}$ are measured in terms of the parameter

$$ \begin{align*} r(p, T):=T \exp(-\mathrm{dist}(g_{\log T}(p))), \end{align*} $$

which will be an important quantity to measure the equidistribution properties throughout this paper; its significance and role in the proof will be discussed below in more detail. It is well known that $r(p, T) \to \infty $ as $T \to \infty $ for any non-periodic p.

We let $\beta $ be the constant in Theorem 1.2; it ultimately comes from the rate of effective mixing. The constant in Theorem 1.1 can be taken to be $c={\beta }/{600}$ .

Theorem 1.1. For any lattice $\Gamma \subset PSL_2(\mathbb {R})$ , there is a constant $c=c(\Gamma )>0$ such that for any $0 \leq \gamma \leq c$ , any non-periodic $p \in X$ and any function $f \in C^4(X)$ with $\Vert f \Vert =1$ ,

$$ \begin{align*} \bigg| \frac{1}{T} \sum_{n \leq T} f(p h(n^{1+\gamma})) -\int f \; d\mu_X \bigg| \ll r^{-{\beta}/{4}}, \end{align*} $$

where $r=r(p, T^{1+\gamma })$ .

To prove Theorem 1.1, we will split the range into different intervals and use Taylor expansion on each one. On an interval $[T_0, T_1]$ , the function $t^{1+\gamma }$ will be approximately equal to $T_0^{1+\gamma } + (1+\gamma ) T_0^\gamma (t-T_0)$ , provided that $T_0$ is not too small and that the range is not too long. The question thus becomes how well $ph(ns)$ for $s \sim T^{\gamma }$ equidistributes. To control these sparse arithmetic sequences, we need two results.

The first one is the following theorem, which is a straightforward consequence of combining Strömbergsson’s equidistribution result [Reference Strömbergsson8] with Venkatesh’s method [Reference Venkatesh9], as performed, for example, by Zheng [Reference Zheng10].

Theorem 1.2. [Reference Zheng10, Theorem 1.2]

Let $\Gamma $ be a non-compact lattice in G. Let $f \in C^4(X)$ with $\Vert f \Vert < \infty $ and $1 \leq s<T$ . Then,

for any initial point $p \in X$ , where $r=r(p, T)$ . The parameter $\tfrac 16>\beta >0$ and the implied constant depend only on $\Gamma $ .

In the cases where r is big compared with T (say $r \geq T^\varepsilon $ for some absolute $\epsilon $ ), this theorem in itself is enough to show equidistribution of the sequence $ph(n^{1+\gamma })$ .

The result below will be used to deal with the case in which the equidistribution is bad. It was proved by the author in [Reference Streck7] to show equidistribution of almost-primes. Its proof uses ideas of Sarnak and Ubis [Reference Sarnak and Ubis6] and has parallels to [Reference Strömbergsson8], whose proof in turn uses ideas going back to Marina Ratner. This result encompasses the dichotomy mentioned in the abstract.

Lemma 1.3. [Reference Streck7, Lemma 1.3]

Let $\Gamma $ be a lattice in $G=PSL_2(\mathbb {R})$ and let $X=\Gamma \backslash G$ . Let $p \in X$ and $T \geq 0$ . Let $\delta>0$ and $K \leq T$ .

There is an interval $I_0 \subset [0,T]$ of size $|I_0| \leq \delta ^{-1} K^2$ such that: for all $t_0 \in [0,T] \backslash I_0$ , there is a segment $\{\xi h(t), t \leq K\}$ of a closed horocycle approximating $\{ph(t_0+t), 0 \leq t \leq K\}$ of order $\delta $ , in the sense that

$$ \begin{align*} \text{ for all } 0 \leq t \leq K, \quad d_X(ph(t_0+t), \xi h(t)) \leq \delta. \end{align*} $$

The period $P=P(t_0, p)$ of this closed horocycle is at most $ P \ll r(p, T)$ .

Moreover, one can assure $P \gg \eta ^2 r$ for some $\eta>0$ by weakening the bound on $I_0$ to $|I_0| \leq \max (\delta ^{-1} K^2, \eta T)$ .

2 On the behaviour of the equidistribution parameter in Theorem 1.2

Except for Lemma 1.3 itself, we will also need some of the other material in [Reference Streck7, Ch. 4] to prove Theorem 1.1. We recall some of the material, going slightly beyond what is presented in [Reference Streck7].

It is well known that $G \cong T_1 \mathbb {H}$ , where $\mathbb {H}$ is the upper half-plane with the hyperbolic metric. Then, $X=\Gamma \backslash G$ has as fundamental domain a set $T_1F$ , where F is a geodesic polygon in $\mathbb {H}$ , that is, a polygon with finitely many vertices with the edges being pieces of geodesics [Reference Einsiedler and Ward2]. This fundamental polygon F has finitely many vertices touching the boundary of the upper half-plane, either at the axis with imaginary part equal to zero or at infinity. After identifying vertices that are in the same orbit under the action of $\Gamma $ , one gets the cusps of X, which we will denote by $r_1, \ldots , r_n$ . Any such cusp $r_i$ is in 1–1 correspondence to an element $\gamma _i \in \Gamma $ with the property that $\gamma _i$ fixes $r_i$ and that $\gamma _i$ is conjugated to $h(1)$ (see [Reference Streck7, Lemma 3.1]). For each cusp, there are elements $\sigma _i \in G$ such that $\sigma _i r_i=\infty $ and $\sigma _i \gamma _i \sigma _i^{-1}=h(1)$ .

For $g \in G$ , we define $Y^0_i(g):=\mathrm {Im}(\sigma _i g)$ , where

$$ \begin{align*} \mathrm{Im}\left(\begin{pmatrix} a & b \\ c & d \end{pmatrix} \right):=\frac{1}{c^2+d^2} \end{align*} $$

is the imaginary part of the the matrix projected to $\mathbb {H}$ . We also set for $p=\Gamma g_p \in X$ , $y_i^0(p):=\max _{\gamma \in \Gamma } Y^0_i(\gamma g_p)$ .

It was shown in [Reference Streck7, Lemma 4.1] that there exist disjoint neighbourhoods $C_i \subset X$ of each cusp $r_i$ with $K=X \backslash \cup C_i$ being compact such that for any $p \in C_i$ , $\exp (\mathrm {dist}(p)) \sim y_i^0(p)$ (while of course $\exp (\mathrm {dist}(p)) \sim 1$ for $p \in K$ ). Arguing as in the [Reference Streck7, proof of 1 in Lemma 4.1], one also sees that if $p=\Gamma g_p \in C_i$ and $g_p$ is such that $Y_i^0(g_p)=y_i^0(p)$ , then for any $\gamma \in \Gamma $ , either $Y_i^0(\gamma g_p) \ll 1$ or $Y_i^0(\gamma g_p)=Y_i^0(g_p)$ (which is the case in which ${\sigma _i \gamma g_p=h(n) \sigma _i g_p}$ and $\gamma =(\gamma _i)^n$ for some n). This implies in particular that there is an absolute constant $C=C(\Gamma )$ such that if $g_p$ is such that $Y_i^0(g_p) \geq C$ , then

$$ \begin{align*} Y_i^0(g_p) \sim y_i^0(p) \sim\exp(\mathrm{dist}(p)), \end{align*} $$

where the second equivalence holds because $y_i^0(p) \geq C$ implies that $p \in C_i$ for C sufficiently big.

From the expression above, the reader sees the relation to the equidistribution parameters

$$ \begin{align*} r(q, K):=K \exp(-\mathrm{dist}(g_{\log K}(q))) \end{align*} $$

appearing in Theorems 1.1 and 1.2.

Observation 2.1. There is an absolute $c_0=c_0(\Gamma )>0$ such that for any T and any p, if there is a representative $g_p$ of p and an i such that for $\sigma _i g_p=:(\begin {smallmatrix} a & b \\ c & d \end {smallmatrix})$ , $\max (T^2 c^2, d^2) \leq c_0 T$ , then $r(p, T) \sim \max (T^2 c^2, d^2)$ .

Proof. We have that

$$ \begin{align*} 2 \max(T^2 c^2, d^2) \geq T (c^2T+d^2T^{-1})=Y_i^0(g_{\log T}(g_p))^{-1} T. \end{align*} $$

Thus, $Y_i^0(g_{\log T}(g_p)) \geq \tfrac 12 c_0^{-1}$ , which shows that

$$ \begin{align*} \exp(\mathrm{dist}(g_{\log T}(p))) \sim Y_i^0(g_{\log T}(g_p)) \end{align*} $$

by the argument above, provided that $c_0$ is sufficiently small.

3 Proof of Theorem 1.1

We start by approximating $t^{1+\gamma }$ with sparse arithmetic sequences. More precisely, we write

$$ \begin{align*} t^{1+\gamma}=T_0^{1+\gamma}+(1+\gamma) T_0^\gamma (t-T_0)+O(T^{-{1}/{6}}) \end{align*} $$

on $[T_0, T_0+T^{1/3}]$ for $T_0 \geq T^{5/6}$ using Taylor expansion.

We will split into several cases. To govern in which case we are, we fix some $\varepsilon>0$ and impose that $\gamma < ({\varepsilon \beta }/{6})$ . We will see at the end which value of $\varepsilon $ makes everything work (which will turn out to be $\varepsilon ={1}/{100})$ .

To apply the results about sparse equidistribution, we are thus tasked with evaluating expressions of the form

$$ \begin{align*} \bigg| \frac{1}{K} \sum_{n \leq K} f(q h((1+\gamma) T_0^\gamma n )) -\int f \; d\mu_X \bigg| \end{align*} $$

for $q=ph(T_0^{1+\gamma })$ and $T^{1/6} \leq K \leq T^{1/3}$ , given some $T_0 \leq T$ . In the case where $r(q, K) \geq T^\varepsilon $ , Theorem 1.2 is enough to deduce good equidistribution.

If $r(q, K) \leq T^\varepsilon $ , then $g_{\log K}(q)$ must lie in the neighbourhood $C_i$ of some cusp $r_i$ , as explained in the previous section. In this case, there is a (essentially unique) representative $g_q$ of q such that $r(q, K) \sim \max (K^2 c^2, d^2)$ , where we set

$$ \begin{align*} \begin{pmatrix} a & b \\ c & d \end{pmatrix}:=\sigma_i g_q, \end{align*} $$

now and for the next couple of pages.

One then has to split into two more cases. The distinction between these cases is governed by

$$ \begin{align*} W_q:=\bigg| \frac{d}{c} \bigg|. \end{align*} $$

The relevance of this $W_q$ is that it measures the time it takes until one gets from bad to good equidistribution again. More precisely, by Observation 2.1,

(3.1) $$ \begin{align} r(q, K) \sim \begin{cases} d^2,& K \leq W_q, \\ d^2 \dfrac{K^2}{W_q^2},& K \geq W_q, \end{cases} \end{align} $$

as long as $r(q, K) \leq c_0 K$ .

This means that even if q and K are such that $r(q, K) \leq T^{\varepsilon }$ , one has that $r(q, T^\varepsilon W_q) \geq ~T^{2\varepsilon }$ . Together with Theorem 1.2, this will be good enough to show effective equidistribution under all assumptions except for those of Proposition 3.1. Under those assumptions, which encompass the most interesting case, almost the entire horocycle orbit $\{ph(t), t \leq T^{1+\gamma }\}$ is close to periodic horocycle orbits of small period. In this case, one will need Lemma 1.3 to conclude.

Proposition 3.1. Let $\Gamma $ and $\gamma <c$ be as in Theorem 1.1, and let $\varepsilon ={1}/{100}$ . Let $p \in X$ and T be such that $r(p, T^{1+\gamma }) \leq T^{4\varepsilon }$ and $W_p \geq T^{1-\varepsilon }$ . Then, for f as in Theorem 1.1,

$$ \begin{align*} \bigg|\frac{1}{T} \sum_{n \leq T} f(ph(n^{1+\gamma})) - \int f \; d\mu_X \bigg| \ll r^{-{\beta}/{4}}. \end{align*} $$

To prove Theorem 1.1, we will first show how one can reduce its proof to Proposition 3.1 using Observation 2.1 and Theorem 1.2. We will then prove Proposition 3.1.

Proof of Theorem 1.1 assuming Proposition 3.1

Say we are given some $t_0$ and set ${q=ph(t_0^{1+\gamma })}$ . If $r:=r(q, T^{1/6}) \geq T^{\varepsilon }$ , then we know by Theorem 1.2 that for any f with $\Vert f \Vert \leq 1$ ,

$$ \begin{align*} \bigg| \frac{1}{T^{{1}/{6}}} \sum_{n \leq T^{{1}/{6}}} f(q h((1+\gamma) t_0^\gamma n )) -\int f \; d\mu_X \bigg| \ll T^{{\gamma}/{2}} r^{-{\beta}/{2}} \leq r^{-{\beta}/{4}}, \end{align*} $$

where we recall $\gamma \leq ({\varepsilon \beta }/{6})$ . We are thus done unless there is a q such that ${r=r(q, T^{1/6}) \leq T^{\varepsilon }}$ . As we saw in §2, then with c and d as defined previously,

(3.2) $$ \begin{align} r \sim \max(T^{{2}/{6}} c^2, d^2). \end{align} $$

If $c^2T^{2/6}$ attains the maximum in (3.2), or equivalently, if $W_q \leq T^{1/6}$ , then $r(q, T^{1/4}) \sim T^{1/6} r \geq T^{1/6}$ by (3.1) and we are done by Theorem 1.2. We can thus assume $W_q \geq T^{1/6}$ . The following claim shows how one can improve the lower bound on $W_q$ further.

Claim 3.2. Let $q=p h(t_0^{1+\gamma })$ such that $r \leq T^\varepsilon $ . Set $W:=W_q$ . If $W \leq T^{1-\varepsilon }$ , then for $K=W^{1+\varepsilon }$ and for f with $\Vert f \Vert \leq 1$ ,

$$ \begin{align*} \bigg| \frac{1}{K} \sum_{0 \leq n \leq K} f(p h((t_0+n)^{1+\gamma})) -\int f \; d\mu_X \bigg| \ll r^{-{\beta}/{4}}. \end{align*} $$

Proof of Claim 3.2

Fix some $W^{1+\varepsilon } \geq s \geq W^{1+{\varepsilon }/{2}}$ and note that then $c^2s^2 \sim W^{-2} s^2 d^2 \gg d^2$ . Thus,

$$ \begin{align*} r(qh(s), T^{{1}/{3}}) &\sim \max (T^{{2}/{6}} c^2, (d+cs)^2 ) \sim \max (T^{{2}/{6}} c^2, c^2 s^2 )\\ &=c^2 s^2 \sim \bigg(\frac{s}{W}\bigg)^2 r \geq r W^\varepsilon \geq r T^{{\varepsilon}/{6}}, \end{align*} $$

where the first equivalence is due to Observation 2.1, which is applicable because $ ({s}/{W})^2 r \ll T^{3\varepsilon }$ . Applying Theorem 1.2 shows that

$$ \begin{align*} &\bigg| \frac{1}{T^{{1}/{3}}} \sum_{n \leq T^{{1}/{3}}} f(p h(t_0^{1+\gamma}+s) h((1+\gamma) (t_0+s)^\gamma n )) -\int f \; d\mu_X \bigg| \\ &\quad\ll T^{{\gamma}/{2}} T^{-{\epsilon \beta}/{12}} r^{-{\beta}/{2}}\leq r^{-{\beta}/{2}}. \end{align*} $$

Now, we use Taylor approximation as above to split the orbit of $(t_0+n)^\gamma $ with $n \leq K$ into different ranges $[s, s+T^{1/3}]$ and note that for all but a $W^{-{\varepsilon }/{2}} T^{\gamma }$ proportion of s, one has $(t_0+s)^{1+\gamma }-t_0^{1+\gamma } \geq W^{1+{\varepsilon }/{2}}$ . As $W^{-{\varepsilon }/{2}} T^{\gamma } \leq T^{-{\varepsilon }/{12}} \leq r^{-{\beta }/{4}}$ , the claim is shown.

We have thus shown the conclusion of Theorem 1.1 unless there is a $q=p h(t_0)$ such that $r(q, T^{1/6}) \leq T^\varepsilon $ and $W_q \geq T^{1-\varepsilon }$ . We let c and d be as defined above and note that in the case considered, $r(q, T^{1/6}) \sim \max (c^2 T^{1/3}, d^2)=d^2$ by definition of $W_q$ . By (3.1), this implies that

$$ \begin{align*} r(q, T^{1+\gamma}) \ll d^2 \frac{T^{2(1+\gamma)}}{W^2_q}\ll T^{4\varepsilon}. \end{align*} $$

Lastly, to get an error term in $r(p, T^{1+\gamma })$ instead of $r(q, T^{1+\gamma })$ , we note that $g_q h(-t_0^{1+\gamma })$ is a representative of p and that because $(d-ct_0^{1+\gamma })^2 \ll d^2 T^{2(1+\gamma )} W^{-2} \ll T^{4\varepsilon }$ ,

$$ \begin{align*} r(p, T^{1+\gamma}) \sim \max(c^2 T^{2(1+\gamma)}, (d-ct_0)^2 ) \ll T^ {4 \varepsilon}, \end{align*} $$

where the first equivalence is due to Observation 2.1. We have thus reduced the proof of Theorem 1.1 to the assumptions of Proposition 3.1.

It now only remains to show Proposition 3.1, which is the main part of the proof of Theorem 1.1.

Proof of Proposition 3.1

Let p and T be given such that $r:=r(p, T^{1+\gamma }) \leq T^{4\varepsilon }$ and $W:=W_p \geq T^{1-\varepsilon }$ . Here, $W=| {d}/{c}|$ , with c and d as defined in Observation 2.1. We also let $g:=g_p$ and $\sigma _i$ be as in Observation 2.1. We invoke Lemma 1.3 to split the orbit $[0, T^{1+\gamma }]$ into pieces of length $K=T^{1/3}$ . As in [Reference Streck7, proof of Lemma 1.3 in Ch. 4], we now parametrize the orbit using the equation

$$ \begin{align*} \sigma_i g h(W+s)=lh(s)=h\bigg(\alpha-\frac{Rs}{s^2+1}\bigg)a\bigg(\frac{R}{s^2+1}\bigg)k(-\mathrm{arccot} \; s), \end{align*} $$

where $l:=\sigma _i g h(W)=:(\alpha +iR, -i)$ is the highest point of the horocycle orbit and

$$ \begin{align*} k(\theta)=\begin{pmatrix} \cos(\theta) & \sin(\theta) \\ -\sin(\theta) & \cos(\theta) \end{pmatrix} \end{align*} $$

is the (subsequently unimportant) rotation component.

Given an $M \leq T$ , we then have that $p h(M^{1+\gamma }+t), t \leq T^{1/3}$ is at distance at most $O(T^{-1/6})$ from the orbit on a periodic horocycle $\xi h(t), t \leq T^{1/3}$ with its period being equal to $y^{-1}$ , where

$$ \begin{align*} y:=\frac{R}{(M^{1+\gamma}-W)^2+1}. \end{align*} $$

By the second clause in Lemma 1.3, we can assume $r \gg y^{-1} \gg \delta ^2 r$ except on an interval of proportion $\delta $ , where $\delta $ is to be chosen later. Using Taylor approximation on $t^{1+\gamma }$ , we thus want to bound

$$ \begin{align*} \bigg| \frac{(1+\gamma) M^\gamma}{T^{{1}/{3}}}\sum_{(1+\gamma) M^\gamma n \leq T^{{1}/{3}}} f(\xi h((1+\gamma)M^\gamma n)) - \int f \,d\mu_X \bigg|. \end{align*} $$

However, we may run into problems here: if for example $y^{-1}=(1+\gamma )M^\gamma $ , the points do not equidistribute at all in the periodic horocycle. To deal with this and related obstructions, we proceed similarly to in [Reference Streck7, the proof of Claim 5.2]. For notational convenience, we set $s:=(1+\gamma ) M^\gamma $ . Let $ q \in \mathbb {N}$ with $y^{-1} \leq q \leq y s^{-1} T^{1/3}$ be such that

$$ \begin{align*} \bigg| s y - \frac{a}{q} \bigg| \leq \frac{y^{-1} s}{q T^{{1}/{3}}} \end{align*} $$

for some a coprime to q (such q exists by the pigeonhole principle). The problem case occurs if q is small compared with $y^{-1}$ . If, however, q is sufficiently big, there are so many distinct points in the interval $[0, y^{-1}]$ that they cannot help being dense enough to approximate $\int _0^{1} f(\xi h(t y^{-1})) \,dt$ by force, as we show now.

Claim 3.3. If $q \geq y^{-3}$ , then

$$ \begin{align*} \bigg| \frac{s}{T^{{1}/{3}}}\sum_{s n \leq T^{{1}/{3}}} f(\xi h(s n)) -\int_0^{1} f(\xi h(t y^{-1})) \,dt \bigg| \ll y \ll \delta^{-2} r^{-1}, \end{align*} $$

where $q, s, y$ and $\xi $ all depend on M.

Proof of Claim 3.3

(The argument in the proof of this claim was suggested by Adrián Ubis.) We set $F(t):=f(\xi h(t y^{-1}))$ , which is one periodic. Because the function f is $1$ -Lipschitz with respect to the hyperbolic metric, the function F is $y^{-1}$ -Lipschitz. We wish to show

$$ \begin{align*} \bigg| \frac{s}{T^{{1}/{3}}} \sum_{sn \leq T^{{1}/{3}}} F(nsy)-\int_0^1 F(t) \,dt \bigg| \ll y. \end{align*} $$

For this, we note that as for any n,

$$ \begin{align*} \bigg|sny-n\frac{a}{q} \bigg| \leq n \frac{y^{-1} s}{q T^{{1}/{3}}}, \end{align*} $$

we have

$$ \begin{align*} \frac{s}{T^{{1}/{3}}} \sum_{sn \leq T^{{1}/{3}}} F(nsy) &= O\bigg(\frac{y^{-2}}{q}\bigg)+\frac{s}{T^{{1}/{3}}} \sum_{sn \leq T^{{1}/{3}}} F\bigg(n \frac{a}{q}\bigg)\\ &=O\bigg(\frac{y^{-2}}{q}\bigg)+O\bigg(\frac{qs}{T^{{1}/{3}}} \bigg)+\frac{1}{q} \sum_{j=0}^{q-1} F\bigg(\frac{ja}{q}\bigg) \end{align*} $$

by the periodicity of F. As a is coprime to q, it does not play a role in the last average and can be dropped. Furthermore, for any $t \leq ({1}/{q})$ ,

$$ \begin{align*} F\bigg(\frac{j}{q}\bigg)=O\bigg(\frac{y^{-1}}{q}\bigg)+F\bigg(\frac{j}{q}+t\bigg), \end{align*} $$

so

$$ \begin{align*} \frac{1}{q} \sum_{j=0}^{q-1} F\bigg(\frac{j}{q}\bigg) &= O\bigg(\frac{y^{-1}}{q}\bigg)+\frac{1}{q} \sum_{j=0}^{q-1} \int_0^1 F\bigg(\frac{j+t}{q} \bigg) \,dt \\ &= O\bigg(\frac{y^{-1}}{q}\bigg)+\int_0^1 F(t) \,dt. \end{align*} $$

As both $y^{-2}q^{-1}$ and $qsT^{-1/3}$ are $O(y)$ , this implies the claim.

By Strömbergsson’s result [Reference Strömbergsson8],

$$ \begin{align*} \bigg| y \int_0^{y^{-1}} f(\xi h(t)) \,dt - \int f \; d\mu_X \bigg| \ll y^\beta \ll (\delta^{-2} r^{-1})^\beta, \end{align*} $$

so we see from Claim 3.3 that

$$ \begin{align*} \bigg| \frac{(1+\gamma) M^\gamma}{T^{{1}/{3}}}\sum_{n \leq T^{{1}/{3}}} f(\xi h((1+\gamma)M^\gamma n)) - \int f \,d\mu_X \bigg| \ll (\delta^{2} r)^{-\beta} \end{align*} $$

unless there is a $q \leq y^{-3} \leq r^{3} $ and a coprime to q such that

$$ \begin{align*} \bigg| (1+\gamma)M^\gamma y - \frac{a}{q} \bigg| \ll M^{\gamma}y^{-1}T^{-{1}/{3}} \leq r T^{-{1}/{3}+\gamma}. \end{align*} $$

To conclude the proof of Theorem 1.1, we just have to show that this is a very exceptional occurrence.

Fortunately, this is what one would expect: if we let

$$ \begin{align*} I_{q, a}:=\bigg\{v \in \mathbb{R}: \bigg| v - \frac{a}{q} \bigg| \leq r T^{-{1}/{3}+\gamma} \bigg\} \end{align*} $$

denote the problem intervals for $q \leq r^3$ and $(a, q)=1$ , we note that they are proportional to $r T^{-1/3+\gamma }$ . Moreover, given distinct intervals $ I_{q_1, a_1}, I_{q_2, a_2}$ , the gap between them is at least of order $r^{-6}$ , as

$$ \begin{align*} \bigg|\frac{a_1}{q_1}-\frac{a_2}{q_2}\bigg|\geq \frac{1}{q_1 q_2} \geq r^{-6}. \end{align*} $$

As $r \ll T^{4\varepsilon }$ , this means that the set $E:=\bigcup _{q \leq r^3, (a, q)=1} I_{q, a}$ makes up only a tiny proportion of the entire range. Unless the function

$$ \begin{align*} G(t):= \frac{t^\gamma R}{(t^{1+\gamma}-W)^2+1}=t^\gamma y \end{align*} $$

is highly concentrated on a small part of its range, our problem case $\{t \leq T: (1+\gamma ) G(t) \in E \}$ will thus only occur on a negligible proportion of $[0,T]$ . The following claim shows that G does not behave in this unusual manner.

Claim 3.4. For all but a $O(\delta + \delta ^{-5} r^7 T^{-1/3+\gamma })$ proportion of $t \leq T$ , there does not exist $q \leq r^3$ such that

$$ \begin{align*} \bigg|(1+\gamma) G(t)-\frac{a}{q}\bigg| \leq r T^{-{1}/{3}+\gamma}. \end{align*} $$

Before we show the claim, we show how it implies Proposition 3.1. The claim implies that at most a small proportion of the intervals we split $[0,T]$ into when applying Taylor approximation will be bad; for the others, we know equidistribution from Claim 3.3. Collecting all the different error terms together,

$$ \begin{align*} \bigg|\frac{1}{T} \sum_{n \leq T} f(ph(n^{1+\gamma})) - \int f \; d\mu_X \bigg| \ll \delta + \delta^{-5} r^7 T^{-{1}/{3}+\gamma} + (\delta^{-2} r^{-1})^\beta, \end{align*} $$

where the error terms come from, in that order, Lemma 1.3 and Claim 3.4, the contribution of the problem intervals $I_{q, a}$ on which the sequence $\xi (1+\gamma ) M^\gamma n $ does not equidistribute in the periodic horocycle, and the comparison with $\int f \; d\mu _X$ on the good intervals. Setting $\delta =r^{-{1}/{10}}$ takes care of the first and third terms, while, recalling that $r \ll T^{4\varepsilon }$ , we can control the second term by setting $\varepsilon ={1}/{100}$ . This concludes the proof of Proposition 3.1 (and thus also the proof of Theorem 1.1) with only Claim 3.4 left to be shown.

Proof of Claim 3.4

To show this claim, we use the following simple lemma, whose proof is left to the reader as an exercise.

Lemma 3.5. Let $I \subset \mathbb {R}$ be an open interval and let $G \colon I \to \mathbb {R}$ be continuously differentiable such that $0<c \leq |G^\prime (t)| \leq C$ for all $t \in I$ . Let $\theta>0$ and let $a_1<b_1< a_2{\kern-1.2pt}<{\kern-1.2pt}\cdots {\kern-1.2pt}<{\kern-1.2pt}a_{n-1}{\kern-1.2pt}<{\kern-1.2pt}b_{n-1}{\kern-1.2pt}<{\kern-1.2pt}a_n$ be real numbers with the property that ${b_i{\kern-1pt}-{\kern-1pt}a_i {\kern-1pt}\leq{\kern-1pt} \theta (a_{i+1}{\kern-1pt}-{\kern-1pt}b_i)}$ for all $1 \leq i<n$ . Then, for $E:=(a_1, b_1) \cup \cdots \cup (a_{n-1}, b_{n-1})$ ,

$$ \begin{align*} |\{t \in I: G(t) \in E \}| \leq 2 \theta C c^{-1} |I| \end{align*} $$

provided that $|I| \geq \theta C c^{-1}$ .

To apply this to the function

$$ \begin{align*} G(t)= \frac{t^\gamma R}{(t^{1+\gamma}-W)^2+1}=t^\gamma y \end{align*} $$

in which we are interested, we need to calculate its derivative. We see that

$$ \begin{align*} \frac{dy}{dt}(t)=-\frac{2(1+\gamma) t^\gamma (t^{1+\gamma} - W)R}{((t^{1+\gamma}-W)^2+1)^2}=-\frac{2y(1+\gamma) t^\gamma (t^{1+\gamma} - W)}{(t^{1+\gamma}-W)^2+1} \end{align*} $$

and thus

$$ \begin{align*} G^\prime(t)= y t^{\gamma-1} \bigg(\gamma - \frac{2(1+\gamma) t^{1+\gamma} (t^{1+\gamma} -W) }{(t^{1+\gamma}-W)^2+1} \bigg). \end{align*} $$

We recall that in Lemma 1.3, we exclude an interval $J_0$ of proportion $\delta $ to assure ${r^{-1} \ll y \ll \delta ^{-2} r^{-1}}$ . We also exclude a set $J_1$ comprising two intervals of proportion $\delta $ to assure $t \geq \delta T$ and $|{W}/{t^{1+\gamma }}-1| \geq \delta $ . This assures that $r^{-1} T^{\gamma -1} \ll y t^{\gamma -1} \ll \delta ^{-3} r^{-1} T^{\gamma -1}$ on the range $[0,T] \backslash (J_0 \cup J_1)$ . If we can bound the expression in the bracket in a similar manner up to factors of powers of $\delta ^{-1}$ , the claim will follow from Lemma 3.5.

To do this, we note that for $t \in [0,T] \backslash J_1$ ,

$$ \begin{align*} \bigg| \frac{1}{(t^{1+\gamma}-W)^2+1}-\frac{1}{(t^{1+\gamma}-W)^2} \bigg|=O(\delta^{-4} T^{-4(1+\gamma)}), \end{align*} $$

which implies

$$ \begin{align*} G^\prime(t)=y t^{\gamma-1} \bigg(\gamma + \frac{2(1+\gamma)}{{W}/{t^{1+\gamma}}-1} + O(\delta^{-4} T^{-2}) \bigg). \end{align*} $$

We set $J_2:=\{t: |{W}/{t^{1+\gamma }}-(1-{(2+\gamma )}/{\gamma })| \geq \delta \}$ , which is the interval of proportion $\delta $ on which the second term roughly cancels out the first. We then have that

$$ \begin{align*} \delta \ll \gamma \bigg|\frac{W}{t^{1+\gamma}}-1\bigg|^{-1} \bigg|\frac{W}{t^{1+\gamma}}-1+\frac{(2+\gamma)}{\gamma}\bigg|=\bigg|\gamma + \frac{2(1+\gamma)}{{W}/{t^{1+\gamma}}-1} \bigg| \ll \delta^{-1} \end{align*} $$

on $[0,T] \backslash (J_1 \cup J_2)$ , which implies that

$$ \begin{align*} \delta r^{-1} T^{\gamma-1} \ll |G^\prime(t)| \ll \delta^{-4} r^{-1} T^{\gamma-1} \end{align*} $$

on $[0,T] \backslash (J_0 \cup J_1 \cup J_2)$ . We can now apply Lemma 3.5 to each of the intervals left. Recalling that each problem interval $I_{q, a}$ is of length $r T^{-1/3+\gamma }$ and the gap between any two successive intervals is of size at least $0.9 r^{-6}$ , we find that

$$ \begin{align*} \frac{1}{T}| \{t \in [0,T] \backslash (J_0 \cup J_1 \cup J_2): (1+\gamma)G(t) \in E \}| \ll \delta^{-5} r^7 T^{-{1}/{3}+\gamma}, \end{align*} $$

where, as before, $E=\bigcup _{q \leq r^3, (a, q)=1} I_{q, a}$ . This shows Claim 3.4, which was the last missing piece in the proof of Theorem 1.1.

Acknowledgements

This is a follow-up paper to [Reference Streck7], which is based on the master’s thesis I did at the Hebrew University of Jerusalem in 2020. As such, I am thankful for the support by my thesis advisor Tamar Ziegler and by Elon Lindenstrauss, who also suggested that the result in the present paper should be achievable with the ideas in [Reference Streck7]. I thank my PhD supervisor Péter Varjú for giving me the freedom to finish the work on these two papers while doing my PhD with him. Above all, I am grateful to Adrián Ubis, who suggested the argument used in the proof of Claim 3.3 in his review of the previous paper, simplifying the proof in [Reference Streck7] considerably. Without getting this new perspective on the material two years later, I would not even have thought of revisiting the problem solved in this paper. The author received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 803711)

References

Bourgain, J.. Pointwise ergodic theorems for arithmetic sets. Publ. Math. Inst. Hautes Études Sci. 69 (1989), 545; with an appendix by the author, Harry Furstenberg, Yitzhak Katznelson and Donald S. Ornstein.10.1007/BF02698838CrossRefGoogle Scholar
Einsiedler, M. and Ward, T.. Ergodic Theory: With a View Towards Number Theory (Graduate Texts in Mathematics, 259). Springer, London, 2011.10.1007/978-0-85729-021-2CrossRefGoogle Scholar
Goldston, D. A. and Yildirim, C. Y.. Higher correlations of divisor sums related to primes I: triple correlations. Integers 3 (2003), A05.Google Scholar
Green, B. and Tao, T.. The primes contain arbitrarily long arithmetic progressions. Ann. of Math. (2) 167 (2004), 481547.10.4007/annals.2008.167.481CrossRefGoogle Scholar
McAdam, T.. Almost-primes in horospherical flows on the space of lattices. J. Mod. Dyn. 15 (2018), 277327.Google Scholar
Sarnak, P. and Ubis, A.. The horocycle flow at prime times. J. Math. Pures Appl. (9) 103 (2011), 575618.10.1016/j.matpur.2014.07.004CrossRefGoogle Scholar
Streck, L.. Non-concentration of primes in $\varGamma \setminus PS{L}_2(\mathbb{R})$ . Israel J. Math. http://doi.org/10.1007/s11856-025-2744-z. Published online 27 March 2025.CrossRefGoogle Scholar
Strömbergsson, A.. On the deviation of ergodic averages for horocycle flows. J. Mod. Dyn. 7(2) (2013), 291328.10.3934/jmd.2013.7.291CrossRefGoogle Scholar
Venkatesh, A.. Sparse equidistribution problems, period bounds and subconvexity. Ann. of Math. (2) 172(2) (2010), 9891094.10.4007/annals.2010.172.989CrossRefGoogle Scholar
Zheng, C.. Sparse equidistribution of unipotent orbits in finite–volume quotients. J. Mod. Dyn. 10 (2016), 121.10.3934/jmd.2016.10.1CrossRefGoogle Scholar
Zheng, C.. On the density of some sparse horocycles. Proc. Indian Acad. Sci. Math. Sci. 134(1) (2024), 0006.10.1007/s12044-023-00774-yCrossRefGoogle Scholar