1. Introduction
In previous work [Reference Van Lieshout and Markwitz25] we developed statistical methods for state estimation on interval-censored data. The motivating example was that of determining the occurrence times of residential burglaries based on police reports. In the criminology literature, such data are known as aoristic crime data [Reference Ratcliffe20, Reference Ratcliffe and McCullagh21]. Aoristic crime studies have mainly focused on ad hoc methods [Reference Ashby and Bowers1], which can be helpful but may miss dependencies such as the near-repeat effect [Reference Bernasco4]. We developed a Bayesian statistical method that can account for inter-event dependencies [Reference Van Lieshout and Markwitz25].
This approach assumed that for each event occurrence the censoring mechanism is governed by a stochastic process. Specifically, an alternating renewal process was used to split time up into observable and partially observable periods according to the two phases of the renewal process. Either the event is fully observed, in which case the exact time of occurrence is recorded, or only the interval between two jumps is recorded. The approach was shown to lead to a tractable mark distribution and is therefore amenable to Monte Carlo methods for simulation.
The censoring mechanism based on alternating renewal processes imposes time-homogeneity. In reality, events rarely occur homogeneously in time. For instance, returning to the motivating example, there may be times of day that are more likely to be censored due to the periodic behaviour of potential victims, such as being at work or asleep. Additionally, burglars may choose to commit crimes at different rates at certain times of the day based on their perception of victim behaviour. Thus, there may be inhomogeneity in both the underlying distribution of occurrence times and the censoring mechanism.
This paper introduces a new model that rectifies the shortcomings discussed above. For the censoring mechanism, we propose a non-homogeneous two-state semi-Markov process [Reference Janssen11, Reference Janssen and Manca12, Reference Korolyuk and Swishchuk16, Reference Ross23], also known as a non-homogeneous alternating renewal process. Conditional intensity-based methods [Reference Haezendonck and De Vylder10] are used to guarantee existence and we derive the joint, marginal, and conditional distributions of the recorded starting point and interval length for each occurrence time. We then propose a marked point process model [Reference Daley and Vere-Jones7] for the complete data using a non-homogeneous Markov point process [Reference Van Lieshout24] for the ground process of event occurrences and a mark kernel based on the non-homogeneous alternating renewal process. We illustrate the model by means of parametric examples that can describe various types of non-homogeneous behaviour, culminating in a comparison of non-homogeneous and homogeneous models.
The plan of this paper is as follows. In Section 2 we recall the definition of a non-homogeneous alternating renewal process on the half line and give an explicit expression for the joint distribution of the time since the last jump and that to go until the next jump. In Section 3 we formulate our marked point process model and study the conditional distribution of the ground process given the union of marks. In Section 4 we present some parametric examples; a demonstration of the model in action is given in Section 5.
2. The non-homogeneous alternating renewal process
2.1. Definition and notation
 Let 
 $(\Omega, \mathcal{A}, {P})$
 be a probability space. Consider the two-dimensional stochastic process
$(\Omega, \mathcal{A}, {P})$
 be a probability space. Consider the two-dimensional stochastic process 
 $(S_i, X_i)$
,
$(S_i, X_i)$
, 
 $i \in \mathbb{N}_0$
, on
$i \in \mathbb{N}_0$
, on 
 $(\Omega, \mathcal{A}, {P})$
 with values in
$(\Omega, \mathcal{A}, {P})$
 with values in 
 $\{ 0, 1 \} \times \mathbb{R}^+$
. Here,
$\{ 0, 1 \} \times \mathbb{R}^+$
. Here, 
 $S_i$
 denotes the ith state that the process is in, and
$S_i$
 denotes the ith state that the process is in, and 
 $0 = X_0 \leq X_1 \leq \cdots$
 are the jump times. Call a time interval that the process spends in state 0 a Z-phase and in state 1 a Y-phase, in analogy to [Reference Van Lieshout and Markwitz25]. We set
$0 = X_0 \leq X_1 \leq \cdots$
 are the jump times. Call a time interval that the process spends in state 0 a Z-phase and in state 1 a Y-phase, in analogy to [Reference Van Lieshout and Markwitz25]. We set 
 $S_0 = 1$
.
$S_0 = 1$
.
 The non-homogeneous alternating renewal process can simply be seen as a special case of the semi-Markov process with only two states. In the semi-Markov paradigm, the tuple 
 $(S_n, X_n)_{n=1}^{\infty}$
 defines a non-homogeneous alternating renewal process if
$(S_n, X_n)_{n=1}^{\infty}$
 defines a non-homogeneous alternating renewal process if 
 \begin{align} \mathbb{P}(S_{n+1} = j&, X_{n+1} \leq x\mid (S_0, X_0) = (s_0, x_0), \ldots, (S_n, X_n) = (s_n, x_n)) \nonumber \\ &= \mathbb{P}(S_{n+1} = j, X_{n+1} \leq x\mid (S_n, X_n) = (s_n, x_n)) \nonumber \\ &= \mathbb{P}(S_{n+1} = j, X_{n+1} - X_n \leq x - x_n \mid (S_n, X_n) = (s_n, x_n)) \nonumber \\ &= \mathbb{P}(S_{1} = j, X_{1} - X_0 \leq x - x_n \mid (S_0, X_0) = (s_n, x_n)), \end{align}
\begin{align} \mathbb{P}(S_{n+1} = j&, X_{n+1} \leq x\mid (S_0, X_0) = (s_0, x_0), \ldots, (S_n, X_n) = (s_n, x_n)) \nonumber \\ &= \mathbb{P}(S_{n+1} = j, X_{n+1} \leq x\mid (S_n, X_n) = (s_n, x_n)) \nonumber \\ &= \mathbb{P}(S_{n+1} = j, X_{n+1} - X_n \leq x - x_n \mid (S_n, X_n) = (s_n, x_n)) \nonumber \\ &= \mathbb{P}(S_{1} = j, X_{1} - X_0 \leq x - x_n \mid (S_0, X_0) = (s_n, x_n)), \end{align}
i.e. the joint conditional probability distribution of the sojourn time
 \begin{align} T_{n+1} = X_{n+1} - X_{n} \end{align}
\begin{align} T_{n+1} = X_{n+1} - X_{n} \end{align}
and the next state 
 $S_{n+1}$
 depends only on the nth state
$S_{n+1}$
 depends only on the nth state 
 $S_n$
 and its jump time
$S_n$
 and its jump time 
 $X_n$
, not on the entire history of the process [Reference Çinlar6, Reference Janssen and Manca12, Reference Korolyuk and Swishchuk16, Reference Ross23] nor on the index n. This process is only Markov at the jump times, hence the name semi-Markov.
$X_n$
, not on the entire history of the process [Reference Çinlar6, Reference Janssen and Manca12, Reference Korolyuk and Swishchuk16, Reference Ross23] nor on the index n. This process is only Markov at the jump times, hence the name semi-Markov.
 It follows from (2.1) that the distribution of a non-homogeneous alternating renewal process is completely specified by the starting state (or its probability distribution) and a semi-Markov kernel G that describes the transition rates from state i to state j. Formally, for 
 $\tau \geq 0$
,
$\tau \geq 0$
, 
 $x\geq 0$
, and
$x\geq 0$
, and 
 $i,j \in \{ 0,1\}$
,
$i,j \in \{ 0,1\}$
, 
 \begin{equation} G_{ij}(x, \tau) = \mathbb{P}(S_{n+1} = j,\,T_{n+1} \leq \tau \mid S_{n} = i, X_{n} = x), \end{equation}
\begin{equation} G_{ij}(x, \tau) = \mathbb{P}(S_{n+1} = j,\,T_{n+1} \leq \tau \mid S_{n} = i, X_{n} = x), \end{equation}
regardless of 
 $n = 0, 1, \ldots$
 Since only two transitions are possible, we proceed by using the notation
$n = 0, 1, \ldots$
 Since only two transitions are possible, we proceed by using the notation 
 $G_{Y}(x, \tau) =G_{10}(x, \tau)$
 and
$G_{Y}(x, \tau) =G_{10}(x, \tau)$
 and 
 $G_{Z}(x, \tau) = G_{01}(x, \tau)$
, where the subscript denotes the state that the process is in after jump time x. In the remainder of this paper, we shall assume that, for all
$G_{Z}(x, \tau) = G_{01}(x, \tau)$
, where the subscript denotes the state that the process is in after jump time x. In the remainder of this paper, we shall assume that, for all 
 $x\geq 0$
,
$x\geq 0$
, 
 $G_Y(x, \cdot)$
 and
$G_Y(x, \cdot)$
 and 
 $G_Z(x, \cdot)$
 are absolutely continuous with respect to Lebesgue measure and write
$G_Z(x, \cdot)$
 are absolutely continuous with respect to Lebesgue measure and write 
 $g_Y(x,\cdot)$
 and
$g_Y(x,\cdot)$
 and 
 $g_Z(x, \cdot)$
 respectively for their Radon–Nikodym derivatives.
$g_Z(x, \cdot)$
 respectively for their Radon–Nikodym derivatives.
2.2. Conditional intensities and non-explosion conditions
 The conditional intensity, also known as the stochastic intensity, of a temporal point process describes the infinitesimal conditional probability of occurrence given the history of the process [Reference Karr14]. More precisely, for 
 $n=0, 1, \ldots$
 and
$n=0, 1, \ldots$
 and 
 $0 = x_0 \leq x_1 \leq \ldots \leq x_n\leq x$
,
$0 = x_0 \leq x_1 \leq \ldots \leq x_n\leq x$
, 
 \begin{equation*} \lambda_{n+1}(x;\ x_1, \ldots, x_n) \, {\mathrm{d}} x = \mathbb{P}(X_{n+1} \leq x+{\mathrm{d}} x\mid X_{n+1} \geq x, X_0 = 0, X_1 = x_1, \ldots, X_n = x_n). \end{equation*}
\begin{equation*} \lambda_{n+1}(x;\ x_1, \ldots, x_n) \, {\mathrm{d}} x = \mathbb{P}(X_{n+1} \leq x+{\mathrm{d}} x\mid X_{n+1} \geq x, X_0 = 0, X_1 = x_1, \ldots, X_n = x_n). \end{equation*}
For a non-homogeneous alternating renewal process, the 
 $\lambda_{n+1}(\cdot\,; \cdot)$
 are closely related to the hazard rates of the sojourn times. To see this, recall that
$\lambda_{n+1}(\cdot\,; \cdot)$
 are closely related to the hazard rates of the sojourn times. To see this, recall that 
 $S_0 = 1$
 and assume that
$S_0 = 1$
 and assume that 
 $n+1$
 is odd. Then, using (2.2), the conditional intensity of the jump process at time x given jumps at times
$n+1$
 is odd. Then, using (2.2), the conditional intensity of the jump process at time x given jumps at times 
 $0 \leq x_1 \leq x_2 \leq \cdots \leq x_n$
 can be simplified as
$0 \leq x_1 \leq x_2 \leq \cdots \leq x_n$
 can be simplified as 
 \begin{align*} \lambda_{n+1}(x;\ x_1, \ldots, x_n)\,{\mathrm{d}} x & = \mathbb{P}(X_{n+1} \leq x+ {\mathrm{d}} x\mid X_{n+1} \geq x, X_n = x_n, S_n = 1) \\ & = \frac{\mathbb{P}(x - x_n \leq T_{n+1} \leq x - x_n + {\mathrm{d}} x\mid X_n = x_n, S_n = 1)}{ \mathbb{P}(T_{n+1} \geq x-x_n\mid X_n = x_n, S_n = 1)} \\ & = \frac{g_Y(x_n, x - x_n) \, {\mathrm{d}} x }{1-G_Y(x_n, x - x_n)}\end{align*}
\begin{align*} \lambda_{n+1}(x;\ x_1, \ldots, x_n)\,{\mathrm{d}} x & = \mathbb{P}(X_{n+1} \leq x+ {\mathrm{d}} x\mid X_{n+1} \geq x, X_n = x_n, S_n = 1) \\ & = \frac{\mathbb{P}(x - x_n \leq T_{n+1} \leq x - x_n + {\mathrm{d}} x\mid X_n = x_n, S_n = 1)}{ \mathbb{P}(T_{n+1} \geq x-x_n\mid X_n = x_n, S_n = 1)} \\ & = \frac{g_Y(x_n, x - x_n) \, {\mathrm{d}} x }{1-G_Y(x_n, x - x_n)}\end{align*}
whenever well-defined, using the absolute continuity of the semi-Markov kernel. Note that this is exactly the hazard rate of the sojourn times. When 
 $G_Y(x_n, x-x_n) = 1$
, the conditional intensity is set to zero. For even
$G_Y(x_n, x-x_n) = 1$
, the conditional intensity is set to zero. For even 
 $n+1$
, a similar argument holds with
$n+1$
, a similar argument holds with 
 $g_Z$
 and
$g_Z$
 and 
 $G_Z$
 instead of
$G_Z$
 instead of 
 $g_Y$
 and
$g_Y$
 and 
 $G_Y$
.
$G_Y$
.
The conditional intensity is a convenient tool to guarantee the existence of the process. Indeed, [Reference Haezendonck and De Vylder10] developed suitable comparison criteria under which explosion, the situation in which there are infinitely many jumps in a finite time span, can be prevented. More formally, the following proposition holds.
Proposition 2.1. (Explosion, from [Reference Haezendonck and De Vylder10].) Let 
 $X_n$
 and
$X_n$
 and 
 $X_n^*$
 be two temporal point processes with corresponding conditional intensities
$X_n^*$
 be two temporal point processes with corresponding conditional intensities 
 $\lambda$
 and
$\lambda$
 and 
 $\lambda^*$
. If
$\lambda^*$
. If
- 
• for every  $n \in \mathbb{N}_0$
, $n \in \mathbb{N}_0$
, $\lambda_{n+1} \leq \lambda^*_{n+1}$
; $\lambda_{n+1} \leq \lambda^*_{n+1}$
;
- 
• for every  $n\in\mathbb{N}$
, either $n\in\mathbb{N}$
, either $\lambda_{n+1}(x;\ x_1, \ldots, x_n)$
 or $\lambda_{n+1}(x;\ x_1, \ldots, x_n)$
 or $\lambda^*_{n+1}(x;\ x_1, \ldots, x_n)$
 depends only on $\lambda^*_{n+1}(x;\ x_1, \ldots, x_n)$
 depends only on $x-x_n$
, $x-x_n$
,
 
the probability of explosion at or before time x of the point process defined by 
 $\lambda$
 is at most as big as that of the point process defined by
$\lambda$
 is at most as big as that of the point process defined by 
 $\lambda^*$
. Under the same conditions, for all
$\lambda^*$
. Under the same conditions, for all 
 $n\in{\mathbb{N}}$
 and
$n\in{\mathbb{N}}$
 and 
 $x\geq 0$
,
$x\geq 0$
, 
 $\mathbb{P}(X_n \leq x ) \leq \mathbb{P}(X_{n}^* \leq x)$
.
$\mathbb{P}(X_n \leq x ) \leq \mathbb{P}(X_{n}^* \leq x)$
.
Proof. See [Reference Haezendonck and De Vylder10, Corollaries 1, 2, and 5].
 Below, we establish existence for two common families of sojourn time distributions, the Gamma and the Weibull. We remark that in the semi-Markov paradigm, due to the Markov property, 
 $\lambda_{n+1}(x;\ x_n) = \lambda_{n+1}(x;\ x_1, \ldots, x_n)$
, as there is no dependence on previous values
$\lambda_{n+1}(x;\ x_n) = \lambda_{n+1}(x;\ x_1, \ldots, x_n)$
, as there is no dependence on previous values 
 $x_1, \ldots, x_{n-1}$
 of
$x_1, \ldots, x_{n-1}$
 of 
 $X_n$
. We will therefore use this notation in what follows.
$X_n$
. We will therefore use this notation in what follows.
Proposition 2.2. (Existence for Gamma distribution.) Let 
 $(S_n, X_n)_{n=1}^{\infty}$
 be a non-homogeneous alternating renewal process with values in
$(S_n, X_n)_{n=1}^{\infty}$
 be a non-homogeneous alternating renewal process with values in 
 $\{ 0, 1\} \times \mathbb{R}^+$
 with
$\{ 0, 1\} \times \mathbb{R}^+$
 with 
 $S_0=1$
,
$S_0=1$
, 
 $X_0=0$
, and semi-Markov kernels
$X_0=0$
, and semi-Markov kernels 
 $G_Y(x,\cdot)$
,
$G_Y(x,\cdot)$
, 
 $G_Z(x, \cdot)$
 that follow Gamma distributions with shape and rate parameters
$G_Z(x, \cdot)$
 that follow Gamma distributions with shape and rate parameters 
 $\theta_Y(x) = (k_Y(x), \lambda_Y(x))$
 and
$\theta_Y(x) = (k_Y(x), \lambda_Y(x))$
 and 
 $\theta_Z(x) = (k_Z(x), \lambda_Z(x))$
 in
$\theta_Z(x) = (k_Z(x), \lambda_Z(x))$
 in 
 $[1,\infty) \times (0,\infty)$
. Write
$[1,\infty) \times (0,\infty)$
. Write 
 $X_\infty = \lim_{n\to \infty} X_n$
 for the time of explosion. If
$X_\infty = \lim_{n\to \infty} X_n$
 for the time of explosion. If 
 $\lambda_Y(x) \leq c$
 and
$\lambda_Y(x) \leq c$
 and 
 $\lambda_Z(x) \leq c$
 for all
$\lambda_Z(x) \leq c$
 for all 
 $x\in\mathbb{R}^+$
 and some
$x\in\mathbb{R}^+$
 and some 
 $c \gt 0$
, then
$c \gt 0$
, then 
 $\mathbb{P}(X_\infty \lt \infty) = 0$
.
$\mathbb{P}(X_\infty \lt \infty) = 0$
.
 The proof is deferred to Appendix A.1. Important special cases include 
 $k_T(x) = 1$
 for exponential distributions or, more generally,
$k_T(x) = 1$
 for exponential distributions or, more generally, 
 $k_T(x) \in \mathbb{N}$
 corresponding to Erlang-distributed phases, where T can be either Y or Z.
$k_T(x) \in \mathbb{N}$
 corresponding to Erlang-distributed phases, where T can be either Y or Z.
Proposition 2.3. (Existence for Weibull distribution.) Let 
 $(S_n, X_n)_{n=1}^{\infty}$
 be a non-homogeneous alternating renewal process with values in
$(S_n, X_n)_{n=1}^{\infty}$
 be a non-homogeneous alternating renewal process with values in 
 $\{ 0, 1\} \times \mathbb{R}^+$
 with
$\{ 0, 1\} \times \mathbb{R}^+$
 with 
 $S_0=1$
,
$S_0=1$
, 
 $X_0=0$
, and semi-Markov kernels
$X_0=0$
, and semi-Markov kernels 
 $G_Y(x,\cdot)$
,
$G_Y(x,\cdot)$
, 
 $G_Z(x, \cdot)$
 that follow Weibull distributions with shape and rate parameters
$G_Z(x, \cdot)$
 that follow Weibull distributions with shape and rate parameters 
 $\theta_Y(x) = (k_Y(x), \lambda_Y(x))$
 and
$\theta_Y(x) = (k_Y(x), \lambda_Y(x))$
 and 
 $\theta_Z(x) = (k_Z(x), \lambda_Z(x))$
 in
$\theta_Z(x) = (k_Z(x), \lambda_Z(x))$
 in 
 $(0,\infty) \times (0,\infty)$
. Write
$(0,\infty) \times (0,\infty)$
. Write 
 $X_\infty = \lim_{n\to \infty} X_n$
 for the time of explosion. If:
$X_\infty = \lim_{n\to \infty} X_n$
 for the time of explosion. If:
- 
(i)  $\lambda_Y(x), \lambda_Z(x) \leq c$
 for some $\lambda_Y(x), \lambda_Z(x) \leq c$
 for some $c \gt 0$
, and $c \gt 0$
, and
- 
(ii) either  $1 \leq k_Y(x) \leq k$
, $1 \leq k_Y(x) \leq k$
, $1 \leq k_Z(x) \leq k$
 for some $1 \leq k_Z(x) \leq k$
 for some $k \geq 1$
, or $k \geq 1$
, or $k_Y(x) = k_Z(x) = k$
 for some $k_Y(x) = k_Z(x) = k$
 for some $k \gt 0$
, $k \gt 0$
,
 
then 
 $\mathbb{P}(X_\infty \lt \infty) = 0$
.
$\mathbb{P}(X_\infty \lt \infty) = 0$
.
 For the proof, see Appendix A.2. The case where 
 $k = 1$
 corresponds to exponential sojourn times.
$k = 1$
 corresponds to exponential sojourn times.
2.3. Renewal function: Existence and boundedness
 The process counting the number of cycles having occurred by time 
 $t \geq 0$
 can be written as
$t \geq 0$
 can be written as 
 $N(t) = \sup\{n \in \mathbb{N}_0\colon X_{2n} \leq t\}$
, where a cycle is an interval of time within which each state occurs once. Thus, N is a counting measure. The distribution of
$N(t) = \sup\{n \in \mathbb{N}_0\colon X_{2n} \leq t\}$
, where a cycle is an interval of time within which each state occurs once. Thus, N is a counting measure. The distribution of 
 $X_{2n}$
, the jump time after completing the nth cycle, is, for
$X_{2n}$
, the jump time after completing the nth cycle, is, for 
 $n\in\mathbb{N}_0$
 and
$n\in\mathbb{N}_0$
 and 
 $t\geq 0$
,
$t\geq 0$
, 
 \begin{equation*} F_{2n}(t) = \mathbb{P}\Bigg( \sum_{i=1}^{2n} T_i \leq t \Bigg) = \mathbb{P}(X_{2n} \leq t) = \mathbb{P}(N(t) \geq n).\end{equation*}
\begin{equation*} F_{2n}(t) = \mathbb{P}\Bigg( \sum_{i=1}^{2n} T_i \leq t \Bigg) = \mathbb{P}(X_{2n} \leq t) = \mathbb{P}(N(t) \geq n).\end{equation*}
The renewal function is defined, analogously to that of the classic alternating renewal process, as 
 $M(t) = \mathbb{E}N(t)$
,
$M(t) = \mathbb{E}N(t)$
, 
 $t \geq 0$
 [Reference Janssen and Manca12]. In our case,
$t \geq 0$
 [Reference Janssen and Manca12]. In our case, 
 \begin{equation*} M(t) = \mathbb{E}N(t) = \sum_{n=0}^{\infty} \mathbb{P}(N(t) \gt n) = \sum_{n=1}^{\infty} \mathbb{P}(X_{2n} \leq t) = \sum_{n=1}^{\infty} F_{2n}(t)\end{equation*}
\begin{equation*} M(t) = \mathbb{E}N(t) = \sum_{n=0}^{\infty} \mathbb{P}(N(t) \gt n) = \sum_{n=1}^{\infty} \mathbb{P}(X_{2n} \leq t) = \sum_{n=1}^{\infty} F_{2n}(t)\end{equation*}
is a 2n-fold convolution.
The following corollaries to Propositions 2.2 and 2.3 hold.
Corollary 2.1. (Renewal function bound for Gamma.) Let 
 $(S_n, X_n)_{n=1}^{\infty}$
 be as in Proposition 2.2. Then its renewal function M(t) satisfies
$(S_n, X_n)_{n=1}^{\infty}$
 be as in Proposition 2.2. Then its renewal function M(t) satisfies 
 $M(t) \leq ct$
,
$M(t) \leq ct$
, 
 $t\geq 0$
.
$t\geq 0$
.
For the proof, see Appendix A.3.
Corollary 2.2. (Renewal function bound for Weibull.) Let 
 $(S_n, X_n)_{n=1}^{\infty}$
 be as in Proposition 2.3. Then its renewal function M(t) is finite and bounded from above by the expectation
$(S_n, X_n)_{n=1}^{\infty}$
 be as in Proposition 2.3. Then its renewal function M(t) is finite and bounded from above by the expectation 
 ${\mathbb{E}} (N^*(t))$
 of a renewal process
${\mathbb{E}} (N^*(t))$
 of a renewal process 
 $N^*(t)$
 with Weibull-distributed sojourn times having shape parameter k and rate parameter c.
$N^*(t)$
 with Weibull-distributed sojourn times having shape parameter k and rate parameter c.
For the proof, see Appendix A.4.
2.4. Age and excess distributions
 Now that the theoretical groundwork for the censoring mechanism has been laid, we proceed by determining the joint distribution of age and excess. The age A(t) is the time elapsed since the last phase change, and B(t), the excess, is the time remaining until the next phase change. For all t where the process is in state 0, or the Z-phase, we assume that the occurrence time can be observed perfectly. Therefore we only consider age and excess with respect to state 1, or the Y-phase. This also provides us with a concrete censoring mechanism: depending on the phase within which a point falls, we note either the exact occurrence time, or the interval corresponding to the age and excess functions. Obtaining their joint distribution allows us to specify the likelihood of intervals based on their starting point and length in terms of the semi-Markov kernel 
 $G_Y$
. See Figure 1 for a visualisation of the age and excess functions.
$G_Y$
. See Figure 1 for a visualisation of the age and excess functions.

Figure 1. Visualisation of a non-homogeneous alternating renewal process with initial values 
 $S_0 = 1$
 and
$S_0 = 1$
 and 
 $X_0 = 0$
. At the dotted line, one cycle has passed – i.e. the process has taken both possible state values. The jump times correspond to a change of state. Since t falls in state 1, a non-zero age A(t) and excess B(t) are recorded. If t were to fall in state 0, an exact time would be recorded.
$X_0 = 0$
. At the dotted line, one cycle has passed – i.e. the process has taken both possible state values. The jump times correspond to a change of state. Since t falls in state 1, a non-zero age A(t) and excess B(t) are recorded. If t were to fall in state 0, an exact time would be recorded.
Proposition 2.4. (Age and excess.) Consider a non-homogeneous alternating renewal process 
 $(S_n, X_n)_{n=1}^{\infty}$
 with values in
$(S_n, X_n)_{n=1}^{\infty}$
 with values in 
 $\{0,1\} \times\mathbb{R}^+$
 with
$\{0,1\} \times\mathbb{R}^+$
 with 
 $S_0=1$
,
$S_0=1$
, 
 $X_0=0$
, semi-Markov kernels
$X_0=0$
, semi-Markov kernels 
 $G_Y$
 and
$G_Y$
 and 
 $G_Z$
, and associated counting measure N(t),
$G_Z$
, and associated counting measure N(t), 
 $t\geq 0$
. Let the age process with respect to the Y-phase be
$t\geq 0$
. Let the age process with respect to the Y-phase be 
 $A(t) = (t - X_{2N(t)})\,\mathbf{1}\{X_{2N(t) + 1} \gt t\}$
 and define the excess with respect to the Y-phase as
$A(t) = (t - X_{2N(t)})\,\mathbf{1}\{X_{2N(t) + 1} \gt t\}$
 and define the excess with respect to the Y-phase as 
 $B(t) = (X_{2N(t) + 1} - t)\,\mathbf{1}\{X_{2N(t) + 1} \gt t\}$
, where
$B(t) = (X_{2N(t) + 1} - t)\,\mathbf{1}\{X_{2N(t) + 1} \gt t\}$
, where 
 $X_{2N(t)}$
 is the jump time immediately after N(t) cycles have been completed. Then, for
$X_{2N(t)}$
 is the jump time immediately after N(t) cycles have been completed. Then, for 
 $t\geq 0$
,
$t\geq 0$
, 
 $0\leq x \leq t$
, and
$0\leq x \leq t$
, and 
 $z\geq 0$
,
$z\geq 0$
, 
 \begin{align} & \mathbb{P}(A(t) \leq x, B(t) \leq z) \nonumber \\ & = G_Y(0,t) - \int_{t-x}^{t} [1 - G_Y(s, t + z - s)]\,{\mathrm{d}} M(s) - \int_0^{t-x} [1 - G_Y(s, t-s)]\,{\mathrm{d}} M(s) \nonumber \\ & \quad + \mathbf{1}\{x=t\}[G_Y(0, t+z) - G_Y(0,t)]. \end{align}
\begin{align} & \mathbb{P}(A(t) \leq x, B(t) \leq z) \nonumber \\ & = G_Y(0,t) - \int_{t-x}^{t} [1 - G_Y(s, t + z - s)]\,{\mathrm{d}} M(s) - \int_0^{t-x} [1 - G_Y(s, t-s)]\,{\mathrm{d}} M(s) \nonumber \\ & \quad + \mathbf{1}\{x=t\}[G_Y(0, t+z) - G_Y(0,t)]. \end{align}
For the proof, see Appendix A.5.
 From Proposition 2.4 we conclude that the probability that time 
 $t \geq 0$
 falls in a Z-phase is given by
$t \geq 0$
 falls in a Z-phase is given by 
 \begin{equation} w_t = \mathbb{P}(A(t) \leq 0, B(t) \leq 0) = G_Y(0,t) - \int_0^{t} [1 - G_Y(s, t-s)]\,{\mathrm{d}} M(s).\end{equation}
\begin{equation} w_t = \mathbb{P}(A(t) \leq 0, B(t) \leq 0) = G_Y(0,t) - \int_0^{t} [1 - G_Y(s, t-s)]\,{\mathrm{d}} M(s).\end{equation}
This case constitutes the atomic part of (2.4). The singular component on the line 
 $x=t$
 has total mass
$x=t$
 has total mass 
 $1 - G_Y(0,t)$
 and represents the case that t falls before the first jump of the alternating renewal process.
$1 - G_Y(0,t)$
 and represents the case that t falls before the first jump of the alternating renewal process.
The absolutely continuous component of (2.4) can be written as
 \begin{align*}\int_0^x \int_0^z g_Y(t-u, u+v) m(t-u)\, {\mathrm{d}} u\,{\mathrm{d}} v\end{align*}
\begin{align*}\int_0^x \int_0^z g_Y(t-u, u+v) m(t-u)\, {\mathrm{d}} u\,{\mathrm{d}} v\end{align*}
provided that the Radon–Nikodym derivatives m of M and 
 $g_Y$
 of
$g_Y$
 of 
 $G_Y$
 exist. Recall that in our proposed censoring mechanism, when t falls in a Y-phase the entire interval
$G_Y$
 exist. Recall that in our proposed censoring mechanism, when t falls in a Y-phase the entire interval 
 $[t-A(t), t+B(t)]$
 is reported, which may be parametrised by the left-most point
$[t-A(t), t+B(t)]$
 is reported, which may be parametrised by the left-most point 
 $t-A(t)$
 and length
$t-A(t)$
 and length 
 $A(t) + B(t)$
. Suppose that
$A(t) + B(t)$
. Suppose that 
 $A(t) = u$
,
$A(t) = u$
, 
 $B(t) = v$
, and apply the change of variables
$B(t) = v$
, and apply the change of variables 
 $a = t- u$
 and
$a = t- u$
 and 
 $l = u + v$
. We find that the joint probability density function of left-most point and length is
$l = u + v$
. We find that the joint probability density function of left-most point and length is 
 \begin{equation} q_t(a, l) = \frac{m(a)g_Y(a, l)}{\int_0^t [1 - G_Y(s, t-s)]\,{\mathrm{d}} M(s)} \mathbf{1} \{ 0 \leq a \leq t \leq a+l;\ l\geq 0 \}\end{equation}
\begin{equation} q_t(a, l) = \frac{m(a)g_Y(a, l)}{\int_0^t [1 - G_Y(s, t-s)]\,{\mathrm{d}} M(s)} \mathbf{1} \{ 0 \leq a \leq t \leq a+l;\ l\geq 0 \}\end{equation}
upon scaling.
Proposition 2.5. (Marginal and conditional distribution.) Let 
 $g_Y$
 and m be as before, and let (A,L) be distributed according to
$g_Y$
 and m be as before, and let (A,L) be distributed according to 
 $q_t(a,l)$
 given by (2.6). Then the marginal probability density function of A at
$q_t(a,l)$
 given by (2.6). Then the marginal probability density function of A at 
 $a \in [0, t]$
 is
$a \in [0, t]$
 is 
 \begin{equation} f_t(a) = \frac{m(a)[1-G_Y(a,t-a)]}{\int_{0}^t [1 - G_Y(s, t-s)]\, {\mathrm{d}} M(s) }\end{equation}
\begin{equation} f_t(a) = \frac{m(a)[1-G_Y(a,t-a)]}{\int_{0}^t [1 - G_Y(s, t-s)]\, {\mathrm{d}} M(s) }\end{equation}
and the conditional probability density function of L given 
 $A=a$
 is, for
$A=a$
 is, for 
 $l \in [t-a, \infty)$
,
$l \in [t-a, \infty)$
, 
 \begin{equation} f_{t,\,L\,|A=a}(l) = \frac{g_Y(a, l)}{1-G_Y(a,t-a)}.\end{equation}
\begin{equation} f_{t,\,L\,|A=a}(l) = \frac{g_Y(a, l)}{1-G_Y(a,t-a)}.\end{equation}
For the proof, see Appendix A.6.
3. A model for non-homogeneous interval censoring
3.1. Model formulation
The ensemble of potentially censored occurrence times can be mathematically formalised as a marked point process [Reference Daley and Vere-Jones7]. The ground process of points represents the uncensored event occurrences, which we model by a Markov point process [Reference Van Lieshout24] defined by a probability density with respect to a unit-rate Poisson process. Temporal variations can be taken into account as well as interactions between the points. Each point is subsequently marked, independently of other points, either by an atom at the point when it is observed perfectly, or by the interval in which the point lies in case of censoring. The mark kernel that governs the random censoring is based on the distribution of age and excess in a non-homogeneous alternating renewal process.
 Formally, let 
 $\mathcal{X}$
 be an open set on the real line. The state space
$\mathcal{X}$
 be an open set on the real line. The state space 
 $\mathcal{N}_{\mathcal{X}}$
 of a point process X consists of finite sets
$\mathcal{N}_{\mathcal{X}}$
 of a point process X consists of finite sets 
 $\{x_1, x_2, \ldots, x_n\} \subset\mathcal{X}$
,
$\{x_1, x_2, \ldots, x_n\} \subset\mathcal{X}$
, 
 $n \in \mathbb{N}_0$
, which we equip with the Borel
$n \in \mathbb{N}_0$
, which we equip with the Borel 
 $\sigma$
-algebra of the weak topology [Reference Daley and Vere-Jones7, Appendix A2]. Let p be a measurable, non-negative function on
$\sigma$
-algebra of the weak topology [Reference Daley and Vere-Jones7, Appendix A2]. Let p be a measurable, non-negative function on 
 $\mathcal{X}$
 that integrates to unity, and
$\mathcal{X}$
 that integrates to unity, and 
 $\sim$
 be a symmetric, reflexive relation on
$\sim$
 be a symmetric, reflexive relation on 
 $\mathcal{X}$
. A point process X on
$\mathcal{X}$
. A point process X on 
 $\mathcal{X}$
 having probability density p with respect to a unit-rate Poisson process is Markov with respect to
$\mathcal{X}$
 having probability density p with respect to a unit-rate Poisson process is Markov with respect to 
 $\sim$
 if [Reference Ripley and Kelly22, Reference Van Lieshout24]:
$\sim$
 if [Reference Ripley and Kelly22, Reference Van Lieshout24]:
- 
(i) p is hereditary, i.e.  $p(\mathbf{x}) \gt 0$
 implies that $p(\mathbf{x}) \gt 0$
 implies that $p(\mathbf{y}) \gt 0$
 for all subsets $p(\mathbf{y}) \gt 0$
 for all subsets $\mathbf{y}$
 of $\mathbf{y}$
 of $\mathbf{x}$
, and $\mathbf{x}$
, and
- 
(ii) the conditional intensity, defined as  ${p(\mathbf{x} \cup \{ t \} )}/{p(\mathbf{x})}$
, with ${p(\mathbf{x} \cup \{ t \} )}/{p(\mathbf{x})}$
, with $a/0 = 0$
 for $a/0 = 0$
 for $a \geq 0$
, depends only on the neighbourhood $a \geq 0$
, depends only on the neighbourhood $\{ x \in \mathbf{x} \colon x \sim t \}$
 of t in $\{ x \in \mathbf{x} \colon x \sim t \}$
 of t in $\mathbf{x}$
 for every $\mathbf{x}$
 for every $t\in \mathcal{X}\setminus \mathbf{x}$
 and every $t\in \mathcal{X}\setminus \mathbf{x}$
 and every $\mathbf{x} = \{ x_1, \ldots, x_n\} \subset \mathcal{X}$
 for which $\mathbf{x} = \{ x_1, \ldots, x_n\} \subset \mathcal{X}$
 for which $p(\mathbf{x}) \gt 0$
. $p(\mathbf{x}) \gt 0$
.
 An interaction function is a family 
 $\phi_0, \phi_1, \phi_2,\ldots$
 of non-negative functions
$\phi_0, \phi_1, \phi_2,\ldots$
 of non-negative functions 
 $\phi_i$
 defined on configurations of i points that take the value one whenever the configuration contains a pair
$\phi_i$
 defined on configurations of i points that take the value one whenever the configuration contains a pair 
 $\{ x_1, x_2 \}$
 of points that are unrelated, i.e.
$\{ x_1, x_2 \}$
 of points that are unrelated, i.e. 
 $x_1 \not\sim x_2$
. By the Hammersley–Clifford theorem [Reference Ripley and Kelly22], a Markov density p can be factorised as
$x_1 \not\sim x_2$
. By the Hammersley–Clifford theorem [Reference Ripley and Kelly22], a Markov density p can be factorised as 
 \begin{equation}p(\mathbf{x}) = \prod_{\mathbf{y}\subset\mathbf{x}} \phi_{|\mathbf{y}|}(\mathbf{y})\end{equation}
\begin{equation}p(\mathbf{x}) = \prod_{\mathbf{y}\subset\mathbf{x}} \phi_{|\mathbf{y}|}(\mathbf{y})\end{equation}
for some interaction function 
 $\phi_i$
, writing
$\phi_i$
, writing 
 $|\cdot|$
 for cardinality. The function
$|\cdot|$
 for cardinality. The function 
 $\phi_1(x)$
 can be used to model temporal variations in the likelihood of events occurring. Higher-order terms
$\phi_1(x)$
 can be used to model temporal variations in the likelihood of events occurring. Higher-order terms 
 $\phi_2, \phi_3, \ldots$
 govern interactions between pairs, triples, or tuples of points.
$\phi_2, \phi_3, \ldots$
 govern interactions between pairs, triples, or tuples of points.
 The points x in a realisation 
 $\mathbf{x}$
 of X are marked independently according to a mark kernel
$\mathbf{x}$
 of X are marked independently according to a mark kernel 
 $\nu(\cdot \mid x)$
 on
$\nu(\cdot \mid x)$
 on 
 ${\mathbb{R}}\times{\mathbb{R}}^+$
. A mark (a, l) represents an interval
${\mathbb{R}}\times{\mathbb{R}}^+$
. A mark (a, l) represents an interval 
 $[a,\ a+l]$
 that starts at a and has length l. The mark kernel
$[a,\ a+l]$
 that starts at a and has length l. The mark kernel 
 $\nu$
, which describes the conditional distribution of the mark given a location in time [Reference Daley and Vere-Jones7], formalises the non-homogeneous alternating renewal process censoring discussed in Section 2. For demonstrative purposes, we assumed a starting time of 0, which we now set to
$\nu$
, which describes the conditional distribution of the mark given a location in time [Reference Daley and Vere-Jones7], formalises the non-homogeneous alternating renewal process censoring discussed in Section 2. For demonstrative purposes, we assumed a starting time of 0, which we now set to 
 $-\infty$
. Doing so also allows us to ignore the singular component. Hence, the appropriate time-dependent mark kernel
$-\infty$
. Doing so also allows us to ignore the singular component. Hence, the appropriate time-dependent mark kernel 
 $\nu(\cdot \mid x)$
,
$\nu(\cdot \mid x)$
, 
 $x\in\mathcal{X}$
, for a Borel subset
$x\in\mathcal{X}$
, for a Borel subset 
 $A \subset \mathbb{R} \times \mathbb{R}^+$
 is
$A \subset \mathbb{R} \times \mathbb{R}^+$
 is 
 \begin{align} \nu(A \mid x) & = \bigg(1 - {\int_{-\infty}^x[1 - G_Y(s, x-s)]\,{\mathrm{d}} M(s)}\bigg) \delta(\{(x,0)\} \cap A) \nonumber \\ & \quad + \int_{-\infty}^x\int_{x-a}^{\infty} \mathbf{1}\{ (a,l) \in A \} \, G_Y(a,{\mathrm{d}} l)\,{\mathrm{d}} M(a). \end{align}
\begin{align} \nu(A \mid x) & = \bigg(1 - {\int_{-\infty}^x[1 - G_Y(s, x-s)]\,{\mathrm{d}} M(s)}\bigg) \delta(\{(x,0)\} \cap A) \nonumber \\ & \quad + \int_{-\infty}^x\int_{x-a}^{\infty} \mathbf{1}\{ (a,l) \in A \} \, G_Y(a,{\mathrm{d}} l)\,{\mathrm{d}} M(a). \end{align}
Write W for the marked point process defined by 
 $p{(}{\cdot}{)}$
 and
$p{(}{\cdot}{)}$
 and 
 $\nu(\cdot \mid \cdot)$
 [Reference Daley and Vere-Jones7, Proposition 4.IV]. A realisation
$\nu(\cdot \mid \cdot)$
 [Reference Daley and Vere-Jones7, Proposition 4.IV]. A realisation 
 $\mathbf{w}$
 is of the form
$\mathbf{w}$
 is of the form 
 \begin{align*}\mathbf{w} = \{w_1, w_2, \ldots, w_n\} = \{(x_1, (a_1, l_1)), (x_2, (a_2, l_2)), \ldots, (x_n, (a_n, l_n))\}\end{align*}
\begin{align*}\mathbf{w} = \{w_1, w_2, \ldots, w_n\} = \{(x_1, (a_1, l_1)), (x_2, (a_2, l_2)), \ldots, (x_n, (a_n, l_n))\}\end{align*}
for 
 $a_i \leq x_i \leq a_i+l_i$
 for all
$a_i \leq x_i \leq a_i+l_i$
 for all 
 $i = 1, 2, \ldots, n$
. We denote the set of realisations by
$i = 1, 2, \ldots, n$
. We denote the set of realisations by 
 $\mathcal{N}_{ {\mathcal{X}}\times (\mathbb{R} \times \mathbb{R}^+) }$
.
$\mathcal{N}_{ {\mathcal{X}}\times (\mathbb{R} \times \mathbb{R}^+) }$
.
 The model description is complete by noting that the observable pattern of marks after censoring is 
 $U = \bigcup_{(x_i, (a_i,l_i)) \in W} \{ (a_i, l_i) \}$
. To obtain the probability distribution of U, write, for F in the Borel
$U = \bigcup_{(x_i, (a_i,l_i)) \in W} \{ (a_i, l_i) \}$
. To obtain the probability distribution of U, write, for F in the Borel 
 $\sigma$
-algebra of the weak topology on
$\sigma$
-algebra of the weak topology on 
 $\mathcal{N}_{ {\mathcal{\mathbb{R} \times \mathbb{R}^+}}}$
,
$\mathcal{N}_{ {\mathcal{\mathbb{R} \times \mathbb{R}^+}}}$
, 
 \begin{align*} \mathbb{P}(U\in F \mid X = \mathbf{x} ) = \int_{(\mathbb{R}\times\mathbb{R}^+)^n} \mathbf{1}(\{ (a_1, l_1), \ldots, (a_n, l_n) \} \in F) \prod_{i=1}^n {\mathrm{d}} \nu((a_i, l_i)\mid x_i),\end{align*}
\begin{align*} \mathbb{P}(U\in F \mid X = \mathbf{x} ) = \int_{(\mathbb{R}\times\mathbb{R}^+)^n} \mathbf{1}(\{ (a_1, l_1), \ldots, (a_n, l_n) \} \in F) \prod_{i=1}^n {\mathrm{d}} \nu((a_i, l_i)\mid x_i),\end{align*}
where 
 $\mathbf{x} = \{x_1, \ldots, x_n\}$
, and then take the expectation with respect to X.
$\mathbf{x} = \{x_1, \ldots, x_n\}$
, and then take the expectation with respect to X.
3.2. Conditional distribution
 Write 
 $\mathbf{u}$
 for a realisation of the interval set U. We are interested in the conditional distributions of X and W given U.
$\mathbf{u}$
 for a realisation of the interval set U. We are interested in the conditional distributions of X and W given U.
Theorem 3.1. (Conditional distribution.) Let W be a marked point process with ground process X on the open set 
 ${\mathcal{X}} \subset {\mathbb{R}}$
 defined by its probability density function p with respect to the distribution of a unit-rate Poisson process having independent marks distributed according to the mark kernel
${\mathcal{X}} \subset {\mathbb{R}}$
 defined by its probability density function p with respect to the distribution of a unit-rate Poisson process having independent marks distributed according to the mark kernel 
 $\nu( \cdot \mid x) $
 for
$\nu( \cdot \mid x) $
 for 
 $x \in {\mathcal{X}}$
 given by (3.2). Let
$x \in {\mathcal{X}}$
 given by (3.2). Let 
 ${\mathbf{u}}$
 be a realisation of U that consists of an atomic part
${\mathbf{u}}$
 be a realisation of U that consists of an atomic part 
 $\{ (a_1, 0), \ldots, (a_m, 0) \}$
,
$\{ (a_1, 0), \ldots, (a_m, 0) \}$
, 
 $m\in{\mathbb{N}}_0$
, and a non-atomic part
$m\in{\mathbb{N}}_0$
, and a non-atomic part 
 $\{ (a_{m+1}, l_{m+1}), \ldots, (a_n, l_n) \}$
,
$\{ (a_{m+1}, l_{m+1}), \ldots, (a_n, l_n) \}$
, 
 $n \geq m$
. Then the conditional distribution of X given
$n \geq m$
. Then the conditional distribution of X given 
 $U = {\mathbf{u}}$
 satisfies, for A in the Borel
$U = {\mathbf{u}}$
 satisfies, for A in the Borel 
 $\sigma$
-algebra of the weak topology on
$\sigma$
-algebra of the weak topology on 
 ${\mathcal{N}}_{\mathcal{X}}$
,
${\mathcal{N}}_{\mathcal{X}}$
, 
 \begin{align*} \mathbb{P}(X \in A\mid U=\mathbf{u}) = & \int_{\mathcal{X}^{n-m}} p(\{a_1, \ldots, a_m, x_{1}, \ldots, x_{n-m}\}) \mathbf{1}_A(\{a_1, \ldots, a_m, x_{1}, \ldots, x_{n-m}\}) \\ & \quad \sum_{\substack{D_1, \ldots, D_{n-m} \\ \cup_j \{ D_j \} = \{1, \ldots, n-m\}}} \prod_{i=1}^{n-m} \mathbf{1}_{[a_{m+i}, a_{m+i} + l_{m+i}]}(x_{D_i})\,{\mathrm{d}} x_i \times c(\mathbf{u}),\end{align*}
\begin{align*} \mathbb{P}(X \in A\mid U=\mathbf{u}) = & \int_{\mathcal{X}^{n-m}} p(\{a_1, \ldots, a_m, x_{1}, \ldots, x_{n-m}\}) \mathbf{1}_A(\{a_1, \ldots, a_m, x_{1}, \ldots, x_{n-m}\}) \\ & \quad \sum_{\substack{D_1, \ldots, D_{n-m} \\ \cup_j \{ D_j \} = \{1, \ldots, n-m\}}} \prod_{i=1}^{n-m} \mathbf{1}_{[a_{m+i}, a_{m+i} + l_{m+i}]}(x_{D_i})\,{\mathrm{d}} x_i \times c(\mathbf{u}),\end{align*}
provided that the normalisation constant
 \begin{equation*} c(\mathbf{u}) = \Bigg[\int_{\mathcal{X}^{n-m}} p(\mathbf{x} \cup \{a_1, \ldots, a_m\}) \sum_{\substack{D_1, \ldots, D_{n-m} \\ \cup_j \{ D_j \} = \{1, \ldots, n-m\}}} \prod_{i=1}^{n-m} \mathbf{1}_{[a_{m+i}, a_{m+i} + l_{m+i}]}(x_{D_i})\,{\mathrm{d}} x_i\Bigg]^{-1}\end{equation*}
\begin{equation*} c(\mathbf{u}) = \Bigg[\int_{\mathcal{X}^{n-m}} p(\mathbf{x} \cup \{a_1, \ldots, a_m\}) \sum_{\substack{D_1, \ldots, D_{n-m} \\ \cup_j \{ D_j \} = \{1, \ldots, n-m\}}} \prod_{i=1}^{n-m} \mathbf{1}_{[a_{m+i}, a_{m+i} + l_{m+i}]}(x_{D_i})\,{\mathrm{d}} x_i\Bigg]^{-1}\end{equation*}
exists in 
 $(0,\infty)$
.
$(0,\infty)$
.
Proof. We must prove (see, e.g., [Reference Daley and Vere-Jones7, Appendix 3.1]), for each A in the Borel 
 $\sigma$
-algebra of
$\sigma$
-algebra of 
 ${\mathcal{N}}_{\mathcal{X}}$
 with respect to the weak topology and each F in the Borel
${\mathcal{N}}_{\mathcal{X}}$
 with respect to the weak topology and each F in the Borel 
 $\sigma$
-algebra of the weak topology on
$\sigma$
-algebra of the weak topology on 
 ${\mathcal{N}}_{{\mathbb{R}}\times{\mathbb{R}}^+}$
, that
${\mathcal{N}}_{{\mathbb{R}}\times{\mathbb{R}}^+}$
, that 
 \begin{equation*}{\mathbb{E}}[ {\mathbf{1}}_F(U) {\mathbb{P}}( X \in A \mid U )] ={\mathbb{E}}[ {\mathbf{1}}_F(U) {\mathbf{1}}_A(X)],\end{equation*}
\begin{equation*}{\mathbb{E}}[ {\mathbf{1}}_F(U) {\mathbb{P}}( X \in A \mid U )] ={\mathbb{E}}[ {\mathbf{1}}_F(U) {\mathbf{1}}_A(X)],\end{equation*}
as in [Reference Van Lieshout and Markwitz25, (4)]. From the model description, writing 
 $w_t$
 for the modification of (2.5) over
$w_t$
 for the modification of (2.5) over 
 $({-}{\infty,} t)$
,
$({-}{\infty,} t)$
, 
 $\ell$
 for Lebesgue measure, and
$\ell$
 for Lebesgue measure, and 
 $| \cdot |$
 for cardinality,
$| \cdot |$
 for cardinality, 
 \begin{align} & \mathbb{E}[\mathbf{1}_F(U)\mathbf{1}_A(X)] \nonumber \\ & = \sum_{n=0}^{\infty}\frac{{\mathrm{e}}^{-\ell(\mathcal{X})}}{n!} \nonumber \\ & \qquad \int_{\mathcal{X}^n}\mathbf{1}_A(\mathbf{x})p(\{x_1,\ldots,x_n\})\sum_{C_0\subset\{1,\ldots,n\}} \frac{1}{(n - |C_0|)!} \prod_{i\in C_0} w_{x_i} \nonumber \\ & \qquad\quad \int_{(\mathbb{R}\times\mathbb{R}_0^+)^{n-|C_0|}}\mathbf{1}_F(\{(a_1,l_1),\ldots,(a_{n-|C_0|},l_{n-|C_0|})\} \cup (\mathbf{x}_{C_0} \times \{{0}\})) \nonumber \\ & \qquad\qquad \sum_{\substack{C_1,\ldots,C_{n-|C_0|} \\ \cup_j \{C_j\} = \{1,\ldots,n\}\setminus C_0}} \prod_{j=1}^{n-|C_0|} m(a_j)g_Y(a_j,l_j)\mathbf{1}_{[a_j, a_j + l_j]}(x_{C_j})\,{\mathrm{d}} a_j\,{\mathrm{d}} l_j \prod_{i=1}^n{\mathrm{d}} x_i. \end{align}
\begin{align} & \mathbb{E}[\mathbf{1}_F(U)\mathbf{1}_A(X)] \nonumber \\ & = \sum_{n=0}^{\infty}\frac{{\mathrm{e}}^{-\ell(\mathcal{X})}}{n!} \nonumber \\ & \qquad \int_{\mathcal{X}^n}\mathbf{1}_A(\mathbf{x})p(\{x_1,\ldots,x_n\})\sum_{C_0\subset\{1,\ldots,n\}} \frac{1}{(n - |C_0|)!} \prod_{i\in C_0} w_{x_i} \nonumber \\ & \qquad\quad \int_{(\mathbb{R}\times\mathbb{R}_0^+)^{n-|C_0|}}\mathbf{1}_F(\{(a_1,l_1),\ldots,(a_{n-|C_0|},l_{n-|C_0|})\} \cup (\mathbf{x}_{C_0} \times \{{0}\})) \nonumber \\ & \qquad\qquad \sum_{\substack{C_1,\ldots,C_{n-|C_0|} \\ \cup_j \{C_j\} = \{1,\ldots,n\}\setminus C_0}} \prod_{j=1}^{n-|C_0|} m(a_j)g_Y(a_j,l_j)\mathbf{1}_{[a_j, a_j + l_j]}(x_{C_j})\,{\mathrm{d}} a_j\,{\mathrm{d}} l_j \prod_{i=1}^n{\mathrm{d}} x_i. \end{align}
Similarly,
 \begin{align*} & {\mathbb{E}}[{\mathbf{1}}_F(U){\mathbb{P}}(X \in A \mid U )] \\ & = \sum_{n=0}^{\infty}\frac{{\mathrm{e}}^{-\ell(\mathcal{X})}}{n!} \\ & \qquad \int_{\mathcal{X}^n} p(\{x_1, \ldots, x_n\})\sum_{C_0 \subset\{1,\ldots,n\}} \frac{1}{(n - |C_0|)!}\prod_{i \in C_0} w_{x_i} \\ & \qquad\quad \int_{(\mathbb{R}\times\mathbb{R}_0^+)^{n-|C_0|}}\mathbf{1}_F(\{(a_1,l_1),\ldots,(a_{n-|C_0|},l_{n-|C_0|})\} \cup (\mathbf{x}_{C_0} \times \{{0}\})) \\ & \qquad\qquad \mathbb{P}(X \in A\mid U = \{(a_1,l_1),\ldots,(a_{n-|C_0|},l_{n-|C_0|})\} \cup (\mathbf{x}_{C_0} \times \{{0}\})) \\ & \qquad\qquad \sum_{\substack{C_1,\ldots,C_{n-|C_0|} \\ \cup_j \{ C_j\} = \{1,\ldots, n\} \setminus C_0}} \prod_{j=1}^{n-|C_0|} m(a_j)g_Y(a_j,l_j)\mathbf{1}_{[a_j, a_j + l_j]}(x_{C_j})\,{\mathrm{d}} a_j\,{\mathrm{d}} l_j\prod_{i=1}^n{\mathrm{d}} x_i.\end{align*}
\begin{align*} & {\mathbb{E}}[{\mathbf{1}}_F(U){\mathbb{P}}(X \in A \mid U )] \\ & = \sum_{n=0}^{\infty}\frac{{\mathrm{e}}^{-\ell(\mathcal{X})}}{n!} \\ & \qquad \int_{\mathcal{X}^n} p(\{x_1, \ldots, x_n\})\sum_{C_0 \subset\{1,\ldots,n\}} \frac{1}{(n - |C_0|)!}\prod_{i \in C_0} w_{x_i} \\ & \qquad\quad \int_{(\mathbb{R}\times\mathbb{R}_0^+)^{n-|C_0|}}\mathbf{1}_F(\{(a_1,l_1),\ldots,(a_{n-|C_0|},l_{n-|C_0|})\} \cup (\mathbf{x}_{C_0} \times \{{0}\})) \\ & \qquad\qquad \mathbb{P}(X \in A\mid U = \{(a_1,l_1),\ldots,(a_{n-|C_0|},l_{n-|C_0|})\} \cup (\mathbf{x}_{C_0} \times \{{0}\})) \\ & \qquad\qquad \sum_{\substack{C_1,\ldots,C_{n-|C_0|} \\ \cup_j \{ C_j\} = \{1,\ldots, n\} \setminus C_0}} \prod_{j=1}^{n-|C_0|} m(a_j)g_Y(a_j,l_j)\mathbf{1}_{[a_j, a_j + l_j]}(x_{C_j})\,{\mathrm{d}} a_j\,{\mathrm{d}} l_j\prod_{i=1}^n{\mathrm{d}} x_i.\end{align*}
Next, we plug in the expression for 
 $\mathbb{P}(X \in A\mid U=\mathbf{u})$
 proposed in the statement of the theorem. Upon substitution, changing integration order, and rearranging, we obtain
$\mathbb{P}(X \in A\mid U=\mathbf{u})$
 proposed in the statement of the theorem. Upon substitution, changing integration order, and rearranging, we obtain 
 \begin{align*} & {\mathbb{E}}[{\mathbf{1}}_F(U){\mathbb{P}}(X \in A \mid U )] \\ & = \sum_{n=0}^{\infty}\frac{{\mathrm{e}}^{-\ell(\mathcal{X})}}{n!} \\ & \qquad \sum_{C_0 \subset\{1,\ldots,n\}}\frac{1}{(n - |C_0|)!} \\ & \qquad\quad \int_{\mathcal{X}^n} p(\mathbf{y} \cup \mathbf{x}_{C_0})\mathbf{1}_A(\mathbf{y}\cup\mathbf{x}_{C_0}) \prod_{i \in C_0} w_{x_i}\end{align*}
\begin{align*} & {\mathbb{E}}[{\mathbf{1}}_F(U){\mathbb{P}}(X \in A \mid U )] \\ & = \sum_{n=0}^{\infty}\frac{{\mathrm{e}}^{-\ell(\mathcal{X})}}{n!} \\ & \qquad \sum_{C_0 \subset\{1,\ldots,n\}}\frac{1}{(n - |C_0|)!} \\ & \qquad\quad \int_{\mathcal{X}^n} p(\mathbf{y} \cup \mathbf{x}_{C_0})\mathbf{1}_A(\mathbf{y}\cup\mathbf{x}_{C_0}) \prod_{i \in C_0} w_{x_i}\end{align*}
 \begin{align*} & \qquad\qquad \int_{(\mathbb{R}\times\mathbb{R}_0^+)^{n-|C_0|}} \mathbf{1}_F(\{(a_1,l_1),\ldots,(a_{n-|C_0|},l_{n-|C_0|})\} \cup (\mathbf{x}_{C_0} \times \{{0}\})) \\ & \qquad\qquad\quad c(\{(a_1,l_1),\ldots,(a_{n-|C_0|},l_{n-|C_0|}) \cup (\mathbf{x}_{C_0}\times\{0\})) \\ & \qquad\qquad\quad \sum_{\substack{D_1,\ldots,D_{n-|C_0|} \\ \cup_j \{D_j\} = \{1,\ldots,n\} \setminus C_0}} \prod_{j=1}^{n-|C_0|} m(a_j)g_Y(a_j,l_j)\mathbf{1}_{[a_j,a_j + l_j]}(y_{D_j}) \\ & \qquad\qquad\qquad\qquad\qquad\qquad\qquad\quad \Bigg(\int_{\mathcal{X}^{n-|C_0|}}p(\mathbf{x}) \sum_{\substack{C_1,\ldots,C_{n-|C_0|} \\ \cup_j \{C_j\} = \{1,\ldots,n\} \setminus C_0}} \\ & \qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\quad \prod_{j=1}^{n-|C_0|}\mathbf{1}_{[a_j, a_j+l_j]}(x_{C_j}) \prod_{j\not\in C_0} {\mathrm{d}} x_j \Bigg) \\ & \qquad\qquad\quad \prod_{j=1}^{n-|C_0|}{\mathrm{d}} a_j\,{\mathrm{d}} l_j\prod_{i\in C_0}{\mathrm{d}} x_i\prod_{k=1}^{n-|C_0|}{\mathrm{d}} y_k = \mathbb{E}[\mathbf{1}_F(U)\mathbf{1}_A(X)],\end{align*}
\begin{align*} & \qquad\qquad \int_{(\mathbb{R}\times\mathbb{R}_0^+)^{n-|C_0|}} \mathbf{1}_F(\{(a_1,l_1),\ldots,(a_{n-|C_0|},l_{n-|C_0|})\} \cup (\mathbf{x}_{C_0} \times \{{0}\})) \\ & \qquad\qquad\quad c(\{(a_1,l_1),\ldots,(a_{n-|C_0|},l_{n-|C_0|}) \cup (\mathbf{x}_{C_0}\times\{0\})) \\ & \qquad\qquad\quad \sum_{\substack{D_1,\ldots,D_{n-|C_0|} \\ \cup_j \{D_j\} = \{1,\ldots,n\} \setminus C_0}} \prod_{j=1}^{n-|C_0|} m(a_j)g_Y(a_j,l_j)\mathbf{1}_{[a_j,a_j + l_j]}(y_{D_j}) \\ & \qquad\qquad\qquad\qquad\qquad\qquad\qquad\quad \Bigg(\int_{\mathcal{X}^{n-|C_0|}}p(\mathbf{x}) \sum_{\substack{C_1,\ldots,C_{n-|C_0|} \\ \cup_j \{C_j\} = \{1,\ldots,n\} \setminus C_0}} \\ & \qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\quad \prod_{j=1}^{n-|C_0|}\mathbf{1}_{[a_j, a_j+l_j]}(x_{C_j}) \prod_{j\not\in C_0} {\mathrm{d}} x_j \Bigg) \\ & \qquad\qquad\quad \prod_{j=1}^{n-|C_0|}{\mathrm{d}} a_j\,{\mathrm{d}} l_j\prod_{i\in C_0}{\mathrm{d}} x_i\prod_{k=1}^{n-|C_0|}{\mathrm{d}} y_k = \mathbb{E}[\mathbf{1}_F(U)\mathbf{1}_A(X)],\end{align*}
since the term within brackets cancels out against the normalisation constant 
 $c{(}{\cdot}{)}$
.
$c{(}{\cdot}{)}$
.
Strikingly, although the marking mechanism is more complicated than that in [Reference Van Lieshout and Markwitz25], the conditional distribution of X has the same form.
 The conditional distribution of W can be obtained in the same vein, by considering 
 $\mathbf{1}_{A}(W)$
 instead of
$\mathbf{1}_{A}(W)$
 instead of 
 $\mathbf{1}_A(X)$
 for A a Borel set in
$\mathbf{1}_A(X)$
 for A a Borel set in 
 $\mathcal{N}_{\mathcal{X}\times(\mathbb{R}\times\mathbb{R}^+)}$
, the space of marked point configurations. It is given by
$\mathcal{N}_{\mathcal{X}\times(\mathbb{R}\times\mathbb{R}^+)}$
, the space of marked point configurations. It is given by 
 \begin{align*} & \mathbb{P}(W \in A \mid U=\mathbf{u}) \\ & \propto \int_{\mathcal{X}^{n-m}}p(\{a_1,\ldots,a_m,x_{1},\ldots,x_{n-m}\}) \\ & \qquad \mathbf{1}_A(\{(a_1,(a_1,0)),\ldots,(a_m,(a_m,0)),(x_{1},(a_{m+1},l_{m+1})),\ldots,(x_{n-m},(a_{n},l_{n}))\}) \\ & \qquad \prod_{i=1}^{n-m}\mathbf{1}_{[a_{m+i}, a_{m+i} + l_{m+i}]}(x_{i})\,{\mathrm{d}} x_i .\end{align*}
\begin{align*} & \mathbb{P}(W \in A \mid U=\mathbf{u}) \\ & \propto \int_{\mathcal{X}^{n-m}}p(\{a_1,\ldots,a_m,x_{1},\ldots,x_{n-m}\}) \\ & \qquad \mathbf{1}_A(\{(a_1,(a_1,0)),\ldots,(a_m,(a_m,0)),(x_{1},(a_{m+1},l_{m+1})),\ldots,(x_{n-m},(a_{n},l_{n}))\}) \\ & \qquad \prod_{i=1}^{n-m}\mathbf{1}_{[a_{m+i}, a_{m+i} + l_{m+i}]}(x_{i})\,{\mathrm{d}} x_i .\end{align*}
4. Modelling considerations
 In this section, we consider parametric forms for 
 $p{(}{\cdot}{)}$
 and
$p{(}{\cdot}{)}$
 and 
 $\nu(\cdot \mid x)$
.
$\nu(\cdot \mid x)$
.
4.1. Non-homogeneous point process densities
 We look first at inhomogeneity that manifests itself via the occurrence time distribution. In view of (3.1), it is natural to add inhomogeneity by means of the first-order interaction function 
 $\phi_1$
, a procedure known as type I inhomogeneity [Reference Jensen and Nielsen13]. The idea is to let
$\phi_1$
, a procedure known as type I inhomogeneity [Reference Jensen and Nielsen13]. The idea is to let 
 $\phi_1(\{ x \}) = \beta(x)$
 vary over time according to a measurable function
$\phi_1(\{ x \}) = \beta(x)$
 vary over time according to a measurable function 
 $\beta$
 that maps
$\beta$
 that maps 
 $x \in \mathcal{X}$
 to
$x \in \mathcal{X}$
 to 
 $[0,\infty)$
. In many applications, it may make sense to model
$[0,\infty)$
. In many applications, it may make sense to model 
 $\beta$
 as a step function. More specifically, given a measurable partition
$\beta$
 as a step function. More specifically, given a measurable partition 
 $B_1, \ldots, B_K$
 of
$B_1, \ldots, B_K$
 of 
 $\mathcal{X}$
, set
$\mathcal{X}$
, set 
 $\beta(x) = \sum_{k=1}^{K}\beta_k\mathbf{1}_{B_k}(x)$
,
$\beta(x) = \sum_{k=1}^{K}\beta_k\mathbf{1}_{B_k}(x)$
, 
 $x \in \mathcal{X}$
, where
$x \in \mathcal{X}$
, where 
 $\beta_k \geq 0$
 is the value that
$\beta_k \geq 0$
 is the value that 
 $\beta$
 takes in the corresponding set
$\beta$
 takes in the corresponding set 
 $B_k$
.
$B_k$
.
 The function 
 $\phi_1$
 can be combined with classic second- and higher-order interaction functions. For instance, the probability density of the non-homogeneous area-interaction point process [Reference Baddeley and van Lieshout3] becomes
$\phi_1$
 can be combined with classic second- and higher-order interaction functions. For instance, the probability density of the non-homogeneous area-interaction point process [Reference Baddeley and van Lieshout3] becomes 
 \begin{align} p(\mathbf{x}) = \alpha_p \Bigg( \prod_{x \in \mathbf{x}} \beta(x) \Bigg) \exp[{-}(\log \gamma)\, \ell( {\mathcal{X}} \cap U_r(\mathbf{x}) )] \end{align}
\begin{align} p(\mathbf{x}) = \alpha_p \Bigg( \prod_{x \in \mathbf{x}} \beta(x) \Bigg) \exp[{-}(\log \gamma)\, \ell( {\mathcal{X}} \cap U_r(\mathbf{x}) )] \end{align}
with respect to a unit-rate Poisson process on 
 $\mathcal{X}$
. The parameter
$\mathcal{X}$
. The parameter 
 $\gamma$
 quantifies the interaction strength, r the radius of interaction, and
$\gamma$
 quantifies the interaction strength, r the radius of interaction, and 
 $\alpha_p = c(\beta({\cdot}), \gamma)$
 is a normalisation constant [Reference Baddeley and van Lieshout3] that depends on the function
$\alpha_p = c(\beta({\cdot}), \gamma)$
 is a normalisation constant [Reference Baddeley and van Lieshout3] that depends on the function 
 $\beta$
 as well as on
$\beta$
 as well as on 
 $\gamma$
. Additionally,
$\gamma$
. Additionally, 
 $U_r(\mathbf{x}) =\bigcup_{i=1}^n B(x_i, r)$
, where
$U_r(\mathbf{x}) =\bigcup_{i=1}^n B(x_i, r)$
, where 
 $B(x_i,r)$
 is the closed interval
$B(x_i,r)$
 is the closed interval 
 $[ x_i-r, x_i + r]$
. We observe regularity for
$[ x_i-r, x_i + r]$
. We observe regularity for 
 $\gamma \lt 1$
 and clustering for
$\gamma \lt 1$
 and clustering for 
 $\gamma \gt 1$
, and
$\gamma \gt 1$
, and 
 $\gamma = 1$
 corresponds to a non-homogeneous Poisson process with intensity function
$\gamma = 1$
 corresponds to a non-homogeneous Poisson process with intensity function 
 $\beta$
. For further examples, we refer to [Reference Van Lieshout24].
$\beta$
. For further examples, we refer to [Reference Van Lieshout24].
4.2. Parametric modelling of the mark kernel
 To proceed, parametric forms for 
 $G_Y$
 and m must be developed. We begin by modelling
$G_Y$
 and m must be developed. We begin by modelling 
 $G_Y$
, the semi-Markov kernel that determines the length of time until the next jump. We may take one of the time-dependent probability density functions considered in Section 2.2. For instance,
$G_Y$
, the semi-Markov kernel that determines the length of time until the next jump. We may take one of the time-dependent probability density functions considered in Section 2.2. For instance, 
 $g_Y(a, \cdot)$
 could be the density function of an exponential distribution with rate
$g_Y(a, \cdot)$
 could be the density function of an exponential distribution with rate 
 \begin{align} \lambda(a;\ \alpha) = \alpha (b + \sin(c a)), \quad a \in \mathbb{R}, \end{align}
\begin{align} \lambda(a;\ \alpha) = \alpha (b + \sin(c a)), \quad a \in \mathbb{R}, \end{align}
where c specifies the period and 
 $b \geq 1$
 the elevation away from 0. The parameter
$b \geq 1$
 the elevation away from 0. The parameter 
 $\alpha$
 determines the amplitude of the harmonic.
$\alpha$
 determines the amplitude of the harmonic.
 We could proceed in a similar fashion for 
 $g_Z$
. However, there are two problems with such an approach. From a probabilistic point of view, tractable expressions for the renewal density m in terms of the semi-Markov kernels
$g_Z$
. However, there are two problems with such an approach. From a probabilistic point of view, tractable expressions for the renewal density m in terms of the semi-Markov kernels 
 $G_Y$
 and
$G_Y$
 and 
 $G_Z$
 do not seem to exist, and, statistically speaking, lengths of Z-phases cannot be observed. Therefore, we shall model m directly. The following proposition justifies this approach.
$G_Z$
 do not seem to exist, and, statistically speaking, lengths of Z-phases cannot be observed. Therefore, we shall model m directly. The following proposition justifies this approach.
Proposition 4.1. (Renewal function modelling.) Let 
 $(S_n, X_n)_{n=1}^\infty$
 be a non-homogeneous alternating renewal process on
$(S_n, X_n)_{n=1}^\infty$
 be a non-homogeneous alternating renewal process on 
 $\{ 1 \}\times \mathbb{R}^+$
 with
$\{ 1 \}\times \mathbb{R}^+$
 with 
 $S_0 = 1$
 and
$S_0 = 1$
 and 
 $X_0=0$
 having semi-Markov kernel
$X_0=0$
 having semi-Markov kernel 
 $G_Y$
 defined by a density function
$G_Y$
 defined by a density function 
 $g_Y(t, \tau)$
,
$g_Y(t, \tau)$
, 
 $t\in\mathbb{R}^+$
,
$t\in\mathbb{R}^+$
, 
 $\tau \in [0,\infty)$
, and write
$\tau \in [0,\infty)$
, and write 
 $\tilde m$
 for the density of its renewal function
$\tilde m$
 for the density of its renewal function 
 $\tilde M(t) = \sum_{n=1}^\infty {\mathbb{P}}( X_n \leq t )$
. If
$\tilde M(t) = \sum_{n=1}^\infty {\mathbb{P}}( X_n \leq t )$
. If 
 $h(t)\colon \mathbb{R}^+ \to [0,\infty)$
 is a Borel-measurable function such that
$h(t)\colon \mathbb{R}^+ \to [0,\infty)$
 is a Borel-measurable function such that 
 $h(t) \leq \tilde{m}(t)$
, then there exists a non-homogeneous alternating renewal process on
$h(t) \leq \tilde{m}(t)$
, then there exists a non-homogeneous alternating renewal process on 
 $\{ 0, 1 \} \times \mathbb{R}^+$
 with
$\{ 0, 1 \} \times \mathbb{R}^+$
 with 
 $G_{10} = G_Y$
 and renewal density h.
$G_{10} = G_Y$
 and renewal density h.
Proof. As 
 $0 \leq {h(t)} / {\tilde{m}(t)} \leq 1$
, we may use a time-dependent thinning approach with retention probability
$0 \leq {h(t)} / {\tilde{m}(t)} \leq 1$
, we may use a time-dependent thinning approach with retention probability 
 $p(t) = {h(t)} / {\tilde{m}(t)}$
. Algorithmically, the sought-after process can be constructed as follows. Initialise
$p(t) = {h(t)} / {\tilde{m}(t)}$
. Algorithmically, the sought-after process can be constructed as follows. Initialise 
 $\widehat{S}_0 = 1$
,
$\widehat{S}_0 = 1$
, 
 $\widehat{X}_0 = 0$
, and
$\widehat{X}_0 = 0$
, and 
 $\widehat{X}_1= X_1$
. Also, set
$\widehat{X}_1= X_1$
. Also, set 
 $\widehat{S}_{2i} = 1$
,
$\widehat{S}_{2i} = 1$
, 
 $\widehat{S}_{2i-1} = 0$
 for
$\widehat{S}_{2i-1} = 0$
 for 
 $i\in{\mathbb{N}}$
, and
$i\in{\mathbb{N}}$
, and 
 $j=1$
. For each jump time
$j=1$
. For each jump time 
 $X_i$
,
$X_i$
, 
 $i=1, 2, \ldots$
,
$i=1, 2, \ldots$
,
- 
• with probability  $p( X_{i})$
, if j is even, update $p( X_{i})$
, if j is even, update $\widehat{X}_{j+1} = X_{i+1}$
 and increment j by 1; for odd j update $\widehat{X}_{j+1} = X_{i+1}$
 and increment j by 1; for odd j update $\widehat{X}_{j+1} = \widehat{X}_j$
, $\widehat{X}_{j+1} = \widehat{X}_j$
, $\widehat{X}_{j+2} = X_{i+1}$
, and increment j by 2; $\widehat{X}_{j+2} = X_{i+1}$
, and increment j by 2;
- 
• otherwise, if j is odd, update  $\widehat{X}_{j+1} = X_{i+1}$
 and increment j by 1; for even j update $\widehat{X}_{j+1} = X_{i+1}$
 and increment j by 1; for even j update $\widehat{X}_j = X_{i+1}$
 leaving j unchanged. $\widehat{X}_j = X_{i+1}$
 leaving j unchanged.
 Because complete cycles correspond to intervals in between accepted points 
 $X_i$
,
$X_i$
, 
 $i=0, 1,2,\ldots$
,
$i=0, 1,2,\ldots$
, 
 \begin{align*}H(t) = \sum_{n=1}^\infty {\mathbb{P}}( \widehat{X}_{2n} \leq t ) =\sum_{n=1}^\infty {\mathbb{P}}( X_n \leq t, X_n\,{\mathrm{retained}}) =\int_0^t \frac{h(s)}{\tilde{m}(s)}\,{\mathrm{d}} \tilde M(s).\end{align*}
\begin{align*}H(t) = \sum_{n=1}^\infty {\mathbb{P}}( \widehat{X}_{2n} \leq t ) =\sum_{n=1}^\infty {\mathbb{P}}( X_n \leq t, X_n\,{\mathrm{retained}}) =\int_0^t \frac{h(s)}{\tilde{m}(s)}\,{\mathrm{d}} \tilde M(s).\end{align*}
We can therefore conclude that the intensity of the thinned process is 
 $[{h(t)}/{\tilde{m}(t)}] \tilde{m}(t) = h(t)$
 (see, e.g., [Reference Daley and Vere-Jones7, pp. 78–79]) and hence
$[{h(t)}/{\tilde{m}(t)}] \tilde{m}(t) = h(t)$
 (see, e.g., [Reference Daley and Vere-Jones7, pp. 78–79]) and hence 
 $(\widehat S_n, \widehat X_n)_{n=1}^\infty$
 is a non-homogeneous alternating renewal process that satisfies the proposed conditions.
$(\widehat S_n, \widehat X_n)_{n=1}^\infty$
 is a non-homogeneous alternating renewal process that satisfies the proposed conditions.
As an illustration, suppose that m is a step function,
 \begin{align} m(t) = \sum_{j=1}^{J} \delta_j\mathbf{1}_{A_j}(t), \quad t\in \mathbb{R}^+, \end{align}
\begin{align} m(t) = \sum_{j=1}^{J} \delta_j\mathbf{1}_{A_j}(t), \quad t\in \mathbb{R}^+, \end{align}
that takes J different values 
 $\delta_j \gt 0$
 on bounded Borel sets
$\delta_j \gt 0$
 on bounded Borel sets 
 $A_j$
 (
$A_j$
 (
 $j=1,\ldots, J$
, with
$j=1,\ldots, J$
, with 
 $J\in{\mathbb{N}}$
) and 0 elsewhere. The following corollary lays out conditions under which
$J\in{\mathbb{N}}$
) and 0 elsewhere. The following corollary lays out conditions under which 
 $h=m$
 is the renewal density of an alternating renewal process whose Y-phases are governed by (4.2).
$h=m$
 is the renewal density of an alternating renewal process whose Y-phases are governed by (4.2).
Corollary 4.1. (Bound for proposed form of m.) A sufficient condition for (4.3) to be the renewal density of a non-homogeneous alternating renewal process on 
 $\{0,1\} \times{\mathbb{R}}^+$
 with
$\{0,1\} \times{\mathbb{R}}^+$
 with 
 $G_Y$
 given by (4.2) on
$G_Y$
 given by (4.2) on 
 ${\mathbb{R}}^+$
 is that for all
${\mathbb{R}}^+$
 is that for all 
 $j=1, \ldots, J$
 we have
$j=1, \ldots, J$
 we have 
 $\delta_j \leq \alpha(b-1)$
.
$\delta_j \leq \alpha(b-1)$
.
Proof. For (4.3) to induce a non-homogeneous alternating renewal process, we require 
 $h(t) \leq \tilde{m}(t)$
, where
$h(t) \leq \tilde{m}(t)$
, where 
 $\tilde{m}(t)$
 is the Radon–Nikodym derivative of the renewal function
$\tilde{m}(t)$
 is the Radon–Nikodym derivative of the renewal function 
 $\tilde{M}(t)$
. In Proposition 4.1 we defined
$\tilde{M}(t)$
. In Proposition 4.1 we defined 
 $\tilde M(t) = \sum_{n=1}^\infty {\mathbb{P}}( X_n \leq t )$
, with
$\tilde M(t) = \sum_{n=1}^\infty {\mathbb{P}}( X_n \leq t )$
, with 
 $(X_n)_{n=0}^{\infty}$
 being its associated jump process of only Y-phases. By construction, its conditional intensity is
$(X_n)_{n=0}^{\infty}$
 being its associated jump process of only Y-phases. By construction, its conditional intensity is 
 $\tilde{\lambda}_{n+1}(t;\ t_1, \ldots, t_n) =\lambda(t_n;\ \alpha)$
 for all
$\tilde{\lambda}_{n+1}(t;\ t_1, \ldots, t_n) =\lambda(t_n;\ \alpha)$
 for all 
 $0 \leq t_1\leq \cdots \leq t_n \leq t$
.
$0 \leq t_1\leq \cdots \leq t_n \leq t$
.
 Observe that 
 $\inf \{ \lambda(t;\ \alpha) \colon t\in {\mathbb{R}} \} =\alpha(b-1)$
. Construct a Poisson process
$\inf \{ \lambda(t;\ \alpha) \colon t\in {\mathbb{R}} \} =\alpha(b-1)$
. Construct a Poisson process 
 $N^*(t)$
 with intensity
$N^*(t)$
 with intensity 
 $\nu = \alpha(b-1)$
. By [Reference Haezendonck and De Vylder10, Corollary 1], as
$\nu = \alpha(b-1)$
. By [Reference Haezendonck and De Vylder10, Corollary 1], as 
 $\lambda_{n+1}^*(t;\ t_1, \ldots, t_n) \leq\tilde{\lambda}_{n+1}(t;\ t_1, \ldots, t_n)$
, we may conclude that the renewal function
$\lambda_{n+1}^*(t;\ t_1, \ldots, t_n) \leq\tilde{\lambda}_{n+1}(t;\ t_1, \ldots, t_n)$
, we may conclude that the renewal function 
 $\nu t$
 of
$\nu t$
 of 
 $N^*(t)$
 is bounded from above by
$N^*(t)$
 is bounded from above by 
 $ \tilde{M}(t)$
 for all t. Hence also
$ \tilde{M}(t)$
 for all t. Hence also 
 $\nu \leq \tilde m(t)$
.
$\nu \leq \tilde m(t)$
.
 For 
 $h=m$
 as in (4.3), in order to have
$h=m$
 as in (4.3), in order to have 
 $\sum_{j=1}^{J} \delta_j\mathbf{1}_{A_j}(t) \leq \alpha(b-1)$
, it is sufficient that
$\sum_{j=1}^{J} \delta_j\mathbf{1}_{A_j}(t) \leq \alpha(b-1)$
, it is sufficient that 
 $\delta_j \leq \alpha(b-1)$
 for all
$\delta_j \leq \alpha(b-1)$
 for all 
 $j = 1, \ldots, J$
 to guarantee that m(t) is the renewal density of a non-homogeneous alternating renewal process.
$j = 1, \ldots, J$
 to guarantee that m(t) is the renewal density of a non-homogeneous alternating renewal process.
 As noted before, in practice, the starting point 0 is moved back to 
 $-\infty$
. Realisations
$-\infty$
. Realisations 
 $\mathbf{u}$
 from the specified model may be obtained as follows. First, a set of points
$\mathbf{u}$
 from the specified model may be obtained as follows. First, a set of points 
 $\mathbf{x} \subset \mathcal{X}$
 in time is chosen according to the probability density function
$\mathbf{x} \subset \mathcal{X}$
 in time is chosen according to the probability density function 
 $p({\cdot})$
 by, for example, coupling from the past [Reference Kendall15] or the Metropolis–Hastings algorithm [Reference Geyer9]. Next, for each point
$p({\cdot})$
 by, for example, coupling from the past [Reference Kendall15] or the Metropolis–Hastings algorithm [Reference Geyer9]. Next, for each point 
 $x\in\mathbf{x}$
, it is determined whether or not it is an atom based on
$x\in\mathbf{x}$
, it is determined whether or not it is an atom based on 
 $w_x$
, see (2.5). If this is not the case, we appeal to Proposition 2.5 and use rejection sampling with a proposal distribution that simulates a uniformly distributed point in
$w_x$
, see (2.5). If this is not the case, we appeal to Proposition 2.5 and use rejection sampling with a proposal distribution that simulates a uniformly distributed point in 
 $A_j \cap ({-}\infty, x]$
 chosen with probability
$A_j \cap ({-}\infty, x]$
 chosen with probability 
 $\delta_j \ell(A_j\cap({-}\infty,x]) / \sum_{i =1}^J \delta_i \ell(A_i \cap({-}\infty, x])$
 and acceptance probability
$\delta_j \ell(A_j\cap({-}\infty,x]) / \sum_{i =1}^J \delta_i \ell(A_i \cap({-}\infty, x])$
 and acceptance probability 
 $\exp{[}{-}\lambda(a;\ \alpha) (x - a) ]$
. The result is a sample a from
$\exp{[}{-}\lambda(a;\ \alpha) (x - a) ]$
. The result is a sample a from 
 $f_x(a)$
, cf. (2.7). The length is then sampled according to an exponential distribution with parameter
$f_x(a)$
, cf. (2.7). The length is then sampled according to an exponential distribution with parameter 
 $\lambda(a; \alpha)$
 shifted by
$\lambda(a; \alpha)$
 shifted by 
 $x-a$
 (see (2.8)). It is interesting to observe that, in contrast to the alternating renewal case studied in [Reference Van Lieshout and Markwitz25], using the marginal distribution with respect to A and then the conditional given A is computationally simpler than sampling L first.
$x-a$
 (see (2.8)). It is interesting to observe that, in contrast to the alternating renewal case studied in [Reference Van Lieshout and Markwitz25], using the marginal distribution with respect to A and then the conditional given A is computationally simpler than sampling L first.
4.3. Statistical aspects
 In practical applications, both the family of probability density functions 
 $g_Y(t, \tau;\ \theta)$
 for the sojourn times in phase Y and the function
$g_Y(t, \tau;\ \theta)$
 for the sojourn times in phase Y and the function 
 $m(t;\ \xi)$
 rely on unknown parameters
$m(t;\ \xi)$
 rely on unknown parameters 
 $\eta = (\theta, \xi)$
 that must be estimated. The log-likelihood
$\eta = (\theta, \xi)$
 that must be estimated. The log-likelihood 
 $L(\eta;\ \mathbf{u})$
 follows directly from (3.3). Upon observing
$L(\eta;\ \mathbf{u})$
 follows directly from (3.3). Upon observing 
 $\mathbf{u} = \{ (a_1, 0), \ldots, (a_m, 0),(a_{m+1}, l_{m+1}),\ldots, (a_n, l_n) \}$
,
$\mathbf{u} = \{ (a_1, 0), \ldots, (a_m, 0),(a_{m+1}, l_{m+1}),\ldots, (a_n, l_n) \}$
, 
 \begin{equation} L(\eta;\ \mathbf{u}) = \sum_{i=1}^{m} \log\bigg(1 \!-\! \int_{-\infty}^{a_i}[1\!-\!G_Y(s,a_i-s;\ \theta)]m(s;\ \xi)\,{\mathrm{d}} s\bigg)\! +\! \sum_{i=m+1}^n \log(m(a_i;\ \xi)g_Y(a_i,l_i;\ \theta)).\end{equation}
\begin{equation} L(\eta;\ \mathbf{u}) = \sum_{i=1}^{m} \log\bigg(1 \!-\! \int_{-\infty}^{a_i}[1\!-\!G_Y(s,a_i-s;\ \theta)]m(s;\ \xi)\,{\mathrm{d}} s\bigg)\! +\! \sum_{i=m+1}^n \log(m(a_i;\ \xi)g_Y(a_i,l_i;\ \theta)).\end{equation}
When the semi-Markov kernels 
 $G_Y$
 and
$G_Y$
 and 
 $G_Z$
, as well as the renewal density
$G_Z$
, as well as the renewal density 
 $m \equiv ( \mathbb{E} Y +\mathbb{E} Z )^{-1}$
, do not vary in time, (4.4) reduces to the renewal likelihood in [Reference Van Lieshout and Markwitz25].
$m \equiv ( \mathbb{E} Y +\mathbb{E} Z )^{-1}$
, do not vary in time, (4.4) reduces to the renewal likelihood in [Reference Van Lieshout and Markwitz25].
 We illustrate the procedure by means of a specific example. For the sojourn times, we take an exponential model; for the function m, we use (4.3). Assume that 
 $G_Y(t, \cdot)$
,
$G_Y(t, \cdot)$
, 
 $t\in \mathbb{R}$
, is the cumulative distribution function of an exponential distribution with rate parameter
$t\in \mathbb{R}$
, is the cumulative distribution function of an exponential distribution with rate parameter 
 $\lambda(t)$
 as in (4.2), so that
$\lambda(t)$
 as in (4.2), so that 
 $G_Y(t, l;\ \lambda(t)) = 1 - {\mathrm{e}}^{-l \lambda(t)}$
. In the homogeneous case where
$G_Y(t, l;\ \lambda(t)) = 1 - {\mathrm{e}}^{-l \lambda(t)}$
. In the homogeneous case where 
 $\lambda(t) \equiv \alpha \gt 0$
 (that is,
$\lambda(t) \equiv \alpha \gt 0$
 (that is, 
 $b=1$
 and
$b=1$
 and 
 $c=0$
),
$c=0$
), 
 $J = 1$
,
$J = 1$
, 
 $A_1 = \mathbb{R}$
, and
$A_1 = \mathbb{R}$
, and 
 $0 \leq \delta_1 \leq \alpha$
, we obtain
$0 \leq \delta_1 \leq \alpha$
, we obtain 
 \begin{equation} \int_{-\infty}^{t}[1 - G_Y(s, t-s)]m(s)\,{\mathrm{d}} s = \int_{-\infty}^{t}{\mathrm{e}}^{-\alpha(t-s)}\,{\mathrm{d}} s = \frac{1}{\alpha} = \mathbb{E}[Y],\end{equation}
\begin{equation} \int_{-\infty}^{t}[1 - G_Y(s, t-s)]m(s)\,{\mathrm{d}} s = \int_{-\infty}^{t}{\mathrm{e}}^{-\alpha(t-s)}\,{\mathrm{d}} s = \frac{1}{\alpha} = \mathbb{E}[Y],\end{equation}
where 
 $\mathbb{E}[Y]$
 refers to the expected length of a Y-phase in a homogeneous alternating renewal process. Using this result, we obtain
$\mathbb{E}[Y]$
 refers to the expected length of a Y-phase in a homogeneous alternating renewal process. Using this result, we obtain 
 $f_Y(l;\ \alpha) = g_Y(t,l;\ \lambda(t))$
, where
$f_Y(l;\ \alpha) = g_Y(t,l;\ \lambda(t))$
, where 
 $f_Y$
 is the probability density function of the length of such a Y-phase (see, e.g., [Reference Van Lieshout and Markwitz25, Theorem 2.2]). Following through, we additionally obtain
$f_Y$
 is the probability density function of the length of such a Y-phase (see, e.g., [Reference Van Lieshout and Markwitz25, Theorem 2.2]). Following through, we additionally obtain 
 \begin{equation*} L(\alpha,\delta_1;\ \mathbf{u}) = m\log\bigg(1-\frac{\delta_1}{\alpha}\bigg) + (n-m)\log\delta_1 + (n-m)\log\alpha - \alpha \sum_{i=m+1}^n l_i,\end{equation*}
\begin{equation*} L(\alpha,\delta_1;\ \mathbf{u}) = m\log\bigg(1-\frac{\delta_1}{\alpha}\bigg) + (n-m)\log\delta_1 + (n-m)\log\alpha - \alpha \sum_{i=m+1}^n l_i,\end{equation*}
which corresponds exactly to the simplified version of [Reference Van Lieshout and Markwitz25, (7)], the log-likelihood in the homogeneous case.
 Returning to the non-homogeneous case, the atom probability for a given time 
 $x \in \mathcal{X}$
 is
$x \in \mathcal{X}$
 is 
 \begin{equation*} w_x = 1 - \sum_{j=1}^{J}\delta_j\int_{A_j \cap ({-}\infty, x]}{\mathrm{e}}^{-(x-s)\lambda(s)}\,{\mathrm{d}} s.\end{equation*}
\begin{equation*} w_x = 1 - \sum_{j=1}^{J}\delta_j\int_{A_j \cap ({-}\infty, x]}{\mathrm{e}}^{-(x-s)\lambda(s)}\,{\mathrm{d}} s.\end{equation*}
The likelihood equation (4.4), after substitution and discarding of terms that do not depend on the parameters, becomes
 \begin{align*} L(\delta,\alpha;\ \mathbf{u}) & = \sum_{i=1}^{m}\log\Bigg(1-\sum_{j=1}^{J}\delta_j\int_{(a_i-A_j)\cap[0,\infty)}{\mathrm{e}}^{-\alpha r(b+\sin(ca_i-cr))}\,{\mathrm{d}} r\Bigg) \\ & \quad + \sum_{i=m+1}^n\log\Bigg(\sum_{j=1}^{J}\delta_j\mathbf{1}_{A_j}(a_i)\Bigg) + (n-m)\log\alpha-\alpha\sum_{i=m+1}^{n}l_i(b + \sin(ca_i)).\end{align*}
\begin{align*} L(\delta,\alpha;\ \mathbf{u}) & = \sum_{i=1}^{m}\log\Bigg(1-\sum_{j=1}^{J}\delta_j\int_{(a_i-A_j)\cap[0,\infty)}{\mathrm{e}}^{-\alpha r(b+\sin(ca_i-cr))}\,{\mathrm{d}} r\Bigg) \\ & \quad + \sum_{i=m+1}^n\log\Bigg(\sum_{j=1}^{J}\delta_j\mathbf{1}_{A_j}(a_i)\Bigg) + (n-m)\log\alpha-\alpha\sum_{i=m+1}^{n}l_i(b + \sin(ca_i)).\end{align*}
The resulting equations can be solved numerically to find optimal values for 
 $\delta_k$
,
$\delta_k$
, 
 $k = 1, \ldots, J$
, and
$k = 1, \ldots, J$
, and 
 $\alpha$
 under the inequality constraints
$\alpha$
 under the inequality constraints 
 $0 \leq \delta_j \leq \alpha(b-1)$
,
$0 \leq \delta_j \leq \alpha(b-1)$
, 
 $j=1, \ldots, J$
.
$j=1, \ldots, J$
.
 The distribution of unobserved occurrence times may also be considered as a parameter to be estimated using the reported intervals. To do so, since the form of the conditional distribution of W given U according to Theorem 3.1 is identical to that for alternating renewal process-based censoring, the simulation techniques developed in [Reference Van Lieshout and Markwitz25] to obtain realisations of the marked occurrence times given a sample 
 $\mathbf{u}$
 of U apply. A brief overview is provided below.
$\mathbf{u}$
 of U apply. A brief overview is provided below.
 Let 
 $\zeta$
 be the parameter vector corresponding to the area-interaction process parameters
$\zeta$
 be the parameter vector corresponding to the area-interaction process parameters 
 $(\beta({\cdot}), \gamma)$
. Write
$(\beta({\cdot}), \gamma)$
. Write 
 $p(\mathbf{x)}$
, the probability density from (4.1), as
$p(\mathbf{x)}$
, the probability density from (4.1), as 
 $p(\mathbf{x};\ \zeta) = c(\zeta)h(\mathbf{x};\ \zeta)$
, where c refers to the normalisation constant as before and h is the unnormalised density. Reference parameter values
$p(\mathbf{x};\ \zeta) = c(\zeta)h(\mathbf{x};\ \zeta)$
, where c refers to the normalisation constant as before and h is the unnormalised density. Reference parameter values 
 $\zeta_0$
 are picked using a Monte Carlo EM approach [Reference Geyer8], and the likelihood given the interval set
$\zeta_0$
 are picked using a Monte Carlo EM approach [Reference Geyer8], and the likelihood given the interval set 
 $U = {\mathbf{u}}$
, for N samples, is estimated by
$U = {\mathbf{u}}$
, for N samples, is estimated by 
 \begin{equation*} l_N(\zeta) = \log\Bigg(\frac{1}{N}\sum_{i=1}^N\frac{h(X_{{\mathbf{u}},i};\ \zeta)}{h(X_{{\mathbf{u}},i};\ \zeta_0)}\Bigg) - \log\Bigg(\frac{1}{N}\sum_{i=1}^N\frac{h(X_{i};\ \zeta)}{h(X_{i};\ \zeta_0)}\Bigg),\end{equation*}
\begin{equation*} l_N(\zeta) = \log\Bigg(\frac{1}{N}\sum_{i=1}^N\frac{h(X_{{\mathbf{u}},i};\ \zeta)}{h(X_{{\mathbf{u}},i};\ \zeta_0)}\Bigg) - \log\Bigg(\frac{1}{N}\sum_{i=1}^N\frac{h(X_{i};\ \zeta)}{h(X_{i};\ \zeta_0)}\Bigg),\end{equation*}
where 
 $X_i$
 are samples from the area-interaction process, and
$X_i$
 are samples from the area-interaction process, and 
 $X_{\mathbf{u}, i}$
 are samples from the conditional distribution of occurrence times given
$X_{\mathbf{u}, i}$
 are samples from the conditional distribution of occurrence times given 
 $U = {\mathbf{u}}$
 [Reference Van Lieshout and Markwitz25]. The likelihood is then maximised to find the optimal value(s) of
$U = {\mathbf{u}}$
 [Reference Van Lieshout and Markwitz25]. The likelihood is then maximised to find the optimal value(s) of 
 $\zeta$
 [Reference Geyer8].
$\zeta$
 [Reference Geyer8].
Once the area-interaction parameters have been estimated, a Metropolis–Hastings algorithm [Reference Brooks, Gelman, Jones and Meng5, Reference Meyn and Tweedie18, Reference Møller and Waagepetersen19] for a fixed number of points can be used to determine the most likely location of the point within each interval, given the parameter values. For further details and conditions under which the algorithm converges to the desired distribution, we refer to [Reference Van Lieshout and Markwitz25, Propositions 4.3–4.5].
5. Illustrations in practice
 To show how the non-homogeneous model behaves, we present a few examples that show the difference in behaviour between a homogeneous model and a non-homogeneous model. Recall that, broadly speaking, there are three sources of inhomogeneity: the interval lengths as governed by 
 $g_Y$
, the renewal density m, and the ground process responsible for the uncensored event occurrences. Throughout this section, we set
$g_Y$
, the renewal density m, and the ground process responsible for the uncensored event occurrences. Throughout this section, we set 
 ${\mathcal{X}} = (0,1)$
. In general, we see that the homogeneous model, due to the stationarity assumption, is not able to encapsulate the complex behaviour that might be present in real-life applications, whereas the non-homogeneous model has the potential to.
${\mathcal{X}} = (0,1)$
. In general, we see that the homogeneous model, due to the stationarity assumption, is not able to encapsulate the complex behaviour that might be present in real-life applications, whereas the non-homogeneous model has the potential to.
5.1. Model misspecification
 The first source of inhomogeneity in our model is the semi-Markov kernel 
 $G_Y(a, \cdot)$
 which determines the time until the next jump for starting point
$G_Y(a, \cdot)$
 which determines the time until the next jump for starting point 
 $a\in \mathbb{R}$
. In the homogeneous case, as in (4.5), these may be assumed to be independent and identically Gamma or Weibull distributed [Reference Van Lieshout and Markwitz25]. In the non-homogeneous case, the density function is dependent on a.
$a\in \mathbb{R}$
. In the homogeneous case, as in (4.5), these may be assumed to be independent and identically Gamma or Weibull distributed [Reference Van Lieshout and Markwitz25]. In the non-homogeneous case, the density function is dependent on a.
 We have devised the following experiment to illustrate the effect of using a homogeneous model erroneously. We generate a realisation of intervals using the non-homogeneous alternating renewal process model, where the interval censoring mechanism is governed by a Weibull distribution with shape parameter 
 $k=1$
 and rate parameter
$k=1$
 and rate parameter 
 $\lambda_Y(t;\ \alpha) = \alpha(1.6+\sin(2\pi t))$
 (see (4.2)) for
$\lambda_Y(t;\ \alpha) = \alpha(1.6+\sin(2\pi t))$
 (see (4.2)) for 
 $\alpha = 1$
. Regarding the other model ingredients, we assume that
$\alpha = 1$
. Regarding the other model ingredients, we assume that 
 $p({\cdot})$
 is of the form given in (4.1) with
$p({\cdot})$
 is of the form given in (4.1) with 
 $\beta = 400$
 and
$\beta = 400$
 and 
 $\gamma = 1$
, i.e. a homogeneous Poisson process on
$\gamma = 1$
, i.e. a homogeneous Poisson process on 
 $\mathcal{X}$
 with intensity 400. We additionally set
$\mathcal{X}$
 with intensity 400. We additionally set 
 $m(t) = 0.6\, \mathbf{1}_{ [-0.2, 1) }(t)$
.
$m(t) = 0.6\, \mathbf{1}_{ [-0.2, 1) }(t)$
.
 After generating these intervals using the non-homogeneous model, we now assume, wrongly, that these intervals were instead generated by a homogeneous interval censoring scheme. Specifically, we fit a Weibull distribution with parameters 
 $k \gt 0$
 and constant rate
$k \gt 0$
 and constant rate 
 $\lambda_Y(t;\ \alpha) \equiv \alpha$
 for
$\lambda_Y(t;\ \alpha) \equiv \alpha$
 for 
 $\alpha \gt 0$
 using maximum likelihood estimation (see, e.g., [Reference Van Lieshout and Markwitz25, Section 6.1.1]) and we obtain parameter estimates
$\alpha \gt 0$
 using maximum likelihood estimation (see, e.g., [Reference Van Lieshout and Markwitz25, Section 6.1.1]) and we obtain parameter estimates 
 $\widehat{k} = 0.9$
 and
$\widehat{k} = 0.9$
 and 
 $\widehat{\alpha} = 2.0$
.
$\widehat{\alpha} = 2.0$
.
 We plot the survival time densities for both sets of parameters. We must choose an exact point in time to obtain the Weibull parameters for the non-homogeneous model, so we assume that we are looking at time 
 $t=0.6$
. This plot, for both models, can be seen in Figure 2. Compared to the actual model, the homogeneous model is able to roughly discern the shape of the distribution, but struggles with the scale. A homogeneous model would have generated more intervals shorter than about
$t=0.6$
. This plot, for both models, can be seen in Figure 2. Compared to the actual model, the homogeneous model is able to roughly discern the shape of the distribution, but struggles with the scale. A homogeneous model would have generated more intervals shorter than about 
 $0.5$
, and fewer of longer length.
$0.5$
, and fewer of longer length.

Figure 2. The unbroken line corresponds to the actual probability density of interval length for 
 $k=1$
 and
$k=1$
 and 
 $\lambda(0.6;\ 1) = 1$
. The dotted line corresponds to the estimated survival time density.
$\lambda(0.6;\ 1) = 1$
. The dotted line corresponds to the estimated survival time density.
5.2. Inhomogeneity in renewal density and survival time
 In our second experiment, we add inhomogeneity in m to the the model and study the effect on 
 $f_x$
, cf. (2.7). As in Section 5.1, consider an exponential semi-Markov kernel density
$f_x$
, cf. (2.7). As in Section 5.1, consider an exponential semi-Markov kernel density 
 $g_Y$
 with rate parameter either constant,
$g_Y$
 with rate parameter either constant, 
 $\lambda(t;\ \alpha) = 1.3\alpha$
, or varying in time according to
$\lambda(t;\ \alpha) = 1.3\alpha$
, or varying in time according to 
 $\lambda(t;\ \alpha) = \alpha(1.3 + \sin(2\pi t))$
. Furthermore, set
$\lambda(t;\ \alpha) = \alpha(1.3 + \sin(2\pi t))$
. Furthermore, set 
 $m(t) = 0.4$
 for
$m(t) = 0.4$
 for 
 $t \in [{-}0.2, 1)$
 in the constant case, and
$t \in [{-}0.2, 1)$
 in the constant case, and 
 \begin{align*}m(t) = \begin{cases} 0.4 & t \in [{-}0.2, 0.4) \\ 0.1 & t \in [0.4, 1) \end{cases}\end{align*}
\begin{align*}m(t) = \begin{cases} 0.4 & t \in [{-}0.2, 0.4) \\ 0.1 & t \in [0.4, 1) \end{cases}\end{align*}
in the time-varying case. We set 
 $\alpha = 1.6$
, so the largest value of
$\alpha = 1.6$
, so the largest value of 
 $\delta_i$
 which guarantees that
$\delta_i$
 which guarantees that 
 $g_Y$
 is the Radon-Nikodym derivative of a semi-Markov kernel is
$g_Y$
 is the Radon-Nikodym derivative of a semi-Markov kernel is 
 $1.6 \times ( 1.3 - 1 ) = 0.48$
. Figure 3 shows the graphs of
$1.6 \times ( 1.3 - 1 ) = 0.48$
. Figure 3 shows the graphs of 
 $f_x({\cdot})$
 for the four possible combinations of
$f_x({\cdot})$
 for the four possible combinations of 
 $g_Y$
 and m obtained from 200,000 samples from
$g_Y$
 and m obtained from 200,000 samples from 
 $q_x$
 for
$q_x$
 for 
 $x=1$
. In Figures 3a and 3b, we assume that
$x=1$
. In Figures 3a and 3b, we assume that 
 $\lambda$
 is constant. When m is also constant as in Figure 3a, the marginal distribution of the starting times, by Proposition 2.5, is a shifted exponential distribution. If m is allowed to vary in time, the exponential curve is broken at
$\lambda$
 is constant. When m is also constant as in Figure 3a, the marginal distribution of the starting times, by Proposition 2.5, is a shifted exponential distribution. If m is allowed to vary in time, the exponential curve is broken at 
 $t=0.4$
, the discontinuity point of m, resulting in a zigzag pattern. In both Figures 3a and 3c, m is constant, but in Figure 3c the rate parameter of
$t=0.4$
, the discontinuity point of m, resulting in a zigzag pattern. In both Figures 3a and 3c, m is constant, but in Figure 3c the rate parameter of 
 $g_Y$
 varies according to a harmonic. The resulting sinusoidal modulation is clearly visible. Finally, allowing m to vary too results in a break at its discontinuity point
$g_Y$
 varies according to a harmonic. The resulting sinusoidal modulation is clearly visible. Finally, allowing m to vary too results in a break at its discontinuity point 
 $t=0.4$
 as seen in Figure 3d. We conclude that the non-homogeneous model is able to capture various types of inhomogeneity well and is far more flexible than the homogeneous one.
$t=0.4$
 as seen in Figure 3d. We conclude that the non-homogeneous model is able to capture various types of inhomogeneity well and is far more flexible than the homogeneous one.

Figure 3. Probability density function of the starting time 
 $f_x({\cdot})$
 with
$f_x({\cdot})$
 with 
 $x=1$
 for various choices of
$x=1$
 for various choices of 
 $g_Y$
 and m.
$g_Y$
 and m.
5.3. Inhomogeneity in occurrence time distribution
 In the previous subsections, we have assumed that the first-order interaction function 
 $\beta$
 of the point process X of occurrence times remains constant over the entire sampling window (0, 1). In our final example, we relax this assumption in that we consider a ‘peak time’ in which events are more likely to occur and investigate the effect on the conditional distribution of occurrences. More precisely, we take an area-interaction model (4.1) with
$\beta$
 of the point process X of occurrence times remains constant over the entire sampling window (0, 1). In our final example, we relax this assumption in that we consider a ‘peak time’ in which events are more likely to occur and investigate the effect on the conditional distribution of occurrences. More precisely, we take an area-interaction model (4.1) with 
 \begin{align*}\beta(y) = \begin{cases} 3, & y \in [0, c_1), \\ 5, & y \in [c_1, c_2), \\ 3, & y \in [c_2, 1], \end{cases} \end{align*}
\begin{align*}\beta(y) = \begin{cases} 3, & y \in [0, c_1), \\ 5, & y \in [c_1, c_2), \\ 3, & y \in [c_2, 1], \end{cases} \end{align*}
and critical range 
 $[c_1, c_2) = [0.81, 0.85)$
. The radius of interaction is set to
$[c_1, c_2) = [0.81, 0.85)$
. The radius of interaction is set to 
 $r=0.1$
 and we consider both a regular
$r=0.1$
 and we consider both a regular 
 $(\eta = -1.2)$
 and a clustered
$(\eta = -1.2)$
 and a clustered 
 $(\eta = 1.2)$
 model.
$(\eta = 1.2)$
 model.
 As in [Reference Van Lieshout and Markwitz25], consider the set 
 $\mathbf{u} = \{(0.45, 0.4),(0.51, 0), (0.58, 0)\}$
 that contains one non-degenerate interval. Recall that the entries are parametrised as (a, l), where a is the starting point and l is the length. Figure 4 plots the conditional distribution of the occurrence time on the interval
$\mathbf{u} = \{(0.45, 0.4),(0.51, 0), (0.58, 0)\}$
 that contains one non-degenerate interval. Recall that the entries are parametrised as (a, l), where a is the starting point and l is the length. Figure 4 plots the conditional distribution of the occurrence time on the interval 
 $[0.45, 0.85]$
 given
$[0.45, 0.85]$
 given 
 ${\mathbf{u}}$
 for the regular and clustered models. To create this figure, a Metropolis–Hastings algorithm [Reference Van Lieshout and Markwitz25, Algorithm 4.2] was run for 600 000 time steps, with the first 100 000 iterations thrown out due to burn-in.
${\mathbf{u}}$
 for the regular and clustered models. To create this figure, a Metropolis–Hastings algorithm [Reference Van Lieshout and Markwitz25, Algorithm 4.2] was run for 600 000 time steps, with the first 100 000 iterations thrown out due to burn-in.

Figure 4. A comparison between a regular and clustered model with a ‘peak time’ added by changing the intensity function within a critical range.
 The general shape of the graphs is similar to the corresponding plots for constant 
 $\beta = 3$
 in [Reference Van Lieshout and Markwitz25, Figures 2 and 3]. For the clustered model, the occurrence time is more likely to happen close to the atoms; for regular models the probability density is shifted away from the atoms. In the non-homogeneous case, the higher value of
$\beta = 3$
 in [Reference Van Lieshout and Markwitz25, Figures 2 and 3]. For the clustered model, the occurrence time is more likely to happen close to the atoms; for regular models the probability density is shifted away from the atoms. In the non-homogeneous case, the higher value of 
 $\beta$
 during the peak times causes a clear bump in the range
$\beta$
 during the peak times causes a clear bump in the range 
 $[c_1, c_2) =[0.81, 0.85)$
, demonstrating the ability of the non-homogeneous model to favour certain occurrence times over others.
$[c_1, c_2) =[0.81, 0.85)$
, demonstrating the ability of the non-homogeneous model to favour certain occurrence times over others.
The previous three experiments show the effects that the added non-homogeneity may have on the model. It is able to draw from a more complicated interval censoring scheme to generate intervals and provide a more realistic starting time distribution based on the semi-Markov kernel and Markov renewal function, while having the functionality of splitting the observation window into regions of differing likelihood, and deal with more complicated interaction structures in the underlying point process. For a concrete application of the non-homogeneous model to a crime dataset, we direct the reader to [Reference Markwitz17]. An application of the homogeneous model can be found in [Reference Van Lieshout and Markwitz25, Section 6].
6. Conclusion
We introduced a time-dependent interval censoring mechanism that splits time into observable and partially observable phases by means of a non-homogeneous alternating renewal process on the real line. The process was shown to be well-defined for a range of Gamma and Weibull semi-Markov kernels. We extended tools from renewal theory to derive families of time-dependent joint distributions of age and excess, which in turn characterise the probability distribution of censored intervals. We then constructed a model wherein a possibly non-homogeneous point process provides a mechanism to select points on the real line, which are independently marked by the intervals resulting from the censoring mechanism. For this model, a conditional distribution was posited and verified. The influence of the model components was demonstrated through parametrised examples. In future, we intend to apply this model to data on arson fires and to add a spatial component.
Appendix A. Proofs of propositions
A.1. Proposition 2.2
 
Proof of Proposition 2.2. The probability density and cumulative distribution functions of the Gamma distribution with shape and rate parameters k(x) and 
 $\lambda(x)$
 are, for
$\lambda(x)$
 are, for 
 $\tau \geq 0$
,
$\tau \geq 0$
, 
 \begin{equation*} g(x,\tau;\ k(x),\lambda(x)) = \frac{{\lambda(x)}^{k(x)}\tau^{k(x)-1}{\mathrm{e}}^{-\lambda(x)\tau}}{\Gamma(k(x))}, \qquad G(x,\tau;\ k(x),\lambda(x)) = \frac{\gamma(k(x),\lambda(x)\tau)}{\Gamma(k(x))},\end{equation*}
\begin{equation*} g(x,\tau;\ k(x),\lambda(x)) = \frac{{\lambda(x)}^{k(x)}\tau^{k(x)-1}{\mathrm{e}}^{-\lambda(x)\tau}}{\Gamma(k(x))}, \qquad G(x,\tau;\ k(x),\lambda(x)) = \frac{\gamma(k(x),\lambda(x)\tau)}{\Gamma(k(x))},\end{equation*}
writing 
 $\Gamma$
 for the gamma function and
$\Gamma$
 for the gamma function and 
 $\gamma$
 for the lower incomplete gamma function. The conditional intensity is, for
$\gamma$
 for the lower incomplete gamma function. The conditional intensity is, for 
 $n=0,1, \ldots$
 and
$n=0,1, \ldots$
 and 
 $0 = x_0 \leq x_1 \leq \ldots \leq x_n \leq x$
,
$0 = x_0 \leq x_1 \leq \ldots \leq x_n \leq x$
, 
 \begin{align*} \lambda_{n+1}(x;\ x_n) = \lambda_{n+1}(x;\ x_1,\ldots,x_n) & = \frac{g_T(x_n,x-x_n;\ k_T({x_n}),\lambda_T({x_n}))}{1-G_T(x_n,x-x_n;\ k_T({x_n}),\lambda_T({x_n}))} \\ & = \frac{\lambda_T({x_n})^{k_T({x_n})}(x-x_n)^{k_T({x_n})-1}{\mathrm{e}}^{-\lambda_T({x_n})(x-x_n)}} {\int_{\lambda_T({x_n})(x-x_n)}^{\infty}u^{k_T({x_n})-1}{\mathrm{e}}^{-u}\,{\mathrm{d}} u},\end{align*}
\begin{align*} \lambda_{n+1}(x;\ x_n) = \lambda_{n+1}(x;\ x_1,\ldots,x_n) & = \frac{g_T(x_n,x-x_n;\ k_T({x_n}),\lambda_T({x_n}))}{1-G_T(x_n,x-x_n;\ k_T({x_n}),\lambda_T({x_n}))} \\ & = \frac{\lambda_T({x_n})^{k_T({x_n})}(x-x_n)^{k_T({x_n})-1}{\mathrm{e}}^{-\lambda_T({x_n})(x-x_n)}} {\int_{\lambda_T({x_n})(x-x_n)}^{\infty}u^{k_T({x_n})-1}{\mathrm{e}}^{-u}\,{\mathrm{d}} u},\end{align*}
where 
 $g_T$
 is either
$g_T$
 is either 
 $g_Y$
 or
$g_Y$
 or 
 $g_Z$
. We examine the limiting behaviour as
$g_Z$
. We examine the limiting behaviour as 
 $x\to\infty$
 and observe that
$x\to\infty$
 and observe that 
 \begin{align*} \lim_{x \to \infty}g_T(x_n, x-x_n;\ k_T(x_n), \lambda_T(x_n)) = 0, \quad \lim_{x\to\infty}(1-G_T(x_n,x-x_n;\ k_T(x_n)\lambda_T(x_n))) = 0.\end{align*}
\begin{align*} \lim_{x \to \infty}g_T(x_n, x-x_n;\ k_T(x_n), \lambda_T(x_n)) = 0, \quad \lim_{x\to\infty}(1-G_T(x_n,x-x_n;\ k_T(x_n)\lambda_T(x_n))) = 0.\end{align*}
Noting that both are differentiable on 
 $(0,\infty)$
, by L’Hôpital’s rule,
$(0,\infty)$
, by L’Hôpital’s rule, 
 \begin{equation*} \lim_{x\to\infty}\lambda_{n+1}(x;\ x_n) = \lim_{x\to\infty}\frac{\lambda_T(x_n)(x-x_n)-(k_T(x_n)-1)}{x-x_n} = \lambda_T(x_n)\end{equation*}
\begin{equation*} \lim_{x\to\infty}\lambda_{n+1}(x;\ x_n) = \lim_{x\to\infty}\frac{\lambda_T(x_n)(x-x_n)-(k_T(x_n)-1)}{x-x_n} = \lambda_T(x_n)\end{equation*}
after simplifying.
 To prove monotonicity, we now show that 
 $\lambda_{n+1}(x;\ x_n)$
 is increasing in
$\lambda_{n+1}(x;\ x_n)$
 is increasing in 
 $x\geq x_n$
. In that case, we obtain the previous equation as the upper bound. Write
$x\geq x_n$
. In that case, we obtain the previous equation as the upper bound. Write 
 $t = \lambda_T(x_n) ( x - x_n)$
. Then
$t = \lambda_T(x_n) ( x - x_n)$
. Then 
 $\lambda_{n+1}(x;\ x_n)$
 can be written as
$\lambda_{n+1}(x;\ x_n)$
 can be written as 
 $\lambda_T(x_n) h(t)$
 for
$\lambda_T(x_n) h(t)$
 for 
 \begin{align*}h(t) = \frac{t^{k_T(x_n)-1} {\mathrm{e}}^{-t}}{ \int_t^\infty u^{k_T(x_n) - 1} {\mathrm{e}}^{-u}\,{\mathrm{d}} u}.\end{align*}
\begin{align*}h(t) = \frac{t^{k_T(x_n)-1} {\mathrm{e}}^{-t}}{ \int_t^\infty u^{k_T(x_n) - 1} {\mathrm{e}}^{-u}\,{\mathrm{d}} u}.\end{align*}
Therefore, it suffices to show that the function 
 $t \to \log h(t)$
 is non-decreasing in
$t \to \log h(t)$
 is non-decreasing in 
 $t \gt 0$
. Now,
$t \gt 0$
. Now, 
 \begin{align*} \frac{\partial}{\partial t} \log h(t) &= \frac{k_T(x_n) -1}{t} - 1 + \frac{t^{k_T(x_n) -1} {\mathrm{e}}^{-t}}{\int_{t} ^{\infty} u^{k_T({x_n}) - 1} {\mathrm{e}}^{-u}\,{\mathrm{d}} u}.\end{align*}
\begin{align*} \frac{\partial}{\partial t} \log h(t) &= \frac{k_T(x_n) -1}{t} - 1 + \frac{t^{k_T(x_n) -1} {\mathrm{e}}^{-t}}{\int_{t} ^{\infty} u^{k_T({x_n}) - 1} {\mathrm{e}}^{-u}\,{\mathrm{d}} u}.\end{align*}
If 
 $t \lt k_T(x_n) - 1$
, we see directly that the derivative is positive. Otherwise, use integration by parts to simplify the last term on the right-hand side to
$t \lt k_T(x_n) - 1$
, we see directly that the derivative is positive. Otherwise, use integration by parts to simplify the last term on the right-hand side to 
 \begin{equation*} \frac{1-\int_{t}^{\infty}[({k_T(x_n)-1})/{u}]u^{k_T({x_n})-1}{\mathrm{e}}^{-u}\,{\mathrm{d}} u}{\int_{t}^{\infty}u^{k_T({x_n})-1}{\mathrm{e}}^{-u}\,{\mathrm{d}} u}.\end{equation*}
\begin{equation*} \frac{1-\int_{t}^{\infty}[({k_T(x_n)-1})/{u}]u^{k_T({x_n})-1}{\mathrm{e}}^{-u}\,{\mathrm{d}} u}{\int_{t}^{\infty}u^{k_T({x_n})-1}{\mathrm{e}}^{-u}\,{\mathrm{d}} u}.\end{equation*}
Consequently,
 \begin{equation*} \frac{\partial}{\partial t}\log h(t) = \frac{\int_{t}^{\infty}\{[({k_T(x_n)-1})/{t}]-[({k_T(x_n)-1})/{u}]\}u^{k_T({x_n})-1}{\mathrm{e}}^{-u}\,{\mathrm{d}} u} {\int_{t}^{\infty}u^{k_T({x_n})-1}{\mathrm{e}}^{-u}\,{\mathrm{d}} u}\end{equation*}
\begin{equation*} \frac{\partial}{\partial t}\log h(t) = \frac{\int_{t}^{\infty}\{[({k_T(x_n)-1})/{t}]-[({k_T(x_n)-1})/{u}]\}u^{k_T({x_n})-1}{\mathrm{e}}^{-u}\,{\mathrm{d}} u} {\int_{t}^{\infty}u^{k_T({x_n})-1}{\mathrm{e}}^{-u}\,{\mathrm{d}} u}\end{equation*}
is non-negative. We conclude that 
 $ \lambda_{n+1}(x;\ x_n) $
 is bounded by
$ \lambda_{n+1}(x;\ x_n) $
 is bounded by 
 $\lambda_T(x_n)$
 for all
$\lambda_T(x_n)$
 for all 
 $k_T(x_n) \geq 1$
.
$k_T(x_n) \geq 1$
.
 Recall our assumption that 
 $\sup_{x\in\mathbb{R}^+} \text{max}(\lambda_Y(x),\lambda_Z(x) ) \leq c$
. We may then construct a Poisson process
$\sup_{x\in\mathbb{R}^+} \text{max}(\lambda_Y(x),\lambda_Z(x) ) \leq c$
. We may then construct a Poisson process 
 $N^*$
 with intensity
$N^*$
 with intensity 
 $\lambda^*_{n+1}(x;\ x_n) = c$
. Clearly,
$\lambda^*_{n+1}(x;\ x_n) = c$
. Clearly, 
 $\lambda^*$
 satisfies the second condition of [10, Corollary 2]. Moreover,
$\lambda^*$
 satisfies the second condition of [10, Corollary 2]. Moreover, 
 $\lambda_{n+1} \leq c = \lambda^*_{n+1}$
 for every
$\lambda_{n+1} \leq c = \lambda^*_{n+1}$
 for every 
 $n \in \mathbb{N}_0$
. Since a Poisson process with constant intensity has probability zero to explode, we conclude that
$n \in \mathbb{N}_0$
. Since a Poisson process with constant intensity has probability zero to explode, we conclude that 
 $\mathbb{P}(X_\infty \lt \infty) = 0$
.
$\mathbb{P}(X_\infty \lt \infty) = 0$
.
A.2. Proposition 2.3
 
Proof of Proposition 2.3. Let 
 $G_T(x, \cdot)$
 and the associated parameter vector
$G_T(x, \cdot)$
 and the associated parameter vector 
 $(\lambda_T(x)$
,
$(\lambda_T(x)$
, 
 $k_T(x))$
 correspond to either Y- or Z-phase cases. The probability density and cumulative distribution functions of the Weibull distribution with shape and rate parameters k(x) and
$k_T(x))$
 correspond to either Y- or Z-phase cases. The probability density and cumulative distribution functions of the Weibull distribution with shape and rate parameters k(x) and 
 $\lambda(x)$
 are, for
$\lambda(x)$
 are, for 
 $\tau \geq 0$
,
$\tau \geq 0$
, 
 \begin{align*} g(x,\tau;\ k(x),\lambda({x})) & = k(x)\lambda(x)(\lambda({x})\tau)^{k(x)-1}{\mathrm{e}}^{-(\lambda({x})\tau)^{k(x)}}, \\ G(x,\tau;\ k(x),\lambda({x})) & = 1 - {\mathrm{e}}^{-(\lambda({x})\tau)^{k(x)}}. \end{align*}
\begin{align*} g(x,\tau;\ k(x),\lambda({x})) & = k(x)\lambda(x)(\lambda({x})\tau)^{k(x)-1}{\mathrm{e}}^{-(\lambda({x})\tau)^{k(x)}}, \\ G(x,\tau;\ k(x),\lambda({x})) & = 1 - {\mathrm{e}}^{-(\lambda({x})\tau)^{k(x)}}. \end{align*}
The conditional intensity is therefore, for 
 $n=0,1,\ldots$
 and
$n=0,1,\ldots$
 and 
 $0 = x_0 \leq x_1 \leq \ldots \leq x_n \leq x$
,
$0 = x_0 \leq x_1 \leq \ldots \leq x_n \leq x$
, 
 \begin{equation*} \lambda_{n+1}(x;\ x_n) = \lambda_{n+1}(x;\ x_1, \ldots, x_n) = k_T(x_n) \lambda_T(x_n)( \lambda_T(x_n) (x-x_n) )^{k_T(x_n)-1}. \end{equation*}
\begin{equation*} \lambda_{n+1}(x;\ x_n) = \lambda_{n+1}(x;\ x_1, \ldots, x_n) = k_T(x_n) \lambda_T(x_n)( \lambda_T(x_n) (x-x_n) )^{k_T(x_n)-1}. \end{equation*}
Since the conditional intensity is unbounded, we cannot use a Poisson process to bound 
 $\lambda_{n+1}$
. Instead, we turn to a homogeneous renewal process
$\lambda_{n+1}$
. Instead, we turn to a homogeneous renewal process 
 $N^*$
 with sojourn times that are Weibull distributed with shape parameter k and rate parameter c. By the strong law of large numbers, since the expected sojourn times are strictly positive,
$N^*$
 with sojourn times that are Weibull distributed with shape parameter k and rate parameter c. By the strong law of large numbers, since the expected sojourn times are strictly positive, 
 $N^*$
 has explosion probability zero [Reference Ross23, Section 3.1]. Also,
$N^*$
 has explosion probability zero [Reference Ross23, Section 3.1]. Also, 
 \begin{align*} \lambda_{n+1}(x;\ x_n) \leq \lambda^*_{n+1}(x;\ x_n) = k c^k (x-x_n)^{k-1} \end{align*}
\begin{align*} \lambda_{n+1}(x;\ x_n) \leq \lambda^*_{n+1}(x;\ x_n) = k c^k (x-x_n)^{k-1} \end{align*}
and the right-hand side is a function of 
 $x-x_n$
 only. By [Reference Haezendonck and De Vylder10, Corollary 2],
$x-x_n$
 only. By [Reference Haezendonck and De Vylder10, Corollary 2], 
 $\mathbb{P}(X_\infty \lt \infty) = 0$
.
$\mathbb{P}(X_\infty \lt \infty) = 0$
.
A.3. Corollary 2.1
 
Proof of Corollary 2.1. Construct a Poisson process 
 $N^*(t)$
 with intensity c as in the proof of Proposition 2.2 and write
$N^*(t)$
 with intensity c as in the proof of Proposition 2.2 and write 
 $X^*_n$
 for its jump times. By [Reference Haezendonck and De Vylder10, Corollary 1],
$X^*_n$
 for its jump times. By [Reference Haezendonck and De Vylder10, Corollary 1], 
 $\mathbb{P}(X_{2n} \leq t)$
 is bounded from above by
$\mathbb{P}(X_{2n} \leq t)$
 is bounded from above by 
 $\mathbb{P}(X^*_{2n} \leq t)$
. Therefore,
$\mathbb{P}(X^*_{2n} \leq t)$
. Therefore, 
 \begin{equation*} \mathbb{E}N(t) = \sum_{n=1}^{\infty} \mathbb{P}(X_{2n} \leq t) \leq \sum_{n=1}^{\infty} \mathbb{P}(X^*_{2n} \leq t) = \mathbb{E}N^*(t) = ct. \end{equation*}
\begin{equation*} \mathbb{E}N(t) = \sum_{n=1}^{\infty} \mathbb{P}(X_{2n} \leq t) \leq \sum_{n=1}^{\infty} \mathbb{P}(X^*_{2n} \leq t) = \mathbb{E}N^*(t) = ct. \end{equation*}
A.4. Corollary 2.2
 
Proof of Corollary 2.2. Construct the renewal process 
 $N^*$
 as in the proof of Proposition 2.3. Then, as in the proof of Corollary 2.1,
$N^*$
 as in the proof of Proposition 2.3. Then, as in the proof of Corollary 2.1, 
 $\mathbb{E}(N(t)) \leq \mathbb{E}(N^*(t))$
. Also,
$\mathbb{E}(N(t)) \leq \mathbb{E}(N^*(t))$
. Also, 
 $\mathbb{E}(N^*(t)) \lt \infty$
 (see [Reference Asmussen2] or [Reference Ross23, Proposition 3.2.2.]).
$\mathbb{E}(N^*(t)) \lt \infty$
 (see [Reference Asmussen2] or [Reference Ross23, Proposition 3.2.2.]).
A.5. Proposition 2.4
 
Proof of Proposition 2.4. By construction, 
 $X_0 = 0$
 and
$X_0 = 0$
 and 
 $S_0 = 1$
. Now, for
$S_0 = 1$
. Now, for 
 $0 \leq x \lt t$
,
$0 \leq x \lt t$
, 
 \begin{align*} \mathbb{P}(A(t) \gt x) & = \mathbb{P}(t - X_{2N(t)} \gt x, X_{2N(t) + 1} \gt t\mid S_0 = 1, X_0 = 0) \\ & = \sum_{n=0}^{\infty}\mathbb{P}(t - X_{2n} \gt x, X_{2n + 1} \gt t, N(t) = n\mid S_0 = 1, X_0 = 0) \\ & = 1 - \mathbb{P}(T_1 \leq t\mid S_0 = 1, X_0 = 0) \\ & \quad + \sum_{n=1}^{\infty} \mathbb{P}(t - X_{2n} \gt x, X_{2n + 1} \gt t\mid S_0 = 1, X_0 = 0) \end{align*}
\begin{align*} \mathbb{P}(A(t) \gt x) & = \mathbb{P}(t - X_{2N(t)} \gt x, X_{2N(t) + 1} \gt t\mid S_0 = 1, X_0 = 0) \\ & = \sum_{n=0}^{\infty}\mathbb{P}(t - X_{2n} \gt x, X_{2n + 1} \gt t, N(t) = n\mid S_0 = 1, X_0 = 0) \\ & = 1 - \mathbb{P}(T_1 \leq t\mid S_0 = 1, X_0 = 0) \\ & \quad + \sum_{n=1}^{\infty} \mathbb{P}(t - X_{2n} \gt x, X_{2n + 1} \gt t\mid S_0 = 1, X_0 = 0) \end{align*}
after simplifying and removing redundant conditions. Note that by (2.3), and as we know that we are guaranteed to be in state 1, 
 $\mathbb{P}(T_1 \leq t\mid S_0 = 1, X_0 = 0) = G_Y(0, t)$
. Continuing,
$\mathbb{P}(T_1 \leq t\mid S_0 = 1, X_0 = 0) = G_Y(0, t)$
. Continuing, 
 \begin{align*} \mathbb{P}(A(t) \gt x) & = 1 - G_Y(0,t) + \sum_{n=1}^{\infty}\mathbb{P}(X_{2n} \lt t- x, X_{2n + 1} \gt t\mid S_0 = 1, X_0 = 0) \\ & = 1 - G_Y(0,t) + \int_0^{t-x}[1 - G_Y(s, t-s)]\,{\mathrm{d}} M(s), \end{align*}
\begin{align*} \mathbb{P}(A(t) \gt x) & = 1 - G_Y(0,t) + \sum_{n=1}^{\infty}\mathbb{P}(X_{2n} \lt t- x, X_{2n + 1} \gt t\mid S_0 = 1, X_0 = 0) \\ & = 1 - G_Y(0,t) + \int_0^{t-x}[1 - G_Y(s, t-s)]\,{\mathrm{d}} M(s), \end{align*}
using the law of total probability and Fubini’s theorem. Considering the discrete components,
 \begin{align*} \mathbb{P}(A(t) = 0) & = 1 - P(A(t) \gt 0) = G_Y(0,t) - \int_0^{t}[1 - G_Y(s, t-s)]\,{\mathrm{d}} M(s), \\ \mathbb{P}(A(t) = t) & = \mathbb{P}(X_1 \gt t) = 1 - G_Y(0, t). \end{align*}
\begin{align*} \mathbb{P}(A(t) = 0) & = 1 - P(A(t) \gt 0) = G_Y(0,t) - \int_0^{t}[1 - G_Y(s, t-s)]\,{\mathrm{d}} M(s), \\ \mathbb{P}(A(t) = t) & = \mathbb{P}(X_1 \gt t) = 1 - G_Y(0, t). \end{align*}
 Next, we turn to the excess. For 
 $z \geq 0$
,
$z \geq 0$
, 
 \begin{align*} \mathbb{P}(B(t) \gt z) & = \mathbb{P}(X_{2N(t)+1} \gt t + z\mid S_0 = 1, X_0 = 0) \\ & = 1 - G_Y(0, t+z) + \int_0^{t}[1 - G_Y(s, t + z - s)]\,{\mathrm{d}} M(s), \\ \mathbb{P}(B(t) = 0) & = 1 - \mathbb{P}(B(t) \gt 0) = G_Y(0,t) - \int_0^{t}[1 - G_Y(s, t-s)]\,{\mathrm{d}} M(s). \end{align*}
\begin{align*} \mathbb{P}(B(t) \gt z) & = \mathbb{P}(X_{2N(t)+1} \gt t + z\mid S_0 = 1, X_0 = 0) \\ & = 1 - G_Y(0, t+z) + \int_0^{t}[1 - G_Y(s, t + z - s)]\,{\mathrm{d}} M(s), \\ \mathbb{P}(B(t) = 0) & = 1 - \mathbb{P}(B(t) \gt 0) = G_Y(0,t) - \int_0^{t}[1 - G_Y(s, t-s)]\,{\mathrm{d}} M(s). \end{align*}
Using similar arguments, we can find an expression for the joint probability 
 $\mathbb{P}(A(t) \gt x$
,
$\mathbb{P}(A(t) \gt x$
, 
 $B(t) \gt z)$
. For
$B(t) \gt z)$
. For 
 $z \in [0, \infty)$
 and
$z \in [0, \infty)$
 and 
 $x \in [0, t)$
,
$x \in [0, t)$
, 
 \begin{align*} \mathbb{P}(A(t) \gt x, B(t) \gt z) & = \mathbb{P}(X_{2N(t)+1} \gt t + z, t - X_{2N(t)} \gt x\mid S_0 = 1, X_0 = 0) \\ & = 1 - G_Y(0, t+z) + \int_0^{t-x}[1 - G_Y(s, t + z - s)]\,{\mathrm{d}} M(s). \end{align*}
\begin{align*} \mathbb{P}(A(t) \gt x, B(t) \gt z) & = \mathbb{P}(X_{2N(t)+1} \gt t + z, t - X_{2N(t)} \gt x\mid S_0 = 1, X_0 = 0) \\ & = 1 - G_Y(0, t+z) + \int_0^{t-x}[1 - G_Y(s, t + z - s)]\,{\mathrm{d}} M(s). \end{align*}
 We can now handle the event 
 $\{ A(t) \leq x, B(t) \leq z \}$
 for
$\{ A(t) \leq x, B(t) \leq z \}$
 for 
 $0 \leq x \lt t$
,
$0 \leq x \lt t$
, 
 $z \geq 0$
 as follows:
$z \geq 0$
 as follows: 
 \begin{align*} \mathbb{P}(A(t) \leq x, B(t) \leq z) & = \mathbb{P}(A(t) \gt x, B(t) \gt z) + 1 - \mathbb{P}(B(t) \gt z) - \mathbb{P}(A(t) \gt x) \\ & = G_Y(0,t) - \int_{t-x}^{t}[1 - G_Y(s, t + z - s)]\,{\mathrm{d}} M(s) \\ & \quad - \int_0^{t-x}[1 - G_Y(s, t-s)]\,{\mathrm{d}} M(s). \end{align*}
\begin{align*} \mathbb{P}(A(t) \leq x, B(t) \leq z) & = \mathbb{P}(A(t) \gt x, B(t) \gt z) + 1 - \mathbb{P}(B(t) \gt z) - \mathbb{P}(A(t) \gt x) \\ & = G_Y(0,t) - \int_{t-x}^{t}[1 - G_Y(s, t + z - s)]\,{\mathrm{d}} M(s) \\ & \quad - \int_0^{t-x}[1 - G_Y(s, t-s)]\,{\mathrm{d}} M(s). \end{align*}
Finally, for 
 $x=t$
,
$x=t$
, 
 \begin{align*} \mathbb{P}(A(t) \leq t, B(t) \leq z) = G_Y(0, t+z) - \int_0^{t}[1 - G_Y(s, t + z - s)]\,{\mathrm{d}} M(s) \end{align*}
\begin{align*} \mathbb{P}(A(t) \leq t, B(t) \leq z) = G_Y(0, t+z) - \int_0^{t}[1 - G_Y(s, t + z - s)]\,{\mathrm{d}} M(s) \end{align*}
and we obtain the proposed expression.
A.6. Proposition 2.5
 
Proof of Proposition 2.5. Assume that 
 $0 \leq a \leq t \leq a+l$
 and
$0 \leq a \leq t \leq a+l$
 and 
 $l \geq 0$
. The marginal distribution of the starting time
$l \geq 0$
. The marginal distribution of the starting time 
 $f_t(a)$
 is
$f_t(a)$
 is 
 \begin{align*} f_t(a) = \int q_t(a,l)\,{\mathrm{d}} l & = \frac{m(a)}{\int_{0}^t[1 - G_Y(s, t-s)]\,{\mathrm{d}} M(s)}\int_{t-a}^{\infty} g_Y(a, l)\,{\mathrm{d}} l \\ & = \frac{m(a)[1-G_Y(a,t-a)]}{\int_{0}^t[1 - G_Y(s, t-s)]\,{\mathrm{d}} M(s)}, \\ f_{t,L|A=a}(l) = \frac{q_t(a,l)}{f_a(a)} & = \frac{g_Y(a, l)}{1-G_Y(a,t-a)}. \end{align*}
\begin{align*} f_t(a) = \int q_t(a,l)\,{\mathrm{d}} l & = \frac{m(a)}{\int_{0}^t[1 - G_Y(s, t-s)]\,{\mathrm{d}} M(s)}\int_{t-a}^{\infty} g_Y(a, l)\,{\mathrm{d}} l \\ & = \frac{m(a)[1-G_Y(a,t-a)]}{\int_{0}^t[1 - G_Y(s, t-s)]\,{\mathrm{d}} M(s)}, \\ f_{t,L|A=a}(l) = \frac{q_t(a,l)}{f_a(a)} & = \frac{g_Y(a, l)}{1-G_Y(a,t-a)}. \end{align*}
Acknowledgement
The research was funded by the Dutch Research Council (NWO), grant number: OCENW.KLEIN.068. We would also like to thank the referees for their comments, which we feel have allowed us to greatly improve the quality of the work.
Funding Information
There are no funding bodies to thank relating to the creation of this article.
Competing Interests
There were no competing interests to declare which arose during the preparation or publication process of this article.
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 











