We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This paper introduces a method for pricing insurance policies using market data. The approach is designed for scenarios in which the insurance company seeks to enter a new market, in our case: pet insurance, lacking historical data. The methodology involves an iterative two-step process. First, a suitable parameter is proposed to characterize the underlying risk. Second, the resulting pure premium is linked to the observed commercial premium using an isotonic regression model. To validate the method, comprehensive testing is conducted on synthetic data, followed by its application to a dataset of actual pet insurance rates. To facilitate practical implementation, we have developed an R package called IsoPriceR. By addressing the challenge of pricing insurance policies in the absence of historical data, this method helps enhance pricing strategies in emerging markets.
The cumulative residual extropy has been proposed recently as an alternative measure of extropy to the cumulative distribution function of a random variable. In this paper, the concept of cumulative residual extropy has been extended to cumulative residual extropy inaccuracy (CREI) and dynamic cumulative residual extropy inaccuracy (DCREI). Some lower and upper bounds for these measures are provided. A characterization problem for the DCREI measure under the proportional hazard rate model is studied. Nonparametric estimators for CREI and DCREI measures based on kernel and empirical methods are suggested. Also, a simulation study is presented to evaluate the performance of the suggested measures. Simulation results show that the kernel-based estimator performs better than the empirical-based estimator. Finally, applications of the DCREI measure for model selection are provided using two real data sets.
Reliability analysis of stress–strength models usually assumes that the stress and strength variables are independent. However, in numerous real-world scenarios, stress and strength variables exhibit dependence. This paper investigates the reliability estimation in a multicomponent stress–strength model for parallel-series system assuming that the dependence between stress and strength is based on the Clayton copula. The estimators for the unknown parameters and system reliability are derived using the two-step maximum likelihood estimation and the maximum product spacing methods. Additionally, confidence intervals are constructed by utilizing asymptotically normal distribution theory and bootstrap method. Furthermore, Monte Carlo simulations are conducted to compare the effectiveness of the proposed inference methods. Finally, a real dataset is analyzed for illustrative purposes.
In this paper, we use an information theoretic approach called cumulative residual extropy (CRJ) to compare mixed used systems. We establish mixture representations for the CRJ of mixed used systems and then explore the measure and comparison results among these systems. We compare the mixed used systems based on stochastic orders and stochastically ordered conditional coefficients vectors. Additionally, we derive bounds for the CRJ of mixed used systems with independent and identically distributed components. We also propose the Jensen-cumulative residual extropy (JCRJ) divergence to calculate the complexity of systems. To demonstrate the utility of these results, we calculate and compare the CRJ and JCRJ divergence of mixed used systems in the Exponential model. Furthermore, we determine the optimal system configuration based on signature under a criterion function derived from JCRJ in the exponential model.
Graph-based semi-supervised learning methods combine the graph structure and labeled data to classify unlabeled data. In this work, we study the effect of a noisy oracle on classification. In particular, we derive the maximum a posteriori (MAP) estimator for clustering a degree corrected stochastic block model when a noisy oracle reveals a fraction of the labels. We then propose an algorithm derived from a continuous relaxation of the MAP, and we establish its consistency. Numerical experiments show that our approach achieves promising performance on synthetic and real data sets, even in the case of very noisy labeled data.
Inference in spatial and spatio-temporal models can be challenging for a variety of reasons. For example, non-Gaussianity often leads to analytically intractable integrals; we may be in a ‘big’ data setting, whereby the number of observations renders traditional methods too computationally expensive; we may wish to make inferences over spatial supports that are different to those of our measurements; or, we may wish to use a statistical model whose likelihood function is either unavailable or computationally intractable. In this thesis, I develop several techniques that help to alleviate these challenges.
We investigate some aspects of the problem of the estimation of birth distributions (BDs) in multi-type Galton–Watson trees (MGWs) with unobserved types. More precisely, we consider two-type MGWs called spinal-structured trees. This kind of tree is characterized by a spine of special individuals whose BD $\nu$ is different from the other individuals in the tree (called normal, and whose BD is denoted by $\mu$). In this work, we show that even in such a very structured two-type population, our ability to distinguish the two types and estimate $\mu$ and $\nu$ is constrained by a trade-off between the growth-rate of the population and the similarity of $\mu$ and $\nu$. Indeed, if the growth-rate is too large, large deviation events are likely to be observed in the sampling of the normal individuals, preventing us from distinguishing them from special ones. Roughly speaking, our approach succeeds if $r\lt \mathfrak{D}(\mu,\nu)$, where r is the exponential growth-rate of the population and $\mathfrak{D}$ is a divergence measuring the dissimilarity between $\mu$ and $\nu$.
We study the community detection problem on a Gaussian mixture model, in which vertices are divided into $k\geq 2$ distinct communities. The major difference in our model is that the intensities for Gaussian perturbations are different for different entries in the observation matrix, and we do not assume that every community has the same number of vertices. We explicitly find the necessary and sufficient conditions for the exact recovery of the maximum likelihood estimation, which can give a sharp phase transition for the exact recovery even though the Gaussian perturbations are not identically distributed; see Section 7. Applications include the community detection on hypergraphs.
In this paper, we introduce a novel way to quantify the remaining inaccuracy of order statistics by utilizing the concept of extropy. We explore various properties and characteristics of this new measure. Additionally, we expand the notion of inaccuracy for ordered random variables to a dynamic version and demonstrate that this dynamic information measure provides a unique determination of the distribution function. Moreover, we investigate specific lifetime distributions by analyzing the residual inaccuracy of the first-order statistics. Nonparametric kernel estimation of the proposed measure is suggested. Simulation results show that the kernel estimator with bandwidth selection using the cross-validation method has the best performance. Finally, an application of the proposed measure on the model selection is provided.
In this paper we study the drift parameter estimation for reflected stochastic linear differential equations of a large signal. We discuss the consistency and asymptotic distributions of trajectory fitting estimator (TFE).
Several information measures have been proposed and studied in the literature. One such measure is extropy, a complementary dual function of entropy. Its meaning and related aging notions have not yet been studied in great detail. In this paper, we first illustrate that extropy information ranks the uniformity of a wide array of absolutely continuous families. We then discuss several theoretical merits of extropy. We also provide a closed-form expression of it for finite mixture distributions. Finally, the dynamic versions of extropy are also discussed, specifically the residual extropy and past extropy measures.
Let $(Z_n)_{n\geq0}$ be a supercritical Galton–Watson process. Consider the Lotka–Nagaev estimator for the offspring mean. In this paper we establish self-normalized Cramér-type moderate deviations and Berry–Esseen bounds for the Lotka–Nagaev estimator. The results are believed to be optimal or near-optimal.
Consider the problem of determining the Bayesian credibility mean $E(X_{n+1}|X_1,\cdots, X_n),$ whenever the random claims $X_1,\cdots, X_n,$ given parameter vector $\boldsymbol{\Psi},$ are sampled from the K-component mixture family of distributions, whose members are the union of different families of distributions. This article begins by deriving a recursive formula for such a Bayesian credibility mean. Moreover, under the assumption that using additional information $Z_{i,1},\cdots,Z_{i,m},$ one may probabilistically determine a random claim $X_i$ belongs to a given population (or a distribution), the above recursive formula simplifies to an exact Bayesian credibility mean whenever all components of the mixture distribution belong to the exponential families of distributions. For a situation where a 2-component mixture family of distributions is an appropriate choice for data modelling, using the logistic regression model, it shows that: how one may employ such additional information to derive the Bayesian credibility model, say Logistic Regression Credibility model, for a finite mixture of distributions. A comparison between the Logistic Regression Credibility (LRC) model and its competitor, the Regression Tree Credibility (RTC) model, has been given. More precisely, it shows that under the squared error loss function, it shows the LRC’s risk function dominates the RTC’s risk function at least in an interval which about $0.5.$ Several examples have been given to illustrate the practical application of our findings.
We introduce a new measure of inaccuracy based on extropy between distributions of the nth upper (lower) record value and parent random variable and discuss some properties of it. A characterization problem for the proposed extropy inaccuracy measure has been studied. It is also shown that the defined measure of inaccuracy is invariant under scale but not under location transformation. We characterize certain specific lifetime distribution functions. Nonparametric estimators based on the empirical and kernel methods for the proposed measures are also obtained. The performance of estimators is also discussed using a real dataset.
Let $\{X_n\}_{n\in{\mathbb{N}}}$ be an ${\mathbb{X}}$-valued iterated function system (IFS) of Lipschitz maps defined as $X_0 \in {\mathbb{X}}$ and for $n\geq 1$, $X_n\;:\!=\;F(X_{n-1},\vartheta_n)$, where $\{\vartheta_n\}_{n \ge 1}$ are independent and identically distributed random variables with common probability distribution $\mathfrak{p}$, $F(\cdot,\cdot)$ is Lipschitz continuous in the first variable, and $X_0$ is independent of $\{\vartheta_n\}_{n \ge 1}$. Under parametric perturbation of both F and $\mathfrak{p}$, we are interested in the robustness of the V-geometrical ergodicity property of $\{X_n\}_{n\in{\mathbb{N}}}$, of its invariant probability measure, and finally of the probability distribution of $X_n$. Specifically, we propose a pattern of assumptions for studying such robustness properties for an IFS. This pattern is implemented for the autoregressive processes with autoregressive conditional heteroscedastic errors, and for IFS under roundoff error or under thresholding/truncation. Moreover, we provide a general set of assumptions covering the classical Feller-type hypotheses for an IFS to be a V-geometrical ergodic process. An accurate bound for the rate of convergence is also provided.
Motivated by recent studies of big samples, this work aims to construct a parametric model which is characterized by the following features: (i) a ‘local’ reinforcement, i.e. a reinforcement mechanism mainly based on the last observations, (ii) a random persistent fluctuation of the predictive mean, and (iii) a long-term almost sure convergence of the empirical mean to a deterministic limit, together with a chi-squared goodness-of-fit result for the limit probabilities. This triple purpose is achieved by the introduction of a new variant of the Eggenberger–Pólya urn, which we call the rescaled Pólya urn. We provide a complete asymptotic characterization of this model, pointing out that, for a certain choice of the parameters, it has properties different from the ones typically exhibited by the other urn models in the literature. Therefore, beyond the possible statistical application, this work could be interesting for those who are concerned with stochastic processes with reinforcement.
In this paper, we study sample size thresholds for maximum likelihood estimation for tensor normal models. Given the model parameters and the number of samples, we determine whether, almost surely, (1) the likelihood function is bounded from above, (2) maximum likelihood estimates (MLEs) exist, and (3) MLEs exist uniquely. We obtain a complete answer for both real and complex models. One consequence of our results is that almost sure boundedness of the log-likelihood function guarantees almost sure existence of an MLE. Our techniques are based on invariant theory and castling transforms.
Bifurcating Markov chains (BMCs) are Markov chains indexed by a full binary tree representing the evolution of a trait along a population where each individual has two children. We provide a central limit theorem for additive functionals of BMCs under
$L^2$
-ergodic conditions with three different regimes. This completes the pointwise approach developed in a previous work. As an application, we study the elementary case of a symmetric bifurcating autoregressive process, which justifies the nontrivial hypothesis considered on the kernel transition of the BMCs. We illustrate in this example the phase transition observed in the fluctuations.