We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Questions of ‘how best to acquire data’ are essential to modelling and prediction in the natural and social sciences, engineering applications, and beyond. Optimal experimental design (OED) formalizes these questions and creates computational methods to answer them. This article presents a systematic survey of modern OED, from its foundations in classical design theory to current research involving OED for complex models. We begin by reviewing criteria used to formulate an OED problem and thus to encode the goal of performing an experiment. We emphasize the flexibility of the Bayesian and decision-theoretic approach, which encompasses information-based criteria that are well-suited to nonlinear and non-Gaussian statistical models. We then discuss methods for estimating or bounding the values of these design criteria; this endeavour can be quite challenging due to strong nonlinearities, high parameter dimension, large per-sample costs, or settings where the model is implicit. A complementary set of computational issues involves optimization methods used to find a design; we discuss such methods in the discrete (combinatorial) setting of observation selection and in settings where an exact design can be continuously parametrized. Finally we present emerging methods for sequential OED that build non-myopic design policies, rather than explicit designs; these methods naturally adapt to the outcomes of past experiments in proposing new experiments, while seeking coordination among all experiments to be performed. Throughout, we highlight important open questions and challenges.
Information generating functions (IGFs) have been of great interest to researchers due to their ability to generate various information measures. The IGF of an absolutely continuous random variable (see Golomb, S. (1966). The information generating function of a probability distribution. IEEE Transactions in Information Theory, 12(1), 75–77) depends on its density function. But, there are several models with intractable cumulative distribution functions, but do have explicit quantile functions. For this reason, in this work, we propose quantile version of the IGF, and then explore some of its properties. Effect of increasing transformations on it is then studied. Bounds are also obtained. The proposed generating function is studied especially for escort and generalized escort distributions. Some connections between the quantile-based IGF (Q-IGF) order and well-known stochastic orders are established. Finally, the proposed Q-IGF is extended for residual and past lifetimes as well. Several examples are presented through out to illustrate the theoretical results established here. An inferential application of the proposed methodology is also discussed
We investigate the convergence rate of multi-marginal optimal transport costs that are regularized with the Boltzmann–Shannon entropy, as the noise parameter $\varepsilon $ tends to $0$. We establish lower and upper bounds on the difference with the unregularized cost of the form $C\varepsilon \log (1/\varepsilon )+O(\varepsilon )$ for some explicit dimensional constants C depending on the marginals and on the ground cost, but not on the optimal transport plans themselves. Upper bounds are obtained for Lipschitz costs or locally semiconcave costs for a finer estimate, and lower bounds for $\mathscr {C}^2$ costs satisfying some signature condition on the mixed second derivatives that may include degenerate costs, thus generalizing results previously in the two marginals case and for nondegenerate costs. We obtain in particular matching bounds in some typical situations where the optimal plan is deterministic.
We generalize to a broader class of decoupled measures a result of Ziv and Merhav on universal estimation of the specific cross (or relative) entropy, originally for a pair of multilevel Markov measures. Our generalization focuses on abstract decoupling conditions and covers pairs of suitably regular g-measures and pairs of equilibrium measures arising from the “small space of interactions” in mathematical statistical mechanics.
It is proven that a conjecture of Tao (2010) holds true for log-concave random variables on the integers: For every $n \geq 1$, if $X_1,\ldots,X_n$ are i.i.d. integer-valued, log-concave random variables, then
as $H(X_1) \to \infty$, where $H(X_1)$ denotes the (discrete) Shannon entropy. The problem is reduced to the continuous setting by showing that if $U_1,\ldots,U_n$ are independent continuous uniforms on $(0,1)$, then
Measures of uncertainty are a topic of considerable and growing interest. Recently, the introduction of extropy as a measure of uncertainty, dual to Shannon entropy, has opened up interest in new aspects of the subject. Since there are many versions of entropy, a unified formulation has been introduced to work with all of them in an easy way. Here we consider the possibility of defining a unified formulation for extropy by introducing a measure depending on two parameters. For particular choices of parameters, this measure provides the well-known formulations of extropy. Moreover, the unified formulation of extropy is also analyzed in the context of the Dempster–Shafer theory of evidence, and an application to classification problems is given.
An extension of Shannon’s entropy power inequality when one of the summands is Gaussian was provided by Costa in 1985, known as Costa’s concavity inequality. We consider the additive Gaussian noise channel with a more realistic assumption, i.e. the input and noise components are not independent and their dependence structure follows the well-known multivariate Gaussian copula. Two generalizations for the first- and second-order derivatives of the differential entropy of the output signal for dependent multivariate random variables are derived. It is shown that some previous results in the literature are particular versions of our results. Using these derivatives, concavity of the entropy power, under certain mild conditions, is proved. Finally, special one-dimensional versions of our general results are described which indeed reveal an extension of the one-dimensional case of Costa’s concavity inequality to the dependent case. An illustrative example is also presented.
The principle of maximum entropy is a well-known approach to produce a model for data-generating distributions. In this approach, if partial knowledge about the distribution is available in terms of a set of information constraints, then the model that maximizes entropy under these constraints is used for the inference. In this paper, we propose a new three-parameter lifetime distribution using the maximum entropy principle under the constraints on the mean and a general index. We then present some statistical properties of the new distribution, including hazard rate function, quantile function, moments, characterization, and stochastic ordering. We use the maximum likelihood estimation technique to estimate the model parameters. A Monte Carlo study is carried out to evaluate the performance of the estimation method. In order to illustrate the usefulness of the proposed model, we fit the model to three real data sets and compare its relative performance with respect to the beta generalized Weibull family.
Recently, there is a growing interest to study the variability of uncertainty measure in information theory. For the sake of analyzing such interest, varentropy has been introduced and examined for one-sided truncated random variables. As the interval entropy measure is instrumental in summarizing various system and its components properties when it fails between two time points, exploring variability of such measure pronounces the extracted information. In this article, we introduce the concept of varentropy for doubly truncated random variable. A detailed study of theoretical results taking into account transformations, monotonicity and other conditions is proposed. A simulation study has been carried out to investigate the behavior of varentropy in shrinking interval for simulated and real-life data sets. Furthermore, applications related to the choice of most acceptable system and the first-passage times of an Ornstein–Uhlenbeck jump-diffusion process are illustrated.
This paper concentrates on the fundamental concepts of entropy, information and divergence to the case where the distribution function and the respective survival function play the central role in their definition. The main aim is to provide an overview of these three categories of measures of information and their cumulative and survival counterparts. It also aims to introduce and discuss Csiszár's type cumulative and survival divergences and the analogous Fisher's type information on the basis of cumulative and survival functions.
Two-sided bounds are explored for concentration functions and Rényi entropies in the class of discrete log-concave probability distributions. They are used to derive certain variants of the entropy power inequalities.
Several authors have investigated the question of whether canonical logic-based accounts of belief revision, and especially the theory of AGM revision operators, are compatible with the dynamics of Bayesian conditioning. Here we show that Leitgeb’s stability rule for acceptance, which has been offered as a possible solution to the Lottery paradox, allows to bridge AGM revision and Bayesian update: using the stability rule, we prove that AGM revision operators emerge from Bayesian conditioning by an application of the principle of maximum entropy. In situations of information loss, or whenever the agent relies on a qualitative description of her information state—such as a plausibility ranking over hypotheses, or a belief set—the dynamics of AGM belief revision are compatible with Bayesian conditioning; indeed, through the maximum entropy principle, conditioning naturally generates AGM revision operators. This mitigates an impossibility theorem of Lin and Kelly for tracking Bayesian conditioning with AGM revision, and suggests an approach to the compatibility problem that highlights the information loss incurred by acceptance rules in passing from probabilistic to qualitative representations of belief.
This paper provides a functional analogue of the recently initiated dual Orlicz–Brunn–Minkowski theory for star bodies. We first propose the Orlicz addition of measures, and establish the dual functional Orlicz–Brunn–Minkowski inequality. Based on a family of linear Orlicz additions of two measures, we provide an interpretation for the famous $f$-divergence. Jensen’s inequality for integrals is also proved to be equivalent to the newly established dual functional Orlicz–Brunn–Minkowski inequality. An optimization problem for the $f$-divergence is proposed, and related functional affine isoperimetric inequalities are established.
In this paper, we introduce two notions of a relative operator (α, β)-entropy and a Tsallis relative operator (α, β)-entropy as two parameter extensions of the relative operator entropy and the Tsallis relative operator entropy. We apply a perspective approach to prove the joint convexity or concavity of these new notions, under certain conditions concerning α and β. Indeed, we give the parametric extensions, but in such a manner that they remain jointly convex or jointly concave.
Significance Statement. What is novel here is that we convincingly demonstrate how our techniques can be used to give simple proofs for the old and new theorems for the functions that are relevant to quantum statistics. Our proof strategy shows that the joint convexity of the perspective of some functions plays a crucial role to give simple proofs for the joint convexity (resp. concavity) of some relative operator entropies.
In the present communication, we introduce quantile-based (dynamic) inaccuracy measures and study their properties. Such measures provide an alternative approach to evaluate inaccuracy contained in the assumed statistical models. There are several models for which quantile functions are available in tractable form, though their distribution functions are not available in explicit form. In such cases, the traditional distribution function approach fails to compute inaccuracy between two random variables. Various examples are provided for illustration purpose. Some bounds are obtained. Effect of monotone transformations and characterizations are provided.
The Shannon entropy based on the probability density function is a key information measure with applications in different areas. Some alternative information measures have been proposed in the literature. Two relevant ones are the cumulative residual entropy (based on the survival function) and the cumulative past entropy (based on the distribution function). Recently, some extensions of these measures have been proposed. Here, we obtain some properties for the generalized cumulative past entropy. In particular, we prove that it determines the underlying distribution. We also study this measure in coherent systems and a closely related generalized past cumulative Kerridge inaccuracy measure.
In this paper, we perform a detailed spectral study of the liberation process associated with two symmetries of arbitrary ranks: $(R,S)\mapsto (R,U_{t}SU_{t}^{\ast })_{t\geqslant 0}$, where $(U_{t})_{t\geqslant 0}$ is a free unitary Brownian motion freely independent from $\{R,S\}$. Our main tool is free stochastic calculus which allows to derive a partial differential equation (PDE) for the Herglotz transform of the unitary process defined by $Y_{t}:=RU_{t}SU_{t}^{\ast }$. It turns out that this is exactly the PDE governing the flow of an analytic function transform of the spectral measure of the operator $X_{t}:=PU_{t}QU_{t}^{\ast }P$ where $P,Q$ are the orthogonal projections associated to $R,S$. Next, we relate the two spectral measures of $RU_{t}SU_{t}^{\ast }$ and of $PU_{t}QU_{t}^{\ast }P$ via their moment sequences and use this relationship to develop a theory of subordination for the boundary values of the Herglotz transform. In particular, we explicitly compute the subordinate function and extend its inverse continuously to the unit circle. As an application, we prove the identity $i^{\ast }(\mathbb{C}P+\mathbb{C}(I-P);\mathbb{C}Q+\mathbb{C}(I-Q))=-\unicode[STIX]{x1D712}_{\text{orb}}(P,Q)$.
We consider dynamic versions of the mutual information of lifetime distributions, with a focus on past lifetimes, residual lifetimes, and mixed lifetimes evaluated at different instants. This allows us to study multicomponent systems, by measuring the dependence in conditional lifetimes of two components having possibly different ages. We provide some bounds, and investigate the mutual information of residual lifetimes within the time-transformed exponential model (under both the assumptions of unbounded and truncated lifetimes). Moreover, with reference to the order statistics of a random sample, we evaluate explicitly the mutual information between the minimum and the maximum, conditional on inspection at different times, and show that it is distribution-free in a special case. Finally, we develop a copula-based approach aiming to express the dynamic mutual information for past and residual bivariate lifetimes in an alternative way.
Given two absolutely continuous nonnegative independent random variables, we define the reversed relevation transform as dual to the relevation transform. We first apply such transforms to the lifetimes of the components of parallel and series systems under suitably proportionality assumptions on the hazard rates. Furthermore, we prove that the (reversed) relevation transform is commutative if and only if the proportional (reversed) hazard rate model holds. By repeated application of the reversed relevation transform we construct a decreasing sequence of random variables which leads to new weighted probability densities. We obtain various relations involving ageing notions and stochastic orders. We also exploit the connection of such a sequence to the cumulative entropy and to an operator that is dual to the Dickson-Hipp operator. Iterative formulae for computing the mean and the cumulative entropy of the random variables of the sequence are finally investigated.
We describe a general framework for realistic analysis of sorting algorithms, and we apply it to the average-case analysis of three basic sorting algorithms (QuickSort, InsertionSort, BubbleSort). Usually the analysis deals with the mean number of key comparisons, but here we view keys as words produced by the same source, which are compared via their symbols in lexicographic order. The ‘realistic’ cost of the algorithm is now the total number of symbol comparisons performed by the algorithm, and, in this context, the average-case analysis aims to provide estimates for the mean number of symbol comparisons used by the algorithm. For sorting algorithms, and with respect to key comparisons, the average-case complexity of QuickSort is asymptotic to 2n log n, InsertionSort to n2/4 and BubbleSort to n2/2. With respect to symbol comparisons, we prove that their average-case complexity becomes Θ (n log2n), Θ(n2), Θ (n2 log n). In these three cases, we describe the dominant constants which exhibit the probabilistic behaviour of the source (namely entropy and coincidence) with respect to the algorithm.