To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This chapter describes how to characterize data and the distribution of data. We will also describe how the shape of the normal distribution enables hypothesis testing. In the section on regression, we look at how two variables or ways of measuring data are related to each other. We will use simple linear regression as an introduction to multiple regression, the technique used in the development of a number of traditional readability measures. A more sophisticated form of regression is called logistic regression is also discussed, which will be applied in the case studies of Chapters 4 to 6.
Brownian motion is a continuous-time process obtained by taking the limit of a scaled random walk. Alternatively, a Brownian motion can be defined in an axiomatic way, using a set of fundamental properties including the normal distribution feature. We consider various transforms of the latter, including scaling, shifting, and the exponential transform. The latter gives rise to the geometric Brownian motion, which is often used to model asset prices or to build Radon–Nikodym derivatives processes. We conclude the chapter by proving Girsanovs theorem. We recall that the distributions of random variables depend on the probability measure at hand, hence, the distributional properties of a stochastic process are impacted by a change of measure. Consequently, a process may display different properties (e.g., different distributions) under different measures. In particular, a process may display the properties of Brownian motions under one measure, but not under another measure. Girsanovs theorem explains how Brownian motion properties are impacted when changing the probability measure using an exponential martingale as the Radon–Nikodym derivative process.
The objective of the trial was to evaluate the effects of arginine supplementation in the feed of gestating sows on the variability of piglet birth weight. The weight of the piglets was evaluated using descriptive analysis, correlation analysis and analysis of variance with a 2 × 3 factorial arrangement. This arrangement included no supplementation or supplementation with 1.0 % L-arginine, combined in three periods. Period 1: from days 25 to 53 of gestation, providing 23 g/day from days 25 to 28 and 18 g/day from days 29 to 53 of gestation; period 2: from days 30 to 60 of gestation and from day 80 of gestation to farrowing, providing 18 g/day in the first period and 45 g/day in the second period and period 3: from day 85 of gestation to delivery, with 24 g/day was provided from day 85 until farrowing and 28 g/day from days 85 to 107, increasing to 56 g/day from day 108 until farrowing. Supplementation with 1.0 % of L-arginine reduced the percentage of total piglets born and piglets born alive with less than 800 g by 2.26 and 2.05 percentage points, respectively; and increased the percentage of total piglets born and piglets born alive between 1601 and 1800g by 5.89 and 6.08 percentage points, respectively. Supplementing with 1.0 % of L-arginine improves litter uniformity, with an average reduction of 4.06 percentage points in the piglet population of less than 1180 g and an increase in the piglet population of 1180 to 1890 g by 4.70 percentage points.
The Graded Response Model (GRM; Samejima, Estimation of ability using a response pattern of graded scores, Psychometric Monograph No. 17, Richmond, VA: The Psychometric Society, 1969) can be derived by assuming a linear regression of a continuous variable, Z, on the trait, θ, to underlie the ordinal item scores (Takane & de Leeuw in Psychometrika, 52:393–408, 1987). Traditionally, a normal distribution is specified for Z implying homoscedastic error variances and a normally distributed θ. In this paper, we present the Heteroscedastic GRM with Skewed Latent Trait, which extends the traditional GRM by incorporation of heteroscedastic error variances and a skew-normal latent trait. An appealing property of the extended GRM is that it includes the traditional GRM as a special case. This enables specific tests on the normality assumption of Z. We show how violations of normality in Z can lead to asymmetrical category response functions. The ability to test this normality assumption is beneficial from both a statistical and substantive perspective. In a simulation study, we show the viability of the model and investigate the specificity of the effects. We apply the model to a dataset on affect and a dataset on alexithymia.
A unidimensional latent trait model for continuous ratings is developed. This model is an extension of Andrich's rating formulation which assumes that the response process at latent thresholds is governed by the dichotomous Rasch model. Item characteristic functions and information functions are used to illustrate that the model takes ceiling and floor effects into account. Both the dichotomous Rasch model and a linear model with normally distributed error can be derived as limiting cases. The separability of the structural and incidental parameters is demonstrated and a procedure for estimating the parameters is outlined.
Under consideration is a test battery of binary items. The responses of n individuals are assumed to follow a Rasch model. It is further assumed that the latent individual parameters are distributed within a given population in accordance with a normal distribution. Methods are then considered for estimating the mean and variance of this latent population distribution. Also considered are methods for checking whether a normal population distribution fits the data. The developed methods are applied to data from an achievement test and from an attitude test.
Chapter 2 discusses the different graphic techniques for describing data. These include bar graphs, histograms, and frequency polygons, as well as shapes and patterns of distributions. Raw scores are often arranged into frequency distributions, which are orderly arrangements of intervals of a given size which contain frequencies. Frequency distributions organize data in ways that can be viewed and interpreted by investigators. Constructing intervals of specified sizes to encompass scores in a distribution enables investigators to capture certain features that suggest particular statistical techniques for data analysis. Large amounts of collected data from questionnaires, observations, or experiments can be summarized by a simple chart or graph. The meaningfulness, relevance, and understanding of this graphic representation cannot be overstated.
Students are introduced the logic, foundation, and basics of statistical inference. The need for samples is first discussed and then how samples can be used to make inferences about the larger population. The normal distribution is then discussed, along with Z-scores to illustrate basic probability and the logic of statistical significance.
This chapter covers the two topics of descriptive statistics and the normal distribution. We first discuss the role of descriptive statistics and the measures of central tendency, variance, and standard deviation. We also provide examples of the kinds of graphs often used in descriptive statistics. We next discuss the normal distribution, its properties and its role in descriptive and inferential statistical analysis.
This chapter reviews some essential concepts of probability and statistics, including: line plots, histograms, scatter plots, mean, median, quantiles, variance, random variables, probability density function, expectation of a random variable, covariance and correlation, independence the normal distribution (also known as the Gaussian distribution), the chi-square distribution. The above concepts provide the foundation for the statistical methods discussed in the rest of this book.
Let $k\geqslant 1$ be a natural number and $\omega _k(n)$ denote the number of distinct prime factors of a natural number n with multiplicity k. We estimate the first and second moments of the functions $\omega _k$ with $k\geqslant 1$. Moreover, we prove that the function $\omega _1(n)$ has normal order $\log \log n$ and the function $(\omega _1(n)-\log \log n)/\sqrt {\log \log n}$ has a normal distribution. Finally, we prove that the functions $\omega _k(n)$ with $k\geqslant 2$ do not have normal order $F(n)$ for any nondecreasing nonnegative function F.
No matter how much care is taken during an experiment, or how sophisticated the equipment used, values obtained through measurement are influenced by errors. Errors can be thought of as acting to conceal the true value of the quantity sought through experiment. Random errors cause values obtained through measurement to occur above and below the true value. This chapter considers statistically-based methods for dealing with variability in experimental data such as that caused by random errors. As statistics can be described as the science of assembling, organising and interpreting numerical data, it is an ideal tool for assisting in the analysis of experimental data.
Log-concave random variables and their various properties play an increasingly important role in probability, statistics, and other fields. For a distribution F, denote by 𝒟F the set of distributions G such that the convolution of F and G has a log-concave probability mass function or probability density function. In this paper, we investigate sufficient and necessary conditions under which 𝒟F ⊆ 𝒟G, where F and G belong to a parametric family of distributions. Both discrete and continuous settings are considered.
Numerous papers have investigated the distribution of birth weight. This interest arises from the association between birth weight and the future health condition of the child. Birth weight distribution commonly differs slightly from the Gaussian distribution. The distribution is typically split into two components: a predominant Gaussian distribution and an unspecified ‘residual’ distribution. In this study, we consider birth weight data from the Åland Islands (Finland) for the period 1885–1998. We compare birth weight between males and females and among singletons and twins. Our study confirms that, on average, birth weight was highest among singletons, medium among twins, and lowest among triplets. A marked difference in the mean birth weight between singleton males and females was found. For singletons, the distribution of birth weight differed significantly from the normal distribution, but for twins the normal distribution held.
We propose a novel approach to introducing hypothesis testing into the biologycurriculum. Instead of telling students the hypothesis and what kind of data to collectfollowed by a rigid recipe of testing the hypothesis with a given test statistic, we askstudents to develop a hypothesis and a mathematical model that describes the nullhypothesis. Simulation of the model under the null hypothesis allows students to comparetheir experimental data to what they would expect under the null hypothesis, thus leadingto a much more intuitive understanding of hypothesis testing. This approach has beentested both in the classroom and in faculty workshops, and we provide some suggestions forimplementations based on our experiences.
This paper presents new Gaussian approximations for the cumulative distribution function P(Aλ ≤ s) of a Poisson random variable Aλ with mean λ. Using an integral transformation, we first bring the Poisson distribution into quasi-Gaussian form, which permits evaluation in terms of the normal distribution function Φ. The quasi-Gaussian form contains an implicitly defined function y, which is closely related to the Lambert W-function. A detailed analysis of y leads to a powerful asymptotic expansion and sharp bounds on P(Aλ ≤ s). The results for P(Aλ ≤ s) differ from most classical results related to the central limit theorem in that the leading term Φ(β), with is replaced by Φ(α), where α is a simple function of s that converges to β as s tends to ∞. Changing β into α turns out to increase precision for small and moderately large values of s. The results for P(Aλ ≤ s) lead to similar results related to the Erlang B formula. The asymptotic expansion for Erlang's B is shown to give rise to accurate approximations; the obtained bounds seem to be the sharpest in the literature thus far.
The target measure μ is the distribution of a random vector in a box ℬ, a Cartesian product of bounded intervals. The Gibbs sampler is a Markov chain with invariant measure μ. A ‘coupling from the past’ construction of the Gibbs sampler is used to show ergodicity of the dynamics and to perfectly simulate μ. An algorithm to sample vectors with multinormal distribution truncated to ℬ is then implemented.
The convex hull of n independent random points in ℝd, chosen according to the normal distribution, is called a Gaussian polytope. Estimates for the variance of the number of i-faces and for the variance of the ith intrinsic volume of a Gaussian polytope in ℝd, d∈ℕ, are established by means of the Efron-Stein jackknife inequality and a new formula of Blaschke-Petkantschin type. These estimates imply laws of large numbers for the number of i-faces and for the ith intrinsic volume of a Gaussian polytope as n→∞.
Let F be a probability distribution function with density f. We assume that (a) F has finite moments of any integer positive order and (b) the classical problem of moments for F has a nonunique solution (F is M-indeterminate). Our goal is to describe a , where h is a ‘small' perturbation function. Such a class S consists of different distributions Fε (fε is the density of Fε) all sharing the same moments as those of F, thus illustrating the nonuniqueness of F, and of any Fε, in terms of the moments. Power transformations of distributions such as the normal, log-normal and exponential are considered and for them Stieltjes classes written explicitly. We define a characteristic of S called an index of dissimilarity and calculate its value in some cases. A new Stieltjes class involving a power of the normal distribution is presented. An open question about the inverse Gaussian distribution is formulated. Related topics are briefly discussed.