To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Wolfgang von der Linden, Technische Universität Graz, Austria,Volker Dose, Max-Planck-Institut für Plasmaphysik, Garching, Germany,Udo von Toussaint, Max-Planck-Institut für Plasmaphysik, Garching, Germany
It is common experience of all experimental scientists that repeated measurements of supposedly one and the same quantity result occasionally in data which are in striking disagreement with all others. Such data are usually called outliers. There are numerous conceivable reasons for such outliers. Formally we may consider the sequence of measurements dj with confidence limits dj ± σj. If the distance ∣dj – dk∣ of any two data points dj and dk becomes larger than the sum (σ2k + σ2j)1/2 of the associated errors, then the data start to become inconsistent and finally at least one of them will become an outlier. There are no strict meanings to the terms ‘inconsistent’ and ‘outliers’ and it is exactly this lack of uniqueness in the definition which enforces a treatment by probability theory. We shall consider two different cases. Inconsistency of the data may result from a wrong estimate of the measurement error σk. Already Carl Friedrich Gauss was concerned about measurement uncertainties and stated that ‘the variances of the measurements are practically never known exactly’ [81]. Inconsistency may also arise from measurements distorted by signals from some unrecognized spurious source, leading to ‘strong’ deviations of dk from the mainstream. We shall begin with the first case.
Erroneously measured uncertainties
In the preceding sections we have boldly assumed that the standard deviation σi of the error distribution for the measured quantity di is known exactly. This assumption almost never applies.
Wolfgang von der Linden, Technische Universität Graz, Austria,Volker Dose, Max-Planck-Institut für Plasmaphysik, Garching, Germany,Udo von Toussaint, Max-Planck-Institut für Plasmaphysik, Garching, Germany
Wolfgang von der Linden, Technische Universität Graz, Austria,Volker Dose, Max-Planck-Institut für Plasmaphysik, Garching, Germany,Udo von Toussaint, Max-Planck-Institut für Plasmaphysik, Garching, Germany
Wolfgang von der Linden, Technische Universität Graz, Austria,Volker Dose, Max-Planck-Institut für Plasmaphysik, Garching, Germany,Udo von Toussaint, Max-Planck-Institut für Plasmaphysik, Garching, Germany
Wolfgang von der Linden, Technische Universität Graz, Austria,Volker Dose, Max-Planck-Institut für Plasmaphysik, Garching, Germany,Udo von Toussaint, Max-Planck-Institut für Plasmaphysik, Garching, Germany
Wolfgang von der Linden, Technische Universität Graz, Austria,Volker Dose, Max-Planck-Institut für Plasmaphysik, Garching, Germany,Udo von Toussaint, Max-Planck-Institut für Plasmaphysik, Garching, Germany
Regression is a technique for describing how a response variable y varies with the values of so-called input variables x. There is a distinction between ‘simple regression’, where we have only one input variable x, and ‘multiple regression’, with many input variables x. Predictions are based on a model function y = f(x∣a) that depends on the model parameters a. At the heart of the regression analysis lies the determination of the parameters a, either because they bear a direct (physical) meaning or because they are used along with the model function to make predictions. The reader not familiar with the general ideas of parameter estimation may want to read Part III [p. 227] first. In the literature on frequentist statistics, regression analysis is generally based on the assumption that the measured values of the response variables are independently and normally distributed with equal noise levels. Regression analysis in frequentist statistics boils down to fitting the model parameters such that the sum of the squared deviations between model and data is minimized. A widespread application is the linear regression model, where the function f is linear in the variables x and the parameters a.
In the Bayesian framework, there is no need for any restrictions. Here we will deal with the general problem of inferring the parameters a of an arbitrary model function f(x∣a). In order to cover the bulk of applications we will restrict the following studies to Gaussian errors.
Wolfgang von der Linden, Technische Universität Graz, Austria,Volker Dose, Max-Planck-Institut für Plasmaphysik, Garching, Germany,Udo von Toussaint, Max-Planck-Institut für Plasmaphysik, Garching, Germany
Wolfgang von der Linden, Technische Universität Graz, Austria,Volker Dose, Max-Planck-Institut für Plasmaphysik, Garching, Germany,Udo von Toussaint, Max-Planck-Institut für Plasmaphysik, Garching, Germany
Wolfgang von der Linden, Technische Universität Graz, Austria,Volker Dose, Max-Planck-Institut für Plasmaphysik, Garching, Germany,Udo von Toussaint, Max-Planck-Institut für Plasmaphysik, Garching, Germany
Wolfgang von der Linden, Technische Universität Graz, Austria,Volker Dose, Max-Planck-Institut für Plasmaphysik, Garching, Germany,Udo von Toussaint, Max-Planck-Institut für Plasmaphysik, Garching, Germany
Wolfgang von der Linden, Technische Universität Graz, Austria,Volker Dose, Max-Planck-Institut für Plasmaphysik, Garching, Germany,Udo von Toussaint, Max-Planck-Institut für Plasmaphysik, Garching, Germany
Wolfgang von der Linden, Technische Universität Graz, Austria,Volker Dose, Max-Planck-Institut für Plasmaphysik, Garching, Germany,Udo von Toussaint, Max-Planck-Institut für Plasmaphysik, Garching, Germany
The present book is comprehensive and application-oriented, written by physicists with the emphasis on physics-related topics. However, the general concepts, ideas and numerical techniques presented here are not restricted to physics but are equally applicable to all natural sciences, as well as to engineering.
Physics is a fairly expansive discipline in the natural sciences, both financially and intellectually. Considerable efforts and financial means go into the planning, design and operation of modern physics experiments. Disappointingly less attention is usually paid to the analysis of the collected data, which hardly ever goes beyond the 200-year-old method of least squares. A possible reason for this imbalance of efforts lies in the problems which physicists encounter with traditional frequentist statistics. The great statistician G. E. Box hit this point already in 1962: ‘I believe, for instance that it would be very difficult to persuade an intelligent physicist that current statistical practise was sensible, but there would be much less difficulty with an approach via likelihood and Bayes' theorem.’ This citation describes fairly precisely the adventure we have experienced with growing enthusiasm during the last 20 years. Bayesian reasoning is nothing but common physicists' logic, however, expressed in a rigorous and consistent mathematical form. Data analysis without a proper background in probability theory and statistics is like performing an experiment without knowing what the electronic devices are good for and how they are used properly.
Wolfgang von der Linden, Technische Universität Graz, Austria,Volker Dose, Max-Planck-Institut für Plasmaphysik, Garching, Germany,Udo von Toussaint, Max-Planck-Institut für Plasmaphysik, Garching, Germany
Wolfgang von der Linden, Technische Universität Graz, Austria,Volker Dose, Max-Planck-Institut für Plasmaphysik, Garching, Germany,Udo von Toussaint, Max-Planck-Institut für Plasmaphysik, Garching, Germany
Wolfgang von der Linden, Technische Universität Graz, Austria,Volker Dose, Max-Planck-Institut für Plasmaphysik, Garching, Germany,Udo von Toussaint, Max-Planck-Institut für Plasmaphysik, Garching, Germany
Wolfgang von der Linden, Technische Universität Graz, Austria,Volker Dose, Max-Planck-Institut für Plasmaphysik, Garching, Germany,Udo von Toussaint, Max-Planck-Institut für Plasmaphysik, Garching, Germany
Wolfgang von der Linden, Technische Universität Graz, Austria,Volker Dose, Max-Planck-Institut für Plasmaphysik, Garching, Germany,Udo von Toussaint, Max-Planck-Institut für Plasmaphysik, Garching, Germany
Probability theory has a long, eventful, and still not fully settled history. As pointed out in [63]: ‘For all human history, people have invented methods for coming to terms with the seemingly unpredictable vicissitudes of existence … Oracles, amulets, and incantations belonged to the indispensable techniques for interpreting and influencing the fate of communities and individuals alike … In the place of superstition there was to be calculation — a project aiming at nothing less than the rationalization of fortune. From that moment on, there was no more talk of fortune but instead of this atrophied cousin: chance.’
The only consistent mathematical way to handle chance, or rather probability, is provided by the rules of (Bayesian) probability theory. But what does the notion ‘probability’ really mean? Although it might appear, at flrst sight, as obvious, it actually has different connotations and definitions, which will be discussed in the following sections.
For the sake of a smooth introduction to probability theory, we will forego a closer definition of some technical terms, as long as their colloquial meaning suffices for understanding the concepts. A precise definition of these terms will be given in a later section.
Classical definition of ‘probability’
The first quantitative definition of the term ‘probability’ appears in the work of Blaise Pascal (1623–1662) and Pierre de Fermat (1601–1665). Antoine Gombauld Chevalier de Mere, Sieur de Baussay (1607–1685) pointed out to them that ‘…mathematics does not apply to real life’.
Wolfgang von der Linden, Technische Universität Graz, Austria,Volker Dose, Max-Planck-Institut für Plasmaphysik, Garching, Germany,Udo von Toussaint, Max-Planck-Institut für Plasmaphysik, Garching, Germany
Wolfgang von der Linden, Technische Universität Graz, Austria,Volker Dose, Max-Planck-Institut für Plasmaphysik, Garching, Germany,Udo von Toussaint, Max-Planck-Institut für Plasmaphysik, Garching, Germany
Wolfgang von der Linden, Technische Universität Graz, Austria,Volker Dose, Max-Planck-Institut für Plasmaphysik, Garching, Germany,Udo von Toussaint, Max-Planck-Institut für Plasmaphysik, Garching, Germany
As outlined earlier, in the frequentist's reasoning there is no such thing as a probability of or for a hypothesis, as the latter is not a random variable. The basic concept of hypothesis tests in the frequentist theory has been introduced briefly in Section 2.3 [p. 28]. The key idea is fairly simple, though not entirely obvious, and we will start out with a simple and transparent example.
Introduction
Let us consider a basic example of quality control. A manufacturer sells components of electronic devices. In order to verify that his production line is working properly, he or she periodically takes samples and checks whether all electronic features are correct. If this is the case the component is called intact, otherwise defect. Let the number of elements per test (sample size) be N and the number of defect components be denoted by n. By virtue of an agreement with the clients, the percentage of defective components should not exceed a given threshold q. For the manufacturer, the optimal state of his production line is if the mean number of defective parts in the sample is μ = qN. If there are more defective components, the manufacturer would have to pay a penalty and if there are less, the production line could be modified in one way or another to become more cost-effective.