To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This book is written as a guide for the presentation of experimental data including a consistent treatment of experimental errors and inaccuracies. It is meant for experimentalists in physics, astronomy, chemistry, life sciences and engineering. However, it can be equally useful for theoreticians who produce simulation data: they are often confronted with statistical data analysis for which the same methods apply as for the analysis of experimental data. The emphasis in this book is on the determination of best estimates for the values and inaccuracies of parameters in a theory, given experimental data. This is the problem area encountered by most physical scientists and engineers. The problem area of experimental design and hypothesis testing – excellently covered by many textbooks – is only touched on but not treated in this book.
The text can be used in education on error analysis, either in conjunction with experimental classes or in separate courses on data analysis and presentation. It is written in such a way – by including examples and exercises – that most students will be able to acquire the necessary knowledge from self study as well. The book is also meant to be kept for later reference in practical applications. For this purpose a set of “data sheets” and a number of useful computer programs are included.
This book consists of parts. Part I contains the main body of the text.
This chapter is about the presentation of experimental results. When the value of a physical quantity is reported, the uncertainty in the value must be properly reported too, and it must be clear to the reader what kind of uncertainty is meant and how it has been estimated. Given the uncertainty, the value must be reported with the proper number of digits. But the quantity also has a unit that must be reported according to international standards. Thus this chapter is about reporting your results: this is the last thing you do, but we'll make it the first chapter before more serious matters require attention.
How to report a series of measurements
In most cases you derive a result on the basis of a series of (similar) measurements. In general you do not report all individual outcomes of the measurements, but you report the best estimates of the quantity you wish to “measure,” based on the experimental data and on the model you use to derive the required quantity from the data. In fact, you use a data reduction method. In a publication you are required to be explicit about the method used to derive the end result from the data. However, in certain cases you may also choose to report details of the data themselves (preferably in an appendix or deposited as “additional material”); this enables the reader to check your results or apply alternative data reduction methods.
This appendix contains programs, functions or code fragments written in Python. Each code is referred to in the text; the page where the reference is made is given in the header.
First some general instructions are given on how to work with these codes. Python is a general-purpose interpretative language, for which interpreters are available for most platforms, including Windows. Python is in the public domain and interpreters are freely available. Most applications in this book use a powerful numerical array extension NumPy, which also provides basic tools in linear algebra, Fourier transforms and random numbers. Although Python version 3 is available, at the time of writing NumPy requires Python version 2, the latest being 2.6. In addition, applications may require the scientific tools library SciPy, which relies on NumPy. Importing SciPy automatically implies the import of NumPy.
Users are advised first to download Python 2.6, then the most recent stable version of NumPy, and then SciPy. Further instructions for Windows users can be found at www.hjcb.nl/python.
There are several options to produce plots, for example Gnuplot.py, based on the gnuplot package or rpy based on the statistical package “R.” But there are many more. Since the user may find it difficult to make a choice, we have added yet another, but very simple to use, plotting module called plotsvg.py. It can be downloaded from the author's website.
If you want to fit parameters in a functional relation to experimental data, the best method is a least-squares analysis: Find the parameters that minimize the sum of squared deviations of the measured values from the values predicted by your function. In this chapter both linear and nonlinear least-squares fits are considered. It is explained how you can test the validity or effectiveness of the fit and how you can determine the expected inaccuracies in the optimal values of the parameters.
Introduction
Consider the following task: you wish to devise a function y = f(x) such that this function fits as accurately as possible to a number of data points (xi, yi), i = 1, …, n. Usually you have – based on theoretical considerations – a set of functions to choose from, and those functions may still contain one or more yet undetermined parameters. In order to select the “best” function and parameters you must use some kind of measure for the deviation of the data points from the function. If this deviation measure is a single value, you can then select the function that minimizes this deviation.
This task is not at all straightforward and you may be lured into pitfalls during the process. For example, your choice of functions and parameters may be so large and your set of data may be so small that you can choose a function that exactly fits your data.
There are errors and uncertainties. The latter are unavoidable; eventually it is the omnipresent thermal noise that causes the results of measurements to be imprecise. After trying to identify and correct avoidable errors, this chapter will concentrate on the propagation and combination of uncertainties in composite functional relations.
Classification of errors
There are several types of error in experimental outcomes:
(i) (accidental, stupid or intended) mistakes
(ii) systematic deviations
(iii) random errors or uncertainties
The first type we shall ignore. Accidental mistakes can be avoided by careful checking and double checking. Stupid mistakes are accidental errors that have been overlooked. Intended mistakes (e.g. selecting data that suit your purpose) purposely mislead the reader and belong to the category of scientific crimes.
Systematic errors
Systematic errors have a non-random character and distort the result of a measurement. They result from erroneous calibration or just from a lack of proper calibration of a measuring instrument, from careless measurements (uncorrected parallax, uncorrected zero-point deviations, time measurements uncorrected for reaction time, etc.), from impurities in materials, or from causes the experimenter is not aware of. The latter are certainly the most dangerous type of error; such errors are likely to show up when results are compared to those of other experimentalists at other laboratories. Therefore independent corroboration of experimental results is required before a critical experiment (e.g. one that overthrows an accepted theory) can be trusted.
It is impossible to measure physical quantities without errors. In most cases errors result from deviations and inaccuracies caused by the measuring apparatus or from the inaccurate reading of the displaying device, but also with optimal instruments and digital displays there are always fluctuations in the measured data. Ultimately there is random thermal noise affecting all quantities that are determined at a finite temperature. Any experimentally determined quantity therefore has a certain inaccuracy. If the experiment were to be repeated, the result would be (slightly) different. One could say that the result of a particular experiment is no more than a random sample from a probability distribution. When reporting the result of an experiment, it is important to also report the extent of the uncertainty, e.g. in terms of the best estimate of some measure of the width of the probability distribution. When experimental data are processed and conclusions are drawn from them, knowledge of the experimental uncertainties is essential to assess the reliability of the conclusion.
Ideally, you should specify the probability distribution from which the reported experimental value is supposed to be a random sample. The problem is that you have only one experiment; even if your experiment consists of many observations of which you report the average, you have only one average to report.
In this chapter the reader is requested to sit back and think. Think about what you are doing and why, and what your conclusions really mean. You have a theory, containing a number of unknown – or insufficiently known – parameters, and you have a set of experimental data. You wish to use the data to validate your theory and to determine or refine the parameters in your theory. Your data contain inaccuracies and whatever you infer from your data contains inaccuracies as well. While the probability distribution of the data, given the theory, is often known or derivable from counting events, the inverse, i.e., the inferred probability distribution of the estimated parameters given the experimental outcome, is of a different, more subjective kind. Scientists who reject any subjective measures must restrict themselves to hypothesis testing. If you want more, turn to Bayes.
Direct and inverse probabilities
Consider the reading of a sensitive digital voltmeter sensing a constant small voltage – say in the microvolt range – during a given time, say 1 millisecond. Repeat the experiment many times. Since the voltmeter itself adds a random noise due to the thermal fluctuations in its input circuit, your observations yi will be samples from a probability distribution f(yi − θ), where θ is the real voltage of the source. You can determine f by collecting many samples.