To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Quantization theory deals primarily with continuous–amplitude signals and continuous amplitude dither. However, within a digital signal processor or a digital computer, both the signal and the dither are represented with finite word length. Examples are digital FIR and IIR filtering, digital control, and numerical calculations. In these cases, intermediate results (e.g. products of numbers) whose amplitude is discrete, have excess bit length, so they must be re–quantized to be stored with the bit number of the memory. Before re–quantization, digital dither may be added to the signal, or sometimes this is even necessary to avoid limit cycles and hysteresis (see Fig. J.1, and Exercises 17.10—17.12, page 462).
Another scenario, when the dither is digital, is when the dither is generated within the computer for the quantization of analog signals. This usually means that each dither sample is produced by a pseudo–random number generator, and a D/A converter is used to convert the number to an analog level to be added to the input of the quantizer before quantization.
In both cases, it is good to know the properties of the most common digital dithers. Therefore, in this appendix we will investigate the properties of digital dither which is desired to be added to a digital signal before requantization.
QUANTIZATION OF REPRESENTABLE SAMPLES
An interesting approach was presented by Wannamaker, Lipshitz, Vanderkooy and Wright (2000). They have recognized that in general, no digital dither can completely remove quantization bias.
Discrete signals are sampled in time and quantized in amplitude. The granularity of such signals, caused by both sampling and quantization, can be analyzed by making use of sampling theory. This chapter reviews sampling theory and develops it in a conventional way for the analysis of sampling in time and for the description of sampled signals. Chapter 3 reviews basic statistical theory related to probability density, characteristic function, and moments. Chapter 4 will show how sampling theory and statistical ideas can be used to analyze quantization.
The origins of sampling theory and interpolation theory go back to the work of Cauchy, Borel, Lagrange, Laplace and Fourier, if not further. We do not have the space here to account for the whole history of sampling, so we will only highlight some major points. For historical details, refer to Higgins (1985), Jerri (1977), Marks (1991).
The sampling theorem, like many other fundamental theorems, was gradually developed by the giants of science, and it is not easy to determine the exact date of its appearance. Shannon (1949) remarks about the imprecise formulation of it that “this is a fact which is common knowledge in the communication art.”
According to Higgins (1985), the first statement that is essentially equivalent to the sampling theorem is due to Borel (1897). The most often cited early paper is however the one of E. T. Whittaker (1915).
Representation of physical quantities in terms of floating–point numbers allows one to cover a very wide dynamic range with a relatively small number of digits. Given this type of representation, roundoff errors are roughly proportional to the amplitude of the represented quantity. In contrast, roundoff errors with uniform quantization are bounded between ±q/2 and are not in any way proportional to the represented quantity.
Floating–point is in most cases so advantageous over fixed–point number representation that it is rapidly becoming ubiquitous. The movement toward usage of floating–point numbers is accelerating as the speed of floating–point calculation is increasing and the cost of implementation is going down. For this reason, it is essential to have a method of analysis for floating–point quantization and floating–point arithmetic.
THE FLOATING–POINT QUANTIZER
Binary numbers have become accepted as the basis for all digital computation. We therefore describe floating–point representation in terms of binary numbers. Other number bases are completely possible, such as base 10 or base 16, but modern digital hardware is built on the binary base.
The numbers in the table of Fig. 12.1 are chosen to provide a simple example. We begin by counting with nonnegative binary floating–point numbers as illustrated in Fig. 12.1. The counting starts with the number 0, represented here by 00000. Each number is multiplied by 2E, where E is an exponent. Initially, let E = 0. Continuing the count, the next number is 1, represented by 00001, and so forth.
Roundoff errors in calculations are often neglected by scientists. The success of the IEEE double precision standard makes most of us think that the precision of a simple personal computer is virtually infinite. Common sense cannot really grasp the meaning of 16 precise decimal digits.
However, roundoff errors can easily destroy the result of a calculation, even if it looks reasonable. Therefore, it is worth investigating them even for IEEE double precision representation.
COMPARISON TO REFERENCE VALUES
Investigation of roundoff errors is most straightforwardly based on comparison of the results to reference values, that is, to the outcome of an ideally precise calculation.
First of all, we need to note that the way of comparison is not well defined. It is usually done by looking at the difference of the finite precision result and of the reference value. This is very reasonable, but cannot be applied in all cases. If, for example, we investigate a stand–alone resonant system, the reference and the finite precision results could be two similar sine waves, with slightly different frequency. In such a case, the difference of the imprecise result and precise result grows significantly with time, although the outputs are still very similar. Therefore, the basis of comparison needs to be chosen very carefully.
When the input to a quantizer is a sampled time series represented by x1, x2, x3, …, the quantization noise is a time series represented by ν1, ν2, ν3, … Suppose that the input time series is stationary and that its statistics satisfy the conditions for multivariable QT II (it would be sufficient that two–variable QT II conditions were satisfied for x1 and x2, x1 and x3, x1 and x4, and so forth, because of stationarity). As such, the quantization noise will be uncorrelated with the quantizer input, and the quantization noise will be white, i.e. uncorrelated over time. The PQN model applies. The autocorrelation function of the quantizer output will be equal to the autocorrelation function of the input plus the autocorrelation function of the quantization noise.
Fig. 20.1(a) is a sketch of an autocorrelation function of a quantizer input signal. Fig. 20.1(b) shows the autocorrelation function of the quantization noise when the PQN model applies. Fig. 20.1(c) shows the corresponding autocorrelation function of the quantizer output.
Corresponding to the autocorrelation functions of Fig. 20.1, the power spectrum of the quantizer output is equal to the power spectrum of the input plus the power spectrum of the quantization noise. This spectrum is flat, with a total power of q2/12.
For many years, rumors have been circulating in the realm of digital signal processing about quantization noise:
(a) the noise is additive and white and uncorrelated with the signal being quantized, and
(b) the noise is uniformly distributed between plus and minus half a quanta, giving it zero mean and a mean square of one–twelfth the square of a quanta.
Many successful systems incorporating uniform quantization have been built and placed into service worldwide whose designs are based on these rumors, thereby reinforcing their veracity. Yet simple reasoning leads one to conclude that:
(a) quantization noise is deterministically related to the signal being quantized and is certainly not independent of it,
(b) the probability density of the noise certainly depends on the probability density of the signal being quantized, and
(c) if the signal being quantized is correlated over time, the noise will certainly have some correlation over time.
In spite of the “simple reasoning,” the rumors are true under most circumstances, or at least true to a very good approximation. When the rumors are true, wonderful things happen:
(a) digital signal processing systems are easy to design, and
(b) systems with quantization that are truly nonlinear behave like linear systems.
In order for the rumors to be true, it is necessary that the signal being quantized obeys a quantizing condition. There actually are several quantizing conditions, all pertaining to the probability density function (PDF) and the characteristic function (CF) of the signal being quantized.
We have seen when discussing the sampling theorem in Section 2.2 that the conditions of the theorem are exactly met only if the signal being sampled is perfectly bandlimited. This is rarely the case, since perfect bandlimitedness implies that the signal cannot be time–limited. Such a signal can be easily defined mathematically, but measured signals are always time–limited, so the condition of the sampling theorem can be met only approximately. While the sinc function wave is theoretically bandlimited, its time truncated versions are not, so the sampling theorem can be applied only approximately. However, sampling theory proved to be very powerful despite its approximate applicability.
The situation is similar with the quantizing theorems. Bandlimitedness of the CF would imply that the PDF is not amplitude–limited. Since measured signals are always amplitude–limited, the quantization theorems can be applied only approximately. Similarly to the sampling theorem, this does not prevent the quantizing theorems from being very powerful in many applications.
Most distributions that are applied in practice, like the Gaussian, exponential or chi–squared are not bandlimited. This fact does not prevent the application of the quantizing theorems if the quantum step size is significantly smaller than the standard deviation. Nevertheless, it is of interest to investigate the question whether there are distributions whose characteristic functions are perfectly bandlimited, similarly to the sinc function. In the following paragraphs we will discuss some examples of distributions whose CFs are perfectly bandlimited.