To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In Chapter 20 we discussed how to guess the outcome of a binary random variable. We now extend the discussion to random variables that take on more than two—but still a finite—number of values. Statisticians call this problem “multi-hypothesis testing” to indicate that there may be more than two hypotheses. Rather than using H, we now denote the random variable whose outcome we wish to guess by M. (In Chapter 20 we used H for “hypothesis;” now we use M for “message.”) We denote the number of possible values that M can take by M and assume that M ≥ 2. (The case M = 2 corresponds to binary hypothesis testing.) As before the “labels” are not important and there is no loss in generality in assuming that M takes values in the setM= ﹛1, …, M﹜. (In the binary case we used the traditional labels 0 and 1 but now we prefer 1, 2, …, M.)
The Setup
A random variable M takes values in the set M = ﹛1, …, M﹜, where M ≥ 2, according to the prior
where
and where
We say that the prior is nondegenerate if
with the inequalities being strict, so M can take on any value in M with positive probability. We say that the prior is uniform if
The observation is a random vector Y taking values in Rd. We assume that for each the distribution of Y conditional on M = m has the density
where is a nonnegative Borel measurable function that integrates to one over Rd.
A guessing rule is a Borel measurable function from the space of possible observations Rd to the set of possible messages M. We think about as the guess we form after observing that Y = yobs. The error probability associated with the guessing rule is given by
Note that two sources of randomness determine whether we err or not: the realization of M and the generation of Y conditional on that realization. A guessing rule is said to be optimal if no other guessing rule achieves a lower probability of error.
In Digital Communications the task of the receiver is to observe the channel outputs and to use these observations to accurately guess the data bits that were sent by the transmitter, i.e., the data bits that were fed to the modulator. Ideally, the guessing would be perfect, i.e., the receiver would make no errors. This, alas, is typically impossible because of the distortions and noise that the channel introduces. Indeed, while one can usually recover the data bits from the transmitted waveform (provided that the modulator is a one-to-one mapping), the receiver has no access to the transmitted waveform but only to the received waveform. And since the latter is typically a noisy version of the former, some errors are usually unavoidable.
In this chapter we shall begin our study of how to guess intelligently, i.e., how, given the channel output, one should guess the data bits with as low a probability of error as possible. This study will help us not only in the design of receivers but also in the design of modulators that allow for reliable decoding from the channel's output.
In the engineering literature the process of guessing the data bits based on the channel output is called “decoding.” In the statistics literature this process is called “hypothesis testing.” We like “guessing” because it demystifies the process.
In most applications the channel output is a continuous-time waveform and we seek to decode a large number of bits. Nevertheless, for pedagogical reasons, we shall begin our study with the simpler case where we wish to decode only a single data bit. This corresponds in the statistics literature to “binary hypothesis testing,” where the term “binary” reminds us that in this guessing problem there are only two alternatives. Moreover, we shall assume that the observation, rather than being a continuous-time waveform, is a vector or a scalar. In fact, we shall begin our study with the simplest case where there are no observations at all.
Claude Shannon, the father of Information Theory, described the fundamental problem of point-to-point communications in his classic 1948 paper as “that of reproducing at one point either exactly or approximately a message selected at another point.” How engineers solve this problem is the subject of this book. But unlike Shannon's general problem, where the message can be an image, a sound clip, or a movie, here we restrict ourselves to bits. We thus envision that the original message is either a binary sequence to start with, or else that it was described using bits by a device outside our control and that our job is to reproduce the describing bits with high reliability. The issue of how images or text files are converted efficiently into bits is the subject of lossy and lossless data compression and is addressed in texts on information theory and on quantization.
The engineering solutions to the point-to-point communication problem greatly depend on the available resources and on the channel between the points. They typically bring together beautiful techniques from Fourier Analysis, Hilbert Spaces, Probability Theory, and Decision Theory. The purpose of this book is to introduce the reader to these techniques and to their interplay.
The book is intended for advanced undergraduates and beginning graduate students. The key prerequisites are basic courses in Calculus, Linear Algebra, and Probability Theory. A course in Linear Systems is a plus but not a must, because all the results from Linear Systems that are needed for this book are summarized in Chapters 5 and 6. But more importantly, the book requires a certain mathematical maturity and patience, because we begin with first principles and develop the theory before discussing its engineering applications. The book is for those who appreciate the views along the way as much as getting to the destination; who like to “stop and smell the roses;” and who prefer fundamentals to acronyms. I firmly believe that those with a sound foundation can easily pick up the acronyms and learn the jargon on the job, but that once one leaves the academic environment, one rarely has the time or peace of mind to study fundamentals.