To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Markov chains are the simplest mathematical models for random phenomena evolving in time. Their simple structure makes it possible to say a great deal about their behaviour. At the same time, the class of Markov chains is rich enough to serve in many applications. This makes Markov chains the first and most important examples of random processes. Indeed, the whole of the mathematical study of random processes can be regarded as a generalization in one way or another of the theory of Markov chains.
This book is an account of the elementary theory of Markov chains, with applications. It was conceived as a text for advanced undergraduates or master's level students, and is developed from a course taught to undergraduates for several years. There are no strict prerequisites but it is envisaged that the reader will have taken a course in elementary probability. In particular, measure theory is not a prerequisite.
The first half of the book is based on lecture notes for the undergraduate course. Illustrative examples introduce many of the key ideas. Careful proofs are given throughout. There is a selection of exercises, which forms the basis of classwork done by the students, and which has been tested over several years. Chapter 1 deals with the theory of discrete-time Markov chains, and is the basis of all that follows. You must begin here. The material is quite straightforward and the ideas introduced permeate the whole book.
In the first three chapters we have given an account of the elementary theory of Markov chains. This already covers a great many applications, but is just the beginning of the theory of Markov processes. The further theory inevitably involves more sophisticated techniques which, although having their own interest, can obscure the overall structure. On the other hand, the overall structure is, to a large extent, already present in the elementary theory. We therefore thought it worth while to discuss some features of the further theory in the context of simple Markov chains, namely, martingales, potential theory, electrical networks and Brownian motion. The idea is that the Markov chain case serves as a guiding metaphor for more complicated processes. So the reader familiar with Markov chains may find this chapter helpful alongside more general higher-level texts. At the same time, further insight is gained into Markov chains themselves.
Martingales
A martingale is a process whose average value remains constant in a particular strong sense, which we shall make precise shortly. This is a sort of balancing property. Often, the identification of martingales is a crucial step in understanding the evolution of a stochastic process.
Applications of Markov chains arise in many different areas. Some have already appeared to illustrate the theory, from games of chance to the evolution of populations, from calculating the fair price for a random reward to calculating the probability that an absent-minded professor is caught without an umbrella. In a real-world problem involving random processes you should always look for Markov chains. They are often easy to spot. Once a Markov chain is identified, there is a qualitative theory which limits the sorts of behaviour that can occur – we know, for example, that every state is either recurrent or transient. There are also good computational methods – for hitting probabilities and expected rewards, and for long-run behaviour via invariant distributions.
In this chapter we shall look at five areas of application in detail: biological models, queueing models, resource management models, Markov decision processes and Markov chain Monte Carlo. In each case our aim is to provide an introduction rather than a systematic account or survey of the field. References to books for further reading are given in each section.
Markov chains in biology
Randomness is often an appropriate model for systems of high complexity, such as are often found in biology. We have already illustrated some aspects of the theory by simple models with a biological interpretation. See Example 1.1.5 (virus), Exercise 1.1.6 (octopus), Example 1.3.4 (birth-and-death chain) and Exercise 2.5.1 (bacteria).
The material on continuous-time Markov chains is divided between this chapter and the next. The theory takes some time to set up, but once up and running it follows a very similar pattern to the discrete-time case. To emphasise this we have put the setting-up in this chapter and the rest in the next. If you wish, you can begin with Chapter 3, provided you take certain basic properties on trust, which are reviewed in Section 3.1. The first three sections of Chapter 2 fill in some necessary background information and are independent of each other. Section 2.4 on the Poisson process and Section 2.5 on birth processes provide a gentle warm-up for general continuous-time Markov chains. These processes are simple and particularly important examples of continuous-time chains. Sections 2.6–2.8, especially 2.8, deal with the heart of the continuous-time theory. There is an irreducible level of difficulty at this point, so we advise that Sections 2.7 and 2.8 are read selectively at first. Some examples of more general processes are given in Section 2.9. As in Chapter 1 the exercises form an important part of the text.
Q-matrices and their exponentials
In this section we shall discuss some of the basic properties of Q-matrices and explain their connection with continuous-time Markov chains.
This chapter is concerned with the initial processing of analogue and digital signal sources prior to transmission over a public switched telephone network (PSTN). A major topic is pulse code modulation (PCM), a technique for encoding analogue signals and transmitting them over a digital link. (The name was originally chosen by analogy with other pulse modulation techniques such as pulse amplitude modulation (PAM) and pulse width modulation (PWM).) PCM involves sampling the analogue waveform at an appropriate rate, encoding the samples in digital (normally binary) form, and then transmitting the coded samples using a suitable digital waveform (which is also often binary but may well be ternary, quaternary or other). At the receiver the digital waveform is decoded, and the original message signal is reconstructed from the sample values. The complete process is illustrated in Fig. 7.1, where the transmitted digital signal has been shown as a baseband binary waveform for simplicity.
The most common application of PCM is perhaps in telephony, although the analogue message signal can originate from a wide range of sources other than a telephone handset: telemetry, radio, video, and so on. Similar techniques are also used for digital audio recording.
Later sections of the chapter introduce alternatives to standard PCM for the digital transmission of analogue signals, and also discuss the need for modems when using digital sources such as facsimile or computer terminals. Finally, some aspects of interfacing to an integrated services digital network ISDN) are introduced.
Many of the concepts discussed in this book can be illustrated by simple computer simulations. Commercial packages are available for some of them, or simple programs can be written in programming languages such as ‘Basic’ or ‘Pascal’. However, for many topics in digital systems a spreadsheet provides a particularly easy and illuminating demonstration of important techniques.
The following are examples of how spreadsheets can be used to illustrate some of the topics in the text. The details will depend upon the particular spreadsheet package used: it is assumed that readers are already familiar with the package to which they have access. Two different spreadsheet packages are used in the examples below. The zero forcing equaliser is illustrated in ‘Excel’ on an ‘Apple’ ‘Macintosh’ computer, while the other two examples are shown implemented on ‘SuperCalc4’.
The reader is encouraged to use these examples as the starting point for experimentation with different parameters and configurations. Other examples of the use of shreadsheets for modelling in engineering will be found in Bissell and Chapman (1989).
A zero forcing equaliser (Section 4.3.4)
Fig. C1 shows a shreadsheet to simulate the zero forcing equaliser of Fig. 4.13. The input (column A) shifts into the first delay stage (column B) then into the second delay stage (column C). The three coefficients (– 0.266,0.866 and 0.204) have been entered in locations E3, E4 and E5 respectively.
In this book we try to give a representative (but not comprehensive) treatment of the digital transmission of signals. Our main aim has been to render the material truly accessible to second or third year undergraduate students, practising engineers requiring updating, or graduate physicists or mathematicians beginning work in the digital transmission sector of the telecommunications industry. This has led to a book whose important features are:
A limited number of topics, dealt with in depth
An emphasis on the engineering context and interpretation of mathematical models
Relevance to both students and practising engineers
Engineering is a pragmatic activity, and its models and theory primarily a means to an end. As with other engineering disciplines, much of telecommunications is driven by practicalities: the design of line codes (Chapter 6), or the synchronous digital hierarchy (Chapter 8), for example, owe little to any complicated theoretical analysis of digital telecommunications! Yet even such pragmatic activities take place against a background of constraints which telecommunications engineers sooner or later translate into highly abstract models involving bandwidth, spectra, noise density, probability distributions, error rates, and so on. To present these vitally important ideas – in a limited number of contexts, but in sufficient detail to be properly understood by the reader – is the main aim of this book. Thus timeand frequency-domain modelling tools form one constant theme (whether as part of the theory of pulse shaping and signal detection in Chapter 4, or as a background to the niceties of optical receiver design in Chapter 9); the constant battle against noise and the drive to minimise errors is another.
Fig. 2.1 introduces some of the most important phenomena which need to be modelled and analysed in digital signal transmission. It shows part of a system for transmitting binary data coded as two different voltage levels: a simple non-return-to-zero (NRZ) code has been assumed, in which a binary 1 is represented by a positive voltage for the duration of a complete symbol period, and binary 0 is represented by zero volts. The transmission medium might be a coaxial cable, as often used in small scale local networks. Similar principles apply, however, to systems using other transmission media and/or more complicated codes.
Fig. 2.1 also shows (not to scale) typical waveforms at various points in the system. After passing down the cable the original waveform A is attenuated (by an amount depending on the length of the cable) and, because of a finite system response time, the clear transitions between the two voltage levels have become indistinct. In practice there will also be a delay, corresponding to the time taken for the signal to pass along the cable, although this has not been shown explicitly. Neither are the effects of any noise included.
To counteract the distortion illustrated, the system includes an equaliser, which ‘sharpens’ the received waveform, so that the relationship of the equaliser output C to the original binary symbols is much clearer. Passing this equalised waveform through a threshold detector (a circuit whose output is one of two voltage levels depending on whether the input is greater or less than a pre-set threshold) has the result of generating a binary signal very similar to the transmitted one.
So far, this book has been concerned with models and processes which are relevant to a wide variety of telecommunication systems. The linear modelling tools of Part 1, and the digital techniques of coding, modulation and pulse processing discussed in Part 2, are applicable to line telephony, digital microwave links, satellite and mobile systems, and many other fields. In a book this size it is impossible to deal with all areas of modern digital telecommunications. To set the previous material into context, we have chosen therefore to concentrate in this final part on transmission aspects of just one, large-scale, system: the digital public switched telephone system or PSTN. The wide variety of topics with which a modern telecommunications engineer needs to be conversant is reflected particularly here, ranging from the borders of electronics almost to those of software engineering.
Fig. 1.1 of the Introduction showed part of an integrated services digital network (ISDN), in which signals from a variety of sources are transmitted over a universal network. Some of these signals are inherently digital (such as computer data), others are by nature analogue (such as speech input to a telephone handset). In an ISDN they are all transmitted as digital signals with a common format, and their origin is immaterial as far as network management is concerned.
There are currently (1992) few examples of true ISDNs. Nevertheless, in many countries the public switched telephone network is evolving in this direction. Fig. 3 shows part of such a telephone network. The boxes represent exchanges interconnected by numerous trunk routes.
In the Introduction to Part 2, the channel coding layer was identified as being concerned with maintaining the integrity of the conveyed data sequence. In one sense, of course, both line coding and pulse shaping are also chosen with this ultimate end in mind. The precise form of channel coding employed in a particular application depends therefore on the characteristics of the data, the nature of the channel, and the choice of line code and/or pulse shape.
The term channel coding is often used as a near-synonym for error detection/correction coding. Examples discussed in this chapter are Hamming codes, cyclic redundancy check (CRC), and convolutional coding. We have chosen a rather wider interpretation of the term channel coding, however, so as to include techniques sometimes used to make up for limitations inherent in a particular line code or signalling scheme. For example, the line codes HDB3 and CMI guarantee a good timing content, whereas AMI does not. Systems using AMI therefore sometimes incorporate scrambling before line coding to reduce the probability of long strings of data without transitions. Similarly, differential coding can be used to overcome the problems of error propagation which can be encountered with partial response signalling.
There is also another reason for treating together such topics as differential coding, scrambling, CRC and convolutional coding. As will be seen in what follows, the implementation of all these processes has much in common, particularly in the use of delay elements, feedback and shift registers.
The signal models introduced so far (periodic signals and isolated pulses) are all deterministic – that is, their behaviour at any instant in time is completely specified in advance. Clearly, though, the signals encountered in telecommunications systems do not often behave like this. The precise sequence of digital symbols transmitted as a message is not known in advance – otherwise there would be no point in transmitting it! And as far as noise is concerned, although the range of voltages to be expected might be known, there is no way of predicting the precise noise level at any particular instant in time. Many waveforms in telecommunications therefore need to be treated as random variables, and modelled using statistical tools. This chapter introduces such tools.
Statistical averages (means)
In Chapter 2 the mean value of a periodic signal was identified with its zero-frequency or d.c. component. A similar idea applies to a random waveform. If the random signal is lowpass filtered so as to remove almost all the time-varying components, as illustrated in Fig. 3.1, then the result is a waveform which wanders only slightly from its mean value or d.c. level. The mean value of a random waveform is therefore equal to its d.c. or zero-frequency component.
A mathematical expression for the mean value follows immediately. Fig. 3.2 is an enlarged version of the input to the filter in Fig. 3.1.
2.1(a) 2.5exp(jωt + π/2) + 2.5exp(–jωt –π/2); double-sided spectral lines of amplitude 2.5, phase ±π/2 at ±ω.
(b) 0.5[exp(jωt) + exp(–jωt)] + 0.25[exp(jωt+π/4) + exp(–jωt–π/4)]; spectral lines of amplitude 0.5, zero phase at ±ω and amplitude 0.25, phase π/4 at ±3ω.
2.2(a) G(0)=10-2Vs, zero crossings every 100 Hz; (b) G(0) = 6.25 x 10-4Vs, zero crossings every 8 kHz.
2.3(a) 0.5 V high extending from -1 to + lms; (b) 10 V high, extending from -0.025 to +0.025 s.
Chapter 3
3.1 V2/3 W.
3.2(a) zero, as there is no d.c. spectral component; (b) 60mV.
3.3 Output mean-square voltage 2 x 10-5 V2. In other words, the first-order filter lets through approximately half as much noise power again as an ideal filter with same cut-off. Alternatively, an ideal filter with a cutoff of πfc/2 would pass the same noise power as the first-order filter.
3.4 The continuous component in each case will be similar to Fig. 3.17(a) but scaled in frequency by a factor of 2 (spectral nulls at integral multiples of 2/T hertz). The bipolar impulse train has no spectral lines, so neither does the RZ bipolar pulse train. The unipolar impulse train has lines at d.c. and all multiples of 1/T; these appear in the spectrum of the RZ pulse train as appropriately scaled lines at d.c, 1/T and frequencies near to side-lobe peaks in the continuous spectral component.