To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In this chapter we discuss the solution of problems by a number of processors working in concert. In specifying an algorithm for such a setting, we must specify not only the sequence of actions of individual processors, but also the actions they take in response to the actions of other processors. The organization and use of multiple processors has come to be divided into two categories: parallel processing and distributed processing. In the former, a number of processors are coupled together fairly tightly: they are similar processors running at roughly the same speeds and they frequently exchange information with relatively small delays in the propagation of such information. For such a system, we wish to assert that at the end of a certain time period, all the processors will have terminated and will collectively hold the solution to the problem. In distributed processing, on the other hand, less is assumed about the speeds of the processors or the delays in propagating information between them. Thus, the focus is on establishing that algorithms terminate at all, on guaranteeing the correctness of the results, and on counting the number of messages that are sent between processors in solving a problem. We begin by studying a model for parallel computation. We then describe several parallel algorithms in this model: sorting, finding maximal independent sets in graphs, and finding maximum matchings in graphs. We also describe the randomized solution of two problems in distributed computation: the choice coordination problem and the Byzantine agreement problem.
THE study of random walks on graphs is fascinating in its own right. In addition, it has a number of applications to the design and analysis of randomized algorithms. This chapter will be devoted to studying random walks on graphs, and to some of their algorithmic applications. We start by describing a simple algorithm for the 2-SAT problem, and analyze it by studying the properties of random walks on the line. Following a brief treatment of the basics of Markov chains, we consider random walks on undirected graphs. It is shown that there is a strong connection between random walks and the theory of electric networks. Random walks are then applied to the problem of determining the connectivity of graphs. Next, we turn to the study of random walks on expander graphs. We define a class of expanders and use algebraic graph theory to characterize their properties. Finally, we illustrate the special properties of random walks on expanders via an application to probability amplification.
Let G = (V,E) be a connected, undirected graph with n vertices and m edges. For a vertex v Є V, Γ(v) denotes the set of neighbors of v in G. A random walk on G is the following process, which occurs in a sequence of discrete steps: starting at a vertex v0, we proceed at the first step to a randomly chosen neighbor of V0. This may be thought of as choosing a random edge incident on V0 and walking along it to a vertex v1 Є Γ(v0).
ALL the algorithms we have studied so far receive their entire inputs at one time. We turn our attention to online algorithms, which receive and process the input in partial amounts. In a typical setting, an online algorithm receives a sequence of requests for service. It must service each request before it receives the next one. In servicing each request, the algorithm has a choice of several alternatives, each with an associated cost. The alternative chosen at a step may influence the costs of alternatives on future requests. Examples of such situations arise in data-structuring, resource-allocation in operating systems, finance, and distributed computing.
In an online setting, it is often meaningless to have an absolute performance measure for an algorithm. This is because in most such settings, any algorithm for processing requests can be forced to incur an unbounded cost by appropriately choosing the input sequence (we study examples of this below); thus, it becomes difficult, if not impossible, to perform a comparison of competing strategies. Consequently, we compare the total cost of the online algorithm on a sequence of requests, to the total cost of an offline algorithm that services the same sequence of requests. We refer to such an analysis of an online algorithm as a competitive analysis; we will make these notions formal presently.
IN this chapter we consider several fundamental optimization problems involving graphs: all-pairs shortest paths, minimum cuts, and minimum spanning trees. In each case, deterministic polynomial time algorithms are known, but the use of randomization allows us to obtain significantly faster solutions for these problems. We show that the problem of computing all-pairs shortest paths is reducible, via a randomized reduction, to the problem of multiplying two integer matrices. We present a fast randomized algorithm for the min-cut problem in undirected graphs, thereby providing evidence that this problem may be easier than the max-flow problem. Finally, we present a linear-time randomized algorithm for the problem of finding minimum spanning trees.
Unless stated otherwise, all the graphs we consider are assumed to be undirected and without multiple edges or self-loops. For shortest paths and min-cuts we restrict our attention to unweighted graphs, although in some cases the results generalize to weighted graphs; we give references in the discussion at the end of the chapter.
All-pairs Shortest Paths
Let G(V,E) be an undirected, connected graph with V = ﹛l,…,n﹜ and \E\ = m. The adjacency matrix A is an n x n 0-1 matrix with Aij = Aji = 1 if and only if the edge (i, j) is present in E. Given A, we define the distance matrix D as an n x n matrix with non-negative integer entries such that Dij equals the length of a shortest path from vertex i to vertex j.
THE last decade has witnessed a tremendous growth in the area of randomized algorithms. During this period, randomized algorithms went from being a tool in computational number theory to finding widespread application in many types of algorithms. Two benefits of randomization have spearheaded this growth: simplicity and speed. For many applications, a randomized algorithm is the simplest algorithm available, or the fastest, or both.
This book presents the basic concepts in the design and analysis of randomized algorithms at a level accessible to advanced undergraduates and to graduate students. We expect it will also prove to be a reference to professionals wishing to implement such algorithms and to researchers seeking to establish new results in the area.
Organization and Course Information
We assume that the reader has had undergraduate courses in Algorithms and Complexity, and in Probability Theory. The book is organized into two parts. The first part, consisting of seven chapters, presents basic tools from probability theory and probabilistic analysis that are recurrent in algorithmic applications. Applications are given along with each tool to illustrate the tool in concrete settings. The second part of the book also contains seven chapters, each focusing on one area of application of randomized algorithms. The seven areas of application we have selected are: data structures, graph algorithms, geometric algorithms, number theoretic algorithms, counting algorithms, parallel and distributed algorithms, and online algorithms. Naturally, some of the algorithms used for illustration in Part I do fall into one of these seven categories.
SOME of the most notable results in theoretical computer science, particularly in complexity theory, have involved a non-trivial use of algebraic techniques combined with randomization. In this chapter we describe some basic randomization techniques with an underlying algebraic flavor. We begin by describing Freivalds’ technique for the verification of identities involving matrices, polynomials, and integers. We describe how this generalizes to the Schwartz-Zippel technique for identities involving multivariate polynomials, and we illustrate this technique by applying it to the problem of detecting the existence of perfect matchings in graphs. Then we present a related technique that leads to an efficient randomized algorithm for pattern matching in strings. We conclude with some complexity-theoretic applications of the techniques introduced here. In particular, we define interactive proof systems and demonstrate such systems for the graph non-isomorphism problem and the problem of counting the number of satisfying truth assignments for a Boolean formula. We then refine this concept into that of an efficiently verifiable proof and demonstrate such proofs for the satisfiability problem. We indicate how these concepts have led to a completely different view of classical complexity classes, as well as the new results obtained via the resulting insight into the structure of these classes.
Most of these techniques and their applications involve (sometimes indirectly) a fingerprinting mechanism, which can be described as follows. Consider the problem of deciding the equality of two elements x and y drawn from a large universe U.
IN this chapter we apply randomization to hard counting problems. After defining the class #P, we present several #P-complete problems. We present a (randomized) polynomial time approximation scheme for the problem of counting the number of satisfying truth assignments for a DNF formula. The problem of approximate counting of perfect matchings in a bipartite graph is shown to be reducible to that of the uniform generation of perfect matchings. We describe a solution to the latter problem using the rapid mixing property of a suitably defined random walk, provided the input graph is sufficiently dense. We conclude with an overview of the estimation of the volume of a convex body.
We say that a decision problem Π is in NP if for any YES-instance I of Π, there exists a proof that I is a YES-instance that can be verified in polynomial time. Equivalently, we can cast the decision problem as a language recognition problem, where the language consists of suitable encodings of all YES-instances of Π. A proof now certifies the membership in the language of an encoded instance of the problem. Usually the proof of membership corresponds to a "solution” to the search version of the decision problem II: for instance, if II were the problem of deciding whether a given graph is Hamiltonian, a possible proof of this for a Hamiltonian graph (YES-instance) would be a Hamiltonian cycle in the graph. In the counting version of this problem, we wish to compute the number of proofs that an instance / is a YES-instance.
The theory of numbers plays a central role in several areas of great importance to computer science, such as cryptography, pseudo-random number generation, complexity theory, algebraic problems, coding theory, and combinatorics, to name just a few. We have already seen that relatively simple properties of prime numbers allow us to devise k-wise independent variables (Chapter 3), and number-theoretic ideas are at the heart of the algebraic techniques in randomization discussed in Chapter 7.
In this chapter, we focus on solving number-theoretic problems using randomized techniques. Since the structure of finite fields depends on the properties of prime numbers, algebraic problems involving polynomials over such fields are also treated in this chapter. We start with a review of some basic concepts in number theory and algebra. Then we develop a variety of randomized algorithms, most notably for the problems of computing square roots, solving polynomial equations, and testing primality. Connections with other areas, such as cryptography and complexity theory, are also pointed out along the way.
There are several unique features in the use of randomization in number theory. As will soon become clear, the use of randomization is fairly simple in that most of the algorithms will start by picking a random number from some domain and then work deterministically from there on. We will claim that with high probability the chosen random number has some desirable property. The hard part usually will be establishing this claim, which will require us to use non-trivial ideas from number theory and algebra.
IN this chapter we study several ideas that are basic to the design and analysis of randomized algorithms. All the topics in this chapter share a game-theoretic viewpoint, which enables us to think of a randomized algorithm as a probability distribution on deterministic algorithms. This leads to the Yao ‘s Minimax Principle, which can be used to establish a lower bound on the performance of a randomized algorithm.
Game Tree Evaluation
We begin with another simple illustration of linearity of expectation, in the setting of game tree evaluation. This example will demonstrate a randomized algorithm whose expected running time is smaller than that of any deterministic algorithm. It will also serve as a vehicle for demonstrating a standard technique for deriving a lower bound on the running time of any randomized algorithm for a problem.
A game tree is a rooted tree in which internal nodes at even distance from the root are labeled MIN and internal nodes at odd distance are labeled MAX. Associated with each leaf is a real number, which we call its value. The evaluation of the game tree is the following process. Each leaf returns the value associated with it. Each MAX node returns the largest value returned by its children, and each MIN node returns the smallest value returned by its children. Given a tree with values at the leaves, the evaluation problem is to determine the value returned by the root.
IN Chapters 1 and 2, we bounded the expected running times of several randomized algorithms. While the expectation of a random variable (such as a running time) may be small, it may frequently assume values that are far higher. In analyzing the performance of a randomized algorithm, we often like to show that the behavior of the algorithm is good almost all the time. For example, it is more desirable to show that the running time is small with high probability, not just that it has a small expectation. In this chapter we will begin the study of general methods for proving statements of this type. We will begin by examining a family of stochastic processes that is fundamental to the analysis of many randomized algorithms: these are called occupancy problems. This motivates the study (in this chapter and the next) of general bounds on the probability that a random variable deviates far from its expectation, enabling us to avoid such custom-made analyses. The probability that a random variable deviates by a given amount from its expectation is referred to as a tail probability for that deviation. Readers wishing to review basic material on probability and distributions may consult Appendix C.
Occupancy Problems
We begin with an example of an occupancy problem. In such problems we envision each of m indistinguishable objects ("balls") being randomly assigned to one of n distinct classes ("bins"). In other words, each ball is placed in a bin chosen independently and uniformly at random.