To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In this chapter we study problems in directed graphs and see how the techniques developed in previous chapters generalize to problems on directed graphs. We first consider exact formulations for the arborescence problem and a vertex connectivity problem in directed graphs. For the latter, we demonstrate the iterative method in the more sophisticated uncrossing context which is applied to biset families instead of set families as in previous chapters. We then extend these results to degree bounded variants of the problems and use the iterative method to obtain bicriteria results unlike previous chapters where the algorithm would be optimal on the cost and only violate the degree constraints.
Given a directed graph D = (V, A) and a root vertex r ∈ V, a spanning r-arborescence is a subgraph of D so that there is a directed path from r to every vertex in V – r. The minimum spanning arborescence problem is to find a spanning r-arborescence with minimum total cost. We will show an integral characterization using iterative proofs, and extend this result in two directions. Given a directed graph D and a root vertex r, a rooted k-connected subgraph is a subgraph of D so that there are k internally vertex-disjoint directed paths from r to every vertex in V – r. The minimum rooted k-connected subgraph problem is to find a rooted k-connected subgraph with minimum total cost. We extend the proofs in the minimum arborescence problem to show an integral characterization in this more general setting.
Given a weighted undirected graph, the maximum matching problem is to find a matching with maximum total weight. In his seminal paper, Edmonds [35] described an integral polytope for the matching problem, and the famous Blossom Algorithm for solving the problem in polynomial time.
In this chapter, we will show the integrality of the formulation given by Edmonds [35] using the iterative method. The argument will involve applying uncrossing in an involved manner and hence we provide a detailed proof. Then, using the local ratio method, we will show how to extend the iterative method to obtain approximation algorithms for the hypergraph matching problem, a generalization of the matching problem to hypergraphs.
Graph matching
Matchings in bipartite graphs are considerably simpler than matchings in general graphs; indeed, the linear programming relaxation considered in Chapter 3 for the bipartite matching problem is not integral when applied to general graphs. See Figure 9.1 for a simple example.
Linear programming relaxation
Given an undirected graph G = (V, E) with a weight function w: E → ℛ on the edges, the linear programming relaxation for the maximum matching problem due to Edmonds is given by the following LPM(G). Recall that E(S) denotes the set of edges with both endpoints in S ⊆ V and x(F) is a shorthand for ∑e∈Fxe for F ⊆ E.
Although there are exponentially many inequalities in LPM(G), there is an efficient separation oracle for this linear program, obtained by Padberg and Rao using Gomory-Hu trees.
Even though we mentioned the paper by Jain [75] as the first explicit application of the iterative method to approximation algorithms, several earlier results can be reinterpreted in this light, which is what we set out to do in this chapter. We will first present a result by Beck and Fiala [12] on hypergraph discrepancy, whose proof is closest to other proofs in this book. Then we will present a result by Steinitz [127] on rearrangements of sums in a geometric setting, which is the earliest application that we know of. Then we will present an approximation algorithm by Skutella [123] for the single source unsplittable flow problem. Then we present the additive approximation algorithm for the bin packing problem by Karmarkar and Karp [77], which is still one of the most sophisticated uses of the iterative relaxation method. Finally, we sketch a recent application of the iterative method augmented with randomized rounding to the undirected Steiner tree problem [20] following the simplification due to Chakrabarty et al. [24].
A discrepancy theorem
In this section, we present the Beck–Fiala theorem from discrepancy theory using an iterative method. Given a hypergraph G = (V, E), a 2-coloring of the hypergraph is defined as an assignment ψ: V → {-1, +1} on the vertices. The discrepancy of a hyperedge e is defined as discχ(e) = ∑v∈e ψ(v), and the discrepancy of the hypergraph G is defined as discψ(G) = maxe∈E(G) |{discψ(e)}|.
In this first chapter we motivate our method via the assignment problem. Through this problem, we highlight the basic ingredients and ideas of the method. We then give an outline of how a typical chapter in the rest of the book is structured, and how the remaining chapters are organized.
The assignment problem
Consider the classical assignment problem: Given a bipartite graph G = (V1 ∪ V2, E) with |V1| = |V2| and weight function w: E → ℝ+, the objective is to match every vertex in V1 with a distinct vertex in V2 to minimize the total weight (cost) of the matching. This is also called the minimum weight bipartite perfect matching problem in the literature and is a fundamental problem in combinatorial optimization. See Figure 1.1 for an example of a perfect matching in a bipartite graph.
One approach to the assignment problem is to model it as a linear programming problem. A linear program is a mathematical formulation of the problem with a system of linear constraints that can contain both equalities and inequalities, and also a linear objective function that is to be maximized or minimized. In the assignment problem, we associate a variable xuv for every {u, v} ∈ E. Ideally, we would like the variables to take one of two values, zero or one (hence in the ideal case, they are binary variables). When xuv is set to one, we intend the model to signal that this pair is matched; when xuv is set to zero, we intend the model to signal that this pair is not matched.
In this chapter, we will study the spanning tree problem in undirected graphs. First, we will study an exact linear programming formulation and show its integrality using the iterative method. To do this, we will introduce the uncrossing method, which is a very powerful technique in combinatorial optimization. The uncrossing method will play a crucial role in the proof and will occur at numerous places in later chapters. We will show two different iterative algorithms for the spanning tree problem, each using a different choice of 1-elements to pick in the solution. For the second iterative algorithm, we show three different correctness proofs for the existence of a 1-element in an extreme point solution: a global counting argument, a local integral token counting argument and a local fractional token counting argument. These token counting arguments will be used in many proofs in later chapters.
We then address the degree-bounded minimum-cost spanning tree problem. We show how the methods developed for the exact characterization of the spanning tree polyhedron are useful in designing approximation algorithms for this NP-hard problem. We give two additive approximation algorithm: The first follows the first approach for spanning trees and naturally generalizes to give a simple proof of the additive two approximation result of Goemans [59]; the second follows the second approach for spanning trees and uses the local fractional token counting argument to provide a very simple proof of the additive one approximation result of Singh and Lau [125].
In this chapter, we consider a simple model, based on a directed tree representation of the variables and constraints, called network matrices. We show how this model as well as its dual have integral optima when used as constraint matrices with integral right-hand sides. Finally, we show the applications of these models, especially in proving the integrality of the dual of the matroid intersection problem in Chapter 5, as well as the dual of the submodular flow problem in Chapter 7.
While our treatment of network matrices is based on its relations to uncrossed structures and their representations, they play a crucial role in the characterization of totally unimodular matrices, which are all constraint matrices that yield integral polytopes when used as constraint matrices with integral right-hand sides [121]. Note that total unimodularity of network matrices automatically implies integrality of the dual program when the right-hand sides of the dual are integral.
The integrality of the dual of the matroid intersection and submodular flow polyhedra can be alternately derived by showing the Total Dual Integrality of these systems [121]. Although our proof of these facts uses iterative rounding directly on the dual, there is a close connection between these two lines of proof since both use the underlying structure on span of the constraints defining the extreme points of the corresponding linear program.
Quoting Lovász from his paper “Submodular Functions and Convexity” [94]:
Several recent combinatorial studies involving submodularity fit into the following pattern. Take a classical graph-theoretical result (e.g. the Marriage Theorem, the Max-flow-min-cut Theorem etc.), and replace certain linear functions occurring in the problem (either in the objective function or in the constraints) by submodular functions. Often the generalizations of the original theorems obtained this way remain valid; sometimes even the proofs carry over. What is important here to realize is that these generalizations are by no means l'art pour l'art. In fact, the range of applicability of certain methods can be extended tremendously by this trick.
The submodular flow model is an excellent example to illustrate this point. In this chapter, we introduce the submodular flow problem as a generalization of the minimum cost circulation problem. We then show the integrality of its LP relaxation and its dual using the iterative method. We then discuss many applications of the main result. We also show an application of the iterative method to an NP-hard degree bounded generalization and show some applications of this result as well.
The crux of the integrality of the submodular flow formulations will be the property that a maximal tight set of constraints form a cross-free family. This representation allows an inductive token counting argument to show a 1-element in an optimal extreme point solution. We will see that this representation is precisely the one we will eventually encounter in Chapter 8 on network matrices.
Risk assessment is in many respects acknowledged as a scientific discipline per se: there are many master and PhD programmes worldwide covering this field, and many scientific journals and conferences highlighting the area. However, there are few books addressing the scientific basis of this discipline, which is unfortunate as the area of risk assessment is growing rapidly and there is an enormous drive and enthusiasm to implement risk assessment methods in organisations. Without a proper basis, risk assessment would fail as a scientific method or activity. Consider the following example, a statement from an experienced risk assessment team about uncertainty in quantitative risk assessments (Aven, 2008a):
The assessments are based on the “best estimates” obtained by using the company's standards for models and data. It is acknowledged that there are uncertainties associated with all elements in the assessment, from the hazard identification to the models and probability calculations. It is concluded that the precision of the assessment is limited, and that one must take this into consideration when comparing the results with the risk acceptance criteria and tolerability limits.
Based on such a statement, one may question what the scientific basis of the risk assessment is. Everything is uncertain, but is not risk assessment performed to assess the uncertainties? From the cited statement it looks like the risk assessment generates uncertainty. In any event, does this acknowledgment – that a considerable amount of uncertainty exists – affect the analyses and the conclusions?
In this chapter we present the announced requirements of reliability and validity that will be used to verify that a risk assessment is scientific (Section 3.3). But first in Section 3.1 we give some reflections about risk assessment being a scientific method motivated by two interesting editorials of the first issue of the journal Risk Analysis (Cumming, 1981; Weinberg, 1981), in relation to the establishment of the Society of Risk Analysis. We also provide a brief review of the traditional sciences (Section 3.2), such as the natural sciences, social sciences, mathematics and probability theory, to place risk assessment into a broader scientific context. A key issue is to what extent risk assessment should be judged by reference to these traditional science paradigms, or is a science per se.
Reflections on risk assessment being a scientific method
Cumming (1981) concludes that the process of analysing or assessing risks involves science, and consequently is a scientific activity. However, according to Cumming, risk assessment is not a scientific method per se. He writes:
Risk assessment cannot demand the certainty and completeness of science. It must produce answers because decisions will be made, with or without its input. The quality of societal decisions will be influenced by the quality of the risk information which goes into them, and the long term success of a society is influenced by the quality of its decisions. Thus, risk assessment is an important activity. It depends on science and has an important stake in receiving the input of good science.
In Chapters 5–7 we have seen how risk assessments and risk management are influenced by the risk perspectives. Now we would like to go one step further, to provide guidance on what should be the preferred approach to risk. The basis for the guidance is the discussions in the previous chapters. Firstly we need to clarify what we mean by risk. A number of definitions and interpretations of the risk concept exist as discussed in Chapter 2. Many of these are probability-based. Below (Section 8.1) we present and discuss a structure for characterising the definitions, which is founded on a clear distinction between (Aven, 2010f)
(a) risk as a concept based on events, consequences and uncertainties;
(b) risk as a modelled, quantitative concept; and
(c) risk descriptions.
The discussion leads to an approach for conceptualising and assessing risk, which is based on risk defined by (a), i.e. is founded on the (A,C,U) risk perspective, and the probability-based definitions of risk can be viewed as model parameters and/or risk descriptions. The approach provides clear guidance on how to think when conceptualising and assessing risk in practice.
Next in this chapter (Section 8.2) we present and discuss a general model-based framework for risk assessments. Starting from an industry guide to quantitative uncertainty analysis and management, clarifications and simplifications are made to ensure consistency with the (A,C,U) risk perspective. Some simple examples are included to motivate and explain the basic ideas of the framework.
In this chapter we study the scientific platform of risk assessments when the objective of these assessments is accurate risk estimation. We first summarise the key concepts, probability and risk, using the set-up introduced in Chapter 2. We then conduct the assessments for the three cases, and from this basis we study the scientific quality of the risk assessments. Focus is on the scientific requirements of reliability and validity defined in Chapter 3.
The risk assessments presented in Sections 5.2–5.4 are rather comprehensive and detailed, although many simplifications have been made. If the reader is mainly concerned about the discussion of the scientific requirements of reliability and validity, a quick reading of these sections would suffice provided the reader is familiar with the statistical nomenclature and methods used. However, to fully appreciate the discussion in Section 5.5 it is necessary to go into the details of the assessments in Sections 5.2–5.4. For example, we cannot evaluate what the main quantities of interest are in the study or see the importance of key assumptions made in the assessments, without looking into the contexts of the analyses and precisely describing how the analyses are carried out.
Next we will study the scientific platform of risk assessments when the objective of these assessments is to describe uncertainties. We follow the same structure as in the previous chapter. We first summarise the framework introduced in Chapter 2 for assessing risk in such a setting, and clarify key concepts like probability and risk. We then conduct the assessments for the three cases, and from this basis we study the scientific quality of the risk assessments. Focus is again on the scientific requirements reliability and validity defined in Chapter 3. We distinguish between an (A,C,Pf)-based risk perspective (referred to as the probability of frequency approach) and an (A,C,U)-based risk perspective; in the former risk is defined through chances (which is the Bayesian term for frequentist probabilities, i.e. fractions of “successes” in the long run; refer to Chapter 2) and in the latter risk is defined through uncertainties.
Scientific basis
We consider an activity and distinguish between the following two ways of looking at risk:
Risk is defined through chances (frequentist probabilities)
Risk = (A,C,Pf), where Pf is a chance (relative frequency-interpreted probability) or a related parameter such as the expected number of occurrences of the event A per unit of time, where expectation is with respect to the chance distribution (relative frequency-interpreted probability distribution).