To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The Weibull distribution finds wide application in reliability theory, and is useful in analyzing failure and maintenance data. Its popularity arises from the fact that it offers flexibility in modeling failure rates, is easy to calculate, and most importantly, adequately describes many physical life processes. Examples include electronic components, ball bearings, semi-conductors, motors, various biological organisms, fatigued materials, corrosion and leakage of batteries. Classical and Bayesian techniques of Weibull estimation are described in [Abernathy et al., 1983] and [Kapur and Lamberson, 1977].
Components with constant failure rates (i.e. exponential life distributions) need not be maintained, only inspected. If they are found unfailed on inspection, they are ‘as good as new’. Components with increasing failure rates usually require preventive maintenance. Life data on such components is often heavily censored, as components are removed from service for many reasons other than failure. Weibull methods are therefore discussed in relation to censoring. In this chapter we assume that the censoring process is independent or random; that is, the censoring process is independent of the failure process. This may arise, for example, when components undergo planned revision, or are failed by overload caused by some upstream failures. In other cases a service sojourn may terminate for a reason which is not itself a failure, but is related to failure. When components are removed during preventive maintenance to repair degraded performance, we certainly may not assume that the censoring is independent.
Project risk management is a rapidly growing area with applications in all engineering areas. We shall particularly concentrate on applications within the construction industry, but the techniques discussed are more widely applicable. Construction risks have been the subject of study for many years. In particular, [Thompson and Perry, 1992] gives a good overall guide to the subject with a large number of references. Several different models and examples of risk analyses for large projects are given in [Cooper and Chapman, 1987]. A description of many projects (in particular high-technology projects) from the twentieth century, the problems encountered during management, and the generic lessons learnt are given in [Morris, 1994].
Large scale infrastructure projects typically have long lead times, suffer from high political and financial uncertainties, and the use of innovative but uncertain technologies. Because of the high risks and costs involved it has become common to apply risk management techniques with the aim of gaining insight into the principal sources of uncertainty in costs and/or time.
A project risk analysis performed by a candidate contractor before it bids for work is valuable because it can give the management quantitative insight into the sources of uncertainty in a project. This gives management a guide to the risks that need to be dealt with in the contract, or in financing arrangements.
The general problem of statistical inference is one in which, given observations of some random phenomenon, we try to make an inference about the probability distribution describing it. Much of statistics is devoted to the problem of inference. Usually we will suppose that the distribution is one of a family of distributions f(t|θ) parameterized by θ, and we try to make an assessment of the likely values taken by θ. An example is the exponential distribution f(t|λ) = λ exp(−λt), but also the joint distribution of n independent samples from the same exponential, f(t1, …, tn|λ) = λn exp(−λ(t1 + … + tn)), falls into the same category and is relevant when making inference on the basis of n independent samples.
Unfortunately, statisticians are not in agreement about the ways in which statistical inference should be carried out. There is a plethora of estimation methods which give rise to different estimates. Statisticians are not even in agreement about the principles that should be used to judge the quality of estimation techniques. The various creeds of statistician, of which the most important categories are Bayesian and frequentist, differ largely in the choice of principles to which they subscribe. (An entertaining guide to the differences is given in the paper of Bradley Efron ‘Why isn't everyone a Bayesian?’ and the heated discussion that follows, [Efron, 1986].) To some extent the question is whether one thinks that statistical inference should be inductive or deductive.
This chapter gives a brief introduction to the relatively new and expanding field of uncertainty analysis. Fundamental concepts are introduced, but theorems will not be proved here. Since uncertainty analysis is effectively dependent on computer support, the models used in uncertainty analysis are discussed in relation to simulation methods. A good elementary introduction to simulation is found in the book of Ross [Ross, 1990].
Uncertainty analysis was introduced with the Rasmussen Report WASH-1400 [NRC, 1975] which, as we recall, made extensive use of subjective probabilities. It was anticipated that the decision makers would not accept a single number as the probability of catastrophic accident with a nuclear reactor. Instead a distribution over possible values for the probability of a catastrophic accident was computed, using estimates of the uncertainty of the input variables. Since this study uncertainty analyses are rapidly becoming standard for large technical studies aiming at consensus in areas with substantial uncertainty. The techniques of uncertainty analysis are not restricted to fault tree probability calculations, rather they can be applied to any quantitative model. Uncertainty analysis is commonplace for large studies in accident consequence modeling, environmental risk studies and structural reliability.
Mathematical formulation of uncertainty analysis
Mathematically uncertainty analysis concerns itself with the following problem. Given some function M(X1, …, Xn) of uncertain quantities X1,…, Xn, determine the distribution of G on the basis of some information about the joint distribution of X1, …, Xn.
We have written this book for numerate readers who have taken a first university course in probability and statistics, and who are interested in mastering the conceptual and mathematical foundations of probabilistic risk analysis. It has been developed from course notes used at Delft University of Technology. An MSc course on risk analysis is given there to mathematicians and students from various engineering faculties. A selection of topics, depending on the specific interests of the students, is made from the chapters in the book. The mathematical background required varies from topic to topic, but all relevant probability and statistics are contained in Chapters 3 and 4.
Probabilistic risk analysis differs from other areas of applied science because it attempts to model events that (almost) never occur. When such an event does occur then the underlying systems and organizations are often changed so that the event cannot occur in the same way again. Because of this, the probabilistic risk analyst must have a strong conceptual and mathematical background.
The first chapter surveys the history of risk analysis applications. Chapter 2 explains why probability is used to model uncertainty and why we adopt a subjective definition of probability in spite of its limitations. Chapters 3 and 4 provide the technical background in probability and statistics that is used in the rest of the book. The remaining chapters are more-or-less technically independent of each other, except that Chapter 7 must follow Chapter 6, and 14 should follow 13.
Reliability data is not simply ‘there’ waiting to be gathered. A failure rate is not an intrinsic property of a component like mass or charge. Rather, reliability parameters characterize populations that emerge from complex interactions of components, operating environments and maintenance regimes. This chapter presents mathematical tools for defining and analyzing populations from which reliability data is to be gathered. This chapter is long, the reason being that the mathematical sophistication required by a practicing risk/reliability analyst has increased significantly in the last years. Whereas in the past the choices of statistical populations and analytic methods were hard wired with the design of the data collection facility, today the analyst must play an increasingly active role in defining statistical populations relative to his/her particular needs.
The first step is to become clear about why we want reliability data. Modern reliability data banks (RDBs) are intended to serve at least three types of users: (1) the maintenance engineer interested in measuring and optimizing maintenance performance, (2) the component designer interested in optimizing component performance, and (3) the risk/reliability analyst wishing to predict reliability of complex systems in which the component operates.
To serve these users modern RDBs distinguish up to ten failure modes, often grouped into critical failures, degraded failures and incipient failures. Degraded and incipient failures are often associated with preventive maintenance. Whereas critical failures are of primary interest in risk and reliability calculations, a maintenance engineer is also interested in degraded and incipient failures.
How can we choose a probabilistic risk acceptance criterion, a probabilistic safety goal, or specify how changes of risk baseline should influence other design and operational decisions? Basically, we have to compare risks from different, perhaps very different activities. What quantities should be compared? There is an ocean of literature on this subject. The story begins with the first probabilistic risk analysis, WASH-1400, and books which come quickly to mind are [Shrader-Frechette, 1985], [Maclean, 1986], [Lowrance, 1976] and [Fischhoff et al., 1981]. A few positions, and associated pitfalls, are set forth below. For convenience we restrict attention to one undesirable event, namely death.
Single statistics representing risk
Deaths per million
The most common quantity used to compare risks is ‘deaths per million’. Covello et al. [Covello et al., 1989] give many examples of the use of this statistic. Similar tables are given by the British Health and Safety Executive [HSE, 1987]. The Dutch government's risk policy statement [MVROM, 1989] gives a variation on this method by tabulating the yearly risk of death as ‘one in X’.
Table 18.1 shows a few numbers taken from Table B.1 of [Covello et al., 1989], ‘Annual risk of death in the United States’. By each ‘cause’ the number of deaths per year per million is given.
In this chapter we shall give a brief introduction to some of the important ideas in decision theory. Decision analysis is an important area of application for quantitative risk analysis. Although many risk analyses are performed simply to show that an installation conforms to the requirements of a regulator, quantitative risk analyses are increasingly being used as input to a decision making process. Logic seems to dictate that a quantitative risk analysis be followed by a quantitative decision analysis.
The field of decision analysis is far too rich to describe in a single chapter. We shall, then, deal quickly with the basic notions required to give simple examples of multi-attribute decision making under uncertainty for a single decision-maker. See [French, 1988] and [Keeney and Raiffa, 1993] for deeper discussions of all the issues involved.
Decision theory has been long split by heated discussions between protagonists of normative theories and those of descriptive theories. A descriptive theory tries to represent beliefs and preferences of individuals as they are. A normative theory tries to model how an individual's beliefs should be structured if they were to follow certain elementary consistency rules which might be expected of a rational individual.
More recently, these two groups moved towards each other. In particular, the adherents of the normative approach see their methods as being an aid to the decision-maker in achieving consistent (rational) preferences.
Probabilistic risk analysis (PRA), also called quantitative risk analysis (QRA) or probabilistic safety analysis (PSA), is currently being widely applied to many sectors, including transport, construction, energy, chemical processing, aerospace, the military, and even to project planning and financial management. In many of these areas PRA techniques have been adopted as part of the regulatory framework by relevant authorities. In other areas the analytic PRA methodology is increasingly applied to validate claims for safety or to demonstrate the need for further improvement. The trend in all areas is for PRA to support tools for management decision making, forming the new area of risk management.
Since PRA tools are becoming ever more widely applied, and are growing in sophistication, one of the aims of this book is to introduce the reader to the main tools used in PRA, and in particular to some of the more recent developments in PRA modeling. Another important aim, though, is to give the reader a good understanding of uncertainty and the extent to which it can be modeled mathematically by using probability. We believe that it is of critical importance not just to understand the mechanics of the techniques involved in PRA, but also to understand the foundations of the subject in order to judge the limitations of the various techniques available. The most important part of the foundations is the study of uncertainty. What do we mean by uncertainty? How might we quantify it?
In many complex systems involving interaction between humans and machines, the largest contribution to the probability of system failure comes from basic failures or initiating events caused by humans. Kirwan ([Kirwan, 1994], Appendix 1) reviews twelve accidents and one incident occurring between 1966 and 1986, including the Space Shuttle accident and Three Mile Island, all of which were largely caused by human error. The realization of the extent of human involvement in major accidents has, in the Netherlands, led to the choice of a completely automated decision system for closing and opening that country's newest storm surge barrier.
Since humans can both initiate and mitigate accidents, it is clear that the influence of humans on total system reliability must be considered in any complete probabilistic risk analysis.
The first human reliability assessment was made as part of the final version of the WASH-1400 study. At that time the methodology was largely restricted to studies on the failure probability for elementary tasks. A human error probability, HEP, is the probability that an error occurs when carrying out a given task. In many situations in which human reliability is an important factor the operator has to interpret (possibly incorrect) instrumentation data, make deductions about the problems at hand, and take decisions involving billion dollar trade-offs under conditions of high uncertainty.
Fault and event trees are modeling tools used as part of a quantitative analysis of a system. Other semi-quantitative or qualitative tools such as failure modes and effects analysis (FMEA) are often performed in preparation for a more exact analysis. Such tools are outside the (quantitative) scope of this book, and the interested reader is referred to [Kumamoto and Henley, 1996], [Andrews and Moss, 1993]. These books also provide further information and more examples on fault tree modeling as does the Fault Tree Handbook [Vesely et al., 1981].
Fault tree and event tree analyses are two of the basic tools in system analysis. Both methodologies give rise to a pictorial representation of a statement in Boolean logic. We shall concentrate on fault tree analysis, but briefly explain the difference in the situations modeled by event trees and fault trees.
Event trees use ‘forward logic’. They begin with an initiating event (an abnormal incident) and ‘propagate’ this event through the system under study by considering all possible ways in which it can effect the behaviour of the (sub)system. The nodes of an event tree represent the possible functioning or malfunctioning of a (sub)system. If a sufficient set of such systems functions normally then the plant will return to normal operating conditions. A path through an event tree resulting in an accident is called an accident sequence.
The subject of dependent failures is one of the more important issues affecting the credibility and validity of standard risk analysis methods; and it is an issue around which much confusion and misleading terminology exist. This treatment draws on procedures for dealing with common causes as issued by the US Nuclear Regulatory Commission, and the International Agency for Atomic Energy.
Component failure data versus incident reporting
Component data reliability banks typically collect individual component failure events and demands and/or operational times. From such data alone it is impossible to estimate the probabilities of dependent failures. For this we need information on the joint failures of components, which becomes available only when incidents involving multiple failures of components are recorded as such. Standard data banks do not collect data on incidents. There are, however, isolated exercises in incident reporting. There is also an ongoing program to analyze the so-called ‘licensee event reports’ in the American commercial nuclear power sector, and draw conclusions for probabilistic risk analysis. An issue of Reliability Engineering and System Safety (27, 1990) was devoted to ‘accident sequence precursor analysis’ and several contributions describe this program (see for example [Cooke and Goossens, 1990]). The use of incident reporting in probabilistic risk analysis deserves more attention than it has received to date, but will not be discussed further here. Recently an international common cause database for the nuclear sector has been established [Carlsson et al., 1998].