Search results for Computational statistics, machine learning and information science

List of notation
David Barber, University College London
Book:

Bayesian Reasoning and Machine Learning

Published online:

05 June 2012

Print publication:

02 February 2012, pp xx-xx
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

2 - Basic graph concepts
from 1 - Inference in probabilistic models
David Barber, University College London
Book:

Bayesian Reasoning and Machine Learning

Published online:

05 June 2012

Print publication:

02 February 2012, pp 22-28
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

4 - Graphical models
from 1 - Inference in probabilistic models
David Barber, University College London
Book:

Bayesian Reasoning and Machine Learning

Published online:

05 June 2012

Print publication:

02 February 2012, pp 58-76
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In Chapter 3 we saw how belief networks are used to represent statements about independence of variables in a probabilistic model. Belief networks are simply one way to unite probability and graphical representation. Many others exist, all under the general heading of ‘graphical models’. Each has specific strengths and weaknesses. Broadly, graphical models fall into two classes: those useful for modelling, such as belief networks, and those useful for inference. This chapter will survey the most popular models from each class.
Graphical models
Graphical Models (GMs) are depictions of independence/dependence relationships for distributions. Each class of GM is a particular union of graph and probability constructs and details the form of independence assumptions represented. Graphical models are useful since they provide a framework for studying a wide class of probabilistic models and associated algorithms. In particular they help to clarify modelling assumptions and provide a unified framework under which inference algorithms in different communities can be related.
It needs to be emphasised that all forms of GM have a limited ability to graphically express conditional (in)dependence statements [281]. As we've seen, belief networks are useful formodelling ancestral conditional independence. In this chapter we'll introduce other types of GM that are more suited to representing different assumptions. Here we'll focus on Markov networks, chain graphs (which marry belief and Markov networks) and factor graphs. There are many more inhabitants of the zoo of graphical models, see [73, 314].

Plate Section
David Barber, University College London
Book:

Bayesian Reasoning and Machine Learning

Published online:

05 June 2012

Print publication:

02 February 2012, pp -
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

22 - Latent ability models
from III - Machine learning
David Barber, University College London
Book:

Bayesian Reasoning and Machine Learning

Published online:

05 June 2012

Print publication:

02 February 2012, pp 479-486
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Index
David Barber, University College London
Book:

Bayesian Reasoning and Machine Learning

Published online:

05 June 2012

Print publication:

02 February 2012, pp 689-697
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

10 - Naive Bayes
from II - Learning in probabilistic models
David Barber, University College London
Book:

Bayesian Reasoning and Machine Learning

Published online:

05 June 2012

Print publication:

02 February 2012, pp 243-255
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

19 - Gaussian processes
from III - Machine learning
David Barber, University College London
Book:

Bayesian Reasoning and Machine Learning

Published online:

05 June 2012

Print publication:

02 February 2012, pp 412-431
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

5 - Efficient inference in trees
from 1 - Inference in probabilistic models
David Barber, University College London
Book:

Bayesian Reasoning and Machine Learning

Published online:

05 June 2012

Print publication:

02 February 2012, pp 77-101
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

V - Approximate inference
David Barber, University College London
Book:

Bayesian Reasoning and Machine Learning

Published online:

05 June 2012

Print publication:

02 February 2012, pp 585-586
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In Part I we discussed inference and showed that for certain models this is computationally tractable. However, for many models of interest, one cannot perform inference exactly and approximations are required.
In Part V we discuss approximate inference methods, beginning with sampling-based approaches. These are popular and well known in many branches of the mathematical sciences, having their origins in chemistry and physics. We also discuss alternative deterministic approximate inference methods which in some cases can have remarkably accurate performance.
It is important to bear in mind that no single algorithm is going to be best on all inference tasks. For this reason, we attempt throughout to explain the assumptions behind the techniques so that one may select an appropriate technique for the problem at hand.

BRMLTOOLBOX
David Barber, University College London
Book:

Bayesian Reasoning and Machine Learning

Published online:

05 June 2012

Print publication:

02 February 2012, pp xxi-xxiv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

7 - Making decisions
from 1 - Inference in probabilistic models
David Barber, University College London
Book:

Bayesian Reasoning and Machine Learning

Published online:

05 June 2012

Print publication:

02 February 2012, pp 127-162
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

IV - Dynamical models
David Barber, University College London
Book:

Bayesian Reasoning and Machine Learning

Published online:

05 June 2012

Print publication:

02 February 2012, pp 487-488
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Natural organisms inhabit a dynamical environment and arguably a large part of natural intelligence is in modelling causal relations and consequences of actions. In this sense, modelling temporal data is of fundamental interest. In a more artificial environment, there are many instances where predicting the future is of interest, particularly in areas such as finance and also in tracking of moving objects.
In Part IV, we discuss some of the classical models of timeseries that may be used to represent temporal data and also to make predictions of the future. Many of these models are well known in different branches of science from physics to engineering and are heavily used in areas such as speech recognition, financial prediction and control. We also discuss some more sophisticated models in Chapter 25, which may be skipped at first reading.
As an allusion to the fact that natural organisms inhabit a temporal world, we also address in Chapter 26 some basic models of how information processing might be achieved in distributed systems.

6 - The junction tree algorithm
from 1 - Inference in probabilistic models
David Barber, University College London
Book:

Bayesian Reasoning and Machine Learning

Published online:

05 June 2012

Print publication:

02 February 2012, pp 102-126
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

When the distribution is multiply connected it would be useful to have a generic inference approach that is efficient in its reuse of messages. In this chapter we discuss an important structure, the junction tree, that by clustering variables enables one to perform message passing efficiently (although the structure onwhich themessage passing occurs may consist of intractably large clusters). The most important thing is the junction tree itself, based on which different message-passing procedures can be considered. The junction tree helps forge links with the computational complexity of inference in fields from computer science to statistics and physics.
Clustering variables
In Chapter 5 we discussed efficient inference for singly connected graphs, for which variable elimination and message-passing schemes are appropriate. In the multiply connected case, however, one cannot in general perform inference by passing messages only along existing links in the graph. The idea behind the Junction Tree Algorithm (JTA) is to form a new representation of the graph in which variables are clustered together, resulting in a singly connected graph in the cluster variables (albeit on a different graph). The main focus of the development will be on marginal inference, though similar techniques apply to different inferences, such as finding the most probable state of the distribution.
At this stage it is important to point out that the JTA is not a magic method to deal with intractabilities resulting from multiply connected graphs; it is simply a way to perform correct inference on a multiply connected graph by transforming to a singly connected structure.

25 - Switching linear dynamical systems
from IV - Dynamical models
David Barber, University College London
Book:

Bayesian Reasoning and Machine Learning

Published online:

05 June 2012

Print publication:

02 February 2012, pp 547-567
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Hidden Markov models assume that the underlying process is discrete; linear dynamical systems that the underlying process is continuous. However, there are scenarios in which the underlying system might jump from one continuous regime to another. In this chapter we discuss a class of models that can be used in this situation. Unfortunately the technical demands of this class of models are somewhat more involved than in previous chapters, although the models are correspondingly more powerful.
Introduction
Complex timeseries which are not well described globally by a single linear dynamical system may be divided into segments, each modelled by a potentially different LDS. Such models can handle situations in which the underlying model ‘jumps’ from one parameter setting to another. For example a single LDS might well represent the normal flows in a chemical plant. When a break in a pipeline occurs, the dynamics of the system changes from one set of linear flow equations to another. This scenario can be modelled using a set of two linear systems, each with different parameters. The discrete latent variable at each time st ∈ {normal, pipe broken} indicates which of the LDSs is most appropriate at the current time. This is called a Switching LDS (SLDS) and is used in many disciplines, from econometrics to machine learning [12, 63, 59, 235, 324, 189].

1 - Inference in probabilistic models
David Barber, University College London
Book:

Bayesian Reasoning and Machine Learning

Published online:

05 June 2012

Print publication:

02 February 2012, pp 1-2
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Probabilistic models explicitly take into account uncertainty and deal with our imperfect knowledge of the world. Suchmodels are of fundamental significance in Machine Learning since our understanding of the world will always be limited by our observations and understanding. We will focus initially on using probabilistic models as a kind of expert system.
In Part I, we assume that the model is fully specified. That is, given a model of the environment, how can we use it to answer questions of interest? We will relate the complexity of inferring quantities of interest to the structure of the graph describing the model. In addition, we will describe operations in terms of manipulations on the corresponding graphs. As we will see, provided the graphs are simple tree-like structures, most quantities of interest can be computed efficiently.
Part I deals with manipulating mainly discrete variable distributions and forms the background to all the later material in the book.

27 - Sampling
from V - Approximate inference
David Barber, University College London
Book:

Bayesian Reasoning and Machine Learning

Published online:

05 June 2012

Print publication:

02 February 2012, pp 587-616
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

II - Learning in probabilistic models
David Barber, University College London
Book:

Bayesian Reasoning and Machine Learning

Published online:

05 June 2012

Print publication:

02 February 2012, pp 163-164
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In Part II we address how to learn a model from data. In particular we will discuss learning a model as a form of inference on an extended distribution, now taking into account the parameters of the model.
Learning a model or model parameters from data forces us to deal with uncertainty since with only limited data we can never be certain which is the ‘correct’ model. We also address how the structure of a model, not just its parameters, can in principle be learned.
In Part II we show how learning can be achieved under simplifying assumptions, such as maximum likelihood that set parameters by those that would most likely reproduce the observed data. We also discuss the problems that arise when, as is often the case, there is missing data.
Together with Part I, Part II prepares the basic material required to embark on understanding models in machine learning, having the tools required to learn models from data and subsequently query them to answer questions of interest.

28 - Deterministic approximate inference
from V - Approximate inference
David Barber, University College London
Book:

Bayesian Reasoning and Machine Learning

Published online:

05 June 2012

Print publication:

02 February 2012, pp 617-654
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Sampling methods are popular and well known for approximate inference. In this chapter we give an introduction to the less well known class of deterministic approximation techniques. These have been spectacularly successful in branches of the information sciences and many have their origins in the study of large-scale physical systems.
Introduction
Deterministic approximate inference methods are an alternative to the sampling techniques discussed in Chapter 27. Drawing exact independent samples is typically computationally intractable and assessing the quality of the sample estimates is difficult. In this chapter we discuss some alternatives. The first, Laplace's method, is a simple perturbation technique. The second class of methods are those that produce rigorous bounds on quantities of interest. Such methods are interesting since they provide certain knowledge – it may be sufficient, for example, to show that a marginal probability is greater than 0.1 in order to make an informed decision. A further class of methods are the consistency methods, such as loopy belief propagation. Such methods have revolutionised certain fields, including error correction [197]. It is important to bear in mind that no single approximation technique, deterministic or stochastic, is going to beat all others on all problems, given the same computational resources. In this sense, insight as to the properties of the various approximations is useful in matching an approximation method to the problem at hand.

3 - Belief networks
from 1 - Inference in probabilistic models
David Barber, University College London
Book:

Bayesian Reasoning and Machine Learning

Published online:

05 June 2012

Print publication:

02 February 2012, pp 29-57
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

We can now make a first connection between probability and graph theory. A belief network introduces structure into a probabilistic model by using graphs to represent independence assumptions among the variables. Probability operations such as marginalising and conditioning then correspond to simple operations on the graph, and details about the model can be ‘read’ from the graph. There is also a benefit in terms of computational efficiency. Belief networks cannot capture all possible relations among variables. However, they are natural for representing ‘causal’ relations, and they are a part of the family of graphical models we study further in Chapter 4.
The benefits of structure
It's tempting to think of feeding a mass of undigested data and probability distributions into a computer and getting back good predictions and useful insights in extremely complex environments. However, unfortunately, such a naive approach is likely to fail. The possible ways variables can interact is extremely large, so that without some sensible assumptions we are unlikely to make a useful model. Independently specifying all the entries of a table p(x 1, …, xN ) over binary variables xi takes O(2 N ) space, which is impractical for more than a handful of variables. This is clearly infeasible in many machine learning and related application areas where we need to deal with distributions on potentially hundreds if not millions of variables. Structure is also important for computational tractability of inferring quantities of interest.

Computational statistics, machine learning and information science

Refine search

Refine search

Actions for selected content:

1004 results in Computational statistics, machine learning and information science

List of notation

2 - Basic graph concepts

4 - Graphical models

Summary

Plate Section

22 - Latent ability models

Index

10 - Naive Bayes

19 - Gaussian processes

5 - Efficient inference in trees

V - Approximate inference

Summary

BRMLTOOLBOX

7 - Making decisions

IV - Dynamical models

Summary

6 - The junction tree algorithm

Summary

25 - Switching linear dynamical systems

Summary

1 - Inference in probabilistic models

Summary

27 - Sampling

II - Learning in probabilistic models

Summary

28 - Deterministic approximate inference

Summary

3 - Belief networks

Summary

Computational statistics, machine learning and information science

Refine search

Refine search

Actions for selected content:

Save Search

1004 results in Computational statistics, machine learning and information science

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary